Wing Sheung Chan

82 Event selection and classification as they affect the separation of the different background processes along the combined NN output, which in turn affects our ability to constrain the background modelling using observed data in statistical fits. To optimise the values of { w b } , a grid search was performed and the values that yield the best expected upper limit on the LFV branching fractions were chosen. The optimised values have the ratio w Ztt : w Wjets : w Zll = 1 . 0 : 1 . 5 : 0 . 33 , common to both the eτ and µτ channels. Figure 4.3 shows the expected ROC curves for the combined NN output in the different channels, while Figure 4.4 shows the expected signal and major background distributions in the SR. As shown in the figures, the distributions of different background processes are clearly separated from each other, as expected from the optimisation of { w b } . The distribution of the combined NN output in the SR is used in binned maximum-likelihood fits to extract evidence of signal or set upper limits on the LFV branching fractions (see Section 6.1) . It is also used to reject the most background-like events in the SR. These events are kinematically very different from the signal events. Other alternative ways of creating a one-dimensional final discriminant have also been experimented. That includes what will be referred to as the “combined-background classifier approach” and the “multiclass classifier approach”. The combined-background classifier approach uses a single binary classifier trained with a combined set of background samples instead of having multiple, individual classifiers for the different backgrounds. The background samples are weighted according to the relative importance of the backgrounds in the SR. The multiclass classifier approach uses a neural network with (number of major backgrounds + 1) output nodes to classify the signal and the major background events at the same time. The multiclass classifier uses a softmax activation [97] and is trained to minimise the categorical cross-entropy. The performance of the final discriminants created with the considered alternative approaches are found to be worse, in terms of expected sensitivity, than what the current chosen approach could offer. This is not unreasonable as in the alternative approaches, NN classifiers that directly output the final discriminant could only be optimised based on some proxy figures of merit, such as cross-entropies or the area under ROC curves. However, these figures of merit might not faithfully reflect the actual sensitivity of the analysis. Whereas in the chosen approach, the parameters for the combination ( { w b } ) can be optimised with the actual procedure of maximum-likelihood fitting and the systematic uncertainties taken into account.

RkJQdWJsaXNoZXIy ODAyMDc0