Joeky Senders

14 Chapter 1 A: Artificial Neural Networks B: Support Vector Machines (SVM) C: Decision Trees D: Naïve Bayes E: K-Nearest Neighbor F: Fuzzy C-Means FIGURE 2. Explanation of most frequently used prediction models. 1A: Artificial neural networks are inspired on the neural networks in the brain and organized in layers of interconnected nodes. The nodes in the upper layer (red) represent the input features, and the node in the lower layer (blue) represents the distinct output. The nodes in the ‘hidden’ (orange) and output layers (blue) base the value of their output on the total input they receive. The rapid increase in computing power in recent years has allowed researchers to develop artificial neural networks with many hidden layers and millions of parameters. This stacking of multiple layers, which is referred to as deep learning, allows the model to recognize complex patterns in higher-dimensional data. However, these models are also referred to as 'black box' algorithms as interpreting the predictive mechanisms can be challenging. 1B: Support vector machines classify data points by calculating the ideal straight line, the ‘separating hyperplane’. Support vector machines select the hyperplane with the maximal distance to the nearest data point. A kernel function is mathematical trick that adds an extra dimension to the data. Non- separable 2-dimensional data, for example, could then be separated in a 3-dimensional space. 1C: Decision trees make predictions or classifications based on several input features with the use of bifurcating the feature space. These algorithms try to find the optimal features at which a split is made and the optimal value in case of a continuous feature. Random forests is an ensemble learning method that takes the mode of classes or the mean predictions of the individual trees, to avoid overfitting of a single tree. 1D: Naïve Bayes calculates the most likely outcome (blue) as a product of the a priori chance (red) and the conditional probabilities given by the individual features. Therefore, it assumes that the presence (or absence) of a feature is unrelated to the presence of any other feature, which is often not the case in real life. 1E: The K-Nearest Neighbors compares a data point with unknown class to its K nearest neighbors and determines its class as the most common class of its neighbors. For K=1, the algorithm assigns the class of a data point to the class of the single closest neighbor. 1F: Fuzzy C Means is an unsupervised learning algorithm that clusters data points based on their input features without having a desired output. The ‘fuzzy’ aspect gives the algorithm the flexibility to classify a data point to each cluster to a certain degree relating to the likelihood of belonging to that cluster.

RkJQdWJsaXNoZXIy ODAyMDc0