MindMap Gallery CFA Level 2 Machine Learning (2)
Some commonly used algorithms under different machine learning categories, Basic principles and applications of algorithms.
Edited at 2020-01-05 03:30:49This is a mind map about bacteria, and its main contents include: overview, morphology, types, structure, reproduction, distribution, application, and expansion. The summary is comprehensive and meticulous, suitable as review materials.
This is a mind map about plant asexual reproduction, and its main contents include: concept, spore reproduction, vegetative reproduction, tissue culture, and buds. The summary is comprehensive and meticulous, suitable as review materials.
This is a mind map about the reproductive development of animals, and its main contents include: insects, frogs, birds, sexual reproduction, and asexual reproduction. The summary is comprehensive and meticulous, suitable as review materials.
This is a mind map about bacteria, and its main contents include: overview, morphology, types, structure, reproduction, distribution, application, and expansion. The summary is comprehensive and meticulous, suitable as review materials.
This is a mind map about plant asexual reproduction, and its main contents include: concept, spore reproduction, vegetative reproduction, tissue culture, and buds. The summary is comprehensive and meticulous, suitable as review materials.
This is a mind map about the reproductive development of animals, and its main contents include: insects, frogs, birds, sexual reproduction, and asexual reproduction. The summary is comprehensive and meticulous, suitable as review materials.
No relevant template
ML Algorithms (machine learning algorithm)
Supervised ML Algorithms
Penalized regression
LASSO (least absolute shrinkage and selection operator)
λ > 0
classification
Support vector machine (SVM)
classification,
linear classifier (a binary classifier)
soft margin classification
Application: particularly suited for small- to medium-size but complex high-dimensional data sets, such as corporate financial statements or bankruptcy databases. Investors seek to predict company failures for identifying stocks to avoid or to short sell, and SVM can generate a binary classification (e.g., bankruptcy likely vs. bankruptcy unlikely) using many fundamental and technical feature variables.
K-nearest neighbor (KNN)
look at the diamond’s k nearest neighbors
A critical challenge of KNN is defining what it means to be “similar” (or near).
Besides the selection of features, an important decision relates to the distance metric used to model similarity because an inappropriate measure will generate poorly performing models.
Applications: including bankruptcy prediction, stock price prediction, corporate bond credit rating assignment, and customized equity and bond index creation.
Classification and Regression Tree (CART)
CART is applied to binary classification or regression.
Application: enhancing detection of fraud in financial statements, generating consistent decision processes in equity and fixed-income selection, and simplifying communication of investment strategies to clients.
ensemble learning
Classification
aggregation of heterogeneous learners
different types of algorithms combined together with a voting classifier
aggregation of homogenous learners
a combination of the same algorithm, using different training data that are based, for example, on a bootstrap aggregating
Example
Bootstrap Aggregating (Bagging)
original training data set is used to generate n new training data sets or bags of data
The algorithm can now be trained on n independent data sets that will generate n new models.
Bagging is a very useful technique because it helps to improve the stability of predictions and protects against overfitting the model.
random forest
a collection of a large number of decision trees trained via a bagging method.
For example, a CART algorithm would be trained using each of the n independent data sets (from the bagging process) to generate the multitude of different decision trees that make up the random forest classifier.
black box-type algorithm.
Unsupervised ML Algorithms
dimension reduction
Principal Components Analysis(PCA)
PCA is used to summarize or reduce highly correlated features of data into a few main, uncorrelated composite variables.
A composite variable is a variable that combines two or more variables that are statistically strongly related to each other
two key concepts: eigenvectors and eigenvalues
The eigenvectors define new, mutually uncorrelated composite variables that are linear combinations of the original features.
An eigenvalue gives the proportion of total variance in the initial data that is explained by each eigenvector.
black box
Application:. It is typically performed as part of exploratory data analysis, before training another supervised or unsupervised learning model.
clustering
A cluster contains a subset of observations from the data set such that all the observations within the same cluster are deemed “similar.”
k-means clustering
K-means is a relatively old algorithm that repeatedly partitions observations into a fixed number, k, of non-overlapping clusters
Application: in data exploration for discovering patterns in high dimensional data or as a method for deriving alternatives to existing static industry classifications.
hierarchical clustering
agglomerative clustering (or bottom-up)
begins with each observation being treated as its own cluster.
clustering (or top-down)
starts with all the observations belonging to a single cluster
Dendrograms
Deep Learning and Reinforcement Learning (Deep Learning and Reinforcement Learning)
Neural Networks
(4-5-1) Neural Network
4: input layer
4 features
5:hidden layers
where learning occurs in training and inputs are processed on trained nets
1: output layer
here consists of a single node for the target variable y
summation operator
A functional part of a neural network’s node that multiplies each input value received by a weight and sums the weighted values to form the total net input, which is then passed to the activation function.
activation function
A functional part of a neural network’s node that transforms the total net input received into the final output of the node. The activation function operates like a light dimmer switch that decreases or increases the strength of the input.
forward propagation
The process of adjusting weights in a neural network, to reduce total error of the network, by moving forward through the network’s layers.
backward propagation
The process of adjusting weights in a neural network, to reduce total error of the network, by moving backward through the network’s layers.
learning rate
A parameter that affects the magnitude of adjustments in the weights in a neural network.
Application: a variety of tasks characterized by non-linearities and complex interactions among features.
deep learning nets (DLNs)
Algorithms based on complex neural networks, ones with many hidden layers (more than 3), that address highly complex tasks, such as image classification, face recognition, speech recognition, and natural language processing.
Reinforcement learning (RL)
Machine learning in which a computer learns from interacting with itself (or data generated by the same algorithm).