MindMap Gallery SVM self-study SMO efficient optimization algorithm mind map
Split data based on the maximum interval, find the maximum interval, SMO efficient optimization algorithm, use the complete Platt SMO algorithm to accelerate optimization, and apply kernel functions on complex data
Edited at 2023-02-23 21:23:35One Hundred Years of Solitude is the masterpiece of Gabriel Garcia Marquez. Reading this book begins with making sense of the characters' relationships, which are centered on the Buendía family and tells the story of the family's prosperity and decline, internal relationships and political struggles, self-mixing and rebirth over the course of a hundred years.
One Hundred Years of Solitude is the masterpiece of Gabriel Garcia Marquez. Reading this book begins with making sense of the characters' relationships, which are centered on the Buendía family and tells the story of the family's prosperity and decline, internal relationships and political struggles, self-mixing and rebirth over the course of a hundred years.
Project management is the process of applying specialized knowledge, skills, tools, and methods to project activities so that the project can achieve or exceed the set needs and expectations within the constraints of limited resources. This diagram provides a comprehensive overview of the 8 components of the project management process and can be used as a generic template for direct application.
One Hundred Years of Solitude is the masterpiece of Gabriel Garcia Marquez. Reading this book begins with making sense of the characters' relationships, which are centered on the Buendía family and tells the story of the family's prosperity and decline, internal relationships and political struggles, self-mixing and rebirth over the course of a hundred years.
One Hundred Years of Solitude is the masterpiece of Gabriel Garcia Marquez. Reading this book begins with making sense of the characters' relationships, which are centered on the Buendía family and tells the story of the family's prosperity and decline, internal relationships and political struggles, self-mixing and rebirth over the course of a hundred years.
Project management is the process of applying specialized knowledge, skills, tools, and methods to project activities so that the project can achieve or exceed the set needs and expectations within the constraints of limited resources. This diagram provides a comprehensive overview of the 8 components of the project management process and can be used as a generic template for direct application.
SVM self-study SMO efficient optimization algorithm mind map
Split data based on maximum interval
advantage
Low generalization error rate
Little computational overhead
Results are easy to interpret
shortcoming
Sensitive to parameter adjustment and kernel function selection
The original classifier without modification is only suitable for handling second-class problems.
Use data types
Numerical type
Nominal type
Linearly separable
Draw a straight line to separate two sets of data points
hyperplane
classification decision boundary
interval
distance from point to dividing surface
The distance from the point to the dividing surface
2 times the minimum distance between all points in the data set and the dividing surface
Interval of classifier (dataset)
support vector
Those points closest to the separating hyperplane
Maximize the distance from the support vector to the separation surface
Find the maximum interval
The distance from the point to the separating hyperplane
b: Similar to the intercept w0 in logistic regression
Optimization problem solved by classifier
Using a function like the unit step function we get f(wTx b)
f(u) outputs -1 when u<0, otherwise 1
Taking -1 and 1 facilitates mathematical processing
interval calculation
The value >=1 is a constraint condition
Find data points with minimum separation
Need to maximize the interval
Difficulty solving
Find the optimal value given some constraints
Lagrange multiplier method
Restrictions
slack variable
The aforementioned assumption is that the data is 100% linearly separable.
The constraints are changed to
The constant C controls the weight of the two objectives of "maximizing the interval" and "ensuring that the function interval of most points is less than 1.0"
SVM general framework
Data collection
any method
Prepare data
Numerical data
analyze data
Helps to visualize separating hyperplanes
training algorithm
SVM spends most of its time training
Mainly through two parameter tuning
Test algorithm
Simple calculation process
Use algorithms
SVM can be used for almost any problem
It is a two-category classifier and needs to be modified for multiple categories.
SMO efficient optimization algorithm
Quadratic Programming Solver Tool
Software for optimizing quadratic objective functions with multiple variables under linear constraints
shortcoming
Requires powerful computing power
Very complex to implement
Platt’s SMO algorithm
Sequential Minimal Optimization
Thought
Decompose a large optimization problem into multiple small optimization problems
advantage
short time
Target
Find a series of alpha and b
Once these alphas are found, it is easy to calculate the weight vector w and obtain the separating hyperplane
working principle
Two alphas are selected for optimization in each loop
Once you find a suitable pair of alphas, increase one and decrease the other
Suitable: meets the conditions
Both alphas must be outside the interval boundaries
The two alphas have not been intervalized or are no longer on the boundary.
Apply a simplified version of the SMO algorithm to process small-scale data sets
Features
Skipping the outer loop to determine the best alpha pair to optimize
less code
slow execution
step
Iterate over each alpha in the dataset
Randomly select another alpha from the remaining alpha set to form an alpha pair
Notice
Always change both alphas at the same time
pseudocode
Create an alpha vector and initialize it to a 0 vector
When the number of iterations is less than the maximum number of iterations (outer loop)
For each data vector in the data set (inner loop)
If this data vector can be optimized
Randomly select another data vector
Optimize two vectors simultaneously
If neither can be optimized, exit the inner loop
If all vectors are not optimized, increase the number of iterations and continue with the next loop
Array filtering
Only useful for NumPy types
alphas[alpha>0]
Accelerate optimization with the complete Platt SMO algorithm
Choice of alpha value
First
Two ways alternate
Perform a single pass on all datasets
Implement single pass scan in non-boundary alpha
non-boundary alpha
Not equal to boundary 0 or value of C
process
Create a list of alpha values
Traverse the list
Skip alpha values that are known not to change
the second
Maximum Uber length
Create a global cache to save the error value
Select the alpha value that maximizes the step size (Ei-Ej)
Classification
Obtain the hyperplane based on the alpha value, including the calculation of w
In the end, only support vectors work
Classify data
Apply kernel functions on complex data
Use kernel functions to map data into high-dimensional space
kernel function
Think of it as wrapper/interface
Mapping from one feature space to another
SVM optimization
good
All operations can be written as inner products
Nuclear technology (nuclear "power transformation")
How to replace inner product with kernel function
radial basis kernel function
radial basis
vector as argument
Ability to output a scalar based on vector distance operations
The following uses the Gaussian version of the radial basis kernel function
formula
User-defined speed parameter used to determine "arrival rate" (function value falling to 0)
Using kernel functions in tests
There is an optimal value for the number of support vectors
Too few: gets a poor decision boundary
Too many: nearly k nearest neighbors
Example: Review of Handwriting Recognition Problems
refer to
Multiple categories
A Comparison of Methods for Multiclass Support Vector Machines
v-SVM
Pattern Recognition
Notice
The minimum training error rate does not correspond to the minimum number of support vectors
Linear kernel function is not particularly bad