Features
Features:

Product Tour >

Edraw AI >

Paid Plans:

Individuals >

Business >

Eduaction >
Resources
Blog

History

How-tos & Tips

Discovery

Biography

Business Analysis

Examples

AI concept Map

Free AI Mind Map Generator

Onenote Mind Map

Bcg Matrix Examples

Nike Marketing Strategy

Unilever SWOT Analysis

Make Mind Maps in Google Docs

Guide

FAQs

What's New

Resource Center
Templates
All Templates

Brain Storming Templates

Strategy and Planning Templates

Project Management Templates

Product Management Templates

Human Resources Templates

Agile Workflow Templates

Marketing Templates

Education Templates

Fun and Games Templates

User Gallery
Download
Pricing
Enterprise

Ensemble learning

It contains two major categories of algorithms: bagging and boosting. It is introduced in detail and described comprehensively. I hope it will be helpful to interested friends!

Edited at 2023-12-23 14:09:40

PlotWizard

Recent works View more works>>

Ensemble learning

PlotWizard

Recent works View more works>>

Recommended to you
Outline

AdaBoost meta-algorithm to improve classification performance skills mind map
- 6
PlotWizard

Ensemble Learning (Part 1)

Introduction

Idea; build and combine multiple weak learners to complete learning tasks

Illustration:

Two issues that need to be paid attention to in integrated learning

How to train a single weak learner?

Method 1: Change the weight of the training data set

Method 2: Change the probability distribution of the training data set

How to combine weak learners into strong learners?

Method 1: Parallel voting method

Method 2: Serial weighting method

Two major categories of ensemble learning

Bagging: There is no strong dependency between base learners and a parallelization method that can be generated simultaneously

representative algorithm

random forest

Algorithm idea: Use decision trees as weak learners, and integrate the weak learners in the bagging method

How is random forest random? (by changing the probability distribution of the data set)

Method 1: Forest-RI

Each time you build a training set, you need to randomly select k samples from the data set D, and randomly select n features from M features.

Illustration:

Method 2: Forest-RC

Each time a training set is constructed, n features need to be randomly selected from the M features of the data set D and linearly weighted to form a data set containing F new features. (Random number with weight coefficient [-1,1])

Illustration:

Algorithm steps

Step 1: Choose a weak learner (decision tree, KNN, logistic regression, etc.)

Step 2: Construct a training set based on randomness

Forest-RI

Forest-RC

Step 3: Train the current weak learner

Step 4: Determine whether the strong learner is qualified based on the voting mechanism

Voting mechanism: the mode of all weak learner results

Illustration:

Advantages and Disadvantages

advantage

During training, trees are independent of each other, and the training speed is fast.

The generalization error uses unbiased estimation, and the model has strong generalization ability.

It has its own bagged data set, so there is no need to separate the cross-validation set

In the face of imbalanced and missing data sets, the model accuracy is still high

shortcoming

Random forests can overfit on some noisy classification or regression problems

Random forest has many parameters and is difficult to adjust.

optimization

Aiming at the problem of too many parameters and difficulty in parameter adjustment

Be familiar with the parameters first, and then adjust them based on grid search.

Illustration of the influence of parameters on the model:

Boosting: There is a strong dependency between base learners and a serialization method that must be generated serially.

representative algorithm

AdaBoost

Algorithm idea: train a weak learner in each round. The weight of the training samples in the previous round is changed and used as the training data for the weak learner in the next round. Finally, each weak learner is combined into an integrated model through linear weighting.

Algorithm steps

Step 1: Choose a weak learner (decision tree, KNN, logistic regression, etc.)

Step 2: Initialize or update sample weights

Initialize sample weights, that is, each sample has the same weight

Update sample weights, that is, reduce the weight of correctly classified samples and increase the weight of incorrectly classified samples.

Illustration:

Step 3: Train the current weak learner

Step 4: Calculate the weight of the current weak learner

Step 1: Calculate the error rate of the current weak learner (the ratio of the number of incorrectly classified samples to the number of all samples)

Step 2: Calculate the weight of the current weak learner based on the error rate

Illustration:

Step 5: Add the current weak learner to the linear model and determine whether it is qualified

linear model

Illustration:

How to judge?

Accuracy of strong learners

The number of weak learners among strong learners

Advantages and Disadvantages

advantage

AdaBoost has high accuracy

AdaBoost can use different classification algorithms as weak classifiers and is not limited to decision trees.

shortcoming

Parameter training is time-consuming

Data imbalance can easily lead to loss of accuracy

The number of weak classifiers is not easy to determine

optimization

Aiming at training time consumption: Use forward distribution algorithm to speed up parameter optimization

If the number of classifiers is difficult to determine: use cross-validation to assist in determination

GBDT (Gradient Boosting Tree)

boost tree

Regression boosting tree: simple addition of multiple weak regressors

Classification boosting tree: simply add multiple weak classifiers

Gradient boosting trees: unified classification, regression boosting trees

Algorithm idea: Use CART regression tree as a weak learner, construct a new round of weak learners based on the loss of the weak learners, and finally linearly add all weak learners.

Algorithm steps

Step 1: Choose a weak learner (decision tree, KNN, logistic regression, etc.)

Step 2: Construct a training set (randomness) by calculating the negative gradient of the loss function of the current weak learner (fitting residual) and random sampling of features and samples of data set D

Step 3: Train the current weak learner

Step 4: Add the current weak learner to the linear model and determine whether it is qualified

Advantages and Disadvantages

advantage

Suitable for low-dimensional data and can handle non-linear data

Using some robust loss functions, it is very robust to outliers

Due to the advantages of both bagging and boosting, the theoretical level is higher than random forest and adaboost.

shortcoming

Difficulty training data in parallel due to dependencies between weak learners

Higher data dimensions will increase the computational complexity of the algorithm.

Since the weak learner is a regressor, it cannot be directly used for classification.

optimization

Achieve partial parallelism through self-sampling SGBT

XGboost: an efficient implementation of GBDT, with new regularization terms and quadratic Taylor expansion fitting of the loss function

LightGBM: An efficient implementation of XGBoost, which discretizes continuous floating-point features into k discrete values and constructs a Histogram with a width of k, speeding up calculations and saving space resources.