Ensemble methods are techniques that create multiple models and then combine them to produce improved results. Ensemble methods usually produce more accurate solutions than a single model would.
The main causes of error in learning are due to noise, bias, and variance. Ensemble helps to minimize these factors. These methods are designed to improve the stability and the accuracy of Machine Learning algorithms.
The two main types of Ensemble methods are Bagging and Boosting.
In this blog, I will explain the difference between Bagging and Boosting ensemble methods.
Bagging is a Parallel ensemble method (stands for Bootstrap Aggregating), is a way to decrease the variance of the prediction model by generating additional data in the training stage. This is produced by random sampling with replacement from the original set. By sampling with replacement, some observations may be repeated in each new training data set. In the case of Bagging, every element has the same probability to appear in a new dataset. By increasing the size of the training set, the model’s predictive force can’t be improved. It decreases the variance and narrowly tunes the prediction to an expected outcome.
The process for training an ensemble through bagging is as follows:
- Grab a sizable sample from your dataset, with replacement
- Train a classifier on this sample
- Repeat until all classifiers have been trained on their own sample from the dataset
- When making a prediction, have each classifier in the ensemble make a prediction
- Aggregate all predictions from all classifiers into a single prediction, using the method of your choice
Boosting is a sequential ensemble method that in general decreases the bias error and builds strong predictive models. The term ‘Boosting’ refers to a family of algorithms that converts a weak learner to a strong learner. Boosting gets multiple learners. The data samples are weighted and therefore, some of them may take part in the new sets more often.
The process for training an ensemble through boosting is as follows:
- The base algorithm reads the data and assigns equal weight to each sample observation.
- False predictions made by the base learner are identified. In the next iteration, these false predictions are assigned to the next base learner with a higher weightage on these incorrect predictions.
- Repeat step 2 until the algorithm can correctly classify the output.
Therefore, the main aim of Boosting is to focus more on miss-classified predictions.
Bagging vs. Boosting
The best technique to use between bagging and boosting depends on the data available, the problem, and any existing circumstances at the time. An estimate’s variance is significantly reduced by bagging and boosting techniques during the combination procedure, thereby increasing the accuracy. Therefore, the results obtained by bagging and boosting provide higher stability than the individual results.
When there is a low-performance issue, the bagging technique will not result in a better bias. However, the boosting technique generates a unified model with lower errors since it concentrates on the optimization of the advantages and reduction of shortcomings in a single model.
When the challenge in a single model is overfitting, the bagging method performs better than the boosting technique. Boosting faces the challenge of handling over-fitting since it comes with over-fitting in itself.