XGBoost
XGBoost stands for eXtreme Gradient Boosting.
XGBoost is a decision-tree-based ensemble Machine Learning algorithm that uses a gradient boosting framework. It is an implementation of gradient boosted decision trees designed for speed and performance. XGBoost has been dominating machine learning and Kaggle competitions for structured or tabular data.
XGBoost algorithm was developed as a research project at the University of Washington. Tianqi Chen and Carlos Guestrin presented their paper at SIGKDD Conference in 2016 and caught the Machine Learning world by fire.
How does it work?
To understand XGBoost, we must first understand Gradient Descent and Gradient Boosting. Gradient Descent is an iterative optimization algorithm. It is a method to minimize a function having several variables. Thus, Gradient Descent can be used to minimize the cost function(measures how close the predicted values are, to the corresponding actual values). It first runs the model with initial weights, then seeks to minimize the cost function by updating the weights over several iterations. Gradient Boosting carries the principle of Gradient Descent and Boosting to supervised learning. Gradient Boosted Models (GBM’s) are trees built sequentially, in series. In GBM’s, we take the weighted sum of multiple models.
- Each new model uses Gradient…