Learning Rate, also known as shrinkage, refers to a technique in XGBoost that influences the contribution of each tree to the final prediction. This mechanism is designed to improve the balance between model complexity and learning speed.
Mechanism
Each prediction step in an XGBoost model is the sum of predictions from all trees. The learning rate scales the contribution of each tree, allowing the model to require fewer trees during training.
equation
Post-multiplication, the shrinkage factor reduces the influence of each tree's prediction in the final result.
Core Functions
Regularization: Learning Rate affects the strength of the regularization methods like L1 and L2. With a smaller learning rate, the model is trained for a longer duration, potentially intensifying the impact of regularization.
Effect on Overfitting: A lower rate implies a lower learning speed, which can mitigate overfitting.
Speed and Convergence: A higher learning rate accelerates training but can result in oscillations around the local minimum. Meanwhile, a lower learning rate leads to steadier convergence, suitable for reaching the global minimum.
Practical Tuning Tips
Grid Search: Allocate specific learning rate ranges in grid searches to determine the optimal value.
Cross-Validation: Validate learning rates along with other hyperparameters using cross-validation to guarantee robust model performance.