Early Stopping in XGBoost helps prevent overfitting and makes training more efficient by stopping iterations when performance on a validation dataset doesn't improve. This is achieved by monitoring a metric like AUC, logloss, or error.
Implementation
Define Parameters: Set the parameters for early stopping:
eval_metric: Metric to evaluate on the validation set.
eval_set: Data to use for evaluation. This should be a list of tuples, with each tuple in the format (X, y). You can have multiple tuples to evaluate on multiple datasets.
early_stopping_rounds: The number of rounds with no improvement after which training will stop.
Training XGBoost Model: Train the XGBoost model with the defined parameters using xgb.train(...) or xgb.XGBClassifier.fit(...) and xgb.XGBClassifier.predict(...).
from xgboost import XGBClassifier
model = XGBClassifier(n_estimators=100, eval_metric='logloss', eval_set=[(X_val, y_val)], early_stopping_rounds=5)
model.fit(X_train, y_train)
Monitoring: During training, XGBoost will evaluate the performance on the validation set after a certain number of boosting rounds, as defined by early_stopping_rounds. If the performance has not improved, training will stop.