+1 vote
in Data Science by
Assume you need to generate a predictive model using multiple regression. Explain how you intend to validate this model

1 Answer

0 votes
by

There are two main ways that you can do this:

A) Adjusted R-squared.

R Squared is a measurement that tells you to what extent the proportion of variance in the dependent variable is explained by the variance in the independent variables. In simpler terms, while the coefficients estimate trends, R-squared represents the scatter around the line of best fit.

However, every additional independent variable added to a model always increases the R-squared value — therefore, a model with several independent variables may seem to be a better fit even if it isn’t. This is where adjusted R² comes in. The adjusted R² compensates for each additional independent variable and only increases if each given variable improves the model above what is possible by probability. This is important since we are creating a multiple regression model.

B) Cross-Validation

A method common to most people is cross-validation, splitting the data into two sets: training and testing data. See the answer to the first question for more on this.

...