0 votes
in Clustering - The Data Ensemble by
How can Clustering (Unsupervised Learning) be used to improve the accuracy of the Linear Regression model (Supervised Learning)?

Creating different models for different cluster groups.

Creating an input feature for cluster ids as an ordinal variable.

Creating an input feature for cluster centroids as a continuous variable.

Creating an input feature for cluster size as a continuous variable.

Options:

A. 1 only

B. 1 and 2

C. 1 and 4

D. 3 only

E. 2 and 4

F. All of the above

1 Answer

0 votes
by

Answer: (F)

Creating an input feature for cluster ids as ordinal variables or creating an input feature for cluster centroids as a continuous variable might not convey any relevant information to the regression model for multidimensional data. But for clustering in a single dimension, all of the given methods are expected to convey meaningful information to the regression model. For example, to cluster people in two groups based on their hair length, storing clustering IDs as ordinal variables and cluster centroids as continuous variables will convey meaningful information.

Related questions

+1 vote
asked Dec 31, 2021 in Unstructured Data Classification by rajeshsharma
+1 vote
asked Feb 3, 2020 in Clustering - The Data Ensemble by MBarbieri
...