Build a decision tree classification model, where dependent variable is “Species” and independent variable is “Sepal.Length”.
Sol:
y = iris[[‘Species’]]
x = iris[[‘Sepal.Length’]]
from sklearn.model_selection import train_test_split
x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=0.4)
from sklearn.tree import DecisionTreeClassifier
dtc = DecisionTreeClassifier()
dtc.fit(x_train,y_train)
y_pred=dtc.predict(x_test)
from sklearn.metrics import confusion_matrix
confusion_matrix(y_test,y_pred)
(22+7+9)/(22+2+0+7+7+11+1+1+9)
Code explanation:
We start off by extracting the independent variable and dependent variable:
y = iris[[‘Species’]]
x = iris[[‘Sepal.Length’]]
Then, we go ahead and divide the data into train and test set:
from sklearn.model_selection import train_test_split
x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=0.4)
After that, we go ahead and build the model:
from sklearn.tree import DecisionTreeClassifier
dtc = DecisionTreeClassifier()
dtc.fit(x_train,y_train)
y_pred=dtc.predict(x_test)
Finally, we build the confusion matrix:
from sklearn.metrics import confusion_matrix
confusion_matrix(y_test,y_pred)
(22+7+9)/(22+2+0+7+7+11+1+1+9)