Correct answer of above question is : a) Initialize -> Train -> Predict -> Evaluate
Unstructured Data Classification classifier building from the following:Initialize -> Train - -> Predict-->Evaluate
Creating the Experiment
Drag and drop the Adult Census Income Binary Classification dataset module into your experiment's workspace.
Add a Clean Missing Data module, and use the default settings, to replace missing values with zeros. Connect the dataset module output to the input port.
Add a Project Columns module, and connect the output of Clean Missing Data module to the input port.
Use the column selector to exclude these columns: workclass, occupation, and native-country. We are excluding these columns because we don't want their values to be used in the training process. By default, Azure ML Studio treats all columns as features except for the target variable (the Label column). Alternatively, you could use the Metadata Editor module, select the excluded columns, and then choose ClearFeatures from the Fields dropdown list.
Select Columns
Add a Split module to create the testing and test sets. Set the Fraction of rows in the first output dataset to 0.7. This means that 70% of the data will be output to the left port and the rest to the right port of this module. We will use the left dataset for training and the right one for testing.
Add a Two-Class Boosted Decision Tree module to initialize a boosted decision tree classifier.
Add a Train Model module and connect the classifier (step 5) and the training set (left output port of the Split module) to the left and right input ports respectively. This module will perform the training of the classifier.
Add a Score Model module and connect the trained model and the test set (right port of the Split module). This module will make the predictions. You can click on its output port to see the actual predictions and the positive class probabilities.
Add an Evaluate Model module and connect the scored dataset to the left input port. To see the evaluation results, click on the output port of the Evaluate Model module and select Visualize.
Results
From these results, you can see that the Two-Class Boosted Decision Tree is fairly accurate in predicting income for the Adult Census Income dataset.