In my previous
post I have
explained about how to generate a single ROC Curve for single classifier, but
in the practical cases you will need to generate Multiple ROC Curve for
multiple numbers of classifiers. This is very important to evaluate the performance
of the classifiers. The following steps will gives you the complete idea of how
to draw Multiple ROC Curve for multiple numbers of classifiers.
Open the Weka tool and select the
knowledge flow tab.
|
Figure 1 |
|
Figure 2 |
- When it is loaded, the first thing is to select ArffLoader from the DataSources menu, which is used to input the data. Drag ArffLoader into the knowledge flow layout as shown in figure 3.
- From the evaluation tab select the ClassAssigner and put it into the knowledge flow layout.
- From the evaluation tab select the ClassValuePicker and put it into the knowledge flow layout.
- From the evaluation tab select the CrossValidationFoldMaker (we have using 10 fold cross validation) and put it into the knowledge flow layout.
- Next step is to choose the classifiers from classifiers tab, in this tutorial I am using Random Forest (RF) and Naïve Bayes as classifiers. Select RF from trees tab and Naïve Bayes from bayes tab.
- We are going perform the performance of the classifiers, for that from the evaluation tab select the ClassifierPerformanceEvaluator(we need two performance evaluator one for each classifier) ) and put it into the knowledge flow layout.
- Finally we have to plot the ROC Curve, for that from the visualization tab select ModelPerformanceChart and put it into the knowledge flow layout.
|
Figure 3 |
Now all the components need to draw the ROC are on the
layout, next thing is to do the connections.
- To connect ArffLoader with the ClassAssigner , right click on the ArffLoader and select the
data set and connect it to the ClassAssigner
|
Figure 4 |
- Right click on the ClassAssigner
and select the dataset, connect it to ClassValuePicker.
- Right click on the ClassValuePicker
and select the dataset, connect it to. CrossValidationFoldMaker.
- Next we have to assign the training and test set data to the
classifier algorithms
- Right click on the CrossValidationFoldMaker
and select the training data, connect it to RF
classifier. Right click on the CrossValidationFoldMaker
and select the testing data, connect it to RF
classifier.Similarly do the same for Naïve Bayes classifier also.
- Right click on the RF
Classifier and select the batchClassifier,
connect it to ClassifierPerformanceEvaluator.
Do the same for the Naïve Bayes classifier
also.
- Right click on the ClassifierPerformanceEvaluator
and select the thresholdData, connect
it to ModelPerformanceChart. Now the
total arrangement looks like figure 4.
|
Figure 5 |
- Next we are going to input the data, for that right clicks
on the ArffLoader and select configure. Browse for the arrf file in your system,
click ok button.
- Right click on the ClassValuePicker
and selects for which class we going to draw the ROC Curve.
- Right click on the CrossValidationFoldMaker
and selects how many folds we are using (default will be 10 fold cross
validation) for selecting the training and testing data. Ten fold cross
validation means from the input 90% data are used as training data and
remaining 10% used as the testing data.
- Next we have to run the model for that right clicks on the ArffLoader and selects start loading
|
Figure 6 |
To see the ROC Curve Right click the ModelPerformanceChart, and select show chart. The result will be
look like in the figure 7.
|
Figure 7 |
|
Figure 8 |
0 comments:
Post a Comment