Need the answers for the following uploaded documents

Need the answers for the following uploaded documents 1

Laboratory III:

To download additional .arff data sets go to:

http://www.hakank.org/weka/

zoo.arff, wine.arff, soybean.arff, zoo2_x.arff,

sunburn.arff, disease.arff

Use the following learning schemes to compare the training set and 10-fold stratified cross-validation scores of the disease data (in disease.arff):

Decision table	- weka.classifiers.DecisionTable -R
C4.5	- weka.classifiers.j48.J48
Id3	- weka.clusterers.Id3

C) Which one of these models would you say is the best? Why?

C4.5	- weka.classifiers.j48.J48
Decision List	- weka. classifiers.PART

What is the most important descriptor (attribute) in wine.arff?
How well were these two schemas able to learn the patterns in the dataset? How would you quantify your answer?
Compare the training set and 10-fold cross-validations scores of the two schemas.
Would you trust these two models? Did they really learn what is important for proper classification of wine?
Which one would you trust more, even if just very slightly?

Perform the same analysis of sunburn.arff as in 2. Instead of 10-fold cross-validations use 5-fold.

A)-E) Same as in 2.

F) Why could not we use 10-fold evaluation in this example?

Choose one of the following three files: soybean.arff, zoo.arff or zoo2_x.arff and use any two schemas of your choice to build and compare the models.