Anzeige
You will be using the dataset HW4transactiondatasettxt.pdf
Nächste SlideShare
Super Resolution with OCR OptimizationSuper Resolution with OCR Optimization
Wird geladen in ... 3
1 von 1
Anzeige

Más contenido relacionado

Más de fastnews24x7(20)

Anzeige

You will be using the dataset HW4transactiondatasettxt.pdf

  1. You will be using the dataset HW4-transaction-dataset.txt. The description of the features is given in the file features.pdf Load the dataset in a variable mydata. How many rows have variables with missing values? Remove all the rows with missing values. Convert the values of the column order from {y,n} to {1,0}. Split the dataset into training and testing with 70-30 split ratio. (2 Points) Convert the variable order in your original, train and test datasets to a factor. Run a decision tree using the variables duration,startHour,cCount, bCount to predict the variable order on the training set. Visualize the decision tree on the training set using rpart.plot() function. Predict using the test set. Calculate precision, recall and overall accuracy on the test set. (2 Points) Execute the following command before proceeding to the next question Run a 10-fold cross validation using decision tree. Are the results better now. (1 Point) Run Nave Bayes, SVM (RBF Kernel), KNN and Random Forest using 10-fold cross validation. Which of them gives the best prediction accuracy?(2 Points) Run a bagging model with a decision tree and random forest using tenfold cross validation repeated three times. Which of the two gives a better performance? Now train a boosting method using gbm. Which of the two (bagging or boosting) performs better (3 Points)
Anzeige