Computer Science > STUDY GUIDE > Edith Cowan University MAT 3170 Data Vis assignment 3-VERIFIED BY EXPERTS 2021-GRADED A+ (All)

Edith Cowan University MAT 3170 Data Vis assignment 3-VERIFIED BY EXPERTS 2021-GRADED A+

Document Content and Description Below

Report on analysis of a real data set MAT3170- Data Analysis and Visualisation – Assignment 2 Binary Logistic regression Binary logistic regression allows to develop machine learning models with ... categorical features or inputs and outcome variable(eg: Yes/No). While linear regression uses the RMSE, bias and correlation to model, Binary logistic regression uses probability. The predictors and the outcome variable can have one or more categorical value. If the probability of success in prediction is greater than 0.5, then the event is predicted as a success and vice versa. Logistic LASSO regression It’s a penalized logistic regression model which can determine the optimal features required instead of using all the features. In LASSO, if a feature is non-significant, the model removes those features by shrinking the coeff value to exact 0. Hyperparameter is Lambda which need to obtain by searching range. Classification Tree Classification trees otherwise known as decision trees are more simpler models and are easier to interpret. Advantage of the model is, it can be used for predicting both categorical and continuous outcome and input features. Classification tree splits the data based on cut-off of the input observations. It uses a hyperparameter; cp value. The three randomly selected supervised learning algorithms with respect to the student ID are, Binary logistic regression, Logistic LASSO regression and Classification Tree. Models from Training set(Malware Samples 10000) The MalwareSamples1000 file has been analyzed and tuned with the selected machine learning algorithms initially. The data has been cleaned and split in an 80/20 ratio forming a training and test sets. Machine learning algorithms will be first trained with the 80% training set and then predicted towards the test sets. The test report are discussed below. Binary Logistic regression Model Summary after running the model and cross validated through the Recursive Feature elimination process, which here cross validated the data with 10 fold and repeated 10 times to get the accurate prediction. * in the summary show the optimal variables needed for the accurate predictions. Chosen Features Coeffici ents Variables Accuracy Kappa AccuracySD KappaSD Selecte [Show More]

Last updated: 3 years ago

Preview 1 out of 7 pages

Buy Now

Instant download

We Accept: