Week six assignment
Week Six Assignment
Rahul Kundu
Personal Loan Acceptance
# packages
library(“data.table”)
# UniversalBank.csv Read
Bank % mutate_if(is.numeric, round, 3)
##
Option # Probability
## 1
1
0.667
## 2
2
0.000
## 3
3
0.000
## 4
4
0.167
## 5
5
0.000
## 6
6
NaN
NOTE: In the above 12 observations there is no observation with (Injury=y
es, WEATHER_R = 2, TRAF_CON_R =2). The conditional probability here is undefi
ned, since the denominator is zero.
iii. Classify the 12 accidents using these probabilities and a cutoff of
0.5.
# probability results
new.df.prob.5,”yes”,”no”)
new.df.prob
##
## 1
## 2
MAX_SEV WEATHER_adverse TRAF_two_way PROB_INJURY PREDICT_PROB
non-fatal
1
0
0.667
yes
no
0
1
0.167
no
## 3 non-fatal
## 4 non-fatal
## 5
no
## 6
no
## 7 non-fatal
## 8
no
## 9 non-fatal
## 10
no
## 11 non-fatal
## 12
no
1
1
1
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0.000
0.000
0.667
0.167
0.167
0.667
0.167
0.167
0.167
0.000
no
no
yes
no
no
yes
no
no
no
no
iv. Compute manually the naive Bayes conditional probability of an injury
given WEATHER_R = 1 and TRAF_CON_R = 1.
man.prob NIR] : 0.0002441
##
##
Kappa : 1
##
## Mcnemar’s Test P-Value : NA
##
##
Sensitivity : 1.0
##
Specificity : 1.0
##
Pos Pred Value : 1.0
##
Neg Pred Value : 1.0
##
Prevalence : 0.5
##
Detection Rate : 0.5
##
Detection Prevalence : 0.5
##
Balanced Accuracy : 1.0
##
##
‘Positive’ Class : no
##
# compare with manually calculated results
new.df.prob$PREDICT_PROB_NB NIR] : 0.002333
Kappa : 0.2588
Mcnemar’s Test P-Value : 0.224765
Sensitivity : 0.6312
Specificity : 0.6300
Pos Pred Value : 0.5771
Neg Pred Value : 0.6811
Prevalence : 0.4444
Detection Rate : 0.2806
Detection Prevalence : 0.4861
Balanced Accuracy : 0.6306
‘Positive’ Class : non-fatal
ner=1-.5384
nerp=scales::percent(ner,0.01)
iii. What is the overall error for the validation set?
confusionMatrix(valid.df$MAX_SEV, predict(nbTotal, valid.df[, vars]))
## Confusion Matrix and Statistics
##
##
Reference
## Prediction no non-fatal
##
no
67
56
##
non-fatal 60
57
##
##
Accuracy : 0.5167
##
95% CI : (0.4515, 0.5814)
##
No Information Rate : 0.5292
##
P-Value [Acc > NIR] : 0.6749
##
##
Kappa : 0.0319
##
## Mcnemar’s Test P-Value : 0.7806
##
##
Sensitivity : 0.5276
##
Specificity : 0.5044
##
Pos Pred Value : 0.5447
##
Neg Pred Value : 0.4872
##
Prevalence : 0.5292
##
##
##
##
##
##
Detection Rate : 0.2792
Detection Prevalence : 0.5125
Balanced Accuracy : 0.5160
‘Positive’ Class : no
ver