Utilize machine learning in order to classify Opioid overdose incidents occurred in Pennsylvania.
There are two classification models: one predicts survival, and another predicts Naloxone administration. Both models are similar, and therefore the input data is similar as well. Below are the features for the Survival classification model:
As can be seen below, the class is unbalanced, so class weights will be taken into account, as well as other evaluation methods such as ROC AUC and Matthews Correlation Coefficient.
Predicted survived if Naloxone was administered and there were no multiple drugs consumed:
Used Spearman correlation and Seaborn package to draw a heatmap:
Created a graph that shows the AUC and Matthew's scores as function of the threshold. The graph was generated using a function, so it could be generated for any model we utilized. Below is the result of Random Forest:
After choosing a threshold of 0.6, a manual iteration was done in order to maximize confusion matrix' results: