Summary: Machine Learning on the Rosanne-ABC Firing Incident Dataset

Summary of results: which methodology/modality “wins?”

Vanilla Merged
Algorithm Speed CCI % ROC AUC RMSE F-1 CCI % ROC AUC F-1 RMSE
ZeroR Instant 37.9333 0.4990 0.4144 NULL 47.1000 0.4990 NULL 0.4536
OneR Instant 43.0000 0.5420 0.5339 NULL 52.9833 0.5660 NULL 0.5599
NaiveBayes Fast 63.8500 0.8160 0.3808 0.6410 63.9667 0.8000 0.6430 0.4374
IBK Fast 56.5333 0.6910 0.4386 0.5230 59.5833 0.6510 0.5470 0.4972
RandomTree Fast 59.5833 0.6800 0.4474 0.5920 62.7167 0.6700 0.6210 0.4954
SimpleLogistic Moderate 73.6500 0.8850 0.3065 0.7320 73.6500 0.8730 0.7300 0.3502
DecisionTable Slow too slow for viable computation on consumer-grade hardware
MultilayerPerceptron Slow
RandomForest Slow
Vanilla Merged
Meta-Classifier Speed CCI % ROC AUC RMSE F-1 CCI % ROC AUC F-1 RMSE
Stack (ZR, NB) Moderate 37.9333 0.4990 0.4144 NULL vacuous results, omitted
Stack (NB. RT) Moderate 63.7000 0.8230 0.3795 0.6350 61.9833 0.6980 0.6130 0.4523
Vote (ZR, NB, RT) Moderate 62.0833 0.8430 0.3414 0.6110 64.0500 0.8330 0.6260 0.3830
CostSensitive (ZR) Instant 37.9333 0.4990 0.4144 NULL 36.6667 0.4990 NULL 0.4623
CostSensitive (OR) Instant 42.7000 0.5400 0.5353 NULL 39.6167 0.5170 NULL 0.6345
CostSensitive (NB) Fast 63.8500 0.8160 0.3808 0.6410 64.0833 0.8010 0.6450 0.4365
CostSensitive (IBK) Fast 56.5333 0.6910 0.4386 0.5230 59.5833 0.6510 0.5470 0.4972
CostSensitive (RT) Fast 59.5833 0.6800 0.4474 0.5920 63.3833 0.7050 0.6350 0.4728
CostSensitive (SL) Moderate 73.6500 0.8850 0.3065 0.7320 74.7833 0.8780 0.7450 0.3478

My results are contained in a separate text file in lab journal format. Salient results consisted of:

Continue reading “Summary: Machine Learning on the Rosanne-ABC Firing Incident Dataset”

Lab Journal: Machine Learning on tweets related to the Roseanne-ABC firing incident

This is my lab journal for the analysis of a data set composed from tweets related to the 2018 firing of Rosanne from ABC over a racist statement.

A summary of these results with methodological comments can be found here.

1) Preparation/Parsing the data set

2) Running the data-set as-is (“vanilla”)

3) Merged data set (Pro, Anti, UncNeut)

4) Meta classifications – Voting and Stacking on the vanilla and merged data sets

5) Introduction of Penalties via CostSensitiveClassifier

Continue reading “Lab Journal: Machine Learning on tweets related to the Roseanne-ABC firing incident”