This serves as a placeholder page for all the installments of my data set for machine learning preparation tutorial. As creating good data sets is where machine learning and data science intersect (otherwise they aren’t exactly interchangeable fields), this will have relevant information for both machine learning engineers and data scientists alike.
This focus on data set creation coincides with my guiding principles: machine learning and data science ought to be explainable and ethically sound. If it works, but you can’t explain why, then it doesn’t really work. If it can’t be used to make the world a better place, then it isn’t useful.
The tutorial shall mainly consist of a series of topics that build upon each other. Each of these topics shall end with some deeper challenges mainly from deductive and inductive logic, and correspond to the research I have done in the past and have going on at the moment. If you want to read a tutorial on handwriting recognition or how to import keras into a Jupyter notebook, those are a dime a dozen and you should look somewhere else.
Although this series will cover a lot of flavor-of-the-week ML topics, my overall goal involves delving into problems for ML that come from decision theory, inductive logic, and computational complexity. These problems are relevant yet underrepresented in the larger artificial intelligence corpus. In some cases they pose major foundational issues. Along the way I will pose some ethical quandaries which, as most ethical quandaries tend to go, have no straight-forward or correct solution.
Tutorial: Machine Learning Data Set Preparation, Part 1
Tutorial: Machine Learning Data Set Preparation, Part 2
Tutorial: Machine Learning Data Set Preparation, Part 3
Tutorial: Machine Learning Data Set Preparation, Part 4
This tutorial is not tied to a class or a book and is free to use non-commercially with attribution without explicit permission to modify it (Creative Commons: Attribution-NonCommercial-NoDerivs CC BY-NC-ND).