What does “causality” mean, and how can you represent it mathematically? How can you encode causal assumptions, and what bearing do they have on data analysis? These types of questions are at the core of the practice of data science, but deep knowledge about them is surprisingly uncommon.
If you analyze data without regard to causality, you open your results up for the possibility of enormous biases. This includes everything from recommendation system results, to post-hoc reports on observational data, to experiments run without proper holdout groups.
Recent posts have been aimed at a more general audience. This one will be aimed at practitioners, and will assume a basic working knowledge of math and data analysis. To get the most from this post you should have a reasonable understanding of linear regression and probability (although we’ll review a lot of probability). Prior knowledge of graphical models will make some concepts more familiar, but is not required.