13:00 - 13:45
Good, Fast, Cheap: How to do Data Science with Missing Data
If you've never heard of the "good, fast, cheap" dilemma, it goes something like this: You can have something good and fast, but it won't be cheap. You can have something good and cheap, but it won't be fast. You can have something fast and cheap, but it won't be good. In short, you can pick two of the three but you can't have all three.
If you've done a data science problem before, I can all but guarantee that you've run into missing data. How do we handle it? Well, we can avoid, ignore, or try to account for missing data. The problem is, none of these strategies are good, fast, *and* cheap.
We'll start by visualizing missing data and identify the three different types of missing data, which will allow us to see how they affect whether we should avoid, ignore, or account for the missing data. We will walk through the advantages and disadvantages of each approach as well as how to visualize and implement each approach. We'll wrap up with practical tips for working with missing data and recommendations for integrating it with your workflow!