This is an introduction to the exciting world of healthcare data science though three real-world efforts we've delivered on in the past 3 years, and some of the open problems behind them. First, automated clinical coding will demonstrate the challenges in natural language understanding, variety of jargons and measurement stability. Second, patient risk prediction will illuminate the need for localized models, non-stationary models and explain-ability. Third, the master patient index covers some of the data quality, privacy and compliance constraints and the design choices they imply. We hope to convince that tackling these open problems is as intellectually exciting as it is socially important. Presented by David Talby, SVP Engineering at Atigeo.