Once an obscure branch of applied mathematics, machine learning is now the darling of tech. This talk (https://www.youtube.com/watch?v=E6KpRwoH18M) discusses lessons learned democratizing machine learning. How scikit-learn was designed to empower users from a community of developers. How the Python data ecosystem was built from scientific computing tools: the importance of good numerics. It also covers remain challenges to address and the progresses that we are making. Scaling out brings different bottlenecks to numerics. Integrating data in the statistical models, a hurdle to data-science practice requires to rethink data cleaning pipelines. This talk draws from my experience as a scikit-learn developer, but also as a researcher in machine learning and applications.
1.Building a toolkit for all
2. Tackling scalability
3. Bridging to data engineering