A lot of data science coverage in the media focuses on big data—storage systems, deep learning, and analyzing data with billions or trillions of observations. However, there’s an equally pressing problem in many industries and smaller companies today: small sample sizes or small subgroups within larger datasets. Machine learning algorithms fail to converge. Statistical methods break down completely. And valuable insight is lost.
However, recent advances in a branch of machine learning called topological data analysis (TDA), along with novel applications of topology to existing statistical methods, have provided a toolset suited to the challenges of small data. These methods have great potential as the field of data science moves from quantity to quality of data. This talk overviews several of TDA’s major tools, as well as their applications to three projects in which traditional methods fail.
I will link to the video when it is made available :)