This document provides a short introduction to machine learning (ML) frameworks built on Hadoop, including Hadoop, Spark, and Petuum. It notes that Hadoop is the de facto standard for distributed storage and processing of big data. Spark is 10x faster than Hadoop for some applications by caching data in memory. Petuum is even faster than Spark for ML by using asynchronous communication to reduce network costs while still guaranteeing convergence, and provides deep learning APIs.