3. 1 text_file = spark.textFile("hdfs://...")
2 text_file.flatMap(lambda line: line.split())
3 .map(lambda word: (word, 1))
4 .reduceByKey(lambda a, b: a+b)
Spark 80
Scala, Python and R shells .
4. SQL, streaming, and
complex analytics.
Spark
including SQL and
DataFrames, MLlib for
machine learning, GraphX,
and Spark Streaming.
5. Spark Hadoop, Mesos,
standalone, or in the cloud.
HDFS, Cassandra, HBase, and S3
.
Spark
on EC2, on Hadoop
YARN, or on Apache Mesos .
HDFS, Cassandra, HBase, Hive,
Tachyon, and any Hadoop data
source.