A description of a complete Big Data ecosystem that can be used for operations on huge collections of data - even up to gigabytes of data per second, and a few hundred thousand customers connected in the same moment. The ecosystem can be upgraded with additional Apache tools: Apache Flume, Ambari, Mesos, Yarn.
6. Drone soft written in Scala language
In real time it streams data to the server
7. Drone soft written in Scala language
In real time it streams data to the server
Using Kafka
8. Drone soft written in Scala language
In real time it streams data to the server
Using Kafka
9. On a server, data is read by Spark Streaming.
It allows us to:
10. It allows us to:
save data to Cassandra
send calculated data to browser through websocket
send it to another Kafka consumer
save the whole log to Hadoop cluster
On a server, data is read by Spark Streaming.
11. By saving logs to Hadoop cluster, we can later
access those logs, if we didn't save something
in Cassandra
12. By sending data to the browser through
websocket, we can see where our drones are in
realtime, monitor sensors and much more
13.
14. By using Cassandra and Apache Spark data
scientists can analyze given data later,
by using:
1. Apache Zeppelin
- Apache Spark(df, RDD) + Scala
- Apache Spark MLLib
2. Azure Machine Learning
15. We prefer to use Azure Machine Learning
instead Spark MLLib because it is much easier
to understand - and design new predictions
Read our blog post about Azure ML:
http://espeo.eu/blog/azure-machine-learning-predictions/