Diese Präsentation wurde erfolgreich gemeldet.
Die SlideShare-Präsentation wird heruntergeladen. ×

Apache Storm based Real Time Analytics for Recommending Trending Topics and Sentiment Analysis on Cloud Compouting Environment

Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Nächste SlideShare
Apache Storm Internals
Apache Storm Internals
Wird geladen in …3
×

Hier ansehen

1 von 20 Anzeige

Apache Storm based Real Time Analytics for Recommending Trending Topics and Sentiment Analysis on Cloud Compouting Environment

Herunterladen, um offline zu lesen

Apache Storm based Real Time Analytics for Recommending Trending Topics and Sentiment Analysis on Cloud Compouting Environment

Apache Storm based Real Time Analytics for Recommending Trending Topics and Sentiment Analysis on Cloud Compouting Environment

Anzeige
Anzeige

Weitere Verwandte Inhalte

Diashows für Sie (20)

Anzeige

Ähnlich wie Apache Storm based Real Time Analytics for Recommending Trending Topics and Sentiment Analysis on Cloud Compouting Environment (20)

Aktuellste (20)

Anzeige

Apache Storm based Real Time Analytics for Recommending Trending Topics and Sentiment Analysis on Cloud Compouting Environment

  1. 1. Akhmedov Khumoyun Storm based Real Time Analytics for Recommending Trending Topics and Sentiment Analysis on Cloud Computing Environment Konkuk 2015 humoyun@konkuk.ac.kr SMCC Lab Social Media Cloud Computing Research Center
  2. 2. Outline • Motivation • Real Time Systems and CEP • Storm Introduction • Used Technologies • Related Work • System Overview • System Architecture • Use Case: Social Media Analytics by SAS
  3. 3. Motivation • Real time computation is on demand • Responding to the problem almost instantly • Business value • Tightly connected to Cloud Computing • Batch processing limitations • and …
  4. 4. Real Time Systems and CEP • Real Time System?  Real-time system has been described as one which “controls an environment by receiving data, processing then, and returning the results sufficiently and quickly to affect the environment at that time”. Real-time response latency is often in the order of seconds, or milliseconds. • CEP(Complex Event Processing)?  CEP is event processing that combines data from multiple sources to infer events or patterns that suggest more complicated circumstances. The goal of CEP is to identify meaningful events (such as threats of attacks) and respond to them asap.
  5. 5. Apache Storm is • Fast & scalable • Fault-tolerant • Guarantees messages will be processed • Easy to setup & operate • Free & open source distributes real time computation system - originally developed by Nathan Marz at BackType (acquired by Twitter)
  6. 6. Conceptual and Physical View of Storm
  7. 7. Real Time Streaming Apache Storm and Apache Kafka
  8. 8. Why we need Kafka  Apache Kafka is an ideal source for Storm topologies. It provides everything necessary for : - At most once processing - At least once processing - Exactly once processing  Apache Storm includes Kafka spout implementations for all levels of reliability.  Kafka supports a wide variety of languages and integration points for both producers and consumers.
  9. 9. Used Technologies • Apache Storm • Apache HBase • MySQL • Hadoop2 • Apache ZooKeeper • Apache Kafka (message broker) • Java and some Python • jQuery and Bootstrap • Play Framework(Java) or Django(Python)
  10. 10. System Overview • Trending Topics?  “Twitter Trends are automatically generated by an algorithm that attempts to identify topics that are being talked about more now than were previously.” The Trends list is designed to help people discover the most hottest topics, breaking news from across the world, in real-time. • Sentiment Analysis?  Generally speaking, sentiment analysis aims to determine the attitude of a speaker or a writer with respect to some topic or the overall contextual polarity of a document.
  11. 11. Trending Topics Fashion uniqlo adidas shanel #armani Politics putin NATO #obama ISIS Sports #messi UEFA #NBA archery Economics crisis #Greece loan finance Health #MERS CDO #Cardio Cancer
  12. 12. 0 5 10 100 1000 10000 100K #MERS obama NATO #cancer shanel #crisis #MERS….. Trending Topics (real time feel)
  13. 13. Sentiment Analysis (of tweets) • Positive • Negative • Neutral
  14. 14. Top Ten Trending Tweets N User Tweets Sentiment 1 BigData Red Hat Offers Apache Hadoop Big Data Services For Business Critica l Workloads : http://tinyurl.com/qb83boj Positive 2 Checkmax Secure your source code. http://bit.ly/1MnVRwQ Get a full vulnerabilit y report and prevent security breaches Negative 3 Time.com 5 players to follow in the Women’s World Cup http://ti.me/1Lk M0Ku Neutral 4 …. …. …. . …. …. …. . …. …. …. 8 …. …. …. 9 Iran #Iran, #Russia discuss regional development, #SCO membershi p http://theiranproject.com/blog/2015/06/20/iran-russia-discus s-regional-development-sco-membership/ … Negative 10 ….. …. ….
  15. 15. Sentiment Analysis  To find sentiment of incoming tweets I will use some Machine Learning algorithms such as Naïve Bayesian Algorithm (predictive learning) and other related techniques.  Besides, I will use predefined reference sentiment dictionary as a model for efficiently determine sentiment value of tweets.
  16. 16. System Architecture TCrawler TCrawler TCrawler Dashboard
  17. 17. System Workflow Trending TopicsBolt Tweet ManipulationBolt Sentiment AnalyserBolt Tweet Spout Tweet Spout DBWriterBolt MySQL Dashboard AllTweets HBase
  18. 18. Social Media Analytics by SAS
  19. 19. THANK YOU Any Questions are welcome…

×