Vodafone, Cyberpark ve Türkiye Teknoloji Geliştirme Vakfı işbirliğinde düzenlen etkinlikte büyük veri kavramı, Apache Hadoop Ekosistemi ve Türkiye ve Dünyadaki örnek uygulamalar anlatıldı.
-
1 Haziran 2016 - Onur Karadeli, Mustafa Murat Sever
5. Big Data is growing (Google Trends)
5
C1 - Public
6. Definition of Big Data
Big data is a term for data sets that are so large or complex
that traditional data processing applications are inadequate.
Challenges include analysis, capture, data curation, search,
sharing, storage, transfer, visualization, querying and
information privacy.
-BigData WIKIPEDIA
What is Big ?
6
C1 - Public
7. The ‘3V’ s
• Volume
• Velocity
• Variety
7
C1 - Public
8. Volume
• %40 Growth per year
• 50 Zettabytes by 2020
Ref:Where-is-your-data-FINAL-5a
8
C1 - Public
32. Apache Hadoop
• Open-Source Projects/Sub-projects of
Apache.
• Core projects
HDFS: Hadoop Distributed File System
MapReduce: Distributed Data processing
...
• Hadoop is not a database.
• Move computation to data !
• Now- %32 percent of all enterprise uses
Apache Hadoop.
32C1 - Public
33. Apache Hadoop History
• 2003 Google File system paper
• 2006 Hadoop subproject created
• 2008 Sort record: Running on a 910-node cluster, Hadoop sorted one
terabyte in 209 seconds
• 2009 Yahoo runs 17 clusters with 24,000 machines
• 2011 Facebook, LinkedIn, eBay and IBM collectively contribute 200,000
lines of code
Ref: https://en.wikipedia.org/wiki/Apache_Hadoop
33C1 - Public
34. Apache Hadoop Base Components & Enablers
Ref: http://synerzip.com - Innovation – It’s in our DNA
34C1 - Public
35. BI & Visualization example
35
Ref: http://forums.bsdinsight.com/articles/?page=4
C1 - Public
41. The Best Big Data Team should have ...
41
• Data Hygienists – for clean data
• Data Explorers – discover data to use
• Business Solution Architects – combine data for a use case
• Data Scientists – for the right model
• Campaign Expert – for the best benefit
* From HBR : https://hbr.org/2013/07/five-roles-you-need-on-your-bi
C1 - Public