2. LEN CHANG
• MACHINE LEARNING & DATA MINING
• DISTRIBUTION SYSTEM & NOSQL
• CRAWLER & CHINESE MINING
• Communication Engineering, General Study - CCU
• Software Engineering, Master Study - NCU
• Pixnet Hackathon 2014 – EXIT MINING
• Pixnet Hackathon 2015 – Spam User Detection
• Taipei Open Data Hackathon 2015
– The relation between Religion and Taipei City
• BI SYSTEM & DATA VISUALIZATION
• FINANCE & EDUCATION & ART & SPORT
• THE PLAYER OF BLIZZARD GAMES
3. AGENDA
• A GOOD STORY
• TOOL 1 : DATABASE
• TOOL 2 : COLLECTION AND REPLICATE.
• TOOL 3 : VISUALIZATION.
• TOOL 4: MACHINE LEARNING
• SAMPLE
• SUMMARY
9. A GOOD STORY TELL US…
• FIND YOUR “UNIQUE CUSTOMER DATA”.
• USE “CUSTOMER DATA” TO IMPROVE “DIGITAL CUSTOMER EXPERIENCE"
• USE “DIGITAL CUSTOMER EXPERIENCE” TO HELP ORGANIZATION “MAKE MONEY”.
13. THE PURPOSE IS IMPORTANT
CDC
ETL
SQL
100 % accurate answer when I see the report
14. THE PURPOSE IS IMPORTANT
Marching Learning
Real time feedback
Real-time dashboard
less accurate, faster response when I need a rough answer
15. THE PURPOSE IS IMPORTANT
Marching Learning
Powerful at full-text search, weak at number computing.
16. THE PURPOSE IS IMPORTANT
High frequency
Real-time dashboard
To ensure accurate and speed, costing isn’t important.
17. DATABASE
• 100 % ACCURATE
• RELATION DATABASE
• LESS ACCURATE, MORE FASTER
• HBASE, SPARK ,CASSANDRA, MONGODB, OTHERS..
• SPECIAL CASE
• FULL-TEXTING SEARCH: ELASTICSEARCH
• ACCURATE AND SPEED: REDIS OR OTHER IN-MEMORY DB.
19. Location
Mobile pay
loyalty card
A Good
Digital Customer Experience
Social network
BI System, Data
warehousing…etc
Collection: Any Data in, Any Data out
23. COMPARISON
FLUENTD
• LANG: C EMBEDDED IN RUBY
• PLATFORM: LINUX
• MAJOR OUTPUT DB: MONGODB
LOGSTASH
• LANG: JAVA
• PLATFORM: LINUX AND WINDOWS
• MAJOR OUTPUT DB: ELASTICSEARCH
• ELK ARCH.
24. Location
Mobile pay
loyalty card
Social network
BI System, Data
warehouse…etc
Replicate: replicate data
from DB_A to DB_B
RDB RDB
Case 1
NOSQL RDB
Case 3
Transaction
DB
NOSQL NOSQL
Case 2
ETL: Extract-Transform-Load
28. COMPARISON
RDB TO RDB NOSQL TO RDBNOSQL TO NOSQL
• TRADITIONAL MECHANISM
• TO ENSURE THE “DATA
CONSISTENCY”
• FINANCIAL INDUSTRY
• HUGE DATA ANALYSIS
• LOW COSTING HARDWARE ,
POWERFUL AND FAST
COMPUTATION
• NEED PROGRAMMING SKILL,
NOT ONLY SQL
• MAKE A RDB AS A NODE OF
NOSQL CLUSTER
• MAYBE IT IS A BALANCE
BETWEEN NOSQL AND RDB
Question 1: Why we feel a thing which Starbucks latte is more expensive is reasonable ?
Question 2: Have anyone can identify what different between general latte and Starbucks latte ?
Question 3: So, What Starbucks do something for this? (Animation)
Starbucks 1.0 : The relation between person and person.
Starbucks 2.0 : Make customers a good digital experience.