More Related Content
Similar to An Intro to Text Analytics on Big Data with a use case (20)
More from Raul Chong (14)
An Intro to Text Analytics on Big Data with a use case
- 3. #TOSMAC
Twitters numbers
An Intro to Text Analytics on Big Data with a use case
| © 2014 IBM Corporation3
As you know:
-500 million Tweets are sent per day.
-Twitter supports 35+ languages.
-255 million monthly active users.
Huge amount of data!
- 4. #TOSMAC
An Intro to Text Analytics on Big Data with a use case
| © 2014 IBM Corporation4
Overview
Section1 Section2 Section3 Section4 Section5
- 9. #TOSMAC
An Intro to Text Analytics on Big Data with a use case
| © 2014 IBM Corporation9
Section2
- 12. #TOSMAC
Section1 Section2 Section3 Section4 Section5
An Intro to Text Analytics on Big Data with a use case
| © 2014 IBM Corporation12
Next section Extractor: used to extract
structured information from
unstructured and
semi-structured data.
AQL: Annotation Query
Language. Rule language
with familiar SQL-like syntax.
- 13. #TOSMAC
Section1 Section2 Section3 Section4 Section5
An Intro to Text Analytics on Big Data with a use case
| © 2014 IBM Corporation13
Next section
Profiler:
troubleshooting performance
problems.
- 14. #TOSMAC
Main concepts
An Intro to Text Analytics on Big Data with a use case
| © 2014 IBM Corporation14
Types of extraction specifications:
- Dictionaries
- Regular expressions
- Part of speech
- 20. #TOSMAC
Main concepts
An Intro to Text Analytics on Big Data with a use case
| © 2014 IBM Corporation20
Types of extraction specifications:
- Dictionaries
-Regular expressions
- Part of speech
numbers:
7.5
4
13
- 22. #TOSMAC
Main concepts
An Intro to Text Analytics on Big Data with a use case
| © 2014 IBM Corporation22
Types of extraction specifications:
- Dictionaries
- Regular expressions
- Part of speech
- 25. #TOSMAC
| © 2014 IBM Corporation25
An Intro to Text Analytics on Big Data with a use case
AQL Guidelines
Basic feature AQL statements
- Develop the core building blocks of the extractor.
- 26. #TOSMAC
| © 2014 IBM Corporation26
An Intro to Text Analytics on Big Data with a use case
AQL Guidelines
Candidate generation AQL statements
- Combine basic features AQL statements.
- 27. #TOSMAC
| © 2014 IBM Corporation27
An Intro to Text Analytics on Big Data with a use case
Candidate generation AQL statements
$7.5 million
$4 thousand
$ 7.5 million
- 28. #TOSMAC
| © 2014 IBM Corporation28
An Intro to Text Analytics on Big Data with a use case
Candidate generation AQL statements
$7.5 million
$4 thousand
$ 7.5 million
$7.5 million
- 29. #TOSMAC
| © 2014 IBM Corporation29
An Intro to Text Analytics on Big Data with a use case
AQL Guidelines
Filter and consolidate AQL statements
- Refine results
- Remove invalid annotations
- Resolve overlap between annotations.
- 31. #TOSMAC
| © 2014 IBM Corporation31
An Intro to Text Analytics on Big Data with a use case
Conclusion
- 33. #TOSMAC
What we have done
An Intro to Text Analytics on Big Data with a use case
| © 2014 IBM Corporation33
Section1 Section2 Section3
- 34. #TOSMAC
What are we going to do?
An Intro to Text Analytics on Big Data with a use case
| © 2014 IBM Corporation34
Section4 Section5
- 36. #TOSMAC
Also using R
An Intro to Text Analytics on Big Data with a use case
| © 2014 IBM Corporation36
1.75 0.32
- 37. #TOSMAC
What are we going to do?
An Intro to Text Analytics on Big Data with a use case
| © 2014 IBM Corporation37