SlideShare ist ein Scribd-Unternehmen logo
1 von 22
May 2013
From Big Data to Smart Data
Marin Dimitrov - CTO
About Ontotext
• Provides products and services for creating,
managing and exploiting semantic data
– Founded in 2000
– Offices in Bulgaria, USA and UK
• Major clients and industries
– Media & Publishing (BBC, Press Association, EuroMoney,
NDP Nieuwsmedia)
– HCLS (AstraZeneca, UCB, NIBIO)
– Cultural Heritage (The British Museum, The National
Archives, Polish National Museum, Dutch Public Library)
– Government (UK Parliament, United Nations FAO, LMI)
#2May 2013From Big Data to Smart Data (Semantic Days 2013)
Contents
• The Problem with Big Data for BI
• From Big Data to Smart Data
• Success Stories by Ontotext
#3From Big Data to Smart Data (Semantic Days 2013) May 2013
BIG DATA FOR BUSINESS
INTELLIGENCE
#4From Big Data to Smart Data (Semantic Days 2013) May 2013
The Problem with Big Data for BI
#5From Big Data to Smart Data (Semantic Days 2013) May 2013
The Problem with Big Data for BI
• It’s not only about Volume, Velocity & Variety
• Too much focus on processing speed & storage
volume
• “Brute force” approaches increase the amount of
data processed…
– But not necessarily the Value & insight derived from data
– May lead to even more data quality & inconsistency
problems
– Problems with data visualisation & exploration
– Often do not lead to better decision making
#6From Big Data to Smart Data (Semantic Days 2013) May 2013
The Problem with Big Data for BI
• BI success is not measured by Volume, Velocity &
Variety, but by more derived Value
• Organisations should learn how to better utilise their
“small data” before targeting Big Data
– Quality over quantity
– Better understanding of the data leads to better decision
making
– Avoid “needle in a haystack” situations
#7From Big Data to Smart Data (Semantic Days 2013) May 2013
The Problem with Big Data for BI
#8From Big Data to Smart Data (Semantic Days 2013) May 2013
Smart Data for Better BI
• Efficiently analyse unstructured data
– Most of the enterprise data is still unstructured
– Even within structured & transactional data sources there
is a lot of embedded unstructured data
– … and this unstructured data is poorly analysed (if at all) =>
lots of potential value still remains locked
– (sometimes even within semantic / Linked Data with
insufficient granularity)
#9From Big Data to Smart Data (Semantic Days 2013) May 2013
Smart Data for Better BI
• Focus on metadata first, Big Data later
– (As opposed to: Big Data first, metadata later)
• Enrich data
• Interlink data
• Provide a common metadata layer
– Break legacy silos
– Align heterogeneous metadata if necessary
• Better analysis of the data, better insight
#10From Big Data to Smart Data (Semantic Days 2013) May 2013
SUCCESS STORIES
#11From Big Data to Smart Data (Semantic Days 2013) May 2013
UK Job Market Intelligence
• Comprehensive recruitment database for the UK
– 4 million job ads / vacancies (dynamic)
– 220,000 company websites & 700 job boards monitored
• Questions we can answer
– What skills are in demand at present?
– Which are the top job boards in a region?
– Which is the right Job board for your industry sector?
– Which are the most active job advertisers / employers?
– Which are the agencies and employers that do not
advertise on your job board?
#12From Big Data to Smart Data (Semantic Days 2013) May 2013
UK Job Market Intelligence
#13From Big Data to Smart Data (Semantic Days 2013) May 2013
UK Job Market Intelligence
• Technology stack
– Web mining & focussed crawling
– KB construction from open & proprietary data sources
– Skills taxonomy (based on DISCO)
– Text mining & semantic enrichment
– Reconciliation & interlinking
– BI reporting & dashboards
#14From Big Data to Smart Data (Semantic Days 2013) May 2013
UK Job Market Intelligence
#15From Big Data to Smart Data (Semantic Days 2013) May 2013
UK Job Market Intelligence
#16From Big Data to Smart Data (Semantic Days 2013) May 2013
UK Job Market Intelligence
#17From Big Data to Smart Data (Semantic Days 2013) May 2013
Asset Recovery Intelligence System (ARIS)
• Support Financial Intelligence Units with tracking
stolen assets, fight corruption & money laundering
• Questions we can answer
– What are the reported activities related to a person?
– What is the person’s personal/professional network?
– What are corruptions cases reported in regional news?
• Data sources
– News feeds from major news agencies
– Dow Jones data & news feeds
– SARs to the FIU
– Open data (people & companies, Wikipedia)
#18From Big Data to Smart Data (Semantic Days 2013) May 2013
Asset Recovery Intelligence System (ARIS)
#19From Big Data to Smart Data (Semantic Days 2013) May 2013
Asset Recovery Intelligence System (ARIS)
• Technology stack
– Web Mining
– Text mining & semantic enrichment (KIM)
– ARIS ontology
• People, companies, assets, relations, financial transactions, …
– Reconciliation & Interlinking
– Triplestore (OWLIM)
– Semantic search & exploration UX
– BI reporting / factsheets / alerts
#20From Big Data to Smart Data (Semantic Days 2013) May 2013
Semantic Information Integration & Enrichment
#21From Big Data to Smart Data (Semantic Days 2013) May 2013
Q & A
Thank you!
@ontotext
#22From Big Data to Smart Data (Semantic Days 2013) May 2013

Weitere ähnliche Inhalte

Was ist angesagt?

Big data characteristics, value chain and challenges
Big data characteristics, value chain and challengesBig data characteristics, value chain and challenges
Big data characteristics, value chain and challengesMusfiqur Rahman
 
Big Data & Analytics (Conceptual and Practical Introduction)
Big Data & Analytics (Conceptual and Practical Introduction)Big Data & Analytics (Conceptual and Practical Introduction)
Big Data & Analytics (Conceptual and Practical Introduction)Yaman Hajja, Ph.D.
 
Big Data Analytics Proposal #1
Big Data Analytics Proposal #1Big Data Analytics Proposal #1
Big Data Analytics Proposal #1Ziyad Saleh
 
Data-Ed Webinar: Demystifying Big Data
Data-Ed Webinar: Demystifying Big Data Data-Ed Webinar: Demystifying Big Data
Data-Ed Webinar: Demystifying Big Data DATAVERSITY
 
Big Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementBig Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementTony Bain
 
Personalized News and Video Recomendation System at LinkSure
Personalized News and Video Recomendation System at LinkSurePersonalized News and Video Recomendation System at LinkSure
Personalized News and Video Recomendation System at LinkSureLeanne Hwee
 
Importance of Big data for your Business
Importance of Big data for your BusinessImportance of Big data for your Business
Importance of Big data for your Businessazuyo.com
 
Importance of Data Analytics
 Importance of Data Analytics Importance of Data Analytics
Importance of Data AnalyticsProduct School
 
BigData and Beyond
BigData and BeyondBigData and Beyond
BigData and BeyondJohn Avery
 
Everis big data_wilson_v1.4
Everis big data_wilson_v1.4Everis big data_wilson_v1.4
Everis big data_wilson_v1.4wilson_lucas
 
Big Data Presentation at SCQAA-SF on June 12 2013
Big Data Presentation at SCQAA-SF on June 12 2013Big Data Presentation at SCQAA-SF on June 12 2013
Big Data Presentation at SCQAA-SF on June 12 2013Sujit Ghosh
 
Big Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture CapabilitiesBig Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture CapabilitiesAshraf Uddin
 
BIG Data and Methodology-A review
BIG Data and Methodology-A reviewBIG Data and Methodology-A review
BIG Data and Methodology-A reviewShilpa Soi
 

Was ist angesagt? (20)

Big data
Big dataBig data
Big data
 
Big data characteristics, value chain and challenges
Big data characteristics, value chain and challengesBig data characteristics, value chain and challenges
Big data characteristics, value chain and challenges
 
Big Data & Analytics (Conceptual and Practical Introduction)
Big Data & Analytics (Conceptual and Practical Introduction)Big Data & Analytics (Conceptual and Practical Introduction)
Big Data & Analytics (Conceptual and Practical Introduction)
 
Big Data Analytics Proposal #1
Big Data Analytics Proposal #1Big Data Analytics Proposal #1
Big Data Analytics Proposal #1
 
Data-Ed Webinar: Demystifying Big Data
Data-Ed Webinar: Demystifying Big Data Data-Ed Webinar: Demystifying Big Data
Data-Ed Webinar: Demystifying Big Data
 
Big Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementBig Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data Management
 
Big data
Big dataBig data
Big data
 
Big data analytics
Big data analyticsBig data analytics
Big data analytics
 
Big data
Big dataBig data
Big data
 
Personalized News and Video Recomendation System at LinkSure
Personalized News and Video Recomendation System at LinkSurePersonalized News and Video Recomendation System at LinkSure
Personalized News and Video Recomendation System at LinkSure
 
Importance of Big data for your Business
Importance of Big data for your BusinessImportance of Big data for your Business
Importance of Big data for your Business
 
Importance of Data Analytics
 Importance of Data Analytics Importance of Data Analytics
Importance of Data Analytics
 
Big Data
Big DataBig Data
Big Data
 
BigData and Beyond
BigData and BeyondBigData and Beyond
BigData and Beyond
 
Data science
Data scienceData science
Data science
 
Everis big data_wilson_v1.4
Everis big data_wilson_v1.4Everis big data_wilson_v1.4
Everis big data_wilson_v1.4
 
Big Data Presentation at SCQAA-SF on June 12 2013
Big Data Presentation at SCQAA-SF on June 12 2013Big Data Presentation at SCQAA-SF on June 12 2013
Big Data Presentation at SCQAA-SF on June 12 2013
 
Big Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture CapabilitiesBig Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture Capabilities
 
BIG Data and Methodology-A review
BIG Data and Methodology-A reviewBIG Data and Methodology-A review
BIG Data and Methodology-A review
 
Big data
Big dataBig data
Big data
 

Andere mochten auch

Semantic Technologies for Big Data
Semantic Technologies for Big DataSemantic Technologies for Big Data
Semantic Technologies for Big DataMarin Dimitrov
 
Using the Semantic Web Stack to Make Big Data Smarter
Using the Semantic Web Stack to Make  Big Data SmarterUsing the Semantic Web Stack to Make  Big Data Smarter
Using the Semantic Web Stack to Make Big Data SmarterMatheus Mota
 
Netquest Survey Manager - Software de encuestas online
Netquest Survey Manager - Software de encuestas online Netquest Survey Manager - Software de encuestas online
Netquest Survey Manager - Software de encuestas online Netquest
 
How Semantics Solves Big Data Challenges
How Semantics Solves Big Data ChallengesHow Semantics Solves Big Data Challenges
How Semantics Solves Big Data ChallengesDATAVERSITY
 
Inference using owl 2.0 semantics
Inference using owl 2.0 semanticsInference using owl 2.0 semantics
Inference using owl 2.0 semanticsCraig Trim
 
Big Data and the Semantic Web: Challenges and Opportunities
Big Data and the Semantic Web: Challenges and OpportunitiesBig Data and the Semantic Web: Challenges and Opportunities
Big Data and the Semantic Web: Challenges and OpportunitiesSrinath Srinivasa
 
시스템 엔지니어가 바라보는 시맨틱웹과 빅데이터 기술
시스템 엔지니어가 바라보는 시맨틱웹과 빅데이터 기술시스템 엔지니어가 바라보는 시맨틱웹과 빅데이터 기술
시스템 엔지니어가 바라보는 시맨틱웹과 빅데이터 기술Haklae Kim
 
Linking Open, Big Data Using Semantic Web Technologies - An Introduction
Linking Open, Big Data Using Semantic Web Technologies - An IntroductionLinking Open, Big Data Using Semantic Web Technologies - An Introduction
Linking Open, Big Data Using Semantic Web Technologies - An IntroductionRonald Ashri
 
DataGraft Platform: RDF Database-as-a-Service
DataGraft Platform: RDF Database-as-a-ServiceDataGraft Platform: RDF Database-as-a-Service
DataGraft Platform: RDF Database-as-a-ServiceMarin Dimitrov
 
OWLIM@AWS - On-demand RDF Data Management in the Cloud
OWLIM@AWS - On-demand RDF Data Management in the CloudOWLIM@AWS - On-demand RDF Data Management in the Cloud
OWLIM@AWS - On-demand RDF Data Management in the CloudMarin Dimitrov
 
S4: The Self-Service Semantic Suite
S4: The Self-Service Semantic SuiteS4: The Self-Service Semantic Suite
S4: The Self-Service Semantic SuiteMarin Dimitrov
 
On-Demand RDF Graph Databases in the Cloud
On-Demand RDF Graph Databases in the CloudOn-Demand RDF Graph Databases in the Cloud
On-Demand RDF Graph Databases in the CloudMarin Dimitrov
 
Low-cost Open Data As-a-Service
Low-cost Open Data As-a-ServiceLow-cost Open Data As-a-Service
Low-cost Open Data As-a-ServiceMarin Dimitrov
 
Enabling Low-cost Open Data Publishing and Reuse
Enabling Low-cost Open Data Publishing and ReuseEnabling Low-cost Open Data Publishing and Reuse
Enabling Low-cost Open Data Publishing and ReuseMarin Dimitrov
 
Ontotext in EC Funded Projects 2002-2012
Ontotext in EC Funded Projects 2002-2012Ontotext in EC Funded Projects 2002-2012
Ontotext in EC Funded Projects 2002-2012Marin Dimitrov
 
RDF Database-as-a-Service with S4
RDF Database-as-a-Service with S4RDF Database-as-a-Service with S4
RDF Database-as-a-Service with S4Marin Dimitrov
 
Text Analytics & Linked Data Management As-a-Service
Text Analytics & Linked Data Management As-a-ServiceText Analytics & Linked Data Management As-a-Service
Text Analytics & Linked Data Management As-a-ServiceMarin Dimitrov
 
Hackconf 2016 - Да пишем код за хиляди сървъри
Hackconf 2016 - Да пишем код за хиляди сървъриHackconf 2016 - Да пишем код за хиляди сървъри
Hackconf 2016 - Да пишем код за хиляди сървъриNikolay Stoitsev
 
Scaling to Millions of Concurrent SPARQL Queries on the Cloud
Scaling to Millions of Concurrent SPARQL Queries on the CloudScaling to Millions of Concurrent SPARQL Queries on the Cloud
Scaling to Millions of Concurrent SPARQL Queries on the CloudMarin Dimitrov
 

Andere mochten auch (20)

Semantic Technologies for Big Data
Semantic Technologies for Big DataSemantic Technologies for Big Data
Semantic Technologies for Big Data
 
Using the Semantic Web Stack to Make Big Data Smarter
Using the Semantic Web Stack to Make  Big Data SmarterUsing the Semantic Web Stack to Make  Big Data Smarter
Using the Semantic Web Stack to Make Big Data Smarter
 
Netquest Survey Manager - Software de encuestas online
Netquest Survey Manager - Software de encuestas online Netquest Survey Manager - Software de encuestas online
Netquest Survey Manager - Software de encuestas online
 
How Semantics Solves Big Data Challenges
How Semantics Solves Big Data ChallengesHow Semantics Solves Big Data Challenges
How Semantics Solves Big Data Challenges
 
Inference using owl 2.0 semantics
Inference using owl 2.0 semanticsInference using owl 2.0 semantics
Inference using owl 2.0 semantics
 
Big Data and the Semantic Web: Challenges and Opportunities
Big Data and the Semantic Web: Challenges and OpportunitiesBig Data and the Semantic Web: Challenges and Opportunities
Big Data and the Semantic Web: Challenges and Opportunities
 
시스템 엔지니어가 바라보는 시맨틱웹과 빅데이터 기술
시스템 엔지니어가 바라보는 시맨틱웹과 빅데이터 기술시스템 엔지니어가 바라보는 시맨틱웹과 빅데이터 기술
시스템 엔지니어가 바라보는 시맨틱웹과 빅데이터 기술
 
Linking Open, Big Data Using Semantic Web Technologies - An Introduction
Linking Open, Big Data Using Semantic Web Technologies - An IntroductionLinking Open, Big Data Using Semantic Web Technologies - An Introduction
Linking Open, Big Data Using Semantic Web Technologies - An Introduction
 
DataGraft Platform: RDF Database-as-a-Service
DataGraft Platform: RDF Database-as-a-ServiceDataGraft Platform: RDF Database-as-a-Service
DataGraft Platform: RDF Database-as-a-Service
 
OWLIM@AWS - On-demand RDF Data Management in the Cloud
OWLIM@AWS - On-demand RDF Data Management in the CloudOWLIM@AWS - On-demand RDF Data Management in the Cloud
OWLIM@AWS - On-demand RDF Data Management in the Cloud
 
S4: The Self-Service Semantic Suite
S4: The Self-Service Semantic SuiteS4: The Self-Service Semantic Suite
S4: The Self-Service Semantic Suite
 
On-Demand RDF Graph Databases in the Cloud
On-Demand RDF Graph Databases in the CloudOn-Demand RDF Graph Databases in the Cloud
On-Demand RDF Graph Databases in the Cloud
 
Low-cost Open Data As-a-Service
Low-cost Open Data As-a-ServiceLow-cost Open Data As-a-Service
Low-cost Open Data As-a-Service
 
Enabling Low-cost Open Data Publishing and Reuse
Enabling Low-cost Open Data Publishing and ReuseEnabling Low-cost Open Data Publishing and Reuse
Enabling Low-cost Open Data Publishing and Reuse
 
Ontotext in EC Funded Projects 2002-2012
Ontotext in EC Funded Projects 2002-2012Ontotext in EC Funded Projects 2002-2012
Ontotext in EC Funded Projects 2002-2012
 
RDF Database-as-a-Service with S4
RDF Database-as-a-Service with S4RDF Database-as-a-Service with S4
RDF Database-as-a-Service with S4
 
Text Analytics & Linked Data Management As-a-Service
Text Analytics & Linked Data Management As-a-ServiceText Analytics & Linked Data Management As-a-Service
Text Analytics & Linked Data Management As-a-Service
 
Hackconf 2016 - Да пишем код за хиляди сървъри
Hackconf 2016 - Да пишем код за хиляди сървъриHackconf 2016 - Да пишем код за хиляди сървъри
Hackconf 2016 - Да пишем код за хиляди сървъри
 
Scaling to Millions of Concurrent SPARQL Queries on the Cloud
Scaling to Millions of Concurrent SPARQL Queries on the CloudScaling to Millions of Concurrent SPARQL Queries on the Cloud
Scaling to Millions of Concurrent SPARQL Queries on the Cloud
 
From Python to Java
From Python to JavaFrom Python to Java
From Python to Java
 

Ähnlich wie From Big Data to Smart Data

Sr. Jon Ander, Internet de las Cosas y Big Data: ¿hacia dónde va la Industria?
Sr. Jon Ander, Internet de las Cosas y Big Data: ¿hacia dónde va la Industria? Sr. Jon Ander, Internet de las Cosas y Big Data: ¿hacia dónde va la Industria?
Sr. Jon Ander, Internet de las Cosas y Big Data: ¿hacia dónde va la Industria? INACAP
 
Eduserv Symposium 2013 - Combatting the data headaches of the digital age
Eduserv Symposium 2013 - Combatting the data headaches of the digital ageEduserv Symposium 2013 - Combatting the data headaches of the digital age
Eduserv Symposium 2013 - Combatting the data headaches of the digital ageEduserv
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big DataAkshata Humbe
 
Social Media World presentation
Social Media World presentationSocial Media World presentation
Social Media World presentationkperi
 
A beginner's guide to Big data
A beginner's guide to Big dataA beginner's guide to Big data
A beginner's guide to Big dataAnushkaGupta763558
 
Big Data for Business & Social Innovation
Big Data for Business & Social InnovationBig Data for Business & Social Innovation
Big Data for Business & Social InnovationStartupSaturdayEurope
 
Linked Data for the Enterprise: Opportunities and Challenges
Linked Data for the Enterprise: Opportunities and ChallengesLinked Data for the Enterprise: Opportunities and Challenges
Linked Data for the Enterprise: Opportunities and ChallengesMarin Dimitrov
 
Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)Thinkful
 
introduction to data science
introduction to data scienceintroduction to data science
introduction to data sciencebhavesh lande
 
P02 | Big Data | Anurag Gupta | BCA
P02 | Big Data | Anurag Gupta | BCAP02 | Big Data | Anurag Gupta | BCA
P02 | Big Data | Anurag Gupta | BCAANURAGGUPTA570
 
Getting Started in Data Science
Getting Started in Data ScienceGetting Started in Data Science
Getting Started in Data ScienceThinkful
 
SME Breakfast Seminar - Keynote Session - The Data Landscape
SME Breakfast Seminar - Keynote Session - The Data LandscapeSME Breakfast Seminar - Keynote Session - The Data Landscape
SME Breakfast Seminar - Keynote Session - The Data LandscapeNathean Technologies
 
big data analytics pgpmx2015
big data analytics pgpmx2015big data analytics pgpmx2015
big data analytics pgpmx2015Sanmeet Dhokay
 

Ähnlich wie From Big Data to Smart Data (20)

Sr. Jon Ander, Internet de las Cosas y Big Data: ¿hacia dónde va la Industria?
Sr. Jon Ander, Internet de las Cosas y Big Data: ¿hacia dónde va la Industria? Sr. Jon Ander, Internet de las Cosas y Big Data: ¿hacia dónde va la Industria?
Sr. Jon Ander, Internet de las Cosas y Big Data: ¿hacia dónde va la Industria?
 
Eduserv Symposium 2013 - Combatting the data headaches of the digital age
Eduserv Symposium 2013 - Combatting the data headaches of the digital ageEduserv Symposium 2013 - Combatting the data headaches of the digital age
Eduserv Symposium 2013 - Combatting the data headaches of the digital age
 
Ictam big data
Ictam big dataIctam big data
Ictam big data
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
Social Media World presentation
Social Media World presentationSocial Media World presentation
Social Media World presentation
 
A beginner's guide to Big data
A beginner's guide to Big dataA beginner's guide to Big data
A beginner's guide to Big data
 
Big Data for Business & Social Innovation
Big Data for Business & Social InnovationBig Data for Business & Social Innovation
Big Data for Business & Social Innovation
 
Big data
Big dataBig data
Big data
 
Linked Data for the Enterprise: Opportunities and Challenges
Linked Data for the Enterprise: Opportunities and ChallengesLinked Data for the Enterprise: Opportunities and Challenges
Linked Data for the Enterprise: Opportunities and Challenges
 
Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)
 
introduction to data science
introduction to data scienceintroduction to data science
introduction to data science
 
Big data
Big dataBig data
Big data
 
P02 | Big Data | Anurag Gupta | BCA
P02 | Big Data | Anurag Gupta | BCAP02 | Big Data | Anurag Gupta | BCA
P02 | Big Data | Anurag Gupta | BCA
 
Getting Started in Data Science
Getting Started in Data ScienceGetting Started in Data Science
Getting Started in Data Science
 
Final_Bigdata_pret
Final_Bigdata_pretFinal_Bigdata_pret
Final_Bigdata_pret
 
Big data in telecom
Big data in telecomBig data in telecom
Big data in telecom
 
SME Breakfast Seminar - Keynote Session - The Data Landscape
SME Breakfast Seminar - Keynote Session - The Data LandscapeSME Breakfast Seminar - Keynote Session - The Data Landscape
SME Breakfast Seminar - Keynote Session - The Data Landscape
 
SKILLWISE-BIGDATA ANALYSIS
SKILLWISE-BIGDATA ANALYSISSKILLWISE-BIGDATA ANALYSIS
SKILLWISE-BIGDATA ANALYSIS
 
Big data by_mcal
Big data by_mcalBig data by_mcal
Big data by_mcal
 
big data analytics pgpmx2015
big data analytics pgpmx2015big data analytics pgpmx2015
big data analytics pgpmx2015
 

Mehr von Marin Dimitrov

Measuring the Productivity of Your Engineering Organisation - the Good, the B...
Measuring the Productivity of Your Engineering Organisation - the Good, the B...Measuring the Productivity of Your Engineering Organisation - the Good, the B...
Measuring the Productivity of Your Engineering Organisation - the Good, the B...Marin Dimitrov
 
Mapping Your Career Journey
Mapping Your Career JourneyMapping Your Career Journey
Mapping Your Career JourneyMarin Dimitrov
 
Trust - the Key Success Factor for Teams & Organisations
Trust - the Key Success Factor for Teams & OrganisationsTrust - the Key Success Factor for Teams & Organisations
Trust - the Key Success Factor for Teams & OrganisationsMarin Dimitrov
 
Uber @ Telerik Academy 2018
Uber @ Telerik Academy 2018Uber @ Telerik Academy 2018
Uber @ Telerik Academy 2018Marin Dimitrov
 
Machine Learning @ Uber
Machine Learning @ UberMachine Learning @ Uber
Machine Learning @ UberMarin Dimitrov
 
Career Advice for My Younger Self
Career Advice for My Younger SelfCareer Advice for My Younger Self
Career Advice for My Younger SelfMarin Dimitrov
 
Scaling Your Engineering Organization with Distributed Sites
Scaling Your Engineering Organization with Distributed SitesScaling Your Engineering Organization with Distributed Sites
Scaling Your Engineering Organization with Distributed SitesMarin Dimitrov
 
Building, Scaling and Leading High-Performance Teams
Building, Scaling and Leading High-Performance TeamsBuilding, Scaling and Leading High-Performance Teams
Building, Scaling and Leading High-Performance TeamsMarin Dimitrov
 
Uber @ Career Days 2017 (Sofia University)
Uber @ Career Days 2017 (Sofia University)Uber @ Career Days 2017 (Sofia University)
Uber @ Career Days 2017 (Sofia University)Marin Dimitrov
 
GraphDB Connectors – Powering Complex SPARQL Queries
GraphDB Connectors – Powering Complex SPARQL QueriesGraphDB Connectors – Powering Complex SPARQL Queries
GraphDB Connectors – Powering Complex SPARQL QueriesMarin Dimitrov
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked DataMarin Dimitrov
 
Crossing the Chasm with Semantic Technology
Crossing the Chasm with Semantic TechnologyCrossing the Chasm with Semantic Technology
Crossing the Chasm with Semantic TechnologyMarin Dimitrov
 
Delivering Linked Data Training to Data Science Practitioners
Delivering Linked Data Training to Data Science PractitionersDelivering Linked Data Training to Data Science Practitioners
Delivering Linked Data Training to Data Science PractitionersMarin Dimitrov
 
Career Days 2012 @ Sofia University
Career Days 2012 @ Sofia UniversityCareer Days 2012 @ Sofia University
Career Days 2012 @ Sofia UniversityMarin Dimitrov
 
Semantic Technologies and Triplestores for Business Intelligence
Semantic Technologies and Triplestores for Business IntelligenceSemantic Technologies and Triplestores for Business Intelligence
Semantic Technologies and Triplestores for Business IntelligenceMarin Dimitrov
 
Linked Data Marketplaces
Linked Data MarketplacesLinked Data Marketplaces
Linked Data MarketplacesMarin Dimitrov
 
Linked Data Management
Linked Data ManagementLinked Data Management
Linked Data ManagementMarin Dimitrov
 

Mehr von Marin Dimitrov (18)

Measuring the Productivity of Your Engineering Organisation - the Good, the B...
Measuring the Productivity of Your Engineering Organisation - the Good, the B...Measuring the Productivity of Your Engineering Organisation - the Good, the B...
Measuring the Productivity of Your Engineering Organisation - the Good, the B...
 
Mapping Your Career Journey
Mapping Your Career JourneyMapping Your Career Journey
Mapping Your Career Journey
 
Open Source @ Uber
Open Source @ Uber Open Source @ Uber
Open Source @ Uber
 
Trust - the Key Success Factor for Teams & Organisations
Trust - the Key Success Factor for Teams & OrganisationsTrust - the Key Success Factor for Teams & Organisations
Trust - the Key Success Factor for Teams & Organisations
 
Uber @ Telerik Academy 2018
Uber @ Telerik Academy 2018Uber @ Telerik Academy 2018
Uber @ Telerik Academy 2018
 
Machine Learning @ Uber
Machine Learning @ UberMachine Learning @ Uber
Machine Learning @ Uber
 
Career Advice for My Younger Self
Career Advice for My Younger SelfCareer Advice for My Younger Self
Career Advice for My Younger Self
 
Scaling Your Engineering Organization with Distributed Sites
Scaling Your Engineering Organization with Distributed SitesScaling Your Engineering Organization with Distributed Sites
Scaling Your Engineering Organization with Distributed Sites
 
Building, Scaling and Leading High-Performance Teams
Building, Scaling and Leading High-Performance TeamsBuilding, Scaling and Leading High-Performance Teams
Building, Scaling and Leading High-Performance Teams
 
Uber @ Career Days 2017 (Sofia University)
Uber @ Career Days 2017 (Sofia University)Uber @ Career Days 2017 (Sofia University)
Uber @ Career Days 2017 (Sofia University)
 
GraphDB Connectors – Powering Complex SPARQL Queries
GraphDB Connectors – Powering Complex SPARQL QueriesGraphDB Connectors – Powering Complex SPARQL Queries
GraphDB Connectors – Powering Complex SPARQL Queries
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked Data
 
Crossing the Chasm with Semantic Technology
Crossing the Chasm with Semantic TechnologyCrossing the Chasm with Semantic Technology
Crossing the Chasm with Semantic Technology
 
Delivering Linked Data Training to Data Science Practitioners
Delivering Linked Data Training to Data Science PractitionersDelivering Linked Data Training to Data Science Practitioners
Delivering Linked Data Training to Data Science Practitioners
 
Career Days 2012 @ Sofia University
Career Days 2012 @ Sofia UniversityCareer Days 2012 @ Sofia University
Career Days 2012 @ Sofia University
 
Semantic Technologies and Triplestores for Business Intelligence
Semantic Technologies and Triplestores for Business IntelligenceSemantic Technologies and Triplestores for Business Intelligence
Semantic Technologies and Triplestores for Business Intelligence
 
Linked Data Marketplaces
Linked Data MarketplacesLinked Data Marketplaces
Linked Data Marketplaces
 
Linked Data Management
Linked Data ManagementLinked Data Management
Linked Data Management
 

Kürzlich hochgeladen

Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 

Kürzlich hochgeladen (20)

Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 

From Big Data to Smart Data

  • 1. May 2013 From Big Data to Smart Data Marin Dimitrov - CTO
  • 2. About Ontotext • Provides products and services for creating, managing and exploiting semantic data – Founded in 2000 – Offices in Bulgaria, USA and UK • Major clients and industries – Media & Publishing (BBC, Press Association, EuroMoney, NDP Nieuwsmedia) – HCLS (AstraZeneca, UCB, NIBIO) – Cultural Heritage (The British Museum, The National Archives, Polish National Museum, Dutch Public Library) – Government (UK Parliament, United Nations FAO, LMI) #2May 2013From Big Data to Smart Data (Semantic Days 2013)
  • 3. Contents • The Problem with Big Data for BI • From Big Data to Smart Data • Success Stories by Ontotext #3From Big Data to Smart Data (Semantic Days 2013) May 2013
  • 4. BIG DATA FOR BUSINESS INTELLIGENCE #4From Big Data to Smart Data (Semantic Days 2013) May 2013
  • 5. The Problem with Big Data for BI #5From Big Data to Smart Data (Semantic Days 2013) May 2013
  • 6. The Problem with Big Data for BI • It’s not only about Volume, Velocity & Variety • Too much focus on processing speed & storage volume • “Brute force” approaches increase the amount of data processed… – But not necessarily the Value & insight derived from data – May lead to even more data quality & inconsistency problems – Problems with data visualisation & exploration – Often do not lead to better decision making #6From Big Data to Smart Data (Semantic Days 2013) May 2013
  • 7. The Problem with Big Data for BI • BI success is not measured by Volume, Velocity & Variety, but by more derived Value • Organisations should learn how to better utilise their “small data” before targeting Big Data – Quality over quantity – Better understanding of the data leads to better decision making – Avoid “needle in a haystack” situations #7From Big Data to Smart Data (Semantic Days 2013) May 2013
  • 8. The Problem with Big Data for BI #8From Big Data to Smart Data (Semantic Days 2013) May 2013
  • 9. Smart Data for Better BI • Efficiently analyse unstructured data – Most of the enterprise data is still unstructured – Even within structured & transactional data sources there is a lot of embedded unstructured data – … and this unstructured data is poorly analysed (if at all) => lots of potential value still remains locked – (sometimes even within semantic / Linked Data with insufficient granularity) #9From Big Data to Smart Data (Semantic Days 2013) May 2013
  • 10. Smart Data for Better BI • Focus on metadata first, Big Data later – (As opposed to: Big Data first, metadata later) • Enrich data • Interlink data • Provide a common metadata layer – Break legacy silos – Align heterogeneous metadata if necessary • Better analysis of the data, better insight #10From Big Data to Smart Data (Semantic Days 2013) May 2013
  • 11. SUCCESS STORIES #11From Big Data to Smart Data (Semantic Days 2013) May 2013
  • 12. UK Job Market Intelligence • Comprehensive recruitment database for the UK – 4 million job ads / vacancies (dynamic) – 220,000 company websites & 700 job boards monitored • Questions we can answer – What skills are in demand at present? – Which are the top job boards in a region? – Which is the right Job board for your industry sector? – Which are the most active job advertisers / employers? – Which are the agencies and employers that do not advertise on your job board? #12From Big Data to Smart Data (Semantic Days 2013) May 2013
  • 13. UK Job Market Intelligence #13From Big Data to Smart Data (Semantic Days 2013) May 2013
  • 14. UK Job Market Intelligence • Technology stack – Web mining & focussed crawling – KB construction from open & proprietary data sources – Skills taxonomy (based on DISCO) – Text mining & semantic enrichment – Reconciliation & interlinking – BI reporting & dashboards #14From Big Data to Smart Data (Semantic Days 2013) May 2013
  • 15. UK Job Market Intelligence #15From Big Data to Smart Data (Semantic Days 2013) May 2013
  • 16. UK Job Market Intelligence #16From Big Data to Smart Data (Semantic Days 2013) May 2013
  • 17. UK Job Market Intelligence #17From Big Data to Smart Data (Semantic Days 2013) May 2013
  • 18. Asset Recovery Intelligence System (ARIS) • Support Financial Intelligence Units with tracking stolen assets, fight corruption & money laundering • Questions we can answer – What are the reported activities related to a person? – What is the person’s personal/professional network? – What are corruptions cases reported in regional news? • Data sources – News feeds from major news agencies – Dow Jones data & news feeds – SARs to the FIU – Open data (people & companies, Wikipedia) #18From Big Data to Smart Data (Semantic Days 2013) May 2013
  • 19. Asset Recovery Intelligence System (ARIS) #19From Big Data to Smart Data (Semantic Days 2013) May 2013
  • 20. Asset Recovery Intelligence System (ARIS) • Technology stack – Web Mining – Text mining & semantic enrichment (KIM) – ARIS ontology • People, companies, assets, relations, financial transactions, … – Reconciliation & Interlinking – Triplestore (OWLIM) – Semantic search & exploration UX – BI reporting / factsheets / alerts #20From Big Data to Smart Data (Semantic Days 2013) May 2013
  • 21. Semantic Information Integration & Enrichment #21From Big Data to Smart Data (Semantic Days 2013) May 2013
  • 22. Q & A Thank you! @ontotext #22From Big Data to Smart Data (Semantic Days 2013) May 2013