SlideShare ist ein Scribd-Unternehmen logo
1 von 23
Downloaden Sie, um offline zu lesen
Towards a Big Data Taxonomy
Bill Mandrick, PhD
Data Tactics
Version 26_August_2013
Scientific Taxonomies Represent
• Types of Processes
• Types of Objects
– Physical Objects
– Information Artifacts
• Types of Characteristics
– Qualities
– Roles
• Relationships
– Between Processes
– Between Objects
– Between Characteristics
2
Big Data Taxonomy
• Big Data Related Processes
• Big Data Characteristics
• Big Data Information Artifacts
• Big Data Information Bearers
• Relationships between Big Data Elements
• Mapping Instances to the Taxonomy
• Creating Situational Awareness
3
Relations Between Processes
• Processes A <relation> Processes B
– Complex Process <has part> Sub-Process
– Sub-Process <part of> Complex Process
– Process A <precedes> Process B
– Process A <follows> Process B
Examples:
Data Curation Process <has part> Data Selection Process
Data Curation Process <has part> Data Collection Process
Data Curation Process <has part> Data Archiving Process
4
Information Artifact Lifecycle Processes
• Collecting
• Curating
• Representing
• Storing
– Cluster Storing
• Managing
– Processing
• Distributed Processing
– Map Reduce
• Analyzing
– Data Mining
– Causal Analysis
– Probabilistic Analysis
– Correlation Analysis
• Data Collection Process
• Data Curation Process
• Data Representation Process
• Data Storing Process
– Cluster Storing Process
• Data Management Process
– Processing
• Distributed Data Process
– Map Reduce Process
• Data Analytics Process
– Data Mining Process
– Causal Analysis Process
– Probabilistic Analysis Process
– Correlation Analysis Process
Common Labels Taxonomy Labels
5
Big Data Processes
6
Big Data Processes can be
decomposed and related to
other (sub)processes
…as well as to their outputs
(Information Artifacts).
Relating Processes to Products
7
Big Data Information Artifacts
8
9
10
Information Content Entities
11
Use Case
Data Characteristics
12
Information Bearers
13
Partial Taxonomy
14
Human Genome Data
15
Terms from Human Genome Data Use Case
Use Case Term:
Genomic Measurements
Reference Materials
Reference Data
Reference Methods
Assess Performance
Genome Sequencing
Integrate Data
Sequencing Technologies
Sequencing Methods
Characterization
Whole Human Genomes
Assess Performance
Genome Sequencing Run
Computer System
Storage
Networking
Processing
Software
Open Source Sequencing Bioinformatics Software
Data Source
Sequencer
Volume
Variety
Variability
Veracity
Visualization
Data Quality
Data Types
Data Analytics
Taxonomical Term:
Genomic Measurement Result (Measurement Result)
Reference Material Role
Reference Data Role
Reference Method
Performance Assessment Process
Genome Sequencing Process
Data Integration Process
Data Sequencing Technology (Tool)
Sequencing Method (Process)
Characterization (Data Characterization, IA or ICE)
Whole Human Genome Characterization (IA or ICE?)
Performance Assessment Process
Genome Sequencing Run
Computer System
Data Storage Process
Computer Networking Process
Data Processing Process
Software (IAO placement?)
Bioinformatics Sequencing Software
Data Source Role
Sequencer
Data Volume (Characteristic)
Data Variety (Characteristic)
Data Variability (Characteristic)
Data Veracity (Characteristic)
Data Visualization Process
Data Quality (Characteristic)
Data Type
Data Analytics Process
16
Information Artifacts:
Human Genome Data Measurement Result
Characterization (Data Characterization, IA or ICE)
Whole Human Genome Characterization (IA or ICE?)
Performance Assessment
Genome Sequence
Software (IAO placement?)
Data Visualization
Processes:
Human Genome Data Measurement Process
Reference Method
Performance Assessment Process
Genome Sequencing Process
Data Integration Process
Sequencing Method (Process)
Data Characterization Process
Performance Assessment Process
Genome Sequencing Run
Data Storage Process
Computer Networking Process
Data Processing Process
Data Visualization Process
Data Analytics Process
Roles and Characteristics:
Reference Material Role
Reference Data Role
Data Source Role
Data Volume (Characteristic)
Data Variety (Characteristic)
Data Variability (Characteristic)
Data Veracity (Characteristic)
Data Visualization Process
Data Quality (Characteristic)
Artifacts/Tools:
Data Sequencing Technology (Tool)
Computer System
Computer Network
Software (IAO placement?)
Bioinformatics Sequencing Software
Sequencer
17
Terms from Human Genome Data Use Case
Genomic Research Organizations
18
Instances
DNA Data Sets
19
Instances
DNA Organizational Roles
20Instances
Agent Roles
21
DNA Visualization
22
Instances
Conclusion
• This method can be done for any part of the
Big Data Taxonomy
• Need SME input for various areas/domains
• Need to add definitions in owl
• Need to expand set of standardized relations
• Link instances to the taxonomy (e.g. actual
data sets, organizations, etc.)
23

Weitere ähnliche Inhalte

Was ist angesagt?

Reading Group: From Database to Dataspaces
Reading Group: From Database to DataspacesReading Group: From Database to Dataspaces
Reading Group: From Database to Dataspaces
Jürgen Umbrich
 
Data mining course learning outcomes,Data Mining CMAP
Data mining course learning outcomes,Data Mining CMAPData mining course learning outcomes,Data Mining CMAP
Data mining course learning outcomes,Data Mining CMAP
jaya lakshmi
 
An Identifier Scheme for the Digitising Scotland Project
An Identifier Scheme for the Digitising Scotland ProjectAn Identifier Scheme for the Digitising Scotland Project
An Identifier Scheme for the Digitising Scotland Project
Alasdair Gray
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
suganmca14
 
Data pre processing
Data pre processingData pre processing
Data pre processing
pommurajopt
 

Was ist angesagt? (19)

Prof. Melinda Laituri, Colorado State University | Map Data Integrity | SotM ...
Prof. Melinda Laituri, Colorado State University | Map Data Integrity | SotM ...Prof. Melinda Laituri, Colorado State University | Map Data Integrity | SotM ...
Prof. Melinda Laituri, Colorado State University | Map Data Integrity | SotM ...
 
Data mining techniques unit 2
Data mining techniques unit 2Data mining techniques unit 2
Data mining techniques unit 2
 
Reading Group: From Database to Dataspaces
Reading Group: From Database to DataspacesReading Group: From Database to Dataspaces
Reading Group: From Database to Dataspaces
 
Role of Biometric in Reducing the Size of Big Data
Role of Biometric in Reducing the Size of Big DataRole of Biometric in Reducing the Size of Big Data
Role of Biometric in Reducing the Size of Big Data
 
Data mining course learning outcomes,Data Mining CMAP
Data mining course learning outcomes,Data Mining CMAPData mining course learning outcomes,Data Mining CMAP
Data mining course learning outcomes,Data Mining CMAP
 
Discovery informaticsstanton
Discovery informaticsstantonDiscovery informaticsstanton
Discovery informaticsstanton
 
DataONE Education Module 07: Metadata
DataONE Education Module 07: MetadataDataONE Education Module 07: Metadata
DataONE Education Module 07: Metadata
 
Data preprocessing in Data Mining
Data preprocessing in Data MiningData preprocessing in Data Mining
Data preprocessing in Data Mining
 
BDVe Webinar Series - Designing Big Data pipelines with Toreador (Ernesto Dam...
BDVe Webinar Series - Designing Big Data pipelines with Toreador (Ernesto Dam...BDVe Webinar Series - Designing Big Data pipelines with Toreador (Ernesto Dam...
BDVe Webinar Series - Designing Big Data pipelines with Toreador (Ernesto Dam...
 
Issues, challenges, and solutions
Issues, challenges, and solutionsIssues, challenges, and solutions
Issues, challenges, and solutions
 
Data mining
Data miningData mining
Data mining
 
TAIR ICAR 2010 Presentation
TAIR ICAR 2010 PresentationTAIR ICAR 2010 Presentation
TAIR ICAR 2010 Presentation
 
An Identifier Scheme for the Digitising Scotland Project
An Identifier Scheme for the Digitising Scotland ProjectAn Identifier Scheme for the Digitising Scotland Project
An Identifier Scheme for the Digitising Scotland Project
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Data pre processing
Data pre processingData pre processing
Data pre processing
 
Resume xiaodan(vinci)
Resume xiaodan(vinci)Resume xiaodan(vinci)
Resume xiaodan(vinci)
 
Data Analyst Roles & Responsibilities | Edureka
Data Analyst Roles & Responsibilities | EdurekaData Analyst Roles & Responsibilities | Edureka
Data Analyst Roles & Responsibilities | Edureka
 
Introduction to Crossref: History, Mission, Members
Introduction to Crossref: History, Mission, MembersIntroduction to Crossref: History, Mission, Members
Introduction to Crossref: History, Mission, Members
 
Data Mining: Classification and analysis
Data Mining: Classification and analysisData Mining: Classification and analysis
Data Mining: Classification and analysis
 

Andere mochten auch

Taxonomy Management, Automatic Metadata Tagging & Auto Classification in Shar...
Taxonomy Management, Automatic Metadata Tagging & Auto Classification in Shar...Taxonomy Management, Automatic Metadata Tagging & Auto Classification in Shar...
Taxonomy Management, Automatic Metadata Tagging & Auto Classification in Shar...
William LaPorte
 
Data Grid Taxonomies
Data Grid TaxonomiesData Grid Taxonomies
Data Grid Taxonomies
awesomesos
 
A comparison between several no sql databases with comments and notes
A comparison between several no sql databases with comments and notesA comparison between several no sql databases with comments and notes
A comparison between several no sql databases with comments and notes
João Gabriel Lima
 
Introduction to metadata management
Introduction to metadata managementIntroduction to metadata management
Introduction to metadata management
Open Data Support
 

Andere mochten auch (12)

Taxonomy Management, Automatic Metadata Tagging & Auto Classification in Shar...
Taxonomy Management, Automatic Metadata Tagging & Auto Classification in Shar...Taxonomy Management, Automatic Metadata Tagging & Auto Classification in Shar...
Taxonomy Management, Automatic Metadata Tagging & Auto Classification in Shar...
 
Data Grid Taxonomies
Data Grid TaxonomiesData Grid Taxonomies
Data Grid Taxonomies
 
Topic 10: Taxonomy of Data and Storage
Topic 10: Taxonomy of Data and StorageTopic 10: Taxonomy of Data and Storage
Topic 10: Taxonomy of Data and Storage
 
Global taxonomy initiative ppt
Global taxonomy initiative  pptGlobal taxonomy initiative  ppt
Global taxonomy initiative ppt
 
A comparison between several no sql databases with comments and notes
A comparison between several no sql databases with comments and notesA comparison between several no sql databases with comments and notes
A comparison between several no sql databases with comments and notes
 
Taxonomy 101
Taxonomy 101Taxonomy 101
Taxonomy 101
 
Successful Content Management Through Taxonomy And Metadata Design
Successful Content Management Through Taxonomy And Metadata DesignSuccessful Content Management Through Taxonomy And Metadata Design
Successful Content Management Through Taxonomy And Metadata Design
 
Taxonomy And Metadata
Taxonomy And MetadataTaxonomy And Metadata
Taxonomy And Metadata
 
Enterprise Knowledge - Taxonomy Design Best Practices and Methodology
Enterprise Knowledge - Taxonomy Design Best Practices and MethodologyEnterprise Knowledge - Taxonomy Design Best Practices and Methodology
Enterprise Knowledge - Taxonomy Design Best Practices and Methodology
 
Taxonomies and Metadata in Information Architecture
Taxonomies and Metadata in Information ArchitectureTaxonomies and Metadata in Information Architecture
Taxonomies and Metadata in Information Architecture
 
Database mapping of XBRL instance documents from the WIP taxonomy
Database mapping of XBRL instance documents from the WIP taxonomyDatabase mapping of XBRL instance documents from the WIP taxonomy
Database mapping of XBRL instance documents from the WIP taxonomy
 
Introduction to metadata management
Introduction to metadata managementIntroduction to metadata management
Introduction to metadata management
 

Ähnlich wie Big Data Taxonomy 8/26/2013

Dwdmunit1 a
Dwdmunit1 aDwdmunit1 a
Dwdmunit1 a
bhagathk
 
Introduction To Data Mining
Introduction To Data MiningIntroduction To Data Mining
Introduction To Data Mining
dataminers.ir
 
Introduction To Data Mining
Introduction To Data Mining   Introduction To Data Mining
Introduction To Data Mining
Phi Jack
 

Ähnlich wie Big Data Taxonomy 8/26/2013 (20)

Dwdmunit1 a
Dwdmunit1 aDwdmunit1 a
Dwdmunit1 a
 
HKU Data Curation MLIM7350 Class 9
HKU Data Curation MLIM7350 Class 9 HKU Data Curation MLIM7350 Class 9
HKU Data Curation MLIM7350 Class 9
 
data mining
data miningdata mining
data mining
 
Data Mining Application and Trends
Data Mining Application and TrendsData Mining Application and Trends
Data Mining Application and Trends
 
Data mininng trends
Data mininng trendsData mininng trends
Data mininng trends
 
Introduction To Data Mining
Introduction To Data MiningIntroduction To Data Mining
Introduction To Data Mining
 
Introduction To Data Mining
Introduction To Data Mining   Introduction To Data Mining
Introduction To Data Mining
 
Introduction to data warehouse
Introduction to data warehouseIntroduction to data warehouse
Introduction to data warehouse
 
Data Mining: Concepts and techniques: Chapter 13 trend
Data Mining: Concepts and techniques: Chapter 13 trendData Mining: Concepts and techniques: Chapter 13 trend
Data Mining: Concepts and techniques: Chapter 13 trend
 
UK Digital Curation Centre: enabling research data management at the coalface
UK Digital Curation Centre: enabling research data management at the coalfaceUK Digital Curation Centre: enabling research data management at the coalface
UK Digital Curation Centre: enabling research data management at the coalface
 
Unit 3 part i Data mining
Unit 3 part i Data miningUnit 3 part i Data mining
Unit 3 part i Data mining
 
20IT501_DWDM_PPT_Unit_II.ppt
20IT501_DWDM_PPT_Unit_II.ppt20IT501_DWDM_PPT_Unit_II.ppt
20IT501_DWDM_PPT_Unit_II.ppt
 
Chaper 13 trend, Han & Kamber
Chaper 13 trend, Han & KamberChaper 13 trend, Han & Kamber
Chaper 13 trend, Han & Kamber
 
Dma unit 1
Dma unit   1Dma unit   1
Dma unit 1
 
13 trend
13 trend13 trend
13 trend
 
20IT501_DWDM_PPT_Unit_II.ppt
20IT501_DWDM_PPT_Unit_II.ppt20IT501_DWDM_PPT_Unit_II.ppt
20IT501_DWDM_PPT_Unit_II.ppt
 
Lect 1 introduction
Lect 1 introductionLect 1 introduction
Lect 1 introduction
 
Data analytics, a (short) tour
Data analytics, a (short) tourData analytics, a (short) tour
Data analytics, a (short) tour
 
Provinance in scientific workflows in e science
Provinance in scientific workflows in e scienceProvinance in scientific workflows in e science
Provinance in scientific workflows in e science
 
Knowledge discovery thru data mining
Knowledge discovery thru data miningKnowledge discovery thru data mining
Knowledge discovery thru data mining
 

Mehr von DataTactics

Data Tactics Analytics Practice
Data Tactics Analytics PracticeData Tactics Analytics Practice
Data Tactics Analytics Practice
DataTactics
 
Big Data Conference
Big Data ConferenceBig Data Conference
Big Data Conference
DataTactics
 
Discontinuities Demo
Discontinuities DemoDiscontinuities Demo
Discontinuities Demo
DataTactics
 
Analytics Brownbag
Analytics Brownbag Analytics Brownbag
Analytics Brownbag
DataTactics
 
Ontology and Reports
Ontology and ReportsOntology and Reports
Ontology and Reports
DataTactics
 
Data Tactics Unified Dataspace Architecture and Description
Data Tactics Unified Dataspace Architecture and DescriptionData Tactics Unified Dataspace Architecture and Description
Data Tactics Unified Dataspace Architecture and Description
DataTactics
 
Data Tactics Semantic and Interoperability Summit Feb 12, 2013
Data Tactics Semantic and Interoperability Summit Feb 12, 2013Data Tactics Semantic and Interoperability Summit Feb 12, 2013
Data Tactics Semantic and Interoperability Summit Feb 12, 2013
DataTactics
 
Horizontal Integration of Big Intelligence Data
Horizontal Integration of Big Intelligence DataHorizontal Integration of Big Intelligence Data
Horizontal Integration of Big Intelligence Data
DataTactics
 
Bill Ontology Summit (08 feb 1400hrs) v2
Bill Ontology Summit (08 feb 1400hrs) v2Bill Ontology Summit (08 feb 1400hrs) v2
Bill Ontology Summit (08 feb 1400hrs) v2
DataTactics
 
DT Company Overview January 2013
DT Company Overview January 2013DT Company Overview January 2013
DT Company Overview January 2013
DataTactics
 
Capabilities Brief Analytics
Capabilities Brief AnalyticsCapabilities Brief Analytics
Capabilities Brief Analytics
DataTactics
 
Data Tactics dhs introduction to cloud technologies wtc
Data Tactics dhs introduction to cloud technologies wtcData Tactics dhs introduction to cloud technologies wtc
Data Tactics dhs introduction to cloud technologies wtc
DataTactics
 
Multi Discipline Intelligence Production Teams 1
Multi Discipline Intelligence Production Teams 1Multi Discipline Intelligence Production Teams 1
Multi Discipline Intelligence Production Teams 1
DataTactics
 
Data Tactics and Nervve Integrated Big Data v3
Data Tactics and Nervve Integrated Big Data v3Data Tactics and Nervve Integrated Big Data v3
Data Tactics and Nervve Integrated Big Data v3
DataTactics
 
Data Tactics Open Source Brief
Data Tactics Open Source BriefData Tactics Open Source Brief
Data Tactics Open Source Brief
DataTactics
 

Mehr von DataTactics (20)

NETWORK CENTRALITY IN SUB-NATIONAL AREAS OF INTEREST USING GDELT DATA
NETWORK CENTRALITY IN SUB-NATIONAL AREAS OF INTEREST USING GDELT DATANETWORK CENTRALITY IN SUB-NATIONAL AREAS OF INTEREST USING GDELT DATA
NETWORK CENTRALITY IN SUB-NATIONAL AREAS OF INTEREST USING GDELT DATA
 
C Star Analytic Presentation
C Star Analytic PresentationC Star Analytic Presentation
C Star Analytic Presentation
 
Text Analysis Using Twitter: A Case Study in Dhaka
Text Analysis Using Twitter: A Case Study in Dhaka Text Analysis Using Twitter: A Case Study in Dhaka
Text Analysis Using Twitter: A Case Study in Dhaka
 
Data Science and Analytics Brown Bag
Data Science and Analytics Brown BagData Science and Analytics Brown Bag
Data Science and Analytics Brown Bag
 
Data Tactics Analytics Practice
Data Tactics Analytics PracticeData Tactics Analytics Practice
Data Tactics Analytics Practice
 
Big Data Conference
Big Data ConferenceBig Data Conference
Big Data Conference
 
Discontinuities Demo
Discontinuities DemoDiscontinuities Demo
Discontinuities Demo
 
DLISA
DLISADLISA
DLISA
 
Analytics Brownbag
Analytics Brownbag Analytics Brownbag
Analytics Brownbag
 
Ontology and Reports
Ontology and ReportsOntology and Reports
Ontology and Reports
 
Data Tactics Unified Dataspace Architecture and Description
Data Tactics Unified Dataspace Architecture and DescriptionData Tactics Unified Dataspace Architecture and Description
Data Tactics Unified Dataspace Architecture and Description
 
Data Tactics Semantic and Interoperability Summit Feb 12, 2013
Data Tactics Semantic and Interoperability Summit Feb 12, 2013Data Tactics Semantic and Interoperability Summit Feb 12, 2013
Data Tactics Semantic and Interoperability Summit Feb 12, 2013
 
Horizontal Integration of Big Intelligence Data
Horizontal Integration of Big Intelligence DataHorizontal Integration of Big Intelligence Data
Horizontal Integration of Big Intelligence Data
 
Bill Ontology Summit (08 feb 1400hrs) v2
Bill Ontology Summit (08 feb 1400hrs) v2Bill Ontology Summit (08 feb 1400hrs) v2
Bill Ontology Summit (08 feb 1400hrs) v2
 
DT Company Overview January 2013
DT Company Overview January 2013DT Company Overview January 2013
DT Company Overview January 2013
 
Capabilities Brief Analytics
Capabilities Brief AnalyticsCapabilities Brief Analytics
Capabilities Brief Analytics
 
Data Tactics dhs introduction to cloud technologies wtc
Data Tactics dhs introduction to cloud technologies wtcData Tactics dhs introduction to cloud technologies wtc
Data Tactics dhs introduction to cloud technologies wtc
 
Multi Discipline Intelligence Production Teams 1
Multi Discipline Intelligence Production Teams 1Multi Discipline Intelligence Production Teams 1
Multi Discipline Intelligence Production Teams 1
 
Data Tactics and Nervve Integrated Big Data v3
Data Tactics and Nervve Integrated Big Data v3Data Tactics and Nervve Integrated Big Data v3
Data Tactics and Nervve Integrated Big Data v3
 
Data Tactics Open Source Brief
Data Tactics Open Source BriefData Tactics Open Source Brief
Data Tactics Open Source Brief
 

Kürzlich hochgeladen

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 

Kürzlich hochgeladen (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 

Big Data Taxonomy 8/26/2013

  • 1. Towards a Big Data Taxonomy Bill Mandrick, PhD Data Tactics Version 26_August_2013
  • 2. Scientific Taxonomies Represent • Types of Processes • Types of Objects – Physical Objects – Information Artifacts • Types of Characteristics – Qualities – Roles • Relationships – Between Processes – Between Objects – Between Characteristics 2
  • 3. Big Data Taxonomy • Big Data Related Processes • Big Data Characteristics • Big Data Information Artifacts • Big Data Information Bearers • Relationships between Big Data Elements • Mapping Instances to the Taxonomy • Creating Situational Awareness 3
  • 4. Relations Between Processes • Processes A <relation> Processes B – Complex Process <has part> Sub-Process – Sub-Process <part of> Complex Process – Process A <precedes> Process B – Process A <follows> Process B Examples: Data Curation Process <has part> Data Selection Process Data Curation Process <has part> Data Collection Process Data Curation Process <has part> Data Archiving Process 4
  • 5. Information Artifact Lifecycle Processes • Collecting • Curating • Representing • Storing – Cluster Storing • Managing – Processing • Distributed Processing – Map Reduce • Analyzing – Data Mining – Causal Analysis – Probabilistic Analysis – Correlation Analysis • Data Collection Process • Data Curation Process • Data Representation Process • Data Storing Process – Cluster Storing Process • Data Management Process – Processing • Distributed Data Process – Map Reduce Process • Data Analytics Process – Data Mining Process – Causal Analysis Process – Probabilistic Analysis Process – Correlation Analysis Process Common Labels Taxonomy Labels 5
  • 6. Big Data Processes 6 Big Data Processes can be decomposed and related to other (sub)processes …as well as to their outputs (Information Artifacts).
  • 8. Big Data Information Artifacts 8
  • 9. 9
  • 10. 10
  • 16. Terms from Human Genome Data Use Case Use Case Term: Genomic Measurements Reference Materials Reference Data Reference Methods Assess Performance Genome Sequencing Integrate Data Sequencing Technologies Sequencing Methods Characterization Whole Human Genomes Assess Performance Genome Sequencing Run Computer System Storage Networking Processing Software Open Source Sequencing Bioinformatics Software Data Source Sequencer Volume Variety Variability Veracity Visualization Data Quality Data Types Data Analytics Taxonomical Term: Genomic Measurement Result (Measurement Result) Reference Material Role Reference Data Role Reference Method Performance Assessment Process Genome Sequencing Process Data Integration Process Data Sequencing Technology (Tool) Sequencing Method (Process) Characterization (Data Characterization, IA or ICE) Whole Human Genome Characterization (IA or ICE?) Performance Assessment Process Genome Sequencing Run Computer System Data Storage Process Computer Networking Process Data Processing Process Software (IAO placement?) Bioinformatics Sequencing Software Data Source Role Sequencer Data Volume (Characteristic) Data Variety (Characteristic) Data Variability (Characteristic) Data Veracity (Characteristic) Data Visualization Process Data Quality (Characteristic) Data Type Data Analytics Process 16
  • 17. Information Artifacts: Human Genome Data Measurement Result Characterization (Data Characterization, IA or ICE) Whole Human Genome Characterization (IA or ICE?) Performance Assessment Genome Sequence Software (IAO placement?) Data Visualization Processes: Human Genome Data Measurement Process Reference Method Performance Assessment Process Genome Sequencing Process Data Integration Process Sequencing Method (Process) Data Characterization Process Performance Assessment Process Genome Sequencing Run Data Storage Process Computer Networking Process Data Processing Process Data Visualization Process Data Analytics Process Roles and Characteristics: Reference Material Role Reference Data Role Data Source Role Data Volume (Characteristic) Data Variety (Characteristic) Data Variability (Characteristic) Data Veracity (Characteristic) Data Visualization Process Data Quality (Characteristic) Artifacts/Tools: Data Sequencing Technology (Tool) Computer System Computer Network Software (IAO placement?) Bioinformatics Sequencing Software Sequencer 17 Terms from Human Genome Data Use Case
  • 23. Conclusion • This method can be done for any part of the Big Data Taxonomy • Need SME input for various areas/domains • Need to add definitions in owl • Need to expand set of standardized relations • Link instances to the taxonomy (e.g. actual data sets, organizations, etc.) 23