Ontology2 platform

Ontology2 Platform
Paul Houle, Founder Ontology2
Bill Freeman, President KMSolutions
(774) 301-1301
O2
kms

OUR PLATFORM
For organizations handling complex, heterogeneous, and big data from
a large number of sources, structured, unstructured and
semistructured.
We rapidly (in terms of computer time and configuration time)
combine, curate, and index your data, both in batch and in real-time.
Based on our experience with Freebase (the basis for the Google
Knowledge Graph), we combine Hadoop technology with SQL and
NoSQL databases on a next generation cloud technology;
Focus: quality, usability, cross-domain integration and inference,
standards-driven interoperability, open-source components

Current State as we understand it
Technical: Need for extreme agility
• High-quality, curated data is important
• Limited by MySQL speed/scalability (and slow schema changes because of row store)
• Difficulty of handling taxonomy/ontology/schema changes
• Dealing with data loss and broken inter-concept links caused by changes
• Difficulty of linking entity between silos; inability to infer accurate, high quality relationships
between collections
• Need for clean, normalized data for input to machine learning algorithms
• Need ability to manage spatial and temporal data
• To keep up with competition: It must be easy has to make changes, fast to implement changes
• Need for data typing beyond SQL (currency, length, time interval, etc.) to support inference and
user interfaces
• Infrastructure built ad-hoc is difficult to document, maintain, expand
Business Challenges
• To be discussed

Benefits from cloud-native Infovore™ platform
Index construction does not interfere
with user-facing real-time services
Development, Test and Staging do
not interfere with production
Batch Jobs Don’t Interfere with
Interactive Services

Next Generation Cloud
• Near Bare Metal Performance
Hardware
Virtualization
• Incredible Speed
• Predictable Response Time
SSD Drives
• Take advantage of competition between cloud
provider
• Use existing on premise capacity; control physical
security, flexible options
Hybrid cloud

Files
Databases
Hadoop Mappers
Hadoop Reducers
Hadoop Powered Index Construction
We deliver the exact data
required by your index
builders, partitioned, sorted
and filtered for maximum
efficiency.

Index Construction in Hybrid Cloud
New Index Construction Never Conflicts With Production
time
Old index (multiple copies for throughput & availability)
Source
data
Test
Clone
New Index
Terminate and
recover
resources

Batch Index plus Real-Time Index
Effortless and efficient scalability
Message
Queue
Bulk Data time stamped
master data
small real-time index
large bulk
index
merger
RESULTS

New approach to data management
A FRAMEWORK FOR DATA QUALITY
Multiple sources of instance data
Facts
classifications
Reference data…
Examples
Test Data
Training Data
Requirements
Quality metrics

WE DELIVER FAST CYCLE TIME
HYBRID CLOUD: No waiting for hardware
PARALLEL DATA PROCESSING: Handle large data sets quickly
DEVOPS AUTOMATION: Little system administration overhead
EFFICIENT DATA REPRESENTATION: Rapid turnaround, low hardware cost
COMPETITIVE
ADVANTAGE
MINIMIZE WASTED CYCLES
automation eliminates errors
MINIMIZE TIME AROUND CYCLE

Ontology2 Spatial Hierarchy
Freebase data enriched for Language+Contextual Performance
Global coverage
30+ languages
250 countries
36,000 regions
1.5M names
400,000 cites & towns
8M names
Large alternative name bank + hierarchical constraint =
• Resolution of jurisdictions in international business listings
• Resolution of place names in free text

Extensive Graph-Based Schema
META-MODEL SYSTEMATICAL DESCRIBES PROCESSES AND THINGS
RDFS
types + properties
XML SCHEMA
Data types
EXTENDED
Data types
DECLARATIVE MAPPINGS
CSV RDBMS XML …
DECLARATIVE
HINTS
formatting
editing
…
LINGUISTIC +
CONTEXTUAL
Knowledge
Representation
SOLVES ISSUES, SEE
SLIDE 3 !

Compiled
representation
databases
COMMON TEXT
FORMATS
CSV, XML, JSON, RDF
FAST BINARY
FORMATS
THRIFT, AVRO
PROTOCOL BUFFERS

RAW DATA
Event-driven real-time pipeline
applications
MERGED
PRODUCTION
INDEX
batch pipeline
MODEL-DRIVEN ARCHITECTURE
HANDLING CONTENT AND DATA WITH CONTEXTUAL UNDERSTANDING

SUMMARY
For organizations handling complex, heterogeneous, and big data from
a large number of sources, structured, unstructured and
semistructured.
We rapidly (in terms of computer time and configuration time)
combine, curate, and index your data, both in batch and in real-time.
Based on our experience with Freebase (the basis for the Google
Knowledge Graph), we combine Hadoop technology with SQL and
NoSQL databases on a next generation cloud technology;
Focus: quality, usability, cross-domain integration and inference,
standards-driven interoperability, open-source components
Bill Freeman, President KMSolutions
william.freeman3@outlook.com (774) 301-1301

Ontology2 platform

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (7)

Ähnlich wie Ontology2 platform

Ähnlich wie Ontology2 platform (20)

Mehr von Paul Houle

Mehr von Paul Houle (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Ontology2 platform