SlideShare ist ein Scribd-Unternehmen logo
1 von 46
1 FOSS Asia 2010
2 FOSS Asia 2010
The State of the Engine
➔
Brief Technology Overview
➔
New SQL Parser (lemon/quex)
➔
User Defined Functions
➔
BlackRay as a storage engine
➔
Outlook: Realtime Data Updates
FOSS Asia 2010
Brief BlackRay History
4 FOSS Asia 2010
What is BlackRay?
●
BlackRay is a relational, in-memory database
●
Supports SQL, utilizes PostgreSQL drivers
●
Fulltext (Tokenized) Search in Text fields
●
Object-Oriented API Support
●
Persistence via Files, Transaction support
●
Scalable and Fault Tolerant
●
Open Source, Open Community
●
Available under the GPLv2
5 FOSS Asia 2010
Current Release
Current 0.10.0 – Released December 2009
●
Complete rewrite of SQL Parser (boost::spirit2)
●
PostgreSQL client compatibility (via network protocol) to
allow JDBC/ODBC... via PostgreSQL driver
●
Rewritten CLI tools
●
Major bugfixes (potential memory leaks)
●
Better Authentication suppor for Instances
FOSS Asia 2010
Technology Overview
7 FOSS Asia 2010
Why call it Data Engine?
●
BlackRay is a hybrid between a relational database
and a search engine thus we call it „→ data engine“
●
Database features:
●
Relational structure, with Join between tables
●
Wildcards and index functions
●
SQL and JDBC/ODBC
●
Search Engine Features
●
Fulltext retrieval (token index)
●
Phonetic and similar approximation search
●
Extremely low latency
8 FOSS Asia 2010
BlackRay Architecture
C++ API
Java API
Management
Server
Instance
Server
Data Universe
(RAM Resident)
<
Redo Log
Snapshots
SQL
Interface
Postgres*
Clients
L5: Multi-Values
L4: Multi-Tokens
L5: Multi-Values
L3: Row Index
L2: Postings
L1: Dictionary
5-Perspective Index
Python API
PHP API
Python API
C# API
9 FOSS Asia 2010
Data Universe
●
BlackRay features a 5-Perspective Index
●
Layer 1: Dictionary
●
Layer 2: Postings
●
Layer 3: Row Index
●
Layer 4: Multi-Token Layer
●
Layer 5: Multi-Value Layer
●
Layer 1 and 2 comprise a fully inverted Index
●
Statistics in this Index used for Query Plan Building
●
All data - index and raw output - are held in memory
10 FOSS Asia 2010
Core BlackRay Features
●
Standard loaders enable high performance loading
of data into tables
●
Persistence is done via file based snapshots
●
Snapshots enable data versioning and simple
backups
●
Basic ACID Transaction complianc is implemented
in BlackRay, without crash recovery support.
11 FOSS Asia 2010
Query Interfaces
●
BlackRay implements the PostgreSQL server socket
interface and binary APIs in Java, C++ and Python
●
PostgreSQL compatible drivers can be utilized
against BlackRay (JDBC/ODBC)
●
Native API enables object oriented data access
●
Performance of native APIs currently is substantially
better than SQL via PostgreSQL drivers
●
Dynamic query building is very efficient with native
APIs
FOSS Asia 2010
A New SQL Parser
13 FOSS Asia 2010
A New SQL Parser - again?
●
The 0.10 release included a much improved SQL
parser, built with boost::spirit
●
Quite solid, fast and simple to use
●
However, boost deprecates spirit1
●
boost::spirit2 is not compatible to spirit1, requiring a
rewrite anyways
●
Our impression: spirit2 requires too many resources
and large grammars result in huge generated files
●
Also: spirit and C++ templates do not mix well
14 FOSS Asia 2010
What would be a better choice?
●
Flex/Bison:
●
The obvious choice of MySQL and PostgreSQL
●
Two-step compile process, generates C not C++
●
No Unicode support
●
ANTLR:
●
Odd grammar rules, not optimal for C++
●
Recursive Descent parsers are not suited for SQL
●
Lemon/QUEX
●
Our new choice ;)
15 FOSS Asia 2010
Lemon/Quex: Our experience
●
Lemon:
●
Lemon is part of SQLite
●
Much more intuitive syntax than Flex syntax
●
Quex:
●
Generates tokenizers in C++
●
Unicode and external Parser support
●
Partially buggy but all issues were fixed witihn days
●
Synopsis: Lemon/Quex are like Bison/Flex, just with
Unicode and C++ support and maybe easier to
debug
16 FOSS Asia 2010
Current Progress
●
Basic SQL Features are ported from spirit to
Lemon/Quex
●
The „issue-77“ branch contains all recent SQL
parser code
●
Unit-Testing and Database level testing very solid
●
Will be part of the 0.11 release
17 FOSS Asia 2010
Recent Additions
●
Support for simple (single column) User Defined
Functions is now complete
●
Query portion (no subselect, no aggregate
functions) is very stable
●
Data Definition Language was added recently
●
CREATE SCHEMA
●
CREATE TABLE
●
ALTER TABLE
●
Index is created dynamically, so no CREATE INDEX
required
FOSS Asia 2010
User Defined Functions
19 FOSS Asia 2010
User Defined Functions
●
BlackRay was designed with support for Index
functions that operate on data in tables
●
Functions pre-compute index results, improving
speed and enabling queries that are not possible
otherwise
●
Functions are called on data load, and also on
queries.
●
Functions must not maintain state outside of tables
of the same instance they operate on.
20 FOSS Asia 2010
A Sample Function
●
Using functions in BlackRay
SELECT name FROM employee_table WHERE
fx_phonetic (name) = 'mike';
●
Functions need to be loaded beforehand:
CREATE FUNCTION fx_phonetic(varchar,
varchar) RETURNS int AS
'DIRECTORY/funcs', 'phonetic' ;
●
The function must implement the BlackRay default
function signature, which is almost identical to the
MySQL and PGSQL signatures
21 FOSS Asia 2010
Current State
●
User Defined Function Repository fully implemented
●
All built-in functions ported to be compatible to User
Defined Functions
●
SQL support for User Defined Functions under way
●
Will be part of the 0.11 release
FOSS Asia 2010
Our Adventure:
BlackRay As A Storage Engine
23 FOSS Asia 2010
Why even bother?
●
In Fall 2009, we embarked on a little adventure to
implement BlackRay as a storage engine
●
The old Engine had only a minimal SQL interface
and we lacked the expertise to build it ourselves
●
Plugging into the MySQL ecosystem seemed like a
very pleasant choice
●
The features of BlackRay would make it a good
query cache for large disk tables.....
24 FOSS Asia 2010
Our First Problem
BlackRay does not support a simple table scan.....
●
It may seem strange, but due to it's design as an in-
memory index, we do not separate table and index
●
Each column index basically is the data of the column
●
BlackRay distinguishes select and output columns, both
of which remain in RAM
●
The index therefore was never designed to be forced
back into a row format, for a simple table walk
25 FOSS Asia 2010
Possible Solution?
So, we can walk the Index instead?
●
Rather than scanning the table, it is possible to scan the
index instead
●
This only works for the columns markes „searchable“
●
Causes nasty errors when trying to select against result-
only columns
●
In tokenized index columns, getting the data back out
means concatenation with a blank between values – not
nice, as tokenizing can follow complex rules
●
Requires Refactoring of our Layer 3 (Row-Index)
26 FOSS Asia 2010
Next Issue
Optimizing Queries
●
The BlackRay Optimizer uses the Layer 1/2 (Inverted
Index) and Layer 4 (Multi-Tokens) Data to chose a Query
Path
●
In BlackRay „SELECT text FROM t WHERE text LIKE
'*pattern1*' AND text LIKE '*pattern2* is extremely
efficient as the inverted index has all the data
●
Even with OR this is an efficient Query, due to the fact
that we can immediately chose the smaller query first
and eliminate double matches
27 FOSS Asia 2010
Next Issue
●
Optimizing in the Storage Engine Interface?
●
In BlackRay, the Optimizer uses the AST from the SQL
Parser to figure out what to optimize
●
Based on a field or single Index level, the number of
matches really are not useful
●
Without utilizing the Layer2 and Layer4 structures, we
lose performance by several orders of magnitude
●
Personal Opinion: The MySQL Optimizer really seems to
like table scans, and tricking it with random vs
sequential read cost did not do the trick
28 FOSS Asia 2010
Functions in the Index
●
Columns can take functions to be used on the data
upon indexing, and when select is carried out
●
The most common functions are
– TOKENIZE – to support multi-token indexes
– PHONETIC – match against defined phonetic rules
– ALIAS – match a token against words with similar meaning
●
Internally these functions could be considered
Meta-Columns on the Index
●
To be able to chose the proper column, we need to
know what function was used in the select
29 FOSS Asia 2010
Functions in the Index
Consider this Query:
SELECT text FROM t WHERE fx_phonetic(text)
LIKE 'maier%';
●
Functions can take more than one parameter, and
may be nested
●
We could not quite figure out how to explain this to
the MySQL Parser
●
The function data would need to be available to the
Index to chose where to look
30 FOSS Asia 2010
Threading Models...
●
BlackRay has a highly optimized Threading Model
●
In RAM, we do not expect I/O-waits, so a model of
two dedicated Threads per CPU core works really
well
●
Locking in the Index is built around this model
●
„One Thread per Conection“ requires at least a
careful review of the way critical data structures are
accessed
31 FOSS Asia 2010
.... Our Conclusion
●
Currently, BlackRay really does not fit too well into
the storage engingine architecture
●
Did we lose all hope? Absolutely not.....
●
BlackRay Applications could really benefit from
being able to utilize MySQL features, including the
Archive Engine as well as temporary tables in Heap
●
Thanks to the excellent Blog and postings by Brian
Aker, which allowed us to not make all beginner
mistakes ourselves
FOSS Asia 2010
Outlook:
Realtime Update/Insert
33 FOSS Asia 2010
Current Challenges
●
Bulk Updates
●
BlackRay supports Insert and Delete via the Bulk Loader
●
Updates are done via Insert & Delete
●
Insert/Delete via API
●
An API exists for Insert/Delete
●
The Insert/Delete API is separate from the Query API
●
Both APIs cannot be used in the same Thread
●
Insert/Delete via SQL
●
Currently Insert/Delete are not available via SQL
34 FOSS Asia 2010
Supporting Insert/Delete
●
Pull together Insert/Delete and Query APIs
●
Take out the separate APIs
●
Unified API will then support transactions
●
Enable Insert/Delete via SQL
●
Extend the SQL Grammar to include INSERT/DELETE
●
Implement the functions via the unified API
●
The Bulk Loader and SQL
●
Rewrite of the Bulk Loader to utilize the unified API,
rather than SQL
35 FOSS Asia 2010
Performance Impact
●
Insert and Delete has a severe performance impact
on parallel queries
●
Locking needs to be utilized to ensure transactional
integrity, causing queries to stall on data
modification
●
Currently BlackRay uses sorted lists for the data
ductionary and the other index layers
●
For indeces that have frequent changes, it may be
much more desirable to utilize other basic data
structures underneath the index
FOSS Asia 2010
Project Roadmap
37 FOSS Asia 2010
Immediate Roadmap
●
Planned 0.11.0 – Due in Fall 2010
●
Pluggable Function architecture (loadable libraries)
●
Make all index functions available in SQL
●
Support for Prepared Statements (ODBC/JDBC)
●
Improved thread and memory management (Perftools?)
●
BlackRay Admin Console (Remora) 0.11
●
Engine Statistics via GUI
●
Cluster Node management
38 FOSS Asia 2010
Shortterm Roadmap
●
Planned 0.12.0 – Due in February 2012
●
Realtime INSERT/UPDATE/DELETE
●
SQL to support subselect
●
Default aggregate functions (SUM/AVG/....)
●
Fix several potential memory leaks (smart pointers)
●
The 0.12 release should be the last pre-GA release
39 FOSS Asia 2010
Midterm Roadmap
●
Scalability Features
●
Sharding & Partitioning Options
●
Federated Search
●
Fully portable snapshot format (across platforms)
●
Query Performance Analyzer
●
Improved Statistics Module with GUI
●
BlackRay as a Storage Backend for SUN OpenDS
LDAP Engine
40 FOSS Asia 2010
Midterm Roadmap
●
Security Features
●
Improved User and Access Control concepts
●
SSL for all connections
●
External User Store (LDAP/OpenSSO/PAM...)
●
Increased Platform support
●
Windows 7 and Windows Server platforms
●
Embedded platforms
●
Other, random features by popular request.
FOSS Asia 2010
The Team behind BlackRay
42 FOSS Asia 2010
SoftMethod GmbH
●
SoftMethod GmbH initiated the project in 2005
●
Company was founded in 2004 and currently has
10 employees
●
Focus of SoftMethod is high performance software
engineering
●
Product portfolio includes telco/contact center and
LDAP applications
●
SoftMethod also offers load testing and technical
software quality assurance support.
43 FOSS Asia 2010
Development Team
●
Felix Schupp, Initiator and Project Sponsor
●
Thomas Wunschel, Architect and Lead Developer
●
Mike Alexeev, Key Contributor (SQL/Functions)
●
Souvik Roy, Performance Analysis and Tools
●
Simon Courtenage, C++ and boost expert
FOSS Asia 2010
Wrap-Up
45 FOSS Asia 2010
What to do next
●
Get BlackRay:
●
Register yourself on http://forge.softmethod.de
●
SVN checkout available at
http://svn.softmethod.de/opensource/blackray/trunk
●
Get Involved
●
Anyone can register and create tickets, news etc
●
We have an active mailing list for discussion as well
●
Contribute
●
We require a signed Contributor agreement before being
allowed commit access to the repository
46 FOSS Asia 2010
Contact Us
●
Website: http://www.blackray.org
●
Twitter: http://twitter.com/dataengine
●
Facebook http://facebook.com/dataengine
●
Mailing List: http://lists.softmethod.de
●
Download: http://sourceforge.net/projects/blackray
●
Felix: felix.schupp@softmethod.de

Weitere ähnliche Inhalte

Was ist angesagt?

Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with SolrErik Hatcher
 
Most Wanted: Future PostgreSQL Features
Most Wanted: Future PostgreSQL FeaturesMost Wanted: Future PostgreSQL Features
Most Wanted: Future PostgreSQL FeaturesPeter Eisentraut
 
Iasi code camp 12 october 2013 jax-rs-jee-ecosystem - catalin mihalache
Iasi code camp 12 october 2013   jax-rs-jee-ecosystem - catalin mihalacheIasi code camp 12 october 2013   jax-rs-jee-ecosystem - catalin mihalache
Iasi code camp 12 october 2013 jax-rs-jee-ecosystem - catalin mihalacheCodecamp Romania
 
JSR-222 Java Architecture for XML Binding
JSR-222 Java Architecture for XML BindingJSR-222 Java Architecture for XML Binding
JSR-222 Java Architecture for XML BindingHeiko Scherrer
 
SQLcl the next generation of SQLPlus?
SQLcl the next generation of SQLPlus?SQLcl the next generation of SQLPlus?
SQLcl the next generation of SQLPlus?Zohar Elkayam
 
Lucene's Latest (for Libraries)
Lucene's Latest (for Libraries)Lucene's Latest (for Libraries)
Lucene's Latest (for Libraries)Erik Hatcher
 
code4lib 2011 preconference: What's New in Solr (since 1.4.1)
code4lib 2011 preconference: What's New in Solr (since 1.4.1)code4lib 2011 preconference: What's New in Solr (since 1.4.1)
code4lib 2011 preconference: What's New in Solr (since 1.4.1)Erik Hatcher
 
Is SQLcl the Next Generation of SQL*Plus?
Is SQLcl the Next Generation of SQL*Plus?Is SQLcl the Next Generation of SQL*Plus?
Is SQLcl the Next Generation of SQL*Plus?Zohar Elkayam
 
JAXB: Create, Validate XML Message and Edit XML Schema
JAXB: Create, Validate XML Message and Edit XML SchemaJAXB: Create, Validate XML Message and Edit XML Schema
JAXB: Create, Validate XML Message and Edit XML SchemaSitdhibong Laokok
 
Solr Query Parsing
Solr Query ParsingSolr Query Parsing
Solr Query ParsingErik Hatcher
 
Get the most out of Solr search with PHP
Get the most out of Solr search with PHPGet the most out of Solr search with PHP
Get the most out of Solr search with PHPPaul Borgermans
 
Introduction of CategoLJ2 #jjug_ccc
Introduction of CategoLJ2 #jjug_cccIntroduction of CategoLJ2 #jjug_ccc
Introduction of CategoLJ2 #jjug_cccToshiaki Maki
 
Java user group 2015 02-09-java8
Java user group 2015 02-09-java8Java user group 2015 02-09-java8
Java user group 2015 02-09-java8marctritschler
 
Got bored by the relational database? Switch to a RDF store!
Got bored by the relational database? Switch to a RDF store!Got bored by the relational database? Switch to a RDF store!
Got bored by the relational database? Switch to a RDF store!benfante
 

Was ist angesagt? (19)

Jooq java object oriented querying
Jooq java object oriented queryingJooq java object oriented querying
Jooq java object oriented querying
 
Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with Solr
 
Most Wanted: Future PostgreSQL Features
Most Wanted: Future PostgreSQL FeaturesMost Wanted: Future PostgreSQL Features
Most Wanted: Future PostgreSQL Features
 
Iasi code camp 12 october 2013 jax-rs-jee-ecosystem - catalin mihalache
Iasi code camp 12 october 2013   jax-rs-jee-ecosystem - catalin mihalacheIasi code camp 12 october 2013   jax-rs-jee-ecosystem - catalin mihalache
Iasi code camp 12 october 2013 jax-rs-jee-ecosystem - catalin mihalache
 
JSF2 and JSP
JSF2 and JSPJSF2 and JSP
JSF2 and JSP
 
JSR-222 Java Architecture for XML Binding
JSR-222 Java Architecture for XML BindingJSR-222 Java Architecture for XML Binding
JSR-222 Java Architecture for XML Binding
 
SQLcl the next generation of SQLPlus?
SQLcl the next generation of SQLPlus?SQLcl the next generation of SQLPlus?
SQLcl the next generation of SQLPlus?
 
Lucene's Latest (for Libraries)
Lucene's Latest (for Libraries)Lucene's Latest (for Libraries)
Lucene's Latest (for Libraries)
 
Jaxb
JaxbJaxb
Jaxb
 
code4lib 2011 preconference: What's New in Solr (since 1.4.1)
code4lib 2011 preconference: What's New in Solr (since 1.4.1)code4lib 2011 preconference: What's New in Solr (since 1.4.1)
code4lib 2011 preconference: What's New in Solr (since 1.4.1)
 
Is SQLcl the Next Generation of SQL*Plus?
Is SQLcl the Next Generation of SQL*Plus?Is SQLcl the Next Generation of SQL*Plus?
Is SQLcl the Next Generation of SQL*Plus?
 
JAXB: Create, Validate XML Message and Edit XML Schema
JAXB: Create, Validate XML Message and Edit XML SchemaJAXB: Create, Validate XML Message and Edit XML Schema
JAXB: Create, Validate XML Message and Edit XML Schema
 
26xslt
26xslt26xslt
26xslt
 
Solr Query Parsing
Solr Query ParsingSolr Query Parsing
Solr Query Parsing
 
Get the most out of Solr search with PHP
Get the most out of Solr search with PHPGet the most out of Solr search with PHP
Get the most out of Solr search with PHP
 
Introduction of CategoLJ2 #jjug_ccc
Introduction of CategoLJ2 #jjug_cccIntroduction of CategoLJ2 #jjug_ccc
Introduction of CategoLJ2 #jjug_ccc
 
Java user group 2015 02-09-java8
Java user group 2015 02-09-java8Java user group 2015 02-09-java8
Java user group 2015 02-09-java8
 
Got bored by the relational database? Switch to a RDF store!
Got bored by the relational database? Switch to a RDF store!Got bored by the relational database? Switch to a RDF store!
Got bored by the relational database? Switch to a RDF store!
 
Oracle 11g sql plsql training
Oracle 11g sql plsql trainingOracle 11g sql plsql training
Oracle 11g sql plsql training
 

Andere mochten auch

Joe P Audio Donation Fund
Joe P Audio Donation FundJoe P Audio Donation Fund
Joe P Audio Donation Fundwhatupitsjoep
 
Migration process3
Migration process3Migration process3
Migration process3Ken Johnson
 
Jean Messner Art 2010 Landscapes
Jean Messner Art 2010 LandscapesJean Messner Art 2010 Landscapes
Jean Messner Art 2010 Landscapesjeanmessner
 
Silicon Valley I Miti Da Sfatare
Silicon Valley I Miti Da SfatareSilicon Valley I Miti Da Sfatare
Silicon Valley I Miti Da Sfatareguestdcdb0c3
 

Andere mochten auch (9)

Joe P Audio Donation Fund
Joe P Audio Donation FundJoe P Audio Donation Fund
Joe P Audio Donation Fund
 
Migration process3
Migration process3Migration process3
Migration process3
 
How to make tea
How to make teaHow to make tea
How to make tea
 
Jean Messner Art 2010 Landscapes
Jean Messner Art 2010 LandscapesJean Messner Art 2010 Landscapes
Jean Messner Art 2010 Landscapes
 
JMS01
JMS01JMS01
JMS01
 
Silicon Valley I Miti Da Sfatare
Silicon Valley I Miti Da SfatareSilicon Valley I Miti Da Sfatare
Silicon Valley I Miti Da Sfatare
 
Khazi Sox A
Khazi Sox AKhazi Sox A
Khazi Sox A
 
600030 2008 N
600030 2008 N600030 2008 N
600030 2008 N
 
Aj USA Inc
Aj USA IncAj USA Inc
Aj USA Inc
 

Ähnlich wie BlackRay FOSS Asia 2010

The Adventure: BlackRay as a Storage Engine
The Adventure: BlackRay as a Storage EngineThe Adventure: BlackRay as a Storage Engine
The Adventure: BlackRay as a Storage Enginefschupp
 
BlackRay - The open Source Data Engine
BlackRay - The open Source Data EngineBlackRay - The open Source Data Engine
BlackRay - The open Source Data Enginefschupp
 
Blackray @ SAPO CodeBits 2009
Blackray @ SAPO CodeBits 2009Blackray @ SAPO CodeBits 2009
Blackray @ SAPO CodeBits 2009fschupp
 
Level 101 for Presto: What is PrestoDB?
Level 101 for Presto: What is PrestoDB?Level 101 for Presto: What is PrestoDB?
Level 101 for Presto: What is PrestoDB?Ali LeClerc
 
High performance json- postgre sql vs. mongodb
High performance json- postgre sql vs. mongodbHigh performance json- postgre sql vs. mongodb
High performance json- postgre sql vs. mongodbWei Shan Ang
 
pandas.(to/from)_sql is simple but not fast
pandas.(to/from)_sql is simple but not fastpandas.(to/from)_sql is simple but not fast
pandas.(to/from)_sql is simple but not fastUwe Korn
 
Oracle to Postgres Migration - part 1
Oracle to Postgres Migration - part 1Oracle to Postgres Migration - part 1
Oracle to Postgres Migration - part 1PgTraining
 
Migration From Oracle to PostgreSQL
Migration From Oracle to PostgreSQLMigration From Oracle to PostgreSQL
Migration From Oracle to PostgreSQLPGConf APAC
 
PostgreSQL Extension APIs are Changing the Face of Relational Databases | PGC...
PostgreSQL Extension APIs are Changing the Face of Relational Databases | PGC...PostgreSQL Extension APIs are Changing the Face of Relational Databases | PGC...
PostgreSQL Extension APIs are Changing the Face of Relational Databases | PGC...Teresa Giacomini
 
PostgreSQL Enterprise Class Features and Capabilities
PostgreSQL Enterprise Class Features and CapabilitiesPostgreSQL Enterprise Class Features and Capabilities
PostgreSQL Enterprise Class Features and CapabilitiesPGConf APAC
 
MySQL 5.6 - Operations and Diagnostics Improvements
MySQL 5.6 - Operations and Diagnostics ImprovementsMySQL 5.6 - Operations and Diagnostics Improvements
MySQL 5.6 - Operations and Diagnostics ImprovementsMorgan Tocker
 
An evening with Postgresql
An evening with PostgresqlAn evening with Postgresql
An evening with PostgresqlJoshua Drake
 
Gobblin @ NerdWallet (Nov 2015)
Gobblin @ NerdWallet (Nov 2015)Gobblin @ NerdWallet (Nov 2015)
Gobblin @ NerdWallet (Nov 2015)NerdWalletHQ
 
Key to a successful Exadata POC
Key to a successful Exadata POCKey to a successful Exadata POC
Key to a successful Exadata POCUmair Mansoob
 
Presto Meetup 2016 Small Start
Presto Meetup 2016 Small StartPresto Meetup 2016 Small Start
Presto Meetup 2016 Small StartHiroshi Toyama
 
COUG_AAbate_Oracle_Database_12c_New_Features
COUG_AAbate_Oracle_Database_12c_New_FeaturesCOUG_AAbate_Oracle_Database_12c_New_Features
COUG_AAbate_Oracle_Database_12c_New_FeaturesAlfredo Abate
 
Presto conferencetokyo2019
Presto conferencetokyo2019Presto conferencetokyo2019
Presto conferencetokyo2019wyukawa
 
Introduction to Structured Data Processing with Spark SQL
Introduction to Structured Data Processing with Spark SQLIntroduction to Structured Data Processing with Spark SQL
Introduction to Structured Data Processing with Spark SQLdatamantra
 

Ähnlich wie BlackRay FOSS Asia 2010 (20)

The Adventure: BlackRay as a Storage Engine
The Adventure: BlackRay as a Storage EngineThe Adventure: BlackRay as a Storage Engine
The Adventure: BlackRay as a Storage Engine
 
BlackRay - The open Source Data Engine
BlackRay - The open Source Data EngineBlackRay - The open Source Data Engine
BlackRay - The open Source Data Engine
 
Blackray @ SAPO CodeBits 2009
Blackray @ SAPO CodeBits 2009Blackray @ SAPO CodeBits 2009
Blackray @ SAPO CodeBits 2009
 
Level 101 for Presto: What is PrestoDB?
Level 101 for Presto: What is PrestoDB?Level 101 for Presto: What is PrestoDB?
Level 101 for Presto: What is PrestoDB?
 
Presto
PrestoPresto
Presto
 
High performance json- postgre sql vs. mongodb
High performance json- postgre sql vs. mongodbHigh performance json- postgre sql vs. mongodb
High performance json- postgre sql vs. mongodb
 
pandas.(to/from)_sql is simple but not fast
pandas.(to/from)_sql is simple but not fastpandas.(to/from)_sql is simple but not fast
pandas.(to/from)_sql is simple but not fast
 
Oracle to Postgres Migration - part 1
Oracle to Postgres Migration - part 1Oracle to Postgres Migration - part 1
Oracle to Postgres Migration - part 1
 
Migration From Oracle to PostgreSQL
Migration From Oracle to PostgreSQLMigration From Oracle to PostgreSQL
Migration From Oracle to PostgreSQL
 
PostgreSQL Extension APIs are Changing the Face of Relational Databases | PGC...
PostgreSQL Extension APIs are Changing the Face of Relational Databases | PGC...PostgreSQL Extension APIs are Changing the Face of Relational Databases | PGC...
PostgreSQL Extension APIs are Changing the Face of Relational Databases | PGC...
 
PostgreSQL Enterprise Class Features and Capabilities
PostgreSQL Enterprise Class Features and CapabilitiesPostgreSQL Enterprise Class Features and Capabilities
PostgreSQL Enterprise Class Features and Capabilities
 
Presto@Uber
Presto@UberPresto@Uber
Presto@Uber
 
MySQL 5.6 - Operations and Diagnostics Improvements
MySQL 5.6 - Operations and Diagnostics ImprovementsMySQL 5.6 - Operations and Diagnostics Improvements
MySQL 5.6 - Operations and Diagnostics Improvements
 
An evening with Postgresql
An evening with PostgresqlAn evening with Postgresql
An evening with Postgresql
 
Gobblin @ NerdWallet (Nov 2015)
Gobblin @ NerdWallet (Nov 2015)Gobblin @ NerdWallet (Nov 2015)
Gobblin @ NerdWallet (Nov 2015)
 
Key to a successful Exadata POC
Key to a successful Exadata POCKey to a successful Exadata POC
Key to a successful Exadata POC
 
Presto Meetup 2016 Small Start
Presto Meetup 2016 Small StartPresto Meetup 2016 Small Start
Presto Meetup 2016 Small Start
 
COUG_AAbate_Oracle_Database_12c_New_Features
COUG_AAbate_Oracle_Database_12c_New_FeaturesCOUG_AAbate_Oracle_Database_12c_New_Features
COUG_AAbate_Oracle_Database_12c_New_Features
 
Presto conferencetokyo2019
Presto conferencetokyo2019Presto conferencetokyo2019
Presto conferencetokyo2019
 
Introduction to Structured Data Processing with Spark SQL
Introduction to Structured Data Processing with Spark SQLIntroduction to Structured Data Processing with Spark SQL
Introduction to Structured Data Processing with Spark SQL
 

Kürzlich hochgeladen

Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 

Kürzlich hochgeladen (20)

Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 

BlackRay FOSS Asia 2010

  • 1. 1 FOSS Asia 2010
  • 2. 2 FOSS Asia 2010 The State of the Engine ➔ Brief Technology Overview ➔ New SQL Parser (lemon/quex) ➔ User Defined Functions ➔ BlackRay as a storage engine ➔ Outlook: Realtime Data Updates
  • 3. FOSS Asia 2010 Brief BlackRay History
  • 4. 4 FOSS Asia 2010 What is BlackRay? ● BlackRay is a relational, in-memory database ● Supports SQL, utilizes PostgreSQL drivers ● Fulltext (Tokenized) Search in Text fields ● Object-Oriented API Support ● Persistence via Files, Transaction support ● Scalable and Fault Tolerant ● Open Source, Open Community ● Available under the GPLv2
  • 5. 5 FOSS Asia 2010 Current Release Current 0.10.0 – Released December 2009 ● Complete rewrite of SQL Parser (boost::spirit2) ● PostgreSQL client compatibility (via network protocol) to allow JDBC/ODBC... via PostgreSQL driver ● Rewritten CLI tools ● Major bugfixes (potential memory leaks) ● Better Authentication suppor for Instances
  • 7. 7 FOSS Asia 2010 Why call it Data Engine? ● BlackRay is a hybrid between a relational database and a search engine thus we call it „→ data engine“ ● Database features: ● Relational structure, with Join between tables ● Wildcards and index functions ● SQL and JDBC/ODBC ● Search Engine Features ● Fulltext retrieval (token index) ● Phonetic and similar approximation search ● Extremely low latency
  • 8. 8 FOSS Asia 2010 BlackRay Architecture C++ API Java API Management Server Instance Server Data Universe (RAM Resident) < Redo Log Snapshots SQL Interface Postgres* Clients L5: Multi-Values L4: Multi-Tokens L5: Multi-Values L3: Row Index L2: Postings L1: Dictionary 5-Perspective Index Python API PHP API Python API C# API
  • 9. 9 FOSS Asia 2010 Data Universe ● BlackRay features a 5-Perspective Index ● Layer 1: Dictionary ● Layer 2: Postings ● Layer 3: Row Index ● Layer 4: Multi-Token Layer ● Layer 5: Multi-Value Layer ● Layer 1 and 2 comprise a fully inverted Index ● Statistics in this Index used for Query Plan Building ● All data - index and raw output - are held in memory
  • 10. 10 FOSS Asia 2010 Core BlackRay Features ● Standard loaders enable high performance loading of data into tables ● Persistence is done via file based snapshots ● Snapshots enable data versioning and simple backups ● Basic ACID Transaction complianc is implemented in BlackRay, without crash recovery support.
  • 11. 11 FOSS Asia 2010 Query Interfaces ● BlackRay implements the PostgreSQL server socket interface and binary APIs in Java, C++ and Python ● PostgreSQL compatible drivers can be utilized against BlackRay (JDBC/ODBC) ● Native API enables object oriented data access ● Performance of native APIs currently is substantially better than SQL via PostgreSQL drivers ● Dynamic query building is very efficient with native APIs
  • 12. FOSS Asia 2010 A New SQL Parser
  • 13. 13 FOSS Asia 2010 A New SQL Parser - again? ● The 0.10 release included a much improved SQL parser, built with boost::spirit ● Quite solid, fast and simple to use ● However, boost deprecates spirit1 ● boost::spirit2 is not compatible to spirit1, requiring a rewrite anyways ● Our impression: spirit2 requires too many resources and large grammars result in huge generated files ● Also: spirit and C++ templates do not mix well
  • 14. 14 FOSS Asia 2010 What would be a better choice? ● Flex/Bison: ● The obvious choice of MySQL and PostgreSQL ● Two-step compile process, generates C not C++ ● No Unicode support ● ANTLR: ● Odd grammar rules, not optimal for C++ ● Recursive Descent parsers are not suited for SQL ● Lemon/QUEX ● Our new choice ;)
  • 15. 15 FOSS Asia 2010 Lemon/Quex: Our experience ● Lemon: ● Lemon is part of SQLite ● Much more intuitive syntax than Flex syntax ● Quex: ● Generates tokenizers in C++ ● Unicode and external Parser support ● Partially buggy but all issues were fixed witihn days ● Synopsis: Lemon/Quex are like Bison/Flex, just with Unicode and C++ support and maybe easier to debug
  • 16. 16 FOSS Asia 2010 Current Progress ● Basic SQL Features are ported from spirit to Lemon/Quex ● The „issue-77“ branch contains all recent SQL parser code ● Unit-Testing and Database level testing very solid ● Will be part of the 0.11 release
  • 17. 17 FOSS Asia 2010 Recent Additions ● Support for simple (single column) User Defined Functions is now complete ● Query portion (no subselect, no aggregate functions) is very stable ● Data Definition Language was added recently ● CREATE SCHEMA ● CREATE TABLE ● ALTER TABLE ● Index is created dynamically, so no CREATE INDEX required
  • 18. FOSS Asia 2010 User Defined Functions
  • 19. 19 FOSS Asia 2010 User Defined Functions ● BlackRay was designed with support for Index functions that operate on data in tables ● Functions pre-compute index results, improving speed and enabling queries that are not possible otherwise ● Functions are called on data load, and also on queries. ● Functions must not maintain state outside of tables of the same instance they operate on.
  • 20. 20 FOSS Asia 2010 A Sample Function ● Using functions in BlackRay SELECT name FROM employee_table WHERE fx_phonetic (name) = 'mike'; ● Functions need to be loaded beforehand: CREATE FUNCTION fx_phonetic(varchar, varchar) RETURNS int AS 'DIRECTORY/funcs', 'phonetic' ; ● The function must implement the BlackRay default function signature, which is almost identical to the MySQL and PGSQL signatures
  • 21. 21 FOSS Asia 2010 Current State ● User Defined Function Repository fully implemented ● All built-in functions ported to be compatible to User Defined Functions ● SQL support for User Defined Functions under way ● Will be part of the 0.11 release
  • 22. FOSS Asia 2010 Our Adventure: BlackRay As A Storage Engine
  • 23. 23 FOSS Asia 2010 Why even bother? ● In Fall 2009, we embarked on a little adventure to implement BlackRay as a storage engine ● The old Engine had only a minimal SQL interface and we lacked the expertise to build it ourselves ● Plugging into the MySQL ecosystem seemed like a very pleasant choice ● The features of BlackRay would make it a good query cache for large disk tables.....
  • 24. 24 FOSS Asia 2010 Our First Problem BlackRay does not support a simple table scan..... ● It may seem strange, but due to it's design as an in- memory index, we do not separate table and index ● Each column index basically is the data of the column ● BlackRay distinguishes select and output columns, both of which remain in RAM ● The index therefore was never designed to be forced back into a row format, for a simple table walk
  • 25. 25 FOSS Asia 2010 Possible Solution? So, we can walk the Index instead? ● Rather than scanning the table, it is possible to scan the index instead ● This only works for the columns markes „searchable“ ● Causes nasty errors when trying to select against result- only columns ● In tokenized index columns, getting the data back out means concatenation with a blank between values – not nice, as tokenizing can follow complex rules ● Requires Refactoring of our Layer 3 (Row-Index)
  • 26. 26 FOSS Asia 2010 Next Issue Optimizing Queries ● The BlackRay Optimizer uses the Layer 1/2 (Inverted Index) and Layer 4 (Multi-Tokens) Data to chose a Query Path ● In BlackRay „SELECT text FROM t WHERE text LIKE '*pattern1*' AND text LIKE '*pattern2* is extremely efficient as the inverted index has all the data ● Even with OR this is an efficient Query, due to the fact that we can immediately chose the smaller query first and eliminate double matches
  • 27. 27 FOSS Asia 2010 Next Issue ● Optimizing in the Storage Engine Interface? ● In BlackRay, the Optimizer uses the AST from the SQL Parser to figure out what to optimize ● Based on a field or single Index level, the number of matches really are not useful ● Without utilizing the Layer2 and Layer4 structures, we lose performance by several orders of magnitude ● Personal Opinion: The MySQL Optimizer really seems to like table scans, and tricking it with random vs sequential read cost did not do the trick
  • 28. 28 FOSS Asia 2010 Functions in the Index ● Columns can take functions to be used on the data upon indexing, and when select is carried out ● The most common functions are – TOKENIZE – to support multi-token indexes – PHONETIC – match against defined phonetic rules – ALIAS – match a token against words with similar meaning ● Internally these functions could be considered Meta-Columns on the Index ● To be able to chose the proper column, we need to know what function was used in the select
  • 29. 29 FOSS Asia 2010 Functions in the Index Consider this Query: SELECT text FROM t WHERE fx_phonetic(text) LIKE 'maier%'; ● Functions can take more than one parameter, and may be nested ● We could not quite figure out how to explain this to the MySQL Parser ● The function data would need to be available to the Index to chose where to look
  • 30. 30 FOSS Asia 2010 Threading Models... ● BlackRay has a highly optimized Threading Model ● In RAM, we do not expect I/O-waits, so a model of two dedicated Threads per CPU core works really well ● Locking in the Index is built around this model ● „One Thread per Conection“ requires at least a careful review of the way critical data structures are accessed
  • 31. 31 FOSS Asia 2010 .... Our Conclusion ● Currently, BlackRay really does not fit too well into the storage engingine architecture ● Did we lose all hope? Absolutely not..... ● BlackRay Applications could really benefit from being able to utilize MySQL features, including the Archive Engine as well as temporary tables in Heap ● Thanks to the excellent Blog and postings by Brian Aker, which allowed us to not make all beginner mistakes ourselves
  • 33. 33 FOSS Asia 2010 Current Challenges ● Bulk Updates ● BlackRay supports Insert and Delete via the Bulk Loader ● Updates are done via Insert & Delete ● Insert/Delete via API ● An API exists for Insert/Delete ● The Insert/Delete API is separate from the Query API ● Both APIs cannot be used in the same Thread ● Insert/Delete via SQL ● Currently Insert/Delete are not available via SQL
  • 34. 34 FOSS Asia 2010 Supporting Insert/Delete ● Pull together Insert/Delete and Query APIs ● Take out the separate APIs ● Unified API will then support transactions ● Enable Insert/Delete via SQL ● Extend the SQL Grammar to include INSERT/DELETE ● Implement the functions via the unified API ● The Bulk Loader and SQL ● Rewrite of the Bulk Loader to utilize the unified API, rather than SQL
  • 35. 35 FOSS Asia 2010 Performance Impact ● Insert and Delete has a severe performance impact on parallel queries ● Locking needs to be utilized to ensure transactional integrity, causing queries to stall on data modification ● Currently BlackRay uses sorted lists for the data ductionary and the other index layers ● For indeces that have frequent changes, it may be much more desirable to utilize other basic data structures underneath the index
  • 37. 37 FOSS Asia 2010 Immediate Roadmap ● Planned 0.11.0 – Due in Fall 2010 ● Pluggable Function architecture (loadable libraries) ● Make all index functions available in SQL ● Support for Prepared Statements (ODBC/JDBC) ● Improved thread and memory management (Perftools?) ● BlackRay Admin Console (Remora) 0.11 ● Engine Statistics via GUI ● Cluster Node management
  • 38. 38 FOSS Asia 2010 Shortterm Roadmap ● Planned 0.12.0 – Due in February 2012 ● Realtime INSERT/UPDATE/DELETE ● SQL to support subselect ● Default aggregate functions (SUM/AVG/....) ● Fix several potential memory leaks (smart pointers) ● The 0.12 release should be the last pre-GA release
  • 39. 39 FOSS Asia 2010 Midterm Roadmap ● Scalability Features ● Sharding & Partitioning Options ● Federated Search ● Fully portable snapshot format (across platforms) ● Query Performance Analyzer ● Improved Statistics Module with GUI ● BlackRay as a Storage Backend for SUN OpenDS LDAP Engine
  • 40. 40 FOSS Asia 2010 Midterm Roadmap ● Security Features ● Improved User and Access Control concepts ● SSL for all connections ● External User Store (LDAP/OpenSSO/PAM...) ● Increased Platform support ● Windows 7 and Windows Server platforms ● Embedded platforms ● Other, random features by popular request.
  • 41. FOSS Asia 2010 The Team behind BlackRay
  • 42. 42 FOSS Asia 2010 SoftMethod GmbH ● SoftMethod GmbH initiated the project in 2005 ● Company was founded in 2004 and currently has 10 employees ● Focus of SoftMethod is high performance software engineering ● Product portfolio includes telco/contact center and LDAP applications ● SoftMethod also offers load testing and technical software quality assurance support.
  • 43. 43 FOSS Asia 2010 Development Team ● Felix Schupp, Initiator and Project Sponsor ● Thomas Wunschel, Architect and Lead Developer ● Mike Alexeev, Key Contributor (SQL/Functions) ● Souvik Roy, Performance Analysis and Tools ● Simon Courtenage, C++ and boost expert
  • 45. 45 FOSS Asia 2010 What to do next ● Get BlackRay: ● Register yourself on http://forge.softmethod.de ● SVN checkout available at http://svn.softmethod.de/opensource/blackray/trunk ● Get Involved ● Anyone can register and create tickets, news etc ● We have an active mailing list for discussion as well ● Contribute ● We require a signed Contributor agreement before being allowed commit access to the repository
  • 46. 46 FOSS Asia 2010 Contact Us ● Website: http://www.blackray.org ● Twitter: http://twitter.com/dataengine ● Facebook http://facebook.com/dataengine ● Mailing List: http://lists.softmethod.de ● Download: http://sourceforge.net/projects/blackray ● Felix: felix.schupp@softmethod.de