This presentation introduces insights behind Clusterpoint document-oriented NoSQL database technology with ACID transaction support, used to run Clusterpoint Cloud DBaaS. Also, it provides brief overview of Clusterpoint team and company.
2. 2
Clusterpoint is an operational database with high-speed ACID-compliant
database transactions, built-in fast full text search and endless scale out ability
Platform delivers reliable distributed transactions at high performance
previously available only for SQL technology, ultra-fast web and mobile UI
responsiveness and relevance ranking of results in Big Data content search
GB ► TB ► PB
XML
JSON
OLTP
DOCUMENT
DATABASE INSTANT SEARCH
Use it to safely manage industry standard xml / json data at top speed and at scale
3. 3
Distributed XML / JSON document
store with linear scale out ability
and fault-tolerant replication
Low-cost rack&stack commodity hardware running
scalable from a day-one application software code
Expensive legacy hardware running
complex SQL application software code
Clusterpoint simplifies your database model and application software code,
generating significant TCO savings over your IT systems life-time
RELATIONAL SQL DATABASE
Fragmented Indexes, Complex Schemas,
Rigid Data Structure, Scales Up
CLUSTERPOINT DATABASE
Full Content Index, No schemas, Flexible
Data Structure, Scales Out
4. 4
RANKING
INDEX
All-in-one Database Server Platform with REST API & management GUI
JSON
DISTRIBUTED DOCUMENT
STORE DATABASE
SCALABLE BUILT-IN SEARCH
WITH BIG DATA INDEXING
FAULT-TOLERANT CLUSTER
WITH REPLICATION
OPEN
API
Distributed high-speed OLTP document-oriented
database with built-in full-text indexing and search
to manage structured & unsctructured Big data
1 2 3
High-performance distributed transactions architecture, ACID compliant
XML
What is inside our software product?
C/C++
7. 7
Founder
Gints
Ernestsons
CTO
Jurgis
Orups
Direct Sales
Director
Martins
Berzins
Infrastructure
S/w Architect
Janis
Sermulins
CEO
Zigmars
Rasscevskis
Partner Sales
Director
Peteris
Janovskis
The Team
Developer, tech
entrepreneur
with 25 years
experience in IT
products design
and services;
expert in SQL,
NoSQL, BI and
the Web search
8 years in
Google in the
role of
Technical
Lead for the
infrastructure
core software
development
team
9 years
leadership of
Clusterpoint
core software
engineering
team, expert
in C/C++, web
scale search
and databases
4 years in
Hewlett-
Packard;
5 years
Global sales
director
in SAF
Tehnika
MIT alumni,
internship in
Intel Research
(USA), 5 years
in Google
(Zurich, Swiss)
& 2 years in
TietoEnator
(Finland)
12 years in
Oracle;
Alliance &
Channel
Director
Central & East
Europe;
Regional Sales
Director
8. 8
Clusterpoint blends together SQL, NoSQL and SEARCH benefits
into a single database server software platform with a single API
Enjoy these features out-of-the-box:
• high performance distributed ACID transactions, including essential SQL support
• simplicity of all-in-one: database, search, high-availability replication, sharding
• fast productivity with format-independend schema-less XML/JSON database model
• cost-efficient scale out ability by clustering of commodity rack&stack hardware
• great end user experience from instant, relevant text search and real-time analytics
OLTP
XML
JSON
9. 9
Why Clusterpoint is the Swiss-army Knife of a Database Developer?
• simplicity of use (all features inside one API)
• universal usability (handles mixed json/xml/object/text data)
• fast productivity at minimal cost (top speed on commodity h/w)
• endless scale out ability from multiplying tools (elastic clustering)
• great end user experience (fast relevant search at ms latency)
10. 10
Cloud-ready scalable database delivers cost-efficiency for Big Data
• FREE license, no data cap, s/w runs on Linux, Windows & MacOS/BSD
• 24/7 managed service for in-premise use & cloud DBAAS subscribers
• (new) simple one-click replication of our customer database to the Cloud
• easy capacity to process very large data sets on the Clusterpoint Cloud
• no the need to buy extra hardware for fast prototyping / development
XML JSON
HTML MIME
Clusterpoint
Cloud Service
11. 11
Web
pages
XML MS
Office
docs
FAST, SCALABLE, INSTANTLY SEARCHABLE
DOCUMENT-ORIENTED OPERATIONAL DATABASE
( no fragmentation, no complex SQL tables, no joins )
Efficiently manage all your business data using Clusterpoint database
Invoices
Clusterpoint Server
RANKING INDEX
date time chars
full-text numeric
tags links geospacial
ORDERS
CONTRACTS
PAYMENTS
CUSTOMERS
INVOICES
MAIL & MSG.
SALES DOCS
LOG & AUDIT
PRODUCTS
USERS & APPS SESSIONS
JSON
Contracts
Purchase
orders
Customer
profiles
Application
source code
Product
descriptions
Payment
orders
Sales
proposals
User
profiles
Data
bases
Email &
messages
Session
tickets
Log
records
Business information
mostly lives in documents
No vendor lock-in: open data format & API. Secure ACID transactions. Full-text search. Essential SQL.
Customer
business
application
(web,
mobile
or
middleware
application
server),
that
interacts
with end-
users via
online GUI
XML
JSON
12. 12
All your data, indexes and replicas in one IT software system deliver solid security
HIGH-PERFORMANCE OLTP DB,
BIG DATA QUERIES & ANALYTICS,
ESSENTIAL SQL SUPPORT
ENTERPRISE SEARCH, INCL. FULL
TEXT, FACETS, SNIPPETS, STEMMING,
GEOSPACIAL, COLLATION ETC
DISTRIBUTED, FAULT-TOLERANT,
SCALABLE XML/JSON DATA STORAGE
ALL YOUR DATA IN ONE SECURE,
SCALABLE, INSTANTLY SEARCHABLE
DATABASE SOFTWARE PLATFORM
XML
JSON
ONE
API
13. 13
Relational database indexing model:
<id>
<title>
</title>
indexes
One single structured and unstructured data index
over all text, date, numeric and data markup content
Clusterpoint database indexing model:
Multiple fragmented indexes with selected index keys,
managed by complex relations
Clusterpoint database is indexed for FREE TEXT SEARCH in all content
SIMPLE AND USER-FRIENDLY QUERIES:
Use fast and relevant web-style free
text search, essential SQL and analytics
COMPLEX SQL QUERIES:
Hard to learn, unforgiving syntax
and performance problems at scale
SQL query: tens of seconds
Our query: milliseconds
RANKED INDEX
14. 14
Think about our index as a “giant tree” where all database content is
organized into small parts (“leaves”) and ranked by relative “weights”
Distributed storage of
all loaded documents
Clusterpoint Index™: ranked for customer own relevance rules
words
strings
numbers
dates
names tags
values
relations
Clusterpoint
database
XML
&
JSON
Clusterpoint database unique RANKING INDEX is an inverted graph
with all data items having customer own defined relevance rules
RAM
15. 15
Query: word1 ^100% word2 ^+30% word3 ^-20%
Ranking for query terms: to overwrite policy-defined default ranking rules (terms boosting)
integer 0 ..... 232
( used when tag weightings are same)
Ranking for your database documents:
applied as document rating (your own unique
formula, time-stamp, popularity, vote etc)
Programmable by meaningful for humans relative ranking delivers
superior relevance sorting and grouping for free text search results
RANKING
INDEX
REAL-TIME BIG DATA SEARCH
milli-
seconds
<id>
<title>
<document>
</title>
80%
Text
10%
Comments
100%
Ranking for your database structure: applied
as relative weights in % for your XML / JSON
tags (organizes relevance rules for search hits)
Title
Two problems solved
16. 16
Address
Company
Ranking rules are precisely customizable by your application needs
Email
Category
Most relevant
Product
Your data items for search
( XML / JSON tags in a database ):
Least relevant
100%0% 50%
100%0% 50%
100%0% 50%
100%0% 50%
100%0% 50%
Documents having search terms hits in tags with higher weightings will be sorted up-front
17. 17
If Document rating
is used for extra
ranking dimension
( for example, a
time stamp
of a news article
could serve as the
Document rating)
its value will be
used to sub-group
the same % tag
weighting
relevancy group
results among
themselves,
creating
cascading sort
orders for the
entire result set
Query: [ w1 w2 w3 ]
Paged result set is sorted, grouped and ordered by
the customizable RANKING RULES for the search
results that best match the database query context
Top group of results
has all w1, w2, w3 hits
in the Title tag
Next group of results
has all w1, w2, w3 hits
in the Text tag
The least relevant
group with all hits in
the Comments tag
0%
100%
First
Last
First
Last
First
Last
WWW
In Clusterpoint
architecture you
can optionally
define additional
sort orders (e.g.,
by votes, by
alphabetic value,
by click-price etc);
Whenever Tag
weighting and
Document rating
results fall into the
same sub-group,
they will be again
sorted by next
cascading sort
rule into even
more human-
friendly sub-grups
Tag weighting Document rating Optional sortingRESULT
Ranking enables search results to match the relevant human intent
Pages: 1 2 3 ...
Pages: 1 2 3 ...
10%
80%
Sorted by the best relevance
18. 18
Ranking solves Big Data information overload problem for our customers while
reducing complexity of application software development (coding efforts)
Replace formalized, designed for expert users and for
machine-processing data sorting statements in the SQL query:
SQL SELECT .... WHERE .... LIKE ....
GROUP BY .... ORDER BY .... JOIN ....
Sorted and grouped results in output pages, matching the database content by relevance ranking
FULL CONTENT
RANKING INDEX
with
a web style ad hoc search query using simple for human users
and free text format terms in Clusterpoint API SEARCH command:
any text or ”any text” or any tex*
Instant and relevant results on the 1st page! Great customer satisfaction!
You can search Clusterpoint database with simple text search like everyone is used to search the Web:
19. 19
Constant and predictable query latency enables real-time Big Data search and analytics
PB
GB
TB
MB
Milliseconds for a
CLUSTERPOINT query
Minutes ... hours
for SQL query
Ranking index enables to scale out your XML / JSON database to billions of
documents assuring very low latency response times for web and mobile apps
FULL CONTENT
RANKING INDEX
XML
JSON
20. 20
Free text:
java developer London
Phrase:
“John Smith”
Wildcards:
Joh* Smi* or “John Smi*”
Pattern match in strings:
John Sm?th John Sm[iy]th
In XML database structure (SQL-like):
java developer <salary>3500..4500</salary> <area>London</area>
Awesome end-user experience
using free format text search terms web-style
to query the Clusterpoint database and
instantly getting the most relevant results
No need to learn a special querying language
syntax; ranking index takes care about data
sorting for results relevance and web paging
Developers can take advantage of combined
full-text and SQL-like structured data queries
using multiple SEARCH API options
Query your database for instant and relevant answers without SQL complexity
and enjoy the world’s most simply and efficiently searchable database
Examples of free text, structured & combined queries delivering results in milliseconds:
21. 21
What customer benefits Clusterpoint database content ranking delivers?
Sorts RELEVANT data first, critical for ultra-fast, productive Big Data access
Groups together valuable information for insight, navigation & analysis
Reduces server-side computing costs eliminating excessive data sorting
Makes databases instantly responsive on web and mobile GUI screens
Organizes information by natural language driven (human) context rules
RANKING
INDEX
Enables high-performance, natively scalable application programming
22. 22
Develop your application software code scalable from day one and the same code
will efficiently run when your database volume and usage will be escalating
OPEX, TCO
Database life-cycle
Save > 80% Your web or
mobile application
software code will
scale for any usage
(write once)
TEST YEAR 1 YEAR 2 YEAR 3 YEAR 4 YEAR N
23. 23
replica 1
replica 2
replica 3
Why pay extra for scalability, high-availabilty and fault-tolerance? All included!
OUT-OF-THE-BOX
SCALABILITY
AND
LOAD-BALANCING
OUT-OF-THE-BOX
FAULT-TOLERANCE
AND
HIGH-AVAILABILITY
24. 24
- - -
10
- - -
Save your time and increase productivity with our elastic DBAAS cloud service capable to
process very large data volumes through efficient workload distribution
Number of
cluster nodes
10% 50% 100%
Our developers can share Clusterpoint Cloud DBAAS infrastructure at low cost
Some metrics for Clusterpoint API users:
Built-in cluster-wide Map-Reduce
49 000 TPS on a 30-servers cluster
4Bn transactions per day per cluster
Handles mixed web / mobile data & text
Millisecond latency for free text queries,
with native pagination for web / mobile UI
Join Clusterpoint DBAAS Cloud for on-demand Big Data computing at the fraction
of cost compared to owning and maintaining your own hardware
DBAASDBAAS
Your work
time needed
TERABYTES OF
DATA AND
BILLIONS OF
TRANSACTIONS- - -
100
- - -
1%
25. 25
GB
DB:
shard I Part IDB:
shard II DB:
shard III
RAM
Server
RAM
Server Server
TB PB
Mirror II
4Replica
3 Part I
Replica
4
Replica
5
Server Server Server
5 N copies
RAMMirror II
Server
Server
Server
Mirror II
Mirror II
Server
Server
Server
DATABASE SHARDING
DATABASE REPLICATION
REPLICATION
OF THE ENTIRE
CLUSTER DATABASE
ACROSS MULTIPLE
DATACENTERS
I
II
III
I
II
III
Easy clustering, sharding and replication using centralized administration tool
Manager
Web GUI
Mirror II
RAMMirror II
Server
Server
Server
I
II
III
replica 1 replica 2 replica 3
Replica
2
Server
Replica
1
Server
26. 26
Clusterpoint API uses industry standard web services and REST architecture
Open engineering standards: open data format, open API, open web protocols
Application
server
Developer’s
computer
Administrator’s
computer Database Storage
Rack &Stack cluster
hardware nodes
XML / JSON
.NET JAVA PHP PYTHON C/C++
CLUSTERPOINT SERVER SOFTWARE (ultra-fast C/C++ code)
insert add a document to the storage
update update or add a document
replace replace the existing document
delete delete the existing document
retrieve retrieve original document
search perform a search query
similar search for similar content
status return storage status information
......... More > 40 API commands
Clusterpoint API
commands
XML / JSON
HTTP HTTPS
fast TCP/IP driver
Transparent Map Reduce for all database operations
CLIENT SOFTWARE / API LIBRARIES
27. 27
What about money? How can Clusterpoint database help to save
money for our customer business?
28. 28
Scaling SQL database escalates your costs while Clusterpoint doesn’t
Does Clusterpoint help to save money for me and my business?
29. 29
saved 825 hours / year
Database query time 1,5 sec reduced to 0,15 sec
SQL query
CLUSTERPOINT query
100 employees x 100 queries x 220 days x 1.35 s
10x faster, saves 1.35 sec
Minimal search latency to quickly find the most relevant results is a crucial database
feature for web / mobile applications saving end-user time and corporate money
825 workhours x $29.63 *
Clusterpoint’s 10x faster database search delivers annually worktime savings: $ 24 445
Sample scenario for 100-employees: corporate work time savings for a database
application, where each employee is doing ~ 100 database queries per day
Saving even more for 1000 emplyees ► $ 244 500 (nearly the quarter of a million)
* - Hourly cost of labor; source: US Department of Labor, Bureau of Labor Statistics, March 2014 (http://www.bls.gov/news.release/ecec.nr0.htm)
30. 30
A single platform simplifies our customer IT software stack, boosts
performance and cost-efficiency, and decreases our customer TCO
OLTP
database,
ACID
compliant
Full-Text Search &
Real Time Analytics
Distributed cluster
computing
XML
JSON
Masterless, transactional, high-availablity operational database platform with fault-
tolerant replication and scale out architecture using inexpensive commodity hardware
31. 31
21 600132 000118 400Clusterpoint delivers > 80% savings in TCO:
018m / 90 0009m / 45 000Maintaining DB + ESS integration code and indexes over
system’s life-time: DBA man-months ($60k salary / year)
0010 000Client software access licenses (if required for 100 users)
04m / 12 0002m / 10 000Implementing DB+ESS clustering and high-availability:
developer months ($60k salary / year)
0DIY / 020% / 3 x 2 000ESS software maintenance fee (3 year)
0010 000Enterprise search software (ESS) license
0
3 x 7200
0
Clusterpoint
database ***
6m / 30 000
DIY / 0
0
Open source
integration **
TCO estimates for 3 years budget, calculating
cost for a 100-users company, in $
Commercial SQL
*
Database software license (enterprise edition) 14 000
Database software maintenance (3 years) 20% / 3 x 2800
Database and ESS s/w integration through custom code:
developer months ($60k salary / year)
3m / 15 000
Reduce # of software platforms (complexity) to reduce your costs
•- data varies among vendors, approximated for average cost ** - assuming that open source integration takes ~ 50% more time and efforts
*** - Clusterpoint database 24/7 support price for 2 high-availability replica servers (2 x $ 3600 / year)
32. 32
Productivity benefits for Clusterpoint database customers
1. Reduces customer IT complexity: simplifies customer database and application software
with a schema-less XML/JSON document store model and scalable, write-once s/w code
2. Delivers high-performance computing: blends OLTP database, free text search, querying
and analytics in the same software platform, without using search engines or BI tools
3. Provides cost-efficient Big data scalability: distributed database architecture scales out
on commodity rack&stack hardware, no complex software skills in MapReduce needed
XML
JSON
TCO
savings
> 80%
33. 33
Vertical-markets
application products for
Big Data real-time
management, running on
the Clusterpoint database
All products scale out linearly by
using inexpensive rack & stack
commodity hardware architecture
Web Content Crawler, Monitoring, Analytics
ECM market sector: $4,7 billion 2012, CAGR 7.2%
GOL (Machine-data Log & Event Analytics)
SIEM market sector: $1,3 billion 2013, CAGR 14%
GB -► TB -► PB
NTSS (All Network Traffic Storage & Search)
Cyber Security Market worth $95 billion in 2014, CAGR 10.3%
All products feature instantly
responsive web-style keyword
search across the entire database
content, essesntial SQL, real-time
Big Data reporting and analytics
GB -► TB -► PB
Portfolio of some of our Big Data applications
34. 34
MyInstaBank is the most
recent Proof-of Concept
application product in $60
billion financial services
market, driven by
Clusterpoint database
platform
Next-gen Touch&Go online banking Solution
Clusterpoint database drives innovative banking payments
archive data management, reporting and analytics solution
Solution is gaining the
growing interest among
financial data
management, banking and
EPR vendors, sectors still
largely dominated by
legacy SQL platforms
We are helping legacy SQL vendors where they are struggling
35. 35
To see the full potential of our database software solutions, you
have to see them in action!
You are welcome to contact us to arrange a live demonstration for you!
support@clusterpoint.com
USA: +1 (650) 681 9710
Europe: +371 (2) 9243460