With the introduction of the Oracle Database In-Memory Option it is now possible
to run real-time, ad-hoc, analytic queries on your business data as it exists
right at this moment and receive the results in sub-seconds.
True real real-time analytics that works both directly in an OLTP database or in a Data Warehouse.
Imagine being able to know the total sales you have made in the state of California as of right now. Not last week, or even last night but right now and have that query
return in sub-second time.
The Oracle Database In-Memory Option will also provide large speedups for OLTP applications.
The Oracle Database In-Memory Option is designed to be completely compatible with and transparent to existing Oracle applications and is trivial to deploy.
With the introduction of the Oracle Database In-Memory Option it is now possible
to run real-time, ad-hoc, analytic queries on your business data as it exists
right at this moment and receive the results in sub-seconds.
True real real-time analytics that works both directly in an OLTP database or in a Data Warehouse.
Imagine being able to know the total sales you have made in the state of California as of right now. Not last week, or even last night but right now and have that query
return in sub-second time.
The Oracle Database In-Memory Option will also provide large speedups for OLTP applications.
The Oracle Database In-Memory Option is designed to be completely compatible with and transparent to existing Oracle applications and is trivial to deploy.
Row format is optimized for OLTP workloads.
OLTP operations tend to access only a few rows but touch all of the columns.A row format allows quick access to all of the columns in a record since all the datafor a given record are kept together in-memory and on-storage. Since all data for a row is kepttogether, much of the row data will be brought into the CPU with a single memory reference. Row format is also much more efficient for row updates and inserts.
Analytical workloads access few columns but scan the entire data set. They also typically require some sort of aggregation. A columnar format allows for much faster data retrieval when only a few columns in a table are selected because all the data for a column is kepttogether in-memory and a single memory access will load many column values into the CPU.It also lends itself to faster filtering and aggregation, making it the most optimized format for analytics.
Until now you have been force to pick just one format and suffer the tradeoff of either sub-optimal OLTP or sub-optimal Analytics.
Other databases have row and column formats but you must choose ONE format for a given table.
Therefore you get either fast OLTP or fast Analytics on a table but not both.
Oracleâs unique dual format architecture allows data to be stored in both row and column format simultaneously. This eliminates the tradeoffs required by others.
Up until now, to achieve fast analytics and OLTP you needed a second copy of the data
(Data Mart, Reporting DB, Operational Data Store), which adds cost and complexity
to the environment, requires additional ETL processing and incurs time delays.
With Oracleâs unique approach, there is a single copy of the table on storage.
So there are no additional storage costs, synchronization issues, etc.
The Oracle optimizer is In-Memory aware. It has been optimized to automatically
run analytic queries using the column format, and OLTP queries using the row format.
Oracleâs In-Memory columnar technology is a pure in-memory format. The in-memory columnar format is not persisted on storage.
Other vendors have stored representations of the columnar format which adds overhead to change operations. Maintaining transactionalconsistency for a stored column format is very inefficient. A single row insert must separately update all the column segments.
Also, in other columnar products, changes must be logged, and the changes must be merged periodically into the stored representation of the data.
This means the ENTIRE table must periodically be rewritten (sometimes called checkpointed) to disk which adds much overhead and IO.
Oracleâs pure in-memory columnar format never causes extra writes to disk.
Users decide which objects should be loaded in the in-memory column store and when. It is possible to pick all of the tables or just some tables or partitions.
The user can also optionally choose the specific columns that they want loaded or not loaded into memory.
The in-memory column store uses a highly efficient compression format that is 2x to 10x smaller than the raw uncompressed data. This formatis optimized for high performance scan operations. Queries run directly against this format without the need to decompress data and are significantly faster than queries run against uncompressed data.
The In-Memory Column Store is populated (loaded into memory) by background processes either when the table or partition is first accessed or when the database instance is started. The database is fully active / accessible while population occurs. This is quite different from a pure in-memory databases. With a pure in-memory database, the database cannot be accessed until all the data is loaded which causes severe availability issues.
For In-Memory databases, the large majority of the database shared memory should be used for the column store. Oracle has run efficiently for decades with a row-store buffer cache that is orders of magnitude smaller than the database. Existing Oracle databases often have a buffer cache that is sized at less than 1% of the database size. For example, there are many production OLTP databases that are terabytes in size with gigabytes of row-store buffer cache. OLTP can very effectively cache hot blocks and efficiently read blocks from disk or flash into the buffer cache. By contrast, to get in-memory level performance with the column store, the column store should be sized to hold all of the table or partition (after compression).
A table that is in the in-memory column store does not have to be loaded in to the in-memory row store (buffer cache) and vice versa.
For warehousing and other analytic workloads, the in-memory column storage can be sized to be the vast majority of the SGA, and the row-store buffer cache can be made small.
For pure OLTP workloads that perform little or no analytics, the column store can be made small or non-existent.
For mixed workloads, the row store works efficiently with just a small percentage of the database in-memory. So most memory can be allocated to the column store.
The Oracle in-memory column format is designed to enable very fast SIMD processing (single instruction processing multiple data values). You can imagine SIMD or vector processing as array processing.
SIMD was originally designed for accelerating computer generated animation and High Performance Scientific computing.
Lets assume we are looking for the total number of sales we have had in the state of California this year.
The sales table is stored in the In-Memory Column Store so we simple have to scan the state column and count the numberof occurrence of the state of California. With SIMD processing we can check 16 values or entries in the state column in a single CPU instruction.
Columnar speed comes from
Scanning only needed columns
SIMD optimized format
Column specific compression algorithms
We have shown that we can scan and filter in-memory data extremely quickly.
But with any new data format we need to be able to join and aggregate the data as well as scan it.
The join between the stores and sales table can be converted into a scan of the sales
table with the join being converted into an additional filter on the sales table.
This allows us to take full advantage of what In-memory is best for, fast scans and filters.
This may sound familiar as it is in a bloom filter. Bloom filters were originally introduced in Oracle 10g
But only kicked in when you were using parallel execution. With the column store you now get bloom
Filters for serial queries too.
Up until now, the only way to run analytic queries with an acceptable response on an OLTP environment was to create specific indexes for these queries.
The good thing about indexes is that they are extremely scalable. They work well in-memory and also are extremely efficient on-disk since they minimize disk IO needed to find the requested data.
All of these additional indexes need to be maintained as the data changes,
which increase the elapse time for each of these changes. Every time the base table is changed, every index must also be changed and each of these operations is separately logged.
The In-Memory Column store can now remove the need for additional analytic indexes if tables fit in memory.
This makes database changes (DML) much faster since every change does not need to update all the analytic indexes.
It also reduces the overall storage space required for system.
And unlike a pure in-memory database, if the system should crash and need to restart your business can still operate fully.
OLTP queries and updates (the heart of any transaction based system) will perform just as they always do against the indexed row store.
Analytical queries will execute slowly until the In-Memory Column Store is populated, but they will still run.
You donât have to wait for all of the data to be populated in memory before resuming your business.
Removing the need for analytic indexes greatly simplifies tuning and reduces ongoing administration.
Even queries that previously did not benefit from an index will run quickly since they will be serviced from the in-memory column store. This enables true ad-hoc reporting since queries are not limited to just those that benefit from existing indexes. Queries on large tables no longer need to be indexed to perform well.
Creating and tuning a database is also much simpler since administrators no longer need to anticipate analytic queries and pre-create indexes for thom.
At Oracle OpenWorld 2013 we demonstrated the performance of the Oracle Database In-Memory Option by running analytic queries simultaneously against the traditional row store and the new in-memory column store. Both the row store and the column store were completely loaded into memory, eliminating any slowdown caused by the storage subsystem.
This demonstration showed that queries on over 3 billion rows of data completed in less than a sixth of a second when run on a single processor core using the new in-memory database option. This performance was hundreds of times faster than the same query run against the traditional row store, even though all data was completely loaded into memory.
The query run against the traditional row store processed data at a very good throughput of 25 million rows per second. The same query run against the in-memory column store ran at an amazing throughput of 20 Billion rows per second. These queries run even faster when executed in parallel.
Remember users get to specify which tables, partitions and columns are stored in-memory.
The amount of data that can fit in memory is greatly increased because we automatically
compress the data as it is loaded into the column store. Oracleâs specialized compressionalgorithms deliver 2x to 10x compression rates depending on the data types and the distribution of the data.
There are three different types of compression available in-memory and we have special query
processing algorithms that allow queries to execute directly against the compressed data.
Even so it is not necessary for all of the data to reside in-memory. Data can be stored in a multiple
tiered environment, with the hottest data in-memory, active data on flash and older or colder data on disk.
The Oracle database does not require data to be accessed differently depending on its location. In fact, asingle query can access data from all three tiers; memory, flash or disk completely transparently.
Transparently using all tiers of data storage enables customers to simultaneously achieve the highest performance and the lowest cost.
It also enables even the largest databases to benefit from in-memory technologies without gigantic increases in budget.
In a multi-server (RAC) environment each database instance has it owns In-Memory Column Store. A table can reside in each column store (duplicated),
or larger tables that will not fit within the columns store of a single server can de distributed across the cluster.
When analytic queries execute against a table distributed across the cluster,
in-memory parallel execution is used to run a parallel server process on each server to access the data local to the server and returns the results. This allows Oracle to
seamlessly scale-out the In-Memory Column Store.
The Cache Fusion protocol has been extended to manage the column-store and provide complete and transparent cross-node consistency.
OLTP query and change (DML) operations work exactly as they do today on RAC clusters. In todayâs RAC cache fusion algorithm, the database ensures that any updateto a row transparently invalidates copies of the row kept in memory (in the buffer cache) of other servers in the cluster. The only difference for the in-memory column store is that a row update will invalidatecopies of the row in the in-memory column store as well (no matter what node of the cluster holds the row in its in-memory column store).
All communication or messaging across the cluster is extremely efficient as it uses a new direct-to-wire InifinBand protocol that enables the database to talk directly to the InfiniBand hardware, bypassing the operating system and the network software layers.
Not only do we scale out in a RAC environment, we can also scale up on very large SMP systems, such as large x86 based SMPs, or the new Sparc M6 machine which has 32TB of RAM.
SMPs are a good match for in-memory databases because SMP scaling removes the overhead of distributing queries across a cluster and coordinating transactions (such as two phase commits that are needed by shared-nothing databases) and data updates.
The inter-process bandwidth on an SMP system also far exceeds any network, allowing large queries to operate more efficiently.
Implementing the In-Memory column store is as simple as flipping a switch. All you have to do is specify how much memory the column store can use,
Then decide which objects (tables, partitions, and columns) you want to be loaded into the column store and your done. Oracle will take care of the rest.
The Optimizer will simply direct any analytic queries to the In-Memory Column store, allowing you to do drop any indexes that were specifically created to
enable analytic query performance.
Because the In-Memory Column Store is embedded in the Oracle Database it is fully compatible with ALL existing features,
and requires absolutely no changes in the application layer. This means you can start taking full advantage of it on day one,regardless of what applications you run in your environment.
Any application that runs against the Oracle database will transparently benefit from the in-memory column store.
The Oracle database offers the Industryâs most sophisticated and robust High Availability including,
DataGuard, GoldenGate, RAC, RMAN, Recovery, ASM Mirrors, Flashback, Online indexing and table reorganization, etc.
These features have matured over decades of use at leading Financial Companies, Telecoms, Web Services, SaaS companies, etc.
All of these features work completely transparently with the new Oracle Database In-Memory Option.
Remember, the in-memory columnar format is never written to disk so canât corrupt your database or interfere with you existing HA strategy.
The in-memory columnar format does not change logging, backup, recovery, Data Guard, or Golden Gate. All existing HA configurations, backup scripts and procedures, etc. work with no changes.
It is extremely safe to implement in critical databases and does not add extra management complexity or limitations that must be managed.
With the introduction of the Oracle Database In-Memory Option it is now possible
to run real-time, ad-hoc, analytic queries on your business data as it exists
right at this moment and receive the results in sub-seconds.
True real real-time analytics that works both directly in an OLTP database or in a Data Warehouse.
The Oracle Database In-Memory Option will also provide large speedups for OLTP applications by removing the need for analytic indexes.
Less management and tuning is needed because analytic queries no longer need to be anticipated and pre-indexed.
The Oracle Database In-Memory Option is designed to be completely compatible with and transparent to existing Oracle applications and is trivial to deploy.