In IQ based historical database system, most important parts are historical data storage and data query service. The historical data storage consists of 3 tiers data store. From middle to left, the first tier database is called HDS cache database that store latest 2 years (0-2 years) original transaction data of business system; The second tier is called HDS Main database that will store latest 12 years of transaction data. The 3 rd tier is archive library on Tape which will archive all the transaction data of more than 12 years. On the right, there are 4 types of application, real time query, data retrieval, legacy data query and other back-end applications. These applications will be running on Sybase IQ multiplex environment. The HDS cache database is dedicated data store for real time query from various types of business systems. As the data volume of HDS cache is relatively small, and the database reside in the IQ dbspace on high performance disk ( Fiber Channel , FC disk ) . To keep storage cost low and insure that frequently accessed data is readily available, the HDS Main database is composed of multiple IQ dbspaces, transaction data of less than 7 years reside in IQ dbspaces on FC disk, which will support high performance data access, and transaction data of 7-12 years will reside in IQ dbspaces on cheaper & slower storage ( SATA disk ). As time goes on, data will be moved from high performance storage to cheaper & slower storage. So ILM strategy can be implemented.
Background of historical data system project. After the centralized of business and data, the volume of data increased fast. At the same time, the increased demands of acceess (ad hoc query) to historical transaction detail data from business departments, customers and various external organizations stress the existed IT system. Keeping large amount of historical transaction detail data in core banking system on Mainframe slowed down performance of business process, so they want to off load historical data query from business system to protect transaction response time by moving out the transaction data from business system, then consolidating the transaction data into a centralized historical data store. During 2005 to 2009, they implemented 3 phases of the HDS project. Phase 1 : In 2005, based on Teradata Phase 2: In 2007, Migrating from Teradata to Sybase IQ Phase 3: In 2009, Expand and extend with Sybase IQ Multiplex
在 iq 中,索引就是数据。
传统的数据库引擎不能以一种通用的访时进行数据压缩,主要是由于存在以下三个问题: 1 、其按行存储的数据存储方式不利于压缩。这是由于数据(数据库底层的存储为二进制数据)在以这种方式存储视重复并不多。我们发现,按行存储的数据,最多能有 5-10% 的压缩比例; 2 、对于许多的 2K 和 4K 的二进制数据的页来说,为压缩和解压缩而增加的开销太大; 3 、在 OLTP 环境中,大量读取和更新大量读取和更新混杂在一起。每一次更新需要进行压缩操作,而读取只需要解压缩操作,大多数的数据压缩算法在压缩时比解压缩时慢 4 倍。这一开销将明显降低 OLTP 数据库引擎的事务处理效率而使得数据压缩的代价昂贵到几乎不能忍受。 在数据仓库应用中,数据压缩可以用小得多的代价换取更大好处。其中包括减少对于存储量的要求;增大数据吞吐量(大页面设置, IQ 一般的页面至少为 128K ),这相当于减少查询响应时间。 Sybase IQ 使用了数据压缩。这是由于数据按列存储,相邻的字段值具有相同的数据类型,其二进制值的范围通常也要小得多,所以压缩更容易,压缩比更高。 Sybase IQ 对列存储的数据通常能得到大于 50% 的压缩。更大的压缩比例,加上大页面 I/O ,使得 Sybase IQ 在获得优良的查询性能的保跎倭硕源娲⒖占涞男枨蟆