SlideShare ist ein Scribd-Unternehmen logo
1 von 7
Downloaden Sie, um offline zu lesen
担当: 若森拓馬(NTT)	
【VLDB2013勉強会】	
Session 1: Emerging Hardware
Hybrid Storage Management for Database Systems	
2	
}  目的:SSDとHDDからなるハイブリッドなストレージシステムを
データベース管理に用いる
}  貢献
}  GD2L:ハイブリッドストレージにおけるcost-awareなbuffer pool管理
アルゴリズムの提案
}  GreedyDualアルゴリズム[1]をベースに,SSDとHDDのパフォーマンスの差異
を考慮
}  CAC:予測的なコストベースのSSD管理技術の提案
}  GD2LとCACの実験によるパフォーマンス評価
}  MySQLのInnoDBストレージエンジンに提案機構を実装
Session 1: Emerging Hardware 担当:若森拓馬(NTT)	
システムアーキテクチャ(Figure: 1より)	
ustrated in Figure 1. When writing data to storage, the
MS chooses which type of device to write it to.
RAM
Buffer pool
manager
SSD
manager
HDD
read/write requests
Buffer pool
SSD
write
Figure 1: System Architecture
revious work has considered how a DBMS should place
in a hybrid storage system [11, 1, 2, 16, 5]. We provide
mmary of such work in Section 7. In this paper, we
a broader view of the problem than is used by most
his work. Our view includes the DBMS bu↵er pool as
as the two types of storage devices. We consider two
ted problems. The first is determining which data should
etained in the DBMS bu↵er pool. The answer to this
tion is a↵ected by the presence of hybrid storage because
ks evicted from the bu↵er cache to an SSD are much
er to retrieve later than blocks evicted to the HDD. Thus,
onsider cost-aware bu↵er management, which can take
distinction into account. Second, assuming that the
is not large enough to hold the entire database, we
e the problem of deciding which data should be placed on
SSD. This should depend on the physical access pattern
[1] N. Young. The k-server dual and loose competitiveness for paging. Algorithmica, 11:525–541, 1994.
•  全てのDBのページはHDDに格納され,いくつかのページのコ
ピーがSSDとbuffer poolのいずれかor両方に格納されている
•  buffer poolがいっぱいになった時,buffer pool managerが
ページ置換ポリシに基づいてページをbufferから取り除く
•  SSD managerは,
•  SSD追加ポリシに基づいて,buffer poolから取り除かれ
たページをSSDに追記する
•  SSD置換ポリシに基づいて,SSDからデータを取り除く
GD2L:cost-awareなバッファプール管理アルゴリズム	
3	
}  前提
}  ページpはコストHをもつ
}  L(inflation value)を用いてHを計算
する
}  2つの優先度付キューを使用する
}  QS:SSD上のページ管理用
}  QD:SSD上にないページ管理用
}  アルゴリズム
}  ページpがキャッシュされていない
時
}  QsのLRUページとQDのLRUページ
のコストHを比較し,小さい方のHを
もつページqを取り除く
}  L←H(q),pをキャッシュに置く
}  pがSSD上にあるとき
}  H(p)←L + RS,pをQsのMRUに置く
}  それ以外で,pがHDD上にあるとき
}  H(p) = L + RD,pをQDのMRUに置く
Session 1: Emerging Hardware 担当:若森拓馬(NTT)	
H
S S S S
H H H H HQD
QS
S QSLRU
QDLRU
Page is flushed
to the SSD
Page is evicted
from the SSD
Figure 3: Bu↵er pool managed by GD2L
GD2Lによるbuffer pool管理(Figure: 3より)	
ストレージデバイスのパラメータ
RD:HDDからの読み込み時間
WD:HDDへの書き込み時間
RS:SSDからの読み込み時間
WS:SSDへの書き込み時間
buffer poolの空きが足りなくなった時,page cleanerは
dirty bufferをcost-awareのポリシに基づいて
ストレージにフラッシュする
cost-awareキャッシングの効果	
4	
}  ページがSSDにキャッシュ
されているとき,buffer
poolミス率と物理書き込み
率が高くなる
}  理由
}  ページが一度SSDに置か
れると,そのページは読み
出しコストが低いため,
buffer poolから取り除かれ
やすくなる
}  page cleanerは取り除く前
にdirty pageをフラッシュし
なければならない
Session 1: Emerging Hardware 担当:若森拓馬(NTT)	
gure 3: Bu↵er pool managed by GD2L
types of writes: replacement writes and recoverabil-
s. Replacement writes are issued when dirty pages
ified as eviction candidates. To remove the latency
d with synchronous writes, the page cleaners try to
hat pages that are likely to be replaced are clean
me of the replacement. In contrast, recoverabil-
s are those that are used to limit failure recovery
he InnoDB uses write ahead logging to ensure that
ed database updates survive failures. The failure
time depends on the age of the oldest changes in
r pool. The page cleaners issue recoverability writes
ast recently modified pages to ensure that a config-
covery time threshold will not be exceeded.
oDB, when the free space of the bu↵er pool is below
Figure 4: Miss rate/write rate while on the HDD
(only) vs. miss rate/write rate while on the SSD.
Each point represents one page
HDDのキャッシュミス/書き込み率 対
SSDのキャッシュミス/書き込み率 (Figure: 4より)
CAC:Cost-Adjusted Caching	
5	
}  pをSSDに置くかどうかを決定するために,利得Bを見積もる
}  p’がすでにSSDキャッシュにあるとき,B(p’)<B(p)を満たすpをSSD
キャッシュに置き,最も小さいBをもつページを取り除く
}  miss rate expansion factor (α)
}  ページの物理read/write率が,
SSDにページがあるかどうかで
どのように変化するかを見積もる
ために導入
Session 1: Emerging Hardware 担当:若森拓馬(NTT)	
Figure 5: Summary of Notation
Symbol Description
rD, wD Measured physical read/write count while
not on the SSD
rS, wS Measured physical read/write count while
on the SSD
crD, cwD Estimated physical read/write count if
never on the SSD
crS, cwS Estimated physical read/write count if al-
ways on the SSD
mS Bu↵er cache miss rate for pages on the
SSD
mD Bu↵er cache miss rate for pages not on the
SSD
↵ Miss rate expansion factor
Figure 6: Miss rate expansion factor for pages from
three TPC-C tables.
expansion factors of pages grouped by table and by logical
3つのTPC-Cのテーブル(のページ)に対する
miss rate expansion factor(Figure: 6より)	
would experience if it were placed on the SSD, and crD(p)
and cwD(p) are the physical read and write counts p would
experience if it were not. Using these hypothetical physical
read and write counts, we can write our desired estimate of
the benefit of placing p on the SSD as follows
B(p) = (crD(p)RD crS(p)RS)+( cwD(p)WD cwS(p)WS) (2)
Thus, the problem of estimating benefit reduces to the prob-
lem of estimating values for crD(p), crS(p), cwD(p), and cwS(p).
To estimate crS(p), CAC uses two measured read counts:
rS(p) and rD(p). (In the following, we will drop the explicit
page reference from our notation as long as the page is clear
from context.) In general, p may spend some time on the
SSD and some time not on the SSD. rS is the count of the
number of physical reads experienced by p while p is on the
SSD. rD is the number of physical reads experienced by p
while it is not on the SSD. To estimate what p’s physical
read count would be if it were on the SSD full time (crS),
CAC uses
a room for a new entry.
The Miss Rate Expansion Factor
urpose of the miss rate expansion factor (↵) is to
how much a page physical read and write rates
nge if the page is admitted to the SSD. A simple
estimate ↵ is to compare the overall miss rates of
the SSD to that of pages that are not on the SSD.
that mS represents the overall miss rate of logical
uests for pages that are on the SSD, i.e., the total
of physical reads from the SSD divided by the total
of logical reads of pages on the SSD. Similarly, let
esent the overall miss rate of logical read requests
s that are not located on the SSD. Both mS and
easily measured. Using mS and mD, we can define
rate expansion factor as:
↵ =
mS
mD
(7)
mple, ↵ = 3 means that the miss rate is three times
ages for on the SSD than for pages that are not on
where mS(g) is
they are in the
pages in g while
We track logi
vidual page, as
counts are upda
to the page. Gr
tain events occu
pages. Specifica
is evicted from
from the bu↵er
from the SSD. B
their logical rea
which a page b
occurs, we subt
old group and a
It is possible t
groups. For ex
Figure 5: Summary of Notation
Symbol Description
rD, wD Measured physical read/write count while
not on the SSD
rS, wS Measured physical read/write count while
on the SSD
crD, cwD Estimated physical read/write count if
never on the SSD
crS, cwS Estimated physical read/write count if al-
ways on the SSD
mS Bu↵er cache miss rate for pages on the
SSD
mD Bu↵er cache miss rate for pages not on the
SSD
↵ Miss rate expansion factor
scaling it up to account for any time in which the page was
not on the SSD. While this might be e↵ective, it will work
only if the page has actually spent time on the SSD to that
rS can be observed. We still require a way to estimate rS for
pages that have not been observed on the SSD. In contrast,
estimation using Equation 3 will work even if rS or rD are
zero due to lack of observations.
We track reference counts for all pages in the bu↵er pool
and all pages in the SSD. In addition, we maintain an out-
queue for to track reference counts for a fixed number (Noutq)
of additional pages. When a page is evicted from the SSD,
Figure 6: Miss rate expansion fact
three TPC-C tables.
expansion factors of pages grouped by
read rate. The three lines represent pag
C STOCK, CUSTOMER, and ORDER
Since di↵erent pages may have substa
rate expansion factors, we use di↵erent
di↵erent groups of pages. Specifically,
pages based on the database object (e
they store data, and on their logical rea
a di↵erent expansion factor for each gr
range of possible logical read rates into
size. We define a group as pages tha
same database object and whose logic
the same subrange. For example, in o
: SSD上に全く無いときの物理read/writeの見積もり数
: SSD上に常にあるときの物理read/writeの見積もる数
:SSD上のページに対する バッファキャッシュミス率
:SSD上にないページに対する バッファキャッシュミス率
GD2LとCACのパフォーマンス評価	
6	
 Session 1: Emerging Hardware 担当:若森拓馬(NTT)	
Next, we consider the impact of switching from a non-
anticipatory cost-based SSD manager (CC) to an antici-
patory one (CAC). Figure 7 shows that GD2L+CAC pro-
vides additional performance gains above and beyond those
achieved by GD2L+CC in the case of the large (30GB)
database. Together, GD2L and CAC provide a TPC-C per-
formance improvement of about a factor of two relative to
the LRU+CC baseline in our 30GB tests. The performance
gain was less significant in the 15GB database tests and
non-existent in the 8GB database tests.
Figure 8 shows that GD2L+CAC results in lower total
I/O costs on both the SSD and HDD devices, relative to
GD2L+CC, in the 30GB experiments. Both policies re-
sult in similar bu↵er pool hit ratios, so the lower I/O cost
achieved by GD2L-CAC is attributable to better decisions
about which pages to retain on the SSD. To better under-
stand the reasons for the lower total I/O cost achieved by
CAC, we analyzed logs of system activity to try to iden-
tify specific situations in which GD2L+CC and GD2L+CAC
make di↵erent placement decisions. One interesting situa-
tion we encounter is one in which a very hot page that is in
the bu↵er pool is placed in the SSD. This may occur, for ex-
ample, when the page is cleaned by the bu↵er manager and
there is free space in the SSD, either during cold start and
because of invalidations. When this occurs, I/O activity for
the hot page will spike because GD2L will consider the page
to be a good eviction candidate. Under the CC policy, such
a page will tend to remain in the SSD because CC prefers to
keep pages with high I/O activity in the SSD. In contrast,
CAC is much more likely to to evict such a page from the
SSD, since it can (correctly) estimate that moving the page
will result in a substantial drop in I/O activity. Thus, we
find that GD2L+CAC tends to keep very hot pages in the
bu↵er pool and out of the SSD, while with GD2L+CC such
pages tend to remain in the SSD and bounce into and out
of the bu↵er pool. Such dynamics illustrate why it is impor-
tant to use an anticipatory SSD manager (like CAC) if the
bu↵er pool manager is cost-aware.
For the experiments with smaller databases (15GB and
8GB), there is little di↵erence in performance between GD2L+CC
and GD2L+CAC. Both policies result in similar per-transaction
I/O costs and similar TPC-C throughput. This is not sur-
prising, since in these settings most or all of the hot part of
the database can fit into the SSD, i.e., there is no need to be
marginally faster than LRU+LRU2, and for the smallest
database (8GB) they were essentially indistinguishable. LRU-
+MV-FIFO performed much worse than the other two al-
gorithms in all of the scenarios we test.
Figure 10: TPC-C Throughput Under LRU+MV-
FIFO, LRU+LRU2, GD2L+LRU2, and
GD2L+CAC for Various Database and DBMS
}  GD2L+CACが最も良いパ
フォーマンスを達成
Alg & HDD HDD SSD SSD total BP miss
BP size util I/O util I/O I/O rate
(GB) (%) (ms) (%) (ms) (ms) (%)
GD2L+CAC
1G 85 39.6 19 8.8 48.4 7.4
2G 83 31.8 20 7.8 39.6 6.3
4G 82 23.5 20 5.8 29.3 4.8
LRU+LRU2
1G 85 49.4 15 8.6 58.0 6.7
2G 87 50.3 12 6.7 57.0 4.4
4G 90 48.5 9 4.7 53.2 2.4
GD2L+LRU2
1G 73 41.1 43 24.6 65.8 10.7
2G 79 46.5 32 18.9 65.3 8.8
4G 79 34.4 32 13.8 48.2 7.6
LRU+FIFO
1G 91 101.3 8 9.2 110.5 6.6
2G 92 92.3 6 5.8 98.1 4.4
4G 92 62.9 5 3.7 66.6 2.5
Figure 11: Device Utilizations, Bu↵er Pool Miss
Rate, and Normalized I/O Time (DB size=30GB)
to 400M. We tested k set to 1%, 2%, 5% and
size, Our results showed that k values in th
impact on TPC-C throughput. In InnoDB,
fier is eight bytes and the size of each page is
hash map for a 400M SSD fits into ten pag
the rate with which the SSD hash map was fl
that even with k = 1%, the highest rate of ch
hash map experienced by any of the three SS
algorithms (CAC, CC, and LRU2) is less tha
ond. Thus, the overhead imposed by checkp
map is negligible.
7. RELATED WORK
Placing hot data in fast storage (e.g. h
cold data in slow storage (e.g. tapes) is n
Hierarchical storage management (HSM) is
technique which automatically moves data
cost and low-cost storage media. It uses f
TPC-Cスループット(Figure: 10より)	
デバイス利用率,buffer poolミス率,正規化後I/O時間(Figure: 11より)
参考: GreedyDualアルゴリズム	
7	
}  前提:オブジェクトサイズが均等で,異なる読み出し
コストを持つ
}  ページpに非負のコストHを与える
}  あるページがキャッシュに移動または参照されたとき,
Hはページをキャッシュに読み出すコストに設定され
る
}  キャッシュに空きを作るとき,キャッシュ上で最も小さ
なH(Hmin)をもつページが除かれる
}  残りの全てのページのHからHminを引く	
Session 1: Emerging Hardware 担当:若森拓馬(NTT)

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (9)

20181016_pgconfeu_ssd2gpu_multi
20181016_pgconfeu_ssd2gpu_multi20181016_pgconfeu_ssd2gpu_multi
20181016_pgconfeu_ssd2gpu_multi
 
Advanced Apache Cassandra Operations with JMX
Advanced Apache Cassandra Operations with JMXAdvanced Apache Cassandra Operations with JMX
Advanced Apache Cassandra Operations with JMX
 
TRHUG 2015 - Veloxity Big Data Migration Use Case
TRHUG 2015 - Veloxity Big Data Migration Use CaseTRHUG 2015 - Veloxity Big Data Migration Use Case
TRHUG 2015 - Veloxity Big Data Migration Use Case
 
Supporting HDF5 in GrADS
Supporting HDF5 in GrADSSupporting HDF5 in GrADS
Supporting HDF5 in GrADS
 
20181025_pgconfeu_lt_gstorefdw
20181025_pgconfeu_lt_gstorefdw20181025_pgconfeu_lt_gstorefdw
20181025_pgconfeu_lt_gstorefdw
 
20180920_DBTS_PGStrom_EN
20180920_DBTS_PGStrom_EN20180920_DBTS_PGStrom_EN
20180920_DBTS_PGStrom_EN
 
Update on OpenTSDB and AsyncHBase
Update on OpenTSDB and AsyncHBase Update on OpenTSDB and AsyncHBase
Update on OpenTSDB and AsyncHBase
 
SSD Aware Scan Operation Optimization in PostGreSQL Database
SSD Aware Scan Operation Optimization in PostGreSQL DatabaseSSD Aware Scan Operation Optimization in PostGreSQL Database
SSD Aware Scan Operation Optimization in PostGreSQL Database
 
21st Athens Big Data Meetup - 1st Talk - Fast and simple data exploration wit...
21st Athens Big Data Meetup - 1st Talk - Fast and simple data exploration wit...21st Athens Big Data Meetup - 1st Talk - Fast and simple data exploration wit...
21st Athens Big Data Meetup - 1st Talk - Fast and simple data exploration wit...
 

Andere mochten auch (6)

ICDE2014 Session 14 Data Warehousing
ICDE2014 Session 14 Data WarehousingICDE2014 Session 14 Data Warehousing
ICDE2014 Session 14 Data Warehousing
 
ICDE2015 Research 3: Distributed Storage and Processing
ICDE2015 Research 3: Distributed Storage and ProcessingICDE2015 Research 3: Distributed Storage and Processing
ICDE2015 Research 3: Distributed Storage and Processing
 
data.tableパッケージで大規模データをサクッと処理する
data.tableパッケージで大規模データをサクッと処理するdata.tableパッケージで大規模データをサクッと処理する
data.tableパッケージで大規模データをサクッと処理する
 
巨大な表を高速に扱うData.table について
巨大な表を高速に扱うData.table について巨大な表を高速に扱うData.table について
巨大な表を高速に扱うData.table について
 
遅延価値観数と階層ベイズを用いた男心をくすぐる女の戦略.R
遅延価値観数と階層ベイズを用いた男心をくすぐる女の戦略.R遅延価値観数と階層ベイズを用いた男心をくすぐる女の戦略.R
遅延価値観数と階層ベイズを用いた男心をくすぐる女の戦略.R
 
Rデータフレーム自由自在
Rデータフレーム自由自在Rデータフレーム自由自在
Rデータフレーム自由自在
 

Ähnlich wie VLDB2013 Session 1 Emerging Hardware

ssd_performance_tech_brief
ssd_performance_tech_briefssd_performance_tech_brief
ssd_performance_tech_brief
Dave Glen
 
Designing SSD-friendly Applications for Better Application Performance and Hi...
Designing SSD-friendly Applications for Better Application Performance and Hi...Designing SSD-friendly Applications for Better Application Performance and Hi...
Designing SSD-friendly Applications for Better Application Performance and Hi...
Zhenyun Zhuang
 
AsawariKhedkar_SSD_HDD_Comparison
AsawariKhedkar_SSD_HDD_ComparisonAsawariKhedkar_SSD_HDD_Comparison
AsawariKhedkar_SSD_HDD_Comparison
Asawari Khedkar
 
fdocuments.in_aerospike-key-value-data-access.ppt
fdocuments.in_aerospike-key-value-data-access.pptfdocuments.in_aerospike-key-value-data-access.ppt
fdocuments.in_aerospike-key-value-data-access.ppt
yashsharma863914
 
Hp smart cache technology c03641668
Hp smart cache technology c03641668Hp smart cache technology c03641668
Hp smart cache technology c03641668
Paul Cao
 

Ähnlich wie VLDB2013 Session 1 Emerging Hardware (20)

ssd_performance_tech_brief
ssd_performance_tech_briefssd_performance_tech_brief
ssd_performance_tech_brief
 
PG-Strom v2.0 Technical Brief (17-Apr-2018)
PG-Strom v2.0 Technical Brief (17-Apr-2018)PG-Strom v2.0 Technical Brief (17-Apr-2018)
PG-Strom v2.0 Technical Brief (17-Apr-2018)
 
Designing SSD-friendly Applications for Better Application Performance and Hi...
Designing SSD-friendly Applications for Better Application Performance and Hi...Designing SSD-friendly Applications for Better Application Performance and Hi...
Designing SSD-friendly Applications for Better Application Performance and Hi...
 
AsawariKhedkar_SSD_HDD_Comparison
AsawariKhedkar_SSD_HDD_ComparisonAsawariKhedkar_SSD_HDD_Comparison
AsawariKhedkar_SSD_HDD_Comparison
 
An Assessment of SSD Performance in the IBM System Storage DS8000
An Assessment of SSD Performance in the IBM System Storage DS8000An Assessment of SSD Performance in the IBM System Storage DS8000
An Assessment of SSD Performance in the IBM System Storage DS8000
 
An Assessment of SSD Performance in the IBM System Storage DS8000
An Assessment of SSD Performance in the IBM System Storage DS8000An Assessment of SSD Performance in the IBM System Storage DS8000
An Assessment of SSD Performance in the IBM System Storage DS8000
 
fdocuments.in_aerospike-key-value-data-access.ppt
fdocuments.in_aerospike-key-value-data-access.pptfdocuments.in_aerospike-key-value-data-access.ppt
fdocuments.in_aerospike-key-value-data-access.ppt
 
Demartek Lenovo Storage S3200 i a mixed workload environment_2016-01
Demartek Lenovo Storage S3200  i a mixed workload environment_2016-01Demartek Lenovo Storage S3200  i a mixed workload environment_2016-01
Demartek Lenovo Storage S3200 i a mixed workload environment_2016-01
 
Speeding time to insight: The Dell PowerEdge C6620 with Dell PERC 12 RAID con...
Speeding time to insight: The Dell PowerEdge C6620 with Dell PERC 12 RAID con...Speeding time to insight: The Dell PowerEdge C6620 with Dell PERC 12 RAID con...
Speeding time to insight: The Dell PowerEdge C6620 with Dell PERC 12 RAID con...
 
Aerospike: Key Value Data Access
Aerospike: Key Value Data AccessAerospike: Key Value Data Access
Aerospike: Key Value Data Access
 
Hp smart cache technology c03641668
Hp smart cache technology c03641668Hp smart cache technology c03641668
Hp smart cache technology c03641668
 
lecture3.pdf
lecture3.pdflecture3.pdf
lecture3.pdf
 
SSDs Deliver More at the Point-of-Processing
SSDs Deliver More at the Point-of-ProcessingSSDs Deliver More at the Point-of-Processing
SSDs Deliver More at the Point-of-Processing
 
Demartek lenovo s3200_mixed_workload_environment_2016-01
Demartek lenovo s3200_mixed_workload_environment_2016-01Demartek lenovo s3200_mixed_workload_environment_2016-01
Demartek lenovo s3200_mixed_workload_environment_2016-01
 
How Data Volume Affects Spark Based Data Analytics on a Scale-up Server
How Data Volume Affects Spark Based Data Analytics on a Scale-up ServerHow Data Volume Affects Spark Based Data Analytics on a Scale-up Server
How Data Volume Affects Spark Based Data Analytics on a Scale-up Server
 
Comparing desktop drive performance: Seagate Solid State Hybrid Drive vs. har...
Comparing desktop drive performance: Seagate Solid State Hybrid Drive vs. har...Comparing desktop drive performance: Seagate Solid State Hybrid Drive vs. har...
Comparing desktop drive performance: Seagate Solid State Hybrid Drive vs. har...
 
Getting Started with Amazon Redshift - AWS July 2016 Webinar Series
Getting Started with Amazon Redshift - AWS July 2016 Webinar SeriesGetting Started with Amazon Redshift - AWS July 2016 Webinar Series
Getting Started with Amazon Redshift - AWS July 2016 Webinar Series
 
1a_query_processing_sil_7ed_ch15.ppt
1a_query_processing_sil_7ed_ch15.ppt1a_query_processing_sil_7ed_ch15.ppt
1a_query_processing_sil_7ed_ch15.ppt
 
1a_query_processing_sil_7ed_ch15.pptx
1a_query_processing_sil_7ed_ch15.pptx1a_query_processing_sil_7ed_ch15.pptx
1a_query_processing_sil_7ed_ch15.pptx
 
20170602_OSSummit_an_intelligent_storage
20170602_OSSummit_an_intelligent_storage20170602_OSSummit_an_intelligent_storage
20170602_OSSummit_an_intelligent_storage
 

Kürzlich hochgeladen

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Kürzlich hochgeladen (20)

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 

VLDB2013 Session 1 Emerging Hardware

  • 2. Hybrid Storage Management for Database Systems 2 }  目的:SSDとHDDからなるハイブリッドなストレージシステムを データベース管理に用いる }  貢献 }  GD2L:ハイブリッドストレージにおけるcost-awareなbuffer pool管理 アルゴリズムの提案 }  GreedyDualアルゴリズム[1]をベースに,SSDとHDDのパフォーマンスの差異 を考慮 }  CAC:予測的なコストベースのSSD管理技術の提案 }  GD2LとCACの実験によるパフォーマンス評価 }  MySQLのInnoDBストレージエンジンに提案機構を実装 Session 1: Emerging Hardware 担当:若森拓馬(NTT) システムアーキテクチャ(Figure: 1より) ustrated in Figure 1. When writing data to storage, the MS chooses which type of device to write it to. RAM Buffer pool manager SSD manager HDD read/write requests Buffer pool SSD write Figure 1: System Architecture revious work has considered how a DBMS should place in a hybrid storage system [11, 1, 2, 16, 5]. We provide mmary of such work in Section 7. In this paper, we a broader view of the problem than is used by most his work. Our view includes the DBMS bu↵er pool as as the two types of storage devices. We consider two ted problems. The first is determining which data should etained in the DBMS bu↵er pool. The answer to this tion is a↵ected by the presence of hybrid storage because ks evicted from the bu↵er cache to an SSD are much er to retrieve later than blocks evicted to the HDD. Thus, onsider cost-aware bu↵er management, which can take distinction into account. Second, assuming that the is not large enough to hold the entire database, we e the problem of deciding which data should be placed on SSD. This should depend on the physical access pattern [1] N. Young. The k-server dual and loose competitiveness for paging. Algorithmica, 11:525–541, 1994. •  全てのDBのページはHDDに格納され,いくつかのページのコ ピーがSSDとbuffer poolのいずれかor両方に格納されている •  buffer poolがいっぱいになった時,buffer pool managerが ページ置換ポリシに基づいてページをbufferから取り除く •  SSD managerは, •  SSD追加ポリシに基づいて,buffer poolから取り除かれ たページをSSDに追記する •  SSD置換ポリシに基づいて,SSDからデータを取り除く
  • 3. GD2L:cost-awareなバッファプール管理アルゴリズム 3 }  前提 }  ページpはコストHをもつ }  L(inflation value)を用いてHを計算 する }  2つの優先度付キューを使用する }  QS:SSD上のページ管理用 }  QD:SSD上にないページ管理用 }  アルゴリズム }  ページpがキャッシュされていない 時 }  QsのLRUページとQDのLRUページ のコストHを比較し,小さい方のHを もつページqを取り除く }  L←H(q),pをキャッシュに置く }  pがSSD上にあるとき }  H(p)←L + RS,pをQsのMRUに置く }  それ以外で,pがHDD上にあるとき }  H(p) = L + RD,pをQDのMRUに置く Session 1: Emerging Hardware 担当:若森拓馬(NTT) H S S S S H H H H HQD QS S QSLRU QDLRU Page is flushed to the SSD Page is evicted from the SSD Figure 3: Bu↵er pool managed by GD2L GD2Lによるbuffer pool管理(Figure: 3より) ストレージデバイスのパラメータ RD:HDDからの読み込み時間 WD:HDDへの書き込み時間 RS:SSDからの読み込み時間 WS:SSDへの書き込み時間 buffer poolの空きが足りなくなった時,page cleanerは dirty bufferをcost-awareのポリシに基づいて ストレージにフラッシュする
  • 4. cost-awareキャッシングの効果 4 }  ページがSSDにキャッシュ されているとき,buffer poolミス率と物理書き込み 率が高くなる }  理由 }  ページが一度SSDに置か れると,そのページは読み 出しコストが低いため, buffer poolから取り除かれ やすくなる }  page cleanerは取り除く前 にdirty pageをフラッシュし なければならない Session 1: Emerging Hardware 担当:若森拓馬(NTT) gure 3: Bu↵er pool managed by GD2L types of writes: replacement writes and recoverabil- s. Replacement writes are issued when dirty pages ified as eviction candidates. To remove the latency d with synchronous writes, the page cleaners try to hat pages that are likely to be replaced are clean me of the replacement. In contrast, recoverabil- s are those that are used to limit failure recovery he InnoDB uses write ahead logging to ensure that ed database updates survive failures. The failure time depends on the age of the oldest changes in r pool. The page cleaners issue recoverability writes ast recently modified pages to ensure that a config- covery time threshold will not be exceeded. oDB, when the free space of the bu↵er pool is below Figure 4: Miss rate/write rate while on the HDD (only) vs. miss rate/write rate while on the SSD. Each point represents one page HDDのキャッシュミス/書き込み率 対 SSDのキャッシュミス/書き込み率 (Figure: 4より)
  • 5. CAC:Cost-Adjusted Caching 5 }  pをSSDに置くかどうかを決定するために,利得Bを見積もる }  p’がすでにSSDキャッシュにあるとき,B(p’)<B(p)を満たすpをSSD キャッシュに置き,最も小さいBをもつページを取り除く }  miss rate expansion factor (α) }  ページの物理read/write率が, SSDにページがあるかどうかで どのように変化するかを見積もる ために導入 Session 1: Emerging Hardware 担当:若森拓馬(NTT) Figure 5: Summary of Notation Symbol Description rD, wD Measured physical read/write count while not on the SSD rS, wS Measured physical read/write count while on the SSD crD, cwD Estimated physical read/write count if never on the SSD crS, cwS Estimated physical read/write count if al- ways on the SSD mS Bu↵er cache miss rate for pages on the SSD mD Bu↵er cache miss rate for pages not on the SSD ↵ Miss rate expansion factor Figure 6: Miss rate expansion factor for pages from three TPC-C tables. expansion factors of pages grouped by table and by logical 3つのTPC-Cのテーブル(のページ)に対する miss rate expansion factor(Figure: 6より) would experience if it were placed on the SSD, and crD(p) and cwD(p) are the physical read and write counts p would experience if it were not. Using these hypothetical physical read and write counts, we can write our desired estimate of the benefit of placing p on the SSD as follows B(p) = (crD(p)RD crS(p)RS)+( cwD(p)WD cwS(p)WS) (2) Thus, the problem of estimating benefit reduces to the prob- lem of estimating values for crD(p), crS(p), cwD(p), and cwS(p). To estimate crS(p), CAC uses two measured read counts: rS(p) and rD(p). (In the following, we will drop the explicit page reference from our notation as long as the page is clear from context.) In general, p may spend some time on the SSD and some time not on the SSD. rS is the count of the number of physical reads experienced by p while p is on the SSD. rD is the number of physical reads experienced by p while it is not on the SSD. To estimate what p’s physical read count would be if it were on the SSD full time (crS), CAC uses a room for a new entry. The Miss Rate Expansion Factor urpose of the miss rate expansion factor (↵) is to how much a page physical read and write rates nge if the page is admitted to the SSD. A simple estimate ↵ is to compare the overall miss rates of the SSD to that of pages that are not on the SSD. that mS represents the overall miss rate of logical uests for pages that are on the SSD, i.e., the total of physical reads from the SSD divided by the total of logical reads of pages on the SSD. Similarly, let esent the overall miss rate of logical read requests s that are not located on the SSD. Both mS and easily measured. Using mS and mD, we can define rate expansion factor as: ↵ = mS mD (7) mple, ↵ = 3 means that the miss rate is three times ages for on the SSD than for pages that are not on where mS(g) is they are in the pages in g while We track logi vidual page, as counts are upda to the page. Gr tain events occu pages. Specifica is evicted from from the bu↵er from the SSD. B their logical rea which a page b occurs, we subt old group and a It is possible t groups. For ex Figure 5: Summary of Notation Symbol Description rD, wD Measured physical read/write count while not on the SSD rS, wS Measured physical read/write count while on the SSD crD, cwD Estimated physical read/write count if never on the SSD crS, cwS Estimated physical read/write count if al- ways on the SSD mS Bu↵er cache miss rate for pages on the SSD mD Bu↵er cache miss rate for pages not on the SSD ↵ Miss rate expansion factor scaling it up to account for any time in which the page was not on the SSD. While this might be e↵ective, it will work only if the page has actually spent time on the SSD to that rS can be observed. We still require a way to estimate rS for pages that have not been observed on the SSD. In contrast, estimation using Equation 3 will work even if rS or rD are zero due to lack of observations. We track reference counts for all pages in the bu↵er pool and all pages in the SSD. In addition, we maintain an out- queue for to track reference counts for a fixed number (Noutq) of additional pages. When a page is evicted from the SSD, Figure 6: Miss rate expansion fact three TPC-C tables. expansion factors of pages grouped by read rate. The three lines represent pag C STOCK, CUSTOMER, and ORDER Since di↵erent pages may have substa rate expansion factors, we use di↵erent di↵erent groups of pages. Specifically, pages based on the database object (e they store data, and on their logical rea a di↵erent expansion factor for each gr range of possible logical read rates into size. We define a group as pages tha same database object and whose logic the same subrange. For example, in o : SSD上に全く無いときの物理read/writeの見積もり数 : SSD上に常にあるときの物理read/writeの見積もる数 :SSD上のページに対する バッファキャッシュミス率 :SSD上にないページに対する バッファキャッシュミス率
  • 6. GD2LとCACのパフォーマンス評価 6 Session 1: Emerging Hardware 担当:若森拓馬(NTT) Next, we consider the impact of switching from a non- anticipatory cost-based SSD manager (CC) to an antici- patory one (CAC). Figure 7 shows that GD2L+CAC pro- vides additional performance gains above and beyond those achieved by GD2L+CC in the case of the large (30GB) database. Together, GD2L and CAC provide a TPC-C per- formance improvement of about a factor of two relative to the LRU+CC baseline in our 30GB tests. The performance gain was less significant in the 15GB database tests and non-existent in the 8GB database tests. Figure 8 shows that GD2L+CAC results in lower total I/O costs on both the SSD and HDD devices, relative to GD2L+CC, in the 30GB experiments. Both policies re- sult in similar bu↵er pool hit ratios, so the lower I/O cost achieved by GD2L-CAC is attributable to better decisions about which pages to retain on the SSD. To better under- stand the reasons for the lower total I/O cost achieved by CAC, we analyzed logs of system activity to try to iden- tify specific situations in which GD2L+CC and GD2L+CAC make di↵erent placement decisions. One interesting situa- tion we encounter is one in which a very hot page that is in the bu↵er pool is placed in the SSD. This may occur, for ex- ample, when the page is cleaned by the bu↵er manager and there is free space in the SSD, either during cold start and because of invalidations. When this occurs, I/O activity for the hot page will spike because GD2L will consider the page to be a good eviction candidate. Under the CC policy, such a page will tend to remain in the SSD because CC prefers to keep pages with high I/O activity in the SSD. In contrast, CAC is much more likely to to evict such a page from the SSD, since it can (correctly) estimate that moving the page will result in a substantial drop in I/O activity. Thus, we find that GD2L+CAC tends to keep very hot pages in the bu↵er pool and out of the SSD, while with GD2L+CC such pages tend to remain in the SSD and bounce into and out of the bu↵er pool. Such dynamics illustrate why it is impor- tant to use an anticipatory SSD manager (like CAC) if the bu↵er pool manager is cost-aware. For the experiments with smaller databases (15GB and 8GB), there is little di↵erence in performance between GD2L+CC and GD2L+CAC. Both policies result in similar per-transaction I/O costs and similar TPC-C throughput. This is not sur- prising, since in these settings most or all of the hot part of the database can fit into the SSD, i.e., there is no need to be marginally faster than LRU+LRU2, and for the smallest database (8GB) they were essentially indistinguishable. LRU- +MV-FIFO performed much worse than the other two al- gorithms in all of the scenarios we test. Figure 10: TPC-C Throughput Under LRU+MV- FIFO, LRU+LRU2, GD2L+LRU2, and GD2L+CAC for Various Database and DBMS }  GD2L+CACが最も良いパ フォーマンスを達成 Alg & HDD HDD SSD SSD total BP miss BP size util I/O util I/O I/O rate (GB) (%) (ms) (%) (ms) (ms) (%) GD2L+CAC 1G 85 39.6 19 8.8 48.4 7.4 2G 83 31.8 20 7.8 39.6 6.3 4G 82 23.5 20 5.8 29.3 4.8 LRU+LRU2 1G 85 49.4 15 8.6 58.0 6.7 2G 87 50.3 12 6.7 57.0 4.4 4G 90 48.5 9 4.7 53.2 2.4 GD2L+LRU2 1G 73 41.1 43 24.6 65.8 10.7 2G 79 46.5 32 18.9 65.3 8.8 4G 79 34.4 32 13.8 48.2 7.6 LRU+FIFO 1G 91 101.3 8 9.2 110.5 6.6 2G 92 92.3 6 5.8 98.1 4.4 4G 92 62.9 5 3.7 66.6 2.5 Figure 11: Device Utilizations, Bu↵er Pool Miss Rate, and Normalized I/O Time (DB size=30GB) to 400M. We tested k set to 1%, 2%, 5% and size, Our results showed that k values in th impact on TPC-C throughput. In InnoDB, fier is eight bytes and the size of each page is hash map for a 400M SSD fits into ten pag the rate with which the SSD hash map was fl that even with k = 1%, the highest rate of ch hash map experienced by any of the three SS algorithms (CAC, CC, and LRU2) is less tha ond. Thus, the overhead imposed by checkp map is negligible. 7. RELATED WORK Placing hot data in fast storage (e.g. h cold data in slow storage (e.g. tapes) is n Hierarchical storage management (HSM) is technique which automatically moves data cost and low-cost storage media. It uses f TPC-Cスループット(Figure: 10より) デバイス利用率,buffer poolミス率,正規化後I/O時間(Figure: 11より)
  • 7. 参考: GreedyDualアルゴリズム 7 }  前提:オブジェクトサイズが均等で,異なる読み出し コストを持つ }  ページpに非負のコストHを与える }  あるページがキャッシュに移動または参照されたとき, Hはページをキャッシュに読み出すコストに設定され る }  キャッシュに空きを作るとき,キャッシュ上で最も小さ なH(Hmin)をもつページが除かれる }  残りの全てのページのHからHminを引く Session 1: Emerging Hardware 担当:若森拓馬(NTT)