The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
Â
A PeopleSoft & OBIEE Consolidation Success Story
1. Welcome to this session where we will be talking about consolidation on Exadata.
Before we get started, can I see a show of handsâŚ
⢠How many of you have Exadata in your data center?
⢠How many have plans to bring Exadata in this year?
The company in this storyâŚ
⢠Large, multi-national commercial real estate company
⢠Offices in North America, Asia, the Middle East, Africa, Europe, and Australia
⢠Last year they consolidated all of their core business applications in North America
onto an Exadata Half-Rack
⢠This year and next, they will be consolidating the remaining international offices onto
the new Exadata based system
⢠Applications include PeopleSoft Financials, HR, OBIEE, a home grown application built
on the PeopleTools framework, and various other applications.
In this session weâll be discussing how we got there. The tools and methodology we
used as well as the challenges (surprises) we encountered along the way.
1
10. The scenario is 7 databases that will be spread out across the 4 nodes
10
11. Letâs say we have the following DBs to migrate on ExadataâŚ
I call this table âthe node layoutâ
And it is read as ..
Database âAâ requires 4 CPUâs and will run on nodes 1 and 2 (2 CPUâs each)
You have 4 nodes with 7 databases spread out.. Now you want to be able to see the
âcluster level utilizationâ
which you just sum up all of their core requirement and divide it by the total number of
cores across the cluster
11
12. BUT more important is seeing the per compute node utilization because you may be
having a node thatâs 80% utilized where the rest of the nodes are on the 10% range
12
13. Hereâs another view of the node layout where we distribute the CPU cores of the
instances based on their node assignments
So each block on the left side is one CPU core
Thatâs 24 cores which accounts the threads as cores.
And that is based on the CPU_COUNT parameter and /proc/cpuinfo
and you set or take the number of CPUs from CPU_COUNT when you do instance
caging
and to be consistent with the monitoring of OEM and AWR
So on the cluster level utilization itâs 29.2%
While on the per compute node this is the utilization.
13
14. Now what we donât want to happen is if we change the node layout and assign more
instances on node 2
and still make use of the same number of CPU core requirement across the databases
On the cluster level it will be the same utilization
BUT on the per compute node you end up having node2 with 80% utilization while the
rest are pretty much idle
So we created a bunch of tools to where we can easily create a provisioning plan, be
able to play around with scenarios, and be able to audit it.
And thatâs what Randy will be introducing..
BTW, I like the part of the interview of Cary where he mentioned that even with a 135
lane highway you will still have the same traffic problem with the 35 lane highway if you
saturate it a bunch of cars.. So the capacity issue on a small hardware can also be an
issue on a big hardware and also on Exadata.
On this slide it is similar to monitoring the utilization of the whole highway as well as the
per lane utilization of that highway..
14
16. The three legs of the process:
1. Gather Requirements
2. Provision Resources
3. Audit Results
Utilization metrics from the audit should be fed back into the provisioning process for
re-evaluation.
16
17. ⢠The more accurate your numbers the better your plan will beâŚ.and the more
confident you will be about your provisioning plan
⢠Allow time for testing!
17
18. ⢠Itâs a capacity planning tool where we make sure that the 3 basic components (CPU,
memory, IO) does not exceed the available capacity.
⢠And Oracle has created a bunch of tools to standardize their installation of Exadata
which helps to avoid mistakes and configuration issues.
⢠BUT the problem is when all of the infra is in place how do you now get the end state
where all of the instances are up and running
⢠So this tool bridges that gap for you to get to that end state
⢠And since itâs an Excel based tool itâs pretty flexible and you can hand this off to your
boss as a documentation of their instance layout
18
19. Now we move on to the capacityâŚ
So thereâs a section on the provisioning worksheet where you will input the capacity of
the Exadata that you currently have
⢠On the node count you put .. 2,4,8
⢠Then we get the SPECint_rate equivalent of the Exadata processor so youâll be able to
compare the SPEED of the Exadata CPU against your source servers
⢠And Iâve also explained earlier that we are counting threads as cores.. And I have an
investigation on that which is available at this link
⢠Each node has 96GB of memory
⢠Disk space is dependent on ASM redundancy and DATA/RECO allocation
⢠The table compression factor lets you gain more disk space as you compress the big
tables
⢠The OFFLOAD FACTOR which is the amount of CPU resources that will be offloaded to
the storage cells
This is art.. This is NOT something that you can calculate.. Itâs not math.. Itâs like black
magic.. We have done a bunch of Exadata so we know when we see a workload we can
guess of what we think the offload percentage is.. That definitely affects the CPU but itâs
not something that you can scientifically calculate.
19
20. So thereâs a source and destination platform which is the Exadata
And you are transferring a bunch of databases from different platforms
And you have to get the equivalent number of cores of the source system against Exadata
And what we do is
⢠We find the type, speed, and number of CPU cores of the source platform
⢠We make use of SPECint comparisons to find the equivalent number of Exadata cores needed
⢠Of course the CPU cores capacity will depend of the Exadata that you have (Quarter, Half,
Full, Mulitple Racks)
Let me introduce you to some simple mathâŚ
Chip eff factor.. That will be your multiplier as to how it is equivalent to Exadata cores
And we make use of that to get the âExa cores requirementâ.. (let me explain the formula)
Now if itâs a DW database you will probably doing a lot of offloading.. So that where we factor in
the offload factor.
And Iâm pretty sure that if you attended Timâs presentation or the tuning class⌠you will get a
higher offload factor here ;)
20
21. And we gather the raw data of CPU, Memory, Storage
21
22. Then the raw data gets translated to requirements
22
23. And then you play around with the node layout.. Where you spread out the instances
across the compute nodes.
23
24. And we visually check the allocation of resources
And see the details of that 75% utilization from that part of the node layout..
And you can see visually the imbalance on the provisioning plan
24
25. And you can now simulate some scenarios
Like the node failure..
Where if I kill this node1
All the resources of that node will failover to the secondary node
And youâll be able to see the effect on utilization on the rest of the nodes
25
26. I know IOPS and MB/s is important but itâs not something that you canât simplify in a
worksheet. Gathering the IO requirements is more of an involved process and a bit
complicated so altogether we have a different process for that.
Check out this link to have an idea about sizing for IOPS
http://karlarao.wordpress.com/2012/06/29/the-effect-of-asm-redundancyparity-on-
readwrite-iops-slob-test-case-for-exadata-and-non-exa-environments/
The IOPS formula is shown here https://lh6.googleusercontent.com/-00PkzwfwnOE/T-
N0oo2Q-FI/AAAAAAAABqE/EbTOnHBlpmQ/s2048/20120621_IOPS.png where you have
to factor in the total workload IOPS, read and write percentage of IO to get the matching
Exadata rack that you need
And check out the example IO visualizations here to give you an idea on how to have an
detailed IO time series data http://goo.gl/xZHHY
IO Breakdown (single/multiblock IOPS, MB/s) http://goo.gl/UFavy
26
27. So that is the provisioning spreadsheet.
⢠How many of you think this might be a useful tool?
⢠Weâre exploring the idea of converting this tool to an Apex application.
27
33. The first success story we want to talk about came about on a busy Friday, right in the
middle of month-end processing.
During month-end processing the 4 primary business databases become extremely busy
and Exadata is put to the test.
33
34. Just a quick review of the instance/node layout. Notice that the HR database (HCMPRD)
shares node 2 with the BIPRD (and two other smaller ones)
34
35. Our first hint that there was a problem was the Oracle Load Map which showed 66
active sessions on the HR database â waiting for CPU!
⢠Complaints were coming in that people could not enter their time (HR) and that
OBIEE was running painfully slow.
⢠When we tried to login to the database server it was so saturated that we could
hardly get logged in.
35
36. We went to the Top Activity page for HCMPRD and found that the problem was with on
particular SQL statement.
We knew that we probably had a SQL statement with a bad plan but we needed to take
the pressure off of the CPU before we could do anything.
36
37. Our first course of action was to implement instance caging to reduce the strain on CPU
resources and lessen the impact to the other databases sharing that node.
⢠We caged the instance to 12 CPUâs â notice what happened to the operating
system run queue when we did this.
⢠Once we limited the instance to 12 CPUâs we went about the task of
investigating what went wrong with the SQL statement.
⢠We found the that execution plan *had* in fact changed.
⢠We used a SQL Profile to lock in the good execution plan â now look at what
happened to the active session count when we implemented the profile.
37
38. From the O/S view we can see that the load average dropped from 36.42 down to 10.36
due to Instance Caging.
We also see that the run queue dropped from 58 to 14 due to Instance Caging.
We were so pleased with Instance Caging we decided to roll it out to all production
databases.
Instance Caging is not a fix for SQL. What it does is protect your system and other
databases from the effects of the occasional load spike due to plan changes or new SQL
that performs poorly.
38
40. Point out that this is a new application without any utilization history. So we didnât know
what to expect from it.
⢠biprd is still contending with the hcmprd, fsprd, and mtaprd when it runs some
inefficient SQLs with cartesian joins causing high level of swapping which translated
to high wait io and high load averageâŚ
⢠Even with instance caging it still affects those databases because the swapping is
outside the Oracle kernel and the instance caging is just all about capping the CPU on
the oracle kernel
The tool that I used here is the âAWR Tableau Toolkitâ which is available here
http://karlarao.wordpress.com/scripts-resources/
40
41. ⢠Hereâs the stacked bar graph of the CPU usage of the entire cluster during that
workload spike from the OBIEE
⢠The half rack has 96 cores.. What is being graphed here is AWR data coming from top
events but just pulling the events âCPUâ and âSchedulerâ which corresponds to the
actual CPU usage and the CPU capped by the instance caging thatâs why on the graph
you are seeing it exceed the 96 cores limit.
⢠This is one of the visualizations that Tableau can do that is not available in Enterprise
Manager which is very helpful in capacity planning. Check out this link to see more
examples http://goo.gl/xZHHY
41
42. We have to be drastic on our solution so we segregated the OBIEE from everyone and
run it as a standalone database
the advantage of this is we have isolated the anti social database..
And you don't want to be on a situation where every now and then whenever this guy
goes bad it will affect this guy and this guy because they donât really fit in terms of
workload
42
43. And that leads us to the OBIEE issue..
This is the performance page whenever that inefficient batch of SQLs run..
43
44. And this is mining the AWR with tableau.. Whenever this happens we find spikes on
PGA, WAIT IO, and LOAD AVG
44
45. And really when the kswapd kicks in.. You are in trouble âş
This is bad because you only have 4 internal drives in your compute node..and you are
dumping that bunch of data in memory to disk.. Which cause the wait IO.. And translates
to high load average
The only thing to get rid of it is to cancel the SQLs or kill sessions.
45
46. And this is the inefficient SQL..
Itâs a SQL that runs for 2mins, and itâs doing a tiny bit of smart scans
Consumes 1.6PGA
Well itâs not really that bad but when you execute 60 of this at the same time that
already the total memory that you have..
46
47. So we did some tuning and engaged some Enkitec resources Karen and Martin..
Then we DBAs helped them on load testing and pushing it from TND to PRODâŚ
The design of the fact tables were fixed so that the resulting BIEE SQL will be efficient
and will consume less memory and then changed some of the ETL to pre-calculate the
time series metrics and store it on those fact tables and also did partitioning on the FACT
tables
And this is now the load after the tuning with the target number of users that they want
which is around 200 users.
47
48. Mention John Clarkâs Presentation on IORM
âŚand hopefully you made it to John Clarkâs presentation, âI/O resource management on
Exadataâ where he explained the mechanics of how IORM works and why it is important.
48
49. Got an email from the DBA that his database refresh took 12 hours which usually takes
40mins
Immediately we did an investigation on IO contention issues
49
50. We saw that this BIUAT database is just the only active database that is hogging the IO
50
52. From the gas.sql script
https://www.dropbox.com/sh/jzcl5ydt29mvw69/XiBJ3MgV1q/PerformanceAndTroubles
hooting
you can see about 31 sessions which most of them are doing smart scans
52
53. And looking at the SQL it is scanning 4TB of data and returning 400GB of it. If you do this
kind of SQL in orders of magnitude.. then you will saturate the IO bandwidth of Exadata.
Going back to what Cary has mentionedâŚ
that even with a 135 lane highway you will still have the same traffic problem with the
35 lane highway if you saturate it a bunch of cars.. So the capacity issue on a small
hardware can also be an issue on a big hardware and also on Exadata.
53
54. You can see the details of the Smart Scans on the row source operations of the SQL..
So here each of the row source is scanning 1TB and returning 144GB..
54
55. And this is what the Top Activity page of the other database will look like when
encountering an IO contention issue..
Youâll see here that the database is having high system IO waits on the critical
background processes CKPT, LGWR, DBWR, LMON
55
56. This lead us to a simple IORM plan that just caps the BIPRD database which is the âanti social databaseâ on Level1
And the OTHER group which is the rest of the database on the cluster on Level2
The IORM objective we have is AUTO
This decision is based on the analysis of the workload of the databases that are doing the smart scans⌠BIPRD will still get the 100% if the other databases are idle and will pull back to 30% when the databases
from the OTHER group will need IO bandwidth (and vice versa).
SYS@biprd2> show parameter db_uniq
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
db_unique_name string BIPRDDAL
# main commands
alter iormplan dbPlan=( -
(name= BIPRDDAL, level=1, allocation=30), -
(name=other, level=2, allocation=100));
alter iormplan active
list iormplan detail
list iormplan attributes objective
alter iormplan objective = auto
# list
dcli -g ~/cell_group -l root 'cellcli -e list iormplan detail'
dcli -g ~/cell_group -l root 'cellcli -e list iormplan attributes objective'
# implement
dcli -g ~/cell_group -l root 'cellcli -e list iormplan detail'
dcli -g ~/cell_group -l root 'cellcli -e alter iormplan dbPlan=( (name= BIPRDDAL, level=1, allocation=30), (name=other, level=2, allocation=100));'
dcli -g ~/cell_group -l root 'cellcli -e list iormplan detail'
dcli -g ~/cell_group -l root 'cellcli -e alter iormplan active'
dcli -g ~/cell_group -l root 'cellcli -e list iormplan detail'
dcli -g ~/cell_group -l root 'cellcli -e alter iormplan objective = auto'
dcli -g ~/cell_group -l root 'cellcli -e list iormplan attributes objective'
# revert
dcli -g ~/cell_group -l root 'cellcli -e alter iormplan dbPlan=""'
dcli -g ~/cell_group -l root 'cellcli -e alter iormplan catPlan=""'
dcli -g ~/cell_group -l root 'cellcli -e list iormplan detail'
dcli -g ~/cell_group -l root 'cellcli -e alter iormplan inactive'
dcli -g ~/cell_group -l root 'cellcli -e list iormplan detail'
dcli -g ~/cell_group -l root 'cellcli -e alter iormplan objective=""'
dcli -g ~/cell_group -l root 'cellcli -e list iormplan attributes objective'
56
57. This is a simple test case from the IOsaturationtoolkit-v2 on the test and dev
environment
http://karlarao.wordpress.com/2012/05/14/iosaturationtoolkit-v2-with-iorm-and-
awesome-text-graph/
All of the databases are doing a sustained IO from the toolkit
With the IORM plan set (previous slide).. You can see here that the BI database pulls
back and gives priority to HCM database
On the summary below, you can see that the total bandwidth given to each of the
databases (last column)
And the response time (seconds) of each of the database (2nd to the last column)
57
59. This is a PeopleSoft SQL⌠and it executes pretty fast finishes in seconds..
But the problem here is the execution plan has a Full Table Scan on this particular tableâŚ
now youâll see on this scenario how a batch job that should not be executed in that
morning period will trigger a workload change
And will transform this SQL from âall cpuâ workload to âall IOâ workload.
59
60. This is the normal looks like for this database.. âall greenâ meaning mostly CPU..
60
61. Then it shifted to this âall blueâ which means given that the Average Active Sessions
shoot up to the range of 60 thatâs a lot of IOâŚ
So thereâs something wrong here.. Check out the SQL_ID 7r9k it went from all cpu to all
IO.. But why?
61
62. Also this another SQL where it has a Full Scan on that same table.. Is being affected and
contributing to the sudden âall IOâ workload change
62
63. Now.. The moment of truth
We found out that there was a batch job that was executed by the Application DBA..
Thatâs a batch job that is out of schedule..
And as soon as he cancelled that job.. The workload went back from âall IOâ to the
normal âall CPUâ workload
63
64. Now everything went back to the original workloadâŚ
And notice the SQL_ID 7r9k went back from âall blueâ (IO) to âall grenâ (CPU)
And this is interesting!
64
65. Even more interesting is as I was investigating on the wait events of this SQL.. When the
batch job was cancelled.. It flipped from doing smart scans to CPU right away without a
plan change. Interesting!
So what could have caused this?
Explanation:
When the batch run was executed it flooded the buffer cache with blocks.. And this SQL
7r9k just goes to CPU by default (on a normal workload) because the optimizer thinks
that it is faster to fetch blocks in buffer cache (LIOs).. Now, this is a serial SQL⌠and for
this to go smart scans it has to do serial direct path reads and the underlying concept of
this is it relies on an algorithm or computation for it to go smart scans and part of the
computation is it checks the amount of blocks that you have in the buffer cache.. So with
that batch job.. It altered that computation and it made the SQL go directly to the
storage cells and favor smart scans.
How can this be avoided?
Well this can be avoided by creating an index on that table.. On this case itâs just one
table thatâs getting affected or causing this. So whenever that batch run will be executed
in the morning again, having an index will let it run on LIOs and will not flip to smart
scans
BTW, having this SQL flip from LIOs to smart scans made it run from seconds to minutes.
So workload change or workload tuning is another thing that you have to be aware of..
65