Always on high availability best practices for informix

© 2015 IBM Corporation
DMX-2628 – Always On: High
Availability Best Practices for Informix
Nagaraju Inturi
nagaraju@us.ibm.com
Scott Lashley
slashley@us.ibm.com

•  IBM’s statements regarding its plans, directions, and intent are subject to change or withdrawal
without notice at IBM’s sole discretion.
•  Information regarding potential future products is intended to outline our general product direction
and it should not be relied on in making a purchasing decision.
•  The information mentioned regarding potential future products is not a commitment, promise, or
legal obligation to deliver any material, code or functionality. Information about potential future
products may not be incorporated into any contract.
•  The development, release, and timing of any future features or functionality described for our
products remains at our sole discretion.
Performance is based on measurements and projections using standard IBM benchmarks in a
controlled environment. The actual throughput or performance that any user will experience will vary
depending upon many factors, including considerations such as the amount of multiprogramming in the
user’s job stream, the I/O configuration, the storage configuration, and the workload processed.
Therefore, no assurance can be given that an individual user will achieve results similar to those stated
here.
Please Note:
2

Industry Terms
•  Recovery Point Objective (RPO)
§  How much data are you willing to lose?
•  Recovery Time Objective (RTO)
§  How much time to recovery from a failure
•  Example
§  ONCONFIG parameter RTO_SERVER_RESTART
Monitors transaction activity and coordinates checkpoints such
that in the event of a server crash, the server can reboot in the
time specified by RTO_SERVER_RESTART
2

Hot Standby
•  Fred wants to implement an RTO policy of 15 seconds in the
event of a failure.
3
Primary Secondary

Updatable Secondary
•  Fred wants to extend his HDR solution to utilize the secondary.
4
Primary Secondary

Updatable Secondary
•  How do updates on the secondary work?
§  Row locks are acquired on secondary as updates are applied
from primary
§  Initial read is done on secondary
§  Update is forwarded to primary
•  If row versioning is defined in the schema for the table, the version is
compared to determine if update can be applied
•  Otherwise, whole row is compared to determine if update can be
applied
•  What isolation levels are supported on a secondary?
§  Dirty Read
§  Committed Read
§  Committed Read Last Committed
5
http://www-01.ibm.com/support/knowledgecenter/SSGU8G_12.1.0/com.ibm.admin.doc/ids_admin_0874.htm%23ids_admin_0874?lang=en
http://www-01.ibm.com/support/knowledgecenter/SSGU8G_12.1.0/com.ibm.admin.doc/ids_admin_0875.htm?lang=en
http://www-01.ibm.com/support/knowledgecenter/SSGU8G_12.1.0/com.ibm.admin.doc/ids_admin_0877.htm?lang=en

Application Perspective - Locking & Queries
6
Begin
Work
Read
Row (V1)
Update
Row (V2)
Apply
Update Sec
Commit
Work
Apply
Commit
Sec
ulock xlock release
lock Pri
Query
Primary
row(V1) DR=row(V2)
CRLC=row(V1)
CR,CS,RR=block
Query
Secondary
row(V1) DR=row(V2)
CRLC=row(V1)
CR=block
Anatomy
Of Update
release
lock Sec
xlock
Sec

Application Perspective - Locking & Updates
7
Begin
Work
Read
Row (V1)
Update
Row (V2)
Apply
Update
Commit
Work
Apply
Commit
ulock xlock release
lock
Update
Primary
row(V1) block
Update
Secondary
If hot row, push to primary
Otherwise, row(V1)
block
Anatomy
Of Update
xlock
Sec
release
lock Sec

Application behavior
•  I’m on an updatable secondary and my application just did an
update to a row but its not committed yet. If I go read the row,
what version of the row will I see?
§  When my session (or any other session) attempts to read a row
that is recently updated, it will wait for secondary server I’m
connected to to replay that row update prior to reading the row.
8
Begin
Work
Read
Row (V1)
Forward
Update Sec
Wait Apply
Update Sec
Commit
Work
Apply
Commit
Sec
Read Row
Again
Block until row is applied

Application behavior
•  When I get error 7350 “Attempt to update a stale version of a
row”, what happened?
§  My application read a row from the secondary node and between
the time the row was read and forwarded to the primary to be
updated, another transaction was able to complete an update to
the row.
9
Update
Secondary
Update
Primary
Read row
(V1)
Read row
(V1)
Update row
(V2)
Commit row
(V2)
Forward
update
(V3)
At this point, the forwarded
update is the wrong version
to what is committed; error
is returned.

What’s new?
•  My application is using UPDATABLE_SECONDARY
configuration to perform queries and updates on all the
members of my HDR cluster. How do I coordinate transactions
across the HDR cluster?
•  CLUSTER_TXN_SCOPE
ONCONFIG and session parameter used to control when the
application receives an acknowledgement of the commit of a
user’s transaction.
10
CLUSTER_TXN_SCOPE Connected to Primary Connected to Secondary
SESSION ACK when commit is
complete
ACK when commit is
complete on primary
SERVER (default) ACK when commit is
complete
ACK when commit is
complete on primary and
processed on the node I’m
connected to
CLUSTER ACK when commit has
been applied to all nodes
ACK when commit has
been applied to all nodes

What’s new?
•  DRINTERVAL & HDR_TXN_SCOPE
These parameters work together to determine synchronization
between primary and secondary nodes
•  FULL_SYNC is new
11
DRINTERVAL HDR_TXN_SCOPE Buffered logging Unbuffered logging
-1 n/a Async Near sync
0 FULL_SYNC Full sync Full sync
0 ASYNC Async Async
0 NEAR_SYNC Near sync Near sync
>0 n/a Async Async

DRINTERVAL & HDR_TXN_SCOPE
•  My RPO is 0 for single point of failure
DRINTERVAL=0
HDR_TXN_SCOPE=NEAR_SYNC
This setting makes sure that committed transactions are received by the
secondary. If the primary fails, all committed transactions will be
guaranteed to be at least in volatile memory on the secondary.
•  My RPO is 10 for a single point of failure
DRINTERVAL=10
Make sure I send to the secondary a buffer at least every 10s
•  My RPO is 0 for multiple points of failure
DRINTERVAL=0
HDR_TXN_SCOPE=FULL_SYNC
This setting makes sure that committed transactions are received and
written to disk by the secondary. If the primary fails, all committed
transactions will be guaranteed to be hardened to disk on the secondary.
12

Offsite disaster
•  Fred wants to extend his HDR solution to include offsite
replication in case of site disaster.
13
Primary Secondary
RSS Secondary

Remote Standalone Secondary (RSS)
•  You want our remote site located in TimBuktu?
How’s the network connectivity to that site?
•  You dropped what database?
§  DELAY_APPY
•  Your planning to do what maintenance
this weekend?
§  Stop Apply command
•  RSS Limitations
§  Can only be promoted to HDR secondary, not primary
§  SYNC mode not supported
14

Improved Network performance
•  SMX_NUMPIPES
§  There is a limit on how many TCP buffers can be inflight across a
wire between a pair of ports until a TCP ACK is sent to the
sender. This is referred to as the TCP window. SMX can be
configured to have multiple pairs of ports between two given
servers, in effect filling in the gaps that would otherwise occur on
the network wire. This is especially advantageous if the network
connection is over a WAN or of less that best quality. In such
conditions, setting SMX_NUMPIPES to 2 can result in twice as
much data being sent across the wire.
§  SMX will reorganize the transmissions on the target node so that
it appears to have been received across a single serial
connection.
15

What’s new (and really cool)?
•  Informix warehouse accelerator (IWA)
16

What’s really cool?
17
Hey Scott, we are having an online sale this weekend and we expect a
huge influx of internet activity on our web site. I might have forgot to tell
you that. Can our infrastructure handle that?
•  Share Disk Secondary (SDS)
§  Adjust capacity as demand
changes
§  Does not duplicate disk space
§  No special hardware
•  Cluster mgr or SDS_LOGCHECK
§  Coexist with ER, HDR & RSS
§  Primary can failover to any SDS
•  ifxclone
§  Make a quick copy

What’s improved?
•  Index page logging (IPL)
§  Copies a newly created index from primary to secondary using
the logical log.
§  Required for RSS secondary servers
§  Big performance boost (4x)
18

Best Practices for HDR, RSS, SDS
•  All nodes which are candidates for failover (HDR secondary &
SDS) should have similar specs in case there is a failover
•  Use unbuffered database logging to minimize lost transactions
•  ONCONFIG parameter OFF_RECVRY_TRHEADS should be
set to prime (# of cpus) * 3
•  Turn on AUTO_READAHEAD on secondary
•  Larger BUFFERPOOL can alleviate some random I/O
•  ONCONFIG parameter TEMPTAB_NOLOG=1 to default temp
tables to non logging
•  ONCONFIG parameter HA_ALIAS= TCP network-based server
alias
§  Used to tell server network interface port to do server to server
replication traffic.
19

Best practices for HDR
•  ONCONFIG parameter DRINTERVAL=0 and use
HDR_TXN_SCOPE (ASYNC, NEAR_SYNC or FULL_SYNC)
•  ONCONFIG parameter DRAUTO=3 and use connection
manage to arbitrate failover
•  ONCONFIG parameter LOG_STAGING_DIR always set
§  Some log records, like CHECKPOINT, require serialized
processing which can block the primary from sending log data.
When an HDR secondary is configured with a log staging
directory, the logs can be spooled to disk while the serialized log
record is applied on the secondary. Once the log record has
been applied, the secondary will apply the spooled log until it
catches up with the primary. This can alleviate backflow pressure
from the secondary to the primary.
20

Best practices for RSS
•  ONCONFIG parameter RSS_FLOW_CONTROL
§  This ONCONFIG parameter controls RPO (units=amount of data
rather than time) for the RSS node so it doesn’t fall too far behind
•  ONCONFIG parameter SMX_NUMPIPES
§  Take advantage of parallel data transmission using multiple
network pipes
21

Best practices for SDS
•  ONCONFIG parameter SDS_LOGCHECK
User scenario…
I’m using HDR SDS with no cluster manager. How do I avoid disk
corruption and split brain in a failover scenario?
§  SDS_LOGCHECK is used to watch to log space in the event of a
failover scenario. After waiting N seconds, if no log activity is
seen, SDS secondary will assume takeover.
§  10 is a good starting value
•  ONCONFIG parameter SDS_FLOW_CONTROL
§  This ONCONFIG parameter controls RTO (units=amount of data
rather than time) for the SDS node so it doesn’t fall too far behind
•  No data will be lost because the disks are shared!
•  By not falling too far behind, it maintains RTO in the event of a
failover so there isn’t too much log to apply in order to catch up
22

Connection Manager
•  Route client connection…
24
?
Cluster
Flexible Grid /
ER

Connection Manager
•  Failover arbitration
25
New Primary
Cluster

Connect Manager
•  Act as a proxy
26
Port Blocked
CM as Proxy
CM-used port allowed
Client that cannot be recompiled

Connection Manager
•  Connection unit types
27
1) CLUSTER 2) REPLSET
3) GRID
Primary
HDR RSS
Enterprise
Replication
4) SERVERSET

Connection Manager – Best Practices
•  Avoid single point of failure
28
Client’s INFORMIXSQLHOSTS:
g_mySLA group - - c=1,i=123456
cm1_mySLA onsoctcp cm1Host cm1Port g=g_mySLA

c=1?
http://publib.boulder.ibm.com/infocenter/idshelp/v117/topic/com.ibm.admin.doc/ids_admin_0175.htm

Network paths offer perspective
PRI
HDR
switch
Is PRI down? Yes
PRI
HDR
Is PRI down? No
vs

We Value Your Feedback!
Don’t forget to submit your Insight session and speaker
feedback! Your feedback is very important to us – we use it
to continually improve the conference.
Access your surveys at insight2015survey.com to quickly
submit your surveys from your smartphone, laptop or
conference kiosk.
31

32
Notices and Disclaimers
Copyright © 2015 by International Business Machines Corporation (IBM). No part of this document may be reproduced or transmitted in any form
without written permission from IBM.
U.S. Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM.
Information in these presentations (including information relating to products that have not yet been announced by IBM) has been reviewed for
accuracy as of the date of initial publication and could include unintentional technical or typographical errors. IBM shall have no responsibility to
update this information. THIS DOCUMENT IS DISTRIBUTED "AS IS" WITHOUT ANY WARRANTY, EITHER EXPRESS OR IMPLIED. IN NO
EVENT SHALL IBM BE LIABLE FOR ANY DAMAGE ARISING FROM THE USE OF THIS INFORMATION, INCLUDING BUT NOT LIMITED TO,
LOSS OF DATA, BUSINESS INTERRUPTION, LOSS OF PROFIT OR LOSS OF OPPORTUNITY. IBM products and services are warranted
according to the terms and conditions of the agreements under which they are provided.
Any statements regarding IBM's future direction, intent or product plans are subject to change or withdrawal without notice.
Performance data contained herein was generally obtained in a controlled, isolated environments. Customer examples are presented as
illustrations of how those customers have used IBM products and the results they may have achieved. Actual performance, cost, savings or other
results in other operating environments may vary.
References in this document to IBM products, programs, or services does not imply that IBM intends to make such products, programs or
services available in all countries in which IBM operates or does business.
Workshops, sessions and associated materials may have been prepared by independent session speakers, and do not necessarily reflect the
views of IBM. All materials and discussions are provided for informational purposes only, and are neither intended to, nor shall constitute legal or
other guidance or advice to any individual participant or their specific situation.
It is the customer’s responsibility to insure its own compliance with legal requirements and to obtain advice of competent legal counsel as to the
identification and interpretation of any relevant laws and regulatory requirements that may affect the customer’s business and any actions the
customer may need to take to comply with such laws. IBM does not provide legal advice or represent or warrant that its services or products will
ensure that the customer is in compliance with any law.

33
Notices and Disclaimers (con’t)
Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly
available sources. IBM has not tested those products in connection with this publication and cannot confirm the accuracy of performance,
compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the
suppliers of those products. IBM does not warrant the quality of any third-party products, or the ability of any such third-party products to
interoperate with IBM’s products. IBM EXPRESSLY DISCLAIMS ALL WARRANTIES, EXPRESSED OR IMPLIED, INCLUDING BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
The provision of the information contained herein is not intended to, and does not, grant any right or license under any IBM patents, copyrights,
trademarks or other intellectual property right.
•  IBM, the IBM logo, ibm.com, Aspera®, Bluemix, Blueworks Live, CICS, Clearcase, Cognos®, DB2® , DOORS®, Emptoris®, Enterprise
Document Management System™, FASP®, FileNet®, Global Business Services ®, Global Technology Services ®, IBM ExperienceOne™,
IBM SmartCloud®, IBM Social Business®, IMS™, Information on Demand, ILOG, Maximo®, MQIntegrator®, MQSeries®, Netcool®,
OMEGAMON, OpenPower, PureAnalytics™, PureApplication®, pureCluster™, PureCoverage®, PureData®, PureExperience®, PureFlex®,
pureQuery®, pureScale®, PureSystems®, QRadar®, Rational®, Rhapsody®, Smarter Commerce®, SoDA, SPSS, Sterling Commerce®,
StoredIQ, Tealeaf®, Tivoli®, Trusteer®, Unica®, urban{code}®, Watson, WebSphere®, Worklight®, X-Force® and System z® Z/OS, are
trademarks of International Business Machines Corporation, registered in many jurisdictions worldwide. Other product and service names
might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at "Copyright and trademark
information" at: www.ibm.com/legal/copytrade.shtml.

Always on high availability best practices for informix

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Andere mochten auch

Andere mochten auch (13)

Ähnlich wie Always on high availability best practices for informix

Ähnlich wie Always on high availability best practices for informix (20)

Mehr von IBM_Info_Management

Mehr von IBM_Info_Management (9)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Always on high availability best practices for informix