SlideShare a Scribd company logo
1 of 72
HBase	
  in	
  the	
  Enterprise	
  Data	
  Hub	
  
It’s	
  all	
  about	
  co-­‐exis0ng	
  peacefully!	
  
Swarnim	
  Kulkarni	
  
May	
  24th	
  2016	
  
innova8on	
  
innova8on	
  noun	
  
in·∙no·∙va·∙0on	
  ˌi-­‐nə-­‐ˈvā-­‐shən	
  
1.	
  	
  the	
  introduc0on	
  of	
  something	
  new	
  
2.	
  	
  a	
  new	
  idea,	
  method	
  or	
  device	
  
Chart	
  Search	
  
Sepsis	
  
Millennium	
  +	
  
Popula8on	
  Health	
  
HBase	
  sits	
  at	
  the	
  
core	
  of	
  a	
  lot	
  of	
  these	
  
solu8ons!	
  
Each	
  solu8on	
  built	
  
with	
  specific	
  
business	
  problem	
  in	
  
mind	
  
CHART	
  	
  
SEARCH	
  
SEPSIS	
   MILLENNIUM	
  +	
   POP	
  HEALTH	
  
CHART	
  	
  
SEARCH	
  
SEPSIS	
   MILLENNIUM	
  +	
   POP	
  HEALTH	
  
Prevents	
  assets	
  from	
  being	
  shared	
  
High	
  investment	
  
High	
  maintenance	
  costs	
  
High	
  barrier	
  of	
  entry	
  for	
  innova8on	
  
	
  
Enter	
  	
  
The	
  Data	
  Hub	
  
Bob	
   Tim	
  
Bob	
  
= +
+
Tim	
  
= +
+
Innova8ons	
  
Source	
  of	
  data	
  
Raw	
  data	
  
Data	
  Hub	
  Owner	
  
Data	
  Hub	
  Consumers	
  
Workers	
  
“An	
  Enterprise	
  Data	
  Hub	
  is	
  a	
  centralized	
  
loca0on	
  to	
  store	
  all	
  data,	
  for	
  as	
  long	
  as	
  
needed,	
  in	
  its	
  original	
  fidelity;	
  with	
  
flexibility	
  to	
  run	
  variety	
  of	
  enterprise	
  
workloads	
  including	
  batch	
  processing	
  and	
  
interac0ve	
  SQL	
  -­‐	
  with	
  robust	
  security,	
  
audi0ng	
  and	
  management.”	
  
“An	
  Enterprise	
  Data	
  Hub	
  is	
  a	
  centralized	
  
loca-on	
  to	
  store	
  all	
  data,	
  for	
  as	
  long	
  as	
  
needed,	
  in	
  its	
  original	
  fidelity;	
  with	
  
flexibility	
  to	
  run	
  variety	
  of	
  enterprise	
  
workloads	
  including	
  batch	
  processing	
  and	
  
interac0ve	
  SQL	
  -­‐	
  with	
  robust	
  security,	
  
audi0ng	
  and	
  management.”	
  
“An	
  Enterprise	
  Data	
  Hub	
  is	
  a	
  centralized	
  
loca-on	
  to	
  store	
  all	
  data,	
  for	
  as	
  long	
  as	
  
needed,	
  in	
  its	
  original	
  fidelity;	
  with	
  
flexibility	
  to	
  run	
  variety	
  of	
  enterprise	
  
workloads	
  including	
  batch	
  processing	
  and	
  
interac0ve	
  SQL	
  -­‐	
  with	
  robust	
  security,	
  
audi0ng	
  and	
  management.”	
  
“An	
  Enterprise	
  Data	
  Hub	
  is	
  a	
  centralized	
  
loca-on	
  to	
  store	
  all	
  data,	
  for	
  as	
  long	
  as	
  
needed,	
  in	
  its	
  original	
  fidelity;	
  with	
  
flexibility	
  to	
  run	
  variety	
  of	
  enterprise	
  
workloads	
  including	
  batch	
  processing	
  and	
  
interac0ve	
  SQL	
  -­‐	
  with	
  robust	
  security,	
  
audi0ng	
  and	
  management.”	
  
“An	
  Enterprise	
  Data	
  Hub	
  is	
  a	
  centralized	
  
loca-on	
  to	
  store	
  all	
  data,	
  for	
  as	
  long	
  as	
  
needed,	
  in	
  its	
  original	
  fidelity;	
  with	
  
flexibility	
  to	
  run	
  variety	
  of	
  enterprise	
  
workloads	
  including	
  batch	
  processing	
  and	
  
interac0ve	
  SQL	
  -­‐	
  with	
  robust	
  security,	
  
audi-ng	
  and	
  management.”	
  
Mul8-­‐tenant	
  
Secure	
  and	
  Compliant	
  
Ac8ve	
  archive	
  of	
  all	
  data	
  
Low	
  barrier	
  of	
  entry	
  
Hadoop	
  as	
  a	
  Service	
  
Data	
  Hub	
  is….	
  
So	
  why	
  provide	
  HBase	
  
as	
  a	
  service?	
  
Maximize	
  cluster	
  
u8liza8on	
  
Minimum	
  resource	
  guarantee	
  
sharing	
  
Low	
  barrier	
  of	
  entry	
  
HBase	
  as	
  a	
  Service	
  
Knowledge	
  
Data	
  Data	
  
Requirements	
  
Support	
  mul8-­‐tenant	
  environment	
  for	
  mul8ple	
  
users	
  
Isolated	
  deployment	
  
Set	
  quota	
  per	
  consumer	
  
ACLs	
  for	
  project	
  level	
  administra8on	
  
Appropriate	
  security	
  for	
  user	
  authen8ca8on	
  
Architecture	
  
Smaller	
  HBase	
  clusters	
  part	
  of	
  the	
  bigger	
  “hub”	
  
Scale	
  as	
  per	
  need	
  
Easier	
  management	
  
Resource	
  Management	
  
Making	
  sure	
  that	
  you	
  have	
  sufficient	
  
resources	
  before	
  you	
  start	
  the	
  job!	
  
CGroups	
  
•  Control	
  Groups	
  
•  Police	
  and	
  limit	
  CPU,	
  Disk	
  I/O	
  and	
  Memory	
  usage	
  
•  Resource	
  guarantee	
  by	
  sta8c	
  par88oning	
  
	
  
•  Very	
  useful	
  in	
  case	
  of	
  conten8on	
  
	
  
	
  
•  Value	
  for	
  resource-­‐limits	
  driven	
  by	
  individual	
  tenants	
  depending	
  
on	
  the	
  workload	
  
•  Defaults	
  to	
  70-­‐30	
  usage	
  for	
  memory	
  and	
  CPU	
  between	
  
(NodeManager	
  +	
  Datanode)	
  and	
  RegionServer	
  
•  Higher	
  memory	
  on	
  RS	
  (at	
  least	
  4	
  GB)	
  helps	
  avoid	
  swapping	
  
and	
  happier	
  GC	
  
CPU	
   CPU	
   CPU	
  
MEMORY	
   MEMORY	
   MEMORY	
  
SEPSIS	
   M+	
   POP	
  HEALTH	
  
What	
  we	
  learned	
  
•  Recommend	
  tenants	
  to	
  give	
  atleast	
  4	
  GB	
  of	
  memory	
  to	
  
the	
  regionserver	
  for	
  smoother	
  opera8on	
  
•  Coopera8ve	
  memory	
  (ex:	
  via	
  JVM	
  heaps,	
  max.	
  container	
  
sizes)	
  works	
  bejer	
  than	
  cgroup	
  limits	
  
•  Disable	
  swapping	
  on	
  HBase	
  nodes	
  
•  Disable	
  HDFS	
  load	
  balancer	
  
•  CMS	
  GC	
  performs	
  way	
  bejer	
  
•  Limit	
  number	
  of	
  containers	
  running	
  on	
  nodes	
  to	
  maximize	
  
performance	
  
Property	
   Value	
  
yarn.nodemanager.container-­‐
executor.class	
  
org.apache.hadoop.yarn.server.node
manager.LinuxContainerExecutor	
  
yarn.nodemanager.linux-­‐container-­‐
executor.group	
  
hadoop	
  
yarn.nodemanager.linux-­‐container-­‐
executor.resources-­‐handler.class	
  
org.apache.hadoop.yarn.server.node
manager.u8l.CgroupsLCEResourcesHa
ndler	
  
yarn.nodemanager.linux-­‐container-­‐
executor.cgroups.hierarchy	
  
/yarn	
  
yarn.nodemanager.linux-­‐container-­‐
executor.cgroups.mount	
  
true	
  
hjps://hadoop.apache.org/docs/current/hadoop-­‐yarn/hadoop-­‐yarn-­‐site/NodeManagerCgroups.html	
  
Request	
  queues	
  
•  Priori8ze	
  variety	
  of	
  workflows	
  that	
  need	
  access	
  to	
  HBase	
  
•  Important	
  to	
  meet	
  SLAs	
  for	
  tenants	
  
•  “FIFO”	
  vs	
  “Deadline”	
  queue	
  type	
  
•  “Deadline”(default)	
  proved	
  to	
  work	
  prejy	
  well	
  for	
  most	
  of	
  
the	
  cases	
  
•  Cannot	
  be	
  set	
  per	
  tenant	
  but	
  has	
  to	
  be	
  set	
  per	
  cluster	
  –	
  
support	
  coming	
  soon	
  
Quotas	
  
•  Makes	
  sure	
  that	
  no	
  single	
  tenant	
  abuses	
  the	
  system	
  
•  Peaceful	
  coexistence	
  
•  Promotes	
  a	
  pay-­‐per-­‐use	
  model.	
  Could	
  buy	
  a	
  higher	
  
throjle	
  limit	
  by	
  contribu8ng	
  more	
  number	
  of	
  nodes	
  
•  Usually	
  we	
  set	
  throjle	
  on	
  a	
  per	
  namespace	
  basis	
  but	
  
could	
  also	
  set	
  on	
  the	
  per	
  table/user	
  basis	
  if	
  needed	
  
Tenant	
  Isola8on	
  
Namespace	
  
•  Logical	
  grouping	
  of	
  HBase	
  tables	
  
•  Analogous	
  to	
  databases	
  
•  Provides	
  tenants	
  with	
  individual	
  space	
  to	
  operate	
  on	
  
•  Tied	
  to	
  the	
  AD	
  group	
  used	
  when	
  onboarding	
  
•  Could	
  also	
  apply	
  quotas(max	
  regions/tables)	
  per	
  
namespace(HBASE-­‐8410)	
  but	
  not	
  using	
  that	
  feature	
  for	
  
now	
  
Security	
  
Authen8ca8on	
  
•  Cluster	
  secured	
  by	
  Kerberos	
  
•  Disallow	
  impersona8on	
  
•  Require	
  kinit	
  to	
  first	
  authen8cate	
  with	
  the	
  KDC	
  before	
  
accessing	
  the	
  cluster	
  
ACLs	
  
•  Provides	
  the	
  authoriza8on	
  piece	
  
•  Set	
  per	
  namespace	
  and	
  8ed	
  to	
  the	
  AD	
  groups	
  
•  Required	
  proper8es	
  
ACLs	
  
<ac0on>	
  -­‐	
  Determines	
  the	
  type	
  of	
  ac8on	
  –	
  grant	
  or	
  revoke	
  	
  
<en0ty>	
  	
  -­‐	
  Determines	
  en8ty	
  to	
  grant	
  access	
  to	
  –	
  user	
  or	
  groups	
  	
  
<level>	
  	
  	
  	
  -­‐	
  Determines	
  the	
  access	
  level	
  –	
  RWXCA	
  	
  
<scope>	
  	
  -­‐	
  Determines	
  the	
  scope	
  for	
  access	
  –	
  namespace,	
  table,	
  column	
  
family	
  or	
  Cell	
  	
  
Must	
  be	
  a	
  super	
  user	
  to	
  run	
  these	
  commands	
  	
  
(Determined	
  by	
  hbase.supersuser)	
  
Example	
  
~33%	
  of	
  all	
  US	
  healthcare	
  data	
  
Millennium	
  +	
  
4	
  CDH	
  5.5.2	
  clusters	
  (soon	
  to	
  be	
  5)	
  
1245	
  tables	
  
113,982	
  total	
  regions	
  
673	
  regionservers	
  
895	
  TB	
  of	
  data!	
  (unreplicated)	
  
Popula8on	
  Health	
  
900	
  tables	
  
2	
  million	
  requests/sec/day	
  
780	
  regionservers	
  
115,000	
  regions	
  
700	
  TB	
  of	
  data!	
  (unreplicated)	
  
Lots	
  of	
  common	
  data….	
  
•  Reference	
  Data	
  
•  Provider	
  informa8on	
  
•  Insurance	
  informa8on	
  
•  SNOMED/	
  ICD9	
  /	
  ICD10	
  data	
  
•  Ac8vity	
  Data	
  
•  Visits	
  
•  Procedures	
  
•  Vitals	
  
Crawler	
  
M
I
L
L
E
N
N
I
U
M
Collector	
   HBase	
  Cluster	
  
Crawler	
   Collector	
   HBase	
  Cluster	
  
M+	
  
Pop.	
  H	
  
Makes	
  sense	
  to	
  be	
  hosted	
  in	
  the	
  Data	
  hub	
  instead!	
  
Onboarding	
  –	
  Capacity	
  planning	
  
M+	
  
Popula8on	
  
Health	
  
Onboarding	
  -­‐	
  Deployment	
  
M+	
  
Pop.	
  Health	
  
DATA	
  HUB	
   Tenants	
  could	
  choose	
  to	
  
modify	
  the	
  cgroup	
  
configura8on	
  depending	
  
on	
  expected	
  workloads	
  
or	
  just	
  stay	
  with	
  defaults	
  
Onboarding	
  -­‐	
  Isola8on	
  
•  Create	
  AD	
  groups	
  
•  poph_users,	
  pop_admins	
  
•  mplus_users,	
  mplus_admins	
  
•  Create	
  namespaces	
  (as	
  super	
  user)	
  
hbase(main):001:0> create_namespace ’mplus'
0 row(s) in 0.5650 seconds
hbase(main):001:0> create_namespace ’poph'
0 row(s) in 0.7262 seconds
Onboarding	
  -­‐	
  Quotas	
  
	
  
•  Setup	
  quotas	
  
hbase(main):001:0> set_quota TYPE => THROTTLE, NAMESPACE
=> ’mplus', LIMIT => ’10000 req/sec'
0 row(s) in 0.7255 seconds
hbase(main):001:0> set_quota TYPE => THROTTLE, NAMESPACE
=> ’poph', LIMIT => ’1500 req/sec'
0 row(s) in 0.5677 seconds
	
  
Onboarding	
  -­‐	
  Security	
  
•  Kerberos	
  secured.	
  	
  
•  Require	
  kinit	
  for	
  regular	
  users	
  to	
  access	
  the	
  cluster	
  
•  Deploy	
  keytabs	
  for	
  service	
  users	
  
•  Setup	
  ACLs	
  
hbase(main):001:0> grant '@poph_users', 'RWCX', ’@poph’
0 row(s) in 0.3250 seconds
	
  
hbase(main):001:0> grant '@poph_admins', 'RWACX', ’@poph’
0 row(s) in 0.4332 seconds
Future	
  
What	
  we	
  can	
  do	
  bejer	
  
•  Namespace	
  quota	
  support	
  (HBASE-­‐8410)	
  
•  Limit	
  tables/regions	
  per	
  namespace	
  
•  Region	
  Server	
  Groups	
  (HBASE-­‐6721)	
  
•  Pin	
  namespace/tables	
  to	
  subset	
  of	
  regionservers	
  
•  Advanced	
  namespace	
  security	
  (HBASE-­‐9206)	
  
•  Higher	
  flexibility	
  to	
  admins	
  and	
  tenants	
  for	
  namespace	
  
management	
  
	
  
@CernerEng	
  
hjp://engineering.cerner.com/	
  

More Related Content

What's hot

Large-scale Web Apps @ Pinterest
Large-scale Web Apps @ PinterestLarge-scale Web Apps @ Pinterest
Large-scale Web Apps @ PinterestHBaseCon
 
Tales from the Cloudera Field
Tales from the Cloudera FieldTales from the Cloudera Field
Tales from the Cloudera FieldHBaseCon
 
HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster
HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster
HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster Cloudera, Inc.
 
Rolling Out Apache HBase for Mobile Offerings at Visa
Rolling Out Apache HBase for Mobile Offerings at Visa Rolling Out Apache HBase for Mobile Offerings at Visa
Rolling Out Apache HBase for Mobile Offerings at Visa HBaseCon
 
HBaseCon 2015: Optimizing HBase for the Cloud in Microsoft Azure HDInsight
HBaseCon 2015: Optimizing HBase for the Cloud in Microsoft Azure HDInsightHBaseCon 2015: Optimizing HBase for the Cloud in Microsoft Azure HDInsight
HBaseCon 2015: Optimizing HBase for the Cloud in Microsoft Azure HDInsightHBaseCon
 
HBaseCon 2013:High-Throughput, Transactional Stream Processing on Apache HBase
HBaseCon 2013:High-Throughput, Transactional Stream Processing on Apache HBase HBaseCon 2013:High-Throughput, Transactional Stream Processing on Apache HBase
HBaseCon 2013:High-Throughput, Transactional Stream Processing on Apache HBase Cloudera, Inc.
 
HBaseCon 2013: ETL for Apache HBase
HBaseCon 2013: ETL for Apache HBaseHBaseCon 2013: ETL for Apache HBase
HBaseCon 2013: ETL for Apache HBaseCloudera, Inc.
 
HBase Backups
HBase BackupsHBase Backups
HBase BackupsHBaseCon
 
HBaseCon 2015: Graph Processing of Stock Market Order Flow in HBase on AWS
HBaseCon 2015: Graph Processing of Stock Market Order Flow in HBase on AWSHBaseCon 2015: Graph Processing of Stock Market Order Flow in HBase on AWS
HBaseCon 2015: Graph Processing of Stock Market Order Flow in HBase on AWSHBaseCon
 
HBaseCon 2012 | HBase, the Use Case in eBay Cassini
HBaseCon 2012 | HBase, the Use Case in eBay Cassini HBaseCon 2012 | HBase, the Use Case in eBay Cassini
HBaseCon 2012 | HBase, the Use Case in eBay Cassini Cloudera, Inc.
 
HBaseCon 2015: HBase and Spark
HBaseCon 2015: HBase and SparkHBaseCon 2015: HBase and Spark
HBaseCon 2015: HBase and SparkHBaseCon
 
Harmonizing Multi-tenant HBase Clusters for Managing Workload Diversity
Harmonizing Multi-tenant HBase Clusters for Managing Workload DiversityHarmonizing Multi-tenant HBase Clusters for Managing Workload Diversity
Harmonizing Multi-tenant HBase Clusters for Managing Workload DiversityHBaseCon
 
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBaseHBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBaseCloudera, Inc.
 
HBaseCon 2015 General Session: State of HBase
HBaseCon 2015 General Session: State of HBaseHBaseCon 2015 General Session: State of HBase
HBaseCon 2015 General Session: State of HBaseHBaseCon
 
Off-heaping the Apache HBase Read Path
Off-heaping the Apache HBase Read Path Off-heaping the Apache HBase Read Path
Off-heaping the Apache HBase Read Path HBaseCon
 
HBaseCon 2013: Apache HBase Operations at Pinterest
HBaseCon 2013: Apache HBase Operations at PinterestHBaseCon 2013: Apache HBase Operations at Pinterest
HBaseCon 2013: Apache HBase Operations at PinterestCloudera, Inc.
 
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, Cloudera
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, ClouderaHBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, Cloudera
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, ClouderaCloudera, Inc.
 
Taming the Elephant: Efficient and Effective Apache Hadoop Management
Taming the Elephant: Efficient and Effective Apache Hadoop ManagementTaming the Elephant: Efficient and Effective Apache Hadoop Management
Taming the Elephant: Efficient and Effective Apache Hadoop ManagementDataWorks Summit/Hadoop Summit
 
HBase Read High Availability Using Timeline-Consistent Region Replicas
HBase Read High Availability Using Timeline-Consistent Region ReplicasHBase Read High Availability Using Timeline-Consistent Region Replicas
HBase Read High Availability Using Timeline-Consistent Region ReplicasHBaseCon
 
HBaseCon 2015: HBase Operations at Xiaomi
HBaseCon 2015: HBase Operations at XiaomiHBaseCon 2015: HBase Operations at Xiaomi
HBaseCon 2015: HBase Operations at XiaomiHBaseCon
 

What's hot (20)

Large-scale Web Apps @ Pinterest
Large-scale Web Apps @ PinterestLarge-scale Web Apps @ Pinterest
Large-scale Web Apps @ Pinterest
 
Tales from the Cloudera Field
Tales from the Cloudera FieldTales from the Cloudera Field
Tales from the Cloudera Field
 
HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster
HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster
HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster
 
Rolling Out Apache HBase for Mobile Offerings at Visa
Rolling Out Apache HBase for Mobile Offerings at Visa Rolling Out Apache HBase for Mobile Offerings at Visa
Rolling Out Apache HBase for Mobile Offerings at Visa
 
HBaseCon 2015: Optimizing HBase for the Cloud in Microsoft Azure HDInsight
HBaseCon 2015: Optimizing HBase for the Cloud in Microsoft Azure HDInsightHBaseCon 2015: Optimizing HBase for the Cloud in Microsoft Azure HDInsight
HBaseCon 2015: Optimizing HBase for the Cloud in Microsoft Azure HDInsight
 
HBaseCon 2013:High-Throughput, Transactional Stream Processing on Apache HBase
HBaseCon 2013:High-Throughput, Transactional Stream Processing on Apache HBase HBaseCon 2013:High-Throughput, Transactional Stream Processing on Apache HBase
HBaseCon 2013:High-Throughput, Transactional Stream Processing on Apache HBase
 
HBaseCon 2013: ETL for Apache HBase
HBaseCon 2013: ETL for Apache HBaseHBaseCon 2013: ETL for Apache HBase
HBaseCon 2013: ETL for Apache HBase
 
HBase Backups
HBase BackupsHBase Backups
HBase Backups
 
HBaseCon 2015: Graph Processing of Stock Market Order Flow in HBase on AWS
HBaseCon 2015: Graph Processing of Stock Market Order Flow in HBase on AWSHBaseCon 2015: Graph Processing of Stock Market Order Flow in HBase on AWS
HBaseCon 2015: Graph Processing of Stock Market Order Flow in HBase on AWS
 
HBaseCon 2012 | HBase, the Use Case in eBay Cassini
HBaseCon 2012 | HBase, the Use Case in eBay Cassini HBaseCon 2012 | HBase, the Use Case in eBay Cassini
HBaseCon 2012 | HBase, the Use Case in eBay Cassini
 
HBaseCon 2015: HBase and Spark
HBaseCon 2015: HBase and SparkHBaseCon 2015: HBase and Spark
HBaseCon 2015: HBase and Spark
 
Harmonizing Multi-tenant HBase Clusters for Managing Workload Diversity
Harmonizing Multi-tenant HBase Clusters for Managing Workload DiversityHarmonizing Multi-tenant HBase Clusters for Managing Workload Diversity
Harmonizing Multi-tenant HBase Clusters for Managing Workload Diversity
 
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBaseHBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
 
HBaseCon 2015 General Session: State of HBase
HBaseCon 2015 General Session: State of HBaseHBaseCon 2015 General Session: State of HBase
HBaseCon 2015 General Session: State of HBase
 
Off-heaping the Apache HBase Read Path
Off-heaping the Apache HBase Read Path Off-heaping the Apache HBase Read Path
Off-heaping the Apache HBase Read Path
 
HBaseCon 2013: Apache HBase Operations at Pinterest
HBaseCon 2013: Apache HBase Operations at PinterestHBaseCon 2013: Apache HBase Operations at Pinterest
HBaseCon 2013: Apache HBase Operations at Pinterest
 
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, Cloudera
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, ClouderaHBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, Cloudera
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, Cloudera
 
Taming the Elephant: Efficient and Effective Apache Hadoop Management
Taming the Elephant: Efficient and Effective Apache Hadoop ManagementTaming the Elephant: Efficient and Effective Apache Hadoop Management
Taming the Elephant: Efficient and Effective Apache Hadoop Management
 
HBase Read High Availability Using Timeline-Consistent Region Replicas
HBase Read High Availability Using Timeline-Consistent Region ReplicasHBase Read High Availability Using Timeline-Consistent Region Replicas
HBase Read High Availability Using Timeline-Consistent Region Replicas
 
HBaseCon 2015: HBase Operations at Xiaomi
HBaseCon 2015: HBase Operations at XiaomiHBaseCon 2015: HBase Operations at Xiaomi
HBaseCon 2015: HBase Operations at Xiaomi
 

Viewers also liked

Apache HBase at Airbnb
Apache HBase at Airbnb Apache HBase at Airbnb
Apache HBase at Airbnb HBaseCon
 
Apache Kylin’s Performance Boost from Apache HBase
Apache Kylin’s Performance Boost from Apache HBaseApache Kylin’s Performance Boost from Apache HBase
Apache Kylin’s Performance Boost from Apache HBaseHBaseCon
 
HBaseCon 2015: Solving HBase Performance Problems with Apache HTrace
HBaseCon 2015: Solving HBase Performance Problems with Apache HTraceHBaseCon 2015: Solving HBase Performance Problems with Apache HTrace
HBaseCon 2015: Solving HBase Performance Problems with Apache HTraceHBaseCon
 
Update on OpenTSDB and AsyncHBase
Update on OpenTSDB and AsyncHBase Update on OpenTSDB and AsyncHBase
Update on OpenTSDB and AsyncHBase HBaseCon
 
Apache HBase - Just the Basics
Apache HBase - Just the BasicsApache HBase - Just the Basics
Apache HBase - Just the BasicsHBaseCon
 
Apache Phoenix: Use Cases and New Features
Apache Phoenix: Use Cases and New FeaturesApache Phoenix: Use Cases and New Features
Apache Phoenix: Use Cases and New FeaturesHBaseCon
 
Improvements to Apache HBase and Its Applications in Alibaba Search
Improvements to Apache HBase and Its Applications in Alibaba Search Improvements to Apache HBase and Its Applications in Alibaba Search
Improvements to Apache HBase and Its Applications in Alibaba Search HBaseCon
 
In Search of Database Nirvana: Challenges of Delivering HTAP
In Search of Database Nirvana: Challenges of Delivering HTAPIn Search of Database Nirvana: Challenges of Delivering HTAP
In Search of Database Nirvana: Challenges of Delivering HTAPHBaseCon
 
A Graph Service for Global Web Entities Traversal and Reputation Evaluation B...
A Graph Service for Global Web Entities Traversal and Reputation Evaluation B...A Graph Service for Global Web Entities Traversal and Reputation Evaluation B...
A Graph Service for Global Web Entities Traversal and Reputation Evaluation B...HBaseCon
 
HBaseCon 2015: Blackbird Collections - In-situ Stream Processing in HBase
HBaseCon 2015: Blackbird Collections - In-situ  Stream Processing in HBaseHBaseCon 2015: Blackbird Collections - In-situ  Stream Processing in HBase
HBaseCon 2015: Blackbird Collections - In-situ Stream Processing in HBaseHBaseCon
 
Real-time HBase: Lessons from the Cloud
Real-time HBase: Lessons from the CloudReal-time HBase: Lessons from the Cloud
Real-time HBase: Lessons from the CloudHBaseCon
 
HBaseCon 2013: Apache Drill - A Community-driven Initiative to Deliver ANSI S...
HBaseCon 2013: Apache Drill - A Community-driven Initiative to Deliver ANSI S...HBaseCon 2013: Apache Drill - A Community-driven Initiative to Deliver ANSI S...
HBaseCon 2013: Apache Drill - A Community-driven Initiative to Deliver ANSI S...Cloudera, Inc.
 
HBaseCon 2015: Warcbase - Scaling 'Out' and 'Down' HBase for Web Archiving
HBaseCon 2015: Warcbase - Scaling 'Out' and 'Down' HBase for Web ArchivingHBaseCon 2015: Warcbase - Scaling 'Out' and 'Down' HBase for Web Archiving
HBaseCon 2015: Warcbase - Scaling 'Out' and 'Down' HBase for Web ArchivingHBaseCon
 
HBaseCon 2013: Real-Time Model Scoring in Recommender Systems
HBaseCon 2013: Real-Time Model Scoring in Recommender Systems HBaseCon 2013: Real-Time Model Scoring in Recommender Systems
HBaseCon 2013: Real-Time Model Scoring in Recommender Systems Cloudera, Inc.
 
Digital Library Collection Management using HBase
Digital Library Collection Management using HBaseDigital Library Collection Management using HBase
Digital Library Collection Management using HBaseHBaseCon
 
HBase at Bloomberg: High Availability Needs for the Financial Industry
HBase at Bloomberg: High Availability Needs for the Financial IndustryHBase at Bloomberg: High Availability Needs for the Financial Industry
HBase at Bloomberg: High Availability Needs for the Financial IndustryHBaseCon
 
Content Identification using HBase
Content Identification using HBaseContent Identification using HBase
Content Identification using HBaseHBaseCon
 
Apache HBase Improvements and Practices at Xiaomi
Apache HBase Improvements and Practices at XiaomiApache HBase Improvements and Practices at Xiaomi
Apache HBase Improvements and Practices at XiaomiHBaseCon
 
HBaseCon 2015: HBase @ CyberAgent
HBaseCon 2015: HBase @ CyberAgentHBaseCon 2015: HBase @ CyberAgent
HBaseCon 2015: HBase @ CyberAgentHBaseCon
 
Keynote: Welcome Message/State of Apache HBase
Keynote: Welcome Message/State of Apache HBase Keynote: Welcome Message/State of Apache HBase
Keynote: Welcome Message/State of Apache HBase HBaseCon
 

Viewers also liked (20)

Apache HBase at Airbnb
Apache HBase at Airbnb Apache HBase at Airbnb
Apache HBase at Airbnb
 
Apache Kylin’s Performance Boost from Apache HBase
Apache Kylin’s Performance Boost from Apache HBaseApache Kylin’s Performance Boost from Apache HBase
Apache Kylin’s Performance Boost from Apache HBase
 
HBaseCon 2015: Solving HBase Performance Problems with Apache HTrace
HBaseCon 2015: Solving HBase Performance Problems with Apache HTraceHBaseCon 2015: Solving HBase Performance Problems with Apache HTrace
HBaseCon 2015: Solving HBase Performance Problems with Apache HTrace
 
Update on OpenTSDB and AsyncHBase
Update on OpenTSDB and AsyncHBase Update on OpenTSDB and AsyncHBase
Update on OpenTSDB and AsyncHBase
 
Apache HBase - Just the Basics
Apache HBase - Just the BasicsApache HBase - Just the Basics
Apache HBase - Just the Basics
 
Apache Phoenix: Use Cases and New Features
Apache Phoenix: Use Cases and New FeaturesApache Phoenix: Use Cases and New Features
Apache Phoenix: Use Cases and New Features
 
Improvements to Apache HBase and Its Applications in Alibaba Search
Improvements to Apache HBase and Its Applications in Alibaba Search Improvements to Apache HBase and Its Applications in Alibaba Search
Improvements to Apache HBase and Its Applications in Alibaba Search
 
In Search of Database Nirvana: Challenges of Delivering HTAP
In Search of Database Nirvana: Challenges of Delivering HTAPIn Search of Database Nirvana: Challenges of Delivering HTAP
In Search of Database Nirvana: Challenges of Delivering HTAP
 
A Graph Service for Global Web Entities Traversal and Reputation Evaluation B...
A Graph Service for Global Web Entities Traversal and Reputation Evaluation B...A Graph Service for Global Web Entities Traversal and Reputation Evaluation B...
A Graph Service for Global Web Entities Traversal and Reputation Evaluation B...
 
HBaseCon 2015: Blackbird Collections - In-situ Stream Processing in HBase
HBaseCon 2015: Blackbird Collections - In-situ  Stream Processing in HBaseHBaseCon 2015: Blackbird Collections - In-situ  Stream Processing in HBase
HBaseCon 2015: Blackbird Collections - In-situ Stream Processing in HBase
 
Real-time HBase: Lessons from the Cloud
Real-time HBase: Lessons from the CloudReal-time HBase: Lessons from the Cloud
Real-time HBase: Lessons from the Cloud
 
HBaseCon 2013: Apache Drill - A Community-driven Initiative to Deliver ANSI S...
HBaseCon 2013: Apache Drill - A Community-driven Initiative to Deliver ANSI S...HBaseCon 2013: Apache Drill - A Community-driven Initiative to Deliver ANSI S...
HBaseCon 2013: Apache Drill - A Community-driven Initiative to Deliver ANSI S...
 
HBaseCon 2015: Warcbase - Scaling 'Out' and 'Down' HBase for Web Archiving
HBaseCon 2015: Warcbase - Scaling 'Out' and 'Down' HBase for Web ArchivingHBaseCon 2015: Warcbase - Scaling 'Out' and 'Down' HBase for Web Archiving
HBaseCon 2015: Warcbase - Scaling 'Out' and 'Down' HBase for Web Archiving
 
HBaseCon 2013: Real-Time Model Scoring in Recommender Systems
HBaseCon 2013: Real-Time Model Scoring in Recommender Systems HBaseCon 2013: Real-Time Model Scoring in Recommender Systems
HBaseCon 2013: Real-Time Model Scoring in Recommender Systems
 
Digital Library Collection Management using HBase
Digital Library Collection Management using HBaseDigital Library Collection Management using HBase
Digital Library Collection Management using HBase
 
HBase at Bloomberg: High Availability Needs for the Financial Industry
HBase at Bloomberg: High Availability Needs for the Financial IndustryHBase at Bloomberg: High Availability Needs for the Financial Industry
HBase at Bloomberg: High Availability Needs for the Financial Industry
 
Content Identification using HBase
Content Identification using HBaseContent Identification using HBase
Content Identification using HBase
 
Apache HBase Improvements and Practices at Xiaomi
Apache HBase Improvements and Practices at XiaomiApache HBase Improvements and Practices at Xiaomi
Apache HBase Improvements and Practices at Xiaomi
 
HBaseCon 2015: HBase @ CyberAgent
HBaseCon 2015: HBase @ CyberAgentHBaseCon 2015: HBase @ CyberAgent
HBaseCon 2015: HBase @ CyberAgent
 
Keynote: Welcome Message/State of Apache HBase
Keynote: Welcome Message/State of Apache HBase Keynote: Welcome Message/State of Apache HBase
Keynote: Welcome Message/State of Apache HBase
 

Similar to Apache HBase in the Enterprise Data Hub at Cerner

Oracle RAC - Customer Proven Scalability
Oracle RAC - Customer Proven ScalabilityOracle RAC - Customer Proven Scalability
Oracle RAC - Customer Proven ScalabilityMarkus Michalewicz
 
Introduction to Kudu - StampedeCon 2016
Introduction to Kudu - StampedeCon 2016Introduction to Kudu - StampedeCon 2016
Introduction to Kudu - StampedeCon 2016StampedeCon
 
How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?sonikadigital1
 
Building a scalable analytics environment to support diverse workloads
Building a scalable analytics environment to support diverse workloadsBuilding a scalable analytics environment to support diverse workloads
Building a scalable analytics environment to support diverse workloadsAlluxio, Inc.
 
Fundamentals of big data analytics and Hadoop
Fundamentals of big data analytics and HadoopFundamentals of big data analytics and Hadoop
Fundamentals of big data analytics and HadoopArchana Gopinath
 
Fast SQL on Hadoop, really?
Fast SQL on Hadoop, really?Fast SQL on Hadoop, really?
Fast SQL on Hadoop, really?DataWorks Summit
 
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...Ceph Community
 
Meta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarMeta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarKognitio
 
Accelerating analytics in the cloud with the Starburst Presto + Alluxio stack
Accelerating analytics in the cloud with the Starburst Presto + Alluxio stackAccelerating analytics in the cloud with the Starburst Presto + Alluxio stack
Accelerating analytics in the cloud with the Starburst Presto + Alluxio stackAlluxio, Inc.
 
HPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journeyHPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journeyPeter Clapham
 
How the Development Bank of Singapore solves on-prem compute capacity challen...
How the Development Bank of Singapore solves on-prem compute capacity challen...How the Development Bank of Singapore solves on-prem compute capacity challen...
How the Development Bank of Singapore solves on-prem compute capacity challen...Alluxio, Inc.
 
Alluxio 2.0 & Near Real-time Big Data Platform w/ Spark & Alluxio
Alluxio 2.0 & Near Real-time Big Data Platform w/ Spark & AlluxioAlluxio 2.0 & Near Real-time Big Data Platform w/ Spark & Alluxio
Alluxio 2.0 & Near Real-time Big Data Platform w/ Spark & AlluxioAlluxio, Inc.
 
clusterstor-hadoop-data-sheet
clusterstor-hadoop-data-sheetclusterstor-hadoop-data-sheet
clusterstor-hadoop-data-sheetAndrei Khurshudov
 
VMworld 2013: Virtualizing Databases: Doing IT Right
VMworld 2013: Virtualizing Databases: Doing IT Right VMworld 2013: Virtualizing Databases: Doing IT Right
VMworld 2013: Virtualizing Databases: Doing IT Right VMworld
 
Webinar: The Performance Challenge: Providing an Amazing Customer Experience ...
Webinar: The Performance Challenge: Providing an Amazing Customer Experience ...Webinar: The Performance Challenge: Providing an Amazing Customer Experience ...
Webinar: The Performance Challenge: Providing an Amazing Customer Experience ...DataStax
 
Big data and hadoop
Big data and hadoopBig data and hadoop
Big data and hadoopMohit Tare
 
Achieving Cost & Resource Effeciencies through Trove Database As-A-Service (D...
Achieving Cost & Resource Effeciencies through Trove Database As-A-Service (D...Achieving Cost & Resource Effeciencies through Trove Database As-A-Service (D...
Achieving Cost & Resource Effeciencies through Trove Database As-A-Service (D...Dean Delamont
 
SD Big Data Monthly Meetup #4 - Session 2 - WANDisco
SD Big Data Monthly Meetup #4 - Session 2 - WANDiscoSD Big Data Monthly Meetup #4 - Session 2 - WANDisco
SD Big Data Monthly Meetup #4 - Session 2 - WANDiscoBig Data Joe™ Rossi
 
New use cases for Ceph, beyond OpenStack, Luis Rico
New use cases for Ceph, beyond OpenStack, Luis RicoNew use cases for Ceph, beyond OpenStack, Luis Rico
New use cases for Ceph, beyond OpenStack, Luis RicoCeph Community
 

Similar to Apache HBase in the Enterprise Data Hub at Cerner (20)

Oracle RAC - Customer Proven Scalability
Oracle RAC - Customer Proven ScalabilityOracle RAC - Customer Proven Scalability
Oracle RAC - Customer Proven Scalability
 
Introduction to Kudu - StampedeCon 2016
Introduction to Kudu - StampedeCon 2016Introduction to Kudu - StampedeCon 2016
Introduction to Kudu - StampedeCon 2016
 
How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?
 
Building a scalable analytics environment to support diverse workloads
Building a scalable analytics environment to support diverse workloadsBuilding a scalable analytics environment to support diverse workloads
Building a scalable analytics environment to support diverse workloads
 
Fundamentals of big data analytics and Hadoop
Fundamentals of big data analytics and HadoopFundamentals of big data analytics and Hadoop
Fundamentals of big data analytics and Hadoop
 
Fast SQL on Hadoop, really?
Fast SQL on Hadoop, really?Fast SQL on Hadoop, really?
Fast SQL on Hadoop, really?
 
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
 
Meta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarMeta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinar
 
Accelerating analytics in the cloud with the Starburst Presto + Alluxio stack
Accelerating analytics in the cloud with the Starburst Presto + Alluxio stackAccelerating analytics in the cloud with the Starburst Presto + Alluxio stack
Accelerating analytics in the cloud with the Starburst Presto + Alluxio stack
 
HPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journeyHPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journey
 
How the Development Bank of Singapore solves on-prem compute capacity challen...
How the Development Bank of Singapore solves on-prem compute capacity challen...How the Development Bank of Singapore solves on-prem compute capacity challen...
How the Development Bank of Singapore solves on-prem compute capacity challen...
 
Alluxio 2.0 & Near Real-time Big Data Platform w/ Spark & Alluxio
Alluxio 2.0 & Near Real-time Big Data Platform w/ Spark & AlluxioAlluxio 2.0 & Near Real-time Big Data Platform w/ Spark & Alluxio
Alluxio 2.0 & Near Real-time Big Data Platform w/ Spark & Alluxio
 
clusterstor-hadoop-data-sheet
clusterstor-hadoop-data-sheetclusterstor-hadoop-data-sheet
clusterstor-hadoop-data-sheet
 
VMworld 2013: Virtualizing Databases: Doing IT Right
VMworld 2013: Virtualizing Databases: Doing IT Right VMworld 2013: Virtualizing Databases: Doing IT Right
VMworld 2013: Virtualizing Databases: Doing IT Right
 
Webinar: The Performance Challenge: Providing an Amazing Customer Experience ...
Webinar: The Performance Challenge: Providing an Amazing Customer Experience ...Webinar: The Performance Challenge: Providing an Amazing Customer Experience ...
Webinar: The Performance Challenge: Providing an Amazing Customer Experience ...
 
Big data and hadoop
Big data and hadoopBig data and hadoop
Big data and hadoop
 
Achieving Cost & Resource Effeciencies through Trove Database As-A-Service (D...
Achieving Cost & Resource Effeciencies through Trove Database As-A-Service (D...Achieving Cost & Resource Effeciencies through Trove Database As-A-Service (D...
Achieving Cost & Resource Effeciencies through Trove Database As-A-Service (D...
 
SD Big Data Monthly Meetup #4 - Session 2 - WANDisco
SD Big Data Monthly Meetup #4 - Session 2 - WANDiscoSD Big Data Monthly Meetup #4 - Session 2 - WANDisco
SD Big Data Monthly Meetup #4 - Session 2 - WANDisco
 
Big Data
Big DataBig Data
Big Data
 
New use cases for Ceph, beyond OpenStack, Luis Rico
New use cases for Ceph, beyond OpenStack, Luis RicoNew use cases for Ceph, beyond OpenStack, Luis Rico
New use cases for Ceph, beyond OpenStack, Luis Rico
 

More from HBaseCon

hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kuberneteshbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: Building online HBase cluster of Zhihu based on KubernetesHBaseCon
 
hbaseconasia2017: HBase on Beam
hbaseconasia2017: HBase on Beamhbaseconasia2017: HBase on Beam
hbaseconasia2017: HBase on BeamHBaseCon
 
hbaseconasia2017: HBase Disaster Recovery Solution at Huawei
hbaseconasia2017: HBase Disaster Recovery Solution at Huaweihbaseconasia2017: HBase Disaster Recovery Solution at Huawei
hbaseconasia2017: HBase Disaster Recovery Solution at HuaweiHBaseCon
 
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinteresthbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
hbaseconasia2017: Removable singularity: a story of HBase upgrade in PinterestHBaseCon
 
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程HBaseCon
 
hbaseconasia2017: Apache HBase at Netease
hbaseconasia2017: Apache HBase at Neteasehbaseconasia2017: Apache HBase at Netease
hbaseconasia2017: Apache HBase at NeteaseHBaseCon
 
hbaseconasia2017: HBase在Hulu的使用和实践
hbaseconasia2017: HBase在Hulu的使用和实践hbaseconasia2017: HBase在Hulu的使用和实践
hbaseconasia2017: HBase在Hulu的使用和实践HBaseCon
 
hbaseconasia2017: 基于HBase的企业级大数据平台
hbaseconasia2017: 基于HBase的企业级大数据平台hbaseconasia2017: 基于HBase的企业级大数据平台
hbaseconasia2017: 基于HBase的企业级大数据平台HBaseCon
 
hbaseconasia2017: HBase at JD.com
hbaseconasia2017: HBase at JD.comhbaseconasia2017: HBase at JD.com
hbaseconasia2017: HBase at JD.comHBaseCon
 
hbaseconasia2017: Large scale data near-line loading method and architecture
hbaseconasia2017: Large scale data near-line loading method and architecturehbaseconasia2017: Large scale data near-line loading method and architecture
hbaseconasia2017: Large scale data near-line loading method and architectureHBaseCon
 
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huaweihbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at HuaweiHBaseCon
 
hbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: HBase Practice At XiaoMihbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: HBase Practice At XiaoMiHBaseCon
 
hbaseconasia2017: hbase-2.0.0
hbaseconasia2017: hbase-2.0.0hbaseconasia2017: hbase-2.0.0
hbaseconasia2017: hbase-2.0.0HBaseCon
 
HBaseCon2017 Democratizing HBase
HBaseCon2017 Democratizing HBaseHBaseCon2017 Democratizing HBase
HBaseCon2017 Democratizing HBaseHBaseCon
 
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in PinterestHBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in PinterestHBaseCon
 
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBaseHBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBaseHBaseCon
 
HBaseCon2017 Transactions in HBase
HBaseCon2017 Transactions in HBaseHBaseCon2017 Transactions in HBase
HBaseCon2017 Transactions in HBaseHBaseCon
 
HBaseCon2017 Highly-Available HBase
HBaseCon2017 Highly-Available HBaseHBaseCon2017 Highly-Available HBase
HBaseCon2017 Highly-Available HBaseHBaseCon
 
HBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at DidiHBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at DidiHBaseCon
 
HBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon2017 gohbase: Pure Go HBase ClientHBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon2017 gohbase: Pure Go HBase ClientHBaseCon
 

More from HBaseCon (20)

hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kuberneteshbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
 
hbaseconasia2017: HBase on Beam
hbaseconasia2017: HBase on Beamhbaseconasia2017: HBase on Beam
hbaseconasia2017: HBase on Beam
 
hbaseconasia2017: HBase Disaster Recovery Solution at Huawei
hbaseconasia2017: HBase Disaster Recovery Solution at Huaweihbaseconasia2017: HBase Disaster Recovery Solution at Huawei
hbaseconasia2017: HBase Disaster Recovery Solution at Huawei
 
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinteresthbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
 
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
 
hbaseconasia2017: Apache HBase at Netease
hbaseconasia2017: Apache HBase at Neteasehbaseconasia2017: Apache HBase at Netease
hbaseconasia2017: Apache HBase at Netease
 
hbaseconasia2017: HBase在Hulu的使用和实践
hbaseconasia2017: HBase在Hulu的使用和实践hbaseconasia2017: HBase在Hulu的使用和实践
hbaseconasia2017: HBase在Hulu的使用和实践
 
hbaseconasia2017: 基于HBase的企业级大数据平台
hbaseconasia2017: 基于HBase的企业级大数据平台hbaseconasia2017: 基于HBase的企业级大数据平台
hbaseconasia2017: 基于HBase的企业级大数据平台
 
hbaseconasia2017: HBase at JD.com
hbaseconasia2017: HBase at JD.comhbaseconasia2017: HBase at JD.com
hbaseconasia2017: HBase at JD.com
 
hbaseconasia2017: Large scale data near-line loading method and architecture
hbaseconasia2017: Large scale data near-line loading method and architecturehbaseconasia2017: Large scale data near-line loading method and architecture
hbaseconasia2017: Large scale data near-line loading method and architecture
 
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huaweihbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
 
hbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: HBase Practice At XiaoMihbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: HBase Practice At XiaoMi
 
hbaseconasia2017: hbase-2.0.0
hbaseconasia2017: hbase-2.0.0hbaseconasia2017: hbase-2.0.0
hbaseconasia2017: hbase-2.0.0
 
HBaseCon2017 Democratizing HBase
HBaseCon2017 Democratizing HBaseHBaseCon2017 Democratizing HBase
HBaseCon2017 Democratizing HBase
 
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in PinterestHBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
 
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBaseHBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
 
HBaseCon2017 Transactions in HBase
HBaseCon2017 Transactions in HBaseHBaseCon2017 Transactions in HBase
HBaseCon2017 Transactions in HBase
 
HBaseCon2017 Highly-Available HBase
HBaseCon2017 Highly-Available HBaseHBaseCon2017 Highly-Available HBase
HBaseCon2017 Highly-Available HBase
 
HBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at DidiHBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at Didi
 
HBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon2017 gohbase: Pure Go HBase ClientHBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon2017 gohbase: Pure Go HBase Client
 

Recently uploaded

Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...OnePlan Solutions
 
Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Mater
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based projectAnoyGreter
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWave PLM
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtimeandrehoraa
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样umasea
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEEVICTOR MAESTRE RAMIREZ
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Andreas Granig
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...OnePlan Solutions
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Cizo Technology Services
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationBradBedford3
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfMarharyta Nedzelska
 
How to Track Employee Performance A Comprehensive Guide.pdf
How to Track Employee Performance A Comprehensive Guide.pdfHow to Track Employee Performance A Comprehensive Guide.pdf
How to Track Employee Performance A Comprehensive Guide.pdfLivetecs LLC
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Hr365.us smith
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfFerryKemperman
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commercemanigoyal112
 
Xen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfXen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfStefano Stabellini
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanyChristoph Pohl
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsAhmed Mohamed
 

Recently uploaded (20)

Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
 
Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based project
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtime
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEE
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion Application
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdf
 
How to Track Employee Performance A Comprehensive Guide.pdf
How to Track Employee Performance A Comprehensive Guide.pdfHow to Track Employee Performance A Comprehensive Guide.pdf
How to Track Employee Performance A Comprehensive Guide.pdf
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdf
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commerce
 
Advantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your BusinessAdvantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your Business
 
Xen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfXen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdf
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML Diagrams
 

Apache HBase in the Enterprise Data Hub at Cerner

  • 1. HBase  in  the  Enterprise  Data  Hub   It’s  all  about  co-­‐exis0ng  peacefully!   Swarnim  Kulkarni   May  24th  2016  
  • 3. innova8on  noun   in·∙no·∙va·∙0on  ˌi-­‐nə-­‐ˈvā-­‐shən   1.    the  introduc0on  of  something  new   2.    a  new  idea,  method  or  device  
  • 4.
  • 9. HBase  sits  at  the   core  of  a  lot  of  these   solu8ons!  
  • 10. Each  solu8on  built   with  specific   business  problem  in   mind  
  • 11. CHART     SEARCH   SEPSIS   MILLENNIUM  +   POP  HEALTH  
  • 12. CHART     SEARCH   SEPSIS   MILLENNIUM  +   POP  HEALTH  
  • 13. Prevents  assets  from  being  shared   High  investment   High  maintenance  costs   High  barrier  of  entry  for  innova8on    
  • 14. Enter     The  Data  Hub  
  • 15.
  • 18. = + +
  • 19.
  • 21. = + +
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30. Innova8ons   Source  of  data   Raw  data  
  • 31. Data  Hub  Owner   Data  Hub  Consumers   Workers  
  • 32. “An  Enterprise  Data  Hub  is  a  centralized   loca0on  to  store  all  data,  for  as  long  as   needed,  in  its  original  fidelity;  with   flexibility  to  run  variety  of  enterprise   workloads  including  batch  processing  and   interac0ve  SQL  -­‐  with  robust  security,   audi0ng  and  management.”  
  • 33. “An  Enterprise  Data  Hub  is  a  centralized   loca-on  to  store  all  data,  for  as  long  as   needed,  in  its  original  fidelity;  with   flexibility  to  run  variety  of  enterprise   workloads  including  batch  processing  and   interac0ve  SQL  -­‐  with  robust  security,   audi0ng  and  management.”  
  • 34. “An  Enterprise  Data  Hub  is  a  centralized   loca-on  to  store  all  data,  for  as  long  as   needed,  in  its  original  fidelity;  with   flexibility  to  run  variety  of  enterprise   workloads  including  batch  processing  and   interac0ve  SQL  -­‐  with  robust  security,   audi0ng  and  management.”  
  • 35. “An  Enterprise  Data  Hub  is  a  centralized   loca-on  to  store  all  data,  for  as  long  as   needed,  in  its  original  fidelity;  with   flexibility  to  run  variety  of  enterprise   workloads  including  batch  processing  and   interac0ve  SQL  -­‐  with  robust  security,   audi0ng  and  management.”  
  • 36. “An  Enterprise  Data  Hub  is  a  centralized   loca-on  to  store  all  data,  for  as  long  as   needed,  in  its  original  fidelity;  with   flexibility  to  run  variety  of  enterprise   workloads  including  batch  processing  and   interac0ve  SQL  -­‐  with  robust  security,   audi-ng  and  management.”  
  • 37. Mul8-­‐tenant   Secure  and  Compliant   Ac8ve  archive  of  all  data   Low  barrier  of  entry   Hadoop  as  a  Service   Data  Hub  is….  
  • 38. So  why  provide  HBase   as  a  service?  
  • 39. Maximize  cluster   u8liza8on   Minimum  resource  guarantee   sharing   Low  barrier  of  entry   HBase  as  a  Service   Knowledge   Data  Data  
  • 41. Support  mul8-­‐tenant  environment  for  mul8ple   users   Isolated  deployment   Set  quota  per  consumer   ACLs  for  project  level  administra8on   Appropriate  security  for  user  authen8ca8on  
  • 43.
  • 44. Smaller  HBase  clusters  part  of  the  bigger  “hub”   Scale  as  per  need   Easier  management  
  • 46. Making  sure  that  you  have  sufficient   resources  before  you  start  the  job!  
  • 47. CGroups   •  Control  Groups   •  Police  and  limit  CPU,  Disk  I/O  and  Memory  usage   •  Resource  guarantee  by  sta8c  par88oning     •  Very  useful  in  case  of  conten8on      
  • 48. •  Value  for  resource-­‐limits  driven  by  individual  tenants  depending   on  the  workload   •  Defaults  to  70-­‐30  usage  for  memory  and  CPU  between   (NodeManager  +  Datanode)  and  RegionServer   •  Higher  memory  on  RS  (at  least  4  GB)  helps  avoid  swapping   and  happier  GC   CPU   CPU   CPU   MEMORY   MEMORY   MEMORY   SEPSIS   M+   POP  HEALTH  
  • 49. What  we  learned   •  Recommend  tenants  to  give  atleast  4  GB  of  memory  to   the  regionserver  for  smoother  opera8on   •  Coopera8ve  memory  (ex:  via  JVM  heaps,  max.  container   sizes)  works  bejer  than  cgroup  limits   •  Disable  swapping  on  HBase  nodes   •  Disable  HDFS  load  balancer   •  CMS  GC  performs  way  bejer   •  Limit  number  of  containers  running  on  nodes  to  maximize   performance  
  • 50. Property   Value   yarn.nodemanager.container-­‐ executor.class   org.apache.hadoop.yarn.server.node manager.LinuxContainerExecutor   yarn.nodemanager.linux-­‐container-­‐ executor.group   hadoop   yarn.nodemanager.linux-­‐container-­‐ executor.resources-­‐handler.class   org.apache.hadoop.yarn.server.node manager.u8l.CgroupsLCEResourcesHa ndler   yarn.nodemanager.linux-­‐container-­‐ executor.cgroups.hierarchy   /yarn   yarn.nodemanager.linux-­‐container-­‐ executor.cgroups.mount   true   hjps://hadoop.apache.org/docs/current/hadoop-­‐yarn/hadoop-­‐yarn-­‐site/NodeManagerCgroups.html  
  • 51. Request  queues   •  Priori8ze  variety  of  workflows  that  need  access  to  HBase   •  Important  to  meet  SLAs  for  tenants   •  “FIFO”  vs  “Deadline”  queue  type   •  “Deadline”(default)  proved  to  work  prejy  well  for  most  of   the  cases   •  Cannot  be  set  per  tenant  but  has  to  be  set  per  cluster  –   support  coming  soon  
  • 52. Quotas   •  Makes  sure  that  no  single  tenant  abuses  the  system   •  Peaceful  coexistence   •  Promotes  a  pay-­‐per-­‐use  model.  Could  buy  a  higher   throjle  limit  by  contribu8ng  more  number  of  nodes   •  Usually  we  set  throjle  on  a  per  namespace  basis  but   could  also  set  on  the  per  table/user  basis  if  needed  
  • 54. Namespace   •  Logical  grouping  of  HBase  tables   •  Analogous  to  databases   •  Provides  tenants  with  individual  space  to  operate  on   •  Tied  to  the  AD  group  used  when  onboarding   •  Could  also  apply  quotas(max  regions/tables)  per   namespace(HBASE-­‐8410)  but  not  using  that  feature  for   now  
  • 56. Authen8ca8on   •  Cluster  secured  by  Kerberos   •  Disallow  impersona8on   •  Require  kinit  to  first  authen8cate  with  the  KDC  before   accessing  the  cluster  
  • 57. ACLs   •  Provides  the  authoriza8on  piece   •  Set  per  namespace  and  8ed  to  the  AD  groups   •  Required  proper8es  
  • 58. ACLs   <ac0on>  -­‐  Determines  the  type  of  ac8on  –  grant  or  revoke     <en0ty>    -­‐  Determines  en8ty  to  grant  access  to  –  user  or  groups     <level>        -­‐  Determines  the  access  level  –  RWXCA     <scope>    -­‐  Determines  the  scope  for  access  –  namespace,  table,  column   family  or  Cell     Must  be  a  super  user  to  run  these  commands     (Determined  by  hbase.supersuser)  
  • 60. ~33%  of  all  US  healthcare  data  
  • 61. Millennium  +   4  CDH  5.5.2  clusters  (soon  to  be  5)   1245  tables   113,982  total  regions   673  regionservers   895  TB  of  data!  (unreplicated)  
  • 62. Popula8on  Health   900  tables   2  million  requests/sec/day   780  regionservers   115,000  regions   700  TB  of  data!  (unreplicated)  
  • 63. Lots  of  common  data….   •  Reference  Data   •  Provider  informa8on   •  Insurance  informa8on   •  SNOMED/  ICD9  /  ICD10  data   •  Ac8vity  Data   •  Visits   •  Procedures   •  Vitals  
  • 64. Crawler   M I L L E N N I U M Collector   HBase  Cluster   Crawler   Collector   HBase  Cluster   M+   Pop.  H   Makes  sense  to  be  hosted  in  the  Data  hub  instead!  
  • 65. Onboarding  –  Capacity  planning   M+   Popula8on   Health  
  • 66. Onboarding  -­‐  Deployment   M+   Pop.  Health   DATA  HUB   Tenants  could  choose  to   modify  the  cgroup   configura8on  depending   on  expected  workloads   or  just  stay  with  defaults  
  • 67. Onboarding  -­‐  Isola8on   •  Create  AD  groups   •  poph_users,  pop_admins   •  mplus_users,  mplus_admins   •  Create  namespaces  (as  super  user)   hbase(main):001:0> create_namespace ’mplus' 0 row(s) in 0.5650 seconds hbase(main):001:0> create_namespace ’poph' 0 row(s) in 0.7262 seconds
  • 68. Onboarding  -­‐  Quotas     •  Setup  quotas   hbase(main):001:0> set_quota TYPE => THROTTLE, NAMESPACE => ’mplus', LIMIT => ’10000 req/sec' 0 row(s) in 0.7255 seconds hbase(main):001:0> set_quota TYPE => THROTTLE, NAMESPACE => ’poph', LIMIT => ’1500 req/sec' 0 row(s) in 0.5677 seconds  
  • 69. Onboarding  -­‐  Security   •  Kerberos  secured.     •  Require  kinit  for  regular  users  to  access  the  cluster   •  Deploy  keytabs  for  service  users   •  Setup  ACLs   hbase(main):001:0> grant '@poph_users', 'RWCX', ’@poph’ 0 row(s) in 0.3250 seconds   hbase(main):001:0> grant '@poph_admins', 'RWACX', ’@poph’ 0 row(s) in 0.4332 seconds
  • 71. What  we  can  do  bejer   •  Namespace  quota  support  (HBASE-­‐8410)   •  Limit  tables/regions  per  namespace   •  Region  Server  Groups  (HBASE-­‐6721)   •  Pin  namespace/tables  to  subset  of  regionservers   •  Advanced  namespace  security  (HBASE-­‐9206)   •  Higher  flexibility  to  admins  and  tenants  for  namespace   management