SlideShare ist ein Scribd-Unternehmen logo
1 von 51
Multi-Cluster Live Synchronization
with Kerberos Federated Hadoop
張雅芳 Mammi Chang
@ 2015 Taiwan HadoopCon
Who am I ?
• Mammi Chang 張雅芳
• Sr. Engineer, SPN, Trend Micro
• SPN Hadoop Cluster Administrator
• DevOps on Hadoop ecosystem and AWS
• 2014 HadoopCon Speaker
Original
Data
Center
New
Data
Center
TMH7TMH6
service
Data SyncData Sync
This is a story of move …
Production
TMH6 TMH7
TMH6 TMH7
Original
Data
Center
New
Data
Center
Production
Staging Staging
Data Sync
Data Sync
Data Sync
Data
SynchronizationData synchronization is the process of establishing consistency among data from a
source to a target data storage and vice versa and the continuous harmonization of the
data over time.
- From wikipedia “Data synchronization”
One-way file synchronization
 Updated files copied from source to destination
Two-way file synchronization
 Updated files are copied in both directories
 Dropbox, SafeSync, etc
Linux One-Way File Synchronization
$ cp fileA fileB
$ scp ./directory/my_file
mammi@198.167.0.3:/home/mammi/
$ rsync -avP /source/data /destination/
Hadoop One-Way File Synchronization
$ hadoop fs -cp /user/mammi/file1 /user/mammi/dir/
$ hadoop distcp hdfs://cluster1/file
hdfs://cluster2/file
#TrendInsight
Hadoop Data Synchronization
DistCp with the same
Hadoop version is trivial.
Hadoop - 2.6 Cluster1 Hadoop – 2.6 Cluster2
$ hadoop distcp hdfs://cluster1_nn:8020/test
hdfs://cluster2_nn:8020/test
Hadoop - 2.6 Cluster1 Hadoop – 2.6 Cluster2
$ hadoop distcp hdfs://cluster2_nn:8020/test
hdfs://cluster1_nn:8020/test
DistCp with the same
Hadoop version is trivial.
different
a little bit tricky
Oops …
[root@tw-spnhadoop1 hadooppet]# hadoop distcp hdfs://cluster1/test hdfs://krb-1.spn.lab.trendnet.org:8020/test
15/01/22 15:11:44 INFO tools.DistCp: srcPaths=[hdfs://cluster1/test]
15/01/22 15:11:44 INFO tools.DistCp: destPath=hdfs://krb-1.spn.lab.trendnet.org:8020/test
15/01/22 15:11:45 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 381 for hdfs on ha-hdfs:cluster1
15/01/22 15:11:45 INFO security.TokenCache: Got dt for hdfs://cluster1; Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:cluster1, Ident:
(HDFS_DELEGATION_TOKEN token 381 for hdfs)
15/01/22 15:11:46 ERROR security.UserGroupInformation: PriviledgedActionException as:hdfs/tw-
spnhadoop1.spn.tw.trendnet.org@ISPN.TRENDMICRO.COM (auth:KERBEROS) cause:org.apache.hadoop.ipc.RemoteException(null):
org.apache.hadoop.ipc.RPC$VersionMismatch
15/01/22 15:11:46 INFO security.UserGroupInformation: Initiating logout for hdfs/tw-spnhadoop1.spn.tw.trendnet.org@ISPN.TRENDMICRO.COM
15/01/22 15:11:46 INFO security.UserGroupInformation: Initiating re-login for hdfs/tw-spnhadoop1.spn.tw.trendnet.org@ISPN.TRENDMICRO.COM
15/01/22 15:11:50 ERROR security.UserGroupInformation: PriviledgedActionException as:hdfs/tw-
spnhadoop1.spn.tw.trendnet.org@ISPN.TRENDMICRO.COM (auth:KERBEROS) cause:org.apache.hadoop.ipc.RemoteException(null):
org.apache.hadoop.ipc.RPC$VersionMismatch
15/01/22 15:11:50 WARN security.UserGroupInformation: Not attempting to re-login since the last re-login was attempted less than 600 seconds
before.
15/01/22 15:11:53 ERROR security.UserGroupInformation: PriviledgedActionException as:hdfs/tw-
spnhadoop1.spn.tw.trendnet.org@ISPN.TRENDMICRO.COM (auth:KERBEROS) cause:org.apache.hadoop.ipc.RemoteException(null):
org.apache.hadoop.ipc.RPC$VersionMismatch
Apache Hadoop – 2.0
Cluster1
Apache Hadoop – 2.6
Cluster2
$ hadoop distcp hftp://cluster1_nn:50070/test
hdfs://cluster2_nn:8020/test
HftpFileSystem is a read-only FileSystem, so DistCp
must be run on the destination cluster
TMH6 Cluster1 TMH7 Cluster2
$ hadoop distcp ????://TMH6_NN:????/test
hdfs://TMH7_NN:8020/test
CDH Based Apache Based
TMH6 Cluster1 TMH7 Cluster2
$ hadoop distcp hftp://TMH6_NN:50070/test
hdfs://TMH7_NN:8020/test
CDH Based Apache Based
Only support data sync from TMH6 to TMH7
DistCp with different Hadoop
version is a little bit tricky
plus kerberos security
annoying !!
TMH6 Cluster1 TMH7 Cluster2
$ hadoop distcp ????://TMH6_NN:XXXX/test
????://TMH7_NN:XXXX/test
DistCp Data Copy Matrix:
HDP1/HDP2 to HDP2
http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.1.1/bk_system-admin-
guide/content/distcp-table.html
Webhdfs is a HTTP REST API
supports the complete
FileSystem interface for HDFS
DistCp Data Copy Matrix:
TMH6/TMH7 to TMH6/TMH7
TMH6
TMH7
insecure
secure
hdfs
hftp
webhdfs
2
TMH6 Cluster1 TMH7 Cluster2
$ hadoop distcp webhdfs://TMH6_NN:8020/test
webhdfs://TMH7_NN:8020/test
Hadoop Security with Kerberos
Kerberos is a computer
network authentication protocol which
works on the basis of 'tickets' to
allow nodes communicating over a non-
secure network to prove their identity to
one another in a secure manner
- From wikipedia “Kerberos_(Protocol)”
REALM – CLUSTER.DOMAIN.COM
Kerberos Negotiation
KDC
(Key Distributed Center)
TGT
(Ticket-Granting Ticket)
KDC
Client
Hadoop
Servers
Msg3 :
Authenticator, TGT
Msg4 : client/server ticket
Msg1 : client login KDC
Msg2 : client TGT
Msg5 : Authenticator, ticket
Msg6 : time auth
REALM – CLUSTER2.DOMAIN.COM
Kerberos Cross-Realm authenticate
REALM – CLUSTER1.DOMAIN.COM
KDC
Client
Hadoop
Servers
Msg3 : Authenticator, TGT
Msg4 : client/server ticket
Msg1 : client login KDC
Msg2 : client TGT
Msg5 : Authenticator, ticket
Msg6 : time auth
KDC
Kerberos Federation for Hadoop
Kerberos Setting
• Set different REALM in
each cluster’s KDC
• Add both cluster’s kerberos
information to configs
• Add federated kerberos
principal to both KDC DB
• Restart kerberos services
Hadoop Setting
• Add Hadoop configurations
• Make sure both cluster
nodes can recognize each
other
• Restart necessary Hadoop
services
Multi-Cluster Kerberos Federation
Cluster1
•Set different REALM
in each cluster’s KDC
•Add all other cluster’s
kerberos information
to configuration
•Add all federated
kerberos principal to
KDC DB
•Add Hadoop
configurations
•Make sure all cluster
nodes can recognize
each others
•Restart necessary
services
Cluster2
•Set different REALM
in each cluster’s KDC
•Add all other cluster’s
kerberos information
to configuration
•Add all federated
kerberos principal to
KDC DB
•Add Hadoop
configurations
•Make sure all cluster
nodes can recognize
each others
•Restart necessary
services
…
•…
Cluster N
•Set different REALM
in each cluster’s KDC
•Add all other cluster’s
kerberos information
to configuration
•Add all federated
kerberos principal to
KDC DB
•Add Hadoop
configurations
•Make sure all cluster
nodes can recognize
each others
•Restart necessary
services
DistCp with different Hadoop version
plus kerberos federation
is annoying !!
in cross DC multi-cluster
not easy.
Done!!
DistCp with different Hadoop
version plus kerberos federation in
cross DC mult-clusters is not easy
at all.
Production
TMH6 TMH7
TMH6 TMH7
Original
Data
Center
New
Data
Center
Production
Staging Staging
Two-way keberos
federation link
Data Sync
Data Sync
Data Sync
#TrendInsight
More Than Functionality …
Issues
• Computing resource
• Zero-downtime
• Schedule limitation
• Network bandwidth
Computing Resource
• Principle
– Do not have production service impact when
many DistCp jobs running
• Strategy
– Run distcp on Staging Env. Instand of Production
Env.
Production
TMH6 TMH7
TMH6 TMH7
Original
Data
Center
New
Data
Center
Production
Staging Staging
Two-way keberos
federation link
$ hadoop distcp
webhdfs://TMH6_PROD_NN:8020/test
webhdfs://TMH7_PROD_NN:8020/test
Data Sync
Data flow
Production
TMH6 TMH7
TMH6 TMH7
Original
Data
Center
New
Data
Center
Production
Staging Staging
Two-way keberos
federation link
$ hadoop distcp
webhdfs://TMH6_PROD_NN:8020/test
webhdfs://TMH7_PROD_NN:8020/test
Data Sync
Data flow
Zero-downtime
• Principle
– Do not have Production Env. downtime
• Strategy
– Change KDC REALM in Staging only
– Rolling restart services
Schedule Limitation
• Principle
– Provide minimum dataset that fulfill production
services requirement
• Strategy
– Divide dataset into cold data and hot data
– All necessary hot data need to be ready before
service move to new DC
#TrendInsight
Lesson Learn
Automation is vital !!!
• Automated CI tests on such complex and
repeated tasks
– save your life time
– prevent plenty of human errors
Customization is
necessary
• Home made distcp running script with
error handling
• Setting permission by real case
Just try it
• Survey is important but sometimes it
cannot totally solve your problem
#TrendInsight
Thank you
QUESTION?
#TrendInsight
Backups
Kerberos Cross-Realm Federation
• Set different REALM in each cluster’s KDC
• Add both cluster’s kerberos information to configs
• Add federated kerberos principal to both KDC DB
• Add Hadoop configurations
• Make sure both cluster nodes can recognize each
other
• Restart necessary services
Set different REAML in each cluster’s KDC
Cluster1 krb5.conf
[realms]
CLUSTER1.DOMAIN.COM = {
kdc = cluster1_kdc_master:88
kdc = cluster1_kdc_slave:88
admin_server = cluster1_kdc_master:749
}
[domain_realm]
cluster1.domain.com = CLUSTER1. DOMAIN.COM
.cluster1.domain.com = CLUSTER1. DOMAIN.COM
Cluster2 krb5.conf
[realms]
CLUSTER2.DOMAIN.COM = {
kdc = cluster2_kdc_master:88
kdc = cluster2_kdc_slave:88
admin_server = cluster2_kdc_master:749
}
[domain_realm]
cluster2.domain.com = CLUSTER2. DOMAIN.COM
.cluster2.domain.com = CLUSTER2. DOMAIN.COM
Add both cluster’s kerberos information to krb5.conf
Both Cluster1 and Cluster2 krb5.conf
[realms]
CLUSTER1.DOMAIN.COM = {
kdc = cluster1_kdc_master:88
kdc = cluster1_kdc_slave:88
admin_server = cluster1_kdc_master:749
}
CLUSTER2.DOMAIN.COM = {
kdc = cluster2_kdc_master:88
kdc = cluster2_kdc_slave:88
admin_server = cluster2_kdc_master:749
}
[domain_realm]
cluster1.domain.com = CLUSTER1. DOMAIN.COM
.cluster1.domain.com = CLUSTER1. DOMAIN.COM
cluster2.domain.com = CLUSTER2. DOMAIN.COM
.cluster2.domain.com = CLUSTER2. DOMAIN.COM
Add federated kerberos principal to both KDC DB
$ kadmin.local: addprinc -e "rc4-hmac:normal des3-hmac-sha1:normal" krbtgt/
CLUSTER1.DOMAIN.COM@CLUSTER2.DOMAIN.COM
WARNING: no policy specified for krbtgt/CLUSTER1.DOMAIN.COM@ CLUSTER2.DOMAIN.COM; defaulting to
no policy Enter password for principal "krbtgt/ CLUSTER1.DOMAIN.COM@CLUSTER2.DOMAIN.COM ": //
123456
Re-enter password for principal "krbtgt/CLUSTER1.DOMAIN.COM@CLUSTER2.DOMAIN.COM": // 123456
Principal "krbtgt/CLUSTER1.DOMAIN.COM@CLUSTER2.DOMAIN.COM" created.
$ kadmin.local: addprinc -e "rc4-hmac:normal des3-hmac-sha1:normal"
krbtgt/CLUSTER2.DOMAIN.COM@CLUSTER1.DOMAIN.COM
WARNING: no policy specified for krbtgt/CLUSTER2.DOMAIN.COM@CLUSTER1.DOMAIN.COM; defaulting to no
policy Enter password for principal "krbtgt/CLUSTER2.DOMAIN.COM @CLUSTER1.DOMAIN.COM ": // 654321
Re-enter password for principal "krbtgt/CLUSTER2.DOMAIN.COM@CLUSTER1.DOMAIN.COM ": // 654321
Principal "krbtgt/CLUSTER2.DOMAIN.COM@CLUSTER1.DOMAIN.COM " created.
use the same password for a principal to make sure the encryption key is the same
Add Hadoop Configuration
core-site.xml
<property>
<name>hadoop.security.auth_to_local</name>
<value>
RULE:[1:$1@$0](^.*@CLUSTER.DOMAIN.
COM$)s/^(.*)@CLUSTER.DOMAIN.COM$/$1/g
RULE:[2:$1@$0](^.*@CLUSTER.DOMAIN.
COM$)s/^(.*)@CLUSTER.DOMAIN.COM$/$1/g
DEFAULT
</value>
</property>
hdfs-site.xml
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
Verify the setting of rule
hadoop org.apache.hadoop.security.HadoopKerberosName mapred/machine.cluster.domain.com@CLUSTER.DOMAIN.COM
Name: mapred/machine.cluster.domain.com@CLUSTER.DOMAIN.COM to mapred
Make sure both cluster nodes can recognize each other
• /etc/hosts for both cluster1 and cluster2 nodes
10.1.145.1 machine1.cluster1.domain.com
10.1.145.2 machine2.cluster1.domain.com
10.1.145.3 machine3.cluster1.domain.com
10.1.144.1 machine1.cluster2.domain.com
10.1.144.2 machine2.cluster2.domain.com
10.1.144.3 machine3.cluster2.domain.com
Restart necessary services
• KDC server
– service krb5kdc restart
– service kadmin restart
• Namenodes, Datanodes
– service hadoop-hdfs-namenode restart
– servcie hadoop-hdfs-datanode restart

Weitere ähnliche Inhalte

Was ist angesagt?

Intro to Spark - for Denver Big Data Meetup
Intro to Spark - for Denver Big Data MeetupIntro to Spark - for Denver Big Data Meetup
Intro to Spark - for Denver Big Data Meetup
Gwen (Chen) Shapira
 

Was ist angesagt? (20)

CBlocks - Posix compliant files systems for HDFS
CBlocks - Posix compliant files systems for HDFSCBlocks - Posix compliant files systems for HDFS
CBlocks - Posix compliant files systems for HDFS
 
Kafka Security
Kafka SecurityKafka Security
Kafka Security
 
Lessons learned from scaling YARN to 40K machines in a multi tenancy environment
Lessons learned from scaling YARN to 40K machines in a multi tenancy environmentLessons learned from scaling YARN to 40K machines in a multi tenancy environment
Lessons learned from scaling YARN to 40K machines in a multi tenancy environment
 
Hadoop - Lessons Learned
Hadoop - Lessons LearnedHadoop - Lessons Learned
Hadoop - Lessons Learned
 
Technical tips for secure Apache Hadoop cluster #ApacheConAsia #ApacheCon
Technical tips for secure Apache Hadoop cluster #ApacheConAsia #ApacheConTechnical tips for secure Apache Hadoop cluster #ApacheConAsia #ApacheCon
Technical tips for secure Apache Hadoop cluster #ApacheConAsia #ApacheCon
 
Improving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux ConfigurationImproving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux Configuration
 
ha_module5
ha_module5ha_module5
ha_module5
 
Sanger OpenStack presentation March 2017
Sanger OpenStack presentation March 2017Sanger OpenStack presentation March 2017
Sanger OpenStack presentation March 2017
 
Flexible compute
Flexible computeFlexible compute
Flexible compute
 
Automation of Hadoop cluster operations in Arm Treasure Data
Automation of Hadoop cluster operations in Arm Treasure DataAutomation of Hadoop cluster operations in Arm Treasure Data
Automation of Hadoop cluster operations in Arm Treasure Data
 
YARN
YARNYARN
YARN
 
Hadoop security
Hadoop securityHadoop security
Hadoop security
 
Hortonworks.Cluster Config Guide
Hortonworks.Cluster Config GuideHortonworks.Cluster Config Guide
Hortonworks.Cluster Config Guide
 
Storage and-compute-hdfs-map reduce
Storage and-compute-hdfs-map reduceStorage and-compute-hdfs-map reduce
Storage and-compute-hdfs-map reduce
 
Visualizing Kafka Security
Visualizing Kafka SecurityVisualizing Kafka Security
Visualizing Kafka Security
 
Hadoop & cloud storage object store integration in production (final)
Hadoop & cloud storage  object store integration in production (final)Hadoop & cloud storage  object store integration in production (final)
Hadoop & cloud storage object store integration in production (final)
 
Optimizing, profiling and deploying high performance Spark ML and TensorFlow ...
Optimizing, profiling and deploying high performance Spark ML and TensorFlow ...Optimizing, profiling and deploying high performance Spark ML and TensorFlow ...
Optimizing, profiling and deploying high performance Spark ML and TensorFlow ...
 
Managing multi tenant resource toward Hive 2.0
Managing multi tenant resource toward Hive 2.0Managing multi tenant resource toward Hive 2.0
Managing multi tenant resource toward Hive 2.0
 
Intro to Spark - for Denver Big Data Meetup
Intro to Spark - for Denver Big Data MeetupIntro to Spark - for Denver Big Data Meetup
Intro to Spark - for Denver Big Data Meetup
 
HBaseCon 2013: How to Get the MTTR Below 1 Minute and More
HBaseCon 2013: How to Get the MTTR Below 1 Minute and MoreHBaseCon 2013: How to Get the MTTR Below 1 Minute and More
HBaseCon 2013: How to Get the MTTR Below 1 Minute and More
 

Andere mochten auch

The Angela Apartments @ Malate, The Heritage
The Angela  Apartments @ Malate, The HeritageThe Angela  Apartments @ Malate, The Heritage
The Angela Apartments @ Malate, The Heritage
Evangeline Yia
 
14.05.2012 Social Media Monitoring with Hadoop (Nils Kübler, MeMo News)
14.05.2012 Social Media Monitoring with Hadoop (Nils Kübler, MeMo News)14.05.2012 Social Media Monitoring with Hadoop (Nils Kübler, MeMo News)
14.05.2012 Social Media Monitoring with Hadoop (Nils Kübler, MeMo News)
Swiss Big Data User Group
 
Cluster Housing in a Cultural Park on the Coeur d'Alene Reservation
Cluster Housing in a Cultural Park on the Coeur d'Alene ReservationCluster Housing in a Cultural Park on the Coeur d'Alene Reservation
Cluster Housing in a Cultural Park on the Coeur d'Alene Reservation
Joshua Arnold
 

Andere mochten auch (20)

The Angela Apartments @ Malate, The Heritage
The Angela  Apartments @ Malate, The HeritageThe Angela  Apartments @ Malate, The Heritage
The Angela Apartments @ Malate, The Heritage
 
Soldagem 2009 2-emi
Soldagem 2009 2-emiSoldagem 2009 2-emi
Soldagem 2009 2-emi
 
Prdc2012
Prdc2012Prdc2012
Prdc2012
 
14.05.2012 Social Media Monitoring with Hadoop (Nils Kübler, MeMo News)
14.05.2012 Social Media Monitoring with Hadoop (Nils Kübler, MeMo News)14.05.2012 Social Media Monitoring with Hadoop (Nils Kübler, MeMo News)
14.05.2012 Social Media Monitoring with Hadoop (Nils Kübler, MeMo News)
 
Cluster Housing in a Cultural Park on the Coeur d'Alene Reservation
Cluster Housing in a Cultural Park on the Coeur d'Alene ReservationCluster Housing in a Cultural Park on the Coeur d'Alene Reservation
Cluster Housing in a Cultural Park on the Coeur d'Alene Reservation
 
An Overview of Ambari
An Overview of AmbariAn Overview of Ambari
An Overview of Ambari
 
Cloudera security and enterprise license by Athemaster(繁中)
Cloudera security and enterprise license by Athemaster(繁中)Cloudera security and enterprise license by Athemaster(繁中)
Cloudera security and enterprise license by Athemaster(繁中)
 
Secure Hadoop Cluster With Kerberos
Secure Hadoop Cluster With KerberosSecure Hadoop Cluster With Kerberos
Secure Hadoop Cluster With Kerberos
 
DevOps Overview
DevOps OverviewDevOps Overview
DevOps Overview
 
Dev ops
Dev opsDev ops
Dev ops
 
Hortonworks Technical Workshop: Apache Ambari
Hortonworks Technical Workshop:   Apache AmbariHortonworks Technical Workshop:   Apache Ambari
Hortonworks Technical Workshop: Apache Ambari
 
Cluster Housing
Cluster HousingCluster Housing
Cluster Housing
 
Designing Puppet: Roles/Profiles Pattern
Designing Puppet: Roles/Profiles PatternDesigning Puppet: Roles/Profiles Pattern
Designing Puppet: Roles/Profiles Pattern
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
Housing Presentation
Housing Presentation Housing Presentation
Housing Presentation
 
Aranya Low Cost Housing
Aranya Low Cost HousingAranya Low Cost Housing
Aranya Low Cost Housing
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
Kudu Cloudera Meetup Paris
Kudu Cloudera Meetup ParisKudu Cloudera Meetup Paris
Kudu Cloudera Meetup Paris
 
DevOps and Continuous Delivery Reference Architectures (including Nexus and o...
DevOps and Continuous Delivery Reference Architectures (including Nexus and o...DevOps and Continuous Delivery Reference Architectures (including Nexus and o...
DevOps and Continuous Delivery Reference Architectures (including Nexus and o...
 
Hadoop configuration & performance tuning
Hadoop configuration & performance tuningHadoop configuration & performance tuning
Hadoop configuration & performance tuning
 

Ähnlich wie HadoopCon2015 Multi-Cluster Live Synchronization with Kerberos Federated Hadoop

Scaling Hadoop at LinkedIn
Scaling Hadoop at LinkedInScaling Hadoop at LinkedIn
Scaling Hadoop at LinkedIn
DataWorks Summit
 
E2E PVS Technical Overview Stephane Thirion
E2E PVS Technical Overview Stephane ThirionE2E PVS Technical Overview Stephane Thirion
E2E PVS Technical Overview Stephane Thirion
sthirion
 
What's the Hadoop-la about Kubernetes?
What's the Hadoop-la about Kubernetes?What's the Hadoop-la about Kubernetes?
What's the Hadoop-la about Kubernetes?
DataWorks Summit
 

Ähnlich wie HadoopCon2015 Multi-Cluster Live Synchronization with Kerberos Federated Hadoop (20)

To Build My Own Cloud with Blackjack…
To Build My Own Cloud with Blackjack…To Build My Own Cloud with Blackjack…
To Build My Own Cloud with Blackjack…
 
Scaling Hadoop at LinkedIn
Scaling Hadoop at LinkedInScaling Hadoop at LinkedIn
Scaling Hadoop at LinkedIn
 
Deploying Big-Data-as-a-Service (BDaaS) in the Enterprise
Deploying Big-Data-as-a-Service (BDaaS) in the EnterpriseDeploying Big-Data-as-a-Service (BDaaS) in the Enterprise
Deploying Big-Data-as-a-Service (BDaaS) in the Enterprise
 
Denver SQL Saturday The Next Frontier
Denver SQL Saturday The Next FrontierDenver SQL Saturday The Next Frontier
Denver SQL Saturday The Next Frontier
 
Introduction to Kubernetes
Introduction to KubernetesIntroduction to Kubernetes
Introduction to Kubernetes
 
Hadoop security
Hadoop securityHadoop security
Hadoop security
 
SQL Saturday San Diego
SQL Saturday San DiegoSQL Saturday San Diego
SQL Saturday San Diego
 
E2E PVS Technical Overview Stephane Thirion
E2E PVS Technical Overview Stephane ThirionE2E PVS Technical Overview Stephane Thirion
E2E PVS Technical Overview Stephane Thirion
 
What's the Hadoop-la about Kubernetes?
What's the Hadoop-la about Kubernetes?What's the Hadoop-la about Kubernetes?
What's the Hadoop-la about Kubernetes?
 
VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...
VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...
VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...
 
Couchbase Singapore Meetup #2: Why Developing with Couchbase is easy !!
Couchbase Singapore Meetup #2:  Why Developing with Couchbase is easy !! Couchbase Singapore Meetup #2:  Why Developing with Couchbase is easy !!
Couchbase Singapore Meetup #2: Why Developing with Couchbase is easy !!
 
Container & kubernetes
Container & kubernetesContainer & kubernetes
Container & kubernetes
 
KubeCon EU 2016: Leveraging ephemeral namespaces in a CI/CD pipeline
KubeCon EU 2016: Leveraging ephemeral namespaces in a CI/CD pipelineKubeCon EU 2016: Leveraging ephemeral namespaces in a CI/CD pipeline
KubeCon EU 2016: Leveraging ephemeral namespaces in a CI/CD pipeline
 
Sergey Dzyuban "To Build My Own Cloud with Blackjack…"
Sergey Dzyuban "To Build My Own Cloud with Blackjack…"Sergey Dzyuban "To Build My Own Cloud with Blackjack…"
Sergey Dzyuban "To Build My Own Cloud with Blackjack…"
 
Copy Data Management for the DBA
Copy Data Management for the DBACopy Data Management for the DBA
Copy Data Management for the DBA
 
Couchbase Chennai Meetup: Developing with Couchbase- made easy
Couchbase Chennai Meetup:  Developing with Couchbase- made easyCouchbase Chennai Meetup:  Developing with Couchbase- made easy
Couchbase Chennai Meetup: Developing with Couchbase- made easy
 
Orchestrating Docker with Terraform and Consul by Mitchell Hashimoto
Orchestrating Docker with Terraform and Consul by Mitchell Hashimoto Orchestrating Docker with Terraform and Consul by Mitchell Hashimoto
Orchestrating Docker with Terraform and Consul by Mitchell Hashimoto
 
Monitoring in Motion: Monitoring Containers and Amazon ECS
Monitoring in Motion: Monitoring Containers and Amazon ECSMonitoring in Motion: Monitoring Containers and Amazon ECS
Monitoring in Motion: Monitoring Containers and Amazon ECS
 
TIAD 2016 : Application delivery in a container world
TIAD 2016 : Application delivery in a container worldTIAD 2016 : Application delivery in a container world
TIAD 2016 : Application delivery in a container world
 
Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) ...
Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) ...Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) ...
Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) ...
 

Kürzlich hochgeladen

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 

Kürzlich hochgeladen (20)

presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 

HadoopCon2015 Multi-Cluster Live Synchronization with Kerberos Federated Hadoop

  • 1. Multi-Cluster Live Synchronization with Kerberos Federated Hadoop 張雅芳 Mammi Chang @ 2015 Taiwan HadoopCon
  • 2. Who am I ? • Mammi Chang 張雅芳 • Sr. Engineer, SPN, Trend Micro • SPN Hadoop Cluster Administrator • DevOps on Hadoop ecosystem and AWS • 2014 HadoopCon Speaker
  • 3.
  • 6. Data SynchronizationData synchronization is the process of establishing consistency among data from a source to a target data storage and vice versa and the continuous harmonization of the data over time. - From wikipedia “Data synchronization”
  • 7. One-way file synchronization  Updated files copied from source to destination Two-way file synchronization  Updated files are copied in both directories  Dropbox, SafeSync, etc
  • 8. Linux One-Way File Synchronization $ cp fileA fileB $ scp ./directory/my_file mammi@198.167.0.3:/home/mammi/ $ rsync -avP /source/data /destination/
  • 9. Hadoop One-Way File Synchronization $ hadoop fs -cp /user/mammi/file1 /user/mammi/dir/ $ hadoop distcp hdfs://cluster1/file hdfs://cluster2/file
  • 11. DistCp with the same Hadoop version is trivial.
  • 12. Hadoop - 2.6 Cluster1 Hadoop – 2.6 Cluster2 $ hadoop distcp hdfs://cluster1_nn:8020/test hdfs://cluster2_nn:8020/test
  • 13. Hadoop - 2.6 Cluster1 Hadoop – 2.6 Cluster2 $ hadoop distcp hdfs://cluster2_nn:8020/test hdfs://cluster1_nn:8020/test
  • 14. DistCp with the same Hadoop version is trivial. different a little bit tricky
  • 15. Oops … [root@tw-spnhadoop1 hadooppet]# hadoop distcp hdfs://cluster1/test hdfs://krb-1.spn.lab.trendnet.org:8020/test 15/01/22 15:11:44 INFO tools.DistCp: srcPaths=[hdfs://cluster1/test] 15/01/22 15:11:44 INFO tools.DistCp: destPath=hdfs://krb-1.spn.lab.trendnet.org:8020/test 15/01/22 15:11:45 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 381 for hdfs on ha-hdfs:cluster1 15/01/22 15:11:45 INFO security.TokenCache: Got dt for hdfs://cluster1; Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:cluster1, Ident: (HDFS_DELEGATION_TOKEN token 381 for hdfs) 15/01/22 15:11:46 ERROR security.UserGroupInformation: PriviledgedActionException as:hdfs/tw- spnhadoop1.spn.tw.trendnet.org@ISPN.TRENDMICRO.COM (auth:KERBEROS) cause:org.apache.hadoop.ipc.RemoteException(null): org.apache.hadoop.ipc.RPC$VersionMismatch 15/01/22 15:11:46 INFO security.UserGroupInformation: Initiating logout for hdfs/tw-spnhadoop1.spn.tw.trendnet.org@ISPN.TRENDMICRO.COM 15/01/22 15:11:46 INFO security.UserGroupInformation: Initiating re-login for hdfs/tw-spnhadoop1.spn.tw.trendnet.org@ISPN.TRENDMICRO.COM 15/01/22 15:11:50 ERROR security.UserGroupInformation: PriviledgedActionException as:hdfs/tw- spnhadoop1.spn.tw.trendnet.org@ISPN.TRENDMICRO.COM (auth:KERBEROS) cause:org.apache.hadoop.ipc.RemoteException(null): org.apache.hadoop.ipc.RPC$VersionMismatch 15/01/22 15:11:50 WARN security.UserGroupInformation: Not attempting to re-login since the last re-login was attempted less than 600 seconds before. 15/01/22 15:11:53 ERROR security.UserGroupInformation: PriviledgedActionException as:hdfs/tw- spnhadoop1.spn.tw.trendnet.org@ISPN.TRENDMICRO.COM (auth:KERBEROS) cause:org.apache.hadoop.ipc.RemoteException(null): org.apache.hadoop.ipc.RPC$VersionMismatch
  • 16. Apache Hadoop – 2.0 Cluster1 Apache Hadoop – 2.6 Cluster2 $ hadoop distcp hftp://cluster1_nn:50070/test hdfs://cluster2_nn:8020/test HftpFileSystem is a read-only FileSystem, so DistCp must be run on the destination cluster
  • 17. TMH6 Cluster1 TMH7 Cluster2 $ hadoop distcp ????://TMH6_NN:????/test hdfs://TMH7_NN:8020/test CDH Based Apache Based
  • 18. TMH6 Cluster1 TMH7 Cluster2 $ hadoop distcp hftp://TMH6_NN:50070/test hdfs://TMH7_NN:8020/test CDH Based Apache Based Only support data sync from TMH6 to TMH7
  • 19. DistCp with different Hadoop version is a little bit tricky plus kerberos security annoying !!
  • 20. TMH6 Cluster1 TMH7 Cluster2 $ hadoop distcp ????://TMH6_NN:XXXX/test ????://TMH7_NN:XXXX/test
  • 21. DistCp Data Copy Matrix: HDP1/HDP2 to HDP2 http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.1.1/bk_system-admin- guide/content/distcp-table.html Webhdfs is a HTTP REST API supports the complete FileSystem interface for HDFS
  • 22. DistCp Data Copy Matrix: TMH6/TMH7 to TMH6/TMH7 TMH6 TMH7 insecure secure hdfs hftp webhdfs 2
  • 23. TMH6 Cluster1 TMH7 Cluster2 $ hadoop distcp webhdfs://TMH6_NN:8020/test webhdfs://TMH7_NN:8020/test
  • 24. Hadoop Security with Kerberos Kerberos is a computer network authentication protocol which works on the basis of 'tickets' to allow nodes communicating over a non- secure network to prove their identity to one another in a secure manner - From wikipedia “Kerberos_(Protocol)”
  • 25. REALM – CLUSTER.DOMAIN.COM Kerberos Negotiation KDC (Key Distributed Center) TGT (Ticket-Granting Ticket) KDC Client Hadoop Servers Msg3 : Authenticator, TGT Msg4 : client/server ticket Msg1 : client login KDC Msg2 : client TGT Msg5 : Authenticator, ticket Msg6 : time auth
  • 26. REALM – CLUSTER2.DOMAIN.COM Kerberos Cross-Realm authenticate REALM – CLUSTER1.DOMAIN.COM KDC Client Hadoop Servers Msg3 : Authenticator, TGT Msg4 : client/server ticket Msg1 : client login KDC Msg2 : client TGT Msg5 : Authenticator, ticket Msg6 : time auth KDC
  • 27. Kerberos Federation for Hadoop Kerberos Setting • Set different REALM in each cluster’s KDC • Add both cluster’s kerberos information to configs • Add federated kerberos principal to both KDC DB • Restart kerberos services Hadoop Setting • Add Hadoop configurations • Make sure both cluster nodes can recognize each other • Restart necessary Hadoop services
  • 28. Multi-Cluster Kerberos Federation Cluster1 •Set different REALM in each cluster’s KDC •Add all other cluster’s kerberos information to configuration •Add all federated kerberos principal to KDC DB •Add Hadoop configurations •Make sure all cluster nodes can recognize each others •Restart necessary services Cluster2 •Set different REALM in each cluster’s KDC •Add all other cluster’s kerberos information to configuration •Add all federated kerberos principal to KDC DB •Add Hadoop configurations •Make sure all cluster nodes can recognize each others •Restart necessary services … •… Cluster N •Set different REALM in each cluster’s KDC •Add all other cluster’s kerberos information to configuration •Add all federated kerberos principal to KDC DB •Add Hadoop configurations •Make sure all cluster nodes can recognize each others •Restart necessary services
  • 29. DistCp with different Hadoop version plus kerberos federation is annoying !! in cross DC multi-cluster not easy. Done!!
  • 30. DistCp with different Hadoop version plus kerberos federation in cross DC mult-clusters is not easy at all.
  • 31. Production TMH6 TMH7 TMH6 TMH7 Original Data Center New Data Center Production Staging Staging Two-way keberos federation link Data Sync Data Sync Data Sync
  • 33. Issues • Computing resource • Zero-downtime • Schedule limitation • Network bandwidth
  • 34. Computing Resource • Principle – Do not have production service impact when many DistCp jobs running • Strategy – Run distcp on Staging Env. Instand of Production Env.
  • 35. Production TMH6 TMH7 TMH6 TMH7 Original Data Center New Data Center Production Staging Staging Two-way keberos federation link $ hadoop distcp webhdfs://TMH6_PROD_NN:8020/test webhdfs://TMH7_PROD_NN:8020/test Data Sync Data flow
  • 36. Production TMH6 TMH7 TMH6 TMH7 Original Data Center New Data Center Production Staging Staging Two-way keberos federation link $ hadoop distcp webhdfs://TMH6_PROD_NN:8020/test webhdfs://TMH7_PROD_NN:8020/test Data Sync Data flow
  • 37. Zero-downtime • Principle – Do not have Production Env. downtime • Strategy – Change KDC REALM in Staging only – Rolling restart services
  • 38. Schedule Limitation • Principle – Provide minimum dataset that fulfill production services requirement • Strategy – Divide dataset into cold data and hot data – All necessary hot data need to be ready before service move to new DC
  • 40. Automation is vital !!! • Automated CI tests on such complex and repeated tasks – save your life time – prevent plenty of human errors
  • 41. Customization is necessary • Home made distcp running script with error handling • Setting permission by real case
  • 42. Just try it • Survey is important but sometimes it cannot totally solve your problem
  • 45. Kerberos Cross-Realm Federation • Set different REALM in each cluster’s KDC • Add both cluster’s kerberos information to configs • Add federated kerberos principal to both KDC DB • Add Hadoop configurations • Make sure both cluster nodes can recognize each other • Restart necessary services
  • 46. Set different REAML in each cluster’s KDC Cluster1 krb5.conf [realms] CLUSTER1.DOMAIN.COM = { kdc = cluster1_kdc_master:88 kdc = cluster1_kdc_slave:88 admin_server = cluster1_kdc_master:749 } [domain_realm] cluster1.domain.com = CLUSTER1. DOMAIN.COM .cluster1.domain.com = CLUSTER1. DOMAIN.COM Cluster2 krb5.conf [realms] CLUSTER2.DOMAIN.COM = { kdc = cluster2_kdc_master:88 kdc = cluster2_kdc_slave:88 admin_server = cluster2_kdc_master:749 } [domain_realm] cluster2.domain.com = CLUSTER2. DOMAIN.COM .cluster2.domain.com = CLUSTER2. DOMAIN.COM
  • 47. Add both cluster’s kerberos information to krb5.conf Both Cluster1 and Cluster2 krb5.conf [realms] CLUSTER1.DOMAIN.COM = { kdc = cluster1_kdc_master:88 kdc = cluster1_kdc_slave:88 admin_server = cluster1_kdc_master:749 } CLUSTER2.DOMAIN.COM = { kdc = cluster2_kdc_master:88 kdc = cluster2_kdc_slave:88 admin_server = cluster2_kdc_master:749 } [domain_realm] cluster1.domain.com = CLUSTER1. DOMAIN.COM .cluster1.domain.com = CLUSTER1. DOMAIN.COM cluster2.domain.com = CLUSTER2. DOMAIN.COM .cluster2.domain.com = CLUSTER2. DOMAIN.COM
  • 48. Add federated kerberos principal to both KDC DB $ kadmin.local: addprinc -e "rc4-hmac:normal des3-hmac-sha1:normal" krbtgt/ CLUSTER1.DOMAIN.COM@CLUSTER2.DOMAIN.COM WARNING: no policy specified for krbtgt/CLUSTER1.DOMAIN.COM@ CLUSTER2.DOMAIN.COM; defaulting to no policy Enter password for principal "krbtgt/ CLUSTER1.DOMAIN.COM@CLUSTER2.DOMAIN.COM ": // 123456 Re-enter password for principal "krbtgt/CLUSTER1.DOMAIN.COM@CLUSTER2.DOMAIN.COM": // 123456 Principal "krbtgt/CLUSTER1.DOMAIN.COM@CLUSTER2.DOMAIN.COM" created. $ kadmin.local: addprinc -e "rc4-hmac:normal des3-hmac-sha1:normal" krbtgt/CLUSTER2.DOMAIN.COM@CLUSTER1.DOMAIN.COM WARNING: no policy specified for krbtgt/CLUSTER2.DOMAIN.COM@CLUSTER1.DOMAIN.COM; defaulting to no policy Enter password for principal "krbtgt/CLUSTER2.DOMAIN.COM @CLUSTER1.DOMAIN.COM ": // 654321 Re-enter password for principal "krbtgt/CLUSTER2.DOMAIN.COM@CLUSTER1.DOMAIN.COM ": // 654321 Principal "krbtgt/CLUSTER2.DOMAIN.COM@CLUSTER1.DOMAIN.COM " created. use the same password for a principal to make sure the encryption key is the same
  • 50. Make sure both cluster nodes can recognize each other • /etc/hosts for both cluster1 and cluster2 nodes 10.1.145.1 machine1.cluster1.domain.com 10.1.145.2 machine2.cluster1.domain.com 10.1.145.3 machine3.cluster1.domain.com 10.1.144.1 machine1.cluster2.domain.com 10.1.144.2 machine2.cluster2.domain.com 10.1.144.3 machine3.cluster2.domain.com
  • 51. Restart necessary services • KDC server – service krb5kdc restart – service kadmin restart • Namenodes, Datanodes – service hadoop-hdfs-namenode restart – servcie hadoop-hdfs-datanode restart