SlideShare ist ein Scribd-Unternehmen logo
1 von 10
Ambari
1.2
Using a Local Repository




© Hortonworks Inc. 2013    Page 1
Summary
• Terminology
   – Repository: a repository that hosts software packages for download + install
   – Yum: a package manager that can fetch software packages from a repository
         – On RHEL/CentOS, typically “yum”.
         – On SLES, typically “zypper”
   – Local Repository: a repository that is hosted on a local network.


• When working with machines that do not have Internet access, or have
  restricted (or limited) network access, you will need to setup a Local
  Repository in order to obtain the Ambari and HDP software packages.




        Architecting the Future of Big Data
                                                                                    Page 2
        © Hortonworks Inc. 2013
Repositories
• Ambari uses four Repositories…

• Ambari
   – Hosts the Ambari Server, Ambari Agent and monitoring software packages

• HDP
   – Hosts the Hadoop “Stack” packages (such as
     Hadoop, Pig, Hive, HCatalog, Oozie, HBase, ZooKeeper, Sqoop)

• HDP-UTILS
   – Utility packages for Ambari and HDP, including Ganglia, Nagios, snappy and rrd

• EPEL (Extra Packages for Enterprise Linux)
   – Set of additional packages for Enterprise Linux
   – Some components installed by Ambari (for example, Ganglia and Nagios) have
     dependencies that are available in EPEL
   – http://fedoraproject.org/wiki/EPEL



        Architecting the Future of Big Data
                                                                                      Page 3
        © Hortonworks Inc. 2013
How Ambari Uses Repositories
                              Ambari Repo                 HDP-UTILS Repo               HDP Repo




  Ambari repo definition               1                  1
1 tells the Ambari server
  which Ambari and HDP-
  Utils repositories to use


                                                                    HostA         HostB            HostC

                     Ambari                           2    3
                     Server
                                                                    HostD         HostE            HostF


                Ambari server tells all                                        Ambari server tells all
            2 hosts in the cluster which                                    3 hosts in the cluster what
                Ambari and HDP-UTILS
                                                                               HDP (“Stack”) repository
                repositories to use
                                                                               to use

                Architecting the Future of Big Data
                                                                                                           Page 4
                © Hortonworks Inc. 2013
Configuring Repositories
• By default, Ambari will use the publicly hosted software repositories
• To use Local Repositories, you must…


   –First, setup the local repositories
   –Second, configure Ambari to use the local repositories


Note: We will not cover setting up a local EPEL repository. Many Data Centers
already mirror or proxy the EPEL repository, so discuss with your Data Center
team whether EPEL is already available from within your firewall.




        Architecting the Future of Big Data
                                                                           Page 5
        © Hortonworks Inc. 2013
First, Setup Local Repositories
   • Depending on your Internet access you have a few options
   • Highlighting the two most popular…

   • Option #1: Mirroring the packages to Local Repository
         – Use this option if you have Internet access available but not from the cluster hosts
         1. Setup a host with Apache httpd
         2. Copy the entire contents of each repository local using the “reposync” command
         3. Run “createrepo” to build the repo metadata


   • Option #2: Download Repository Tarball and Build Local Repository
         – Use this option if you have no Internet access available
         1. Setup a host with Apache httpd
         2. Download a tarball copy of the each repositories entire contents
         3. Extract contents



http://docs.hortonworks.com/HDPDocuments/HDP1/HDP-1.2.1/bk_reference/content/reference_chap4.html

               Architecting the Future of Big Data
                                                                                                    Page 6
               © Hortonworks Inc. 2013
Second, Configure Ambari
  • Set ambari.repo to point to your Ambari and HDP-UTILS Local Repositories




  • Install Ambari Server




  • Configure the Ambari “Stacks” to point to your HDP Local Repository




http://docs.hortonworks.com/HDPDocuments/HDP1/HDP-1.2.1/bk_using_Ambari_book/content/ambari-chap1-6.html

             Architecting the Future of Big Data
                                                                                                           Page 7
             © Hortonworks Inc. 2013
And Don’t Forget: Install Wizard Option




Need to check this box




             Architecting the Future of Big Data
                                                   Page 8
             © Hortonworks Inc. 2013
A Word About the JDK…
• Ambari (and Hadoop) need a JDK to run
• Ambari automatically fetches the JDK from the Internet and installs

• If you have limited Internet connectivity and need to use Local
  Repositories for software packages, you also need to setup for how to
  get the JDK w/o Internet access

   – Perform JDK install on each machine
         – And let Ambari know the location during install



   – Alternatively, download JDK & put in place on Ambari Server before cluster install
         – Ambari Server will distribute JDK to cluster hosts




        Architecting the Future of Big Data
                                                                                     Page 9
        © Hortonworks Inc. 2013
Useful Resources
Resource                                         Link

Deploying HDP in Data Centers                    http://docs.hortonworks.com/HDPDocuments/HDP1/HDP-
with Firewalls                                   1.2.1/bk_reference/content/reference_chap4.html

Configure Ambari to Use Local                    http://docs.hortonworks.com/HDPDocuments/HDP1/HDP-
Repositories                                     1.2.1/bk_using_Ambari_book/content/ambari-chap1-6.html

Apache Ambari Project Page                       http://incubator.apache.org/ambari/

Apache Ambari User Group                         http://www.meetup.com/Apache-Ambari-User-Group/

Apache Ambari JIRA                               https://issues.apache.org/jira/browse/AMBARI




           Architecting the Future of Big Data
                                                                                                          Page 10
           © Hortonworks Inc. 2013

Weitere ähnliche Inhalte

Mehr von Hortonworks

Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsHortonworks
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeHortonworks
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidHortonworks
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleHortonworks
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATAHortonworks
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Hortonworks
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseHortonworks
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseHortonworks
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationHortonworks
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementHortonworks
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHortonworks
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCHortonworks
 
4 Essential Steps for Managing Sensitive Data
4 Essential Steps for Managing Sensitive Data4 Essential Steps for Managing Sensitive Data
4 Essential Steps for Managing Sensitive DataHortonworks
 
5 Steps to Create a Company Culture that Embraces the Power of Data
5 Steps to Create a Company Culture that Embraces the Power of Data5 Steps to Create a Company Culture that Embraces the Power of Data
5 Steps to Create a Company Culture that Embraces the Power of DataHortonworks
 
Exploring the Heated-and Completely Unnecessary- Data Lake Debate
Exploring the Heated-and Completely Unnecessary- Data Lake DebateExploring the Heated-and Completely Unnecessary- Data Lake Debate
Exploring the Heated-and Completely Unnecessary- Data Lake DebateHortonworks
 
Sprint's Data Modernization Journey
Sprint's Data Modernization JourneySprint's Data Modernization Journey
Sprint's Data Modernization JourneyHortonworks
 
Modernize Your Existing EDW with IBM Big SQL & Hortonworks Data Platform
Modernize Your Existing EDW with IBM Big SQL & Hortonworks Data PlatformModernize Your Existing EDW with IBM Big SQL & Hortonworks Data Platform
Modernize Your Existing EDW with IBM Big SQL & Hortonworks Data PlatformHortonworks
 
Enterprise Data Science at Scale Meetup - IBM and Hortonworks - Oct 2017
Enterprise Data Science at Scale Meetup - IBM and Hortonworks - Oct 2017 Enterprise Data Science at Scale Meetup - IBM and Hortonworks - Oct 2017
Enterprise Data Science at Scale Meetup - IBM and Hortonworks - Oct 2017 Hortonworks
 
Benefits of Transferring Real-Time Data to Hadoop at Scale
Benefits of Transferring Real-Time Data to Hadoop at ScaleBenefits of Transferring Real-Time Data to Hadoop at Scale
Benefits of Transferring Real-Time Data to Hadoop at ScaleHortonworks
 

Mehr von Hortonworks (20)

Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data Landscape
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache Druid
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at Scale
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with Ease
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data Management
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDC
 
4 Essential Steps for Managing Sensitive Data
4 Essential Steps for Managing Sensitive Data4 Essential Steps for Managing Sensitive Data
4 Essential Steps for Managing Sensitive Data
 
5 Steps to Create a Company Culture that Embraces the Power of Data
5 Steps to Create a Company Culture that Embraces the Power of Data5 Steps to Create a Company Culture that Embraces the Power of Data
5 Steps to Create a Company Culture that Embraces the Power of Data
 
Exploring the Heated-and Completely Unnecessary- Data Lake Debate
Exploring the Heated-and Completely Unnecessary- Data Lake DebateExploring the Heated-and Completely Unnecessary- Data Lake Debate
Exploring the Heated-and Completely Unnecessary- Data Lake Debate
 
Sprint's Data Modernization Journey
Sprint's Data Modernization JourneySprint's Data Modernization Journey
Sprint's Data Modernization Journey
 
Modernize Your Existing EDW with IBM Big SQL & Hortonworks Data Platform
Modernize Your Existing EDW with IBM Big SQL & Hortonworks Data PlatformModernize Your Existing EDW with IBM Big SQL & Hortonworks Data Platform
Modernize Your Existing EDW with IBM Big SQL & Hortonworks Data Platform
 
Enterprise Data Science at Scale Meetup - IBM and Hortonworks - Oct 2017
Enterprise Data Science at Scale Meetup - IBM and Hortonworks - Oct 2017 Enterprise Data Science at Scale Meetup - IBM and Hortonworks - Oct 2017
Enterprise Data Science at Scale Meetup - IBM and Hortonworks - Oct 2017
 
Benefits of Transferring Real-Time Data to Hadoop at Scale
Benefits of Transferring Real-Time Data to Hadoop at ScaleBenefits of Transferring Real-Time Data to Hadoop at Scale
Benefits of Transferring Real-Time Data to Hadoop at Scale
 

Kürzlich hochgeladen

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 

Kürzlich hochgeladen (20)

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 

Ambari: Using a Local Repository

  • 1. Ambari 1.2 Using a Local Repository © Hortonworks Inc. 2013 Page 1
  • 2. Summary • Terminology – Repository: a repository that hosts software packages for download + install – Yum: a package manager that can fetch software packages from a repository – On RHEL/CentOS, typically “yum”. – On SLES, typically “zypper” – Local Repository: a repository that is hosted on a local network. • When working with machines that do not have Internet access, or have restricted (or limited) network access, you will need to setup a Local Repository in order to obtain the Ambari and HDP software packages. Architecting the Future of Big Data Page 2 © Hortonworks Inc. 2013
  • 3. Repositories • Ambari uses four Repositories… • Ambari – Hosts the Ambari Server, Ambari Agent and monitoring software packages • HDP – Hosts the Hadoop “Stack” packages (such as Hadoop, Pig, Hive, HCatalog, Oozie, HBase, ZooKeeper, Sqoop) • HDP-UTILS – Utility packages for Ambari and HDP, including Ganglia, Nagios, snappy and rrd • EPEL (Extra Packages for Enterprise Linux) – Set of additional packages for Enterprise Linux – Some components installed by Ambari (for example, Ganglia and Nagios) have dependencies that are available in EPEL – http://fedoraproject.org/wiki/EPEL Architecting the Future of Big Data Page 3 © Hortonworks Inc. 2013
  • 4. How Ambari Uses Repositories Ambari Repo HDP-UTILS Repo HDP Repo Ambari repo definition 1 1 1 tells the Ambari server which Ambari and HDP- Utils repositories to use HostA HostB HostC Ambari 2 3 Server HostD HostE HostF Ambari server tells all Ambari server tells all 2 hosts in the cluster which 3 hosts in the cluster what Ambari and HDP-UTILS HDP (“Stack”) repository repositories to use to use Architecting the Future of Big Data Page 4 © Hortonworks Inc. 2013
  • 5. Configuring Repositories • By default, Ambari will use the publicly hosted software repositories • To use Local Repositories, you must… –First, setup the local repositories –Second, configure Ambari to use the local repositories Note: We will not cover setting up a local EPEL repository. Many Data Centers already mirror or proxy the EPEL repository, so discuss with your Data Center team whether EPEL is already available from within your firewall. Architecting the Future of Big Data Page 5 © Hortonworks Inc. 2013
  • 6. First, Setup Local Repositories • Depending on your Internet access you have a few options • Highlighting the two most popular… • Option #1: Mirroring the packages to Local Repository – Use this option if you have Internet access available but not from the cluster hosts 1. Setup a host with Apache httpd 2. Copy the entire contents of each repository local using the “reposync” command 3. Run “createrepo” to build the repo metadata • Option #2: Download Repository Tarball and Build Local Repository – Use this option if you have no Internet access available 1. Setup a host with Apache httpd 2. Download a tarball copy of the each repositories entire contents 3. Extract contents http://docs.hortonworks.com/HDPDocuments/HDP1/HDP-1.2.1/bk_reference/content/reference_chap4.html Architecting the Future of Big Data Page 6 © Hortonworks Inc. 2013
  • 7. Second, Configure Ambari • Set ambari.repo to point to your Ambari and HDP-UTILS Local Repositories • Install Ambari Server • Configure the Ambari “Stacks” to point to your HDP Local Repository http://docs.hortonworks.com/HDPDocuments/HDP1/HDP-1.2.1/bk_using_Ambari_book/content/ambari-chap1-6.html Architecting the Future of Big Data Page 7 © Hortonworks Inc. 2013
  • 8. And Don’t Forget: Install Wizard Option Need to check this box Architecting the Future of Big Data Page 8 © Hortonworks Inc. 2013
  • 9. A Word About the JDK… • Ambari (and Hadoop) need a JDK to run • Ambari automatically fetches the JDK from the Internet and installs • If you have limited Internet connectivity and need to use Local Repositories for software packages, you also need to setup for how to get the JDK w/o Internet access – Perform JDK install on each machine – And let Ambari know the location during install – Alternatively, download JDK & put in place on Ambari Server before cluster install – Ambari Server will distribute JDK to cluster hosts Architecting the Future of Big Data Page 9 © Hortonworks Inc. 2013
  • 10. Useful Resources Resource Link Deploying HDP in Data Centers http://docs.hortonworks.com/HDPDocuments/HDP1/HDP- with Firewalls 1.2.1/bk_reference/content/reference_chap4.html Configure Ambari to Use Local http://docs.hortonworks.com/HDPDocuments/HDP1/HDP- Repositories 1.2.1/bk_using_Ambari_book/content/ambari-chap1-6.html Apache Ambari Project Page http://incubator.apache.org/ambari/ Apache Ambari User Group http://www.meetup.com/Apache-Ambari-User-Group/ Apache Ambari JIRA https://issues.apache.org/jira/browse/AMBARI Architecting the Future of Big Data Page 10 © Hortonworks Inc. 2013