Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

Automate Hadoop Cluster Deployment in a Banking Ecosystem

366 Aufrufe

Veröffentlicht am

Presentation at Continuous Lifecycle London, 04 May 2016

Veröffentlicht in: Software
  • Als Erste(r) kommentieren

Automate Hadoop Cluster Deployment in a Banking Ecosystem

  1. 1. Hellmar Becker, ING Continuous Lifecycle London Automate Hadoop Cluster Deployment in a Banking Ecosystem Lessons from Practice May 4, 2016
  2. 2. Who am I? 2
  3. 3. Automate Hadoop Cluster Deployment in a Banking Ecosystem 3 The Goal Prelude: Hadoop Patterns in ING Chapter 1: First Steps Chapter 2: Standardizing Chapter 3: The Cloud Conclusion Questions
  4. 4. The Goal IN WHICH we look at the challenges that a bank has to face in the 21st century, and how this translates into decisions made in the IT landscape.
  5. 5. Market leaders Benelux Growth markets Commercial Banking Challengers The world of ING – Data Driven Since 1881 5 Customers 33 Million Private, Corporate and Institutional Customers Countries 41 In Europe, Asia, Australia, North and South America Employees 52,000
  6. 6. 6
  7. 7. We accelerate through the Concept of One 7 Provide standardized and easy to use global capabilities and services Accelerate strategy and concentrate on business value Concept of One
  8. 8. Prelude: Hadoop Patterns in ING 8 IN WHICH we describe the journey of some interesting characters that set out to get Hadoop adopted within a large, venerable institution, and across the world.
  9. 9. Data Lake and Advanced Analytics within ING 9 External and internal reporting for own or regulatory purposes Integrate all data sources within the bank into one processing platform • Batch data streams • Live transactions • Model building for customer interaction Better understand customer needs in an increasingly digital world Data can help us offering tailored products and services Empower data scientists and analysts to get the best results with advanced analytics tools and predictive models Open source software where possible – Hadoop as a core component
  10. 10. 1. File Storage 2. Deep Data 3. Analytical Hadoop 4. Real Time Hadoop Usage Patterns 10
  11. 11. Analytical Hadoop • Our first use case • Development and Production environments • P environment has Production level security but Test level SLA FileStore and Deep Data • Completely automated • Full DTAP street (Development, Test, Acceptance, Production) Patterns and their maintenance 11
  12. 12. • Vendors give us tools to do a GUI based install • Maintain several clusters in parallel, DTAP! • Auditability! • Not for us, we need to do automated installs • APIs and scripting facilities do exist, but are often poorly tested and documented Standard installation doesn’t cut it
  13. 13. • Layers – IaaS, PaaS, Application (we want IaaS not PaaS) • Organizational divide: Platform team vs. Infra team • Different privileges • Different tool choices • Trust and collaboration need to be actively built • Convince security audit teams! Organizational challenge
  14. 14. Chapter 1: First Steps IN WHICH a first expedition ventures into uncharted territory, encounters strange monsters and reconsiders their equipment.
  15. 15. • First take by Exploration teams (Analytical Pattern) • Unusual Ops mode: No Production system (although we use production data) • Install everything with Ansible • YAML based, ssh based access • All text files. Easy to put in git and to document • The Power of Root • Great power and flexibility • Risk people and GUI users do not like it • You are on your own • We tried to learn from this! Tooling part 1
  16. 16. Chapter 2: Standardizing 16 IN WHICH a larger party sets out with better equipment, reaches the shores of a new world but finds that still, much is to be improved.
  17. 17. • Now we needed a Datalake integrated solution with full support • Also need a full DTAP street Infra team has legacy tooling (proprietary tools) but limited flexibility. • Basically, we roll our specific configuration into homemade rpm packages. Tool choice for application deployment: CA Lisa aka Nolio • GUI based • No version control (tagging added as an afterthought) • Slow and awkward to use • Dumbed down by organizational restrictions Conclusion: Don’t go there! Implementing the FileStore and DeepData patterns
  18. 18. • By then, we had a lot of structure to help us • Standardized build server with GitBlit, Artifactory, Jenkings • Agile Way of Working • Now deployment is a split approach • Infra parts use TEM (and Ambari blueprint) to deploy full Hadoop stack • On top of the stack we deploy our own applications with Nolio • Handovers CIO-Infra still hurt us • We do have: Deployment on a given system at the press of a button • We do have: Automatic propagation of Git changes into Artifactory via Jenkins • We do not (yet) have: Automated propagation D->T->A->P via Jenkins Implementing the FileStore and DeepData patterns
  19. 19. Chapter 3: The Cloud IN WHICH our heroes learn from the cloud experience and from explorers around the world, and make deployment a safe experience for everyone. Chapter 3: The Cloud
  20. 20. • ING Private Cloud: is essentially Datacenter v2.0 • However, we get the chance to rethink our tooling • Puppet integrates nicely with RH Satellite and is used to provision PaaS solutions • Ansible is gaining ground in the internal discussion • External Ansible community: Meetup grown a lot over the last year. Now more than Puppet and Chef combined • ING has an initiative to come up with a standardized way to deploy packaged software, based on Ansible The Cloud
  21. 21. Conclusion
  22. 22. • Be aware: Deployment of mostly prepackaged software is different from developing your own software • Full automation might not be needed because we do not change as quickly as e.g. mobile app • Use tools that are scriptable • GUIs suck • Own your stack Conclusion 22
  23. 23. Questions Questions Questions
  24. 24. • Crane Gears by Kevin Utting is licensed under CC BY 2.0 • Hellmar in Nîmes / With Python in Mindanao, by the author • Domtoren in het oranje licht by helena_is_here is licensed under CC BY 2.0 Attributions 24