Diese Präsentation wurde erfolgreich gemeldet.
Die SlideShare-Präsentation wird heruntergeladen. ×

Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)

Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Wird geladen in …3
×

Hier ansehen

1 von 22 Anzeige

Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)

Herunterladen, um offline zu lesen

Teradata has been hard at work on Presto, and we want to share with you what we've done so far and our roadmap going forward. From presto-admin, a tool for installing and administering Presto, to YARN/Ambari support, to fully certified JDBC and ODBC drivers, we are committed to making Presto the best, most enterprise-ready SQL-on Hadoop solution out there.

Teradata has been hard at work on Presto, and we want to share with you what we've done so far and our roadmap going forward. From presto-admin, a tool for installing and administering Presto, to YARN/Ambari support, to fully certified JDBC and ODBC drivers, we are committed to making Presto the best, most enterprise-ready SQL-on Hadoop solution out there.

Anzeige
Anzeige

Weitere Verwandte Inhalte

Diashows für Sie (20)

Andere mochten auch (20)

Anzeige

Ähnlich wie Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015) (20)

Aktuellste (20)

Anzeige

Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)

  1. 1. Hello, Enterprise! Meet Presto Teradata Contributions to Presto 10/6/15 Christina Wallin
  2. 2. 2 • Teradata Center for Hadoop • Formerly Hadapt, the first SQL-on-Hadoop company (founded in 2010) • Offices in Boston and Warsaw, some remote employees in CA and CT • Around 20 employees working on Presto • Contributors to the open source project Presto! Who are we?
  3. 3. 3 What is Presto? • 100% open source distributed ANSI SQL engine for Big Data – Modern architecture and implementation – Proven scalability and performance – Optimized for low latency, interactive querying • Cross platform query capability, not only SQL on Hadoop • Distributed under the Apache license, now supported by Teradata • Used by a community of well known, well respected technology companies
  4. 4. 4 Presto Architecture Coordinator Parser/ analyzer Planner Scheduler Worker Client Worker Worker
  5. 5. 5 Presto Pluggable Data sources Capabilities Push-down to Hadoop System Push-down to Other Database HADOOP HDFS OTHER DATABASES HADOOP KAFKA Hadoop HADOOP PRESTO Push-down to NoSQL Databases NOSQL DATABASES
  6. 6. 6 Teradata Contributions to Presto Implement Integrate Proliferate • Installer • Documentation • Monitoring & Support Tools • Management Tool Integration • YARN Integration ODBC Driver • JDBC Driver • BI Certification • Security • Connectors Commercial Support Phase 1 Phase 2 Phase 3 June 8, 2015 Q4 2015 2016 Expanding ANSI SQL Coverage
  7. 7. 7 Easy Installation and Administration
  8. 8. 8 • presto-admin can: – Install and uninstall Presto – Deploy configuration files across the cluster – Start/stop/restart Presto servers – Show you the status of the cluster – Add and remove connectors – Upgrade Presto to a different version – Collect logs, query info, system info for support • Additionally, we added an RPM for Presto • https://github.com/prestodb/presto-admin presto-admin: a tool to manage and install Presto
  9. 9. 9 Hadoop Ecosystem Integration
  10. 10. 10 Ambari Integration (Work In Progress) • http://github.com/prestodb/ambari-presto-service
  11. 11. 11
  12. 12. 12
  13. 13. 13
  14. 14. 14
  15. 15. 15 Resource Allocation with YARN • Slated for Q4 2015 • Allow Presto to run its services within YARN containers so that YARN knows about memory/CPU allocated to Presto. – Using Apache Slider – The allocation is fixed and upfront – Supports HDP and CDH Hadoop Versions • YARN CGroups Integration • http://github.com/prestodb/presto-yarn
  16. 16. 16 Enterprise Database Features
  17. 17. 17 • Improved ODBC driver -- Q4 2015 • Improved JDBC driver -- Q1 2016 • Certification against Tableau, Qlik, etc. – mid 2016 Unleashing Presto on Business Intelligence Tools
  18. 18. 18 • Current Contributions – DECIMAL type (WIP) – Additional smaller things – new functions, bug fixes, TIMESTAMP support for Parquet • Future goal: Support TPC-H and TPC-DS unmodified! – Additional subquery and join support – EXISTS, EXCEPT, INTERSECT – Various other odds and ends Expanded ANSI SQL Support
  19. 19. 19 Demo of presto-admin!
  20. 20. 20 • https://github.com/facebook/presto • https://github.com/prestodb/presto-admin • Certified distro: http://www.teradata.com/presto/ – Also can download VM images pre-installed with Presto How can I give Presto a try?
  21. 21. 21 Questions?
  22. 22. 22

Hinweis der Redaktion

  • Interactive performance of execution engine
    Code generation for operators (similarly to Impala)
    Data is pipelined MPP-style
    Runs at Facebook scale
    *Capable of querying other non-HDFS data stores as well*
  • Add information specific to your understanding of the client challenges or objectives that would lead to an analytic roadmap. This should be very tailored to the client audience.

  • Presto-Yarn Integration objective - resource allocation meant for long running services. In addition for cases where Presto and Hadoop share the same hardware (or cluster) Yarn integration also provides an unified way of accounting and monitoring of cluster utilization.
    The goal of this is to be transparent to YARN about how much RAM / CPU was allocated to Presto so that less is available to other YARN applications (MapReduce, Tez, etc.)
    The allocation is fixed and upfront - no dynamic changes to resource allocation supported for Phase 2. To reconfigure memory/cpu settings, a restart is necessary.
    YARN has introduced support for CPU sharing (via CGroups). Currently, CGroups is only used for limiting CPU usage. So we will leverage this to limit Presto in the CPU usage. (Slider also has some CPU resource sharing support)
    Apache Slider is a YARN application to deploy existing distributed applications on YARN, monitor them and make them larger or smaller as desired . Slider’s objective is to make it easy for existing distributed applications, like Presto, to be deployed on a YARN cluster without changes and with little or no custom code.

  • Untar presto-admin & install
    ./presto-admin server install presto-server-rpm.rpm
    ./presto-admin server start
    Pause briefly so that the coordinator finds the workers
    ./presto-admin server status
    ./presto-admin configuration show
    Cat hive.properties
    Mv hive.properties /opt/prestoadmin/connectors
    ./presto-admin connector add hive
    ./presto-admin server restart
    wait
    ./presto-admin server status
    Presto CLI: ./presto –server localhost:8080 –catalog hive –schema default
    show tables;
    Create table lineitem as select * from tpch.1gb.lineitem;
    Select count(*) from lineitem;

×