Suche senden
Hochladen
Nicholas:hdfs what is new in hadoop 2
•
2 gefällt mir
•
1,814 views
H
hdhappy001
Folgen
BDTC 2013 Beijing China
Weniger lesen
Mehr lesen
Technologie
Melden
Teilen
Melden
Teilen
1 von 48
Jetzt herunterladen
Downloaden Sie, um offline zu lesen
Empfohlen
HDFS- What is New and Future
HDFS- What is New and Future
DataWorks Summit
Hadoop operations-2014-strata-new-york-v5
Hadoop operations-2014-strata-new-york-v5
Chris Nauroth
Hadoop Operations - Best Practices from the Field
Hadoop Operations - Best Practices from the Field
DataWorks Summit
WANdisco Non-Stop Hadoop: PHXDataConference Presentation Oct 2014
WANdisco Non-Stop Hadoop: PHXDataConference Presentation Oct 2014
Chris Almond
Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?
Uwe Printz
Architectural Overview of MapR's Apache Hadoop Distribution
Architectural Overview of MapR's Apache Hadoop Distribution
mcsrivas
Hadoop and WANdisco: The Future of Big Data
Hadoop and WANdisco: The Future of Big Data
WANdisco Plc
Selective Data Replication with Geographically Distributed Hadoop
Selective Data Replication with Geographically Distributed Hadoop
DataWorks Summit
Empfohlen
HDFS- What is New and Future
HDFS- What is New and Future
DataWorks Summit
Hadoop operations-2014-strata-new-york-v5
Hadoop operations-2014-strata-new-york-v5
Chris Nauroth
Hadoop Operations - Best Practices from the Field
Hadoop Operations - Best Practices from the Field
DataWorks Summit
WANdisco Non-Stop Hadoop: PHXDataConference Presentation Oct 2014
WANdisco Non-Stop Hadoop: PHXDataConference Presentation Oct 2014
Chris Almond
Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?
Uwe Printz
Architectural Overview of MapR's Apache Hadoop Distribution
Architectural Overview of MapR's Apache Hadoop Distribution
mcsrivas
Hadoop and WANdisco: The Future of Big Data
Hadoop and WANdisco: The Future of Big Data
WANdisco Plc
Selective Data Replication with Geographically Distributed Hadoop
Selective Data Replication with Geographically Distributed Hadoop
DataWorks Summit
Big data- HDFS(2nd presentation)
Big data- HDFS(2nd presentation)
Takrim Ul Islam Laskar
Hadoop Operations - Best practices from the field
Hadoop Operations - Best practices from the field
Uwe Printz
SD Big Data Monthly Meetup #4 - Session 2 - WANDisco
SD Big Data Monthly Meetup #4 - Session 2 - WANDisco
Big Data Joe™ Rossi
Apache Hadoop YARN 3.x in Alibaba
Apache Hadoop YARN 3.x in Alibaba
DataWorks Summit
Debunking the Myths of HDFS Erasure Coding Performance
Debunking the Myths of HDFS Erasure Coding Performance
DataWorks Summit/Hadoop Summit
Apache hadoop: POSH Meetup Palo Alto, CA April 2014
Apache hadoop: POSH Meetup Palo Alto, CA April 2014
Kevin Crocker
How the Internet of Things are Turning the Internet Upside Down
How the Internet of Things are Turning the Internet Upside Down
DataWorks Summit
Solving Hadoop Replication Challenges with an Active-Active Paxos Algorithm
Solving Hadoop Replication Challenges with an Active-Active Paxos Algorithm
DataWorks Summit
HDFS tiered storage
HDFS tiered storage
DataWorks Summit
Cross-DC Fault-Tolerant ViewFileSystem @ Twitter
Cross-DC Fault-Tolerant ViewFileSystem @ Twitter
DataWorks Summit/Hadoop Summit
MapReduce Improvements in MapR Hadoop
MapReduce Improvements in MapR Hadoop
abord
HDFS Erasure Coding in Action
HDFS Erasure Coding in Action
DataWorks Summit/Hadoop Summit
LLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in Hive
DataWorks Summit/Hadoop Summit
Difference between hadoop 2 vs hadoop 3
Difference between hadoop 2 vs hadoop 3
Manish Chopra
Hadoop disaster recovery
Hadoop disaster recovery
Sandeep Singh
HDFS Tiered Storage: Mounting Object Stores in HDFS
HDFS Tiered Storage: Mounting Object Stores in HDFS
DataWorks Summit/Hadoop Summit
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
DataWorks Summit/Hadoop Summit
Disaster Recovery in the Hadoop Ecosystem: Preparing for the Improbable
Disaster Recovery in the Hadoop Ecosystem: Preparing for the Improbable
Stefan Kupstaitis-Dunkler
Evolving HDFS to Generalized Storage Subsystem
Evolving HDFS to Generalized Storage Subsystem
DataWorks Summit/Hadoop Summit
Inside MapR's M7
Inside MapR's M7
Ted Dunning
袁晓如:大数据时代可视化和可视分析的机遇与挑战
袁晓如:大数据时代可视化和可视分析的机遇与挑战
hdhappy001
薛伟:腾讯广点通——大数据之上的实时精准推荐
薛伟:腾讯广点通——大数据之上的实时精准推荐
hdhappy001
Weitere ähnliche Inhalte
Was ist angesagt?
Big data- HDFS(2nd presentation)
Big data- HDFS(2nd presentation)
Takrim Ul Islam Laskar
Hadoop Operations - Best practices from the field
Hadoop Operations - Best practices from the field
Uwe Printz
SD Big Data Monthly Meetup #4 - Session 2 - WANDisco
SD Big Data Monthly Meetup #4 - Session 2 - WANDisco
Big Data Joe™ Rossi
Apache Hadoop YARN 3.x in Alibaba
Apache Hadoop YARN 3.x in Alibaba
DataWorks Summit
Debunking the Myths of HDFS Erasure Coding Performance
Debunking the Myths of HDFS Erasure Coding Performance
DataWorks Summit/Hadoop Summit
Apache hadoop: POSH Meetup Palo Alto, CA April 2014
Apache hadoop: POSH Meetup Palo Alto, CA April 2014
Kevin Crocker
How the Internet of Things are Turning the Internet Upside Down
How the Internet of Things are Turning the Internet Upside Down
DataWorks Summit
Solving Hadoop Replication Challenges with an Active-Active Paxos Algorithm
Solving Hadoop Replication Challenges with an Active-Active Paxos Algorithm
DataWorks Summit
HDFS tiered storage
HDFS tiered storage
DataWorks Summit
Cross-DC Fault-Tolerant ViewFileSystem @ Twitter
Cross-DC Fault-Tolerant ViewFileSystem @ Twitter
DataWorks Summit/Hadoop Summit
MapReduce Improvements in MapR Hadoop
MapReduce Improvements in MapR Hadoop
abord
HDFS Erasure Coding in Action
HDFS Erasure Coding in Action
DataWorks Summit/Hadoop Summit
LLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in Hive
DataWorks Summit/Hadoop Summit
Difference between hadoop 2 vs hadoop 3
Difference between hadoop 2 vs hadoop 3
Manish Chopra
Hadoop disaster recovery
Hadoop disaster recovery
Sandeep Singh
HDFS Tiered Storage: Mounting Object Stores in HDFS
HDFS Tiered Storage: Mounting Object Stores in HDFS
DataWorks Summit/Hadoop Summit
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
DataWorks Summit/Hadoop Summit
Disaster Recovery in the Hadoop Ecosystem: Preparing for the Improbable
Disaster Recovery in the Hadoop Ecosystem: Preparing for the Improbable
Stefan Kupstaitis-Dunkler
Evolving HDFS to Generalized Storage Subsystem
Evolving HDFS to Generalized Storage Subsystem
DataWorks Summit/Hadoop Summit
Inside MapR's M7
Inside MapR's M7
Ted Dunning
Was ist angesagt?
(20)
Big data- HDFS(2nd presentation)
Big data- HDFS(2nd presentation)
Hadoop Operations - Best practices from the field
Hadoop Operations - Best practices from the field
SD Big Data Monthly Meetup #4 - Session 2 - WANDisco
SD Big Data Monthly Meetup #4 - Session 2 - WANDisco
Apache Hadoop YARN 3.x in Alibaba
Apache Hadoop YARN 3.x in Alibaba
Debunking the Myths of HDFS Erasure Coding Performance
Debunking the Myths of HDFS Erasure Coding Performance
Apache hadoop: POSH Meetup Palo Alto, CA April 2014
Apache hadoop: POSH Meetup Palo Alto, CA April 2014
How the Internet of Things are Turning the Internet Upside Down
How the Internet of Things are Turning the Internet Upside Down
Solving Hadoop Replication Challenges with an Active-Active Paxos Algorithm
Solving Hadoop Replication Challenges with an Active-Active Paxos Algorithm
HDFS tiered storage
HDFS tiered storage
Cross-DC Fault-Tolerant ViewFileSystem @ Twitter
Cross-DC Fault-Tolerant ViewFileSystem @ Twitter
MapReduce Improvements in MapR Hadoop
MapReduce Improvements in MapR Hadoop
HDFS Erasure Coding in Action
HDFS Erasure Coding in Action
LLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in Hive
Difference between hadoop 2 vs hadoop 3
Difference between hadoop 2 vs hadoop 3
Hadoop disaster recovery
Hadoop disaster recovery
HDFS Tiered Storage: Mounting Object Stores in HDFS
HDFS Tiered Storage: Mounting Object Stores in HDFS
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Disaster Recovery in the Hadoop Ecosystem: Preparing for the Improbable
Disaster Recovery in the Hadoop Ecosystem: Preparing for the Improbable
Evolving HDFS to Generalized Storage Subsystem
Evolving HDFS to Generalized Storage Subsystem
Inside MapR's M7
Inside MapR's M7
Andere mochten auch
袁晓如:大数据时代可视化和可视分析的机遇与挑战
袁晓如:大数据时代可视化和可视分析的机遇与挑战
hdhappy001
薛伟:腾讯广点通——大数据之上的实时精准推荐
薛伟:腾讯广点通——大数据之上的实时精准推荐
hdhappy001
Ad network、ad exchange、dsp、ssp、rtb_和dmp介绍
Ad network、ad exchange、dsp、ssp、rtb_和dmp介绍
Sijia Lyu
刘书良:基于大数据公共云平台的Dsp技术
刘书良:基于大数据公共云平台的Dsp技术
hdhappy001
徐萌:中国移动大数据应用实践
徐萌:中国移动大数据应用实践
hdhappy001
詹剑锋:Big databench—benchmarking big data systems
詹剑锋:Big databench—benchmarking big data systems
hdhappy001
翟艳堂:腾讯大规模Hadoop集群实践
翟艳堂:腾讯大规模Hadoop集群实践
hdhappy001
Zh tw cloud computing era
Zh tw cloud computing era
TrendProgContest13
Capital onehadoopintro
Capital onehadoopintro
Doug Chang
Cloud computing era
Cloud computing era
TrendProgContest13
Introduction to hadoop and hdfs
Introduction to hadoop and hdfs
TrendProgContest13
Andere mochten auch
(11)
袁晓如:大数据时代可视化和可视分析的机遇与挑战
袁晓如:大数据时代可视化和可视分析的机遇与挑战
薛伟:腾讯广点通——大数据之上的实时精准推荐
薛伟:腾讯广点通——大数据之上的实时精准推荐
Ad network、ad exchange、dsp、ssp、rtb_和dmp介绍
Ad network、ad exchange、dsp、ssp、rtb_和dmp介绍
刘书良:基于大数据公共云平台的Dsp技术
刘书良:基于大数据公共云平台的Dsp技术
徐萌:中国移动大数据应用实践
徐萌:中国移动大数据应用实践
詹剑锋:Big databench—benchmarking big data systems
詹剑锋:Big databench—benchmarking big data systems
翟艳堂:腾讯大规模Hadoop集群实践
翟艳堂:腾讯大规模Hadoop集群实践
Zh tw cloud computing era
Zh tw cloud computing era
Capital onehadoopintro
Capital onehadoopintro
Cloud computing era
Cloud computing era
Introduction to hadoop and hdfs
Introduction to hadoop and hdfs
Ähnlich wie Nicholas:hdfs what is new in hadoop 2
Interactive Hadoop via Flash and Memory
Interactive Hadoop via Flash and Memory
Chris Nauroth
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Community
New Ceph capabilities and Reference Architectures
New Ceph capabilities and Reference Architectures
Kamesh Pemmaraju
Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?
Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?
Red_Hat_Storage
Apache Hadoop 3.0 Community Update
Apache Hadoop 3.0 Community Update
DataWorks Summit
Hadoop operations-2015-hadoop-summit-san-jose-v5
Hadoop operations-2015-hadoop-summit-san-jose-v5
Chris Nauroth
Hadoop 3 in a Nutshell
Hadoop 3 in a Nutshell
DataWorks Summit/Hadoop Summit
Hadoop 3 @ Hadoop Summit San Jose 2017
Hadoop 3 @ Hadoop Summit San Jose 2017
Junping Du
Apache Hadoop 3.0 Community Update
Apache Hadoop 3.0 Community Update
DataWorks Summit
Apache Hadoop YARN - The Future of Data Processing with Hadoop
Apache Hadoop YARN - The Future of Data Processing with Hadoop
Hortonworks
YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez
Hortonworks
List of Engineering Colleges in Uttarakhand
List of Engineering Colleges in Uttarakhand
Roorkee College of Engineering, Roorkee
Hadoop.pptx
Hadoop.pptx
arslanhaneef
Hadoop.pptx
Hadoop.pptx
sonukumar379092
Track B-3 解構大數據架構 - 大數據系統的伺服器與網路資源規劃
Track B-3 解構大數據架構 - 大數據系統的伺服器與網路資源規劃
Etu Solution
Optimizing Dell PowerEdge Configurations for Hadoop
Optimizing Dell PowerEdge Configurations for Hadoop
Mike Pittaro
Democratizing Memory Storage
Democratizing Memory Storage
DataWorks Summit
Key trends in Big Data and new reference architecture from Hewlett Packard En...
Key trends in Big Data and new reference architecture from Hewlett Packard En...
Ontico
Tutorial Haddop 2.3
Tutorial Haddop 2.3
Atanu Chatterjee
Hadoop ppt1
Hadoop ppt1
chariorienit
Ähnlich wie Nicholas:hdfs what is new in hadoop 2
(20)
Interactive Hadoop via Flash and Memory
Interactive Hadoop via Flash and Memory
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
New Ceph capabilities and Reference Architectures
New Ceph capabilities and Reference Architectures
Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?
Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?
Apache Hadoop 3.0 Community Update
Apache Hadoop 3.0 Community Update
Hadoop operations-2015-hadoop-summit-san-jose-v5
Hadoop operations-2015-hadoop-summit-san-jose-v5
Hadoop 3 in a Nutshell
Hadoop 3 in a Nutshell
Hadoop 3 @ Hadoop Summit San Jose 2017
Hadoop 3 @ Hadoop Summit San Jose 2017
Apache Hadoop 3.0 Community Update
Apache Hadoop 3.0 Community Update
Apache Hadoop YARN - The Future of Data Processing with Hadoop
Apache Hadoop YARN - The Future of Data Processing with Hadoop
YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez
List of Engineering Colleges in Uttarakhand
List of Engineering Colleges in Uttarakhand
Hadoop.pptx
Hadoop.pptx
Hadoop.pptx
Hadoop.pptx
Track B-3 解構大數據架構 - 大數據系統的伺服器與網路資源規劃
Track B-3 解構大數據架構 - 大數據系統的伺服器與網路資源規劃
Optimizing Dell PowerEdge Configurations for Hadoop
Optimizing Dell PowerEdge Configurations for Hadoop
Democratizing Memory Storage
Democratizing Memory Storage
Key trends in Big Data and new reference architecture from Hewlett Packard En...
Key trends in Big Data and new reference architecture from Hewlett Packard En...
Tutorial Haddop 2.3
Tutorial Haddop 2.3
Hadoop ppt1
Hadoop ppt1
Mehr von hdhappy001
俞晨杰:Linked in大数据应用和azkaban
俞晨杰:Linked in大数据应用和azkaban
hdhappy001
杨少华:阿里开放数据处理服务
杨少华:阿里开放数据处理服务
hdhappy001
肖永红:科研数据应用和共享方面的实践
肖永红:科研数据应用和共享方面的实践
hdhappy001
肖康:Storm在实时网络攻击检测和分析的应用与改进
肖康:Storm在实时网络攻击检测和分析的应用与改进
hdhappy001
夏俊鸾:Spark——基于内存的下一代大数据分析框架
夏俊鸾:Spark——基于内存的下一代大数据分析框架
hdhappy001
魏凯:大数据商业利用的政策管制问题
魏凯:大数据商业利用的政策管制问题
hdhappy001
王涛:基于Cloudera impala的非关系型数据库sql执行引擎
王涛:基于Cloudera impala的非关系型数据库sql执行引擎
hdhappy001
王峰:阿里搜索实时流计算技术
王峰:阿里搜索实时流计算技术
hdhappy001
钱卫宁:在线社交媒体分析型查询基准评测初探
钱卫宁:在线社交媒体分析型查询基准评测初探
hdhappy001
穆黎森:Interactive batch query at scale
穆黎森:Interactive batch query at scale
hdhappy001
罗李:构建一个跨机房的Hadoop集群
罗李:构建一个跨机房的Hadoop集群
hdhappy001
刘诚忠:Running cloudera impala on postgre sql
刘诚忠:Running cloudera impala on postgre sql
hdhappy001
刘昌钰:阿里大数据应用平台
刘昌钰:阿里大数据应用平台
hdhappy001
李战怀:大数据背景下分布式系统的数据一致性策略
李战怀:大数据背景下分布式系统的数据一致性策略
hdhappy001
冯宏华:H base在小米的应用与扩展
冯宏华:H base在小米的应用与扩展
hdhappy001
堵俊平:Hadoop virtualization extensions
堵俊平:Hadoop virtualization extensions
hdhappy001
陈跃国:Sql on-hadoop结构化大数据分析系统性能评测
陈跃国:Sql on-hadoop结构化大数据分析系统性能评测
hdhappy001
查礼 -大数据技术如何用于传统信息系统
查礼 -大数据技术如何用于传统信息系统
hdhappy001
Ted yu:h base and hoya
Ted yu:h base and hoya
hdhappy001
Raghu nambiar:industry standard benchmarks
Raghu nambiar:industry standard benchmarks
hdhappy001
Mehr von hdhappy001
(20)
俞晨杰:Linked in大数据应用和azkaban
俞晨杰:Linked in大数据应用和azkaban
杨少华:阿里开放数据处理服务
杨少华:阿里开放数据处理服务
肖永红:科研数据应用和共享方面的实践
肖永红:科研数据应用和共享方面的实践
肖康:Storm在实时网络攻击检测和分析的应用与改进
肖康:Storm在实时网络攻击检测和分析的应用与改进
夏俊鸾:Spark——基于内存的下一代大数据分析框架
夏俊鸾:Spark——基于内存的下一代大数据分析框架
魏凯:大数据商业利用的政策管制问题
魏凯:大数据商业利用的政策管制问题
王涛:基于Cloudera impala的非关系型数据库sql执行引擎
王涛:基于Cloudera impala的非关系型数据库sql执行引擎
王峰:阿里搜索实时流计算技术
王峰:阿里搜索实时流计算技术
钱卫宁:在线社交媒体分析型查询基准评测初探
钱卫宁:在线社交媒体分析型查询基准评测初探
穆黎森:Interactive batch query at scale
穆黎森:Interactive batch query at scale
罗李:构建一个跨机房的Hadoop集群
罗李:构建一个跨机房的Hadoop集群
刘诚忠:Running cloudera impala on postgre sql
刘诚忠:Running cloudera impala on postgre sql
刘昌钰:阿里大数据应用平台
刘昌钰:阿里大数据应用平台
李战怀:大数据背景下分布式系统的数据一致性策略
李战怀:大数据背景下分布式系统的数据一致性策略
冯宏华:H base在小米的应用与扩展
冯宏华:H base在小米的应用与扩展
堵俊平:Hadoop virtualization extensions
堵俊平:Hadoop virtualization extensions
陈跃国:Sql on-hadoop结构化大数据分析系统性能评测
陈跃国:Sql on-hadoop结构化大数据分析系统性能评测
查礼 -大数据技术如何用于传统信息系统
查礼 -大数据技术如何用于传统信息系统
Ted yu:h base and hoya
Ted yu:h base and hoya
Raghu nambiar:industry standard benchmarks
Raghu nambiar:industry standard benchmarks
Kürzlich hochgeladen
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
Remote DBA Services
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
ThousandEyes
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
Delhi Call girls
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
hans926745
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
The Digital Insurer
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Neo4j
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
sudhanshuwaghmare1
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
HampshireHUG
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
sammart93
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
Antenna Manufacturer Coco
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
wesley chun
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
Radu Cotescu
🐬 The future of MySQL is Postgres 🐘
🐬 The future of MySQL is Postgres 🐘
RTylerCroy
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
The Digital Insurer
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
The Digital Insurer
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
debabhi2
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Miguel Araújo
Kürzlich hochgeladen
(20)
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
🐬 The future of MySQL is Postgres 🐘
🐬 The future of MySQL is Postgres 🐘
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Nicholas:hdfs what is new in hadoop 2
1.
HDFS: What is
New in Hadoop 2 Sze Tsz-Wo Nicholas 施子和 December 6, 2013 © Hortonworks Inc. 2013 Page 1
2.
About Me • 施子和
Sze Tsz-Wo Nicholas, Ph.D. – Software Engineer at Hortonworks – PMC Member at Apache Hadoop – One of the most active contributors/committers of HDFS • Started in 2007 – Used Hadoop to compute Pi at the two-quadrillionth (2x1015th) bit • It is the current World Record. = 3.141592654… – Received Ph.D. from the University of Maryland, College Park • Discovered a novel square root algorithm over finite field. Architecting the Future of Big Data © Hortonworks Inc. 2013 Page 2
3.
Agenda • New HDFS
features in Hadoop-2 – New appendable write-pipeline – Multiple Namenode Federation – Namenode HA – File System Snapshots Architecting the Future of Big Data © Hortonworks Inc. 2013 Page 3
4.
We have been
hard at work… • Progress is being made in many areas – Scalability – Performance – Enterprise features – Ongoing operability improvements – Enhancements for other projects in the ecosystem – Expand Hadoop ecosystem to more platforms and use cases • 2192 commits in Hadoop in the last year – Almost a million lines of changes – ~150 contributors – Lot of new contributors - ~80 with < 3 patches • 350K lines of changes in HDFS and common Architecting the Future of Big Data © Hortonworks Inc. 2013 Page 4
5.
Building on Rock-solid
Foundation • Original design choices - simple and robust – Single Namenode metadata server – all state in memory – Fault Tolerance: multiple replicas, active monitoring – Storage: Rely on OS’s file system not raw disk • Reliability – Over 7 9’s of data reliability, less than 0.38 failures across 25 clusters • Operability – Small teams can manage large clusters • An operator per 3K node cluster – Fast Time to repair on node or disk failure • Minutes to an hour Vs. RAID array repairs taking many long hours • Scalable - proven by large scale deployments not bits – > 100 PB storage, > 400 million files, > 4500 nodes in a single cluster – ~ 100 K nodes of HDFS in deployment and use Architecting the Future of Big Data © Hortonworks Inc. 2013 Page 5
6.
New Appendable Write-Pipeline Architecting the
Future of Big Data © Hortonworks Inc. 2011 Page 6
7.
HDFS Write Pipeline •
The write pipeline has been improved dramatically – – – – Better durability Better visibility Consistency guarantees Appendable data Writer data DN1 ack Architecting the Future of Big Data © Hortonworks Inc. 2013 data DN2 ack DN3 ack Page 7
8.
New Feature in
Write Pipeline • Earlier versions of HDFS – Files were immutable – Write-once-read-many model • New features in Hadoop 2 – – – – Files can be reopened for append New primitives: hflush and hsync Read consistency Replace datanode on failure Architecting the Future of Big Data © Hortonworks Inc. 2013 Page 8
9.
HDFS hflush and
hsync • Java flush (or C++ fflush) – forces any buffered output bytes to be written out. • HDFS hflush – Flush data to all the datanodes in the write pipeline – Guarantees the data is visible for reading – The data may be in datanodes’ memory • HDFS sync – Hfush with local file system sync – May also update the file length in Namenode Architecting the Future of Big Data © Hortonworks Inc. 2013 Page 9
10.
Read Consistency • A
reader may read data during write – It can read from any datanode in the pipeline – and then failover to any other datanode to read the same data data Writer ack data DN1 ack data DN2 ack DN3 read read Reader Architecting the Future of Big Data © Hortonworks Inc. 2013 Page 10
11.
In the past
… • When a datanode fails, the pipeline is reconstructed with data the remain datanodes ack data Writer DN1 DN2 DN3 ack • When another datanode fails, only one datanode remains! data Writer DN1 DN2 DN3 ack Architecting the Future of Big Data © Hortonworks Inc. 2013 Page 11
12.
Replace Datanode on
Failure • Add new datanodes to the pipeline data ack data Writer data DN1 DN2 DN3 ack DN4 ack • User clients may choose the replacement policy – Performance vs data reliability Architecting the Future of Big Data © Hortonworks Inc. 2013 Page 12
13.
Multiple Namenode Federation Architecting the
Future of Big Data © Hortonworks Inc. 2011 Page 13
14.
Namespace HDFS Architecture Persistent Namespace Metadata
& Journal Hierarchal Namespace File Name BlockIDs Namespace State Namenode Block Map Block ID Block Locations Block Storage Heartbeats & Block Reports b2 b1 b3 b1 b3 b5 b3 Datanodes b2 b5 b1 b2 b5 Block ID Data JBOD JBOD JBOD JBOD Horizontally Scale IO and Storage Architecting the Future of Big Data © Hortonworks Inc. 2011 14 Page 14
15.
Single Namenode Limitations •
Namespace size is limited by the namenode memory size – 64GB memory can support ~100m files and blocks – Solution: Federation • Single point of failure (SPOF) – The service is down when the namenode is down – Solution: HA Architecting the Future of Big Data © Hortonworks Inc. 2013 Page 15
16.
Federation Cluster • Multiple
namenodes and namespace volumes in a cluster – – – – The namenodes/namespaces are independent Scalability by adding more namenodes/namespaces Isolation – separating applications to their own namespaces Client side mount tables/ViewFS for integrated views • Block Storage as generic storage service – Datanodes store blocks in block pools for all namespaces Architecting the Future of Big Data © Hortonworks Inc. 2013 Page 16
17.
Namespace Multiple Namenode Federation Foreign NS
n NS k NS1 ... Pool 1 Block Storage NN-n NN-k NN-1 ... Pool k Pool n Block Pools DN 1 .. DN 2 .. DN m .. Common Storage Architecting the Future of Big Data © Hortonworks Inc. 2013 Page 17
18.
Namenode HA Architecting the
Future of Big Data © Hortonworks Inc. 2011 Page 18
19.
High Availability –
No SPOF • Support standby namenode and failover – Planned downtime – Unplanned downtime • Release 1.1 – Cold standby • Require reconstructing in-memory data structures during failure-over – Uses NFS as shared storage – Standard HA frameworks as failover controller • Linux HA and VMWare VSphere – Suitable for small clusters up to 500 nodes Architecting the Future of Big Data © Hortonworks Inc. 2013 Page 19
20.
Hadoop Full Stack
HA Slave Nodes of Hadoop Cluster jo b jo b jo b jo b jo b Apps Running Outside Failover JT into Safemode NN JT Server Server NN Server HA Cluster for Master Daemons Architecting the Future of Big Data © Hortonworks Inc. 2013 Page 20
21.
High Availability –
Release 2.0 • Support for Hot Standby – The standby namenode maintains in-memory data structures • Supports manual and automatic failover • Automatic failover with Failover Controller – Active NN election and failure detection using ZooKeeper – Periodic NN health check – Failover on NN failure • Removed shared storage dependency – Quorum Journal Manager • 3 to 5 Journal Nodes for storing editlog • Edit must be written to quorum number of Journal Nodes • Replay cache for correctness & transparent failovers Architecting the Future of Big Data © Hortonworks Inc. 2013 Page 21
22.
Namenode HA in
Hadoop 2 ZK Heartbeat ZK ZK Heartbeat FailoverController Active FailoverController Standby Cmds Monitor Health of NN. OS, HW JN NN Active JN JN Shared NN state through Quorum of JournalNodes NN Standby Monitor Health of NN. OS, HW Block Reports to Active & Standby DN fencing: only obey commands from active DN DN DN DN Namenode HA has no external dependency Architecting the Future of Big Data © Hortonworks Inc. 2013 Page 22
23.
File System Snapshots Architecting
the Future of Big Data © Hortonworks Inc. 2011 Page 23
24.
Before Snapshots… • Deleted
files cannot be restored – Trash is buggy and not well understood – Trash works only for CLI based deletion • No point-in-time recovery • No periodic snapshots to restore from – No admin/user managed snapshots Architecting the Future of Big Data © Hortonworks Inc. 2013 Page 24
25.
HDFS Snapshot Point-in-time image
of the file system Read-only Copy-on-write Architecting the Future of Big Data © Hortonworks Inc. 2013 Page 25
26.
Use Cases Protection against
user errors Backup Experimental/Test setups Architecting the Future of Big Data © Hortonworks Inc. 2013 Page 26
27.
Example: Periodic Snapshots
for Backup • A typical snapshot policy: Take a snapshot in – every 15 mins and – every 1 hr, – every 1 day, – every 1 week, – every 1 month, Architecting the Future of Big Data © Hortonworks Inc. 2013 keep it for 24 hrs keep 2 days keep 14 days keep 3 months keep 1 year Page 27
28.
Design Goal: Efficiency •
Storage efficiency – No block data copying – No metadata copying for unmodified files • Processing efficiency – No additional costs for processing current data • Cheap snapshot creation – Must be fast and lightweight – Must support for a very large number of snapshots Architecting the Future of Big Data © Hortonworks Inc. 2013 Page 28
29.
Design Goal: Features •
Read-only – Files and directories in a snapshot are immutable – Nothing can be added to or removed from directories • Hierarchical snapshots – Snapshots of the entire namespace – Snapshots of subtrees • User operation – Users can take snapshots for their data – Admins manage where users can take snapshots Architecting the Future of Big Data © Hortonworks Inc. 2013 Page 29
30.
HDFS-2802: Snapshot Development •
Available in Hadoop 2 GA release (v2.2.0) • Community-driven – Special thanks to who have provided for the valuable discussion and feedback on the feature requirements and the open questions • 136 subtask JIRAs – Mainly contributed by Hortonworks • The merge patch has about 28k lines • ~8 months of development Architecting the Future of Big Data © Hortonworks Inc. 2013 Page 30
31.
Namenode Only Operation •
No complicated distributed mechanism • Snapshot metadata stored in Namenode • Datanodes have no knowledge of snapshots • Block management layer also don’t know about snapshots Architecting the Future of Big Data © Hortonworks Inc. 2013 Page 31
32.
Fast Snapshot Creation •
Snapshot Creation: O(1) – It just adds a record to an inode / d 1 f1 Architecting the Future of Big Data © Hortonworks Inc. 2013 d 2 f2 S1 f3 Page 32
33.
Low Memory Overhead •
NameNode memory usage: O(M) – M is the number of modified files/directories – Additional memory is used only when modifications are made relative to a snapshot / d 1 f1 d 2 f4 Architecting the Future of Big Data © Hortonworks Inc. 2013 f2 S1 Modifications: 1. rm f3 2. add f4 f3 Page 33
34.
File Blocks Sharing •
Blocks in datanodes are not copied – The snapshot files record the block list and the file size – No data copying / d blk0 Architecting the Future of Big Data © Hortonworks Inc. 2013 S1 f' f’’ S2 f blk1 blk2 blk3 Page 34
35.
Persistent Data Structures •
A well-known data structure for “time travel” – Support querying previous version of the data • Access slow down – The additional time required for the data structure • In traditional persistent data structures – There is slow down on accessing current data and snapshot data • In our implementation – No slow down on accessing current data – Slow down happens only on accessing snapshot data Architecting the Future of Big Data © Hortonworks Inc. 2013 Page 35
36.
No Slow Down
on Accessing Current Data • The current data can be accessed directly – Modifications are recorded in reverse chronological order Snapshot data = Current data – Modifications / ~ modifications d 1 f1 d 2 f4 f2 S1 Modifications: 1. rm f3 2. add f4 f3 f2 Architecting the Future of Big Data © Hortonworks Inc. 2013 d 2 f3 Page 36
37.
Easy Management • Snapshots
can be taken on any directory – Set the directory to be snapshottable • Support 65,536 simultaneous snapshots • No limit on the number of snapshottable directories – Nested snapshottable directories are currently NOT allowed Architecting the Future of Big Data © Hortonworks Inc. 2013 Page 37
38.
Admin Ops • Allow
snapshots on a directory – hdfs dfsadmin –allowSnapshot <path> • Reset a snapshottable directory – hdfs dfsadmin –disallowSnapshot <path> • Example Architecting the Future of Big Data © Hortonworks Inc. 2013 Page 38
39.
User Ops • Create/delete/rename
snapshots – hdfs dfs -createSnapshot <path> [<snapshotName>] – hdfs dfs –deleteSnapshot <path> <snapshotName> – hdfs dfs –renameSnapshot <path> <oldName> <newName> • Get snapshottable directory listing – hdfs lsSnapshottableDir • Get snapshots difference report – hdfs snapshotDiff <path> <from> <to> Architecting the Future of Big Data © Hortonworks Inc. 2013 Page 39
40.
Use snapshot paths
in CLI • All regular commands and APIs can be used against snapshot path – /<snapshottableDir>/.snapshot/<snapshotName>/foo/bar • List all the files in a snapshot – ls /test/.snapshot/s4 • List all the snapshots under that path – ls <path>/.snapshot Architecting the Future of Big Data © Hortonworks Inc. 2013 Page 40
41.
Test Snapshot Functionalities •
~100 unit tests • ~1.4 million generated system tests – Covering most combination of (snapshot + rename) operations • Automated long-running tests for months Architecting the Future of Big Data © Hortonworks Inc. 2013 Page 41
42.
NFS Support and Other
Features Architecting the Future of Big Data © Hortonworks Inc. 2011 Page 42
43.
NFS Support • NFS
Gateway provides NFS access to HDFS – File browsing, Data download/upload, Data streaming – No client-side library – Better alternative to Hadoop + Fuse based solution • Better consistency guarantees • Supports NFSv3 • Stateless Gateway – Simpler design, easy to handle failures • Future work – High Availability for NFS Gateway – NFSv4 support? Architecting the Future of Big Data © Hortonworks Inc. 2013 Page 43
44.
Other Features • Protobuf,
wire compatibility – Post 2.0 GA stronger wire compatibility • Rolling upgrades – With relaxed version checks • Improvements for other projects – Stale node to improve HBase MTTR • Block placement enhancements – Better support for other topologies such as VMs and Cloud • On the wire encryption – Both data and RPC • Expanding ecosystem, platforms and applicability – Native support for Windows Architecting the Future of Big Data © Hortonworks Inc. 2013 Page 44
45.
Enterprise Readiness • Storage
fault-tolerance – built into HDFS – 100% data reliability • High Availability • Standard Interfaces – WebHDFS(REST), Fuse, NFS, HttpFs, libwebhdfs and libhdfs • Wire protocol compatibility – Protocol buffers • Rolling upgrades • Snapshots • Disaster Recovery – Distcp for parallel and incremental copies across cluster – Apache Ambari and HDP for automated management Architecting the Future of Big Data © Hortonworks Inc. 2013 Page 45
46.
Work in Progress •
HDFS-2832: Heterogeneous storages – Datanode abstraction from single storage to collection of storages – Support different storage types: Disk and SSD • HDFS-5535: Zero download rolling upgrade – Namenodes and Datanodes can be upgraded independently – No upgrade downtime • HDFS-4685: ACLs – More flexible than user-group-permission Architecting the Future of Big Data © Hortonworks Inc. 2013 Page 46
47.
Future Works • HDFS-5477:
Block manager as a service – Move block management out from Namenode – Support different name service, e.g. key-value store • HDFS-3154: Immutable files – Write-once and then read-only • HDFS-4704: Transient files – Tmp files will not be recorded in snapshots Architecting the Future of Big Data © Hortonworks Inc. 2013 Page 47
48.
Q&A • Myths and
misinformation of HDFS – – – – – Not reliable (was never true) Namenode dies, all state is lost (was never true) Does not support disaster recovery (distcp in Hadoop0.15) Hard to operate for new comers Performance improvements (always ongoing) • Major improvements in 1.2 and 2.x – Namenode is a single point of failure – Needs shared NFS storage for HA – Does not have point in time recovery Thank You! Architecting the Future of Big Data © Hortonworks Inc. 2013 Page 48
Jetzt herunterladen