SlideShare ist ein Scribd-Unternehmen logo
1 von 43
EcossistemaHadoop
Motivações
Játentouprocessarumarquivo-textocom1TB?
ODougCuttingtentou...
OqueéHadoop?
Umsistemadearquivosdistribuído
● HDFS (Hadoop Distributed File
System)
○ Baseado em blocos gigantes
○ Alta disponibilidade
○ Quase POSIX
○ Fricção baixa para usuários *nix
HDFSéumFS
>hdfs dfs -ls -h /data/customers/customers_joined
Found 3 items
-rw-r--r-- 4 hadoop supergroup 0 2016-05-18 21:37
/data/customers/customers_joined/_SUCCESS
drwxr-xr-x - hadoop supergroup 0 2016-05-18 22:43
/data/customers/customers_joined/output
-rw-r--r-- 4 hadoop supergroup 380.1 M 2016-05-18 21:37
/data/customers/customers_joined/part-r-00000.lzo
>hdfs fsck /data/customers/customers_joined/part-r-00000.lzo -files -blocks
0. BP-351819023-10.113.142.94-1441385804011:blk_1096286844_357944571
len=134217728 repl=4
1. BP-351819023-10.113.142.94-1441385804011:blk_1096286845_357945328
len=134217728 repl=4
2. BP-351819023-10.113.142.94-1441385804011:blk_1096286846_357946005
len=130107025 repl=4
The filesystem under path '/data/customers/customers_joined/part-r-00000.lzo'
is HEALTHY
Umaplataforma deprocessamento distribuído
YARN (Yet Another Resource
Negotiator)
Distribui aplicações MapReduce
Divide processamento para os nós
Gerencia recursos
Hadoopéumaplataformafacilmente
escaláveldearmazenamentoe
processamentodistribuído,
agnósticoaotipodedado,hardwaree
sistemaoperacional.
Topologia
Namenode
Secondaries
Secondaries
Datanodes
H
D
F
S
ResourceManager
Secondaries
Secondaries
NodeManager
Y
A
R
N
Localidade
D=2
D=0
D=4
D=6
MapReduce
MapReducecomPYdoop
import pydoop.mapreduce.api as api
import pydoop.mapreduce.pipes as pp
class Mapper(api.Mapper):
def map(self, context):
words = context.value.split()
for w in words:
context.emit(w, 1)
class Reducer(api.Reducer):
def reduce(self, context):
s = sum(context.values)
context.emit(context.key, s)
def __main__():
pp.run_task(pp.Factory(Mapper, Reducer))
Ecossistema
BigDataLandscape2016
mattturck.com/2016/02/01/big-data-landscape
Hive
Criado pelo Facebook
Expõe MapReduce como SQL
Mapeia arquivos gigantescos
Suporta compressão (particionável)
Recursos e síntaxe similares ao MySQL
Analítico
Hiveemação
hive> select keyword, count(1) total from searches group by keyword order by
total desc limit 10;
Hadoop job information for Stage-1: number of mappers: 70; number of reducers:
2
2016-05-17 11:39:27,729 Stage-1 map = 0%, reduce = 0%
2016-05-17 11:40:12,255 Stage-1 map = 100%, reduce = 100%, Cumulative CPU
406.39 sec
OK
sofa 138361
tablet 125837
fogao 106451
geladeira 99641
notebook 92787
celular 92195
tv 87594
iphone 82045
microondas 75806
Time taken: 116.755 seconds, Fetched: 10 row(s)
Flume
Coletor/Agregador/Streamer/Sinker
Ingestor de tail -f distribuído
Agregador de logs
Compressão do stream
Requisições HTTP
Sinker para HBase, HDFS, Avro,
BuscasnoFlume
# Sources
search_stream.sources = SearchStream
search_stream.sources.SearchStream.type = http
search_stream.sources.SearchStream.handler = org.apache.flume.source.http.JSONHandler
# Channels
search_stream.channels = FileChannel
search_stream.channels.FileChannel.type = file
search_stream.channels.FileChannel.dataDirs = /mnt/flume/data
# Sinks
search_stream.sinks = HDFSSink
search_stream.sinks.HDFSSink.type = hdfs
search_stream.sinks.HDFSSink.hdfs.filePrefix = search
search_stream.sinks.HDFSSink.hdfs.path = /flume/events/search
search_stream.sinks.HDFSSink.hdfs.fileType = CompressedStream
search_stream.sinks.HDFSSink.hdfs.codeC = lzop
# Tie everything together
search_stream.sources.SearchStream.channels = FileChannel
search_stream.sinks.HDFSSink.channel = FileChannel
Impala
Usa engine própria (MPP)
Usa metadados do Hive
Aproveita-se bem do Apache Parquet
Bilhões de linhas em pouquíssimos
segundos
Performance linear ao tamanho do cluster
Impalaesuasbilhõesdelinhasempoucossegundos
impala-shell> select count(1) total from price_changes;
+------------+
| total |
+------------+
| 2127126382 |
+------------+
Returned 1 row(s) in 2.11s
impala-shell> select status, count(1) total from price_changes group by status;
+-----------+------------+
| status | total |
+-----------+------------+
| 0 | 2119762471 |
| 1 | 7363911 |
+-----------+------------+
Returned 2 row(s) in 2.82s
HBase
Baseado no Google BigTable
Colunar (dinâmico): trilhões de colunas e
linhas
Acesso aleatório e em tempo-real
Divide chaves em regiões, que estão em
diferentes máquinas
ClientenoHBase
hbase> get customers, '40019869212'
COLUMN CELL
cliente:nomcli
timestamp=1462020386697, value=NELSON FORTE
cliente:numdoc
timestamp=1462020386697, value=598745328
cliente:origem
timestamp=1462020386697, value=ERP
cliente:qtcompras
timestamp=1462020386697, value=3
cliente:sexo
timestamp=1462020386697, value=M
1 row(s) in 0.0910 seconds
Apache Pig
Pig
Pig Latin
Simplicidade nas expressões
Constrói resultados analíticos com
MapReduce
ETL sobre grandes dados
Extensível (Java, Python, Ruby,
Relacionandobasesdecliente
grunt> set output.compression.enable true;
grunt> set output.compression.codec com.hadoop.compression.lzo.LzopCodec;
erp = LOAD 'hbase://customers_erp' USING
org.apache.pig.backend.hadoop.hbase.HBaseStorage('client:customer_id', '-
loadKey=true') AS (id:bytearray, name_erp:chararray);
site = LOAD 'hbase://customers_site' USING
org.apache.pig.backend.hadoop.hbase.HBaseStorage('client:name', '-loadKey=true') AS
(id:bytearray, name_site:chararray);
joined = JOIN erp BY id, site BY id;
stripped = FOREACH joined GENERATE name_erp, LOWER(name_erp) AS lowered_name_erp,
name_site, LOWER(name_site) AS lowered_name_site;
rmf /data/customers/joined
STORE stripped INTO '/data/customers/joined' USING PigStorage('|');
Sqoop
Importador de dados estruturados pro HDFS
e vice-versa
Distribui a importação pelas máquinas
(via MapReduce)
Configuração simples
Servidor de Jobs
Importandopushes
sqoop job --create push_notification -- import --connect
jdbc:mysql://server/push 
--username importer 
--password-file file:///.password_file 
--split-by created_at 
--num-mappers 4 
--compress 
--compression-codec com.hadoop.compression.lzo.LzopCodec 
--delete-target-dir 
--target-dir /tmp/push 
--null-string 'N' 
--null-non-string 'N' 
--as-textfile 
--outdir /sqoop-scripts/src 
--bindir /sqoop-scripts/jar 
--direct --query 'select push_id, created_at, email, sent + 0 from pushes
where $CONDITIONS'
Quemusae
contribui?
https://wiki.apache.org/hadoop/PoweredBy
2000+nós|8800cores|60PB
Hive+Presto
Cassandra->HDFS(Análise)
“Novo”LZO+ElephantBird
1000+nós
Genie+Lipstick
900+nós
Luigi+Snakebite
200+nós|~1PB/dia
(metadados)
~1GB/dia
Dadosclimáticos
100+nós 1400+nós
10000+nós|100000+ cores
72PBRAM
6nós|56cores|458GBRAM|30TB
1000jobs/dia
Perguntas?
Obrigado!

Weitere ähnliche Inhalte

Was ist angesagt?

Introduction to apache hadoop
Introduction to apache hadoopIntroduction to apache hadoop
Introduction to apache hadoopShashwat Shriparv
 
Hadoop installation with an example
Hadoop installation with an exampleHadoop installation with an example
Hadoop installation with an exampleNikita Kesharwani
 
Hadoop migration and upgradation
Hadoop migration and upgradationHadoop migration and upgradation
Hadoop migration and upgradationShashwat Shriparv
 
Hadoop Installation presentation
Hadoop Installation presentationHadoop Installation presentation
Hadoop Installation presentationpuneet yadav
 
Apache Drill - Why, What, How
Apache Drill - Why, What, HowApache Drill - Why, What, How
Apache Drill - Why, What, Howmcsrivas
 
Learn Hadoop Administration
Learn Hadoop AdministrationLearn Hadoop Administration
Learn Hadoop AdministrationEdureka!
 
Big data interview questions and answers
Big data interview questions and answersBig data interview questions and answers
Big data interview questions and answersKalyan Hadoop
 
Introduction to HDFS and MapReduce
Introduction to HDFS and MapReduceIntroduction to HDFS and MapReduce
Introduction to HDFS and MapReduceUday Vakalapudi
 
Hadoop installation, Configuration, and Mapreduce program
Hadoop installation, Configuration, and Mapreduce programHadoop installation, Configuration, and Mapreduce program
Hadoop installation, Configuration, and Mapreduce programPraveen Kumar Donta
 
Learn to setup a Hadoop Multi Node Cluster
Learn to setup a Hadoop Multi Node ClusterLearn to setup a Hadoop Multi Node Cluster
Learn to setup a Hadoop Multi Node ClusterEdureka!
 
Hadoop - Disk Fail In Place (DFIP)
Hadoop - Disk Fail In Place (DFIP)Hadoop - Disk Fail In Place (DFIP)
Hadoop - Disk Fail In Place (DFIP)mundlapudi
 
Introducción a hadoop
Introducción a hadoopIntroducción a hadoop
Introducción a hadoopdatasalt
 
Session 03 - Hadoop Installation and Basic Commands
Session 03 - Hadoop Installation and Basic CommandsSession 03 - Hadoop Installation and Basic Commands
Session 03 - Hadoop Installation and Basic CommandsAnandMHadoop
 
Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)
Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)
Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)Hari Shankar Sreekumar
 

Was ist angesagt? (20)

Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
Introduction to apache hadoop
Introduction to apache hadoopIntroduction to apache hadoop
Introduction to apache hadoop
 
Hadoop installation with an example
Hadoop installation with an exampleHadoop installation with an example
Hadoop installation with an example
 
Hadoop migration and upgradation
Hadoop migration and upgradationHadoop migration and upgradation
Hadoop migration and upgradation
 
Hadoop Installation presentation
Hadoop Installation presentationHadoop Installation presentation
Hadoop Installation presentation
 
Apache Drill - Why, What, How
Apache Drill - Why, What, HowApache Drill - Why, What, How
Apache Drill - Why, What, How
 
Learn Hadoop Administration
Learn Hadoop AdministrationLearn Hadoop Administration
Learn Hadoop Administration
 
Big data interview questions and answers
Big data interview questions and answersBig data interview questions and answers
Big data interview questions and answers
 
Introduction to HDFS and MapReduce
Introduction to HDFS and MapReduceIntroduction to HDFS and MapReduce
Introduction to HDFS and MapReduce
 
Hadoop administration
Hadoop administrationHadoop administration
Hadoop administration
 
Hadoop installation, Configuration, and Mapreduce program
Hadoop installation, Configuration, and Mapreduce programHadoop installation, Configuration, and Mapreduce program
Hadoop installation, Configuration, and Mapreduce program
 
Learn to setup a Hadoop Multi Node Cluster
Learn to setup a Hadoop Multi Node ClusterLearn to setup a Hadoop Multi Node Cluster
Learn to setup a Hadoop Multi Node Cluster
 
Hadoop - Disk Fail In Place (DFIP)
Hadoop - Disk Fail In Place (DFIP)Hadoop - Disk Fail In Place (DFIP)
Hadoop - Disk Fail In Place (DFIP)
 
Hadoop 1.x vs 2
Hadoop 1.x vs 2Hadoop 1.x vs 2
Hadoop 1.x vs 2
 
Hadoop2.2
Hadoop2.2Hadoop2.2
Hadoop2.2
 
Hadoop Interview Questions and Answers
Hadoop Interview Questions and AnswersHadoop Interview Questions and Answers
Hadoop Interview Questions and Answers
 
Introducción a hadoop
Introducción a hadoopIntroducción a hadoop
Introducción a hadoop
 
Session 03 - Hadoop Installation and Basic Commands
Session 03 - Hadoop Installation and Basic CommandsSession 03 - Hadoop Installation and Basic Commands
Session 03 - Hadoop Installation and Basic Commands
 
Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)
Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)
Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)
 
HDFS Internals
HDFS InternalsHDFS Internals
HDFS Internals
 

Andere mochten auch

Arquitetura para solução Big Data – open source
Arquitetura para solução Big Data – open sourceArquitetura para solução Big Data – open source
Arquitetura para solução Big Data – open sourceFelipe RENZ - MBA TI / Big
 
Exemplos de uso de apache spark usando aws elastic map reduce
Exemplos de uso de apache spark usando aws elastic map reduceExemplos de uso de apache spark usando aws elastic map reduce
Exemplos de uso de apache spark usando aws elastic map reduceFelipe
 
Arquitetura do Framework Apache Hadoop 2.6
Arquitetura do Framework Apache Hadoop 2.6Arquitetura do Framework Apache Hadoop 2.6
Arquitetura do Framework Apache Hadoop 2.6Felipe Schimith Batista
 
Estudo sobre ferramentas de BI Open Source
Estudo sobre ferramentas de BI Open SourceEstudo sobre ferramentas de BI Open Source
Estudo sobre ferramentas de BI Open SourceNelson Forte
 
Big Data - O que é o hadoop, map reduce, hdfs e hive
Big Data - O que é o hadoop, map reduce, hdfs e hiveBig Data - O que é o hadoop, map reduce, hdfs e hive
Big Data - O que é o hadoop, map reduce, hdfs e hiveFlavio Fonte, PMP, ITIL
 
Arquiteturas, Tecnologias e Desafios para Análise de BigData
Arquiteturas, Tecnologias e Desafios para Análise de BigDataArquiteturas, Tecnologias e Desafios para Análise de BigData
Arquiteturas, Tecnologias e Desafios para Análise de BigDataSandro Andrade
 
Arquiteturas escaláveis o exemplo da spotify aplicado ao e commerce
Arquiteturas escaláveis  o exemplo da spotify aplicado ao e commerceArquiteturas escaláveis  o exemplo da spotify aplicado ao e commerce
Arquiteturas escaláveis o exemplo da spotify aplicado ao e commerceRafael Rocha
 
SQL to Hive Cheat Sheet
SQL to Hive Cheat SheetSQL to Hive Cheat Sheet
SQL to Hive Cheat SheetHortonworks
 

Andere mochten auch (11)

Arquitetura para solução Big Data – open source
Arquitetura para solução Big Data – open sourceArquitetura para solução Big Data – open source
Arquitetura para solução Big Data – open source
 
Exemplos de uso de apache spark usando aws elastic map reduce
Exemplos de uso de apache spark usando aws elastic map reduceExemplos de uso de apache spark usando aws elastic map reduce
Exemplos de uso de apache spark usando aws elastic map reduce
 
Arquitetura do Framework Apache Hadoop 2.6
Arquitetura do Framework Apache Hadoop 2.6Arquitetura do Framework Apache Hadoop 2.6
Arquitetura do Framework Apache Hadoop 2.6
 
Hadoop
HadoopHadoop
Hadoop
 
The ABC of Big Data
The ABC of Big DataThe ABC of Big Data
The ABC of Big Data
 
Proposta de arquitetura Hadoop
Proposta de arquitetura HadoopProposta de arquitetura Hadoop
Proposta de arquitetura Hadoop
 
Estudo sobre ferramentas de BI Open Source
Estudo sobre ferramentas de BI Open SourceEstudo sobre ferramentas de BI Open Source
Estudo sobre ferramentas de BI Open Source
 
Big Data - O que é o hadoop, map reduce, hdfs e hive
Big Data - O que é o hadoop, map reduce, hdfs e hiveBig Data - O que é o hadoop, map reduce, hdfs e hive
Big Data - O que é o hadoop, map reduce, hdfs e hive
 
Arquiteturas, Tecnologias e Desafios para Análise de BigData
Arquiteturas, Tecnologias e Desafios para Análise de BigDataArquiteturas, Tecnologias e Desafios para Análise de BigData
Arquiteturas, Tecnologias e Desafios para Análise de BigData
 
Arquiteturas escaláveis o exemplo da spotify aplicado ao e commerce
Arquiteturas escaláveis  o exemplo da spotify aplicado ao e commerceArquiteturas escaláveis  o exemplo da spotify aplicado ao e commerce
Arquiteturas escaláveis o exemplo da spotify aplicado ao e commerce
 
SQL to Hive Cheat Sheet
SQL to Hive Cheat SheetSQL to Hive Cheat Sheet
SQL to Hive Cheat Sheet
 

Ähnlich wie Ecossistema Hadoop no Magazine Luiza

R the unsung hero of Big Data
R the unsung hero of Big DataR the unsung hero of Big Data
R the unsung hero of Big DataDhafer Malouche
 
Ambari Management Packs (Apache Ambari Meetup 2018)
Ambari Management Packs (Apache Ambari Meetup 2018)Ambari Management Packs (Apache Ambari Meetup 2018)
Ambari Management Packs (Apache Ambari Meetup 2018)Swapan Shridhar
 
What's new in hadoop 3.0
What's new in hadoop 3.0What's new in hadoop 3.0
What's new in hadoop 3.0Heiko Loewe
 
Tajo_Meetup_20141120
Tajo_Meetup_20141120Tajo_Meetup_20141120
Tajo_Meetup_20141120Hyoungjun Kim
 
Tools, not only for Oracle RAC
Tools, not only for Oracle RACTools, not only for Oracle RAC
Tools, not only for Oracle RACMarkus Flechtner
 
RAPIDS: GPU-Accelerated ETL and Feature Engineering
RAPIDS: GPU-Accelerated ETL and Feature EngineeringRAPIDS: GPU-Accelerated ETL and Feature Engineering
RAPIDS: GPU-Accelerated ETL and Feature EngineeringKeith Kraus
 
Hadoop - Lessons Learned
Hadoop - Lessons LearnedHadoop - Lessons Learned
Hadoop - Lessons Learnedtcurdt
 
12c: Testing audit features for Data Pump (Export & Import) and RMAN jobs
12c: Testing audit features for Data Pump (Export & Import) and RMAN jobs12c: Testing audit features for Data Pump (Export & Import) and RMAN jobs
12c: Testing audit features for Data Pump (Export & Import) and RMAN jobsMonowar Mukul
 
Ambari Management Packs (Apache Ambari Meetup 2018)
Ambari Management Packs (Apache Ambari Meetup 2018)Ambari Management Packs (Apache Ambari Meetup 2018)
Ambari Management Packs (Apache Ambari Meetup 2018)Swapan Shridhar
 
Common schema my sql uc 2012
Common schema   my sql uc 2012Common schema   my sql uc 2012
Common schema my sql uc 2012Roland Bouman
 
Common schema my sql uc 2012
Common schema   my sql uc 2012Common schema   my sql uc 2012
Common schema my sql uc 2012Roland Bouman
 
Secure Hadoop Cluster With Kerberos
Secure Hadoop Cluster With KerberosSecure Hadoop Cluster With Kerberos
Secure Hadoop Cluster With KerberosEdureka!
 
Highload Perf Tuning
Highload Perf TuningHighload Perf Tuning
Highload Perf TuningHighLoad2009
 
Bringing OLTP woth OLAP: Lumos on Hadoop
Bringing OLTP woth OLAP: Lumos on HadoopBringing OLTP woth OLAP: Lumos on Hadoop
Bringing OLTP woth OLAP: Lumos on HadoopDataWorks Summit
 
Hadoop Architecture and HDFS
Hadoop Architecture and HDFSHadoop Architecture and HDFS
Hadoop Architecture and HDFSEdureka!
 
Hvordan sette opp en OAI-PMH metadata-innhøster
Hvordan sette opp en OAI-PMH metadata-innhøsterHvordan sette opp en OAI-PMH metadata-innhøster
Hvordan sette opp en OAI-PMH metadata-innhøsterLibriotech
 
Pig on Tez - Low Latency ETL with Big Data
Pig on Tez - Low Latency ETL with Big DataPig on Tez - Low Latency ETL with Big Data
Pig on Tez - Low Latency ETL with Big DataDataWorks Summit
 
Presentation sreenu dwh-services
Presentation sreenu dwh-servicesPresentation sreenu dwh-services
Presentation sreenu dwh-servicesSreenu Musham
 

Ähnlich wie Ecossistema Hadoop no Magazine Luiza (20)

R the unsung hero of Big Data
R the unsung hero of Big DataR the unsung hero of Big Data
R the unsung hero of Big Data
 
Ambari Management Packs (Apache Ambari Meetup 2018)
Ambari Management Packs (Apache Ambari Meetup 2018)Ambari Management Packs (Apache Ambari Meetup 2018)
Ambari Management Packs (Apache Ambari Meetup 2018)
 
What's new in hadoop 3.0
What's new in hadoop 3.0What's new in hadoop 3.0
What's new in hadoop 3.0
 
RHadoop - beginners
RHadoop - beginnersRHadoop - beginners
RHadoop - beginners
 
Hadoop
HadoopHadoop
Hadoop
 
Tajo_Meetup_20141120
Tajo_Meetup_20141120Tajo_Meetup_20141120
Tajo_Meetup_20141120
 
Tools, not only for Oracle RAC
Tools, not only for Oracle RACTools, not only for Oracle RAC
Tools, not only for Oracle RAC
 
RAPIDS: GPU-Accelerated ETL and Feature Engineering
RAPIDS: GPU-Accelerated ETL and Feature EngineeringRAPIDS: GPU-Accelerated ETL and Feature Engineering
RAPIDS: GPU-Accelerated ETL and Feature Engineering
 
Hadoop - Lessons Learned
Hadoop - Lessons LearnedHadoop - Lessons Learned
Hadoop - Lessons Learned
 
12c: Testing audit features for Data Pump (Export & Import) and RMAN jobs
12c: Testing audit features for Data Pump (Export & Import) and RMAN jobs12c: Testing audit features for Data Pump (Export & Import) and RMAN jobs
12c: Testing audit features for Data Pump (Export & Import) and RMAN jobs
 
Ambari Management Packs (Apache Ambari Meetup 2018)
Ambari Management Packs (Apache Ambari Meetup 2018)Ambari Management Packs (Apache Ambari Meetup 2018)
Ambari Management Packs (Apache Ambari Meetup 2018)
 
Common schema my sql uc 2012
Common schema   my sql uc 2012Common schema   my sql uc 2012
Common schema my sql uc 2012
 
Common schema my sql uc 2012
Common schema   my sql uc 2012Common schema   my sql uc 2012
Common schema my sql uc 2012
 
Secure Hadoop Cluster With Kerberos
Secure Hadoop Cluster With KerberosSecure Hadoop Cluster With Kerberos
Secure Hadoop Cluster With Kerberos
 
Highload Perf Tuning
Highload Perf TuningHighload Perf Tuning
Highload Perf Tuning
 
Bringing OLTP woth OLAP: Lumos on Hadoop
Bringing OLTP woth OLAP: Lumos on HadoopBringing OLTP woth OLAP: Lumos on Hadoop
Bringing OLTP woth OLAP: Lumos on Hadoop
 
Hadoop Architecture and HDFS
Hadoop Architecture and HDFSHadoop Architecture and HDFS
Hadoop Architecture and HDFS
 
Hvordan sette opp en OAI-PMH metadata-innhøster
Hvordan sette opp en OAI-PMH metadata-innhøsterHvordan sette opp en OAI-PMH metadata-innhøster
Hvordan sette opp en OAI-PMH metadata-innhøster
 
Pig on Tez - Low Latency ETL with Big Data
Pig on Tez - Low Latency ETL with Big DataPig on Tez - Low Latency ETL with Big Data
Pig on Tez - Low Latency ETL with Big Data
 
Presentation sreenu dwh-services
Presentation sreenu dwh-servicesPresentation sreenu dwh-services
Presentation sreenu dwh-services
 

Kürzlich hochgeladen

Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 

Kürzlich hochgeladen (20)

Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 

Ecossistema Hadoop no Magazine Luiza

Hinweis der Redaktion

  1. Resolve problema de leitura sequencial e aleatório (100MB/s) de um disco rígido, distribuindo partes do arquivo em várias máquinas Taxa de transferência dos discos atuais é muito maior que a velocidade de busca B-Trees funcionam bem até os gigabytes nessa taxa de transferência
  2. HDDs tem normalmente blocos de 512 bytes Ext4 tem blocos de 4 Kb HDFS tem blocos de 128 Mb
  3. Suporta: S3 FTP RAID distribuído Local WebHDFS (REST)
  4. CERN - 30 PB/dia