SlideShare ist ein Scribd-Unternehmen logo
1 von 46
MapR, Implications for Integration CHUG – August 2011
Outline MapR system overview Map-reduce review MapR architecture Performance Results Map-reduce on MapR Architectural implications Search indexing / deployment EM algorithm for machine learning 
 and more 

Map-Reduce Shuffle Input Output
Bottlenecks and Issues Read-only files Many copies in I/O path Shuffle based on HTTP Can’t use new technologies Eats file descriptors Spills go to local file space Bad for skewed distribution of sizes
MapR Areas of Development
MapR Improvements Faster file system Fewer copies Multiple NICS No file descriptor or page-buf competition Faster map-reduce Uses distributed file system Direct RPC to receiver Very wide merges
MapR Innovations Volumes Distributed management Data placement Read/write random access file system Allows distributed meta-data Improved scaling Enables NFS access Application-level NIC bonding Transactionally correct snapshots and mirrors
MapR'sContainers Files/directories are sharded into blocks, whichare placed into mini NNs (containers ) on disks ,[object Object]
Directories & files
Data blocks
Replicated on servers
No need to manage directlyContainers are 16-32 GB segments of disk, placed on nodes
Container locations and replication CLDB N1, N2 N1 N3, N2 N1, N2 N2 N1, N3 N3, N2 N3 Container location database (CLDB) keeps track of nodes hosting each container
MapR Scaling Containers represent 16 - 32GB of data ,[object Object]
100M containers =  ~ 2 Exabytes  (a very large cluster)250 bytes DRAM to cache a container ,[object Object]
But not necessary, can page to disk
Typical large 10PB cluster needs 2GBContainer-reports are 100x - 1000x  <  HDFS block-reports ,[object Object]
Increase container size to 64G to serve 4EB cluster
Map/reduce not affected,[object Object]
Terasort on MapR 10+1 nodes: 8 core, 24GB DRAM, 11 x 1TB SATA 7200 rpm Elapsed time (mins) Lower is better
HBase on MapR YCSB Random Read with 1 billion 1K records 10+1 node cluster: 8 core, 24GB DRAM, 11 x 1TB 7200 RPM Recordspersecond Higher is better
Small Files (Apache Hadoop, 10 nodes) Out of box Op:  - create file         - write 100 bytes         - close Notes: - NN not replicated - NN uses 20G DRAM - DN uses  2G  DRAM Tuned Rate (files/sec) # of files (m)
MUCH faster for some operations Same 10 nodes 
 Create Rate # of files (millions)
What MapR is not Volumes != federation MapR supports > 10,000 volumes all with independent placement and defaults Volumes support snapshots and mirroring NFS != FUSE Checksum and compress at gateway IP fail-over Read/write/update semantics at full speed MapR != maprfs
New Capabilities
NFS mounting models Export to the world NFS gateway runs on selected gateway hosts Local server NFS gateway runs on local host Enables local compression and check summing Export to self NFS gateway runs on all data nodes, mounted from localhost
Export to the world NFS Server NFS Server NFS Server NFS Server NFS Client
Local server Client Application NFS Server Cluster Nodes
Universal export to self Cluster Nodes Cluster Node Task NFS Server
Cluster Node Task NFS Server Cluster Node Task Cluster Node Task NFS Server NFS Server Nodes are identical
Application architecture So now we have a hammer Let’s find us some nails!
Sharded text Indexing Index text to local disk and then copy index to distributed file store Assign documents to shards Map Reducer Clustered index storage Input documents Copy to local disk typically required before index can be loaded Local disk Search Engine Local disk
Shardedtext indexing Mapper assigns document to shard Shard is usually hash of document id Reducer indexes all documents for a shard Indexes created on local disk On success, copy index to DFS On failure, delete local files Must avoid directory collisions  can’t use shard id! Must manage and reclaim local disk space
Conventional data flow Failure of search engine requires another download of the index from clustered storage. Map Failure of a reducer causes garbage to accumulate in the local disk Reducer Clustered index storage Input documents Local disk Search Engine Local disk
Simplified NFS data flows Index to task work directory via NFS Map Reducer Search Engine Input documents Clustered index storage Failure of a reducer is cleaned up by map-reduce framework Search engine reads mirrored index directly.
Simplified NFS data flows Search Engine Mirroring allows exact placement of index data Map Reducer Input documents Search Engine Aribitrary levels of replication also possible Mirrors
How about another one?
K-means Classic E-M based algorithm Given cluster centroids, Assign each data point to nearest centroid Accumulate new centroids Rinse, lather, repeat
K-means, the movie Centroids Assign to Nearest centroid I n p u t Aggregate new centroids
But 

Parallel Stochastic Gradient Descent Model Train sub model I n p u t Average models
VariationalDirichlet Assignment Model Gather sufficient statistics I n p u t Update model
Old tricks, new dogs Mapper Assign point to cluster Emit cluster id, (1, point) Combiner and reducer Sum counts, weighted sum of points Emit cluster id, (n, sum/n) Output to HDFS Read from local disk from distributed cache Read from HDFS to local disk by distributed cache Written by map-reduce
Old tricks, new dogs Mapper Assign point to cluster Emit cluster id, (1, point) Combiner and reducer Sum counts, weighted sum of points Emit cluster id, (n, sum/n) Output to HDFS Read from NFS Written by map-reduce MapR FS
Poor man’s Pregel Mapper Lines in bold can use conventional I/O via NFS while not done:     read and accumulate input models     for each input:        accumulate model     write model    synchronize     reset input format emit summary 37
Click modeling architecture Map-reduce Side-data Now via NFS Feature extraction and down sampling I n p u t Data join Sequential SGD Learning

Weitere Àhnliche Inhalte

Was ist angesagt?

Putting Wings on the Elephant
Putting Wings on the ElephantPutting Wings on the Elephant
Putting Wings on the ElephantDataWorks Summit
 
Hadoop Architecture_Cluster_Cap_Plan
Hadoop Architecture_Cluster_Cap_PlanHadoop Architecture_Cluster_Cap_Plan
Hadoop Architecture_Cluster_Cap_PlanNarayana B
 
Oscon data-2011-ted-dunning
Oscon data-2011-ted-dunningOscon data-2011-ted-dunning
Oscon data-2011-ted-dunningTed Dunning
 
Hadoop Performance Optimization at Scale, Lessons Learned at Twitter
Hadoop Performance Optimization at Scale, Lessons Learned at TwitterHadoop Performance Optimization at Scale, Lessons Learned at Twitter
Hadoop Performance Optimization at Scale, Lessons Learned at TwitterDataWorks Summit
 
Hadoop - Introduction to map reduce programming - ReuniĂŁo 12/04/2014
Hadoop - Introduction to map reduce programming - ReuniĂŁo 12/04/2014Hadoop - Introduction to map reduce programming - ReuniĂŁo 12/04/2014
Hadoop - Introduction to map reduce programming - ReuniĂŁo 12/04/2014soujavajug
 
Hadoop architecture meetup
Hadoop architecture meetupHadoop architecture meetup
Hadoop architecture meetupvmoorthy
 
Hadoop operations basic
Hadoop operations basicHadoop operations basic
Hadoop operations basicHafizur Rahman
 
ĐœĐ°ŃŃˆŃ‚Đ°Đ±ĐžŃ€ŃƒĐ”ĐŒĐŸŃŃ‚ŃŒ Hadoop ĐČ Facebook. Đ”ĐŒĐžŃ‚Ń€ĐžĐč ĐœĐŸĐ»ŃŒĐșĐŸĐČ, Facebook
ĐœĐ°ŃŃˆŃ‚Đ°Đ±ĐžŃ€ŃƒĐ”ĐŒĐŸŃŃ‚ŃŒ Hadoop ĐČ Facebook. Đ”ĐŒĐžŃ‚Ń€ĐžĐč ĐœĐŸĐ»ŃŒĐșĐŸĐČ, FacebookĐœĐ°ŃŃˆŃ‚Đ°Đ±ĐžŃ€ŃƒĐ”ĐŒĐŸŃŃ‚ŃŒ Hadoop ĐČ Facebook. Đ”ĐŒĐžŃ‚Ń€ĐžĐč ĐœĐŸĐ»ŃŒĐșĐŸĐČ, Facebook
ĐœĐ°ŃŃˆŃ‚Đ°Đ±ĐžŃ€ŃƒĐ”ĐŒĐŸŃŃ‚ŃŒ Hadoop ĐČ Facebook. Đ”ĐŒĐžŃ‚Ń€ĐžĐč ĐœĐŸĐ»ŃŒĐșĐŸĐČ, Facebookyaevents
 
Hdfs, Map Reduce & hadoop 1.0 vs 2.0 overview
Hdfs, Map Reduce & hadoop 1.0 vs 2.0 overviewHdfs, Map Reduce & hadoop 1.0 vs 2.0 overview
Hdfs, Map Reduce & hadoop 1.0 vs 2.0 overviewNitesh Ghosh
 
Apache hadoop, hdfs and map reduce Overview
Apache hadoop, hdfs and map reduce OverviewApache hadoop, hdfs and map reduce Overview
Apache hadoop, hdfs and map reduce OverviewNisanth Simon
 
Hadoop introduction 2
Hadoop introduction 2Hadoop introduction 2
Hadoop introduction 2Tianwei Liu
 
Hadoop training in hyderabad-kellytechnologies
Hadoop training in hyderabad-kellytechnologiesHadoop training in hyderabad-kellytechnologies
Hadoop training in hyderabad-kellytechnologiesKelly Technologies
 
MapR M7: Providing an enterprise quality Apache HBase API
MapR M7: Providing an enterprise quality Apache HBase APIMapR M7: Providing an enterprise quality Apache HBase API
MapR M7: Providing an enterprise quality Apache HBase APImcsrivas
 
Overview of Spark for HPC
Overview of Spark for HPCOverview of Spark for HPC
Overview of Spark for HPCGlenn K. Lockwood
 
JFall 2011 no sql workshop
JFall 2011 no sql workshopJFall 2011 no sql workshop
JFall 2011 no sql workshopfvanvollenhoven
 
Quantcast File System (QFS) - Alternative to HDFS
Quantcast File System (QFS) - Alternative to HDFSQuantcast File System (QFS) - Alternative to HDFS
Quantcast File System (QFS) - Alternative to HDFSbigdatagurus_meetup
 
Hadoop MapReduce Streaming and Pipes
Hadoop MapReduce  Streaming and PipesHadoop MapReduce  Streaming and Pipes
Hadoop MapReduce Streaming and PipesHanborq Inc.
 

Was ist angesagt? (20)

Putting Wings on the Elephant
Putting Wings on the ElephantPutting Wings on the Elephant
Putting Wings on the Elephant
 
Hadoop Architecture_Cluster_Cap_Plan
Hadoop Architecture_Cluster_Cap_PlanHadoop Architecture_Cluster_Cap_Plan
Hadoop Architecture_Cluster_Cap_Plan
 
Oscon data-2011-ted-dunning
Oscon data-2011-ted-dunningOscon data-2011-ted-dunning
Oscon data-2011-ted-dunning
 
Hadoop Performance Optimization at Scale, Lessons Learned at Twitter
Hadoop Performance Optimization at Scale, Lessons Learned at TwitterHadoop Performance Optimization at Scale, Lessons Learned at Twitter
Hadoop Performance Optimization at Scale, Lessons Learned at Twitter
 
Hadoop - Introduction to map reduce programming - ReuniĂŁo 12/04/2014
Hadoop - Introduction to map reduce programming - ReuniĂŁo 12/04/2014Hadoop - Introduction to map reduce programming - ReuniĂŁo 12/04/2014
Hadoop - Introduction to map reduce programming - ReuniĂŁo 12/04/2014
 
RuG Guest Lecture
RuG Guest LectureRuG Guest Lecture
RuG Guest Lecture
 
Hadoop 2
Hadoop 2Hadoop 2
Hadoop 2
 
Hadoop architecture meetup
Hadoop architecture meetupHadoop architecture meetup
Hadoop architecture meetup
 
Hadoop operations basic
Hadoop operations basicHadoop operations basic
Hadoop operations basic
 
ĐœĐ°ŃŃˆŃ‚Đ°Đ±ĐžŃ€ŃƒĐ”ĐŒĐŸŃŃ‚ŃŒ Hadoop ĐČ Facebook. Đ”ĐŒĐžŃ‚Ń€ĐžĐč ĐœĐŸĐ»ŃŒĐșĐŸĐČ, Facebook
ĐœĐ°ŃŃˆŃ‚Đ°Đ±ĐžŃ€ŃƒĐ”ĐŒĐŸŃŃ‚ŃŒ Hadoop ĐČ Facebook. Đ”ĐŒĐžŃ‚Ń€ĐžĐč ĐœĐŸĐ»ŃŒĐșĐŸĐČ, FacebookĐœĐ°ŃŃˆŃ‚Đ°Đ±ĐžŃ€ŃƒĐ”ĐŒĐŸŃŃ‚ŃŒ Hadoop ĐČ Facebook. Đ”ĐŒĐžŃ‚Ń€ĐžĐč ĐœĐŸĐ»ŃŒĐșĐŸĐČ, Facebook
ĐœĐ°ŃŃˆŃ‚Đ°Đ±ĐžŃ€ŃƒĐ”ĐŒĐŸŃŃ‚ŃŒ Hadoop ĐČ Facebook. Đ”ĐŒĐžŃ‚Ń€ĐžĐč ĐœĐŸĐ»ŃŒĐșĐŸĐČ, Facebook
 
Hdfs, Map Reduce & hadoop 1.0 vs 2.0 overview
Hdfs, Map Reduce & hadoop 1.0 vs 2.0 overviewHdfs, Map Reduce & hadoop 1.0 vs 2.0 overview
Hdfs, Map Reduce & hadoop 1.0 vs 2.0 overview
 
Apache hadoop, hdfs and map reduce Overview
Apache hadoop, hdfs and map reduce OverviewApache hadoop, hdfs and map reduce Overview
Apache hadoop, hdfs and map reduce Overview
 
Hadoop introduction 2
Hadoop introduction 2Hadoop introduction 2
Hadoop introduction 2
 
Hadoop training in hyderabad-kellytechnologies
Hadoop training in hyderabad-kellytechnologiesHadoop training in hyderabad-kellytechnologies
Hadoop training in hyderabad-kellytechnologies
 
MapR M7: Providing an enterprise quality Apache HBase API
MapR M7: Providing an enterprise quality Apache HBase APIMapR M7: Providing an enterprise quality Apache HBase API
MapR M7: Providing an enterprise quality Apache HBase API
 
Overview of Spark for HPC
Overview of Spark for HPCOverview of Spark for HPC
Overview of Spark for HPC
 
JFall 2011 no sql workshop
JFall 2011 no sql workshopJFall 2011 no sql workshop
JFall 2011 no sql workshop
 
01 hbase
01 hbase01 hbase
01 hbase
 
Quantcast File System (QFS) - Alternative to HDFS
Quantcast File System (QFS) - Alternative to HDFSQuantcast File System (QFS) - Alternative to HDFS
Quantcast File System (QFS) - Alternative to HDFS
 
Hadoop MapReduce Streaming and Pipes
Hadoop MapReduce  Streaming and PipesHadoop MapReduce  Streaming and Pipes
Hadoop MapReduce Streaming and Pipes
 

Andere mochten auch

Prezentarea agentiei Justpixel
Prezentarea agentiei JustpixelPrezentarea agentiei Justpixel
Prezentarea agentiei JustpixelUngureanu Lucian
 
OpenFest 2013 Open Source Hardware (OSHW) made in Bulgaria
OpenFest 2013 Open Source Hardware (OSHW) made in BulgariaOpenFest 2013 Open Source Hardware (OSHW) made in Bulgaria
OpenFest 2013 Open Source Hardware (OSHW) made in BulgariaOlimex Bulgaria
 
Come (e perchĂš) un'istituzione finanziaria puĂČ costruire un ottimo blog azien...
Come (e perchĂš) un'istituzione finanziaria puĂČ costruire un ottimo blog azien...Come (e perchĂš) un'istituzione finanziaria puĂČ costruire un ottimo blog azien...
Come (e perchĂš) un'istituzione finanziaria puĂČ costruire un ottimo blog azien...AdviseOnly
 
Aiguille du Midi en France
Aiguille du Midi  en FranceAiguille du Midi  en France
Aiguille du Midi en FranceBalcon60
 
Verden trenger mer sjĂžmat - langsiktig megatrend - Holberg Fondene
Verden trenger mer sjĂžmat - langsiktig megatrend - Holberg FondeneVerden trenger mer sjĂžmat - langsiktig megatrend - Holberg Fondene
Verden trenger mer sjĂžmat - langsiktig megatrend - Holberg FondeneNordnet Norge
 
Stream Data into the Cloud with Raspberry Pi and Windows 10 IoT Core
Stream Data into the Cloud with Raspberry Pi and Windows 10 IoT CoreStream Data into the Cloud with Raspberry Pi and Windows 10 IoT Core
Stream Data into the Cloud with Raspberry Pi and Windows 10 IoT CoreMike Branstein
 
Peningkatan mutu agregat ringan beton bertulang ringan struktural untuk bangu...
Peningkatan mutu agregat ringan beton bertulang ringan struktural untuk bangu...Peningkatan mutu agregat ringan beton bertulang ringan struktural untuk bangu...
Peningkatan mutu agregat ringan beton bertulang ringan struktural untuk bangu...Krismanto Mahendra
 
Çiçeklerin DĂŒnyası, World of Flowers IV
Çiçeklerin DĂŒnyası, World of Flowers IVÇiçeklerin DĂŒnyası, World of Flowers IV
Çiçeklerin DĂŒnyası, World of Flowers IV***
 
TITANIC II
TITANIC IITITANIC II
TITANIC IIBalcon60
 
МарĐșĐ”Ń‚ĐžĐœĐł ĐœĐ° ĐČĐŽŃŠŃ…ĐœĐŸĐČĐ”ĐœĐžĐ”Ń‚ĐŸ
МарĐșĐ”Ń‚ĐžĐœĐł ĐœĐ° ĐČĐŽŃŠŃ…ĐœĐŸĐČĐ”ĐœĐžĐ”Ń‚ĐŸĐœĐ°Ń€ĐșĐ”Ń‚ĐžĐœĐł ĐœĐ° ĐČĐŽŃŠŃ…ĐœĐŸĐČĐ”ĐœĐžĐ”Ń‚ĐŸ
МарĐșĐ”Ń‚ĐžĐœĐł ĐœĐ° ĐČĐŽŃŠŃ…ĐœĐŸĐČĐ”ĐœĐžĐ”Ń‚ĐŸJustine Toms
 
Big Data y Salud. Un enfoque orientado a resultados
Big Data y Salud. Un enfoque orientado a resultadosBig Data y Salud. Un enfoque orientado a resultados
Big Data y Salud. Un enfoque orientado a resultadosEmagina Business Solutions S.L.
 
L298N çąłćˆ·éŠŹé”é©…ć‹•
L298N çąłćˆ·éŠŹé”é©…ć‹•L298N çąłćˆ·éŠŹé”é©…ć‹•
L298N çąłćˆ·éŠŹé”é©…ć‹•Ziyuan Chen
 
ĐŸĐ»Đ°Ń‚Ń„ĐŸŃ€ĐŒĐ° Đž Ń€Đ”ŃˆĐ”ĐœĐžŃ НРЕ ĐŽĐ»Ń Đ±ĐŸĐ»ŃŒŃˆĐžŃ… ĐŽĐ°ĐœĐœŃ‹Ń…
ĐŸĐ»Đ°Ń‚Ń„ĐŸŃ€ĐŒĐ° Đž Ń€Đ”ŃˆĐ”ĐœĐžŃ НРЕ ĐŽĐ»Ń Đ±ĐŸĐ»ŃŒŃˆĐžŃ… ĐŽĐ°ĐœĐœŃ‹Ń…ĐŸĐ»Đ°Ń‚Ń„ĐŸŃ€ĐŒĐ° Đž Ń€Đ”ŃˆĐ”ĐœĐžŃ НРЕ ĐŽĐ»Ń Đ±ĐŸĐ»ŃŒŃˆĐžŃ… ĐŽĐ°ĐœĐœŃ‹Ń…
ĐŸĐ»Đ°Ń‚Ń„ĐŸŃ€ĐŒĐ° Đž Ń€Đ”ŃˆĐ”ĐœĐžŃ НРЕ ĐŽĐ»Ń Đ±ĐŸĐ»ŃŒŃˆĐžŃ… ĐŽĐ°ĐœĐœŃ‹Ń…Andrey Karpov
 
Kya aap jantay hain
Kya aap jantay hainKya aap jantay hain
Kya aap jantay hainrubab fatima
 
Dziƛ juĆŒ nie ma offline - PaweƂ Loedl
Dziƛ juĆŒ nie ma offline - PaweƂ LoedlDziƛ juĆŒ nie ma offline - PaweƂ Loedl
Dziƛ juĆŒ nie ma offline - PaweƂ LoedlSchool of New Media
 
Ù…Ű°ÙƒŰ±Ű© The future Ù„Ù„Ű”Ù Ű§Ù„Ű«Ű§Ù„Ű« Ű§Ù„Ű§ŰšŰȘŰŻŰ§ŰŠÙ‰ Ű§Ù„ŰȘŰ±Ù… Ű§Ù„Ű«Ű§Ù†Ù‰
Ù…Ű°ÙƒŰ±Ű© The future Ù„Ù„Ű”Ù Ű§Ù„Ű«Ű§Ù„Ű« Ű§Ù„Ű§ŰšŰȘŰŻŰ§ŰŠÙ‰ Ű§Ù„ŰȘŰ±Ù… Ű§Ù„Ű«Ű§Ù†Ù‰Ù…Ű°ÙƒŰ±Ű© The future Ù„Ù„Ű”Ù Ű§Ù„Ű«Ű§Ù„Ű« Ű§Ù„Ű§ŰšŰȘŰŻŰ§ŰŠÙ‰ Ű§Ù„ŰȘŰ±Ù… Ű§Ù„Ű«Ű§Ù†Ù‰
Ù…Ű°ÙƒŰ±Ű© The future Ù„Ù„Ű”Ù Ű§Ù„Ű«Ű§Ù„Ű« Ű§Ù„Ű§ŰšŰȘŰŻŰ§ŰŠÙ‰ Ű§Ù„ŰȘŰ±Ù… Ű§Ù„Ű«Ű§Ù†Ù‰Salah Abdelsalam
 
GeopolĂ­tica do PetrĂłleo UENF - 30 AGO 2016
GeopolĂ­tica do PetrĂłleo UENF - 30 AGO 2016GeopolĂ­tica do PetrĂłleo UENF - 30 AGO 2016
GeopolĂ­tica do PetrĂłleo UENF - 30 AGO 2016Lincoln Weinhardt
 
Syllabus leertraject koninkrijkszaken
Syllabus leertraject koninkrijkszakenSyllabus leertraject koninkrijkszaken
Syllabus leertraject koninkrijkszakenSibrenne Wagenaar
 
GĂŒller,Roses
GĂŒller,RosesGĂŒller,Roses
GĂŒller,Roses***
 

Andere mochten auch (20)

Prezentarea agentiei Justpixel
Prezentarea agentiei JustpixelPrezentarea agentiei Justpixel
Prezentarea agentiei Justpixel
 
OpenFest 2013 Open Source Hardware (OSHW) made in Bulgaria
OpenFest 2013 Open Source Hardware (OSHW) made in BulgariaOpenFest 2013 Open Source Hardware (OSHW) made in Bulgaria
OpenFest 2013 Open Source Hardware (OSHW) made in Bulgaria
 
Come (e perchĂš) un'istituzione finanziaria puĂČ costruire un ottimo blog azien...
Come (e perchĂš) un'istituzione finanziaria puĂČ costruire un ottimo blog azien...Come (e perchĂš) un'istituzione finanziaria puĂČ costruire un ottimo blog azien...
Come (e perchĂš) un'istituzione finanziaria puĂČ costruire un ottimo blog azien...
 
Aiguille du Midi en France
Aiguille du Midi  en FranceAiguille du Midi  en France
Aiguille du Midi en France
 
Verden trenger mer sjĂžmat - langsiktig megatrend - Holberg Fondene
Verden trenger mer sjĂžmat - langsiktig megatrend - Holberg FondeneVerden trenger mer sjĂžmat - langsiktig megatrend - Holberg Fondene
Verden trenger mer sjĂžmat - langsiktig megatrend - Holberg Fondene
 
Stream Data into the Cloud with Raspberry Pi and Windows 10 IoT Core
Stream Data into the Cloud with Raspberry Pi and Windows 10 IoT CoreStream Data into the Cloud with Raspberry Pi and Windows 10 IoT Core
Stream Data into the Cloud with Raspberry Pi and Windows 10 IoT Core
 
Peningkatan mutu agregat ringan beton bertulang ringan struktural untuk bangu...
Peningkatan mutu agregat ringan beton bertulang ringan struktural untuk bangu...Peningkatan mutu agregat ringan beton bertulang ringan struktural untuk bangu...
Peningkatan mutu agregat ringan beton bertulang ringan struktural untuk bangu...
 
Çiçeklerin DĂŒnyası, World of Flowers IV
Çiçeklerin DĂŒnyası, World of Flowers IVÇiçeklerin DĂŒnyası, World of Flowers IV
Çiçeklerin DĂŒnyası, World of Flowers IV
 
TITANIC II
TITANIC IITITANIC II
TITANIC II
 
МарĐșĐ”Ń‚ĐžĐœĐł ĐœĐ° ĐČĐŽŃŠŃ…ĐœĐŸĐČĐ”ĐœĐžĐ”Ń‚ĐŸ
МарĐșĐ”Ń‚ĐžĐœĐł ĐœĐ° ĐČĐŽŃŠŃ…ĐœĐŸĐČĐ”ĐœĐžĐ”Ń‚ĐŸĐœĐ°Ń€ĐșĐ”Ń‚ĐžĐœĐł ĐœĐ° ĐČĐŽŃŠŃ…ĐœĐŸĐČĐ”ĐœĐžĐ”Ń‚ĐŸ
МарĐșĐ”Ń‚ĐžĐœĐł ĐœĐ° ĐČĐŽŃŠŃ…ĐœĐŸĐČĐ”ĐœĐžĐ”Ń‚ĐŸ
 
Big Data y Salud. Un enfoque orientado a resultados
Big Data y Salud. Un enfoque orientado a resultadosBig Data y Salud. Un enfoque orientado a resultados
Big Data y Salud. Un enfoque orientado a resultados
 
L298N çąłćˆ·éŠŹé”é©…ć‹•
L298N çąłćˆ·éŠŹé”é©…ć‹•L298N çąłćˆ·éŠŹé”é©…ć‹•
L298N çąłćˆ·éŠŹé”é©…ć‹•
 
ĐŸĐ»Đ°Ń‚Ń„ĐŸŃ€ĐŒĐ° Đž Ń€Đ”ŃˆĐ”ĐœĐžŃ НРЕ ĐŽĐ»Ń Đ±ĐŸĐ»ŃŒŃˆĐžŃ… ĐŽĐ°ĐœĐœŃ‹Ń…
ĐŸĐ»Đ°Ń‚Ń„ĐŸŃ€ĐŒĐ° Đž Ń€Đ”ŃˆĐ”ĐœĐžŃ НРЕ ĐŽĐ»Ń Đ±ĐŸĐ»ŃŒŃˆĐžŃ… ĐŽĐ°ĐœĐœŃ‹Ń…ĐŸĐ»Đ°Ń‚Ń„ĐŸŃ€ĐŒĐ° Đž Ń€Đ”ŃˆĐ”ĐœĐžŃ НРЕ ĐŽĐ»Ń Đ±ĐŸĐ»ŃŒŃˆĐžŃ… ĐŽĐ°ĐœĐœŃ‹Ń…
ĐŸĐ»Đ°Ń‚Ń„ĐŸŃ€ĐŒĐ° Đž Ń€Đ”ŃˆĐ”ĐœĐžŃ НРЕ ĐŽĐ»Ń Đ±ĐŸĐ»ŃŒŃˆĐžŃ… ĐŽĐ°ĐœĐœŃ‹Ń…
 
Kya aap jantay hain
Kya aap jantay hainKya aap jantay hain
Kya aap jantay hain
 
Dziƛ juĆŒ nie ma offline - PaweƂ Loedl
Dziƛ juĆŒ nie ma offline - PaweƂ LoedlDziƛ juĆŒ nie ma offline - PaweƂ Loedl
Dziƛ juĆŒ nie ma offline - PaweƂ Loedl
 
ĐșĐ°Đ»ĐžĐœ 100
ĐșĐ°Đ»ĐžĐœ 100ĐșĐ°Đ»ĐžĐœ 100
ĐșĐ°Đ»ĐžĐœ 100
 
Ù…Ű°ÙƒŰ±Ű© The future Ù„Ù„Ű”Ù Ű§Ù„Ű«Ű§Ù„Ű« Ű§Ù„Ű§ŰšŰȘŰŻŰ§ŰŠÙ‰ Ű§Ù„ŰȘŰ±Ù… Ű§Ù„Ű«Ű§Ù†Ù‰
Ù…Ű°ÙƒŰ±Ű© The future Ù„Ù„Ű”Ù Ű§Ù„Ű«Ű§Ù„Ű« Ű§Ù„Ű§ŰšŰȘŰŻŰ§ŰŠÙ‰ Ű§Ù„ŰȘŰ±Ù… Ű§Ù„Ű«Ű§Ù†Ù‰Ù…Ű°ÙƒŰ±Ű© The future Ù„Ù„Ű”Ù Ű§Ù„Ű«Ű§Ù„Ű« Ű§Ù„Ű§ŰšŰȘŰŻŰ§ŰŠÙ‰ Ű§Ù„ŰȘŰ±Ù… Ű§Ù„Ű«Ű§Ù†Ù‰
Ù…Ű°ÙƒŰ±Ű© The future Ù„Ù„Ű”Ù Ű§Ù„Ű«Ű§Ù„Ű« Ű§Ù„Ű§ŰšŰȘŰŻŰ§ŰŠÙ‰ Ű§Ù„ŰȘŰ±Ù… Ű§Ù„Ű«Ű§Ù†Ù‰
 
GeopolĂ­tica do PetrĂłleo UENF - 30 AGO 2016
GeopolĂ­tica do PetrĂłleo UENF - 30 AGO 2016GeopolĂ­tica do PetrĂłleo UENF - 30 AGO 2016
GeopolĂ­tica do PetrĂłleo UENF - 30 AGO 2016
 
Syllabus leertraject koninkrijkszaken
Syllabus leertraject koninkrijkszakenSyllabus leertraject koninkrijkszaken
Syllabus leertraject koninkrijkszaken
 
GĂŒller,Roses
GĂŒller,RosesGĂŒller,Roses
GĂŒller,Roses
 

Ähnlich wie MapR, Implications for Integration

Hadoop Architecture
Hadoop ArchitectureHadoop Architecture
Hadoop ArchitectureDelhi/NCR HUG
 
Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)
Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)
Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)Hari Shankar Sreekumar
 
Data mining-2011-09
Data mining-2011-09Data mining-2011-09
Data mining-2011-09Ted Dunning
 
(SDD401) Amazon Elastic MapReduce Deep Dive and Best Practices | AWS re:Inven...
(SDD401) Amazon Elastic MapReduce Deep Dive and Best Practices | AWS re:Inven...(SDD401) Amazon Elastic MapReduce Deep Dive and Best Practices | AWS re:Inven...
(SDD401) Amazon Elastic MapReduce Deep Dive and Best Practices | AWS re:Inven...Amazon Web Services
 
Hadoop Network Performance profile
Hadoop Network Performance profileHadoop Network Performance profile
Hadoop Network Performance profilepramodbiligiri
 
02.28.13 WANdisco ApacheCon 2013
02.28.13 WANdisco ApacheCon 201302.28.13 WANdisco ApacheCon 2013
02.28.13 WANdisco ApacheCon 2013WANdisco Plc
 
Hadoop, HDFS and MapReduce
Hadoop, HDFS and MapReduceHadoop, HDFS and MapReduce
Hadoop, HDFS and MapReducefvanvollenhoven
 
SparkNotes
SparkNotesSparkNotes
SparkNotesDemet Aksoy
 
Hadoop training in bangalore-kellytechnologies
Hadoop training in bangalore-kellytechnologiesHadoop training in bangalore-kellytechnologies
Hadoop training in bangalore-kellytechnologiesappaji intelhunt
 
Ted Dunning - Whither Hadoop
Ted Dunning - Whither HadoopTed Dunning - Whither Hadoop
Ted Dunning - Whither HadoopEd Kohlwey
 
An Introduction to Hadoop
An Introduction to HadoopAn Introduction to Hadoop
An Introduction to HadoopDerrekYoungDotCom
 
Big Data Essentials meetup @ IBM Ljubljana 23.06.2015
Big Data Essentials meetup @ IBM Ljubljana 23.06.2015Big Data Essentials meetup @ IBM Ljubljana 23.06.2015
Big Data Essentials meetup @ IBM Ljubljana 23.06.2015Andrey Vykhodtsev
 
Masterclass Live: Amazon EMR
Masterclass Live: Amazon EMRMasterclass Live: Amazon EMR
Masterclass Live: Amazon EMRAmazon Web Services
 
Data Science
Data ScienceData Science
Data ScienceSubhajit75
 
Hw09 Practical HBase Getting The Most From Your H Base Install
Hw09   Practical HBase  Getting The Most From Your H Base InstallHw09   Practical HBase  Getting The Most From Your H Base Install
Hw09 Practical HBase Getting The Most From Your H Base InstallCloudera, Inc.
 

Ähnlich wie MapR, Implications for Integration (20)

Hadoop Architecture
Hadoop ArchitectureHadoop Architecture
Hadoop Architecture
 
Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)
Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)
Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)
 
Lecture 2 part 1
Lecture 2 part 1Lecture 2 part 1
Lecture 2 part 1
 
Data mining-2011-09
Data mining-2011-09Data mining-2011-09
Data mining-2011-09
 
(SDD401) Amazon Elastic MapReduce Deep Dive and Best Practices | AWS re:Inven...
(SDD401) Amazon Elastic MapReduce Deep Dive and Best Practices | AWS re:Inven...(SDD401) Amazon Elastic MapReduce Deep Dive and Best Practices | AWS re:Inven...
(SDD401) Amazon Elastic MapReduce Deep Dive and Best Practices | AWS re:Inven...
 
Apache hadoop
Apache hadoopApache hadoop
Apache hadoop
 
Hadoop Network Performance profile
Hadoop Network Performance profileHadoop Network Performance profile
Hadoop Network Performance profile
 
02.28.13 WANdisco ApacheCon 2013
02.28.13 WANdisco ApacheCon 201302.28.13 WANdisco ApacheCon 2013
02.28.13 WANdisco ApacheCon 2013
 
Hadoop
HadoopHadoop
Hadoop
 
Hadoop, HDFS and MapReduce
Hadoop, HDFS and MapReduceHadoop, HDFS and MapReduce
Hadoop, HDFS and MapReduce
 
SparkNotes
SparkNotesSparkNotes
SparkNotes
 
Hadoop training in bangalore-kellytechnologies
Hadoop training in bangalore-kellytechnologiesHadoop training in bangalore-kellytechnologies
Hadoop training in bangalore-kellytechnologies
 
Ted Dunning - Whither Hadoop
Ted Dunning - Whither HadoopTed Dunning - Whither Hadoop
Ted Dunning - Whither Hadoop
 
Understanding Hadoop
Understanding HadoopUnderstanding Hadoop
Understanding Hadoop
 
An Introduction to Hadoop
An Introduction to HadoopAn Introduction to Hadoop
An Introduction to Hadoop
 
Unit 1
Unit 1Unit 1
Unit 1
 
Big Data Essentials meetup @ IBM Ljubljana 23.06.2015
Big Data Essentials meetup @ IBM Ljubljana 23.06.2015Big Data Essentials meetup @ IBM Ljubljana 23.06.2015
Big Data Essentials meetup @ IBM Ljubljana 23.06.2015
 
Masterclass Live: Amazon EMR
Masterclass Live: Amazon EMRMasterclass Live: Amazon EMR
Masterclass Live: Amazon EMR
 
Data Science
Data ScienceData Science
Data Science
 
Hw09 Practical HBase Getting The Most From Your H Base Install
Hw09   Practical HBase  Getting The Most From Your H Base InstallHw09   Practical HBase  Getting The Most From Your H Base Install
Hw09 Practical HBase Getting The Most From Your H Base Install
 

Mehr von trihug

TriHUG October: Apache Ranger
TriHUG October: Apache RangerTriHUG October: Apache Ranger
TriHUG October: Apache Rangertrihug
 
TriHUG Feb: Hive on spark
TriHUG Feb: Hive on sparkTriHUG Feb: Hive on spark
TriHUG Feb: Hive on sparktrihug
 
TriHUG 3/14: HBase in Production
TriHUG 3/14: HBase in ProductionTriHUG 3/14: HBase in Production
TriHUG 3/14: HBase in Productiontrihug
 
TriHUG 2/14: Apache Sentry
TriHUG 2/14: Apache SentryTriHUG 2/14: Apache Sentry
TriHUG 2/14: Apache Sentrytrihug
 
TriHUG talk on Spark and Shark
TriHUG talk on Spark and SharkTriHUG talk on Spark and Shark
TriHUG talk on Spark and Sharktrihug
 
Impala presentation
Impala presentationImpala presentation
Impala presentationtrihug
 
Practical pig
Practical pigPractical pig
Practical pigtrihug
 
Financial services trihug
Financial services trihugFinancial services trihug
Financial services trihugtrihug
 
TriHUG January 2012 Talk by Chris Shain
TriHUG January 2012 Talk by Chris ShainTriHUG January 2012 Talk by Chris Shain
TriHUG January 2012 Talk by Chris Shaintrihug
 
TriHUG November HCatalog Talk by Alan Gates
TriHUG November HCatalog Talk by Alan GatesTriHUG November HCatalog Talk by Alan Gates
TriHUG November HCatalog Talk by Alan Gatestrihug
 
TriHUG November Pig Talk by Alan Gates
TriHUG November Pig Talk by Alan GatesTriHUG November Pig Talk by Alan Gates
TriHUG November Pig Talk by Alan Gatestrihug
 

Mehr von trihug (11)

TriHUG October: Apache Ranger
TriHUG October: Apache RangerTriHUG October: Apache Ranger
TriHUG October: Apache Ranger
 
TriHUG Feb: Hive on spark
TriHUG Feb: Hive on sparkTriHUG Feb: Hive on spark
TriHUG Feb: Hive on spark
 
TriHUG 3/14: HBase in Production
TriHUG 3/14: HBase in ProductionTriHUG 3/14: HBase in Production
TriHUG 3/14: HBase in Production
 
TriHUG 2/14: Apache Sentry
TriHUG 2/14: Apache SentryTriHUG 2/14: Apache Sentry
TriHUG 2/14: Apache Sentry
 
TriHUG talk on Spark and Shark
TriHUG talk on Spark and SharkTriHUG talk on Spark and Shark
TriHUG talk on Spark and Shark
 
Impala presentation
Impala presentationImpala presentation
Impala presentation
 
Practical pig
Practical pigPractical pig
Practical pig
 
Financial services trihug
Financial services trihugFinancial services trihug
Financial services trihug
 
TriHUG January 2012 Talk by Chris Shain
TriHUG January 2012 Talk by Chris ShainTriHUG January 2012 Talk by Chris Shain
TriHUG January 2012 Talk by Chris Shain
 
TriHUG November HCatalog Talk by Alan Gates
TriHUG November HCatalog Talk by Alan GatesTriHUG November HCatalog Talk by Alan Gates
TriHUG November HCatalog Talk by Alan Gates
 
TriHUG November Pig Talk by Alan Gates
TriHUG November Pig Talk by Alan GatesTriHUG November Pig Talk by Alan Gates
TriHUG November Pig Talk by Alan Gates
 

KĂŒrzlich hochgeladen

Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍾 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍾 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍾 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍾 8923113531 🎰 Avail...gurkirankumar98700
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 

KĂŒrzlich hochgeladen (20)

Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍾 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍾 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍾 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍾 8923113531 🎰 Avail...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 

MapR, Implications for Integration

  • 1. MapR, Implications for Integration CHUG – August 2011
  • 2. Outline MapR system overview Map-reduce review MapR architecture Performance Results Map-reduce on MapR Architectural implications Search indexing / deployment EM algorithm for machine learning 
 and more 

  • 4. Bottlenecks and Issues Read-only files Many copies in I/O path Shuffle based on HTTP Can’t use new technologies Eats file descriptors Spills go to local file space Bad for skewed distribution of sizes
  • 5. MapR Areas of Development
  • 6. MapR Improvements Faster file system Fewer copies Multiple NICS No file descriptor or page-buf competition Faster map-reduce Uses distributed file system Direct RPC to receiver Very wide merges
  • 7. MapR Innovations Volumes Distributed management Data placement Read/write random access file system Allows distributed meta-data Improved scaling Enables NFS access Application-level NIC bonding Transactionally correct snapshots and mirrors
  • 8.
  • 12. No need to manage directlyContainers are 16-32 GB segments of disk, placed on nodes
  • 13. Container locations and replication CLDB N1, N2 N1 N3, N2 N1, N2 N2 N1, N3 N3, N2 N3 Container location database (CLDB) keeps track of nodes hosting each container
  • 14.
  • 15.
  • 16. But not necessary, can page to disk
  • 17.
  • 18. Increase container size to 64G to serve 4EB cluster
  • 19.
  • 20. Terasort on MapR 10+1 nodes: 8 core, 24GB DRAM, 11 x 1TB SATA 7200 rpm Elapsed time (mins) Lower is better
  • 21. HBase on MapR YCSB Random Read with 1 billion 1K records 10+1 node cluster: 8 core, 24GB DRAM, 11 x 1TB 7200 RPM Recordspersecond Higher is better
  • 22. Small Files (Apache Hadoop, 10 nodes) Out of box Op: - create file - write 100 bytes - close Notes: - NN not replicated - NN uses 20G DRAM - DN uses 2G DRAM Tuned Rate (files/sec) # of files (m)
  • 23. MUCH faster for some operations Same 10 nodes 
 Create Rate # of files (millions)
  • 24. What MapR is not Volumes != federation MapR supports > 10,000 volumes all with independent placement and defaults Volumes support snapshots and mirroring NFS != FUSE Checksum and compress at gateway IP fail-over Read/write/update semantics at full speed MapR != maprfs
  • 26. NFS mounting models Export to the world NFS gateway runs on selected gateway hosts Local server NFS gateway runs on local host Enables local compression and check summing Export to self NFS gateway runs on all data nodes, mounted from localhost
  • 27. Export to the world NFS Server NFS Server NFS Server NFS Server NFS Client
  • 28. Local server Client Application NFS Server Cluster Nodes
  • 29. Universal export to self Cluster Nodes Cluster Node Task NFS Server
  • 30. Cluster Node Task NFS Server Cluster Node Task Cluster Node Task NFS Server NFS Server Nodes are identical
  • 31. Application architecture So now we have a hammer Let’s find us some nails!
  • 32. Sharded text Indexing Index text to local disk and then copy index to distributed file store Assign documents to shards Map Reducer Clustered index storage Input documents Copy to local disk typically required before index can be loaded Local disk Search Engine Local disk
  • 33. Shardedtext indexing Mapper assigns document to shard Shard is usually hash of document id Reducer indexes all documents for a shard Indexes created on local disk On success, copy index to DFS On failure, delete local files Must avoid directory collisions can’t use shard id! Must manage and reclaim local disk space
  • 34. Conventional data flow Failure of search engine requires another download of the index from clustered storage. Map Failure of a reducer causes garbage to accumulate in the local disk Reducer Clustered index storage Input documents Local disk Search Engine Local disk
  • 35. Simplified NFS data flows Index to task work directory via NFS Map Reducer Search Engine Input documents Clustered index storage Failure of a reducer is cleaned up by map-reduce framework Search engine reads mirrored index directly.
  • 36. Simplified NFS data flows Search Engine Mirroring allows exact placement of index data Map Reducer Input documents Search Engine Aribitrary levels of replication also possible Mirrors
  • 38. K-means Classic E-M based algorithm Given cluster centroids, Assign each data point to nearest centroid Accumulate new centroids Rinse, lather, repeat
  • 39. K-means, the movie Centroids Assign to Nearest centroid I n p u t Aggregate new centroids
  • 41. Parallel Stochastic Gradient Descent Model Train sub model I n p u t Average models
  • 42. VariationalDirichlet Assignment Model Gather sufficient statistics I n p u t Update model
  • 43. Old tricks, new dogs Mapper Assign point to cluster Emit cluster id, (1, point) Combiner and reducer Sum counts, weighted sum of points Emit cluster id, (n, sum/n) Output to HDFS Read from local disk from distributed cache Read from HDFS to local disk by distributed cache Written by map-reduce
  • 44. Old tricks, new dogs Mapper Assign point to cluster Emit cluster id, (1, point) Combiner and reducer Sum counts, weighted sum of points Emit cluster id, (n, sum/n) Output to HDFS Read from NFS Written by map-reduce MapR FS
  • 45. Poor man’s Pregel Mapper Lines in bold can use conventional I/O via NFS while not done: read and accumulate input models for each input: accumulate model write model synchronize reset input format emit summary 37
  • 46. Click modeling architecture Map-reduce Side-data Now via NFS Feature extraction and down sampling I n p u t Data join Sequential SGD Learning
  • 47. Click modeling architecture Map-reduce Map-reduce Side-data Map-reduce cooperates with NFS Sequential SGD Learning Feature extraction and down sampling Sequential SGD Learning I n p u t Data join Sequential SGD Learning Sequential SGD Learning
  • 49. Hybrid model flow Map-reduce Map-reduce Feature extraction and down sampling Down stream modeling Deployed Model ?? SVD (PageRank) (spectral)
  • 50.
  • 51. Hybrid model flow Feature extraction and down sampling Down stream modeling Deployed Model Sequential Map-reduce SVD (PageRank) (spectral)
  • 53. Trivial visualization interface Map-reduce output is visible via NFS Legacy visualization just works $ R > x <- read.csv(“/mapr/my.cluster/home/ted/data/foo.out”) > plot(error ~ t, x) > q(save=‘n’)
  • 54. Conclusions We used to know all this Tab completion used to work 5 years of work-arounds have clouded our memories We just have to remember the future