SlideShare ist ein Scribd-Unternehmen logo
1 von 20
Downloaden Sie, um offline zu lesen
Large-scale Neural ModelingLarge-scale Neural Modeling
in MapReduce and Giraphin MapReduce and Giraph
Co-authors
Nicholas D. Spielman
Neuroscience Program
University of St. Thomas
Presenter
Shuo Yang
Graduate Programs in Software
University of St. Thomas
Special thanks
Bhabani Misra, PhD
Graduate Programs in Software
University of St. Thomas
Jadin C. Jackson PhD
Department of Biology
University of St. Thomas
Bradley S. Rubin, PhD
Graduate Programs in Software
University of St. Thomas
Why Hadoop & What is Hadoop
Why not supercomputers?
Expensive
Limited access
Scalability
Why Hadoop?
Runs on commodity hardware
Scalable
Full-fledged eco-system & community
Open-source implementation of MapReduce
based on Java
MapReduce Model
Client
Map
Reduce
HDFS
Split Data
Output
Map
Map
Reduce Output
…....
…....
∑ I
input currents
from neighbors
∆vI1
I2
In
currents to all neighbors
Synaptic weight matrix
0 1000Time Step
Neuron ID
Simulation results
0
2500
Neural Model (Izhikevich model)
…....
…....
∑ I
input currents
from neighbors
∆vI1
I2
In
currents to all neighbors
Synaptic weight matrix
0 1000Time Step
Neuron ID
Simulation results
0
2500
Neural Model (Izhikevich model)
This is a graph structure
Mapper
N1 I2
I3
N1 I2
I3
I2
I3
Reducer
Mapper
N2 I1
I3
N2 I1
I3
I1
I3
Mapper
N3 I2
I1
N3 I2
I1
I2
I1
Reducer
Reducer
N1 I2
I3
N2 I1
I3
N3 I2
I1
sum currents to N1
sum currents to N2
sum currents to N3
update N1
update N2
update N3
HDFS
initial input
write back to HDFS
N1 and its
local structure
N2 and its
local structure
N3 and its
local structure
Map
Sort &
Shuffle Reduce
Basic MapReduce Implementation
input from previous job
Mapper
N1 I2
I3
N1 I2
I3
I2
I3
Reducer
Mapper
N2 I1
I3
N2 I1
I3
I1
I3
Mapper
N3 I2
I1
N3 I2
I1
I2
I1
Reducer
Reducer
N1 I2
I3
N2 I1
I3
N3 I2
I1
sum currents to N1
sum currents to N2
sum currents to N3
update N1
update N2
update N3
HDFS
initial input
write back to HDFS
N1 and its
local structure
N2 and its
local structure
N3 and its
local structure
Map
Sort &
Shuffle Reduce
Basic MapReduce Implementation
input from previous job
Problems:
synaptic currents are sent
directly to the reducers without
local aggregation
The graph structure is shuffled in
each iteration
N1 I2
I3
Mapper
N2 I1
I3
N3 I2
I1
HDFS
initial input
Map
Sort &
Shuffle
Reduce
In-Mapper Combining (IMC, introduced by Lin & Schatz)
N1 I2
I3
N2 I1
I3
N3 I2
I1
I1
I1
I2
I2
I3
I3
∑
∑
∑
Reducer
Reducer
Reducer
I3
N2 I1
I3
N3
I1
update N1
update N2
update N3
I2
I2
N1 I2
I3
Mapper
N2 I1
I3
N3 I2
I1
HDFS
initial input
Map
Sort &
Shuffle
Reduce
In-Mapper Combining (IMC, introduced by Lin & Schatz)
N1 I2
I3
N2 I1
I3
N3 I2
I1
I1
I1
I2
I2
I3
I3
∑
∑
∑
Reducer
Reducer
Reducer
I3
N2 I1
I3
N3
I1
update N1
update N2
update N3
I2
I2
The graph structure is still shuffled!
Mapper
N1 I2
I3
I2
I3
Reducer
Mapper
N2 I1
I3
I1
I3
Mapper
N3 I2
I1
I2
I1
Reducer
Reducer
N1 I2
I3
N2 I1
I3
N3 I2
I1
sum currents to N1
sum currents to N2
sum currents to N3
update N1
update N2
update N3
HDFS
initial input
write back to HDFS
N1 and its
local structure
N2 and its
local structure
N3 and its
local structure
Schimmy (introduced by Lin & Schatz)
N1 I2
I3
N2 I1
I3
N3 I2
I1
Map
remotely read graph structure
sort &
shuffle Reduce
Mapper
N1 I2
I3
I2
I3
Reducer
Mapper
N2 I1
I3
I1
I3
Mapper
N3 I2
I1
I2
I1
Reducer
Reducer
N1 I2
I3
N2 I1
I3
N3 I2
I1
sum currents to N1
sum currents to N2
sum currents to N3
update N1
update N2
update N3
HDFS
initial input
write back to HDFS
N1 and its
local structure
N2 and its
local structure
N3 and its
local structure
Schimmy (introduced by Lin & Schatz)
N1 I2
I3
N2 I1
I3
N3 I2
I1
Map
remotely read graph structure
sort &
shuffle Reduce
Problems:
Remote reading from HDFS
The graph structure is read and
written in each iteration
Mapper
N1 I2
I3
I2
I3
Reducer
Mapper
N2 I1
I3
I1
I3
Mapper
N3 I2
I1
I2
I1
Reducer
Reducer
N1 I2
I3
N2 I1
I3
N3 I2
I1
sum currents to N1
sum currents to N2
sum currents to N3
update N1
update N2
update N3
HDFS
initial input
write back to HDFS
N1 and its
local structure
N2 and its
local structure
N3 and its
local structure
Schimmy (introduced by Lin & Schatz)
N1 I2
I3
N2 I1
I3
N3 I2
I1
Map
remotely read graph structure
sort &
shuffle Reduce
Observation:
The graph structure is read-only!
Mapper
N1
I2
I3
Reducer
Mapper
I1
I3
Mapper
N3 I2
I1
Reducer
Reducer
N1
N2
N3
sum currents to N1
sum currents to N2
sum currents to N3
update N1
update N2
update N3
HDFS
initial input
write back to HDFS
Mapper-side Schimmy
N1 I2
I3
N2 I1
I3
N3 I2
I1
N2
Map
sort &
shuffle Reduce
Drawbacks of Graph algorithm in MapReduce
Non-intuitive and hard to implement
Not efficiently expressed as iterative algorithms
Not optimized for large numbers of iterations
input from
HDFS
output to
HDFS
input from
HDFS
output to
HDFS
Mapper Intermediate files Reducer
Iterate
Startup Penalty Disk Penalty Disk Penalty
Not optimized for large numbers of iterations
Giraph
N1 I2
I3
N2 I1
I3
N3 I2
I1
N1 I2
I3
N2 I1
I3
N3 I2
I1
H
D
F
S
Load input Synchronous barrier Synchronous barrier
N1 I2
I3
N2 I1
I3
N3 I2
I1
H
D
F
S
…...
Write results back
Iterative graph processing system
Powers Facebook graph search
Highly scalable
Based on BSP model
Mapper-only job on Hadoop
In-memory computation
“Think like a vertex”
More intuitive APIs
Giraph
N1 I2
I3
N2 I1
I3
N3 I2
I1
N1 I2
I3
N2 I1
I3
N3 I2
I1
H
D
F
S
Load input Synchronous barrier Synchronous barrier
N1 I2
I3
N2 I1
I3
N3 I2
I1
H
D
F
S
…...
Write results back
Iterative graph processing system
Powers Facebook graph search
Highly scalable
Based on BSP model
Mapper-only job on Hadoop
In-memory computation
“Think like a NEURON”
More intuitive APIs
Comparison of running time of each iteration
Comparison of speeds – 40 ms simulation
6% 0% -11% -48% -64% -91%
Conclusion
Hadoop is capable of modeling large-scale neural
networks.
Based on IMC and Schimmy, our Mapper-side Schimmy
improves MapReduce graph algorithms
Where graph structure is read-only.
Vertex-centric approaches, such as, Giraph showed
superior performance. However,
# of iterations specified as a global variable
Limited by memory per node
Not widely adopted by industry
Large-scale Neural ModelingLarge-scale Neural Modeling
in MapReduce and Giraphin MapReduce and Giraph
Co-authors
Nicholas D. Spielman
Neuroscience Program
University of St. Thomas
Presenter
Shuo Yang
Graduate Programs in Software
University of St. Thomas
Special thanks
Bhabani Misra, PhD
Graduate Programs in Software
University of St. Thomas
Jadin C. Jackson PhD
Department of Biology
University of St. Thomas
Bradley S. Rubin, PhD
Graduate Programs in Software
University of St. Thomas
Comparison of speeds – 40 ms simulation
Comparison of speeds – 20 ms to 40 ms simulation

Weitere ähnliche Inhalte

Ähnlich wie Ieee eit-talk-large-scale-neural-modeling-in-map reduce-giraph

Sparse matrix computations in MapReduce
Sparse matrix computations in MapReduceSparse matrix computations in MapReduce
Sparse matrix computations in MapReduceDavid Gleich
 
Scalable Hadoop with succinct Python: the best of both worlds
Scalable Hadoop with succinct Python: the best of both worldsScalable Hadoop with succinct Python: the best of both worlds
Scalable Hadoop with succinct Python: the best of both worldsDataWorks Summit
 
MAP REDUCE IN DATA SCIENCE.pptx
MAP REDUCE IN DATA SCIENCE.pptxMAP REDUCE IN DATA SCIENCE.pptx
MAP REDUCE IN DATA SCIENCE.pptxHARIKRISHNANU13
 
Introduction to map reduce
Introduction to map reduceIntroduction to map reduce
Introduction to map reduceBhupesh Chawda
 
Introduction to Map Reduce
Introduction to Map ReduceIntroduction to Map Reduce
Introduction to Map ReduceApache Apex
 
Behm Shah Pagerank
Behm Shah PagerankBehm Shah Pagerank
Behm Shah Pagerankgothicane
 
Ling liu part 02:big graph processing
Ling liu part 02:big graph processingLing liu part 02:big graph processing
Ling liu part 02:big graph processingjins0618
 
Hadoop 101 for bioinformaticians
Hadoop 101 for bioinformaticiansHadoop 101 for bioinformaticians
Hadoop 101 for bioinformaticiansattilacsordas
 
Amazon-style shopping cart analysis using MapReduce on a Hadoop cluster
Amazon-style shopping cart analysis using MapReduce on a Hadoop clusterAmazon-style shopping cart analysis using MapReduce on a Hadoop cluster
Amazon-style shopping cart analysis using MapReduce on a Hadoop clusterAsociatia ProLinux
 
Hadoop World 2011: The Powerful Marriage of R and Hadoop - David Champagne, R...
Hadoop World 2011: The Powerful Marriage of R and Hadoop - David Champagne, R...Hadoop World 2011: The Powerful Marriage of R and Hadoop - David Champagne, R...
Hadoop World 2011: The Powerful Marriage of R and Hadoop - David Champagne, R...Cloudera, Inc.
 
Hadoop trainting-in-hyderabad@kelly technologies
Hadoop trainting-in-hyderabad@kelly technologiesHadoop trainting-in-hyderabad@kelly technologies
Hadoop trainting-in-hyderabad@kelly technologiesKelly Technologies
 

Ähnlich wie Ieee eit-talk-large-scale-neural-modeling-in-map reduce-giraph (20)

MapReduce and NoSQL
MapReduce and NoSQLMapReduce and NoSQL
MapReduce and NoSQL
 
Data Science
Data ScienceData Science
Data Science
 
Apache Hadoop: DFS and Map Reduce
Apache Hadoop: DFS and Map ReduceApache Hadoop: DFS and Map Reduce
Apache Hadoop: DFS and Map Reduce
 
Sparse matrix computations in MapReduce
Sparse matrix computations in MapReduceSparse matrix computations in MapReduce
Sparse matrix computations in MapReduce
 
Map Reduce
Map ReduceMap Reduce
Map Reduce
 
Scalable Hadoop with succinct Python: the best of both worlds
Scalable Hadoop with succinct Python: the best of both worldsScalable Hadoop with succinct Python: the best of both worlds
Scalable Hadoop with succinct Python: the best of both worlds
 
MAP REDUCE IN DATA SCIENCE.pptx
MAP REDUCE IN DATA SCIENCE.pptxMAP REDUCE IN DATA SCIENCE.pptx
MAP REDUCE IN DATA SCIENCE.pptx
 
Introduction to map reduce
Introduction to map reduceIntroduction to map reduce
Introduction to map reduce
 
Hadoop MapReduce
Hadoop MapReduceHadoop MapReduce
Hadoop MapReduce
 
Introduction to Map Reduce
Introduction to Map ReduceIntroduction to Map Reduce
Introduction to Map Reduce
 
Using MapReduce for Large–scale Medical Image Analysis
Using MapReduce for Large–scale Medical Image AnalysisUsing MapReduce for Large–scale Medical Image Analysis
Using MapReduce for Large–scale Medical Image Analysis
 
Behm Shah Pagerank
Behm Shah PagerankBehm Shah Pagerank
Behm Shah Pagerank
 
Ling liu part 02:big graph processing
Ling liu part 02:big graph processingLing liu part 02:big graph processing
Ling liu part 02:big graph processing
 
Hadoop 101 for bioinformaticians
Hadoop 101 for bioinformaticiansHadoop 101 for bioinformaticians
Hadoop 101 for bioinformaticians
 
Big Data & Hadoop. Simone Leo (CRS4)
Big Data & Hadoop. Simone Leo (CRS4)Big Data & Hadoop. Simone Leo (CRS4)
Big Data & Hadoop. Simone Leo (CRS4)
 
Amazon-style shopping cart analysis using MapReduce on a Hadoop cluster
Amazon-style shopping cart analysis using MapReduce on a Hadoop clusterAmazon-style shopping cart analysis using MapReduce on a Hadoop cluster
Amazon-style shopping cart analysis using MapReduce on a Hadoop cluster
 
Hadoop World 2011: The Powerful Marriage of R and Hadoop - David Champagne, R...
Hadoop World 2011: The Powerful Marriage of R and Hadoop - David Champagne, R...Hadoop World 2011: The Powerful Marriage of R and Hadoop - David Champagne, R...
Hadoop World 2011: The Powerful Marriage of R and Hadoop - David Champagne, R...
 
Hadoop Internals
Hadoop InternalsHadoop Internals
Hadoop Internals
 
Hadoop Internals
Hadoop InternalsHadoop Internals
Hadoop Internals
 
Hadoop trainting-in-hyderabad@kelly technologies
Hadoop trainting-in-hyderabad@kelly technologiesHadoop trainting-in-hyderabad@kelly technologies
Hadoop trainting-in-hyderabad@kelly technologies
 

Kürzlich hochgeladen

Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdfExploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdfkalichargn70th171
 
Amazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilitiesAmazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilitiesKrzysztofKkol1
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsSafe Software
 
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprisepreethippts
 
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identityteam-WIBU
 
Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Rob Geurden
 
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdfEnhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdfRTS corp
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLionel Briand
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringHironori Washizaki
 
Patterns for automating API delivery. API conference
Patterns for automating API delivery. API conferencePatterns for automating API delivery. API conference
Patterns for automating API delivery. API conferencessuser9e7c64
 
Osi security architecture in network.pptx
Osi security architecture in network.pptxOsi security architecture in network.pptx
Osi security architecture in network.pptxVinzoCenzo
 
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full RecordingOpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full RecordingShane Coughlan
 
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptxThe Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptxRTS corp
 
SAM Training Session - How to use EXCEL ?
SAM Training Session - How to use EXCEL ?SAM Training Session - How to use EXCEL ?
SAM Training Session - How to use EXCEL ?Alexandre Beguel
 
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsSensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsChristian Birchler
 
What’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 UpdatesWhat’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 UpdatesVictoriaMetrics
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Angel Borroy López
 
Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecturerahul_net
 
Effectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorEffectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorTier1 app
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalLionel Briand
 

Kürzlich hochgeladen (20)

Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdfExploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
 
Amazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilitiesAmazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilities
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data Streams
 
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprise
 
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identity
 
Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...
 
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdfEnhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and Repair
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their Engineering
 
Patterns for automating API delivery. API conference
Patterns for automating API delivery. API conferencePatterns for automating API delivery. API conference
Patterns for automating API delivery. API conference
 
Osi security architecture in network.pptx
Osi security architecture in network.pptxOsi security architecture in network.pptx
Osi security architecture in network.pptx
 
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full RecordingOpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
 
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptxThe Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
 
SAM Training Session - How to use EXCEL ?
SAM Training Session - How to use EXCEL ?SAM Training Session - How to use EXCEL ?
SAM Training Session - How to use EXCEL ?
 
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsSensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
 
What’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 UpdatesWhat’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 Updates
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
 
Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecture
 
Effectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorEffectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryError
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive Goal
 

Ieee eit-talk-large-scale-neural-modeling-in-map reduce-giraph

  • 1. Large-scale Neural ModelingLarge-scale Neural Modeling in MapReduce and Giraphin MapReduce and Giraph Co-authors Nicholas D. Spielman Neuroscience Program University of St. Thomas Presenter Shuo Yang Graduate Programs in Software University of St. Thomas Special thanks Bhabani Misra, PhD Graduate Programs in Software University of St. Thomas Jadin C. Jackson PhD Department of Biology University of St. Thomas Bradley S. Rubin, PhD Graduate Programs in Software University of St. Thomas
  • 2. Why Hadoop & What is Hadoop Why not supercomputers? Expensive Limited access Scalability Why Hadoop? Runs on commodity hardware Scalable Full-fledged eco-system & community Open-source implementation of MapReduce based on Java MapReduce Model Client Map Reduce HDFS Split Data Output Map Map Reduce Output
  • 3. ….... ….... ∑ I input currents from neighbors ∆vI1 I2 In currents to all neighbors Synaptic weight matrix 0 1000Time Step Neuron ID Simulation results 0 2500 Neural Model (Izhikevich model)
  • 4. ….... ….... ∑ I input currents from neighbors ∆vI1 I2 In currents to all neighbors Synaptic weight matrix 0 1000Time Step Neuron ID Simulation results 0 2500 Neural Model (Izhikevich model) This is a graph structure
  • 5. Mapper N1 I2 I3 N1 I2 I3 I2 I3 Reducer Mapper N2 I1 I3 N2 I1 I3 I1 I3 Mapper N3 I2 I1 N3 I2 I1 I2 I1 Reducer Reducer N1 I2 I3 N2 I1 I3 N3 I2 I1 sum currents to N1 sum currents to N2 sum currents to N3 update N1 update N2 update N3 HDFS initial input write back to HDFS N1 and its local structure N2 and its local structure N3 and its local structure Map Sort & Shuffle Reduce Basic MapReduce Implementation input from previous job
  • 6. Mapper N1 I2 I3 N1 I2 I3 I2 I3 Reducer Mapper N2 I1 I3 N2 I1 I3 I1 I3 Mapper N3 I2 I1 N3 I2 I1 I2 I1 Reducer Reducer N1 I2 I3 N2 I1 I3 N3 I2 I1 sum currents to N1 sum currents to N2 sum currents to N3 update N1 update N2 update N3 HDFS initial input write back to HDFS N1 and its local structure N2 and its local structure N3 and its local structure Map Sort & Shuffle Reduce Basic MapReduce Implementation input from previous job Problems: synaptic currents are sent directly to the reducers without local aggregation The graph structure is shuffled in each iteration
  • 7. N1 I2 I3 Mapper N2 I1 I3 N3 I2 I1 HDFS initial input Map Sort & Shuffle Reduce In-Mapper Combining (IMC, introduced by Lin & Schatz) N1 I2 I3 N2 I1 I3 N3 I2 I1 I1 I1 I2 I2 I3 I3 ∑ ∑ ∑ Reducer Reducer Reducer I3 N2 I1 I3 N3 I1 update N1 update N2 update N3 I2 I2
  • 8. N1 I2 I3 Mapper N2 I1 I3 N3 I2 I1 HDFS initial input Map Sort & Shuffle Reduce In-Mapper Combining (IMC, introduced by Lin & Schatz) N1 I2 I3 N2 I1 I3 N3 I2 I1 I1 I1 I2 I2 I3 I3 ∑ ∑ ∑ Reducer Reducer Reducer I3 N2 I1 I3 N3 I1 update N1 update N2 update N3 I2 I2 The graph structure is still shuffled!
  • 9. Mapper N1 I2 I3 I2 I3 Reducer Mapper N2 I1 I3 I1 I3 Mapper N3 I2 I1 I2 I1 Reducer Reducer N1 I2 I3 N2 I1 I3 N3 I2 I1 sum currents to N1 sum currents to N2 sum currents to N3 update N1 update N2 update N3 HDFS initial input write back to HDFS N1 and its local structure N2 and its local structure N3 and its local structure Schimmy (introduced by Lin & Schatz) N1 I2 I3 N2 I1 I3 N3 I2 I1 Map remotely read graph structure sort & shuffle Reduce
  • 10. Mapper N1 I2 I3 I2 I3 Reducer Mapper N2 I1 I3 I1 I3 Mapper N3 I2 I1 I2 I1 Reducer Reducer N1 I2 I3 N2 I1 I3 N3 I2 I1 sum currents to N1 sum currents to N2 sum currents to N3 update N1 update N2 update N3 HDFS initial input write back to HDFS N1 and its local structure N2 and its local structure N3 and its local structure Schimmy (introduced by Lin & Schatz) N1 I2 I3 N2 I1 I3 N3 I2 I1 Map remotely read graph structure sort & shuffle Reduce Problems: Remote reading from HDFS The graph structure is read and written in each iteration
  • 11. Mapper N1 I2 I3 I2 I3 Reducer Mapper N2 I1 I3 I1 I3 Mapper N3 I2 I1 I2 I1 Reducer Reducer N1 I2 I3 N2 I1 I3 N3 I2 I1 sum currents to N1 sum currents to N2 sum currents to N3 update N1 update N2 update N3 HDFS initial input write back to HDFS N1 and its local structure N2 and its local structure N3 and its local structure Schimmy (introduced by Lin & Schatz) N1 I2 I3 N2 I1 I3 N3 I2 I1 Map remotely read graph structure sort & shuffle Reduce Observation: The graph structure is read-only!
  • 12. Mapper N1 I2 I3 Reducer Mapper I1 I3 Mapper N3 I2 I1 Reducer Reducer N1 N2 N3 sum currents to N1 sum currents to N2 sum currents to N3 update N1 update N2 update N3 HDFS initial input write back to HDFS Mapper-side Schimmy N1 I2 I3 N2 I1 I3 N3 I2 I1 N2 Map sort & shuffle Reduce
  • 13. Drawbacks of Graph algorithm in MapReduce Non-intuitive and hard to implement Not efficiently expressed as iterative algorithms Not optimized for large numbers of iterations input from HDFS output to HDFS input from HDFS output to HDFS Mapper Intermediate files Reducer Iterate Startup Penalty Disk Penalty Disk Penalty Not optimized for large numbers of iterations
  • 14. Giraph N1 I2 I3 N2 I1 I3 N3 I2 I1 N1 I2 I3 N2 I1 I3 N3 I2 I1 H D F S Load input Synchronous barrier Synchronous barrier N1 I2 I3 N2 I1 I3 N3 I2 I1 H D F S …... Write results back Iterative graph processing system Powers Facebook graph search Highly scalable Based on BSP model Mapper-only job on Hadoop In-memory computation “Think like a vertex” More intuitive APIs
  • 15. Giraph N1 I2 I3 N2 I1 I3 N3 I2 I1 N1 I2 I3 N2 I1 I3 N3 I2 I1 H D F S Load input Synchronous barrier Synchronous barrier N1 I2 I3 N2 I1 I3 N3 I2 I1 H D F S …... Write results back Iterative graph processing system Powers Facebook graph search Highly scalable Based on BSP model Mapper-only job on Hadoop In-memory computation “Think like a NEURON” More intuitive APIs
  • 16. Comparison of running time of each iteration
  • 17. Comparison of speeds – 40 ms simulation 6% 0% -11% -48% -64% -91%
  • 18. Conclusion Hadoop is capable of modeling large-scale neural networks. Based on IMC and Schimmy, our Mapper-side Schimmy improves MapReduce graph algorithms Where graph structure is read-only. Vertex-centric approaches, such as, Giraph showed superior performance. However, # of iterations specified as a global variable Limited by memory per node Not widely adopted by industry
  • 19. Large-scale Neural ModelingLarge-scale Neural Modeling in MapReduce and Giraphin MapReduce and Giraph Co-authors Nicholas D. Spielman Neuroscience Program University of St. Thomas Presenter Shuo Yang Graduate Programs in Software University of St. Thomas Special thanks Bhabani Misra, PhD Graduate Programs in Software University of St. Thomas Jadin C. Jackson PhD Department of Biology University of St. Thomas Bradley S. Rubin, PhD Graduate Programs in Software University of St. Thomas
  • 20. Comparison of speeds – 40 ms simulation Comparison of speeds – 20 ms to 40 ms simulation