SlideShare ist ein Scribd-Unternehmen logo
1 von 25
Downloaden Sie, um offline zu lesen
G O SG O SBeyond the GFLOPSBeyond the GFLOPS
Dominic Mallinson
Vice President, US R & D
Dominic Mallinson
Vice President, US R & D
Sony Computer Entertainment Inc.Sony Computer Entertainment Inc.
“Wh t t li b?“Why not go out on a limb?
That’s where the fruit is.”That s where the fruit is
(Will Rogers, cowboy, actor, philanthropist)
© 2007 SCE
Th C ll B db d E iTh C ll B db d E iThe Cell Broadband Engine
(Cell/B E ) Processor
The Cell Broadband Engine
(Cell/B E ) Processor(Cell/B.E.) Processor(Cell/B.E.) Processor
© 2007 SCE
The Cell/B.E. ProcessorThe Cell/B.E. Processor
Leading the industry in heterogeneous multi-core
200+ GFLOPS high performance computing
Leading the industry in heterogeneous multi-core
200+ GFLOPS high performance computing200+ GFLOPS high performance computing
But what lies beyond the GFLOPS statistics ?
200+ GFLOPS high performance computing
But what lies beyond the GFLOPS statistics ?
Why does an application need Cell/B.E.’s power ?
How can we make Cell/B.E.’s performance more accessible ?
Why does an application need Cell/B.E.’s power ?
How can we make Cell/B.E.’s performance more accessible ?
What part do you and the Cell/B.E.’s software community play ?What part do you and the Cell/B.E.’s software community play ?
© 2007 SCE
Why does SCE need
C ll/B E f ?
Why does SCE need
C ll/B E f ?Cell/B.E. performance ?Cell/B.E. performance ?
© 2007 SCE
Games and Virtual WorldGames and Virtual World
GBytes of data streaming through the CPU in real-time
100 f i ti 3D h t
GBytes of data streaming through the CPU in real-time
100 f i ti 3D h t100s of animating 3D characters on screen
True HD 3D Graphics with millions of vertices visible
100s of animating 3D characters on screen
True HD 3D Graphics with millions of vertices visible
Complex Artificial Intelligence techniques
Physical Simulation, cloth, fluids, soft and rigid bodies
Complex Artificial Intelligence techniques
Physical Simulation, cloth, fluids, soft and rigid bodies
Real-time spatial audio processing and encode
Millions of simultaneous users
Real-time spatial audio processing and encode
Millions of simultaneous users
© 2007 SCE
Potential for client and server to use Cell/B.E. processorPotential for client and server to use Cell/B.E. processor
Demo TimeDemo TimeDemo TimeDemo Time
© 2007 SCE
Media ProcessingMedia Processinggg
Blu-ray movie playback
1080p video decode in AVC VC1 or MPEG2
Blu-ray movie playback
1080p video decode in AVC VC1 or MPEG21080p video decode in AVC, VC1 or MPEG2
Simultaneous 480p “picture in picture” decode
7.1 multi-channel audio decode and mixing
1080p video decode in AVC, VC1 or MPEG2
Simultaneous 480p “picture in picture” decode
7.1 multi-channel audio decode and mixing7.1 multi channel audio decode and mixing

 and a Javaℱ VM
Remote Play function of PLAYSTATION¼3 (PS3ℱ)
7.1 multi channel audio decode and mixing

 and a Javaℱ VM
Remote Play function of PLAYSTATION¼3 (PS3ℱ)y ( )
Realtime AV encoding and streaming to a PlayStationÂźPortable
Multi-person AV Chat
y ( )
Realtime AV encoding and streaming to a PlayStationÂźPortable
Multi-person AV Chat
© 2007 SCE
1 encode plus up to 5 decodes, AEC noise reduction1 encode plus up to 5 decodes, AEC noise reduction
Folding@homeTM
on PS3Folding@homeTM
on PS3g@g@
A distributed computing project from Stanford University
R h i t t i i f ldi t h l d t d d fi d
A distributed computing project from Stanford University
R h i t t i i f ldi t h l d t d d fi dResearch into protein misfolding to help understand and find
treatments for diseases such as Alzheimer’s and cancer.
PS3 Client launched in March 2007
Research into protein misfolding to help understand and find
treatments for diseases such as Alzheimer’s and cancer.
PS3 Client launched in March 2007PS3 Client launched in March 2007
Over 250,000 unique PS3 users in the first month
488 TFLOPS (Stanford metrics from June 14th 2007)
PS3 Client launched in March 2007
Over 250,000 unique PS3 users in the first month
488 TFLOPS (Stanford metrics from June 14th 2007)( )
26,961 Active Cell/B.E. CPUs
More than doubled previous PC/GPU contributions
( )
26,961 Active Cell/B.E. CPUs
More than doubled previous PC/GPU contributions
© 2007 SCE
DEMODEMO
Accessing the power of Cell/B.E.Accessing the power of Cell/B.E.
© 2007 SCE
Accessing the power of Cell/B.E.Accessing the power of Cell/B.E.g pg p
The Cell/B.E. is designed for performanceThe Cell/B.E. is designed for performance
Maximum performance requires complex software
The upper quartile of engineers already achieve it
Maximum performance requires complex software
The upper quartile of engineers already achieve it
The lower quartile currently cannot
Research and Industry must bridge this gap
The lower quartile currently cannot
Research and Industry must bridge this gapy g g p
Many programming models are emerging
How does SCE tackle this problem ?
y g g p
Many programming models are emerging
How does SCE tackle this problem ?
© 2007 SCE
How does SCE tackle this problem ?How does SCE tackle this problem ?
SCE’s SPURS EnvironmentSCE’s SPURS Environment
A flexible, cooperative SPE management layerA flexible, cooperative SPE management layer
SPE-centric scheduling (minimal PPU overhead)
Low or zero context switch overhead
SPE-centric scheduling (minimal PPU overhead)
Low or zero context switch overhead
Application control for scheduling priorities
Supports sharing SPE with 3rd party middleware
Application control for scheduling priorities
Supports sharing SPE with 3rd party middlewareSupports sharing SPE with 3rd party middleware
Built on top of OS SPE Threads
Supports sharing SPE with 3rd party middleware
Built on top of OS SPE Threads
© 2007 SCE
Policy manager allows multiple modelsPolicy manager allows multiple models
Duck Demo SPE UsageDuck Demo SPE Usagegg
Old Code – no machine vision – 6 SPEsOld Code – no machine vision – 6 SPEs Old Code - machine vision – 8 SPEsOld Code - machine vision – 8 SPEs
SPE0 – Surface water physics
SPE1 – Splash physics
SPE2 – Boat 1 physics
SPE3 Boat 2 physics
SPE0 – Surface water physics
SPE1 – Splash physics
SPE2 – Boat 1 physics
SPE3 Boat 2 physics
SPE0-SPE5 UNCHANGED
Added machine vision, particle water
SPE0-SPE5 UNCHANGED
Added machine vision, particle water
© 2007 SCE
SPE3 – Boat 2 physics
SPE4 – Collision physics
SPE5 – Graphics
SPE3 – Boat 2 physics
SPE4 – Collision physics
SPE5 – Graphics
SPE6 – Particle water physics
SPE7 – Machine vision
SPE6 – Particle water physics
SPE7 – Machine vision
Goal: Everything on 6 SPEsGoal: Everything on 6 SPEsy gy g
Refactor with SPURSRefactor with SPURSNaĂŻve use of SPURSNaĂŻve use of SPURS
Refactor machine vision
Refactor particle water
Use SPURS to share SPEs
Refactor machine vision
Refactor particle water
Use SPURS to share SPEs
Just try to move work around
Water + Boat 2 is over time
Graphics + Machine vision
Just try to move work around
Water + Boat 2 is over time
Graphics + Machine vision
© 2007 SCE
Use SPURS to share SPEs
Room to ‘breath’
Use SPURS to share SPEs
Room to ‘breath’
Graphics + Machine vision
Fits but no room to flex
Graphics + Machine vision
Fits but no room to flex
SCE’s SPURS EnvironmentSCE’s SPURS Environment
The “Tasks” policy module
Si il t th d b t ti h d li
The “Tasks” policy module
Si il t th d b t ti h d liSimilar to threads but cooperative scheduling
SPE’s pull tasks from a shared memory pool
Good for mid to high complexity programs
Similar to threads but cooperative scheduling
SPE’s pull tasks from a shared memory pool
Good for mid to high complexity programsGood for mid to high complexity programs
The “Jobs” policy module
Stateless execution kernels (specify all input/output)
Good for mid to high complexity programs
The “Jobs” policy module
Stateless execution kernels (specify all input/output)Stateless execution kernels (specify all input/output)
SPE’s pull from a shared queue of jobs
Good for low to mid complexity programs
Stateless execution kernels (specify all input/output)
SPE’s pull from a shared queue of jobs
Good for low to mid complexity programs
© 2007 SCE
Good for low to mid complexity programs
Ideal for stream processing
Good for low to mid complexity programs
Ideal for stream processing
Job StreamingJob Streaming
PPE thread
gg
Divide a program and data into pieces (called Jobs)
Define dependencies between groups of jobs
Divide a program and data into pieces (called Jobs)
Define dependencies between groups of jobs
J b Li t
p g p j
Build Job Lists
SPEs grab Jobs and execute them in parallel
p g p j
Build Job Lists
SPEs grab Jobs and execute them in parallel
Job
Job
Job
Job
Job
Job
Job
Job
Job List
Job
Program
and
Data
Job
Job
Job
Job
Job
Job
Job
Job
Job
Job
Job
Job
Job
© 2007 SCE
Job
Job
Job
Job
Job
PPE thread
Job Streaming PipelineJob Streaming Pipelineg pg p
RAM RAMRAM RAM RAM
SPU
Execute
Code*,
Parameters SPE
JD Address Execute
Input
Data
Output
Data
Parameters,
I/O addresses,
I/O sizes,
etc.
CODEJD Address
© 2007 SCE
“prefetch”“prefetch” “input”“input” “execute”“execute” “output”“output”
Multi-BufferingMulti-Bufferinggg
Job stages are interleaved so that DMA memory transfers will be
in progress during job execution
Job stages are interleaved so that DMA memory transfers will be
in progress during job execution
Each color represents a different job.Each color represents a different job.
in progress during job execution.in progress during job execution.
prefetch prefetch prefetch prefetch prefetch
I t I t I t I t I tInput Input Input Input Input
Exec Exec Exec Exec Exec
Output Output Output Output Output
TIMETIME
P P S P S E P S E F P S E F S E F E F F
P t ti ll th i t lli f t f !P t ti ll th i t lli f t f !
© 2007 SCE
Potentially, there is no stalling for memory transfers!Potentially, there is no stalling for memory transfers!
SCE’s SPURS EnvironmentSCE’s SPURS Environment
SPURS solves part of the problem
All ff ti h i f th SPE
SPURS solves part of the problem
All ff ti h i f th SPEAllows effective sharing of the SPE resources
Simplifies the programming and synchronization
B t it till d ’t b id th
Allows effective sharing of the SPE resources
Simplifies the programming and synchronization
B t it till d ’t b id thBut it still doesn’t bridge the gap
We need higher level models which provide

f S
But it still doesn’t bridge the gap
We need higher level models which provide

f SAutomatic DMA for large code and data on SPE
Parallel programming abstractions
S l bl h i ti th d
Automatic DMA for large code and data on SPE
Parallel programming abstractions
S l bl h i ti th d
© 2007 SCE
Scalable synchronization methods
Full debug and performance analysis
Scalable synchronization methods
Full debug and performance analysis
The Cell/B E Software CommunityThe Cell/B E Software CommunityThe Cell/B.E. Software CommunityThe Cell/B.E. Software Community
© 2007 SCE
The Importance of the CoCThe Importance of the CoCpp
The Center of Competence is a focal point
T b i t th h d i d t
The Center of Competence is a focal point
T b i t th h d i d tTo bring together researchers and industry
To help develop optimized ‘standard’ libraries for Cell/B.E
Research new programming languages/models
To bring together researchers and industry
To help develop optimized ‘standard’ libraries for Cell/B.E
Research new programming languages/modelsResearch new programming languages/models
Research new compiler techniques
General multi-core / parallel programming research
Research new programming languages/models
Research new compiler techniques
General multi-core / parallel programming research
Dealing with distributed memory hierarchies
Research scalability of synchronization methods
De elop tools that can help is ali e parallel soft are
Dealing with distributed memory hierarchies
Research scalability of synchronization methods
De elop tools that can help is ali e parallel soft are
© 2007 SCE
Develop tools that can help visualize parallel softwareDevelop tools that can help visualize parallel software
Industry SupportIndustry Supporty ppy pp
Terra Soft Solutions – Yellow Dog Linux for PS3Terra Soft Solutions – Yellow Dog Linux for PS3
Mercury Systems
RapidMind
Mercury Systems
RapidMindp
Cmpware, Inc.
Reservoir Labs
p
Cmpware, Inc.
Reservoir LabsReservoir Labs
Gedae
Reservoir Labs
Gedae
© 2007 SCE
allineaallinea
Concluding ThoughtsConcluding Thoughtsg gg g
© 2007 SCE
Concluding ThoughtsConcluding Thoughtsg gg g
The Cell/B.E. has amazing performanceThe Cell/B.E. has amazing performance
Its available now in consumer and HPC marketsIts available now in consumer and HPC markets
We need more software targeting Cell/B.E.
We need Cell/B E ’s power to be more accessible
We need more software targeting Cell/B.E.
We need Cell/B E ’s power to be more accessibleWe need Cell/B.E. s power to be more accessible
We need more research into Cell/B.E. and multi-core
We need Cell/B.E. s power to be more accessible
We need more research into Cell/B.E. and multi-core
© 2007 SCE
We need YOU to help us goWe need YOU to help us goWe need YOU to help us go..We need YOU to help us go..
Beyond the GFLOPSBeyond the GFLOPS
© 2007 SCE

Weitere Àhnliche Inhalte

Ähnlich wie Beyond the GFLOPS

Acug datafiniti pellon_sept2013
Acug datafiniti pellon_sept2013Acug datafiniti pellon_sept2013
Acug datafiniti pellon_sept2013
Datafiniti
 
BlueData Isilon Validation Brief
BlueData Isilon Validation BriefBlueData Isilon Validation Brief
BlueData Isilon Validation Brief
Boni Bruno
 
3. EMC Storage for future Surveillance.pdf
3. EMC Storage for future Surveillance.pdf3. EMC Storage for future Surveillance.pdf
3. EMC Storage for future Surveillance.pdf
PawachMetharattanara
 
Simulation Versus Acceleration, Versus Emulation
Simulation Versus Acceleration, Versus EmulationSimulation Versus Acceleration, Versus Emulation
Simulation Versus Acceleration, Versus Emulation
DVClub
 
Mobcents QE achievements, infrastructure review, frameworks - Mobicents Summi...
Mobcents QE achievements, infrastructure review, frameworks - Mobicents Summi...Mobcents QE achievements, infrastructure review, frameworks - Mobicents Summi...
Mobcents QE achievements, infrastructure review, frameworks - Mobicents Summi...
telestax
 

Ähnlich wie Beyond the GFLOPS (20)

Ab initio training Ab-initio Architecture
Ab initio training Ab-initio ArchitectureAb initio training Ab-initio Architecture
Ab initio training Ab-initio Architecture
 
Sirius: Graphical Editors for your DSLs
Sirius: Graphical Editors for your DSLsSirius: Graphical Editors for your DSLs
Sirius: Graphical Editors for your DSLs
 
Three key concepts for java batch
Three key concepts for java batchThree key concepts for java batch
Three key concepts for java batch
 
Floyd Imaging
Floyd ImagingFloyd Imaging
Floyd Imaging
 
Hosseini sv07
Hosseini sv07Hosseini sv07
Hosseini sv07
 
Acug datafiniti pellon_sept2013
Acug datafiniti pellon_sept2013Acug datafiniti pellon_sept2013
Acug datafiniti pellon_sept2013
 
BlueData Isilon Validation Brief
BlueData Isilon Validation BriefBlueData Isilon Validation Brief
BlueData Isilon Validation Brief
 
Spanner : Google' s Globally Distributed Database
Spanner : Google' s Globally Distributed DatabaseSpanner : Google' s Globally Distributed Database
Spanner : Google' s Globally Distributed Database
 
Innoslate the Gateway to SysML 2.0 and Beyond
Innoslate the Gateway to SysML 2.0 and BeyondInnoslate the Gateway to SysML 2.0 and Beyond
Innoslate the Gateway to SysML 2.0 and Beyond
 
3. EMC Storage for future Surveillance.pdf
3. EMC Storage for future Surveillance.pdf3. EMC Storage for future Surveillance.pdf
3. EMC Storage for future Surveillance.pdf
 
AI For Software Engineering: Two Industrial Experience Reports
AI For Software Engineering: Two Industrial Experience ReportsAI For Software Engineering: Two Industrial Experience Reports
AI For Software Engineering: Two Industrial Experience Reports
 
An Overview of Hadoop
An Overview of HadoopAn Overview of Hadoop
An Overview of Hadoop
 
Modern javascript localization with c-3po and the good old gettext
Modern javascript localization with c-3po and the good old gettextModern javascript localization with c-3po and the good old gettext
Modern javascript localization with c-3po and the good old gettext
 
Compile ahead of time. It's fine?
Compile ahead of time. It's fine?Compile ahead of time. It's fine?
Compile ahead of time. It's fine?
 
Hugaccumulo 121018192044-phpapp02
Hugaccumulo 121018192044-phpapp02Hugaccumulo 121018192044-phpapp02
Hugaccumulo 121018192044-phpapp02
 
Simulation Versus Acceleration, Versus Emulation
Simulation Versus Acceleration, Versus EmulationSimulation Versus Acceleration, Versus Emulation
Simulation Versus Acceleration, Versus Emulation
 
Engine Terminology
Engine Terminology Engine Terminology
Engine Terminology
 
OpenStack at Scale Inside NetApp
OpenStack at Scale Inside NetAppOpenStack at Scale Inside NetApp
OpenStack at Scale Inside NetApp
 
Mobcents QE achievements, infrastructure review, frameworks - Mobicents Summi...
Mobcents QE achievements, infrastructure review, frameworks - Mobicents Summi...Mobcents QE achievements, infrastructure review, frameworks - Mobicents Summi...
Mobcents QE achievements, infrastructure review, frameworks - Mobicents Summi...
 
Continuous Delivery: The Dirty Details
Continuous Delivery: The Dirty DetailsContinuous Delivery: The Dirty Details
Continuous Delivery: The Dirty Details
 

Mehr von Slide_N

Filtering Approaches for Real-Time Anti-Aliasing
Filtering Approaches for Real-Time Anti-AliasingFiltering Approaches for Real-Time Anti-Aliasing
Filtering Approaches for Real-Time Anti-Aliasing
Slide_N
 
Cell Today and Tomorrow - IBM Systems and Technology Group
Cell Today and Tomorrow - IBM Systems and Technology GroupCell Today and Tomorrow - IBM Systems and Technology Group
Cell Today and Tomorrow - IBM Systems and Technology Group
Slide_N
 

Mehr von Slide_N (20)

SpursEngine A High-performance Stream Processor Derived from Cell/B.E. for Me...
SpursEngine A High-performance Stream Processor Derived from Cell/B.E. for Me...SpursEngine A High-performance Stream Processor Derived from Cell/B.E. for Me...
SpursEngine A High-performance Stream Processor Derived from Cell/B.E. for Me...
 
Parallel Vector Tile-Optimized Library (PVTOL) Architecture-v3.pdf
Parallel Vector Tile-Optimized Library (PVTOL) Architecture-v3.pdfParallel Vector Tile-Optimized Library (PVTOL) Architecture-v3.pdf
Parallel Vector Tile-Optimized Library (PVTOL) Architecture-v3.pdf
 
Experiences with PlayStation VR - Sony Interactive Entertainment
Experiences with PlayStation VR  - Sony Interactive EntertainmentExperiences with PlayStation VR  - Sony Interactive Entertainment
Experiences with PlayStation VR - Sony Interactive Entertainment
 
SPU-based Deferred Shading for Battlefield 3 on Playstation 3
SPU-based Deferred Shading for Battlefield 3 on Playstation 3SPU-based Deferred Shading for Battlefield 3 on Playstation 3
SPU-based Deferred Shading for Battlefield 3 on Playstation 3
 
Filtering Approaches for Real-Time Anti-Aliasing
Filtering Approaches for Real-Time Anti-AliasingFiltering Approaches for Real-Time Anti-Aliasing
Filtering Approaches for Real-Time Anti-Aliasing
 
Chip Multiprocessing and the Cell Broadband Engine.pdf
Chip Multiprocessing and the Cell Broadband Engine.pdfChip Multiprocessing and the Cell Broadband Engine.pdf
Chip Multiprocessing and the Cell Broadband Engine.pdf
 
Cell Today and Tomorrow - IBM Systems and Technology Group
Cell Today and Tomorrow - IBM Systems and Technology GroupCell Today and Tomorrow - IBM Systems and Technology Group
Cell Today and Tomorrow - IBM Systems and Technology Group
 
New Millennium for Computer Entertainment - Kutaragi
New Millennium for Computer Entertainment - KutaragiNew Millennium for Computer Entertainment - Kutaragi
New Millennium for Computer Entertainment - Kutaragi
 
Sony Transformation 60 - Kutaragi
Sony Transformation 60 - KutaragiSony Transformation 60 - Kutaragi
Sony Transformation 60 - Kutaragi
 
Sony Transformation 60
Sony Transformation 60 Sony Transformation 60
Sony Transformation 60
 
Moving Innovative Game Technology from the Lab to the Living Room
Moving Innovative Game Technology from the Lab to the Living RoomMoving Innovative Game Technology from the Lab to the Living Room
Moving Innovative Game Technology from the Lab to the Living Room
 
The Technology behind PlayStation 2
The Technology behind PlayStation 2The Technology behind PlayStation 2
The Technology behind PlayStation 2
 
Cell Technology for Graphics and Visualization
Cell Technology for Graphics and VisualizationCell Technology for Graphics and Visualization
Cell Technology for Graphics and Visualization
 
Translating GPU Binaries to Tiered SIMD Architectures with Ocelot
Translating GPU Binaries to Tiered SIMD Architectures with OcelotTranslating GPU Binaries to Tiered SIMD Architectures with Ocelot
Translating GPU Binaries to Tiered SIMD Architectures with Ocelot
 
Cellular Neural Networks: Theory
Cellular Neural Networks: TheoryCellular Neural Networks: Theory
Cellular Neural Networks: Theory
 
Network Processing on an SPE Core in Cell Broadband EngineTM
Network Processing on an SPE Core in Cell Broadband EngineTMNetwork Processing on an SPE Core in Cell Broadband EngineTM
Network Processing on an SPE Core in Cell Broadband EngineTM
 
Deferred Pixel Shading on the PLAYSTATIONÂź3
Deferred Pixel Shading on the PLAYSTATIONÂź3Deferred Pixel Shading on the PLAYSTATIONÂź3
Deferred Pixel Shading on the PLAYSTATIONÂź3
 
Developing Technology for Ratchet and Clank Future: Tools of Destruction
Developing Technology for Ratchet and Clank Future: Tools of DestructionDeveloping Technology for Ratchet and Clank Future: Tools of Destruction
Developing Technology for Ratchet and Clank Future: Tools of Destruction
 
NVIDIA Tesla Accelerated Computing Platform for IBM Power
NVIDIA Tesla Accelerated Computing Platform for IBM PowerNVIDIA Tesla Accelerated Computing Platform for IBM Power
NVIDIA Tesla Accelerated Computing Platform for IBM Power
 
The Visual Computing Revolution Continues
The Visual Computing Revolution ContinuesThe Visual Computing Revolution Continues
The Visual Computing Revolution Continues
 

KĂŒrzlich hochgeladen

Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

KĂŒrzlich hochgeladen (20)

Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 

Beyond the GFLOPS

  • 1. G O SG O SBeyond the GFLOPSBeyond the GFLOPS Dominic Mallinson Vice President, US R & D Dominic Mallinson Vice President, US R & D Sony Computer Entertainment Inc.Sony Computer Entertainment Inc.
  • 2. “Wh t t li b?“Why not go out on a limb? That’s where the fruit is.”That s where the fruit is (Will Rogers, cowboy, actor, philanthropist) © 2007 SCE
  • 3. Th C ll B db d E iTh C ll B db d E iThe Cell Broadband Engine (Cell/B E ) Processor The Cell Broadband Engine (Cell/B E ) Processor(Cell/B.E.) Processor(Cell/B.E.) Processor © 2007 SCE
  • 4. The Cell/B.E. ProcessorThe Cell/B.E. Processor Leading the industry in heterogeneous multi-core 200+ GFLOPS high performance computing Leading the industry in heterogeneous multi-core 200+ GFLOPS high performance computing200+ GFLOPS high performance computing But what lies beyond the GFLOPS statistics ? 200+ GFLOPS high performance computing But what lies beyond the GFLOPS statistics ? Why does an application need Cell/B.E.’s power ? How can we make Cell/B.E.’s performance more accessible ? Why does an application need Cell/B.E.’s power ? How can we make Cell/B.E.’s performance more accessible ? What part do you and the Cell/B.E.’s software community play ?What part do you and the Cell/B.E.’s software community play ? © 2007 SCE
  • 5. Why does SCE need C ll/B E f ? Why does SCE need C ll/B E f ?Cell/B.E. performance ?Cell/B.E. performance ? © 2007 SCE
  • 6. Games and Virtual WorldGames and Virtual World GBytes of data streaming through the CPU in real-time 100 f i ti 3D h t GBytes of data streaming through the CPU in real-time 100 f i ti 3D h t100s of animating 3D characters on screen True HD 3D Graphics with millions of vertices visible 100s of animating 3D characters on screen True HD 3D Graphics with millions of vertices visible Complex Artificial Intelligence techniques Physical Simulation, cloth, fluids, soft and rigid bodies Complex Artificial Intelligence techniques Physical Simulation, cloth, fluids, soft and rigid bodies Real-time spatial audio processing and encode Millions of simultaneous users Real-time spatial audio processing and encode Millions of simultaneous users © 2007 SCE Potential for client and server to use Cell/B.E. processorPotential for client and server to use Cell/B.E. processor
  • 7. Demo TimeDemo TimeDemo TimeDemo Time © 2007 SCE
  • 8. Media ProcessingMedia Processinggg Blu-ray movie playback 1080p video decode in AVC VC1 or MPEG2 Blu-ray movie playback 1080p video decode in AVC VC1 or MPEG21080p video decode in AVC, VC1 or MPEG2 Simultaneous 480p “picture in picture” decode 7.1 multi-channel audio decode and mixing 1080p video decode in AVC, VC1 or MPEG2 Simultaneous 480p “picture in picture” decode 7.1 multi-channel audio decode and mixing7.1 multi channel audio decode and mixing 
 and a Javaℱ VM Remote Play function of PLAYSTATIONÂź3 (PS3ℱ) 7.1 multi channel audio decode and mixing 
 and a Javaℱ VM Remote Play function of PLAYSTATIONÂź3 (PS3ℱ)y ( ) Realtime AV encoding and streaming to a PlayStationÂźPortable Multi-person AV Chat y ( ) Realtime AV encoding and streaming to a PlayStationÂźPortable Multi-person AV Chat © 2007 SCE 1 encode plus up to 5 decodes, AEC noise reduction1 encode plus up to 5 decodes, AEC noise reduction
  • 9. Folding@homeTM on PS3Folding@homeTM on PS3g@g@ A distributed computing project from Stanford University R h i t t i i f ldi t h l d t d d fi d A distributed computing project from Stanford University R h i t t i i f ldi t h l d t d d fi dResearch into protein misfolding to help understand and find treatments for diseases such as Alzheimer’s and cancer. PS3 Client launched in March 2007 Research into protein misfolding to help understand and find treatments for diseases such as Alzheimer’s and cancer. PS3 Client launched in March 2007PS3 Client launched in March 2007 Over 250,000 unique PS3 users in the first month 488 TFLOPS (Stanford metrics from June 14th 2007) PS3 Client launched in March 2007 Over 250,000 unique PS3 users in the first month 488 TFLOPS (Stanford metrics from June 14th 2007)( ) 26,961 Active Cell/B.E. CPUs More than doubled previous PC/GPU contributions ( ) 26,961 Active Cell/B.E. CPUs More than doubled previous PC/GPU contributions © 2007 SCE DEMODEMO
  • 10. Accessing the power of Cell/B.E.Accessing the power of Cell/B.E. © 2007 SCE
  • 11. Accessing the power of Cell/B.E.Accessing the power of Cell/B.E.g pg p The Cell/B.E. is designed for performanceThe Cell/B.E. is designed for performance Maximum performance requires complex software The upper quartile of engineers already achieve it Maximum performance requires complex software The upper quartile of engineers already achieve it The lower quartile currently cannot Research and Industry must bridge this gap The lower quartile currently cannot Research and Industry must bridge this gapy g g p Many programming models are emerging How does SCE tackle this problem ? y g g p Many programming models are emerging How does SCE tackle this problem ? © 2007 SCE How does SCE tackle this problem ?How does SCE tackle this problem ?
  • 12. SCE’s SPURS EnvironmentSCE’s SPURS Environment A flexible, cooperative SPE management layerA flexible, cooperative SPE management layer SPE-centric scheduling (minimal PPU overhead) Low or zero context switch overhead SPE-centric scheduling (minimal PPU overhead) Low or zero context switch overhead Application control for scheduling priorities Supports sharing SPE with 3rd party middleware Application control for scheduling priorities Supports sharing SPE with 3rd party middlewareSupports sharing SPE with 3rd party middleware Built on top of OS SPE Threads Supports sharing SPE with 3rd party middleware Built on top of OS SPE Threads © 2007 SCE Policy manager allows multiple modelsPolicy manager allows multiple models
  • 13. Duck Demo SPE UsageDuck Demo SPE Usagegg Old Code – no machine vision – 6 SPEsOld Code – no machine vision – 6 SPEs Old Code - machine vision – 8 SPEsOld Code - machine vision – 8 SPEs SPE0 – Surface water physics SPE1 – Splash physics SPE2 – Boat 1 physics SPE3 Boat 2 physics SPE0 – Surface water physics SPE1 – Splash physics SPE2 – Boat 1 physics SPE3 Boat 2 physics SPE0-SPE5 UNCHANGED Added machine vision, particle water SPE0-SPE5 UNCHANGED Added machine vision, particle water © 2007 SCE SPE3 – Boat 2 physics SPE4 – Collision physics SPE5 – Graphics SPE3 – Boat 2 physics SPE4 – Collision physics SPE5 – Graphics SPE6 – Particle water physics SPE7 – Machine vision SPE6 – Particle water physics SPE7 – Machine vision
  • 14. Goal: Everything on 6 SPEsGoal: Everything on 6 SPEsy gy g Refactor with SPURSRefactor with SPURSNaĂŻve use of SPURSNaĂŻve use of SPURS Refactor machine vision Refactor particle water Use SPURS to share SPEs Refactor machine vision Refactor particle water Use SPURS to share SPEs Just try to move work around Water + Boat 2 is over time Graphics + Machine vision Just try to move work around Water + Boat 2 is over time Graphics + Machine vision © 2007 SCE Use SPURS to share SPEs Room to ‘breath’ Use SPURS to share SPEs Room to ‘breath’ Graphics + Machine vision Fits but no room to flex Graphics + Machine vision Fits but no room to flex
  • 15. SCE’s SPURS EnvironmentSCE’s SPURS Environment The “Tasks” policy module Si il t th d b t ti h d li The “Tasks” policy module Si il t th d b t ti h d liSimilar to threads but cooperative scheduling SPE’s pull tasks from a shared memory pool Good for mid to high complexity programs Similar to threads but cooperative scheduling SPE’s pull tasks from a shared memory pool Good for mid to high complexity programsGood for mid to high complexity programs The “Jobs” policy module Stateless execution kernels (specify all input/output) Good for mid to high complexity programs The “Jobs” policy module Stateless execution kernels (specify all input/output)Stateless execution kernels (specify all input/output) SPE’s pull from a shared queue of jobs Good for low to mid complexity programs Stateless execution kernels (specify all input/output) SPE’s pull from a shared queue of jobs Good for low to mid complexity programs © 2007 SCE Good for low to mid complexity programs Ideal for stream processing Good for low to mid complexity programs Ideal for stream processing
  • 16. Job StreamingJob Streaming PPE thread gg Divide a program and data into pieces (called Jobs) Define dependencies between groups of jobs Divide a program and data into pieces (called Jobs) Define dependencies between groups of jobs J b Li t p g p j Build Job Lists SPEs grab Jobs and execute them in parallel p g p j Build Job Lists SPEs grab Jobs and execute them in parallel Job Job Job Job Job Job Job Job Job List Job Program and Data Job Job Job Job Job Job Job Job Job Job Job Job Job © 2007 SCE Job Job Job Job Job PPE thread
  • 17. Job Streaming PipelineJob Streaming Pipelineg pg p RAM RAMRAM RAM RAM SPU Execute Code*, Parameters SPE JD Address Execute Input Data Output Data Parameters, I/O addresses, I/O sizes, etc. CODEJD Address © 2007 SCE “prefetch”“prefetch” “input”“input” “execute”“execute” “output”“output”
  • 18. Multi-BufferingMulti-Bufferinggg Job stages are interleaved so that DMA memory transfers will be in progress during job execution Job stages are interleaved so that DMA memory transfers will be in progress during job execution Each color represents a different job.Each color represents a different job. in progress during job execution.in progress during job execution. prefetch prefetch prefetch prefetch prefetch I t I t I t I t I tInput Input Input Input Input Exec Exec Exec Exec Exec Output Output Output Output Output TIMETIME P P S P S E P S E F P S E F S E F E F F P t ti ll th i t lli f t f !P t ti ll th i t lli f t f ! © 2007 SCE Potentially, there is no stalling for memory transfers!Potentially, there is no stalling for memory transfers!
  • 19. SCE’s SPURS EnvironmentSCE’s SPURS Environment SPURS solves part of the problem All ff ti h i f th SPE SPURS solves part of the problem All ff ti h i f th SPEAllows effective sharing of the SPE resources Simplifies the programming and synchronization B t it till d ’t b id th Allows effective sharing of the SPE resources Simplifies the programming and synchronization B t it till d ’t b id thBut it still doesn’t bridge the gap We need higher level models which provide
 f S But it still doesn’t bridge the gap We need higher level models which provide
 f SAutomatic DMA for large code and data on SPE Parallel programming abstractions S l bl h i ti th d Automatic DMA for large code and data on SPE Parallel programming abstractions S l bl h i ti th d © 2007 SCE Scalable synchronization methods Full debug and performance analysis Scalable synchronization methods Full debug and performance analysis
  • 20. The Cell/B E Software CommunityThe Cell/B E Software CommunityThe Cell/B.E. Software CommunityThe Cell/B.E. Software Community © 2007 SCE
  • 21. The Importance of the CoCThe Importance of the CoCpp The Center of Competence is a focal point T b i t th h d i d t The Center of Competence is a focal point T b i t th h d i d tTo bring together researchers and industry To help develop optimized ‘standard’ libraries for Cell/B.E Research new programming languages/models To bring together researchers and industry To help develop optimized ‘standard’ libraries for Cell/B.E Research new programming languages/modelsResearch new programming languages/models Research new compiler techniques General multi-core / parallel programming research Research new programming languages/models Research new compiler techniques General multi-core / parallel programming research Dealing with distributed memory hierarchies Research scalability of synchronization methods De elop tools that can help is ali e parallel soft are Dealing with distributed memory hierarchies Research scalability of synchronization methods De elop tools that can help is ali e parallel soft are © 2007 SCE Develop tools that can help visualize parallel softwareDevelop tools that can help visualize parallel software
  • 22. Industry SupportIndustry Supporty ppy pp Terra Soft Solutions – Yellow Dog Linux for PS3Terra Soft Solutions – Yellow Dog Linux for PS3 Mercury Systems RapidMind Mercury Systems RapidMindp Cmpware, Inc. Reservoir Labs p Cmpware, Inc. Reservoir LabsReservoir Labs Gedae Reservoir Labs Gedae © 2007 SCE allineaallinea
  • 24. Concluding ThoughtsConcluding Thoughtsg gg g The Cell/B.E. has amazing performanceThe Cell/B.E. has amazing performance Its available now in consumer and HPC marketsIts available now in consumer and HPC markets We need more software targeting Cell/B.E. We need Cell/B E ’s power to be more accessible We need more software targeting Cell/B.E. We need Cell/B E ’s power to be more accessibleWe need Cell/B.E. s power to be more accessible We need more research into Cell/B.E. and multi-core We need Cell/B.E. s power to be more accessible We need more research into Cell/B.E. and multi-core © 2007 SCE
  • 25. We need YOU to help us goWe need YOU to help us goWe need YOU to help us go..We need YOU to help us go.. Beyond the GFLOPSBeyond the GFLOPS © 2007 SCE