SlideShare ist ein Scribd-Unternehmen logo
1 von 36
Downloaden Sie, um offline zu lesen
Josiah Samuel
IBM Systems Development Labs – Bangalore
Apache
1
Power Software Development
Agenda
§ Motivation
§ Introduction	to	Apache	Spark
§ Spark	SQL
§ Spark	Internals
§ Spark	ML	Pipelines
2
Big-Data	Era
§ Collect,	Store	&	Process	information	at	scale	
§ Rise	of	Open	Source	Software
– Leverage	clusters	of	commodity	computers	to	process	the	data
§ Data	Science
– Bridge	between	Data	and	the	tools
– Starts	with	running	simple	queries
§ Placing	Schema	on	the	data	and	run	SQL	queries
– R,	Octave,	Python	Scikit learn	
§ Data	is	partitioned	and	spread	across	nodes	(	HDFS	)
– Algorithms	with	wide	data	dependency	will	suffer	from	n/w	delays
– Probability	of	Node	failure	increases
3
Examples	of	Big-Data	Processing
§ Build	a	model	to	detect	credit	card	fraud	using	thousands	of	features	and	
billions	of	transactions.
§ Intelligently	recommend	millions	of	products	to	millions	of	users.
§ Estimate	financial	risk	through	simulations	of	portfolios	including	millions	of	
instruments.
§ Easily	manipulate	data	from	thousands	of	human	genomes	to	detect	genetic	
associations	with	disease.
4
Parallel	Systems	 Distributed	Systems
§ Tightly	coupled	Systems
§ Multiple	processors	shared	same	
memory	address	space
§ Scale	Up	Servers
§ High	Performance	Computing(HPC)
§ Disadvantages:
– Scalability
– Expensive	
5
§ Loosely	coupled	Systems
§ Communicate	with	each	other	over	
Network
§ Scale	Out	Servers
§ Capable	on	collaborating	to	complete	
a	task
§ Disadvantages:
– Difficult	in	developing	distributed	
software
– Network	problems
– Reliability	&	Fault	Tolerance
Apache	Hadoop
§ Hadoop	emerges	as	a	leader
– Filesystem	Abstraction
– M/R	programming	model
– Linear	scalability
– Automatic	failure	recovery
– Cheaper	solution
§ Challenges
– Transformational	APIs	missing	for	Feature	Engineering
– Not	suitable	for	ML	modeling
§ Multiple	passes	on	same	data	sets
6
Apache	Spark
§ Analytical	Operation	System
§ No	need	to	write	intermediate	results	in	disk
§ All	transformations	are	represented	in	DAG	
– Acyclic	Graph	of	Operators
§ Pass	directly	the	results	to	next	step	in	the	
pipeline
§ In-Memory	Processing
8
Performance Aspect
9
Ecosystem Aspect
10
Apache	Spark	APIs:	
• Easy	to	use	APIs
• Improves	productivity	when	operating	on	
large	dataset
• APIs	are	intuitive	and	expressive
• RDDs, DataFrames/Datasets
v APIs help seamless movement between DataFrame or
Dataset and RDDs
v DataFrames and Datasets are built on top of RDDs.
v REPL	– Interactive	Analytics
Development Aspect
11
12
13
Hardware	Trends
14
Storage
Network
2010 2017 Rate of Increase
50+MB/s
(HDD)
1Gbps
~3GHz
500+MB/s
(SSD)
10Gbps
~3GHz
10X
10X
??
Spark	Software	Stack	Trend
§ IO	has	been	optimized
– Reduce	IO	by	pruning	input	data	that	is	not	needed
– New	shuffle	and	network	implementations	
(2014	sort	record	– Ref:	http://sortbenchmark.org/)
§ Data	formats	have	improved
– E.g.	Parquet	is	a	“dense”	columnar	format
§ CPU	increasingly	the	bottleneck;	trend	expected	to	continue
15
16
Project	Tungsten
§ Phase	1	– Foundation – Spark	1.6.1
– Memory	Management
– Code	Generation
– Cache-aware	Algorithms
§ Phase	2 - Spark	2.0.1
– Whole-stage	Codegen
– Vectorization
17
Project	Tungsten:	Phase	1
§ Perform	explicit	memory	management	instead	of	relying	on	Java	objects
– Reduce	memory	footprint
– Eliminate	garbage	collection	overheads
– Use	sun.misc.unsafe rows	and	off	heap	memory
§ Code	generation	for	expression	evaluation
– Reduce	virtual	function	calls	and	interpretation	overhead
§ Cache	conscious	sorting
– Reduce	bad	memory	access	patterns
Ref:
https://databricks.com/blog/2015/04/28/project-tungsten-bringing-spark-closer-to-bare-metal.html
https://www.youtube.com/watch?v=5ajs8EIPWGI&feature=youtu.be
18
Project	Tungsten:	Phase	1:	Memory	Management	
19
String str = “abcd”;
Java takes 48 bytes
Project	Tungsten:	Phase	1:	Memory	Management	
20
In Spark, entire Object fits in 80 bytes
Note: Its applicable only for Dataset/DataFrame as the Schema is known
Project	Tungsten:	Phase	2
21
Volcano Iterator Model Simplicity
Project	Tungsten:	Phase	2
§ Short-coming	of	Volcano	Iterator	Model
– Too	many	virtual	function	calls
– Intermediate	data	in	memory	(or	L1/L2/L3	cache)
– Can’t	take	advantage	of	modern	CPU	features	
§ no	loop	unrolling
§ SIMD
§ Pipelining
§ Prefetching
§ Branch	Prediction
22
Project	Tungsten:	Phase	2
§ Fusing	operators	together	so	the	generated	code	looks	like	hand	optimized	code
§ Identify	chains	of	operators	(“stages”)
§ Compile	each	stage	into	a	single	function
§ Functionality	of	a	general	purpose	execution	engine;
23
24
Challenges	with	Data	Science
§ Majority	of	work	lies	in	preprocessing	the	data	
– Feature	Engineering
– Choosing	the	Algorithms
– Convert	data	to	vectors	for	ML	Algorithms
§ Iteration
– Scans	over	input	vector	till	model	converges
– Results	are	based	on	experimentation
§ Put	the	models	in	Production
– Evaluate	its	accuracy	over	time
– Rebuilt	the	model	periodically
25
System should support more flexible
transformation
Multiple Data access from disk
should be effectively handled
Ease Model creation & suitable for
production use
SparkML – High	level	functionality
26
• Built on top of DataFrames
• org.apache.spark.mllib.* - deprecated
SparkML – TF	IDF
§ Term	Frequency	– Inverse	Document	Frequency
§ Used	to	build	Search	Engines
– Score	indicate	how	important	a	word	is	to	a	collection	of	documents
§ If	a	word	appears	frequently	in	a	doc,	it’s	important
§ But	if	a	word	appears	in	many	docs	(the,	and,	of	- stop-words)	the	word	is	not	
meaningful,	so	lower	its	score
27
SparkML – KMeans Clustering
§ Unsupervised	learning
§ Classify	items	into	K	different	groups
§ Randomly	initialize	the	centroid	for	these	group
§ Compute	Euclidean	distance	between	the	datapoints &	centroid	to	assign	the	
group
§ Recompute the	centroid	once	again	with	all	the	datapoints which	are	part	of	the	
same	group
§ Repeat	till	the	centroid	movement	is	negligible
28
Spark	ML	Pipeline
§ DataFrame:
– uses	DF	from	Spark	SQL	as	a	ML	dataset.	Different	columns	can	store	text,	feature	
vectors,	true	labels	and	predictions
§ Transformer:	
– an	algorithm	which	can	transform	one	DataFrame into	another	DataFrame
(example:	ML	model	is	a	transformer	that	transforms	a	DF	with	features	into	a	DF	with	predictions)
§ Estimator:	
– an	algorithm	which	can	be	fit	on	a	DF	to	produce	a	Model	
(example:	a	learning	algorithm	is	an	Estimator	which	trains	on	a	DF	and	produces	a	model)
§ Pipeline:	
– chains	multiple	Transformers	and	Estimators	together	to	specify	a	ML	workflow
29
Spark	ML	Pipeline
30
Tokenizer KMeans
Kmeans
Model
Hashing
TFPipeline
(Estimator)
Pipeline.fit()
Raw
Text
Words
Feature
Vectors
31
Backup
32
GPU	Acceleration
§ Target	Computation	Heavy	Spark	Applications.
• Machine	learning	algorithms	like	linear	regression,	logistics	regression	etc.
• Same	lambda	function	is	applied	on	huge	set	of	rows
§ Rising	need	to	offload	CPU	work	as	CPUs	has	become	bottleneck	on	Spark	
§ Goal	is	to	shorten	execution	time	of	a	long-running	Computation-heavy	Spark	application
§ Approach
– Accelerate	a	Spark	application	by	using	GPUs	effectively	and	transparently
– Minimum	change	to	user’s	Spark	Program
– No	change	to	existing	Spark	code	base
33
GPUEnabler in	Catalyst
§ Put	GPU	kernel	launcher	and	code	generator	into	Catalyst
34
User’s Spark program
DataFrame
Dataset
Tungsten
Catalyst
Off-heap
UnsafeRow
GPU device memory
Columnar
Logical optimizer
Memory manager
CPU code generator
GPU code generatorGPU kernel launcher
Columnar
How	to	use	GPUEnabler Plugin
35
Available	at	 https://github.com/IBMSparkGPU/GPUEnabler
Build the package:
This will install the package to the maven local repository.
To include this package to your Spark Application include the dependency in the application’s pom.xml file:
More information can be found in the github repository regarding the APIs and sample programs
API	usage	compared	to	Spark	APIs
§ map	&	reduce	is	replaced	with	mapExtFunc &	reduceExtFunc
§ Handles	are	passed	to	it	which	holds	the	mapping	of	CUDA	Kernels
§ Handles	are	also	used	in	specifying	the	input	parameters	&	output	parameters	mappings
36

Weitere ähnliche Inhalte

Was ist angesagt?

Building High Performance MySql Query Systems And Analytic Applications
Building High Performance MySql Query Systems And Analytic ApplicationsBuilding High Performance MySql Query Systems And Analytic Applications
Building High Performance MySql Query Systems And Analytic Applicationsguest40cda0b
 
Spark and machine learning in microservices architecture
Spark and machine learning in microservices architectureSpark and machine learning in microservices architecture
Spark and machine learning in microservices architectureStepan Pushkarev
 
Introduction to Databus
Introduction to DatabusIntroduction to Databus
Introduction to DatabusAmy W. Tang
 
Spark-on-Yarn: The Road Ahead-(Marcelo Vanzin, Cloudera)
Spark-on-Yarn: The Road Ahead-(Marcelo Vanzin, Cloudera)Spark-on-Yarn: The Road Ahead-(Marcelo Vanzin, Cloudera)
Spark-on-Yarn: The Road Ahead-(Marcelo Vanzin, Cloudera)Spark Summit
 
All Aboard the Databus
All Aboard the DatabusAll Aboard the Databus
All Aboard the DatabusAmy W. Tang
 
Stream Data Processing at Big Data Landscape by Oleksandr Fedirko
Stream Data Processing at Big Data Landscape by Oleksandr Fedirko Stream Data Processing at Big Data Landscape by Oleksandr Fedirko
Stream Data Processing at Big Data Landscape by Oleksandr Fedirko GlobalLogic Ukraine
 
Building Intelligent Applications, Experimental ML with Uber’s Data Science W...
Building Intelligent Applications, Experimental ML with Uber’s Data Science W...Building Intelligent Applications, Experimental ML with Uber’s Data Science W...
Building Intelligent Applications, Experimental ML with Uber’s Data Science W...Databricks
 
Choose Your Weapon: Comparing Spark on FPGAs vs GPUs
Choose Your Weapon: Comparing Spark on FPGAs vs GPUsChoose Your Weapon: Comparing Spark on FPGAs vs GPUs
Choose Your Weapon: Comparing Spark on FPGAs vs GPUsDatabricks
 
Writing Yarn Applications Hadoop Summit 2012
Writing Yarn Applications Hadoop Summit 2012Writing Yarn Applications Hadoop Summit 2012
Writing Yarn Applications Hadoop Summit 2012Hortonworks
 
Building Big Data Streaming Architectures
Building Big Data Streaming ArchitecturesBuilding Big Data Streaming Architectures
Building Big Data Streaming ArchitecturesDavid Martínez Rego
 
Development of concurrent services using In-Memory Data Grids
Development of concurrent services using In-Memory Data GridsDevelopment of concurrent services using In-Memory Data Grids
Development of concurrent services using In-Memory Data Gridsjlorenzocima
 
AI on Greenplum Using
 Apache MADlib and MADlib Flow - Greenplum Summit 2019
AI on Greenplum Using
 Apache MADlib and MADlib Flow - Greenplum Summit 2019AI on Greenplum Using
 Apache MADlib and MADlib Flow - Greenplum Summit 2019
AI on Greenplum Using
 Apache MADlib and MADlib Flow - Greenplum Summit 2019VMware Tanzu
 
Big Data, Simple and Fast: Addressing the Shortcomings of Hadoop
Big Data, Simple and Fast: Addressing the Shortcomings of HadoopBig Data, Simple and Fast: Addressing the Shortcomings of Hadoop
Big Data, Simple and Fast: Addressing the Shortcomings of HadoopHazelcast
 
Interactive Visualization of Streaming Data Powered by Spark
Interactive Visualization of Streaming Data Powered by SparkInteractive Visualization of Streaming Data Powered by Spark
Interactive Visualization of Streaming Data Powered by SparkSpark Summit
 
Lambda Architecture with Spark
Lambda Architecture with SparkLambda Architecture with Spark
Lambda Architecture with SparkKnoldus Inc.
 
Streamlio and IoT analytics with Apache Pulsar
Streamlio and IoT analytics with Apache PulsarStreamlio and IoT analytics with Apache Pulsar
Streamlio and IoT analytics with Apache PulsarStreamlio
 
Handling Data Skew Adaptively In Spark Using Dynamic Repartitioning
Handling Data Skew Adaptively In Spark Using Dynamic RepartitioningHandling Data Skew Adaptively In Spark Using Dynamic Repartitioning
Handling Data Skew Adaptively In Spark Using Dynamic RepartitioningSpark Summit
 
Observability for Data Pipelines With OpenLineage
Observability for Data Pipelines With OpenLineageObservability for Data Pipelines With OpenLineage
Observability for Data Pipelines With OpenLineageDatabricks
 
Self Regulating Streaming - Data Platforms Conference 2018
Self Regulating Streaming - Data Platforms Conference 2018Self Regulating Streaming - Data Platforms Conference 2018
Self Regulating Streaming - Data Platforms Conference 2018Streamlio
 

Was ist angesagt? (19)

Building High Performance MySql Query Systems And Analytic Applications
Building High Performance MySql Query Systems And Analytic ApplicationsBuilding High Performance MySql Query Systems And Analytic Applications
Building High Performance MySql Query Systems And Analytic Applications
 
Spark and machine learning in microservices architecture
Spark and machine learning in microservices architectureSpark and machine learning in microservices architecture
Spark and machine learning in microservices architecture
 
Introduction to Databus
Introduction to DatabusIntroduction to Databus
Introduction to Databus
 
Spark-on-Yarn: The Road Ahead-(Marcelo Vanzin, Cloudera)
Spark-on-Yarn: The Road Ahead-(Marcelo Vanzin, Cloudera)Spark-on-Yarn: The Road Ahead-(Marcelo Vanzin, Cloudera)
Spark-on-Yarn: The Road Ahead-(Marcelo Vanzin, Cloudera)
 
All Aboard the Databus
All Aboard the DatabusAll Aboard the Databus
All Aboard the Databus
 
Stream Data Processing at Big Data Landscape by Oleksandr Fedirko
Stream Data Processing at Big Data Landscape by Oleksandr Fedirko Stream Data Processing at Big Data Landscape by Oleksandr Fedirko
Stream Data Processing at Big Data Landscape by Oleksandr Fedirko
 
Building Intelligent Applications, Experimental ML with Uber’s Data Science W...
Building Intelligent Applications, Experimental ML with Uber’s Data Science W...Building Intelligent Applications, Experimental ML with Uber’s Data Science W...
Building Intelligent Applications, Experimental ML with Uber’s Data Science W...
 
Choose Your Weapon: Comparing Spark on FPGAs vs GPUs
Choose Your Weapon: Comparing Spark on FPGAs vs GPUsChoose Your Weapon: Comparing Spark on FPGAs vs GPUs
Choose Your Weapon: Comparing Spark on FPGAs vs GPUs
 
Writing Yarn Applications Hadoop Summit 2012
Writing Yarn Applications Hadoop Summit 2012Writing Yarn Applications Hadoop Summit 2012
Writing Yarn Applications Hadoop Summit 2012
 
Building Big Data Streaming Architectures
Building Big Data Streaming ArchitecturesBuilding Big Data Streaming Architectures
Building Big Data Streaming Architectures
 
Development of concurrent services using In-Memory Data Grids
Development of concurrent services using In-Memory Data GridsDevelopment of concurrent services using In-Memory Data Grids
Development of concurrent services using In-Memory Data Grids
 
AI on Greenplum Using
 Apache MADlib and MADlib Flow - Greenplum Summit 2019
AI on Greenplum Using
 Apache MADlib and MADlib Flow - Greenplum Summit 2019AI on Greenplum Using
 Apache MADlib and MADlib Flow - Greenplum Summit 2019
AI on Greenplum Using
 Apache MADlib and MADlib Flow - Greenplum Summit 2019
 
Big Data, Simple and Fast: Addressing the Shortcomings of Hadoop
Big Data, Simple and Fast: Addressing the Shortcomings of HadoopBig Data, Simple and Fast: Addressing the Shortcomings of Hadoop
Big Data, Simple and Fast: Addressing the Shortcomings of Hadoop
 
Interactive Visualization of Streaming Data Powered by Spark
Interactive Visualization of Streaming Data Powered by SparkInteractive Visualization of Streaming Data Powered by Spark
Interactive Visualization of Streaming Data Powered by Spark
 
Lambda Architecture with Spark
Lambda Architecture with SparkLambda Architecture with Spark
Lambda Architecture with Spark
 
Streamlio and IoT analytics with Apache Pulsar
Streamlio and IoT analytics with Apache PulsarStreamlio and IoT analytics with Apache Pulsar
Streamlio and IoT analytics with Apache Pulsar
 
Handling Data Skew Adaptively In Spark Using Dynamic Repartitioning
Handling Data Skew Adaptively In Spark Using Dynamic RepartitioningHandling Data Skew Adaptively In Spark Using Dynamic Repartitioning
Handling Data Skew Adaptively In Spark Using Dynamic Repartitioning
 
Observability for Data Pipelines With OpenLineage
Observability for Data Pipelines With OpenLineageObservability for Data Pipelines With OpenLineage
Observability for Data Pipelines With OpenLineage
 
Self Regulating Streaming - Data Platforms Conference 2018
Self Regulating Streaming - Data Platforms Conference 2018Self Regulating Streaming - Data Platforms Conference 2018
Self Regulating Streaming - Data Platforms Conference 2018
 

Ähnlich wie Power Software Development with Apache Spark

How to create innovative architecture using VisualSim?
How to create innovative architecture using VisualSim?How to create innovative architecture using VisualSim?
How to create innovative architecture using VisualSim?Deepak Shankar
 
How to create innovative architecture using VisualSim?
How to create innovative architecture using VisualSim?How to create innovative architecture using VisualSim?
How to create innovative architecture using VisualSim?Deepak Shankar
 
Streaming Analytics with Spark, Kafka, Cassandra and Akka
Streaming Analytics with Spark, Kafka, Cassandra and AkkaStreaming Analytics with Spark, Kafka, Cassandra and Akka
Streaming Analytics with Spark, Kafka, Cassandra and AkkaHelena Edelson
 
Streaming Analytics with Spark, Kafka, Cassandra and Akka by Helena Edelson
Streaming Analytics with Spark, Kafka, Cassandra and Akka by Helena EdelsonStreaming Analytics with Spark, Kafka, Cassandra and Akka by Helena Edelson
Streaming Analytics with Spark, Kafka, Cassandra and Akka by Helena EdelsonSpark Summit
 
Big data trends challenges opportunities
Big data trends challenges opportunitiesBig data trends challenges opportunities
Big data trends challenges opportunitiesMohammed Guller
 
Webinar: Detecting Deadlocks in Electronic Systems using Time-based Simulation
Webinar: Detecting Deadlocks in Electronic Systems using Time-based SimulationWebinar: Detecting Deadlocks in Electronic Systems using Time-based Simulation
Webinar: Detecting Deadlocks in Electronic Systems using Time-based SimulationDeepak Shankar
 
Exploration of Radars and Software Defined Radios using VisualSim
Exploration of  Radars and Software Defined Radios using VisualSimExploration of  Radars and Software Defined Radios using VisualSim
Exploration of Radars and Software Defined Radios using VisualSimDeepak Shankar
 
Hpc lunch and learn
Hpc lunch and learnHpc lunch and learn
Hpc lunch and learnJohn D Almon
 
Spark Streaming & Kafka-The Future of Stream Processing
Spark Streaming & Kafka-The Future of Stream ProcessingSpark Streaming & Kafka-The Future of Stream Processing
Spark Streaming & Kafka-The Future of Stream ProcessingJack Gudenkauf
 
Spark Streaming& Kafka-The Future of Stream Processing by Hari Shreedharan of...
Spark Streaming& Kafka-The Future of Stream Processing by Hari Shreedharan of...Spark Streaming& Kafka-The Future of Stream Processing by Hari Shreedharan of...
Spark Streaming& Kafka-The Future of Stream Processing by Hari Shreedharan of...Data Con LA
 
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...Precisely
 
Introduction to Architecture Exploration of Semiconductor, Embedded Systems, ...
Introduction to Architecture Exploration of Semiconductor, Embedded Systems, ...Introduction to Architecture Exploration of Semiconductor, Embedded Systems, ...
Introduction to Architecture Exploration of Semiconductor, Embedded Systems, ...Deepak Shankar
 
Designing & Optimizing Micro Batching Systems Using 100+ Nodes (Ananth Ram, R...
Designing & Optimizing Micro Batching Systems Using 100+ Nodes (Ananth Ram, R...Designing & Optimizing Micro Batching Systems Using 100+ Nodes (Ananth Ram, R...
Designing & Optimizing Micro Batching Systems Using 100+ Nodes (Ananth Ram, R...DataStax
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiDataWorks Summit
 
Using VisualSim Architect for Semiconductor System Analysis
Using VisualSim Architect for Semiconductor System AnalysisUsing VisualSim Architect for Semiconductor System Analysis
Using VisualSim Architect for Semiconductor System AnalysisDeepak Shankar
 
Building FoundationDB
Building FoundationDBBuilding FoundationDB
Building FoundationDBFoundationDB
 
Event Detection Pipelines with Apache Kafka
Event Detection Pipelines with Apache KafkaEvent Detection Pipelines with Apache Kafka
Event Detection Pipelines with Apache KafkaDataWorks Summit
 
Advances in Scientific Workflow Environments
Advances in Scientific Workflow EnvironmentsAdvances in Scientific Workflow Environments
Advances in Scientific Workflow EnvironmentsCarole Goble
 

Ähnlich wie Power Software Development with Apache Spark (20)

How to create innovative architecture using VisualSim?
How to create innovative architecture using VisualSim?How to create innovative architecture using VisualSim?
How to create innovative architecture using VisualSim?
 
How to create innovative architecture using VisualSim?
How to create innovative architecture using VisualSim?How to create innovative architecture using VisualSim?
How to create innovative architecture using VisualSim?
 
Streaming Analytics with Spark, Kafka, Cassandra and Akka
Streaming Analytics with Spark, Kafka, Cassandra and AkkaStreaming Analytics with Spark, Kafka, Cassandra and Akka
Streaming Analytics with Spark, Kafka, Cassandra and Akka
 
Streaming Analytics with Spark, Kafka, Cassandra and Akka by Helena Edelson
Streaming Analytics with Spark, Kafka, Cassandra and Akka by Helena EdelsonStreaming Analytics with Spark, Kafka, Cassandra and Akka by Helena Edelson
Streaming Analytics with Spark, Kafka, Cassandra and Akka by Helena Edelson
 
Big data trends challenges opportunities
Big data trends challenges opportunitiesBig data trends challenges opportunities
Big data trends challenges opportunities
 
Webinar: Detecting Deadlocks in Electronic Systems using Time-based Simulation
Webinar: Detecting Deadlocks in Electronic Systems using Time-based SimulationWebinar: Detecting Deadlocks in Electronic Systems using Time-based Simulation
Webinar: Detecting Deadlocks in Electronic Systems using Time-based Simulation
 
Exploration of Radars and Software Defined Radios using VisualSim
Exploration of  Radars and Software Defined Radios using VisualSimExploration of  Radars and Software Defined Radios using VisualSim
Exploration of Radars and Software Defined Radios using VisualSim
 
Hpc lunch and learn
Hpc lunch and learnHpc lunch and learn
Hpc lunch and learn
 
Spark Streaming & Kafka-The Future of Stream Processing
Spark Streaming & Kafka-The Future of Stream ProcessingSpark Streaming & Kafka-The Future of Stream Processing
Spark Streaming & Kafka-The Future of Stream Processing
 
Spark Streaming& Kafka-The Future of Stream Processing by Hari Shreedharan of...
Spark Streaming& Kafka-The Future of Stream Processing by Hari Shreedharan of...Spark Streaming& Kafka-The Future of Stream Processing by Hari Shreedharan of...
Spark Streaming& Kafka-The Future of Stream Processing by Hari Shreedharan of...
 
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
 
Introduction to Architecture Exploration of Semiconductor, Embedded Systems, ...
Introduction to Architecture Exploration of Semiconductor, Embedded Systems, ...Introduction to Architecture Exploration of Semiconductor, Embedded Systems, ...
Introduction to Architecture Exploration of Semiconductor, Embedded Systems, ...
 
Designing & Optimizing Micro Batching Systems Using 100+ Nodes (Ananth Ram, R...
Designing & Optimizing Micro Batching Systems Using 100+ Nodes (Ananth Ram, R...Designing & Optimizing Micro Batching Systems Using 100+ Nodes (Ananth Ram, R...
Designing & Optimizing Micro Batching Systems Using 100+ Nodes (Ananth Ram, R...
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Spark at Zillow
Spark at ZillowSpark at Zillow
Spark at Zillow
 
Using VisualSim Architect for Semiconductor System Analysis
Using VisualSim Architect for Semiconductor System AnalysisUsing VisualSim Architect for Semiconductor System Analysis
Using VisualSim Architect for Semiconductor System Analysis
 
CQRS
CQRSCQRS
CQRS
 
Building FoundationDB
Building FoundationDBBuilding FoundationDB
Building FoundationDB
 
Event Detection Pipelines with Apache Kafka
Event Detection Pipelines with Apache KafkaEvent Detection Pipelines with Apache Kafka
Event Detection Pipelines with Apache Kafka
 
Advances in Scientific Workflow Environments
Advances in Scientific Workflow EnvironmentsAdvances in Scientific Workflow Environments
Advances in Scientific Workflow Environments
 

Mehr von OpenPOWERorg

OpenPOWER Foundation at NVIDIA GPU Technology Conference 2019
OpenPOWER Foundation at NVIDIA GPU Technology Conference 2019OpenPOWER Foundation at NVIDIA GPU Technology Conference 2019
OpenPOWER Foundation at NVIDIA GPU Technology Conference 2019OpenPOWERorg
 
TAU for Accelerating AI Applications at OpenPOWER Summit Europe
TAU for Accelerating AI Applications at OpenPOWER Summit Europe TAU for Accelerating AI Applications at OpenPOWER Summit Europe
TAU for Accelerating AI Applications at OpenPOWER Summit Europe OpenPOWERorg
 
Artificial Intelligence in Healthcare at OpenPOWER Summit Europe
Artificial Intelligence in Healthcare at OpenPOWER Summit EuropeArtificial Intelligence in Healthcare at OpenPOWER Summit Europe
Artificial Intelligence in Healthcare at OpenPOWER Summit EuropeOpenPOWERorg
 
OpenPOWER SC16 Recap: Day 3
OpenPOWER SC16 Recap: Day 3OpenPOWER SC16 Recap: Day 3
OpenPOWER SC16 Recap: Day 3OpenPOWERorg
 
OpenPOWER SC16 Recap: Day 2
OpenPOWER SC16 Recap: Day 2OpenPOWER SC16 Recap: Day 2
OpenPOWER SC16 Recap: Day 2OpenPOWERorg
 
OpenPOWER SC16 Recap: Day 1
OpenPOWER SC16 Recap: Day 1OpenPOWER SC16 Recap: Day 1
OpenPOWER SC16 Recap: Day 1OpenPOWERorg
 
OpenPOWER Summit Europe: Day 2 Recap
OpenPOWER Summit Europe: Day 2 RecapOpenPOWER Summit Europe: Day 2 Recap
OpenPOWER Summit Europe: Day 2 RecapOpenPOWERorg
 
OpenPOWER Summit Europe: Day 1 Recap
OpenPOWER Summit Europe: Day 1 RecapOpenPOWER Summit Europe: Day 1 Recap
OpenPOWER Summit Europe: Day 1 RecapOpenPOWERorg
 
OpenPOWER's ISC 2016 Recap
OpenPOWER's ISC 2016 RecapOpenPOWER's ISC 2016 Recap
OpenPOWER's ISC 2016 RecapOpenPOWERorg
 
ISC 2016 Day 3 Recap
ISC 2016 Day 3 RecapISC 2016 Day 3 Recap
ISC 2016 Day 3 RecapOpenPOWERorg
 
ISC 2016 Day 2 Recap
ISC 2016 Day 2 RecapISC 2016 Day 2 Recap
ISC 2016 Day 2 RecapOpenPOWERorg
 
ISC 2016 Day 1 Recap
ISC 2016 Day 1 RecapISC 2016 Day 1 Recap
ISC 2016 Day 1 RecapOpenPOWERorg
 
OpenPOWER Summit Day 1 Recap
OpenPOWER Summit Day 1 RecapOpenPOWER Summit Day 1 Recap
OpenPOWER Summit Day 1 RecapOpenPOWERorg
 
OpenPOWER Summit Day 2 Recap
OpenPOWER Summit Day 2 RecapOpenPOWER Summit Day 2 Recap
OpenPOWER Summit Day 2 RecapOpenPOWERorg
 
OpenPOWER Supercomputing 2015 Day Three Recap: The Open Ecosystem
OpenPOWER Supercomputing 2015 Day Three Recap: The Open EcosystemOpenPOWER Supercomputing 2015 Day Three Recap: The Open Ecosystem
OpenPOWER Supercomputing 2015 Day Three Recap: The Open EcosystemOpenPOWERorg
 
OpenPOWER Supercomputing Recap Day Two: Innovating Across the Stack
OpenPOWER Supercomputing Recap Day Two: Innovating Across the StackOpenPOWER Supercomputing Recap Day Two: Innovating Across the Stack
OpenPOWER Supercomputing Recap Day Two: Innovating Across the StackOpenPOWERorg
 

Mehr von OpenPOWERorg (16)

OpenPOWER Foundation at NVIDIA GPU Technology Conference 2019
OpenPOWER Foundation at NVIDIA GPU Technology Conference 2019OpenPOWER Foundation at NVIDIA GPU Technology Conference 2019
OpenPOWER Foundation at NVIDIA GPU Technology Conference 2019
 
TAU for Accelerating AI Applications at OpenPOWER Summit Europe
TAU for Accelerating AI Applications at OpenPOWER Summit Europe TAU for Accelerating AI Applications at OpenPOWER Summit Europe
TAU for Accelerating AI Applications at OpenPOWER Summit Europe
 
Artificial Intelligence in Healthcare at OpenPOWER Summit Europe
Artificial Intelligence in Healthcare at OpenPOWER Summit EuropeArtificial Intelligence in Healthcare at OpenPOWER Summit Europe
Artificial Intelligence in Healthcare at OpenPOWER Summit Europe
 
OpenPOWER SC16 Recap: Day 3
OpenPOWER SC16 Recap: Day 3OpenPOWER SC16 Recap: Day 3
OpenPOWER SC16 Recap: Day 3
 
OpenPOWER SC16 Recap: Day 2
OpenPOWER SC16 Recap: Day 2OpenPOWER SC16 Recap: Day 2
OpenPOWER SC16 Recap: Day 2
 
OpenPOWER SC16 Recap: Day 1
OpenPOWER SC16 Recap: Day 1OpenPOWER SC16 Recap: Day 1
OpenPOWER SC16 Recap: Day 1
 
OpenPOWER Summit Europe: Day 2 Recap
OpenPOWER Summit Europe: Day 2 RecapOpenPOWER Summit Europe: Day 2 Recap
OpenPOWER Summit Europe: Day 2 Recap
 
OpenPOWER Summit Europe: Day 1 Recap
OpenPOWER Summit Europe: Day 1 RecapOpenPOWER Summit Europe: Day 1 Recap
OpenPOWER Summit Europe: Day 1 Recap
 
OpenPOWER's ISC 2016 Recap
OpenPOWER's ISC 2016 RecapOpenPOWER's ISC 2016 Recap
OpenPOWER's ISC 2016 Recap
 
ISC 2016 Day 3 Recap
ISC 2016 Day 3 RecapISC 2016 Day 3 Recap
ISC 2016 Day 3 Recap
 
ISC 2016 Day 2 Recap
ISC 2016 Day 2 RecapISC 2016 Day 2 Recap
ISC 2016 Day 2 Recap
 
ISC 2016 Day 1 Recap
ISC 2016 Day 1 RecapISC 2016 Day 1 Recap
ISC 2016 Day 1 Recap
 
OpenPOWER Summit Day 1 Recap
OpenPOWER Summit Day 1 RecapOpenPOWER Summit Day 1 Recap
OpenPOWER Summit Day 1 Recap
 
OpenPOWER Summit Day 2 Recap
OpenPOWER Summit Day 2 RecapOpenPOWER Summit Day 2 Recap
OpenPOWER Summit Day 2 Recap
 
OpenPOWER Supercomputing 2015 Day Three Recap: The Open Ecosystem
OpenPOWER Supercomputing 2015 Day Three Recap: The Open EcosystemOpenPOWER Supercomputing 2015 Day Three Recap: The Open Ecosystem
OpenPOWER Supercomputing 2015 Day Three Recap: The Open Ecosystem
 
OpenPOWER Supercomputing Recap Day Two: Innovating Across the Stack
OpenPOWER Supercomputing Recap Day Two: Innovating Across the StackOpenPOWER Supercomputing Recap Day Two: Innovating Across the Stack
OpenPOWER Supercomputing Recap Day Two: Innovating Across the Stack
 

Kürzlich hochgeladen

TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 

Kürzlich hochgeladen (20)

TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 

Power Software Development with Apache Spark