SlideShare ist ein Scribd-Unternehmen logo
1 von 39
Cloud and Information Services Lab
Furong Huang
UC Irvine
Anima Anandkumar
UC Irvine
Nikos Karampatziakis
Microsoft CISL
Paul Mineiro + 𝜀
Microsoft CISL
Sergiy Matusevych
Microsoft CISL
Shravan Narayanamurthy
Microsoft CISL
Markus Weimer
Microsoft CISL
Apache REEF Contributors
Worldwide
/pos/cv107_24319.txt
is evil dead ii a bad movie ?
it's full of terrible acting ,
pointless violence , and plot
holes yet it remains a cult
classic nearly fifteen years
after its release ...
/pos/cv108_15571.txt
it's rather strange too have
two computer animated talking
ant movies come out in a single
year , but that is what disney
and pixar animation ; s latest
film represents ...
http://www.cs.cornell.edu/People/pabo/movie-review-data
LDAvis library for R https://github.com/cpsievert/LDAvis
=*
𝑀2 ≝ 𝔼 𝑥1⨂𝑥2𝑀1 ≝ 𝔼 𝑥1
𝑀3 ≝ 𝔼 𝑥1⨂𝑥2⨂𝑥3
𝑀2 ≝ 𝔼 𝑥1⨂𝑥2
𝑀1 ≝ 𝔼 𝑥1
𝑀3 ≝ 𝔼 𝑥1⨂𝑥2⨂𝑥3
−
𝛼0
𝛼0 + 1
𝑀1⨂𝑀1
−[… more shift terms]
𝑀2 =
𝑖=1
𝑘
𝛼𝑖 ∙ 𝛽𝑖⨂𝛽𝑖
𝑀3 =
𝑖=1
𝑘
𝛼𝑖 ∙ 𝛽𝑖⨂𝛽𝑖⨂𝛽𝑖
𝑀3 ≝ 𝔼 𝑥1⨂𝑥2⨂𝑥3
𝑀3 𝜆1 𝑎1⨂𝑏1⨂𝑐1
= 𝜆1
𝜆2 𝑎2⨂𝑏2⨂𝑐2
+ 𝜆2 + 𝜆3 ⋯
=
𝑖
𝜆𝑖 ∙ 𝑎𝑖⨂𝑏𝑖⨂𝑐𝑖
𝜆, 𝐴 ← argmin
𝜆∈ℝ 𝑘
𝐴∈ℝ 𝑘×𝑘
𝐴 ⋅ Diag 𝜆 ⋅ 𝐶⨀𝐵 ⊤
− 𝑀3
2
http://reef.incubator.apache.org
Storage
(Focus: HDFS)
HDFS ...
Azure
Block
Storage
... Office 365
SQL / HIVE /
LINQ
Cloud
Numerics
Pregel
GraphLab
Programming Models
(Domain Specific Languages)
DatalabApplications
Machine
Learning
BI
Power*
Resource Manager
(Focus: YARN)
YARN ... Mesos ...
Azure Tasks
Drawbridge
REEF
The Application Server for Big Data
Communications, Storage, Fault
Management, Interoperability
Operator Layer
(Future Work) REEF Operator API and Library
REEF Logical Abstraction
Container
+
∑⊕
⊗ ⊗
⊗
Easy to reason about
Centralized control flow
• Evaluator allocation and configuration
• Task configuration and submission
Centralized error handling
• Task exceptions thrown to the Driver
• Evaluator failures reported to the Driver
Scalable
Event-based programming
• Driver sends requests as events to REEF
• REEF sends events to the Driver
Mostly stateless design
• REEF maintains minimal state
• Majority of state keeping (e.g. work queues)
is maintained by the Driver
// Submit task to the newly created context
public class ContextActiveHandler implements EventHandler<ActiveContext> {
@Override
public void onNext(final ActiveContext context) {
taskGroups.submitNext(context);
}
}
// Submit next task to current context
public class TaskCompletedHandler implements EventHandler<CompletedTask> {
@Override
public void onNext(final CompletedTask task) {
final ActiveContext context = task.getActiveContext();
taskGroups.submitNext(context);
}
}
@Inject
public WhitenTask(
final @Parameter(TaskConfigurationOptions.Identifier.class) String taskId,
final @Parameter(Launch.DimD.class) int dimD,
final @Parameter(Launch.DimK.class) int dimK,
final GroupCommClient groupCommClient,
final InputData data,
final TaskEnvironment env) {
// ...
}
“ ”Use Java “type system” to validate the configuration
// We can send and receive any Java serializable data, e.g. JBLAS matrices
private final Broadcast.Sender<DoubleMatrix> modelSender;
private final Broadcast.Receiver<DoubleMatrix[]> resultReceiver;
// Broadcast the model, collect the results, repeat.
do {
this.modelSender.send(sliceA);
// ...
final DoubleMatrix[] result = this.resultReceiver.reduce();
} while (notConverged(sliceA, prevSliceA));
https://github.com/Microsoft-CISL/TensorFactorization
http://reef.incubator.apache.org
motus@apache.org
𝑀2 =
𝑖
𝜆𝑖 ∙ 𝑢𝑖⨂𝑣𝑖
𝑀2 𝜆1 ∙ 𝑢1⨂𝑣1
= 𝜆1
𝜆2 ∙ 𝑢2⨂𝑣2
+ 𝜆2 + 𝜆3 ⋯
𝑀3 ≝ 𝔼 𝑥1⨂𝑥2⨂𝑥3
𝑀3 𝜆1 𝑢1⨂𝑣1⨂𝑤1
= 𝜆1
𝜆2 𝑢2⨂𝑣2⨂𝑤2
+ 𝜆2 + 𝜆3 ⋯
=
𝑖
𝜆𝑖 ∙ 𝑢𝑖⨂𝑣𝑖⨂𝑤𝑖
𝐼
𝑎1
𝑎1
• Find whitening matrix s.t. orthogonal
• Use to find s.t.
• Whiten :

Weitere ähnliche Inhalte

Was ist angesagt?

Building production spark streaming applications
Building production spark streaming applicationsBuilding production spark streaming applications
Building production spark streaming applicationsJoey Echeverria
 
Understanding Akka Streams, Back Pressure, and Asynchronous Architectures
Understanding Akka Streams, Back Pressure, and Asynchronous ArchitecturesUnderstanding Akka Streams, Back Pressure, and Asynchronous Architectures
Understanding Akka Streams, Back Pressure, and Asynchronous ArchitecturesLightbend
 
Building reactive distributed systems with Akka
Building reactive distributed systems with Akka Building reactive distributed systems with Akka
Building reactive distributed systems with Akka Johan Andrén
 
Akka 2.4 plus new commercial features in Typesafe Reactive Platform
Akka 2.4 plus new commercial features in Typesafe Reactive PlatformAkka 2.4 plus new commercial features in Typesafe Reactive Platform
Akka 2.4 plus new commercial features in Typesafe Reactive PlatformLegacy Typesafe (now Lightbend)
 
Spark Streaming Recipes and "Exactly Once" Semantics Revised
Spark Streaming Recipes and "Exactly Once" Semantics RevisedSpark Streaming Recipes and "Exactly Once" Semantics Revised
Spark Streaming Recipes and "Exactly Once" Semantics RevisedMichael Spector
 
Asynchronous Orchestration DSL on squbs
Asynchronous Orchestration DSL on squbsAsynchronous Orchestration DSL on squbs
Asynchronous Orchestration DSL on squbsAnil Gursel
 
Above the clouds: introducing Akka
Above the clouds: introducing AkkaAbove the clouds: introducing Akka
Above the clouds: introducing Akkanartamonov
 
Fresh from the Oven (04.2015): Experimental Akka Typed and Akka Streams
Fresh from the Oven (04.2015): Experimental Akka Typed and Akka StreamsFresh from the Oven (04.2015): Experimental Akka Typed and Akka Streams
Fresh from the Oven (04.2015): Experimental Akka Typed and Akka StreamsKonrad Malawski
 
Akka Streams and HTTP
Akka Streams and HTTPAkka Streams and HTTP
Akka Streams and HTTPRoland Kuhn
 
Dive into spark2
Dive into spark2Dive into spark2
Dive into spark2Gal Marder
 
A Tale of Two APIs: Using Spark Streaming In Production
A Tale of Two APIs: Using Spark Streaming In ProductionA Tale of Two APIs: Using Spark Streaming In Production
A Tale of Two APIs: Using Spark Streaming In ProductionLightbend
 
Introduction to Apache ZooKeeper | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to Apache ZooKeeper | Big Data Hadoop Spark Tutorial | CloudxLabIntroduction to Apache ZooKeeper | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to Apache ZooKeeper | Big Data Hadoop Spark Tutorial | CloudxLabCloudxLab
 
[Tokyo Scala User Group] Akka Streams & Reactive Streams (0.7)
[Tokyo Scala User Group] Akka Streams & Reactive Streams (0.7)[Tokyo Scala User Group] Akka Streams & Reactive Streams (0.7)
[Tokyo Scala User Group] Akka Streams & Reactive Streams (0.7)Konrad Malawski
 
Reactive Streams: Handling Data-Flow the Reactive Way
Reactive Streams: Handling Data-Flow the Reactive WayReactive Streams: Handling Data-Flow the Reactive Way
Reactive Streams: Handling Data-Flow the Reactive WayRoland Kuhn
 
Specs2 whirlwind tour at Scaladays 2014
Specs2 whirlwind tour at Scaladays 2014Specs2 whirlwind tour at Scaladays 2014
Specs2 whirlwind tour at Scaladays 2014Eric Torreborre
 
Spark stream - Kafka
Spark stream - Kafka Spark stream - Kafka
Spark stream - Kafka Dori Waldman
 
Concurrency in Scala - the Akka way
Concurrency in Scala - the Akka wayConcurrency in Scala - the Akka way
Concurrency in Scala - the Akka wayYardena Meymann
 

Was ist angesagt? (20)

Curator intro
Curator introCurator intro
Curator intro
 
Building production spark streaming applications
Building production spark streaming applicationsBuilding production spark streaming applications
Building production spark streaming applications
 
Understanding Akka Streams, Back Pressure, and Asynchronous Architectures
Understanding Akka Streams, Back Pressure, and Asynchronous ArchitecturesUnderstanding Akka Streams, Back Pressure, and Asynchronous Architectures
Understanding Akka Streams, Back Pressure, and Asynchronous Architectures
 
Building reactive distributed systems with Akka
Building reactive distributed systems with Akka Building reactive distributed systems with Akka
Building reactive distributed systems with Akka
 
Akka 2.4 plus new commercial features in Typesafe Reactive Platform
Akka 2.4 plus new commercial features in Typesafe Reactive PlatformAkka 2.4 plus new commercial features in Typesafe Reactive Platform
Akka 2.4 plus new commercial features in Typesafe Reactive Platform
 
Spark Streaming Recipes and "Exactly Once" Semantics Revised
Spark Streaming Recipes and "Exactly Once" Semantics RevisedSpark Streaming Recipes and "Exactly Once" Semantics Revised
Spark Streaming Recipes and "Exactly Once" Semantics Revised
 
Asynchronous Orchestration DSL on squbs
Asynchronous Orchestration DSL on squbsAsynchronous Orchestration DSL on squbs
Asynchronous Orchestration DSL on squbs
 
Above the clouds: introducing Akka
Above the clouds: introducing AkkaAbove the clouds: introducing Akka
Above the clouds: introducing Akka
 
Spark+flume seattle
Spark+flume seattleSpark+flume seattle
Spark+flume seattle
 
Fresh from the Oven (04.2015): Experimental Akka Typed and Akka Streams
Fresh from the Oven (04.2015): Experimental Akka Typed and Akka StreamsFresh from the Oven (04.2015): Experimental Akka Typed and Akka Streams
Fresh from the Oven (04.2015): Experimental Akka Typed and Akka Streams
 
Akka Streams and HTTP
Akka Streams and HTTPAkka Streams and HTTP
Akka Streams and HTTP
 
YARN Services
YARN ServicesYARN Services
YARN Services
 
Dive into spark2
Dive into spark2Dive into spark2
Dive into spark2
 
A Tale of Two APIs: Using Spark Streaming In Production
A Tale of Two APIs: Using Spark Streaming In ProductionA Tale of Two APIs: Using Spark Streaming In Production
A Tale of Two APIs: Using Spark Streaming In Production
 
Introduction to Apache ZooKeeper | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to Apache ZooKeeper | Big Data Hadoop Spark Tutorial | CloudxLabIntroduction to Apache ZooKeeper | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to Apache ZooKeeper | Big Data Hadoop Spark Tutorial | CloudxLab
 
[Tokyo Scala User Group] Akka Streams & Reactive Streams (0.7)
[Tokyo Scala User Group] Akka Streams & Reactive Streams (0.7)[Tokyo Scala User Group] Akka Streams & Reactive Streams (0.7)
[Tokyo Scala User Group] Akka Streams & Reactive Streams (0.7)
 
Reactive Streams: Handling Data-Flow the Reactive Way
Reactive Streams: Handling Data-Flow the Reactive WayReactive Streams: Handling Data-Flow the Reactive Way
Reactive Streams: Handling Data-Flow the Reactive Way
 
Specs2 whirlwind tour at Scaladays 2014
Specs2 whirlwind tour at Scaladays 2014Specs2 whirlwind tour at Scaladays 2014
Specs2 whirlwind tour at Scaladays 2014
 
Spark stream - Kafka
Spark stream - Kafka Spark stream - Kafka
Spark stream - Kafka
 
Concurrency in Scala - the Akka way
Concurrency in Scala - the Akka wayConcurrency in Scala - the Akka way
Concurrency in Scala - the Akka way
 

Andere mochten auch

Trafodion overview
Trafodion overviewTrafodion overview
Trafodion overviewRohit Jain
 
The other Apache Technologies your Big Data solution needs
The other Apache Technologies your Big Data solution needsThe other Apache Technologies your Big Data solution needs
The other Apache Technologies your Big Data solution needsgagravarr
 
Trafodion – an enterprise class sql based on hadoop
Trafodion – an enterprise class sql based on hadoopTrafodion – an enterprise class sql based on hadoop
Trafodion – an enterprise class sql based on hadoopKrishna-Kumar
 
Apache REEF - stdlib for big data
Apache REEF - stdlib for big dataApache REEF - stdlib for big data
Apache REEF - stdlib for big dataSergiy Matusevych
 
Building large scale applications in yarn with apache twill
Building large scale applications in yarn with apache twillBuilding large scale applications in yarn with apache twill
Building large scale applications in yarn with apache twillHenry Saputra
 
Harnessing the power of YARN with Apache Twill
Harnessing the power of YARN with Apache TwillHarnessing the power of YARN with Apache Twill
Harnessing the power of YARN with Apache TwillTerence Yim
 

Andere mochten auch (6)

Trafodion overview
Trafodion overviewTrafodion overview
Trafodion overview
 
The other Apache Technologies your Big Data solution needs
The other Apache Technologies your Big Data solution needsThe other Apache Technologies your Big Data solution needs
The other Apache Technologies your Big Data solution needs
 
Trafodion – an enterprise class sql based on hadoop
Trafodion – an enterprise class sql based on hadoopTrafodion – an enterprise class sql based on hadoop
Trafodion – an enterprise class sql based on hadoop
 
Apache REEF - stdlib for big data
Apache REEF - stdlib for big dataApache REEF - stdlib for big data
Apache REEF - stdlib for big data
 
Building large scale applications in yarn with apache twill
Building large scale applications in yarn with apache twillBuilding large scale applications in yarn with apache twill
Building large scale applications in yarn with apache twill
 
Harnessing the power of YARN with Apache Twill
Harnessing the power of YARN with Apache TwillHarnessing the power of YARN with Apache Twill
Harnessing the power of YARN with Apache Twill
 

Ähnlich wie Topic Modeling via Tensor Factorization - Use Case for Apache REEF

How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...
How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...
How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...Lightbend
 
Running High-Speed Serverless with nuclio
Running High-Speed Serverless with nuclioRunning High-Speed Serverless with nuclio
Running High-Speed Serverless with nuclioiguazio
 
UniRx - Reactive Extensions for Unity(EN)
UniRx - Reactive Extensions for Unity(EN)UniRx - Reactive Extensions for Unity(EN)
UniRx - Reactive Extensions for Unity(EN)Yoshifumi Kawai
 
Exploring the Final Frontier of Data Center Orchestration: Network Elements -...
Exploring the Final Frontier of Data Center Orchestration: Network Elements -...Exploring the Final Frontier of Data Center Orchestration: Network Elements -...
Exploring the Final Frontier of Data Center Orchestration: Network Elements -...Puppet
 
Alexey Orlenko ''High-performance IPC and RPC for microservices and apps''
Alexey Orlenko ''High-performance IPC and RPC for microservices and apps''Alexey Orlenko ''High-performance IPC and RPC for microservices and apps''
Alexey Orlenko ''High-performance IPC and RPC for microservices and apps''OdessaJS Conf
 
[DoKDayNA2022] - Architecting Your First Event Driven Serverless Streaming Ap...
[DoKDayNA2022] - Architecting Your First Event Driven Serverless Streaming Ap...[DoKDayNA2022] - Architecting Your First Event Driven Serverless Streaming Ap...
[DoKDayNA2022] - Architecting Your First Event Driven Serverless Streaming Ap...Timothy Spann
 
Puppet ENC – a ServiceNow Scoped Application; Richard Romanus
Puppet ENC – a ServiceNow Scoped Application; Richard RomanusPuppet ENC – a ServiceNow Scoped Application; Richard Romanus
Puppet ENC – a ServiceNow Scoped Application; Richard RomanusPuppet
 
RUCK 2017 R에 날개 달기 - Microsoft R과 클라우드 머신러닝 소개
RUCK 2017 R에 날개 달기 - Microsoft R과 클라우드 머신러닝 소개RUCK 2017 R에 날개 달기 - Microsoft R과 클라우드 머신러닝 소개
RUCK 2017 R에 날개 달기 - Microsoft R과 클라우드 머신러닝 소개r-kor
 
Spring Cloud Data Flow Overview
Spring Cloud Data Flow OverviewSpring Cloud Data Flow Overview
Spring Cloud Data Flow OverviewVMware Tanzu
 
Evolution of unix environments and the road to faster deployments
Evolution of unix environments and the road to faster deploymentsEvolution of unix environments and the road to faster deployments
Evolution of unix environments and the road to faster deploymentsRakuten Group, Inc.
 
containerit at useR!2017 conference, Brussels
containerit at useR!2017 conference, Brusselscontainerit at useR!2017 conference, Brussels
containerit at useR!2017 conference, BrusselsDaniel Nüst
 
cover-letter-2016-base+hist
cover-letter-2016-base+histcover-letter-2016-base+hist
cover-letter-2016-base+histRich Andrews
 
Programming the world with Docker
Programming the world with DockerProgramming the world with Docker
Programming the world with DockerPatrick Chanezon
 
Fabric - Realtime stream processing framework
Fabric - Realtime stream processing frameworkFabric - Realtime stream processing framework
Fabric - Realtime stream processing frameworkShashank Gautam
 
Continous delivery at docker age
Continous delivery at docker ageContinous delivery at docker age
Continous delivery at docker ageAdrien Blind
 
AWS re:Invent 2016: Building a Platform for Collaborative Scientific Research...
AWS re:Invent 2016: Building a Platform for Collaborative Scientific Research...AWS re:Invent 2016: Building a Platform for Collaborative Scientific Research...
AWS re:Invent 2016: Building a Platform for Collaborative Scientific Research...Amazon Web Services
 
seven-ways-to-run-flink-on-aws.pdf
seven-ways-to-run-flink-on-aws.pdfseven-ways-to-run-flink-on-aws.pdf
seven-ways-to-run-flink-on-aws.pdfSergioBruno21
 
Docker containers & the Future of Drupal testing
Docker containers & the Future of Drupal testing Docker containers & the Future of Drupal testing
Docker containers & the Future of Drupal testing Ricardo Amaro
 
UniK - a unikernel compiler and runtime
UniK - a unikernel compiler and runtimeUniK - a unikernel compiler and runtime
UniK - a unikernel compiler and runtimeLee Calcote
 

Ähnlich wie Topic Modeling via Tensor Factorization - Use Case for Apache REEF (20)

How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...
How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...
How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...
 
Running High-Speed Serverless with nuclio
Running High-Speed Serverless with nuclioRunning High-Speed Serverless with nuclio
Running High-Speed Serverless with nuclio
 
UniRx - Reactive Extensions for Unity(EN)
UniRx - Reactive Extensions for Unity(EN)UniRx - Reactive Extensions for Unity(EN)
UniRx - Reactive Extensions for Unity(EN)
 
Exploring the Final Frontier of Data Center Orchestration: Network Elements -...
Exploring the Final Frontier of Data Center Orchestration: Network Elements -...Exploring the Final Frontier of Data Center Orchestration: Network Elements -...
Exploring the Final Frontier of Data Center Orchestration: Network Elements -...
 
Alexey Orlenko ''High-performance IPC and RPC for microservices and apps''
Alexey Orlenko ''High-performance IPC and RPC for microservices and apps''Alexey Orlenko ''High-performance IPC and RPC for microservices and apps''
Alexey Orlenko ''High-performance IPC and RPC for microservices and apps''
 
[DoKDayNA2022] - Architecting Your First Event Driven Serverless Streaming Ap...
[DoKDayNA2022] - Architecting Your First Event Driven Serverless Streaming Ap...[DoKDayNA2022] - Architecting Your First Event Driven Serverless Streaming Ap...
[DoKDayNA2022] - Architecting Your First Event Driven Serverless Streaming Ap...
 
Afanasov14flynet slides
Afanasov14flynet slidesAfanasov14flynet slides
Afanasov14flynet slides
 
Puppet ENC – a ServiceNow Scoped Application; Richard Romanus
Puppet ENC – a ServiceNow Scoped Application; Richard RomanusPuppet ENC – a ServiceNow Scoped Application; Richard Romanus
Puppet ENC – a ServiceNow Scoped Application; Richard Romanus
 
RUCK 2017 R에 날개 달기 - Microsoft R과 클라우드 머신러닝 소개
RUCK 2017 R에 날개 달기 - Microsoft R과 클라우드 머신러닝 소개RUCK 2017 R에 날개 달기 - Microsoft R과 클라우드 머신러닝 소개
RUCK 2017 R에 날개 달기 - Microsoft R과 클라우드 머신러닝 소개
 
Spring Cloud Data Flow Overview
Spring Cloud Data Flow OverviewSpring Cloud Data Flow Overview
Spring Cloud Data Flow Overview
 
Evolution of unix environments and the road to faster deployments
Evolution of unix environments and the road to faster deploymentsEvolution of unix environments and the road to faster deployments
Evolution of unix environments and the road to faster deployments
 
containerit at useR!2017 conference, Brussels
containerit at useR!2017 conference, Brusselscontainerit at useR!2017 conference, Brussels
containerit at useR!2017 conference, Brussels
 
cover-letter-2016-base+hist
cover-letter-2016-base+histcover-letter-2016-base+hist
cover-letter-2016-base+hist
 
Programming the world with Docker
Programming the world with DockerProgramming the world with Docker
Programming the world with Docker
 
Fabric - Realtime stream processing framework
Fabric - Realtime stream processing frameworkFabric - Realtime stream processing framework
Fabric - Realtime stream processing framework
 
Continous delivery at docker age
Continous delivery at docker ageContinous delivery at docker age
Continous delivery at docker age
 
AWS re:Invent 2016: Building a Platform for Collaborative Scientific Research...
AWS re:Invent 2016: Building a Platform for Collaborative Scientific Research...AWS re:Invent 2016: Building a Platform for Collaborative Scientific Research...
AWS re:Invent 2016: Building a Platform for Collaborative Scientific Research...
 
seven-ways-to-run-flink-on-aws.pdf
seven-ways-to-run-flink-on-aws.pdfseven-ways-to-run-flink-on-aws.pdf
seven-ways-to-run-flink-on-aws.pdf
 
Docker containers & the Future of Drupal testing
Docker containers & the Future of Drupal testing Docker containers & the Future of Drupal testing
Docker containers & the Future of Drupal testing
 
UniK - a unikernel compiler and runtime
UniK - a unikernel compiler and runtimeUniK - a unikernel compiler and runtime
UniK - a unikernel compiler and runtime
 

Kürzlich hochgeladen

AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesVictorSzoltysek
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...kalichargn70th171
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplatePresentation.STUDIO
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdfPearlKirahMaeRagusta1
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnAmarnathKambale
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech studentsHimanshiGarg82
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension AidPhilip Schwarz
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionOnePlan Solutions
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 

Kürzlich hochgeladen (20)

Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 

Topic Modeling via Tensor Factorization - Use Case for Apache REEF

  • 1. Cloud and Information Services Lab
  • 2. Furong Huang UC Irvine Anima Anandkumar UC Irvine Nikos Karampatziakis Microsoft CISL Paul Mineiro + 𝜀 Microsoft CISL Sergiy Matusevych Microsoft CISL Shravan Narayanamurthy Microsoft CISL Markus Weimer Microsoft CISL Apache REEF Contributors Worldwide
  • 3.
  • 4.
  • 5. /pos/cv107_24319.txt is evil dead ii a bad movie ? it's full of terrible acting , pointless violence , and plot holes yet it remains a cult classic nearly fifteen years after its release ... /pos/cv108_15571.txt it's rather strange too have two computer animated talking ant movies come out in a single year , but that is what disney and pixar animation ; s latest film represents ... http://www.cs.cornell.edu/People/pabo/movie-review-data
  • 6. LDAvis library for R https://github.com/cpsievert/LDAvis
  • 7. =*
  • 8.
  • 9.
  • 10. 𝑀2 ≝ 𝔼 𝑥1⨂𝑥2𝑀1 ≝ 𝔼 𝑥1
  • 11. 𝑀3 ≝ 𝔼 𝑥1⨂𝑥2⨂𝑥3
  • 12. 𝑀2 ≝ 𝔼 𝑥1⨂𝑥2 𝑀1 ≝ 𝔼 𝑥1 𝑀3 ≝ 𝔼 𝑥1⨂𝑥2⨂𝑥3 − 𝛼0 𝛼0 + 1 𝑀1⨂𝑀1 −[… more shift terms]
  • 13. 𝑀2 = 𝑖=1 𝑘 𝛼𝑖 ∙ 𝛽𝑖⨂𝛽𝑖 𝑀3 = 𝑖=1 𝑘 𝛼𝑖 ∙ 𝛽𝑖⨂𝛽𝑖⨂𝛽𝑖
  • 14. 𝑀3 ≝ 𝔼 𝑥1⨂𝑥2⨂𝑥3 𝑀3 𝜆1 𝑎1⨂𝑏1⨂𝑐1 = 𝜆1 𝜆2 𝑎2⨂𝑏2⨂𝑐2 + 𝜆2 + 𝜆3 ⋯ = 𝑖 𝜆𝑖 ∙ 𝑎𝑖⨂𝑏𝑖⨂𝑐𝑖
  • 15. 𝜆, 𝐴 ← argmin 𝜆∈ℝ 𝑘 𝐴∈ℝ 𝑘×𝑘 𝐴 ⋅ Diag 𝜆 ⋅ 𝐶⨀𝐵 ⊤ − 𝑀3 2
  • 16.
  • 17.
  • 19. Storage (Focus: HDFS) HDFS ... Azure Block Storage ... Office 365 SQL / HIVE / LINQ Cloud Numerics Pregel GraphLab Programming Models (Domain Specific Languages) DatalabApplications Machine Learning BI Power* Resource Manager (Focus: YARN) YARN ... Mesos ... Azure Tasks Drawbridge REEF The Application Server for Big Data Communications, Storage, Fault Management, Interoperability Operator Layer (Future Work) REEF Operator API and Library REEF Logical Abstraction
  • 22. Easy to reason about Centralized control flow • Evaluator allocation and configuration • Task configuration and submission Centralized error handling • Task exceptions thrown to the Driver • Evaluator failures reported to the Driver Scalable Event-based programming • Driver sends requests as events to REEF • REEF sends events to the Driver Mostly stateless design • REEF maintains minimal state • Majority of state keeping (e.g. work queues) is maintained by the Driver
  • 23. // Submit task to the newly created context public class ContextActiveHandler implements EventHandler<ActiveContext> { @Override public void onNext(final ActiveContext context) { taskGroups.submitNext(context); } } // Submit next task to current context public class TaskCompletedHandler implements EventHandler<CompletedTask> { @Override public void onNext(final CompletedTask task) { final ActiveContext context = task.getActiveContext(); taskGroups.submitNext(context); } }
  • 24.
  • 25. @Inject public WhitenTask( final @Parameter(TaskConfigurationOptions.Identifier.class) String taskId, final @Parameter(Launch.DimD.class) int dimD, final @Parameter(Launch.DimK.class) int dimK, final GroupCommClient groupCommClient, final InputData data, final TaskEnvironment env) { // ... } “ ”Use Java “type system” to validate the configuration
  • 26.
  • 27.
  • 28.
  • 29. // We can send and receive any Java serializable data, e.g. JBLAS matrices private final Broadcast.Sender<DoubleMatrix> modelSender; private final Broadcast.Receiver<DoubleMatrix[]> resultReceiver; // Broadcast the model, collect the results, repeat. do { this.modelSender.send(sliceA); // ... final DoubleMatrix[] result = this.resultReceiver.reduce(); } while (notConverged(sliceA, prevSliceA));
  • 30.
  • 33.
  • 34.
  • 35. 𝑀2 = 𝑖 𝜆𝑖 ∙ 𝑢𝑖⨂𝑣𝑖 𝑀2 𝜆1 ∙ 𝑢1⨂𝑣1 = 𝜆1 𝜆2 ∙ 𝑢2⨂𝑣2 + 𝜆2 + 𝜆3 ⋯
  • 36.
  • 37. 𝑀3 ≝ 𝔼 𝑥1⨂𝑥2⨂𝑥3 𝑀3 𝜆1 𝑢1⨂𝑣1⨂𝑤1 = 𝜆1 𝜆2 𝑢2⨂𝑣2⨂𝑤2 + 𝜆2 + 𝜆3 ⋯ = 𝑖 𝜆𝑖 ∙ 𝑢𝑖⨂𝑣𝑖⨂𝑤𝑖
  • 39. • Find whitening matrix s.t. orthogonal • Use to find s.t. • Whiten :

Hinweis der Redaktion

  1. We are hiring!
  2. What is the problem we are solving, why it’s important, and what are state-of-the-art solutions. New approach and our algorithm etc
  3. In general, given data (e.g. corpus of text, social graph, user pageview/click logs), reveal latent parameters that influence the distribution – communities, user preferences, text topics. We’ll talk about text because it’s easy to demo and reason about even on a small dataset
  4. Top 10 topics. Each document has a mixture of topics; some topics are common, e.g. film/movie/time. Word appear in many topics, e.g. action/crime/cop and action/Jackie Chan. Topics are sparse
  5. Start 3:20
  6. It’s all bag of words to me Nikolai Ge, Portrait of Leo Tolstoy, 1884 Tretyakov gallery, Moscow Writing what I believe
  7. Start 10
  8. Introduced by Karl Pearson in 1894; everything new is well forgotten old; so M1 is a vector, M2 a matrix; M2 is not enough for topics (there is spectral clustering – will talk later if asked). Need to capture triplets – a cube of data…
  9. It was shown that with these shifted terms M1..M3 are sufficient to reveal not only clusters, but mixtures of latent parameters. in fact, if you squint right, M2 is a covariance matrix, and a0 is a Dirichlet hyperprior. Similarly, M3 is skewness (shifted). I will give more details later. So this is information that we collect.. How to get the topics??
  10. 8:25 We can factorize the tensor into a cross product of eigenvectors that reveal the topics. i.e. each vector beta_i contains probabilities of words in topic i.
  11. We can factorize the tensor into a cross product of eigenvectors that reveal the topics. i.e. each vector beta_i contains probabilities of words in topic i.
  12. it’s linear . Need resource manager, e.g. YARN, and distributed FS. . Master node checks for convergence
  13. Markus gave a talk at Hadoop Summit 2014 – see on YouTube
  14. Much nicer in C# REEF itself has very little state; all state is in the driver
  15. 18:00
  16. Centralized error handling: mention Erlang/OTP supervisor architecture
  17. Much nicer in C# REEF itself has very little state; all state is in the driver
  18. Centralized error handling: mention Erlang/OTP supervisor architecture
  19. Java “type system”… Annotate constructor with @Inject, mark leaf parameters with @Parameter, other params must be classes with @Inject
  20. Centralized error handling: mention Erlang/OTP supervisor architecture
  21. Centralized error handling: mention Erlang/OTP supervisor architecture
  22. Centralized error handling: mention Erlang/OTP supervisor architecture
  23. Form a communication tree – nodes pass data along.. On reduce stage we also specify the aggregation operator
  24. Future work: community detection, larger datasets (pubmed), compare with LightLDA; in general: need better support for tensors (libraries, CUDA, parameter server)
  25. Future work: community detection, larger datasets (pubmed), compare with LightLDA; in general: need better support for tensors (libraries, CUDA, parameter server)
  26. Future work: community detection, larger datasets (pubmed), compare with LightLDA; in general: need better support for tensors (libraries, CUDA, parameter server) End: 20 min sharp Total ~24 min with questions
  27. Model (LDA) is independent from inference algorithms (variational Bayes, MCMC, tensors)