SlideShare ist ein Scribd-Unternehmen logo
1 von 39
Downloaden Sie, um offline zu lesen
Introduction

Approaches

Effziente Verarbeitung von grossen Datenmengen
Teil II
Tristan Schneider

January 9, 2014

Effziente Verarbeitung von grossen Datenmengen Teil II

Conclusion
Introduction

Inhalt
Introduction
Social Graph
Problems and Motivation
Approaches
TAO
Horton
Pregel
Trinity
Unicorn
Conclusion
Comparison
Future Work
Effziente Verarbeitung von grossen Datenmengen Teil II

Approaches

Conclusion
Introduction

Approaches

Social Graph

Consists of Nodes and Edges
Describes Entities and their Relation
Used by Facebook, Google, Amazon etc
About 100+ million nodes and 10+ billion edges

Effziente Verarbeitung von grossen Datenmengen Teil II

Conclusion
Introduction

Approaches

Problems and Motivation

amount of data exceeds capability of a single machine
necessary to distribute data and computation
data access managed by framework
different requirements (latency, throughput)

Effziente Verarbeitung von grossen Datenmengen Teil II

Conclusion
Introduction

Approaches

TAO

developed by Facebook
read optimized
fixed set of queries
Strength low latency access

Effziente Verarbeitung von grossen Datenmengen Teil II

Conclusion
Introduction

Approaches

TAO: Data Model

data identified by 64-bit integer
Objects (id) → (otype, (key → value)*)
Associations (id1, atype, id2) → (time, (key → value)*)

Effziente Verarbeitung von grossen Datenmengen Teil II

Conclusion
Introduction

Approaches

TAO: API

fixed set of queries
assoc add, assoc delete, assoc change type
assoc get, assoc count, assoc range, assoc time range

Effziente Verarbeitung von grossen Datenmengen Teil II

Conclusion
Introduction

Approaches

TAO: Architecture

data divided into shard (via hashing)
each server handles one or more shard
objects and their associations are in the same shard
an object never changes the shard

Effziente Verarbeitung von grossen Datenmengen Teil II

Conclusion
Introduction

Approaches

TAO: Architecture

servers divided in leaders and followers
clients always communicate with followers
cache misses and writes redirected to leader
slave servers support master servers if necessary

Effziente Verarbeitung von grossen Datenmengen Teil II

Conclusion
Introduction

Approaches

TAO: Architecture: Scheme

Effziente Verarbeitung von grossen Datenmengen Teil II

Conclusion
Introduction

Approaches

TAO: Fault Tolerance and Performance

efficiency and availability > consistency
global mark for down server
followers are interchangeable
slave databases promoted to master, if master crashes

Effziente Verarbeitung von grossen Datenmengen Teil II

Conclusion
Introduction

Approaches

Conclusion

TAO: Fault Tolerance and Performance

Figure: Write Access Latencies
https://www.facebook.com/download/273893712748848/atc13-bronson.pdf
Effziente Verarbeitung von grossen Datenmengen Teil II
Introduction

Approaches

Horton

query language execution engine
written in C#
Strength interactive queries with low latency

Effziente Verarbeitung von grossen Datenmengen Teil II

Conclusion
Introduction

Approaches

Horton: Data Model

similar to TAO
divided in partitions
additional data can be attached (e.g. key-value-pairs)
directed edges stored at source and target

Effziente Verarbeitung von grossen Datenmengen Teil II

Conclusion
Introduction

Approaches

Horton: API

horton query language
initiated via client (library)

Effziente Verarbeitung von grossen Datenmengen Teil II

Conclusion
Introduction

Approaches

Horton: Architecture

Graph Client Library translates query to regular expression
Graph Coordinator translates regular expression to finite state
machine and finds most effective execution plan
Graph Partitions executes the finite state machine and traverses
the graph
Graph Manager provides an interface to administrate the graph

Effziente Verarbeitung von grossen Datenmengen Teil II

Conclusion
Introduction

Approaches

Pregel

C++ based
computation consists of parallel iteration
communication using messaging
Strength high throughput (for analysis)

Effziente Verarbeitung von grossen Datenmengen Teil II

Conclusion
Introduction

Approaches

Pregel: Data Model

graph divided in partitions
partition assignment based on node id (hash(id) mod n)

Effziente Verarbeitung von grossen Datenmengen Teil II

Conclusion
Introduction

Approaches

Pregel: API

implementation of a Vertex class (task)
define methods like Compute(...), SendMessageTo(...)

Effziente Verarbeitung von grossen Datenmengen Teil II

Conclusion
Introduction

Approaches

Pregel: Architecture

runs on a cluster management system
uses distributed file system (eg. Bigtable)

Effziente Verarbeitung von grossen Datenmengen Teil II

Conclusion
Introduction

Approaches

Pregel: Basic Work Flow

1. copy task to worker machines, one is promoted to master
2. master assigns one or more partitions to each worker
3. master invokes supersteps
4. save graph after computation

Effziente Verarbeitung von grossen Datenmengen Teil II

Conclusion
Introduction

Approaches

Conclusion

Pregel: Fault Tolerance and Performance

workers save their progress at checkpoint supersteps
worker failure detected using ping
reassign partitions failed servers to available workers
reload state of the most recent available checkpoint superstep
process termination if master failed

Effziente Verarbeitung von grossen Datenmengen Teil II
Introduction

Approaches

Pregel: Fault Tolerance and Performance

Figure: varying number of worker on 1 billion vertex binary tree
http://kowshik.github.io/JPregel/pregel paper.pdf
Effziente Verarbeitung von grossen Datenmengen Teil II

Conclusion
Introduction

Approaches

Trinity
developed by Microsoft
flexible in data and computation
supports online query processing and offline computation
on top well-connected cluster (memory cloud)
based on TFS (similar to HDFS)
Strength low latency and high throughput (not at the same
time)

Effziente Verarbeitung von grossen Datenmengen Teil II

Conclusion
Introduction

Approaches

Conclusion

Trinity: Data Model

key-value-store
one table for nodes
one table for each type of relation
relations represented by id-pairs in the specific table
customisation possible with Trinity Structure Language (TSL)
data backed up in persistent file system

Effziente Verarbeitung von grossen Datenmengen Teil II
Introduction

Approaches

Trinity: API

Trinity Desktop Environment (TDE)
supports query requests (similar to Horton/SQL)
supports offline computation (similar to pregel)

Effziente Verarbeitung von grossen Datenmengen Teil II

Conclusion
Introduction

Approaches

Trinity: Architecture

Slaves Stores a part of the data, processes tasks and
messages.
Proxies Optional middle tier between slaves and clients.
Handles messages, does not store data.
Clients Responsible for user interaction with the cluster.

Effziente Verarbeitung von grossen Datenmengen Teil II

Conclusion
Introduction

Approaches

Trinity: Architecture

Figure: Trinity Cluster Structure
https://research.microsoft.com/pubs/161291/trinity.pdf
Effziente Verarbeitung von grossen Datenmengen Teil II

Conclusion
Introduction

Approaches

Conclusion

Trinity: Fault Tolerance and Performance

no ACID support, but atomicity of operations
dead machines are replaced by alive ones, reload memory from
TFS
requesting machine will wait till the dead machine is replaced
recovering the state of the most recent checkpoint superstep
(similar to pregel)

Effziente Verarbeitung von grossen Datenmengen Teil II
Introduction

Approaches

Trinity: Fault Tolerance and Performance

Figure: Response time of subgraph match queries
Effziente Verarbeitung von grossen Datenmengen Teil II
https://research.microsoft.com/pubs/161291/trinity.pdf

Conclusion
Introduction

Approaches

Unicorn

in-memory social graph-aware indexing system
search offering backend of Facebook
based on Hadoop
Strength Typeahead
Good performance on complex queries.

Effziente Verarbeitung von grossen Datenmengen Teil II

Conclusion
Introduction

Approaches

Unicorn: Data Model

sharded data (similar to Facebooks TAO)
indices built and converted using custom Hadoop pipeline

Effziente Verarbeitung von grossen Datenmengen Teil II

Conclusion
Introduction

Approaches

Conclusion

Unicorn: API

Queries in Unicorn Query Language
e.g. (term likers:104076956295773))
≈ 6M Likers of ”Computer Science”
apply allows to query a (truncated) set of id and then use
those to construct a new query
extract attaches matches as metadata within the forward
index of the query set

Effziente Verarbeitung von grossen Datenmengen Teil II
Introduction

Approaches

Conclusion

Unicorn: Architeture

top-aggregator dispatches the query to one rack-aggregator of
each rack, combines and returns result
rack-aggregator forwards the query to all index servers of its rack
(high bandwidth), combines results
index server about 40-80 machines per rack, stores adjacency
lists, performs operations

Effziente Verarbeitung von grossen Datenmengen Teil II
Introduction

Approaches

Unicorn: Fault Tolerance and Performance

sharding and replication
automatically replacing machines
serving incomplete results is strongly preferable to serving
empty results

Effziente Verarbeitung von grossen Datenmengen Teil II

Conclusion
Introduction

Approaches

Conclusion

Unicorn: Fault Tolerance and Performance
(apply friend: likers:104076956295773) ≈ Friends of Likers of
”Computer Science”

https://www.facebook.com/download/138915572976390/UnicornVLDBfinal.pdf
Effziente Verarbeitung von grossen Datenmengen Teil II
Introduction

Approaches

Conclusion

Comparison

Framework
TAO
Horton
Pregel
Trinity
Unicorn

Query Language
no
yes
no
yes
yes

Effziente Verarbeitung von grossen Datenmengen Teil II

low latency
yes
yes
no
yes
yes

high throughput
no
no
yes
yes
no
Introduction

Approaches

Future Work

query language vs fixed set queries
all-in-one framework difficult (Trinity as best attempt)

Effziente Verarbeitung von grossen Datenmengen Teil II

Conclusion
Introduction

Approaches

Thank you for your attention.

Questions?
Sources
1.
2.
3.
4.
5.

https://research.microsoft.com/pubs/161291/trinity.pdf
http://research.microsoft.com/pubs/162643/icde12 demo 679.pdf
http://kowshik.github.io/JPregel/pregel paper.pdf
https://www.facebook.com/download/273893712748848/atc13-bronson.pdf
https://www.facebook.com/download/138915572976390/UnicornVLDB-final.pdf

Effziente Verarbeitung von grossen Datenmengen Teil II

Conclusion

Weitere ähnliche Inhalte

Was ist angesagt?

Scalable Distributed Real-Time Clustering for Big Data Streams
Scalable Distributed Real-Time Clustering for Big Data StreamsScalable Distributed Real-Time Clustering for Big Data Streams
Scalable Distributed Real-Time Clustering for Big Data StreamsAntonio Severien
 
Scaling Application on High Performance Computing Clusters and Analysis of th...
Scaling Application on High Performance Computing Clusters and Analysis of th...Scaling Application on High Performance Computing Clusters and Analysis of th...
Scaling Application on High Performance Computing Clusters and Analysis of th...Rusif Eyvazli
 
Event-Driven, Client-Server Archetypes for E-Commerce
Event-Driven, Client-Server Archetypes for E-CommerceEvent-Driven, Client-Server Archetypes for E-Commerce
Event-Driven, Client-Server Archetypes for E-Commerceijtsrd
 
REPLICATION STRATEGY BASED ON DATA RELATIONSHIP IN GRID COMPUTING
REPLICATION STRATEGY BASED ON DATA RELATIONSHIP IN GRID COMPUTINGREPLICATION STRATEGY BASED ON DATA RELATIONSHIP IN GRID COMPUTING
REPLICATION STRATEGY BASED ON DATA RELATIONSHIP IN GRID COMPUTINGcsandit
 
Ijircce publish this paper
Ijircce publish this paperIjircce publish this paper
Ijircce publish this paperSANTOSH WAYAL
 
Berlin Hadoop Get Together Apache Drill
Berlin Hadoop Get Together Apache Drill Berlin Hadoop Get Together Apache Drill
Berlin Hadoop Get Together Apache Drill MapR Technologies
 
Final Report_798 Project_Nithin_Sharmila
Final Report_798 Project_Nithin_SharmilaFinal Report_798 Project_Nithin_Sharmila
Final Report_798 Project_Nithin_SharmilaNithin Kakkireni
 
Data mining model for the data retrieval from central server configuration
Data mining model for the data retrieval from central server configurationData mining model for the data retrieval from central server configuration
Data mining model for the data retrieval from central server configurationijcsit
 
Fota Delta Size Reduction Using FIle Similarity Algorithms
Fota Delta Size Reduction Using FIle Similarity AlgorithmsFota Delta Size Reduction Using FIle Similarity Algorithms
Fota Delta Size Reduction Using FIle Similarity AlgorithmsShivansh Gaur
 
" NoSQL Databases: An Overview" Lena Wiese, Research Group Knowledge Engineer...
" NoSQL Databases: An Overview" Lena Wiese, Research Group Knowledge Engineer..." NoSQL Databases: An Overview" Lena Wiese, Research Group Knowledge Engineer...
" NoSQL Databases: An Overview" Lena Wiese, Research Group Knowledge Engineer...Dataconomy Media
 
Using R for Cyber Security Part 1
Using R for Cyber Security Part 1Using R for Cyber Security Part 1
Using R for Cyber Security Part 1Ajay Ohri
 
Web Oriented FIM for large scale dataset using Hadoop
Web Oriented FIM for large scale dataset using HadoopWeb Oriented FIM for large scale dataset using Hadoop
Web Oriented FIM for large scale dataset using Hadoopdbpublications
 
Replication and Synchronization Algorithms for Distributed Databases - Lena W...
Replication and Synchronization Algorithms for Distributed Databases - Lena W...Replication and Synchronization Algorithms for Distributed Databases - Lena W...
Replication and Synchronization Algorithms for Distributed Databases - Lena W...distributed matters
 
LARGE-SCALE DATA PROCESSING USING MAPREDUCE IN CLOUD COMPUTING ENVIRONMENT
LARGE-SCALE DATA PROCESSING USING MAPREDUCE IN CLOUD COMPUTING ENVIRONMENTLARGE-SCALE DATA PROCESSING USING MAPREDUCE IN CLOUD COMPUTING ENVIRONMENT
LARGE-SCALE DATA PROCESSING USING MAPREDUCE IN CLOUD COMPUTING ENVIRONMENTijwscjournal
 
A Standard Data Format for Computational Chemistry: CSX
A Standard Data Format for Computational Chemistry: CSXA Standard Data Format for Computational Chemistry: CSX
A Standard Data Format for Computational Chemistry: CSXStuart Chalk
 
Towards an Incremental Schema-level Index for Distributed Linked Open Data G...
Towards an Incremental Schema-level Index  for Distributed Linked Open Data G...Towards an Incremental Schema-level Index  for Distributed Linked Open Data G...
Towards an Incremental Schema-level Index for Distributed Linked Open Data G...Till Blume
 
AOTO: Adaptive overlay topology optimization in unstructured P2P systems
AOTO: Adaptive overlay topology optimization in unstructured P2P systemsAOTO: Adaptive overlay topology optimization in unstructured P2P systems
AOTO: Adaptive overlay topology optimization in unstructured P2P systemsZhenyun Zhuang
 

Was ist angesagt? (20)

Scalable Distributed Real-Time Clustering for Big Data Streams
Scalable Distributed Real-Time Clustering for Big Data StreamsScalable Distributed Real-Time Clustering for Big Data Streams
Scalable Distributed Real-Time Clustering for Big Data Streams
 
Scaling Application on High Performance Computing Clusters and Analysis of th...
Scaling Application on High Performance Computing Clusters and Analysis of th...Scaling Application on High Performance Computing Clusters and Analysis of th...
Scaling Application on High Performance Computing Clusters and Analysis of th...
 
Event-Driven, Client-Server Archetypes for E-Commerce
Event-Driven, Client-Server Archetypes for E-CommerceEvent-Driven, Client-Server Archetypes for E-Commerce
Event-Driven, Client-Server Archetypes for E-Commerce
 
Big Data & Hadoop. Simone Leo (CRS4)
Big Data & Hadoop. Simone Leo (CRS4)Big Data & Hadoop. Simone Leo (CRS4)
Big Data & Hadoop. Simone Leo (CRS4)
 
REPLICATION STRATEGY BASED ON DATA RELATIONSHIP IN GRID COMPUTING
REPLICATION STRATEGY BASED ON DATA RELATIONSHIP IN GRID COMPUTINGREPLICATION STRATEGY BASED ON DATA RELATIONSHIP IN GRID COMPUTING
REPLICATION STRATEGY BASED ON DATA RELATIONSHIP IN GRID COMPUTING
 
Ijircce publish this paper
Ijircce publish this paperIjircce publish this paper
Ijircce publish this paper
 
Berlin Hadoop Get Together Apache Drill
Berlin Hadoop Get Together Apache Drill Berlin Hadoop Get Together Apache Drill
Berlin Hadoop Get Together Apache Drill
 
Final Report_798 Project_Nithin_Sharmila
Final Report_798 Project_Nithin_SharmilaFinal Report_798 Project_Nithin_Sharmila
Final Report_798 Project_Nithin_Sharmila
 
Data mining model for the data retrieval from central server configuration
Data mining model for the data retrieval from central server configurationData mining model for the data retrieval from central server configuration
Data mining model for the data retrieval from central server configuration
 
Fota Delta Size Reduction Using FIle Similarity Algorithms
Fota Delta Size Reduction Using FIle Similarity AlgorithmsFota Delta Size Reduction Using FIle Similarity Algorithms
Fota Delta Size Reduction Using FIle Similarity Algorithms
 
Harvard poster
Harvard posterHarvard poster
Harvard poster
 
" NoSQL Databases: An Overview" Lena Wiese, Research Group Knowledge Engineer...
" NoSQL Databases: An Overview" Lena Wiese, Research Group Knowledge Engineer..." NoSQL Databases: An Overview" Lena Wiese, Research Group Knowledge Engineer...
" NoSQL Databases: An Overview" Lena Wiese, Research Group Knowledge Engineer...
 
Using R for Cyber Security Part 1
Using R for Cyber Security Part 1Using R for Cyber Security Part 1
Using R for Cyber Security Part 1
 
Poster (1)
Poster (1)Poster (1)
Poster (1)
 
Web Oriented FIM for large scale dataset using Hadoop
Web Oriented FIM for large scale dataset using HadoopWeb Oriented FIM for large scale dataset using Hadoop
Web Oriented FIM for large scale dataset using Hadoop
 
Replication and Synchronization Algorithms for Distributed Databases - Lena W...
Replication and Synchronization Algorithms for Distributed Databases - Lena W...Replication and Synchronization Algorithms for Distributed Databases - Lena W...
Replication and Synchronization Algorithms for Distributed Databases - Lena W...
 
LARGE-SCALE DATA PROCESSING USING MAPREDUCE IN CLOUD COMPUTING ENVIRONMENT
LARGE-SCALE DATA PROCESSING USING MAPREDUCE IN CLOUD COMPUTING ENVIRONMENTLARGE-SCALE DATA PROCESSING USING MAPREDUCE IN CLOUD COMPUTING ENVIRONMENT
LARGE-SCALE DATA PROCESSING USING MAPREDUCE IN CLOUD COMPUTING ENVIRONMENT
 
A Standard Data Format for Computational Chemistry: CSX
A Standard Data Format for Computational Chemistry: CSXA Standard Data Format for Computational Chemistry: CSX
A Standard Data Format for Computational Chemistry: CSX
 
Towards an Incremental Schema-level Index for Distributed Linked Open Data G...
Towards an Incremental Schema-level Index  for Distributed Linked Open Data G...Towards an Incremental Schema-level Index  for Distributed Linked Open Data G...
Towards an Incremental Schema-level Index for Distributed Linked Open Data G...
 
AOTO: Adaptive overlay topology optimization in unstructured P2P systems
AOTO: Adaptive overlay topology optimization in unstructured P2P systemsAOTO: Adaptive overlay topology optimization in unstructured P2P systems
AOTO: Adaptive overlay topology optimization in unstructured P2P systems
 

Ähnlich wie Effiziente Verarbeitung von grossen Datenmengen

Google Cloud Computing on Google Developer 2008 Day
Google Cloud Computing on Google Developer 2008 DayGoogle Cloud Computing on Google Developer 2008 Day
Google Cloud Computing on Google Developer 2008 Dayprogrammermag
 
Fundamental question and answer in cloud computing quiz by animesh chaturvedi
Fundamental question and answer in cloud computing quiz by animesh chaturvediFundamental question and answer in cloud computing quiz by animesh chaturvedi
Fundamental question and answer in cloud computing quiz by animesh chaturvediAnimesh Chaturvedi
 
Streaming Data in R
Streaming Data in RStreaming Data in R
Streaming Data in RRory Winston
 
Monitoring as an entry point for collaboration
Monitoring as an entry point for collaborationMonitoring as an entry point for collaboration
Monitoring as an entry point for collaborationJulien Pivotto
 
Designing a Scalable Twitter - Patterns for Designing Scalable Real-Time Web ...
Designing a Scalable Twitter - Patterns for Designing Scalable Real-Time Web ...Designing a Scalable Twitter - Patterns for Designing Scalable Real-Time Web ...
Designing a Scalable Twitter - Patterns for Designing Scalable Real-Time Web ...Nati Shalom
 
Big Data Meetup #7
Big Data Meetup #7Big Data Meetup #7
Big Data Meetup #7Paul Lo
 
Big Data Analytics (ML, DL, AI) hands-on
Big Data Analytics (ML, DL, AI) hands-onBig Data Analytics (ML, DL, AI) hands-on
Big Data Analytics (ML, DL, AI) hands-onDony Riyanto
 
My Other Computer is a Data Center: The Sector Perspective on Big Data
My Other Computer is a Data Center: The Sector Perspective on Big DataMy Other Computer is a Data Center: The Sector Perspective on Big Data
My Other Computer is a Data Center: The Sector Perspective on Big DataRobert Grossman
 
Big Data Taiwan 2014 Track2-2: Informatica Big Data Solution
Big Data Taiwan 2014 Track2-2: Informatica Big Data SolutionBig Data Taiwan 2014 Track2-2: Informatica Big Data Solution
Big Data Taiwan 2014 Track2-2: Informatica Big Data SolutionEtu Solution
 
Performance evaluation of Map-reduce jar pig hive and spark with machine lear...
Performance evaluation of Map-reduce jar pig hive and spark with machine lear...Performance evaluation of Map-reduce jar pig hive and spark with machine lear...
Performance evaluation of Map-reduce jar pig hive and spark with machine lear...IJECEIAES
 
Survey Paper on Big Data and Hadoop
Survey Paper on Big Data and HadoopSurvey Paper on Big Data and Hadoop
Survey Paper on Big Data and HadoopIRJET Journal
 
Hw09 Hadoop Based Data Mining Platform For The Telecom Industry
Hw09   Hadoop Based Data Mining Platform For The Telecom IndustryHw09   Hadoop Based Data Mining Platform For The Telecom Industry
Hw09 Hadoop Based Data Mining Platform For The Telecom IndustryCloudera, Inc.
 
Many Task Applications for Grids and Supercomputers
Many Task Applications for Grids and SupercomputersMany Task Applications for Grids and Supercomputers
Many Task Applications for Grids and SupercomputersIan Foster
 
Dr.Hadoop- an infinite scalable metadata management for Hadoop-How the baby e...
Dr.Hadoop- an infinite scalable metadata management for Hadoop-How the baby e...Dr.Hadoop- an infinite scalable metadata management for Hadoop-How the baby e...
Dr.Hadoop- an infinite scalable metadata management for Hadoop-How the baby e...Dipayan Dev
 
Maximizing Data Lake ROI with Data Virtualization: A Technical Demonstration
Maximizing Data Lake ROI with Data Virtualization: A Technical DemonstrationMaximizing Data Lake ROI with Data Virtualization: A Technical Demonstration
Maximizing Data Lake ROI with Data Virtualization: A Technical DemonstrationDenodo
 

Ähnlich wie Effiziente Verarbeitung von grossen Datenmengen (20)

Paper ijert
Paper ijertPaper ijert
Paper ijert
 
Google Cloud Computing on Google Developer 2008 Day
Google Cloud Computing on Google Developer 2008 DayGoogle Cloud Computing on Google Developer 2008 Day
Google Cloud Computing on Google Developer 2008 Day
 
Fundamental question and answer in cloud computing quiz by animesh chaturvedi
Fundamental question and answer in cloud computing quiz by animesh chaturvediFundamental question and answer in cloud computing quiz by animesh chaturvedi
Fundamental question and answer in cloud computing quiz by animesh chaturvedi
 
Streaming Data in R
Streaming Data in RStreaming Data in R
Streaming Data in R
 
Monitoring as an entry point for collaboration
Monitoring as an entry point for collaborationMonitoring as an entry point for collaboration
Monitoring as an entry point for collaboration
 
Designing a Scalable Twitter - Patterns for Designing Scalable Real-Time Web ...
Designing a Scalable Twitter - Patterns for Designing Scalable Real-Time Web ...Designing a Scalable Twitter - Patterns for Designing Scalable Real-Time Web ...
Designing a Scalable Twitter - Patterns for Designing Scalable Real-Time Web ...
 
Big Data Meetup #7
Big Data Meetup #7Big Data Meetup #7
Big Data Meetup #7
 
Hadoop basics
Hadoop basicsHadoop basics
Hadoop basics
 
Big Data Analytics (ML, DL, AI) hands-on
Big Data Analytics (ML, DL, AI) hands-onBig Data Analytics (ML, DL, AI) hands-on
Big Data Analytics (ML, DL, AI) hands-on
 
My Other Computer is a Data Center: The Sector Perspective on Big Data
My Other Computer is a Data Center: The Sector Perspective on Big DataMy Other Computer is a Data Center: The Sector Perspective on Big Data
My Other Computer is a Data Center: The Sector Perspective on Big Data
 
[IJET V2I2P18] Authors: Roopa G Yeklaspur, Dr.Yerriswamy.T
[IJET V2I2P18] Authors: Roopa G Yeklaspur, Dr.Yerriswamy.T[IJET V2I2P18] Authors: Roopa G Yeklaspur, Dr.Yerriswamy.T
[IJET V2I2P18] Authors: Roopa G Yeklaspur, Dr.Yerriswamy.T
 
Big Data Taiwan 2014 Track2-2: Informatica Big Data Solution
Big Data Taiwan 2014 Track2-2: Informatica Big Data SolutionBig Data Taiwan 2014 Track2-2: Informatica Big Data Solution
Big Data Taiwan 2014 Track2-2: Informatica Big Data Solution
 
Performance evaluation of Map-reduce jar pig hive and spark with machine lear...
Performance evaluation of Map-reduce jar pig hive and spark with machine lear...Performance evaluation of Map-reduce jar pig hive and spark with machine lear...
Performance evaluation of Map-reduce jar pig hive and spark with machine lear...
 
Survey Paper on Big Data and Hadoop
Survey Paper on Big Data and HadoopSurvey Paper on Big Data and Hadoop
Survey Paper on Big Data and Hadoop
 
Hw09 Hadoop Based Data Mining Platform For The Telecom Industry
Hw09   Hadoop Based Data Mining Platform For The Telecom IndustryHw09   Hadoop Based Data Mining Platform For The Telecom Industry
Hw09 Hadoop Based Data Mining Platform For The Telecom Industry
 
disertation
disertationdisertation
disertation
 
BigData
BigDataBigData
BigData
 
Many Task Applications for Grids and Supercomputers
Many Task Applications for Grids and SupercomputersMany Task Applications for Grids and Supercomputers
Many Task Applications for Grids and Supercomputers
 
Dr.Hadoop- an infinite scalable metadata management for Hadoop-How the baby e...
Dr.Hadoop- an infinite scalable metadata management for Hadoop-How the baby e...Dr.Hadoop- an infinite scalable metadata management for Hadoop-How the baby e...
Dr.Hadoop- an infinite scalable metadata management for Hadoop-How the baby e...
 
Maximizing Data Lake ROI with Data Virtualization: A Technical Demonstration
Maximizing Data Lake ROI with Data Virtualization: A Technical DemonstrationMaximizing Data Lake ROI with Data Virtualization: A Technical Demonstration
Maximizing Data Lake ROI with Data Virtualization: A Technical Demonstration
 

Mehr von Florian Stegmaier

Ansätze für gemeinschaftliches Filtering
Ansätze für gemeinschaftliches FilteringAnsätze für gemeinschaftliches Filtering
Ansätze für gemeinschaftliches FilteringFlorian Stegmaier
 
Fortschritte im Bereich Collaborative Filtering
Fortschritte im Bereich Collaborative FilteringFortschritte im Bereich Collaborative Filtering
Fortschritte im Bereich Collaborative FilteringFlorian Stegmaier
 
Realtime
 Distributed Analysis
 of Datastreams
Realtime
 Distributed Analysis
 of DatastreamsRealtime
 Distributed Analysis
 of Datastreams
Realtime
 Distributed Analysis
 of DatastreamsFlorian Stegmaier
 
Effiziente Verarbeitung von großen Datenmengen
Effiziente Verarbeitung von großen DatenmengenEffiziente Verarbeitung von großen Datenmengen
Effiziente Verarbeitung von großen DatenmengenFlorian Stegmaier
 
Trust-based recommender systems
Trust-based recommender systemsTrust-based recommender systems
Trust-based recommender systemsFlorian Stegmaier
 
Trust und Interest Similarity und deren Anwendung für Empfehlungssysteme
Trust und Interest Similarity und deren Anwendung für EmpfehlungssystemeTrust und Interest Similarity und deren Anwendung für Empfehlungssysteme
Trust und Interest Similarity und deren Anwendung für EmpfehlungssystemeFlorian Stegmaier
 
Robustheit in Empfehlungssystemen
Robustheit in EmpfehlungssystemenRobustheit in Empfehlungssystemen
Robustheit in EmpfehlungssystemenFlorian Stegmaier
 
Linked Open Data als Basis für Empfehlungssysteme
Linked Open Data als Basis für EmpfehlungssystemeLinked Open Data als Basis für Empfehlungssysteme
Linked Open Data als Basis für EmpfehlungssystemeFlorian Stegmaier
 
Entscheidungshilfe: Recommender System
Entscheidungshilfe: Recommender SystemEntscheidungshilfe: Recommender System
Entscheidungshilfe: Recommender SystemFlorian Stegmaier
 
Funktionsweise und Ansätze von inhaltsbasiertem Filtern
Funktionsweise und Ansätze von inhaltsbasiertem FilternFunktionsweise und Ansätze von inhaltsbasiertem Filtern
Funktionsweise und Ansätze von inhaltsbasiertem FilternFlorian Stegmaier
 
Context Basierte Personalisierungsansätze
Context Basierte PersonalisierungsansätzeContext Basierte Personalisierungsansätze
Context Basierte PersonalisierungsansätzeFlorian Stegmaier
 
Evaluierung von Empfehlungssystemen
Evaluierung von EmpfehlungssystemenEvaluierung von Empfehlungssystemen
Evaluierung von EmpfehlungssystemenFlorian Stegmaier
 
Introduction to the FP7 CODE project @ BDBC
Introduction to the FP7 CODE project @ BDBCIntroduction to the FP7 CODE project @ BDBC
Introduction to the FP7 CODE project @ BDBCFlorian Stegmaier
 
Generische Datenintegration zur semantischen Diagnoseunterstützung im Projekt...
Generische Datenintegration zur semantischen Diagnoseunterstützung im Projekt...Generische Datenintegration zur semantischen Diagnoseunterstützung im Projekt...
Generische Datenintegration zur semantischen Diagnoseunterstützung im Projekt...Florian Stegmaier
 
AIR: Architecture for Interoperable Retrieval on Distributed and Heterogeneou...
AIR: Architecture for Interoperable Retrieval on Distributed and Heterogeneou...AIR: Architecture for Interoperable Retrieval on Distributed and Heterogeneou...
AIR: Architecture for Interoperable Retrieval on Distributed and Heterogeneou...Florian Stegmaier
 

Mehr von Florian Stegmaier (16)

Ansätze für gemeinschaftliches Filtering
Ansätze für gemeinschaftliches FilteringAnsätze für gemeinschaftliches Filtering
Ansätze für gemeinschaftliches Filtering
 
Fortschritte im Bereich Collaborative Filtering
Fortschritte im Bereich Collaborative FilteringFortschritte im Bereich Collaborative Filtering
Fortschritte im Bereich Collaborative Filtering
 
Realtime
 Distributed Analysis
 of Datastreams
Realtime
 Distributed Analysis
 of DatastreamsRealtime
 Distributed Analysis
 of Datastreams
Realtime
 Distributed Analysis
 of Datastreams
 
Effiziente Verarbeitung von großen Datenmengen
Effiziente Verarbeitung von großen DatenmengenEffiziente Verarbeitung von großen Datenmengen
Effiziente Verarbeitung von großen Datenmengen
 
Trust-based recommender systems
Trust-based recommender systemsTrust-based recommender systems
Trust-based recommender systems
 
Trust und Interest Similarity und deren Anwendung für Empfehlungssysteme
Trust und Interest Similarity und deren Anwendung für EmpfehlungssystemeTrust und Interest Similarity und deren Anwendung für Empfehlungssysteme
Trust und Interest Similarity und deren Anwendung für Empfehlungssysteme
 
Musikempfehlungssysteme
MusikempfehlungssystemeMusikempfehlungssysteme
Musikempfehlungssysteme
 
Robustheit in Empfehlungssystemen
Robustheit in EmpfehlungssystemenRobustheit in Empfehlungssystemen
Robustheit in Empfehlungssystemen
 
Linked Open Data als Basis für Empfehlungssysteme
Linked Open Data als Basis für EmpfehlungssystemeLinked Open Data als Basis für Empfehlungssysteme
Linked Open Data als Basis für Empfehlungssysteme
 
Entscheidungshilfe: Recommender System
Entscheidungshilfe: Recommender SystemEntscheidungshilfe: Recommender System
Entscheidungshilfe: Recommender System
 
Funktionsweise und Ansätze von inhaltsbasiertem Filtern
Funktionsweise und Ansätze von inhaltsbasiertem FilternFunktionsweise und Ansätze von inhaltsbasiertem Filtern
Funktionsweise und Ansätze von inhaltsbasiertem Filtern
 
Context Basierte Personalisierungsansätze
Context Basierte PersonalisierungsansätzeContext Basierte Personalisierungsansätze
Context Basierte Personalisierungsansätze
 
Evaluierung von Empfehlungssystemen
Evaluierung von EmpfehlungssystemenEvaluierung von Empfehlungssystemen
Evaluierung von Empfehlungssystemen
 
Introduction to the FP7 CODE project @ BDBC
Introduction to the FP7 CODE project @ BDBCIntroduction to the FP7 CODE project @ BDBC
Introduction to the FP7 CODE project @ BDBC
 
Generische Datenintegration zur semantischen Diagnoseunterstützung im Projekt...
Generische Datenintegration zur semantischen Diagnoseunterstützung im Projekt...Generische Datenintegration zur semantischen Diagnoseunterstützung im Projekt...
Generische Datenintegration zur semantischen Diagnoseunterstützung im Projekt...
 
AIR: Architecture for Interoperable Retrieval on Distributed and Heterogeneou...
AIR: Architecture for Interoperable Retrieval on Distributed and Heterogeneou...AIR: Architecture for Interoperable Retrieval on Distributed and Heterogeneou...
AIR: Architecture for Interoperable Retrieval on Distributed and Heterogeneou...
 

Kürzlich hochgeladen

Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxUnpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxDavid Michel
 
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfHow Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfFIDO Alliance
 
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone KomSalesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone KomCzechDreamin
 
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...FIDO Alliance
 
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeFree and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeCzechDreamin
 
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlPeter Udo Diehl
 
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...panagenda
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfSrushith Repakula
 
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfSimplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfFIDO Alliance
 
Oauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftOauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftshyamraj55
 
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...CzechDreamin
 
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...marcuskenyatta275
 
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPTiSEO AI
 
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...FIDO Alliance
 
Syngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdfSyngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdfSyngulon
 
Microsoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireMicrosoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireExakis Nelite
 
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutesconfluent
 
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptxWSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptxJennifer Lim
 
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...CzechDreamin
 

Kürzlich hochgeladen (20)

Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxUnpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
 
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfHow Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
 
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone KomSalesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
 
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
 
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeFree and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
 
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
 
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdf
 
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfSimplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
 
Overview of Hyperledger Foundation
Overview of Hyperledger FoundationOverview of Hyperledger Foundation
Overview of Hyperledger Foundation
 
Oauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftOauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoft
 
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
 
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
 
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
 
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
 
Syngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdfSyngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdf
 
Microsoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireMicrosoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - Questionnaire
 
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutes
 
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptxWSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
 
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
 

Effiziente Verarbeitung von grossen Datenmengen