SlideShare ist ein Scribd-Unternehmen logo
1 von 209
Introducing
    Riak
     Kevin A. Smith
   Senior Developer
  Basho Technologies
What Is Riak?
What Is Riak?

• Key/Value store
What Is Riak?

• Key/Value store
• Document-oriented database
What Is Riak?

• Key/Value store
• Document-oriented database
• Web-shaped storage
Key/Value Store
Key/Value Store


• Data organized by bucket/key pairs
Key/Value Store


• Data organized by bucket/key pairs
• Simple REST API (GET, PUT, DELETE)
Document Store
Document Store

• Store values as JSON
Document Store

• Store values as JSON
• Many clients support automatic JSON
  encoding/decoding
Document Store

• Store values as JSON
• Many clients support automatic JSON
  encoding/decoding
• Javascript Map/Reduce built on top of JSON
Web-Shaped
 Storage
Web-Shaped
         Storage

• Content neutral
Web-Shaped
          Storage

• Content neutral
• Highly distributed
Web-Shaped
          Storage

• Content neutral
• Highly distributed
• Replicated
Web-Shaped
          Storage

• Content neutral
• Highly distributed
• Replicated
• Fault-tolerant
What Is Riak?
What Is Riak?
A flexible storage engine...
What Is Riak?
A flexible storage engine...
   ...with a REST API...
What Is Riak?
 A flexible storage engine...
     ...with a REST API...
...and map/reduce capability...
What Is Riak?
   A flexible storage engine...
       ...with a REST API...
 ...and map/reduce capability...
....designed to be fault-tolerant...
What Is Riak?
   A flexible storage engine...
       ...with a REST API...
 ...and map/reduce capability...
....designed to be fault-tolerant...
          ...distributed...
What Is Riak?
   A flexible storage engine...
       ...with a REST API...
 ...and map/reduce capability...
....designed to be fault-tolerant...
          ...distributed...
        ...and ops friendly
Influences
Influences

• CAP Theorem
Influences

• CAP Theorem
• Amazon’s Dynamo Paper
Influences

• CAP Theorem
• Amazon’s Dynamo Paper
• Experience running large networks
  (Akamai)
CAP Theorem
CAP Theorem
Consistent Reads and writes reflect a
globally consistent system state
CAP Theorem
Consistent Reads and writes reflect a
globally consistent system state
CAP Theorem
Consistent Reads and writes reflect a
globally consistent system state

Available System is available for reads and
writes
CAP Theorem
Consistent Reads and writes reflect a
globally consistent system state

Available System is available for reads and
writes
CAP Theorem
Consistent Reads and writes reflect a
globally consistent system state

Available System is available for reads and
writes

Partition Tolerant System can handle
the failure of individual parts
Common Wisdom
Common Wisdom


   Pick two.
The Riak Way
The Riak Way

  Pick Two.
The Riak Way

     Pick Two.

For each operation.
Dynamo
Influences
Dynamo
           Influences
• N = The number of replicas
Dynamo
           Influences
• N = The number of replicas
• R = The number of replicas needed for a
  successful read
Dynamo
            Influences
• N = The number of replicas
• R = The number of replicas needed for a
  successful read
• W = The number of replicas needed for a
  successful write
Dynamo Math
Dynamo Math


N - R = read fault tolerance
Dynamo Math


N - R = read fault tolerance
N - W = write fault tolerance
Dynamo Math
Dynamo Math
N = 4, W = 2, R = 1
Dynamo Math
N = 4, W = 2, R = 1
Dynamo Math
N = 4, W = 2, R = 1


4 - 2 = 2 hosts can be down and Riak can
still perform writes.
Dynamo Math
N = 4, W = 2, R = 1


4 - 2 = 2 hosts can be down and Riak can
still perform writes.
4 - 1 = 3 hosts can be down and Riak can
still perform reads.
Riak Improvements
Riak Improvements

• N can vary per bucket
Riak Improvements

• N can vary per bucket
• R and W can vary per operation
Riak Improvements

• N can vary per bucket
• R and W can vary per operation
   Choose your own fault tolerance/performance tradeoff
Consistent Hashing
2160            0

                                 node 0
                                 node 1
                       2160/4
                                 node 2
                                 node 3

                    hash(<<"artist">>,<<"REM">>)

       2160/2
R value
            get(<<"artist">>,<<"REM">>,
                        R=2)

(N=3)
                            {ok, Object}




        X
W value
            put(<<"artist">>,<<"REM">>,
                        W=2)

(N=3)
                            ok




        X
N=10, R/W=2
                                 get/put("artist", "REM",
                                          R/W=2)
                (N=10)

                                                {ok, Object}




X                            X
    X
        X                X
            X   X    X
Resolving Conflicts
Resolving Conflicts

• Riak focuses on the AP of CAP
Resolving Conflicts

• Riak focuses on the AP of CAP
• Data could be briefly inconsistent
Resolving Conflicts

• Riak focuses on the AP of CAP
• Data could be briefly inconsistent
• Inconsistency must be resolved
Detecting & Resolving
      Conflicts
    0   1
             Object
              v0
    2   3
Detecting & Resolving
      Conflicts
             Object
   0    1
              v0

             Object
   2    3
              v0
Detecting & Resolving
      Conflicts
             Object
   0    1
              v1

             Object
   2    3
              v0
Detecting & Resolving
      Conflicts
   0    1
             Object
              v1
   2    3
Detecting & Resolving
      Conflicts
             Object
   0    1
              v1

             Object
   2    3
              v1
Client Resolution
Client Resolution

• Can be set per-bucket or server-wide
Client Resolution

• Can be set per-bucket or server-wide
• Conflicting data is “bubbled up” to the
  client
Client Resolution

• Can be set per-bucket or server-wide
• Conflicting data is “bubbled up” to the
  client
• Client picks the winner
Server Resolution
Server Resolution

• “Last write wins”
Server Resolution

• “Last write wins”
• Enabled by default
Server Resolution

• “Last write wins”
• Enabled by default
• What most apps need 80% of the time
Live Demo!
Linking Objects
Linking Objects

• Objects can store pointers, or links, to
  other objects
Linking Objects

• Objects can store pointers, or links, to
  other objects
• Doesn’t have to be the same bucket
Linking Objects

• Objects can store pointers, or links, to
  other objects
• Doesn’t have to be the same bucket
• Object links described in a Link header
Link Header Format

    Object URL


</riak/demo/test1>; riaktag="userinfo"


                              Link tag
Link Walking
Link Walking

• Ask Riak to “walk” a sequence of links
Link Walking

• Ask Riak to “walk” a sequence of links
• Optionally, collect objects along the walk
  and return them
Link Walking

• Ask Riak to “walk” a sequence of links
• Optionally, collect objects along the walk
  and return them
• Can be arbitrarily deep
Link Walking Examples
Link Walking Examples


   /riak/demo/test1/_,_,1
Link Walking Examples


      /riak/demo/test1/_,_,1
Start walking at /demo/test1 and return all
linked objects
Link Walking Examples
Link Walking Examples


  /riak/demo/test1/demo,_,1
Link Walking Examples


    /riak/demo/test1/demo,_,1
Start walking at /demo/test1 and return all
linked objects contained in the demo bucket
Link Walking Examples
Link Walking Examples


 /riak/demo/test1/_,_,0/_,_,1
Link Walking Examples


     /riak/demo/test1/_,_,0/_,_,1
Start walking at /demo/test1, find any linked objects,
then find and return any objects linked to those
Link Walking Examples
Link Walking Examples

/riak/demo/test1/_,child,0/_,_,1
Link Walking Examples

  /riak/demo/test1/_,child,0/_,_,1
Start walking at /demo/test1, find any linked objects
with the link tag “child”, then find and return any objects
linked to those
Map/Reduce
  Terms
Map/Reduce
           Terms
• Phase: A step within a job
Map/Reduce
           Terms
• Phase: A step within a job
• Job: A sequence of phases and inputs
Map/Reduce
           Terms
• Phase: A step within a job
• Job: A sequence of phases and inputs
• Map: Data collection phase
Map/Reduce
            Terms
• Phase: A step within a job
• Job: A sequence of phases and inputs
• Map: Data collection phase
• Reduce: Data collation or processing
  phase
Map/Reduce
 Overview
Map/Reduce
              Overview
• Map phases execute in parallel w/data
  locality
Map/Reduce
              Overview
• Map phases execute in parallel w/data
  locality
• Reduce phases execute in parallel on the
  node where job was submitted
Map/Reduce
              Overview
• Map phases execute in parallel w/data
  locality
• Reduce phases execute in parallel on the
  node where job was submitted
• Results are not cached or stored
Map/Reduce
              Overview
• Map phases execute in parallel w/data
  locality
• Reduce phases execute in parallel on the
  node where job was submitted
• Results are not cached or stored
• Phases can be written in Erlang or
  Javascript
Map Phase
Map Phase

• Inputs must be bucket/key pairs
Map Phase

• Inputs must be bucket/key pairs
• Must return a list
Map Phase

• Inputs must be bucket/key pairs
• Must return a list
• Parallel results are aggregated into a single
  list
Parallel Map
Parallel Map
Parallel Map
Parallel Map
Erlang Map Phase
Erlang Map Phase

• Two types: modfun and qfun
Erlang Map Phase

• Two types: modfun and qfun
• modfuns reference the module and name
  of the Erlang function to call
Erlang Map Phase

• Two types: modfun and qfun
• modfuns reference the module and name
    of the Erlang function to call

•   qfuns are anonymous Erlang functions*
Erlang Map Phase

• Two types: modfun and qfun
• modfuns reference the module and name
    of the Erlang function to call

•   qfuns are anonymous Erlang functions*

     *Must   be on the server-side codepath
Erlang Map Phase
Erlang Map Phase
map_object_value(Obj, _KeyData, _Arg) ->
Erlang Map Phase
map_object_value(Obj, _KeyData, _Arg) ->
 [riak_object:get_value(Obj)].
Erlang Map Phase
map_object_value(Obj, _KeyData, _Arg) ->
 [riak_object:get_value(Obj)].
Erlang Map Phase
map_object_value(Obj, _KeyData, _Arg) ->
 [riak_object:get_value(Obj)].


  • Obj:riak_object retrieved from bucket/key
Erlang Map Phase
map_object_value(Obj, _KeyData, _Arg) ->
 [riak_object:get_value(Obj)].


  • Obj:riak_object retrieved from bucket/key
  • KeyData: Static argument specified with the bucket/
    key
Erlang Map Phase
map_object_value(Obj, _KeyData, _Arg) ->
 [riak_object:get_value(Obj)].


  • Obj:riak_object retrieved from bucket/key
  • KeyData: Static argument specified with the bucket/
    key
  • Arg: Static argument specified with the phase
Erlang Map
 Built-Ins
Erlang Map
             Built-Ins
riak_mapreduce:map_object_value/3
Erlang Map
               Built-Ins
riak_mapreduce:map_object_value/3

• Returns object value wrapped in a list
Erlang Map
               Built-Ins
riak_mapreduce:map_object_value/3

• Returns object value wrapped in a list
riak_mapreduce:map_object_value_list/3
Erlang Map
               Built-Ins
riak_mapreduce:map_object_value/3

• Returns object value wrapped in a list
riak_mapreduce:map_object_value_list/3

• Returns object value. Object value must already
  be a list
Javascript
Map Phase
Javascript
           Map Phase
• Two types: jsanon and jsfun
Javascript
            Map Phase
• Two types: jsanon and jsfun
• jsanons are anonymous JS functions:
Javascript
            Map Phase
• Two types: jsanon and jsfun
• jsanons are anonymous JS functions:
  function(value) { return [value]; }
Javascript
             Map Phase
• Two types: jsanon and jsfun
• jsanons are anonymous JS functions:
  function(value) { return [value]; }

• jsfuns are named JS functions:
Javascript
             Map Phase
• Two types: jsanon and jsfun
• jsanons are anonymous JS functions:
  function(value) { return [value]; }

• jsfuns are named JS functions:
      Riak.mapValuesJson
Erlang & Javascript
Erlang & Javascript

• Same environment as Firefox minus
  browser bits
Erlang & Javascript

• Same environment as Firefox minus
  browser bits
• Erlang to Javascript data is JSON encoded
Erlang & Javascript

• Same environment as Firefox minus
  browser bits
• Erlang to Javascript data is JSON encoded
• Javascript to Erlang data is JSON decoded
Javascript Map Phase
Javascript Map Phase
function(value, keyData, arg)
Javascript Map Phase
function(value, keyData, arg)
Javascript Map Phase
function(value, keyData, arg)


• value: JSON-encoded version of
  riak_object
Javascript Map Phase
function(value, keyData, arg)


• value: JSON-encoded version of
  riak_object

• keyData: Same as Erlang
Javascript Map Phase
function(value, keyData, arg)


• value: JSON-encoded version of
  riak_object

• keyData: Same as Erlang
• arg: Same as Erlang
Javascript Map
   Built-Ins
Javascript Map
        Built-Ins
Riak.mapValues
Javascript Map
         Built-Ins
Riak.mapValues

• Returns object values. Handles detecting
  when/if to use list wrapping.
Javascript Map
         Built-Ins
Riak.mapValues

• Returns object values. Handles detecting
  when/if to use list wrapping.
Riak.mapValuesJson
Javascript Map
         Built-Ins
Riak.mapValues

• Returns object values. Handles detecting
  when/if to use list wrapping.
Riak.mapValuesJson

• Returns JSON parsed object values. Also
  performs list wrapping, if needed.
Reduce Phase
Reduce Phase

• Performed on the node coordinating the
  map/reduce job
Reduce Phase

• Performed on the node coordinating the
  map/reduce job
• Two processes per reduce phase to add
  minor parallelism
Reduce Phase

• Performed on the node coordinating the
  map/reduce job
• Two processes per reduce phase to add
  minor parallelism
• Must return a list
Erlang Reduce
   Built-Ins
Erlang Reduce
           Built-Ins
riak_mapreduce:reduce_set_union/2
Erlang Reduce
             Built-Ins
riak_mapreduce:reduce_set_union/2
• Returns unique set of values
Erlang Reduce
             Built-Ins
riak_mapreduce:reduce_set_union/2
• Returns unique set of values
riak_mapreduce:reduce_sum/2
Erlang Reduce
             Built-Ins
riak_mapreduce:reduce_set_union/2
• Returns unique set of values
riak_mapreduce:reduce_sum/2
• Returns the sum of inputs
Erlang Reduce
             Built-Ins
riak_mapreduce:reduce_set_union/2
• Returns unique set of values
riak_mapreduce:reduce_sum/2
• Returns the sum of inputs
riak_mapreduce:reduce_sort/2
Erlang Reduce
              Built-Ins
riak_mapreduce:reduce_set_union/2
• Returns unique set of values
riak_mapreduce:reduce_sum/2
• Returns the sum of inputs
riak_mapreduce:reduce_sort/2
• Returns the sorted list of inputs
Javascript Reduce
     Built-Ins
Javascript Reduce
        Built-Ins
Riak.reduceMin
Javascript Reduce
           Built-Ins
  Riak.reduceMin

• Returns the minimum value of the input set
Javascript Reduce
           Built-Ins
  Riak.reduceMin

• Returns the minimum value of the input set
  Riak.reduceMax
Javascript Reduce
           Built-Ins
  Riak.reduceMin

• Returns the minimum value of the input set
  Riak.reduceMax

• Returns the maximum value of the input set
Javascript Reduce
           Built-Ins
  Riak.reduceMin

• Returns the minimum value of the input set
  Riak.reduceMax

• Returns the maximum value of the input set
  Riak.reduceSort
Javascript Reduce
           Built-Ins
  Riak.reduceMin

• Returns the minimum value of the input set
  Riak.reduceMax

• Returns the maximum value of the input set
  Riak.reduceSort

• Returns a sorted list of the input set
Building
M/R Job
Building
               M/R Job

• Job is a list of phases and starting inputs
Building
               M/R Job

• Job is a list of phases and starting inputs
• Each phase can:
Building
               M/R Job

• Job is a list of phases and starting inputs
• Each phase can:
 • Receive a static argument
Building
               M/R Job

• Job is a list of phases and starting inputs
• Each phase can:
 • Receive a static argument
 • Accumulate and return results
Submitting Jobs
  via HTTP
Submitting Jobs
        via HTTP
• Riak exposes M/R via its REST API
Submitting Jobs
        via HTTP
• Riak exposes M/R via its REST API
• Job is described in JSON
Submitting Jobs
        via HTTP
• Riak exposes M/R via its REST API
• Job is described in JSON
• Submitted via POST
Submitting Jobs
        via HTTP
• Riak exposes M/R via its REST API
• Job is described in JSON
• Submitted via POST
• Default URL is /mapred
Erlang Phase
   (JSON)
Erlang Phase
             (JSON)
{Type:{“language”:”erlang”, “module”: Module,
Erlang Phase
             (JSON)
{Type:{“language”:”erlang”, “module”: Module,
       “function”: Function, “keep”:Flag}}
Erlang Phase
             (JSON)
{Type:{“language”:”erlang”, “module”: Module,
       “function”: Function, “keep”:Flag}}
Erlang Phase
                 (JSON)
    {Type:{“language”:”erlang”, “module”: Module,
           “function”: Function, “keep”:Flag}}



•   Type: “map” or “reduce”
Erlang Phase
                 (JSON)
    {Type:{“language”:”erlang”, “module”: Module,
           “function”: Function, “keep”:Flag}}



•   Type: “map” or “reduce”

•   Module: String name of Erlang module
Erlang Phase
                 (JSON)
    {Type:{“language”:”erlang”, “module”: Module,
           “function”: Function, “keep”:Flag}}



•   Type: “map” or “reduce”

•   Module: String name of Erlang module

•   Function: String name of Erlang function
Erlang Phase
                 (JSON)
    {Type:{“language”:”erlang”, “module”: Module,
           “function”: Function, “keep”:Flag}}



•   Type: “map” or “reduce”

•   Module: String name of Erlang module

•   Function: String name of Erlang function

•   Flag: Boolean accumulation toggle
Javascript Phase
    (JSON)
Javascript Phase
          (JSON)
{Type:{“language”:”javascript”,
Javascript Phase
          (JSON)
{Type:{“language”:”javascript”,
       “source”: Source,“keep”:Flag}}
Javascript Phase
          (JSON)
{Type:{“language”:”javascript”,
       “source”: Source,“keep”:Flag}}
Javascript Phase
              (JSON)
    {Type:{“language”:”javascript”,
           “source”: Source,“keep”:Flag}}


•   Type: “map” or “reduce”
Javascript Phase
              (JSON)
    {Type:{“language”:”javascript”,
           “source”: Source,“keep”:Flag}}


•   Type: “map” or “reduce”

•   Source: Source for anonymous function
Javascript Phase
              (JSON)
    {Type:{“language”:”javascript”,
           “source”: Source,“keep”:Flag}}


•   Type: “map” or “reduce”

•   Source: Source for anonymous function

•   Flag: Boolean accumulation toggle
Javascript Phase
    (JSON)
Javascript Phase
          (JSON)
{Type:{“language”:”javascript”,
Javascript Phase
          (JSON)
{Type:{“language”:”javascript”,
       “name”:Name,“keep”:Flag}}
Javascript Phase
          (JSON)
{Type:{“language”:”javascript”,
       “name”:Name,“keep”:Flag}}
Javascript Phase
              (JSON)
    {Type:{“language”:”javascript”,
           “name”:Name,“keep”:Flag}}


•   Type: “map” or “reduce”
Javascript Phase
              (JSON)
    {Type:{“language”:”javascript”,
           “name”:Name,“keep”:Flag}}


•   Type: “map” or “reduce”

•   Name: String name of Javascript function
Javascript Phase
              (JSON)
    {Type:{“language”:”javascript”,
           “name”:Name,“keep”:Flag}}


•   Type: “map” or “reduce”

•   Name: String name of Javascript function

•   Flag: Boolean accumulation toggle
Putting It
              Together

{“inputs”: [[“stocks”, “goog”]],

 “query”: [{“map”:{“language”:”javascript”,

                   “name”: “Riak.mapValuesJson”},

            “keep”: true}]}
Putting It
              Together

{“inputs”: [[“stocks”, “goog”],

            [“stocks”, “csco”]],

 “query”: [{“map”:{“language”:”javascript”,

                   “name”: “Riak.mapValuesJson”},

            “keep”: true}]}
Putting It
              Together
{“inputs”: “stocks”,

 “query”: [{“map”:{“language”:”javascript”,

                   “name”: “App.extractTickers”,

                   “arg”: “GOOG”},

            “keep”: false},

           {“reduce”:{“language”:”javascript,

                       “name”: “Riak.reduceMin”},

            “keep”: true}]}
Live Demo!
Thank You

     Kevin A. Smith
Email: ksmith@basho.com
  Twitter: @kevsmith

Weitere ähnliche Inhalte

Was ist angesagt?

9. Document Oriented Databases
9. Document Oriented Databases9. Document Oriented Databases
9. Document Oriented DatabasesFabio Fumarola
 
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...Edureka!
 
Introduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processingIntroduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processingTill Rohrmann
 
Spark 의 핵심은 무엇인가? RDD! (RDD paper review)
Spark 의 핵심은 무엇인가? RDD! (RDD paper review)Spark 의 핵심은 무엇인가? RDD! (RDD paper review)
Spark 의 핵심은 무엇인가? RDD! (RDD paper review)Yongho Ha
 
Pyspark Tutorial | Introduction to Apache Spark with Python | PySpark Trainin...
Pyspark Tutorial | Introduction to Apache Spark with Python | PySpark Trainin...Pyspark Tutorial | Introduction to Apache Spark with Python | PySpark Trainin...
Pyspark Tutorial | Introduction to Apache Spark with Python | PySpark Trainin...Edureka!
 
Intro to Neo4j and Graph Databases
Intro to Neo4j and Graph DatabasesIntro to Neo4j and Graph Databases
Intro to Neo4j and Graph DatabasesNeo4j
 
Processing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeekProcessing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeekVenkata Naga Ravi
 
Apache Spark - Basics of RDD | Big Data Hadoop Spark Tutorial | CloudxLab
Apache Spark - Basics of RDD | Big Data Hadoop Spark Tutorial | CloudxLabApache Spark - Basics of RDD | Big Data Hadoop Spark Tutorial | CloudxLab
Apache Spark - Basics of RDD | Big Data Hadoop Spark Tutorial | CloudxLabCloudxLab
 
A really really fast introduction to PySpark - lightning fast cluster computi...
A really really fast introduction to PySpark - lightning fast cluster computi...A really really fast introduction to PySpark - lightning fast cluster computi...
A really really fast introduction to PySpark - lightning fast cluster computi...Holden Karau
 
DoK Talks #91- Leveraging Druid Operator to manage Apache Druid on Kubernetes
DoK Talks #91- Leveraging Druid Operator to manage Apache Druid on KubernetesDoK Talks #91- Leveraging Druid Operator to manage Apache Druid on Kubernetes
DoK Talks #91- Leveraging Druid Operator to manage Apache Druid on KubernetesDoKC
 
Evaluation of TPC-H on Spark and Spark SQL in ALOJA
Evaluation of TPC-H on Spark and Spark SQL in ALOJAEvaluation of TPC-H on Spark and Spark SQL in ALOJA
Evaluation of TPC-H on Spark and Spark SQL in ALOJADataWorks Summit
 
Spark SQL Deep Dive @ Melbourne Spark Meetup
Spark SQL Deep Dive @ Melbourne Spark MeetupSpark SQL Deep Dive @ Melbourne Spark Meetup
Spark SQL Deep Dive @ Melbourne Spark MeetupDatabricks
 
Spark overview
Spark overviewSpark overview
Spark overviewLisa Hua
 
Data Federation with Apache Spark
Data Federation with Apache SparkData Federation with Apache Spark
Data Federation with Apache SparkDataWorks Summit
 
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...Databricks
 
Learn Apache Spark: A Comprehensive Guide
Learn Apache Spark: A Comprehensive GuideLearn Apache Spark: A Comprehensive Guide
Learn Apache Spark: A Comprehensive GuideWhizlabs
 

Was ist angesagt? (20)

9. Document Oriented Databases
9. Document Oriented Databases9. Document Oriented Databases
9. Document Oriented Databases
 
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...
 
Introduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processingIntroduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processing
 
Spark 의 핵심은 무엇인가? RDD! (RDD paper review)
Spark 의 핵심은 무엇인가? RDD! (RDD paper review)Spark 의 핵심은 무엇인가? RDD! (RDD paper review)
Spark 의 핵심은 무엇인가? RDD! (RDD paper review)
 
Pyspark Tutorial | Introduction to Apache Spark with Python | PySpark Trainin...
Pyspark Tutorial | Introduction to Apache Spark with Python | PySpark Trainin...Pyspark Tutorial | Introduction to Apache Spark with Python | PySpark Trainin...
Pyspark Tutorial | Introduction to Apache Spark with Python | PySpark Trainin...
 
Intro to Neo4j and Graph Databases
Intro to Neo4j and Graph DatabasesIntro to Neo4j and Graph Databases
Intro to Neo4j and Graph Databases
 
Apache Spark Overview
Apache Spark OverviewApache Spark Overview
Apache Spark Overview
 
Processing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeekProcessing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeek
 
Apache Spark - Basics of RDD | Big Data Hadoop Spark Tutorial | CloudxLab
Apache Spark - Basics of RDD | Big Data Hadoop Spark Tutorial | CloudxLabApache Spark - Basics of RDD | Big Data Hadoop Spark Tutorial | CloudxLab
Apache Spark - Basics of RDD | Big Data Hadoop Spark Tutorial | CloudxLab
 
PySaprk
PySaprkPySaprk
PySaprk
 
A really really fast introduction to PySpark - lightning fast cluster computi...
A really really fast introduction to PySpark - lightning fast cluster computi...A really really fast introduction to PySpark - lightning fast cluster computi...
A really really fast introduction to PySpark - lightning fast cluster computi...
 
DoK Talks #91- Leveraging Druid Operator to manage Apache Druid on Kubernetes
DoK Talks #91- Leveraging Druid Operator to manage Apache Druid on KubernetesDoK Talks #91- Leveraging Druid Operator to manage Apache Druid on Kubernetes
DoK Talks #91- Leveraging Druid Operator to manage Apache Druid on Kubernetes
 
Evaluation of TPC-H on Spark and Spark SQL in ALOJA
Evaluation of TPC-H on Spark and Spark SQL in ALOJAEvaluation of TPC-H on Spark and Spark SQL in ALOJA
Evaluation of TPC-H on Spark and Spark SQL in ALOJA
 
PostgreSQL
PostgreSQLPostgreSQL
PostgreSQL
 
Apache Spark Architecture
Apache Spark ArchitectureApache Spark Architecture
Apache Spark Architecture
 
Spark SQL Deep Dive @ Melbourne Spark Meetup
Spark SQL Deep Dive @ Melbourne Spark MeetupSpark SQL Deep Dive @ Melbourne Spark Meetup
Spark SQL Deep Dive @ Melbourne Spark Meetup
 
Spark overview
Spark overviewSpark overview
Spark overview
 
Data Federation with Apache Spark
Data Federation with Apache SparkData Federation with Apache Spark
Data Federation with Apache Spark
 
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
 
Learn Apache Spark: A Comprehensive Guide
Learn Apache Spark: A Comprehensive GuideLearn Apache Spark: A Comprehensive Guide
Learn Apache Spark: A Comprehensive Guide
 

Andere mochten auch

Riak a successful failure
Riak   a successful failureRiak   a successful failure
Riak a successful failureGiltTech
 
Introduction to Riak - Red Dirt Ruby Conf Training
Introduction to Riak - Red Dirt Ruby Conf TrainingIntroduction to Riak - Red Dirt Ruby Conf Training
Introduction to Riak - Red Dirt Ruby Conf TrainingSean Cribbs
 
Riak Operations
Riak OperationsRiak Operations
Riak Operationsgschofield
 
Riak - From Small to Large
Riak - From Small to LargeRiak - From Small to Large
Riak - From Small to LargeRusty Klophaus
 
Distributed Key-Value Stores- Featuring Riak
Distributed Key-Value Stores- Featuring RiakDistributed Key-Value Stores- Featuring Riak
Distributed Key-Value Stores- Featuring Riaksamof76
 
Riak (Øredev nosql day)
Riak (Øredev nosql day)Riak (Øredev nosql day)
Riak (Øredev nosql day)Sean Cribbs
 
Riak in Ten Minutes
Riak in Ten MinutesRiak in Ten Minutes
Riak in Ten MinutesJon Meredith
 
Riak Training Session — Surge 2011
Riak Training Session — Surge 2011Riak Training Session — Surge 2011
Riak Training Session — Surge 2011DstroyAllModels
 

Andere mochten auch (9)

Riak a successful failure
Riak   a successful failureRiak   a successful failure
Riak a successful failure
 
Introduction to Riak - Red Dirt Ruby Conf Training
Introduction to Riak - Red Dirt Ruby Conf TrainingIntroduction to Riak - Red Dirt Ruby Conf Training
Introduction to Riak - Red Dirt Ruby Conf Training
 
Riak Operations
Riak OperationsRiak Operations
Riak Operations
 
Riak - From Small to Large
Riak - From Small to LargeRiak - From Small to Large
Riak - From Small to Large
 
Distributed Key-Value Stores- Featuring Riak
Distributed Key-Value Stores- Featuring RiakDistributed Key-Value Stores- Featuring Riak
Distributed Key-Value Stores- Featuring Riak
 
Riak (Øredev nosql day)
Riak (Øredev nosql day)Riak (Øredev nosql day)
Riak (Øredev nosql day)
 
Riak in Ten Minutes
Riak in Ten MinutesRiak in Ten Minutes
Riak in Ten Minutes
 
Relational Databases to Riak
Relational Databases to RiakRelational Databases to Riak
Relational Databases to Riak
 
Riak Training Session — Surge 2011
Riak Training Session — Surge 2011Riak Training Session — Surge 2011
Riak Training Session — Surge 2011
 

Ähnlich wie Introducing Riak

Introduction to Riak and Ripple (KC.rb)
Introduction to Riak and Ripple (KC.rb)Introduction to Riak and Ripple (KC.rb)
Introduction to Riak and Ripple (KC.rb)Sean Cribbs
 
Embrace NoSQL and Eventual Consistency with Ripple
Embrace NoSQL and Eventual Consistency with RippleEmbrace NoSQL and Eventual Consistency with Ripple
Embrace NoSQL and Eventual Consistency with RippleSean Cribbs
 
Adding Riak to your NoSQL Bag of Tricks
Adding Riak to your NoSQL Bag of TricksAdding Riak to your NoSQL Bag of Tricks
Adding Riak to your NoSQL Bag of Trickssiculars
 
Processing Large Graphs
Processing Large GraphsProcessing Large Graphs
Processing Large GraphsNishant Gandhi
 
Dynamo: Not Just For Datastores
Dynamo: Not Just For DatastoresDynamo: Not Just For Datastores
Dynamo: Not Just For DatastoresSusan Potter
 
Riak at The NYC Cloud Computing Meetup Group
Riak at The NYC Cloud Computing Meetup GroupRiak at The NYC Cloud Computing Meetup Group
Riak at The NYC Cloud Computing Meetup Groupsiculars
 
Convergent Replicated Data Types in Riak 2.0
Convergent Replicated Data Types in Riak 2.0Convergent Replicated Data Types in Riak 2.0
Convergent Replicated Data Types in Riak 2.0Big Data Spain
 
When OLAP Meets Real-Time, What Happens in eBay?
When OLAP Meets Real-Time, What Happens in eBay?When OLAP Meets Real-Time, What Happens in eBay?
When OLAP Meets Real-Time, What Happens in eBay?DataWorks Summit
 
Riak at Engine Yard Cloud
Riak at Engine Yard CloudRiak at Engine Yard Cloud
Riak at Engine Yard CloudInes Sombra
 
London devops logging
London devops loggingLondon devops logging
London devops loggingTomas Doran
 
Reactive Stream Processing with Akka Streams
Reactive Stream Processing with Akka StreamsReactive Stream Processing with Akka Streams
Reactive Stream Processing with Akka StreamsKonrad Malawski
 
Scylla Summit 2018: Scalable Stream Processing with KSQL, Kafka and ScyllaDB
Scylla Summit 2018: Scalable Stream Processing with KSQL, Kafka and ScyllaDBScylla Summit 2018: Scalable Stream Processing with KSQL, Kafka and ScyllaDB
Scylla Summit 2018: Scalable Stream Processing with KSQL, Kafka and ScyllaDBScyllaDB
 
AWS re:Invent 2016: Cross-Region Replication with Amazon DynamoDB Streams (DA...
AWS re:Invent 2016: Cross-Region Replication with Amazon DynamoDB Streams (DA...AWS re:Invent 2016: Cross-Region Replication with Amazon DynamoDB Streams (DA...
AWS re:Invent 2016: Cross-Region Replication with Amazon DynamoDB Streams (DA...Amazon Web Services
 
Rolling With Riak
Rolling With RiakRolling With Riak
Rolling With RiakJohn Lynch
 
Large Scale Machine Learning with Apache Spark
Large Scale Machine Learning with Apache SparkLarge Scale Machine Learning with Apache Spark
Large Scale Machine Learning with Apache SparkCloudera, Inc.
 
Distributed Search in Riak - Integrating Search in a NoSQL Database: Presente...
Distributed Search in Riak - Integrating Search in a NoSQL Database: Presente...Distributed Search in Riak - Integrating Search in a NoSQL Database: Presente...
Distributed Search in Riak - Integrating Search in a NoSQL Database: Presente...Lucidworks
 
2014-10-20 Large-Scale Machine Learning with Apache Spark at Internet of Thin...
2014-10-20 Large-Scale Machine Learning with Apache Spark at Internet of Thin...2014-10-20 Large-Scale Machine Learning with Apache Spark at Internet of Thin...
2014-10-20 Large-Scale Machine Learning with Apache Spark at Internet of Thin...DB Tsai
 

Ähnlich wie Introducing Riak (20)

Introduction to Riak and Ripple (KC.rb)
Introduction to Riak and Ripple (KC.rb)Introduction to Riak and Ripple (KC.rb)
Introduction to Riak and Ripple (KC.rb)
 
Embrace NoSQL and Eventual Consistency with Ripple
Embrace NoSQL and Eventual Consistency with RippleEmbrace NoSQL and Eventual Consistency with Ripple
Embrace NoSQL and Eventual Consistency with Ripple
 
Adding Riak to your NoSQL Bag of Tricks
Adding Riak to your NoSQL Bag of TricksAdding Riak to your NoSQL Bag of Tricks
Adding Riak to your NoSQL Bag of Tricks
 
Rack
RackRack
Rack
 
Processing Large Graphs
Processing Large GraphsProcessing Large Graphs
Processing Large Graphs
 
Dynamo: Not Just For Datastores
Dynamo: Not Just For DatastoresDynamo: Not Just For Datastores
Dynamo: Not Just For Datastores
 
Riak at The NYC Cloud Computing Meetup Group
Riak at The NYC Cloud Computing Meetup GroupRiak at The NYC Cloud Computing Meetup Group
Riak at The NYC Cloud Computing Meetup Group
 
Convergent Replicated Data Types in Riak 2.0
Convergent Replicated Data Types in Riak 2.0Convergent Replicated Data Types in Riak 2.0
Convergent Replicated Data Types in Riak 2.0
 
When OLAP Meets Real-Time, What Happens in eBay?
When OLAP Meets Real-Time, What Happens in eBay?When OLAP Meets Real-Time, What Happens in eBay?
When OLAP Meets Real-Time, What Happens in eBay?
 
Let's Get to the Rapids
Let's Get to the RapidsLet's Get to the Rapids
Let's Get to the Rapids
 
Riak at Engine Yard Cloud
Riak at Engine Yard CloudRiak at Engine Yard Cloud
Riak at Engine Yard Cloud
 
London devops logging
London devops loggingLondon devops logging
London devops logging
 
Reactive Stream Processing with Akka Streams
Reactive Stream Processing with Akka StreamsReactive Stream Processing with Akka Streams
Reactive Stream Processing with Akka Streams
 
Java script basics
Java script basicsJava script basics
Java script basics
 
Scylla Summit 2018: Scalable Stream Processing with KSQL, Kafka and ScyllaDB
Scylla Summit 2018: Scalable Stream Processing with KSQL, Kafka and ScyllaDBScylla Summit 2018: Scalable Stream Processing with KSQL, Kafka and ScyllaDB
Scylla Summit 2018: Scalable Stream Processing with KSQL, Kafka and ScyllaDB
 
AWS re:Invent 2016: Cross-Region Replication with Amazon DynamoDB Streams (DA...
AWS re:Invent 2016: Cross-Region Replication with Amazon DynamoDB Streams (DA...AWS re:Invent 2016: Cross-Region Replication with Amazon DynamoDB Streams (DA...
AWS re:Invent 2016: Cross-Region Replication with Amazon DynamoDB Streams (DA...
 
Rolling With Riak
Rolling With RiakRolling With Riak
Rolling With Riak
 
Large Scale Machine Learning with Apache Spark
Large Scale Machine Learning with Apache SparkLarge Scale Machine Learning with Apache Spark
Large Scale Machine Learning with Apache Spark
 
Distributed Search in Riak - Integrating Search in a NoSQL Database: Presente...
Distributed Search in Riak - Integrating Search in a NoSQL Database: Presente...Distributed Search in Riak - Integrating Search in a NoSQL Database: Presente...
Distributed Search in Riak - Integrating Search in a NoSQL Database: Presente...
 
2014-10-20 Large-Scale Machine Learning with Apache Spark at Internet of Thin...
2014-10-20 Large-Scale Machine Learning with Apache Spark at Internet of Thin...2014-10-20 Large-Scale Machine Learning with Apache Spark at Internet of Thin...
2014-10-20 Large-Scale Machine Learning with Apache Spark at Internet of Thin...
 

Introducing Riak

  • 1. Introducing Riak Kevin A. Smith Senior Developer Basho Technologies
  • 3. What Is Riak? • Key/Value store
  • 4. What Is Riak? • Key/Value store • Document-oriented database
  • 5. What Is Riak? • Key/Value store • Document-oriented database • Web-shaped storage
  • 7. Key/Value Store • Data organized by bucket/key pairs
  • 8. Key/Value Store • Data organized by bucket/key pairs • Simple REST API (GET, PUT, DELETE)
  • 10. Document Store • Store values as JSON
  • 11. Document Store • Store values as JSON • Many clients support automatic JSON encoding/decoding
  • 12. Document Store • Store values as JSON • Many clients support automatic JSON encoding/decoding • Javascript Map/Reduce built on top of JSON
  • 14. Web-Shaped Storage • Content neutral
  • 15. Web-Shaped Storage • Content neutral • Highly distributed
  • 16. Web-Shaped Storage • Content neutral • Highly distributed • Replicated
  • 17. Web-Shaped Storage • Content neutral • Highly distributed • Replicated • Fault-tolerant
  • 19. What Is Riak? A flexible storage engine...
  • 20. What Is Riak? A flexible storage engine... ...with a REST API...
  • 21. What Is Riak? A flexible storage engine... ...with a REST API... ...and map/reduce capability...
  • 22. What Is Riak? A flexible storage engine... ...with a REST API... ...and map/reduce capability... ....designed to be fault-tolerant...
  • 23. What Is Riak? A flexible storage engine... ...with a REST API... ...and map/reduce capability... ....designed to be fault-tolerant... ...distributed...
  • 24. What Is Riak? A flexible storage engine... ...with a REST API... ...and map/reduce capability... ....designed to be fault-tolerant... ...distributed... ...and ops friendly
  • 27. Influences • CAP Theorem • Amazon’s Dynamo Paper
  • 28. Influences • CAP Theorem • Amazon’s Dynamo Paper • Experience running large networks (Akamai)
  • 30. CAP Theorem Consistent Reads and writes reflect a globally consistent system state
  • 31. CAP Theorem Consistent Reads and writes reflect a globally consistent system state
  • 32. CAP Theorem Consistent Reads and writes reflect a globally consistent system state Available System is available for reads and writes
  • 33. CAP Theorem Consistent Reads and writes reflect a globally consistent system state Available System is available for reads and writes
  • 34. CAP Theorem Consistent Reads and writes reflect a globally consistent system state Available System is available for reads and writes Partition Tolerant System can handle the failure of individual parts
  • 36. Common Wisdom Pick two.
  • 38. The Riak Way Pick Two.
  • 39. The Riak Way Pick Two. For each operation.
  • 41. Dynamo Influences • N = The number of replicas
  • 42. Dynamo Influences • N = The number of replicas • R = The number of replicas needed for a successful read
  • 43. Dynamo Influences • N = The number of replicas • R = The number of replicas needed for a successful read • W = The number of replicas needed for a successful write
  • 45. Dynamo Math N - R = read fault tolerance
  • 46. Dynamo Math N - R = read fault tolerance N - W = write fault tolerance
  • 48. Dynamo Math N = 4, W = 2, R = 1
  • 49. Dynamo Math N = 4, W = 2, R = 1
  • 50. Dynamo Math N = 4, W = 2, R = 1 4 - 2 = 2 hosts can be down and Riak can still perform writes.
  • 51. Dynamo Math N = 4, W = 2, R = 1 4 - 2 = 2 hosts can be down and Riak can still perform writes. 4 - 1 = 3 hosts can be down and Riak can still perform reads.
  • 53. Riak Improvements • N can vary per bucket
  • 54. Riak Improvements • N can vary per bucket • R and W can vary per operation
  • 55. Riak Improvements • N can vary per bucket • R and W can vary per operation Choose your own fault tolerance/performance tradeoff
  • 56. Consistent Hashing 2160 0 node 0 node 1 2160/4 node 2 node 3 hash(<<"artist">>,<<"REM">>) 2160/2
  • 57. R value get(<<"artist">>,<<"REM">>, R=2) (N=3) {ok, Object} X
  • 58. W value put(<<"artist">>,<<"REM">>, W=2) (N=3) ok X
  • 59. N=10, R/W=2 get/put("artist", "REM", R/W=2) (N=10) {ok, Object} X X X X X X X X
  • 61. Resolving Conflicts • Riak focuses on the AP of CAP
  • 62. Resolving Conflicts • Riak focuses on the AP of CAP • Data could be briefly inconsistent
  • 63. Resolving Conflicts • Riak focuses on the AP of CAP • Data could be briefly inconsistent • Inconsistency must be resolved
  • 64. Detecting & Resolving Conflicts 0 1 Object v0 2 3
  • 65. Detecting & Resolving Conflicts Object 0 1 v0 Object 2 3 v0
  • 66. Detecting & Resolving Conflicts Object 0 1 v1 Object 2 3 v0
  • 67. Detecting & Resolving Conflicts 0 1 Object v1 2 3
  • 68. Detecting & Resolving Conflicts Object 0 1 v1 Object 2 3 v1
  • 70. Client Resolution • Can be set per-bucket or server-wide
  • 71. Client Resolution • Can be set per-bucket or server-wide • Conflicting data is “bubbled up” to the client
  • 72. Client Resolution • Can be set per-bucket or server-wide • Conflicting data is “bubbled up” to the client • Client picks the winner
  • 75. Server Resolution • “Last write wins” • Enabled by default
  • 76. Server Resolution • “Last write wins” • Enabled by default • What most apps need 80% of the time
  • 79. Linking Objects • Objects can store pointers, or links, to other objects
  • 80. Linking Objects • Objects can store pointers, or links, to other objects • Doesn’t have to be the same bucket
  • 81. Linking Objects • Objects can store pointers, or links, to other objects • Doesn’t have to be the same bucket • Object links described in a Link header
  • 82. Link Header Format Object URL </riak/demo/test1>; riaktag="userinfo" Link tag
  • 84. Link Walking • Ask Riak to “walk” a sequence of links
  • 85. Link Walking • Ask Riak to “walk” a sequence of links • Optionally, collect objects along the walk and return them
  • 86. Link Walking • Ask Riak to “walk” a sequence of links • Optionally, collect objects along the walk and return them • Can be arbitrarily deep
  • 88. Link Walking Examples /riak/demo/test1/_,_,1
  • 89. Link Walking Examples /riak/demo/test1/_,_,1 Start walking at /demo/test1 and return all linked objects
  • 91. Link Walking Examples /riak/demo/test1/demo,_,1
  • 92. Link Walking Examples /riak/demo/test1/demo,_,1 Start walking at /demo/test1 and return all linked objects contained in the demo bucket
  • 94. Link Walking Examples /riak/demo/test1/_,_,0/_,_,1
  • 95. Link Walking Examples /riak/demo/test1/_,_,0/_,_,1 Start walking at /demo/test1, find any linked objects, then find and return any objects linked to those
  • 98. Link Walking Examples /riak/demo/test1/_,child,0/_,_,1 Start walking at /demo/test1, find any linked objects with the link tag “child”, then find and return any objects linked to those
  • 100. Map/Reduce Terms • Phase: A step within a job
  • 101. Map/Reduce Terms • Phase: A step within a job • Job: A sequence of phases and inputs
  • 102. Map/Reduce Terms • Phase: A step within a job • Job: A sequence of phases and inputs • Map: Data collection phase
  • 103. Map/Reduce Terms • Phase: A step within a job • Job: A sequence of phases and inputs • Map: Data collection phase • Reduce: Data collation or processing phase
  • 105. Map/Reduce Overview • Map phases execute in parallel w/data locality
  • 106. Map/Reduce Overview • Map phases execute in parallel w/data locality • Reduce phases execute in parallel on the node where job was submitted
  • 107. Map/Reduce Overview • Map phases execute in parallel w/data locality • Reduce phases execute in parallel on the node where job was submitted • Results are not cached or stored
  • 108. Map/Reduce Overview • Map phases execute in parallel w/data locality • Reduce phases execute in parallel on the node where job was submitted • Results are not cached or stored • Phases can be written in Erlang or Javascript
  • 110. Map Phase • Inputs must be bucket/key pairs
  • 111. Map Phase • Inputs must be bucket/key pairs • Must return a list
  • 112. Map Phase • Inputs must be bucket/key pairs • Must return a list • Parallel results are aggregated into a single list
  • 118. Erlang Map Phase • Two types: modfun and qfun
  • 119. Erlang Map Phase • Two types: modfun and qfun • modfuns reference the module and name of the Erlang function to call
  • 120. Erlang Map Phase • Two types: modfun and qfun • modfuns reference the module and name of the Erlang function to call • qfuns are anonymous Erlang functions*
  • 121. Erlang Map Phase • Two types: modfun and qfun • modfuns reference the module and name of the Erlang function to call • qfuns are anonymous Erlang functions* *Must be on the server-side codepath
  • 124. Erlang Map Phase map_object_value(Obj, _KeyData, _Arg) -> [riak_object:get_value(Obj)].
  • 125. Erlang Map Phase map_object_value(Obj, _KeyData, _Arg) -> [riak_object:get_value(Obj)].
  • 126. Erlang Map Phase map_object_value(Obj, _KeyData, _Arg) -> [riak_object:get_value(Obj)]. • Obj:riak_object retrieved from bucket/key
  • 127. Erlang Map Phase map_object_value(Obj, _KeyData, _Arg) -> [riak_object:get_value(Obj)]. • Obj:riak_object retrieved from bucket/key • KeyData: Static argument specified with the bucket/ key
  • 128. Erlang Map Phase map_object_value(Obj, _KeyData, _Arg) -> [riak_object:get_value(Obj)]. • Obj:riak_object retrieved from bucket/key • KeyData: Static argument specified with the bucket/ key • Arg: Static argument specified with the phase
  • 130. Erlang Map Built-Ins riak_mapreduce:map_object_value/3
  • 131. Erlang Map Built-Ins riak_mapreduce:map_object_value/3 • Returns object value wrapped in a list
  • 132. Erlang Map Built-Ins riak_mapreduce:map_object_value/3 • Returns object value wrapped in a list riak_mapreduce:map_object_value_list/3
  • 133. Erlang Map Built-Ins riak_mapreduce:map_object_value/3 • Returns object value wrapped in a list riak_mapreduce:map_object_value_list/3 • Returns object value. Object value must already be a list
  • 135. Javascript Map Phase • Two types: jsanon and jsfun
  • 136. Javascript Map Phase • Two types: jsanon and jsfun • jsanons are anonymous JS functions:
  • 137. Javascript Map Phase • Two types: jsanon and jsfun • jsanons are anonymous JS functions: function(value) { return [value]; }
  • 138. Javascript Map Phase • Two types: jsanon and jsfun • jsanons are anonymous JS functions: function(value) { return [value]; } • jsfuns are named JS functions:
  • 139. Javascript Map Phase • Two types: jsanon and jsfun • jsanons are anonymous JS functions: function(value) { return [value]; } • jsfuns are named JS functions: Riak.mapValuesJson
  • 141. Erlang & Javascript • Same environment as Firefox minus browser bits
  • 142. Erlang & Javascript • Same environment as Firefox minus browser bits • Erlang to Javascript data is JSON encoded
  • 143. Erlang & Javascript • Same environment as Firefox minus browser bits • Erlang to Javascript data is JSON encoded • Javascript to Erlang data is JSON decoded
  • 147. Javascript Map Phase function(value, keyData, arg) • value: JSON-encoded version of riak_object
  • 148. Javascript Map Phase function(value, keyData, arg) • value: JSON-encoded version of riak_object • keyData: Same as Erlang
  • 149. Javascript Map Phase function(value, keyData, arg) • value: JSON-encoded version of riak_object • keyData: Same as Erlang • arg: Same as Erlang
  • 150. Javascript Map Built-Ins
  • 151. Javascript Map Built-Ins Riak.mapValues
  • 152. Javascript Map Built-Ins Riak.mapValues • Returns object values. Handles detecting when/if to use list wrapping.
  • 153. Javascript Map Built-Ins Riak.mapValues • Returns object values. Handles detecting when/if to use list wrapping. Riak.mapValuesJson
  • 154. Javascript Map Built-Ins Riak.mapValues • Returns object values. Handles detecting when/if to use list wrapping. Riak.mapValuesJson • Returns JSON parsed object values. Also performs list wrapping, if needed.
  • 156. Reduce Phase • Performed on the node coordinating the map/reduce job
  • 157. Reduce Phase • Performed on the node coordinating the map/reduce job • Two processes per reduce phase to add minor parallelism
  • 158. Reduce Phase • Performed on the node coordinating the map/reduce job • Two processes per reduce phase to add minor parallelism • Must return a list
  • 159. Erlang Reduce Built-Ins
  • 160. Erlang Reduce Built-Ins riak_mapreduce:reduce_set_union/2
  • 161. Erlang Reduce Built-Ins riak_mapreduce:reduce_set_union/2 • Returns unique set of values
  • 162. Erlang Reduce Built-Ins riak_mapreduce:reduce_set_union/2 • Returns unique set of values riak_mapreduce:reduce_sum/2
  • 163. Erlang Reduce Built-Ins riak_mapreduce:reduce_set_union/2 • Returns unique set of values riak_mapreduce:reduce_sum/2 • Returns the sum of inputs
  • 164. Erlang Reduce Built-Ins riak_mapreduce:reduce_set_union/2 • Returns unique set of values riak_mapreduce:reduce_sum/2 • Returns the sum of inputs riak_mapreduce:reduce_sort/2
  • 165. Erlang Reduce Built-Ins riak_mapreduce:reduce_set_union/2 • Returns unique set of values riak_mapreduce:reduce_sum/2 • Returns the sum of inputs riak_mapreduce:reduce_sort/2 • Returns the sorted list of inputs
  • 166. Javascript Reduce Built-Ins
  • 167. Javascript Reduce Built-Ins Riak.reduceMin
  • 168. Javascript Reduce Built-Ins Riak.reduceMin • Returns the minimum value of the input set
  • 169. Javascript Reduce Built-Ins Riak.reduceMin • Returns the minimum value of the input set Riak.reduceMax
  • 170. Javascript Reduce Built-Ins Riak.reduceMin • Returns the minimum value of the input set Riak.reduceMax • Returns the maximum value of the input set
  • 171. Javascript Reduce Built-Ins Riak.reduceMin • Returns the minimum value of the input set Riak.reduceMax • Returns the maximum value of the input set Riak.reduceSort
  • 172. Javascript Reduce Built-Ins Riak.reduceMin • Returns the minimum value of the input set Riak.reduceMax • Returns the maximum value of the input set Riak.reduceSort • Returns a sorted list of the input set
  • 174. Building M/R Job • Job is a list of phases and starting inputs
  • 175. Building M/R Job • Job is a list of phases and starting inputs • Each phase can:
  • 176. Building M/R Job • Job is a list of phases and starting inputs • Each phase can: • Receive a static argument
  • 177. Building M/R Job • Job is a list of phases and starting inputs • Each phase can: • Receive a static argument • Accumulate and return results
  • 178. Submitting Jobs via HTTP
  • 179. Submitting Jobs via HTTP • Riak exposes M/R via its REST API
  • 180. Submitting Jobs via HTTP • Riak exposes M/R via its REST API • Job is described in JSON
  • 181. Submitting Jobs via HTTP • Riak exposes M/R via its REST API • Job is described in JSON • Submitted via POST
  • 182. Submitting Jobs via HTTP • Riak exposes M/R via its REST API • Job is described in JSON • Submitted via POST • Default URL is /mapred
  • 183. Erlang Phase (JSON)
  • 184. Erlang Phase (JSON) {Type:{“language”:”erlang”, “module”: Module,
  • 185. Erlang Phase (JSON) {Type:{“language”:”erlang”, “module”: Module, “function”: Function, “keep”:Flag}}
  • 186. Erlang Phase (JSON) {Type:{“language”:”erlang”, “module”: Module, “function”: Function, “keep”:Flag}}
  • 187. Erlang Phase (JSON) {Type:{“language”:”erlang”, “module”: Module, “function”: Function, “keep”:Flag}} • Type: “map” or “reduce”
  • 188. Erlang Phase (JSON) {Type:{“language”:”erlang”, “module”: Module, “function”: Function, “keep”:Flag}} • Type: “map” or “reduce” • Module: String name of Erlang module
  • 189. Erlang Phase (JSON) {Type:{“language”:”erlang”, “module”: Module, “function”: Function, “keep”:Flag}} • Type: “map” or “reduce” • Module: String name of Erlang module • Function: String name of Erlang function
  • 190. Erlang Phase (JSON) {Type:{“language”:”erlang”, “module”: Module, “function”: Function, “keep”:Flag}} • Type: “map” or “reduce” • Module: String name of Erlang module • Function: String name of Erlang function • Flag: Boolean accumulation toggle
  • 191. Javascript Phase (JSON)
  • 192. Javascript Phase (JSON) {Type:{“language”:”javascript”,
  • 193. Javascript Phase (JSON) {Type:{“language”:”javascript”, “source”: Source,“keep”:Flag}}
  • 194. Javascript Phase (JSON) {Type:{“language”:”javascript”, “source”: Source,“keep”:Flag}}
  • 195. Javascript Phase (JSON) {Type:{“language”:”javascript”, “source”: Source,“keep”:Flag}} • Type: “map” or “reduce”
  • 196. Javascript Phase (JSON) {Type:{“language”:”javascript”, “source”: Source,“keep”:Flag}} • Type: “map” or “reduce” • Source: Source for anonymous function
  • 197. Javascript Phase (JSON) {Type:{“language”:”javascript”, “source”: Source,“keep”:Flag}} • Type: “map” or “reduce” • Source: Source for anonymous function • Flag: Boolean accumulation toggle
  • 198. Javascript Phase (JSON)
  • 199. Javascript Phase (JSON) {Type:{“language”:”javascript”,
  • 200. Javascript Phase (JSON) {Type:{“language”:”javascript”, “name”:Name,“keep”:Flag}}
  • 201. Javascript Phase (JSON) {Type:{“language”:”javascript”, “name”:Name,“keep”:Flag}}
  • 202. Javascript Phase (JSON) {Type:{“language”:”javascript”, “name”:Name,“keep”:Flag}} • Type: “map” or “reduce”
  • 203. Javascript Phase (JSON) {Type:{“language”:”javascript”, “name”:Name,“keep”:Flag}} • Type: “map” or “reduce” • Name: String name of Javascript function
  • 204. Javascript Phase (JSON) {Type:{“language”:”javascript”, “name”:Name,“keep”:Flag}} • Type: “map” or “reduce” • Name: String name of Javascript function • Flag: Boolean accumulation toggle
  • 205. Putting It Together {“inputs”: [[“stocks”, “goog”]], “query”: [{“map”:{“language”:”javascript”, “name”: “Riak.mapValuesJson”}, “keep”: true}]}
  • 206. Putting It Together {“inputs”: [[“stocks”, “goog”], [“stocks”, “csco”]], “query”: [{“map”:{“language”:”javascript”, “name”: “Riak.mapValuesJson”}, “keep”: true}]}
  • 207. Putting It Together {“inputs”: “stocks”, “query”: [{“map”:{“language”:”javascript”, “name”: “App.extractTickers”, “arg”: “GOOG”}, “keep”: false}, {“reduce”:{“language”:”javascript, “name”: “Riak.reduceMin”}, “keep”: true}]}
  • 209. Thank You Kevin A. Smith Email: ksmith@basho.com Twitter: @kevsmith

Hinweis der Redaktion