SlideShare ist ein Scribd-Unternehmen logo
1 von 45
Downloaden Sie, um offline zu lesen
Hadoop	
  Use	
  Cases	
  
   At	
  Salesforce.com	
  
   Narayan	
  Bharadwaj 	
                                                       	
            	
            	
            	
  	
  
   Director,	
  Product	
  Management	
  
       Monitoring	
  &	
  Big	
  Data      	
                         	
  	
  
           Salesforce.com 	
          	
                       	
                       	
            	
            	
                	
  	
  

          	
  	
  	
  	
  	
  	
  @nadubharadwaj 	
     	
                       	
            	
            	
            	
                    	
  	
  
Safe	
  harbor	
  
Safe	
  harbor	
  statement	
  under	
  the	
  Private	
  Securi8es	
  Li8ga8on	
  Reform	
  Act	
  of	
  1995:	
  

This	
  presenta8on	
  may	
  contain	
  forward-­‐looking	
  statements	
  that	
  involve	
  risks,	
  uncertain8es,	
  and	
  assump8ons.	
  If	
  any	
  
such	
  uncertain8es	
  materialize	
  or	
  if	
  any	
  of	
  the	
  assump8ons	
  proves	
  incorrect,	
  the	
  results	
  of	
  salesforce.com,	
  inc.	
  could	
  
differ	
  materially	
  from	
  the	
  results	
  expressed	
  or	
  implied	
  by	
  the	
  forward-­‐looking	
  statements	
  we	
  make.	
  All	
  statements	
  
other	
  than	
  statements	
  of	
  historical	
  fact	
  could	
  be	
  deemed	
  forward-­‐looking,	
  including	
  any	
  projec8ons	
  of	
  product	
  or	
  
service	
  availability,	
  subscriber	
  growth,	
  earnings,	
  revenues,	
  or	
  other	
  financial	
  items	
  and	
  any	
  statements	
  regarding	
  
strategies	
  or	
  plans	
  of	
  management	
  for	
  future	
  opera8ons,	
  statements	
  of	
  belief,	
  any	
  statements	
  concerning	
  new,	
  
planned,	
  or	
  upgraded	
  services	
  or	
  technology	
  developments	
  and	
  customer	
  contracts	
  or	
  use	
  of	
  our	
  services.	
  

The	
  risks	
  and	
  uncertain8es	
  referred	
  to	
  above	
  include	
  –	
  but	
  are	
  not	
  limited	
  to	
  –	
  risks	
  associated	
  with	
  developing	
  and	
  
delivering	
  new	
  func8onality	
  for	
  our	
  service,	
  new	
  products	
  and	
  services,	
  our	
  new	
  business	
  model,	
  our	
  past	
  opera8ng	
  
losses,	
  possible	
  fluctua8ons	
  in	
  our	
  opera8ng	
  results	
  and	
  rate	
  of	
  growth,	
  interrup8ons	
  or	
  delays	
  in	
  our	
  Web	
  hos8ng,	
  
breach	
  of	
  our	
  security	
  measures,	
  the	
  outcome	
  of	
  intellectual	
  property	
  and	
  other	
  li8ga8on,	
  risks	
  associated	
  with	
  
possible	
  mergers	
  and	
  acquisi8ons,	
  the	
  immature	
  market	
  in	
  which	
  we	
  operate,	
  our	
  rela8vely	
  limited	
  opera8ng	
  
history,	
  our	
  ability	
  to	
  expand,	
  retain,	
  and	
  mo8vate	
  our	
  employees	
  and	
  manage	
  our	
  growth,	
  new	
  releases	
  of	
  our	
  
service	
  and	
  successful	
  customer	
  deployment,	
  our	
  limited	
  history	
  reselling	
  non-­‐salesforce.com	
  products,	
  and	
  
u8liza8on	
  and	
  selling	
  to	
  larger	
  enterprise	
  customers.	
  Further	
  informa8on	
  on	
  poten8al	
  factors	
  that	
  could	
  affect	
  the	
  
financial	
  results	
  of	
  salesforce.com,	
  inc.	
  is	
  included	
  in	
  our	
  annual	
  report	
  on	
  Form	
  10-­‐Q	
  for	
  the	
  most	
  recent	
  fiscal	
  
quarter	
  ended	
  July	
  31,	
  2012.	
  This	
  documents	
  and	
  others	
  containing	
  important	
  disclosures	
  are	
  available	
  on	
  the	
  SEC	
  
Filings	
  sec8on	
  of	
  the	
  Investor	
  Informa8on	
  sec8on	
  of	
  our	
  Web	
  site.	
  

Any	
  unreleased	
  services	
  or	
  features	
  referenced	
  in	
  this	
  or	
  other	
  presenta8ons,	
  press	
  releases	
  or	
  public	
  statements	
  
are	
  not	
  currently	
  available	
  and	
  may	
  not	
  be	
  delivered	
  on	
  8me	
  or	
  at	
  all.	
  Customers	
  who	
  purchase	
  our	
  services	
  should	
  
make	
  the	
  purchase	
  decisions	
  based	
  upon	
  features	
  that	
  are	
  currently	
  available.	
  Salesforce.com,	
  inc.	
  assumes	
  no	
  
obliga8on	
  and	
  does	
  not	
  intend	
  to	
  update	
  these	
  forward-­‐looking	
  statements.	
  
Agenda	
  

•    Technology	
  
•    Big	
  Data	
  use	
  cases	
  
•    Use	
  case	
  discussion	
  
•    Q&A	
  
Got	
  “Cloud	
  Data”?	
  




130k	
  customers	
           1	
  billion	
  transac8ons/day	
  
Millions	
  of	
  users	
     Terabytes/day	
  
Technology	
  
Big	
  Data	
  Ecosystem	
  




Phoenix	
                  Oozie	
  
Phoenix	
  
                  “We	
  put	
  the	
  SQL	
  back	
  in	
  NoSQL”	
  
•  SQL	
  layer	
  on	
  HBase	
  

•  Seamless	
  applica8on	
  integra8on	
  
     –  Standard	
  JDBC	
  interface	
  
     –  DDL	
  statement	
  support	
  

•  Low	
  query	
  latency	
  
     –    SQL	
  query	
  è	
  Mul8ple	
  HBase	
  scans	
  
     –    Co-­‐processors,	
  custom	
  filters	
  
     –    Milliseconds	
  for	
  small	
  queries	
  
     –    Seconds	
  for	
  tens	
  of	
  millions	
  rows	
  

•  hdps://github.com/forcedotcom/phoenix	
  
Contribu8ons	
  

  @pRaShAnT1784	
  :	
  Prashant	
  Kommireddi 	
                    	
  	
  




Lars	
  Ho<ansl 	
     	
     	
  	
  @thefutureian	
  :	
  Ian	
  Varley	
  
Data	
  Science	
  tools	
  ecosystem	
  




Apache	
  Pig	
  
Big	
  Data	
  Use	
  Cases	
  

                                User	
  behavior	
  
Product	
  Metrics	
                                                 Capacity	
  planning	
  
                                  analysis	
  



   Monitoring	
                                                       Query	
  Run8me	
  
                                  Collec8ons	
  
   intelligence	
                                                       Predic8on	
  



 Early	
  Warning	
              Collabora8ve	
  
                                                                     Search	
  Relevancy	
  
    System	
                       Filtering	
  


                                               Internal	
  App	
      Product	
  feature	
  
Product	
  Metrics	
  
Product	
  Metrics	
  –	
  Problem	
  Statement	
  



 •  Track	
  feature	
  usage/adop8on	
  across	
  130k+	
  
    customers	
  
       –  Eg:	
  Accounts,	
  Contacts,	
  Visualforce,	
  Apex,…	
  


 •  Track	
  standard	
  metrics	
  across	
  all	
  features	
  
       –  Eg:	
  #Requests,	
  #UniqueOrgs,	
  #UniqueUsers,	
  AvgResponseTime,…	
  


 •  Track	
  features	
  and	
  metrics	
  across	
  all	
  channels	
  
       –  API,	
  UI,	
  Mobile	
  


 •  Primary	
  audience:	
  Execu8ves,	
  Product	
  Managers	
  
Product	
  Metrics	
  Pipeline	
  

                              User	
  Input	
                                                  CollaboraWon	
  
                                                                                                                                                   Reports,	
  Dashboards	
  
                            (Page	
  Layout)	
                                                   (ChaXer)	
  
              Workflow	
  




                                                                                                                                                                                 Formula	
  
                                                                                                                                                                                  Fields	
  
           	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  Feature	
  Metrics	
                                                                        Trend	
  Metrics	
  
           	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  (Custom	
  Object)	
                                                                        (Custom	
  Object)	
  




                                                                                API	
  




                                                                                                                                         API	
  
                                                                                          	
  Client	
  Machine	
  

                                                                                              Java	
  Program	
  

                                                                                          Pig	
  script	
  generator	
  




                                                                                                                           Workflow	
  




                                                                                                                                                               Log	
  Pull	
  
                                                                                           Hadoop	
                                                                                        Log	
  Files	
  
VisualizaWon	
  (Reports	
  &	
  Dashboards)	
  




           Note:	
  Feature	
  Names	
  are	
  not	
  displayed	
  
VisualizaWon	
  (Reports	
  &	
  Dashboards)	
  
Collaborate,	
  Iterate	
  (ChaXer)	
  
User	
  Behavior	
  Analysis	
  
Problem	
  Statement	
  

§  How	
  do	
  we	
  reduce	
  number	
  of	
  clicks	
  on	
  the	
  user	
  interface?	
  
§  What	
  are	
  the	
  top	
  user	
  click	
  path	
  sequences?	
  
§  What	
  are	
  the	
  user	
  clusters/personas?	
  

•  Approach:	
  
  •  Markov	
  transi8on	
  for	
  click	
  path,	
  D3.js	
  visuals	
  
  •  K-­‐means	
  (unsupervised)	
  clustering	
  for	
  user	
  groups	
  
Markov	
  TransiWons	
  for	
  "Setup"	
  pages	
  




                                                      Note:	
  Based	
  on	
  an	
  internal	
  Salesforce	
  org	
  
K-­‐means	
  clustering	
  of	
  "Setup"	
  pages	
  




                                                        Note:	
  Based	
  on	
  an	
  internal	
  Salesforce	
  org	
  
Collabora8ve	
  Filtering	
  
CollaboraWve	
  Filtering	
  –	
  Problem	
  Statement	
  




 •  Show	
  similar	
  files	
  within	
  an	
  organiza8on	
  
       –  Content-­‐based	
  approach	
  
       –  Community-­‐base	
  approach	
  
Popular	
  File	
  
Related	
  File	
  
We	
  found	
  this	
  relaWonship	
  using	
  item-­‐to-­‐item	
  collaboraWve	
  filtering	
  




 •  Amazon	
  published	
  this	
  algorithm	
  in	
  2003.	
  
       –  Amazon.com	
  RecommendaJons:	
  Item-­‐to-­‐Item	
  CollaboraJve	
  Filtering,	
  by	
  
          Gregory	
  Linden,	
  Brent	
  Smith,	
  and	
  Jeremy	
  York.	
  	
  IEEE	
  Internet	
  Compu8ng,	
  
          January-­‐February	
  2003.	
  


 •  At	
  Salesforce,	
  we	
  adapted	
  this	
  algorithm	
  for	
  
    Hadoop,	
  and	
  we	
  use	
  it	
  to	
  recommend	
  files	
  to	
  
    view	
  and	
  users	
  to	
  follow.	
  
Example:	
  CF	
  on	
  5	
  files	
  

                                                                              Vision	
  Statement	
  
                         Annual	
  Report	
  




Dilbert	
  Comic	
  
                                                                                     Darth	
  Vader	
  Cartoon	
  




                                                Disk	
  Usage	
  Report	
  
View	
  History	
  Table	
  


                                                                   Darth	
  
                      Annual	
         Vision	
      Dilbert	
                   Disk	
  Usage	
  
                                                                   Vader	
  
                      Report	
         Statement	
   Cartoon	
                   Report	
  
                                                                   Cartoon	
  
  Miranda	
  
                               1	
          1	
            1	
           0	
            0	
  
  (CEO)	
  

  Bob	
  (CFO)	
               1	
          1	
            1	
           0	
            0	
  

  Susan	
  
                               0	
          1	
            1	
           1	
            0	
  
  (Sales)	
  
  Chun	
  
                               0	
          0	
            1	
           1	
            0	
  
  (Sales)	
  

  Alice	
  (IT)	
              0	
          0	
            1	
           1	
            1	
  
RelaWonships	
  between	
  the	
  files	
  




                            Annual	
  Report	
                         Vision	
  Statement	
  




                                                                                                 Darth	
  Vader	
  
                                                                                                 Cartoon	
  
             Dilbert	
  Cartoon	
  




                                                   Disk	
  Usage	
  
                                                   Report	
  
RelaWonships	
  between	
  the	
  files	
  



                             Annual	
  Report	
  
                                                          2                 Vision	
  Statement	
  




                                                                        0                   1
                                                    3
                             2


                                                                            0                         Darth	
  Vader	
  
                                                0                                                     Cartoon	
  
               Dilbert	
  
               Cartoon	
                                   3



                                                                                1
                                           1



                                                        Disk	
  Usage	
  
                                                        Report	
  
Sorted	
  relaWonships	
  for	
  each	
  file	
  




Annual	
                           Vision	
      Dilbert	
                                            Darth	
                           Disk	
  Usage	
  
Report	
                           Statement	
   Cartoon	
                                            Vader	
                           Report	
  
                                                                                                      Cartoon	
  
Dilbert	
  (2)	
                   Dilbert	
  (3)	
                 Vision	
  Stmt.	
  (3)	
          Dilbert	
  (3)	
                  Dilbert	
  (1)	
  
Vision	
  Stmt.	
  (2)	
           Annual	
  Rpt.	
  (2)	
          Darth	
  Vader	
  (3)	
           Vision	
  Stmt.	
  (1)	
          Darth	
  Vader	
  (1)	
  


                                   Darth	
  Vader	
  (1)	
          Annual	
  Rpt.	
  (2)	
           Disk	
  Usage	
  (1)	
  
                                                                    Disk	
  Usage	
  (1)	
  

                     The	
  popularity	
  problem:	
  no8ce	
  that	
  Dilbert	
  appears	
  first	
  in	
  every	
  list.	
  	
  This	
  is	
  
                     probably	
  not	
  what	
  we	
  want.	
  

                     The	
  solu8on:	
  divide	
  the	
  relaWonship	
  tallies	
  by	
  file	
  populariWes.	
  
Normalized	
  relaWonships	
  between	
  the	
  files	
  



                              Annual	
  Report	
                        .82	
                      Vision	
  Statement	
  




                                                                                           0                       .33	
  
                                 .63	
                       .77	
  



                                                                                               0
                                                         0                                                                   Darth	
  Vader	
  
                                                                                                                             Cartoon	
  
              Dilbert	
  Cartoon	
  
                                                                           .77	
  



                                                                                                     .58	
  
                                               .45	
  



                                                                       Disk	
  Usage	
  
                                                                       Report	
  
Sorted	
  relaWonships	
  for	
  each	
  file,	
  normalized	
  by	
  file	
  populariWes	
  




Annual	
                       Vision	
                      Dilbert	
                      Darth	
  Vader	
   Disk	
  Usage	
  
Report	
                       Statement	
                   Cartoon	
                      Cartoon	
          Report	
  
Vision	
  Stmt.	
              Annual	
  Report	
  	
        Darth	
  Vader	
                                                Darth	
  Vader	
  
                                                                                            Dilbert	
  (.77)	
  
(.82)	
                        (.82)	
                       (.77)	
                                                         (.58)	
  
                                                             Vision	
  Stmt.	
              Disk	
  Usage	
                  Dilbert	
  
Dilbert	
  (.63)	
             Dilbert	
  (.77)	
  
                                                             (.77)	
                        (.58)	
                          (.45)	
  
                               Darth	
  Vader	
  	
          Annual	
  Report	
             Vision	
  Stmt.	
  
                               (.33)	
                       (.63)	
                        (.33)	
  
                                                             Disk	
  Usage	
  
                                                             (.45)	
  



                High	
  rela8onship	
  tallies	
  AND	
  similar	
  popularity	
  values	
  now	
  drive	
  closeness.	
  
The	
  item-­‐to-­‐item	
  CF	
  algorithm	
  




  1)  Compute	
  file	
  populari8es	
  
  2)  Compute	
  rela8onship	
  tallies	
  and	
  divide	
  by	
  
      file	
  populari8es	
  
  3)  Sort	
  and	
  store	
  the	
  results	
  
MapReduce	
  Overview	
  
Map	
                 Shuffle	
               Reduce	
  




     (adapted	
  from	
  hdp://code.google.com/p/mapreduce-­‐framework/wiki/
     MapReduce)	
  
1.	
  Compute	
  File	
  PopulariWes	
  


                                                               <user,	
  file>	
  


                                                                                 Inverse	
  iden8ty	
  map	
  


                                                           <file,	
  List<user>>	
  


                                                                                  Reduce	
  


                                                          <file,	
  (user	
  count)>	
  


          Result	
  is	
  a	
  table	
  of	
  (file,	
  popularity)	
  pairs	
  that	
  you	
  store	
  in	
  the	
  Hadoop	
  distributed	
  cache.	
  
Example:	
  File	
  popularity	
  for	
  Dilbert	
  




   (Miranda,	
  Dilbert),	
  (Bob,	
  Dilbert),	
  (Susan,	
  Dilbert),	
  (Chun,	
  Dilbert),	
  (Alice,	
  Dilbert)	
  



                                                                          Inverse	
  iden8ty	
  map	
  



                                   <Dilbert,	
  {Miranda,	
  Bob,	
  Susan,	
  Chun,	
  Alice}>	
  



                                                                          Reduce	
  



                                                            (Dilbert,	
  5)	
  
2a.	
  Compute	
  relaWonship	
  tallies	
  -­‐	
  find	
  all	
  relaWonships	
  in	
  view	
  history	
  table	
  	
  



                                                        <user,	
  file>	
  	
  

                                                                         Iden8ty	
  map	
  


                                                    <user,	
  List<file>>	
  

                                                                          Reduce	
  


                                             <(file1,	
  file2),	
  Integer(1)>,	
  	
  
                                             <(file1,	
  file3),	
  Integer(1)>,	
  
                                             	
  …	
  	
  
                                             <(file(n-­‐1),	
  file(n)),	
  Integer(1)>	
  



                 Rela8onships	
  have	
  their	
  file	
  IDs	
  in	
  alphabe8cal	
  order	
  to	
  avoid	
  double	
  
                 coun8ng.	
  
Example	
  2a:	
  Miranda’s	
  (CEO)	
  file	
  relaWonship	
  votes	
  




           (Miranda,	
  Annual	
  Report),	
  (Miranda,	
  Vision	
  Statement),	
  (Miranda,	
  Dilbert)	
  


                                                                  Iden8ty	
  map	
  


                        <Miranda,	
  {Annual	
  Report,	
  Vision	
  Statement,	
  Dilbert}>	
  

                                                                  Reduce	
  


                                <(Annual	
  Report,	
  Dilbert),	
  Integer(1)>,	
  	
  
                                <(Annual	
  Report,	
  Vision	
  Statement),	
  Integer(1)>,	
  	
  
                                <(Dilbert,	
  Vision	
  Statement),	
  Integer(1)>	
  
2b.	
  Tally	
  the	
  relaWonship	
  votes	
  -­‐	
  just	
  a	
  word	
  count,	
  where	
  each	
  
relaWonship	
  occurrence	
  is	
  a	
  word	
  	
  


                                                <(file1,	
  file2),	
  Integer(1)>	
  


                                                                          Iden8ty	
  map	
  


                                             <(file1,	
  file2),	
  List<Integer(1)>	
  



                                                                           Reduce:	
  count	
  and	
  divide	
  
                                                                           by	
  populari8es	
  


                     <file1,	
  (file2,	
  similarity	
  score)>,	
  <file2,	
  	
  (file1,	
  similarity	
  score)>	
  


                               Note	
  that	
  we	
  emit	
  each	
  result	
  twice,	
  
                               one	
  for	
  each	
  file	
  that	
  belongs	
  to	
  a	
  rela8onship.	
  
Example	
  2b:	
  the	
  Dilbert/Darth	
  Vader	
  relaWonship	
  




                                       <(Dilbert,	
  Vader),	
  Integer(1)>,	
  
                                       <(Dilbert,	
  Vader),	
  Integer(1)>,	
  	
  
                                       <(Dilbert,	
  Vader),	
  Integer(1)>	
  


                                                                      Iden8ty	
  map	
  


                                         <(Dilbert,	
  Vader),	
  {1,	
  1,	
  1}>	
  



                                                                      Reduce:	
  count	
  and	
  divide	
  
                                                                      by	
  populari8es	
  


                     <Dilbert,	
  (Vader,	
  sqrt(3/5))>,	
  <Vader,	
  (Dilbert,	
  sqrt(3/5))>	
  
3.	
  Sort	
  and	
  store	
  results	
  



                                            <file1,	
  (file2,	
  similarity	
  score)>	
  


                                                                            Iden8ty	
  map	
  



                                        <file1,	
  List<(file2,	
  similarity	
  score)>>	
  


                                                                             Reduce	
  


                                               <file1,	
  {top	
  n	
  similar	
  files}>	
  




                                   Store	
  the	
  results	
  in	
  your	
  loca8on	
  of	
  choice	
  
Example	
  3:	
  SorWng	
  the	
  results	
  for	
  Dilbert	
  


                                              <Dilbert,	
  (Annual	
  Report,	
  .63)>,	
  
                                              <Dilbert,	
  (Vision	
  Statement,	
  .77)>,	
  
                                              <Dilbert,	
  (Disk	
  Usage,	
  .45)>,	
  
                                              <Dilbert,	
  (Darth	
  Vader,	
  .77)>	
  


                                                                            Iden8ty	
  map	
  


    <Dilbert,	
  {(Annual	
  Report,	
  .63),	
  (Vision	
  Statement,	
  .77),	
  (Disk	
  Usage,	
  .45),	
  (Darth	
  Vader,	
  .77)}>	
  


                                                                             Reduce	
  


                             <Dilbert,	
  {Darth	
  Vader,	
  Vision	
  Statement}>	
  (Top	
  2	
  files)	
  




                                                            Store	
  results	
  
Appendix	
  




 •  Cosine	
  formula	
  and	
  normaliza8on	
  trick	
  to	
  
    avoid	
  the	
  distributed	
  cache	
  
                             A• B   A   B
                 cosθ AB   =      =   •
                             A B    A   B

 •  Mahout	
  has	
  CF	
  
 •  Asympto8c	
  order	
  of	
  the	
  algorithm	
  is	
  O(M*N2)	
  
     €
    in	
  worst	
  case,	
  but	
  is	
  helped	
  by	
  sparsity.	
  
Narayan	
  Bharadwaj	
  
Monitoring,	
  Big	
  Data	
  @salesforce	
  

          @nadubharadwaj	
  
Hadoop Summit San Diego Feb2013

Weitere ähnliche Inhalte

Was ist angesagt?

SAP HANA SPS09 - HANA Modeling
SAP HANA SPS09 - HANA ModelingSAP HANA SPS09 - HANA Modeling
SAP HANA SPS09 - HANA ModelingSAP Technology
 
HANA SPS07 Modeling Enhancements
HANA SPS07 Modeling EnhancementsHANA SPS07 Modeling Enhancements
HANA SPS07 Modeling EnhancementsSAP Technology
 
DMM161 HANA_MODELING_2015
DMM161 HANA_MODELING_2015DMM161 HANA_MODELING_2015
DMM161 HANA_MODELING_2015Luc Vanrobays
 
30fab7f5 f95f-2d10-8ba7-8edb4d69b9f3
30fab7f5 f95f-2d10-8ba7-8edb4d69b9f330fab7f5 f95f-2d10-8ba7-8edb4d69b9f3
30fab7f5 f95f-2d10-8ba7-8edb4d69b9f3Yogeeswar Reddy
 
Attain Superior Sales Performance Through Insight Driven Oracle Sales Analytics
Attain Superior Sales Performance Through Insight Driven Oracle Sales AnalyticsAttain Superior Sales Performance Through Insight Driven Oracle Sales Analytics
Attain Superior Sales Performance Through Insight Driven Oracle Sales AnalyticsJerome Leonard
 
SAP HANA SPS10- SAP HANA Development Tools
SAP HANA SPS10- SAP HANA Development ToolsSAP HANA SPS10- SAP HANA Development Tools
SAP HANA SPS10- SAP HANA Development ToolsSAP Technology
 
What's New in SAP HANA View Modeling
What's New in SAP HANA View ModelingWhat's New in SAP HANA View Modeling
What's New in SAP HANA View ModelingSAP Technology
 
SAP HANA SPS09 - SAP HANA Core & SQL
SAP HANA SPS09 - SAP HANA Core & SQLSAP HANA SPS09 - SAP HANA Core & SQL
SAP HANA SPS09 - SAP HANA Core & SQLSAP Technology
 
Reporting _ Paul Vella _ OBI Analytics for JDE.pdf
Reporting _ Paul Vella _ OBI Analytics for JDE.pdfReporting _ Paul Vella _ OBI Analytics for JDE.pdf
Reporting _ Paul Vella _ OBI Analytics for JDE.pdfInSync2011
 
SAP HANA SPS10- Text Analysis & Text Mining
SAP HANA SPS10- Text Analysis & Text MiningSAP HANA SPS10- Text Analysis & Text Mining
SAP HANA SPS10- Text Analysis & Text MiningSAP Technology
 

Was ist angesagt? (17)

SAP HANA SPS09 - HANA Modeling
SAP HANA SPS09 - HANA ModelingSAP HANA SPS09 - HANA Modeling
SAP HANA SPS09 - HANA Modeling
 
EA261_2015
EA261_2015EA261_2015
EA261_2015
 
HANA SPS07 Modeling Enhancements
HANA SPS07 Modeling EnhancementsHANA SPS07 Modeling Enhancements
HANA SPS07 Modeling Enhancements
 
SAP_HANA_FAQ
SAP_HANA_FAQSAP_HANA_FAQ
SAP_HANA_FAQ
 
DMM161 HANA_MODELING_2015
DMM161 HANA_MODELING_2015DMM161 HANA_MODELING_2015
DMM161 HANA_MODELING_2015
 
30fab7f5 f95f-2d10-8ba7-8edb4d69b9f3
30fab7f5 f95f-2d10-8ba7-8edb4d69b9f330fab7f5 f95f-2d10-8ba7-8edb4d69b9f3
30fab7f5 f95f-2d10-8ba7-8edb4d69b9f3
 
SAP Runs SAP Mobile
SAP Runs SAP MobileSAP Runs SAP Mobile
SAP Runs SAP Mobile
 
SAP NetWeaver Gateway - RFC & BOR Generators
SAP NetWeaver Gateway - RFC & BOR GeneratorsSAP NetWeaver Gateway - RFC & BOR Generators
SAP NetWeaver Gateway - RFC & BOR Generators
 
101 ab 1600-1630
101 ab 1600-1630101 ab 1600-1630
101 ab 1600-1630
 
Attain Superior Sales Performance Through Insight Driven Oracle Sales Analytics
Attain Superior Sales Performance Through Insight Driven Oracle Sales AnalyticsAttain Superior Sales Performance Through Insight Driven Oracle Sales Analytics
Attain Superior Sales Performance Through Insight Driven Oracle Sales Analytics
 
SAP HANA SPS10- SAP HANA Development Tools
SAP HANA SPS10- SAP HANA Development ToolsSAP HANA SPS10- SAP HANA Development Tools
SAP HANA SPS10- SAP HANA Development Tools
 
TZH300_EN_COL96
TZH300_EN_COL96TZH300_EN_COL96
TZH300_EN_COL96
 
What's New in SAP HANA View Modeling
What's New in SAP HANA View ModelingWhat's New in SAP HANA View Modeling
What's New in SAP HANA View Modeling
 
SAP NetWeaver Gateway - Gateway Service Consumption
SAP NetWeaver Gateway - Gateway Service Consumption SAP NetWeaver Gateway - Gateway Service Consumption
SAP NetWeaver Gateway - Gateway Service Consumption
 
SAP HANA SPS09 - SAP HANA Core & SQL
SAP HANA SPS09 - SAP HANA Core & SQLSAP HANA SPS09 - SAP HANA Core & SQL
SAP HANA SPS09 - SAP HANA Core & SQL
 
Reporting _ Paul Vella _ OBI Analytics for JDE.pdf
Reporting _ Paul Vella _ OBI Analytics for JDE.pdfReporting _ Paul Vella _ OBI Analytics for JDE.pdf
Reporting _ Paul Vella _ OBI Analytics for JDE.pdf
 
SAP HANA SPS10- Text Analysis & Text Mining
SAP HANA SPS10- Text Analysis & Text MiningSAP HANA SPS10- Text Analysis & Text Mining
SAP HANA SPS10- Text Analysis & Text Mining
 

Ähnlich wie Hadoop Summit San Diego Feb2013

How Salesforce.com uses Hadoop
How Salesforce.com uses HadoopHow Salesforce.com uses Hadoop
How Salesforce.com uses HadoopNarayan Bharadwaj
 
Sybase Complex Event Processing
Sybase Complex Event ProcessingSybase Complex Event Processing
Sybase Complex Event ProcessingSybase Türkiye
 
Performance Monitoring and Testing in the Salesforce Cloud
Performance Monitoring and Testing in the Salesforce CloudPerformance Monitoring and Testing in the Salesforce Cloud
Performance Monitoring and Testing in the Salesforce CloudSalesforce Developers
 
Business Intelligence - Architecture &amp; Execution Done Right
Business Intelligence - Architecture &amp; Execution Done RightBusiness Intelligence - Architecture &amp; Execution Done Right
Business Intelligence - Architecture &amp; Execution Done RightDavid Sogn
 
SnapLogic corporate presentation
SnapLogic corporate presentationSnapLogic corporate presentation
SnapLogic corporate presentationpbridges
 
OpenSpan - A Better Way to Work, A Better Way to Manage
OpenSpan - A Better Way to Work, A Better Way to ManageOpenSpan - A Better Way to Work, A Better Way to Manage
OpenSpan - A Better Way to Work, A Better Way to ManageFrank Wagman
 
Collaborative Lifecycle Managmenent - an Introduction
Collaborative Lifecycle Managmenent - an IntroductionCollaborative Lifecycle Managmenent - an Introduction
Collaborative Lifecycle Managmenent - an IntroductionStrongback Consulting
 
Integrating IBM Web Sphere Portal With Web Analytic Hosted And Non Hosted Sit...
Integrating IBM Web Sphere Portal With Web Analytic Hosted And Non Hosted Sit...Integrating IBM Web Sphere Portal With Web Analytic Hosted And Non Hosted Sit...
Integrating IBM Web Sphere Portal With Web Analytic Hosted And Non Hosted Sit...Chris Sparshott
 
SAP_ABAP John Hofmann 04-28 NT
SAP_ABAP John Hofmann 04-28 NTSAP_ABAP John Hofmann 04-28 NT
SAP_ABAP John Hofmann 04-28 NTJohn Hofmann
 
InApp Inc. Corporate Profile
InApp Inc. Corporate ProfileInApp Inc. Corporate Profile
InApp Inc. Corporate Profileinapp
 
21st Century Service Oriented Architecture
21st Century Service Oriented Architecture21st Century Service Oriented Architecture
21st Century Service Oriented ArchitectureBob Rhubart
 
Be the Data Hero in Your Organization with SAP and CA Analytic Solutions
Be the Data Hero in Your Organization with SAP and CA Analytic SolutionsBe the Data Hero in Your Organization with SAP and CA Analytic Solutions
Be the Data Hero in Your Organization with SAP and CA Analytic SolutionsCA Technologies
 
Practical Approach to Data Maintenance in for PLM in Oracle EBS
Practical Approach to Data Maintenance in for PLM in Oracle EBSPractical Approach to Data Maintenance in for PLM in Oracle EBS
Practical Approach to Data Maintenance in for PLM in Oracle EBSSamsung Electronics
 

Ähnlich wie Hadoop Summit San Diego Feb2013 (20)

How Salesforce.com uses Hadoop
How Salesforce.com uses HadoopHow Salesforce.com uses Hadoop
How Salesforce.com uses Hadoop
 
Manufacturing Performance
Manufacturing PerformanceManufacturing Performance
Manufacturing Performance
 
Sybase Complex Event Processing
Sybase Complex Event ProcessingSybase Complex Event Processing
Sybase Complex Event Processing
 
Skelta BPM
Skelta BPMSkelta BPM
Skelta BPM
 
Performance Monitoring and Testing in the Salesforce Cloud
Performance Monitoring and Testing in the Salesforce CloudPerformance Monitoring and Testing in the Salesforce Cloud
Performance Monitoring and Testing in the Salesforce Cloud
 
Clinical approach to technical upgrade
Clinical approach to technical upgradeClinical approach to technical upgrade
Clinical approach to technical upgrade
 
Business Intelligence - Architecture &amp; Execution Done Right
Business Intelligence - Architecture &amp; Execution Done RightBusiness Intelligence - Architecture &amp; Execution Done Right
Business Intelligence - Architecture &amp; Execution Done Right
 
SnapLogic corporate presentation
SnapLogic corporate presentationSnapLogic corporate presentation
SnapLogic corporate presentation
 
JasperSoft and GlassFish
JasperSoft and GlassFishJasperSoft and GlassFish
JasperSoft and GlassFish
 
Technical Recruitment Overview & Tips
Technical Recruitment Overview & TipsTechnical Recruitment Overview & Tips
Technical Recruitment Overview & Tips
 
OpenSpan - A Better Way to Work, A Better Way to Manage
OpenSpan - A Better Way to Work, A Better Way to ManageOpenSpan - A Better Way to Work, A Better Way to Manage
OpenSpan - A Better Way to Work, A Better Way to Manage
 
Collaborative Lifecycle Managmenent - an Introduction
Collaborative Lifecycle Managmenent - an IntroductionCollaborative Lifecycle Managmenent - an Introduction
Collaborative Lifecycle Managmenent - an Introduction
 
Integrating IBM Web Sphere Portal With Web Analytic Hosted And Non Hosted Sit...
Integrating IBM Web Sphere Portal With Web Analytic Hosted And Non Hosted Sit...Integrating IBM Web Sphere Portal With Web Analytic Hosted And Non Hosted Sit...
Integrating IBM Web Sphere Portal With Web Analytic Hosted And Non Hosted Sit...
 
SAP_ABAP John Hofmann 04-28 NT
SAP_ABAP John Hofmann 04-28 NTSAP_ABAP John Hofmann 04-28 NT
SAP_ABAP John Hofmann 04-28 NT
 
InApp Inc. Corporate Profile
InApp Inc. Corporate ProfileInApp Inc. Corporate Profile
InApp Inc. Corporate Profile
 
apiGrove
apiGroveapiGrove
apiGrove
 
21st Century Service Oriented Architecture
21st Century Service Oriented Architecture21st Century Service Oriented Architecture
21st Century Service Oriented Architecture
 
Be the Data Hero in Your Organization with SAP and CA Analytic Solutions
Be the Data Hero in Your Organization with SAP and CA Analytic SolutionsBe the Data Hero in Your Organization with SAP and CA Analytic Solutions
Be the Data Hero in Your Organization with SAP and CA Analytic Solutions
 
Practical Approach to Data Maintenance in for PLM in Oracle EBS
Practical Approach to Data Maintenance in for PLM in Oracle EBSPractical Approach to Data Maintenance in for PLM in Oracle EBS
Practical Approach to Data Maintenance in for PLM in Oracle EBS
 
Introduction to Force.com
Introduction to Force.comIntroduction to Force.com
Introduction to Force.com
 

Kürzlich hochgeladen

ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusZilliz
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...apidays
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 

Kürzlich hochgeladen (20)

ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 

Hadoop Summit San Diego Feb2013

  • 1. Hadoop  Use  Cases   At  Salesforce.com   Narayan  Bharadwaj             Director,  Product  Management   Monitoring  &  Big  Data       Salesforce.com                            @nadubharadwaj                
  • 2. Safe  harbor   Safe  harbor  statement  under  the  Private  Securi8es  Li8ga8on  Reform  Act  of  1995:   This  presenta8on  may  contain  forward-­‐looking  statements  that  involve  risks,  uncertain8es,  and  assump8ons.  If  any   such  uncertain8es  materialize  or  if  any  of  the  assump8ons  proves  incorrect,  the  results  of  salesforce.com,  inc.  could   differ  materially  from  the  results  expressed  or  implied  by  the  forward-­‐looking  statements  we  make.  All  statements   other  than  statements  of  historical  fact  could  be  deemed  forward-­‐looking,  including  any  projec8ons  of  product  or   service  availability,  subscriber  growth,  earnings,  revenues,  or  other  financial  items  and  any  statements  regarding   strategies  or  plans  of  management  for  future  opera8ons,  statements  of  belief,  any  statements  concerning  new,   planned,  or  upgraded  services  or  technology  developments  and  customer  contracts  or  use  of  our  services.   The  risks  and  uncertain8es  referred  to  above  include  –  but  are  not  limited  to  –  risks  associated  with  developing  and   delivering  new  func8onality  for  our  service,  new  products  and  services,  our  new  business  model,  our  past  opera8ng   losses,  possible  fluctua8ons  in  our  opera8ng  results  and  rate  of  growth,  interrup8ons  or  delays  in  our  Web  hos8ng,   breach  of  our  security  measures,  the  outcome  of  intellectual  property  and  other  li8ga8on,  risks  associated  with   possible  mergers  and  acquisi8ons,  the  immature  market  in  which  we  operate,  our  rela8vely  limited  opera8ng   history,  our  ability  to  expand,  retain,  and  mo8vate  our  employees  and  manage  our  growth,  new  releases  of  our   service  and  successful  customer  deployment,  our  limited  history  reselling  non-­‐salesforce.com  products,  and   u8liza8on  and  selling  to  larger  enterprise  customers.  Further  informa8on  on  poten8al  factors  that  could  affect  the   financial  results  of  salesforce.com,  inc.  is  included  in  our  annual  report  on  Form  10-­‐Q  for  the  most  recent  fiscal   quarter  ended  July  31,  2012.  This  documents  and  others  containing  important  disclosures  are  available  on  the  SEC   Filings  sec8on  of  the  Investor  Informa8on  sec8on  of  our  Web  site.   Any  unreleased  services  or  features  referenced  in  this  or  other  presenta8ons,  press  releases  or  public  statements   are  not  currently  available  and  may  not  be  delivered  on  8me  or  at  all.  Customers  who  purchase  our  services  should   make  the  purchase  decisions  based  upon  features  that  are  currently  available.  Salesforce.com,  inc.  assumes  no   obliga8on  and  does  not  intend  to  update  these  forward-­‐looking  statements.  
  • 3. Agenda   •  Technology   •  Big  Data  use  cases   •  Use  case  discussion   •  Q&A  
  • 4. Got  “Cloud  Data”?   130k  customers   1  billion  transac8ons/day   Millions  of  users   Terabytes/day  
  • 6. Big  Data  Ecosystem   Phoenix   Oozie  
  • 7. Phoenix   “We  put  the  SQL  back  in  NoSQL”   •  SQL  layer  on  HBase   •  Seamless  applica8on  integra8on   –  Standard  JDBC  interface   –  DDL  statement  support   •  Low  query  latency   –  SQL  query  è  Mul8ple  HBase  scans   –  Co-­‐processors,  custom  filters   –  Milliseconds  for  small  queries   –  Seconds  for  tens  of  millions  rows   •  hdps://github.com/forcedotcom/phoenix  
  • 8. Contribu8ons   @pRaShAnT1784  :  Prashant  Kommireddi       Lars  Ho<ansl        @thefutureian  :  Ian  Varley  
  • 9. Data  Science  tools  ecosystem   Apache  Pig  
  • 10. Big  Data  Use  Cases   User  behavior   Product  Metrics   Capacity  planning   analysis   Monitoring   Query  Run8me   Collec8ons   intelligence   Predic8on   Early  Warning   Collabora8ve   Search  Relevancy   System   Filtering   Internal  App   Product  feature  
  • 12. Product  Metrics  –  Problem  Statement   •  Track  feature  usage/adop8on  across  130k+   customers   –  Eg:  Accounts,  Contacts,  Visualforce,  Apex,…   •  Track  standard  metrics  across  all  features   –  Eg:  #Requests,  #UniqueOrgs,  #UniqueUsers,  AvgResponseTime,…   •  Track  features  and  metrics  across  all  channels   –  API,  UI,  Mobile   •  Primary  audience:  Execu8ves,  Product  Managers  
  • 13. Product  Metrics  Pipeline   User  Input   CollaboraWon   Reports,  Dashboards   (Page  Layout)   (ChaXer)   Workflow   Formula   Fields                        Feature  Metrics   Trend  Metrics                        (Custom  Object)   (Custom  Object)   API   API    Client  Machine   Java  Program   Pig  script  generator   Workflow   Log  Pull   Hadoop   Log  Files  
  • 14. VisualizaWon  (Reports  &  Dashboards)   Note:  Feature  Names  are  not  displayed  
  • 15. VisualizaWon  (Reports  &  Dashboards)  
  • 18. Problem  Statement   §  How  do  we  reduce  number  of  clicks  on  the  user  interface?   §  What  are  the  top  user  click  path  sequences?   §  What  are  the  user  clusters/personas?   •  Approach:   •  Markov  transi8on  for  click  path,  D3.js  visuals   •  K-­‐means  (unsupervised)  clustering  for  user  groups  
  • 19. Markov  TransiWons  for  "Setup"  pages   Note:  Based  on  an  internal  Salesforce  org  
  • 20. K-­‐means  clustering  of  "Setup"  pages   Note:  Based  on  an  internal  Salesforce  org  
  • 22. CollaboraWve  Filtering  –  Problem  Statement   •  Show  similar  files  within  an  organiza8on   –  Content-­‐based  approach   –  Community-­‐base  approach  
  • 25. We  found  this  relaWonship  using  item-­‐to-­‐item  collaboraWve  filtering   •  Amazon  published  this  algorithm  in  2003.   –  Amazon.com  RecommendaJons:  Item-­‐to-­‐Item  CollaboraJve  Filtering,  by   Gregory  Linden,  Brent  Smith,  and  Jeremy  York.    IEEE  Internet  Compu8ng,   January-­‐February  2003.   •  At  Salesforce,  we  adapted  this  algorithm  for   Hadoop,  and  we  use  it  to  recommend  files  to   view  and  users  to  follow.  
  • 26. Example:  CF  on  5  files   Vision  Statement   Annual  Report   Dilbert  Comic   Darth  Vader  Cartoon   Disk  Usage  Report  
  • 27. View  History  Table   Darth   Annual   Vision   Dilbert   Disk  Usage   Vader   Report   Statement   Cartoon   Report   Cartoon   Miranda   1   1   1   0   0   (CEO)   Bob  (CFO)   1   1   1   0   0   Susan   0   1   1   1   0   (Sales)   Chun   0   0   1   1   0   (Sales)   Alice  (IT)   0   0   1   1   1  
  • 28. RelaWonships  between  the  files   Annual  Report   Vision  Statement   Darth  Vader   Cartoon   Dilbert  Cartoon   Disk  Usage   Report  
  • 29. RelaWonships  between  the  files   Annual  Report   2 Vision  Statement   0 1 3 2 0 Darth  Vader   0 Cartoon   Dilbert   Cartoon   3 1 1 Disk  Usage   Report  
  • 30. Sorted  relaWonships  for  each  file   Annual   Vision   Dilbert   Darth   Disk  Usage   Report   Statement   Cartoon   Vader   Report   Cartoon   Dilbert  (2)   Dilbert  (3)   Vision  Stmt.  (3)   Dilbert  (3)   Dilbert  (1)   Vision  Stmt.  (2)   Annual  Rpt.  (2)   Darth  Vader  (3)   Vision  Stmt.  (1)   Darth  Vader  (1)   Darth  Vader  (1)   Annual  Rpt.  (2)   Disk  Usage  (1)   Disk  Usage  (1)   The  popularity  problem:  no8ce  that  Dilbert  appears  first  in  every  list.    This  is   probably  not  what  we  want.   The  solu8on:  divide  the  relaWonship  tallies  by  file  populariWes.  
  • 31. Normalized  relaWonships  between  the  files   Annual  Report   .82   Vision  Statement   0 .33   .63   .77   0 0 Darth  Vader   Cartoon   Dilbert  Cartoon   .77   .58   .45   Disk  Usage   Report  
  • 32. Sorted  relaWonships  for  each  file,  normalized  by  file  populariWes   Annual   Vision   Dilbert   Darth  Vader   Disk  Usage   Report   Statement   Cartoon   Cartoon   Report   Vision  Stmt.   Annual  Report     Darth  Vader   Darth  Vader   Dilbert  (.77)   (.82)   (.82)   (.77)   (.58)   Vision  Stmt.   Disk  Usage   Dilbert   Dilbert  (.63)   Dilbert  (.77)   (.77)   (.58)   (.45)   Darth  Vader     Annual  Report   Vision  Stmt.   (.33)   (.63)   (.33)   Disk  Usage   (.45)   High  rela8onship  tallies  AND  similar  popularity  values  now  drive  closeness.  
  • 33. The  item-­‐to-­‐item  CF  algorithm   1)  Compute  file  populari8es   2)  Compute  rela8onship  tallies  and  divide  by   file  populari8es   3)  Sort  and  store  the  results  
  • 34. MapReduce  Overview   Map   Shuffle   Reduce   (adapted  from  hdp://code.google.com/p/mapreduce-­‐framework/wiki/ MapReduce)  
  • 35. 1.  Compute  File  PopulariWes   <user,  file>   Inverse  iden8ty  map   <file,  List<user>>   Reduce   <file,  (user  count)>   Result  is  a  table  of  (file,  popularity)  pairs  that  you  store  in  the  Hadoop  distributed  cache.  
  • 36. Example:  File  popularity  for  Dilbert   (Miranda,  Dilbert),  (Bob,  Dilbert),  (Susan,  Dilbert),  (Chun,  Dilbert),  (Alice,  Dilbert)   Inverse  iden8ty  map   <Dilbert,  {Miranda,  Bob,  Susan,  Chun,  Alice}>   Reduce   (Dilbert,  5)  
  • 37. 2a.  Compute  relaWonship  tallies  -­‐  find  all  relaWonships  in  view  history  table     <user,  file>     Iden8ty  map   <user,  List<file>>   Reduce   <(file1,  file2),  Integer(1)>,     <(file1,  file3),  Integer(1)>,    …     <(file(n-­‐1),  file(n)),  Integer(1)>   Rela8onships  have  their  file  IDs  in  alphabe8cal  order  to  avoid  double   coun8ng.  
  • 38. Example  2a:  Miranda’s  (CEO)  file  relaWonship  votes   (Miranda,  Annual  Report),  (Miranda,  Vision  Statement),  (Miranda,  Dilbert)   Iden8ty  map   <Miranda,  {Annual  Report,  Vision  Statement,  Dilbert}>   Reduce   <(Annual  Report,  Dilbert),  Integer(1)>,     <(Annual  Report,  Vision  Statement),  Integer(1)>,     <(Dilbert,  Vision  Statement),  Integer(1)>  
  • 39. 2b.  Tally  the  relaWonship  votes  -­‐  just  a  word  count,  where  each   relaWonship  occurrence  is  a  word     <(file1,  file2),  Integer(1)>   Iden8ty  map   <(file1,  file2),  List<Integer(1)>   Reduce:  count  and  divide   by  populari8es   <file1,  (file2,  similarity  score)>,  <file2,    (file1,  similarity  score)>   Note  that  we  emit  each  result  twice,   one  for  each  file  that  belongs  to  a  rela8onship.  
  • 40. Example  2b:  the  Dilbert/Darth  Vader  relaWonship   <(Dilbert,  Vader),  Integer(1)>,   <(Dilbert,  Vader),  Integer(1)>,     <(Dilbert,  Vader),  Integer(1)>   Iden8ty  map   <(Dilbert,  Vader),  {1,  1,  1}>   Reduce:  count  and  divide   by  populari8es   <Dilbert,  (Vader,  sqrt(3/5))>,  <Vader,  (Dilbert,  sqrt(3/5))>  
  • 41. 3.  Sort  and  store  results   <file1,  (file2,  similarity  score)>   Iden8ty  map   <file1,  List<(file2,  similarity  score)>>   Reduce   <file1,  {top  n  similar  files}>   Store  the  results  in  your  loca8on  of  choice  
  • 42. Example  3:  SorWng  the  results  for  Dilbert   <Dilbert,  (Annual  Report,  .63)>,   <Dilbert,  (Vision  Statement,  .77)>,   <Dilbert,  (Disk  Usage,  .45)>,   <Dilbert,  (Darth  Vader,  .77)>   Iden8ty  map   <Dilbert,  {(Annual  Report,  .63),  (Vision  Statement,  .77),  (Disk  Usage,  .45),  (Darth  Vader,  .77)}>   Reduce   <Dilbert,  {Darth  Vader,  Vision  Statement}>  (Top  2  files)   Store  results  
  • 43. Appendix   •  Cosine  formula  and  normaliza8on  trick  to   avoid  the  distributed  cache   A• B A B cosθ AB = = • A B A B •  Mahout  has  CF   •  Asympto8c  order  of  the  algorithm  is  O(M*N2)   € in  worst  case,  but  is  helped  by  sparsity.  
  • 44. Narayan  Bharadwaj   Monitoring,  Big  Data  @salesforce   @nadubharadwaj