SlideShare ist ein Scribd-Unternehmen logo
1 von 131
Sistemas de
  Recomendação
  usando Python
Marcel Pinheiro Caraciolo
marcel@pingmind.com
  @marcelcaraciolo




                      http://www.pycursos.com
Quem é Marcel ?
Marcel Pinheiro Caraciolo - @marcelcaraciolo

   Sergipano, porém Recifense.
   Mestre em Ciência da Computação no CIN/UFPE na área de mineração de dados
   Diretor de Pesquisa e Desenvolvimento no Atépassar
   CEO e Co-fundador do PyCursos/ Pingmind
   Membro e Moderador da Celúla de Usuários Python de Pernambuco (PUG-PE)
       Minhas áreas de interesse: Computação móvel e Computação inteligente


       Meus blogs: http://www.mobideia.com (sobre Mobilidade desde 2006)
                   http://aimotion.blogspot.com (sobre I.A. desde 2009)
WEB
WEB
1.0                       2.0




Fonte de Informação   Fluxo Contínuo de Informação
                                 VI Encontro do PUG-PE
                                  VI Encontro do PUG-PE
WEB SITES
WEB APPLICATIONS
 WEB SERVICES
                   3.0          SEMANTIC WEB




                   USERS   VI Encontro do PUG-PE
                            VI Encontro do PUG-PE
Usar informação coletiva de
   forma efetiva afim de
 aprimorar uma aplicação
Intelligence from
                    Mining Data




                                                     User
                                                      User
User                                                   User
                                                        User
              Um usuário influencia outros
       por resenhas, notas, recomendações e blogs




          Um usuário é influenciado por outros
        por resenhas, notas, recomendações e blogs
aggregation information: lists
                                             ratings
   user-generated content
                                 reviews
   blogs                                    recommendations

       wikis      Collective Intelligence      voting
                    Your application             bookmarking
               Search
                            tag cloud        tagging
                                                        saving
   Natural Language Processing

    Clustering and                 Harness external content
   predictive models
WEB SITES
WEB APPLICATIONS
 WEB SERVICES
                   3.0            SEMANTIC WEB




                   USERS
                           antes...
                             VI Encontro do PUG-PE
                              VI Encontro do PUG-PE
Atualmente
estamos sobrecarregados
     de informações
muitas vezes inúteis
às vezes
procuramos
   isso...
e encontramos isso!
google?
google?




midias sociais?
eeeeuuuu...

    google?




midias sociais?
Sistemas de Recomendação
“A lot of times, people don’t know what
 they want until you show it to them.”
                                     Steve Jobs

“We are leaving the Information age, and
entering into the Recommendation age.”
                  Chris Anderson, from book Long Tail
Recomendações Sociais

                                      Família/Amigos
                                    Amigos/ Família
 O Que eu
deveria ler ?




                                                 Ref: Flickr-BlueAlgae



                                     “Eu acho que
                                    você deveria ler
 Ref: Flickr photostream: jefield     estes livros.
Recomendações por Interação

                  Entrada: Avalie alguns livros

   O Que eu
  deveria ler ?




                                                  Saída:
                                                  “Livros que você
                                                    pode gostar
                                                       são …”
Sistemas desenhados para sugerir algo para mim do meu
                       interesse!
Por que Recomendação ?
Netflix
 - 2/3 dos filmes alugados vêm de recomendação

Google News
 - 38% das notícias mais clicadas vêm de recomendação


Amazon
 - 38% das vendas vêm de recomendação

                                    Fonte: Celma & Lamere, ISMIR 2007
!"#$%"#&'"%(&$)")
 Nós+,&-.$/).#&0#/"1.#$%234(".#
  * estamos sobrecarregados de
               informação
    $/)#5(&6 7&.2.#"$4,#)$8
    * 93((3&/.#&0#:&'3".;#5&&<.#
      $/)#:-.34#2%$4<.#&/(3/"
Milhares de artigos e posts
    * =/#>$/&3;#?#@A#+B#4,$//"(.;#
    novos todos os dias
      2,&-.$/).#&0#7%&6%$:.#
      "$4,#)$8
    * =/#C"1#D&%<;#."'"%$(#
Milhões de Músicas, Filmes e
      2,&-.$/).#&0#$)#:"..$6".#
           Livros
      ."/2#2&#-.#7"%#)$8


   Milhares de Ofertas e
        Promoções
O que pode ser recomendado ?

                  Contatos em Redes Sociais    Artigos
  Produtos      Messagens de Propaganda
Cursos e-learning                               Livros
        Tags        Músicas
                                Futuras namoradas
                      Roupas         Filmes
            Restaurantes
                                 Programas de Tv
    Vídeos               Papers
      Opções de Investimento             Profissionais
                  Módulos de código
E como funciona a
 recomendação ?
O que os sistemas de recomendação
        realmente fazem ?
 1. Prediz o quanto você pode gostar de um certo
                 produto ou serviço
2. Sugere um lista de N items ordenada de acordo
                  com seu interese
3. Sugere uma lista de N usuários ordernada
           para um produto/serviço
4. Explica a você o porque esses items foram
                 recomendados
5. Ajusta a predição e a recomendação baseado em
              seu feedback e de outros.
Filtragem baseada por Conteúdo

                  Similar




Duro de           O Vento                         Toy
                               Armagedon                    Items
 Matar             Levou                         Store


                                     recomenda
          gosta

                            Marcel                       Usuários
Problemas com filtragem por
              conteúdo
 1. Análise dos dados Restrita
  - Items e usuários pouco detalhados. Pior em áudio ou imagens

  2. Dados Especializados
 - Uma pessoa que não tem experiência com Sushi não recebe o
            melhor restaurante de Sushi da cidade
 3. Efeito Portfólio
- Só porque eu vi 1 filme da Xuxa quando criança, tem que me
                    recomendar todos dela
Filtragem Colaborativa




                 O Vento                         Toy
Thor                             Armagedon               Items
                  Levou                         Store

gosta
                                  recomenda


        Marcel        Rafael           Amanda           Usuários




                       Similar
Problemas com filtragem colaborativa
    1. Escabilidade
        - Amazon com 5M usuários, 50K items, 1.4B avaliações
   2. Dados esparsos
     - Novos usuários e items que não tem histórico
   3. Partida Fria
      - Só avaliei apenas um único livro no Amazon!
   4. Popularidade
     - Todo mundo lê ‘Harry Potter’
   5. Hacking
    - A pessoa que lê ‘Harry Potter’ lê Kama Sutra
Filtragem Híbrida
          Combinação de múltiplos métodos

Duro de         O Vento                              Toy
                              Armagedon                       Items
 Matar           Levou                              Store



                                                            Ontologias
                                                              Dados
                                                            Símbolicos




            Marcel        Rafael          Luciana           Usuários
Como eles são
               apresentados ?
   Destaques                 Mais sobre este artista...
  Alguem similar a você também gostou disso
              O mais popular em seu grupo...
Já que você escutou esta, você pode querer esta...
 Lançamentos         Escute músicas de artistas similares.
     Estes dois item vêm juntos..
Como eles são avaliados ?
Como sabemos se a recomendação é boa ?
Geralmente se divide-se em treinamento/teste (80/20)

Críterios utilizados:
 - Erro de Predição: RMSE

- Curva ROC*, rank-utility, F-Measure
                               *http://code.google.com/p/pyplotmining/
How to build a recommender
        system with Python ?

There is one option...




                         Crab
             A Python Framework for Building
                 Recommendation Engines

       https://github.com/python-recsys/crab
How to build a recommender
        system with Python ?

There is one option...         But it’s still in development!




                         Crab
             A Python Framework for Building
                 Recommendation Engines

       https://github.com/python-recsys/crab
But here we will create one from
         Zero with Python!
Find someone similar to you


                   O Vento                         Toy
  Thor                             Armagedon              Items
                    Levou                         Store

   like
                                   recommends


          Marcel        Rafael           Amanda           Users




                         Similar
But here we will create one from
      Step Zero with Python!
Find someone similar to you
           Movies Ratings Dataset
But here we will create one from
      Step Zero with Python!
Find someone similar to you
           Movies Ratings Dataset


                               Mr. X deu nota 4 para
                              Snow Crash e 2 para
                         Girl with the Dragon Tatoo,
                           O que recomendar para ele ?
But here we will create one from
      Step Zero with Python!
Find someone similar to you
But here we will create one from
      Step Zero with Python!
Find someone similar to you




                       Descobrimos que Amy é mais similar dentre as opções,

                     Podemos recomendar um filme visto por ela com 5 estrelas :)
But here we will create one from
        Step Zero with Python!

Mais uma métrica de similaridade: Distância Euclideana
But here we will create one from
        Step Zero with Python!

Mais uma métrica de similaridade: Distância Euclideana
But here we will create one from
        Step Zero with Python!

Mais uma métrica de similaridade: Distância Euclideana
Show me the code!
Show me the code!
>>>#Representing the data in Python
Show me the code!
>>>#Representing the data in Python
>>>users = {"Angelica": {"Blues Traveler": 3.5, "Broken Bells": 2.0,
 "Norah Jones": 4.5, "Phoenix": 5.0,
 "Slightly Stoopid": 1.5,
 "The Strokes": 2.5, "Vampire Weekend": 2.0},
 "Bill": {"Blues Traveler": 2.0, "Broken Bells": 3.5,
 "Deadmau5": 4.0,
 "Phoenix": 2.0, "Slightly Stoopid": 3.5,
 "Vampire Weekend": 3.0},
 "Chan": {"Blues Traveler": 5.0, "Broken Bells": 1.0,
 "Deadmau5": 1.0, "Norah Jones": 3.0,
 "Phoenix": 5, "Slightly Stoopid": 1.0},
 "Dan": {"Blues Traveler": 3.0, "Broken Bells": 4.0,
 "Deadmau5": 4.5, "Phoenix": 3.0,
 "Slightly Stoopid": 4.5, "The Strokes": 4.0,
 "Vampire Weekend": 2.0},
 "Hailey": {"Broken Bells": 4.0, "Deadmau5": 1.0,
 "Norah Jones": 4.0, "The Strokes": 4.0,
 "Vampire Weekend": 1.0},
 "Jordyn": {"Broken Bells": 4.5, "Deadmau5": 4.0, "Norah Jones": 5.0,
 "Phoenix": 5.0, "Slightly Stoopid": 4.5,
 "The Strokes": 4.0, "Vampire Weekend": 4.0},
 "Sam": {"Blues Traveler": 5.0, "Broken Bells": 2.0,
 "Norah Jones": 3.0, "Phoenix": 5.0,
 "Slightly Stoopid": 4.0, "The Strokes": 5.0},
 "Veronica": {"Blues Traveler": 3.0, "Norah Jones": 5.0,
 "Phoenix": 4.0, "Slightly Stoopid": 2.5,
 "The Strokes": 3.0}}
Show me the code!
Show me the code!
>>>#Representing the data in Python
Show me the code!
>>>#Representing the data in Python
>>>users = {"Angelica": {"Blues Traveler": 3.5, "Broken Bells": 2.0,
 "Norah Jones": 4.5, "Phoenix": 5.0,
 "Slightly Stoopid": 1.5,
 "The Strokes": 2.5, "Vampire Weekend": 2.0},
 "Bill": {"Blues Traveler": 2.0, "Broken Bells": 3.5,
 "Deadmau5": 4.0,
 "Phoenix": 2.0, "Slightly Stoopid": 3.5,
 "Vampire Weekend": 3.0},
 "Chan": {"Blues Traveler": 5.0, "Broken Bells": 1.0,
 "Deadmau5": 1.0, "Norah Jones": 3.0,
 "Phoenix": 5, "Slightly Stoopid": 1.0},
 "Dan": {"Blues Traveler": 3.0, "Broken Bells": 4.0,
 "Deadmau5": 4.5, "Phoenix": 3.0,
 "Slightly Stoopid": 4.5, "The Strokes": 4.0,
 "Vampire Weekend": 2.0},
 "Hailey": {"Broken Bells": 4.0, "Deadmau5": 1.0,
 "Norah Jones": 4.0, "The Strokes": 4.0,
 "Vampire Weekend": 1.0},
 "Jordyn": {"Broken Bells": 4.5, "Deadmau5": 4.0, "Norah Jones": 5.0,
 "Phoenix": 5.0, "Slightly Stoopid": 4.5,
 "The Strokes": 4.0, "Vampire Weekend": 4.0},
 "Sam": {"Blues Traveler": 5.0, "Broken Bells": 2.0,
 "Norah Jones": 3.0, "Phoenix": 5.0,
 "Slightly Stoopid": 4.0, "The Strokes": 5.0},
 "Veronica": {"Blues Traveler": 3.0, "Norah Jones": 5.0,
 "Phoenix": 4.0, "Slightly Stoopid": 2.5,
 "The Strokes": 3.0}}
Codificando o Mahantan
Codificando o Mahantan

def manhattan(rating1, rating2):
    """Computes the Manhattan distance. Both rating1 and rating2 are
        dictionaries of the form {'The Strokes': 3.0, 'Slightly
        Stoopid': 2.5}"""
    distance = 0
    commonRatings = False
    for key in rating1:
        if key in rating2:
             distance += abs(rating1[key] – rating2[key])
             commonRatings = True
    if commonRatings:
        return distance
    else:
        return -1 #Indicates no ratings in common
Codificando o Mahantan
Codificando o Mahantan

def manhattan(rating1, rating2):
    """Computes the Manhattan distance. Both rating1 and rating2 are
        dictionaries of the form {'The Strokes': 3.0, 'Slightly
        Stoopid': 2.5}"""
    distance = 0
    commonRatings = False
    for key in rating1:
        if key in rating2:
             distance += abs(rating1[key] – rating2[key])
             commonRatings = True
    if commonRatings:
        return distance
    else:
        return -1 #Indicates no ratings in common
Codificando o Mahantan

def manhattan(rating1, rating2):
    """Computes the Manhattan distance. Both rating1 and rating2 are
        dictionaries of the form {'The Strokes': 3.0, 'Slightly
        Stoopid': 2.5}"""
    distance = 0
    commonRatings = False
    for key in rating1:
        if key in rating2:
             distance += abs(rating1[key] – rating2[key])
             commonRatings = True
    if commonRatings:
        return distance
    else:
        return -1 #Indicates no ratings in common




>>> manhattan(users['Hailey'], users['Veronica'])
2.0
>>> manhattan(users['Hailey'], users['Jordyn'])
1.5
>>>
Codificando Euclidean
Codificando Euclidean
def euclidean(rating1, rating2):
    """Computes the euclidean distance.
    Both rating1 and rating2 are dictionaries of the form
    {'The Strokes': 3.0, 'Slightly Stoopid': 2.5}"""
    distance = 0.0
    commonRatings = False
    for key in rating1:
        if key in rating2:
            distance += pow(abs(rating1[key] - rating2[key]), 2.0)
            commonRatings = True
    if commonRatings:
        return pow(distance, 1/2.0)
    else:
        return -1 #Indicates no ratings in common
Codificando Euclidean




1.4142135623730951
Codificando Euclidean
 def euclidean(rating1, rating2):
     """Computes the euclidean distance.
     Both rating1 and rating2 are dictionaries of the form
     {'The Strokes': 3.0, 'Slightly Stoopid': 2.5}"""
     distance = 0.0
     commonRatings = False
     for key in rating1:
         if key in rating2:
             distance += pow(abs(rating1[key] - rating2[key]), 2.0)
             commonRatings = True
     if commonRatings:
         return pow(distance, 1/2.0)
     else:
         return -1 #Indicates no ratings in common




1.4142135623730951
Codificando Euclidean
  def euclidean(rating1, rating2):
      """Computes the euclidean distance.
      Both rating1 and rating2 are dictionaries of the form
      {'The Strokes': 3.0, 'Slightly Stoopid': 2.5}"""
      distance = 0.0
      commonRatings = False
      for key in rating1:
          if key in rating2:
              distance += pow(abs(rating1[key] - rating2[key]), 2.0)
              commonRatings = True
      if commonRatings:
          return pow(distance, 1/2.0)
      else:
          return -1 #Indicates no ratings in common




>>>   euclidean(users['Hailey'], users['Veronica'])

1.4142135623730951
Find the closest users
Find the closest users
def computeNearestNeighbor(username, users):
    """creates a sorted list of users based on their distance to
    username"""
    distances = []
    for user in users:
        if user != username:
            distance = manhattan(users[user], users[username])
            distances.append((distance, user))
    # sort based on distance -- closest first
    distances.sort()
    return distances
Find the closest users
Find the closest users




>>> computeNearestNeighbor('Hailey', users)
[(2.0, 'Veronica'), (4.0, 'Chan'),(4.0, 'Sam'), (4.5, 'Dan'), (5.0,
'Angelica'), (5.5, 'Bill'), (7.5, 'Jordyn')]
>>>
Find the closest users
def computeNearestNeighbor(username, users):
    """creates a sorted list of users based on their distance to
    username"""
    distances = []
    for user in users:
        if user != username:
            distance = manhattan(users[user], users[username])
            distances.append((distance, user))
    # sort based on distance -- closest first
    distances.sort()
    return distances




 >>> computeNearestNeighbor('Hailey', users)
 [(2.0, 'Veronica'), (4.0, 'Chan'),(4.0, 'Sam'), (4.5, 'Dan'), (5.0,
 'Angelica'), (5.5, 'Bill'), (7.5, 'Jordyn')]
 >>>
The recommender
The recommender
def recommend(username, users):
    """Give list of recommendations"""
    # first find nearest neighbor
    nearest = computeNearestNeighbor(username, users)[0][1]
    recommendations = []
    # now find bands neighbor rated that user didn't
    neighborRatings = users[nearest]
    userRatings = users[username]
    for artist in neighborRatings:
        if not artist in userRatings:
             recommendations.append((artist, neighborRatings[artist]))
    recommendations.sort(key=lambda artistTuple: artistTuple[1],
         reverse = True)
    return recommendations
The recommender
The recommender




>>> recommend('Hailey', users)
[('Phoenix', 4.0), ('Blues Traveler', 3.0), ('Slightly Stoopid', 2.5)]

>>> recommend('Chan', users)
[('The Strokes', 4.0), ('Vampire Weekend', 1.0)]

>>> recommend('Angelica', users)
[]
The recommender
  def recommend(username, users):
      """Give list of recommendations"""
      # first find nearest neighbor
      nearest = computeNearestNeighbor(username, users)[0][1]
      recommendations = []
      # now find bands neighbor rated that user didn't
      neighborRatings = users[nearest]
      userRatings = users[username]
      for artist in neighborRatings:
          if not artist in userRatings:
               recommendations.append((artist, neighborRatings[artist]))
      recommendations.sort(key=lambda artistTuple: artistTuple[1],
           reverse = True)
      return recommendations




>>> recommend('Hailey', users)
[('Phoenix', 4.0), ('Blues Traveler', 3.0), ('Slightly Stoopid', 2.5)]

>>> recommend('Chan', users)
[('The Strokes', 4.0), ('Vampire Weekend', 1.0)]

>>> recommend('Angelica', users)
[]
The recommender
The recommender




>>> computeNearestNeighbor('Angelica', users)
[(3.5, 'Veronica'), (4.5, 'Chan'), (5.0, 'Hailey'), (8.0, 'Sam'), (9.0,
'Bill'), (9.0, 'Dan'), (9.5, 'Jordyn')]('Hailey', users)
The recommender
 def recommend(username, users):
     """Give list of recommendations"""
     # first find nearest neighbor
     nearest = computeNearestNeighbor(username, users)[0][1]
     recommendations = []
     # now find bands neighbor rated that user didn't
     neighborRatings = users[nearest]
     userRatings = users[username]
     for artist in neighborRatings:
         if not artist in userRatings:
              recommendations.append((artist, neighborRatings[artist]))
     recommendations.sort(key=lambda artistTuple: artistTuple[1],
          reverse = True)
     return recommendations


>>> computeNearestNeighbor('Angelica', users)
[(3.5, 'Veronica'), (4.5, 'Chan'), (5.0, 'Hailey'), (8.0, 'Sam'), (9.0,
'Bill'), (9.0, 'Dan'), (9.5, 'Jordyn')]('Hailey', users)
But we need to improve it more...
The Pearson Correlation
The Pearson Correlation
The Pearson Correlation
The Pearson Correlation
The Pearson Correlation

Output: -1 (perfect disagreement) to 1 (perfect agreement)
The Pearson Correlation
Pearson Correlation
Pearson Correlation
def pearson(rating1, rating2):
        sum_xy = 0
        sum_x = 0
        sum_y = 0
        sum_x2 = 0
        sum_y2 = 0
        n = 0
        for key in rating1:
           if key in rating2:
                 n += 1
           x = rating1[key]
           y = rating2[key]
           sum_xy += x * y
           sum_x += x
           sum_y += y
           sum_x2 += x**2
           sum_y2 += y**2
        # now compute denominator
        denominator = sqrt(sum_x2 - (sum_x**2) / n) *
            sqrt(sum_y2 -(sum_y**2) / n)
        if denominator == 0:
           return 0
        else:
           return (sum_xy - (sum_x * sum_y) / n) / denominator
Pearson Correlation
Pearson Correlation




>>> pearson(users['Angelica'], users['Bill'])
-0.90405349906826993
>>> pearson(users['Angelica'], users['Hailey'])
0.42008402520840293
>>> pearson(users['Angelica'], users['Jordyn'])
0.76397486054754316
>>>
Pearson Correlation
def pearson(rating1, rating2):
        sum_xy = 0
        sum_x = 0
        sum_y = 0
        sum_x2 = 0
        sum_y2 = 0
        n = 0
        for key in rating1:
           if key in rating2:
                 n += 1
           x = rating1[key]
           y = rating2[key]
           sum_xy += x * y
           sum_x += x
           sum_y += y
           sum_x2 += x**2
           sum_y2 += y**2
        # now compute denominator
        denominator = sqrt(sum_x2 - (sum_x**2) / n) *
            sqrt(sum_y2 -(sum_y**2) / n)
        if denominator == 0:
           return 0
        else:
           return (sum_xy - (sum_x * sum_y) / n) / denominator

>>> pearson(users['Angelica'], users['Bill'])
-0.90405349906826993
>>> pearson(users['Angelica'], users['Hailey'])
0.42008402520840293
>>> pearson(users['Angelica'], users['Jordyn'])
0.76397486054754316
>>>
Which one to choose ?
K-nearest Neighbors (kNN)
Find k most similars to you
K-nearest Neighbors (kNN)

Challenge you!
Final Code




recsys.py
Item Based Filtering
Change people to items


      to
Change people to items
{'Lisa Rose': {'Lady in the Water': 2.5, 'Snakes on a Plane': 3.5},
 'Gene Seymour': {'Lady in the Water': 3.0, 'Snakes on a Plane': 3.5}}


                              to
{'Lady in the Water':{'Lisa Rose':2.5,'Gene Seymour':3.0},
'Snakes on a Plane':{'Lisa Rose':3.5,'Gene Seymour':3.5}} etc.
Change people to items


      to
Change people to items
{'Lisa Rose': {'Lady in the Water': 2.5, 'Snakes on a Plane': 3.5},
 'Gene Seymour': {'Lady in the Water': 3.0, 'Snakes on a Plane': 3.5}}


                              to
{'Lady in the Water':{'Lisa Rose':2.5,'Gene Seymour':3.0},
'Snakes on a Plane':{'Lisa Rose':3.5,'Gene Seymour':3.5}} etc.
Change people to items
{'Lisa Rose': {'Lady in the Water': 2.5, 'Snakes on a Plane': 3.5},
 'Gene Seymour': {'Lady in the Water': 3.0, 'Snakes on a Plane': 3.5}}


                              to
{'Lady in the Water':{'Lisa Rose':2.5,'Gene Seymour':3.0},
'Snakes on a Plane':{'Lisa Rose':3.5,'Gene Seymour':3.5}} etc.




def transformPrefs(prefs):
      result={}
      for person in prefs:
         for item in prefs[person]:
            result.setdefault(item,{})
            # Flip item and person
            result[item][person]=prefs[person][item]

      return result
Change people to items
{'Lisa Rose': {'Lady in the Water': 2.5, 'Snakes on a Plane': 3.5},
 'Gene Seymour': {'Lady in the Water': 3.0, 'Snakes on a Plane': 3.5}}


                              to
{'Lady in the Water':{'Lisa Rose':2.5,'Gene Seymour':3.0},
'Snakes on a Plane':{'Lisa Rose':3.5,'Gene Seymour':3.5}} etc.




def transformPrefs(prefs):
      result={}
      for person in prefs:
         for item in prefs[person]:
            result.setdefault(item,{})
            # Flip item and person
            result[item][person]=prefs[person][item]

      return result



>> movies=recommendations.transformPrefs(recommendations.users)
>> recommendations.computeNearestNeighbors(‘Blues Traveler’, movies)
[(0.657, 'You, Me and Dupree'), (0.487, 'Lady in the Water'), (0.111, 'Snakes on a
Plane'), (-0.179, 'The Night Listener'), (-0.422, 'Just My Luck')]
Change people to items


      to
Change people to items
{'Lisa Rose': {'Lady in the Water': 2.5, 'Snakes on a Plane': 3.5},
 'Gene Seymour': {'Lady in the Water': 3.0, 'Snakes on a Plane': 3.5}}


                              to
{'Lady in the Water':{'Lisa Rose':2.5,'Gene Seymour':3.0},
'Snakes on a Plane':{'Lisa Rose':3.5,'Gene Seymour':3.5}} etc.
Change people to items
{'Lisa Rose': {'Lady in the Water': 2.5, 'Snakes on a Plane': 3.5},
 'Gene Seymour': {'Lady in the Water': 3.0, 'Snakes on a Plane': 3.5}}


                              to
{'Lady in the Water':{'Lisa Rose':2.5,'Gene Seymour':3.0},
'Snakes on a Plane':{'Lisa Rose':3.5,'Gene Seymour':3.5}} etc.




def transformPrefs(prefs):
      result={}
      for person in prefs:
         for item in prefs[person]:
            result.setdefault(item,{})
            # Flip item and person
            result[item][person]=prefs[person][item]

      return result
Change people to items
{'Lisa Rose': {'Lady in the Water': 2.5, 'Snakes on a Plane': 3.5},
 'Gene Seymour': {'Lady in the Water': 3.0, 'Snakes on a Plane': 3.5}}


                              to
{'Lady in the Water':{'Lisa Rose':2.5,'Gene Seymour':3.0},
'Snakes on a Plane':{'Lisa Rose':3.5,'Gene Seymour':3.5}} etc.




def transformPrefs(prefs):
      result={}
      for person in prefs:
         for item in prefs[person]:
            result.setdefault(item,{})
            # Flip item and person
            result[item][person]=prefs[person][item]

      return result



>> movies=recommendations.transformPrefs(recommendations.critics)
>> recommendations.computeNearestNeighbors(movies,'Superman Returns')
[(0.657, 'You, Me and Dupree'), (0.487, 'Lady in the Water'), (0.111, 'Snakes on a
Plane'), (-0.179, 'The Night Listener'), (-0.422, 'Just My Luck')]
User Based Filtering até agora!




Problemas de Escalabilidade e Esparsidade
Item Based Filtering
Find k most similars to the item
Find the closest items
Find the closest items
def calculateSimilarItems(prefs,sim_distance=manhattan):
! # Create a dictionary of items showing which other items they
! # are most similar to.
! result={}

! # Invert the preference matrix to be item-centric

!   itemPrefs=transformPrefs(prefs)
!   c=0
!   for item in itemPrefs:
!   ! # Status updates for large datasets
!   ! c+=1
!   ! if c%100==0: print "%d / %d" % (c,len(itemPrefs))

                  # Find the most similar items to this one
                  scores=computeNearestNeighbor(item,itemPrefs,distance=sim_distance)
                  result[item]=scores

! return result
Find the closest items
def calculateSimilarItems(prefs,sim_distance=manhattan):
! # Create a dictionary of items showing which other items they
! # are most similar to.
! result={}

! # Invert the preference matrix to be item-centric

!   itemPrefs=transformPrefs(prefs)
!   c=0
!   for item in itemPrefs:
!   ! # Status updates for large datasets
!   ! c+=1
!   ! if c%100==0: print "%d / %d" % (c,len(itemPrefs))

                  # Find the most similar items to this one
                  scores=computeNearestNeighbor(item,itemPrefs,distance=sim_distance)
                  result[item]=scores

! return result




>>> itemsim=recommendations.calculateSimilarItems(users)
>>> itemsim
{'Lady in the Water': [(0.40000000000000002, 'You, Me and Dupree'), (0.2857142857142857, 'The
Night Listener'),... 'Snakes on a Plane': [(0.22222222222222221, 'Lady in the Water'),
(0.18181818181818182, 'The Night Listener'),... etc.
The recommender
The recommender
def recommend(username,users, similarities, n=3):
    scores = {}
    totalSim = {}
    #
    # now get the ratings for the user
    #
    userRatings = users[username]
    # Loop over items rated by this user
    for item, rating in userRatings.items():
        #Loop over items similar to this one
        for sim, other_item in similarities[item]:
            # Ignore if this user has already rated this item
            if other_item in userRatings: continue
            # Weighted sum of rating times similarity
            scores.setdefault(other_item, 0.0)
            scores[other_item]+= sim * rating
            # Sum of all the similarities
            totalSim.setdefault(other_item, 0.0)
            totalSim[other_item] += sim
    # Divide each total score by total weighting to get an average
    recommendations = [(score/totalSim[item],item) for item,score in scores.items()]
    # finally sort and return
    recommendations.sort(key=lambda artistTuple: artistTuple[1], reverse = True)
    # Return the first n items
    return recommendations[:n]
The recommender




>>> recommend('Hailey', users,similarities,3)
[(3.1176470588235294, 'Slightly Stoopid'),
 (2.639207507820647, 'Phoenix'),(2.64476386036961, 'Blues Traveler')]
The recommender
   def recommend(username,users, similarities, n=3):
       scores = {}
       totalSim = {}
       #
       # now get the ratings for the user
       #
       userRatings = users[username]
       # Loop over items rated by this user
       for item, rating in userRatings.items():
           #Loop over items similar to this one
           for sim, other_item in similarities[item]:
               # Ignore if this user has already rated this item
               if other_item in userRatings: continue
               # Weighted sum of rating times similarity
               scores.setdefault(other_item, 0.0)
               scores[other_item]+= sim * rating
               # Sum of all the similarities
               totalSim.setdefault(other_item, 0.0)
               totalSim[other_item] += sim
       # Divide each total score by total weighting to get an average
       recommendations = [(score/totalSim[item],item) for item,score in scores.items()]
       # finally sort and return
       recommendations.sort(key=lambda artistTuple: artistTuple[1], reverse = True)
       # Return the first n items
       return recommendations[:n]


>>> recommend('Hailey', users,similarities,3)
[(3.1176470588235294, 'Slightly Stoopid'),
 (2.639207507820647, 'Phoenix'),(2.64476386036961, 'Blues Traveler')]
Content Based Filtering

                   Similar




Duro de            O Vento                         Toy
                                Armagedon                  Items
 Matar              Levou                         Store


                                      recommend
          likes

                             Marcel                       Users
source, the recommendation architecture that we propose will                    would rely more on collaborative-filtering techniques, that is,
aggregate the results of such filtering techniques.                                   Bezerra and Carvalho proposed approaches where the results
                                                                                the reviews from similar users.
   We aim at integrating the previously mentioned hybrid prod-                     Figure 1 shows a overview of our meta recommender
                                                                                     achieved showed to be very promising [19].
                                                                                approach. By combining the content-based filtering and the
uct recommendation approach in a mobile application so the
                                                                                                                                                                                               A.

                   Crab is already in production
users could benefit from useful and logical recommendations.                     collaborative-based one into a hybrid recommender system, it
Moreover, we aim at providing a suited explanation for each                     would use the services/products III. S YSTEM catalogues
                                                                                                                repositories which D ESIGN
recommendation to the user, since the current approaches just                   the services to be recommended, and the review repository
                                                                                        Application data information our mobile recommender sys-
                                                                                that contains the user opinions about those services. All this                                                 for
only deliver product recommendations with a overall score
without pointing out the appropriateness of such recommen-                      datatembecan be from data source containers in the web product description
                                                                                      can    extracted divided into two parts: the                                                             rec
dation [13]. Besides the basic information provided by the                      such(such location-based social network Foursquare its attributes) and the user
                                                                                      as the as location, description and [17] as

                                         Hybrid Meta Approach gives the system’s architecture and
suppliers, the system will deliver the explanation, providing
relevant reviews of similar users, we believe that it will
                                                  tags, etc.). The Figure 3
increase the confidence in the buying decision process and the
                                                                                displayed at the Figure 2 and the location recommendation
                                                                                engine from Google: Google HotPot [18]. by user (such as rating, comments,
                                                                                     reviews or ratings provided
                                                                                                                                                                                               mo
                                                                                                                                                                                               wh
product accepptance rate. In the mobile context this approach
                                                                                                                                                                                               po
could help the users in this process and showing the user
                                                                                   relative components.                                                                                        thi
opinions could contribute to achieve this task.                                                                                                                                                rec
                                                                                                                                                                                               spe
                                                                                     !"#$"%&'$                                                         5&-$
        !"#$%&'%($)                               !".,"/#)                                                                                                                                     acc
        !"*+#,$+'-)                              !"*+#,$+'-)                                                                +,-*.&$
                                                                                                           !(#$()&'*&%$
                                                                                                                           /01&'234&$          !6#$6,00&41&7$
                                                                                                                                                                                               wh
                                                                                                                                                                                               res
                                                                                                                                   !<#$<'&2&'&04&%A$B,431*,0A$&14C$
                                                                                                                                                                                               ves
                                              0+44%6+'%$,.")1%#"2)
      0+($"($)1%#"2)
                                                    3,4$"',(5)
                                                                                                                                                                                               ou
        3,4$"',(5)
                                             )))67,8,#%)+,4%$91$'%4)-1":))))
                                                                                                                                                                                               suc
  !"#$%&"'()*+,#&-,.)
  /$%,0"12()*3$4%)3""5.)
                                             ))))1,;&,<4)<1&%%,')=2)4&:&8$1))
                                             )))))))))))%$4%,5)94,14>?)                                                                                                    <',7)41$
                                                                                                                                                                                               pro
                                                                                                                                                                          8&=,%*1,'>$
                                                                                                                                                                                               exp
                                                                                                                  8&4,99&0731*,0$:0;*0&$                        !B#$B*%1$,2$D4,'&7$<',7)41%$
                                                                                                                                                                !(#$()&'*&%$
                                                                                                                                                                                               ma
                                                                                                                                                                           8&?*&@$
                                                                                                                                                                                               we
                                                                                       Fig. 2.   User Reviews from Foursquare Social Network                              8&=,%*1,'>$
                                                                                                                                                                                               com
                                  7"$%)
                              !"8+99"(2"'))
                                                                                                                                     !8#$830E&7$<',7)41%$
                                                                                   The content-based filtering approach will be used to filter                                                   ext
                                                                                the product/service repository, while the collaborative based
                                                                                                                        8&%).1%$                                                               B.
                                                                                approach will derive the product review recommendations. In
                                                                                addition we will use text mining techniques to distinct the
                               !"8+99"(2%$,+(#)                                 polarity of the user review between positive or negative one.
                                                                                This information summarized would contribute in the product Architecture
                                                                                                   Fig. 3. Mobile Recommender System                                                           rat
                                                                                score recommendation computation. The final product recom-
                Fig. 1.    Meta Recommender Architecture
                                                                                mendation score is computed by integrating the result of both
                                                                                                                                                                                               me
                                                                                recommenders. By now, weproduct/service recommender, the user could
                                                                                        In our mobile are considering to use different                                                         and
   Since one of the goals of this work is to incorporate                        options regarding this integration approach, one and get a list of recommen-
different data sources of user opinions and descriptions, we                         filter some products or services at special                                                                oth
                                                                                is the symbolic data analysis approach (SDA) [19], which
have addopted an meta recommendation architecture. By using                     eachtations. The user user ratings/reviews arehis preferences or give his
                                                                                      product description and also can enter modeled                                                           ow
a meta recommender architecture, the system would provide
a personalized control over the generated recommendation list
                                                                                     feedback to some offered product recommendation.
                                                                                as set of modal symbolic descriptions that summarizes the                                                      Re
                                                                                information provided by the corresponding data sources. It is
Crab is already in production

  Brazilian Social Network called Atepassar.com
         Educational network with more than 60.000 students and 120 video-classes




     Running on Python
    + Numpy + Scipy and
          Django


Backend for Recommendations
MongoDB - mongoengine

   Daily Recommendations
    with Explanations
Distributing the recommendation computations


Use Hadoop and Map-Reduce intensively
  Investigating the Yelp mrjob framework     https://github.com/pfig/mrjob



Develop the Netflix and novel standard-of-the-art used
    Matrix Factorization, Singular Value Decomposition (SVD), Boltzman machines



The most commonly used is Slope One technique.
   Simple algebra math with slope one algebra y = a*x+b
Distributed Computing with mrJob
                     https://github.com/Yelp/mrjob




http://aimotion.blogspot.com/2012/08/introduction-to-recommendations-with.html
Distributed Computing with mrJob
                          https://github.com/Yelp/mrjob




It supports Amazon’s Elastic MapReduce(EMR) service, your own Hadoop cluster or
                                 local (for testing)

     http://aimotion.blogspot.com/2012/08/introduction-to-recommendations-with.html
Distributed Computing with mrJob
                          https://github.com/Yelp/mrjob




It supports Amazon’s Elastic MapReduce(EMR) service, your own Hadoop cluster or
                                 local (for testing)

     http://aimotion.blogspot.com/2012/08/introduction-to-recommendations-with.html
Distributed Computing with mrJob
                          https://github.com/Yelp/mrjob


                                                """The classic MapReduce job: count the frequency of words.
                                                """
                                                from mrjob.job import MRJob
                                                import re

                                                WORD_RE = re.compile(r"[w']+")

                                                class MRWordFreqCount(MRJob):

                                                    def mapper(self, _, line):
                                                        for word in WORD_RE.findall(line):
                                                            yield (word.lower(), 1)

                                                    def reducer(self, word, counts):
                                                        yield (word, sum(counts))

                                                if __name__ == '__main__':
                                                    MRWordFreqCount.run()




It supports Amazon’s Elastic MapReduce(EMR) service, your own Hadoop cluster or
                                 local (for testing)

     http://aimotion.blogspot.com/2012/08/introduction-to-recommendations-with.html
Future studies with Sparse Matrices
 Real datasets come with lots of empty values
  http://aimotion.blogspot.com/2011/05/evaluating-recommender-systems.html



Solutions:

       scipy.sparse package

       Sharding operations

       Matrix Factorization
        techniques (SVD)




                                                  Apontador Reviews Dataset
Future studies with Sparse Matrices
     Real datasets come with lots of empty values
      http://aimotion.blogspot.com/2011/05/evaluating-recommender-systems.html



   Solutions:

          scipy.sparse package

          Sharding operations

          Matrix Factorization
           techniques (SVD)




  Crab implements a Matrix
Factorization with Expectation
   Maximization algorithm

                                                      Apontador Reviews Dataset
Future studies with Sparse Matrices
     Real datasets come with lots of empty values
      http://aimotion.blogspot.com/2011/05/evaluating-recommender-systems.html



   Solutions:

          scipy.sparse package

          Sharding operations

          Matrix Factorization
           techniques (SVD)




  Crab implements a Matrix
Factorization with Expectation
   Maximization algorithm
      scikits.crab.svd package
                                                      Apontador Reviews Dataset
How are we working ?
        Our Project’s Home Page




http://github.com/python-recsys/crab
Future Releases
       Planned Release 0.1
   Collaborative Filtering Algorithms working, sample datasets to load and test


       Planned Release 0.11
                Sparse Matrixes and Database Models support


       Planned Release 0.12
                Slope One Agorithm, new factorization techniques implemented



....
Join us!

1. Read our Wiki Page
   https://github.com/python-recsys/crab/wiki/Developer-Resources

2. Check out our current sprints and open issues
   https://github.com/python-recsys/crab/issues

3. Forks, Pull Requests mandatory
4. Join us at irc.freenode.net #muricoca or at our
                     discussion list
                   http://groups.google.com/group/scikit-crab
Construção	
  do	
  Social	
  Genoma	
  
colecione descontos


http://aimotion.blogspot.com.br/2013/01/how-recommend-deals-on-line-for-coupon.html




                          WWW.
                          FAVORITOZ.
                          COM
Recommended Books




Toby Segaran, Programming Collective   SatnamAlag, Collective Intelligence in
Intelligence, O'Reilly, 2007           Action, Manning Publications, 2009



   ACM RecSys, KDD , SBSC...
Conferências Recomendadas
- ACM RecSys.

–ICWSM: Weblogand Social Media

–WebKDD: Web Knowledge Discovery and Data Mining

–WWW: The original WWW conference

–SIGIR: Information Retrieval

–ACM KDD: Knowledge Discovery and Data Mining

–ICML: Machine Learning
Sistemas de
  Recomendação
  usando Python
Marcel Pinheiro Caraciolo
marcel@pingmind.com
  @marcelcaraciolo




                      http://www.pycursos.com

Weitere ähnliche Inhalte

Was ist angesagt?

Aprendizado de Máquina
Aprendizado de MáquinaAprendizado de Máquina
Aprendizado de Máquina
butest
 
Recommender system introduction
Recommender system   introductionRecommender system   introduction
Recommender system introduction
Liang Xiang
 

Was ist angesagt? (20)

Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Aprendizado de Máquina
Aprendizado de MáquinaAprendizado de Máquina
Aprendizado de Máquina
 
Introduction to Recommendation Systems
Introduction to Recommendation SystemsIntroduction to Recommendation Systems
Introduction to Recommendation Systems
 
Collaborative Filtering using KNN
Collaborative Filtering using KNNCollaborative Filtering using KNN
Collaborative Filtering using KNN
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
[Final]collaborative filtering and recommender systems
[Final]collaborative filtering and recommender systems[Final]collaborative filtering and recommender systems
[Final]collaborative filtering and recommender systems
 
Recommender system
Recommender systemRecommender system
Recommender system
 
BD I - Aula 04 A - Resumo MER e Mapeamento Relacional
BD I - Aula 04 A - Resumo MER e Mapeamento RelacionalBD I - Aula 04 A - Resumo MER e Mapeamento Relacional
BD I - Aula 04 A - Resumo MER e Mapeamento Relacional
 
Movie Recommendation System - MovieLens Dataset
Movie Recommendation System - MovieLens DatasetMovie Recommendation System - MovieLens Dataset
Movie Recommendation System - MovieLens Dataset
 
Recommendation system
Recommendation system Recommendation system
Recommendation system
 
Rekomendujemy - Szybkie wprowadzenie do systemów rekomendacji oraz trochę wie...
Rekomendujemy - Szybkie wprowadzenie do systemów rekomendacji oraz trochę wie...Rekomendujemy - Szybkie wprowadzenie do systemów rekomendacji oraz trochę wie...
Rekomendujemy - Szybkie wprowadzenie do systemów rekomendacji oraz trochę wie...
 
Recommender system introduction
Recommender system   introductionRecommender system   introduction
Recommender system introduction
 
Business Intelligence
Business IntelligenceBusiness Intelligence
Business Intelligence
 
Movie Recommendation engine
Movie Recommendation engineMovie Recommendation engine
Movie Recommendation engine
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Machine Learning - Introdução e Aplicações
Machine Learning - Introdução e AplicaçõesMachine Learning - Introdução e Aplicações
Machine Learning - Introdução e Aplicações
 
Recommendation System Explained
Recommendation System ExplainedRecommendation System Explained
Recommendation System Explained
 
An introduction to Recommender Systems
An introduction to Recommender SystemsAn introduction to Recommender Systems
An introduction to Recommender Systems
 
Curso CSS 3 - Aula Introdutória com conceitos básicos
Curso CSS 3 - Aula Introdutória com conceitos básicosCurso CSS 3 - Aula Introdutória com conceitos básicos
Curso CSS 3 - Aula Introdutória com conceitos básicos
 
Introduction to Recommendation System
Introduction to Recommendation SystemIntroduction to Recommendation System
Introduction to Recommendation System
 

Andere mochten auch

Recommendations with hadoop streaming and python
Recommendations with hadoop streaming and pythonRecommendations with hadoop streaming and python
Recommendations with hadoop streaming and python
Andrew Look
 
Item Based Collaborative Filtering Recommendation Algorithms
Item Based Collaborative Filtering Recommendation AlgorithmsItem Based Collaborative Filtering Recommendation Algorithms
Item Based Collaborative Filtering Recommendation Algorithms
nextlib
 

Andere mochten auch (15)

Como interpretar seu próprio genoma com Python
Como interpretar seu próprio genoma com PythonComo interpretar seu próprio genoma com Python
Como interpretar seu próprio genoma com Python
 
Python em Sistemas de Recomendação: A Cobra é Inteligente!
Python em Sistemas de Recomendação: A Cobra é Inteligente!Python em Sistemas de Recomendação: A Cobra é Inteligente!
Python em Sistemas de Recomendação: A Cobra é Inteligente!
 
Twittando Com Python
Twittando Com PythonTwittando Com Python
Twittando Com Python
 
Desenvolvendo DSLs Em Python
Desenvolvendo DSLs Em PythonDesenvolvendo DSLs Em Python
Desenvolvendo DSLs Em Python
 
Recommendations with hadoop streaming and python
Recommendations with hadoop streaming and pythonRecommendations with hadoop streaming and python
Recommendations with hadoop streaming and python
 
Django Módulo Básico Parte I - Desenvolvimento de uma aplicação Web
Django Módulo Básico Parte I - Desenvolvimento de uma aplicação WebDjango Módulo Básico Parte I - Desenvolvimento de uma aplicação Web
Django Módulo Básico Parte I - Desenvolvimento de uma aplicação Web
 
Sistema de Recomendação de Produtos Utilizando Mineração de Dados
Sistema de Recomendação de Produtos Utilizando Mineração de DadosSistema de Recomendação de Produtos Utilizando Mineração de Dados
Sistema de Recomendação de Produtos Utilizando Mineração de Dados
 
Pip - Instalando Pacotes facilmente para Python
Pip - Instalando Pacotes facilmente para PythonPip - Instalando Pacotes facilmente para Python
Pip - Instalando Pacotes facilmente para Python
 
Acessando o MySql com o Python
Acessando o MySql com o PythonAcessando o MySql com o Python
Acessando o MySql com o Python
 
Introdução à Programação Python e Tk
Introdução à Programação Python e TkIntrodução à Programação Python e Tk
Introdução à Programação Python e Tk
 
Python Interface Gráfica Tkinter
Python Interface Gráfica TkinterPython Interface Gráfica Tkinter
Python Interface Gráfica Tkinter
 
Como Python pode ajudar na automação do seu laboratório
Como Python pode ajudar na automação do  seu laboratórioComo Python pode ajudar na automação do  seu laboratório
Como Python pode ajudar na automação do seu laboratório
 
Hackeando Dados públicos com python
Hackeando Dados públicos com pythonHackeando Dados públicos com python
Hackeando Dados públicos com python
 
Item Based Collaborative Filtering Recommendation Algorithms
Item Based Collaborative Filtering Recommendation AlgorithmsItem Based Collaborative Filtering Recommendation Algorithms
Item Based Collaborative Filtering Recommendation Algorithms
 
Python para iniciantes
Python para iniciantesPython para iniciantes
Python para iniciantes
 

Ähnlich wie Construindo Sistemas de Recomendação com Python

Glasgow: OPAC 2.0 and Beyond
Glasgow: OPAC 2.0 and BeyondGlasgow: OPAC 2.0 and Beyond
Glasgow: OPAC 2.0 and Beyond
daveyp
 
Design of recommender systems
Design of recommender systemsDesign of recommender systems
Design of recommender systems
Rashmi Sinha
 
The hunt for the perfect interface in a googlified world
The hunt for the perfect interface in a googlified worldThe hunt for the perfect interface in a googlified world
The hunt for the perfect interface in a googlified world
nabot
 
Beyond Task Based Testing: Interviews and Personas
Beyond Task Based Testing: Interviews and PersonasBeyond Task Based Testing: Interviews and Personas
Beyond Task Based Testing: Interviews and Personas
Jeff Wisniewski
 
SEMPO Canada Summit in Vancouver May 2013
SEMPO Canada Summit in Vancouver May 2013SEMPO Canada Summit in Vancouver May 2013
SEMPO Canada Summit in Vancouver May 2013
Duane Forrester
 
Ola ei top tech trends
Ola ei top tech trendsOla ei top tech trends
Ola ei top tech trends
Stephen Abram
 
Money for Mission Conference: Fundraising 2.0
Money for Mission Conference: Fundraising 2.0Money for Mission Conference: Fundraising 2.0
Money for Mission Conference: Fundraising 2.0
Beth Kanter
 

Ähnlich wie Construindo Sistemas de Recomendação com Python (20)

TVOT June 2012
TVOT June 2012TVOT June 2012
TVOT June 2012
 
Lec7 collaborative filtering
Lec7 collaborative filteringLec7 collaborative filtering
Lec7 collaborative filtering
 
Jumper 2.0 Collaborative Search MIN15 060310
Jumper 2.0 Collaborative Search MIN15 060310Jumper 2.0 Collaborative Search MIN15 060310
Jumper 2.0 Collaborative Search MIN15 060310
 
Recommendation Systems Roadtrip
Recommendation Systems RoadtripRecommendation Systems Roadtrip
Recommendation Systems Roadtrip
 
Beyond Usability
Beyond UsabilityBeyond Usability
Beyond Usability
 
Glasgow: OPAC 2.0 and Beyond
Glasgow: OPAC 2.0 and BeyondGlasgow: OPAC 2.0 and Beyond
Glasgow: OPAC 2.0 and Beyond
 
Remote Research Workshop, UX Week 2012 - Cyd Harrell
Remote Research Workshop, UX Week 2012 - Cyd HarrellRemote Research Workshop, UX Week 2012 - Cyd Harrell
Remote Research Workshop, UX Week 2012 - Cyd Harrell
 
Design of recommender systems
Design of recommender systemsDesign of recommender systems
Design of recommender systems
 
Google: Spotting Fake Reviewer Groups
Google: Spotting Fake Reviewer GroupsGoogle: Spotting Fake Reviewer Groups
Google: Spotting Fake Reviewer Groups
 
You’re not a dog
You’re not a dogYou’re not a dog
You’re not a dog
 
You’re Not A Dog: How Lawyers Can Put Their Best Foot Forward Online
You’re Not A Dog: How Lawyers Can Put Their Best Foot Forward OnlineYou’re Not A Dog: How Lawyers Can Put Their Best Foot Forward Online
You’re Not A Dog: How Lawyers Can Put Their Best Foot Forward Online
 
The hunt for the perfect interface in a googlified world
The hunt for the perfect interface in a googlified worldThe hunt for the perfect interface in a googlified world
The hunt for the perfect interface in a googlified world
 
Beyond Task Based Testing: Interviews and Personas
Beyond Task Based Testing: Interviews and PersonasBeyond Task Based Testing: Interviews and Personas
Beyond Task Based Testing: Interviews and Personas
 
Nwill2012
Nwill2012Nwill2012
Nwill2012
 
SEMPO Canada Summit in Vancouver May 2013
SEMPO Canada Summit in Vancouver May 2013SEMPO Canada Summit in Vancouver May 2013
SEMPO Canada Summit in Vancouver May 2013
 
FSU SLIS InfoSvcs Wk 3 - Web Search & Evaluation
FSU SLIS InfoSvcs Wk 3 - Web Search & EvaluationFSU SLIS InfoSvcs Wk 3 - Web Search & Evaluation
FSU SLIS InfoSvcs Wk 3 - Web Search & Evaluation
 
Computer-Assisted Consumer Profiles on Twitter
Computer-Assisted Consumer Profiles on TwitterComputer-Assisted Consumer Profiles on Twitter
Computer-Assisted Consumer Profiles on Twitter
 
Ola ei top tech trends
Ola ei top tech trendsOla ei top tech trends
Ola ei top tech trends
 
Recommender Systems and the Human Factor
Recommender Systems and the Human FactorRecommender Systems and the Human Factor
Recommender Systems and the Human Factor
 
Money for Mission Conference: Fundraising 2.0
Money for Mission Conference: Fundraising 2.0Money for Mission Conference: Fundraising 2.0
Money for Mission Conference: Fundraising 2.0
 

Mehr von Marcel Caraciolo

Mehr von Marcel Caraciolo (20)

Joblib: Lightweight pipelining for parallel jobs (v2)
Joblib:  Lightweight pipelining for parallel jobs (v2)Joblib:  Lightweight pipelining for parallel jobs (v2)
Joblib: Lightweight pipelining for parallel jobs (v2)
 
Construindo softwares de bioinformática para análises clínicas : Desafios e...
Construindo softwares  de bioinformática  para análises clínicas : Desafios e...Construindo softwares  de bioinformática  para análises clínicas : Desafios e...
Construindo softwares de bioinformática para análises clínicas : Desafios e...
 
Como Python ajudou a automatizar o nosso laboratório v.2
Como Python ajudou a automatizar o nosso laboratório v.2Como Python ajudou a automatizar o nosso laboratório v.2
Como Python ajudou a automatizar o nosso laboratório v.2
 
Python on Science ? Yes, We can.
Python on Science ?   Yes, We can.Python on Science ?   Yes, We can.
Python on Science ? Yes, We can.
 
Oficina Python: Hackeando a Web com Python 3
Oficina Python: Hackeando a Web com Python 3Oficina Python: Hackeando a Web com Python 3
Oficina Python: Hackeando a Web com Python 3
 
Recommender Systems with Ruby (adding machine learning, statistics, etc)
Recommender Systems with Ruby (adding machine learning, statistics, etc)Recommender Systems with Ruby (adding machine learning, statistics, etc)
Recommender Systems with Ruby (adding machine learning, statistics, etc)
 
Opensource - Como começar e dá dinheiro ?
Opensource - Como começar e dá dinheiro ?Opensource - Como começar e dá dinheiro ?
Opensource - Como começar e dá dinheiro ?
 
Big Data com Python
Big Data com PythonBig Data com Python
Big Data com Python
 
Benchy, python framework for performance benchmarking of Python Scripts
Benchy, python framework for performance benchmarking  of Python ScriptsBenchy, python framework for performance benchmarking  of Python Scripts
Benchy, python framework for performance benchmarking of Python Scripts
 
Python e 10 motivos por que devo conhece-la ?
Python e 10 motivos por que devo conhece-la ?Python e 10 motivos por que devo conhece-la ?
Python e 10 motivos por que devo conhece-la ?
 
GeoMapper, Python Script for Visualizing Data on Social Networks with Geo-loc...
GeoMapper, Python Script for Visualizing Data on Social Networks with Geo-loc...GeoMapper, Python Script for Visualizing Data on Social Networks with Geo-loc...
GeoMapper, Python Script for Visualizing Data on Social Networks with Geo-loc...
 
Benchy: Lightweight framework for Performance Benchmarks
Benchy: Lightweight framework for Performance Benchmarks Benchy: Lightweight framework for Performance Benchmarks
Benchy: Lightweight framework for Performance Benchmarks
 
Python, A pílula Azul da programação
Python, A pílula Azul da programaçãoPython, A pílula Azul da programação
Python, A pílula Azul da programação
 
Construindo Soluções Científicas com Big Data & MapReduce
Construindo Soluções Científicas com Big Data & MapReduceConstruindo Soluções Científicas com Big Data & MapReduce
Construindo Soluções Científicas com Big Data & MapReduce
 
Como Python está mudando a forma de aprendizagem à distância no Brasil
Como Python está mudando a forma de aprendizagem à distância no BrasilComo Python está mudando a forma de aprendizagem à distância no Brasil
Como Python está mudando a forma de aprendizagem à distância no Brasil
 
Novas Tendências para a Educação a Distância: Como reinventar a educação ?
Novas Tendências para a Educação a Distância: Como reinventar a educação ?Novas Tendências para a Educação a Distância: Como reinventar a educação ?
Novas Tendências para a Educação a Distância: Como reinventar a educação ?
 
Aula WebCrawlers com Regex - PyCursos
Aula WebCrawlers com Regex - PyCursosAula WebCrawlers com Regex - PyCursos
Aula WebCrawlers com Regex - PyCursos
 
Arquivos Zip com Python - Aula PyCursos
Arquivos Zip com Python - Aula PyCursosArquivos Zip com Python - Aula PyCursos
Arquivos Zip com Python - Aula PyCursos
 
PyFoursquare: Python Library for Foursquare
PyFoursquare: Python Library for FoursquarePyFoursquare: Python Library for Foursquare
PyFoursquare: Python Library for Foursquare
 
Sistemas de Recomendação: Como funciona e Onde Se aplica?
Sistemas de Recomendação: Como funciona e Onde Se aplica?Sistemas de Recomendação: Como funciona e Onde Se aplica?
Sistemas de Recomendação: Como funciona e Onde Se aplica?
 

Kürzlich hochgeladen

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Kürzlich hochgeladen (20)

A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 

Construindo Sistemas de Recomendação com Python

  • 1. Sistemas de Recomendação usando Python Marcel Pinheiro Caraciolo marcel@pingmind.com @marcelcaraciolo http://www.pycursos.com
  • 2. Quem é Marcel ? Marcel Pinheiro Caraciolo - @marcelcaraciolo Sergipano, porém Recifense. Mestre em Ciência da Computação no CIN/UFPE na área de mineração de dados Diretor de Pesquisa e Desenvolvimento no Atépassar CEO e Co-fundador do PyCursos/ Pingmind Membro e Moderador da Celúla de Usuários Python de Pernambuco (PUG-PE) Minhas áreas de interesse: Computação móvel e Computação inteligente Meus blogs: http://www.mobideia.com (sobre Mobilidade desde 2006) http://aimotion.blogspot.com (sobre I.A. desde 2009)
  • 3. WEB
  • 4. WEB
  • 5. 1.0 2.0 Fonte de Informação Fluxo Contínuo de Informação VI Encontro do PUG-PE VI Encontro do PUG-PE
  • 6. WEB SITES WEB APPLICATIONS WEB SERVICES 3.0 SEMANTIC WEB USERS VI Encontro do PUG-PE VI Encontro do PUG-PE
  • 7. Usar informação coletiva de forma efetiva afim de aprimorar uma aplicação
  • 8. Intelligence from Mining Data User User User User User Um usuário influencia outros por resenhas, notas, recomendações e blogs Um usuário é influenciado por outros por resenhas, notas, recomendações e blogs
  • 9. aggregation information: lists ratings user-generated content reviews blogs recommendations wikis Collective Intelligence voting Your application bookmarking Search tag cloud tagging saving Natural Language Processing Clustering and Harness external content predictive models
  • 10. WEB SITES WEB APPLICATIONS WEB SERVICES 3.0 SEMANTIC WEB USERS antes... VI Encontro do PUG-PE VI Encontro do PUG-PE
  • 12. estamos sobrecarregados de informações
  • 18. eeeeuuuu... google? midias sociais?
  • 20. “A lot of times, people don’t know what they want until you show it to them.” Steve Jobs “We are leaving the Information age, and entering into the Recommendation age.” Chris Anderson, from book Long Tail
  • 21. Recomendações Sociais Família/Amigos Amigos/ Família O Que eu deveria ler ? Ref: Flickr-BlueAlgae “Eu acho que você deveria ler Ref: Flickr photostream: jefield estes livros.
  • 22. Recomendações por Interação Entrada: Avalie alguns livros O Que eu deveria ler ? Saída: “Livros que você pode gostar são …”
  • 23. Sistemas desenhados para sugerir algo para mim do meu interesse!
  • 25. Netflix - 2/3 dos filmes alugados vêm de recomendação Google News - 38% das notícias mais clicadas vêm de recomendação Amazon - 38% das vendas vêm de recomendação Fonte: Celma & Lamere, ISMIR 2007
  • 26. !"#$%"#&'"%(&$)") Nós+,&-.$/).#&0#/"1.#$%234(".# * estamos sobrecarregados de informação $/)#5(&6 7&.2.#"$4,#)$8 * 93((3&/.#&0#:&'3".;#5&&<.# $/)#:-.34#2%$4<.#&/(3/" Milhares de artigos e posts * =/#>$/&3;#?#@A#+B#4,$//"(.;# novos todos os dias 2,&-.$/).#&0#7%&6%$:.# "$4,#)$8 * =/#C"1#D&%<;#."'"%$(# Milhões de Músicas, Filmes e 2,&-.$/).#&0#$)#:"..$6".# Livros ."/2#2&#-.#7"%#)$8 Milhares de Ofertas e Promoções
  • 27. O que pode ser recomendado ? Contatos em Redes Sociais Artigos Produtos Messagens de Propaganda Cursos e-learning Livros Tags Músicas Futuras namoradas Roupas Filmes Restaurantes Programas de Tv Vídeos Papers Opções de Investimento Profissionais Módulos de código
  • 28. E como funciona a recomendação ?
  • 29. O que os sistemas de recomendação realmente fazem ? 1. Prediz o quanto você pode gostar de um certo produto ou serviço 2. Sugere um lista de N items ordenada de acordo com seu interese 3. Sugere uma lista de N usuários ordernada para um produto/serviço 4. Explica a você o porque esses items foram recomendados 5. Ajusta a predição e a recomendação baseado em seu feedback e de outros.
  • 30. Filtragem baseada por Conteúdo Similar Duro de O Vento Toy Armagedon Items Matar Levou Store recomenda gosta Marcel Usuários
  • 31. Problemas com filtragem por conteúdo 1. Análise dos dados Restrita - Items e usuários pouco detalhados. Pior em áudio ou imagens 2. Dados Especializados - Uma pessoa que não tem experiência com Sushi não recebe o melhor restaurante de Sushi da cidade 3. Efeito Portfólio - Só porque eu vi 1 filme da Xuxa quando criança, tem que me recomendar todos dela
  • 32. Filtragem Colaborativa O Vento Toy Thor Armagedon Items Levou Store gosta recomenda Marcel Rafael Amanda Usuários Similar
  • 33. Problemas com filtragem colaborativa 1. Escabilidade - Amazon com 5M usuários, 50K items, 1.4B avaliações 2. Dados esparsos - Novos usuários e items que não tem histórico 3. Partida Fria - Só avaliei apenas um único livro no Amazon! 4. Popularidade - Todo mundo lê ‘Harry Potter’ 5. Hacking - A pessoa que lê ‘Harry Potter’ lê Kama Sutra
  • 34. Filtragem Híbrida Combinação de múltiplos métodos Duro de O Vento Toy Armagedon Items Matar Levou Store Ontologias Dados Símbolicos Marcel Rafael Luciana Usuários
  • 35. Como eles são apresentados ? Destaques Mais sobre este artista... Alguem similar a você também gostou disso O mais popular em seu grupo... Já que você escutou esta, você pode querer esta... Lançamentos Escute músicas de artistas similares. Estes dois item vêm juntos..
  • 36. Como eles são avaliados ? Como sabemos se a recomendação é boa ? Geralmente se divide-se em treinamento/teste (80/20) Críterios utilizados: - Erro de Predição: RMSE - Curva ROC*, rank-utility, F-Measure *http://code.google.com/p/pyplotmining/
  • 37. How to build a recommender system with Python ? There is one option... Crab A Python Framework for Building Recommendation Engines https://github.com/python-recsys/crab
  • 38. How to build a recommender system with Python ? There is one option... But it’s still in development! Crab A Python Framework for Building Recommendation Engines https://github.com/python-recsys/crab
  • 39. But here we will create one from Zero with Python! Find someone similar to you O Vento Toy Thor Armagedon Items Levou Store like recommends Marcel Rafael Amanda Users Similar
  • 40. But here we will create one from Step Zero with Python! Find someone similar to you Movies Ratings Dataset
  • 41. But here we will create one from Step Zero with Python! Find someone similar to you Movies Ratings Dataset Mr. X deu nota 4 para Snow Crash e 2 para Girl with the Dragon Tatoo, O que recomendar para ele ?
  • 42. But here we will create one from Step Zero with Python! Find someone similar to you
  • 43. But here we will create one from Step Zero with Python! Find someone similar to you Descobrimos que Amy é mais similar dentre as opções, Podemos recomendar um filme visto por ela com 5 estrelas :)
  • 44. But here we will create one from Step Zero with Python! Mais uma métrica de similaridade: Distância Euclideana
  • 45. But here we will create one from Step Zero with Python! Mais uma métrica de similaridade: Distância Euclideana
  • 46. But here we will create one from Step Zero with Python! Mais uma métrica de similaridade: Distância Euclideana
  • 47. Show me the code!
  • 48. Show me the code! >>>#Representing the data in Python
  • 49. Show me the code! >>>#Representing the data in Python >>>users = {"Angelica": {"Blues Traveler": 3.5, "Broken Bells": 2.0, "Norah Jones": 4.5, "Phoenix": 5.0, "Slightly Stoopid": 1.5, "The Strokes": 2.5, "Vampire Weekend": 2.0}, "Bill": {"Blues Traveler": 2.0, "Broken Bells": 3.5, "Deadmau5": 4.0, "Phoenix": 2.0, "Slightly Stoopid": 3.5, "Vampire Weekend": 3.0}, "Chan": {"Blues Traveler": 5.0, "Broken Bells": 1.0, "Deadmau5": 1.0, "Norah Jones": 3.0, "Phoenix": 5, "Slightly Stoopid": 1.0}, "Dan": {"Blues Traveler": 3.0, "Broken Bells": 4.0, "Deadmau5": 4.5, "Phoenix": 3.0, "Slightly Stoopid": 4.5, "The Strokes": 4.0, "Vampire Weekend": 2.0}, "Hailey": {"Broken Bells": 4.0, "Deadmau5": 1.0, "Norah Jones": 4.0, "The Strokes": 4.0, "Vampire Weekend": 1.0}, "Jordyn": {"Broken Bells": 4.5, "Deadmau5": 4.0, "Norah Jones": 5.0, "Phoenix": 5.0, "Slightly Stoopid": 4.5, "The Strokes": 4.0, "Vampire Weekend": 4.0}, "Sam": {"Blues Traveler": 5.0, "Broken Bells": 2.0, "Norah Jones": 3.0, "Phoenix": 5.0, "Slightly Stoopid": 4.0, "The Strokes": 5.0}, "Veronica": {"Blues Traveler": 3.0, "Norah Jones": 5.0, "Phoenix": 4.0, "Slightly Stoopid": 2.5, "The Strokes": 3.0}}
  • 50. Show me the code!
  • 51. Show me the code! >>>#Representing the data in Python
  • 52. Show me the code! >>>#Representing the data in Python >>>users = {"Angelica": {"Blues Traveler": 3.5, "Broken Bells": 2.0, "Norah Jones": 4.5, "Phoenix": 5.0, "Slightly Stoopid": 1.5, "The Strokes": 2.5, "Vampire Weekend": 2.0}, "Bill": {"Blues Traveler": 2.0, "Broken Bells": 3.5, "Deadmau5": 4.0, "Phoenix": 2.0, "Slightly Stoopid": 3.5, "Vampire Weekend": 3.0}, "Chan": {"Blues Traveler": 5.0, "Broken Bells": 1.0, "Deadmau5": 1.0, "Norah Jones": 3.0, "Phoenix": 5, "Slightly Stoopid": 1.0}, "Dan": {"Blues Traveler": 3.0, "Broken Bells": 4.0, "Deadmau5": 4.5, "Phoenix": 3.0, "Slightly Stoopid": 4.5, "The Strokes": 4.0, "Vampire Weekend": 2.0}, "Hailey": {"Broken Bells": 4.0, "Deadmau5": 1.0, "Norah Jones": 4.0, "The Strokes": 4.0, "Vampire Weekend": 1.0}, "Jordyn": {"Broken Bells": 4.5, "Deadmau5": 4.0, "Norah Jones": 5.0, "Phoenix": 5.0, "Slightly Stoopid": 4.5, "The Strokes": 4.0, "Vampire Weekend": 4.0}, "Sam": {"Blues Traveler": 5.0, "Broken Bells": 2.0, "Norah Jones": 3.0, "Phoenix": 5.0, "Slightly Stoopid": 4.0, "The Strokes": 5.0}, "Veronica": {"Blues Traveler": 3.0, "Norah Jones": 5.0, "Phoenix": 4.0, "Slightly Stoopid": 2.5, "The Strokes": 3.0}}
  • 54. Codificando o Mahantan def manhattan(rating1, rating2): """Computes the Manhattan distance. Both rating1 and rating2 are dictionaries of the form {'The Strokes': 3.0, 'Slightly Stoopid': 2.5}""" distance = 0 commonRatings = False for key in rating1: if key in rating2: distance += abs(rating1[key] – rating2[key]) commonRatings = True if commonRatings: return distance else: return -1 #Indicates no ratings in common
  • 56. Codificando o Mahantan def manhattan(rating1, rating2): """Computes the Manhattan distance. Both rating1 and rating2 are dictionaries of the form {'The Strokes': 3.0, 'Slightly Stoopid': 2.5}""" distance = 0 commonRatings = False for key in rating1: if key in rating2: distance += abs(rating1[key] – rating2[key]) commonRatings = True if commonRatings: return distance else: return -1 #Indicates no ratings in common
  • 57. Codificando o Mahantan def manhattan(rating1, rating2): """Computes the Manhattan distance. Both rating1 and rating2 are dictionaries of the form {'The Strokes': 3.0, 'Slightly Stoopid': 2.5}""" distance = 0 commonRatings = False for key in rating1: if key in rating2: distance += abs(rating1[key] – rating2[key]) commonRatings = True if commonRatings: return distance else: return -1 #Indicates no ratings in common >>> manhattan(users['Hailey'], users['Veronica']) 2.0 >>> manhattan(users['Hailey'], users['Jordyn']) 1.5 >>>
  • 59. Codificando Euclidean def euclidean(rating1, rating2): """Computes the euclidean distance. Both rating1 and rating2 are dictionaries of the form {'The Strokes': 3.0, 'Slightly Stoopid': 2.5}""" distance = 0.0 commonRatings = False for key in rating1: if key in rating2: distance += pow(abs(rating1[key] - rating2[key]), 2.0) commonRatings = True if commonRatings: return pow(distance, 1/2.0) else: return -1 #Indicates no ratings in common
  • 61. Codificando Euclidean def euclidean(rating1, rating2): """Computes the euclidean distance. Both rating1 and rating2 are dictionaries of the form {'The Strokes': 3.0, 'Slightly Stoopid': 2.5}""" distance = 0.0 commonRatings = False for key in rating1: if key in rating2: distance += pow(abs(rating1[key] - rating2[key]), 2.0) commonRatings = True if commonRatings: return pow(distance, 1/2.0) else: return -1 #Indicates no ratings in common 1.4142135623730951
  • 62. Codificando Euclidean def euclidean(rating1, rating2): """Computes the euclidean distance. Both rating1 and rating2 are dictionaries of the form {'The Strokes': 3.0, 'Slightly Stoopid': 2.5}""" distance = 0.0 commonRatings = False for key in rating1: if key in rating2: distance += pow(abs(rating1[key] - rating2[key]), 2.0) commonRatings = True if commonRatings: return pow(distance, 1/2.0) else: return -1 #Indicates no ratings in common >>> euclidean(users['Hailey'], users['Veronica']) 1.4142135623730951
  • 64. Find the closest users def computeNearestNeighbor(username, users): """creates a sorted list of users based on their distance to username""" distances = [] for user in users: if user != username: distance = manhattan(users[user], users[username]) distances.append((distance, user)) # sort based on distance -- closest first distances.sort() return distances
  • 66. Find the closest users >>> computeNearestNeighbor('Hailey', users) [(2.0, 'Veronica'), (4.0, 'Chan'),(4.0, 'Sam'), (4.5, 'Dan'), (5.0, 'Angelica'), (5.5, 'Bill'), (7.5, 'Jordyn')] >>>
  • 67. Find the closest users def computeNearestNeighbor(username, users): """creates a sorted list of users based on their distance to username""" distances = [] for user in users: if user != username: distance = manhattan(users[user], users[username]) distances.append((distance, user)) # sort based on distance -- closest first distances.sort() return distances >>> computeNearestNeighbor('Hailey', users) [(2.0, 'Veronica'), (4.0, 'Chan'),(4.0, 'Sam'), (4.5, 'Dan'), (5.0, 'Angelica'), (5.5, 'Bill'), (7.5, 'Jordyn')] >>>
  • 69. The recommender def recommend(username, users): """Give list of recommendations""" # first find nearest neighbor nearest = computeNearestNeighbor(username, users)[0][1] recommendations = [] # now find bands neighbor rated that user didn't neighborRatings = users[nearest] userRatings = users[username] for artist in neighborRatings: if not artist in userRatings: recommendations.append((artist, neighborRatings[artist])) recommendations.sort(key=lambda artistTuple: artistTuple[1], reverse = True) return recommendations
  • 71. The recommender >>> recommend('Hailey', users) [('Phoenix', 4.0), ('Blues Traveler', 3.0), ('Slightly Stoopid', 2.5)] >>> recommend('Chan', users) [('The Strokes', 4.0), ('Vampire Weekend', 1.0)] >>> recommend('Angelica', users) []
  • 72. The recommender def recommend(username, users): """Give list of recommendations""" # first find nearest neighbor nearest = computeNearestNeighbor(username, users)[0][1] recommendations = [] # now find bands neighbor rated that user didn't neighborRatings = users[nearest] userRatings = users[username] for artist in neighborRatings: if not artist in userRatings: recommendations.append((artist, neighborRatings[artist])) recommendations.sort(key=lambda artistTuple: artistTuple[1], reverse = True) return recommendations >>> recommend('Hailey', users) [('Phoenix', 4.0), ('Blues Traveler', 3.0), ('Slightly Stoopid', 2.5)] >>> recommend('Chan', users) [('The Strokes', 4.0), ('Vampire Weekend', 1.0)] >>> recommend('Angelica', users) []
  • 74. The recommender >>> computeNearestNeighbor('Angelica', users) [(3.5, 'Veronica'), (4.5, 'Chan'), (5.0, 'Hailey'), (8.0, 'Sam'), (9.0, 'Bill'), (9.0, 'Dan'), (9.5, 'Jordyn')]('Hailey', users)
  • 75. The recommender def recommend(username, users): """Give list of recommendations""" # first find nearest neighbor nearest = computeNearestNeighbor(username, users)[0][1] recommendations = [] # now find bands neighbor rated that user didn't neighborRatings = users[nearest] userRatings = users[username] for artist in neighborRatings: if not artist in userRatings: recommendations.append((artist, neighborRatings[artist])) recommendations.sort(key=lambda artistTuple: artistTuple[1], reverse = True) return recommendations >>> computeNearestNeighbor('Angelica', users) [(3.5, 'Veronica'), (4.5, 'Chan'), (5.0, 'Hailey'), (8.0, 'Sam'), (9.0, 'Bill'), (9.0, 'Dan'), (9.5, 'Jordyn')]('Hailey', users)
  • 76. But we need to improve it more...
  • 81. The Pearson Correlation Output: -1 (perfect disagreement) to 1 (perfect agreement)
  • 84. Pearson Correlation def pearson(rating1, rating2): sum_xy = 0 sum_x = 0 sum_y = 0 sum_x2 = 0 sum_y2 = 0 n = 0 for key in rating1: if key in rating2: n += 1 x = rating1[key] y = rating2[key] sum_xy += x * y sum_x += x sum_y += y sum_x2 += x**2 sum_y2 += y**2 # now compute denominator denominator = sqrt(sum_x2 - (sum_x**2) / n) * sqrt(sum_y2 -(sum_y**2) / n) if denominator == 0: return 0 else: return (sum_xy - (sum_x * sum_y) / n) / denominator
  • 86. Pearson Correlation >>> pearson(users['Angelica'], users['Bill']) -0.90405349906826993 >>> pearson(users['Angelica'], users['Hailey']) 0.42008402520840293 >>> pearson(users['Angelica'], users['Jordyn']) 0.76397486054754316 >>>
  • 87. Pearson Correlation def pearson(rating1, rating2): sum_xy = 0 sum_x = 0 sum_y = 0 sum_x2 = 0 sum_y2 = 0 n = 0 for key in rating1: if key in rating2: n += 1 x = rating1[key] y = rating2[key] sum_xy += x * y sum_x += x sum_y += y sum_x2 += x**2 sum_y2 += y**2 # now compute denominator denominator = sqrt(sum_x2 - (sum_x**2) / n) * sqrt(sum_y2 -(sum_y**2) / n) if denominator == 0: return 0 else: return (sum_xy - (sum_x * sum_y) / n) / denominator >>> pearson(users['Angelica'], users['Bill']) -0.90405349906826993 >>> pearson(users['Angelica'], users['Hailey']) 0.42008402520840293 >>> pearson(users['Angelica'], users['Jordyn']) 0.76397486054754316 >>>
  • 88. Which one to choose ?
  • 89. K-nearest Neighbors (kNN) Find k most similars to you
  • 93. Change people to items to
  • 94. Change people to items {'Lisa Rose': {'Lady in the Water': 2.5, 'Snakes on a Plane': 3.5}, 'Gene Seymour': {'Lady in the Water': 3.0, 'Snakes on a Plane': 3.5}} to {'Lady in the Water':{'Lisa Rose':2.5,'Gene Seymour':3.0}, 'Snakes on a Plane':{'Lisa Rose':3.5,'Gene Seymour':3.5}} etc.
  • 95. Change people to items to
  • 96. Change people to items {'Lisa Rose': {'Lady in the Water': 2.5, 'Snakes on a Plane': 3.5}, 'Gene Seymour': {'Lady in the Water': 3.0, 'Snakes on a Plane': 3.5}} to {'Lady in the Water':{'Lisa Rose':2.5,'Gene Seymour':3.0}, 'Snakes on a Plane':{'Lisa Rose':3.5,'Gene Seymour':3.5}} etc.
  • 97. Change people to items {'Lisa Rose': {'Lady in the Water': 2.5, 'Snakes on a Plane': 3.5}, 'Gene Seymour': {'Lady in the Water': 3.0, 'Snakes on a Plane': 3.5}} to {'Lady in the Water':{'Lisa Rose':2.5,'Gene Seymour':3.0}, 'Snakes on a Plane':{'Lisa Rose':3.5,'Gene Seymour':3.5}} etc. def transformPrefs(prefs): result={} for person in prefs: for item in prefs[person]: result.setdefault(item,{}) # Flip item and person result[item][person]=prefs[person][item] return result
  • 98. Change people to items {'Lisa Rose': {'Lady in the Water': 2.5, 'Snakes on a Plane': 3.5}, 'Gene Seymour': {'Lady in the Water': 3.0, 'Snakes on a Plane': 3.5}} to {'Lady in the Water':{'Lisa Rose':2.5,'Gene Seymour':3.0}, 'Snakes on a Plane':{'Lisa Rose':3.5,'Gene Seymour':3.5}} etc. def transformPrefs(prefs): result={} for person in prefs: for item in prefs[person]: result.setdefault(item,{}) # Flip item and person result[item][person]=prefs[person][item] return result >> movies=recommendations.transformPrefs(recommendations.users) >> recommendations.computeNearestNeighbors(‘Blues Traveler’, movies) [(0.657, 'You, Me and Dupree'), (0.487, 'Lady in the Water'), (0.111, 'Snakes on a Plane'), (-0.179, 'The Night Listener'), (-0.422, 'Just My Luck')]
  • 99. Change people to items to
  • 100. Change people to items {'Lisa Rose': {'Lady in the Water': 2.5, 'Snakes on a Plane': 3.5}, 'Gene Seymour': {'Lady in the Water': 3.0, 'Snakes on a Plane': 3.5}} to {'Lady in the Water':{'Lisa Rose':2.5,'Gene Seymour':3.0}, 'Snakes on a Plane':{'Lisa Rose':3.5,'Gene Seymour':3.5}} etc.
  • 101. Change people to items {'Lisa Rose': {'Lady in the Water': 2.5, 'Snakes on a Plane': 3.5}, 'Gene Seymour': {'Lady in the Water': 3.0, 'Snakes on a Plane': 3.5}} to {'Lady in the Water':{'Lisa Rose':2.5,'Gene Seymour':3.0}, 'Snakes on a Plane':{'Lisa Rose':3.5,'Gene Seymour':3.5}} etc. def transformPrefs(prefs): result={} for person in prefs: for item in prefs[person]: result.setdefault(item,{}) # Flip item and person result[item][person]=prefs[person][item] return result
  • 102. Change people to items {'Lisa Rose': {'Lady in the Water': 2.5, 'Snakes on a Plane': 3.5}, 'Gene Seymour': {'Lady in the Water': 3.0, 'Snakes on a Plane': 3.5}} to {'Lady in the Water':{'Lisa Rose':2.5,'Gene Seymour':3.0}, 'Snakes on a Plane':{'Lisa Rose':3.5,'Gene Seymour':3.5}} etc. def transformPrefs(prefs): result={} for person in prefs: for item in prefs[person]: result.setdefault(item,{}) # Flip item and person result[item][person]=prefs[person][item] return result >> movies=recommendations.transformPrefs(recommendations.critics) >> recommendations.computeNearestNeighbors(movies,'Superman Returns') [(0.657, 'You, Me and Dupree'), (0.487, 'Lady in the Water'), (0.111, 'Snakes on a Plane'), (-0.179, 'The Night Listener'), (-0.422, 'Just My Luck')]
  • 103. User Based Filtering até agora! Problemas de Escalabilidade e Esparsidade
  • 104. Item Based Filtering Find k most similars to the item
  • 106. Find the closest items def calculateSimilarItems(prefs,sim_distance=manhattan): ! # Create a dictionary of items showing which other items they ! # are most similar to. ! result={} ! # Invert the preference matrix to be item-centric ! itemPrefs=transformPrefs(prefs) ! c=0 ! for item in itemPrefs: ! ! # Status updates for large datasets ! ! c+=1 ! ! if c%100==0: print "%d / %d" % (c,len(itemPrefs)) # Find the most similar items to this one scores=computeNearestNeighbor(item,itemPrefs,distance=sim_distance) result[item]=scores ! return result
  • 107. Find the closest items def calculateSimilarItems(prefs,sim_distance=manhattan): ! # Create a dictionary of items showing which other items they ! # are most similar to. ! result={} ! # Invert the preference matrix to be item-centric ! itemPrefs=transformPrefs(prefs) ! c=0 ! for item in itemPrefs: ! ! # Status updates for large datasets ! ! c+=1 ! ! if c%100==0: print "%d / %d" % (c,len(itemPrefs)) # Find the most similar items to this one scores=computeNearestNeighbor(item,itemPrefs,distance=sim_distance) result[item]=scores ! return result >>> itemsim=recommendations.calculateSimilarItems(users) >>> itemsim {'Lady in the Water': [(0.40000000000000002, 'You, Me and Dupree'), (0.2857142857142857, 'The Night Listener'),... 'Snakes on a Plane': [(0.22222222222222221, 'Lady in the Water'), (0.18181818181818182, 'The Night Listener'),... etc.
  • 109. The recommender def recommend(username,users, similarities, n=3): scores = {} totalSim = {} # # now get the ratings for the user # userRatings = users[username] # Loop over items rated by this user for item, rating in userRatings.items(): #Loop over items similar to this one for sim, other_item in similarities[item]: # Ignore if this user has already rated this item if other_item in userRatings: continue # Weighted sum of rating times similarity scores.setdefault(other_item, 0.0) scores[other_item]+= sim * rating # Sum of all the similarities totalSim.setdefault(other_item, 0.0) totalSim[other_item] += sim # Divide each total score by total weighting to get an average recommendations = [(score/totalSim[item],item) for item,score in scores.items()] # finally sort and return recommendations.sort(key=lambda artistTuple: artistTuple[1], reverse = True) # Return the first n items return recommendations[:n]
  • 110. The recommender >>> recommend('Hailey', users,similarities,3) [(3.1176470588235294, 'Slightly Stoopid'), (2.639207507820647, 'Phoenix'),(2.64476386036961, 'Blues Traveler')]
  • 111. The recommender def recommend(username,users, similarities, n=3): scores = {} totalSim = {} # # now get the ratings for the user # userRatings = users[username] # Loop over items rated by this user for item, rating in userRatings.items(): #Loop over items similar to this one for sim, other_item in similarities[item]: # Ignore if this user has already rated this item if other_item in userRatings: continue # Weighted sum of rating times similarity scores.setdefault(other_item, 0.0) scores[other_item]+= sim * rating # Sum of all the similarities totalSim.setdefault(other_item, 0.0) totalSim[other_item] += sim # Divide each total score by total weighting to get an average recommendations = [(score/totalSim[item],item) for item,score in scores.items()] # finally sort and return recommendations.sort(key=lambda artistTuple: artistTuple[1], reverse = True) # Return the first n items return recommendations[:n] >>> recommend('Hailey', users,similarities,3) [(3.1176470588235294, 'Slightly Stoopid'), (2.639207507820647, 'Phoenix'),(2.64476386036961, 'Blues Traveler')]
  • 112. Content Based Filtering Similar Duro de O Vento Toy Armagedon Items Matar Levou Store recommend likes Marcel Users
  • 113. source, the recommendation architecture that we propose will would rely more on collaborative-filtering techniques, that is, aggregate the results of such filtering techniques. Bezerra and Carvalho proposed approaches where the results the reviews from similar users. We aim at integrating the previously mentioned hybrid prod- Figure 1 shows a overview of our meta recommender achieved showed to be very promising [19]. approach. By combining the content-based filtering and the uct recommendation approach in a mobile application so the A. Crab is already in production users could benefit from useful and logical recommendations. collaborative-based one into a hybrid recommender system, it Moreover, we aim at providing a suited explanation for each would use the services/products III. S YSTEM catalogues repositories which D ESIGN recommendation to the user, since the current approaches just the services to be recommended, and the review repository Application data information our mobile recommender sys- that contains the user opinions about those services. All this for only deliver product recommendations with a overall score without pointing out the appropriateness of such recommen- datatembecan be from data source containers in the web product description can extracted divided into two parts: the rec dation [13]. Besides the basic information provided by the such(such location-based social network Foursquare its attributes) and the user as the as location, description and [17] as Hybrid Meta Approach gives the system’s architecture and suppliers, the system will deliver the explanation, providing relevant reviews of similar users, we believe that it will tags, etc.). The Figure 3 increase the confidence in the buying decision process and the displayed at the Figure 2 and the location recommendation engine from Google: Google HotPot [18]. by user (such as rating, comments, reviews or ratings provided mo wh product accepptance rate. In the mobile context this approach po could help the users in this process and showing the user relative components. thi opinions could contribute to achieve this task. rec spe !"#$"%&'$ 5&-$ !"#$%&'%($) !".,"/#) acc !"*+#,$+'-) !"*+#,$+'-) +,-*.&$ !(#$()&'*&%$ /01&'234&$ !6#$6,00&41&7$ wh res !<#$<'&2&'&04&%A$B,431*,0A$&14C$ ves 0+44%6+'%$,.")1%#"2) 0+($"($)1%#"2) 3,4$"',(5) ou 3,4$"',(5) )))67,8,#%)+,4%$91$'%4)-1":)))) suc !"#$%&"'()*+,#&-,.) /$%,0"12()*3$4%)3""5.) ))))1,;&,<4)<1&%%,')=2)4&:&8$1)) )))))))))))%$4%,5)94,14>?) <',7)41$ pro 8&=,%*1,'>$ exp 8&4,99&0731*,0$:0;*0&$ !B#$B*%1$,2$D4,'&7$<',7)41%$ !(#$()&'*&%$ ma 8&?*&@$ we Fig. 2. User Reviews from Foursquare Social Network 8&=,%*1,'>$ com 7"$%) !"8+99"(2"')) !8#$830E&7$<',7)41%$ The content-based filtering approach will be used to filter ext the product/service repository, while the collaborative based 8&%).1%$ B. approach will derive the product review recommendations. In addition we will use text mining techniques to distinct the !"8+99"(2%$,+(#) polarity of the user review between positive or negative one. This information summarized would contribute in the product Architecture Fig. 3. Mobile Recommender System rat score recommendation computation. The final product recom- Fig. 1. Meta Recommender Architecture mendation score is computed by integrating the result of both me recommenders. By now, weproduct/service recommender, the user could In our mobile are considering to use different and Since one of the goals of this work is to incorporate options regarding this integration approach, one and get a list of recommen- different data sources of user opinions and descriptions, we filter some products or services at special oth is the symbolic data analysis approach (SDA) [19], which have addopted an meta recommendation architecture. By using eachtations. The user user ratings/reviews arehis preferences or give his product description and also can enter modeled ow a meta recommender architecture, the system would provide a personalized control over the generated recommendation list feedback to some offered product recommendation. as set of modal symbolic descriptions that summarizes the Re information provided by the corresponding data sources. It is
  • 114. Crab is already in production Brazilian Social Network called Atepassar.com Educational network with more than 60.000 students and 120 video-classes Running on Python + Numpy + Scipy and Django Backend for Recommendations MongoDB - mongoengine Daily Recommendations with Explanations
  • 115. Distributing the recommendation computations Use Hadoop and Map-Reduce intensively Investigating the Yelp mrjob framework https://github.com/pfig/mrjob Develop the Netflix and novel standard-of-the-art used Matrix Factorization, Singular Value Decomposition (SVD), Boltzman machines The most commonly used is Slope One technique. Simple algebra math with slope one algebra y = a*x+b
  • 116. Distributed Computing with mrJob https://github.com/Yelp/mrjob http://aimotion.blogspot.com/2012/08/introduction-to-recommendations-with.html
  • 117. Distributed Computing with mrJob https://github.com/Yelp/mrjob It supports Amazon’s Elastic MapReduce(EMR) service, your own Hadoop cluster or local (for testing) http://aimotion.blogspot.com/2012/08/introduction-to-recommendations-with.html
  • 118. Distributed Computing with mrJob https://github.com/Yelp/mrjob It supports Amazon’s Elastic MapReduce(EMR) service, your own Hadoop cluster or local (for testing) http://aimotion.blogspot.com/2012/08/introduction-to-recommendations-with.html
  • 119. Distributed Computing with mrJob https://github.com/Yelp/mrjob """The classic MapReduce job: count the frequency of words. """ from mrjob.job import MRJob import re WORD_RE = re.compile(r"[w']+") class MRWordFreqCount(MRJob):     def mapper(self, _, line):         for word in WORD_RE.findall(line):             yield (word.lower(), 1)     def reducer(self, word, counts):         yield (word, sum(counts)) if __name__ == '__main__':     MRWordFreqCount.run() It supports Amazon’s Elastic MapReduce(EMR) service, your own Hadoop cluster or local (for testing) http://aimotion.blogspot.com/2012/08/introduction-to-recommendations-with.html
  • 120. Future studies with Sparse Matrices Real datasets come with lots of empty values http://aimotion.blogspot.com/2011/05/evaluating-recommender-systems.html Solutions: scipy.sparse package Sharding operations Matrix Factorization techniques (SVD) Apontador Reviews Dataset
  • 121. Future studies with Sparse Matrices Real datasets come with lots of empty values http://aimotion.blogspot.com/2011/05/evaluating-recommender-systems.html Solutions: scipy.sparse package Sharding operations Matrix Factorization techniques (SVD) Crab implements a Matrix Factorization with Expectation Maximization algorithm Apontador Reviews Dataset
  • 122. Future studies with Sparse Matrices Real datasets come with lots of empty values http://aimotion.blogspot.com/2011/05/evaluating-recommender-systems.html Solutions: scipy.sparse package Sharding operations Matrix Factorization techniques (SVD) Crab implements a Matrix Factorization with Expectation Maximization algorithm scikits.crab.svd package Apontador Reviews Dataset
  • 123. How are we working ? Our Project’s Home Page http://github.com/python-recsys/crab
  • 124. Future Releases Planned Release 0.1 Collaborative Filtering Algorithms working, sample datasets to load and test Planned Release 0.11 Sparse Matrixes and Database Models support Planned Release 0.12 Slope One Agorithm, new factorization techniques implemented ....
  • 125. Join us! 1. Read our Wiki Page https://github.com/python-recsys/crab/wiki/Developer-Resources 2. Check out our current sprints and open issues https://github.com/python-recsys/crab/issues 3. Forks, Pull Requests mandatory 4. Join us at irc.freenode.net #muricoca or at our discussion list http://groups.google.com/group/scikit-crab
  • 127.
  • 129. Recommended Books Toby Segaran, Programming Collective SatnamAlag, Collective Intelligence in Intelligence, O'Reilly, 2007 Action, Manning Publications, 2009 ACM RecSys, KDD , SBSC...
  • 130. Conferências Recomendadas - ACM RecSys. –ICWSM: Weblogand Social Media –WebKDD: Web Knowledge Discovery and Data Mining –WWW: The original WWW conference –SIGIR: Information Retrieval –ACM KDD: Knowledge Discovery and Data Mining –ICML: Machine Learning
  • 131. Sistemas de Recomendação usando Python Marcel Pinheiro Caraciolo marcel@pingmind.com @marcelcaraciolo http://www.pycursos.com