Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

Deriving Conversational Insight by Learning Emoji Representation by Jeff Weintraub

181 Aufrufe

Veröffentlicht am

Abstract:- It is a rare occurrence to observe the rise of a new language amongst a population. It is an even more rare occurrence to observe the adoption of such a language on a global scale. Since the introduction of the emoji keyboard on iOS in 2011, the use of emojis in textual communication has steadily grown into a common vernacular on social media. As of April 2015, Instagram reported that nearly half of all text contained emojis and, in some countries, over 60% of texts contained emoji characters. For power users of social media as well as for marketers looking for audiences on these platforms, it is becoming increasingly imperative to capture emoji data and derive insight from its use; to better understand what intent or meaning the usage carries in the conversation. Jeff Weintraub, VP of Technology at theAmplify, a creative Brandtech Influencer Service and a subsidiary of You & Mr Jones, the World's First Brandtech Group, will briefly summarize the data science behind learning emoji representations and also present recent trends in emoji usage within the context of advertising and branded marketing campaigns on social media.

Veröffentlicht in: Technologie
  • Als Erste(r) kommentieren

Deriving Conversational Insight by Learning Emoji Representation by Jeff Weintraub

  1. 1. Deriving  Conversational  Insight  by   Learning  Emoji  Representations VP,  Technology Jeff  Weintraub a  You  &  Mr  Jones  company
  2. 2. //BigDataLA2017 AGENDA 1. Emoji  Adop?on   2. Emojineering   3. Conversa?onal  Insight
  3. 3. Product  &  Technology  Development1.  Emoji  Adoption
  4. 4. 4 //BigDataLA2017 Emoji  Adoption  -­‐  Instagram October  2011   Emoji  keyboard  launches  on  iOS 10%   Instagram  Comments  contained  emoji   (Nov  2011) 50%+ Instagram  Comments  contained   emoji  (March  2015) See  https://engineering.instagram.com/emojineering-­‐part-­‐1-­‐machine-­‐learning-­‐for-­‐emoji-­‐trendsmachine-­‐learning-­‐for-­‐emoji-­‐trends-­‐7f5f9cb979ad
  5. 5. 5 //BigDataLA2017 Emoji  Adoption  -­‐  Instagram 2,666   Emojis  in  Unicode  Standard  as  of   May  2017 -­‐0.93   Correla?on  coefficient  within  respec?ve   cohorts See  https://engineering.instagram.com/emojineering-­‐part-­‐1-­‐machine-­‐learning-­‐for-­‐emoji-­‐trendsmachine-­‐learning-­‐for-­‐emoji-­‐trends-­‐7f5f9cb979ad
  6. 6. Product  &  Technology  Development2.  Emojineering
  7. 7. 7 //BigDataLA2017 Emojineering Ford  GTs  are  the     Ford  GTs  are ! !
  8. 8. 8 //BigDataLA2017 Emojineering Ford  GTs  are  the     Ford  GTs  are ! ! (Pos) (Neg)
  9. 9. 9 //BigDataLA2017 Emojineering NLP  SemanCc  Analysis   -­‐ N-­‐gram  Nueral  Network  Language   Model  (NNLM) See  Mikolov,  et  al.  Efficient  Estimation  of  Word  Representations  in  Vector  Space,  2013 Q  =  Training  Complexity;  Goal  is  to  minimize  so  can  be  trained  efficiently  on         more  data   C  is  the  maximum  distance  of  the  words.     V  is  size  of  the  vocabulary;  output  layer  dimensionality -­‐ Trained  with  stochas?c  gradient  descent   (SGD)  and  back  propaga?on -­‐ Maximize  classifica?on  of  a  word  based   on  another  word  in  the  same  sentence. ConCnuous  Skip-­‐gram  Model  
  10. 10. 10 //BigDataLA2017 Emojineering Skip-­‐gram  Model   -­‐ if  we  choose  C  =  5,  for  each  training   word  we  will  select  randomly  a  number   R  in  range  <  1;  C  >,  and  then  use  R   words  from  history  and  R  words  from   the  future  of  the  current  word  as   correct  labels. See  Mikolov,  et  al.  Efficient  Estimation  of  Word  Representations  in  Vector  Space,  2013 -­‐ increasing  the  range  improves  quality  of   the  resul?ng  word  vectors,  but  it  also   increases  the  computa?onal  complexity
  11. 11. 11 //BigDataLA2017 Emojineering DistribuConal  Hypothesis   Words  that  occur  in  similar  contexts  tend   to  have  similar  meanings  (Harris,  1954;   Firth,  1957;  Deerwester  et  al.,  1990) Training  Accuracy   -­‐ 300  dimensional  vectors;  words  and   emojis   -­‐ 3  million  phrases   -­‐ 6B  tokens the, Ford, GT cars, Ford, :) See  https://engineering.instagram.com/emojineering-­‐part-­‐1-­‐machine-­‐learning-­‐for-­‐emoji-­‐trendsmachine-­‐learning-­‐for-­‐emoji-­‐trends-­‐7f5f9cb979ad
  12. 12. 12 //BigDataLA2017 Emojineering  -­‐  Visualization the, Ford, GT cars, Ford, :)
  13. 13. 13 //BigDataLA2017 Emojineering DistribuConal  Hypothesis   Words  that  occur  in  similar  contexts  tend   to  have  similar  meanings  (Harris,  1954;   Firth,  1957;  Deerwester  et  al.,  1990) 100  Billion  Words   Model  contains  300  dimensional  vectors   for  3  million  words  and  phrases the, Ford, GT cars, Ford, :) 3.  Conversational  Insight
  14. 14. 14 //BigDataLA2017 Conversational  Insight  -­‐  Entertainment  Vertical 65.23%   of  Emojis  used  were  Top  10  Emojis 34.7%   of  Emojis  uses  were                      and  😂 😍 30.01%  of  Emojis  used  were  seman?cally   relevant  to  key  words
  15. 15. 15 //BigDataLA2017 Conversational  Insight  -­‐  Retail  Vertical 58.14%   of  Emojis  used  were  Top  10  Emojis 22.5%   of  Emojis  uses  were                      and   😍 11.78%  of  Emojis  used  were  seman?cally   relevant  to  key  words ❤
  16. 16. 16 //BigDataLA2017 Conversational  Insight  -­‐  Beauty  Vertical 71.22%   of  Emojis  used  were  Top  10  Emojis 37.8%   of  Emojis  uses  were                      and  😂 😍 4%  of  Emojis  used  were  seman?cally   relevant  to  key  words
  17. 17. //BigDataLA2017 AGENDA 1. Emoji  Adop?on   2. Emojineering   3. Conversa?onal  Insight
  18. 18. Thank  You! jeff@theamplify.com @jeff_weintraub a  You  &  Mr  Jones  company

×