Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.
The	     Power	            Tareque	  Hossain	            Sr.	  Software	  Engineer	                                      	  
What	  about	  it?	  •  We	  always	  associate	  solr	  with	  searching	  •  solr	  can	  also	  serve	  as	  your	  non...
NoSQL	  ?	  solr	  ?	  
Why	  solr?	  •  Hey	  solr	  is	  already	  part	  of	  my	  stack	  •  I	  love	  solr	  •  It’s	  fast,	  scalable	  an...
When	  would	  you	  consider	  it?	  •  You	  have	  a	  DB	  that’s	  frequently	  read	  and	     infrequently	  writte...
What’s	  not	  so	  cool?	  •  Doesn’t	  support	  transactions	  •  Not	  all	  SQL	  queries	  can	  be	  translated	  i...
But..	  •  You	  don’t	  have	  to	  give	  up	  your	  relational	     data	  layer	  •  Create	  a	  non-­‐relational	  ...
So	  what’s	  the	  use	  case?	  •  We	  deal	  with	  medical	  survey	  data	  •  Say:	      –  About	  300	  multiple	...
What	  a	  survey	  question	  looks	  like	   When	  were	  you	  diagnosed	  with	  the	  following	  types	  of	   Arth...
Storing	  a	  single	  response	   When	  were	  you	  diagnosed	  with	  the	  following	  types	  of	   Arthri5s?	      ...
Aggregating	  over	  2000	  responses	   When	  were	  you	  diagnosed	  with	  the	  following	  types	  of	   Arthri5s?	...
The	  Document	  Structure	  •  Each	  survey	  response	  =	  solr	  document	  •  Up	  to	  3000	  boolean	  variables	 ...
Querying	  •  Filter	  by	  age,	  interest,	  profession	  •  Facet	  across	  boolean	  field	  •  Result:	  what	  group...
Why	  solr	  is	  awesome..	  •  Faceting	  across	  boolean	  field	  uses	  very	  little	       memory	  •  Combining	  ...
Good	  to	  know..	  •  sunburnt:	  Awesome	  python	  solr	  interface	     	   	   	   	  github.com/tow/sunburnt	  •  P...
Questions?	  •  wisertogether.com	  •  slideshare.net/tarequeh/the-­‐solr-­‐power	  •  @tarequeh	  	  
The solr power
Nächste SlideShare
Wird geladen in …5
×

The solr power

1.289 Aufrufe

Veröffentlicht am

Motivation for using solr as a NoSQL backend

Veröffentlicht in: Technologie, Gesundheit & Medizin
  • Loggen Sie sich ein, um Kommentare anzuzeigen.

The solr power

  1. 1. The   Power   Tareque  Hossain   Sr.  Software  Engineer    
  2. 2. What  about  it?  •  We  always  associate  solr  with  searching  •  solr  can  also  serve  as  your  non-­‐relational   data  layer  
  3. 3. NoSQL  ?  solr  ?  
  4. 4. Why  solr?  •  Hey  solr  is  already  part  of  my  stack  •  I  love  solr  •  It’s  fast,  scalable  and  there  are  some  great   python              interfaces  out  there  
  5. 5. When  would  you  consider  it?  •  You  have  a  DB  that’s  frequently  read  and   infrequently  written  •  You  want  robust  search  &  filtering  on  your   data  •  You  want  to  leverage  the  faceting  feature  •  You  want  a  decently  scalable  data  layer  
  6. 6. What’s  not  so  cool?  •  Doesn’t  support  transactions  •  Not  all  SQL  queries  can  be  translated  into   solr  queries  •  Generating  indices  can  take  a  long  time  •  Searching  and  indexing  at  the  same  time   brings  down  performance  
  7. 7. But..  •  You  don’t  have  to  give  up  your  relational   data  layer  •  Create  a  non-­‐relational  layer  on  top  of  your   relational  data  layer  •  Get  best  of  the  both  worlds  
  8. 8. So  what’s  the  use  case?  •  We  deal  with  medical  survey  data  •  Say:   –  About  300  multiple  choice  questions   –  Responses  can  be  multi-­‐dimensional   –  7000+  different  answer  choices  per  question   –  2000+  respondents  per  survey   –  15+  surveys  and  growing  
  9. 9. What  a  survey  question  looks  like   When  were  you  diagnosed  with  the  following  types  of   Arthri5s?   Rheumatoid   Traumatic   Psoriatic   Osteoarthritis   Other   Arthritis   Arthritis   Arthritis  Less  than  a   þ   ☐   ☐   ☐   ☐   year  ago  More  than  a   ☐   ☐   þ   ☐   ☐   year  ago  
  10. 10. Storing  a  single  response   When  were  you  diagnosed  with  the  following  types  of   Arthri5s?   Rheumatoid   Traumatic   Psoriatic   Osteoarthritis   Other   Arthritis   Arthritis   Arthritis  Less  than  a   1   0   0   0   0   year  ago  More  than  a   0   0   1   0   0   year  ago  
  11. 11. Aggregating  over  2000  responses   When  were  you  diagnosed  with  the  following  types  of   Arthri5s?   Rheumatoid   Traumatic   Psoriatic   Osteoarthritis   Other   Arthritis   Arthritis   Arthritis  Less  than  a   63   155   19   27   268   year  ago  More  than  a   190   46   8   213   325   year  ago  
  12. 12. The  Document  Structure  •  Each  survey  response  =  solr  document  •  Up  to  3000  boolean  variables  per  document   indicating  chosen  answers  •  Added  meta  information:  age,  profession,   interests  
  13. 13. Querying  •  Filter  by  age,  interest,  profession  •  Facet  across  boolean  field  •  Result:  what  group  of  people  chose  what   group  of  answers    
  14. 14. Why  solr  is  awesome..  •  Faceting  across  boolean  field  uses  very  little   memory  •  Combining  3000  fields  for  2000  documents   takes  1  ~  2  ms  •  Allowed  us  to  reduce  API  response  time   from  a  variable  of  2  ~  15  seconds  (sucked!)  to   an  almost  constant  ~50  ms    
  15. 15. Good  to  know..  •  sunburnt:  Awesome  python  solr  interface          github.com/tow/sunburnt  •  Programmatic  querying  as  well  as  raw   queries  •  Supports  most  advanced  solr  options  •  If  you  only  required  facets,  specify  rows=0  
  16. 16. Questions?  •  wisertogether.com  •  slideshare.net/tarequeh/the-­‐solr-­‐power  •  @tarequeh    

×