2. "Big data is like teenage sex:
everyone talks about it,
nobody really knows how to do it,
everyone thinks everyone else is doing it,
so everyone claims they are doing it..."
- Dan Ariely, Duke University
25. 4.4 vs. 4.5
4.4 - no routing
vs.
4.5 routing via CloudSolrServer
Faster AFS flushes
Faster indexing: 3.5 M vs. 4 M docs in 55 min
Lower heap: circa 10%
26. DocValues
<field name="id"
/>
type="string" indexed="true" stored="true"
required="true"
docValues="true"
<dynamicField name="dim_str_*"
/>
type="string" indexed="true" stored="false" multiValued="true" docValues="true"
<dynamicField name="dim_num_*" type="tlong" indexed="true" stored="false" multiValued="true" docValues="true"
/>
Forward index data structure
… in memory and/or on disk
… compressible
… off-heap
See
https://cwiki.apache.org/confluence/display/solr/DocValues
http://searchhub.org/2013/04/02/fun-with-docvalues-in-solr-4-2/
27. !DocValues vs. DocValues
With DocValues:
… smaller heap
… more numerous, but smaller GCs
… roughly equal query latency
… faster faceting, sorting, and grouping (but we
are not using that in SPM with Solr)
… didn’t see slow down in indexing (hmmm)