Native Code & Off-Heap Data Structures for Solr: Presented by Yonik Seeley, Heliosearch

Native Code & Off-Heap Data
Structures for Solr
Yonik Seeley
Lucene/Solr Revolution 2014
Washington, D.C.

My Background
• Creator of Solr
• Heliosearch Founder
• LucidWorks Co-Founder
• Lucene/Solr committer, PMC member
• Apache Software Foundation member
• M.S. in Computer Science, Stanford

Heliosearch Project
• The Next Evolution of Solr
• Forked from Solr, Developing at github
– Started Jan 2014
– Well aligned community
– Open Source, Apache licensed
• Bring back to Apache in the future?
• Currently drop-in replacement for Solr at the HTTP-API level
– A super-set… we continually merge in upstream changes
– Latest version of Heliosearch includes latest Solr
• Current Features: Off-heap filters, Off-heap fieldcache, facet-by-
function, sub-facets, native code performance
enhancements

Garbage Collection Basics
Eden
Space
Survivor
Space
1
Survivor
Space
2
Tenured
Space
Permanent
Space
q New objects allocated in Eden
q Find live objects by tracing from GC
“roots” (threads, stack locals, etc)
q Make a copy of live objects, leaving
“garbage” behind
q Eden + Survivor Space copied
together to other Survivor space
q Tenured from Survivor when old
enough
q “stop-the-world” needed when GC
can’t keep up
q Out of memory when too much time
spent in GC
Thread

Java Memory Waste
- Need to size for worst case scenario
- OS needs free memory to cache index files
- JVMs aren’t good at “sharing” with rest of the system
- mmap allocations managed by OS, can be immediately reused on free
OS
Real
Memory
max
heap
Unused
Heap
Heap
in
use
JVM
max
heap
Unused
Heap
Heap
in
use
JVM
Unused
Heap
C
Heap
in
use
C
Process
Unused
Heap
C
Heap
in
use
C
Process
mmap
alloced
mmap
alloced
“Free”
Memory
includes
buffer
cache,
important
to
cache
index
files

GC Impact
q GC Reduces Throughput
q Time to copy all that memory around could be spent
better!
q Stop-the-world pauses
q Seconds to Minutes long
q Pause time proportional to heap size
q Still exists in all Hotspot GCs… CMS, G1GC, etc
q Breaks Application SLAs (request timeouts, etc)
q Can cause SolrCloud Zookeeper session timeouts
q Reducing max pause size normally means reduced
throughput
q Non-graceful degradation
q if you don't size your heap big enough… BOOM!

GC Tuning
UseSerialGC
UseParallelGC
UseParallelOldGC
UseParallelOldGCCompacting
UseParallelDensePrefixUpdate
HeapMaximumCompactionInterval
HeapFirstMaximumCompactionCount
UseMaximumCompactionOnSystemGC
ParallelOldDeadWoodLimiterMean
ParallelOldDeadWoodLimiterStdDev
UseParallelOldGCDensePrefix
ParallelGCThreads
ParallelCMSThreads
YoungPLABSize
OldPLABSize
GCTaskTimeStampEntries
AlwaysTenure
NeverTenure
ScavengeBeforeFullGC
UseConcMarkSweepGC
ExplicitGCInvokesConcurrent
UseCMSBestFit
UseCMSCollectionPassing
UseParNewGC
ParallelGCVerbose
ParallelGCBufferWastePct
ParallelGCRetainPLAB
TargetPLABWastePct
PLABWeight
ResizePLAB
PrintPLAB
ParGCArrayScanChunk
ParGCDesiredObjsFromOverflowList
CMSParPromoteBlocksToClaim
AlwaysPreTouch
CMSUseOldDefaults
CMSYoungGenPerWorker
CMSIncrementalMode
CMSIncrementalDutyCycle
CMSIncrementalPacing
CMSIncrementalDutyCycleMin
CMSIncrementalSafetyFactor
CMSIncrementalOffset
CMSExpAvgFactor
CMS_FLSWeight
CMS_FLSPadding
FLSCoalescePolicy
CMS_SweepWeight
CMS_SweepPadding
CMS_SweepTimerThresholdMillis
CMSClassUnloadingEnabled
CMSCompactWhenClearAllSoftRefs
UseCMSCompactAtFullCollection
CMSFullGCsBeforeCompaction
CMSIndexedFreeListReplenish
CMSLoopWarn
CMSMarkStackSize
CMSMarkStackSizeMax
CMSMaxAbortablePrecleanLoops
CMSMaxAbortablePrecleanTime
CMSAbortablePrecleanMinWorkPerIteration
CMSAbortablePrecleanWaitMillis
CMSRescanMultiple
CMSConcMarkMultiple
CMSRevisitStackSize
CMSAbortSemantics
CMSParallelRemarkEnabled
CMSParallelSurvivorRemarkEnabled
CMSPLABRecordAlways
CMSConcurrentMTEnabled
CMSPermGenPrecleaningEnabled
CMSPermGenSweepingEnabled
CMSPrecleaningEnabled
CMSPrecleanIter
CMSPrecleanNumerator
CMSPrecleanDenominator
CMSPrecleanRefLists1
CMSPrecleanRefLists2
CMSPrecleanSurvivors1
CMSPrecleanSurvivors2
CMSPrecleanThreshold
CMSCleanOnEnter
CMSRemarkVerifyVariant
CMSScheduleRemarkEdenSizeThreshold
CMSScheduleRemarkEdenPenetration
CMSScheduleRemarkSamplingRatio
CMSSamplingGrain
CMSScavengeBeforeRemark
CMSWorkQueueDrainThreshold
CMSWaitDuration
CMSYield
CMSBitMapYieldQuantum
UseGCLogFileRotation
NumberOfGCLogFiles
GCLogFileSize
LargePageSizeInBytes
LargePageHeapSizeThreshold
PrintGCApplicationConcurrentTime
PrintGCApplicationStoppedTime
OnOutOfMemoryError
ClassUnloading
BlockOffsetArrayUseUnallocatedBlock
RefDiscoveryPolicy
ParallelRefProcEnabled
CMSTriggerRatio
CMSBootstrapOccupancy
CMSInitiatingOccupancyFraction
UseCMSInitiatingOccupancyOnly
HandlePromotionFailure
PreserveMarkStackSize
ZeroTLAB
PrintTLAB
TLABStats
AlwaysActAsServerClassMachine
DefaultMaxRAM
DefaultMaxRAMFraction
DefaultInitialRAMFraction
UseAutoGCSelectPolicy
AutoGCSelectPauseMillis
UseAdaptiveSizePolicy
UsePSAdaptiveSurvivorSizePolicy
UseAdaptiveGenerationSizePolicyAtMinorCollection
UseAdaptiveGenerationSizePolicyAtMajorCollection
UseAdaptiveSizePolicyWithSystemGC
UseAdaptiveGCBoundary
AdaptiveSizeThroughPutPolicy
AdaptiveSizePausePolicy
AdaptiveSizePolicyInitializingSteps
AdaptiveSizePolicyOutputInterval
UseAdaptiveSizePolicyFootprintGoal
AdaptiveSizePolicyWeight
AdaptiveTimeWeight
PausePadding
PromotedPadding
SurvivorPadding
AdaptivePermSizeWeight
PermGenPadding
ThresholdTolerance
AdaptiveSizePolicyCollectionCostMargin
YoungGenerationSizeIncrement
YoungGenerationSizeSupplement
YoungGenerationSizeSupplementDecay
TenuredGenerationSizeIncrement
TenuredGenerationSizeSupplement
TenuredGenerationSizeSupplementDecay
MaxGCPauseMillis
MaxGCMinorPauseMillis
GCTimeRatio
AdaptiveSizeDecrementScaleFactor
UseAdaptiveSizeDecayMajorGCCost
AdaptiveSizeMajorGCDecayTimeScale
MinSurvivorRatio
InitialSurvivorRatio
BaseFootPrintEstimate
UseGCOverheadLimit
GCTimeLimit
GCHeapFreeLimit
PrintAdaptiveSizePolicy
DisableExplicitGC
CollectGen0First
BindGCTaskThreadsToCPUs
UseGCTaskAffinity
ProcessDistributionStride
CMSCoordinatorYieldSleepCount
CMSYieldSleepCount
PrintGCTaskTimeStamps
TraceClassLoadingPreorder
TraceGen0Time
TraceGen1Time
PrintTenuringDistribution
PrintHeapAtSIGBREAK
TraceParallelOldGCTasks
PrintParallelOldGCPhaseTimes
MaxHeapSize
MaxNewSize
PretenureSizeThreshold
MinTLABSize
TLABAllocationWeight
TLABWasteTargetPercent
TLABRefillWasteFraction
TLABWasteIncrement
MaxLiveObjectEvacuationRatio
OldSize
MinHeapFreeRatio
MaxHeapFreeRatio
SoftRefLRUPolicyMSPerMB
MinHeapDeltaBytes
MinPermHeapExpansion
MaxPermHeapExpansion
QueuedAllocationWarningCount
MaxTenuringThreshold
InitialTenuringThreshold
TargetSurvivorRatio
MarkSweepDeadRatio
PermMarkSweepDeadRatio
MarkSweepAlwaysCompactCount
PrintCMSStatistics
PrintCMSInitiationStatistics
PrintFLSStatistics
PrintFLSCensus
DeferThrSuspendLoopCount
DeferPollingPageLoopCount
SafepointSpinBeforeYield
UseDepthFirstScavengeOrder
GCDrainStackTargetSize
ThreadSafetyMargin
CodeCacheMinimumFreeSpace
MaxDirectMemorySize
PerfDataMemorySize
AggressiveHeap
UseCompressedStrings
UseStringCache
HeapDumpOnOutOfMemoryError
HeapDumpPath
PrintGC
PrintGCDetails
PrintGCTimeStamps
PG1HeapRegionSize
G1ReservePercent
G1ConfidencePercent
PrintPromotionFailure
PrintGCDateStamps
-‐XX:IniKaKngHeapOccupancyPercent=n
-‐XX:MaxGCPauseMillis=n
-‐XX:ConcGCThreads=n
-‐XX:MaxHeapFreeRaKo=70
-‐XX:MaxTenuringThreshold=n
-‐XX:+ScavengeBeforeFullGC

GC Reduction
q Reuse objects – cause less garbage
q Move certain things off-heap (invisible to GC)
q Option1: Direct ByteBuffers
q Limited to “int” (2GB)
q No way to directly “free” – still relies on GC
q Option2: sun.misc.Unsafe
q malloc() + free() + direct memory access
q Supported on all major JVMs
q Widely used: Java (nio, concurrent),JSR166, Google
Guava, objenesis (which is used in Kyro, which is used
in Twitter Storm), Apache DirectMemory,Lightning,
Hazelcast, snappy, gson, …
q Being considered for Java 9

Off-Heap Filters
50M docs
(3.8 GB index)
8GB RAM
20K requests
8 req threads
500 filters
JVM Options:
-Xmx4G (solr)

Off-Heap title
Filters Test
Observed
max
process
sizes
Solr
:
3.8GB
–
4.3GB
Heliosearch:
3.6GB
–
3.7GB

Off-Heap FieldCache
Normal (on-heap) FieldCache
q Typically the largest data structures kept on the heap
q Used for sorting, function query values, single-valued faceting,
grouping
q Uses weak references
Heliosearch nCache (n is for “native”)
q Allocated off-heap
q First-class managed Solr cache
q Configure size, warming policies
q View statistics
q Per-segment (NRT friendly)
q No weak references

nCache admin stats
item_id:{
"field":"id",
"uses":8,
"class":"StrTopValues",
"refcount":2,
"numSegments":7,
"carriedOver":6,
"size":612}
item_popularity:{
"field":"popularity",
"uses":5,
"class":"IntTopValues",
"refcount":2,
"numSegments":7,
"carriedOver":6,
"size":106}
item_price:{
"field":"price”,
"uses":0,
-- the number of top-level uses for searcher
"class":"FloatTopValues",
"refcount":2,
"numSegments":5,
-- number of segments populated
"carriedOver":5,
-- number of segments carried over from last searcher
"size":272
-- size in bytes for all populated segments
}

Off-Heap Integer Field
q 50M document index
q Sorting on 6 different integer fields (10,100,1000,10000,1M unique values)
q 4 request threads
Results
q 42% faster sorting
q 73% faster functions

String Field Sorting
q 10M document index
q 10 different string fields, each field 80% populated
q Median latency

String Field Sorting Throughput
q Concurrent throughput sorting on random fields in random order (asc/desc)
q ~50% performance gain

Native Code
q The Idea: create native accelerators for CPU hotspots
q Faceting anyone?
q But…. JNI Sucks! (and it’s GC’s fault again)
jint
*buf=
(*env)-‐>GetIntArrayElements(env,
arr,
0);
for
(i=0;
i<len;
i++)
{
sum
+=
buf[i];
q GetArrayElements() – makes a *copy* of the array!
q GetPrimitiveArrayCritical() – blocks garbage collection!
q Tons of other restrictions… it’s a “critical section”
q Defeats the purpose of going to native code in the first place
q But… our data is already off-heap, we’re good!
}

Native Single Valued String Faceting
q Top-Level off-heap String cache
q Improves Sorting and Faceting speed
q Eliminates FieldCache “insanity”
q Native Code
q Written in C++, compiled with GCC 4.7, 4.8
q Currently supports 64 bit Windows, OS-X, Linux (x86)
q static compilation avoids JVM hotspot warmup period,
mis-compilation bugs, and variations between runs

Facet Module Goals
q Replace the aging “SimpleFacets”
q First class JSON support
q Easier programmatic construction of complex nested facet
commands
q Canonical response format that is easier for clients to
parse
q First class analytics support
q Cleaner distributed search support
q Fully pluggable
q Better base for integration of other search features
Heliosearch is a Solr super-set, so you can still chose to
use the old faceting or mix-n-match.

API Comparison
Old Style New JSON API
&facet=true
&facet.range={!key=age_ranges}age
&f.age_ranges.facet.range.start=0
&f.age_ranges.facet.range.end=100
&f.age_ranges.facet.range.gap=10
&facet.range={!key=price_ranges}price
&f.price_ranges.facet.range.start=0
&f.price_ranges.facet.range.end=1000
&f.price_ranges.facet.range.gap=50
{
age_ranges:
{
//
facet
name
range:
{
//
facet
type
field
:
age,
//
facet
params
start
:
0,
end
:
100,
gap
:
10
}
},
price_ranges:
{
range:
{
field
:
price,
start
:
0,
end
:
1000,
gap
:
50
}
}
}

Facet Functions
q Sort/Report by things other than “count”
Aggregation Functions / Stats:
count
sum(function)
avg(function)
sumsq(function)
min(function)
max(function)
unique(string_field)
any
“funcKon
query”
that
yields
a
numeric
value!
Example:
sum(mul(num_units,
unit_price))
q Stats are calculated “per bucket”
q Buckets created by Query, Range, or Terms (field) facets

Simple Request + Response
$
curl
http://localhost:8983/solr/query
-‐d
'q=widgets&
json.facet=
{
//
Comments
can
help
with
clarity
/*
traditional
C-‐style
comments
are
also
supported
*/
x
:
"avg(price)"
,
//
Simple
strings
can
occur
unquoted
y
:
'unique(brand)'
//
Strings
can
also
use
single
quotes
}
'
[…]
"facets"
:
{
"count"
:
314,
"x"
:
102.5,
"y"
:
28
}
Number
of
documents
in
the
facet
bucket

Terms Facet Example
json.facet={
shoes:{
terms:{
field:
shoe_style,
sort:
{x
:
desc},
facet:{
x
:
"avg(price)",
y
:
"unique(brand)"
}
}
}
}
"facets":
{
"count"
:
472,
"shoes":
{
"buckets"
:
[
{
"val"
:
"Hiking",
"count"
:
34,
"x"
:
135.25,
"y"
:
17,
},
{
"val"
:
"Running",
"count"
:
45,
"x"
:
110.75,
"y"
:
24,
},
Executed
per-‐bucket

Sub-Facets
q Any facet that produces buckets can have sub-facets
(terms/field, range, query)
q Sub-facets can have facet functions (stats) or their
own sub-facets (no limit to nesting).
q A subfacet can be any type (field, range, query)
q Multiple subfacets can be added to any given facet
q Subfacets are first-class facets - can be configured
independently like any other facet.
q Different offsets, limits, stats, sorts, etc

Sub-Facet Example
json.facet={
shoes:{
terms:{
field:
shoe_style,
sort:
{x
:
desc},
facet:{
x
:
"avg(price)",
y
:
"unique(brand)",
colors
:{terms:color}
}
}
}
}
"facets":
{
"count"
:
472,
"shoes":
{
"buckets"
:
[
{
"val"
:
"Hiking",
"count"
:
34,
"x"
:
135.25,
"y"
:
17,
"colors"
:
{
"buckets"
:
[
{
"val"
:
"brown",
"count"
:
12
},
{
"val"
:
"black",
"count"
:
10
},
[…]
]
}
//
end
of
colors
sub-‐facet
},
//
end
of
Hiking
bucket
{
"val"
:
"Running",
"count"
:
45,
"x"
:
110.75,
"y"
:
24,
"colors"
:
{
"buckets"
:
[…]
Short-‐form
for
terms
facet
simply
specifies
the
field.
Sorts
buckets
by
count
descending.

Terms Facet
Terms facet creates buckets of docs with the same value in a field
- field – The field name to facet over.
- offset – Used for paging, this skips the first N buckets. Defaults to 0.
- limit – Limits the number of buckets returned. Defaults to 10.
- mincount – Only return buckets with a count of at least this number. Defaults to 1.
- sort – Specifies how to sort the buckets produced. “count” specifies document count,
“index” sorts by the index (natural) order of the bucket value. One can also sort by any
facet function / statistic that occurs in the bucket. The default is “count desc”. This
parameter may also be specified in JSON like sort:{count:desc}. The sort order may
either be “asc” or “desc”
- missing – A boolean that specifies if a special “missing” bucket should be returned that is
defined by documents without a value in the field. Defaults to false.
- numBuckets – A boolean. If true, adds “numBuckets” to the response, an integer
representing the number of buckets for the facet (as opposed to the number of buckets
returned). Defaults to false.
- allBuckets – A boolean. If true, adds an “allBuckets” bucket to the response, representing
the union of all of the buckets. For multi-valued fields, this is different than a bucket for all
of the documents in the domain since a single document can belong to multiple buckets.
Defaults to false.
- prefix – Only produce buckets for terms starting with the specified prefix.

Query Facet
Query facet creates a single bucket of documents matching the
query.
{
//
simple
example
highpop:{
query:{
q:"inStock:true
AND
popularity[8
TO
10]"
}
}
}
{
//
example
with
multiple
sub-‐facets
highpop:{
query:{
q
:
"inStock:true
AND
popularity[8
TO
10]",
facet
:
{
average_price
:
"agv(price)",
available_colors
:
{
terms
:
color
},
price_ranges
:
{
range
:
{
field:price,
start:0,
end:200,
gap:10
}}
}}
}

Range Facet
Creates buckets over ranges on a numeric or date field
Parameter names/values "in sync" with Solr range parameters:
field – The numeric field or date field to produce range buckets from
start – Lower bound of the ranges
end – Upper bound of the ranges
gap – Size of each range bucket produced
hardend – A boolean, which if true means that the last bucket will end at “end” even if it is less than “gap” wide. If false,
the last bucket will be “gap” wide, which may extend past “end”.
other – This param indicates that in addition to the counts for each range constraint between facet.range.start and
facet.range.end, counts should also be computed for…
– "before" all records with field values lower then lower bound of the first range
– "after" all records with field values greater then the upper bound of the last range
– "between" all records with field values between the start and end bounds of all ranges
– "none" compute none of this information
– "all" shortcut for before, between, and after
include – By default, the ranges used to compute range faceting between facet.range.start and facet.range.end are
inclusive of their lower bounds and exclusive of the upper bounds. The “before” range is exclusive and the “after” range is
inclusive. This default, equivalent to lower below, will not result in double counting at the boundaries. This behavior can be
modified by the facet.range.include param, which can be any combination of the following options…
– "lower" all gap based ranges include their lower bound
– "upper" all gap based ranges include their upper bound
– "edge" the first and last gap ranges include their edge bounds (ie: lower for the first one, upper for the last one)
even if the corresponding upper/lower option is not specified
– "outer" the “before” and “after” ranges will be inclusive of their bounds, even if the first or last ranges already
include those boundaries.
– "all" shorthand for lower, upper, edge, outer

Sub-Facets + Facet-Functions
=
Business Intelligence / Analytics

Fantasy
($1045)
Top
Authors
$423
George
R.R.
MarKn
$347
Brandon
Sanderson
$155
JK
Rowling
Top
Books
$252
A
Game
of
Thrones
$113
Emperor
of
Thorns
$101
Nine
Princes
in
Amber
$82
Steel
Heart
Sci-‐Fi
($898)
Top
Authors
$321
Iain
M
Banks
$218
Neal
Asher
$155
Neal
Stephenson
Top
Books
$113
Gridlinked
$101
Use
of
Weapons
$93
Snow
Crash
$82
The
Skinner
Mystery
($645)
Top
Authors
$191
James
Panerson
$145
Patricia
Cornwell
$126
John
Grisham
Top
Books
$85
One
for
the
Money
$77
Angels
&
Daemons
$64
Shuner
Island
$35
The
Firm
Filter
By
State
$852
NJ
(14
stores)
$658
NY
(11
stores)
$421
CT
(8
stores)
Chain
$984
Amazoon
(14
stores)
$734
Houses&Royalty
(9
stores)
$387
Books-‐r-‐us
(7
stores)
Store
$108
Amazoon
Branchburg
$93
Books-‐r-‐us
Bridgewater
$87
H&R
NYC
Number
of
Books
Chain
201K
Houses&Royalty
183K
Amazoon
98K
Books-‐r-‐us
Store
193K
H&R
NYC
77K
Books-‐r-‐us
Bridgewater
68K
Amazoon
Branchburg

date_breakout
:
{
range:
{
field:
sale_date,
start
:
...,
end
:
...,
gap
:
"+1MONTH”,
facet
:
{
top_genre
:
{
terms
:
{
field
:
genre,
sort
:
"revenue
desc",
limit
:
4,
facet
:
{
revenue
:
"sum(sales)"
}
}},
by_chain:
{
terms
:
{
field
:
chain,
facet
:
{
revenue
:
"sum(sales)"
}
}}
[…]
Implementation
Creates
series
of
facet
buckets
based
on
date
For
each
date
bucket,
facet
by
genre,
taking
the
top
4
by
revenue
For
each
genre
bucket,
report
revenue

Fantasy
($1045)
Top
Authors
$423
George
R.R.
MarKn
$347
Brandon
Sanderson
$155
JK
Rowling
Top
Books
$252
A
Game
of
Thrones
$113
Emperor
of
Thorns
$101
Nine
Princes
in
Amber
$82
Steel
Heart
Sci-‐Fi
($898)
Top
Authors
$321
Iain
M
Banks
$218
Neal
Asher
$155
Neal
Stephenson
Top
Books
$113
Gridlinked
$101
Use
of
Weapons
$93
Snow
Crash
$82
The
Skinner
Mystery
($645)
Top
Authors
$191
James
Panerson
$145
Patricia
Cornwell
$126
John
Grisham
Top
Books
$85
One
for
the
Money
$77
Angels
&
Daemons
$64
Shuner
Island
$35
The
Firm
top_genres:{
terms:{
field:
genre,
facet
:
{
rev
:
"sum(sales)",
top_authors:{
terms:{
field
:
author,
sort
:"rev
desc",
limit
:
3,
facet
:
{
rev
:
"sum(sales)"
}
}},
top_books:{
terms:{
field
:
Ktle,
sort
:
"rev
desc",
limit
:
4,
facet
:
{
rev
:
"sum(sales)"
}
}}
[…]

Filter
By
State
$852
NJ
(14
stores)
$658
NY
(11
stores)
$421
CT
(8
stores)
Chain
$984
Amazoon
(14
stores)
$734
Houses&Royalty
(9
stores)
$387
Books-‐r-‐us
(7
stores)
Store
$108
Amazoon
Branchburg
$93
Books-‐r-‐us
Bridgewater
$87
H&R
NYC
state_breakout:{
terms:{
field:
state,
sort:
"rev
desc",
facet
:
{
rev
:
"sum(sales)",
num_stores
:
"unique(store)"
}},
chain_breakout:{
terms:{
field:
chain,
sort:
"rev
desc",
facet
:
{
rev
:
"sum(sales)",
num_stores
:
"unique(store)"
}}
,
store_breakout:{
terms:{
field:
store,
sort:
"rev
desc",
facet
:
{
rev
:
"sum(sales)",
}}}

Parameter Substitution
q Parameters / macros substituted across whole request
q Happens before any parsing, so usable in any context
q=price:[ ${low} TO ${high} ]
&low=100
&high=200
q Default values
q=price:[ ${low:0} TO ${high:100} ]
q Nested
q=${price_query}
&price_query=${price_field}:[ ${low} TO ${high} ] AND inStock:true
&price_field=specialPrice
&low=50
&high=100

New Query Parser Features
q Filters in queries - just like “fq” parameters, but may appear
anywhere in a query
q=(text:elephant –(filter(*:* -price:[ 0 TO 100 ]) OR
filter(date[0 TO 2013]) )
q Constant Score Queries
q=color:(blue OR green)^=1 text:shoes
q Comments in Queries (can nest)
q=+text:elephant /* the main query */ /* boosting part – WIP
{!func}mul(pop,rank)^10 */

Thank You
Help Develop the Next Generation of Solr!
Resources:
q http://heliosearch.org
q https://github.com/Heliosearch/heliosearch
q https://groups.google.com/forum/#!forum/heliosearch
q https://groups.google.com/forum/#!forum/heliosearch-dev

Native Code & Off-Heap Data Structures for Solr: Presented by Yonik Seeley, Heliosearch

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (20)

Ähnlich wie Native Code & Off-Heap Data Structures for Solr: Presented by Yonik Seeley, Heliosearch

Ähnlich wie Native Code & Off-Heap Data Structures for Solr: Presented by Yonik Seeley, Heliosearch (20)

Mehr von Lucidworks

Mehr von Lucidworks (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Native Code & Off-Heap Data Structures for Solr: Presented by Yonik Seeley, Heliosearch