This document discusses CDH, a company that provides FAST Search for SharePoint services. It provides information on CDH's expertise, partnerships, and consultants. It also summarizes how FAST Search increases insight through better extraction of meaning from queries and content, and the components required for scaling FAST Search deployments across query, crawl, and index layers.
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
FAST Search for SharePoint
1. CDH
Transform Enterprise Search with
CDH FAST Search for SharePoint
2. CDH Quick Facts
About Us Approach Partnerships
• 22nd Year • Vendor • Microsoft Gold
• Grand Rapids & Independent • VMware Enterprise
Royal Oak • Non-reseller • Citrix Silver
• 30 Staff • Professional • Novell Gold
Services Only • Cisco Premier
7. CDH Agenda: Insight
• How FAST increases insight
• Insight into how FAST is used to solve
specific business problems
• Insight into what FAST Search high
availability really requires
9. CDH One answer
“Search is the ability to find text strings in
documents”
10. The Problem:
CDH Hidden meaning in the searcher’s intent
”What should I know ”What should I know
about selling ERP?” about implementing ERP?”
- Alan Brewer, Sales Lead - Renee Lo, Consultant
11. CDH Another answer
“Search is the ability to query any document
property”
17. In the box:
CDH Dynamic rank algorithms at query time
• Query terms in title vs. body
Context
Query term • «Bill Gates» vs. «Bill saw the gates»
proximity
«Anchors» match • «...a page about Bill Gates...»
query terms
• Others clicked a hit for «Bill Gates»
Click history match
19. CDH Search and the activity feed
Looking for a knowledge management solution?!?!?
Knowledge Management
I love SharePoint
It’s the best Knowledge Management Solution in the market
Web Content Management
Have you ever built an e-commerce solution on it?
Our focus is knowledge management, and it just works! E-Commerce
We use it as a web content management system, and we’re so happy with it
Great for WCM, Great for KM!
Just deployed for KM… so good, so far… will get back once the pilot is over!
20. CDH For the geeks…
fql = xrank(string(“fast search”),
or(department:or(string(“services”),
string(“engineering”)),
keywords:string(“knowledge
management”)),
boost=10,000)
21. In the box: Static rank algorithms
CDH at content processing time
Landing • Prefer shallow
pages URLs
• Links from other
Authority pages
• Boost
High quality sites/documents
22. CDH Customizable content processing
How to Index Content by Location?
• Address, intersection, zip code, names, etc.
– One Microsoft Way, Redmond, WA
• Geodetic coordinates (latitude & longitude)
– 47.639767, -122.129755
– Degrees, minutes, seconds
• 47° 38’ 23.16” N, 122° 7’ 47.1” W
• Universal Transverse Mercator (UTM)
– 10N 565367 5276630
• Military Grid Reference System (MGRS)
– 10T ET 65367 76630
Index Schema ( Managed Properties)
23. CDH Geographic entity extraction
• Requirement { name: 'Microsoft',
– Parse elements from text address: 'One Microsoft Way, Redmond,
WA 98052',
– Tag documents with the individual values phone: '1‐800‐Microsoft (642‐7676)',
path: 'http://www.microsoft.com',
latitude: '47.639767',
• Solution longitude: '‐122.129755' }
– Custom regular expression extraction
– Call Bing Maps API
– Return latitude and longitude and store as crawled property
24. CDH How they did it
Geo-coding with Bing Maps API
…
…
Mapper
Lemmatization
OpenSearch Entity Extraction
Format Conversion
Source Language Detection
End Users
Data Sources
Federation
Query Content
Indexer
Processor Processor Feeder
Feeder
Index
Search Center Partition
26. CDH Takeaways
• Search ain’t beanbag
• http://www.well.com/~doctorow/metacrap.htm
• FAST Search for SharePoint provides tools
to extract MEANING from content and
queries
28. CDH FAST Search for SharePoint scaleout
Query
Scale-out multiple
Volume Search and Indexing “dimensions”
Query Volume
Content Volume
Indexing freshness
Redundancy options
Query and Result Content
Search
Processing Volume Indexing
Performance targets*
15M Docs/node
25 QPS/node
50 docs/sec
No theoretical Crawling and Content
Processing
upper bounds!
*Depends on content and hardware specifics
29. CDH Don’t forget SharePoint!
Request crawl
Web
FAST
Content SSA
crawls
Admin DB
Poll request Database
Admin
component
Content Web
Log request FAST Service
Content SSA
Crawl DB
Poll request
Master Crawl
comp.
Crawl data Distribute work
Crawl history Crawl comp.
Crawl comp.
Crawl queue Crawl comp.
additions
Document
batches
FAST Search
30. CDH SharePoint Search components
Admin Query
Index P1
Crawl
Admin Crawl Props
SharePoint Server Database Server
All Components on one server All Databases on one Instance
31. Search deployment:
CDH Query layer build out
Query Query
P2
Index P1
Admin Query
Index P1
Crawl
Admin Crawl Props
SharePoint Server Database Server
Query Components on Multiple Servers All Databases on one Instance
Index Re-Partitioned
Props
32. Search deployment:
CDH Crawl layer build out
Query Query Query Query
Index P1 Index P2
Admin
Crawl
SharePoint Server
Query Components on Multiple Servers
Index Re-Partitioned
SharePoint Server Crawl
Crawl Components on
Crawl Props
Multiple Servers Admin
Props
Crawl
Database Server
All Databases on one Instance
33. CDH Thank You
Royal Oak Grand Rapids
306 S. Washington Ave. 15 Ionia SW
Suite 212 Suite 270
Royal Oak, MI 48067 Grand Rapids, MI 49503
p: (248) 546-1800 p: (616) 776-1600
www.cdh.com
(c) C/D/H 2007. All rights reserved