Dennis Chaney, Director of Global Product Management, LexisNexis. Presentation from ACTIVATE 2019, the Search and AI Conference hosted by Lucidworks. http://www.activate-conf.com
AWS Community Day CPH - Three problems of Terraform
Solr Migration at Scale: A LexisNexis Journey
1.
2. STAY CONNECTED
Twitter @activate_conf
Facebook @activateconf
#Activate19
Log in to wifi, follow Activate on social media,
and download the event app where you can
submit an evaluation after the session
WIFI NETWORK: Activate2019
PASSWORD: Lucidworks
DOWNLOAD THE ACTIVATE 2019 MOBILE APP
Search Activate2019 in the App/Play store
Or visit: http://crowd.cc/activate19
3. D E N N I S C H A N E Y
Director Global Product Management
Solr Migration at Scale: A LexisNexis Journey
Dennis is the Director of Global Product Management at LexisNexis. He
is based in Raleigh North Carolina and is responsible for the migration of
Lexis Advance to a new search platform. Dennis has been modernizing
enterprise technology systems for over 25 years, with leadership
experience across architecture, design, engineering development. His
repeated success in delivering enterprise-wide systems using advanced
and emerging technology solutions will position LexisNexis for success
on their search migration journey.
4. Agenda
• Rule of Law
• Business Value of Search
• Scaling Search
• An Integrated Effort
• Innovation
5. A DVA NCING THE R ULE
OF LAW AROUND THE
WO R LD
6. ACCESS A VAST RANGE
O F CO NTE NT
60,000 trusted legal, news,
business and public records
sources.
81 Billion documents
7. R E V EA L THE MEA NING
A ND AC T O N IT
Smart content combined with
advanced technology helps
you dig deeper, spot hidden
connections and analyze
better so you make sound,
data-driven decisions.
8. COUNT ON DE E P LEGAL
E X PE RTIS E
Shepard’s® editors and
respected, practicing
attorneys enrich Lexis
Advance content with
analysis and in-depth
commentary to further
enhance your
understanding.
9. CO NDUC T R ES EA RCH
YO UR WAY
Enter keywords, use a
Boolean approach or
choose advanced, form-
based searching. Flexible,
cutting-edge technology
powers your online legal
research.
10. O N LIN E S EA RCH
A DVA N CES A R E K E Y TO
THE PR AC TICE O F LAW
Legal Research is now
synonymous with online
search.
12. Vision, Goals, and Approach
G OA L S
Build a best-in-class search experience by migrating
Lexis Advance search to an open source engine.
Deliver dramatically faster search performance for
Lexis Advance products.
Enable a much faster path to delivery of
search improvements.
A PPROACH
Move to a modern open source search engine that
enables faster innovation, is cloud-friendly, and has
lower operational cost.
Implement architecture that ensures search response
time under a second and is noticeably faster than
competition.
Continuous deployment with higher quality and
frequency, enable faster A/B testing experimentation
and automation.
Increase search performance and release frequency to quickly and continuously provide value to New
LexisNexis customers while providing development teams with an easily configurable search platform
meeting market and product needs.
15. Cost of Ingestion Content
Shift from Lambda to Spark
Lambda: Compute Service runs your code in
response to events and messages. Underlying
compute resources, scalability, and security managed
by AWS.
Estimated Avg Docs ingested Per Dollar ($)
Lambda : 150K docs / $1 >> Spark : 750K docks / $1
$$$$$$$ >> $
Spark: Fast and general processing engine
compatible with S3,Hadoop. Designed to
process both batch processing and streaming.
17. Consumers and Partners
Search
Platform
Partner Teams
RESULTS
GENERATION
QUERY
ENGINE
CONTENT
INGESTION
DELIVERY
INFRASTRUCTURE
NARS V3
(JSON
)
gNS RADIX
Data Lake
Platform
Engineering
HRT
STF
Consumers
18. SRP CI/CD
Pipeline
Search RE Platform
API
Content
Pipeline
Platform Setup
SOLR Cloud
API Pipeline
Content
Ingestion
Pipeline
E2E Functional Test
Open STFxPerformance
19. INNOVATION WITH
S EA RCH
New Relevance Algorithms
Machine Learning & AI
Data and Analytics
20. Measuring Relevance for News
Search Test Framework (GUI and API)
Jaccard Index + doc count
RBO + DCG
Rankin
g
Recall
Precision
22. BE PART OF SOMETHING BIGGER
A R O U N D T H E G L O B E , L E X I S N E X I S ® E M P L O Y E E S A R E C O N N E C T E D B Y T H E D E S I R E T O
S H A P E A B E T T E R W O R L D W H E R E T H E R U L E O F L AW I N C R E A S E S P E A C E , P R O S P E R I T Y
A N D J U S T I C E . L E X I S N E X I S TA K E S C O M M U N I T Y A C T I O N T O M A K E T H E L AW M O R E
A C C E S S I B L E . A N D L E X I S N E X I S S O L U T I O N S G I V E P R O F E S S I O N A L S M O R E I N S I G H T. W E
H E L P D O C T O R S S AV E L I V E S , I N V E S T I G AT O R S F I N D L O S T C H I L D R E N , L AW Y E R S F I N D
J U S T I C E F O R T H E I R C L I E N T S A N D S T U D E N T S G A I N D E E P E R K N O W L E D G E . A C A R E E R
W I T H L E X I S N E X I S C A N H E L P Y O U M A K E A D I F F E R E N C E — I N T H E C O M M U N I T Y A N D AT
W O R K .
https://www.lexisnexis.com/en-us/about-us/careers.page
I hope everyone is having a good time. A special thanks to Lucidworks for bringing us all together for Activate 2019.
Just incase you haven’t found all the ways to stay connected here is just a quick reminder.
If you’re interested I have a little about my background. But in general I have been working in technology, architecture, and product for over 25 years and still love every minute of it.
Today I will be explaining a little bit about LexisNexis and then jumping into the different aspects of search that we have to deal with.
The Rule of Law
Our people and core business operations are helping to advance the rule of law around the world. We accomplish this by operating in over 130 countries around the globe.
Search is at the core of almost every product offerings. The content that underpins those searches come from 60,000 trusted legal, news, business, and public records sources and equates to approximately 81 billion documents. As we progress our migration journey we started with our News sources which consists of approximately 2 billion articles and growing very rapidly every year.
As we migrate the underlying engine we have an opportunity to unwind years of development to determine what features are needed to aide our customers in the future. Observability and logging are fundamentals are at the core of every development activity.
Count on deep legal expertise
Shepard’s® editors and respected, practicing attorneys enrich Lexis Advance content with analysis and in-depth commentary to further enhance your understanding. Providing rapid access to this data is part of the key to providing relevant and on-point search results. Migrating to an Solr we expect to be able to leverage and expand upon our machine learning and artificial intelligence capabilities that are continue to emerge.
Such a simple statement “Conduct research your way” as many in the room will attest means extreme complexity under the covers.
The team has successfully migrating the Boolean syntax and is now tackling the complex queries that our customers are using. To provide early problem detection these queries are tested with every build.
Enter keywords, use a Boolean approach or choose advanced, form-based searching. Flexible, cutting-edge technology powers your online legal research.
At the end of the day our customers are expecting, actually demanding, more from search. Their search experiences continue to expand in area outside of the legal space from areas such as online sales, simple browser searches(Google), and even social media. But their expectation are that we continue to provide results are relevant and on point for the work they are doing.
Even though the law may move slowly the legal industry is expecting faster performance, improved relevance, and demands to integrate their usage data to help them navigate the complex mazes that make up the legal systems around the world.
Scaling Search is more than just the ability to have a fast search engine. It requires the right placement of functionality in the architectural stack. To meet our overall objectives we focused on using data to determine what our customers behaviors were. Looking at the types of search, the complexity of those searches and performance metrics associated with there activities. One focal area is around the features and functionality that the LexisNexis customer base uses. In many areas we found that we could get close but as
As part of our effort to experiment and move forward from our learnings we established focal areas. Such as Infrastructure and Content Ingestion and Core Search Functionality.
To meet the needs of our customers and the competitive nature of content we set forth to provide a better search experience for our customs.
We established a vision and high level goals in which we were trying to achieve. To address these goals we established a set of overarching approaches to help guide us along the way.
Everyone who has participated in a migration effort knows of the Parity Challenge. LexisNexis is no different. What we have done is use data to drive decisions. By analyzing production data we were able to determine the query used by our customers and focus development appropriately.
The shear volume of data that is need to support our customers is challenging. Being able to index that data in a timely fashion has always been a challenge. As part of our efforts we experimented with a variety of technical options to determine their limitations and our implementation challenges. In the end we found that for bulk load we found that Spark would scaled to levels that we targeted which is approximately 2 billion documents a day.
The team continues to evolve the platform and develop new features for the consumers of the platform. Throughout the journey the team is focused not only on functional testing but performance, relevance, cost, and consumer experience. Because the re-platforming team cannot be experts in all of these areas we have integrated with partner teams to leverage their tools and expertise.
We are partnering with the platform consumers to ensure that they are able to meet the external customers needs. The team has partnered with the Search Test Framework team to integrate STF as part of the CI/CD pathway.
One of the major reason for migrating to Solr is to provide us with the ability to create new and innovative ways to help our customers but along the way we are looking at our own tools and practices to ensure that we have the best chance of meeting those innovation challenges of the future.
One such area is the use of the Search Test Framework that LexisNexis developed to help evaluate algorithms for improving relevancy. The replatforming team has pushed beyond its original intent and is now working with them to integrate this as part of the development process. This provides real time feedback and detects problems with functionality changes that inadvertently effected relevance.
Another innovative way that the team is approaching development and the use of data is through the managed stack. This allows a developer to standup a production like instance load the data and test sets required for their algorithm development and immediately see how their changes compare to the production system.
Thanks to all of our service and tool providers without you we would not be able to accomplish everything we set out to.