SlideShare ist ein Scribd-Unternehmen logo
1 von 26
O C T O B E R 1 1 - 1 4 , 2 0 1 6 • B O S T O N , M A
Solr JDBC
Kevin Risden
Apache Lucene/Solr Committer; Avalon Consulting, LLC
3
03
About me
• Consultant with Avalon Consulting, LLC
• ~4 years working with Hadoop and Search
• Contributed patches to Ambari, HBase, Knox, Solr, Storm
• Installation, security, performance tuning, development, administration
• Kevin Risden
• Apache Lucene/Solr Committer
• YCSB Contributor
4
03
Overview
• Background
• Use Case
• Solr JDBC
• Demo
• Future Development/Improvements
5
01
Background - What is JDBC?
The JDBC API is a Java API that can access any kind of tabular data, especially
data stored in a Relational Database.
Source: https://docs.oracle.com/javase/tutorial/jdbc/overview/
JDBC drivers convert SQL into a backend query.
6
01
Background - Why should you care about Solr JDBC?
• SQL skills are prolific.
• JDBC drivers exist for most relational databases.
• Existing reporting tools work with JDBC/ODBC drivers.
Solr 6 works with SQL and existing JDBC tools!
7
01
Use Case – Analytics – Utility Rates
Data set: 2011 Utility Rates
Questions:
• How many utility companies serve the state of Maryland?
• Which Maryland utility has the cheapest residential rates?
• What are the minimum and maximum residential power rates excluding missing data elements?
• What is the state and zip code with the highest residential rate?
How could you answer those questions with Solr?
Inspired By: http://blog.cloudera.com/blog/2015/10/how-to-use-apache-solr-to-query-indexed-data-for-analytics/
• Facets
• Filter Queries
• Filters
• Grouping
• Sorting
• Stats
• String queries together
8
01
Use Case – Analytics – Utility Rates
Inspired By: http://blog.cloudera.com/blog/2015/10/how-to-use-apache-solr-to-query-indexed-data-for-analytics/
Method: Lucene syntax
Questions:
• How many utility companies serve the state of Maryland?
http://solr:8983/solr/rates/select?q=state%3A%22MD%22&wt=json&indent=true&group=true&group.field=utility_name&rows=10&
group.limit=1
• Which Maryland utility has the cheapest residential rates?
http://solr:8983/solr/rates/select?q=state%3A%22MD%22&wt=json&indent=true&group=true&group.field=utility_name&rows=1&g
roup.limit=1&sort=res_rate+asc
• What are the minimum and maximum residential power rates excluding missing data elements?
http://solr:8983/solr/rates/select?q=*:*&fq=%7b!frange+l%3D0.0+incl%3Dfalse%7dres_rate&wt=json&indent=true&rows=0&stats=t
rue&stats.field=res_rate
• What is the state and zip code with the highest residential rate?
http://solr:8983/solr/rates/select?q=res_rate:0.849872773537&wt=json&indent=true&rows=1
Is there a better way?
9
01
Solr JDBC
Highlights
• JDBC Driver for Solr
• Powered by Streaming Expressions and Parallel SQL
• Thursday - Parallel SQL and Analytics with Solr – Yonik Seeley
• Thursday - Creating New Streaming Expressions – Dennis Gove
• Integrates with any* JDBC client * tested with the JDBC clients in this presentation
Usage
jdbc:solr://SOLR_ZK_CONNECTION_STRING?collection=COLLECTION_NAME
Apache Solr Reference Guide - Parallel SQL Interface
10
01
Solr JDBC - Architecture
11
01
Demo
Programming Languages
• Java
• Python/Jython
• R
• Apache Spark
Web
• Apache Zeppelin
• RStudio
GUI – JDBC
• DbVisualizer
• SQuirreL SQL
GUI – ODBC
• Microsoft Excel
• Tableau*
https://github.com/risdenk/solrj-jdbc-testing
12
01
Demo – Java
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import java.sql.*;
public class SolrJJDBCTestingJava {
private static final Logger LOGGER = LoggerFactory.getLogger(SolrJJDBCTestingJava.class);
public static void main(String[] args) throws Exception {
String sql = args[0];
try (Connection con = DriverManager.getConnection("jdbc:solr://solr:9983?collection=test")) {
try (Statement stmt = con.createStatement()) {
try (ResultSet rs = stmt.executeQuery(sql)) {
ResultSetMetaData rsMetaData = rs.getMetaData();
int columns = rsMetaData.getColumnCount();
StringBuilder header = new StringBuilder();
for(int i = 1; i < columns + 1; i++) {
header.append(rsMetaData.getColumnLabel(i)).append(",");
}
LOGGER.info(header.toString());
while (rs.next()) {
StringBuilder row = new StringBuilder();
for(int i = 1; i < columns + 1; i++) {
row.append(rs.getObject(i)).append(",");
}
LOGGER.info(row.toString());
}
}
}
}
}
}
Apache Solr Reference Guide - Generic
13
01
Demo – Python
#!/usr/bin/env python
# https://pypi.python.org/pypi/JayDeBeApi/
import jaydebeapi
import sys
if __name__ == '__main__':
jdbc_url = "jdbc:solr://solr:9983?collection=test”
driverName = "org.apache.solr.client.solrj.io.sql.DriverImpl”
statement = "select fielda, fieldb, fieldc, fieldd_s, fielde_i from test limit 10”
conn = jaydebeapi.connect(driverName, jdbc_url)
curs = conn.cursor() curs.execute(statement)
print(curs.fetchall())
conn.close()
Apache Solr Reference Guide - Python/Jython
14
01
Demo – Jython
#!/usr/bin/env jython
# http://www.jython.org/jythonbook/en/1.0/DatabasesAndJython.html
# https://wiki.python.org/jython/DatabaseExamples#SQLite_using_JDBC
import sys from java.lang
import Class from java.sql
import DriverManager, SQLException
if __name__ == '__main__':
jdbc_url = "jdbc:solr://solr:9983?collection=test”
driverName = "org.apache.solr.client.solrj.io.sql.DriverImpl”
statement = "select fielda, fieldb, fieldc, fieldd_s, fielde_i from test limit 10”
dbConn = DriverManager.getConnection(jdbc_url)
stmt = dbConn.createStatement()
resultSet = stmt.executeQuery(statement)
while resultSet.next():
print(resultSet.getString("fielda"))
resultSet.close()
stmt.close()
dbConn.close() Apache Solr Reference Guide - Python/Jython
15
01
Demo – R
# https://www.rforge.net/RJDBC/
library("RJDBC")
solrCP <- c(list.files('/opt/solr/dist/solrj-lib', full.names=TRUE),
list.files('/opt/solr/dist', pattern='solrj', full.names=TRUE, recursive = TRUE))
drv <- JDBC("org.apache.solr.client.solrj.io.sql.DriverImpl",
solrCP,
identifier.quote="`")
conn <- dbConnect(drv, "jdbc:solr://solr:9983?collection=test", "user", "pwd")
dbGetQuery(conn, "select fielda, fieldb, fieldc, fieldd_s, fielde_i from test limit 10")
dbDisconnect(conn)
Apache Solr Reference Guide - R
16
01
Demo – Apache Zeppelin
Apache Solr Reference Guide - Apache Zeppelin
17
01
Demo – RStudio
18
01
Demo – DbVisualizer
Apache Solr Reference Guide - DbVisualizer
19
01
Demo – SQuirreL SQL
Apache Solr Reference Guide - SQuirreL SQL
20
01
Demo – Microsoft Excel
21
01
Use Case – Analytics – Utility Rates
Inspired By: http://blog.cloudera.com/blog/2015/10/how-to-use-apache-solr-to-query-indexed-data-for-analytics/
Method: Lucene syntax
Questions:
• How many utility companies serve the state of Maryland?
http://solr:8983/solr/rates/select?q=state%3A%22MD%22&wt=json&indent=true&group=true&group.field=utility_name&ro
ws=10&group.limit=1
• Which Maryland utility has the cheapest residential rates?
http://solr:8983/solr/rates/select?q=state%3A%22MD%22&wt=json&indent=true&group=true&group.field=utility_name&ro
ws=1&group.limit=1&sort=res_rate+asc
• What are the minimum and maximum residential power rates excluding missing data elements?
http://solr:8983/solr/rates/select?q=*:*&fq=%7b!frange+l%3D0.0+incl%3Dfalse%7dres_rate&wt=json&indent=true&rows=0
&stats=true&stats.field=res_rate
• What is the state and zip code with the highest residential rate?
http://solr:8983/solr/rates/select?q=res_rate:0.849872773537&wt=json&indent=true&rows=1
Is there a better way?
22
01
Use Case – Analytics – Utility Rates
Method: SQL
Questions:
• How many utility companies serve the state of Maryland?
select distinct utility_name from rates where state='MD';
• Which Maryland utility has the cheapest residential rates?
select utility_name,min(res_rate) from rates where state='MD' group by utility_name order by min(res_rate) asc limit 1;
• What are the minimum and maximum residential power rates excluding missing data elements?
select min(res_rate),max(res_rate) from rates where not res_rate = 0;
• What is the state and zip code with the highest residential rate?
select state,zip,max(res_rate) from rates group by state,zip order by max(res_rate) desc limit 1;
How should you answer those questions with Solr? – Using SQL!
Inspired By: http://blog.cloudera.com/blog/2015/10/how-to-use-apache-solr-to-query-indexed-data-for-analytics/
23
01
Use Case – Analytics – Utility Rates
How should you answer those questions with Solr? – Using SQL!
Inspired By: http://blog.cloudera.com/blog/2015/10/how-to-use-apache-solr-to-query-indexed-data-for-analytics/
24
01
Future Development/Improvements
• Replace Presto with Apache Calcite - SOLR-8593
• Improve SQL compatibility
• Ability to specify optimization rules (push downs, joins, etc)
• Potentially use Avatica JDBC/ODBC drivers
• Streaming Expressions/Parallel SQL improvements - SOLR-8125
• JDBC driver improvements - SOLR-8659
Info on how to get involved
25
01
Future Development/Improvements
SQL Join
Info on how to get involved
SELECT
movie_title,character_name,line
FROM
movie_dialogs_movie_titles_metadata a
JOIN
movie_dialogs_movie_lines b
ON
a.movieID=b.movieID;
select(
innerJoin(
search(movie_dialogs_movie_titles_metadata,
q=*:*,
fl="movieID,movie_title",
sort="movieID asc"),
search(movie_dialogs_movie_lines,
q=*:*,
fl="movieID,character_name,line",
sort="movieID asc"),
on="movieID”
),
movie_title,character_name,line
)
Streaming Expression Join
26
01
Questions?

Weitere ähnliche Inhalte

Was ist angesagt?

Solr Black Belt Pre-conference
Solr Black Belt Pre-conferenceSolr Black Belt Pre-conference
Solr Black Belt Pre-conferenceErik Hatcher
 
Solr and Elasticsearch, a performance study
Solr and Elasticsearch, a performance studySolr and Elasticsearch, a performance study
Solr and Elasticsearch, a performance studyCharlie Hull
 
Battle of the giants: Apache Solr vs ElasticSearch
Battle of the giants: Apache Solr vs ElasticSearchBattle of the giants: Apache Solr vs ElasticSearch
Battle of the giants: Apache Solr vs ElasticSearchRafał Kuć
 
Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with SolrErik Hatcher
 
Native Code, Off-Heap Data & JSON Facet API for Solr (Heliosearch)
Native Code, Off-Heap Data & JSON Facet API for Solr (Heliosearch)Native Code, Off-Heap Data & JSON Facet API for Solr (Heliosearch)
Native Code, Off-Heap Data & JSON Facet API for Solr (Heliosearch)Yonik Seeley
 
Parallel SQL and Analytics with Solr: Presented by Yonik Seeley, Cloudera
Parallel SQL and Analytics with Solr: Presented by Yonik Seeley, ClouderaParallel SQL and Analytics with Solr: Presented by Yonik Seeley, Cloudera
Parallel SQL and Analytics with Solr: Presented by Yonik Seeley, ClouderaLucidworks
 
Solr Indexing and Analysis Tricks
Solr Indexing and Analysis TricksSolr Indexing and Analysis Tricks
Solr Indexing and Analysis TricksErik Hatcher
 
ElasticSearch for .NET Developers
ElasticSearch for .NET DevelopersElasticSearch for .NET Developers
ElasticSearch for .NET DevelopersBen van Mol
 
Solr Troubleshooting - TreeMap approach
Solr Troubleshooting - TreeMap approachSolr Troubleshooting - TreeMap approach
Solr Troubleshooting - TreeMap approachAlexandre Rafalovitch
 
Apache Solr/Lucene Internals by Anatoliy Sokolenko
Apache Solr/Lucene Internals  by Anatoliy SokolenkoApache Solr/Lucene Internals  by Anatoliy Sokolenko
Apache Solr/Lucene Internals by Anatoliy SokolenkoProvectus
 
Introduction to Apache Solr
Introduction to Apache SolrIntroduction to Apache Solr
Introduction to Apache SolrChristos Manios
 
Optimizer percona live_ams2015
Optimizer percona live_ams2015Optimizer percona live_ams2015
Optimizer percona live_ams2015Manyi Lu
 
Introduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and UsecasesIntroduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and UsecasesRahul Jain
 
Introduction to Cloudera Impala
Introduction to Cloudera ImpalaIntroduction to Cloudera Impala
Introduction to Cloudera ImpalaAlex Moundalexis
 
Tutorial on developing a Solr search component plugin
Tutorial on developing a Solr search component pluginTutorial on developing a Solr search component plugin
Tutorial on developing a Solr search component pluginsearchbox-com
 
New-Age Search through Apache Solr
New-Age Search through Apache SolrNew-Age Search through Apache Solr
New-Age Search through Apache SolrEdureka!
 
Lucene's Latest (for Libraries)
Lucene's Latest (for Libraries)Lucene's Latest (for Libraries)
Lucene's Latest (for Libraries)Erik Hatcher
 
Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)
Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)
Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)Alexandre Rafalovitch
 

Was ist angesagt? (20)

Solr Flair
Solr FlairSolr Flair
Solr Flair
 
Solr Black Belt Pre-conference
Solr Black Belt Pre-conferenceSolr Black Belt Pre-conference
Solr Black Belt Pre-conference
 
Solr and Elasticsearch, a performance study
Solr and Elasticsearch, a performance studySolr and Elasticsearch, a performance study
Solr and Elasticsearch, a performance study
 
Battle of the giants: Apache Solr vs ElasticSearch
Battle of the giants: Apache Solr vs ElasticSearchBattle of the giants: Apache Solr vs ElasticSearch
Battle of the giants: Apache Solr vs ElasticSearch
 
Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with Solr
 
Native Code, Off-Heap Data & JSON Facet API for Solr (Heliosearch)
Native Code, Off-Heap Data & JSON Facet API for Solr (Heliosearch)Native Code, Off-Heap Data & JSON Facet API for Solr (Heliosearch)
Native Code, Off-Heap Data & JSON Facet API for Solr (Heliosearch)
 
Parallel SQL and Analytics with Solr: Presented by Yonik Seeley, Cloudera
Parallel SQL and Analytics with Solr: Presented by Yonik Seeley, ClouderaParallel SQL and Analytics with Solr: Presented by Yonik Seeley, Cloudera
Parallel SQL and Analytics with Solr: Presented by Yonik Seeley, Cloudera
 
Solr Indexing and Analysis Tricks
Solr Indexing and Analysis TricksSolr Indexing and Analysis Tricks
Solr Indexing and Analysis Tricks
 
ElasticSearch for .NET Developers
ElasticSearch for .NET DevelopersElasticSearch for .NET Developers
ElasticSearch for .NET Developers
 
Solr Troubleshooting - TreeMap approach
Solr Troubleshooting - TreeMap approachSolr Troubleshooting - TreeMap approach
Solr Troubleshooting - TreeMap approach
 
Apache Solr/Lucene Internals by Anatoliy Sokolenko
Apache Solr/Lucene Internals  by Anatoliy SokolenkoApache Solr/Lucene Internals  by Anatoliy Sokolenko
Apache Solr/Lucene Internals by Anatoliy Sokolenko
 
Introduction to Apache Solr
Introduction to Apache SolrIntroduction to Apache Solr
Introduction to Apache Solr
 
Optimizer percona live_ams2015
Optimizer percona live_ams2015Optimizer percona live_ams2015
Optimizer percona live_ams2015
 
Introduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and UsecasesIntroduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and Usecases
 
Introduction to Cloudera Impala
Introduction to Cloudera ImpalaIntroduction to Cloudera Impala
Introduction to Cloudera Impala
 
Tutorial on developing a Solr search component plugin
Tutorial on developing a Solr search component pluginTutorial on developing a Solr search component plugin
Tutorial on developing a Solr search component plugin
 
New-Age Search through Apache Solr
New-Age Search through Apache SolrNew-Age Search through Apache Solr
New-Age Search through Apache Solr
 
Lucene's Latest (for Libraries)
Lucene's Latest (for Libraries)Lucene's Latest (for Libraries)
Lucene's Latest (for Libraries)
 
Solr Recipes
Solr RecipesSolr Recipes
Solr Recipes
 
Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)
Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)
Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)
 

Andere mochten auch

Anatomy of Relevance - From Data to Action: Presented by Saïd Radhouani, Yell...
Anatomy of Relevance - From Data to Action: Presented by Saïd Radhouani, Yell...Anatomy of Relevance - From Data to Action: Presented by Saïd Radhouani, Yell...
Anatomy of Relevance - From Data to Action: Presented by Saïd Radhouani, Yell...Lucidworks
 
Ektron 8.5 RC - Search
Ektron 8.5 RC - SearchEktron 8.5 RC - Search
Ektron 8.5 RC - SearchBillCavaUs
 
Neo4j高可用性クラスタ― vs 大規模分散クラスタ―の解説
Neo4j高可用性クラスタ― vs 大規模分散クラスタ―の解説Neo4j高可用性クラスタ― vs 大規模分散クラスタ―の解説
Neo4j高可用性クラスタ― vs 大規模分散クラスタ―の解説昌桓 李
 
Events, Signals, and Recommendations
Events, Signals, and RecommendationsEvents, Signals, and Recommendations
Events, Signals, and RecommendationsLucidworks
 
Netflix Global Search - Lucene Revolution
Netflix Global Search - Lucene RevolutionNetflix Global Search - Lucene Revolution
Netflix Global Search - Lucene Revolutionivan provalov
 
Introducing Neo4j 3.0
Introducing Neo4j 3.0Introducing Neo4j 3.0
Introducing Neo4j 3.0Neo4j
 
Intro to Graphs and Neo4j
Intro to Graphs and Neo4jIntro to Graphs and Neo4j
Intro to Graphs and Neo4jNeo4j
 
Next generation Polyglot Architectures using Neo4j by Stefan Kolmar
Next generation Polyglot Architectures using Neo4j by Stefan KolmarNext generation Polyglot Architectures using Neo4j by Stefan Kolmar
Next generation Polyglot Architectures using Neo4j by Stefan KolmarBig Data Spain
 
RDBMS to Graphs
RDBMS to GraphsRDBMS to Graphs
RDBMS to GraphsNeo4j
 
Autocomplete Multi-Language Search Using Ngram and EDismax Phrase Queries: Pr...
Autocomplete Multi-Language Search Using Ngram and EDismax Phrase Queries: Pr...Autocomplete Multi-Language Search Using Ngram and EDismax Phrase Queries: Pr...
Autocomplete Multi-Language Search Using Ngram and EDismax Phrase Queries: Pr...Lucidworks
 
Galene - LinkedIn's Search Architecture: Presented by Diego Buthay & Sriram S...
Galene - LinkedIn's Search Architecture: Presented by Diego Buthay & Sriram S...Galene - LinkedIn's Search Architecture: Presented by Diego Buthay & Sriram S...
Galene - LinkedIn's Search Architecture: Presented by Diego Buthay & Sriram S...Lucidworks
 
Airbnb Search Architecture: Presented by Maxim Charkov, Airbnb
Airbnb Search Architecture: Presented by Maxim Charkov, AirbnbAirbnb Search Architecture: Presented by Maxim Charkov, Airbnb
Airbnb Search Architecture: Presented by Maxim Charkov, AirbnbLucidworks
 
An Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4jAn Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4jDebanjan Mahata
 
Modelling game economy with Neo4j
Modelling game economy with Neo4jModelling game economy with Neo4j
Modelling game economy with Neo4jYan Cui
 
Big & Personal: the data and the models behind Netflix recommendations by Xa...
 Big & Personal: the data and the models behind Netflix recommendations by Xa... Big & Personal: the data and the models behind Netflix recommendations by Xa...
Big & Personal: the data and the models behind Netflix recommendations by Xa...BigMine
 
An overview of Neo4j Internals
An overview of Neo4j InternalsAn overview of Neo4j Internals
An overview of Neo4j InternalsTobias Lindaaker
 

Andere mochten auch (20)

Graph database
Graph databaseGraph database
Graph database
 
Anatomy of Relevance - From Data to Action: Presented by Saïd Radhouani, Yell...
Anatomy of Relevance - From Data to Action: Presented by Saïd Radhouani, Yell...Anatomy of Relevance - From Data to Action: Presented by Saïd Radhouani, Yell...
Anatomy of Relevance - From Data to Action: Presented by Saïd Radhouani, Yell...
 
Search at Twitter
Search at TwitterSearch at Twitter
Search at Twitter
 
Ektron 8.5 RC - Search
Ektron 8.5 RC - SearchEktron 8.5 RC - Search
Ektron 8.5 RC - Search
 
Neo4j高可用性クラスタ― vs 大規模分散クラスタ―の解説
Neo4j高可用性クラスタ― vs 大規模分散クラスタ―の解説Neo4j高可用性クラスタ― vs 大規模分散クラスタ―の解説
Neo4j高可用性クラスタ― vs 大規模分散クラスタ―の解説
 
Streaming ETL for All
Streaming ETL for AllStreaming ETL for All
Streaming ETL for All
 
Events, Signals, and Recommendations
Events, Signals, and RecommendationsEvents, Signals, and Recommendations
Events, Signals, and Recommendations
 
Netflix Global Search - Lucene Revolution
Netflix Global Search - Lucene RevolutionNetflix Global Search - Lucene Revolution
Netflix Global Search - Lucene Revolution
 
Introducing Neo4j 3.0
Introducing Neo4j 3.0Introducing Neo4j 3.0
Introducing Neo4j 3.0
 
Intro to Graphs and Neo4j
Intro to Graphs and Neo4jIntro to Graphs and Neo4j
Intro to Graphs and Neo4j
 
Neo4j in Depth
Neo4j in DepthNeo4j in Depth
Neo4j in Depth
 
Next generation Polyglot Architectures using Neo4j by Stefan Kolmar
Next generation Polyglot Architectures using Neo4j by Stefan KolmarNext generation Polyglot Architectures using Neo4j by Stefan Kolmar
Next generation Polyglot Architectures using Neo4j by Stefan Kolmar
 
RDBMS to Graphs
RDBMS to GraphsRDBMS to Graphs
RDBMS to Graphs
 
Autocomplete Multi-Language Search Using Ngram and EDismax Phrase Queries: Pr...
Autocomplete Multi-Language Search Using Ngram and EDismax Phrase Queries: Pr...Autocomplete Multi-Language Search Using Ngram and EDismax Phrase Queries: Pr...
Autocomplete Multi-Language Search Using Ngram and EDismax Phrase Queries: Pr...
 
Galene - LinkedIn's Search Architecture: Presented by Diego Buthay & Sriram S...
Galene - LinkedIn's Search Architecture: Presented by Diego Buthay & Sriram S...Galene - LinkedIn's Search Architecture: Presented by Diego Buthay & Sriram S...
Galene - LinkedIn's Search Architecture: Presented by Diego Buthay & Sriram S...
 
Airbnb Search Architecture: Presented by Maxim Charkov, Airbnb
Airbnb Search Architecture: Presented by Maxim Charkov, AirbnbAirbnb Search Architecture: Presented by Maxim Charkov, Airbnb
Airbnb Search Architecture: Presented by Maxim Charkov, Airbnb
 
An Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4jAn Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4j
 
Modelling game economy with Neo4j
Modelling game economy with Neo4jModelling game economy with Neo4j
Modelling game economy with Neo4j
 
Big & Personal: the data and the models behind Netflix recommendations by Xa...
 Big & Personal: the data and the models behind Netflix recommendations by Xa... Big & Personal: the data and the models behind Netflix recommendations by Xa...
Big & Personal: the data and the models behind Netflix recommendations by Xa...
 
An overview of Neo4j Internals
An overview of Neo4j InternalsAn overview of Neo4j Internals
An overview of Neo4j Internals
 

Ähnlich wie Solr JDBC - Lucene/Solr Revolution 2016

6 tips for improving ruby performance
6 tips for improving ruby performance6 tips for improving ruby performance
6 tips for improving ruby performanceEngine Yard
 
Dev8d Apache Solr Tutorial
Dev8d Apache Solr TutorialDev8d Apache Solr Tutorial
Dev8d Apache Solr TutorialSourcesense
 
SQL Tutorial for Marketers
SQL Tutorial for MarketersSQL Tutorial for Marketers
SQL Tutorial for MarketersJustin Mares
 
MySQL Query Optimization
MySQL Query OptimizationMySQL Query Optimization
MySQL Query OptimizationMorgan Tocker
 
Drupal for ng_os
Drupal for ng_osDrupal for ng_os
Drupal for ng_osdstuartnz
 
Ebs dba con4696_pdf_4696_0001
Ebs dba con4696_pdf_4696_0001Ebs dba con4696_pdf_4696_0001
Ebs dba con4696_pdf_4696_0001jucaab
 
cPanel now supports MySQL 8.0 - My Top Seven Features
cPanel now supports MySQL 8.0 - My Top Seven FeaturescPanel now supports MySQL 8.0 - My Top Seven Features
cPanel now supports MySQL 8.0 - My Top Seven FeaturesDave Stokes
 
Apache Solr Workshop
Apache Solr WorkshopApache Solr Workshop
Apache Solr WorkshopJSGB
 
Webinar: What's New in Solr 6
Webinar: What's New in Solr 6Webinar: What's New in Solr 6
Webinar: What's New in Solr 6Lucidworks
 
New-Age Search through Apache Solr
New-Age Search through Apache SolrNew-Age Search through Apache Solr
New-Age Search through Apache SolrEdureka!
 
Best Practices for Oracle Exadata and the Oracle Optimizer
Best Practices for Oracle Exadata and the Oracle OptimizerBest Practices for Oracle Exadata and the Oracle Optimizer
Best Practices for Oracle Exadata and the Oracle OptimizerEdgar Alejandro Villegas
 
Letting In the Light: Using Solr as an External Search Component
Letting In the Light: Using Solr as an External Search ComponentLetting In the Light: Using Solr as an External Search Component
Letting In the Light: Using Solr as an External Search ComponentJay Luker
 
Solr Powered Lucene
Solr Powered LuceneSolr Powered Lucene
Solr Powered LuceneErik Hatcher
 
Talavant Data Lake Analytics
Talavant Data Lake Analytics Talavant Data Lake Analytics
Talavant Data Lake Analytics Sean Forgatch
 
OGSA-DAI DQP: A Developer's View
OGSA-DAI DQP: A Developer's ViewOGSA-DAI DQP: A Developer's View
OGSA-DAI DQP: A Developer's ViewBartosz Dobrzelecki
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to SolrErik Hatcher
 
Spark DataFrames: Simple and Fast Analytics on Structured Data at Spark Summi...
Spark DataFrames: Simple and Fast Analytics on Structured Data at Spark Summi...Spark DataFrames: Simple and Fast Analytics on Structured Data at Spark Summi...
Spark DataFrames: Simple and Fast Analytics on Structured Data at Spark Summi...Databricks
 
MuleSoft London Community February 2020 - MuleSoft and OData
MuleSoft London Community February 2020 - MuleSoft and ODataMuleSoft London Community February 2020 - MuleSoft and OData
MuleSoft London Community February 2020 - MuleSoft and ODataPace Integration
 

Ähnlich wie Solr JDBC - Lucene/Solr Revolution 2016 (20)

6 tips for improving ruby performance
6 tips for improving ruby performance6 tips for improving ruby performance
6 tips for improving ruby performance
 
Dev8d Apache Solr Tutorial
Dev8d Apache Solr TutorialDev8d Apache Solr Tutorial
Dev8d Apache Solr Tutorial
 
SQL Tutorial for Marketers
SQL Tutorial for MarketersSQL Tutorial for Marketers
SQL Tutorial for Marketers
 
MySQL Query Optimization
MySQL Query OptimizationMySQL Query Optimization
MySQL Query Optimization
 
Drupal for ng_os
Drupal for ng_osDrupal for ng_os
Drupal for ng_os
 
Ebs dba con4696_pdf_4696_0001
Ebs dba con4696_pdf_4696_0001Ebs dba con4696_pdf_4696_0001
Ebs dba con4696_pdf_4696_0001
 
Day 6.pptx
Day 6.pptxDay 6.pptx
Day 6.pptx
 
cPanel now supports MySQL 8.0 - My Top Seven Features
cPanel now supports MySQL 8.0 - My Top Seven FeaturescPanel now supports MySQL 8.0 - My Top Seven Features
cPanel now supports MySQL 8.0 - My Top Seven Features
 
Apache Solr Workshop
Apache Solr WorkshopApache Solr Workshop
Apache Solr Workshop
 
Webinar: What's New in Solr 6
Webinar: What's New in Solr 6Webinar: What's New in Solr 6
Webinar: What's New in Solr 6
 
New-Age Search through Apache Solr
New-Age Search through Apache SolrNew-Age Search through Apache Solr
New-Age Search through Apache Solr
 
Best Practices for Oracle Exadata and the Oracle Optimizer
Best Practices for Oracle Exadata and the Oracle OptimizerBest Practices for Oracle Exadata and the Oracle Optimizer
Best Practices for Oracle Exadata and the Oracle Optimizer
 
Oracle SQL Tuning
Oracle SQL TuningOracle SQL Tuning
Oracle SQL Tuning
 
Letting In the Light: Using Solr as an External Search Component
Letting In the Light: Using Solr as an External Search ComponentLetting In the Light: Using Solr as an External Search Component
Letting In the Light: Using Solr as an External Search Component
 
Solr Powered Lucene
Solr Powered LuceneSolr Powered Lucene
Solr Powered Lucene
 
Talavant Data Lake Analytics
Talavant Data Lake Analytics Talavant Data Lake Analytics
Talavant Data Lake Analytics
 
OGSA-DAI DQP: A Developer's View
OGSA-DAI DQP: A Developer's ViewOGSA-DAI DQP: A Developer's View
OGSA-DAI DQP: A Developer's View
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to Solr
 
Spark DataFrames: Simple and Fast Analytics on Structured Data at Spark Summi...
Spark DataFrames: Simple and Fast Analytics on Structured Data at Spark Summi...Spark DataFrames: Simple and Fast Analytics on Structured Data at Spark Summi...
Spark DataFrames: Simple and Fast Analytics on Structured Data at Spark Summi...
 
MuleSoft London Community February 2020 - MuleSoft and OData
MuleSoft London Community February 2020 - MuleSoft and ODataMuleSoft London Community February 2020 - MuleSoft and OData
MuleSoft London Community February 2020 - MuleSoft and OData
 

Kürzlich hochgeladen

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 

Kürzlich hochgeladen (20)

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 

Solr JDBC - Lucene/Solr Revolution 2016

  • 1. O C T O B E R 1 1 - 1 4 , 2 0 1 6 • B O S T O N , M A
  • 2. Solr JDBC Kevin Risden Apache Lucene/Solr Committer; Avalon Consulting, LLC
  • 3. 3 03 About me • Consultant with Avalon Consulting, LLC • ~4 years working with Hadoop and Search • Contributed patches to Ambari, HBase, Knox, Solr, Storm • Installation, security, performance tuning, development, administration • Kevin Risden • Apache Lucene/Solr Committer • YCSB Contributor
  • 4. 4 03 Overview • Background • Use Case • Solr JDBC • Demo • Future Development/Improvements
  • 5. 5 01 Background - What is JDBC? The JDBC API is a Java API that can access any kind of tabular data, especially data stored in a Relational Database. Source: https://docs.oracle.com/javase/tutorial/jdbc/overview/ JDBC drivers convert SQL into a backend query.
  • 6. 6 01 Background - Why should you care about Solr JDBC? • SQL skills are prolific. • JDBC drivers exist for most relational databases. • Existing reporting tools work with JDBC/ODBC drivers. Solr 6 works with SQL and existing JDBC tools!
  • 7. 7 01 Use Case – Analytics – Utility Rates Data set: 2011 Utility Rates Questions: • How many utility companies serve the state of Maryland? • Which Maryland utility has the cheapest residential rates? • What are the minimum and maximum residential power rates excluding missing data elements? • What is the state and zip code with the highest residential rate? How could you answer those questions with Solr? Inspired By: http://blog.cloudera.com/blog/2015/10/how-to-use-apache-solr-to-query-indexed-data-for-analytics/ • Facets • Filter Queries • Filters • Grouping • Sorting • Stats • String queries together
  • 8. 8 01 Use Case – Analytics – Utility Rates Inspired By: http://blog.cloudera.com/blog/2015/10/how-to-use-apache-solr-to-query-indexed-data-for-analytics/ Method: Lucene syntax Questions: • How many utility companies serve the state of Maryland? http://solr:8983/solr/rates/select?q=state%3A%22MD%22&wt=json&indent=true&group=true&group.field=utility_name&rows=10& group.limit=1 • Which Maryland utility has the cheapest residential rates? http://solr:8983/solr/rates/select?q=state%3A%22MD%22&wt=json&indent=true&group=true&group.field=utility_name&rows=1&g roup.limit=1&sort=res_rate+asc • What are the minimum and maximum residential power rates excluding missing data elements? http://solr:8983/solr/rates/select?q=*:*&fq=%7b!frange+l%3D0.0+incl%3Dfalse%7dres_rate&wt=json&indent=true&rows=0&stats=t rue&stats.field=res_rate • What is the state and zip code with the highest residential rate? http://solr:8983/solr/rates/select?q=res_rate:0.849872773537&wt=json&indent=true&rows=1 Is there a better way?
  • 9. 9 01 Solr JDBC Highlights • JDBC Driver for Solr • Powered by Streaming Expressions and Parallel SQL • Thursday - Parallel SQL and Analytics with Solr – Yonik Seeley • Thursday - Creating New Streaming Expressions – Dennis Gove • Integrates with any* JDBC client * tested with the JDBC clients in this presentation Usage jdbc:solr://SOLR_ZK_CONNECTION_STRING?collection=COLLECTION_NAME Apache Solr Reference Guide - Parallel SQL Interface
  • 10. 10 01 Solr JDBC - Architecture
  • 11. 11 01 Demo Programming Languages • Java • Python/Jython • R • Apache Spark Web • Apache Zeppelin • RStudio GUI – JDBC • DbVisualizer • SQuirreL SQL GUI – ODBC • Microsoft Excel • Tableau* https://github.com/risdenk/solrj-jdbc-testing
  • 12. 12 01 Demo – Java import org.slf4j.Logger; import org.slf4j.LoggerFactory; import java.sql.*; public class SolrJJDBCTestingJava { private static final Logger LOGGER = LoggerFactory.getLogger(SolrJJDBCTestingJava.class); public static void main(String[] args) throws Exception { String sql = args[0]; try (Connection con = DriverManager.getConnection("jdbc:solr://solr:9983?collection=test")) { try (Statement stmt = con.createStatement()) { try (ResultSet rs = stmt.executeQuery(sql)) { ResultSetMetaData rsMetaData = rs.getMetaData(); int columns = rsMetaData.getColumnCount(); StringBuilder header = new StringBuilder(); for(int i = 1; i < columns + 1; i++) { header.append(rsMetaData.getColumnLabel(i)).append(","); } LOGGER.info(header.toString()); while (rs.next()) { StringBuilder row = new StringBuilder(); for(int i = 1; i < columns + 1; i++) { row.append(rs.getObject(i)).append(","); } LOGGER.info(row.toString()); } } } } } } Apache Solr Reference Guide - Generic
  • 13. 13 01 Demo – Python #!/usr/bin/env python # https://pypi.python.org/pypi/JayDeBeApi/ import jaydebeapi import sys if __name__ == '__main__': jdbc_url = "jdbc:solr://solr:9983?collection=test” driverName = "org.apache.solr.client.solrj.io.sql.DriverImpl” statement = "select fielda, fieldb, fieldc, fieldd_s, fielde_i from test limit 10” conn = jaydebeapi.connect(driverName, jdbc_url) curs = conn.cursor() curs.execute(statement) print(curs.fetchall()) conn.close() Apache Solr Reference Guide - Python/Jython
  • 14. 14 01 Demo – Jython #!/usr/bin/env jython # http://www.jython.org/jythonbook/en/1.0/DatabasesAndJython.html # https://wiki.python.org/jython/DatabaseExamples#SQLite_using_JDBC import sys from java.lang import Class from java.sql import DriverManager, SQLException if __name__ == '__main__': jdbc_url = "jdbc:solr://solr:9983?collection=test” driverName = "org.apache.solr.client.solrj.io.sql.DriverImpl” statement = "select fielda, fieldb, fieldc, fieldd_s, fielde_i from test limit 10” dbConn = DriverManager.getConnection(jdbc_url) stmt = dbConn.createStatement() resultSet = stmt.executeQuery(statement) while resultSet.next(): print(resultSet.getString("fielda")) resultSet.close() stmt.close() dbConn.close() Apache Solr Reference Guide - Python/Jython
  • 15. 15 01 Demo – R # https://www.rforge.net/RJDBC/ library("RJDBC") solrCP <- c(list.files('/opt/solr/dist/solrj-lib', full.names=TRUE), list.files('/opt/solr/dist', pattern='solrj', full.names=TRUE, recursive = TRUE)) drv <- JDBC("org.apache.solr.client.solrj.io.sql.DriverImpl", solrCP, identifier.quote="`") conn <- dbConnect(drv, "jdbc:solr://solr:9983?collection=test", "user", "pwd") dbGetQuery(conn, "select fielda, fieldb, fieldc, fieldd_s, fielde_i from test limit 10") dbDisconnect(conn) Apache Solr Reference Guide - R
  • 16. 16 01 Demo – Apache Zeppelin Apache Solr Reference Guide - Apache Zeppelin
  • 18. 18 01 Demo – DbVisualizer Apache Solr Reference Guide - DbVisualizer
  • 19. 19 01 Demo – SQuirreL SQL Apache Solr Reference Guide - SQuirreL SQL
  • 21. 21 01 Use Case – Analytics – Utility Rates Inspired By: http://blog.cloudera.com/blog/2015/10/how-to-use-apache-solr-to-query-indexed-data-for-analytics/ Method: Lucene syntax Questions: • How many utility companies serve the state of Maryland? http://solr:8983/solr/rates/select?q=state%3A%22MD%22&wt=json&indent=true&group=true&group.field=utility_name&ro ws=10&group.limit=1 • Which Maryland utility has the cheapest residential rates? http://solr:8983/solr/rates/select?q=state%3A%22MD%22&wt=json&indent=true&group=true&group.field=utility_name&ro ws=1&group.limit=1&sort=res_rate+asc • What are the minimum and maximum residential power rates excluding missing data elements? http://solr:8983/solr/rates/select?q=*:*&fq=%7b!frange+l%3D0.0+incl%3Dfalse%7dres_rate&wt=json&indent=true&rows=0 &stats=true&stats.field=res_rate • What is the state and zip code with the highest residential rate? http://solr:8983/solr/rates/select?q=res_rate:0.849872773537&wt=json&indent=true&rows=1 Is there a better way?
  • 22. 22 01 Use Case – Analytics – Utility Rates Method: SQL Questions: • How many utility companies serve the state of Maryland? select distinct utility_name from rates where state='MD'; • Which Maryland utility has the cheapest residential rates? select utility_name,min(res_rate) from rates where state='MD' group by utility_name order by min(res_rate) asc limit 1; • What are the minimum and maximum residential power rates excluding missing data elements? select min(res_rate),max(res_rate) from rates where not res_rate = 0; • What is the state and zip code with the highest residential rate? select state,zip,max(res_rate) from rates group by state,zip order by max(res_rate) desc limit 1; How should you answer those questions with Solr? – Using SQL! Inspired By: http://blog.cloudera.com/blog/2015/10/how-to-use-apache-solr-to-query-indexed-data-for-analytics/
  • 23. 23 01 Use Case – Analytics – Utility Rates How should you answer those questions with Solr? – Using SQL! Inspired By: http://blog.cloudera.com/blog/2015/10/how-to-use-apache-solr-to-query-indexed-data-for-analytics/
  • 24. 24 01 Future Development/Improvements • Replace Presto with Apache Calcite - SOLR-8593 • Improve SQL compatibility • Ability to specify optimization rules (push downs, joins, etc) • Potentially use Avatica JDBC/ODBC drivers • Streaming Expressions/Parallel SQL improvements - SOLR-8125 • JDBC driver improvements - SOLR-8659 Info on how to get involved
  • 25. 25 01 Future Development/Improvements SQL Join Info on how to get involved SELECT movie_title,character_name,line FROM movie_dialogs_movie_titles_metadata a JOIN movie_dialogs_movie_lines b ON a.movieID=b.movieID; select( innerJoin( search(movie_dialogs_movie_titles_metadata, q=*:*, fl="movieID,movie_title", sort="movieID asc"), search(movie_dialogs_movie_lines, q=*:*, fl="movieID,character_name,line", sort="movieID asc"), on="movieID” ), movie_title,character_name,line ) Streaming Expression Join