P4 Search is a tool built internally and open-sourced in the Perforce Workshop. It creates and uses an external search index to allow users to search the content of a Perforce Server. This talk will explain the inner workings of P4 Search, its setup and applications, and explore ideas on how to extend this great and essential tool.
6. #
• Built-in command, since Perforce 2010.1
• Search files stored in P4D based on content
– Case sensitive and insensitive searches
– Can use regular expressions
– Can search through all revisions
– Provide context search
• Returns depot paths
7. #
• A few drawbacks:
– Text search only, limited to 4K lines
– No search for Metadata such as attributes
• Performance concerns:
– Limited to 10,000 revisions by default
– Memory and CPU consumption
– But: lockless with peeking since 2013.3
8. #
p4 files/p4 fstat
index
store
search
Search engine indexes content
Stores it in its own database
Users search the index first
Index returns a depot path
Index and Perforce Server
can live on separate hosts
9. #
• Lucene
– Scalable, high performance indexing
– Search Algorithms
• Solr
– Stand-alone enterprise search server
– HTML Administration interface
– Extensible
• Tika
– Content analysis tool
10. #
• P4Search
– Index queue (processing indexing requests)
– Search controller (security)
– RESTful API (integration into other tools)
– UI (simple searches)
• Runs in Jetty
24. #
• External index and protection table?
• Solution:
– Use a programmable search engine
– Use Perforce protections to filter results
Users need read access to files to be able to search
27. #
• Download from the Workshop
• Follow the provided instructions to install
• Run two services
– p4search-solr
– p4search-jetty
28. #
• On first run index your entire depot
– You probably don’t want to do this
• On submit index new file revs
– change-commit trigger on depot location
• At any time any given change
– curl POST --data commit,change#
http://p4search:8080/api/queue/{token}
29. #
• Indexing
– With trigger P4D, so ultimately any given client and user
• Searching
– P4Search UI
– Piper
– Commons
– Custom through P4Search API
30. #
• Deep dive after learning Lucene/Solr
• Starting point
p4search/solr/example/solr/collection1/conf
– schema.xml
– solrconfig.xml