Getting Started with Solr

Beginners guide to Solr database. Getting started with Solr database.

What is Solr?

Solr pronounce as solar is a document bases no sql search platform developed using popular programming language java. It is open sourced under Apache License. Solr runs as standalone server within a servlet container such as tomcat or jetty. Here are some of the main features of Solr.

  • Near real time indexing/search.
  • Powerful full text search and hit highlighting
  • Scalable and fault tolerant (no single point of failure)
  • Rich document (eg: rtf, word, pdf) support
  • Distributed
  • Tons of specialized queries: Faceted Search, Grouping, pseudo join, spatial search, functions

History of Solr

Into 2004 CNET developer named Yonik Seeley created a search platform to add search capabilities to the Then they launched a company named LucidWorks to provide consulting and commercial support and training for the search platform.

Later in 2006 they donated it to the Apache Foundation as a open source search platform.

Now it is widely used as the major search platform for large web sites.

Get started

  • Download the solr zip file from the solr website. Unzip the binary distribution (.zip). No installation required.
  • Run the following command to start solr

    $ java -jar start.jar

Go to browser and hit http://localhost:8983/solr to view the admin.

Now we have Solr up and running and we can start adding, updating and retrieving documents from any application. You can use any programming language to add/update/retrieve/remove document since Solr supports JSON, XML, PHP, Ruby, Python, XSLT, Velocity and custom Java binary output formats over HTTP.

Lets use simple curl to send some json to add the document, and then get real time search result as json.

Add and retrieve document

  $ curl http://localhost:8983/solr/update -H 'Content-type:application/json' -d' 
{ "id" : "book1", 
"title" : "American Gods", 
"author" : "Neil Gaiman"
} ]'

$ curl http://localhost:8983/soir/get?id=book1
{ "Id" : "book1", 
"author" : "Neil Gaiman'',
 "title" : "American Gods", 
" version ": 1410390803582287872 

note: no type of "commit" is necessary to retrieve documents via /get (real time get)

Now we want to update our documents. Lets add two new fields to the book document.

$ curl http://localhost:8983/solr/update -H 'Content-type:application/json' -d' 
 "id" : "book1", 
 "pubyear_i" : {"add":2001}, 
 "isbn_s" : {"add":"112-12321-12-1"}

We don't need to define the schema first or predefine the columns.We can use the feature called dynamic fields and add any no of fields on the fly.

See example below how we can just increment integer field or remove the fields from the document.

$ curl http://localhost:8983/solr/update -H 'Content-type:application/json' -d' 
     "id" : "book1", 
     "copies_i" : {"increment":1}, 
     "isbn_s" : {"set":"112-12321-12-2"},
     "remove_s" : {"set":null},

Optimistic Concurrency

Conditional update based on document version.

Solr optimistic concurrency

You get the document from the Solr, update the document retaining the document version, and update to Solr. But it could fail sometime if some one else had updated the document at the same time. If it fails with 409 code, repeat the process again.

Optimistic Concurrency Example

Get the document

$ curl httn://localhost:8983/solr/get?id=book2
{ "doc" : {
"author": "William Gibson", 
version_":123456789 }}

Modify the document retaining the version_

$ curl http://localhost:8983/solr/update -H 'Content-type:application/json -d '
"author": "William Gibson", 
"version_":123456789 }]'

Simplified JSON delete syntax

  • Single delete by id {"delete":"book1"}
  • Multiple delete by ids {"delete":["book1","book2",....]}
  • Delete with optimistic concurrency {"delete":{"id":"book1","_version_":"12121212"}}
  • Delete by query {"delete":{"query","tag:category1"}}