How-to install Solr 4.0 with Tomcat 7 on Webfaction [Tutorial]

| 12 Comments

For one of my startups-still-in-stealth-mode I’m working on a professional and scalable solution for search suggestions. Considering the current low number of daily visits I first thought I could easily get away with one of the many tutorials on search suggestions with mysql. However, the complexity would force me to use JOINS in combination with UNION JOINS which will result in the end in a crappy performance. Also, I shoot for the stars so anything I build needs to be super scalable.

So I chose for a setup with a text-based search engine. Now these days there are many text-based search engines out there but since I have some experience with Lucene based Solr and since Solr is just a bit more suited for dummies I chose Solr. Another popular alternative is Elastic Search but I seriously doubt that my website or yours would come to a point where you can’t do it with Solr. The Lucene engined is suitable for and extreme high number of concurrent requests, is super stable and serves many high-traffic websites out there.

So, this will be the first time I actually install Solr on a Webfaction server and for the view out there struggling with the same I share the complete process step-by-step.

In order for Solr to run it needs a Servlet container like Tomcat and Jetty. Solr actually comes with Jetty build in but for several reasons (security, doing-difficult-just-for-the-heck-of-it and others) I’ve decided to go with Tomcat 7.

OMG, this was a total nightmare and cost me >2 days to get it done so kudo’s and other forms of appreciation are very welcome!

Step 1: Installing Tomcat 7

Just follow the steps as described in the post How-to install Tomcat 7 on Webfaction for Dummies. Next, stop the Tomcat service:

Step 2: Download the Solr Distribution

First go to the root folder of your newly added website which prob is:

Next, find the link to the latest production version of Solr. Go to http://apache.spinellicreations.com/lucene/solr/ click on the latest production version which in this case is 4.0.0. Click on the folder link and then copy the link location of the tgz file, watch out you don’t copy the source code which is identified by having -src- in the filename. Once you’ve copied the download location go back to your ssh session and type:

And then wait till the download is finished, which can take quite a while in my case, probably I chose a mirror that’s quite far from my web server. Next extract the downloaded file:

Basically here I followed the instructions from the Apache Wiki on installing Solr on Tomcat:

Copy the example/solr directory from the source to the installation directory like /opt/solr/example/solr, herafter $SOLR_HOME. Copy the .war file dist/apache-solr-*.war into $SOLR_HOME assolr.war.

The configuration file $SOLR_HOME/conf/solrconfig.xml in the example sets dataDir for the index to be ./solr/data relative to the current directory – which is true for running the Jetty server provided with the example, but incorrect for Tomcat running as a service. Modify the dataDir to specify the full path to $SOLR_HOME/data:

The dataDir can also be temporarily overridden with the JAVA_OPTS environment variable prior to starting Tomcat:

Create a Tomcat Context fragment to point docBase to the $SOLR_HOME/solr.war file and solr/home to $SOLR_HOME:

Symlink or place the file in $CATALINA_HOME/conf/Catalina/localhost/solr-example.xml, where Tomcat will automatically pick it up. Tomcat deletes the file on undeploy (which happens automatically if the configuration is invalid).

Seemed all pretty straightforward and I was hopeful it would work, but it didn’t….. Great, and all articles I can find on this topic are either way outdated or not relevant. This is where a 2 day quest began till and in the meanwhile also installing Solr using the included Jetty just to see Solr in action. But since securing Jetty is a drag, I didn’t want to give up.

The errors I got varied from “Corrupt war file” to:

Nov 27, 2012 10:10:43 AM org.apache.catalina.startup.ContextConfig init
SEVERE: Exception fixing docBase for context [/solr]
java.util.zip.ZipException: error in opening zip file

Some posts pointed to changing permissions to files and folders but in the end I found it had to do with the paths. Somehow relative paths seemed to fail, even though the paths logged were exatcly the right paths. So I changed all paths in any config file to full absolute paths:

So, first the most important one, the Tomcat Context fragment, which I saved as solr.xml instead of solr-example.xml just because it gives a nicer folder name.

So this is the new Tomcat Context fragment which you have to save as /conf/Catalina/localhost/solr.xml and don’t forget to remove any other exisiting context fragment:

Next I also had to change the datadir setting in /opt/solr/example/solr/collection1/conf/solrconfig.xml and changed it to a full absolute path:

And this was the time when my two day quest was over, after restarting Tomcat I could access Solr, the core was started and I could add data to it. Now I’m pretty sure there must be a more graceful way to do this but it did the trick. Any suggestions are obviously always welcome.

Some More Tips

When I was trying to add another core I kept running into errors about fieldtypes, even though I copied that whole section from the example in my new schema. The solution was to add a version field to your schema:

Don’t fill this field, it will be handled by Solr itself.

  • anonymous

    what do you mean your newly added website is named /webapps/solr. You named the it webapps/tomcat?

    • Thomas

      I’d appreciate an answer to this as well.

    • http://sangatpedas.com Sangat Pedas

      Sorry, this was a type, should have been /webapps/tomcat. I also did a Solr installation outside of Tomcat so mixed up the to locations

  • anonymous

    there’s no /opt/ directory anywhere. you lost me when you started with the copy/paste of apache directions

    • http://sangatpedas.com Sangat Pedas

      the opt dir is under the directory ../username/webapps/tomcat/

  • Jony Kwa

    Thanks you, you save me the days of work!

    • http://sangatpedas.com Sangat Pedas

      You’re welcome mate!

  • Craig Sander

    Hi, first of all, thanks a lot for this tut. It’s incredibly helpful. However, I’m getting lost on the file-path copies you outline. Could you possibly post a more detailed explorer image that shows the full “username/webapps/tomcat/opt/…” directories?

    I think that might help clarify what the final structure should look like.

    At the moment, I have Tomcat running and everything and it recognizes that /solr is present, but I get “FAIL – Application at context path /solr could not be started” errors when I try to start it up in the Tomcat homepage.

    • http://sangatpedas.com Sangat Pedas

      Hi Craig, I hope this helps

      • Craig Sander

        That’s exactly what I needed! Thanks a lot for the quick reply!

        • http://sangatpedas.com Sangat Pedas

          You’re welcome. It seems more people have trouble with the whole folder structure so maybe you can point out where my post becomes unclear so I can change it a bit?

          • Craig Sander

            Sorry, I just saw this! I’m back to do it again and I’m getting stuck where you essentially follow Apache’s guidelines. Right where the italic’s start is where I get stuck.