Sangat Pedas

Implementing Solr4 in WordPress – No Plugin

| 1 Comment

I’m currently working on a WordPress based travel site….

 

I left some space here where the experts can chuckle, LOL, ROFL and whatever else they think is appropriate. To those cynics I say good luck with you’re home made platform. WordPress is so much more than a blog platform, it’s the perfect framework for almost every project. It’s known for it’s near perfect and solid database design, it’s equipped with a great admin interface, offers great caching plugins and many more. I at first thought I had to migrate to another platform but along the way I found out that at least for me the possibilities are unlimited. And scalability? Well, read this post and make up your mind again.

When and why Solr in WordPress?

For me there were 3 solid reasons besides that using text-based search is just really cool, which could be a reason when you run a simple blog like this one. But in that case it’s definitely not a necessity, caching plugins and CDN will do the trick for you.

Scalability

If you plan to become a high traffic site your database might become a bottle neck. In my case, my theme has a lot of custom fields. This resulted in that a 10 post-per page category page resulted in >100 database reads. Now with the current traffic and server capacity that’s not a problem but you don’t need to be an expert to see that one day this will become a serious problem.

Flexibility

In my opinion in principle MySQL is for writing and text-based search for searching and reading. Especially when using a lot of custom fields you’re MySQL based search will become complicated and resource intensive either resulting in extremely complex code to generate the right queries or you deciding to drop some functionality. Both I don’t opt for. Solr and other text based search engines allow you to “flatten” you data and in a very easy way do any kind of query you want all way much faster than MySQL.

Speed

It’s a bit derivative from the two points above, but Solr performs just way much faster than MySQL in querying your data, period.

Why not use a standard Solr plugin?

First of all, if you use WordPress as a framework for a travel site or anything else for which you could say WordPress is not intended, I would advice to stay away from all plugins that deliver front-end functionality. Nothing wrong with using a plugin for caching or SEO but for most others I would rely on my own code for many reasons. You don’t know what you get, how well it’s been coded and what the implications are when you update a plugin. Besides that, building your own code for widgets makes things much more flexible and you’re not limited by the functionality the plugin offers.

Second, I tried some of those plugins and wasn’t impressed or pleased. The output of text was distorted and again, it didn’t allow for much customisation because any update of the plugin would instantly kill all custom code.

And last, it’s really not that difficult to code it yourself.

Designing you schema

Before you can add a new core to Solr you need to design a schema defining the fields of your core. Here’s a short example:

Once you’ve created your schema you need to save it in the conf directroy of your new core. Assuming that you’ve completed all steps in my previous post on how to install Solr 4, your next step is filling your core with data.

Adding data to your core

The first method of doing this I was thinking of was using the integrated Data Import Handler and using the MySQL handler. But the problem I ran into was that the queries would be extremely complex because of the many custom fields and other info related to a post I wanted to add to Solr. Basically it can not be done this way in my situation and any attempt would result in slow queries and the database to choke.

So I had to come up with another method. After doing some reading I realised that the Data Import Handler is not the way to go and that I would go for the update handler. The update handler allows for Json input and I figured I could easily and quickly create the Json input require for my schema.

Building the API

I chose to build an API that generates Json output, not on screen but in a file which I instantly move outside of the web accessible environment, I don’t want people to be able to download my complete or fragments of my database.

In this API I leverage on all that WordPress has to offer and all functions I already created to get data in an efficient way. In order to do that you need to tell the API that it’s running in a WordPress environment by adding the following lines:

Now your API is connected to your WordPress environment. Next I queried my post table:

So now you’ve created a proper Json file that has all your WordPress data. Obviously you can make it more lean and mean with incorporating “last_modified” in your query so it only holds new and changed records.

If Solr and WordPress run on different servers I guess you have to copy the file using ftp or so to the other server, Google that shit yourself!

Committing your data to Solr

The last step is pretty simple and I’ve created a simple script for that so I can easily assign it to a cron job:

The first line puts me in the folder where the json file is stored, the second line commits the Json file to the specified core in your Solr server. If all goes well the response would be something like this:

If you get an error check the field type and whether you’ve defined a field as required but has not value in your Json file.

Using Json in WordPress

While I’m writing this post I’ve only integrated it in my category page, but all other pages will be quite similar and especially for a custom search Solr will make your life so much easier.

Here is the beginning of the code of your category.php in your template:

Basically you have everything you need for setting up paging now, you could stick with the default wordpress paging but Solr might run behind a bit on the MySQL database due to synchronisation delays so I would suggest to make your own function for the links to the previous and next page which can be something like this:

And then replace the standard links in your category page with the new ones:

There you go, you just turbo charged WordPress with Solr, no plugin needed and super fast.

  • http://navinot.com/ neofreko

    get yourself ready with score tweaking and what not. search is blazingly fast, but document scoring can be difficult sometime. Many ended up with a catch all field and ignoring other field weighting.

    Good luck with wordpress :D