Starting with Sphinx – Installation, Configuration

I recently used sphinx in one of my projects, so want to share my experiences with it.

Installation

Installing sphinx is quite simple. On Ubuntu

sudo add-apt-repository ppa:builds/sphinxsearch-daily
sudo apt-get update
sudo apt-get install sphinxsearch

More details given here

On CentOS download latest RPM from here and install using

rpm -Uhv sphinx-2.2.1-1.rhel6.x86_64.rpm

More details given here

If during installation you get an error like

Error: Package: sphinx-2.1.7-1.rhel6.x86_64 (/sphinx-2.1.7-1.rhel6.x86_64)
           Requires: libmysqlclient.so.16()(64bit)
Error: Package: sphinx-2.1.7-1.rhel6.x86_64 (/sphinx-2.1.7-1.rhel6.x86_64)
           Requires: libmysqlclient.so.16(libmysqlclient_16)(64bit)

You need to install sphinx from source directly. I tried many ways to fix above problem, but couldn’t.

To Install via source, download the latest stable release from here
Download the source tarball file and do the steps below

wget http://sphinxsearch.com/files/sphinx-2.1.8-release.tar.gz
tar -xzf sphinx-2.1.8-release.tar.gz
cd sphinx-2.1.8-release
mkdir /usr/local/sphinx
./configure --prefix=/usr/local/sphinx --with-mysql
make
make install

In case of any issues you can refer here

After your done with above, go to folder /usr/local/sphinx and you see your sphinx installation.

cd /usr/local/sphinx
ls

you should see the folder
bin/
etc/
share/
var/

Inside the bin/ folder you should find all executable files like “indexer” and “searchd”

“searchd” = This is the main sphinx service which acts as the server and does the searching. Find more details here
To start searchd run command

searchd --config /usr/local/sphinx/sphinx-custom.conf

“indexer” = This is main tool used to index data for sphinx. You would use this tool index data from mysql or any other data source to sphinx.

Sphinx Configuration

Next important step to do is configuring sphinx.
By default sphinx provides two configuration files in folder ‘/usr/local/sphinx/etc’
1. sphinx.conf.dist : This config file has all configuration options possible with comments giving their explanation.
2. sphinx-min.conf.dist: This file has a single data source and single indexer configuration and is a minimal config file. We will use this as a start file and make edits as per our needs.

First copy the file

sudo cp sphinx-min.conf.dist sphinx-custom.conf

Next to

sudo nano sphinx-custom.conf

Data Source Configuration

Will explain below explanation of a mysql configuration.
Suppose we have a table name products_indexed with a column full_name,author,category_name,full_spec,brand which we need indexed with columns as below
Screenshot 2014-05-03 10.05.46

Below is the data source configuration for the same

source products_indexed
{
	type			= mysql

	sql_host		= localhost
	sql_user		= etech_pricegenie
	sql_pass		= [email protected]
	sql_db			= etech_pricegenie
	sql_port		= 3306	# optional, default is 3306

	sql_query		= SELECT * FROM products_indexed WHERE id>=$start AND id<=$end

        sql_field_string = full_name
	sql_field_string = author
	sql_field_string = category_name
	sql_field_string = full_spec
	sql_field_string = brand
	sql_attr_uint = cat_id
	sql_attr_uint = sub_cat_id
	sql_attr_uint = cat_priority
	sql_attr_uint = prod_order
	sql_attr_uint = isbn
	sql_attr_uint = sellers
	sql_attr_uint = product_id
	sql_attr_uint = search_update
    sql_attr_string = url
	sql_attr_string = model_no
	sql_attr_string = image
	sql_attr_string = query_id
	sql_attr_float = price
	sql_attr_float = lowest_price

	#sql_attr_timestamp	= date_added

	sql_query_range  = select MIN(id), MAX(id) FROM products_indexed
    sql_range_step  = 1000
	sql_query_info		= SELECT * FROM products_indexed WHERE id=$id
}

sql_field_string : Specifies which field sphinx will index
sql_attr_uint,sql_attr_string : unsigned int and string attributes which we want sphinx to return in our search result.
sql_query_range: we want sphinx to index 1000 rows at a time instead of entire table at one go, important for big tables.
There are host of other configuration options which you read here or in sphinx.conf.dist file.

Indexer Configuration

index products_indexed
{
	source			= products_indexed
	path			= /usr/local/sphinx/var/data/products_indexed
	morphology		= stem_en
}

There are host of other configuration options which you read here or in sphinx.conf.dist file

Once your done with above save the file and run to index the data.

/usr/local/sphinx/bin/indexer --config /usr/local/sphinx/etc/sphinx-custom.conf --all

You might also need to start searchd, using

searchd --config /usr/local/sphinx/sphinx-custom.conf
#searchd --config /usr/local/sphinx/sphinx-custom.conf --stop
#searchd --config /usr/local/sphinx/sphinx-custom.conf --restart
#searchd --config /usr/local/sphinx/sphinx-custom.conf --status

After this you can test your searching using

/usr/local/sphinx/bin/search -c /usr/local/sphinx/etc/sphinx-min.conf.dist --index products_indexed samsung galaxy s4

This would give you the desired output.

Also to further index data, if you want to add new data to index run this

/usr/local/sphinx/bin/indexer --rotate --config /usr/local/sphinx/etc/sphinx-custom.conf --all

You can also add the above to cron job to auto index.

Auto Start Sphinx On Reboot

sudo vi /etc/init.d/searchd

Copy the code

#!/bin/bash

case "${1:-''}" in
'start')
/usr/local/bin/searchd
;;
'stop')
/usr/local/bin/searchd --stop
;;
'restart')
/usr/local/bin/searchd --stop && /usr/local/bin/searchd
;;
*)
echo "Usage: $SELF start|stop|restart"
exit 1
;;
esac

To exit vi do ESC + :wq + ENTER

Then

sudo chmod -x /etc/init.d/searchd
sudo update-rc.d searchd defaults

Hope you enjoyed my blog on sphinx and found it useful.
http://excellencemagentoblog.com/sphinx-installation-configuration
http://excellencemagentoblog.com/sphinx-sphinxql-php