Magento URL Rewrite Process

Magento URL Rewrite is core feature in magento and is used through. This is mainly used to generate SEO friendly URL’s for products, categories, cms pages etc. Rewrite is also possible to do through config.xml file when you create modules. We will see both of these in detail and how they work.

Database Table URL Rewrites

Magento stores URL rewrites in the table ‘core_url_rewrite’. It has the following columns
url_rewrite_id: primary key
store_id: store id for the rewrite is done
id_path: internal unique identifier used. its usally ‘product’ + product_od or ‘category’ + category_id etc
request_path: the url which is opened in browser
target_path: the which should open if request path is matched
is_system: if the rewrite is system generated or manually created
options: different redirect options like ‘RP’ which is permanent redirect (302 Redirect), ‘R’ which is a 301 Redirect, or null/empty
description: any description if entered
category_id: the category id if the redirect is for a cateogry
product_id : the product id if redirect is for a product

We can also add custom rewrites to this table from Admin -> Catalog -> Url Rewrite Management

DB Rewrite Implementation

Let’s see in detail how magento implement these rewrites when we open a URL in browser

When we open a URL in browser, the main control goes to Mage_Core_Controller_Varien_Front::dispatch() where this code is written

$this->_getRequestRewriteController()->rewrite();

This basically creates an object of class ‘core/url_rewrite_request’ and calls the rewrite function there.
This is what the rewrite() function looks like

    public function rewrite()
    {
        
        if (!$this->_request->isStraight()) {
            $this->_rewriteDb();
        }
        $this->_rewriteConfig();
        return true;
    }

As we can see here _rewriteDb() and _rewriteConfig() functions are called here which manage the db and config rewrites respectively. Lets look at db rewrite first.

    protected function _rewriteDb()
    {
        if (null === $this->_rewrite->getStoreId() || false === $this->_rewrite->getStoreId()) {
            $this->_rewrite->setStoreId($this->_app->getStore()->getId());
        }

        $requestCases = $this->_getRequestCases();
        $this->_rewrite->loadByRequestPath($requestCases);

        $fromStore = $this->_request->getQuery('___from_store');
        if (!$this->_rewrite->getId() && $fromStore) {
            $stores = $this->_app->getStores(false, true);
            if (!empty($stores[$fromStore])) {
                /** @var $store Mage_Core_Model_Store */
                $store = $stores[$fromStore];
                $fromStoreId = $store->getId();
            } else {
                return false;
            }

            $this->_rewrite->setStoreId($fromStoreId)->loadByRequestPath($requestCases);
            if (!$this->_rewrite->getId()) {
                return false;
            }

            // Load rewrite by id_path
            $currentStore = $this->_app->getStore();
            $this->_rewrite->setStoreId($currentStore->getId())->loadByIdPath($this->_rewrite->getIdPath());

            $this->_setStoreCodeCookie($currentStore->getCode());

            $targetUrl = $currentStore->getBaseUrl() . $this->_rewrite->getRequestPath();
            $this->_sendRedirectHeaders($targetUrl, true);
        }

        if (!$this->_rewrite->getId()) {
            return false;
        }

        $this->_request->setAlias(Mage_Core_Model_Url_Rewrite::REWRITE_REQUEST_PATH_ALIAS,
            $this->_rewrite->getRequestPath());
        $this->_processRedirectOptions();

        return true;
    }

The first thing which happens there is creating of request cases. Request cases is basically an array which has difference combinations of the request URL priority wise. e.g if the input URL is “http://yourmagento.com/abc.html?test=1”
The request cases which get generated are
1. abc.html?test=1
2. abc.html/?test=1
3. abc.html
4. abc.html/

similarly if the request url is “http://yourmagento.com/abc.html/?test=1” , request cases are
1. abc.html/?test=1
2. abc.html?test=1
3. abc.html/
4. abc.html

not the difference in slash order based on URL.
here is code for the same which magento uses

    protected function _getRequestCases()
    {
        $pathInfo = $this->_request->getPathInfo();
        $requestPath = trim($pathInfo, '/');
        $origSlash = (substr($pathInfo, -1) == '/') ? '/' : '';
        // If there were final slash - add nothing to less priority paths. And vice versa.
        $altSlash = $origSlash ? '' : '/';

        $requestCases = array();
        // Query params in request, matching "path + query" has more priority
        $queryString = $this->_getQueryString();
        if ($queryString) {
            $requestCases[] = $requestPath . $origSlash . '?' . $queryString;
            $requestCases[] = $requestPath . $altSlash . '?' . $queryString;
        }
        $requestCases[] = $requestPath . $origSlash;
        $requestCases[] = $requestPath . $altSlash;
        return $requestCases;
    }

Once request cases are generated, magento check’s database for matches using function

$this->_rewrite->loadByRequestPath($requestCases);

This is defined the resource model ‘Mage_Core_Model_Resource_Url_Rewrite’

    public function loadByRequestPath(Mage_Core_Model_Url_Rewrite $object, $path)
    {
        if (!is_array($path)) {
            $path = array($path);
        }

        $pathBind = array();
        foreach ($path as $key => $url) {
            $pathBind['path' . $key] = $url;
        }
        // Form select
        $adapter = $this->_getReadAdapter();
        $select  = $adapter->select()
            ->from($this->getMainTable())
            ->where('request_path IN (:' . implode(', :', array_flip($pathBind)) . ')')
            ->where('store_id IN(?)', array(Mage_Core_Model_App::ADMIN_STORE_ID, (int)$object->getStoreId()));
        $items = $adapter->fetchAll($select, $pathBind);

        // Go through all found records and choose one with lowest penalty - earlier path in array, concrete store
        $mapPenalty = array_flip(array_values($path)); // we got mapping array(path => index), lower index - better
        $currentPenalty = null;
        $foundItem = null;
        foreach ($items as $item) {
            if (!array_key_exists($item['request_path'], $mapPenalty)) {
                continue;
            }
            $penalty = $mapPenalty[$item['request_path']] << 1 + ($item['store_id'] ? 0 : 1);
            if (!$foundItem || $currentPenalty > $penalty) {
                $foundItem = $item;
                $currentPenalty = $penalty;
                if (!$currentPenalty) {
                    break; // Found best matching item with zero penalty, no reason to continue
                }
            }
        }

        // Set data and finish loading
        if ($foundItem) {
            $object->setData($foundItem);
        }

        // Finish
        $this->unserializeFields($object);
        $this->_afterLoad($object);

        return $this;
    }

What this code does it, create an sql query like

SELECT `core_url_rewrite`.* FROM `core_url_rewrite` WHERE (request_path IN (:path0, :path1)) AND (store_id IN(0, 1))

and bind it to an array like

Array
(
    [path0] => 
    [path1] => 
)

Next we have code which calculates penalty using bitwise operators, but what the code does in short is
return “earlier path in array and concrete store”.

So once we get the final request path, if any magento does the final redirect using

$this->_processRedirectOptions();

In between there is other code as well, but its self explanatory.

So this is how database redirect options work, let see now how config redirect options work.

Another thing to see is when does magento create these url rewrites
This is done at ‘Mage_Catalog_Model_Indexer_Url’ using the “reindexAll()” function.

Configuration Based Rewrite

Here is the code for configuration based rewrites

     protected function _rewriteConfig()
    {
        $config = $this->_config->getNode('global/rewrite');
        if (!$config) {
            return false;
        }
        foreach ($config->children() as $rewrite) {
            $from = (string)$rewrite->from;
            $to = (string)$rewrite->to;
            if (empty($from) || empty($to)) {
                continue;
            }
            $from = $this->_processRewriteUrl($from);
            $to   = $this->_processRewriteUrl($to);

            $pathInfo = preg_replace($from, $to, $this->_request->getPathInfo());
            if (isset($rewrite->complete)) {
                $this->_request->setPathInfo($pathInfo);
            } else {
                $this->_request->rewritePathInfo($pathInfo);
            }
        }
        return true;
    }
    protected function _processRewriteUrl($url)
    {
        $startPos = strpos($url, '{');
        if ($startPos !== false) {
            $endPos = strpos($url, '}');
            $routeName = substr($url, $startPos + 1, $endPos - $startPos - 1);
            $router = $this->_getRouterByRoute($routeName);
            if ($router) {
                $frontName = $router->getFrontNameByRoute($routeName);
                $url = str_replace('{' . $routeName . '}', $frontName, $url);
            }
        }
        return $url;
    }

This is xml which we write in our config.xml files

 <global>
    <rewrite>
        <designer_url>
            <from><![CDATA[#^/author/id/#]]></from>
            <to><![CDATA[/designer/index/index/id/]]></to>
            <complete>1</complete>
        </designer_url>
    </rewrite>
</global>

As we can see the code is quite self explanatory on how this works. One thing to note is magento uses ‘preg_replace’ to match url, so we can use regular expressions as well.