Categories
Magento

INDEXES: REASONS NOT TO REFRESH

A presentation by John Hughes on indexes and why refreshing them is not best practice or as he put it “Stop refreshing the @#$%&! indexes”

The presentation takes an irreverent look at indexes laying out the aims, what indexes are, their impact on performance and what should be done.  We also have knights, kings, dancing and dragons (Giffs are as much a past-time for Hughes as Magento).

Aims:

Give you a solid understanding of Magento’s indexes

Outline the impact indexes can have on store performance

Provide you insight on following the process of how data is indexed to aid with debugging common issues.

An index is responsible for collecting, parsing and storing data to facilitate fast and accurate information retrieval. Hughes describes an index as a form of caching but where the data is transformed during the process. Indexes prevent the server from repeatedly making complex calculations in the form of database queries to retrieve often needed information. The larger your site the further the performance impact is without indexes.

See below John Hughes simple terminology.

We the learn what Magento Indexes:

Product prices

Product inventory (stock) status

Product attribute data

Product category associations and more….

As an example pricing:

The final price displayed to a customer on the frontend can be impacted by:

Base Price

Special price (and from / to dates)

Tiered / customer group pricing

Catalogue price rules

All of which can be per website!

Or as Hughes puts it:

Then consider the complex product types:

Configurable / grouped 

Min / max price (e.g. cheapest / most expensive product)

Bundle

Min / max price (e.g. cheapest / most expensive items combination)

Dynamic / fixed pricing of all child bundle items

Hughes goes on to look at indexing modes Update on save and Update on schedule

Here you see how saving just one product with update on save clears the full page cache for every single product and if all products are affected then all categories are too so full page cache is also cleared for every single category!

So having established that is an incredibly bad way of doing things we now look at update on schedule, or essentially a cron task (running default every minute).

This feature is available in Magento 1 Commerce Edition 1.13.0 and newer, and Open Source Edition for M2.  Indexes are run via schedule so no longer immediately after save.  Things get even better as only the the data that has been changed will be indexed and only pages relevant to that record are removed from the full page cache.

From here Hughes goes into more detail concerning the cascading effects of refreshing indexes, what you need to look at and best practice for going foreword, and dragons, don’t forget the dragons.

For the full presentation CLICK HERE