A presentation by John Hughes on indexes and why refreshing them is not best practice or as he put it “Stop refreshing the @#$%&! indexes”
The presentation takes an irreverent look at indexes laying out the aims, what indexes are, their impact on performance and what should be done. We also have knights, kings, dancing and dragons (Giffs are as much a past-time for Hughes as Magento).
Aims:
Give you a solid understanding of Magento’s indexes
Outline the impact indexes can have on store performance
Provide you insight on following the process of how data is indexed to aid with debugging common issues.
An index is responsible for collecting, parsing and storing data to facilitate fast and accurate information retrieval. Hughes describes an index as a form of caching but where the data is transformed during the process. Indexes prevent the server from repeatedly making complex calculations in the form of database queries to retrieve often needed information. The larger your site the further the performance impact is without indexes.
See below John Hughes simple terminology.
We the learn what Magento Indexes:
Product prices
Product inventory (stock) status
Product attribute data
Product category associations and more….
As an example pricing:
The final price displayed to a customer on the frontend can be impacted by:
Base Price
Special price (and from / to dates)
Tiered / customer group pricing
Catalogue price rules
All of which can be per website!
Then consider the complex product types:
Configurable / grouped
Min / max price (e.g. cheapest / most expensive product)
Bundle
Min / max price (e.g. cheapest / most expensive items combination)
Dynamic / fixed pricing of all child bundle items
Hughes goes on to look at indexing modes Update on save and Update on schedule
Here you see how saving just one product with update on save clears the full page cache for every single product and if all products are affected then all categories are too so full page cache is also cleared for every single category!
So having established that is an incredibly bad way of doing things we now look at update on schedule, or essentially a cron task (running default every minute).
This feature is available in Magento 1 Commerce Edition 1.13.0 and newer, and Open Source Edition for M2. Indexes are run via schedule so no longer immediately after save. Things get even better as only the the data that has been changed will be indexed and only pages relevant to that record are removed from the full page cache.
From here Hughes goes into more detail concerning the cascading effects of refreshing indexes, what you need to look at and best practice for going foreword, and dragons, don’t forget the dragons.
For the full presentation CLICK HERE