Releases: Smile-SA/elasticsuite
2.10.19
2.11.4.3
Main new feature
Tools to check and fix invalid behavioral data
When the Elasticsuite tracker is enabled on a site with a custom theme or on a PWA frontend, it can happen that the data collection is not correctly performed, particularly that some events are registred with an undefined or 'null' user session identifier (tracker session.uid
parameter).
In turn those events coming from different visitors (tracker session.vid
parameter) would still be collected as event for a same navigation session spanning several weeks or months, generating a document in the behavioral data session index with several hundreds or thousands of search terms, products added to cart, ordered, etc.
In the long run, it will slow down requests performed on the behavioral data indices up to generate 429 errors on your Elasticsearch/OpenSearch server preventing you to use, for instance, the Elasticsuite Search Usage analytics dashboard.
This release contains both a fix to prevent the collection of those ill-formed events and two Magento commands to be able to check and fix your already indexed behavioral data.
Check the presence of invalid behavioral data
You can run the elasticsuite:tracker:check-data
Magento command to scan your behavioral indices for invalid data.
It will scan the behavioral data indices of all the active Magento store views and report if there are any errors.
If there are errors for a given store, they will be reported, as seen below :
Fix the invalid behavioral data.
If the elasticsuite:tracker:check-data
command reports error, you can run the elasticsuite:tracker:fix-data
command to fix the invalid data from your behavioral indices.
It will report what has been fixed.
For instance, on the same example as seen above, here are the results :
If there was nothing to fix, the command has no effect :
Future
As for the system message that can pop in your Magento admin interface when you have "Ghost Indices", we might in a future release make sure that admin users are made aware of existing invalid behavioral data without having to launch the elasticsuite:tracker:check-data
Magento command.
📦 Features
- [Configuration] Added configuration for ssl verification by @gabrielLumao in #3127
- [Tracker] Tools to check and fix invalid behavioral data by @rbayet in #3122
- [Tracker] Prevent partial/invalid events to be indexed by @rbayet in #3111
🐛 Fixes
- Fixes #3137 [Analytics] PHP8 compatibility popular search terms w/ re… by @rbayet in #3138
- Fixes #3119 [GraphQL] allow drill-down in categories agg. of products… by @rbayet in #3120
- Fixes #3123 [Search Merchandizing] Show back hidden products in preview by @rbayet in #3124
- Fixes #3134 [Catalog] Typo in decimal layered navigation filter by @rbayet in #3141
- Fixes #3132 [Thesaurus] limit thesaurus rewriting loops by @rbayet in #3133
- Fixes #3132 [Thesaurus] Rewriting loops avoidance unit tests by @rbayet in #3145
- Fixes #2913 [Tracker] undefined array key date for elasticsuite tracker event index by @rbayet in #3112
- [Tracker] Remove 'domain' parameter from tracker by @romainruaud in 7bc0b3c
New Contributors
- @gabrielLumao made their first contribution in #3127
Full Changelog: 2.11.4.2...2.11.4.3
2.10.18.3
Main new feature
Tools to check and fix invalid behavioral data
When the Elasticsuite tracker is enabled on a site with a custom theme or on a PWA frontend, it can happen that the data collection is not correctly performed, particularly that some events are registred with an undefined or 'null' user session identifier (tracker session.uid
parameter).
In turn those events coming from different visitors (tracker session.vid
parameter) would still be collected as event for a same navigation session spanning several weeks or months, generating a document in the behavioral data session index with several hundreds or thousands of search terms, products added to cart, ordered, etc.
In the long run, it will slow down requests performed on the behavioral data indices up to generate 429 errors on your Elasticsearch/OpenSearch server preventing you to use, for instance, the Elasticsuite Search Usage analytics dashboard.
This release contains both a fix to prevent the collection of those ill-formed events and two Magento commands to be able to check and fix your already indexed behavioral data.
Check the presence of invalid behavioral data
You can run the elasticsuite:tracker:check-data
Magento command to scan your behavioral indices for invalid data.
It will scan the behavioral data indices of all the active Magento store views and report if there are any errors.
If there are errors for a given store, they will be reported, as seen below :
Fix the invalid behavioral data.
If the elasticsuite:tracker:check-data
command reports error, you can run the elasticsuite:tracker:fix-data
command to fix the invalid data from your behavioral indices.
It will report what has been fixed.
For instance, on the same example as seen above, here are the results :
If there was nothing to fix, the command has no effect :
Future
As for the system message that can pop in your Magento admin interface when you have "Ghost Indices", we might in a future release make sure that admin users are made aware of existing invalid behavioral data without having to launch the elasticsuite:tracker:check-data
Magento command.
📦 Features
- [Tracker] Tools to check and fix invalid behavioral data by @rbayet in #3122
- [Tracker] Prevent partial/invalid events to be indexed by @rbayet in #3111
🐛 Fixes
- Fixes #3137 [Analytics] PHP8 compatibility popular search terms w/ re… by @rbayet in #3138
- Fixes #3119 [GraphQL] allow drill-down in categories agg. of products… by @rbayet in #3120
- Fixes #3123 [Search Merchandizing] Show back hidden products in preview by @rbayet in #3124
- Fixes #3132 [Thesaurus] limit thesaurus rewriting loops by @rbayet in #3133
- Fixes #3132 [Thesaurus] Rewriting loops avoidance unit tests by @rbayet in #3145
- Fixes #2913 [Tracker] undefined array key date for elasticsuite tracker event index by @rbayet in #3112
- [Tracker] Remove 'domain' parameter from tracker by @romainruaud in 7bc0b3c
Full Changelog: 2.10.18.2...2.10.18.3
2.11.4.3-rc1
What's Changed
- [Tracker] Prevent partial/invalid events to be indexed by @rbayet in #3111
- Fix #2913, undefined array key date for elasticsuite tracker event index by @rbayet in #3112
- Fixes #3119 [GraphQL] allow drill-down in categories agg. of products… by @rbayet in #3120
- Fixes #3123 [Search Merchandizing] Show back hidden products in preview by @rbayet in #3124
- Added configuration for ssl verification by @gabrielLumao in #3127
- Fixes #3137 [Analytics] PHP8 compatibility popular search terms w/ re… by @rbayet in #3138
- Fix 3132 limit thesaurus rewriting loops by @rbayet in #3133
New Contributors
- @gabrielLumao made their first contribution in #3127
Full Changelog: 2.11.4.2...2.11.4.3-rc1
2.10.18.3-rc1
What's Changed
- [Tracker] Prevent partial/invalid events to be indexed by @rbayet in #3111
- Fix #2913, undefined array key date for elasticsuite tracker event index by @rbayet in #3112
- Fixes #3119 [GraphQL] allow drill-down in categories agg. of products… by @rbayet in #3120
- Fixes #3123 [Search Merchandizing] Show back hidden products in preview by @rbayet in #3124
- Fixes #3137 [Analytics] PHP8 compatibility popular search terms w/ re… by @rbayet in #3138
- Fix 3132 limit thesaurus rewriting loops by @rbayet in #3133
Full Changelog: 2.10.18.2...2.10.18.3-rc1
2.11.4.2
Main new features
Ignore manual positions for out of stock products
When you display out of stock products in the frontend, it can be annoying that a product manually positioned in a category or for a search query be still displayed at the configured position when it is no longer in stock.
A new setting named "Ignore manual positions of out of stock products" available in "Stores > Configuration > Elasticsuite > Catalog Search > Catalog Search Configuration" can now prevent this situation.
Category products preview area in Category edit screen
Category products in the frontend
Configure the Elasticsuite Tracker to use the REST API endpoint
By default, the Elasticsuite tracking script which collects anonymized behavioral data (search queries, product views, product sales) uses an invisible pixel to push its data to Magento.
It can happen that frontend caches blocks image URLs with parameters for safety reason, preventing the collection of tracking data.
While our FAQ contains a workaround for Fastly, it is now possible to easily make the Elasticsuite tracker use the dedicated REST API endpoint which was developed several releases ago for headless themes integration.
The new setting is named "Use the API to collect data" and is available in "Stores > Configuration > Elasticsuite > Tracking > Global Configuration".
Alphabetical sort of attribute options in the rule engine
When you have a very long list of options for a select or multiselect attribute, for example a "Brand" attribute with more than 100 values, it can sometimes be difficult to pick the desired option in the rule engine of the Search Optimizers or the Virtual Categories. It is because by default the options are listed in the order they were created.
A new setting named "Alphabetical sorting of attribute options in the rule engine" available in "Stores > Configuration > Elasticsuite > Catalog Search > Catalog Rules Configuration" allows you to force an alphabetical sorting of the options.
📦 Features
- [Optimizers, Virtual Categories] Allow forcing alphabetical sorting of product attributes options by @rbayet in #3067
- [Optimizers, Virtual Categories] Ability to create rules based on stock qty
and searchable contentby @rbayet in #3089 - [Categories merchandising] Feature #3099 Ignore manual positions for out of stock products by @rbayet in #3100
- [Thesaurus] Feature #3063, allow mass enabling / disabling of Thesaurus by @vahonc in #3091
- [Tracker] Officially adding REST API option to tracking script by @rbayet in #3105
- [Analysis] Adding 'untouched' normalizer to 'untouched' fields by @rbayet in #3080
🐛 Fixes
- [Optimizers] Support new spellchecker settings in optimizers preview by @rbayet in #3088
- [Optimizers, Virtual Categories] Removing unsafe/unstable 'search' special attribute by @rbayet in #3097
- [German Translation] fix typo by @Morgy93 in #3102
- [Catalog Navigation] In sync with MSI 1.2.6, remove undeclared plugin by @Bashev in #3066
🔨 Quality enhancements
- [Quality] Add REST schema generation step by @rbayet in #3071
- [Quality] Add specific PHPStan workflow by @rbayet in #3077
- [Quality] Fixing unit test namespace by @rbayet in #3083
- [Quality] #3061 elasticsuite phpstan level zero errors 2.10 fix by @rbayet in #3072
- [Quality] Fixing specific PHPStan workflow sequencing by @rbayet in #3078
- [Quality] Fixing/simplifying PHPStan workflow by @rbayet in #3084
- [Quality] Fixing credentials settings by @rbayet in #3085
- [Quality] Spellchecker constructor settings unit tests by @rbayet in #3093
- [Quality] Accurate coverage of spellchecker request unit tests by @rbayet in #3095
New Contributors
Full Changelog: 2.11.4.1...2.11.4.2
2.10.18.2
Main new features
Ignore manual positions for out of stock products
When you display out of stock products in the frontend, it can be annoying that a product manually positioned in a category or for a search query be still displayed at the configured position when it is no longer in stock.
A new setting named "Ignore manual positions of out of stock products" available in "Stores > Configuration > Elasticsuite > Catalog Search > Catalog Search Configuration" can now prevent this situation.
Category products preview area in Category edit screen
Category products in the frontend
Configure the Elasticsuite Tracker to use the REST API endpoint
By default, the Elasticsuite tracking script which collects anonymized behavioral data (search queries, product views, product sales) uses an invisible pixel to push its data to Magento.
It can happen that frontend caches blocks image URLs with parameters for safety reason, preventing the collection of tracking data.
While our FAQ contains a workaround for Fastly, it is now possible to easily make the Elasticsuite tracker use the dedicated REST API endpoint which was developed several releases ago for headless themes integration.
The new setting is named "Use the API to collect data" and is available in "Stores > Configuration > Elasticsuite > Tracking > Global Configuration".
Alphabetical sort of attribute options in the rule engine
When you have a very long list of options for a select or multiselect attribute, for example a "Brand" attribute with more than 100 values, it can sometimes be difficult to pick the desired option in the rule engine of the Search Optimizers or the Virtual Categories. It is because by default the options are listed in the order they were created.
A new setting named "Alphabetical sorting of attribute options in the rule engine" available in "Stores > Configuration > Elasticsuite > Catalog Search > Catalog Rules Configuration" allows you to force an alphabetical sorting of the options.
📦 Features
- [Optimizers, Virtual Categories] Allow forcing alphabetical sorting of product attributes options by @rbayet in #3067
- [Optimizers, Virtual Categories] Ability to create rules based on stock qty
and searchable contentby @rbayet in #3089 - [Categories merchandising] Feature #3099 Ignore manual positions for out of stock products by @rbayet in #3100
- [Thesaurus] Feature #3063, allow mass enabling / disabling of Thesaurus by @vahonc in #3091
- [Tracker] Officially adding REST API option to tracking script by @rbayet in #3105
- [Analysis] Adding 'untouched' normalizer to 'untouched' fields by @rbayet in #3080
🐛 Fixes
- [Optimizers] Support new spellchecker settings in optimizers preview by @rbayet in #3088
- [Optimizers, Virtual Categories] Removing unsafe/unstable 'search' special attribute by @rbayet in #3097
🔨 Quality enhancements
- [Quality] Add REST schema generation step by @rbayet in #3071
- [Quality] Add specific PHPStan workflow by @rbayet in #3077
- [Quality] Fixing unit test namespace by @rbayet in #3083
- [Quality] #3061 elasticsuite phpstan level zero errors 2.10 fix by @rbayet in #3072
- [Quality] Fixing specific PHPStan workflow sequencing by @rbayet in #3078
- [Quality] Fixing/simplifying PHPStan workflow by @rbayet in #3084
- [Quality] Fixing credentials settings by @rbayet in #3085
- [Quality] Spellchecker constructor settings unit tests by @rbayet in #3093
- [Quality] Accurate coverage of spellchecker request unit tests by @rbayet in #3095
Full Changelog: 2.10.18.1...2.10.18.2
2.11.4.1
📦 Feature
[Experimental] Ability to trigger exact matching on ngrams / partial words
Release 2.11.3 introduced the ability to select standard_edge_ngram
as search analyzer for product attributes.
By default, though, all the ngrams generated are invisible to the pre-request analysis step which determines if Elasticsuite should perform a fuzzy or an exact matching query.
So if a user searches for a partial word, for instance "scre" for "screen" or "screwdriver"
- if "scre" is present exactly in any of the searchable attributes (for example as part of a sku "SCRE001AB"
- then an exact matching query will be performed
- and it's likely products with
standard_edge_ngram
analyzed name containing "screen" or "screwdriver" will be displayed at the top of the search results
- if that is not the case
- then a fuzzy query will be performed
- and products with
standard_edge_ngram
analyzed name containing "screen" or "screwdriver" will compete with- products containing "sure" or "sore" or any valid fuzzy variation of "scre"
- products containing a word whose phonetic analysis matches "scre", for instance "secure"
The new settings Elasticsuite > Search Relevance > Spellchecking configuration > Term Vectors Configuration > [Experimental] Use edge ngram analyzer in term vectors, when set to "Yes", allows you
- to enable the detection of ngrams in the pre-request analysis step to ensure exact matching
- even if the partial word search by the user is only contained in a
standard_edge_ngram
analyzed product names.
It is also recommended to switch to "Yes" the other experimental settings located above named "[Experimental] Use all tokens from term vectors", especially if you previously switched to "Yes" the experimental settings also located in the section "[Experimental] Use reference analyzer in term vectors".
What's Changed
- [Spellcheck] Experimental settings for edge ngram exact matching by @rbayet in #3056
- [Spellchecker] Re-enable spelling type cache by @rbayet in #3057
Full Changelog: 2.11.4...2.11.4.1
2.10.18.1
📦 Feature
[Experimental] Ability to trigger exact matching on ngrams / partial words
Release 2.10.17 introduced the ability to select standard_edge_ngram
as search analyzer for product attributes.
By default, though, all the ngrams generated are invisible to the pre-request analysis step which determines if Elasticsuite should perform a fuzzy or an exact matching query.
So if a user searches for a partial word, for instance "scre" for "screen" or "screwdriver"
- if "scre" is present exactly in any of the searchable attributes (for example as part of a sku "SCRE001AB"
- then an exact matching query will be performed
- and it's likely products with
standard_edge_ngram
analyzed name containing "screen" or "screwdriver" will be displayed at the top of the search results
- if that is not the case
- then a fuzzy query will be performed
- and products with
standard_edge_ngram
analyzed name containing "screen" or "screwdriver" will compete with- products containing "sure" or "sore" or any valid fuzzy variation of "scre"
- products containing a word whose phonetic analysis matches "scre", for instance "secure"
The new settings Elasticsuite > Search Relevance > Spellchecking configuration > Term Vectors Configuration > [Experimental] Use edge ngram analyzer in term vectors, when set to "Yes", allows you
- to enable the detection of ngrams in the pre-request analysis step to ensure exact matching
- even if the partial word search by the user is only contained in a
standard_edge_ngram
analyzed product names.
It is also recommended to switch to "Yes" the other experimental settings located above named "[Experimental] Use all tokens from term vectors", especially if you previously switched to "Yes" the experimental settings also located in the section "[Experimental] Use reference analyzer in term vectors".
What's Changed
- [Spellcheck] Experimental settings for edge ngram exact matching by @rbayet in #3056
- [Spellchecker] Re-enable spelling type cache by @rbayet in #3057
Full Changelog: 2.10.18...2.10.18.1
2.11.4
📦 Feature
[Experimental] Ability to finetune single term queries exact maching
When an exact match query is performed with multiple search terms, the targeted attributes/fields are :
-
the searchable attributes/fields in their 'standard' analyzed version with their own search weight
-
the searchable attributes/fields in their 'shingle'/phrase matching version with their own search weight multiplied by the phrase match boost value
-
the searchable AND sortable attributes/fields in their 'sortable' version with their own search weight multiplied by twice the phrase match boost value
The logic behind that query construction is to
- collect all indexed documents (products, categories) containing the individual search terms in a strictly exact or "stemmed" version (ie, not differentiating between the singular and plural form of a word - grill vs grills for instance - or between a noun/verb and its participle present form - hammer vs hammering for instance)
- boost (possibly heavily) the documents that contain the search terms in a strictly exact or "stemmed" version close to one another
- boost (possibly quite heavily) the documents that have an attribute/field containing strictly exactly the search terms and only the search terms in their indexed value (for instance, when a user searches for a product name without any word missing or any typo/approximation)
Now, when an exact match query is performed with a single term, there is a slight change with regards to rule 2. : the searchable attributes/fields are targeted in their 'whitespace' version, but still with their own search weight multiplied by the phrase match boost value.
Considering that the 'whitespace' version of attributes/fields does not have any stemming component, that means that exact matching for single term searches puts a huge boost on product or categories that contain exactly the word the user typed, compared to product or categories that contain only the "stemmed" version of that word.
This is not an issue in most cases, but you could find yourself in a situation where you would like that this emphasis on strictly exact matching be a bit more relaxed, particularly in the context of matching categories in the autocomplete.
For instance, let's imagine you're selling grills, grill tools, grill accessories. You might want that when a user searches either for "grill" or "grills" in the autocomplete, the same list of categories appear in the same exact order: "Grills", "Grill Tools", "Grill Accessories".
It is now possible to achieve that behavior with new experimental settings located in "Elasticsuite > Search Relevance > Relevance Configuration > Exact Matching Configuration" :
- "[Experimental] Enable single term custom boost values" : Yes/No, enables customizing the boost value for rules 2. and 3. when a single term is searched. Defaults to No.
- "[Experimental] Single term phrase match boost value" : the boost value that will replace the phrase match boost value applied to 'whitespace' version of attributes/fields. Defaults to 10. It's possible to set it to 0 to totally disable the rule 2.
- "[Experimental] Single term sortable matches boost value" : the boost value that will replace "twice the phrase match boost value" applied to 'sortable' version of attributes/fields. Defaults to 20. It's possible to set it to 0 to totally disable the rule 3.
Those three new settings can be changed only for a specific scope, ie a specific search query type on a given store view.
You are thus allowed to change the native behavior for categories results in the autocomplete box without affecting the products search.
What's Changed
- Add observer to load mandatory attributes on category collections by @nige-one in #3044
- Experimental single term exact matching relevance by @rbayet in #3050
- Only Add Frontend Tracker Blocks If Config Enabled by @pykettk in #3051
Full Changelog: 2.11.3.3...2.11.4