-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(elastic-search): improved default search #3284
feat(elastic-search): improved default search #3284
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
@monrostar I know you guys have done a lot of work on this plugin, so perhaps you can take a look if this doesn't conflict with any of your use cases? |
{ productId: 'T_3', enabled: false }, | ||
]); | ||
const t3 = result.search.items.find(i => i.productId === 'T_3'); | ||
expect(t3?.enabled).toEqual(false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fuzzy matching returns multiple results now, but this test only cares about if T3 is disabled, so we should ignore the other results
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
Quality Gate passedIssues Measures |
'Camera Lens', | ||
'Instant Camera', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Camera Lens
is now the first result because name is more important. In most cases this is desired, but this test case is debatable... WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is fine.
Hi, sorry for long reply. You can do what you want. Currently we are using our own plugin for Elasticsearch. We use one document for 1 variant at a time for all channels and all translations and currencies. Unfortunately I had to completely rewrite the original plugin. I'd like to make a contribute of this code, but we don't have plans for that yet... Here's a small example of the new structure const defaultAvailableLanguages = [LanguageCode.en]
const languageAnalyzerMap: Partial<Record<LanguageCode, string>> & { default: string } = {
[LanguageCode.ar]: 'arabic',
[LanguageCode.hy]: 'armenian',
[LanguageCode.eu]: 'basque',
[LanguageCode.bn]: 'bengali',
[LanguageCode.pt_BR]: 'brazilian',
[LanguageCode.bg]: 'bulgarian',
[LanguageCode.ca]: 'catalan',
[LanguageCode.cs]: 'czech',
[LanguageCode.da]: 'danish',
[LanguageCode.nl]: 'dutch',
[LanguageCode.en]: 'english',
[LanguageCode.en_AU]: 'english',
[LanguageCode.en_CA]: 'english',
[LanguageCode.en_GB]: 'english',
[LanguageCode.en_US]: 'english',
[LanguageCode.et]: 'estonian',
[LanguageCode.fi]: 'finnish',
[LanguageCode.fr]: 'french',
[LanguageCode.gl]: 'galician',
[LanguageCode.de]: 'german',
[LanguageCode.el]: 'greek',
[LanguageCode.hu]: 'hungarian',
[LanguageCode.id]: 'indonesian',
[LanguageCode.ga]: 'irish',
[LanguageCode.it]: 'italian',
[LanguageCode.lv]: 'latvian',
[LanguageCode.lt]: 'lithuanian',
[LanguageCode.nb]: 'norwegian',
[LanguageCode.nn]: 'norwegian',
[LanguageCode.pt]: 'portuguese',
[LanguageCode.ro]: 'romanian',
[LanguageCode.ru]: 'russian',
[LanguageCode.sr]: 'serbian',
[LanguageCode.es]: 'spanish',
[LanguageCode.sv]: 'swedish',
default: 'standard',
}
function getAnalyzerForLanguage(languageCode: LanguageCode): string {
return languageAnalyzerMap[languageCode] || languageAnalyzerMap.default
}
export const buildIndexName = (prefix: string, name: string, postfix = ''): estypes.IndexName => `${prefix}${name}${postfix}`
export const buildAliasName = (prefix: string, name: string, postfix = ''): estypes.IndexAlias => `${prefix}${name}${postfix}`
export function TranslatedTextKeywordMappingField(): estypes.MappingObjectProperty {
return {
type: 'object',
properties: defaultAvailableLanguages.reduce((acc, lang) => {
acc[lang] = {
type: 'text',
analyzer: `${getAnalyzerForLanguage(lang)}_analyzer`,
fields: {
keyword: {
type: 'keyword',
},
},
}
return acc
}, {} as Record<LanguageCode, estypes.MappingProperty>),
}
}
export function TranslatedTextMappingField(): estypes.MappingObjectProperty {
return {
type: 'object',
properties: defaultAvailableLanguages.reduce((acc, lang) => {
acc[lang] = {
type: 'text',
analyzer: `${getAnalyzerForLanguage(lang)}_analyzer`,
}
return acc
}, {} as Record<LanguageCode, estypes.MappingProperty>),
}
}
const priceMappingField: estypes.MappingProperty = {
type: 'nested',
properties: {
id: { type: 'keyword' },
channelId: { type: 'keyword' },
currencyCode: { type: 'keyword' },
price: { type: 'integer' },
},
}
function generateDynamicTemplatesAndAnalyzers() {
const dynamicTemplates: Record<string, MappingDynamicTemplate> | Record<string, MappingDynamicTemplate>[] = []
const analyzers: Record<string, AnalysisAnalyzer> = {
standard_analyzer: {
type: 'custom',
tokenizer: 'standard',
filter: ['lowercase', 'asciifolding'],
},
}
const filters: Record<string, AnalysisTokenFilter> = {}
for (const langCode of Object.values(LanguageCode)) {
const analyzerName = getAnalyzerForLanguage(langCode)
const effectiveAnalyzer = analyzerName ? `${analyzerName}_analyzer` : 'standard_analyzer'
dynamicTemplates.push({
[`language_analyzer_${langCode}_productName`]: {
match_mapping_type: 'string',
path_match: `productName.${langCode}`,
mapping: {
type: 'text',
analyzer: effectiveAnalyzer,
fields: {
keyword: {
type: 'keyword',
ignore_above: 256,
},
},
},
},
})
dynamicTemplates.push({
[`language_analyzer_${langCode}_variantName`]: {
match_mapping_type: 'string',
path_match: `variantName.${langCode}`,
mapping: {
type: 'text',
analyzer: effectiveAnalyzer,
fields: {
keyword: {
type: 'keyword',
ignore_above: 256,
},
},
},
},
})
dynamicTemplates.push({
[`language_analyzer_${langCode}_productDescription`]: {
match_mapping_type: 'string',
path_match: `productDescription.${langCode}`,
mapping: {
type: 'text',
analyzer: effectiveAnalyzer,
},
},
})
if (analyzerName && analyzerName !== 'standard') {
analyzers[`${analyzerName}_analyzer`] = {
type: 'custom',
tokenizer: 'standard',
filter: ['lowercase', 'asciifolding', `${analyzerName}_stemmer`],
}
filters[`${analyzerName}_stemmer`] = {
type: 'stemmer',
language: analyzerName,
}
}
}
return { dynamicTemplates, analyzers, filters }
}
const ProductVariantIndexMappingProperties: { [key in keyof VariantIndexItem]: estypes.MappingProperty } = {
// index date
lastSyncedAt: { type: 'date' },
productUpdatedAt: { type: 'date' },
productCreatedAt: { type: 'date' },
// product fields
productId: { type: 'keyword' },
productChannelIds: { type: 'keyword' },
productCollectionIds: { type: 'keyword' },
productFacetValueIds: { type: 'keyword' },
productFacetIds: { type: 'keyword' },
productOptions: { type: 'flattened' },
productOptionsGroups: {
type: 'nested',
properties: {
code: { type: 'keyword' },
id: { type: 'keyword' },
name: TranslatedTextKeywordMappingField(),
options: {
type: 'nested',
properties: {
id: { type: 'keyword' },
name: TranslatedTextKeywordMappingField(),
code: { type: 'keyword' },
},
},
},
},
productEnabled: { type: 'boolean' },
productInStock: { type: 'boolean' },
productName: TranslatedTextKeywordMappingField(),
productSlug: TranslatedTextKeywordMappingField(),
productDescription: TranslatedTextMappingField(),
productPriceMax: priceMappingField,
productPriceMin: priceMappingField,
productAssetId: { type: 'keyword' },
productPreview: { type: 'keyword' },
productPreviewFocalPoint: { type: 'flattened' },
productAssets: { type: 'flattened' },
// variant fields
variantUpdatedAt: { type: 'date' },
variantCreatedAt: { type: 'date' },
variantId: { type: 'keyword' },
variantChannelIds: { type: 'keyword' },
variantCollectionIds: { type: 'keyword' },
variantFacetIds: { type: 'keyword' },
variantFacetValueIds: { type: 'keyword' },
variantEnabled: { type: 'boolean' },
variantInStock: { type: 'boolean' },
variantDisplayStockLevel: { type: 'keyword' },
variantName: TranslatedTextKeywordMappingField(),
variantSku: { type: 'keyword' },
variantOptions: {
type: 'nested',
properties: {
code: { type: 'keyword' },
id: { type: 'keyword' },
name: TranslatedTextKeywordMappingField(),
group: {
type: 'object',
properties: {
id: { type: 'keyword' },
name: TranslatedTextKeywordMappingField(),
code: { type: 'keyword' },
},
},
},
},
variantPrice: priceMappingField,
variantAssetId: { type: 'keyword' },
variantPreview: { type: 'keyword' },
variantPreviewFocalPoint: { type: 'flattened' },
variantAssets: { type: 'flattened' },
} |
@monrostar Thanks for your reply! In that case, I might also take a look at the indexing process: I see that it was optimized quite a bit over the past years for product with very large amount of variants. This does seem to slow down the indexing process for 'normal' stores. It takes almost 10 minutes to index a store with ~600 variants and ~200 products. I am no ES expert, but doesn't that seem to be a bit long? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks! Potential optimizations to the overall architecture and indexing performance can be tackled separately.
'Camera Lens', | ||
'Instant Camera', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is fine.
{ productId: 'T_3', enabled: false }, | ||
]); | ||
const t3 = result.search.items.find(i => i.productId === 'T_3'); | ||
expect(t3?.enabled).toEqual(false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
Description
Minor tweaks to improve the out-of-the-box search results from elastic search.
It was a bit demotivating to see that my search results were worse than with the default plugin, while ES is such a powerful engine. In my case this was due to:
Most consumers probably define their own queries, but for those starting with the defaults this gives them a better experience.
Breaking changes
No
Checklist
📌 Always:
👍 Most of the time: