Intermediate stages of a text analyzer using _analyze + a possible UI in OSD #7602
Open
Description
Is your feature request related to a problem? Please describe.
For testing a text analyzer we use _analyze
. But it only gives the final token and not the intermediate stages.
Describe the solution you'd like
The _analyze
is decent, however it pales in comparison to what we have in Apache Solr. A feature like that would be great. Of course, to start with, showing the intermediate stages of an analyzer in raw JSON format should be good enough. Eventually, a page in OpenSearch Dashboard for this would be awesome (like in Apache Solr).
Describe alternatives you've considered
- Use the
_analyze
with its limitation - https://github.com/o19s/elyzer/
Additional context
The below screenshot does this,
<fieldType name="text_en" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer name="standard"/>
<filter name="stop" ignoreCase="true" words="lang/stopwords_en.txt"/>
<filter name="lowercase"/>
<filter name="englishPossessive"/>
<filter protected="protwords.txt" name="keywordMarker"/>
<filter name="porterStem"/>
</analyzer>
<analyzer type="query">
<tokenizer name="standard"/>
<filter name="synonymGraph" expand="true" ignoreCase="true" synonyms="synonyms.txt"/>
<filter name="stop" ignoreCase="true" words="lang/stopwords_en.txt"/>
<filter name="lowercase"/>
<filter name="englishPossessive"/>
<filter protected="protwords.txt" name="keywordMarker"/>
<filter name="porterStem"/>
</analyzer>
</fieldType>
Metadata
Assignees
Labels
Type
Projects
Status
Later (6 months plus)