Skip to content

Intermediate stages of a text analyzer using _analyze + a possible UI in OSD #7602

Open
@aswath86

Description

Is your feature request related to a problem? Please describe.
For testing a text analyzer we use _analyze. But it only gives the final token and not the intermediate stages.

Describe the solution you'd like
The _analyze is decent, however it pales in comparison to what we have in Apache Solr. A feature like that would be great. Of course, to start with, showing the intermediate stages of an analyzer in raw JSON format should be good enough. Eventually, a page in OpenSearch Dashboard for this would be awesome (like in Apache Solr).

Describe alternatives you've considered

  1. Use the _analyze with its limitation
  2. https://github.com/o19s/elyzer/

Additional context

The below screenshot does this,

  <fieldType name="text_en" class="solr.TextField" positionIncrementGap="100">
    <analyzer type="index">
      <tokenizer name="standard"/>
      <filter name="stop" ignoreCase="true" words="lang/stopwords_en.txt"/>
      <filter name="lowercase"/>
      <filter name="englishPossessive"/>
      <filter protected="protwords.txt" name="keywordMarker"/>
      <filter name="porterStem"/>
    </analyzer>
    <analyzer type="query">
      <tokenizer name="standard"/>
      <filter name="synonymGraph" expand="true" ignoreCase="true" synonyms="synonyms.txt"/>
      <filter name="stop" ignoreCase="true" words="lang/stopwords_en.txt"/>
      <filter name="lowercase"/>
      <filter name="englishPossessive"/>
      <filter protected="protwords.txt" name="keywordMarker"/>
      <filter name="porterStem"/>
    </analyzer>
  </fieldType>

Solr_Analyzer_Tester

Metadata

Assignees

No one assigned

    Labels

    SearchSearch query, autocomplete ...etcSearch:RelevanceenhancementEnhancement or improvement to existing feature or requestfeatureNew feature or request

    Type

    No type

    Projects

    • Status

      Later (6 months plus)

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions