Skip to content
This repository has been archived by the owner on Aug 2, 2022. It is now read-only.

Distinct * returns empty result #860

Open
dai-chen opened this issue Nov 26, 2020 · 4 comments
Open

Distinct * returns empty result #860

dai-chen opened this issue Nov 26, 2020 · 4 comments
Labels
bug Something isn't working SQL

Comments

@dai-chen
Copy link
Member

In SQL, DISTINCT * is used to get distinct row in result set. Currently an empty result is returned for queries as below:

POST _opendistro/_sql
{
  "query": """
    SELECT DISTINCT * FROM accounts
  """
}
{
  "schema": [
   ...
  ],
  "total": 0,
  "datarows": [],
  "size": 0,
  "status": 200
}
@dai-chen dai-chen added bug Something isn't working SQL labels Nov 26, 2020
@mj84
Copy link

mj84 commented Jan 6, 2021

I ran into this today as well, it seems like a browser-side JavaScript issue to me.
When trying to execute a query with DISTINCT I get this in my browser console:
TypeError: Cannot read property 'values' of undefined

The failing code is:
var _iterator = main_createForOfIteratorHelper(schema.values()), _step;

@mj84
Copy link

mj84 commented Jan 6, 2021

After investigating a bit more, i found that in the above code schema was undefined, because this call failed:
var schema = external_kbnSharedDeps_Lodash_default.a.get(responseObj, "schema");

Elasticsearch returned this error:
details: "Shard[0]: java.lang.IllegalArgumentException: Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [pcs_username] in order to load field data by uninverting the inverted index. Note that this can use significant memory.For more details, please send request for Json format to see the raw response from elasticsearch engine." reason: "Error occurred in Elasticsearch engine: all shards failed" type: "SearchPhaseExecutionException"

After changing my query to use the ".keyword" field of the desired field, DISTINCT now works correctly, so this is probably not a bug.
So you could try to use SELECT DISTINCT * FROM accounts.keyword

@dai-chen
Copy link
Member Author

@mj84 Thanks for your info! But I didn't get why FROM accounts.keyword worked. Could you elaborate or share your index? Thanks!

@mj84
Copy link

mj84 commented Jan 12, 2021

My index is nothing special, just some syslog messages with specific parts of the message extracted into separate fields using logstash and grok.

The reason is actually explained in the error message from Elasticsearch :)
Since DISTINCT is an aggregation operation, Elasticsearch requires keywords when aggregating, for performance reasons.

Nonetheless, it would be quite helpful to pass on errors from Elasticsearch like this one to the user.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working SQL
Projects
None yet
Development

No branches or pull requests

2 participants