As we know , unindexed field will store in _source field,so we can get those unindexed fields in _source.But Elasticsearch-sql can't get unindexed field.
GET /_sql/translate
{
"query":"""
select * from "view-laohu-sdk" limit 1
"""
}
look the picture blow,“_source”:false lead to we can't get unindexed field
Please look at the problem carefully,I unindexed these fields and these fields stored in _source
default. But I can't retrieve these fields when I use elasticsearch-sql.
Could you provide a full recreation script as described in About the Elasticsearch category. It will help to better understand what you are doing. Please, try to keep the example as simple as possible.
A full reproduction script is something anyone can copy and paste in Kibana dev console, click on the run button to reproduce your use case. It will help readers to understand, reproduce and if needed fix your problem. It will also most likely help to get a faster answer.
Reading unindexed fields has a significant performance impact and can be very slow and scale badly. I wonder if it is therefore disabled within the SQL interface.
Maybe,but this situation is very widely used .If I use SQL ,I must index all fields although there are some fields That I don't want to do search. I think index all fields has a more significant performance impact.
Elasticsearch is not a relational database and does not store non-indexed data in a columnar format. In order to access data that is not indexed it needs to retrieve and parse the full source document, which is a lot slower than access indexed fields. This is why I suspect it might have been disabled on purpose, but someone from Elastic can probably shed some light on this.
I don't think so .If you retrieve many fields, from “_source” is faster than from docvalue.This is why Elasticsearch stored field in “_source” default.
However, SQL Language can control what fields are returned ,such as “select field1” or “select *”,I think SQL interface should be flexible, not banned altogether
As detailed in SQL Limitations, most columns will now always be retrieved from _source, with some exceptions (such as the @timestamp field of type date, in your example) .
And in accordance to that, as an example: if you run something like SELECT "@timestamp", bytes FROM kibana_sample_data_logs, you will get a translation like:
ES/SQL queries will also not run through all stored docs of an index to match what non searchable but _source stored fields might be relevant. Besides performance, since this field is not indexed, how could SQL know the type of this field and how to handle it; ex.: should field2+1=223 be allowed, or rather CHAR_LENGTH(field2)=3?
I see.
But if a index has 50 fields(only 10 fields need search,so we just index these 10 fields).
When I use SQL, I can't get the other 30 unindexed fields back,I think this is unreasonable.
I must index all fields,this is a waste of storage.
"how could SQL know the type of this field and how to handle it; ex.: should field2+1=223 be allowed, or rather CHAR_LENGTH(field2)=3 ?"
First,if a field is unindexed,user will not do that like field2+1=223. If they do that ,SQL can throw Exception to prompt them the wrong usage.
In addition, fetching unindexed fields is a very common use case,if SQL don't support this,That would be a great pity.
How much?
Is that a real problem? I mean that in some corner use cases that could be a problem but I have often seen engineers (like myself) trying to solve problems they don't have actually.
In my usage scenario,a index have many fields that don't need to search.If these fields set "index": false,"store": true ,when I use "select * from xx",many fields retrieve from 'doc_value' not '_source', this should have a significant impact on performance.
If an index has thousands of fields, do you think it is faster to fetch all the fields from _source or from DocValue.
Obviously, from _source is faster
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.