Elasticsearch 6 and the disappearing _all field


(Roeland Van Heddegem) #1

Hello,

I've read the following link Breaking changes in 6.0 about the disappearing _all field, but I was searching for more information about this.

Why are you guys doing this? Is there documentation so we can understand the motives behind this move?
How should I exclude fields from the query_string and simple_query_string queries?
If these queries are searching over all the fields, how do they cope with the different analyzers that I use in those fields. What about multi-fields?

I can of course test all this, but I want to know the reasons and the direction you are evolving into to be able to better use this great piece of software!

Thanks


(Mark Walkom) #2

We're replacing it with a query that does the same thing. Which means less storage use.

There's more to it and hopefully someone with a deeper understanding can comment. If not, expect a blog post on it in the near future :smiley:


(Roeland Van Heddegem) #3

I would like a blog post ;-). But I'm worried, because it seems that I need to add fields to the default list, instead of exclude them, but this seems somehow contradictory with the dynamic mapping that I use.
I guess I still see a use for the include_in_all property, even if there's no physical _all field created


(Jörg Prante) #4

In the meantime, you can read old discussion at

While I agree some properties of _all catch field are bad, like

  • space issues
  • highlighting issues
  • analyzer issues

it can tremendously save energy when handling with hundreds of tiny string fields that have to be searchable on a single field.

I am using Elasticsearch since 0.5.1 back in March 2010, and the _all field was a must-have, it has been very simple and fast to push hundreds of fields into an index, having all the strings concatenated, and execute a simple_query_string against the _all field.

The new all query is considered as a special multi match query but has some drawbacks

  • slower
  • no scoring

I'm not sure how this will work out in the end.


(Lee Hinman) #5

It's worth mentioning that you can use copy_to also to create your own field similar to _all. Especially as Jörg mentioned, if you have many small string fields.


(Lee Hinman) #6

The queries will use the per-field analyzer when they analyze, so assume you have a string multifield with a non-analyzed and analyzed version, the all_fields query will generate two queries, one that is non-analyzed and one that is analyzed.

I also gave a talk covering this at Elasticon 2017, you can watch the video here - https://www.elastic.co/elasticon/conf/2017/sf/elasticsearch-search-improvements


(Roeland Van Heddegem) #7

I wants to come back to this, because I didn't find an answer:

In my use case the customers all have a dedicated mapping with some fields excluded from the _all field. If they want to add documents with extra parts, elasticsearch currently generates the mapping for this, and they can search on it.
But how am I going to exclude fields from the all_fields query in ES 6.0?
copy_to is not the solution, because it seems not to work on dynamic field mapping. See copy_to cannot be used on object mappings

So, do I have to make the mapping beforehand? That couldn't work, because I don't know them.
Changing the _default_field then? Is this the solution?

PS: It seems that the new 'all_fields' query also doesn't search in nested documents? (see _default_field)

If the _ all field is disabled, the query_string query will automatically attempt to determine the existing fields in the index’s mapping that are queryable, and perform the search on those fields. Note that this will not include nested documents, use a nested query to search those documents.


(system) #8

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.