Exclude particular fields from [all_field] searches


#1

Hello,

I want to use "all_field" option in my queries to search all indexed fields, but excluding a few. I have elastic search 5.2.1 installed and sent below commands using Kibana to elastic search.

PUT test_index
{
"mappings": {
"user": {
"_all": { "enabled": false },
"properties": {
"title": { "type": "text" },
"name": { "type": "text" },
"age": { "type": "integer" }
}
}
}
}

POST test_index/_bulk
{ "index" : { "_type" : "user", "_id" : "1" } }
{ "name" : "niaz", "title" : "test", "age" : 40, "tags" : "toyota, bmw" }
{ "index" : { "_type" : "user", "_id" : "2" } }
{ "name" : "john", "title" : "toyota", "age" : 30 }
{ "index" : { "_type" : "user", "_id" : "3" } }
{ "name" : "bell", "title" : "mercedes", "age" : 35 }
{ "index" : { "_type" : "user", "_id" : "4" } }
{ "name" : "akram", "title" : "bmw", "age" : 42 }

GET test_index/_search
{
"query": {
"query_string": {
"query": "toyota"
}
}
}

The query returns me 2 documents, this is 100% correct. The question is how can I change my all_field configuration that the query should exclude the "tags" field from search.

In general, I can configure the custom _all field OR send the query with multiple-fields. But then my fields will be either indexed twice OR for each query I need to send a large list of fields to elastic search.

Thanks a lot in advance for your help.

Greetings,
Niaz


(Mark Walkom) #2

You may want to disable _all and use copy_to to create your own version and then pick that as the default search field.


#3

Hello,

copy_to is a possibility, like I mentioned above in this case the data will be indexed twice. Once in the field itself and secondly in the copied field. I would like to know, if there is any other possibility? Can we somehow create a virtual field or some alias in the mapping? That means when this virtual/alias is referenced in the query the corresponding configured fields will be used?

OR configure [all_field] with one or more fields. In theory, this should be the same mechanism like [all_field] is working. Currently [all_field] is automatically translated to all fields in the mapping and I want to influence its translation to some special fields. e.g. for above example to all fields except "tags".

Greetings,
Niaz


(Mark Walkom) #4

Nope.

You could look at https://www.elastic.co/guide/en/elasticsearch/reference/5.2/include-in-all.html


#5

It seems "include_in_all" solves what I need. But the only thing that I want to confirm is:

1 - Does "include_in_all" configures in the background the antique "_all" field where all fields with "include_in_all=true" will be double indexed?

OR

2 - This setting only affects the queries with no default field for search and the fields configured with "include_in_all=true" will be taken to search in. This is my wished behavior.

I will make an example for (2) and post later today.

Thanks
Niaz


#6

Hello Mark,

I have executed following code and it seems under the hood _all field is used. That means, the feature of elasticsearch 5.1 can not be used where we can disable the _all field completely.

DELETE test_index

PUT test_index
{
"mappings": {
"user": {
"include_in_all": false,
"properties": {
"title": { "type": "text", "include_in_all": true },
"name": { "type": "text", "include_in_all": true },
"age": { "type": "integer" },
"tags": { "type": "text", "include_in_all": false }
}
}
}
}

POST test_index/_bulk
{ "index" : { "_type" : "user", "_id" : "1" } }
{ "name" : "niaz", "title" : "test", "age" : 40, "tags" : "toyota, bmw" }
{ "index" : { "_type" : "user", "_id" : "2" } }
{ "name" : "john", "title" : "toyota", "age" : 30 }
{ "index" : { "_type" : "user", "_id" : "3" } }
{ "name" : "bell", "title" : "mercedes", "age" : 35 }
{ "index" : { "_type" : "user", "_id" : "4" } }
{ "name" : "akram", "title" : "bmw", "age" : 42 }

GET test_index/_search
{
"query": {
"query_string": {
"query": "toyota"
}
}
}

The query under 4 results ONE document, CORRECT as we had disabled "include_in_all" from our "tags" field.

Now update the mapping for "tags" field like below:

PUT test_index/_mapping/user
{
"properties": {
"tags" : { "type": "text", "include_in_all": true }
}
}

Execute the query from point 4 again and it will return again 1 document. That means, the "include_in_all" cannot be changed dynamically. Instead the documents that are already indexed will not be affected when this setting is changed from false to true or vice versa.

I post one more document using below command.
POST test_index/_bulk
{ "index" : { "_type" : "user", "_id" : "5" } }
{ "name" : "Mark Walkom", "title" : "elastic", "age" : 35, "tags" : "toyota, tesla" }

Execute the query under (4) again and this time 2 documents will be returned as the "tags" field from (6) is indexed.

My point: I want to use the feature of elastic search 5.1 where _all field can be disabled by default and all fields are indexed only once. At the time of search, when no default field is provided a user configured _all field should be used for searching. Please accept this feature request for next version of elastic search.

Thanks
Niaz


Option for excluding fields in "all_fields" mode query for ES 6.0
(Aaron XImm) #7

We will want the same thing: to be able to control which fields are excluded from all_fields in 6.x.

For our case it is critical to be able to exclude specific fields from all_fields as it would not be feasible to enumerate only included fields.


(system) #8

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.