Bulk Update operation on multi field type and search

Sagar_Viradiya · November 17, 2022, 5:47am

Previously we had mapping in index_1 Index.

{
    "properties": {
       "offer_id":{
        "type":long,
         "fields": {
                "keyword": {
                    "type": "text"
                }
            }
},
        "seller_account_id": {
            "type": "long"
          
        }
    }
}

and all data are there.
We were not able to search seller_account_id when provided in multi_match for full text search
So we decided to update mapping for seller_account_id as multi_type field.

{
    "properties": {
        "seller_account_id": {
            "type": "long",
            "fields": {
                "keyword": {
                    "type": "text"
                }
            }
        }
    }
}

I believe no Re-indexing is required for update mapping, but how we should make data available in search, because when we updated mapping search was not working?, we tried to use bulk API to update all documents. but still old documents are not searchable , only new documents are coming in search, I tried with _bulk API with _refresh=true, but still no luck

Do I need to create new Index and reindex all documents.( I dont prefer this)
Our multi_match request

{
    "query": {
         "multi_match": {
             "fields": [
                 "offer_id.keyword",
                 "seller_account_id.keyword"
             ],
             "operator": "and",
             "query": "9333 8029425",
             "type": "cross_fields"
         }
    }
}

Is not working.

Christian_Dahlqvist · November 17, 2022, 6:55am

Have you tried using the update by query API to process old documents so the new mapping takes effect?

Sagar_Viradiya · November 17, 2022, 6:59am

I was just checking examples from that document

POST test/_update_by_query?refresh&conflicts=proceed
POST test/_search?filter_path=hits.total
{
  "query": {
    "match": {
      "flag": "foo"
    }
  }
}

Are both different request or same?
Do I just need to use POST test/_update_by_query?refresh&conflicts=proceed without body?

Actually I did not understand example, If you can help me on it.

Christian_Dahlqvist · November 17, 2022, 7:05am

I would identify a few old documents that need to be updated and run an update by query with a filter to target just a few documents in order to verify that it works and resolves the issue. Once that is done you should be able to run the task without a body and process all documents. To be more selective you might also be able to write a query clause to select only documents that do not have the seller_account_id.keyword field defined.

Sagar_Viradiya · November 17, 2022, 7:22am

How can I check if there is seller_account_id.keyword field is defined or not? can you share example.
Because if mapping is updated all document have seller_account_id available with seller_account_id.keyword right?

Christian_Dahlqvist · November 17, 2022, 7:28am

You can add a boolean query with a must_not exists clause.

Elasticsearch stores data in immutable segments. The new field will therefore not be added for existing documents unless you update them, causing them to be written to a new segment.

Sagar_Viradiya · November 17, 2022, 8:50am

So there will not be required _reindex API to use in overall solution? Generally what steps should be follow when updating field as multi_type field in existing index? and also existing data should be available for searching?

If we use update_by_query , how we can resolve conflicts in existing documents?

Christian_Dahlqvist · November 17, 2022, 8:58am

I believe that is correct.

Yes. Run a refresh before starting the update as the query clause relies on all old data being searchable.

If you have conflicts, I believe you should skip those documents as it indicates that a new version was indexed after you started the operation and that means it will automatically get the new mappings.

Sagar_Viradiya · November 17, 2022, 8:58am

When I used

{
    "query": {
        "bool": {
            "must_not": {
                "exists": {
                    "field": "seller_account_id.keyword"
                }
            }
        }
    }
}

It returns only results where seller_account_id is null and there are few. but I have issue in many records.

Christian_Dahlqvist · November 17, 2022, 8:59am

Then you may need to update all documents.

Sagar_Viradiya · November 17, 2022, 9:01am

I am thinking another solution

like adding temporary field like temp_id and bulk update all documents with this field , so document version would be updated and it will be available for search, right?

And after that I will remove field by updating mapping.

Let me know your thoughts

Christian_Dahlqvist · November 17, 2022, 9:03am

I would just run a update by query on all documents and skip on conflicts.

Adding a field requires an update to every document and you can not remove fields from mappings without reindexing. Sounds a lot more complicated...

Sagar_Viradiya · November 17, 2022, 9:04am

Ok, Thanks. running update_by_query for all documents mean there will be no critieria to search, I mean update_by_query would require some body right?

Sagar_Viradiya · November 17, 2022, 9:10am

Is it correct to update all documents?

POST test/_update_by_query?refresh&conflicts=proceed
{
  "query": {
    "match_all": {}
  }
}

Christian_Dahlqvist · November 17, 2022, 9:21am

That should work but the first example in the docs I linked to seems to do exactly what you want.

Sagar_Viradiya · November 17, 2022, 9:22am

This one?

POST my-index-000001/_update_by_query?conflicts=proceed

Christian_Dahlqvist · November 17, 2022, 9:35am

Yes.

Sunile_Manjee · November 29, 2022, 5:51am

Curious if you tried run time field? this would be avoid updates to existing docs.

system · December 27, 2022, 5:52am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Search in multi_field type Elasticsearch	5	1154	July 5, 2017
Update Field Type Elasticsearch	5	247	October 21, 2021
Questions about multi_field, configurations, routing control, filtered alias Elasticsearch	6	379	July 6, 2017
Trying to understand how to update a mapping without requiring a complete rebuild of the index? Elasticsearch	2	657	July 6, 2017
Elasticsearch update mapping from text field to multifield with keyword fails Elasticsearch	4	10031	August 7, 2017

Bulk Update operation on multi field type and search

Related Topics