Elasticsearch HighLevelRestClient batch call not updating field reliably

Description of the problem including expected versus actual behavior:

We noticed a strange behavior with one of our ElasticSearch bulk refresh calls.

If we try to update the stock field in our item index (we use one item index per language, so we call updateBulkModel for each one of them), the call will go through smoothly, no errors, no warnings.

But the stock field is not updated for some of the items. It just fails somehow. If we call the method again and again it gradually will update all the stocks.

The current batch size is 500, if we use a way lower batchsize of 5 or 10, the stock field gets updated more reliably, but it takes way to long this way.

What we triede so far:

  • compare document versions of the item before and after the bulk call, none of them reported the same version.
  • introduced logging with hasFailures and isFailed methods, none of them ever reported an error
  • changed thread_pool.write.queue_size to 500 in the elasticsearch.yml, which didn't helped

Here is the (shortened) code we use to update the items. Is there anything obvious that we're doing wrong here?

private Set<String> updateEsItems(final List<Item> argItemList, final List<Pricelist> argPricelists, final Connection argConnection) throws IOException, SQLException
    {
        for (final Language language : LanguageUtils.getSupportedLanguages())
        {
            final List<ElasticsearchItem> esItemUpdateList = new ArrayList<>();
            final ElasticsearchItem esItem = documentApi.getById(item.getId(), null);
            for (final Item item : argItemList)
            {
                esItem.setStock(ItemUtils.getStock(item));
                esItemUpdateList.add(esItem);
            }
            documentApi.updateBulkModel(esItemUpdateList);
        }
    }

    public List<String> updateBulkModel(final List<T> argModels) throws IOException
    {
        final List<String> retVal = new ArrayList<>();
        final RestHighLevelClient client = ElasticsearchClientFactory.getInstance().getClient();
        final BulkRequest request = new BulkRequest();
        if (!argModels.isEmpty())
        {
            for (final T model : argModels)
            {
                if (model.getId() != null)
                {
                    request.add(new UpdateRequest().index(getIndexAlias()).id(String.valueOf(model.getId())).doc(new Gson().toJson(model), XContentType.JSON));
                }
            }
            final BulkResponse response = client.bulk(request, ElasticsearchUtils.requestOptions);
            if (response.hasFailures())
            {
                LOG.error("Error in updateBulkModel: " + response.buildFailureMessage(), null);
            }
            for (final BulkItemResponse res : response.getItems())
            {
                if (res.isFailed())
                {
                    LOG.error("Error in updateBulkModel ItemResponse - ItemId: " + res.getResponse().getId() + " - Message: " + res.getFailureMessage(), null);
                }
                else
                {
                    retVal.add(res.getResponse().getId());
                }
            }
        }
        return retVal;
    }

Java High Level REST Client Version : '7.7.0'

Elastic-Version: '7.7.1'

Java Version: openjdk 11.0.9.1 2020-11-04

Os Version: Linux 4.19.0-9-amd64 #1 SMP Debian 4.19.118-2 (2020-04-29) x86_64 GNU/Linux

Mapping (shortened)

      "settings": {
        "index": {
          "number_of_replicas": 0,
          "number_of_shards": 5,
          "max_ngram_diff": 50,
          "max_inner_result_window": 2147483647,
          "mapping": {
            "nested_objects": {
              "limit": 500000
            }
          }
        },
        "analysis": {
          "analyzer": {
            "search_analyzer": {
              "tokenizer": "standard",
              "filter": [
                "lowercase",
                "synonym_filter"
              ]
            },
            "normalize_analyzer": {
              "tokenizer": "standard",
              "char_filter": [
                "normalize_char_filter"
              ]
            }
          },
          "filter": {
            "synonym_filter": {
              "type": "synonym",
              "synonyms": []
            }
          },
          "char_filter": {
            "normalize_char_filter": {
              "type": "mapping",
              "mappings": [
                ": => ",
                ". => ",
                "- => ",
                "/ => "
              ]
            }
          }
        }
      },
       "mappings": {
        "properties": {
          "Stock": {
            "type": "integer"
          }
        }
      }
    }

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.