ElasticSearch Java API 8.5.0 fields are missing from indexed document

Hello. We've been migrating from ES 7 version to 8.5 and discovered this weird behavior.
In Kibana I can index a test document with a null field value and it will get indexed and will be present in the index.

PUT my-index-000001
{
  "mappings": {
    "properties": {
      "status": {
        "type":       "text"
      },
      "docType": {
        "type":       "text"
      }
    }
  }
}
PUT my-index-000001/_doc/1
{
  "status": null,
  "docType":"Contract"
}
GET my-index-000001/_doc/1

Field "status" is present in document 1 and has a null value:

{
  "_index": "my-index-000001",
  "_id": "1",
  "_version": 1,
  "_seq_no": 0,
  "_primary_term": 1,
  "found": true,
  "_source": {
    "status": null,
    "docType": "War"
  }
}

When I index a second document with a null docType using Java API it doesn't get indexed:

@Data
public class Document {
    private String status;
    private String docType;
}
Document doc = new Document();
doc.setStatus("new");

ElasticsearchClient esClient = getClient();
esClient.index(i -> i.index("my-index-000001").id("2").document(doc));

This is how the document looks like in the index:

{
  "_index": "my-index-000001",
  "_id": "2",
  "_version": 1,
  "_seq_no": 1,
  "_primary_term": 1,
  "found": true,
  "_source": {
    "status": "new"
  }
}

docType is missing.
I get that we can't search null fields but we can't even index them and include in JSON? If so, I would have expected the same behavior in Kibana and Java API. Am I missing something in my Java code? Is it possible to include null fields in the indexed documents? I need a unified JSON structure for all my indexed documents, searching on nullable fields is not needed.

I don't know java and the client, but if you aren't including the value in the document. Generally, Elasticsearch can't index a field, even if null, if it isn't told about it.

Thank you for your quick response. It does make perfect sense but it also means that the default object serializer that Elasticsearch Java API uses doesn't serialize nulls. In our case it is a problem so for now our solution is to index JSON like so:

Gson gson = new GsonBuilder().serializeNulls().create();
String json = gson.toJson(object).replace('\'', '"');
esClient.index(i -> i
                        .index(indexName)
                        .id(docId)
                        .withJson(new StringReader(json ));

It would be nice to have the option to tell Elasticsearch to serialize nulls.
Maybe there is such an option but I couldn't find it in the Java API documentation.

Why it's a problem? Could you explain? I'm curious.

We have two systems that are dependent on each other. One indexes documents and uses the index for search (how it's supposed to be used). In this case, our system wouldn't have a problem with not storing null values in the index but the other system uses this same index as a REST service to get documents by id. This other system expects a certain structure of the JSON document and fails if some fields are missing. This is the only reason we need to have null fields. I can't really justify this kind of set up but sadly we have to live with it and its limitations.

2 Likes

There was indeed an issue that caused null values to be filtered out. This has been fixed recently and will be part of version 8.5.1 that will be released in the next few days.

1 Like

Thank you, good to know, we'll wait for v8.5.1.

Got it. Thanks for the explanation.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.