Coerce object to String

I'm currently ingesting documents that for the most part are very structured. However, there are two fields in the document that contain JSON documents. With dynamic mapping, ES is treating each key of the JSON document as a field. As each JSON document has different keys, this is leading to a mapping explosion that affects Kibana and ES search performance. The mapping JSON for this index is 1.5mb to give you some idea.

I'm trying to write a template that will treat the entire JSON document as a string:

{
  "type": "string",
  "coerce": true
}

but ES doesn't like this. I could use {"type": "object, "enabled": false} but that's not exactly what I'm looking for. That keeps the field as part of the document but it's not searchable. I want full-text search on the JSON document as a string. I don't need to be able to search by JSON.someKey.someNestedKey. Is this possible?

1 Like

Nope, it's not unfortunately. =( ES will always try to treat the JSON as an actual JSON object.

You'll have to preprocess your documents somehow and serialize the JSON into a string. Either in your application, or perhaps something like Logstash.

Would there be interest in a PR? I would think others have this same problem. Perusing XContentParser, it doesn't look that complicated.

Maybe? To be honest, I'm not super familiar with the parsing code and the roadmap there. I know a lot of work has gone into simplifying it, and making it more consistent. This may take it the opposite direction, as it makes the parsing more ambiguous.

I'd open an issue first, instead of a PR, to gauge interest from devs more involved in that part of the code. That way you won't waste time on a PR if there is strong resistance.

Assuming it's a desired feature though, I'm sure a PR would be very appreciated! :slight_smile:

Created issue: https://github.com/elastic/elasticsearch/issues/19691

IMO it seems more consistent to support coerce on as many data types as possible. Why limit it only to String -> Numbers.