"Discover: Unable to parse/seralize body" due to truncated HTTP response

We have currently an Elastic Search Stack release 6.2.4 installed on our server. It was updated from 5.2 (via 5.6). I've been trying to search all entries where the field "session" begins with a specific uuid, using filters such as the following:

{
  "query": {
    "regexp": {
      "session": "767c0e43-70bd-4603-b1c7-ff719dd2f5b6.*"
    }
  }
}

But with this particular regular exception, it throws me the following exception:

Discover: Unable to parse/serialize body

Error: Unable to parse/serialize body
ErrorAbstract@http://[redacted]:5601/bundles/vendors.bundle.js?v=16627:111:151667
errors.Serialization@http://[redacted]:5601/bundles/vendors.bundle.js?v=16627:111:152992
respond@http://[redacted]:5601/bundles/vendors.bundle.js?v=16627:111:161112
checkRespForFailure@http://[redacted]:5601/bundles/vendors.bundle.js?v=16627:111:160796
AngularConnector.prototype.request/<@http://[redacted]:5601/bundles/vendors.bundle.js?v=16627:105:285482
processQueue@http://[redacted]:5601/bundles/vendors.bundle.js?v=16627:58:132456
scheduleProcessQueue/<@http://[redacted]:5601/bundles/vendors.bundle.js?v=16627:58:133349
$digest@http://[redacted]:5601/bundles/vendors.bundle.js?v=16627:58:144239
$apply@http://[redacted]:5601/bundles/vendors.bundle.js?v=16627:58:147007
done@http://[redacted]:5601/bundles/vendors.bundle.js?v=16627:58:100015
completeRequest@http://[redacted]:5601/bundles/vendors.bundle.js?v=16627:58:104697
createHttpBackend/</xhr.onload@http://[redacted]:5601/bundles/vendors.bundle.js?v=16627:58:105435

The filter is obviously well-formed JSON and I've checked with a POST _search that the data it gets from Elasticsearch are well-formed too. The weird thing is that with other (expected) value in the regexp Kibana does display the search result without any issue.

I've also looked in the elasticsearch logs and used journalctl -u kibana.service but I am at a loss as to why this particular filter fails.

EDIT: Maybe there is an alternative way to achieve what I want to achieve without triggering any "Unable to parse" exception?

EDIT2: The HTTP response when doing the search seems to be truncated and this is the reason why it fails. Now to find out why it is the case and how to fix that.

EDIT3: I got evidence of other use case where the JSON response was truncated the same way, without using the regexp filter.

As a side node, this kind of exception should maybe which part of the body is problematic. I'm not even sure if the body is the query or the results or something else for that matter.

Hi @EldrosKandar,

I think it should be the problem with the body, is there any chance I can take a look at the body that causes this error (redacted if needed)? It's really hard to guess.

Best,
Oleg

That's the thing I'm not sure about: what is the body in this case? Where can I find it?

I assume you use Kibana's Dev Tools -> Console, so ideally I'd run Wireshark and see what goes to and comes from Elasticsearch when I send this request from Kibana.

If that's not possible can you open Network tab in browser developer tools and see what's sent and returned for api/console/proxy?path=_search.... requests?

I can do that, but as I said (without telling that it was in the Dev Console), when I did the POST call, I didn't get any exception. Only when in Discover mode with the regexp filter as specified before.

That said I will provide the information (redacted as needed) as it could be useful.

EDIT: This will take some time as the HTTP response is quite big so I need to be thorough with my redaction, but first thing I quickly notice is that the response for kibana is truncated for some reason...

You mean you use session: 767c0e43-70bd-4603-b1c7-ff719dd2f5b6.* in search bar when in Discover tab (Regexp queries)?

I mean I created a filter and edited the Query DSL for this filter to make "regexp" instead of "match"

So here is the typical structure of a search hit when I'm doing my search with Filter (redacted to remove any sensitive information:

{
	"_index": "index-2018.07.10",
	"_type": "doc",
	"_id": "QYgLg2QBCgLLC9RGZG1s",
	"_version": 1,
	"_score": null,
	"_source": {
		"session": "767c0e43-70bd-4603-b1c7-ff719dd2f5b6.p1531206912015",
		"pid": "JB5021",
		"source": "/opt/path/to/log/file.lgw",
		"sessionid": "Thread-1 (HornetQ-client-global-threads-36893813)",
		"beat": {
			"name": "testhost28867.novalocal",
			"version": "6.2.4"
		},
		"_parsehost": "testhost28867.novalocal",
		"class": "com.dummy.java.classpath2",
		"offset": 1577307,
		"module": "modulename",
		"prospector": {
			"type": "log"
		},
		"contextId": "767c0e43-70bd-4603-b1c7-ff719dd2f5b6.p1531206912015",
		"thread": "Thread-1 (HornetQ-client-global-threads-36893813)",
		"message": "This is my message.",
		"machine_host": "b6aio128867",
		"env": ["000", "000"],
		"tags": ["28867", "jboss"],
		"@timestamp": "2018-07-10T07:15:14.042Z",
		"instanceid": "TESTHOST",
		"context_rest": "contextId=767c0e43-70bd-4603-b1c7-ff719dd2f5b6.p1531206912015, sessionid=Thread-1 (HornetQ-client-global-threads-36893813), Module=modulename, env=000",
		"loglevel": "INFO",
		"_logformat": "@LGW800",
		"fields": {
			"pipeline": "lgw",
			"hostname": "testhost28867",
			"product": "OURPRODUCT",
			"provider": "openstack",
			"jenkins": {
				"name": "com.dummy.test.name",
				"buildno": 94
			},
			"instance": {
				"buildno": 28867
			}
		},
		"Module": "modulename",
		"user": ""
	},
	"fields": {
		"@timestamp": ["2018-07-10T07:15:14.042Z"]
	},
	"highlight": {
		"session": ["@kibana-highlighted-field@767c0e43-70bd-4603-b1c7-ff719dd2f5b6.p1531206912015@/kibana-highlighted-field@"],
		"fields.instance.buildno": ["@kibana-highlighted-field@28867@/kibana-highlighted-field@"]
	},
	"sort": [1531206914042]
}

When doing the same request there are a few differences. There are no _version, fields.@timestamp highlight and sort fields and _score as a value.

But the most interesting is when I'm doing the search in kibana the HTTP response is truncated, which is probably why it can't parse. But why does it truncate it? Is there a configuration I need to change so it doesn't happen?

Okay, that's interesting, let's try to figure that out.

  • at what stage response is truncated, when returned from Elasticsearch or when returned from Kibana server?
  • is Kibana/Elasticsearch hosted behind a proxy that may truncate response?
  • maybe you have approximate truncated response size? Approximate number can give us a hint

That's a good question, with the web tool I can only know for sure that the response is truncated when arriving in Kibana, but i don't know for sure if it is truncated by Elasticsearch beforehand or if it is kibana which does the truncating upon receiving Elasticsearch's response.

Not to my knowledge. Those are hosted on Openstack instances (as it is a cluster) if it is relevant.

The order of the hits change, but the size of the JSON answer is always the same: 31598 B, not counting any headers as I simply copied the answer from the Web Tools in a text files multiple times and compared the size.

Hey @EldrosKandar,

There is no news for now, still trying to understand where the issue may come from. In the meantime can you query Elasticsearch directly with the same request via curl or Dev Tools -> Console and see what the real size of response should be?

Kibana doesn't limit response size explicitly, maybe there is something on the Openstack side, maybe not.

Thanks,
Oleg

So one thing I should have known, formatting the response JSON and saving it in a text file takes more disk space as saving it in a text file unformatted.

Anyway, the JSON response takes only 21.799 Bytes, and this each time. Also, I took the parameter generated by Kibana when it is doing its search and used the Dev Console and It also gets a truncated answer at 21.799 Bytes. (Previously I only used a query processor with a match and regexp processor within)

For reference here are the parameters:

POST _search
  {
	"highlight": {
		"pre_tags": ["@kibana-highlighted-field@"],
		"post_tags": ["@/kibana-highlighted-field@"],
		"fields": {
			"*": {}
		},
		"fragment_size": 2147483647
	},
	"version": true,
	"size": 5000,
	"sort": [{
		"@timestamp": {
			"order": "asc",
			"unmapped_type": "boolean"
		}
	}],
	"_source": {
		"excludes": []
	},
	"aggs": {
		"2": {
			"date_histogram": {
				"field": "@timestamp",
				"interval": "3h",
				"time_zone": "Europe/Berlin",
				"min_doc_count": 1
			}
		}
	},
	"stored_fields": ["*"],
	"script_fields": {},
	"docvalue_fields": ["@timestamp"],
	"query": {
		"bool": {
			"must": [{
				"query_string": {
					"analyze_wildcard": true,
					"default_field": "*",
					"query": "*"
				}
			}, {
				"bool": {
					"must": [{
						"match": {
							"fields.instance.buildno": "28867"
						}
					}, {
						"regexp": {
							"session": "767c0e43-70bd-4603-b1c7-ff719dd2f5b6.*"
						}
					}]
				}
			}, {
				"range": {
					"@timestamp": {
						"gte": 1530602114883,
						"lte": 1531293314883,
						"format": "epoch_millis"
					}
				}
			}],
			"filter": [],
			"should": [],
			"must_not": []
		}
	}
}

Hey @azasypkin,

Since the truncation of the response also happens during a Dev Console call, is the issue not on the ElasticSearch side? Should we move this discussion there or should I create a new topic? Or is it fine to keep the discussion here because the search query is as Kibana generate it?

Also, it might be of relevance, as I said, our Elastic Stack is 6.2.4 but we only recently made the update from 5.2 to 5.6 to 6.2.4, maybe the issue is due to a remnant of the old system.

Hi @EldrosKandar,

Since the truncation of the response also happens during a Dev Console call, is the issue not on the ElasticSearch side?

Can you please issue that request with curl to Elasticsearch directly to see whether it returns truncated response as well? If it's, then we'll transfer this topic to Elasticsearch category and maybe they can suggest something. If not and it's Kibana only problem I'd like to ask you to file Kibana bug report so that we can triage that and investigate.

It really feels like "something" between browser and Kibana or Kibana and Elasticsearch truncates part of the response.

Thanks!
Oleg

Hi @azasypkin

So curling gave me an uncropped response, as such I've begun to file a bug, but I'm stuck at trying to document how to reproduce the issue.

I've saved the output of the curl, I've created a test index with the same mappings and number of shards as our productive index, but I'm not sure how I could feed the data while preserving their "integrity", i.e. being sure that it gives me the same output.

I've opened an issue: https://github.com/elastic/kibana/issues/20820

HI @azasypkin

I could documented the issue with step on how to reproduce it, if you want to have a look. Otherwise, let's wait what the answer to the issue will be.

Hi @EldrosKandar,

Thanks for filing issue! Let's wait until it gets triaged.

Hi @azasypkin

Thank you for the labelling. That said, is there anything I can do to speed things up? Don't understand me wrong, I'm aware that the team have a lot of work, but if there is anything I can do to get attention to my issue in a constructive way, or maybe takes a bit of the workload on the issue, I would be happy to follow any pointer.

For example would it be useful to use Wireshark for more information, or wouldn't it be able to get information when the issue seems to be on the communication between Kibana and Elasticsearch.