jakarta.json.stream.JsonParsingException when deserializing data retrieved from Elasticsearch

Georgi_Nikolov · September 19, 2023, 1:04pm

for some time now I have been trying to incorporate the Elastic Java 8.10 client into my code, I have a big ELK stack with a lot of data. I am trying to fetch continuously data from it, but I am encountering some inconsistent behavior.

My code is following the Elastic tutorial on their website and I have a simple function to get the data:

/**
     * Method for querying a search in ELK using search_after.
     * @param search_size
     * @param index
     * @param sort_options
     * @param field
     * @param value
     * @return SearchResponse
     * @throws IOException on failed search
     */
    public final SearchResponse<JsonData> querySearch(final int search_size,
            final String index, final List<SortOptions> sort_options,
            final List<FieldValue> search_after, final String field,
            final String value) throws IOException {
        SearchResponse<JsonData> result = this.el_client.search(s -> s
            .index(index)
            .size(search_size)
            .sort(sort_options)
            .query(q -> q
                .term(t -> t
                    .field(field)
                    .value(v -> v.stringValue(value))
                )),
            JsonData.class);

        return result;
    }

The strange thing is that it works fine for some time, but then I always receive the following error:

Error deserializing co.elastic.clients.elasticsearch.core.search.Hit: jakarta.json.stream.JsonParsingException: Illegal unquoted character ((CTRL-CHAR, code 10)): has to be escaped using backslash to be included in name

it is inconsistent as it happens at random entries retrieved from Elasticsearch, I looked the previous entries that were retrieved and it is never the same time window (i am using the @timestamp field to cross-reference which entries might provoke the error)

Here is a GitHubgist with the stack trace: gist:a28bd2e4b4732ed25837b4fdddf2468d · GitHub

At this point I have the feeling that the error is not from my code, but how the Elastic API sends the data and then the transformation from the raw JSON to JsonData. I tried using Object instead, but the same thing happens.
Is there a way to sanitize the JSON at the moment of fetching it or any way to correct this strange behavior?

swallez · September 19, 2023, 1:49pm

This is indeed rather strange: the error indicates that the JSON content sent by the server is malformed.

Can you log the SearchRequest that causes these errors to replay them in the Kibana console to examine the JSON returned by Elasticsearch? JsonpUtils.toJsonString() can be used for that.

Georgi_Nikolov · September 19, 2023, 3:32pm

Thank you for the reply,

how should I log the SearchRequest and response? Is there a special method of the SearchRequest I can call providing the specific stream to output to?

swallez · September 19, 2023, 4:09pm

You have to refactor the code a bit:

        SearchResponse<JsonData> result = this.el_client.search(s -> s
            .index(index)
            .size(search_size)
            ...
        return result;

has to become something like:

        SearchRequest request = SearchRequest.of(s -> s
            .index(index)
            .size(search_size)
            ...
    try {
        SearchResponse<JsonData> result = this.el_client.search(request, JsonData.class);
        return result;
    } catch (JsonpMappingException e) {
        String requestJson = JsonpUtils.toJsonString(request, this.es_client_);
        JsonpMapper mapper = this.es_client._transport().jsonpMapper();
        logger.error("Decoding error. Request is: " + JsonpUtils.toJsonString(request, mapper), e);
        throw e;
    }

This should log the requests that caused a failure, which you can they copy/paste in the Kibana developer console to see what the response looks like.

Georgi_Nikolov · September 20, 2023, 11:44am

So I adapted my code as you suggested and ran the program. As stated previously, Elastic is holding a large quantity of records, so probably I will need to let it run for multiple hours.
The strange thing is that it wildly varies when the Mapping error occurs, so I don't know when exactly it might crash.
I tried to use the range query to try and search for entries in a time-window where previously the errors occurred, but nothing so far. Once, i have the request and test it in the Kibana Developer Console, I will come back to you with the results.

Also while I am staring at the screen, I'm wondering when the program will crash and if you think there might be an influence of the amount of records retrieved at once? Originally I was retrieving the max amount of records (100) and then using searchAfter to continue the retrieval. Now I switched to 5, for legibility reasons and haven't encountered any problems so far.

Topic		Replies	Views
Org.elasticsearch.common.jackson.JsonParseException: Illegal unquoted character ((CTRL-CHAR, Elasticsearch	1	1153	July 6, 2017
JsonParseException: Illegal unquoted character ((CTRL-CHAR, code 9)): has to be escaped Elasticsearch	3	39353	July 6, 2017
Unable to parse json logs Elasticsearch	2	1041	August 29, 2019
Java Api client documentation example seems to give JacksonParseException Elasticsearch language-clients	6	870	September 14, 2023
Illegal unquoted character ((CTRL-CHAR, code 10)): has to be escaped using backslash to be included in string value\n at [Source: org.elasticsearch.transport.netty4.ByteBufStreamInput@6eb3d4e; line: 3, column: 206]1 Elasticsearch	2	3869	December 28, 2019

jakarta.json.stream.JsonParsingException when deserializing data retrieved from Elasticsearch

Related topics