How to handle JSON arrays so that they're searchable in Kibana?

We’ve JSON documents that contain arrays.

# Doc 1
{
    "test": {
        "steps": [{
            "response_time": "100"
        }, {
            "response_time": "101"
        }]
    }
}

# Doc 2
{
    "test": {
        "steps": [{
            "response_time": "101"
        }, {
            "response_time": "100"
        }]
    }
}

Now, I want to search for documents which have response_time=101 in the second element of test.steps.
If I write a query like this:- (query_string type is what Kibana uses)

{
    "query": {
        "query_string": {
           "query": "test.steps.response_time:101"
        }
    }
}

It will match both documents and there is no way to specify which element I should search on. This is because Elasticsearch flattens the json internally. I wish ES had this feature but I can’t write a query like test.steps[1].response_time:101 and expect it to work.

A way to solve this is writing a filter in logstash that converts all arrays into flat documents while preserving the ordering. For ex,

Convert:

{
    "test": {
        "steps": [{
            "response_time": "100"
        }, {
            "response_time": "101"
        }]
    }
}

to this:

{
    "test": {
        "steps1": { "response_time": "100" },
        "steps2": { "response_time": "100" }
    }
}

This seems like a common problem that other people might have faced. Do we have a filter already built to handle this?

I came up with a solution myself. Slightly different than what I initially thought but it works better.

ruby {
    init => "
        def arrays_to_hash(h)
          h.each do |k,v|
            # If v is nil, an array is being iterated and the value is k.
            # If v is not nil, a hash is being iterated and the value is v.
            value = v || k
            if value.is_a?(Array)
                # "value" is replaced with "value_hash" later.
                value_hash = {}
                value.each_with_index do |v, i|
                    value_hash[i.to_s] = v
                end
                h[k] = value_hash
            end

            if value.is_a?(Hash) || value.is_a?(Array)
              arrays_to_hash(value)
            end
          end
        end
      "
      code => "arrays_to_hash(event.to_hash)"
}

More details: http://blog.abhijeetr.com/2016/11/logstashelasticsearch-best-way-to.html

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.