Split filter on an array containing only one element

Hello,

I'm trying to split the result of an elasticsearch aggregation into several events. Everything is working fine when there are multiple buckets in the result. I'm facing an issue when there is only one bucket. The split filter doesn't work anymore.

Here is my configuration

input {
  stdin {
    type => "indexMetric"
  }
}
filter {
  json {
    source => "message"
  }

  split {
    field => "aggregations[count_by_idx][buckets]"
    target => "metric"
  }
}
output {
  stdout {
    codec => rubydebug
  }
}

Given the initial message (which corresponds to the result of a simple aggregation) :

{
  "aggregations": {
    "count_by_type": {
      "buckets": [
        {
          "key": "email",
          "doc_count": 512
        },
        {
          "key": "chat",
          "doc_count": 3
        }
      ]
    }
  }
}

I get the two expected events with field "metric" :

"metric"=>{"key"=>"email", "doc_count"=>512}
"metric"=>{"key"=>"chat", "doc_count"=>3}

But with the following message (which has got only one bucket) :

{
  "aggregations": {
    "count_by_type": {
      "buckets": [
        {
          "key": "email",
          "doc_count": 512
        }
      ]
    }
  }
}

There is no split event meaning that the split filter doesn't generate any event.

Am I facing a bug ? Is this the desired behaviour ? Do you know any workaround that could make me handle this case ?

(I am using logstash 2.0.0)

Thanks by advance.

Pierre

1 Like

I reply to myself since I would like to share the workaround I found so far. It consists of using ruby filter to get number of buckets. And if this number is one, then use ruby filter again to add to the event the metric field with the single bucket.

input {
  stdin {
     type => "indexMetric"
  }
}
filter {
  json {
    source => "message"
  }

  #get number of buckets
  ruby {
    code => "event['nb_buckets'] = event['aggregations']['count_by_idx']['buckets'].length"
  }

  if [nb_buckets] == 1 {
    # copy first bucket into metric field using ruby filter
    ruby {
      code => "event['metric'] = event['aggregations']['count_by_idx']['buckets'][0]"
    }
  } else {
    # otherwise use split filter
    split {
      field => "aggregations[count_by_idx][buckets]"
      target => "metric"
    }
  }

  # remove the temporary nb_buckets field
  mutate {
    remove_field => "nb_buckets"
  }
}
output {
  stdout {
    codec => rubydebug
  }
}

Tell me if you know a more appropriate way to do that.

Pierre

I have just run into a similar problem. Feels like a bug in split to me.

I think so... Same problem. Thanks for your solution.