Mapping http_poller JSON to an index


(A_Shelby) #1

This is a representation of a JSON array (widgets) I'm getting with the http_poller plugin...

 {
 "ABC_123":
      {"id": 7, "cost": 1.02, "sales_volume": 6}, 
 "3BT_8RT":
      {"id": 84, "cost": .025, "sales_volume": 7800}
 }

There are about 90 items (widgets) in the JSON array
ID's don't change, and there are actually 11 fields... I shortened for brevity.
I also need to index the names (ABC_123) for each item, they're unique and also stay with the ID property.
Lastly, I want a datetime on every widget entry.

My PUT mapping...

PUT /widgetdata
{
  "mappings": {
    "widgets" : {
      "properties" : {
        "id" :       {"type" : "keyword"},
        "cost" :     {"type" : "float"},
        "sales_volume" :   {"type" : "float"},
        "datetime" : {"type" : "date", "format": "yyyyMMdd'T'HHmmss.SSSZ"}
       }
    }
  }
}

And finally, my logstash config file...

input {
    http_poller {
        urls => {
          ticker => "https://widgetsite.com/public?command=returnWidgets"
        }
        request_timeout => 60
        # Supports "cron", "every", "at" and "in" schedules by rufus scheduler
        schedule => { every => "5s" }
        codec => "json"
        # A hash of request metadata info (timing, response headers, etc.) will be sent here
        metadata_target => "http_poller_metadata"
     }
}
filter {
  grok { match => [ "message", "%{HTTPDATE:[@metadata][timestamp]}" ] }
  date { match => [ "[@metadata][timestamp]", "yyyyMMdd'T'HHmmss.SSSZ" ] }
}
output {
    stdout {
        codec => "rubydebug"
    }
    elasticsearch {
        hosts => "localhost"
        index => "widgets-%{+YYYY.MM.dd}"
        user => "elastic"
        password => "********"
    }
}

The error from rubydebug says I'm exceeding the 1000 field limit.

04:52:39.324 [[main]>worker3] WARN logstash.outputs.elasticsearch - Failed action. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"widgets-2017.05.23", :_type=>"logs", :_routing=>nil}, 2017-05-23T04:52:38.707Z %{host} %{message}], :response=>{"index"=>{"_index"=>"widgets-2017.05.23", "_type"=>"logs", "_id"=>"AVwzplqg3CbsVytXtm70", "status"=>400, "error"=>{"type"=>"illegal_argument_exception", "reason"=>"Limit of total fields [1000] in index [widgets-2017.05.23] has been exceeded"}}}}

Not sure why it won't use my ID, it's assigning a new ID it appears?

My guess is that my mapping is incorrect and diving through the mapping api docs, I can't figure out where I'm going wrong. I say guess, because I have 90 items in the JSON and 11 fields per. That would be over 1000 if Elastic or Logstash is trying to create a new field for every item parameter in the array. And would makes sense why I get this error. But very unsure how to fix...???

Thanks for any help or suggestions!


(Jai Prakash Bhardwaj) #2

Hi,

you need to add

document_id => "%{_id}" 

to your elasticsearch output plugin to make the Id as the elasticsearch id.

Hope this helps

Best,
J


(Christian Dahlqvist) #3

That looks more like a single JSON object where the ID is used as a key than an array. If you look at the stdout output, is this being indexed as a single and quite complex document?


(A_Shelby) #4

Thanks for the suggestion, I changed it to this...

output {
    stdout {
        codec => "rubydebug"
     }
     elasticsearch {
         hosts => "localhost"
         index => "widgets-%{+YYYY.MM.dd}"
         user => "elastic"
         password => "*********"
         document_id => "%{_id}"
     }
 }

But get the same "Limit of total fields [1000] in index..." error.


(system) #5

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.