Mapper parsing exception periods in field name

(Mike Sparr) #1

I have the latest version of packetbeat running and sending beats to Kafka. I then consume topics with Logstash and index in Elasticsearch 2.3.x

I keep seeing in the logs the mapper_parsing_exception warning and since Beats is creating the fields, field names, etc. I would not expect them to be in error with ES. It looks like request headers issue for the 'http' monitoring causing the issue.

:response=>{"create"=>{"_index"=>"packetbeat-2016.06.02", "_type"=>"http", "_id"=>"AVUS8RpdQfS6Ytb3qgow", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"Field name [_hp2_id.1646711747] cannot contain '.'"}}}, :level=>:warn}

Here is my config in packetbeat.yml file:

    # Configure the ports where to listen for HTTP traffic. You can disable
    # the HTTP protocol by commenting out the list of ports.
    ports: [80, 443, 8080, 8000, 5000, 8002]

    # Uncomment the following to hide certain parameters in URL or forms attached
    # to HTTP requests. The names of the parameters are case insensitive.
    # The value of the parameters will be replaced with the 'xxxxx' string.
    # This is generally useful for avoiding storing user passwords or other
    # sensitive information.
    # Only query parameters and top level form parameters are replaced.
    # hide_keywords: ['pass', 'password', 'passwd']
    send_headers: ["User-Agent"]
    #split_cookie: true
    real_ip_header: "X-Forwarded-For"

The error seems to occur when tracking http requests to a /_cluster/state URI and I'm unsure if this is storing from an old Spring Framework app we have running or if it's a data point generated by Beats itself. It looks like a cookie and perhaps Beats is parsing those into their own fields and thus any cookie fields it catches with a "." in the name will err on ES import? This seems very fragile as you often don't control cookie field names.

{:timestamp=>"2016-06-02T15:07:19.865000-0600", :message=>"Failed action. ", :status=>400, :action=>["index", {:_id=>nil, :_index=>"packetbeat-2016.06.02", :_type=>"http", :_routing=>nil}, #<LogStash::Event:0xafac6e1 @metadata_accessors=#<LogStash::Util::Accessors:0x3c55466 @store={}, @lut={}>, @cancelled=false, @data={"status"=>"OK", "responsetime"=>26, "path"=>"/_cluster/state", "query"=>"GET /_cluster/state", "bytes_out"=>545153, "bytes_in"=>1189, "client_ip"=>"<SNIP />", "client_proc"=>"", "port"=>80, "proc"=>"", "direction"=>"in", "type"=>"http", "beat"=>{"hostname"=>"app1", "name"=>"app1"}, "@timestamp"=>"2016-06-02T21:07:17.977Z", "method"=>"GET", "params"=>"", "http"=>{"code"=>200, "content_length"=>544980, "phrase"=>"OK", "request_headers"=>{"cookie"=>{"__utma"=>"7540132.187868129.1427997271.1459374207.1462899186.33", "__utmz"=>"7540132.1459199967.30.6.utmcsr=<SNIP />|utmccn=(referral)|utmcmd=referral|utmcct=/<SNIP />/index.html", "_hp2_id.1646711747"=>"8905883439154284.1499744626.1651169981", "_hp2_props.1646711747"=>"%7B%22account_type_atm%22%3A%22customer%22%2C%22plan_type_atm%22%3A3%2C%22plan_name_atm%22%3A%22Plus%22%2C%22is_trial_atm%22%3Afalse%2C%22max_agents_atm%22%3A3%2C%22bc_type_atm%22%3A1%2C%22is_mobile%22%3Afalse%2C%22tta_in_days%22%3A1232.09%2C%22tta_in_hours%22%3A29570.16%2C%22tta_in_minutes%22%3A1774209.9%2C%22tta_user_in_days%22%3A196.22%2C%22tta_user_in_hours%22%3A4709.22%2C%22tta_user_in_minutes%22%3A282553.07%7D"}

(ruflin) #2

@mikesparr It seems to be an issue with a dot in the field name. Elasticsearch 5.0.0-alpha3 actually brought the support for dots in fields names back. If I look at the content above it looks like a special case of header/cookie information.

@monica What do you suggest in such cases? Should we escape on our side?

(Mike Sparr) #3

Thanks @ruflin for your response. Yes, given when you enable the cookies and headers capture and do not control 3rd-party systems, there is a chance they might have dot notation in their cookie or header fields which would then break beats. It's nice to be able to reference them but unsure how to address that scenario that we ran into.

My only resort for 2.3.x appears to be to disable the headers which is a bummer because there is valuable data in there, or somehow store them as a text blob and parse later.

(ruflin) #4

As this could also happen in other cases, I think we should do something about it especially as it breaks adding all the content. Can you open a Github issue for this here?

(Mike Sparr) #5

Opened Github issue and referenced this discussion. If anything more needed in issue, please add what you think is relevant for your team. Thanks for attending to this!

(ruflin) #6

Thanks, we will continue the discussion on the Github issue.

(system) #7

This topic was automatically closed after 21 days. New replies are no longer allowed.