Elasticsearch receives data error with escape characters


(sun_changlong) #1

The data is sent to elasticsearch via logstash, the data contains escape characters, and the elasticsearc error after receiving the data:

This is the data sent by logstash,

{
"message" => "id="ngtos" version="1.0" time="2017-05-08 01:07:53" dev="1235668" pri="6" type="ddos_clean" recorder="ads" vsid="0" sub_type=attacklog dst_addr=1.1.1.1 zonename=1504152157558 grpname=test attack_status=begin src_addr=128.18.74.44;128.19.75.33 service= protocol_4=TCP dst_port=5060 attack_type="http-flood" defense_method="http-source-auth" cur_cfg_value=1 cfg_value_unit=pps total_packets=1500 attack_packets=1000 total_bytes=12590 attack_bytes=1240 action=drop attack_msgs="http-flood" backup1=0 backup2= backup3= client_addr=1.1.1.1 ip_flow_b=2000 cut_ip_flow_b=500 tcp_flow_b=2000 cut_tcp_flow_b=200 dns_flow_b=1000 cut_dns_flow_b=100 http_flow_b=2000 cut_http_flow_b=200 sip_flow_b=3000 cut_sip_flow_b=300 ipfrag_flow_b=2000 cut_ipfrag_flow_b=100",
"offset" => 751,
"prospector" => {
"type" => "log"
},
"@version" => "1",
"tags" => [
[0] "beats_input_codec_plain_applied",
[1] "_grokparsefailure"
],
"source" => "/home/filebeat/testlog/aads_new.log",
"@timestamp" => 2018-07-02T08:55:35.974Z,
"input" => {
"type" => "log"
},
"beat" => {
"hostname" => "n",
"name" => "n",
"version" => "6.3.0"
},
"host" => {
"name" => "n"
}
}

this is the error message reported by elasticsearch after receiving it.

[2018-07-02T16:57:45,620][DEBUG][o.e.a.b.TransportShardBulkAction] [logstash-2018.07.02][2] failed to execute bulk item (index) BulkShardRequest [[logstash-2018.07.02][2]] containing [index {[logstash-2018.07.02][doc][GL42WmQBBKeqWE53TJHN], source[{"message":"id=\"ngtos\" version=\"1.0\" time=\"2017-05-08 01:07:53\" dev=\"1235668\" pri=\"6\" type=\"ddos_clean\" recorder=\"ads\" vsid=\"0\" sub_type=attacklog dst_addr=1.1.1.1 zonename=1504152157558 grpname=test attack_status=begin src_addr=128.18.74.44;128.19.75.33 service= protocol_4=TCP dst_port=5060 attack_type=\"http-flood\" defense_method=\"http-source-auth\" cur_cfg_value=1 cfg_value_unit=pps total_packets=1500 attack_packets=1000 total_bytes=12590 attack_bytes=1240 action=drop attack_msgs=\"http-flood\" backup1=0 backup2= backup3= client_addr=1.1.1.1 ip_flow_b=2000 cut_ip_flow_b=500 tcp_flow_b=2000 cut_tcp_flow_b=200 dns_flow_b=1000 cut_dns_flow_b=100 http_flow_b=2000 cut_http_flow_b=200 sip_flow_b=3000 cut_sip_flow_b=300 ipfrag_flow_b=2000 cut_ipfrag_flow_b=100","offset":751,"prospector":{"type":"log"},"@version":"1","tags":["beats_input_codec_plain_applied","_grokparsefailure"],"source":"/home/filebeat/testlog/aads_new.log","@timestamp":"2018-07-02T08:55:35.974Z","input":{"type":"log"},"beat":{"hostname":"n","name":"n","version":"6.3.0"},"host":{"name":"n"}}]}]

org.elasticsearch.index.mapper.MapperParsingException: failed to parse [host]
at org.elasticsearch.index.mapper.FieldMapper.parse(FieldMapper.java:302) ~[elasticsearch-6.3.0.jar:6.3.0]
at org.elasticsearch.index.mapper.DocumentParser.parseObjectOrField(DocumentParser.java:481) ~[elasticsearch-6.3.0.jar:6.3.0]
at org.elasticsearch.index.mapper.DocumentParser.parseObject(DocumentParser.java:496) ~[elasticsearch-6.3.0.jar:6.3.0]
at org.elasticsearch.index.mapper.DocumentParser.innerParseObject(DocumentParser.java:390) ~[elasticsearch-6.3.0.jar:6.3.0]
at org.elasticsearch.index.mapper.DocumentParser.parseObjectOrNested(DocumentParser.java:380) ~[elasticsearch-6.3.0.jar:6.3.0]
at org.elasticsearch.index.mapper.DocumentParser.internalParseDocument(DocumentParser.java:95) ~[elasticsearch-6.3.0.jar:6.3.0]
.......


(David Pilato) #2

Please don't post images of text as they are hardly readable and not searchable.

Instead paste the text and format it with </> icon. Check the preview window.


(sun_changlong) #3

Thank you for your suggestion, I took a proposal and made a replacement.Do you help me see the problem?


(David Pilato) #4

I can't see any host field in your message.
Not sure the sample you shared is exactly what is sent to elastic. Is it?


(sun_changlong) #5

**
I just sent a log of data, the log content does not necessarily include host, but the prompt is missing the host field, I do not understand why。
Now the logstash data is consistent with that received by elasticsearch. Errors will still appear.**

this is the logstash.yml:

input {
  beats {
    port => 5044
  }
}
filter {
  grok {
    match => { "message" => "%{COMBINEDAPACHELOG}" }
  }
  date {
    match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
  }
}
output {
  elasticsearch { hosts => ["http://192.168.33.85:9200"] }
  stdout { codec => rubydebug }
}

(sun_changlong) #6

hi,dadoonet .hi,dadoonet .The information about the host was missing because the data information when I uploaded it is incomplete. Now I uploaded the complete error message. I hope you can help me check the problem. Thank you very much.


(David Pilato) #7

What is the mapping for this index?


(sun_changlong) #8

I just read a log log through filebeat and send it to logstash. Logstash automatically adds host and some other fields, and sends the assembled log to elasticsearch, which receives the message and reports an error.

There is no additional processing in the process, the configuration is as follows;

> filebeat.inputs:
> - type: log
>   enabled: true
>   paths:
>     - /home/filebeat/testlog/*.log
> filebeat.config.modules:
>   path: ${path.config}/modules.d/*.yml
>   reload.enabled: false
> setup.template.settings:
>   index.number_of_shards: 3
> setup.kibana:
> output.logstash:
>   hosts: ["192.168.33.85:5044"]

logstash.yml:

input {
beats {
port => 5044
}
}
filter {
grok {
match => { "message" => "%{COMBINEDAPACHELOG}" }
}
date {
match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
}
}
output {
elasticsearch { hosts => ["http://192.168.33.85:9200"] }
stdout { codec => rubydebug }
}

elasticsearch.yml:

cluster.name: my-application
node.name: node-1
node.attr.rack: r1
path.data: /home/elasticsearch/data
path.logs: /home/elasticsearch/logs
network.host: 192.168.33.85
http.port: 9200

All on the same host,i'm sorry. I don't understand what the mapping of this index is.


(David Pilato) #9
GET YOURINDEXNAME/_mapping

YOURINDEXNAME is probably something like logstash-....


(sun_changlong) #10

I just sent the log again to view the index and found a message in logstash. The content is as follows:

 [2018-07-03T15:45:31,060][WARN ][logstash.outputs.elasticsearch] Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"logstash-2018.07.03", :_type=>"doc", :_routing=>nil}, #<LogStash::Event:0x4a377894>], :response=>{"index"=>{"_index"=>"logstash-2018.07.03", "_type"=>"doc", "_id"=>"cL0aX2QBE8Yn7rwIhJjF", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"failed to parse [host]", "caused_by"=>{"type"=>"illegal_state_exception", "reason"=>"Can't get text on a START_OBJECT at 1:9"}}}}}
    {
              "host" => {
            "name" => "server2"
        },
            "source" => "/home/testlog/aads_new.log",
              "beat" => {
                "name" => "server2",
            "hostname" => "server2",
             "version" => "6.3.0"
        },
              "tags" => [
            [0] "beats_input_codec_plain_applied",
            [1] "_grokparsefailure"
        ],
             "input" => {
            "type" => "log"
        },
        "prospector" => {
            "type" => "log"
        },
            "offset" => 576,
           "message" => "grpname=test attack_status=begin src_addr=128.18.74.44;128.19.75.33 service= protocol_4=TCP dst_port=5060 attack_type=\"http-flood\" defense_method=\"http-source-auth\" cur_cfg_value=1 cfg_value_unit=pps total_packets=1500 attack_packets=1000 total_bytes=12590 attack_bytes=1240 action=drop attack_msgs=\"http-flood\" backup1=0 backup2= backup3= client_addr=1.1.1.1 ip_flow_b=2000 cut_ip_flow_b=500 tcp_flow_b=2000 cut_tcp_flow_b=200 dns_flow_b=1000 cut_dns_flow_b=100 http_flow_b=2000 cut_http_flow_b=200 sip_flow_b=3000 cut_sip_flow_b=300 ipfrag_flow_b=2000 cut_ipfrag_flow_b=100",
          "@version" => "1",
        "@timestamp" => 2018-07-03T07:43:14.688Z
    }

by using the curl 192.168.33.85:9200/_mapping,Results are as follows。I am sorry, I did not see any helpful information.

{"logstash-2018.07.02":{"mappings":{"doc":{"dynamic_templates":[{"message_field":{"path_match":"message","match_mapping_type":"string","mapping":{"norms":false,"type":"text"}}},{"string_fields":{"match":"","match_mapping_type":"string","mapping":{"fields":{"keyword":{"ignore_above":256,"type":"keyword"}},"norms":false,"type":"text"}}}],"properties":{"@timestamp":{"type":"date"},"@version":{"type":"keyword"},"geoip":{"dynamic":"true","properties":{"ip":{"type":"ip"},"latitude":{"type":"half_float"},"location":{"type":"geo_point"},"longitude":{"type":"half_float"}}},"host":{"type":"text","norms":false,"fields":{"keyword":{"type":"keyword","ignore_above":256}}},"message":{"type":"text","norms":false},"tags":{"type":"text","norms":false,"fields":{"keyword":{"type":"keyword","ignore_above":256}}}}},"default":{"dynamic_templates":[{"message_field":{"path_match":"message","match_mapping_type":"string","mapping":{"norms":false,"type":"text"}}},{"string_fields":{"match":"","match_mapping_type":"string","mapping":{"fields":{"keyword":{"ignore_above":256,"type":"keyword"}},"norms":false,"type":"text"}}}],"properties":{"@timestamp":{"type":"date"},"@version":{"type":"keyword"},"geoip":{"dynamic":"true","properties":{"ip":{"type":"ip"},"latitude":{"type":"half_float"},"location":{"type":"geo_point"},"longitude":{"type":"half_float"}}}}}}},"filebeat-6.3.0-2018.07.03":{"mappings":{"doc":{"properties":{"@timestamp":{"type":"date"},"@version":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}},"beat":{"properties":{"hostname":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}},"name":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}},"version":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}}}},"host":{"properties":{"name":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}}}},"input":{"properties":{"type":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}}}},"message":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}},"offset":{"type":"long"},"prospector":{"properties":{"type":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}}}},"source":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}},"tags":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}}}}}},"logstash-2018.07.03":{"mappings":{"default":{"dynamic_templates":[{"message_field":{"path_match":"message","match_mapping_type":"string","mapping":{"norms":false,"type":"text"}}},{"string_fields":{"match":"","match_mapping_type":"string","mapping":{"fields":{"keyword":{"ignore_above":256,"type":"keyword"}},"norms":false,"type":"text"}}}],"properties":{"@timestamp":{"type":"date"},"@version":{"type":"keyword"},"geoip":{"dynamic":"true","properties":{"ip":{"type":"ip"},"latitude":{"type":"half_float"},"location":{"type":"geo_point"},"longitude":{"type":"half_float"}}}}},"doc":{"dynamic_templates":[{"message_field":{"path_match":"message","match_mapping_type":"string","mapping":{"norms":false,"type":"text"}}},{"string_fields":{"match":"","match_mapping_type":"string","mapping":{"fields":{"keyword":{"ignore_above":256,"type":"keyword"}},"norms":false,"type":"text"}}}],"properties":{"@timestamp":{"type":"date"},"@version":{"type":"keyword"},"geoip":{"dynamic":"true","properties":{"ip":{"type":"ip"},"latitude":{"type":"half_float"},"location":{"type":"geo_point"},"longitude":{"type":"half_float"}}},"host":{"type":"text","norms":false,"fields":{"keyword":{"type":"keyword","ignore_above":256}}},"message":{"type":"text","norms":false},"tags":{"type":"text","norms":false,"fields":{"keyword":{"type":"keyword","ignore_above":256}}}}}}},"%{[@metadata][beat]}-%{[@metadata][version]}-2018.07.03":{"mappings":{"doc":{"properties":{"@timestamp":{"type":"date"},"@version":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}},"host":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}},"message":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}}}}}}}

`


(sun_changlong) #11
GET logstash-2018.07.03/_mapping

the result as follows:

    {
  "logstash-2018.07.03": {
    "mappings": {
      "_default_": {
        "dynamic_templates": [
          {
            "message_field": {
              "path_match": "message",
              "match_mapping_type": "string",
              "mapping": {
                "norms": false,
                "type": "text"
              }
            }
          },
          {
            "string_fields": {
              "match": "*",
              "match_mapping_type": "string",
              "mapping": {
                "fields": {
                  "keyword": {
                    "ignore_above": 256,
                    "type": "keyword"
                  }
                },
                "norms": false,
                "type": "text"
              }
            }
          }
        ],
        "properties": {
          "@timestamp": {
            "type": "date"
          },
          "@version": {
            "type": "keyword"
          },
          "geoip": {
            "dynamic": "true",
            "properties": {
              "ip": {
                "type": "ip"
              },
              "latitude": {
                "type": "half_float"
              },
              "location": {
                "type": "geo_point"
              },
              "longitude": {
                "type": "half_float"
              }
            }
          }
        }
      },
      "doc": {
        "dynamic_templates": [
          {
            "message_field": {
              "path_match": "message",
              "match_mapping_type": "string",
              "mapping": {
                "norms": false,
                "type": "text"
              }
            }
          },
          {
            "string_fields": {
              "match": "*",
              "match_mapping_type": "string",
              "mapping": {
                "fields": {
                  "keyword": {
                    "ignore_above": 256,
                    "type": "keyword"
                  }
                },
                "norms": false,
                "type": "text"
              }
            }
          }
        ],
        "properties": {
          "@timestamp": {
            "type": "date"
          },
          "@version": {
            "type": "keyword"
          },
          "beat": {
            "properties": {
              "hostname": {
                "type": "text",
                "norms": false,
                "fields": {
                  "keyword": {
                    "type": "keyword",
                    "ignore_above": 256
                  }
                }
              },
              "name": {
                "type": "text",
                "norms": false,
                "fields": {
                  "keyword": {
                    "type": "keyword",
                    "ignore_above": 256
                  }
                }
              },
              "version": {
                "type": "text",
                "norms": false,
                "fields": {
                  "keyword": {
                    "type": "keyword",
                    "ignore_above": 256
                  }
                }
              }
            }
          },
          "geoip": {
            "dynamic": "true",
            "properties": {
              "ip": {
                "type": "ip"
              },
              "latitude": {
                "type": "half_float"
              },
              "location": {
                "type": "geo_point"
              },
              "longitude": {
                "type": "half_float"
              }
            }
          },
          "host": {
            "type": "text",
            "norms": false,
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "input": {
            "properties": {
              "type": {
                "type": "text",
                "norms": false,
                "fields": {
                  "keyword": {
                    "type": "keyword",
                    "ignore_above": 256
                  }
                }
              }
            }
          },
          "message": {
            "type": "text",
            "norms": false
          },
          "offset": {
            "type": "long"
          },
          "prospector": {
            "properties": {
              "type": {
                "type": "text",
                "norms": false,
                "fields": {
                  "keyword": {
                    "type": "keyword",
                    "ignore_above": 256
                  }
                }
              }
            }
          },
          "source": {
            "type": "text",
            "norms": false,
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "tags": {
            "type": "text",
            "norms": false,
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          }
        }
      }
    }
  }
}

(David Pilato) #12

To get a mapping run:

GET logstash-2018.07.03/_mapping

(sun_changlong) #13

hi,dadoonet.
I re-run the instructions you recommended, and the results are updated as above.Did not see where the problem occurred.
Finally, I took a temporary configuration to circumvent this problem.
The set as follows:
mutate {
rename => { "[host][name]" => "host" }
}
I don't know if there is a problem with the 6.3 version, I can only temporarily make configuration adjustments.

If you have a better way, I will adopt it, thanks


(David Pilato) #14

The problem is that your mapping does not correspond to your documents.

Mapping says for host:

          "host": {
            "type": "text",
            "norms": false,
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },

But your documents are like:

{
  "host": {
    "name": "server2"
  }
}

Which can not work.

Indeed, the default template which comes with Logstash has a generic configuration but your data is coming from beats which requires a specific mapping.

It sounds like you are parsing Apache logs here, so I'd say that Logstash is useless in that case.

If you install filebeat 6.3.0, it can directly communicate with elasticsearch, create the right mappings in Elasticsearch, create ingest pipelines, create the dashboards in Kibana. You just have to tell filebeat that you want to use the apache2 module.


(system) #15

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.