Failed json lines being ignored


(Daniel Maiochi) #1

Hello guys!
Recently I discovered the json processor and it made my life much easier!
The only problem I'm encountering with it, is that if a line fails to be pushed, the "on_failure" is not executed.
example:
pipeline (simplified)

{
  "description": "Pipeline for parsing Sendible logs",
  "processors": [
    {
      "json": {
        "field": "message",
        "target_field": "sendible"
      }
    }
  ],
  "on_failure": [
    {
      "set": {
        "field": "error.message",
        "value": "{{ _ingest.on_failure_message }}"
      }
    }
  ]
}

so I'm pushing one line:

{"level":"fatal","integer_field":0}

and it's all good, but I since the following can happen, I just don't want to lose this line:

{"level":"fatal","integer_field":"string value"}

since the integer_field is number, when I push a string value it fails (expected), but this line isn't pushed to error.message as I specified on the "on_failure".

The exception is this:

{"type":"mapper_parsing_exception","reason":"failed to parse [sendible.integer_field]","caused_by":{"type":"illegal_argument_exception","reason":"For input string: "string value""}}

can someone tell me what am I doing wrong?

Thanks :slight_smile:


(Noémi Ványi) #2

Where do you see the exception you pasted? Could you show me the full document returned by Ingest? Could you share your full pipeline?

{{ _ingest.on_failure_message }} shows up in error.message of your document. message should be also present in the same document based on your shared pipeline config. You don't drop it anywhere in on_failure, so it's not lost.


(Noémi Ványi) #3

Also, you can read about handling failures during processing here: https://www.elastic.co/guide/en/elasticsearch/reference/current/handling-failure-in-pipelines.html


(Daniel Maiochi) #4

Hi! yes, I just tried not to pollute things here..
the full pipeline:

{
  "description": "Pipeline for parsing Sendible logs",
  "processors": [
    {
      "json": {
        "field": "message",
        "target_field": "sendible"
      }
    },
    {
      "date": {
        "field": "sendible.date",
        "target_field": "@timestamp",
        "formats": [
          "yyyy-MM-dd'T'HH:mm:ss.SSSZ",
          "yyyy-MM-dd'T'HH:mm:ss.SSSSSSSZ"
        ],
        "ignore_failure": true
      }
    },
    {
      "rename": {
        "field": "sendible.content.exception",
        "target_field": "sendible.exception",
        "ignore_failure": true
      }
    },
    {
      "rename": {
        "field": "sendible.content.status",
        "target_field": "sendible.status",
        "ignore_failure": true
      }
    },
    {
      "rename": {
        "field": "sendible.content.uri",
        "target_field": "sendible.uri",
        "ignore_failure": true
      }
    },
    {
      "remove": {
        "field": [
          "message",
          "sendible.date"
        ]
      }
    }
  ],
  "on_failure": [
    {
      "set": {
        "field": "error.message",
        "value": "{{ _ingest.on_failure_message }}"
      }
    }
  ]
}

the exception on the filebeat console:

2018-06-27T18:05:29.269+0100 WARN elasticsearch/client.go:502 Cannot index event publisher.Event{Content:beat.Event{Timestamp:time.Time{wall:0xbec50ed56ee61954, ext:2667218601, loc:(*time.Location)(0x16f1580)}, Meta:common.MapStr{"pipeline":"filebeat-6.3.0-sendible-default-pipeline-json"}, Fields:common.MapStr{"fileset":common.MapStr{"module":"sendible", "name":"default"}, "prospector":common.MapStr{"type":"log"}, "input":common.MapStr{"type":"log"}, "beat":common.MapStr{"hostname":"DANIELM-LAPTOP", "version":"6.3.0", "name":"DANIELM-LAPTOP"}, "host":common.MapStr{"name":"DANIELM-LAPTOP"}, "source":"C:\temp\Logs\test.log", "offset":65282, "message":"{"level":"fatal","name":"Test daniel with string pid","message":"objs","source":"IntelliMail.Server.Receiver.Console","id":"3a5017f7-14df-4cf4-ad3f-ca7a20cfeee3","pid":"dadadadad","thread":1,"date":"2018-06-27T11:57:31.3807761+01:00","content":{"FacebookFansReceiver":{"AccountStatus":2,"PostAgeInHours":48,"Properties":{},"AgeInHours":120.0,"AccountId":1580286,"ReceiverType":"FacebookPageFansCustomFeed","ServiceName":"facebook(custompage)","Threads":1}}}"}, Private:file.State{Id:"", Finished:false, Fileinfo:(*os.fileStat)(0xc0424b7ec0), Source:"C:\temp\Logs\test.log", Offset:65282, Timestamp:time.Time{wall:0xbec50ed5143555d4, ext:1219426701, loc:(*time.Location)(0x16f1580)}, TTL:-1, Type:"log", FileStateOS:file.StateOS{IdxHi:0x3a20000, IdxLo:0x7bee, Vol:0x70408343}}}, Flags:0x1} (status=400): {"type":"mapper_parsing_exception","reason":"failed to parse [sendible.pid]","caused_by":{"type":"illegal_argument_exception","reason":"For input string: "dadadadad""}}

the exception on elasticsearch log

[2018-06-27T18:05:29,245][DEBUG][o.e.a.b.TransportShardBulkAction] [filebeat-6.3.0-2018.06.27][2] failed to execute bulk item (index) BulkShardRequest [[filebeat-6.3.0-2018.06.27][2]] containing [16] requests
org.elasticsearch.index.mapper.MapperParsingException: failed to parse [sendible.pid]
	at org.elasticsearch.index.mapper.FieldMapper.parse(FieldMapper.java:302) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.index.mapper.DocumentParser.parseObjectOrField(DocumentParser.java:481) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.index.mapper.DocumentParser.parseValue(DocumentParser.java:603) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.index.mapper.DocumentParser.innerParseObject(DocumentParser.java:403) ~[elasticsearch-6.3.0.jar:6.3.0]
...
Caused by: java.lang.IllegalArgumentException: For input string: "dadadadad"
	at org.elasticsearch.common.xcontent.support.AbstractXContentParser.toLong(AbstractXContentParser.java:199) ~[elasticsearch-x-content-6.3.0.jar:6.3.0]
	at org.elasticsearch.common.xcontent.support.AbstractXContentParser.longValue(AbstractXContentParser.java:220) ~[elasticsearch-x-content-6.3.0.jar:6.3.0]

the good line:

{"level":"fatal","name":"Test daniel","message":"objs","source":"Receiver.Console","id":"3a5017f7-14df-4cf4-ad3f-ca7a20cfeee3","pid":14212,"thread":1,"date":"2018-06-27T11:57:31.3807761+01:00","content":{"FacebookFansReceiver":{"AccountStatus":2,"PostAgeInHours":48,"Properties":{},"AgeInHours":120.0,"AccountId":1580286,"ReceiverType":"FacebookPageFansCustomFeed","ServiceName":"facebook(custompage)","Threads":1}}}

the bad line

{"level":"fatal","name":"Test daniel with string pid","message":"objs","source":"Receiver.Console","id":"3a5017f7-14df-4cf4-ad3f-ca7a20cfeee3","pid":"dadadadad","thread":1,"date":"2018-06-27T11:57:31.3807761+01:00","content":{"FacebookFansReceiver":{"AccountStatus":2,"PostAgeInHours":48,"Properties":{},"AgeInHours":120.0,"AccountId":1580286,"ReceiverType":"FacebookPageFansCustomFeed","ServiceName":"facebook(custompage)","Threads":1}}}

Thanks again :slight_smile:


(Daniel Maiochi) #6

@kvch do you have any idea what I'm doing wrong?
I'm sorry but I still haven't figured it out :cry:


(Noémi Ványi) #7

I think you could try adding a new field (to your fields.yml file) which stores the string instead of the number e.g sendible.pid_str. So if a bad line is encountered it can be still saved. However, you need to figure out it somehow in the Ingest pipeline whether sendible.pid is an integer or a string. If it's a string the field could be renamed to sendible.pid_str. If it's a string, there is nothing to do.
Do you mind posting the question to the Elasticsearch forum? It's rather an Ingest node question, so it's possible that what I suggest you to do is suboptiomal.


(Daniel Maiochi) #8

Thanks @kvch but the pid was just an example, I have no idea what field and values our users can send, so I just don't want to lose these lines if they do some mess.
I could find someone else with the same problem here Ingest pipeline errors and "on_failure" and turns out that the on_failure is not triggered because the problem is not on the processor side.
I'll create a topic there later, thanks anyway :slight_smile:


(system) #9

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.