Multiline problem with python stack trace


(Skubiszewski) #1

Hello,

I'm using an elk stack. I have a log file from a python program. Some logs have stacktrace. I'm trying to use the multiline option in filebeat to get all stacktrace in addition to the error log. However, nothing more than just the log without the stacktrace appears.
Perhaps my pattern is bad but I don't thing so

A log that I want to have as one:

2018-04-24 13:38:55 [scrapy.core.scraper] ERROR: Spider error processing <GET http://clubmonaco.borderfree.com/product/index.jsp?productId=133770236 via http://35.205.126.55:8050/execute> (referer: None)
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/scrapy/utils/defer.py", line 102, in iter_errback
    yield next(it)
  File "/usr/local/lib/python3.6/dist-packages/scrapy_splash/middleware.py", line 156, in process_spider_output
    for el in result:
  File "/usr/local/lib/python3.6/dist-packages/scrapy/spidermiddlewares/offsite.py", line 30, in process_spider_output
    for x in result:
  File "/usr/local/lib/python3.6/dist-packages/scrapy/spidermiddlewares/referer.py", line 339, in <genexpr>
    return (_set_referer(r) for r in result or ())
  File "/usr/local/lib/python3.6/dist-packages/scrapy/spidermiddlewares/urllength.py", line 37, in <genexpr>
    return (r for r in result or () if _filter(r))
  File "/usr/local/lib/python3.6/dist-packages/scrapy/spidermiddlewares/depth.py", line 58, in <genexpr>
    return (r for r in result or () if _filter(r))
  File "/tmp/clothes-1524570832-w21yv00v.egg/clothes/spiders/clubmonaco.py", line 221, in parse_item
    if image_to_locate_script in script:
TypeError: 'in <string>' requires string as left operand, not NoneType

My filebeat.yml

 filebeat:
  prospectors:
    [{"paths": ["/data/logs/*/*/*.log"], "type": "log", "fields_under_root": true, "fields": {"index_type": "scrapy_log"}}]

  
  multiline.pattern: "^[a-zA-Z]+Error.*"
  multiline.negate: true
  multiline.match: before

logstash:

filter {

    if [index_type] == "scrapy_log" {

        grok {
            match => {
                "message" => "%{YEAR}-%{MONTHNUM}-%{MONTHDAY} %{TIME} \[%{NOTSPACE:user}] %{WORD:severity}\: %{GREEDYDATA:message}"
            }

        }
      
        grok {
            match => {
                "source" => "%{GREEDYDATA:folder}/%{NOTSPACE:crawler}\_%{NOTSPACE:country}/%{GREEDYDATA:filename}\.log"
            }
        }
    }
}

(Tim Ward) #2

For Java exceptions, which are essentially similar, I just use the "2" at the start of the timestamp. Tacky, sure, but it works, for the whole of this millenium.

  multiline:
    pattern: ^2
    negate: true
    match: after

(Skubiszewski) #3

Unfortunately, I tried with this

multiline.pattern : "^[0-9]{4}-[0-9]{2}-[0-9]{2}"
multiline.negate: true
multiline.match: after

that is in the same idea as yours, but nothing changed


(Tim Ward) #4

Which suggests that it's the grok pattern that's the problem then, not the multiline stuff.


(Skubiszewski) #5

I got logs in kibana
for example :

2018-04-24 15:43:24 [root] INFO: start parse_item for url: http://clubmonaco.borderfree.com/product/index.jsp?productId=137926546

gave me

So I imagine if the multiline option works, I would have all the stacktrace, no?


(Skubiszewski) #6

multiline is a parameter of the prospectors, this solved my problem

filebeat_prospectors:
      - type: log
        paths:
          - "/data/logs/*/*/*.log"
        fields:
          index_type: scrapy_log
        fields_under_root: true
        multiline.pattern: ^[0-9]{4}-[0-9]{2}-[0-9]{2}
        multiline.negate: true
        multiline.match: after

(system) #7

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.