Multiline problem with python stack trace


I'm using an elk stack. I have a log file from a python program. Some logs have stacktrace. I'm trying to use the multiline option in filebeat to get all stacktrace in addition to the error log. However, nothing more than just the log without the stacktrace appears.
Perhaps my pattern is bad but I don't thing so

A log that I want to have as one:

2018-04-24 13:38:55 [scrapy.core.scraper] ERROR: Spider error processing <GET via> (referer: None)
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/scrapy/utils/", line 102, in iter_errback
    yield next(it)
  File "/usr/local/lib/python3.6/dist-packages/scrapy_splash/", line 156, in process_spider_output
    for el in result:
  File "/usr/local/lib/python3.6/dist-packages/scrapy/spidermiddlewares/", line 30, in process_spider_output
    for x in result:
  File "/usr/local/lib/python3.6/dist-packages/scrapy/spidermiddlewares/", line 339, in <genexpr>
    return (_set_referer(r) for r in result or ())
  File "/usr/local/lib/python3.6/dist-packages/scrapy/spidermiddlewares/", line 37, in <genexpr>
    return (r for r in result or () if _filter(r))
  File "/usr/local/lib/python3.6/dist-packages/scrapy/spidermiddlewares/", line 58, in <genexpr>
    return (r for r in result or () if _filter(r))
  File "/tmp/clothes-1524570832-w21yv00v.egg/clothes/spiders/", line 221, in parse_item
    if image_to_locate_script in script:
TypeError: 'in <string>' requires string as left operand, not NoneType

My filebeat.yml

    [{"paths": ["/data/logs/*/*/*.log"], "type": "log", "fields_under_root": true, "fields": {"index_type": "scrapy_log"}}]

  multiline.pattern: "^[a-zA-Z]+Error.*"
  multiline.negate: true
  multiline.match: before


filter {

    if [index_type] == "scrapy_log" {

        grok {
            match => {
                "message" => "%{YEAR}-%{MONTHNUM}-%{MONTHDAY} %{TIME} \[%{NOTSPACE:user}] %{WORD:severity}\: %{GREEDYDATA:message}"

        grok {
            match => {
                "source" => "%{GREEDYDATA:folder}/%{NOTSPACE:crawler}\_%{NOTSPACE:country}/%{GREEDYDATA:filename}\.log"

For Java exceptions, which are essentially similar, I just use the "2" at the start of the timestamp. Tacky, sure, but it works, for the whole of this millenium.

    pattern: ^2
    negate: true
    match: after

Unfortunately, I tried with this

multiline.pattern : "^[0-9]{4}-[0-9]{2}-[0-9]{2}"
multiline.negate: true
multiline.match: after

that is in the same idea as yours, but nothing changed

Which suggests that it's the grok pattern that's the problem then, not the multiline stuff.

I got logs in kibana
for example :

2018-04-24 15:43:24 [root] INFO: start parse_item for url:

gave me

So I imagine if the multiline option works, I would have all the stacktrace, no?

multiline is a parameter of the prospectors, this solved my problem

      - type: log
          - "/data/logs/*/*/*.log"
          index_type: scrapy_log
        fields_under_root: true
        multiline.pattern: ^[0-9]{4}-[0-9]{2}-[0-9]{2}
        multiline.negate: true
        multiline.match: after

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.