Incorrect path being indexed


(Malcolm Burlington) #1

Hi,

I’m looking for help for an issue I have with Logsatsh 1.5.4. I’m using Logstash to read a number of log files from various Java services. The filenames are like this:

AvailabilityAndFulfilment.log
DocumentGeneration.log
IdentityManagement.log

I’m using the following input:
input {
file {
type => "mos-service"
path => [ "/opt/mos/logs/.log" ]
codec => multiline {
pattern => "(^\d+\serror)|(^.+Exception: .+)|(^\s+at .+)|(^\s+... \d+ more)|(^\s
Caused by:.+)"
what => "previous"
}
}
}
The problem is that when I query the data using Kibana, the "path" field does not relate to the path of the input file. For example, I get messages relating to DocumentGeneration with a path of “AvailabilityAndFulfilment.log”. I’ve checked the logs and all entries are in the correct files. I need to use multiline as some entries contain Java stack traces.

I read about a bug in an older version of Logstash (https://logstash.jira.com/browse/LOGSTASH-1979) and wonder if something similar is happening with Logstash 1.5.4.


(Jay Greenberg) #2

Can you post the output of:

bin/plugin list --verbose

Thanks


(Malcolm Burlington) #3

Hi Jay,

Thanks for getting back. here's the output of bin/plugin list --verbose

logstash-codec-collectd (1.0.1)
logstash-codec-dots (1.0.0)
logstash-codec-edn (1.0.0)
logstash-codec-edn_lines (1.0.0)
logstash-codec-es_bulk (1.0.0)
logstash-codec-fluent (1.0.0)
logstash-codec-graphite (1.0.0)
logstash-codec-json (1.0.1)
logstash-codec-json_lines (1.0.1)
logstash-codec-line (1.0.0)
logstash-codec-msgpack (1.0.0)
logstash-codec-multiline (1.0.0)
logstash-codec-netflow (1.0.0)
logstash-codec-oldlogstashjson (1.0.0)
logstash-codec-plain (1.0.0)
logstash-codec-rubydebug (1.0.0)
logstash-filter-anonymize (1.0.0)
logstash-filter-checksum (1.0.1)
logstash-filter-clone (1.0.0)
logstash-filter-csv (1.0.0)
logstash-filter-date (1.0.0)
logstash-filter-dns (1.0.0)
logstash-filter-drop (1.0.0)
logstash-filter-fingerprint (1.0.0)
logstash-filter-geoip (1.0.2)
logstash-filter-grok (1.0.0)
logstash-filter-json (1.0.1)
logstash-filter-kv (1.0.0)
logstash-filter-metrics (1.0.0)
logstash-filter-multiline (1.0.0)
logstash-filter-mutate (1.0.1)
logstash-filter-ruby (1.0.0)
logstash-filter-sleep (1.0.0)
logstash-filter-split (1.0.0)
logstash-filter-syslog_pri (1.0.0)
logstash-filter-throttle (1.0.0)
logstash-filter-urldecode (1.0.0)
logstash-filter-useragent (1.0.1)
logstash-filter-uuid (1.0.0)
logstash-filter-xml (1.0.0)
logstash-input-couchdb_changes (1.0.0)
logstash-input-elasticsearch (1.0.0)
logstash-input-eventlog (1.0.0)
logstash-input-exec (1.0.0)
logstash-input-file (1.0.1)
logstash-input-ganglia (1.0.0)
logstash-input-gelf (1.0.0)
logstash-input-generator (1.0.0)
logstash-input-graphite (1.0.0)
logstash-input-heartbeat (1.0.0)
logstash-input-http (1.0.2)
logstash-input-imap (1.0.0)
logstash-input-irc (1.0.0)
logstash-input-kafka (1.0.0)
logstash-input-log4j (1.0.0)
logstash-input-lumberjack (1.0.4)
logstash-input-pipe (1.0.0)
logstash-input-rabbitmq (1.1.0)
logstash-input-redis (1.0.3)
logstash-input-s3 (1.0.0)
logstash-input-snmptrap (1.0.0)
logstash-input-sqs (1.0.0)
logstash-input-stdin (1.0.0)
logstash-input-syslog (1.0.1)
logstash-input-tcp (1.0.0)
logstash-input-twitter (1.0.1)
logstash-input-udp (1.0.0)
logstash-input-unix (1.0.0)
logstash-input-xmpp (1.0.0)
logstash-input-zeromq (1.0.0)
logstash-output-cloudwatch (1.0.0)
logstash-output-csv (1.0.0)
logstash-output-elasticsearch (1.0.7)
logstash-output-elasticsearch_http (1.0.0)
logstash-output-email (1.0.0)
logstash-output-exec (1.0.0)
logstash-output-file (1.0.0)
logstash-output-ganglia (1.0.0)
logstash-output-gelf (1.0.0)
logstash-output-graphite (1.0.2)
logstash-output-hipchat (1.0.0)
logstash-output-http (1.0.0)
logstash-output-irc (1.0.0)
logstash-output-juggernaut (1.0.0)
logstash-output-kafka (1.0.0)
logstash-output-lumberjack (1.0.2)
logstash-output-nagios (1.0.0)
logstash-output-nagios_nsca (1.0.0)
logstash-output-null (1.0.0)
logstash-output-opentsdb (1.0.0)
logstash-output-pagerduty (1.0.0)
logstash-output-pipe (1.0.0)
logstash-output-rabbitmq (1.1.1)
logstash-output-redis (1.0.0)
logstash-output-s3 (1.0.0)
logstash-output-sns (2.0.1)
logstash-output-sqs (1.0.0)
logstash-output-statsd (1.1.0)
logstash-output-stdout (1.0.0)
logstash-output-tcp (1.0.0)
logstash-output-udp (1.0.0)
logstash-output-xmpp (1.0.0)
logstash-output-zeromq (1.0.0)
logstash-patterns-core (0.3.0)

-- Malcolm


(Jay Greenberg) #4

@mburlington,

I would like to attempt to reproduce this issue, as this is a fairly common configuration. It is always reproducible? How often is the path incorrect? Only on stack traces / multiline ?

I performed a test using 3 separate catalina.out files, and the path was correct in each of the ~7000 events (was not able to reproduce).

Are you able to provide me with some sample input via S3 or Dropbox? If so, you can private message me the URL.

Thanks,
Jay


(system) #5