Observability Overview - Logs not shown as log source

Hi @stephenb

Sorry for the dealy, I didn't find time to do the requested test yesterday.

Test
I did the migration again for the specific datastream (Logstash pipeline below - Fetch data from elasticsearch 7.16 and load it in 8.14.3):

Same result:

event.dataset does now contain a . instead of a -. Below a sample document:
PS: a colleague of you used - in this sample "Unknown" logs in observability overview - #2 by felixbarny. That's where I got it from.

All below fields are on the screenshot above as before (if you prefer it in a different format because of readability, please do request):

    "data_stream": {
      "namespace": "jboss",
      "type": "logs",
      "dataset": "info"
    },
...
    "event.dataset": "jboss.fat"

I don't think we are facing timezone issues:

{
  "_index": ".ds-logs-info-jboss-2024.10.16-000001",
  "_id": "GbtLlJIBP4HuDLbvLht9",
  "_version": 1,
  "_score": 0,
  "_source": {
...
    "@timestamp": "2024-09-28T22:00:05.852Z",

To avoid timezone issues I also started loading bigger chunks of consecutive days that even if there would be shifts in hours, there would still be enough data available (as you see below) - 28/9 to 1/10 and we are looking at 29/9:

Migration pipeline

Logstash file 1:

#Only pipeline size 500 & scroll 5m
#Other running pipeline size 200 & scroll 5m
input {
 elasticsearch {
    hosts => "localhost:9200"
    index => "jboss-fat-2024.09*"
    query => '{  }'
    size => 200
    scroll => "5m"
    docinfo => true
  }
}

filter {
#Parse data via new logic (remove deducted fields)
      mutate {
        remove_field => [ "loglevel", "thread", "logtime", "class", "logmessage", "context" ]
      }


#ID is generated below, old tags are removed first
      mutate {
        remove_tag => [ "idParsed", "idParsingFailed", "dateparsed", "idParsed" ]
      }

#key is required for bug: https://github.com/logstash-plugins/logstash-filter-fingerprint/issues/46
    fingerprint {
      source => "message"
      target => "[@metadata][fingerprint]"
      method => "MD5"
      key => "XXX"
    }
    ruby {
      code => "event.set('[@metadata][tsEpochMilliPrefix]', (1000*event.get('@timestamp').to_f).round(0))" 
    }

    if [@metadata][tsEpochMilliPrefix] and [@metadata][fingerprint] {
        mutate {
#Document ID is set in the elasticsearch output plugin
#            add_field => { document_id => "%{[@metadata][tsEpochMilliPrefix]}%{[@metadata][fingerprint]}"}
            add_tag => [ "idParsed" ]
        }
    } else {
        mutate {
            add_tag => [ "idParsingFailed" ]
        }
    }
}


output {
	if [fields][type] == "jboss" {
	  pipeline { send_to => "jboss-input" }
	} else if [fields][type] == "cassandra" {
	  pipeline { send_to => "cassandra-input" }
	} else if [fields][type] == "kpi" {
	  pipeline { send_to => kpi }
	} else if [fields][type] == "monitoring" {
	  pipeline { send_to => monitoring }
	}
}

Logstash file 2:

input { pipeline { address => "jboss-input" } }

filter {
       grok {
          patterns_dir => ["/etc/logstash/patterns"]
          match => [ "message", "^%{TIMESTAMP_ISO8601:[log][time]}%{SPACE}%{SLOGLEVEL:[log][level]}%{SPACE}\[%{ENDCONTEXT:[log][context]}\]%{SPACE}\(%{NOTBRACKET:[log][thread]}\)%{SPACE}%{GREEDYDATA:[log][content]}$"] 
        }
        mutate {
            convert => [ "pid", "integer"]
            remove_field => ["offset", "[prospector][type]"]
        }
        date {
            match => [ "[log][time]" , "yyyy-MM-dd HH:mm:ss,SSS" ]
            timezone => "Europe/Brussels"
            add_tag => [ "dateparsed" ]
        }
	#https://www.elastic.co/guide/en/observability/current/logs-app-fields.html
	#https://discuss.elastic.co/t/log-source-unknown-in-observability-overview/262568
	#Required to have the source in Observability - Logs view
      mutate {
        add_field => { "event.dataset" => "%{[fields][type]}.%{[fields][env]}" }
		add_field => { "service.name" => "jboss" }
		add_field => { "host.hostname" => "%{[host][name]}" }
		add_field => { "container.id" => "jboss-%{[host][name]}" }
		add_field => { "log.file.path" => "%{[source]}" }
		#rename => { "[host][name]" => "[host][hostname]" }
      }			  
}

I hope I answered all requests.. :smiling_face:

Best regards
Christof

PS:

I assume you do advise now to upgrade to the latest version. I'm still in development phase, so that would be perfectly feasible.