chouben
(Christof H)
October 17, 2024, 8:53am
21
Hi Stephen
We upgraded to 8.15.2 - Same result
Could it be related to event.dataset
typing?
What I noticed now is that event.dataset
is a String
field with keyword
subfield.
Grouping options via discover:
Although still shown via discover as
keyword
field:
keyword
subfield:
String
subfield:
Not possible to group on String field
I'll get back to you once tested with correct typed fields..
Best regards
Christof
1 Like
chouben
(Christof H)
October 17, 2024, 9:39am
22
Hi @stephenb
It seems your last suggestion to debug via Discover did point me in the right direction!
Checking an item further in the past (dataset finally appears):
Timeframe we were checking (dataset is shown too together with other sources)
For completeness, my updated Component Template
(I'm disabling all_strings_to_keywords
because keyword
fields make searches harder)
PUT _component_template/logs@custom
{
"template": {
"settings": {
"index": {
"number_of_replicas": "0",
"default_pipeline": "calculate_lag",
"codec": "best_compression"
}
},
"mappings": {
"dynamic": true,
"date_detection": true,
"dynamic_date_formats": [
"strict_date_optional_time",
"yyyy/MM/dd HH:mm:ss Z||yyyy/MM/dd Z",
"yyyy-MM-dd HH:mm:ss,SSS"
],
"dynamic_templates": [
{
"all_strings_to_keywords": {
"unmatch" : "*",
"match_mapping_type": "string",
"mapping": {
"ignore_above": 1024,
"type": "keyword"
}
}
}
],
"properties": {
"@timestamp": {
"type": "date"
},
"data_stream.namespace": {
"type": "constant_keyword"
},
"data_stream.dataset": {
"type": "constant_keyword"
},
"data_stream.type": {
"type": "constant_keyword",
"value": "logs"
},
"log.level": {
"type": "keyword"
},
"service.name": {
"type": "keyword"
},
"event.dataset": {
"type": "keyword"
},
"container.id": {
"type": "keyword"
},
"host.hostname": {
"type": "keyword"
},
"host.name": {
"type": "keyword"
}
}
}
},
"_meta": {
"managed": true,
"description": "default mappings for the logs index template installed by x-pack"
}
}
CONCLUSION
event.dataset
field was shown via Discover as Keyword
field, although it was only a subtype on my system (primary type String
)
Due to incorrect (primary) type, Log rate per minute
tile on the Observability Overview
screen did not pick up the event.dataset
field
Same solution would have probably worked on 8.14.3
, but system was already upgraded to 8.15.2
Thanks a lot for your persistence!
Christof
1 Like
Hi, I am just scanning this issue quickly but I noticed that your file has beat.version: 6.5.1, which would imply that it's almost 6 years out of date from current, and would have very limited support for Elastic Common Schema, which could be the cause of some of the field type conflicts that you're experiencing. Can you confirm that your beats version is up to date? If not, then you can download current beats from: Download Beats: Data Shippers for Elasticsearch | Elastic
1 Like
chouben
(Christof H)
October 17, 2024, 12:18pm
24
Hi @Mike_Paquette
Thanks for stopping by!
I'm currently in the process of upgrading an old standalone 7.16 stack to a new RPM installed 8.15.2 stack. The logs are old logs, which have to be migrated.
Issue has been resolved since this morning. Below my conclusions:
chouben:
CONCLUSION
event.dataset
field was shown via Discover as Keyword
field, although it was only a subtype on my system (primary type String
)
Due to incorrect (primary) type, Log rate per minute
tile on the Observability Overview
screen did not pick up the event.dataset
field
Same solution would have probably worked on 8.14.3
, but system was already upgraded to 8.15.2
With an import remark, why we didn't have the keyword mapping:
chouben:
(I'm disabling all_strings_to_keywords
because keyword
fields make searches harder)
PUT _component_template/logs@custom
{
"template": {
...
"mappings": {
...
"dynamic_templates": [
{
"all_strings_to_keywords": {
"unmatch" : "*",
"match_mapping_type": "string",
"mapping": {
"ignore_above": 1024,
"type": "keyword"
}
}
}
...
Those logs for the migration are fetched, cleaned (file 1) & enriched (file 2) with new fields, cfr. my migration pipeline:
chouben:
Migration pipeline
Logstash file 1:
#Only pipeline size 500 & scroll 5m
#Other running pipeline size 200 & scroll 5m
input {
elasticsearch {
hosts => "localhost:9200"
index => "jboss-fat-2024.09*"
query => '{ }'
size => 200
scroll => "5m"
docinfo => true
}
}
filter {
#Parse data via new logic (remove deducted fields)
mutate {
remove_field => [ "loglevel", "thread", "logtime", "class", "logmessage", "context" ]
}
#ID is generated below, old tags are removed first
mutate {
remove_tag => [ "idParsed", "idParsingFailed", "dateparsed", "idParsed" ]
}
#key is required for bug: https://github.com/logstash-plugins/logstash-filter-fingerprint/issues/46
fingerprint {
source => "message"
target => "[@metadata][fingerprint]"
method => "MD5"
key => "XXX"
}
ruby {
code => "event.set('[@metadata][tsEpochMilliPrefix]', (1000*event.get('@timestamp').to_f).round(0))"
}
if [@metadata][tsEpochMilliPrefix] and [@metadata][fingerprint] {
mutate {
#Document ID is set in the elasticsearch output plugin
# add_field => { document_id => "%{[@metadata][tsEpochMilliPrefix]}%{[@metadata][fingerprint]}"}
add_tag => [ "idParsed" ]
}
} else {
mutate {
add_tag => [ "idParsingFailed" ]
}
}
}
output {
if [fields][type] == "jboss" {
pipeline { send_to => "jboss-input" }
} else if [fields][type] == "cassandra" {
pipeline { send_to => "cassandra-input" }
} else if [fields][type] == "kpi" {
pipeline { send_to => kpi }
} else if [fields][type] == "monitoring" {
pipeline { send_to => monitoring }
}
}
Logstash file 2:
input { pipeline { address => "jboss-input" } }
filter {
grok {
patterns_dir => ["/etc/logstash/patterns"]
match => [ "message", "^%{TIMESTAMP_ISO8601:[log][time]}%{SPACE}%{SLOGLEVEL:[log][level]}%{SPACE}\[%{ENDCONTEXT:[log][context]}\]%{SPACE}\(%{NOTBRACKET:[log][thread]}\)%{SPACE}%{GREEDYDATA:[log][content]}$"]
}
mutate {
convert => [ "pid", "integer"]
remove_field => ["offset", "[prospector][type]"]
}
date {
match => [ "[log][time]" , "yyyy-MM-dd HH:mm:ss,SSS" ]
timezone => "Europe/Brussels"
add_tag => [ "dateparsed" ]
}
#https://www.elastic.co/guide/en/observability/current/logs-app-fields.html
#https://discuss.elastic.co/t/log-source-unknown-in-observability-overview/262568
#Required to have the source in Observability - Logs view
mutate {
add_field => { "event.dataset" => "%{[fields][type]}.%{[fields][env]}" }
add_field => { "service.name" => "jboss" }
add_field => { "host.hostname" => "%{[host][name]}" }
add_field => { "container.id" => "jboss-%{[host][name]}" }
add_field => { "log.file.path" => "%{[source]}" }
#rename => { "[host][name]" => "[host][hostname]" }
}
}
FYI:
I was planning to move all beats to an Elastic Agent. In the process however I noted that support for custom log formats is not yet that great on Elastic Agent. So it was still advised to use Filebeat.
All components for the new Elastic Stack are in sync and are updated to 8.15.2 since this morning: Elastic Agent, Filebeat, Logstash, Elasticsearch with Fleet Server, Kibana
Best regards
Christof