Hello team,
I face a problem about logstash send data to missing elasticsearch index.
my architecture is outlined below.
Filebeats -> Logstash -> Elasticsearch/AWS Elasticsearch
my machine have Filebeat agent for collecting application logs and send to Logstash. I add logappname and logapphostname as custom fields in "fields" in Filebeat configuration file.
- type: log
enabled: true
paths:
- /home/user/app/current/var/log/p2c2p123.log
multiline.pattern: '^\[[0-9]{4}-{0-9}{2}-{0-9}{2}\ [0-9]{2}:[0-9]{2}:[0-9]{2}\]'
multiline.negate: true
multiline.match: after
multiline.flush_pattern: '\[\]'
fields:
logappname: p2c2p123
logapphostname: host1.testhostclub.local
fields_under_root: true
on my Logstash, I match timestamp in message. my code is outlined below.
filter {
grok {
match => { "message" => "\[%{TIMESTAMP_ISO8601:timestamp}\]%{SPACE}(:%{WORD:logtype}:)?%{SPACE}%{WORD:CLASS}.%{WORD:loglevel}:%{DATA}" }
}
}
And I have another configuration file as "send-to-es.conf" for setting up Logstash output. my code is outlined below.
output {
amazon_es{
hosts => ["<ES Host>"]
region => "ap-southeast-1"
index => "%{logappname}-%{logapphostname}-%{+YYYY.MM.dd}"
}
#stdout { codec => ruby-debug }
}
----> in this point
on my elasticsearch, I see my data in the p2c2p123-host1.testhostclub.local-2019.04.03
and some time, I see my data store in %{logappname}-%{logapphostname}-2019.04.03
I does not understand why this %{logappname}-%{logapphostname}-2019.04.03 is occur on my ES
Please you help me to resolve my problem.
Regards
This typically mean that you have data coming in for which these fields are not defined. Look at the data in the strange index to identify where it is coming from.
Hello Chrisitan,
First of all thank you so much for reply my issue. As I see in %{logappname}-%{logapphostname}-2019.04.03. It contain data of p2c2p123.log like p2c2p123-host1.testhostclub.local-2019.04.03.
Can you show a sample record? Does it have the fields used to create the index name parsed out?
{
"_index": "%{custom_field_logfilename}-%{custom_field_hostname}-2019.04.04-mxq1",
"_type": "doc",
"_id": "CxMX52kBND7ihUH8q7wL",
"_version": 1,
"_score": null,
"_source": {
"host": {
"os": {
"platform": "amzn",
"version": "2018.03",
"name": "Amazon Linux AMI",
"family": ""
},
"architecture": "x86_64",
"name": "ip-xxx-xxx-xxx-xxx",
"containerized": true
},
"timestamp": "2019-03-22 12:57:57",
"prospector": {
"type": "log"
},
"@timestamp": "2019-04-04T06:44:34.596Z",
"tags": [
"p2c2p123",
"beats_input_codec_plain_applied"
],
"meta": {
"cloud": {
"provider": "ec2",
"instance_id": "<instance_id>",
"region": "<region>",
"machine_type": "c5.2xlarge",
"availability_zone": "<availability_zone>"
}
},
"@version": "1",
"message": "[2019-03-22 12:57:57] report.WARNING:",
"input": {
"type": "log"
},
"offset": 43071153,
"beat": {
"version": "6.6.1",
"name": "ip-xxx-xxx-xxx-xxx",
"hostname": "ip-xxx-xxx-xxx-xxx"
},
"source": "/home/ec2-user/app/current/var/log/p2c2p123.log",
"log": {
"file": {
"path": "/home/ec2-user/app/current/var/log/p2c2p123.log"
}
},
"CLASS": "report",
"loglevel": "WARNING"
},
"fields": {
"@timestamp": [
"2019-04-04T06:44:34.596Z"
]
},
"sort": [
1554360274596
]
}
{
"_index": "p2c2p123-host1.testhostclub.local-2019.04.04-mxq1",
"_type": "doc",
"_id": "7Y4R52kB8zAPqq0q7qMM",
"_version": 1,
"_score": null,
"_source": {
"host": {
"os": {
"version": "2018.03",
"platform": "amzn",
"name": "Amazon Linux AMI",
"family": ""
},
"architecture": "x86_64",
"name": "ip-xxx-xxx-xxx-xxx",
"containerized": true
},
"timestamp": "2019-04-04 06:38:07",
"prospector": {
"type": "log"
},
"@timestamp": "2019-04-04T06:38:08.853Z",
"tags": [
"p2c2p123",
"beats_input_codec_plain_applied"
],
"meta": {
"cloud": {
"provider": "ec2",
"instance_id": "<instance_id>",
"region": "<region>",
"machine_type": "c5.2xlarge",
"availability_zone": "<availability_zone>"
}
},
"@version": "1",
"message": "[2019-04-04 06:38:07] report.DEBUG:",
"offset": 384412217,
"custom_field_hostname": "host1.testhostclub.local",
"input": {
"type": "log"
},
"custom_field_logfilename": "p2c2p123",
"log": {
"file": {
"path": "/home/ec2-user/app/current/var/log/p2c2p123.log"
}
},
"beat": {
"version": "6.6.1",
"name": "ip-xxx-xxx-xxx-xxx",
"hostname": "ip-xxx-xxx-xxx-xxx"
},
"source": "/home/ec2-user/app/current/var/log/p2c2p123.log",
"CLASS": "report",
"loglevel": "DEBUG"
},
"fields": {
"@timestamp": [
"2019-04-04T06:38:08.853Z"
]
},
"sort": [
1554359888853
]
}
That document does not have those fields defined, which is why you are seeing this issue.
It does look like this approach could result in a very large number of shards which is very inefficient and can cause problems down the line. I would recommend reading and following the guidelines outlines in this blog post.
Hello Christian,
I update 2 sample finish
It looks like your Filebeat configs are not consistently setting the additional fields correctly. I would however recommend against this type of naming convention as it is likely to cause you problems with too many small shards as per the blog I linked to.
Hello Chrisitian,
I reduce master node from 3 to 2 node and my data node is 4 node. I still to face problem.
as I see in logstash output (output { stdout { codec => rubydebug } }).
I found output of my logstash does not contain
logappname and logapphostname field
Please let me know how can I resolve this problem.
Hello Chrisitian,
I can resolve my problem by merge all file to one file.