Please help, Logstash can not parse field value to created dynamic index

Hello team,
I face a problem about logstash send data to missing elasticsearch index.

my architecture is outlined below.
Filebeats -> Logstash -> Elasticsearch/AWS Elasticsearch

my machine have Filebeat agent for collecting application logs and send to Logstash. I add logappname and logapphostname as custom fields in "fields" in Filebeat configuration file.

- type: log
  enabled: true
  paths:
     - /home/user/app/current/var/log/p2c2p123.log
  multiline.pattern: '^\[[0-9]{4}-{0-9}{2}-{0-9}{2}\ [0-9]{2}:[0-9]{2}:[0-9]{2}\]'
  multiline.negate: true
  multiline.match: after
  multiline.flush_pattern: '\[\]'
  fields:
       logappname: p2c2p123
       logapphostname: host1.testhostclub.local
  fields_under_root: true

on my Logstash, I match timestamp in message. my code is outlined below.

filter {
    grok {
           match => { "message" => "\[%{TIMESTAMP_ISO8601:timestamp}\]%{SPACE}(:%{WORD:logtype}:)?%{SPACE}%{WORD:CLASS}.%{WORD:loglevel}:%{DATA}" }
              }     
}

And I have another configuration file as "send-to-es.conf" for setting up Logstash output. my code is outlined below.

output {
      amazon_es{
             hosts =>  ["<ES Host>"]
             region => "ap-southeast-1"
             index =>  "%{logappname}-%{logapphostname}-%{+YYYY.MM.dd}"
      }
      #stdout { codec => ruby-debug }
 }

----> in this point

on my elasticsearch, I see my data in the p2c2p123-host1.testhostclub.local-2019.04.03
and some time, I see my data store in %{logappname}-%{logapphostname}-2019.04.03

I does not understand why this %{logappname}-%{logapphostname}-2019.04.03 is occur on my ES
Please you help me to resolve my problem.

Regards

any update?

This typically mean that you have data coming in for which these fields are not defined. Look at the data in the strange index to identify where it is coming from.

Hello Chrisitan,
First of all thank you so much for reply my issue. As I see in %{logappname}-%{logapphostname}-2019.04.03. It contain data of p2c2p123.log like p2c2p123-host1.testhostclub.local-2019.04.03.

Can you show a sample record? Does it have the fields used to create the index name parsed out?

{
  "_index": "%{custom_field_logfilename}-%{custom_field_hostname}-2019.04.04-mxq1",
  "_type": "doc",
  "_id": "CxMX52kBND7ihUH8q7wL",
  "_version": 1,
  "_score": null,
  "_source": {
      "host": {
          "os": {
               "platform": "amzn",
               "version": "2018.03",
               "name": "Amazon Linux AMI",
               "family": ""
           },
          "architecture": "x86_64",
          "name": "ip-xxx-xxx-xxx-xxx",
          "containerized": true
      },
      "timestamp": "2019-03-22 12:57:57",
      "prospector": {
      "type": "log"
    },
    "@timestamp": "2019-04-04T06:44:34.596Z",
    "tags": [
      "p2c2p123",
      "beats_input_codec_plain_applied"
   ],
   "meta": {
      "cloud": {
         "provider": "ec2",
         "instance_id": "<instance_id>",
         "region": "<region>",
         "machine_type": "c5.2xlarge",
         "availability_zone": "<availability_zone>"
       }
    },
    "@version": "1",
    "message": "[2019-03-22 12:57:57] report.WARNING:",
    "input": {
       "type": "log"
    },
    "offset": 43071153,
    "beat": {
         "version": "6.6.1",
         "name": "ip-xxx-xxx-xxx-xxx",
         "hostname": "ip-xxx-xxx-xxx-xxx"
     },
     "source": "/home/ec2-user/app/current/var/log/p2c2p123.log",
     "log": {
           "file": {
               "path": "/home/ec2-user/app/current/var/log/p2c2p123.log"
            }
       },
       "CLASS": "report",
       "loglevel": "WARNING"
   },
  "fields": {
         "@timestamp": [
                 "2019-04-04T06:44:34.596Z"
          ]
    },
    "sort": [
          1554360274596
     ]
}
{
   "_index": "p2c2p123-host1.testhostclub.local-2019.04.04-mxq1",
   "_type": "doc",
   "_id": "7Y4R52kB8zAPqq0q7qMM",
   "_version": 1,
   "_score": null,
   "_source": {
   "host": {
       "os": {
           "version": "2018.03",
          "platform": "amzn",
          "name": "Amazon Linux AMI",
          "family": ""
       },
       "architecture": "x86_64",
       "name": "ip-xxx-xxx-xxx-xxx",
       "containerized": true
      },
     "timestamp": "2019-04-04 06:38:07",
     "prospector": {
     "type": "log"
     },
     "@timestamp": "2019-04-04T06:38:08.853Z",
"tags": [
  "p2c2p123",
  "beats_input_codec_plain_applied"
],
"meta": {
  "cloud": {
    "provider": "ec2",
    "instance_id": "<instance_id>",
    "region": "<region>",
    "machine_type": "c5.2xlarge",
    "availability_zone": "<availability_zone>"
     }
  },
    "@version": "1",
    "message": "[2019-04-04 06:38:07] report.DEBUG:",
   "offset": 384412217,
   "custom_field_hostname": "host1.testhostclub.local",
    "input": {
      "type": "log"
    },
    "custom_field_logfilename": "p2c2p123",
    "log": {
       "file": {
        "path": "/home/ec2-user/app/current/var/log/p2c2p123.log"
      }
    },
     "beat": {
         "version": "6.6.1",
         "name": "ip-xxx-xxx-xxx-xxx",
        "hostname": "ip-xxx-xxx-xxx-xxx"
    },
       "source": "/home/ec2-user/app/current/var/log/p2c2p123.log",
       "CLASS": "report",
      "loglevel": "DEBUG"
   },
   "fields": {
      "@timestamp": [
      "2019-04-04T06:38:08.853Z"
    ]
 },
  "sort": [
     1554359888853
      ]
   }

That document does not have those fields defined, which is why you are seeing this issue.

It does look like this approach could result in a very large number of shards which is very inefficient and can cause problems down the line. I would recommend reading and following the guidelines outlines in this blog post.

Hello Christian,
I update 2 sample finish

It looks like your Filebeat configs are not consistently setting the additional fields correctly. I would however recommend against this type of naming convention as it is likely to cause you problems with too many small shards as per the blog I linked to.

Ok let me try :slight_smile:

Hello Chrisitian,
I reduce master node from 3 to 2 node and my data node is 4 node. I still to face problem.
as I see in logstash output (output { stdout { codec => rubydebug } }).

  I found output of my logstash does not contain
  logappname and logapphostname field 

  Please let me know how can I resolve this problem.

Hello Chrisitian,
I can resolve my problem by merge all file to one file.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.