Data stream logs not showing in discover

So I have a setup where my rabbitmq send logs to logstash and then to elasticsearch and kibana.
This is what I have setup in my /etc/logstash/conf.d/rabbitmq.conf

input {
    rabbitmq {
        host => "0.0.0.0"
        port => 5672
        durable => true
        exchange => "logs"
        exchange_type => "fanout"
        user => "rabbit"
        password => "password"
        queue => "application"
        tags => ["rabbitmq"]
    }
}

output {
    if "rabbitmq" in [tags] {
        elasticsearch {
            hosts => "https://localhost:9200"
            user => "elastic"
            password => "password"
            ssl_certificate_verification => false
            index => "applog-%{[extra][tags][client_name]}-%{channel}-%{+yyyy.MM}"
        }
    }
}

With this setup everything is fine, I can see my logs in discover and everything is as it should be. However, when I start working on my ILM. I create my policy and my index template - As soon as I enable data stream I don't see logs in discover and the datastream filesize are 225 bytes. I have no idea why it does that. If I remove datastream then I can see logs but policy doesn't work.

Hope someone can help me.

Hi @metalaarif Welcome to the community

What is the name of your data stream?

your index in filebeat needs to point to the data stream write alias... so the data is writing to the stream not a concrete index.

index => "my-data-stream-name"

See example here

then ILM will work etc... you can even test rollover with

POST my-data-stream/_rollover

It looks like you may be trying to create a data stream for each client... if so ... you will need to create a datastream for each client before you start indexing... the data... not sure if that is what you are trying to do....

Hi Stephenb,

Thanks for replying and glad to be in this community, I am not using filebeat and my rabbitmq generates all the logs in this applog-%{[extra][tags][client_name]}-%{channel}-%{+yyyy.MM} format and pushes it to logstash and I can clearly see the logs.

I am not sure what you mean by data stream name but I create it when I was creating index template. Assuming index template name is data stream which I don't think it is, then the name should it be like this

index => applog-template%{[extra][tags][client_name]}-%{channel}-%{+yyyy.MM}"

or are you saying that this is how I need to create a data stream and ignore clicking the button in index template

PUT applog-data-stream/_bulk

Our logs generate in applog-* format so in this scenario do I still need to create for each client?

Apologies yes logstash... not filebeat.

Note sure what that means...

Please share your complete index template and ILM policy

GET _index_template/my-index-template
GET _ilm/policy/my-ilm-policy 

I think you need to look at all the data stream setting in the logstash elasticsearch output... see here

The index needs to be the data stream name.

"index => my-data-stream-name"

If you put the client name in the index name you are going to add a lot of complexity...
Perhaps you want to use namespace in the data stream... I think you are mixing concepts a bit, would need to think about that..

Perhaps you could use the data_stream_namespace... here

I would probably start without the client name... get it working... (as long as you have a client field you will always be able to filter)

THEN I would probably work towards using the name space..

A data stream for each client will probably be difficult unless you are saying you have a very small number of clients... you would need to automate alot I suspect.

Our you can just go back to your daily indices ... that you have working but that may create many small indices ...

This is my index_template and I have removed data stream from index_template temporarily because the policy was working but I couldn't see the logs in discover.

{
  "index_templates": [
    {
      "name": "applog-temp",
      "index_template": {
        "index_patterns": [
          "applog-*"
        ],
        "template": {
          "settings": {
            "index": {
              "lifecycle": {
                "name": "applog-policy",
                "rollover_alias": "applog"
              },
              "mapping": {
                "total_fields": {
                  "limit": "10000"
                }
              },
              "refresh_interval": "5s",
              "number_of_shards": "1",
              "number_of_replicas": "0"
            }
          }
        },
        "composed_of": []
      }
    }
  ]
}

and my _ilm/policy and please ignore the min_age that's just for testing

{
  "applog-policy": {
    "version": 11,
    "modified_date": "2022-07-26T17:06:50.644Z",
    "policy": {
      "phases": {
        "warm": {
          "min_age": "10h",
          "actions": {
            "set_priority": {
              "priority": 50
            }
          }
        },
        "cold": {
          "min_age": "15h",
          "actions": {
            "set_priority": {
              "priority": 0
            }
          }
        },
        "hot": {
          "min_age": "0ms",
          "actions": {
            "set_priority": {
              "priority": 100
            }
          }
        },
        "delete": {
          "min_age": "1d",
          "actions": {
            "delete": {
              "delete_searchable_snapshot": true
            }
          }
        }
      }
    },
      "data_streams": [],
      "composable_templates": [
        "applog-temp"
      ]
    }
  }
}

Did you create a Data View.. .you need that to see logs in Discover.. you may have already had it working

Kibana -> Stack Management -> Data View -> Create Data View

if you did

GET _cat/indices

did you see the data stream indices.. .they start with .ds-...

Yes, like you mentioned I whenever I hit that command in Dev Tools

GET cat/indices

I can see all of them listed. The policy seems like working fine as I can see it go from Hot to Warm to Delete. The rollover worked fine too and I could clearly see .ds-xxxxxx-000001, -000002, -00003 etc. being generated however, I could never see any logs in Kibana ==> discover.

I don't know why that was happening but as soon as I enable data stream in index_template the total size of indices are just 225 byte and 0 docs.

By Defintion those are backing indices for a data stream

You mean this line

  "data_stream": { },

Sorry I have lost track ... to many moving parts...

if you just want to try indices (not data stream)

If you take what you have above without the data stream

Cleanup up

Then bootstrap the write alias index

PUT <applog-{now/d}-000001>
{
  "aliases": {
    "applog": {
      "is_write_index": true
    }
  }
}

then set in logstash output

index => "applog"

It should all work

You will still need to create a Data View in Kibana to see that data

Yes, I did create data view and I couldn't see them at all.

But I will try what you've mentioned and I will change my index => applog

Then bootstrap the write alias index

PUT <applog-{now/d}-000001>
{
  "aliases": {
    "applog": {
      "is_write_index": true
    }
  }
}

This is something I have not tried. I'll try this and get back to you.

Thank you so much for swift reply.

1 Like

That ^^^^ is straight up Elastic Magic :slight_smile:

@metalaarif

I found the problem with the logstash and data streams DARN ... I should have recognized it... as I have written on it before... DOH!!! Apologies... and this is a bit of a bug right now..

Everything you did in the beginning was right (except the client id) but you need to add when using the index directive...

action => "create"

in the elasticsearch output section. Basically logstash was failing to write the data ... because Data Stream only allow create and you have to set it manually.

output {
    if "rabbitmq" in [tags] {
        elasticsearch {
            hosts => "https://localhost:9200"
            user => "elastic"
            password => "password"
            ssl_certificate_verification => false
            index => "applog"
            action => "create"  <!---- YOU NEED THIS... long story... there is a bug
        }
    }
}

if you look at the logstash logs before you probably had errors like

[2022-07-26T14:50:45,209][WARN ][logstash.outputs.elasticsearch][main][48d6c242f2e45e8e134251754210be2b1b6290a5d6780c9b6a1230dd822ca880] Could not index event to Elasticsearch. ....
 "reason"=>"only write ops with an op_type of create are allowed in data streams"}}}}

This should work... now

So I did that and now I have a data stream

With an index with documents in it

And Now for the Complete Solution with Client Names and Data Streams

PUT _index_template/applog
{
  "name": "applog",
  "index_template": {
    "index_patterns": [
      "applog-*"
    ],
    "template": {
      "settings": {
        "index": {
          "lifecycle": {
            "name": "applog"
          }
        }
      },
      "mappings": {
        "properties": {
          "@timestamp": {
            "type": "date"
          },
          "message": {
            "type": "text"
          }
        }
      }
    },
    "composed_of": [],
    "data_stream": {
      "hidden": false,
      "allow_custom_routing": true <!--- NOTE this for later 
    }
  }
}

And the Logstash this is my stub the filter and output are important..

input {
  stdin {
  }
}

filter {

  # Assume you have a the fields you want... 
  mutate {
    add_field => {
      "client_name" => "beta-corp"
    }
  }

  # Assume you have a client name
  # Set the datastream namespace name to your client 
  mutate {
    add_field => {
      "[data_stream][namespace]" => "%{client_name}"
    }
  }

}
output {
  elasticsearch {
    hosts => "localhost:9200"
    data_stream => true
    data_stream_auto_routing => true
    data_stream_dataset => "applogs"
    # Some reason this does not work I think it should... 
    # data_stream_namespace => "%{client_name}"
  }

  stdout {
    codec => rubydebug
  }
}

And the output when I set the client to different name Note the 2 data streams with the common suffix and then the client name... the logs prefix is pretty much hard coded...

And a Single Data View for All

And Discover One Data View (or you could create one per client) with all the data

Event test rollover

POST logs-applogs-acme-corp/_rollover

{
  "acknowledged" : true,
  "shards_acknowledged" : true,
  "old_index" : ".ds-logs-applogs-acme-corp-2022.07.26-000001",
  "new_index" : ".ds-logs-applogs-acme-corp-2022.07.26-000002",
  "rolled_over" : true,
  "dry_run" : false,
  "conditions" : { }
}

This was good for me to go all through too!!

1 Like

Hi stephenb,

Thank you so much for your help and it makes much more sense however, this is the only thing I am confused about:

I apologies in advance if this is something everyone knows but where is this going? Do I add them in /etc/logstash/conf.d/ or /etc/logstash/logstash.conf or something I am adding in dev tools.

input {
  stdin {
  }
}

filter {

  # Assume you have a the fields you want... 
  mutate {
    add_field => {
      "client_name" => "beta-corp"
    }
  }

  # Assume you have a client name
  # Set the datastream namespace name to your client 
  mutate {
    add_field => {
      "[data_stream][namespace]" => "%{client_name}"
    }
  }

}
output {
  elasticsearch {
    hosts => "localhost:9200"
    data_stream => true
    data_stream_auto_routing => true
    data_stream_dataset => "applogs"
    # Some reason this does not work I think it should... 
    # data_stream_namespace => "%{client_name}"
  }

  stdout {
    codec => rubydebug
  }
}

That is your logstash pipeline configuration that is NOT loaded in Kibana Dev -> Tools

See here and here

You can create a my-logstash.conf file put it in here and it will automatically get envoked

or use the pipelines.yml file to specify the location see here

Thank you so much that's what I needed to know. I somewhat know what I need to do now.
Will be doing that soon.

Once again thanks for your support.

1 Like