So I have a setup where my rabbitmq send logs to logstash and then to elasticsearch and kibana.
This is what I have setup in my /etc/logstash/conf.d/rabbitmq.conf
input {
rabbitmq {
host => "0.0.0.0"
port => 5672
durable => true
exchange => "logs"
exchange_type => "fanout"
user => "rabbit"
password => "password"
queue => "application"
tags => ["rabbitmq"]
}
}
output {
if "rabbitmq" in [tags] {
elasticsearch {
hosts => "https://localhost:9200"
user => "elastic"
password => "password"
ssl_certificate_verification => false
index => "applog-%{[extra][tags][client_name]}-%{channel}-%{+yyyy.MM}"
}
}
}
With this setup everything is fine, I can see my logs in discover and everything is as it should be. However, when I start working on my ILM. I create my policy and my index template - As soon as I enable data stream I don't see logs in discover and the datastream filesize are 225 bytes. I have no idea why it does that. If I remove datastream then I can see logs but policy doesn't work.
then ILM will work etc... you can even test rollover with
POST my-data-stream/_rollover
It looks like you may be trying to create a data stream for each client... if so ... you will need to create a datastream for each client before you start indexing... the data... not sure if that is what you are trying to do....
Thanks for replying and glad to be in this community, I am not using filebeat and my rabbitmq generates all the logs in this applog-%{[extra][tags][client_name]}-%{channel}-%{+yyyy.MM} format and pushes it to logstash and I can clearly see the logs.
I am not sure what you mean by data stream name but I create it when I was creating index template. Assuming index template name is data stream which I don't think it is, then the name should it be like this
index => applog-template%{[extra][tags][client_name]}-%{channel}-%{+yyyy.MM}"
or are you saying that this is how I need to create a data stream and ignore clicking the button in index template
PUT applog-data-stream/_bulk
Our logs generate in applog-* format so in this scenario do I still need to create for each client?
Please share your complete index template and ILM policy
GET _index_template/my-index-template
GET _ilm/policy/my-ilm-policy
I think you need to look at all the data stream setting in the logstash elasticsearch output... see here
The index needs to be the data stream name.
"index => my-data-stream-name"
If you put the client name in the index name you are going to add a lot of complexity...
Perhaps you want to use namespace in the data stream... I think you are mixing concepts a bit, would need to think about that..
Perhaps you could use the data_stream_namespace... here
I would probably start without the client name... get it working... (as long as you have a client field you will always be able to filter)
THEN I would probably work towards using the name space..
A data stream for each client will probably be difficult unless you are saying you have a very small number of clients... you would need to automate alot I suspect.
Our you can just go back to your daily indices ... that you have working but that may create many small indices ...
This is my index_template and I have removed data stream from index_template temporarily because the policy was working but I couldn't see the logs in discover.
Yes, like you mentioned I whenever I hit that command in Dev Tools
GET cat/indices
I can see all of them listed. The policy seems like working fine as I can see it go from Hot to Warm to Delete. The rollover worked fine too and I could clearly see .ds-xxxxxx-000001, -000002, -00003 etc. being generated however, I could never see any logs in Kibana ==> discover.
I don't know why that was happening but as soon as I enable data stream in index_template the total size of indices are just 225 byte and 0 docs.
I found the problem with the logstash and data streams DARN ... I should have recognized it... as I have written on it before... DOH!!! Apologies... and this is a bit of a bug right now..
Everything you did in the beginning was right (except the client id) but you need to add when using the index directive...
action => "create"
in the elasticsearch output section. Basically logstash was failing to write the data ... because Data Stream only allow create and you have to set it manually.
output {
if "rabbitmq" in [tags] {
elasticsearch {
hosts => "https://localhost:9200"
user => "elastic"
password => "password"
ssl_certificate_verification => false
index => "applog"
action => "create" <!---- YOU NEED THIS... long story... there is a bug
}
}
}
if you look at the logstash logs before you probably had errors like
[2022-07-26T14:50:45,209][WARN ][logstash.outputs.elasticsearch][main][48d6c242f2e45e8e134251754210be2b1b6290a5d6780c9b6a1230dd822ca880] Could not index event to Elasticsearch. ....
"reason"=>"only write ops with an op_type of create are allowed in data streams"}}}}
And the Logstash this is my stub the filter and output are important..
input {
stdin {
}
}
filter {
# Assume you have a the fields you want...
mutate {
add_field => {
"client_name" => "beta-corp"
}
}
# Assume you have a client name
# Set the datastream namespace name to your client
mutate {
add_field => {
"[data_stream][namespace]" => "%{client_name}"
}
}
}
output {
elasticsearch {
hosts => "localhost:9200"
data_stream => true
data_stream_auto_routing => true
data_stream_dataset => "applogs"
# Some reason this does not work I think it should...
# data_stream_namespace => "%{client_name}"
}
stdout {
codec => rubydebug
}
}
And the output when I set the client to different name Note the 2 data streams with the common suffix and then the client name... the logs prefix is pretty much hard coded...
Thank you so much for your help and it makes much more sense however, this is the only thing I am confused about:
I apologies in advance if this is something everyone knows but where is this going? Do I add them in /etc/logstash/conf.d/ or /etc/logstash/logstash.conf or something I am adding in dev tools.
input {
stdin {
}
}
filter {
# Assume you have a the fields you want...
mutate {
add_field => {
"client_name" => "beta-corp"
}
}
# Assume you have a client name
# Set the datastream namespace name to your client
mutate {
add_field => {
"[data_stream][namespace]" => "%{client_name}"
}
}
}
output {
elasticsearch {
hosts => "localhost:9200"
data_stream => true
data_stream_auto_routing => true
data_stream_dataset => "applogs"
# Some reason this does not work I think it should...
# data_stream_namespace => "%{client_name}"
}
stdout {
codec => rubydebug
}
}
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.