Hi everyone,
I need to send data between elasticsearch(input) and s3 bucket(output) - not so complicated.
We are using this configuration in logstash:
Scenario: logstash file used to export data from elasticsearch and output information in a s3 bucket.
logstash.conf
input {
elasticsearch {
index => "servers"
hosts => [ "myhostinformation.test:443" ]
ssl => true
query => '{"query":{"range": {"@timestamp":{"gt":"04/02/2018 00:00:00.000","lt": "04/02/2018 23:59:59.999","format": "dd/MM/yyyy HH:mm:ss.SSS||dd/MM/yyyy HH:mm:ss.SSS","time_zone": "-02:00"}}}}'
size => 500
scroll => "5m" }
}
output {
s3 {
region => "sa-east-1"
time_file => 1
bucket => "mybucket.com"
prefix => "%{+YYYY}/%{+MM}/%{+dd}"
codec => json
}
}
This is another logstash file used to get files inside a S3 bucket and put information in a elasticsearch index:
input {
s3 {
bucket => "mybucket.com"
prefix => "PREFIX_BUCKET"
region => "sa-east-1"
add_field => {
inf => "s3"
}
}
}
filter
{
mutate { gsub => ["message" , "}{", "}|\n|{"] }
split { terminator => "|\n|" }
json {
source => "message" }
json {
source => "raw_message"
}
}
output {
elasticsearch {
index => "historical-information"
hosts => [ "myhostinformation.test:443" ]
}
}
===================
Problem: For some reason the data between 22:00:00 and 23:59:59 does not there as I expected and obviously number of documents inside this new bucket is not the correct one.
Important Informations:
- We tried to remove time_zone information in query input elasticsearch, but, it does not work
- We tried to change the lt to 05/02/2018 02:00:00, but, it does not work
- We created this logstash.conf file - migrating data between 2 different index and everything works well:
input {
elasticsearch {
index => "servers"
hosts => [ "myhostinformation.test:443" ]
ssl => true
query => '{"query":{"range": {"@timestamp":{"gt":"04/02/2018 00:00:00.000","lt": "04/02/2018 23:59:59.999","format": "dd/MM/yyyy HH:mm:ss.SSS||dd/MM/yyyy HH:mm:ss.SSS","time_zone": "-02:00"}}}}'
size => 500
scroll => "5m"
}
}
output {
elasticsearch {
index => "historical-information"
hosts => [ "myhostinformation.test:443" ]
}
}
====
- LOGSTASH VERSION = 5.6.9
- ELASTICSEARCH VERSION = 5.3
===
When we execute a query directly in elasticsearch:
Correct Information:
GET servers/_search
"hits": {
"total": 720817
{
"query": {
"range": {
"@timestamp": {
"gt": "04/02/2018 00:00:00.000",
"lt": "04/02/2018 23:59:59.999",
"format": "dd/MM/yyyy HH:mm:ss.SSS||dd/MM/yyyy HH:mm:ss.SSS",
"time_zone": "-02:00"
}
}
}
}
Incorrect Data - New Index called historical-information:
GET historical-information/_search
"hits": {
"total": 639783
Now we don't know if we are going to the right way or this is a limitation. Please help us!