We're facing multiple issues while using ELK stack. We suspect they're Logstash Configuration issues. Issues are as follows:
-
Logstash connected to Dynamodb streams isn't showing real-time changes. We even have an explicit
perform_stream=>true
in our Logstash configuration. Note: We do get the latest data if we restart the logstash (which is running in a docker container). Could this be cross-region issue? Dynamodb (in us-east-1) while Logstash & Elasticsearch (in us-west-1)? -
Upon restarting Logstash the entire Dynamodb table data is presumably duplicated in Elasticsearch. Dynamodb has around 70K+ Item Count while Elasticsearch has more than double Searchable Documents. Could it be because we have
perform_stream=>true
config? -
Intermittently the latest data can be seen but it is sandwiched between older records; some kind of random data fetch order. Could it be due to multiple workers trying to log at the same time?
-
We need the json message contents from Dynamodb as is. However, we noticed that when we run Logstash the output shows the data in "Stream Records". When we use
log_format=>"json_binary_as_text"
, we can see the json message as we require. Is this sufficient?
Following is our Logstash Configuration:
input { dynamodb { endpoint => "dynamodb.us-east-1.amazonaws.com" streams_endpoint => "streams.dynamodb.us-east-1.amazonaws.com" view_type => "new_image" perform_scan => true perform_stream => true publish_metrics => true table_name => "here-we-have-dynamodb-table-name" log_format => "json_binary_as_text" } } output { elasticsearch { hosts => "here-we-have-our-elasticsearch-endpoint-which-is-in-us-west-1" } }
NOTE: There are no errors in the logs (docker logs --follow container-name
).
Any help on these issues is really appreciated.