First time poster, long time lurker.
I am using the Elastic stack to process and analyze XML documents which are sent to me over the HTTP protocol. I currently have the following pipeline set up to realise this behaviour:
- Node.JS (receives document over HTTP and does some processing)
- Logstash formats XML as JSON and does further processing on some fields
- Elastic indexes the documents
(4. Kibana for visualisation)
This works great on our live system (Windows server + 3 x CentOS Elastic Cluster) however I am migrating to a containerized solution on our test system, to be eventually rolled out to live.
Many events (approximately 55% or 5,135 events over a 15 minute period) are being lost on the test system and I do not know where. I know this because I can find event ids which exist on the live system, but not on the test system (they share a data feed). Does anyone have any ideas how I could go about identifying which part of this pipeline is causing the bottleneck and events to be missed? Any help would be much appreciated.