The persistent queue uses memory mapped files and these are read into off-heap memory. Such a big queue will have thousands of page and checkpoint files. Only a few of these is in memory at a time. The 'head' page, where the writes go to, and a few tail pages, where the reads and acks are done.
There is the number of 'events in flight' to consider. A batch of 125 events is read by each worker thread, say 24 workers, then 125 x 24 = 3000 events in LS at any one time - now when a batch of events are JSON serialized to bulk index to Elasticsearch the memory is doubled for that batch. As the Elasticsearch is multithreaded each workers batch can be serialised into memory.
To calculate how much memory to give to LS, one needs to know what the average and max size of your events are.
For a 1 MB average event size 3000 event can mean a 6000MB memory consumption. But each batch is a mixed bag of events though.
However, from your reported queue size of 200GB @ 130 million events we get 2e11 / 130e6 = 1538 bytes per event or 1.5KB and at that size 3000 in-flight events would consume only 9MB of heap.
Have you changed the batch size? Do you know whether you have any extremely large events? Say 100MB+.