We are using four droplets right now at Digitalocean:
An elasticsearch droplet (8GB ram, 4 CPU's)
A deepstream droplet (512MB ram, 1 CPU)
A redis droplet (512MB ram, 1 CPU)
A java Tomcat droplet (1GB ram, 1 CPU)
The flow would be as following: a request is received by deepstream by either the java Tomcat droplet or our Android app, and data is send or retrieved by redis and elasticsearch.
Saving data and retrieving data is really easy this way and fast. Especially when using the subscribe methods of Deepstream to sync data client side instantly.
Now to our problem, with a single deepstream client connection, we do receive the message "No ACK message received in time for eventPrefix/." after a while. The connection no longer receives update from the subscribe path but we can still make requests on the getRecord or has path.
We did see great improvements by expanding the Elasticsearch droplet resources. We started from 512MB ram Elastcisearch heap size to 2GB heap size to (now) 4GB heap size. With 512MB ram received the "No ACK" message instantly, with 2GB heap size after about 30 seconds and with 4GB heap size after about 5 minutes.
Also, the java Tomcat droplet will get stuck on an "has" command of deepstream. The server simply doesn't respond and the script hangs on that call.
We did disable swapping of the Elasticsearch server by setting the following:
MAX_LOCKED_MEMORY=unlimited
bootstrap.mlockall: true
And ofcourse with our 8GB system memory the ES_HEAP_SIZE=4g
Question: Will expanding the Elasticsearch resources (especially more RAM) help to fix our issues we are facing right now? Is there anything that comes to your mind that we might try?
Now, we have a tomcat server running a script that goes on forever, the script updates about 50 records per minute (shouldn't be too much). The script has a routine, the routine starts with the "has" command. After about 24 hours on average, the script is stuck on that "has" command. We have about 8 threads running and all of them gets stuck at the same time on that command. It seems like there is no answer from the storage anymore. While the tomcat server script is stuck, we can still use calls from our Android app.
Now to our Android app, upon subscribing to a path, a record and an update it written to that specific path we subscribed to, we receive the following messages:
No ACK message received in time for SUBSCRIBE {RECORD_NAME}
No message received in time for READ {RECORD_NAME}
After this, every record subscribed to no longer receives any update and we can no longer use getRecord or has or whatsoever on the same Deepstream client object. The connection seems dead.
I think there is a connection between the "No ack in time" and our back-end getting stuck on the "has" command.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.