I have setup an elasticsearch cluster on the AWS, the cluster has 2 shards each with 1 replicate, so I have 4 VMs in total for the cluster. My goal is to use the mongo-connector to dump 1 million documents from my mongodb on other VM to the elasticsearc cluster. The mongo-connector uses the elasticsearch python client, it uses the bulk API to indexes 500 documents for each call. During the test, I found an interesting issue :
The first 600 bulk requests take 0.25~ second each time, however, starting from the 601th API call, each bulk request takes 1.25~ seconds, it suddenly takes 1 more second. So the first 300,000(600*500) documents have been indexed to elasticsearch very fast, the rest of docs take much longer than the first 300,000 docs.
Below are the logs of the mongo-connector :
2015-09-24 09:06:00,369 [INFO] mongo_connector.connector:1030 - Beginning Mongo Connector
2015-09-24 09:06:00,369 [WARNING] mongo_connector.connector:116 - MongoConnector: Can't find oplog.timestamp, attempting to create an empty progress log
2015-09-24 09:06:00,374 [INFO] mongo_connector.connector:223 - MongoConnector: Empty oplog progress file.
2015-09-24 09:06:00,377 [INFO] mongo_connector.oplog_manager:86 - OplogThread: Initializing oplog thread
2015-09-24 09:06:00,392 [INFO] mongo_connector.connector:295 - MongoConnector: Starting connection thread MongoClient('localhost', 27017)
2015-09-24 09:06:02,963 [INFO] urllib3.connectionpool:205 - Starting new HTTP connection (1): 172.31.1.254
2015-09-24 09:06:03,387 [INFO] elasticsearch:64 - POST http://172.31.1.254:9200/_bulk?timeout=60&refresh=false [status:200 request:0.259s]
2015-09-24 09:06:06,078 [INFO] elasticsearch:64 - POST http://172.31.1.254:9200/_bulk?timeout=60&refresh=false [status:200 request:0.233s]
2015-09-24 09:06:07,719 [INFO] elasticsearch:64 - POST http://172.31.1.254:9200/_bulk?timeout=60&refresh=false [status:200 request:0.244s]
2015-09-24 09:06:11,288 [INFO] elasticsearch:64 - POST http://172.31.1.254:9200/_bulk?timeout=60&refresh=false [status:200 request:0.228s]
..........594 lines omitted
2015-09-24 09:29:12,213 [INFO] elasticsearch:64 - POST http://172.31.1.254:9200/_bulk?timeout=60&refresh=false [status:200 request:0.232s]
2015-09-24 09:29:13,688 [INFO] elasticsearch:64 - POST http://172.31.1.254:9200/_bulk?timeout=60&refresh=false [status:200 request:0.260s] the 600th bulk call
2015-09-24 09:29:25,096 [INFO] elasticsearch:64 - POST http://172.31.1.254:9200/_bulk?timeout=60&refresh=false [status:200 request:1.211s] the 601th bulk call
2015-09-24 09:29:36,250 [INFO] elasticsearch:64 - POST http://172.31.1.254:9200/_bulk?timeout=60&refresh=false [status:200 request:1.244s]
2015-09-24 09:29:46,413 [INFO] elasticsearch:64 - POST http://172.31.1.254:9200/_bulk?timeout=60&refresh=false [status:200 request:1.211s]
2015-09-24 09:29:56,851 [INFO] elasticsearch:64 - POST http://172.31.1.254:9200/_bulk?timeout=60&refresh=false [status:200 request:1.259s]
......... more logs ....
As we can see from the log, for the first 600 requests, the log time is around 1 second after the previous log, the actual server process time is around 0.25 second. Starting from the 601th, the log time suddenly become 10~ seconds for each request, and the server process time becomes 1.25 seconds.
What would be the problem? Can anyone help?