Scroll starts fast but finishes slow

javadevmtl · October 22, 2015, 4:24pm

Hi running 1.7.3

I have 4 cluster node with 32 cores per node and 30GB per node and SSD drives. I have an index that has 30,000,000 records sharded with 32 shards, so 8 shards per machine.

I then run a sorted search with fields.

I set the scroll size to 100 (this doesn't seem matter to mater, tested various sizes...). Each iteration of the scroll it gets slower and slower.

So first few scrolls it runs at "decent" speed and then after that you can just see it get slower and slower...

Any thoughts on this?

Thanks

javadevmtl · October 22, 2015, 6:07pm

Ok I have pinpointed the bottleneck.

It seems that when specifying fields it's 300 of orders magnitude slower then just getting back the full document. I'm pretty sure this used to be faster in 1.5,x

Here is the code.

http://pastebin.com/aA2dVgbE

The same report just writing the source doc line by line takes under a minute, while doing it by field it takes 30 minutes.

javadevmtl · October 22, 2015, 6:16pm

Ok, so it was me building the header. I moved the header loop inside the if block checking to see if the header was built, and performance is back where's it supposed to be... Still don't see how that could make it slower?

This is the good way:

if(!headerWritten) {
	for(Entry<String, SearchHitField> e : hit.fields().entrySet()){
		header += "\"" + e.getKey() + "\",";
	}
	header += "\r\n";
	//System.out.println(header);
	fop.write(header.getBytes());
	headerWritten = true;
}

for(Entry<String, SearchHitField> e : hit.fields().entrySet()){
	line += "\"" + e.getValue().value().toString() + "\",";
}

line += "\r\n";

Is it because it has to go all the way to the top of the results to pull the header all the time?

Topic		Replies	Views
Scrolling performance Elasticsearch	5	1689	July 6, 2017
Scroll query performance regression upgrading to ES v7.9 from v6.8 Elasticsearch	1	361	October 12, 2020
Elasticsearch Bulk Write is slow using Scan and Scroll Elasticsearch	4	899	July 5, 2017
Queries get slow while indexing documents Elasticsearch	9	1794	November 5, 2020
How to improve Scroll runtime for 5 billion record retrieval? Elasticsearch	3	403	May 11, 2020

Scroll starts fast but finishes slow

Related topics