Loading performance - what should I expect?

This is Day 1 of me using elasticsearch so apologies for the basic question.

Line-by-line loading seems really slow, but I've no idea what to expect. I have 80 character text strings (lines of English words from books) and want to load about 400m of them as documents. But at 100 strings per second, this is nowhere near the speed I was expecting. However, the subsequent search on the first 5 million loaded is good.

I have a single node dev machine with SSD drive, 16Gb RAM running Ubuntu 18.04 and Elasticsearch 7.6.2. No other major jobs are running. The load is being done from Python.

I previously loaded the data into PostgreSQL in a few hours, which set my speed expectations. But, as you'd expect, the search performance wasn't good enough - hence looking at Elasticsearch.

I'll persevere is people say "yes, that's normal".

Have you followed these guidelines , e.g. around using bulk requests of a suitable size?

Not yet - thanks for the pointer. I knew of bulk loading but wanted to get something simple working. I'll try it out.

It generally makes a huge difference so we’ll worth using.

I've re-written my code to use bulk insertion - and achieved a speed-up of 150 times. So that was good advice - thanks. 15,000 documents per second.

2 Likes

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.