Problem with insert data to elasticsearch


(Jakub) #1

Hi,

I'm using latest elasticsearch release 0.12.0
I have problem with insert data to elasticsearch.
I'm using "elasticsearch" PHP client to insert data.

Insert data is done in loop.
PHP script insert 0 - 10000 entries and reload itself.
Next it insert entries from 10000 - 20000. And it reload and start
from 30000.
Script should works like this to the end. About 190000 entries.

Insert single entry works ok. Problem is that script stops every time
after insert 28233 entries.
I'm 100% sure it's not problem with inserted data of entry 28233
because starting from 10000 it will go thourgh entry 28233 without any
problem and will stop on entry 38233. So it always insert only 28233
entries.
PHP client return error.
Couldnt connect to host [123.123.123.123], ElasticSearch down?

(123.123.123.123 - is fake ip address)

This problem don't generate any log entry in log file.
ElasticSearch is not down after this problem happen.

Did anyone have same problem?
Can anyone help me to find what is the reason of this problem?

Thank You
Regards
Jakub


(Michiel) #2

This could be a memory problem in PHP itself. So you could findout if that's
the case. Because insert such many items in 1 array for example, will also
need such amount of space of memory.

On Mon, Oct 25, 2010 at 12:18 PM, Jakub jakub.wachol@gmail.com wrote:

Hi,

I'm using latest elasticsearch release 0.12.0
I have problem with insert data to elasticsearch.
I'm using "elasticsearch" PHP client to insert data.

Insert data is done in loop.
PHP script insert 0 - 10000 entries and reload itself.
Next it insert entries from 10000 - 20000. And it reload and start
from 30000.
Script should works like this to the end. About 190000 entries.

Insert single entry works ok. Problem is that script stops every time
after insert 28233 entries.
I'm 100% sure it's not problem with inserted data of entry 28233
because starting from 10000 it will go thourgh entry 28233 without any
problem and will stop on entry 38233. So it always insert only 28233
entries.
PHP client return error.
Couldnt connect to host [123.123.123.123], ElasticSearch down?

(123.123.123.123 - is fake ip address)

This problem don't generate any log entry in log file.
ElasticSearch is not down after this problem happen.

Did anyone have same problem?
Can anyone help me to find what is the reason of this problem?

Thank You
Regards
Jakub


(Jakub) #3

Thx for answer.

I'm 100% sure it's not PHP memory problem.
PHP setting "memory_limit" is set to 128M in php.ini file.
Script that break with error "Couldnt connect to host
[123.123.123.123], ElasticSearch down?" use only 20.75 megabytes
So it's not the reason of problem.

Regards
Jakub

On 25 Paź, 12:26, Michiel Eghuizen michieleghui...@gmail.com wrote:

This could be a memory problem in PHP itself. So you could findout if that's
the case. Because insert such many items in 1 array for example, will also
need such amount of space of memory.

On Mon, Oct 25, 2010 at 12:18 PM, Jakub jakub.wac...@gmail.com wrote:

Hi,

I'm using latest elasticsearch release 0.12.0
I have problem with insert data to elasticsearch.
I'm using "elasticsearch" PHP client to insert data.

Insert data is done in loop.
PHP script insert 0 - 10000 entries and reload itself.
Next it insert entries from 10000 - 20000. And it reload and start
from 30000.
Script should works like this to the end. About 190000 entries.

Insert single entry works ok. Problem is that script stops every time
after insert 28233 entries.
I'm 100% sure it's not problem with inserted data of entry 28233
because starting from 10000 it will go thourgh entry 28233 without any
problem and will stop on entry 38233. So it always insert only 28233
entries.
PHP client return error.
Couldnt connect to host [123.123.123.123], ElasticSearch down?

(123.123.123.123 - is fake ip address)

This problem don't generate any log entry in log file.
ElasticSearch is not down after this problem happen.

Did anyone have same problem?
Can anyone help me to find what is the reason of this problem?

Thank You
Regards
Jakub


(Shay Banon) #4

Is there a chance that it ends up opening and closing connections for each
index request? You might run out of file descriptors (sockets) that way.

On Mon, Oct 25, 2010 at 4:02 PM, Jakub jakub.wachol@gmail.com wrote:

Thx for answer.

I'm 100% sure it's not PHP memory problem.
PHP setting "memory_limit" is set to 128M in php.ini file.
Script that break with error "Couldnt connect to host
[123.123.123.123], ElasticSearch down?" use only 20.75 megabytes
So it's not the reason of problem.

Regards
Jakub

On 25 Paź, 12:26, Michiel Eghuizen michieleghui...@gmail.com wrote:

This could be a memory problem in PHP itself. So you could findout if
that's
the case. Because insert such many items in 1 array for example, will
also
need such amount of space of memory.

On Mon, Oct 25, 2010 at 12:18 PM, Jakub jakub.wac...@gmail.com wrote:

Hi,

I'm using latest elasticsearch release 0.12.0
I have problem with insert data to elasticsearch.
I'm using "elasticsearch" PHP client to insert data.

Insert data is done in loop.
PHP script insert 0 - 10000 entries and reload itself.
Next it insert entries from 10000 - 20000. And it reload and start
from 30000.
Script should works like this to the end. About 190000 entries.

Insert single entry works ok. Problem is that script stops every time
after insert 28233 entries.
I'm 100% sure it's not problem with inserted data of entry 28233
because starting from 10000 it will go thourgh entry 28233 without any
problem and will stop on entry 38233. So it always insert only 28233
entries.
PHP client return error.
Couldnt connect to host [123.123.123.123], ElasticSearch down?

(123.123.123.123 - is fake ip address)

This problem don't generate any log entry in log file.
ElasticSearch is not down after this problem happen.

Did anyone have same problem?
Can anyone help me to find what is the reason of this problem?

Thank You
Regards
Jakub


(Jakub) #5

Hi,

It looks like curl that is implement in elasticsearch PHP client don't
like to be executed in so big loop.
I try to modify elasticsearch PHP to use curl_multi_exec instead of
curl_exec but this kill elasticsearch.
Finally I change number of entries in one loop to 1000 and now script
works longer but finish without any errors.

Thank You for help.

Regards.
Jakub

On 25 Paź, 16:12, Shay Banon shay.ba...@elasticsearch.com wrote:

Is there a chance that it ends up opening and closing connections for each
index request? You might run out of file descriptors (sockets) that way.

On Mon, Oct 25, 2010 at 4:02 PM, Jakub jakub.wac...@gmail.com wrote:

Thx for answer.

I'm 100% sure it's not PHP memory problem.
PHP setting "memory_limit" is set to 128M in php.ini file.
Script that break with error "Couldnt connect to host
[123.123.123.123], ElasticSearch down?" use only 20.75 megabytes
So it's not the reason of problem.

Regards
Jakub

On 25 Paź, 12:26, Michiel Eghuizen michieleghui...@gmail.com wrote:

This could be a memory problem in PHP itself. So you could findout if
that's
the case. Because insert such many items in 1 array for example, will
also
need such amount of space of memory.

On Mon, Oct 25, 2010 at 12:18 PM, Jakub jakub.wac...@gmail.com wrote:

Hi,

I'm using latest elasticsearch release 0.12.0
I have problem with insert data to elasticsearch.
I'm using "elasticsearch" PHP client to insert data.

Insert data is done in loop.
PHP script insert 0 - 10000 entries and reload itself.
Next it insert entries from 10000 - 20000. And it reload and start
from 30000.
Script should works like this to the end. About 190000 entries.

Insert single entry works ok. Problem is that script stops every time
after insert 28233 entries.
I'm 100% sure it's not problem with inserted data of entry 28233
because starting from 10000 it will go thourgh entry 28233 without any
problem and will stop on entry 38233. So it always insert only 28233
entries.
PHP client return error.
Couldnt connect to host [123.123.123.123], ElasticSearch down?

(123.123.123.123 - is fake ip address)

This problem don't generate any log entry in log file.
ElasticSearch is not down after this problem happen.

Did anyone have same problem?
Can anyone help me to find what is the reason of this problem?

Thank You
Regards
Jakub


(Jakub) #6

Hi

I will refresh this thread as problem back to me after migrate application to new server.
So, my fix on old server was to split big insert loop to pieces (1000 entries) and do 2 sec break betwen each piece.
So I fire insert to elasticsearch loop and do 2 sec break after every 1000 entries.

Right now after migrate page to new "faster" server I run same script and I see error
"Couldnt connect to host [123.123.123.123], ElasticSearch down? "
I try to do 3 sec break instead of 2 and everything works.

I analize a little "elsticsearch PHP client" and I found there same code that I was using "sleep(2)"
It's used in "elasticsearch PHP client" when 2 actions use same entry.
For example function testStringSearch() add entry to elasticsearch, wait 2 sec and then search that entry.

So for me it looks like elasticsearch needs about 2 sec to insert/update entry in storage before it can be visible.

Back to my problem. I think when inserting so many data in loop, then elasticsearch don't do it so fast as loop goes and it kill/hang temporary elasticsearch ("Couldnt connect to host [123.123.123.123], ElasticSearch down? ")

Can anyone confirm that insert data to elasticsearch need some time like in example about 2 sec?
Next question is about new server settings. How is it possible that old server fire loop with 2 sec break and new server needs 3 sec break. New server is faster with more memmory etc. Even php setting is set to bigger. Old server 128 mb for php. New server 256 mb for php. So is there any setting in elasticsearch to speed up save data??


(Shay Banon) #7

By default, an index is refreshed every 1 second (+refresh overhead) to make indexed documents visible to search. Usually, in tests, there isn't really a need to add waits, simply use hte refresh API to make sure what was indexed will be visible for tests.

Regarding the failure, I am not sure where its coming from. What is the actual failure? Is that a connect timeout?
On Thursday, April 21, 2011 at 12:06 PM, Jakub wrote:

Hi

I will refresh this thread as problem back to me after migrate application
to new server.
So, my fix on old server was to split big insert loop to pieces (1000
entries) and do 2 sec break betwen each piece.
So I fire insert to elasticsearch loop and do 2 sec break after every 1000
entries.

Right now after migrate page to new "faster" server I run same script and I
see error
"Couldnt connect to host [123.123.123.123], ElasticSearch down? "
I try to do 3 sec break instead of 2 and everything works.

I analize a little "elsticsearch PHP client" and I found there same code
that I was using "sleep(2)"
It's used in "elasticsearch PHP client" when 2 actions use same entry.
For example function testStringSearch() add entry to elasticsearch, wait 2
sec and then search that entry.

So for me it looks like elasticsearch needs about 2 sec to insert/update
entry in storage before it can be visible.

Back to my problem. I think when inserting so many data in loop, then
elasticsearch don't do it so fast as loop goes and it kill/hang temporary
elasticsearch ("Couldnt connect to host [123.123.123.123], ElasticSearch
down? ")

Can anyone confirm that insert data to elasticsearch need some time like in
example about 2 sec?
Next question is about new server settings. How is it possible that old
server fire loop with 2 sec break and new server needs 3 sec break. New
server is faster with more memmory etc. Even php setting is set to bigger.
Old server 128 mb for php. New server 256 mb for php. So is there any
setting in elasticsearch to speed up save data??

--
View this message in context: http://elasticsearch-users.115913.n3.nabble.com/Problem-with-insert-data-to-elasticsearch-tp1766525p2846362.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.


(system) #8