we faced the same problem recently (indexing slow via curl, fast via
other library), and see this option:
libcurl makes all POST and PUT requests (except for POST requests with a
very tiny request body) use the "Expect: 100-continue" header. This header
allows the server to deny the operation early so that libcurl can bail out
already before having to send any data. This is useful in authentication
cases and others.
However, many servers don't implement the Expect: stuff properly and if the
server doesn't respond (positively) within 1 second libcurl will continue
and send off the data anyway.
You can disable libcurl's use of the Expect: header the same way you disable
any header, using -H / CURLOPT_HTTPHEADER, or by forcing it to use HTTP 1.0.
Faster now with curl
On Tue, Sep 28, 2010 at 3:20 PM, Eugene firstname.lastname@example.org wrote:
PHP client does indeed use curl, so i think i'll just rewrite the
import script to use raw socket (fsockopen()). This will cover my
needs so far.
Thrift on the other hand looks really promising, i'll look in to when
the system i am prototyping goes into production.
On Sep 28, 2:38 pm, Shay Banon shay.ba...@elasticsearch.com wrote:
Another option is the new REST Thrift client at master:http://github.com/elasticsearch/elasticsearch/issues/closed#issue/354. There
is a PHP client at works, maybe add it there?http://github.com/nervetattoo/elasticsearch(I am not sure if it uses curl
On Tue, Sep 28, 2010 at 2:26 PM, Eugene glum...@gmail.com wrote:
I just tried with HTTPClient and it does indeed seem to get my big
document through to ES fast! Thanks a million mate, i've been banging
my head against the wall all day today.
My original plan was to use curl php extension in a php script (turns
out it is as slow as the command line curl) and iterate through my
mongodb documents. I will now look into a different php client...
On Sep 28, 2:11 pm, Shay Banon shay.ba...@elasticsearch.com wrote:
Its curl acting up. It takes about 2-8 milliseconds on my laptop to
it using a different http client (I used the mac one called HTTPClient).
do you plan to load the docs?
On Tue, Sep 28, 2010 at 1:20 PM, Eugene glum...@gmail.com wrote:
I'm struggling with a strange ElasticSearch insert problem (tried both
0.10 and git master versions). On small ~800 chars documents the
insert takes 0.015s (http://tinypaste.com/6f186) but on bigger
~4000chars documents the insert time jumps to over 2 seconds:
I have one local es instance and i've tried with different number of
shards and store options. I've also tried to tweak 'direct',
'buffer_size', 'warm_cache' etc. Plus i've tried starting
elasticsearch with different -Des.index.gateway.snapshot_interval and -
Des.index.engine.robin.refresh_interval arguments. Nothing seems to
I hope you can point me in the right direction so i can index all of
my 200000 documents (both small and big).