Really Slow Indexing on Win7

Hi ES Friends,

I'm trying to PUT documents into my ElasticSearch index from an ASP.NET
app, and it takes on average three seconds to index one small document! Way
too slow! Can someone tell me what I'm doing wrong?

I am running on Windows 7 for my test environment on a Quad-Core with
Hyperthreading (8 CPU's) 3.33ghz
I have 12 GB of RAM
I am running 64-bit Windows
I am running jdk1.7.0_04 64-bit

ES_HEAP_SIZE: 4g
ES_JAVA_OPTS: -Xms2g -Xmx4g

.NET code (using the samples in the installation docs):

    [TestMethod]
    public void PutTest()
    {
        string url = "http://localhost:9200/twitter/user/kimchy"; // 

index/type/id

        string postData = new JavaScriptSerializer().Serialize(new {
            user = "kimchy",
            postDate = "2009-11-15T13:12:00",
            message = "Trying out Elastic Search, so far so good?"
        });

        Console.WriteLine("postData is:");
        Console.WriteLine(postData);

        WebClient client = new WebClient();
        //client.Headers.Add("Content-Type", 

"application/x-www-form-urlencoded");

        Stopwatch sw = new Stopwatch();
        sw.Start();
        byte[] responseArray = client.UploadData(url, "PUT", 

Encoding.ASCII.GetBytes(postData));
sw.Stop();
Console.WriteLine();
Console.WriteLine("Elapsed: {0}", sw.Elapsed);
Console.WriteLine("Response:");
Console.WriteLine(Encoding.ASCII.GetString(responseArray));
}

And last but not least, here is the serialized document that .NET is
posting:

{"user":"kimchy","postDate":"2009-11-15T13:12:00","message":"Trying out
Elastic Search, so far so good?"}

Why is my indexing taking three seconds?!

Cheers,

Allison A.

--

Hi Allison,
Have you tried indexing the document directly using Elasticsearchhead or
any other REST client ?
Does it take this much time too ?
Or is it a problem only when trying to do it using the .Net code ?

regards,
-Parag

On Tuesday, 14 August 2012 23:28:12 UTC+2, Allison A. wrote:

Hi ES Friends,

I'm trying to PUT documents into my Elasticsearch index from an ASP.NETapp, and it takes on average three seconds to index one small document! Way
too slow! Can someone tell me what I'm doing wrong?

I am running on Windows 7 for my test environment on a Quad-Core with
Hyperthreading (8 CPU's) 3.33ghz
I have 12 GB of RAM
I am running 64-bit Windows
I am running jdk1.7.0_04 64-bit

ES_HEAP_SIZE: 4g
ES_JAVA_OPTS: -Xms2g -Xmx4g

.NET code (using the samples in the installation docs):

    [TestMethod]
    public void PutTest()
    {
        string url = "http://localhost:9200/twitter/user/kimchy"; // 

index/type/id

        string postData = new JavaScriptSerializer().Serialize(new {
            user = "kimchy",
            postDate = "2009-11-15T13:12:00",
            message = "Trying out Elastic Search, so far so good?"
        });

        Console.WriteLine("postData is:");
        Console.WriteLine(postData);

        WebClient client = new WebClient();
        //client.Headers.Add("Content-Type", 

"application/x-www-form-urlencoded");

        Stopwatch sw = new Stopwatch();
        sw.Start();
        byte[] responseArray = client.UploadData(url, "PUT", 

Encoding.ASCII.GetBytes(postData));
sw.Stop();
Console.WriteLine();
Console.WriteLine("Elapsed: {0}", sw.Elapsed);
Console.WriteLine("Response:");
Console.WriteLine(Encoding.ASCII.GetString(responseArray));
}

And last but not least, here is the serialized document that .NET is
posting:

{"user":"kimchy","postDate":"2009-11-15T13:12:00","message":"Trying out
Elastic Search, so far so good?"}

Why is my indexing taking three seconds?!

Cheers,

Allison A.

--

Hi, Allison --

Depending on the storage configuration you have, e.g., a FAT32 filesystem on an external USB drive, Windows can have impressively poor performance managing file handles. I'd rule that out before looking elsewhere.

-- Paul

On Aug 14, 2012, at 2:28 PM, Allison A. allisonandrews@feranassociates.com wrote:

Hi ES Friends,

I'm trying to PUT documents into my Elasticsearch index from an ASP.NET app, and it takes on average three seconds to index one small document! Way too slow! Can someone tell me what I'm doing wrong?

I am running on Windows 7 for my test environment on a Quad-Core with Hyperthreading (8 CPU's) 3.33ghz
I have 12 GB of RAM
I am running 64-bit Windows
I am running jdk1.7.0_04 64-bit

ES_HEAP_SIZE: 4g
ES_JAVA_OPTS: -Xms2g -Xmx4g

.NET code (using the samples in the installation docs):

    [TestMethod]
    public void PutTest()
    {
        string url = "http://localhost:9200/twitter/user/kimchy"; // index/type/id

        string postData = new JavaScriptSerializer().Serialize(new {
            user = "kimchy",
            postDate = "2009-11-15T13:12:00",
            message = "Trying out Elastic Search, so far so good?"
        });

        Console.WriteLine("postData is:");
        Console.WriteLine(postData);

        WebClient client = new WebClient();
        //client.Headers.Add("Content-Type", "application/x-www-form-urlencoded");

        Stopwatch sw = new Stopwatch();
        sw.Start();
        byte[] responseArray = client.UploadData(url, "PUT", Encoding.ASCII.GetBytes(postData));
        sw.Stop();
        Console.WriteLine();
        Console.WriteLine("Elapsed: {0}", sw.Elapsed);
        Console.WriteLine("Response:");
        Console.WriteLine(Encoding.ASCII.GetString(responseArray));
    }

And last but not least, here is the serialized document that .NET is posting:

{"user":"kimchy","postDate":"2009-11-15T13:12:00","message":"Trying out Elastic Search, so far so good?"}

Why is my indexing taking three seconds?!

Cheers,

Allison A.

--

--

Hi,

I'm trying to PUT documents into my Elasticsearch index from an ASP.NET app, and it takes on average three seconds to index one small document! Way too slow! Can someone tell me what I'm doing wrong?
Stopwatch sw = new Stopwatch();
sw.Start();
byte responseArray = client.UploadData(url, "PUT", Encoding.ASCII.GetBytes(postData));
sw.Stop();

Note here that you're measuring the time taken to upload the data to ES, not index it. That call will return before your document has been indexed. My gut instinct would be that it's not ES that's slow, it's something in your network infrastructure. Perhaps a slow DNS lookup, or whatever Windows uses these days? (It's been a few years...)

Try:

  • Repeating the experiment with curl rather than your .NET code
  • Using your .NET code to upload a small document to a web server rather than ES

If the second example is slow, then it rules out ES performance specifically. If the first is fast, then it's something in your .NET code.

For comparison, I have a small, Python-based (using pyes), single-threaded loader which does geocoding lookups for each document it's about to index before indexing it, talking to ES over HTTP - and that manages to get well over 100 2.5kb docs per second into ES, using the bulk index interface.

Cheers,
Dan

--
Dan Fairs | dan.fairs@gmail.com | @danfairs | www.fezconsulting.com

--

And if you have any antivirus software, disable it for ES data dir.

Are the next indexations also slow?

I use ES under windows7 and I can index about 200-300 docs per second.

David

--

Le 15 août 2012 à 10:34, Dan Fairs dan.fairs@gmail.com a écrit :

Hi,

I'm trying to PUT documents into my Elasticsearch index from an ASP.NET app, and it takes on average three seconds to index one small document! Way too slow! Can someone tell me what I'm doing wrong?
Stopwatch sw = new Stopwatch();
sw.Start();
byte responseArray = client.UploadData(url, "PUT", Encoding.ASCII.GetBytes(postData));
sw.Stop();

Note here that you're measuring the time taken to upload the data to ES, not index it. That call will return before your document has been indexed. My gut instinct would be that it's not ES that's slow, it's something in your network infrastructure. Perhaps a slow DNS lookup, or whatever Windows uses these days? (It's been a few years...)

Try:

  • Repeating the experiment with curl rather than your .NET code
  • Using your .NET code to upload a small document to a web server rather than ES

If the second example is slow, then it rules out ES performance specifically. If the first is fast, then it's something in your .NET code.

For comparison, I have a small, Python-based (using pyes), single-threaded loader which does geocoding lookups for each document it's about to index before indexing it, talking to ES over HTTP - and that manages to get well over 100 2.5kb docs per second into ES, using the bulk index interface.

Cheers,
Dan

--
Dan Fairs | dan.fairs@gmail.com | @danfairs | www.fezconsulting.com

--

--