Slow Indexing

Hi,

Testing a possible move from lucene to elasticsearch I created
elasticsearch code with NEST client that imitates the lucene indexing I'm
doing in my code.

The results were that the elasticsearch indexing took 10 times longer the
direct lucene indexing of the same data.

This is the NEST code I use:

Call once from the constructor –

private void CreateEsIndex()

    {

        MapAndAnalyze(

            a => a.Analyzers(an => an.Add(snowball, new SnowballAnalyzer{ Language = language })),

            m => m

                     .Properties(p => p

                                          .String(sm => sm

                                                            .Name(f => 

f.Description)

.IndexAnalyzer(snowball)

.SearchAnalyzer(snowball)

                                                            .Store(false

)

                                          )

                                          .String(sm => sm.Name(f => 

f.Board).Store(false))

                                          .String(sm => sm.Name(f => 

f.User).Store(false))

                                          .String(sm => sm.Name(f => 

f.IDPin).Store())

                                          .String(sm => sm.Name(f => 

f.IDPicture).Store())

                     )

                     .DisableAllField()

            );

    }



    private void MapAndAnalyze(

        Func<AnalysisDescriptor, AnalysisDescriptor> analysisSelector,

        Func<RootObjectMappingDescriptor<Document>, 

RootObjectMappingDescriptor> typeMappingDescriptor

        )

    {

        if (!client.IndexExists(index).Exists)

        {

            var result = client.CreateIndex(

                index, c => c

                                .NumberOfReplicas(0)

                                .NumberOfShards(5)

                                .Analysis(analysisSelector)

                                .AddMapping(typeMappingDescriptor)

                );

            Log.InfoFormat("Index Created: {0}", result.OK);

        }

    }

Call for each indexing –

var indexResult = client.Index(

                        new Document(

                            pin.Id.ToString(), 

pin.InnerData.PictureID.ToString(), pin.InnerData.Board, pin.InnerData.User,

                            pin.InnerData.Description.ToString(), id), 

index,

                        new IndexParameters { Refresh = true });

Thanks,

Ophir

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Could you try with Bulk instead of single insertion?

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 6 févr. 2013 à 12:47, Ophir Michaeli ophirmichaeli@gmail.com a écrit :

Hi,

Testing a possible move from lucene to elasticsearch I created elasticsearch code with NEST client that imitates the lucene indexing I'm doing in my code.

The results were that the elasticsearch indexing took 10 times longer the direct lucene indexing of the same data.

This is the NEST code I use:

Call once from the constructor –

private void CreateEsIndex()

    {

        MapAndAnalyze(

            a => a.Analyzers(an => an.Add(snowball, new SnowballAnalyzer { Language = language })),

            m => m

                     .Properties(p => p

                                          .String(sm => sm

                                                            .Name(f => f.Description)

                                                            .IndexAnalyzer(snowball)

                                                            .SearchAnalyzer(snowball)

                                                            .Store(false)

                                          )

                                          .String(sm => sm.Name(f => f.Board).Store(false))

                                          .String(sm => sm.Name(f => f.User).Store(false))

                                          .String(sm => sm.Name(f => f.IDPin).Store())

                                          .String(sm => sm.Name(f => f.IDPicture).Store())

                     )

                     .DisableAllField()

            );

    }



    private void MapAndAnalyze(

        Func<AnalysisDescriptor, AnalysisDescriptor> analysisSelector,

        Func<RootObjectMappingDescriptor<Document>, RootObjectMappingDescriptor<Document>> typeMappingDescriptor

        )

    {

        if (!client.IndexExists(index).Exists)

        {

            var result = client.CreateIndex(

                index, c => c

                                .NumberOfReplicas(0)

                                .NumberOfShards(5)

                                .Analysis(analysisSelector)

                                .AddMapping(typeMappingDescriptor)

                );

            Log.InfoFormat("Index Created: {0}", result.OK);

        }

    }

Call for each indexing –

var indexResult = client.Index(

                        new Document(

                            pin.Id.ToString(), pin.InnerData.PictureID.ToString(), pin.InnerData.Board, pin.InnerData.User,

                            pin.InnerData.Description.ToString(), id), index,

                        new IndexParameters { Refresh = true });

Thanks,

Ophir

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

I see that you perform a refresh with the index operation. Are you doing
this for each request?
Setting refresh=true for each index request is not recommended. By default
ES will perform a refresh periodically (1 second). The automatic refresh is
configurable.

Martijn

On 6 February 2013 13:27, David Pilato david@pilato.fr wrote:

Could you try with Bulk instead of single insertion?

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 6 févr. 2013 à 12:47, Ophir Michaeli ophirmichaeli@gmail.com a
écrit :

Hi,

Testing a possible move from lucene to elasticsearch I created
elasticsearch code with NEST client that imitates the lucene indexing I'm
doing in my code.

The results were that the elasticsearch indexing took 10 times longer the
direct lucene indexing of the same data.

This is the NEST code I use:

Call once from the constructor –

private void CreateEsIndex()

    {

        MapAndAnalyze(

            a => a.Analyzers(an => an.Add(snowball, new

SnowballAnalyzer { Language = language })),

            m => m

                     .Properties(p => p

                                          .String(sm => sm

                                                            .Name(f

=> f.Description)

.IndexAnalyzer(snowball)

.SearchAnalyzer(snowball)

                                                            .Store(

false)

                                          )

                                          .String(sm => sm.Name(f =>

f.Board).Store(false))

                                          .String(sm => sm.Name(f =>

f.User).Store(false))

                                          .String(sm => sm.Name(f =>

f.IDPin).Store())

                                          .String(sm => sm.Name(f =>

f.IDPicture).Store())

                     )

                     .DisableAllField()

            );

    }



    private void MapAndAnalyze(

        Func<AnalysisDescriptor, AnalysisDescriptor> analysisSelector,

        Func<RootObjectMappingDescriptor<Document>,

RootObjectMappingDescriptor> typeMappingDescriptor

        )

    {

        if (!client.IndexExists(index).Exists)

        {

            var result = client.CreateIndex(

                index, c => c

                                .NumberOfReplicas(0)

                                .NumberOfShards(5)

                                .Analysis(analysisSelector)

                                .AddMapping(typeMappingDescriptor)

                );

            Log.InfoFormat("Index Created: {0}", result.OK);

        }

    }

Call for each indexing –

var indexResult = client.Index(

                        new Document(

                            pin.Id.ToString(),

pin.InnerData.PictureID.ToString(), pin.InnerData.Board, pin.InnerData.User,

                            pin.InnerData.Description.ToString(),

id), index,

                        new IndexParameters { Refresh = true });

Thanks,

Ophir

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.