Integration tests and consistency

Hi.

Ive created integration tests for the ES integration in our product. We use
version 0.18.6

For all tests, the flow is like this:

Index data -> this.client.index( request ).actionGet();
Flush index -> client.admin().indices().flush( Requests.flushRequest(
INDEX_NAME ).refresh( true ) ).actionGet();
Query -> this.client.search( req ).actionGet();

When running a lot of tests, usually one or two test fails with not
expected results from query, meaning that the query is executed before the
index is updated properly. When running all tests again, another test may
fail while to previous was successfull.
What can I do to enshure that the test will run properly each time? I
thought the flush-index would make shure that everything was flushed into
the index?

The settings for the test-environment is like this:

protected static final String NUMBER_OF_SHARDS = "5";

protected static final String NUMBER_OF_REPLICAS = "1";

public ImmutableSettings.Builder buildSettings()
{
    ImmutableSettings.Builder settings =

ImmutableSettings.settingsBuilder();
settings.loadFromSource( buildDistributionSettings() );
settings.loadFromSource( buildStorageSettings() );
//settings.loadFromSource( buildAnalyserSettings() );

    return settings;
}

private String buildStorageSettings()
{
    try
    {
        return jsonBuilder().startObject().startObject( "store"

).field( "type", "memory" ).endObject().endObject().string();
}
catch ( IOException e )
{
throw new ContentIndexException( "Not able to create settings
for index", e );
}
}

private String buildDistributionSettings()
{
    try
    {
        return jsonBuilder().startObject()
            .field( "number_of_shards", NUMBER_OF_SHARDS )
            .field( "number_of_replicas", NUMBER_OF_REPLICAS )
            .endObject()
            .string();
    }
    catch ( IOException e )
    {
        throw new ContentIndexException( "Not able to create settings

for index", e );
}
}

mvh

Runar Myklebust

Hi,

Not sure what is the problem here but there are some things that hit me:

  1. You use memory index and flush flushes the index into its store (which
    in this case is memory). Have never seen this use case before :slight_smile: Flush is
    used to make sure that your index data is persistently written into a store
    (and won't be lost if machine crashes). I am not sure that flush on memory
    based indices do...

  2. All you need to do to make indexed data searchable it refresh (not
    flush). Although I see you call refresh on flush API as well, I would try
    only Refresh first.

I did some basic ES test implementation some time ago here
https://github.com/lukas-vlcek/elasticsearch.demo/blob/master/src/test/java/org/elasticsearch/demo/BasicTest.java
May be it can help you.

If the above will not help then I think you might want to share more code
(how you get client and how you configure and start nodes).

Regards,
Lukas

On Mon, Feb 6, 2012 at 10:51 AM, Runar Myklebust runar@myklebust.me wrote:

Hi.

Ive created integration tests for the ES integration in our product. We
use version 0.18.6

For all tests, the flow is like this:

Index data -> this.client.index( request ).actionGet();
Flush index -> client.admin().indices().flush( Requests.flushRequest(
INDEX_NAME ).refresh( true ) ).actionGet();
Query -> this.client.search( req ).actionGet();

When running a lot of tests, usually one or two test fails with not
expected results from query, meaning that the query is executed before the
index is updated properly. When running all tests again, another test may
fail while to previous was successfull.
What can I do to enshure that the test will run properly each time? I
thought the flush-index would make shure that everything was flushed into
the index?

The settings for the test-environment is like this:

protected static final String NUMBER_OF_SHARDS = "5";

protected static final String NUMBER_OF_REPLICAS = "1";

public ImmutableSettings.Builder buildSettings()
{
    ImmutableSettings.Builder settings =

ImmutableSettings.settingsBuilder();
settings.loadFromSource( buildDistributionSettings() );
settings.loadFromSource( buildStorageSettings() );
//settings.loadFromSource( buildAnalyserSettings() );

    return settings;
}

private String buildStorageSettings()
{
    try
    {
        return jsonBuilder().startObject().startObject( "store"

).field( "type", "memory" ).endObject().endObject().string();
}
catch ( IOException e )
{
throw new ContentIndexException( "Not able to create settings
for index", e );
}
}

private String buildDistributionSettings()
{
    try
    {
        return jsonBuilder().startObject()
            .field( "number_of_shards", NUMBER_OF_SHARDS )
            .field( "number_of_replicas", NUMBER_OF_REPLICAS )
            .endObject()
            .string();
    }
    catch ( IOException e )
    {
        throw new ContentIndexException( "Not able to create settings

for index", e );
}
}

mvh

Runar Myklebust

Hi Runar,

You should take a look at this class
https://github.com/elasticsearch/elasticsearch/blob/master/src/test/java/org/elasticsearch/test/integration/AbstractNodesTests.java

It will give you some hints on how to correctly isolate nodes and clients
you create.
It's important because you don't want a node to join your cluster without
knowing it.

Hope, this helps,

--
Cordialement/Regards,

Louis GUEYE
linkedin http://fr.linkedin.com/in/louisgueye |
bloghttp://deepintojee.wordpress.com/
| twitter http://twitter.com/#!/lgueye

2012/2/6 Lukáš Vlček lukas.vlcek@gmail.com

Hi,

Not sure what is the problem here but there are some things that hit me:

  1. You use memory index and flush flushes the index into its store (which
    in this case is memory). Have never seen this use case before :slight_smile: Flush is
    used to make sure that your index data is persistently written into a store
    (and won't be lost if machine crashes). I am not sure that flush on memory
    based indices do...

  2. All you need to do to make indexed data searchable it refresh (not
    flush). Although I see you call refresh on flush API as well, I would try
    only Refresh first.

I did some basic ES test implementation some time ago here
https://github.com/lukas-vlcek/elasticsearch.demo/blob/master/src/test/java/org/elasticsearch/demo/BasicTest.java
May be it can help you.

If the above will not help then I think you might want to share more code
(how you get client and how you configure and start nodes).

Regards,
Lukas

As was suggested, there isn't a need to flush in order to refresh the index, just a refresh will do. Can you try and run your tests with the default store type or ram type and see if it makes a difference?

On Monday, February 6, 2012 at 7:07 PM, louis gueye wrote:

Hi Runar,

You should take a look at this class https://github.com/elasticsearch/elasticsearch/blob/master/src/test/java/org/elasticsearch/test/integration/AbstractNodesTests.java

It will give you some hints on how to correctly isolate nodes and clients you create.
It's important because you don't want a node to join your cluster without knowing it.

Hope, this helps,

-- Cordialement/Regards,

Louis GUEYE
linkedin (http://fr.linkedin.com/in/louisgueye) | blog (http://deepintojee.wordpress.com/) | twitter (http://twitter.com/#!/lgueye)

2012/2/6 Lukáš Vlček <lukas.vlcek@gmail.com (mailto:lukas.vlcek@gmail.com)>

Hi,

Not sure what is the problem here but there are some things that hit me:

  1. You use memory index and flush flushes the index into its store (which in this case is memory). Have never seen this use case before :slight_smile: Flush is used to make sure that your index data is persistently written into a store (and won't be lost if machine crashes). I am not sure that flush on memory based indices do...

  2. All you need to do to make indexed data searchable it refresh (not flush). Although I see you call refresh on flush API as well, I would try only Refresh first.

I did some basic ES test implementation some time ago here https://github.com/lukas-vlcek/elasticsearch.demo/blob/master/src/test/java/org/elasticsearch/demo/BasicTest.java
May be it can help you.

If the above will not help then I think you might want to share more code (how you get client and how you configure and start nodes).

Regards,
Lukas

Hi, thanks a lot. The "gateway.type", "none" - setting did the trick.

On Mon, Feb 6, 2012 at 6:07 PM, louis gueye louis.gueye@gmail.com wrote:

Hi Runar,

You should take a look at this class
https://github.com/elasticsearch/elasticsearch/blob/master/src/test/java/org/elasticsearch/test/integration/AbstractNodesTests.java

It will give you some hints on how to correctly isolate nodes and clients
you create.
It's important because you don't want a node to join your cluster without
knowing it.

Hope, this helps,

--
Cordialement/Regards,

Louis GUEYE
linkedin http://fr.linkedin.com/in/louisgueye | bloghttp://deepintojee.wordpress.com/
| twitter http://twitter.com/#!/lgueye

2012/2/6 Lukáš Vlček lukas.vlcek@gmail.com

Hi,

Not sure what is the problem here but there are some things that hit me:

  1. You use memory index and flush flushes the index into its store (which
    in this case is memory). Have never seen this use case before :slight_smile: Flush is
    used to make sure that your index data is persistently written into a store
    (and won't be lost if machine crashes). I am not sure that flush on memory
    based indices do...

  2. All you need to do to make indexed data searchable it refresh (not
    flush). Although I see you call refresh on flush API as well, I would try
    only Refresh first.

I did some basic ES test implementation some time ago here
https://github.com/lukas-vlcek/elasticsearch.demo/blob/master/src/test/java/org/elasticsearch/demo/BasicTest.java
May be it can help you.

If the above will not help then I think you might want to share more code
(how you get client and how you configure and start nodes).

Regards,
Lukas

--
mvh

Runar Myklebust