Correct way to configure index settings from Java client

Hi guys.

I use single-node ES application with embedded node (i.e. 1 node cluster,
node is a data node, client is obtained as
NodeBuilder.nodeBuilder().build().client()).

I have elasticsearch.yml file in my resources, where I have refresh_inteval
set to "5s".
I also have index_settings.json file and some xxx_mapping.json files in my
resources, attached via

public void ensureIndexWithMappings(String... mappings) {

    try {

        final AdminClient admin = esClient.admin();

        if(!admin.indices().prepareExists(indexName).execute().actionGet().isExists()) 

{

            CreateIndexRequestBuilder indexBuilder = 

admin.indices().prepareCreate("index").setSettings(

ImmutableSettings.settingsBuilder().loadFromClasspath("index_settings.json"
)));

            for (String mapping : mappings) {

                indexBuilder.addMapping(mapping,

                        new Scanner(new ClassPathResource(String.format(

MAPPING_FILENAME_PATTERN, mapping)).getInputStream(), "UTF-8").useDelimiter(
"\A").next());

            }


            indexBuilder.execute().actionGet();

        }


        // wait until we're done

admin.cluster().prepareHealth().setWaitForGreenStatus().execute().actionGet();

    } catch (ElasticSearchException ese) {

        LOGGER.error("Failed to initialize index {}", indexName, ese);

        throw ese;

    } catch (IOException ioe) {

        LOGGER.error("Failed to read mapping definitions for index {}. 

Definitions are: {}", indexName, mappings, ioe);

        throw new RuntimeException(ioe);

    }

}

elasticsearch.yml contains

A time setting controlling how often the refresh operation will be

executed.

Defaults to 1s. Can be set to -1 in order to disable it.

index.refresh_interval: 5s

index_settings.json contains the same refresh_inteval property just for a
test.

{

"index" : {

    "refresh_interval" : "6s",

    "analysis" : {

        "analyzer" : {

            "ngram_analyzer" : {

                type : "custom",

                "tokenizer" : "standard",

                "filter" : ["lowercase", "stop", "substrings"]

            }

        },

        "filter" : {

            "substrings" : {

                "type" : "edgeNGram",

                "min_gram" : 2,

                "max_gram"  : 10

            }

        }

    }

}

}

My question is quite simple - what is the correct (preferred) way to push
settings - via global .yml file or via admin.indices().prepareCreate("index"
).setSettings(...)?
In my case, analysis section is correctly applied but refresh_interval "6s"
is never read from this JSON - even if I comment out "5s" setting in .yml
file.

-Max

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

I prefer doing this using API.
That way, you don't need to have your index_settings file on each node.

My 2 cents

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr | @scrutmydocs

Le 4 mars 2013 à 17:24, Max Alexejev malexejev@gmail.com a écrit :

Hi guys.

I use single-node ES application with embedded node (i.e. 1 node cluster, node is a data node, client is obtained as NodeBuilder.nodeBuilder().build().client()).

I have elasticsearch.yml file in my resources, where I have refresh_inteval set to "5s".
I also have index_settings.json file and some xxx_mapping.json files in my resources, attached via

public void ensureIndexWithMappings(String... mappings) {
    try {
        final AdminClient admin = esClient.admin();
        if (!admin.indices().prepareExists(indexName).execute().actionGet().isExists()) {
            CreateIndexRequestBuilder indexBuilder = admin.indices().prepareCreate("index").setSettings(
                    ImmutableSettings.settingsBuilder().loadFromClasspath("index_settings.json")));

            for (String mapping : mappings) {
                indexBuilder.addMapping(mapping,
                        new Scanner(new ClassPathResource(String.format(MAPPING_FILENAME_PATTERN, mapping)).getInputStream(), "UTF-8").useDelimiter("\\A").next());
            }

            indexBuilder.execute().actionGet();
        }

        // wait until we're done
        admin.cluster().prepareHealth().setWaitForGreenStatus().execute().actionGet();
    } catch (ElasticSearchException ese) {
        LOGGER.error("Failed to initialize index {}", indexName, ese);
        throw ese;
    } catch (IOException ioe) {
        LOGGER.error("Failed to read mapping definitions for index {}. Definitions are: {}", indexName, mappings, ioe);
        throw new RuntimeException(ioe);
    }
}

elasticsearch.yml contains

A time setting controlling how often the refresh operation will be executed.

Defaults to 1s. Can be set to -1 in order to disable it.

index.refresh_interval: 5s

index_settings.json contains the same refresh_inteval property just for a test.

{
"index" : {
"refresh_interval" : "6s",
"analysis" : {
"analyzer" : {
"ngram_analyzer" : {
type : "custom",
"tokenizer" : "standard",
"filter" : ["lowercase", "stop", "substrings"]
}
},
"filter" : {
"substrings" : {
"type" : "edgeNGram",
"min_gram" : 2,
"max_gram" : 10
}
}
}
}
}

My question is quite simple - what is the correct (preferred) way to push settings - via global .yml file or via admin.indices().prepareCreate("index").setSettings(...)?
In my case, analysis section is correctly applied but refresh_interval "6s" is never read from this JSON - even if I comment out "5s" setting in .yml file.

-Max

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

In general, there are three different options:

  • config file for cluster-wide and indices-wide options
  • settings/mappings for index/type creation (cluster settings are
    ignored silently)
  • updateable cluster settings per cluster update API, for temporary use

Note, you tried to change refresh_interval on index settings/mappings,
but it is a cluster update API setting.

I think the point is there is no feedback about
unused/ignored/unrecognized options - I'm confident this will be
improved in the future.

Jörg

Am 04.03.13 17:24, schrieb Max Alexejev:

My question is quite simple - what is the correct (preferred) way to
push settings - via global .yml file or via
admin.indices().prepareCreate("index").setSettings(...)?
In my case, analysis section is correctly applied but refresh_interval
"6s" is never read from this JSON - even if I comment out "5s" setting
in .yml file.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Got it. Thank you guys.

-Max

вторник, 5 марта 2013 г., 0:11:30 UTC+4 пользователь Jörg Prante написал:

In general, there are three different options:

  • config file for cluster-wide and indices-wide options
  • settings/mappings for index/type creation (cluster settings are
    ignored silently)
  • updateable cluster settings per cluster update API, for temporary use

Note, you tried to change refresh_interval on index settings/mappings,
but it is a cluster update API setting.

I think the point is there is no feedback about
unused/ignored/unrecognized options - I'm confident this will be
improved in the future.

Jörg

Am 04.03.13 17:24, schrieb Max Alexejev:

My question is quite simple - what is the correct (preferred) way to
push settings - via global .yml file or via
admin.indices().prepareCreate("index").setSettings(...)?
In my case, analysis section is correctly applied but refresh_interval
"6s" is never read from this JSON - even if I comment out "5s" setting
in .yml file.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.