What's the advantage of setting up index manually vs handled by client

If I use transport client for elasticsearch, when indexing a document to a non-existing index, it will automatically generate the new index and figure out the mapping from the first document. Please correct me if I got it wrong.

So my question is, do we still need to set up an index manually using XPUT? Is there any advantage of doing this?

Definitely better to control index settings and mappings.

That said an index template can do that for you anytime you index a new document in a non existing index.

But is it true that it's actually the mapping that really matters in terms of opting for manual setting? In other word, only putting an index doesn't help at all, since dynamic template can easily figure out the index name but mapping is more complicated. If the first document is malformed, then the mapping could be problematic.

And would it cause any problem when indexing a document that doesn't match the preset mapping? In the sense that control index setting and mapping is more reliable and prevent unmapping documents?

Index settings are also important IMO. You may be don't want to have 5 shards per index if 1 is enough.
Also custom analyzers....

But yes mapping matters a lot.

Indexing a doc which has other fields than the ones defined in mapping will generate a mapping update by default.

1 Like

https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping.html#_updating_existing_mappings

But in the official documentation it says :

Other than where documented, existing type and field mappings cannot be updated. Changing the mapping would mean invalidating already indexed documents. Instead, you should create a new index with the correct mappings and reindex your data into that index.

So are you saying that if a document has more fields than the mapping, it would upgrading the mapping by adding the new fields? But the existing documents don't have such fields. Would it be a problem?

Also, how about a document has a field that doesn't match the type defined in the mapping? For instance, I have a field called "day" in mapping defined as a number , but I index a document with a field called "day" that is a string?

As long as all fields that are defined have the same type, it does not matter is not all fields are present in all documents.

This would result in a mapping conflict that would prevent the new document from being indexed.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.