I spent the last few hours studying ElasticSearch documentation, reading
posts in this group and searching Stackoverflow and over the Web in general.
I manage to search for substrings as follows:
created an empty index
defined index analysis settings to use a a custom analyzer that uses
an nGram filter
defined mappings for an index and type specifying the above analyzer
for each property
posted some documents to the right index and type
Everything works as expected, but I still have some doubts:
the elasticsearch documentation states that the mappings definition
is not needed in general
(http://www.elasticsearch.org/guide/reference/mapping/), but in my tests
search for substrings did not work without mappings. Is this the case or
there is a mistake in my code and search for substrings should work without
mappings?
is there a way to define mappings that apply to all the types in an
index?
I created the index and specified settings and mappings using CURL.
When using the Java API is it correct to use the exact same JSON below as
the input of ImmutableSettings.Builder.loadFromSource() ?
I splitted the settings and mappings in two json files, because I would
like the mappings to be applied to several index types whose names are not
known in advance.
So I am creating an index specifying only the settings and then, the first
time a new type is used I am creating a putmappingrequest to pass the
mapping for that index type.
However, this approach fails, because the analyzer is defined in the
settings but when I try to provide mappings for the index type the parser
cannot resolve the analyzer:
org.elasticsearch.index.mapper.MapperParsingException: Analyzer [my_analyzer
] not found for field [title]
at org.elasticsearch.index.mapper.core.TypeParsers.parseField(
TypeParsers.java:74) ~[elasticsearch-0.19.8.jar:na]
at org.elasticsearch.index.mapper.core.StringFieldMapper$TypeParser.
parse(StringFieldMapper.java:116) ~[elasticsearch-0.19.8.jar:na]
at org.elasticsearch.index.mapper.object.ObjectMapper$TypeParser.
parseProperties(ObjectMapper.java:261) ~[elasticsearch-0.19.8.jar:na]
at org.elasticsearch.index.mapper.object.ObjectMapper$TypeParser.parse(
ObjectMapper.java:217) ~[elasticsearch-0.19.8.jar:na]
at org.elasticsearch.index.mapper.DocumentMapperParser.parse(
DocumentMapperParser.java:161) ~[elasticsearch-0.19.8.jar:na]
at org.elasticsearch.index.mapper.MapperService.parse(MapperService.java
:271) ~[elasticsearch-0.19.8.jar:na]
at org.elasticsearch.cluster.metadata.MetaDataMappingService$4.execute(
MetaDataMappingService.java:317) ~[elasticsearch-0.19.8.jar:na]
at org.elasticsearch.cluster.service.InternalClusterService$2.run(
InternalClusterService.java:211) ~[elasticsearch-0.19.8.jar:na]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.
java:1110) ~[na:1.7.0_07]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor
.java:603) ~[na:1.7.0_07]
the elasticsearch documentation states that the mappings definition is not
needed in general (Elasticsearch Platform — Find real-time answers at scale | Elastic),
but in my tests search for substrings did not work without mappings. Is this
the case or there is a mistake in my code and search for substrings should
work without mappings?
Elasticsearch will work without an explicit mapping, but only with the
default behavior. That same webpage states: "Only when the defaults
need to be overridden must a mapping definition be provided.".
Searching by substring is definitley not the default.
is there a way to define mappings that apply to all the types in an index?
Not sure, but take a look at index templates. Perhaps the type name
can be a wildcard.
Queries with no field specified will use the _all field, which uses
the default analyzer.
I created the index and specified settings and mappings using CURL. When
using the Java API is it correct to use the exact same JSON below as the
input of ImmutableSettings.Builder.loadFromSource() ?
thank you very much for your helpful reply, it clarified some of my doubts.
I did not try your suggestion of using templates because I managed to
create the mapping on demand using the Java API and a template JSON
document.
In fact, with respect to my last question about JSON, I confirm that it is
possible to set the type mapping or the index settings using a JSON
document via the Java API. However, due to lack of good examples, I found
a bit difficult to create correct JSON documents. So after a couple of
unsuccessful tries, I decided to first create an index and a type with
default settings and auto-generated mapping. Then I used the _settings and
_mapping REST API to enquiry for the configuration, and the reponse
messages helped me figure out how to write a correct JSON document for the
index settings and a second JSON document for the type mapping.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.