Hi Shay,
it is worth spending some time discussing mapping, far to many "on top
of the lucene" projects map lucene analyzers 1-1, It looks to me ES
goes further in that sense.
Another example:
"conditional analysis", mapping M to N fields:
Map(STREET, HNO)->{street_name, hno}
Example:
STREET = "Sleepy street 21"
HNO = "null"
should be mapped to
street_name = "sleepy street"
hno = "21"
OR
STREET = "22nd street"
HNO = "10"
should be mapped to
street_name = "22nd street"
hno = "10"
If we were to support something like this, we would have to stretch ES
config capabilities quite a lot, maybe some "request mutators" would
be better.
Something like Transform(Document) -> NEW_Document extensibility plug-
in, so we are not pushing ES to provide all possible transformations,
we just make it possible for users to "do whatever they
want" (something they could anyhow do in client code).
The problem with this is that NEW_DOCUMENT must conform to minimum
standards, (e.g. have the same ID as original Document not to screw up
routing...)
Due to great dynamic mapping in ES, I think it is not a problem to
have it relatively simple. At the end of a day, I think of it as
opening possibility to "push client code to ES" With all its gotcha-s
(User should not wonder if original fields get transformed by
"transformer" )....
What I am trying to say, instead of stretching ES mapping features for
this, maybe it would be better provide "extension points" for users to
reformulate document with their own "transformers".
benefit:
My example with A1, A2 and FULL and any similar would be my concern, I
could indeed make transformer that maps Transform({A1, A2})->{A1, A2,
FULL, NGramFULL, whatever} as intercepted operation...
And it would be my responsibility to know what fields I can use to
search.
Sometimes are these operation quite complicated, imagine some
extractions from unstructured documents where I need to index more
then one index document like:
TransformBulkText("some longish text")-> should get transformed to two
or more documents (Named Entities e.g.)
I do not think it is too hard to make such "API extensions" and make
ES even more usable.... but take it with reserve, I still have to dig
into ES.... Just analyzing if it could help with problems I already
met
Thanks,
Eks
On 22 Mrz., 12:09, Shay Banon shay.ba...@elasticsearch.com wrote:
The index_name is not meant to allow to concatenate several fields. Its just controls the name of the field created (if it will be appended with the path to the object or not). And you, most times, don't have to change it.
Having the ability to create a field that is an aggregation of other several fields is valid, I agree (aside from _all). You can open an issue for that, though, its a bit tricky to implement (as always ).