Multivalued field

Hi,

How elasticsearch handles a multivalued field? Like authors field? The
content is initially a comma separated list from which I would like to
split the values and then use in a facet?

--

It's an array type:

Em segunda-feira, 5 de novembro de 2012 20h23min14s UTC-2, Rogerio Pereira
escreveu:

Hi,

How elasticsearch handles a multivalued field? Like authors field? The
content is initially a comma separated list from which I would like to
split the values and then use in a facet?

--

I think I did not understand your concern.

An answer is perhaps to use the _analyze API to see how ES will break your field into tokens.

Does it help?
Or could you elaborate a bit more?

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 5 nov. 2012 à 23:41, Rogerio Pereira rogerio.araujo@gmail.com a écrit :

It's an array type:

Em segunda-feira, 5 de novembro de 2012 20h23min14s UTC-2, Rogerio Pereira escreveu:

Hi,

How elasticsearch handles a multivalued field? Like authors field? The content is initially a comma separated list from which I would like to split the values and then use in a facet?

--

--

I think I know what he wants. If you can get the frequency (not just
presence) of the words within the document and you can get the total
frequency of words within the corpus, then you can use that to get the
weight of each word and use that to do a statistical comparison of two
documents to see if they match. Its very fast and can be used for things
like near-deduplication. You can even do things like set up a medical
corpus and a legal corpus and figure out reliably where a new document
belongs.

...Ken

On Mon, Nov 5, 2012 at 9:38 PM, David Pilato david@pilato.fr wrote:

I think I did not understand your concern.

An answer is perhaps to use the _analyze API to see how ES will break your
field into tokens.
Elasticsearch Platform — Find real-time answers at scale | Elastic

Does it help?
Or could you elaborate a bit more?

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 5 nov. 2012 à 23:41, Rogerio Pereira rogerio.araujo@gmail.com a
écrit :

It's an array type:

Elasticsearch Platform — Find real-time answers at scale | Elastic

Em segunda-feira, 5 de novembro de 2012 20h23min14s UTC-2, Rogerio Pereira
escreveu:

Hi,

How elasticsearch handles a multivalued field? Like authors field? The
content is initially a comma separated list from which I would like to
split the values and then use in a facet?

--

--

--

I think you give me a direction David, I just need to split something like
"Author 1, Author 2" into several terms to use in my facet.

As far I could see the pattern tokenizer can help me to do that instead of
set an array to my author field.

Em terça-feira, 6 de novembro de 2012 01h39min11s UTC-2, David Pilato
escreveu:

I think I did not understand your concern.

An answer is perhaps to use the _analyze API to see how ES will break your
field into tokens.
Elasticsearch Platform — Find real-time answers at scale | Elastic

Does it help?
Or could you elaborate a bit more?

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 5 nov. 2012 à 23:41, Rogerio Pereira <rogerio...@gmail.com<javascript:>>
a écrit :

It's an array type:

Elasticsearch Platform — Find real-time answers at scale | Elastic

Em segunda-feira, 5 de novembro de 2012 20h23min14s UTC-2, Rogerio Pereira
escreveu:

Hi,

How elasticsearch handles a multivalued field? Like authors field? The
content is initially a comma separated list from which I would like to
split the values and then use in a facet?

--

--