ghoumard
(Gildas Houmard)
April 26, 2013, 3:23pm
1
Hi,
We have a document like this in ES:
{ "ID":"1111", "Text":"This my text",
"Concept":["concept1","concept2","concept3"] }
We are trying to detect duplicate based on the field "concept", with a
pattern like: if at least 85% of concept are the same, then it's a
duplicate.
But when using the following query, results are not good;
$ curl -XGET '
http://localhost:9200/twitter/tweet/1/_mlt?mlt_fields=concept&min_doc_freq=1 '
We had to change the json format for the MLT to work like expected. (string
instead of array)
{ "ID":"1111", "Text":"This my text", "Concept":"concept1 concept2
concept3" }
Are we missing something ? Is this a bug or a design limitation ?
Thanks,
Gildas
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com .
For more options, visit https://groups.google.com/groups/opt_out .
ghoumard
(Gildas Houmard)
April 29, 2013, 12:57pm
2
Up,
We really like the fact of storing array and not string.
Any idea someone ?
Gildas
On Friday, April 26, 2013 11:23:55 AM UTC-4, Gildas Houmard wrote:
Hi,
We have a document like this in ES:
{ "ID":"1111", "Text":"This my text",
"Concept":["concept1","concept2","concept3"] }
We are trying to detect duplicate based on the field "concept", with a
pattern like: if at least 85% of concept are the same, then it's a
duplicate.
But when using the following query, results are not good;
$ curl -XGET '
http://localhost:9200/twitter/tweet/1/_mlt?mlt_fields=concept&min_doc_freq=1 '
We had to change the json format for the MLT to work like expected.
(string instead of array)
{ "ID":"1111", "Text":"This my text", "Concept":"concept1 concept2
concept3" }
Are we missing something ? Is this a bug or a design limitation ?
Thanks,
Gildas
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com .
For more options, visit https://groups.google.com/groups/opt_out .
On 29/04/13 13:57, Gildas Houmard wrote:
Up,
We really like the fact of storing array and not string.
Any idea someone ?
What's your mapping look like? I've seen this happen when the field type
is inferred. I had to manually set the mapping of that type to be an
array type, and then all was well.
--
Cheers,
James Harrison
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com .
For more options, visit https://groups.google.com/groups/opt_out .