Hello,
i tried to collapse multivalued field and its doesnt work
e.g:
"field": [id1, id2, id3]
i'm asking if there's a workaround to solve this issue.
any suggestions please !!
Hello,
i tried to collapse multivalued field and its doesnt work
e.g:
"field": [id1, id2, id3]
i'm asking if there's a workaround to solve this issue.
any suggestions please !!
Hi @arcturuscom
Just to help us understand.
Can you show what the desired input and result you want? (What do you want it to look like)
What did you try that failed?
How or what did the fail result in / error?
this is what i get
{
"shard" : 0,
"index" : "whitespace-autres-1",
"node" : "-pvtzg-jSm2tXGNwadcRlw",
"reason" : {
"type" : "illegal_state_exception",
"reason" : "failed to collapse 120067, the collapse field must be single valued"
}
},
my quer is simple :
POST legan_test/_search?explain
{
"from": 0,
"size": 10,
"collapse": {
"field": "combiningIds.keyword"
}
}
the combingIds mapping is :
"combiningIds" : {
"type" : "text",
"norms" : false,
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
this field is multivalue
"combiningIds" : [
"1234567",
"09747",
"119999888
]
we process some rdf files, each document contains a list of combiningIds we retrieve them and we index them in Elasticsearch
we consider that 2 documents should be collapsed if there's atleast one element shared between them.
Ok now I understand... Exactly as your title stated
Unfortunately as I understand it and as the documentation here states
The field used for collapsing must be a single valued keyword or numeric field with doc_values activated
Collapse does not work with arrays of keyword values. It can not chose a value out of the array.
That is exactly the error you are getting, which make sense because your field combiningIds.keyword
is an array.
Thanks @stephenb
Do you have any suggestion or workaround?
I'm thinking about each element in the table on a separate document but i don't think that's a good idea in terms of disk space
Any hints please
Can you provide 5 or 6 sample docs and the desired results perhaps we can take a look...
The array might work with a different approach and yes de-normalization can be an option but there could be many docs.
How many documents do you have?
I am more interested in what the result you are looking for, even if collapse "worked" I am unclear what the expected result was.
de-normalization can be an option but there could be many docs.
this is true. we have millions of documents
as i mentioned, my main use case is to collapse documents that they are in relationship between them instead of flat them in search result. we want to keep places for some other relevant documents.
giving an example, a document could have 5 version, and i want to display only the last version ( i know how to do this ) and collapse the old version instead of make them available in the search result. we want to avoid this because we want to increase our relevancy and make a place to the other relevant document instead of displaying 5 different version of the same document.
let consider in this example that combiningIds field like a versions field that contains something like
versions: [1,2,3,4,5]
Apologies but I do not have a solution for you ... I cannot tell what you want the output to be.
Perhaps someone else will.
I think without some sample simplified docs and what you want the output to be it will be difficult for anyone to suggest a solution.
we consider that a document is in relationship with another if both sharing atleast one combiningId.
the screenshot above visualize the relationship between the documents.
let say that i want to collapse document id:2 and 3 with id:1 or document 2 with 4
Could you accomplish what you want with Elasticsearch Graph API?
Thnaks @Jason_Slater it's sounds interessting
if you have some data, such as this example:
{"some_id":123,"some_field":["id1","id2","id3"]}
{"some_id":456,"some_field":["id2","id3","id4"]}
{"some_id":789,"some_field":["id3","id4","id5"]}
where you want to find connections by the values in "some_field"
I ingested this sample as NLJSON into "some-index" and created an index pattern for it.
In Graph, if I query for some_id: 123, the connections are apparent:
For reference, my settings are here:
Clicking on "Inspect" will reveal the API query and response behind the graph.
Hope this helps, or at least provides some food for thought.
@Jason_Slater very interesting, i'm curious to know if it's possible to display reponse in format :
hits : [
{
some_id: 123,
inner_hits: [
{some_id: 456}, {some_id: 789}
]
}
]
i want to display somehting similar to this :
PS: document 456, 789 shouldn't be appeared on the other search cards as they are already collapsed in the first
Just a random thought. If you create a hash based on multi-value field and then apply collapse on that hashed-field?
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.
© 2020. All Rights Reserved - Elasticsearch
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries.