Here's the sample of mapping, register, and search query.
mapping
curl -X PUT "es:9200/english" -H 'Content-Type: application/json' -d'
{
"mappings": {
"_doc": {
"properties": {
"title" : {
"type" : "text"
},
"contents": {
"type": "nested"
}
}
}
}
}
'
register
curl -X PUT "es:9200/english/_doc/1?refresh" -H 'Content-Type: application/json' -d'
{
"title": "Test title",
"contents": [
{
"header": "something special",
"body": "I am John."
},
{
"header": "anything hot",
"body": "This is a cup."
}
]
}
'
curl -X PUT "es:9200/english/_doc/2?refresh" -H 'Content-Type: application/json' -d'
{
"title": "Test title",
"contents": [
{
"header": "something special",
"body": "I am John."
},
{
"header": "anything hot",
"body": "That is a glass."
}
]
}
'
search
curl -XGET "es:9200/english/_search?pretty" -H 'Content-Type: application/json' -d'
{
"_source": 'false',
"size": 20,
"query": {
"nested": {
"path": "contents",
"score_mode": "max",
"query": {
"simple_query_string":{
"query": "I am",
"fields": ["contents.header","contents.body"],
"auto_generate_synonyms_phrase_query": 'true'
}
},
"inner_hits": {
"size": 1
}
}
}
}
'
result
{
"took" : 6,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 2,
"max_score" : 1.4723401,
"hits" : [
{
"_index" : "english",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.4723401,
"inner_hits" : {
"contents" : {
"hits" : {
"total" : 1,
"max_score" : 1.4723401,
"hits" : [
{
"_index" : "english",
"_type" : "_doc",
"_id" : "2",
"_nested" : {
"field" : "contents",
"offset" : 0
},
"_score" : 1.4723401,
"_source" : {
"header" : "something special",
"body" : "I am John."
}
}
]
}
}
}
},
{
"_index" : "english",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.4723401,
"inner_hits" : {
"contents" : {
"hits" : {
"total" : 1,
"max_score" : 1.4723401,
"hits" : [
{
"_index" : "english",
"_type" : "_doc",
"_id" : "1",
"_nested" : {
"field" : "contents",
"offset" : 0
},
"_score" : 1.4723401,
"_source" : {
"header" : "something special",
"body" : "I am John."
}
}
]
}
}
}
}
]
}
}
In the search response, there is two hits. And two is same content except for "_id"
.
Then I would like to remove one hit which is similar to another.
If someone have good solution for it, please help me...!!
The following solution of eliminating duplication "Field Collapsing" doesn't seems to be fit in using nested query.
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-collapse.html