Search by a weighted tag - deadend


(sparky-2) #1

Hi guys, I am struggling with a setup to be able to search using weighted
tags. The score comes out wrong and some objects (that should match) are
ignored completely.

My docs:

[
"title":"1", "tags" : ["a", "b", "c"],
"title":"2", "tags" : ["a", "b"],
"title":"3", "tags" : ["c", "b"],
"title":"4", "tags" : ["b"]
]

My query:

{
"query": {
"custom_filters_score": {
"query": {
"terms": {
"tags": ["a", "c"]
}
},
"filters": [
{"filter":{"term":{"tags":"a"}}, "script":"1.0"},
{"filter":{"term":{"tags":"c"}}, "script":"1.5"}
],
"score_mode": "total"
}
}
}

Response:

{
"_shards": {
"failed": 0,
"successful": 5,
"total": 5
},
"hits": {
"hits": [
{
"_id": "3",
"_index": "test",
"_score": 0.23837921,
"_source": {
"tags": [
"c",
"b"
],
"title": "3"
},
"_type": "bit"
},
{
"_id": "1",
"_index": "test",
"_score": 0.042195037,
"_source": {
"tags": [
"a",
"b",
"c"
],
"title": "1"
},
"_type": "bit"
}
],
"max_score": 0.23837921,
"total": 2
},
"timed_out": false,
"took": 3
}

However, I expected the following result:

  1. Document 1 (score: 2.5 because "a" and "c")
  2. Document 3 (score: 1.5 because "c")
  3. Document 2 (score: 1.0 because "a")

That is, the document #2 is missing completely, the order and the scores
are wrong.
Can you give me a hint of what I am doing wrong?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Britta Weber) #2

Hi,

there are two issues here: First, "a" is a stopword and therfore when
searching for ["a","c"] you only get results for ["c"]. Try a
different mapping, for example with "index" set to "not_analyzed".
Second, the custom_filters_score multiplies the value from the script
to the score of the query. If you want to avoid this, you can use a
constant_score like this:

{
    "query": {
        "custom_filters_score": {
            "filters": [
                {
                    "filter": {
                        "term": {
                            "tags": "a"
                        }
                    },
                    "script": "1.0"
                },
                {
                    "filter": {
                        "term": {
                            "tags": "c"
                        }
                    },
                    "script": "1.5"
                }
            ],
            "query": {
                "constant_score": {
                    "query": {
                        "terms": {
                            "tags": [
                                "a",
                                "c"
                            ]
                        }
                    }
                }
            },
            "score_mode": "total"
        }
    }
}

Cheers,
Britta
~
~

On Sat, Sep 7, 2013 at 8:47 AM, sparky roman.semko@gmail.com wrote:

Hi guys, I am struggling with a setup to be able to search using weighted
tags. The score comes out wrong and some objects (that should match) are
ignored completely.

My docs:

[
"title":"1", "tags" : ["a", "b", "c"],
"title":"2", "tags" : ["a", "b"],
"title":"3", "tags" : ["c", "b"],
"title":"4", "tags" : ["b"]
]

My query:

{
"query": {
"custom_filters_score": {
"query": {
"terms": {
"tags": ["a", "c"]
}
},
"filters": [
{"filter":{"term":{"tags":"a"}}, "script":"1.0"},
{"filter":{"term":{"tags":"c"}}, "script":"1.5"}
],
"score_mode": "total"
}
}
}

Response:

{
"_shards": {
"failed": 0,
"successful": 5,
"total": 5
},
"hits": {
"hits": [
{
"_id": "3",
"_index": "test",
"_score": 0.23837921,
"_source": {
"tags": [
"c",
"b"
],
"title": "3"
},
"_type": "bit"
},
{
"_id": "1",
"_index": "test",
"_score": 0.042195037,
"_source": {
"tags": [
"a",
"b",
"c"
],
"title": "1"
},
"_type": "bit"
}
],
"max_score": 0.23837921,
"total": 2
},
"timed_out": false,
"took": 3
}

However, I expected the following result:

Document 1 (score: 2.5 because "a" and "c")
Document 3 (score: 1.5 because "c")
Document 2 (score: 1.0 because "a")

That is, the document #2 is missing completely, the order and the scores are
wrong.
Can you give me a hint of what I am doing wrong?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(sparky-2) #3

Didn't think about the stopwords - that was it! Thanks Britta and have a
nice weekend! :slight_smile:

Am Samstag, 7. September 2013 08:47:14 UTC+2 schrieb sparky:

Hi guys, I am struggling with a setup to be able to search using weighted
tags. The score comes out wrong and some objects (that should match) are
ignored completely.

My docs:

[
"title":"1", "tags" : ["a", "b", "c"],
"title":"2", "tags" : ["a", "b"],
"title":"3", "tags" : ["c", "b"],
"title":"4", "tags" : ["b"]
]

My query:

{
"query": {
"custom_filters_score": {
"query": {
"terms": {
"tags": ["a", "c"]
}
},
"filters": [
{"filter":{"term":{"tags":"a"}}, "script":"1.0"},
{"filter":{"term":{"tags":"c"}}, "script":"1.5"}
],
"score_mode": "total"
}
}
}

Response:

{
"_shards": {
"failed": 0,
"successful": 5,
"total": 5
},
"hits": {
"hits": [
{
"_id": "3",
"_index": "test",
"_score": 0.23837921,
"_source": {
"tags": [
"c",
"b"
],
"title": "3"
},
"_type": "bit"
},
{
"_id": "1",
"_index": "test",
"_score": 0.042195037,
"_source": {
"tags": [
"a",
"b",
"c"
],
"title": "1"
},
"_type": "bit"
}
],
"max_score": 0.23837921,
"total": 2
},
"timed_out": false,
"took": 3
}

However, I expected the following result:

  1. Document 1 (score: 2.5 because "a" and "c")
  2. Document 3 (score: 1.5 because "c")
  3. Document 2 (score: 1.0 because "a")

That is, the document #2 is missing completely, the order and the scores
are wrong.
Can you give me a hint of what I am doing wrong?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #4