Custom Filters Score Help

I'm using a custom filters score query to boost results that have terms
fb_connections in common. I'm finding that results with 4 terms in common
in the fb_connections field aren't necessarily weighted higher than those
with 1 common fb_connection. Is there a way with this query to boost it
multiple times for the count of common term matches?

{
"query": {
"filtered": {
"query": {
"custom_filters_score": {
"query": {
"query_string": {
"query": "Apple"
}
},
"filters": [
{
"filter": {
"terms": {
"fb_connections": [
"100",
"101"
]
}
},
"boost": 5
}
],
"score_mode": "total"
}
},
"filter": {
"and": [
{
"term": {
"is_active": true
}
},
{
"not": {
"terms": {
"id": [
1
]
}
}
},
{
"not": {
"terms": {
"fb_user_id": [
"100",
"101"
]
}
}
}
]
}
}
},
"size": 8,
"fields": [
"id",
"fb_user_id",
"linkedin_id"
]
}

--

hey,

when you say terms in common do you refer to the terms in your filter? ie.
you filter for '100','101', '102' so if a document has all 3 of them you
want a higher boost than for documents that only have 2 of those terms?

simon

On Wednesday, November 14, 2012 2:54:48 PM UTC+1, Brandon Hilkert wrote:

I'm using a custom filters score query to boost results that have terms
fb_connections in common. I'm finding that results with 4 terms in common
in the fb_connections field aren't necessarily weighted higher than those
with 1 common fb_connection. Is there a way with this query to boost it
multiple times for the count of common term matches?

{
"query": {
"filtered": {
"query": {
"custom_filters_score": {
"query": {
"query_string": {
"query": "Apple"
}
},
"filters": [
{
"filter": {
"terms": {
"fb_connections": [
"100",
"101"
]
}
},
"boost": 5
}
],
"score_mode": "total"
}
},
"filter": {
"and": [
{
"term": {
"is_active": true
}
},
{
"not": {
"terms": {
"id": [
1
]
}
}
},
{
"not": {
"terms": {
"fb_user_id": [
"100",
"101"
]
}
}
}
]
}
}
},
"size": 8,
"fields": [
"id",
"fb_user_id",
"linkedin_id"
]
}

--

Exactly.

On Nov 15, 2012, at 4:23 AM, simonw simon.willnauer@elasticsearch.com
wrote:

hey,

when you say terms in common do you refer to the terms in your filter? ie.
you filter for '100','101', '102' so if a document has all 3 of them you
want a higher boost than for documents that only have 2 of those terms?

simon

On Wednesday, November 14, 2012 2:54:48 PM UTC+1, Brandon Hilkert wrote:

I'm using a custom filters score query to boost results that have terms
fb_connections in common. I'm finding that results with 4 terms in common
in the fb_connections field aren't necessarily weighted higher than those
with 1 common fb_connection. Is there a way with this query to boost it
multiple times for the count of common term matches?

{
"query": {
"filtered": {
"query": {
"custom_filters_score": {
"query": {
"query_string": {
"query": "Apple"
}
},
"filters": [
{
"filter": {
"terms": {
"fb_connections": [
"100",
"101"
]
}
},
"boost": 5
}
],
"score_mode": "total"
}
},
"filter": {
"and": [
{
"term": {
"is_active": true
}
},
{
"not": {
"terms": {
"id": [
1
]
}
}
},
{
"not": {
"terms": {
"fb_user_id": [
"100",
"101"
]
}
}
}
]
}
}
},
"size": 8,
"fields": [
"id",
"fb_user_id",
"linkedin_id"
]
}

--

--

Your filter in custom_filters_score is "binary" it ether matches and then
the document gets a boost of 5, or it doesn't and the document doesn't get
any boost. To have the behavior that Simon described you need to
have multiple filters (one for each term) and use "total" or "multiply"
score mode to accumulate boosts from all matching filters.

On Thursday, November 15, 2012 5:58:22 AM UTC-5, Brandon Hilkert wrote:

Exactly.

On Nov 15, 2012, at 4:23 AM, simonw <simon.w...@elasticsearch.com<javascript:>>
wrote:

hey,

when you say terms in common do you refer to the terms in your filter? ie.
you filter for '100','101', '102' so if a document has all 3 of them you
want a higher boost than for documents that only have 2 of those terms?

simon

On Wednesday, November 14, 2012 2:54:48 PM UTC+1, Brandon Hilkert wrote:

I'm using a custom filters score query to boost results that have terms
fb_connections in common. I'm finding that results with 4 terms in common
in the fb_connections field aren't necessarily weighted higher than those
with 1 common fb_connection. Is there a way with this query to boost it
multiple times for the count of common term matches?

{
"query": {
"filtered": {
"query": {
"custom_filters_score": {
"query": {
"query_string": {
"query": "Apple"
}
},
"filters": [
{
"filter": {
"terms": {
"fb_connections": [
"100",
"101"
]
}
},
"boost": 5
}
],
"score_mode": "total"
}
},
"filter": {
"and": [
{
"term": {
"is_active": true
}
},
{
"not": {
"terms": {
"id": [
1
]
}
}
},
{
"not": {
"terms": {
"fb_user_id": [
"100",
"101"
]
}
}
}
]
}
}
},
"size": 8,
"fields": [
"id",
"fb_user_id",
"linkedin_id"
]
}

--

--

Ahh....ok. That makes sense. I'll give that a shot. Thanks!

On Nov 15, 2012, at 2:45 PM, Igor Motov imotov@gmail.com wrote:

Your filter in custom_filters_score is "binary" it ether matches and then
the document gets a boost of 5, or it doesn't and the document doesn't get
any boost. To have the behavior that Simon described you need to
have multiple filters (one for each term) and use "total" or "multiply"
score mode to accumulate boosts from all matching filters.

On Thursday, November 15, 2012 5:58:22 AM UTC-5, Brandon Hilkert wrote:

Exactly.

On Nov 15, 2012, at 4:23 AM, simonw <simon.w...@elasticsearch.com<javascript:>>
wrote:

hey,

when you say terms in common do you refer to the terms in your filter? ie.
you filter for '100','101', '102' so if a document has all 3 of them you
want a higher boost than for documents that only have 2 of those terms?

simon

On Wednesday, November 14, 2012 2:54:48 PM UTC+1, Brandon Hilkert wrote:

I'm using a custom filters score query to boost results that have terms
fb_connections in common. I'm finding that results with 4 terms in common
in the fb_connections field aren't necessarily weighted higher than those
with 1 common fb_connection. Is there a way with this query to boost it
multiple times for the count of common term matches?

{
"query": {
"filtered": {
"query": {
"custom_filters_score": {
"query": {
"query_string": {
"query": "Apple"
}
},
"filters": [
{
"filter": {
"terms": {
"fb_connections": [
"100",
"101"
]
}
},
"boost": 5
}
],
"score_mode": "total"
}
},
"filter": {
"and": [
{
"term": {
"is_active": true
}
},
{
"not": {
"terms": {
"id": [
1
]
}
}
},
{
"not": {
"terms": {
"fb_user_id": [
"100",
"101"
]
}
}
}
]
}
}
},
"size": 8,
"fields": [
"id",
"fb_user_id",
"linkedin_id"
]
}

--

--

--

As a follow up to my initial question, I implemented iterating through the
connections to a boost times the number of matching terms. Everything
works, but these are social connections (LinkedIn, Facebook, Twitter,...).
I found today that some of our productions users have quite a few
connections, so I'm getting the following error:

*nested: TooManyClauses[maxClauseCount is set to 1024]; *
*
*
Is there a better way to go about searching for a string, and boosting
results with common terms based on the number of common terms?

On Thu, Nov 15, 2012 at 2:53 PM, Brandon Hilkert brandon@meeteor.comwrote:

Ahh....ok. That makes sense. I'll give that a shot. Thanks!

On Nov 15, 2012, at 2:45 PM, Igor Motov imotov@gmail.com wrote:

Your filter in custom_filters_score is "binary" it ether matches and then
the document gets a boost of 5, or it doesn't and the document doesn't get
any boost. To have the behavior that Simon described you need to
have multiple filters (one for each term) and use "total" or "multiply"
score mode to accumulate boosts from all matching filters.

On Thursday, November 15, 2012 5:58:22 AM UTC-5, Brandon Hilkert wrote:

Exactly.

On Nov 15, 2012, at 4:23 AM, simonw simon.w...@elasticsearch.com wrote:

hey,

when you say terms in common do you refer to the terms in your filter?
ie. you filter for '100','101', '102' so if a document has all 3 of them
you want a higher boost than for documents that only have 2 of those terms?

simon

On Wednesday, November 14, 2012 2:54:48 PM UTC+1, Brandon Hilkert wrote:

I'm using a custom filters score query to boost results that have terms
fb_connections in common. I'm finding that results with 4 terms in common
in the fb_connections field aren't necessarily weighted higher than those
with 1 common fb_connection. Is there a way with this query to boost it
multiple times for the count of common term matches?

{
"query": {
"filtered": {
"query": {
"custom_filters_score": {
"query": {
"query_string": {
"query": "Apple"
}
},
"filters": [
{
"filter": {
"terms": {
"fb_connections": [
"100",
"101"
]
}
},
"boost": 5
}
],
"score_mode": "total"
}
},
"filter": {
"and": [
{
"term": {
"is_active": true
}
},
{
"not": {
"terms": {
"id": [
1
]
}
}
},
{
"not": {
"terms": {
"fb_user_id": [
"100",
"101"
]
}
}
}
]
}
}
},
"size": 8,
"fields": [
"id",
"fb_user_id",
"linkedin_id"
]
}

--

--

--

First of all, you can increase the maxClauseCount using
indices.query.bool.max_clause_count setting. But to answer you question,
depending on the number of connections and number of search results, it
might make sense to try switching to custom_score and boost connections
by comparing connection id with a hashmap of connection that you want to
boost. For a few connections, it will be slower than custom_filters_score
but for users with vary large number of connections it might be a faster
solution.

On Tuesday, November 27, 2012 2:33:31 PM UTC-5, Brandon Hilkert wrote:

As a follow up to my initial question, I implemented iterating through the
connections to a boost times the number of matching terms. Everything
works, but these are social connections (LinkedIn, Facebook, Twitter,...).
I found today that some of our productions users have quite a few
connections, so I'm getting the following error:

*nested: TooManyClauses[maxClauseCount is set to 1024]; *
*
*
Is there a better way to go about searching for a string, and boosting
results with common terms based on the number of common terms?

On Thu, Nov 15, 2012 at 2:53 PM, Brandon Hilkert <bra...@meeteor.com<javascript:>

wrote:

Ahh....ok. That makes sense. I'll give that a shot. Thanks!

On Nov 15, 2012, at 2:45 PM, Igor Motov <imo...@gmail.com <javascript:>>
wrote:

Your filter in custom_filters_score is "binary" it ether matches and then
the document gets a boost of 5, or it doesn't and the document doesn't get
any boost. To have the behavior that Simon described you need to
have multiple filters (one for each term) and use "total" or "multiply"
score mode to accumulate boosts from all matching filters.

On Thursday, November 15, 2012 5:58:22 AM UTC-5, Brandon Hilkert wrote:

Exactly.

On Nov 15, 2012, at 4:23 AM, simonw simon.w...@elasticsearch.com
wrote:

hey,

when you say terms in common do you refer to the terms in your filter?
ie. you filter for '100','101', '102' so if a document has all 3 of them
you want a higher boost than for documents that only have 2 of those terms?

simon

On Wednesday, November 14, 2012 2:54:48 PM UTC+1, Brandon Hilkert wrote:

I'm using a custom filters score query to boost results that have terms
fb_connections in common. I'm finding that results with 4 terms in common
in the fb_connections field aren't necessarily weighted higher than those
with 1 common fb_connection. Is there a way with this query to boost it
multiple times for the count of common term matches?

{
"query": {
"filtered": {
"query": {
"custom_filters_score": {
"query": {
"query_string": {
"query": "Apple"
}
},
"filters": [
{
"filter": {
"terms": {
"fb_connections": [
"100",
"101"
]
}
},
"boost": 5
}
],
"score_mode": "total"
}
},
"filter": {
"and": [
{
"term": {
"is_active": true
}
},
{
"not": {
"terms": {
"id": [
1
]
}
}
},
{
"not": {
"terms": {
"fb_user_id": [
"100",
"101"
]
}
}
}
]
}
}
},
"size": 8,
"fields": [
"id",
"fb_user_id",
"linkedin_id"
]
}

--

--

--