Filtering results after search

Hi guys,

Ive run into a bit of a technical issue, im doing paged searches of 10
results per page. The problem is im doing filtering of those results
(filtering results to match a user's region)
The problem is these region filtering rules are a bit complex and i cant
really see them being moved into the query as a filter... (I will soon
elaborate)

Is it possible to plugin a programatic filtering mechanism into
elasticsearch which will not effect the total results i get back from the
search response (response.getHits().getTotalHits()) and prevent me from
having to do subsequent searches to try fill a page full of 10 results?

The current region filtering is quite complex as it checks for the
existence of certain values before doing a check:

  • public static boolean allowedToSell(String collectionName, String
    serverName, String userCountryCode,*
  •                                     String rightsCountry, String 
    

rightsTerritory, String notForSale,*

  •                                     String salesRightsType, 
    

List serverNames) {*

  •    if (collectionName.equals("col1")) {*
    
  •        return randomHouseServerNames.contains(serverName);*
    
  •    } else if (collectionName.equals("col2")) {**           *
    
  •        if (!userCountryCode.equals("AU")) {*
    
  •            return false;*
    
  •        }*
    
  •    } else if (collectionName.equals("col3")) {**           *
    
  •        if (userCountryCode.equals("NZ")) {*
    
  •            return false;*
    
  •        }*
    
  •    }*
    
  •    return isValidForPurchase(userCountryCode, rightsCountry, 
    

rightsTerritory, notForSale, salesRightsType);*

  • }*
  • private static boolean isValidForPurchase(String userCountryCode,
    String rightsCountry, String rightsTerritory,*
  •                                          String notForSale, String 
    

salesRightsType) {*

  •    if (salesRightsType == null) {*
    
  •        return false;*
    
  •    }*
    
  •   *
    
  •    if (notForSale != null && notForSale.contains(userCountryCode)) {*
    
  •        return false;*
    
  •    }*
    
  •    *
    
  •    if (rightsTerritory != null && (rightsTerritory.equals("WORLD") || 
    

rightsTerritory.equals("ROW"))) {*

  •        return true;*
    
  •    }*
    
  •    *
    
  •    if (rightsCountry.contains(userCountryCode)) {*
    
  •        return true;*
    
  •    }*
    
  •    return false;*
    
  • }*

The following fields come from a the source from a elasticsearch document:

  • collectionName
  • rightsCountry
  • rightsTerritory
  • notForSale
  • salesRightsType

The following field values get passed in:

  • userCountryCode
  • serverName (hostname of where request comes from)
  • serverNames (static list of server hostnames)

As you can see its not as simple as just moving this logic into the
existing elasticsearch query as an additional filter...
Could someone please give me some guidance in the right direction. I would
prefer moving this logic as a query filter but not sure if its possible due
to its complex nature as i still want to retain the total results count and
not have to do subsequent searches to try fill a page if results are
filtered out after the search.

Thanks in advance!

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

You can always request more documents than necessary if you post-filter the
documents. However, pagination will be more difficult since the next offset
will also need to be calculated.

Glancing quickly at your code, the logic does not seem too difficult to
implement as a script. It can be written and compiled in Java, so that
performance will not be impacted. For the values that need to be passed in,
the terms filter lookup can potentially be used.

http://www.elasticsearch.org/guide/reference/modules/scripting/
http://www.elasticsearch.org/guide/reference/query-dsl/terms-filter/ (see
the lookup section)

Cheers,

Ivan

On Sun, Aug 18, 2013 at 4:24 PM, Wesley Archbell
wesleyarchbell@gmail.comwrote:

Hi guys,

Ive run into a bit of a technical issue, im doing paged searches of 10
results per page. The problem is im doing filtering of those results
(filtering results to match a user's region)
The problem is these region filtering rules are a bit complex and i cant
really see them being moved into the query as a filter... (I will soon
elaborate)

Is it possible to plugin a programatic filtering mechanism into
elasticsearch which will not effect the total results i get back from the
search response (response.getHits().getTotalHits()) and prevent me from
having to do subsequent searches to try fill a page full of 10 results?

The current region filtering is quite complex as it checks for the
existence of certain values before doing a check:

  • public static boolean allowedToSell(String collectionName, String
    serverName, String userCountryCode,*
  •                                     String rightsCountry, String
    

rightsTerritory, String notForSale,*

  •                                     String salesRightsType,
    

List serverNames) {*

  •    if (collectionName.equals("col1")) {*
    
  •        return randomHouseServerNames.contains(serverName);*
    
  •    } else if (collectionName.equals("col2")) {**           *
    
  •        if (!userCountryCode.equals("AU")) {*
    
  •            return false;*
    
  •        }*
    
  •    } else if (collectionName.equals("col3")) {**           *
    
  •        if (userCountryCode.equals("NZ")) {*
    
  •            return false;*
    
  •        }*
    
  •    }*
    
  •    return isValidForPurchase(userCountryCode, rightsCountry,
    

rightsTerritory, notForSale, salesRightsType);*

  • }*
  • private static boolean isValidForPurchase(String userCountryCode,
    String rightsCountry, String rightsTerritory,*
  •                                          String notForSale, String
    

salesRightsType) {*

  •    if (salesRightsType == null) {*
    
  •        return false;*
    
  •    }*
    
  •   *
    
  •    if (notForSale != null && notForSale.contains(userCountryCode)) {
    
  •        return false;*
    
  •    }*
    
  •    *
    
  •    if (rightsTerritory != null && (rightsTerritory.equals("WORLD")
    

|| rightsTerritory.equals("ROW"))) {*

  •        return true;*
    
  •    }*
    
  •    *
    
  •    if (rightsCountry.contains(userCountryCode)) {*
    
  •        return true;*
    
  •    }*
    
  •    return false;*
    
  • }*

The following fields come from a the source from a elasticsearch document:

  • collectionName
  • rightsCountry
  • rightsTerritory
  • notForSale
  • salesRightsType

The following field values get passed in:

  • userCountryCode
  • serverName (hostname of where request comes from)
  • serverNames (static list of server hostnames)

As you can see its not as simple as just moving this logic into the
existing elasticsearch query as an additional filter...
Could someone please give me some guidance in the right direction. I would
prefer moving this logic as a query filter but not sure if its possible due
to its complex nature as i still want to retain the total results count and
not have to do subsequent searches to try fill a page if results are
filtered out after the search.

Thanks in advance!

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Yes that is what i had to do in the past; continue searching until the page
is filled and then pass back the new from index offset. It works ok but its
not clean.
I will try give the scripting feature a go, do you know if you can access
request parameters from within the script module?

Cheerz,
Wez

On Monday, August 19, 2013 2:48:17 PM UTC+10, Ivan Brusic wrote:

You can always request more documents than necessary if you post-filter
the documents. However, pagination will be more difficult since the next
offset will also need to be calculated.

Glancing quickly at your code, the logic does not seem too difficult to
implement as a script. It can be written and compiled in Java, so that
performance will not be impacted. For the values that need to be passed in,
the terms filter lookup can potentially be used.

http://www.elasticsearch.org/guide/reference/modules/scripting/
http://www.elasticsearch.org/guide/reference/query-dsl/terms-filter/ (see
the lookup section)

Cheers,

Ivan

On Sun, Aug 18, 2013 at 4:24 PM, Wesley Archbell <wesleya...@gmail.com<javascript:>

wrote:

Hi guys,

Ive run into a bit of a technical issue, im doing paged searches of 10
results per page. The problem is im doing filtering of those results
(filtering results to match a user's region)
The problem is these region filtering rules are a bit complex and i cant
really see them being moved into the query as a filter... (I will soon
elaborate)

Is it possible to plugin a programatic filtering mechanism into
elasticsearch which will not effect the total results i get back from the
search response (response.getHits().getTotalHits()) and prevent me
from having to do subsequent searches to try fill a page full of 10 results?

The current region filtering is quite complex as it checks for the
existence of certain values before doing a check:

  • public static boolean allowedToSell(String collectionName, String
    serverName, String userCountryCode,*
  •                                     String rightsCountry, String 
    

rightsTerritory, String notForSale,*

  •                                     String salesRightsType, 
    

List serverNames) {*

  •    if (collectionName.equals("col1")) {*
    
  •        return randomHouseServerNames.contains(serverName);*
    
  •    } else if (collectionName.equals("col2")) {**           *
    
  •        if (!userCountryCode.equals("AU")) {*
    
  •            return false;*
    
  •        }*
    
  •    } else if (collectionName.equals("col3")) {**           *
    
  •        if (userCountryCode.equals("NZ")) {*
    
  •            return false;*
    
  •        }*
    
  •    }*
    
  •    return isValidForPurchase(userCountryCode, rightsCountry, 
    

rightsTerritory, notForSale, salesRightsType);*

  • }*
  • private static boolean isValidForPurchase(String userCountryCode,
    String rightsCountry, String rightsTerritory,*
  •                                          String notForSale, String 
    

salesRightsType) {*

  •    if (salesRightsType == null) {*
    
  •        return false;*
    
  •    }*
    
  •   *
    
  •    if (notForSale != null && notForSale.contains(userCountryCode)) 
    

{*

  •        return false;*
    
  •    }*
    
  •    *
    
  •    if (rightsTerritory != null && (rightsTerritory.equals("WORLD") 
    

|| rightsTerritory.equals("ROW"))) {*

  •        return true;*
    
  •    }*
    
  •    *
    
  •    if (rightsCountry.contains(userCountryCode)) {*
    
  •        return true;*
    
  •    }*
    
  •    return false;*
    
  • }*

The following fields come from a the source from a elasticsearch document:

  • collectionName
  • rightsCountry
  • rightsTerritory
  • notForSale
  • salesRightsType

The following field values get passed in:

  • userCountryCode
  • serverName (hostname of where request comes from)
  • serverNames (static list of server hostnames)

As you can see its not as simple as just moving this logic into the
existing elasticsearch query as an additional filter...
Could someone please give me some guidance in the right direction. I
would prefer moving this logic as a query filter but not sure if its
possible due to its complex nature as i still want to retain the total
results count and not have to do subsequent searches to try fill a page if
results are filtered out after the search.

Thanks in advance!

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Yes you can pass parameters. The script factory object is given a
Map<String, Object> of parameters that can used by the script instance. See
the custom parameters section
of http://www.elasticsearch.org/guide/reference/query-dsl/script-filter/
for what this looks like in the query JSON.

Cheers,
Dan

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Yeah cool thanks, managed to figure it out once i created a implementation
of a NativeScriptFactory which creates the script. Do you know if the
intent of the native script is to filter out a result? If so would you just
return null if you wanted to filter it from results and and pass back what
object if you wanted it included?

Cheerz,
Wez

On Monday, August 19, 2013 3:41:15 PM UTC+10, Dan Everton wrote:

Yes you can pass parameters. The script factory object is given a
Map<String, Object> of parameters that can used by the script instance. See
the custom parameters section of
http://www.elasticsearch.org/guide/reference/query-dsl/script-filter/ for
what this looks like in the query JSON.

Cheers,
Dan

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.