How to Discard Empty Fields When Searching?

Hi,

I am running following query. Basically I want to exclude all the documents
where lastname starts with a number. The first set of documents in the
result set are the ones where lastname field is empty. I simply want to
discard the documents where the lastname field is empty. How do I do that?
I tried "missing" : "_last" but as I expected it seems to work only when
the field is missing from the document. I am using a custom PHP class to
generate the query and therefore I have to use the "query_string" query.

Thank you.

{
"filter": {
"and": [
{
"query": {
"query_string": {
"query": "active",
"default_field": "status"
}
}
},
{
"query": {
"query_string": {
"query": "+lastname:(-0* -1* -2* -3* -4* -5* -6* -7* -8* -9*)",
"default_field": "lastname"
}
}
}
]
},
"sort": [
{
"lastname_not_analyzed": {
"order": "asc",
"missing" : "_last"
}
}
],
"size": 10,
"from": 0
}

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Easiest thing would be just to exclude empty lastnames, or lastnames
beginning with a number:

POST /_search
{
"query": {
"filtered": {
"filter": {
"bool": {
"must": [
{
"term": {
"status": "active"
}
},
{
"exists": {
"field": "lastname"
}
}
],
"must_not": [
{
"regexp": {
"lastname": "[0-9].+"
}
}
]
}
}
}
},
"sort": [
{
"lastname_not_analyzed": {
"order": "asc"
}
}
]
}

Notes:

  1. instead of having a separate field called lastname_not_analyzed, look at
    using multi_fields

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-multi-field-type.html

  1. running the regexp against the lastname will match against any word in
    the lastname, not just the first.
    perhaps you want to match against lastname_not_analyzed instead?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi Clinton,

Thanks for the answer. I will try the suggestions.

Is there any particular reason for this behavior (including and showing the
documents with empty fields first).

When it comes to sorting, why does it lists documents with empty fields
first (In this case, when we sort based on lastname, it lists the documents
where lastname is empty first)?

Thank you.

On Friday, October 18, 2013 6:22:16 PM UTC+8, Eric wrote:

Hi,

I am running following query. Basically I want to exclude all the
documents where lastname starts with a number. The first set of
documents in the result set are the ones where lastname field is empty.
I simply want to discard the documents where the lastname field is empty.
How do I do that? I tried "missing" : "_last" but as I expected it seems to
work only when the field is missing from the document. I am using a custom
PHP class to generate the query and therefore I have to use
the "query_string" query.

Thank you.

{
"filter": {
"and": [
{
"query": {
"query_string": {
"query": "active",
"default_field": "status"
}
}
},
{
"query": {
"query_string": {
"query": "+lastname:(-0* -1* -2* -3* -4* -5* -6* -7* -8* -9*)",
"default_field": "lastname"
}
}
}
]
},
"sort": [
{
"lastname_not_analyzed": {
"order": "asc",
"missing" : "_last"
}
}
],
"size": 10,
"from": 0
}

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.