Query for empty field, add new field later possible?

Hi,

I'm having troubles to find the correct query to get all documents
where a specific field is not set at all.

My test index has 3 text fields: field1, field2, field3. Some
documents, have all fields filled, some only one or two of the fields.
What would be the query to get all documents with an empty field2?

Let's say time goes by, the index grew, and I like to add another
field "field4". Do I have to reindex all documents in the index again
for such a schema change?

Thank you,

Jean

Hi,

I found a way to filter for a non set field, but I don't know if it's
the most elegant way to do it. I guess the wildcard query has some
computing overhead.

query: {
bool: {
must_not:
{
wildcard: {
source.county: *
}
}
}
}

Regarding adding a new field, it's no problem at all, but ES still
does some of its magic and sets the type by itself. If a field looks
like 2012-02-02 it becomes automatically a date/timestamp field. I
tried to set dynamic to false, with no luck. As I don't know

Hi,

I found a way to filter for a non set field, but I don't know if it's
the most elegant way to do it. I guess the wildcard query has some
computing overhead.

query: {
bool: {
must_not:
{
wildcard: {
field2: *
}
}
}
}

Regarding adding a new field, it's no problem at all, but ES still
does some of its magic and sets the type by itself. If a field looks
like 2012-02-02 it becomes automatically a date/timestamp field. I
tried to set dynamic to false, with no luck.

Since I don't know the field names and contents in advance,
I want to just save everything as a string for now.

On Tue, 2012-02-07 at 02:30 -0800, jeangld@yahoo.com wrote:

Hi,

I'm having troubles to find the correct query to get all documents
where a specific field is not set at all.

My test index has 3 text fields: field1, field2, field3. Some
documents, have all fields filled, some only one or two of the fields.
What would be the query to get all documents with an empty field2?

curl -XGET 'http://127.0.0.1:9200/_all/_search?pretty=1' -d '
{
"query" : {
"constant_score" : {
"filter" : {
"missing" : {
"field" : "field2"
}
}
}
}
}
'

Let's say time goes by, the index grew, and I like to add another
field "field4". Do I have to reindex all documents in the index again
for such a schema change?

No - adding a new field is not a problem. Changing an existing field
(eg field type or analyzer) would require reindexing.

clint

Hi clint,

curl -XGET 'http://127.0.0.1:9200/_all/_search?pretty=1' -d '
{
"query" : {
"constant_score" : {
"filter" : {
"missing" : {
"field" : "field2"
}
}
}
}}

Your code works like a charm :slight_smile: I measued also the speed and it is
significantly faster than my wildcard nonsense.

No - adding a new field is not a problem. Changing an existing field
(eg field type or analyzer) would require reindexing.

Right. The only thing with new fields I'm trying now is to bypass the
dynamic allocation at all. For the beginning I just want everything to
be a string field, even if it starts like a date (2012-02-21...).
Later on, I can decide what the correct field type is, change it and
reindex the related stuff.