Problems with "filtered"


(Pascal P. Pochet) #1

ES 0.14.3
The following syntax "works" (no syntax error but result is wrong, 0
when it should be 3) on _count
curl -s -XGET 'http://localhost:9200/INDEX/TYPE/_search?pretty=true' -
d '{
"filtered" : {
"query" : {
"matchAll" : {}
},
"filter" : {
"exists" : { "field" : "geoloc" }
}
}
}
output is :
{
"count" : 0,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
}
}

but fails with _search: (furthermore doesn't output a pretty result as
specified in the GET request)
{
"error" : "SearchPhaseExecutionException[Failed to execute phase
[query], total failure; shardFailures {[pb8b2EvRTSmQX8numgtNFg]
[swissalive][0]: SearchParseException[[swissalive][0]:
from[-1],size[-1]: Parse Failure [Failed to parse source [\n{\n\t
"filtered" : {\n\t "query" : {\n\t\t\t\t"matchAll" : {}\n
\t },\n\t "filter" : {\n\t "exists" :
{ "field" : "geoloc" }\n\t }\n\t }\n}]]]; nested:
SearchParseException[[swissalive][0]: from[-1],size[-1]: Parse Failure
[No parser for element [filtered]]]; }{[pb8b2EvRTSmQX8numgtNFg]
[swissalive][1]: SearchParseException[[swissalive][1]:
from[-1],size[-1]: Parse Failure [Failed to parse source [\n{\n\t
"filtered" : {\n\t "query" : {\n\t\t\t\t"matchAll" : {}\n
\t },\n\t "filter" : {\n\t "exists" :
{ "field" : "geoloc" }\n\t }\n\t }\n}]]]; nested:
SearchParseException[[swissalive][1]: from[-1],size[-1]: Parse Failure
[No parser for element [filtered]]]; }{[pb8b2EvRTSmQX8numgtNFg]
[swissalive][2]: SearchParseException[[swissalive][2]:
from[-1],size[-1]: Parse Failure [Failed to parse source [\n{\n\t
"filtered" : {\n\t "query" : {\n\t\t\t\t"matchAll" : {}\n
\t },\n\t "filter" : {\n\t "exists" :
{ "field" : "geoloc" }\n\t }\n\t }\n}]]]; nested:
SearchParseException[[swissalive][2]: from[-1],size[-1]: Parse Failure
[No parser for element [filtered]]]; }{[pb8b2EvRTSmQX8numgtNFg]
[swissalive][3]: SearchParseException[[swissalive][3]:
from[-1],size[-1]: Parse Failure [Failed to parse source [\n{\n\t
"filtered" : {\n\t "query" : {\n\t\t\t\t"matchAll" : {}\n
\t },\n\t "filter" : {\n\t "exists" :
{ "field" : "geoloc" }\n\t }\n\t }\n}]]]; nested:
SearchParseException[[swissalive][3]: from[-1],size[-1]: Parse Failure
[No parser for element [filtered]]]; }{[pb8b2EvRTSmQX8numgtNFg]
[swissalive][4]: SearchParseException[[swissalive][4]:
from[-1],size[-1]: Parse Failure [Failed to parse source [\n{\n\t
"filtered" : {\n\t "query" : {\n\t\t\t\t"matchAll" : {}\n
\t },\n\t "filter" : {\n\t "exists" :
{ "field" : "geoloc" }\n\t }\n\t }\n}]]]; nested:
SearchParseException[[swissalive][4]: from[-1],size[-1]: Parse Failure
[No parser for element [filtered]]]; }]"
}

(mapping of TYPE is
{"properties":{"zip":{"type":"string"},"geoloc":
{"type":"geo_point"},"phone":{"type":"string"},"presentations":
{"properties":{"de":{"type":"string"},"it":{"type":"string"},"fr":
{"type":"string"},"en":{"type":"string"},"es":
{"type":"string"}}},"street":{"type":"string"},"localch_id":
{"type":"string"},"city":{"type":"string"},"contacts":{"properties":
{"faxes":{"type":"string"},"mobiles":{"type":"string"},"emails":
{"type":"string"},"phones":{"type":"string"}}},"updated":
{"format":"dateOptionalTime","type":"date"},"name":
{"type":"string"},"categories":{"type":"string"},"checked":
{"format":"dateOptionalTime","type":"date"}}}
)

It's me or it's a bug ?

(the output of
{
"fields" : [ "geoloc" ],
"query" : {
"matchAll" : {}
}
}'
being (simplified) :
{
"took" : 5,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 8,
"max_score" : 1.0,
"hits" : [ {
...
"fields" : {
"geoloc" : {
"lon" : 8.969346,
"lat" : 45.87649
}
}
}, {
...
"fields" : {
"geoloc" : null
}
}, {
...
"fields" : {
"geoloc" : {
"lon" : 6.146914,
"lat" : 46.202218
}
}
}, {
...
}, {
...
"fields" : {
"geoloc" : null
}
}, {
...
}, {
...
"fields" : {
"geoloc" : null
}
}, {
...
"fields" : {
"geoloc" : {
"lon" : 6.933529,
"lat" : 46.992651
}
}
} ]
}
}
)
3 entries out of 8 have a geoloc field, on the 5 without geoloc, 3
results in "geoloc" : null while the 2 others doesn't have any output
at all apart of the (_index, _type, _id, _score) tuple of course
(skipped here because not meaningful for the question).

Any reason to look for a difference there ?


(Clinton Gormley) #2

Hi P3

The following syntax "works" (no syntax error but result is wrong, 0
when it should be 3) on _count
curl -s -XGET 'http://localhost:9200/INDEX/TYPE/_search?pretty=true' -

You mean _count above

d '{
"filtered" : {

This is correct - count excepts the query type at the top level, because
count doesn't do facets etc.

However, _search needs the query to be wrapped in "query": {....}

Here's a working example on master:

[Tue Feb 8 10:54:16 2011] Protocol: http, Server: 127.0.0.1:9200

curl -XPUT 'http://127.0.0.1:9200/foo/'

[Tue Feb 8 10:54:17 2011] Response:

{

"ok" : true,

"acknowledged" : true

}

[Tue Feb 8 10:54:18 2011] Protocol: http, Server: 127.0.0.1:9200

curl -XPUT 'http://127.0.0.1:9200/foo/bar/_mapping' -d '
{
"bar" : {
"properties" : {
"contacts" : {
"properties" : {
"emails" : {
"type" : "string"
},
"phones" : {
"type" : "string"
},
"mobiles" : {
"type" : "string"
},
"faxes" : {
"type" : "string"
}
}
},
"localch_id" : {
"type" : "string"
},
"street" : {
"type" : "string"
},
"name" : {
"type" : "string"
},
"categories" : {
"type" : "string"
},
"phone" : {
"type" : "string"
},
"checked" : {
"format" : "dateOptionalTime",
"type" : "date"
},
"zip" : {
"type" : "string"
},
"city" : {
"type" : "string"
},
"updated" : {
"format" : "dateOptionalTime",
"type" : "date"
},
"geoloc" : {
"type" : "geo_point"
},
"presentations" : {
"properties" : {
"en" : {
"type" : "string"
},
"fr" : {
"type" : "string"
},
"it" : {
"type" : "string"
},
"de" : {
"type" : "string"
},
"es" : {
"type" : "string"
}
}
}
}
}
}
'

[Tue Feb 8 10:54:18 2011] Response:

{

"ok" : true,

"acknowledged" : true

}

[Tue Feb 8 10:54:21 2011] Protocol: http, Server: 127.0.0.1:9200

curl -XPOST 'http://127.0.0.1:9200/_bulk' -d '
{"index" : {"_index" : "foo", "_type" : "bar"}}
{"geoloc" : {"lat" : 46.202218, "lon" : 6.146914}}
{"index" : {"_index" : "foo", "_type" : "bar"}}
{"geoloc" : {"lat" : 46.202218, "lon" : 6.146914}}
{"index" : {"_index" : "foo", "_type" : "bar"}}
{"geoloc" : {"lat" : 46.202218, "lon" : 6.146914}}
{"index" : {"_index" : "foo", "_type" : "bar"}}
{"geoloc" : null}
{"index" : {"_index" : "foo", "_type" : "bar"}}
{"geoloc" : null}
'

[Tue Feb 8 10:54:21 2011] Response:

{

"items" : [

{

"create" : {

"ok" : true,

"_index" : "foo",

"_id" : "XOHsgqcwShiTU-HG9RlX_w",

"_type" : "bar",

"_version" : 1

}

},

{

"create" : {

"ok" : true,

"_index" : "foo",

"_id" : "44CqdtQ6RVWkFK_J4pMjug",

"_type" : "bar",

"_version" : 1

}

},

{

"create" : {

"ok" : true,

"_index" : "foo",

"id" : "7s5q4ELnS-_xaXOFn5VZQ",

"_type" : "bar",

"_version" : 1

}

},

{

"create" : {

"ok" : true,

"_index" : "foo",

"_id" : "JIJ2lFOiRNOqsdd67v4Inw",

"_type" : "bar",

"_version" : 1

}

},

{

"create" : {

"ok" : true,

"_index" : "foo",

"_id" : "MmfXLWg5Q0mmzLALWznjuA",

"_type" : "bar",

"_version" : 1

}

}

],

"took" : 6

}

[Tue Feb 8 10:54:24 2011] Protocol: http, Server: 127.0.0.1:9200

curl -XGET 'http://127.0.0.1:9200/_all/_search' -d '
{
"fields" : [
"geoloc"
],
"query" : {
"filtered" : {
"filter" : {
"exists" : {
"field" : "geoloc"
}
},
"query" : {
"matchAll" : {}
}
}
}
}
'

[Tue Feb 8 10:54:24 2011] Response:

{

"hits" : {

"hits" : [

{

"_score" : 1,

"fields" : {

"geoloc" : {

"lat" : 46.202218,

"lon" : 6.146914

}

},

"_index" : "foo",

"_id" : "XOHsgqcwShiTU-HG9RlX_w",

"_type" : "bar",

"_version" : 1

},

{

"_score" : 1,

"fields" : {

"geoloc" : {

"lat" : 46.202218,

"lon" : 6.146914

}

},

"_index" : "foo",

"id" : "7s5q4ELnS-_xaXOFn5VZQ",

"_type" : "bar",

"_version" : 1

},

{

"_score" : 1,

"fields" : {

"geoloc" : {

"lat" : 46.202218,

"lon" : 6.146914

}

},

"_index" : "foo",

"_id" : "44CqdtQ6RVWkFK_J4pMjug",

"_type" : "bar",

"_version" : 1

}

],

"max_score" : 1,

"total" : 3

},

"timed_out" : false,

"_shards" : {

"failed" : 0,

"successful" : 5,

"total" : 5

},

"took" : 3

}


(Pascal P. Pochet) #3

Ok, thank you for the reminder regarding the _search syntax, this one
was my fault…

But your example still doesn't work here :
The create with _bulk works:
{
"items": [{
"create": {
"_index": "foo",
"_type": "bar",
"_id": "z54W8QoIT1-ojVL05XMr3w",
"ok": true
}
},
{
"create": {
"_index": "foo",
"_type": "bar",
"_id": "UkOS2_FkQd2WJwEBDqc96w",
"ok": true
}
},
{
"create": {
"_index": "foo",
"_type": "bar",
"_id": "4PYPyRhUR8ubnroSX9gRng",
"ok": true
}
},
{
"create": {
"_index": "foo",
"_type": "bar",
"_id": "YuS-8Te4RxWRRxdu_5hmQQ",
"ok": true
}
},
{
"create": {
"_index": "foo",
"_type": "bar",
"_id": "8FdnpXaATv-JOnKSs947Jg",
"ok": true
}
}]
}

The search (the one you post here above) still returns 0 hit:
{
"took": 7,
"_shards": {
"total": 35,
"successful": 35,
"failed": 0
},
"hits": {
"total": 0,
"max_score": null,
"hits": []
}
}

while the unfiltered returns the 8 entries


(Pascal P. Pochet) #4

FYI

stopping ES, deleting "data", restart
and trying your code works

So I suspect the implicit mapping generated by the data import to be
the source of the problem.
Will continue testing to go closer of the source of the problem.


(Pascal P. Pochet) #5

As soon as I add one entry of my INDEX/TYPE, the results are broken:

SEARCH EXISTS GEOLOC
{
"took" : 3,
"_shards" : {
"total" : 10,
"successful" : 10,
"failed" : 0
},
"hits" : {
"total" : 0,
"max_score" : null,
"hits" : [ ]
}
}
SEARCH _ALL
{
"took" : 2,
"_shards" : {
"total" : 10,
"successful" : 10,
"failed" : 0
},
"hits" : {
"total" : 6,
"max_score" : 1.0,
"hits" : [ {
"_index" : "INDEX",
"_type" : "TYPE",
"_id" : "3245a07a-7943-454d-abdd-535703996c9f",
"_score" : 1.0, "_source" : { "entry" : { "checked" :
"2011-02-09T09:04:24+01:00", "updated" :
"2011-02-09T09:04:24+01:00", "localch_id" : "-26Nm97xRgaXa-
gxb6mLIA", "name" : "Administration Cantonale Genevoise Département de
l'instruction publique Service des ressources humaines (engagement de
nouveaux enseignants)", "phone" : "", "street" : "rue Jean-Calvin
11", "zip" : "1204", "city" : "Genève", "contacts" : { "phones" :
[ "022 546 75 50" ], "faxes" : [ ], "mobiles" : [ ], "emails" :
[ ] } , "categories" : { } , "geoloc" : { "lat" : 46.202218,
"lon" : 6.146914 } }}
}, {
"_index" : "foo",
"_type" : "bar",
"_id" : "UZadGMreSfK1gA8ZCUXBFA",
"_score" : 1.0, "_source" : {"geoloc" : null}
}, {
"_index" : "foo",
"_type" : "bar",
"_id" : "hdjMbnV7SWW0Ed0sikZ0tg",
"_score" : 1.0, "_source" : {"geoloc" : {"lat" : 46.202218,
"lon" : 6.146914}}
}, {
"_index" : "foo",
"_type" : "bar",
"_id" : "Adn0wrI3RgqJUuZvnwtpug",
"_score" : 1.0, "_source" : {"geoloc" : {"lat" : 46.202218,
"lon" : 6.146914}}
}, {
"_index" : "foo",
"_type" : "bar",
"_id" : "kA7EyGMxSXeNyAajhGMORQ",
"_score" : 1.0, "_source" : {"geoloc" : null}
}, {
"_index" : "foo",
"_type" : "bar",
"_id" : "d_Uv2ZITQXal47ebIlIwKg",
"_score" : 1.0, "_source" : {"geoloc" : {"lat" : 46.202218,
"lon" : 6.146914}}
} ]
}
}
INDEX/TYPE _mapping
{"INDEX":{"TYPE":{"properties":{"zip":{"type":"string"},"phone":
{"type":"string"},"geoloc":{"properties":{"lon":
{"type":"double"},"lat":{"type":"double"}}},"updated":
{"format":"dateOptionalTime","type":"date"},"street":
{"type":"string"},"name":{"type":"string"},"localch_id":
{"type":"string"},"categories":{"type":"object"},"checked":
{"format":"dateOptionalTime","type":"date"},"contacts":{"properties":
{"phones":{"type":"string"}}},"city":{"type":"string"}}}}}
FOO/BAR _mapping
{"foo":{}}

I stop/delete data/restart insert your bulk example, do the test and
it was OK, then add 1 entry using:

cat< /tmp/entry
{
"TYPE" : {
"checked" : "2011-02-09T09:04:24+01:00",
"updated" : "2011-02-09T09:04:24+01:00",
"localch_id" : "-26Nm97xRgaXa-gxb6mLIA",
"name" : "Administration Cantonale Genevoise Département de
l'instruction publique Service des ressources humaines (engagement de
nouveaux enseignants)",
"phone" : "",
"street" : "rue Jean-Calvin 11",
"zip" : "1204",
"city" : "Genève",
"contacts" : {
"phones" : [
"022 546 75 50" ],
"faxes" : [ ],
"mobiles" : [ ],
"emails" : [ ]
}
,
"categories" : { }

,
"geoloc" : { "lat" : 46.202218, "lon" : 6.146914 }
}

}
EOF
curl -s -XPUT 'http://localhost:9200/INDEX/TYPE/3245a07a-7943-454d-
abdd-535703996c9f' -d @/tmp/entry
(Note that the use of a temp file with curl was required because of
the presence of a single quote in the "name" value: bash doesn't like
it...)

and got the result above.
the filter on geoloc doesnt' work anymore
(note that I have tried also without the TYPE level in the PUT and
results were identical)

And the first visible difference between the two create modes seems to
be the implicit creation of the _mapping.
Could that be the culprit ?

Second remark:
the default mapping generated for geoloc is not a geo_point but a
record containing 2 double values, which also means that
we are forced to use explicit PUT of the mapping before the first
insert to be able to use geo location functions later.


(Shay Banon) #6

If you can gist a curl recreation, I can have a look.
On Wednesday, February 9, 2011 at 10:36 AM, P3 wrote:

As soon as I add one entry of my INDEX/TYPE, the results are broken:

SEARCH EXISTS GEOLOC
{
"took" : 3,
"_shards" : {
"total" : 10,
"successful" : 10,
"failed" : 0
},
"hits" : {
"total" : 0,
"max_score" : null,
"hits" : [ ]
}
}
SEARCH _ALL
{
"took" : 2,
"_shards" : {
"total" : 10,
"successful" : 10,
"failed" : 0
},
"hits" : {
"total" : 6,
"max_score" : 1.0,
"hits" : [ {
"_index" : "INDEX",
"_type" : "TYPE",
"_id" : "3245a07a-7943-454d-abdd-535703996c9f",
"_score" : 1.0, "_source" : { "entry" : { "checked" :
"2011-02-09T09:04:24+01:00", "updated" :
"2011-02-09T09:04:24+01:00", "localch_id" : "-26Nm97xRgaXa-
gxb6mLIA", "name" : "Administration Cantonale Genevoise Département de
l'instruction publique Service des ressources humaines (engagement de
nouveaux enseignants)", "phone" : "", "street" : "rue Jean-Calvin
11", "zip" : "1204", "city" : "Genève", "contacts" : { "phones" :
[ "022 546 75 50" ], "faxes" : [ ], "mobiles" : [ ], "emails" :
[ ] } , "categories" : { } , "geoloc" : { "lat" : 46.202218,
"lon" : 6.146914 } }}
}, {
"_index" : "foo",
"_type" : "bar",
"_id" : "UZadGMreSfK1gA8ZCUXBFA",
"_score" : 1.0, "_source" : {"geoloc" : null}
}, {
"_index" : "foo",
"_type" : "bar",
"_id" : "hdjMbnV7SWW0Ed0sikZ0tg",
"_score" : 1.0, "_source" : {"geoloc" : {"lat" : 46.202218,
"lon" : 6.146914}}
}, {
"_index" : "foo",
"_type" : "bar",
"_id" : "Adn0wrI3RgqJUuZvnwtpug",
"_score" : 1.0, "_source" : {"geoloc" : {"lat" : 46.202218,
"lon" : 6.146914}}
}, {
"_index" : "foo",
"_type" : "bar",
"_id" : "kA7EyGMxSXeNyAajhGMORQ",
"_score" : 1.0, "_source" : {"geoloc" : null}
}, {
"_index" : "foo",
"_type" : "bar",
"_id" : "d_Uv2ZITQXal47ebIlIwKg",
"_score" : 1.0, "_source" : {"geoloc" : {"lat" : 46.202218,
"lon" : 6.146914}}
} ]
}
}
INDEX/TYPE _mapping
{"INDEX":{"TYPE":{"properties":{"zip":{"type":"string"},"phone":
{"type":"string"},"geoloc":{"properties":{"lon":
{"type":"double"},"lat":{"type":"double"}}},"updated":
{"format":"dateOptionalTime","type":"date"},"street":
{"type":"string"},"name":{"type":"string"},"localch_id":
{"type":"string"},"categories":{"type":"object"},"checked":
{"format":"dateOptionalTime","type":"date"},"contacts":{"properties":
{"phones":{"type":"string"}}},"city":{"type":"string"}}}}}
FOO/BAR _mapping
{"foo":{}}

I stop/delete data/restart insert your bulk example, do the test and
it was OK, then add 1 entry using:

cat< /tmp/entry
{
"TYPE" : {
"checked" : "2011-02-09T09:04:24+01:00",
"updated" : "2011-02-09T09:04:24+01:00",
"localch_id" : "-26Nm97xRgaXa-gxb6mLIA",
"name" : "Administration Cantonale Genevoise Département de
l'instruction publique Service des ressources humaines (engagement de
nouveaux enseignants)",
"phone" : "",
"street" : "rue Jean-Calvin 11",
"zip" : "1204",
"city" : "Genève",
"contacts" : {
"phones" : [
"022 546 75 50" ],
"faxes" : [ ],
"mobiles" : [ ],
"emails" : [ ]
}
,
"categories" : { }

,
"geoloc" : { "lat" : 46.202218, "lon" : 6.146914 }
}
}
EOF
curl -s -XPUT 'http://localhost:9200/INDEX/TYPE/3245a07a-7943-454d-
abdd-535703996c9f' -d @/tmp/entry
(Note that the use of a temp file with curl was required because of
the presence of a single quote in the "name" value: bash doesn't like
it...)

and got the result above.
the filter on geoloc doesnt' work anymore
(note that I have tried also without the TYPE level in the PUT and
results were identical)

And the first visible difference between the two create modes seems to
be the implicit creation of the _mapping.
Could that be the culprit ?

Second remark:
the default mapping generated for geoloc is not a geo_point but a
record containing 2 double values, which also means that
we are forced to use explicit PUT of the mapping before the first
insert to be able to use geo location functions later.


(Pascal P. Pochet) #7

Shay,

I got the solution:

as soon as you create the data with PUT INDEX/TYPE/KEY instead of
_bulk,
the problem appears (I suspect because a implicit mapping is created…)
BUT
specifying "geoloc.lat" instead of "geoloc" in the filter solves the
problem:

curl -s -XPUT 'http://127.0.0.1:9200/foo/bar/1' -d '
{
"geoloc" : {"lat" : 46.202218, "lon" : 6.146914}
}'
echo ""

curl -s -XPUT 'http://127.0.0.1:9200/foo/bar/2' -d '
{"geoloc" : {"lat" : 46.202218, "lon" : 6.146914}}'
echo ""

curl -s -XPUT 'http://127.0.0.1:9200/foo/bar/3' -d '
{"geoloc" : {"lat" : 46.202218, "lon" : 6.146914}}'
echo ""

curl -s -XPUT 'http://127.0.0.1:9200/foo/bar/4' -d '
{"geoloc" : null}'
echo ""

curl -s -XPUT 'http://127.0.0.1:9200/foo/bar/5' -d '
{"geoloc" : null}'
echo ""

echo "This one will NOT return geoloc fields when present"
curl -s -XGET 'http://127.0.0.1:9200/_all/_search?pretty=true' -d '
{
"fields" : [
"geoloc"
],
"query" : {
"matchAll" : {}
}
}'
echo ""

echo "This one will return geoloc.lat and .lon fields when geoloc
present"
curl -s -XGET 'http://127.0.0.1:9200/_all/_search?pretty=true' -d '
{
"fields" : [
"geoloc.lat", "geoloc.lon"
],
"query" : {
"matchAll" : {}
}
}'
echo ""

echo "This one works (returns 3):"
curl -s -XGET 'http://localhost:9200/foo/bar/_count?pretty=true' -d '
{
"filtered" : {
"query" : {
"matchAll" : {}
},
"filter" : {
"exists" : { "field" : "geoloc.lat" }
}
}
}'
echo ""

echo "This one fails (returns 0):"
curl -s -XGET 'http://localhost:9200/foo/bar/_count?pretty=true' -d '
{
"filtered" : {
"query" : {
"matchAll" : {}
},
"filter" : {
"exists" : { "field" : "geoloc" }
}
}
}'
echo ""

On 9 fév, 11:15, Shay Banon shay.ba...@elasticsearch.com wrote:

If you can gist a curl recreation, I can have a look.

On Wednesday, February 9, 2011 at 10:36 AM, P3 wrote:

As soon as I add one entry of my INDEX/TYPE, the results are broken:

SEARCH EXISTS GEOLOC
{
"took" : 3,
"_shards" : {
"total" : 10,
"successful" : 10,
"failed" : 0
},
"hits" : {
"total" : 0,
"max_score" : null,
"hits" : [ ]
}
}
SEARCH _ALL
{
"took" : 2,
"_shards" : {
"total" : 10,
"successful" : 10,
"failed" : 0
},
"hits" : {
"total" : 6,
"max_score" : 1.0,
"hits" : [ {
"_index" : "INDEX",
"_type" : "TYPE",
"_id" : "3245a07a-7943-454d-abdd-535703996c9f",
"_score" : 1.0, "_source" : { "entry" : { "checked" :
"2011-02-09T09:04:24+01:00", "updated" :
"2011-02-09T09:04:24+01:00", "localch_id" : "-26Nm97xRgaXa-
gxb6mLIA", "name" : "Administration Cantonale Genevoise Département de
l'instruction publique Service des ressources humaines (engagement de
nouveaux enseignants)", "phone" : "", "street" : "rue Jean-Calvin
11", "zip" : "1204", "city" : "Genève", "contacts" : { "phones" :
[ "022 546 75 50" ], "faxes" : [ ], "mobiles" : [ ], "emails" :
[ ] } , "categories" : { } , "geoloc" : { "lat" : 46.202218,
"lon" : 6.146914 } }}
}, {
"_index" : "foo",
"_type" : "bar",
"_id" : "UZadGMreSfK1gA8ZCUXBFA",
"_score" : 1.0, "_source" : {"geoloc" : null}
}, {
"_index" : "foo",
"_type" : "bar",
"_id" : "hdjMbnV7SWW0Ed0sikZ0tg",
"_score" : 1.0, "_source" : {"geoloc" : {"lat" : 46.202218,
"lon" : 6.146914}}
}, {
"_index" : "foo",
"_type" : "bar",
"_id" : "Adn0wrI3RgqJUuZvnwtpug",
"_score" : 1.0, "_source" : {"geoloc" : {"lat" : 46.202218,
"lon" : 6.146914}}
}, {
"_index" : "foo",
"_type" : "bar",
"_id" : "kA7EyGMxSXeNyAajhGMORQ",
"_score" : 1.0, "_source" : {"geoloc" : null}
}, {
"_index" : "foo",
"_type" : "bar",
"_id" : "d_Uv2ZITQXal47ebIlIwKg",
"_score" : 1.0, "_source" : {"geoloc" : {"lat" : 46.202218,
"lon" : 6.146914}}
} ]
}
}
INDEX/TYPE _mapping
{"INDEX":{"TYPE":{"properties":{"zip":{"type":"string"},"phone":
{"type":"string"},"geoloc":{"properties":{"lon":
{"type":"double"},"lat":{"type":"double"}}},"updated":
{"format":"dateOptionalTime","type":"date"},"street":
{"type":"string"},"name":{"type":"string"},"localch_id":
{"type":"string"},"categories":{"type":"object"},"checked":
{"format":"dateOptionalTime","type":"date"},"contacts":{"properties":
{"phones":{"type":"string"}}},"city":{"type":"string"}}}}}
FOO/BAR _mapping
{"foo":{}}

I stop/delete data/restart insert your bulk example, do the test and
it was OK, then add 1 entry using:

cat< /tmp/entry
{
"TYPE" : {
"checked" : "2011-02-09T09:04:24+01:00",
"updated" : "2011-02-09T09:04:24+01:00",
"localch_id" : "-26Nm97xRgaXa-gxb6mLIA",
"name" : "Administration Cantonale Genevoise Département de
l'instruction publique Service des ressources humaines (engagement de
nouveaux enseignants)",
"phone" : "",
"street" : "rue Jean-Calvin 11",
"zip" : "1204",
"city" : "Genève",
"contacts" : {
"phones" : [
"022 546 75 50" ],
"faxes" : [ ],
"mobiles" : [ ],
"emails" : [ ]
}
,
"categories" : { }

,
"geoloc" : { "lat" : 46.202218, "lon" : 6.146914 }
}
}
EOF
curl -s -XPUT 'http://localhost:9200/INDEX/TYPE/3245a07a-7943-454d-
abdd-535703996c9f' -d @/tmp/entry
(Note that the use of a temp file with curl was required because of
the presence of a single quote in the "name" value: bash doesn't like
it...)

and got the result above.
the filter on geoloc doesnt' work anymore
(note that I have tried also without the TYPE level in the PUT and
results were identical)

And the first visible difference between the two create modes seems to
be the implicit creation of the _mapping.
Could that be the culprit ?

Second remark:
the default mapping generated for geoloc is not a geo_point but a
record containing 2 double values, which also means that
we are forced to use explicit PUT of the mapping before the first
insert to be able to use geo location functions later.


(Pascal P. Pochet) #8

However, there is still a problem:

I really need "geo_point" type for "geoloc" field,
so I PUT an explicit mapping at the beginning of the data loading.

Then there is no more explicit/accessible/… "lat" and "lon" fields in
"geoloc"…
thus the trick to filter on "geoloc.lat" instead of "geoloc" doesn't
work anymore…
and since the filtering on "geoloc" doesn't work either:
we are back to the starting point.


(Shay Banon) #9

Please, stop putting code in the mails, its unreadable. I don't understand the problem, gist an example that recreates it, and we will have a look.
On Wednesday, February 9, 2011 at 3:19 PM, P3 wrote:

However, there is still a problem:

I really need "geo_point" type for "geoloc" field,
so I PUT an explicit mapping at the beginning of the data loading.

Then there is no more explicit/accessible/… "lat" and "lon" fields in
"geoloc"…
thus the trick to filter on "geoloc.lat" instead of "geoloc" doesn't
work anymore…
and since the filtering on "geoloc" doesn't work either:
we are back to the starting point.


(Pascal P. Pochet) #10

Problem is simple: with _bulk no implicit mapping is generated, by
adding entries one by one, there is one.
Then "exists" on "geoloc" works when there is no mapping and doesn't
work if there is one.
And since to have "geoloc" entry considered as "geo_point" (and be
able to use geo functionalities), you have to create the mapping
explicitly before loading data… and by doing so you are beaten by the
problem with "exists" without any way to avoid it…

On 9 fév, 20:25, Shay Banon shay.ba...@elasticsearch.com wrote:

Please, stop putting code in the mails, its unreadable. I don't understand the problem, gist an example that recreates it, and we will have a look.

On Wednesday, February 9, 2011 at 3:19 PM, P3 wrote:

However, there is still a problem:

I really need "geo_point" type for "geoloc" field,
so I PUT an explicit mapping at the beginning of the data loading.

Then there is no more explicit/accessible/… "lat" and "lon" fields in
"geoloc"…
thus the trick to filter on "geoloc.lat" instead of "geoloc" doesn't
work anymore…
and since the filtering on "geoloc" doesn't work either:
we are back to the starting point.


(Clinton Gormley) #11

PS3 - as Shay said - please gist a curl recreation

That way it is easy for him to replicate the problem, and to test
whether is has been fixed or not.

On Thu, 2011-02-10 at 12:22 -0800, P3 wrote:

Problem is simple: with _bulk no implicit mapping is generated, by
adding entries one by one, there is one.
Then "exists" on "geoloc" works when there is no mapping and doesn't
work if there is one.
And since to have "geoloc" entry considered as "geo_point" (and be
able to use geo functionalities), you have to create the mapping
explicitly before loading data… and by doing so you are beaten by the
problem with "exists" without any way to avoid it…

On 9 fév, 20:25, Shay Banon shay.ba...@elasticsearch.com wrote:

Please, stop putting code in the mails, its unreadable. I don't understand the problem, gist an example that recreates it, and we will have a look.

On Wednesday, February 9, 2011 at 3:19 PM, P3 wrote:

However, there is still a problem:

I really need "geo_point" type for "geoloc" field,
so I PUT an explicit mapping at the beginning of the data loading.

Then there is no more explicit/accessible/… "lat" and "lon" fields in
"geoloc"…
thus the trick to filter on "geoloc.lat" instead of "geoloc" doesn't
work anymore…
and since the filtering on "geoloc" doesn't work either:
we are back to the starting point.

--
Web Announcements Limited is a company registered in England and Wales,
with company number 05608868, with registered address at 10 Arvon Road,
London, N5 1PR.


(Shay Banon) #12

Its hard to understand your description of the problem. What is "exists"? If you wan for lat/lon to be indexed as well (taking a long shot here that maybe thats what you are after), then you can configure the geo_type type to enable it by setting "lat_lon" on it to true.

-shay.banon
On Thursday, February 10, 2011 at 10:31 PM, Clinton Gormley wrote:

PS3 - as Shay said - please gist a curl recreation

That way it is easy for him to replicate the problem, and to test
whether is has been fixed or not.

On Thu, 2011-02-10 at 12:22 -0800, P3 wrote:

Problem is simple: with _bulk no implicit mapping is generated, by
adding entries one by one, there is one.
Then "exists" on "geoloc" works when there is no mapping and doesn't
work if there is one.
And since to have "geoloc" entry considered as "geo_point" (and be
able to use geo functionalities), you have to create the mapping
explicitly before loading data… and by doing so you are beaten by the
problem with "exists" without any way to avoid it…

On 9 fév, 20:25, Shay Banon shay.ba...@elasticsearch.com wrote:

Please, stop putting code in the mails, its unreadable. I don't understand the problem, gist an example that recreates it, and we will have a look.

On Wednesday, February 9, 2011 at 3:19 PM, P3 wrote:

However, there is still a problem:

I really need "geo_point" type for "geoloc" field,
so I PUT an explicit mapping at the beginning of the data loading.

Then there is no more explicit/accessible/… "lat" and "lon" fields in
"geoloc"…
thus the trick to filter on "geoloc.lat" instead of "geoloc" doesn't
work anymore…
and since the filtering on "geoloc" doesn't work either:
we are back to the starting point.

--
Web Announcements Limited is a company registered in England and Wales,
with company number 05608868, with registered address at 10 Arvon Road,
London, N5 1PR.


(Pascal P. Pochet) #13

"exists" is the filter function…
http://www.elasticsearch.org/guide/reference/query-dsl/exists-filter.html

I don't see what you don't understand in the fact that a "_bulk"
insert doesn't create an implicit mapping while a "normal" (INDEX/TYPE/
ID) one does…
the strange behavior of "exists" function in the filter comes from
that simple fact: filtering "exists" on the "record" (here above
"geoloc") doesn't work once an implicit mapping exists, but "exists"
on a "field of the record" (e.g. geoloc.lat) does.

On 10 fév, 21:43, Shay Banon shay.ba...@elasticsearch.com wrote:

Its hard to understand your description of the problem. What is "exists"? If you wan for lat/lon to be indexed as well (taking a long shot here that maybe thats what you are after), then you can configure the geo_type type to enable it by setting "lat_lon" on it to true.

-shay.banon

On Thursday, February 10, 2011 at 10:31 PM, Clinton Gormley wrote:

PS3 - as Shay said - please gist a curl recreation

That way it is easy for him to replicate the problem, and to test
whether is has been fixed or not.

On Thu, 2011-02-10 at 12:22 -0800, P3 wrote:

Problem is simple: with _bulk no implicit mapping is generated, by
adding entries one by one, there is one.
Then "exists" on "geoloc" works when there is no mapping and doesn't
work if there is one.
And since to have "geoloc" entry considered as "geo_point" (and be
able to use geo functionalities), you have to create the mapping
explicitly before loading data… and by doing so you are beaten by the
problem with "exists" without any way to avoid it…

On 9 fév, 20:25, Shay Banon shay.ba...@elasticsearch.com wrote:

Please, stop putting code in the mails, its unreadable. I don't understand the problem, gist an example that recreates it, and we will have a look.

On Wednesday, February 9, 2011 at 3:19 PM, P3 wrote:

However, there is still a problem:

I really need "geo_point" type for "geoloc" field,
so I PUT an explicit mapping at the beginning of the data loading.

Then there is no more explicit/accessible/… "lat" and "lon" fields in
"geoloc"…
thus the trick to filter on "geoloc.lat" instead of "geoloc" doesn't
work anymore…
and since the filtering on "geoloc" doesn't work either:
we are back to the starting point.

--
Web Announcements Limited is a company registered in England and Wales,
with company number 05608868, with registered address at 10 Arvon Road,
London, N5 1PR.


(Clinton Gormley) #14

I don't see what you don't understand

P3 - you have been asked several times now: Please provide a gist with a
curl recreation of the problem.

It is difficult to read code in emails, and it may be a bug, or it may
be a problem with your syntax.

Kimchy has a long list of things he is working on. He, of course, wants
to fix any bugs and to make ElasticSearch better.

If you want this problem fixed, then please provide (via a gist) a
recreation of the problem so that:
a) it is easy to see where the problem is
b) it is easy to test

that way, you are likely to get the issue fixed quickly

clint


(Administrator-2) #15

Clinton,

For those that might be new or do not know, would you please provide details about what a "gist" is or point to a resource that details the process?

For example, I know that the most efficient way to have a problem looked at or get help is to provide a gist, but I have no clue what it is or how to create one.

Thanks in advance.

	- Nick

-----Original Message-----
From: Clinton Gormley [mailto:clinton@iannounce.co.uk]
Sent: Friday, February 11, 2011 2:24 PM
To: users@elasticsearch.com
Subject: Re: Problems with "filtered"

I don't see what you don't understand

P3 - you have been asked several times now: Please provide a gist with a
curl recreation of the problem.

It is difficult to read code in emails, and it may be a bug, or it may
be a problem with your syntax.

Kimchy has a long list of things he is working on. He, of course, wants
to fix any bugs and to make ElasticSearch better.

If you want this problem fixed, then please provide (via a gist) a
recreation of the problem so that:
a) it is easy to see where the problem is
b) it is easy to test

that way, you are likely to get the issue fixed quickly

clint


(Clinton Gormley) #16

Hi Nick

For those that might be new or do not know, would you please provide
details about what a "gist" is or point to a resource that details the
process?

Sorry - assumed knowledge - mea culpa :slight_smile:

Gists are a bit like pastebin, but are versioned. They are provided by
github: https://gist.github.com/

to quote github:

    "Gist is a simple way to share snippets and pastes with others.
    All gists are git repositories, so they are automatically
    versioned, forkable and usable as a git repository."

For example, I know that the most efficient way to have a problem
looked at or get help is to provide a gist, but I have no clue what it
is or how to create one.

I will write a page on the docs about: how to report a problem, which
should make it clearer for all

thanks

clint


(Administrator-2) #17

Clinton,

Excellent, much appreciated!

	- Nick

-----Original Message-----
From: Clinton Gormley [mailto:clinton@iannounce.co.uk]
Sent: Friday, February 11, 2011 2:37 PM
To: users@elasticsearch.com
Subject: RE: Problems with "filtered"

Hi Nick

For those that might be new or do not know, would you please provide
details about what a "gist" is or point to a resource that details the
process?

Sorry - assumed knowledge - mea culpa :slight_smile:

Gists are a bit like pastebin, but are versioned. They are provided by
github: https://gist.github.com/

to quote github:

    "Gist is a simple way to share snippets and pastes with others.
    All gists are git repositories, so they are automatically
    versioned, forkable and usable as a git repository."

For example, I know that the most efficient way to have a problem
looked at or get help is to provide a gist, but I have no clue what it
is or how to create one.

I will write a page on the docs about: how to report a problem, which
should make it clearer for all

thanks

clint


(Pascal P. Pochet) #18

Here is the gist


(Shay Banon) #19

This happens because the geoloc is not really a field in the document indexed, geoloc.lat and geoloc.lon are (and they are considered as simple numeric values, no geo features enabled). This happens because you did not specify that geoloc is if type "geo_point" in a mapping (that needs to be specified before indexing a doc). Once you do that, then it will work.
On Saturday, February 12, 2011 at 5:37 PM, P3 wrote:

Here is the gist

https://gist.github.com/823820


(Clinton Gormley) #20

For those that might be new or do not know, would you please provide
details about what a "gist" is or point to a resource that details the
process?

I have just added a page to the docs which should help new users to ask
for help:

http://www.elasticsearch.org/help/

ta

clint