A ES newbie here.
I've got documents in the following format and basically try to loop through all users data to find 2000 users within 100Km from his location excluding any users associated with the blocked_ids.
{
"name": "user1",
"active": true,
"location": { "lat": 21, "lon": 80 },
"blocked_ids": [ "19", "83", "486", ... ]
},
{
"name": "user2",
"active": true,
"location": { "lat": 22, "lon": 81 },
"blocked_ids": [ "7", "8", "148", ... ]
},
:
So, first off, on my PHP application, I retrieve a chunk of every 200 users with scorll API. then, loop the chunk through with the following query:
return [
"index" => "users",
"type" => "doc",
"body" => [
"size" => 2000,
"query" => [
"bool" => [
"must" => [
[ "match" => [ "active" => true ] ],
],
"must_not" => [
"ids" => [
"values" => $excludedIds
],
],
"filter" => [
'geo_distance' => [
'distance' => "100km",
"distance_type" => "plane",
'location' => [
"lat" => $lat,
"lon" => $lng
]
]
]
]
]
]
];
here, $excludedOds, $lat, and $lng are PHP variables extracted from the chunk. But, the execution time I measured was too long. I observed the size of query result, currently 2000, affect the execution time greatly. How can I achieve faster query result? Any tips or enlightments would greatly be appreciated.
Thanks!
Chuck