Before SCAN I got in the habit of faceting _id whenever I needed the
_ids from a query, it was much faster than iterating through the query
results (I often have several hundred thousand records in my
results). Using scan to extract the ids still seems a bit slower than
faceting (I imagine because there is more overhead per item
returned). However in the long run I realize I need to be careful
with faceting because of the potential for excessive memory usage.
I suppose my question is, what is the best workflow for what (in my
cases at least) is a common operation (ie performing queries on
parents and joining them to their children). My current process is:
- Iterate through a scan query on the parent making sure to set
fields = 
- Extract the id field from the results
- For each _id, perform a term query on the _parent field to
retrieve the proper child.
a) Is this the best approach using currently available methods?
b) Any more though to having a simple "join" or "has_parent" method?
On Apr 28, 4:37 pm, Shay Banon shay.ba...@elasticsearch.com wrote:
You get the _id for each document back in the search result, or am I missing something else?
Note, you can always enable indexing the _id as well. But lets see if you really need to.
On Thursday, April 28, 2011 at 11:32 PM, merrellb wrote:
I often need to get the _ids matching a query so I can retrieve the
appropriate children (still waiting on that has_parent query I've
found that faceting _id is faster than iterating/scanning through the
query results and extracting the _id. The new changes to _id seem to
make this difficult. Am I missing an alternative way to do what seems
to be a common task?