Refresh + child documents

Hi Alexander,

The test can be found at [1] but it's not that straightforward to set up as
the app requires quite a bit of dependencies.

First some background info:

Our app has lots of content that can be shared with lots of users. Content
can be marked as private to a set of users.
We've modelled that in ES by having top-level content documents which have
a child document per user that has access to the content item.

When we do a general search we add the user id of the current user and run
a has_child filter as well.

I'll try my best to explain what the test is doing and what happens in the
background.

  1. The first couple of lines creates a couple of users
  2. A piece of content gets created by user A
  3. That content item gets shared with user B
  4. The content items gets put in the index
  5. All the users who have access to that piece of content are added as
    child documents
  6. Refresh the search index (consistency = all)
  7. Search for the content item
  8. Assert we get the content item

Occasionally (25% of the time maybe) the assertion in step 8 fails.

Is it possible that when our application gets a response from the refresh
request (6) ES hasn't
actually fully re-indexed everything?

FWIW, we've set the number of shards to 1 and the number of replica's to 0
as per the elasticsearch.yml recommendation
for dev environments and that seems to help somewhat. (presumably because
there are less shards and replicas to process
thus causing less IO)

The full query can be found at [2]

Kind regards,

Simon

[1] https://github.com/oaeproject/Hilary/blob/master/node_modules/oae-content/tests/test-library-search.js#L247
[2]
{
"query": {
"filtered": {
"query": {
"query_string": {
"fields": [
"q_high^2.0",
"q_low^0.75"
],
"query": "*"
}
},
"filter": {
"and": [
{
"term": {
"_type": "resource"
}
},
{
"term": {
"resourceType": "content"
}
},
{
"has_child": {
"type": "resource_members",
"query": {
"terms": {
"direct_members": [
"u:camtest:gJ-kkTf-2W"
]
}
}
}
}
]
}
}
},
"from": 1,
"size": 25,
"sort": [
{
"_score": {
"order": "desc"
}
},
{
"sort": "asc"
}
],
"min_score": 0.2
}

On Thursday, September 12, 2013 4:28:06 PM UTC+1, Alexander Reelsen wrote:

Hey,

can you share the test maybe?

--Alex

On Thu, Sep 12, 2013 at 4:50 PM, Simon Gaeremynck <gaere...@gmail.com<javascript:>

wrote:

Hi,

In our unit tests for our app, we're seeing a couple of intermittent
search failures when doing the following:

  1. Create / Update a bunch of resources
  2. Trigger an index refresh (with consistency == 'all')
  3. Search
    -> Failure because of missing expected results

The search in step 3 is a query that searches through child documents.
Is it possible that when the refresh from step 2 returns, the child
documents haven't been
fully re-indexed yet?

We're only seeing the failure intermittently on a low-spec box under
heavy load.

Kind regards,

Simon

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.