thanks for this answer as I've just been sitting here puzzling over
the same thing - click refresh and the hits I get back in first
size=25 alternate between two diff 25 sets..
based on your advice I've tried using sessionid in preference but have
no idea whether I've done it right, I stuck it in the query DSL:
{"query":{"filtered":{"query":{"query_string":
{"query":"ProductionSchool:british","default_operator":"OR"}},"filter":
{"and":[{"exists":{"field":"ObjectNumber"}},{"term":
{"Category":"painting"}}]}},"preference":"b8164e5587009099581f89faa5bde211"},"from":
0,"size":"25","facets":{"Dept":{"terms":{"field":"Dept"}},"Category":
{"terms":{"field":"Category","size":"50"}},"Name":{"terms":
{"field":"Name","size":"50"}},"Material":{"terms":
{"field":"Material","size":"50"}},"Technique":{"terms":
{"field":"Technique","size":"50"}},"ProductionPlaceName":{"terms":
{"field":"ProductionPlaceName","size":"50"}},"RRFlag":{"terms":
{"field":"RRFlag"}},"Maker":{"terms":
{"field":"Maker.asterm","size":"50"}}}}
I have two ES's running and the only config I have done is to name the
group and set the master - everything else ES is looking after
thanks for any help
Shaun
On Mar 28, 11:58 am, Shay Banon kim...@gmail.com wrote:
It might be that some documents have the same sorting value, and then, when
you execute one search and it hits one set of shards, and another which
hits another set of shards (copies of the data), you will get different
results (but correct sorting).
You have the optino to specify a "preference" when searching:Elasticsearch Platform — Find real-time answers at scale | Elastic,
specifically, check the "custom string value"). This can ensure two
searches will use the same shards.
On Wed, Mar 28, 2012 at 8:11 AM, Byakuya mukhin.vladi...@googlemail.comwrote:
Hello!
Here is our problem. We have Elasticsearch cluster with two nodes (20
shards, 2 replicas). There is an index for CouchDb datasource (approx.
2 mil. documents) via river. When we make the same search request with
limitation of first 100 results some times, we receive two different
sets of results - one after another in turn. Some result elements are
skipped but another are added. They are sorted correctly. Total
quantity of results is the same, but some elements differ. When we
disable one node and only one works, then set of search results we get
is stable.
Query example (query is built depending on data, which is received
from html form):
curl -XGET 'http://192.168.0.248:9200/tenderinfo_index/_search'-d '{
"sort": {
"publishDate.value": "desc"
},
"from": 0,
"fields": [
"_id",
"orderName.value",
... and other fields we need to retrieve
],
"query": {
"filtered": {
"filter": {
"bool": {
"must": [
{
"match_all": {}
}
]
}
},
"query": {
"bool": {
"must": [
{
"bool": {
"should": [
{
"query_string": {
"query": "разработка
сайта",
"default_operator":
"AND",
"default_field":
"orderName.value",
"analyzer": "russian"
}
},
{
"nested": {
"path": "lots",
"score_mode": "avg",
"query": {
"query_string": {
"query":
"разработка сайта",
"default_operator": "AND",
"default_field":
"lots.subject.value",
"analyzer":
"russian"
}
}
}
}
]
}
}
]
}
}
}
},
"size": 100
}'
The goal of our task is to get hash on search results to detect new
data and perform some actions with it. But with interleaved sets of
results hash is different at the same query.
How this problem can be solved? We appreciate any suggestions.