Elasticsearch Search data

anand1 · October 24, 2019, 8:25pm

Hi All,

I am new to Elasticsearch. I have created the below indexes and have pushed the same document to both the indexes. Only the difference is index settings that is first index has five shards and second index has only one shard. I have tried to search the documents in both the indexes and results are in different order.

Index1 –

Five shards and one replica.
Total document 250K. Each document has 40 to 50 fields

Index2

One shards and two replica.
Total document 250K. Each document has 40 to 50 fields

For example, I have tried to match the title field in both the indexes

GET index1/_search
{
 "query": {
  "match": {
   “title”: “Oracle security”
  }
 }
}

Results comes into the following order. The exact title match comes in third position.

Oracle Advanced security

Oracle Database security

Oracle security

I have executed the same query on index2. The exact title comes in first result.

Oracle security

Oracle Advanced security

Oracle Database security

Why I am seeing differences in the order of results for both the indexes. If shards count high then result comes randomly.

dadoonet · October 25, 2019, 3:00pm

Yeah. When you have multiple shards and a very small number of documents, the score is biased by the repartition of the documents.
It's better to use one single shard of in the case of testing you can add search_type=dfs_query_then_fetch parameter.

If you want exact match always being on top, you can combine multiple queries within a bool query as should clauses.

I wrote some example here:

gist.github.com

https://gist.github.com/dadoonet/5179ee72ecbf08f12f53d4bda1b76bab

search_kibana_console.txt

### REINIT
DELETE user
PUT user
{
  "settings": {
    "number_of_shards": 1
  }, 
  "mappings": {
    "_doc": {
      "properties": {

This file has been truncated. show original

HTH

anand1 · November 11, 2019, 8:58pm

HI, Thank you for the clarification. It's really helpful.
One more question. is there any way to give the limit to each should condition in the elasticsearch query. Since my index has more than millions of documents and i have to show results on top for each should condition.

GET index/_search
{
	"size": 10,
	"query": {
		"bool": {
			"should": [
				{
					"bool": {
						"should": [
							{
								"match_phrase": {
									"title": {
										"query": "Oracle and science"
									}
								}
							},
							{
								"match": {
									"title": {
										"query": "Oracle and science",
										"fuzziness": 2,
										"operator": "or"
									}
								}
							},
							{
								"match": {
									"tit": {
										"query": "Oracle and science",
										"fuzziness": 2,
										"operator": "or"
									}
								}
							}
						]
					}
				}
			]
		}
	}
}

For example, the above query returns the more no of results and top 100 results are based on first condition only (match_phrase) and no results found that match the "Science" keyword.

It would be really helpful if you could share any suggestion to achieve my goal.

Thanks in advance.

dadoonet · November 12, 2019, 1:34am

I don't understand.

Could you provide a full recreation script as described in About the Elasticsearch category. It will help to better understand what you are doing. Please, try to keep the example as simple as possible.

A full reproduction script will help readers to understand, reproduce and if needed fix your problem.

system · December 10, 2019, 1:34am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Differnt shards giving different results Elasticsearch	7	1896	July 29, 2019
Elasticsearch 2.3.3 : Same Query Yielding different Results with replica Elasticsearch	5	1825	July 5, 2017
Elasticsearch - Result order Elasticsearch	1	341	November 21, 2019
Exact search results not shown at the top Elasticsearch	4	1403	November 26, 2018
Search on Multiple Indices Elasticsearch	5	4235	July 15, 2022

Elasticsearch Search data

Related topics