Elasticsearch Search data

Hi All,

I am new to Elasticsearch. I have created the below indexes and have pushed the same document to both the indexes. Only the difference is index settings that is first index has five shards and second index has only one shard. I have tried to search the documents in both the indexes and results are in different order.

Index1 –

  • Five shards and one replica.
  • Total document 250K. Each document has 40 to 50 fields

Index2

  • One shards and two replica.
  • Total document 250K. Each document has 40 to 50 fields

For example, I have tried to match the title field in both the indexes

GET index1/_search
{
 "query": {
  "match": {
   “title”: “Oracle security”
  }
 }
}

Results comes into the following order. The exact title match comes in third position.

Oracle Advanced security

Oracle Database security

Oracle security

I have executed the same query on index2. The exact title comes in first result.

Oracle security

Oracle Advanced security

Oracle Database security

Why I am seeing differences in the order of results for both the indexes. If shards count high then result comes randomly.

Yeah. When you have multiple shards and a very small number of documents, the score is biased by the repartition of the documents.
It's better to use one single shard of in the case of testing you can add search_type=dfs_query_then_fetch parameter.

If you want exact match always being on top, you can combine multiple queries within a bool query as should clauses.

I wrote some example here:

HTH

HI, Thank you for the clarification. It's really helpful.
One more question. is there any way to give the limit to each should condition in the elasticsearch query. Since my index has more than millions of documents and i have to show results on top for each should condition.

GET index/_search
{
	"size": 10,
	"query": {
		"bool": {
			"should": [
				{
					"bool": {
						"should": [
							{
								"match_phrase": {
									"title": {
										"query": "Oracle and science"
									}
								}
							},
							{
								"match": {
									"title": {
										"query": "Oracle and science",
										"fuzziness": 2,
										"operator": "or"
									}
								}
							},
							{
								"match": {
									"tit": {
										"query": "Oracle and science",
										"fuzziness": 2,
										"operator": "or"
									}
								}
							}
						]
					}
				}
			]
		}
	}
}

For example, the above query returns the more no of results and top 100 results are based on first condition only (match_phrase) and no results found that match the "Science" keyword.

It would be really helpful if you could share any suggestion to achieve my goal.

Thanks in advance.

I don't understand.

Could you provide a full recreation script as described in About the Elasticsearch category. It will help to better understand what you are doing. Please, try to keep the example as simple as possible.

A full reproduction script will help readers to understand, reproduce and if needed fix your problem.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.