Elasticsearch 'search' not returning all data


(R_C) #1

Hi ,

I have inserted some 1000 documents using a script
i=1 while [ $i -le 998 ] do curl 'localhost:9200/test_index1/test_type1/'$i'' -d'{"name":"Hi , This is XYZ from location 10.0.4.135"}' i=expr $i + 1done
Prior to this I had created 1 shards with 0 replicas for this index
curl -X PUT "localhost:9200/test_index1" -H 'Content-Type: application/json' -d'
{
"settings" : {
"index" : {
"number_of_shards" : 1,
"number_of_replicas" : 0
}
}
}
'

Now I am performing a simple search using
curl -X GET "localhost:9200/test_index1/_search" -H 'Content-Type: application/json' -d'
{
"query": {
"query_string" : {
"default_field" : "name",
"query" : "10.0.4.135"
}
}
}
'

It does not shows all the documents with the string 10.0.4.135 but only a few -
{"took":1,"timed_out":false,"_shards":{"total":1,"successful":1,"failed":0},"hits":{"total":549,"max_score":0.59587663,"hits":[{"_index":"test_index1","_type":"test_type1","_id":"521","_score":0.59587663,"_source":{"name":"Hi , This is XYZ from location 10.0.4.135"}},{"_index":"test_index1","_type":"test_type1","_id":"522","_score":0.59587663,"_source":{"name":"Hi , This is XYZ from location 10.0.4.135"}},{"_index":"test_index1","_type":"test_type1","_id":"523","_score":0.59587663,"_source":{"name":"Hi , This is XYZ from location 10.0.4.135"}},{"_index":"test_index1","_type":"test_type1","_id":"524","_score":0.59587663,"_source":{"name":"Hi , This is XYZ from location 10.0.4.135"}},{"_index":"test_index1","_type":"test_type1","_id":"525","_score":0.59587663,"_source":{"name":"Hi , This is XYZ from location 10.0.4.135"}},{"_index":"test_index1","_type":"test_type1","_id":"526","_score":0.59587663,"_source":{"name":"Hi , This is XYZ from location 10.0.4.135"}},{"_index":"test_index1","_type":"test_type1","_id":"527","_score":0.59587663,"_source":{"name":"Hi , This is XYZ from location 10.0.4.135"}},{"_index":"test_index1","_type":"test_type1","_id":"528","_score":0.59587663,"_source":{"name":"Hi , This is XYZ from location 10.0.4.135"}},{"_index":"test_index1","_type":"test_type1","_id":"529","_score":0.59587663,"_source":{"name":"Hi , This is XYZ from location 10.0.4.135"}},{"_index":"test_index1","_type":"test_type1","_id":"530","_score":0.59587663,"_source":{"name":"Hi , This is XYZ from location 10.0.4.135"}}]}}Wed Jun 6 12:42:34 IST 2018

Can anyone please help me understand the same .

Thanks :slight_smile:


#2

Does the loop work properly? Have you tried to decrease the number of iteration? lets say indexing just 10 documents?


(R_C) #3

Hi , this works for 10 iterations,i.e. it is fetching all the 10 documents .. can you please explain its behavior for 1000 documents .

I have also tried with 100 iterations , it returns only 10 matches

Thanks :slight_smile:


#4

All right, then your loop is working fine,
so it would be good to have the responses returned by the curl requests...
I don't know exactly what's wrong, so you need to gradually increase the number of iterations and check the behavior.
Maybe add a sleep, to pause after each curl request?


(R_C) #5

Hi ,

[root@node1 ~]# curl -X GET "localhost:9200/_cat/indices?v"

health status index uuid pri rep docs.count docs.deleted store.size pri.store.size green open test_index1 gKOIJDmhSxuQGEqz3MGLcg 1 0 100 0 26.9kb 26.9kb

This is showing doc.count = 100 but the result returned is only 1st 10 matching docs .
This behavior is checked with doc.count=10, 20, 100, 500, 1000

Any leads would be highly appreciated .

Thanks


#6

Sorry, are you aware that by default, ES returns the first 10 results? But the hits.total, will give you the total count.

Use the size parameter to fetch more documents.
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-from-size.html#search-request-from-size


(R_C) #7

Thanks :slight_smile: that was much help


(system) #8

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.