Searched for single item => ES return near whole index


#1

I'm using ES 6.1.1.

I want to select single item by message.

My query in Kibana is:

GET MY_INDEX/_search
{
  "query": {
    "constant_score": {
      "filter": {
        "bool": {
          "must": [
            {
              "match": {
                "message": "|cir=C3|date=DATE|acronym=ACRONYM|user=USER|type=2|transactionHostDepartment=01|membIdentificNumb=MEMBERIDENTIFICNUMB|patronId=PATRONID|note=SuperPartner|patronCategory=004|gender=2|district=009|lastVisitDate=LASTVISITDATE|libraryCode=LIBRARYCODE|libraryDept=01|firstsignUpDate=FIRSTSINGUPDATE|visitValid=0|visitTypeValid=0|"
              }
            }
          ]
        }
      }
    }
  }
}

I've replace all non-public parameter with uppercase placeholder.

My index has a size of 810 391 111 items. This query should return only one item, because each message is unique. But it returns 810 391 109 items.

Could anyone help me further?

Thanks,
Mojster


(David Pilato) #2

It depends on the analyzer you are using.
I guess you are using the default one. * at the end is retrieving all docs


#3

I did not set any analyzers, so it should be the default one.
"|" are a part of my input message. So I need to search by them.

How can I tell ES to search for this string as one whole item?

edit:
I'm reading docs, and I've stumbled upon keyword analyzer. Can I set it only for one/this particular field?

edit2:
So if I'm right, there is nothing to be done without reindexing.
So I should fix my mapping and rerun all data again, so that next time, I could search over this?
Could you give me some directions how to fix my mapping only for the message field.

Thanks,
Mojster


(David Pilato) #4

Yes that's it. You need to set the mapping to your field (like keyword) and then reindex.

IMHO though it would be better to split your message field into subfields and structure more your documents. My 2 cents.


#5

My mapping is here:

PUT transakcije
{
    "settings" : {
        "index" : {
            "number_of_shards" : 10,
            "number_of_replicas" : 1
        }
    },
	"mappings": {
		"log_transakcije": {
			"properties": {
				"@timestamp": {
					"type": "date"
				},
				"@version": {
					"type": "text",
					"fields": {
						"keyword": {
							"type": "keyword",
							"ignore_above": 256
						}
					}
				},
				"acronym": {
					"type": "keyword",
					"eager_global_ordinals": true
				},
				"beat": {
					"properties": {
						"hostname": {
							"type": "keyword"
						},
						"name": {
							"type": "keyword"
						},
						"version": {
							"type": "keyword"
						}
					}
				},
				"bibl001c": {
					"type": "keyword"
				},
				"biblLanguage101a": {
					"type": "keyword"
				},
				"biblTargetAudienceCode100e": {
					"type": "keyword"
				},
				"biblType001b": {
					"type": "keyword"
				},
				"biblUDK675s": {
					"type": "text"
				},
				"busStopId": {
					"type": "keyword"
				},
				"cir": {
					"type": "keyword"
				},
				"cobissId": {
					"type": "keyword",
					"eager_global_ordinals": true
				},
				"country": {
					"type": "keyword"
				},
				"date": {
					"type": "date"
				},
				"district": {
					"type": "keyword"
				},
				"firstsignUpDate": {
					"type": "date"
				},
				"gender": {
					"type": "byte"
				},
				"holdStatus": {
					"type": "keyword"
				},
				"host": {
					"type": "keyword"
				},
				"input_type": {
					"type": "keyword"
				},
				"inventoryNo": {
					"type": "keyword"
				},
				"lastProlongDate": {
					"type": "date"
				},
				"lastVisitDate": {
					"type": "date"
				},
				"libraryCode": {
					"type": "keyword"
				},
				"libraryDept": {
					"type": "keyword"
				},
				"loanDate": {
					"type": "date"
				},
				"materialType": {
					"type": "keyword"
				},
				"membIdentificNumb": {
					"type": "keyword"
				},
				"message": {
					"type": "text"
				},
				"note": {
					"type": "text"
				},
				"offset": {
					"type": "long"
				},
				"parentDepartment": {
					"type": "keyword"
				},
				"patronCategory": {
					"type": "keyword"
				},
				"patronEducation": {
					"type": "short"
				},
				"patronId": {
					"type": "keyword"
				},
				"rPSPriority": {
					"type": "byte"
				},
				"rPSStatus": {
					"type": "byte"
				},
				"reservationDate": {
					"type": "date"
				},
				"returnDate": {
					"type": "date"
				},
				"rptPackageStatus": {
					"type": "long"
				},
				"schoolDept": {
					"type": "keyword"
				},
				"schoolName": {
					"type": "keyword"
				},
				"schoolProgram": {
					"type": "keyword"
				},
				"schoolType": {
					"type": "byte"
				},
				"source": {
					"type": "text"
				},
				"tags": {
					"type": "keyword"
				},
				"transactionHostDepartment": {
					"type": "keyword"
				},
				"type": {
					"type": "long"
				},
				"user": {
					"type": "text"
				},
				"visitTypeValid": {
					"type": "byte"
				},
				"visitValid": {
					"type": "byte"
				}
			}
		}
	}
}

Thanks for the advice, but I'm splitting those fields and store them separately.
I'm just keeping the incoming message, because in some cases I need to get the whole thing, because an after process was written earlier and depends on it.

So to fix my issue I just change the field type from text to keyword.
Got it, thanks.


(David Pilato) #6

Yeah. So the string won't be analyzed.


#7

Thanks.


#8

I've changed those two fields from text to keyword.
And now after building the index again its size has risen more than 100%.
From 450GB to 1010GB.

Both indexes were build on 6.x ES.
New one 6.1.1., old was was build on some earlier version, probably 6.0.1 or sth.
I've got another index build on the same version as the first one. Can I find somewhere this version? And does it matter?


(system) #9

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.