How to group related logs in elasticsearch, based on a sentence?

I want to group similar logs based on a sentence. In, other words I want to group logs into different buckets based on certain similarities. Wish to know are there any inbuilt closest feature to achieve the same.


what is your definition of 'certain similarities' - sounds pretty broad, but might actually be very specific to your concrete implementation. As it does not make a lot of sense to create buckets for each sentence (you might have a lot of those), is there a possibility you could create a grouping field based on the sentence before you index your data as a preprocessing step? I could also imagine some ingest node processor which is doing that.

Hope this helps.


Thanks Alex for your quick response. I am quite new to log analytics and elasticsearch.

Yeah, I was sounding pretty broad :slight_smile: . Let me rephrase a bit.

I search for a specific string say "Dog" and I am able to get the results of logs containing string "Dog".

However, I wish to make elasticsearch return results even if I search using "Animal" or "Doog" (spelling mistake)

Can I achieve this ? if yes kindly provide pointers for the same.

Appreciate your support!


one can be achieved by the use of synonyms. The other use-case is a fuzzy search which allows for typos.

See the chapter in the definitive guide about synonyms and the typos & mispellings chapter.


Fantastic will check this out thanks! :smile:

Dear Shrikant,

you can also look in to clustering . your requirements is to have keywords cluster from description.
It can be further enhanced with supervised and unsupervised,

You can look at integrating carrot2 or mahout along with elasticsearch.


Oh okay, makes sense will check this too. Thank You :slight_smile: