I index my emails in Kibana. And I would like to know the number of times that the word "timeout" appears in "Emailbody" field.
{
"From":"string",
"To" : "string",
"Date" "date"
"Emailbody": "string"
}
This is more an Elasticsearch question. Assuming that the field is named emailbody, is the field configured to be analyzed in ES? If so, you can use the Lucene Query Syntax to find all documents that have that work in the field, which would be emailbody:timeout. Putting that in the Kibana query bar will filter the documents returned, and then you could get the count of the documents.
Hello Joe,
it works perfectly I get the count of the documents
but I would like to know: The number of times that the word "timeout" is repeated
on the emailbody field for each document?
Thank you in advance for your help
Oh, I misunderstood what you were trying to see, sorry about that.
As you can see, Elasticsearch will give you the documents that contain the term pretty easily, but I don't know if there's a way to get the count of times a word was used in a field. Looking at the API, I don't see anything that looks like it'll do that, at least not directly.
I think you can do this with a scripted field though. I haven't tried it, but the idea would be to just use a term filter like you are now, and have a script field that would split that value by spaces, and then someone filter by the word you want and then count the remaining items. It would probably be crazy slow, but it would work.
The better way to do this would be to enrich the data when you index it, effectively doing the term count just the one time, and indexing the result along with everything else. This only works if you know ahead of time what term or terms you care about, but it sounds like maybe you do in this case. For existing documents, you can pretty easily reindex them.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.