How to extract found terms into keyword fields

David9 · March 29, 2019, 8:02pm

We're relatively new to Elastic and have a requirement to better understand the occurrences of particular hashtags and twitter handles that appear in paragraphs in full text fields. E.g. I'd like to have a way to locate all the hashtags that appear in a document and place those in a new array in the document. In that manner I think that would enable or at least simplify processes like aggregations around hashtags. I could easily generate a report showing most popular hashtags, with a count. Is this possible? Is it the best way to solve the problem? Thanks

John_Guzman · April 25, 2019, 4:09pm

In my opinion the best way to hang out your problem is collect all the data on the index and process it into a new one with Logtash.

This may help.
https://www.elastic.co/es/elasticon/2015/sf/building-entity-centric-indexes

Mark_Harwood · April 25, 2019, 4:18pm

This sounds like an entity extraction problem and fortunately twitter handles and hashtags are easily identified using a simple regular expression.
I tend to use Python code to prepare docs but this is a personal choice and Logstash or ingest pipelines are other document-enrichment tools. This question explores the same problem.
Either way, you should be OK to have a plain doc with your original text field and use a keyword type structured field with an array of the extracted handles or tags. If you want to remember where these handles were extracted from the text it might be an idea to use an annotated_text field instead.

system · May 23, 2019, 4:18pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How to extract terms into a new field while indexing Elasticsearch	6	879	May 1, 2019
Aggregations based on text fields instead of keyword fields Elasticsearch	9	1305	April 29, 2019
Extract Hashtags and Mentions into separate fields Elasticsearch	3	1045	January 13, 2022
Isolating Hashtags in Twitter Feed Logstash	8	583	April 23, 2018
Identifying Significant Words In a Field Kibana	8	642	May 1, 2018

How to extract found terms into keyword fields

Related topics