Finding count by distinct text row using Lens

Chetan_Madaan · February 15, 2021, 7:09am

Hello,

I am very very new to this, Like up until 12 hours ago I hadn't even heard of elastic Search or Kibana.

I do however have basic knowledge of SQL and Other Data visualization tools (including Data Studio and others).

I am using this with Twint (GitHub - twintproject/twint: An advanced Twitter scraping & OSINT tool written in Python that doesn't use Twitter's API, allowing you to scrape a user's followers, following, Tweets and more while evading most API limitations.) trying to analyze some data.

I want to find similar tweets based on text (& record count of occurrence) but it looks like I can using text based fields in lens.

Now, I can use text bases fields in discover but it doesn't aggregate to provide me a count of the same.

Am I looking at this incorrectly?

Please assist if you can.

flash1293 · February 15, 2021, 9:34am

I'm not sure I'm fully understanding what you want to do.

I want to find similar tweets based on text

What's "similar" here? Do you mean exactly the same text? If that's the case you can use the "Top values" function for the tweet field together with the "Count" function in Lens.

Chetan_Madaan · February 16, 2021, 8:02pm

Hi,

Yes, "exact text match". But the problem is the tweet field is not visible in lens.

It is however visible in Discovery. Do I need to create some sort of index or something?

Chetan_Madaan · February 16, 2021, 8:05pm

Adding more to it.

The available fields in discover are 37 but in less there are only 19 (see screenshots below).

Thanks,
Chetan

flash1293 · February 17, 2021, 8:10am

OK, so the difference between the Discover and Lens views are Discover working on individual documents while Lens works with aggregated data only.

Depending on how your index is configured, not all fields are aggregatable. For Discover this doesn't matter - it can still show all of them. But in Lens, the list of fields is pre-filtered.

If you go to Management > Index patterns you can see this:

agent.keyword is usable in Lens and will show up in the list, while agent is not. In Discover, you can use both fields.

If the tweet message field shows up in Discover but not in Lens it means it's not aggregatable. To fix this, you have to change your mapping and make sure it's indexed as "keyword" (not only text). Then you need to re-index your existing data so the aggregatable index is built within Elasticsearch.

A common mapping to use (like in the screenshot above) is to have the original field indexed as "text" for full text search with a second field suffixed with ".keyword" for the keyword indexed version of the same data for aggregations. This gives you the best of both worlds:

        "agent" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        }

Chetan_Madaan · February 17, 2021, 2:41pm

Thanks @flash1293, this helps understand the poroblem. Yes, the field is not aggregatable in the index.

I am trying to do the above based on the code you provided but that doesn't seem to be working. I am positive I am missing something here.

flash1293 · February 17, 2021, 2:53pm

This tries to set the mapping for the tweet type, not the tweet field. I think the best approach is to create a completely new index mapping (maybe twinttweets_fixed) with all fields, including the keyword version of the tweet field, then use the re-index api to shovel the data from twinttweets to twinttweets_fixed.

Chetan_Madaan · February 17, 2021, 3:09pm

That's a lot of learning for me right there, I'll go about doing that. Thank you @flash1293

system · March 17, 2021, 3:09pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Can't use text fields in lens : Kibana Kibana lens	3	1394	June 2, 2021
Visualizations for Text Words in Kibana Kibana	6	336	January 20, 2023
Kibana dashboard vs discover (top 5 values) Kibana	3	746	October 8, 2022
How to filter by "Count of records" within Kibana Lens Kibana lens	4	3726	July 25, 2023
Is it possible to count each term of a text instead of the complete text to display in Kibana? Elasticsearch	5	4829	March 3, 2017

Finding count by distinct text row using Lens

Related topics