Use ES as a backed for a dash Application

yasvanth · October 21, 2020, 4:16pm

Hi Team,

ES Version: 6.5.4
Python Version : 3

I am writing a dash app to generate reports that uses elasticsearch as a backend/data source, to connect my app to elasticsearch I use elasticsearch-dsl python module. When I tried to query data for single day which consists of 13322 docs, it takes long time to display the results even though using scroll api.

The python script queries the data and convert from json to pandas data frame.

The elastic search query is structured using Kibana's dev-tool .

Query:-

realm_query = {
    'aggs': {
        '2': {
          'date_histogram': {
            'field': 'EventTime',
            'interval': '1h',
            'time_zone': 'Europe/London',
            'min_doc_count': 1
          },
          'aggs': {
            '3': {
              'significant_terms': {
                'field': 'AuthStatus.keyword',
                'size': 2
              }
            }
          }
        }
      },
      'size': 0,
      '_source': {
        'excludes': []
      },
      'stored_fields': [
        '*'
      ],
      'script_fields': {},
      'docvalue_fields': [
        {
          'field': '@timestamp',
          'format': 'date_time'
        },
        {
          'field': 'EventTime',
          'format': 'date_time'
        }
      ],
      'query': { 
        'bool': { 
          'must': [
            { 'terms': { 'server ': ['server_name']  }}
          ],
          'filter': [ 
            { 
              'range': {
                'EventTime': { 
                  "gte": 1601506800000,
                  "lte": 1601593199999,
                  "format": "epoch_millis"
                 }
              }
            }
          ]
        }
      }
    }

python script:-

response = es_client.search(
            index = "server-*",
            scroll = "10s",
            size = 500,
            body = realm_query
            )

        #response['hits']
        #print("Filtered Query : \n", parsed_query)
        
        # Counter values
      
        counter = 0
        sid = response['_scroll_id']

        scroll_size = response['hits']['total']
        print("Scroll size : ", scroll_size)

        """
        STORE THE ELASTICSEARCH INDEX'S FIELDS IN A DICT
        """
        ## create an empty dictionary for Elasticsearch fields
        fields = {}

        while (scroll_size > 0):

            #print("Scrolling...")
            page = es_client.scroll(scroll_id = sid, scroll = '10s')

            #print("Hits: ", len(page['hits']['hits']))
            sid = page['_scroll_id']

            #Get the number of results that we returned in the last scroll
            scroll_size = len(page['hits']['hits'])
    
            elastic_docs = response["hits"]["hits"]

            # Iterate the date into fields. 

            for num, doc in enumerate(elastic_docs):
                # Data contains in _source field
                source_data = doc["_source"]
                #radius_type = doc["_source"]["RadiusType"]
   
                # _source field is a dictonary, so iterate through the dict
                for key, val in source_data.items():
                    try:
                        fields[key] = np.append(fields[key], val)
                    except KeyError:
                        fields[key] = np.array([val])
       

            print("Scroll Size {} ".format(scroll_size))

            counter = counter + 1
    
        """
        Transform dictionary to pandas dataframe
        """
        data_es_df  = pd.DataFrame(fields)

        print("Total Pages : {}".format(counter))

Output Time:-

May I know how Kibana queries the data from elasticsearch and whether it is do able using python script.
Any reference and document would be helpful.

Best,
Yash

yasvanth · October 24, 2020, 10:06am

Hi Team,

Used aggregation to speed up the data querying instead of querying whole document.

This ticket can be closed.

Best,
Yash

system · November 21, 2020, 10:06am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How do I do a filtered query using Elasticsearch DSL (python) Elasticsearch	1	2449	July 5, 2017
Elasticsearch query is too slow Elasticsearch	5	519	June 16, 2020
Elasticsearch DSL: How to use QueryString + Time Range properly? Elasticsearch	2	6759	August 15, 2017
Getting timebased results in Elasticsearch using Python Elasticsearch	1	468	August 8, 2019
Create Elastic index from Python script Elasticsearch	3	839	December 14, 2020

Use ES as a backed for a dash Application

Related topics