Use ES as a backed for a dash Application

Hi Team,

ES Version: 6.5.4
Python Version : 3

I am writing a dash app to generate reports that uses elasticsearch as a backend/data source, to connect my app to elasticsearch I use elasticsearch-dsl python module. When I tried to query data for single day which consists of 13322 docs, it takes long time to display the results even though using scroll api.

The python script queries the data and convert from json to pandas data frame.

The elastic search query is structured using Kibana's dev-tool .

Query:-

realm_query = {
    'aggs': {
        '2': {
          'date_histogram': {
            'field': 'EventTime',
            'interval': '1h',
            'time_zone': 'Europe/London',
            'min_doc_count': 1
          },
          'aggs': {
            '3': {
              'significant_terms': {
                'field': 'AuthStatus.keyword',
                'size': 2
              }
            }
          }
        }
      },
      'size': 0,
      '_source': {
        'excludes': []
      },
      'stored_fields': [
        '*'
      ],
      'script_fields': {},
      'docvalue_fields': [
        {
          'field': '@timestamp',
          'format': 'date_time'
        },
        {
          'field': 'EventTime',
          'format': 'date_time'
        }
      ],
      'query': { 
        'bool': { 
          'must': [
            { 'terms': { 'server ': ['server_name']  }}
          ],
          'filter': [ 
            { 
              'range': {
                'EventTime': { 
                  "gte": 1601506800000,
                  "lte": 1601593199999,
                  "format": "epoch_millis"
                 }
              }
            }
          ]
        }
      }
    }

python script:-

response = es_client.search(
            index = "server-*",
            scroll = "10s",
            size = 500,
            body = realm_query
            )

        #response['hits']
        #print("Filtered Query : \n", parsed_query)
        
        # Counter values
      
        counter = 0
        sid = response['_scroll_id']

        scroll_size = response['hits']['total']
        print("Scroll size : ", scroll_size)

        """
        STORE THE ELASTICSEARCH INDEX'S FIELDS IN A DICT
        """
        ## create an empty dictionary for Elasticsearch fields
        fields = {}

        while (scroll_size > 0):

            #print("Scrolling...")
            page = es_client.scroll(scroll_id = sid, scroll = '10s')

            #print("Hits: ", len(page['hits']['hits']))
            sid = page['_scroll_id']

            #Get the number of results that we returned in the last scroll
            scroll_size = len(page['hits']['hits'])
    
            elastic_docs = response["hits"]["hits"]

            # Iterate the date into fields. 

            for num, doc in enumerate(elastic_docs):
                # Data contains in _source field
                source_data = doc["_source"]
                #radius_type = doc["_source"]["RadiusType"]
   
                # _source field is a dictonary, so iterate through the dict
                for key, val in source_data.items():
                    try:
                        fields[key] = np.append(fields[key], val)
                    except KeyError:
                        fields[key] = np.array([val])
       

            print("Scroll Size {} ".format(scroll_size))

            counter = counter + 1
    
        """
        Transform dictionary to pandas dataframe
        """
        data_es_df  = pd.DataFrame(fields)

        print("Total Pages : {}".format(counter))
    

Output Time:-

May I know how Kibana queries the data from elasticsearch and whether it is do able using python script.
Any reference and document would be helpful.

Best,
Yash

Hi Team,

Used aggregation to speed up the data querying instead of querying whole document.

This ticket can be closed.

Best,
Yash

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.