Greetings.
I need to fetch the oldest or newest document saved into a beat.  Someone asked this before and this was the suggested course of action:
  
  
    Hello, is there a solution to get this information?
   
 
This approach will NOT work for metricbeat data because you cannot snag over 10000 records at a time and metricbeat produces almost 9000 in one day.
Any suggestions?  Thank you!
             
            
               
               
               
            
                
            
           
          
            
            
              One of the indices I'm looking at has 57,150,950 documents.  That's just one of them.  : /
             
            
               
               
               
            
            
           
          
            
              
                spinscale  
                (Alexander Reelsen)
               
              
                  
                    December 18, 2019,  8:54am
                   
                   
              3 
               
             
            
              What is the problem of sorting by timestamp and only return a single document? Execute two searches, one with sort asc and the other with sort desc.
Would that work or is there an issue with that approach that I might have missed in your post?
             
            
               
               
               
            
            
           
          
            
            
              There are way too many documents.  I'm querying with Python - which has a maximum document size of 10k.  With metricbeat data I can go back about a day but that's it.
             
            
               
               
               
            
            
           
          
            
              
                spinscale  
                (Alexander Reelsen)
               
              
                  
                    December 19, 2019,  9:06am
                   
                   
              5 
               
             
            
              we are now talking about something different, it seems. Your initial ask was to retrieve only a single document from what I read.
If you need to get more than 10k documents, than use a scroll search.
             
            
               
               
               
            
            
           
          
            
            
              Okay...  I need to get the oldest of 15 million records.  I don't think I can do it.
             
            
               
               
               
            
            
           
          
            
            
              For instance, if I do this query, I can go back a few days.  But that's all.
resr = esr.search(index="metricbeat-*", size=documentSize, body={
		"query": {
			'bool': {
				'must': [
					{"term": {'fields': thisId}},
					{"exists": {"field": "host"}}
				]
			}
		},
		"sort": [
			{
				"@timestamp": {
					"order": "desc"
				}
			}
		]
	})
return resr['hits']['hits'][0]['_source']
 
I'll look into the scroll functionality and see if I can get it to work.  Thank you for the suggestion.
             
            
               
               
               
            
            
           
          
            
            
              In order to get around this problem I am changing my approach.  Of course it created a new issue...  But here's the follow up question:
  
  
    Hi there. 
I'm trying to aggregate host configuration data out of metricbeat.  I need to look at the documents for a single day, bucketing by ID, and return the host info.  What I'm trying to do is watch for configuration changes on a daily basis. 
If I use this code: 
"query": {
		'bool': {
			'must': [
				{"exists": {"field": "host"}},
				{'term': {'@timestamp': date}}
			]
		}
	},
	"aggs": {
		"by_id": {
			"terms": {"field": "fields.id"}
		}
	}
I get an array out containing our fields.id …
   
 
Thank you!
             
            
               
               
               
            
            
           
          
            
              
                system  
                (system)
                  Closed 
               
              
                  
                    January 17, 2020,  8:57pm
                   
                   
              9 
               
             
            
              This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.