I am searching data in a time range "2024-09-20".
I made sure that the user "administrador" has permission to generate csv report and the sentence is ommited
Hi @RobertC1 I'm glad you solved the problem here. Just FYI: I see a missing comma at the end of the line starting with "from": "2023-09-20", turning this into invalid JSON. This is likely the cause.
from elasticsearch import Elasticsearch
from datetime import datetime
# Elasticsearch URL
ELASTICSEARCH_URL = "http://localhost:9200"
# Elasticsearch username and password
USERNAME = "elastic"
PASSWORD = "xxxx"
# Create Elasticsearch client with basic authentication
es = Elasticsearch([ELASTICSEARCH_URL], basic_auth=(USERNAME, PASSWORD))
# Get the current date and time
# Check if the connection is successful
if not es.ping():
print("Could not connect to Elasticsearch")
exit()
start_date = "2023-09-05T00:00:00"
end_date = "2023-09-05T23:59:59"
# Define the date range
# Initial search query with date range
query = {
"size": 10000, # Number of documents to retrieve per scroll
"query": {
"bool": {
"must": [
{
"range": {
"@timestamp": {
"gte": start_date,
"lte": end_date
}
}
}
]
}
},
"sort": [
{"nodo": {"order": "asc"}}
]
}
# Execute initial search request with scroll
response = es.search(index="trx_hours_history", body=query, scroll="1s")
# Store unique document identifiers to avoid duplicates
seen_ids = set()
#print(f'{fecha},"{nodo_str}",{base},{sp},{segundo_div:.4f},"{cant_format}"')
# Process initial search response
scroll_id = response["_scroll_id"]
hits = response["hits"]["hits"]
total_hits = response["hits"]["total"]["value"]
# Process the first batch of results
for hit in hits:
source = hit["_source"]
doc_id = hit["_id"] # Get document ID
# Check if the document ID has been seen before
if doc_id not in seen_ids:
seen_ids.add(doc_id)
fecha = source.get("@timestamp")
nodo = source.get("nodo")
nodo_str = str(nodo).zfill(2)
base = source.get("base")
sp = source.get("sp")
cant_trn = source.get("cant_trn")
cant_format = "{:,}".format(cant_trn)
segundo_prom = source.get("segundo_prom")
if segundo_prom is not None:
segundo_div = segundo_prom / 1000
print(f'{fecha},"{nodo_str}",{base},{sp},{segundo_div:.4f},"{cant_format}"')
Thanks for sharing your code. I think it could be super useful for the community. If you don't mind, I'd mark it as a solution for your post so people can get inspiration from it.
dadoonet
I don't really remember, I believe it was a mix of a youtube video and chatgpt.
It was sunday and late and when I found the python solution, I didn't insist with the curl command.
Best Regards.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.