How to create mass amounts of test data in elasticsearch / Kibana

0

I need to test some logic I have written in Kibana (Vega scripts). For this I require quite a bit of data to generate the graphs. I am currently doing this manually with statements such as:

post /metrics-hardware-xxx/_doc {"@timestamp":"2023-03-11T15:00", "system_Status": 1 }
post /metrics-hardware-xxx/_doc {"@timestamp":"2023-03-11T15:01", "system_Status": 1 }
post /metrics-hardware-xxx/_doc {"@timestamp":"2023-03-11T15:02", "system_Status": 0 }

BUt I would like to have some kind of for loop where I can loop through timestamps 1 per minute over several days and assign a random system status either 0,1,2.

Is this possible in Elastic?

It is straightforward with a stored procedure in most sql DBs, but I am not sure where to start , and seems there is not much info on the web.

Hi @shelby,

There's not a capability to add a large volume of test data directly in Elasticsearch aside from the _bulk upload endpoint with generated data. Since you want data in a specific format the demo sets available on the Kibana home page will not be what you're looking for.

But there are several ways you could automate the ingestion of data volume using other sources. Have you considered either:

  1. Writing a script in a programming language your comfortable with such as Python and then push the data using the Elasticsearch client for that language?
  2. Using Logstash to ingest data from a file or DB (if you have another source) into Elasticsearch?
  3. Making use of a community project such as elasticsearch-sample-data-generator. Just a warning that project hasn't been updated since 2021 so you need to consider if the version of Elasticsearch you are running is compatible with that version.
  4. Using a JSON generator and uploading the data, similar to the steps (in this post)[How to create sample data to perform searches using elastic search? - DevOpsSchool.com]

Do any of those options work for you?

2 Likes

Hello, many thanks for hte follow-up, yes I am currently experimenting with _bulk using curl. This is an excellent option for my case.

1 Like

Hello, I have some follow-up questions, I am still trying to get bulk insertion to work, and I am now testingthe elasticsearch c# client.

I copied the very simple example in the doc:

var settings = new ElasticsearchClientSettings(new Uri(ELASTIC_URI))
.Authentication(new BasicAuthentication(USER, PWD));
var client = new ElasticsearchClient(settings);
var response = await client.CountAsync();
Console.WriteLine($"success={response.IsSuccess()} : {response.DebugInformation}");

However I am having connectivity issues.

  1. whether or not I use he correct USER/PWD combination, I always get the same error message:

Request stream not captured or already read to completion by serializer. Set DisableDirectStreaming() on TransportConfiguration to force it to be set on the response.>

Response:

{"statusCode":404,"error":"Not Found","message":"Not Found"}

  1. I am not sure what I should use for elastic_URI, particularly for the PORT settings:

I have xxxx.yy.zz.aws.elastic-cloud.com:9243

is the port setting correct? where can I find the URI? is it the same one that I use in the DEV Console elasticsearch? Do I need a "/" at the end?

Most of the info on the webthat I found show examples where people are working with a local elastic installation and the port seems to always be 9200, so I am not sure of the url:port combination for a cloud setup.

  1. I also spent quite a bit of time trying to connect using the certificate fingerprint which I generated with openssl according to the online documentation, I have similar issues there (I can try again and post more details on that as well).

Any help is much appreciated, it is quite important or us to be able to manipulate the data remotely in bulk.

Hi @shelby ,

Your URL should be a combination of your Elasticsearch URL and port separated by a colon. You can copy the base URL from the Cloud home screen as shown in the below screenshot:

Just like local installations, port 9200 should be used for API calls for a cloud deployment. So for example with copied URL https://my-deployment.com, you need to specify https://my-deployment.com:9200.

Hope that helps! Do let me know how you get on.

Hello,
thanks again for your follow-up.

Indeed, it seems the port and uri was the issue, however I was not able to get the endpoint to work. I was able to get the authentication working and I am now able to connect using the CLOUD_ID and user/pwd.

If anyone else has the issue here is my code:

 var authentication = new BasicAuthentication(USER, PWD);
        var client = new ElasticsearchClient(CLOUD_ID,authentication);
        var response = await client.CountAsync();

I will start experimentingwith the API which looks very powerful, I expect to easily be able to bulk insert in a flexible way.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.