Best practice for a newbie

Rhh · April 15, 2020, 4:37am

Hi

I am totaly new to elastic and tries to figure out how to use elastic with serilog as provider.

I have a lot of different log sources (100 +), and they all generate a JSON I need to log. I know I can log the
different JSON's in one field, but I thing for searching purposes this will not fly well. Most of the JSON's are
different in the respect of common fields, maybe only 5-10% in some of them where common fields will occure.

So my questions is :

Should I save the complete JSON in one field or should I split the JSON's into separate fields for every data inside it ?

If I create separate fields for every data in the JSON's, due to the different structures, should I have them all in the same index ?
If not, should I create one index pr JSON type ? Will this not influence performance ?

Regards

Sample 1 :

{
"message": "my custom data 1"
}

Sample 2 :

{
"message": "my custom data 2",
"data": {
"A": "valueA",
"B": {
"B1": "valueB1"
},
"C": {
"C1": "valueC1",
"C2": {
"C2_1": "valueC2_1",
"C2_2": "valueC2_2"
}
}
}
}

Sample 3 :

{
"message": "my custom data 3",
"data1": {
"D": {
"D1": {
"D1_1": "valueC1_1",
"D1_2": "valueC1_2"
}
},
"data2": {
"X": {
"X1_1": "valueX1_1",
"X1_2": "valueX1_2"
}
}
}
}

forloop · April 15, 2020, 11:56am

Hey @Rhh,

You may want to check out Elastic.CommonSchema.Serilog nuget package, which is part of the suite of integrations for the Elastic Common Schema with .NET.

Elastic Common Schema as the name implies, is a specification for a common set of fields when storing logs and metrics in Elasticsearch. Elastic.CommonSchema.Serilog contains an ITextFormatter implementation that formats a Serilog event into a JSON representation that adheres to Elastic Common Schema. You can read more about it in the GitHub repository and blog post.

To answer your questions

Ideally, each log line in the JSON file is a separate log event that would be indexed as a separate document in Elasticsearch.

A typical approach is to define a unified logging format, similar to what Elastic Common Schema does, that all log events adhere to, making it easier to analyze and correlate across log events. If logs are already different structures, it may make sense to put them into separate, time-based indices.

An index is made up of one or more shards. It is really more the number of shards rather than the number of indices that have an impact i.e. one index with 5 shards or five indices each with one shard has about the same impact.

Rhh · April 16, 2020, 10:03am

OK, I will try to use Elastic.CommonSchema.Serilog...

I try to use it this way

LoggerConfiguration.WriteTo.Elasticsearch(new ElasticsearchSinkOptions(
new Uri(TmpURL))
{
AutoRegisterTemplate
= true ,

            CustomFormatter
                = new EcsTextFormatter()
        });

But I get :

Failed to create the template

Any ide why this does not work ?

Regards

Rhh · April 17, 2020, 6:34am

After som more work on this I have decided to create my own custom template.

When I request the data I get Norwegian characters like this with the curl -X GET "localhost:9200/_search?pretty" command from my cms windows under Windows 10.

If should have been

"message" : "{"Message":"5 forsøk", …

I am quite sure the text is added correctly to ES.

Why is this ?

Regards

system · May 15, 2020, 6:34am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.