In my logs I have a field "message" that has in it among other strings one term "UserID:" followed by a value.
This value is mostly different in each document, but sometimes the same UserID is logged.
I am trying to find a way to count the number of times each UserID is logged.
I have been researching but can't find a way to isolate the value after "UserID:" and then count based on this value.
Any help would be greatly appreciated.
RabBit_BR
(andre.coelho)
May 12, 2022, 1:05am
2
Hi @MobileOne .
Could you give an example of the indexed doc?
Hi @RabBit_BR
You will see below that the "message" field has "response={ UserId:" followed by the ID that I want to count across all the results.
{
"_index" : ".sample-2022.05.10-001311",
"_type" : "_doc",
"_id" : "ECaIr4ABpdjiMbYDoZXE",
"_score" : 49.102036,
"_ignored" : [
"message.keyword"
],
"_source" : {
"container" : {
"image" : {
"name" : "xyz.amazonaws.com/user:3.471ae2"
}
},
"cluster" : "qa",
"kubernetes" : {
"container" : {
"name" : "user"
},
"pod" : {
"ip" : "70.71.245.322",
"name" : "user-6768fdff5-v33k"
},
"namespace" : "qa",
"replicaset" : {
"name" : "user-6734573255"
},
"labels" : {
"service" : "user",
"pod-template-hash" : "2554339581",
"track" : "stable"
}
},
"level" : "INFO",
"project" : "test",
"message" : "[2022-05-10 19:53:27.228] [INFO] response={ UserId: '977dfe3fd034c8609b6ec63cafd0d14f57caa24536593c7f2f3df6f2a6b4c1236e1147fb44a1eb6', registrationId: '9D702A7B-B676-E74E0799', rawRequest: undefined, rawResponse: undefined, token: 'f80b7a786cc49a29d03ed9a1c06954dd521c70fef445aada4eb8511eb12170f677b7ba68bbe56f595a575f1', timestamp: '1652212407227' } responseTime=987 service=user traceId=7dd7a96b8ce051f1",
"market" : "us",
"input" : { },
"environment" : "staging",
"@timestamp" : "2022-05-10T19:53:27.228Z",
"ecs" : { },
"stream" : "stdout",
"service" : "user",
"host" : {
"name" : "ip-70-71-245-322"
},
"region" : "us",
"event" : "authorize"
}
}
RabBit_BR
(andre.coelho)
May 12, 2022, 5:45am
4
I don't see a way to do this.
Maybe another user can help you.
Excuse me.
Thanks @RabBit_BR , I think that the direction is via the Analizer, but lack of experience with this feature is an issu
Tomo_M
(Tomohiro Mitani)
May 13, 2022, 6:19am
6
Hi,
If it raise no performance problem, one possible way is using runtime field for UserId. With runtime field, you can use script to almost freely process the message field to extract the UserId. As it processes the entire documents for each query however, it may cause fatal performance problems.
In my opinion, it should be addressed before indexing. Using logstash for example, you can parse the message field and create UserId field suitable for Elasticsearch.
1 Like
Tomo_M
(Tomohiro Mitani)
May 13, 2022, 6:24am
7
Ingest pipeline is another solution.
1 Like
system
(system)
Closed
June 10, 2022, 6:25am
8
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.