Hello, everyone,
I would like to know in general how a nested JSON data structure can most effectively be imported into Elasticsearch.
I have read some posts on this topic and also tried the approaches with parent/child fields, nested fields, partially denormalized data structure and separate indexes. Unfortunately everything did not lead to the desired success so far.
For example I have the following structure:
{
"CVE_Items" : [ {
"cve" : {
"data_type" : "CVE",
"CVE_data_meta" : {
"ID" : "CVE-2006-1174"
},
"references" : {
"reference_data" : [ {
"url" : "",
"name" : "",
"refsource" : "",
"tags" : [ "Patch", "Vendor Advisory" ]
}, {
"url" : "",
"name" : "",
"refsource" : "",
"tags" : [ "Vendor Advisory" ]
}, {
"url" : "",
"name" : "",
"refsource" : "",
"tags" : [ "Vendor Advisory" ]
}, {
"url" : "",
"name" : "",
"refsource" : "",
"tags" : [ "Exploit" ]
} ]
},
"description" : {
"description_data" : [ {
"lang" : "en",
"value" : ""
} ]
}
},
"configurations" : {
"CVE_data_version" : "4.0",
"nodes" : [ {
"operator" : "OR",
"cpe_match" : [ {
"vulnerable" : true,
"cpe23Uri" : "cpe:2.3:a:debian:shadow:4.0.0:*:*:*:*:*:*:*",
"vendor" : "debian",
"version" : "4.0.0"
}, {
"vulnerable" : true,
"cpe23Uri" : "cpe:2.3:a:debian:shadow:4.0.1:*:*:*:*:*:*:*",
"vendor" : "debian",
"version" : "4.0.1"
}, {
"vulnerable" : true,
"cpe23Uri" : "cpe:2.3:a:debian:shadow:4.0.2:*:*:*:*:*:*:*",
"vendor" : "debian",
"version" : "4.0.2"
} ]
} ]
},
"publishedDate" : "2006-05-28T23:02Z",
"lastModifiedDate" : "2020-08-11T17:09Z"
}, {
"cve" : {
[...]
} ]
}
The arrays cpe_match
and references
can contain about 50 entries.
Planned is a search/visualization for properties in the array cpe_match
e.g. vendor
and an output/aggreations of the related information of the uppermost JSON data structure or also from the array "references". Here, further filtering by properties (tags) should also be possible. (e.g. for CVE_ID: XXX
a Exploit
and Patch
is given and lastModifiedDate
was ...)
The search should also be possible in reverse.