Shodan.io query return json mapping, nested dynamic mapping

hi, i'm using some python to query shodan.io, it returns a reasonably complex json that i'd like to push into Elasticsearch. i've got most mapped out and its work, but there is one field i just cant to map correctly.

the rough format:

{
... #bunch of fields i have mapped

'vulns': {
    'CVE-2022-01-02' : {
          'verified' : false,
         'references': [ <<bunch of web links>>],
        'summary': <<text>>
   },
    'CVE-2021-02-12' : {
          'verified' : false,
         'references': [ <<bunch of web links>>],
        'summary': <<text>>
   }
    'CVE-2019-04-11' : {
          'verified' : false,
         'references': [ <<bunch of web links>>],
        'summary': <<text>>
   }
}
...
}

the CVE field name changes.. so that should be a dynamic field? is this the right approach?
how do i fit the verified, references, and summary fields in this?

[
  {
    "cve-objects": {
      "mapping": {
        "include_in_parent": true,
        "type": "nested"
      },
      "match_mapping_type": "object",
      "match": "CVE-*"
    }
  }
]

Having keys that are created dynamically is not recommended in Elasticsearch as it can lead to mapping explosion.

I would recommend restructuring the document as follows instead to avoid this:

'vulns': [
  {
    'cve_id': 'CVE-2022-01-02',
    'verified' : false,
    'references': [ <<bunch of web links>>],
    'summary': <<text>>
  },
  {
    'cve_id': 'CVE-2021-02-12',
    'verified' : false,
    'references': [ <<bunch of web links>>],
    'summary': <<text>>
  },
  {
    'cve_id': 'CVE-2019-04-11',
    'verified' : false,
    'references': [ <<bunch of web links>>],
    'summary': <<text>>
  }
]
1 Like

thats perfect, but how would i map that? thats what i havent figured out?

vulns would be mapped as a nested field and the fields in the subdocuments mapped based on the content and how you will query them.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.