Querying an Array of Strings for the 1st, 2nd and 3rd Element


(Martin Wright) #1

Hello elastic community,

I have an Array of Strings from a spam filter log;

spam_rules": [
   "KNOWN_SPAM_CONTENT",
   "KNOWN_SPAM_ZIPSIG",
   "KNOWN_SPAM_ZIPSIG_EXT",
   "RDNS_SUSP_SPAM",
   "ZIP_ATTACHED_WSF",
   "RDNS_SUSP_ZIP_ATTACHED",
   "INVOICE_ATTACHMENT",
   "HTML_00_01",
   "HTML_00_10",
   "ARCHIVE_ATTACHED",
   "BODYTEXTP_SIZE_3000_LESS",
   "FROM_SAME_AS_TO_DOMAIN",
   "NO_REAL_NAME",
   "NO_URI_HTTPS",
   "RDNS_GENERIC_POOLED",
   "RDNS_SUSP",
   "RDNS_SUSP_GENERIC",
   "ZIP_ATTACHED"
]

In this example "KNOWN_SPAM_CONTENT" has more weight than "KNOWN_SPAM_ZIPSIG", and so on down the array with the last value having the least weight.

I would like to be able to query the first, second and third values and visualize each individually to see what spam rule was see the most within the first, second and third element.

Upon reading the documentation I saw that multivalue field arrays are "a bag of values".
"However, arrays are indexed—made searchable—as multivalue fields, which are unordered. At search time, you can’t refer to “the first element” or “the last element.” Rather, think of an array as a bag of values."
https://www.elastic.co/guide/en/elasticsearch/guide/master/complex-core-fields.html

I have the created the flowing mapping for spam_rules

>   "type": "text",
>   "index_options": "positions",
>   "fields": {
>     "keyword": {
>       "type": "keyword"

I had considered trying to create three new fields called spam_rule1, spam_rule2, and spam_rule3 within logstash. This would be pretty simple to accomplish however I would rather pull the data from the array if possible.

I am pretty sure I need to write a painless script to visualize this data. However, I'd like to know if this is ultimately possible.

Logstash, Elasticsearch, Kibana are running version 5.1.2-1

Any suggestions or tips would be awesome!


(system) #2

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.