Logstash Field Types Not Working Intermittently

Hi Guys, I have seen this type of issue once or twice before but i can't seem to get it sorted my side still.

I have a field defined as an INT in my patterns, which today has thrown out the error (in Kibana):

Visualize: Expected numeric type on field [MS], but got [string]

I understand that for some reason Kibana is seeing this value as a string instead of an int as expected hence the error, but without a single config change to my patterns or Logstash filters i don't see why this would suddenly change.

Here is the logstash config it is passing through:

filter {
      if [type] == "swAuditLog" {
        grok {
            patterns_dir => ["/opt/logstash/patterns"]
            match => { "message" => "%{SW_AUDIT}" }
        }
        
        if "_grokparsefailure" in [tags] {
            drop { }
        }

        date {
            timezone => "Africa/Harare"
            match => ["LoggedDateTime", "YYYY/MM/dd HH:mm:ss"]
            target => "@timestamp"
        }
        
        mutate {
            remove_tag => [ "beats_input_codec_plain_applied" ]
        }
    }
}

.. and the specific (custom) pattern itself:

SW_MS (?:[0-9]{1,4})
SW_DUR_MS (?:%{SW_MS:MS:int}%{SPACE}(ms))

SW_SEC (?:(?:[0-5]?[0-9]|60)(?:[:.,][0-9]+)?)
SW_MIN (?:[0-5]?[0-9])
SW_DUR_SEC (?:%{SW_SEC:SEC:int}%{SPACE}(sec))
SW_DUR_MIN (?:%{SW_MIN:MIN:int}%{SPACE}(min))
SW_DURATION (?:(\[Duration:%{SPACE}%{SW_DUR_MS}\])|(\[Duration:%{SPACE}%{SW_DUR_SEC}(,)%{SW_DUR_MS}\])|(\[Duration:%{SPACE}%{SW_DUR_MIN}(,)%{SW_DUR_MS}\])|(\[Duration:%{SPACE}%{SW_DUR_MIN}(,)%{SW_DUR_SEC}(,)%{SW_DUR_MS}\]))

I am running on Logstash v2.3.4, Elasticsearch v2.3.4, Kibana v4.5.3 and Filebeat v1.2.3 with only this one single server logging to my Elastic Stack. Logging levels are default not debug level.

How i can better find out why this started happening without any config changes?

What does the actual document stored in ES look like? How is that field mapped? Use ES's get mapping API.

Thank you for your response Magnus

Using the following query in sense GET filebeat-2016.07.26/_all/_mapping/field/MS,SEC,MIN i get the following output from the day before and after:

    "MS": {
      "full_name": "MS",
      "mapping": {
        "MS": {
          "type": "long"
        }
      }
    }

... and the following output from the day with the reported issue:

    "MS": {
      "full_name": "MS",
      "mapping": {
        "MS": {
          "type": "string",
          "index": "not_analyzed",
          "ignore_above": 1024
        }
      }
    }

An example of the document with this MS field in it is as follows:

{
  "_index": "filebeat-2016.07.27",
  "_type": "swAuditLog",
  "_id": "AVYr3M9K91aMV7UCrCTG",
  "_version": 1,
  "found": true,
  "_source": {
    "message": "[2016/07/27 12:17:36][AuditLog] [INFO]  [Duration: 113 ms]",
    "@version": "1",
    "@timestamp": "2016-07-27T10:17:36.000Z",
    "beat": {
      "hostname": "Swimmer",
      "name": "Swimmer"
    },
    "type": "swAuditLog",
    "offset": 50347146,
    "input_type": "log",
    "count": 1,
    "fields": null,
    "source": "c:\\Test\\Log\\test.log",
    "host": "Swimmer",
    "LoggedDateTime": "2016/07/27 12:17:36",
    "Date": "2016/07/27",
    "Time": "12:17:36",
    "Duration": "[Duration: 113 ms]",
    "MS": 113,
    "Bytes": 882
  }
}

Please let me know if i have missed any information you asked for.

Hmm. I'm not sure why, but on that problematic day the MS field was apparently mapped as a string. Without inspecting all documents with an MS field in the index it's impossible to tell why. I suggest you use an index template to explicitly assign mappings to your fields instead of relying on the automatic mapping to do the right thing.

That's fair enough, and thanks again i appreciate the help.
I will see about getting a predefined template in place to avoid this issue in the future.