How can I make the string field not_analyzed?

sharon.c · November 21, 2015, 7:31pm

I am using logstash 1.5.1 and elasticsearch 1.7.3.0. I used logstash elasticsearch output to index the records residing in a bunch of csv files, and used my own mapping document where I set strings to be not_analyzed, also set logstash default template match"*" as string not_analyzed. In kibana I also verified that string fields are not_analyzed, however, when I use kibana bar chart to create bucket on string field, the string is broken down into tokens.

As you can see for example the "path" field, in kibana mapping details, it is not_analyzed as what I set

Also, the value of "path" field is as follows, it is the path of csv files

Then when I use bar chart to do bucket based on "path" field you can see the legend, the "path" field values are broken down into tokens. Instead of "/.../testcsvimport/record220k-100_3.csv", it is broken down to "testcsvimport" "record220k" "100" "csv"...

I don't want it to be analyzed, I want to keep the whole path field as one string, how can I do it?

I have attached logstash .conf file that i used to export to elasticsearch index, please help me.

input {

file {
path => "/home/myfolder/testcsvimport/record*.csv"
start_position => "beginning"
sincedb_path => "/dev/null"
}

}

filter {

csv {
columns => ["some_column_names"] # the "path" field is added by csv filter
}

grok {
	match => { "record_IP" => "%{IP:clientip}" }
}
  
geoip {
    source => "clientip"
}

mutate
{
  remove_field => [ "message", "host" ]
}

}

output {
elasticsearch {
host => "dev-elkstack:9200"
protocol => "http"
index => "mt_joined_record_index"
template_name => "mt_joined_record_type"
manage_template => false
}

}

warkolm · November 21, 2015, 11:07pm

Can you paste/link to your mapping?

sharon.c · November 21, 2015, 11:10pm

PUT /mt_joined_record_index
{
"mappings": {
"mt_joined_record_type": {
"_all": {
"enabled": false,
"omit_norms": true
},
"properties": {
"@timestamp": {
"type": "date",
"format": "dateOptionalTime"
},
"@version": {
"type": "string",
"index": "not_analyzed"
},
"path": {
"type": "string",
"index": "not_analyzed"
},
"record_time": {
"type": "date",
"format": "yyyy-mm-dd HH:mm:ss || yyyy-mm-dd hh:mm:ss Z"
},
"tags": {
"type": "string",
"index": "not_analyzed"
}
}
}
}

}

sharon.c · November 22, 2015, 12:38am

In kibana it already showed that the "path" field analyzed as "false" as the picture I posted here

How come in bar chart the field is still analyzed? It is not consisitent.

tinle · November 22, 2015, 1:36am

Did you reload your field list after making the change? It could be cached.

Tin

sharon.c · November 22, 2015, 1:39am

Every time when I did new experiment, I changed index name and mapping name to new names, I believe that would be clean experiment, will not get affected by the previous experiments. I also know this mapping I created is effective, because the date type I defined in the mapping is correctly recognized by kibana.

tinle · November 22, 2015, 2:04am

Sure. Just for grins, would you mind trying the reload fields anyway?

Let's eliminate that.

sharon.c · November 22, 2015, 2:11am

Yes I did that, it still doesn't work.

sharon.c · November 22, 2015, 2:13am

Can you give me a example or a link, how other people succeeded in making string fields not_analyzed? Thank you very much!

tinle · November 22, 2015, 2:55am

The index mapping you shown above seem to come from a PUT. Could you please post a mapping for the current index you are having problem with? from a GET?

Something like from similar command.

curl localhost:9200/logstash-YYYY.MM.DD/_mapping?pretty

Marcin_Kubica · November 22, 2015, 2:55am

Sorry can't tell what's wrong in your case @sharon.c however I'm using non analysed fields alot and never had this issue.

Deploy your ELK from scratch and try again?

sharon.c · November 23, 2015, 12:07am

I think I solved it by using raw field, because I need to do aggregation on that field, I think it is better to just use raw field, instead using not_analyzed.

chrisribe · September 30, 2016, 3:12pm

Could you post your solution ?
Having the same issue...
Thanks

chrisribe · September 30, 2016, 6:09pm

Found it, here is my mapping solution for mysql-* mappings.
Creates raw fields if the fields is a string less than 256 chars.

{
  "template" : "mysql-*",
  "settings" : {
    "index.refresh_interval" : "5s",
    "analysis" : {
      "analyzer" : {
        "default" : {
          "type" : "standard",
          "stopwords" : "_none_"
        }
      }
    }
  },
  "mappings" : {
    "_default_" : {
       "_all" : {"enabled" : true},
       "dynamic_templates" : [ {
         "string_fields" : {
           "match" : "*",
           "match_mapping_type" : "string",
           "mapping" : {
             "type" : "multi_field",
               "fields" : {
                 "{name}" : {"type": "string", "index" : "analyzed", "omit_norms" : true, "index_options" : "docs"},
                 "raw" : {"type": "string", "index" : "not_analyzed", "ignore_above" : 256}
               }
           }
         }
       } ],
       "properties" : {
         "@version": { "type": "string", "index": "not_analyzed" }
       }
    }
  }
}

Reference (see : logstash index template)

Sami_Bensmida · December 13, 2016, 2:31pm

Hello,

I'm using talend telasticsearch component for the ETL, converting CSV Data to Json in order to load it in elasticsearch, witch that's mean the mapping is generated automatically in background and I cant see the code.

Default mode : All the String fields are ANALYZED I want o change it to NOT ANALYZED.

Ideas please ?

Thank's in advance,
Sami BENSMIDA

Black-Star · January 3, 2017, 6:46am

Here is Stackoverflow link for some similar question

curl -XPUT localhost:9200/_template/global -d '{
  "template": "*",
  "mappings": {
    "_default_": {
      "dynamic_templates": [
        {
          "strings": {
            "match_mapping_type": "string",
            "mapping": {
              "type": "string",
              "index": "not_analyzed"
            }
          }
        }
      ]
    }
  }
}'

venkat_venkat · March 23, 2017, 5:17pm

Hi ,

I am facing the same issue, I am trying to remove the hostname string field analyzed to non-analyzed, every time I am getting analyzed only, please help me out how to remove analyzed for the hostname.

Please find the below template, I am using.

curl -XPUT 'http://localhost:9200/_template/elasticsearchstats' -d '{
"template": "elasticsearchstats",
"order": 10,
"settings": {
"index.refresh_interval": "5s"
},
"mappings": {
"default": {
"_all": {
"enabled": true,
"omit_norms": true
},
"dynamic_templates": [
{
"string_fields": {
"match": "",
"match_mapping_type": "string",
"mapping": {
"type": "keyword"
}
}
},
{
"float_fields": {
"match": "",
"match_mapping_type": "float",
"mapping": {
"type": "float",
"doc_values": true
}
}
},
{
"double_fields": {
"match": "",
"match_mapping_type": "double",
"mapping": {
"type": "double",
"doc_values": true
}
}
},
{
"byte_fields": {
"match": "",
"match_mapping_type": "byte",
"mapping": {
"type": "byte",
"doc_values": true
}
}
},
{
"short_fields": {
"match": "",
"match_mapping_type": "short",
"mapping": {
"type": "short",
"doc_values": true
}
}
},
{
"integer_fields": {
"match": "",
"match_mapping_type": "integer",
"mapping": {
"type": "integer",
"doc_values": true
}
}
},
{
"long_fields": {
"match": "",
"match_mapping_type": "long",
"mapping": {
"type": "long",
"doc_values": true
}
}
},
{
"date_fields": {
"match": "",
"match_mapping_type": "date",
"mapping": {
"type": "date",
"doc_values": true
}
}
},
{
"geo_point_fields": {
"match": "*",
"match_mapping_type": "geo_point",
"mapping": {
"type": "geo_point",
"doc_values": true
}
}
}
],
"properties": {
"@timestamp": {
"type": "date",
"doc_values": true
},
"@version": {
"type": "string",
"index": "not_analyzed",
"doc_values": true
},
"clusterstatus" : {
"type" : "long"
},
"cpupercent" : {
"type" : "long"
},
"fielddataestimated" : {
"type" : "long"
},
"fielddatalimit" : {
"type" : "long"
},
"freedisk" : {
"type" : "long"
},
"currentstatus" : {
"type" : "string",
"index" : "not_analyzed"
},
"hostname" : {
"type": "keyword",
"index" : "no"
},
"testname" : {
"type": "keyword"
},
"freemem" : {
"type" : "long"
},
"heapold" : {
"type" : "long"
},
"heapsurvior" : {
"type" : "long"
},
"heapused" : {
"type" : "long"
},
"heapyoung" : {
"type" : "long"
},
"loadaverage" : {
"type" : "long"
},
"hostname" : {
"type" : "string"
},
"openfiles" : {
"type" : "float"
},
"threadcount" : {
"type" : "float"
},
"type" : {
"type" : "string"
},
"geoip": {
"type": "object",
"dynamic": true,
"properties": {
"ip": {
"type": "ip",
"doc_values": true
},
"location": {
"type": "geo_point",
"doc_values": true
},
"latitude": {
"type": "float",
"doc_values": true
},
"longitude": {
"type": "float",
"doc_values": true
}
}
}
}
}
}
}'

Topic		Replies	Views
Set field as not_analyzed in Elastic Search Elasticsearch	23	10515	July 5, 2017
Logstash using not_analyzed not working Logstash	6	2981	July 6, 2017
How to turn current string field into non-analyzed? Kibana	7	1090	July 6, 2017
Not_analyzed field being analyzed Elasticsearch	6	1066	July 5, 2017
How to change the fields to 'not-analyzed'? Logstash	7	19213	July 6, 2017

How can I make the string field not_analyzed?

Related topics