Using Pattern Replace to update field

Matthew_Bullock · October 11, 2017, 7:39pm

I am trying to use pattern replace to change a number reference to a domain:

url -XPUT "http://localhost:9200/timeordered?pretty" -H 'Content-Type: application/json' -d'
{
  "settings": {
    "index.mapping.ignore_malformed": false,
  "index": {
    "analysis": {
      "analyzer": {
        "my_analyzer": {
          "tokenizer": "keyword",
          "char_filter": [
            "my_char_filter"
          ]
        }
      },
      "char_filter": {
        "my_char_filter": {
          "type": "mapping",
          "mappings": [
            "45941455 => domain.de",
            "45941671 => domain.es",
            "45941287 => domain.com",
            "45941554 => domain-domain.fr",
            "48031837 => domain1.com",
            "13042264 => domain-cloud.com",
            "13042207 => domain3.com",
            "13157590 => domain4.com",
            "13057180 => domain5.com",
            "15076396 => domain6.cl",
            "13133866 => domain7.com",
            "15076060 => domain8.com.ar",
            "15076411 => domain9.com.au",
            "15076303 => domain10.com.co",
            "15076393 => domain11.com.mx",
            "15076408 => domain12.es",
            "15076405 => domain13.fr",
            "15076402 => domain14.jp",
            "13040731 => domain15.com"
          ]
        }
      }
    }
  }
  },
  "mappings": {
    "logs": {
      "properties": {
        "EdgeStartTimestamp": {
          "type":   "date",
          "format": "epoch_millis"
        },
        "geoip.location": {
          "type": "geo_point"
      }
        }
      }
    }
  }'

However when I push data via the bulk api the field is not being converted.

Now I am guessing I either have to add this to my bulk api format or call it in the mapping of the field i need reindexing this being:

curl -s -XPUT 'http://localhost:9200/timeordered/_mapping/**ZoneID**?pretty' -d'
{
      "properties": {
        "ZoneID": {
          "type": "string",
          "index": "not_analyzed"
        }
      }

The bulk api format i have (not this is JQ that processes the file to bulk import)

{"index": {"_index": "timeordered", "_type": "logs", "_id": .ID, "pipeline": "geoip-timeordered"}},

Any advice would be great!

Thanks

abdon · October 13, 2017, 11:45am

I'm assuming it's the ZoneIDfield you wish to apply this pattern replace filter to?

For starters, your ZoneID is mapped as not analyzed, so it will not get any analyzer applied to it. What you need to do is change "index": "not_analyzed" into "index": "analyzed" in the mapping for ZoneID (or omit the "index" line completely, as "analyzed" is the default).

Next, you need to tell Elasticsearch that you want to apply this my_analyzer analyzer to this field, instead of the default standard analyzer. So, in your mapping for ZoneID you need to add a line "analyzer": "my_analyzer".

Be aware though, by making ZoneID analyzed you may run into memory issues when you aggregate or sort on this field.

Also, be aware that any analysis will not change the _source of your documents. So, Elasticsearch will return you the original number references with the search results. Analysis will only influence the internal values that Elasticsearch will use for queries and aggregations.

If you'd like to change the _source documents themselves, you'd have to use the ingest node instead of a character filter. It looks like you're already doing that, with the "pipeline": "geoip-timeordered" directive in your bulk request. You could do the tranformation in that pipeline, if you'd like to change the ZoneID in the _source itself.

system · November 10, 2017, 11:45am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Pattern_replace Token Filter Elasticsearch	1	302	July 6, 2017
Replacing values of fields by regex Elasticsearch	3	2526	November 27, 2017
Pattern_filter for removing dots from a number Elasticsearch	3	1884	November 24, 2017
Replace document field value on querying Elasticsearch	1	347	June 18, 2019
8.14.0 Migration : Unknown field 'preserve_original' Elasticsearch	3	28	September 23, 2024

Using Pattern Replace to update field

Related topics