[SOLVED] Question about custom analyzer

Okay, heres an example, maybe still not perfect but I think it shows some of the possibilities you have:

I created a test index with two analyzers, one for indexing (using the path_hierarchy tokenizer) and one for the query, and a mapping for a doc just containing this ip field:

PUT /ip4test
{
  "settings": {
    "analysis": {
      "tokenizer": {
        "ip_4_tokenizer": {
          "type": "path_hierarchy",
          "delimiter": "."
        }
      },
      "filter": {
        "remove_trailing_dot": {
          "type": "pattern_replace",
          "pattern": "\\.$",
          "replace": ""
        }
      },
      "analyzer": {
        "my_analyzer": {
          "type": "custom",
          "tokenizer": "ip_4_tokenizer"
        },
        "dedot_keyword": {
          "type": "custom",
          "tokenizer": "keyword",
          "filter": [
            "remove_trailing_dot"
          ]
        }
      }
    }
  },
  "mappings": {
    "my_type": {
      "properties": {
        "ip": {
          "type": "string",
          "analyzer": "my_analyzer",
          "search_analyzer": "dedot_keyword"
        }
      }
    }
  }
}

If you now use the _analyze endpoint you can see how the IP adress gets broken up at index time:

curl -XGET 'localhost:9200/ip4test/_analyze?pretty&analyzer=my_analyzer' -d 11.22.33.44

{
  "tokens" : [ {
    "token" : "11",
    "start_offset" : 0,
    "end_offset" : 2,
    "type" : "word",
    "position" : 0
  }, {
    "token" : "11.22",
    "start_offset" : 0,
    "end_offset" : 5,
    "type" : "word",
    "position" : 0
  }, {
    "token" : "11.22.33",
    "start_offset" : 0,
    "end_offset" : 8,
    "type" : "word",
    "position" : 0
  }, {
    "token" : "11.22.33.44",
    "start_offset" : 0,
    "end_offset" : 11,
    "type" : "word",
    "position" : 0
  } ]
}

So now enter some docs:

PUT /ip4test/my_type/1
{
  "ip" : "11.4.76.03"
}

PUT /ip4test/my_type/2
{
  "ip" : "11.4.71.04"
}

PUT /ip4test/my_type/3
{
  "ip" : "11.41.71.04"
}

And do some querying:

GET /ip4test/my_type/_search 
{
  "query": { "match": {
    "ip" : "11.4"
  }
  }
  , "highlight": {
    "fields": {"ip" : {}}
  }
}

"hits": [
      {
        "_index": "ip4test",
        "_type": "my_type",
        "_id": "2",
        "_score": 0.30685282,
        "_source": {
          "ip": "11.4.71.04"
        },
        "highlight": {
          "ip": [
            "<em>11.4</em>.71.04"
          ]
        }
      },
      {
        "_index": "ip4test",
        "_type": "my_type",
        "_id": "1",
        "_score": 0.30685282,
        "_source": {
          "ip": "11.4.76.03"
        },
        "highlight": {
          "ip": [
            "<em>11.4</em>.76.03"
          ]
        }
      }
    ]

See how this one didn't match 11.41.71.04 not sure if that was the intention or not. I no, you have to use prefixes.

Without removing the dot at the end of the query term, the next example would return no results, but thanks to the pattern_replace filter it does:

GET /ip4test/my_type/_search 
{
  "query": { "match": {
    "ip" : "11.4."
  }
  }
  , "highlight": {
    "fields": {"ip" : {}}
  }
}

"hits": [
      {
        "_index": "ip4test",
        "_type": "my_type",
        "_id": "2",
        "_score": 0.30685282,
        "_source": {
          "ip": "11.4.71.04"
        },
        "highlight": {
          "ip": [
            "<em>11.4</em>.71.04"
          ]
        }
      },
      {
        "_index": "ip4test",
        "_type": "my_type",
        "_id": "1",
        "_score": 0.30685282,
        "_source": {
          "ip": "11.4.76.03"
        },
        "highlight": {
          "ip": [
            "<em>11.4</em>.76.03"
          ]
        }
      }
    ]

Depending on your exact use case you might have to modify this a bit. Hope that helps a bit.