Limit ngram tokenizer

POST _analyze
{
"tokenizer": {"type":"ngram",
"min_gram":"3",
"max_gram":"50"},
"text": "12345678"
}

{
"tokens": [
{
"token": "123",
"start_offset": 0,
"end_offset": 3,
"type": "word",
"position": 0
},
{
"token": "1234",
"start_offset": 0,
"end_offset": 4,
"type": "word",
"position": 1
},
{
"token": "12345",
"start_offset": 0,
"end_offset": 5,
"type": "word",
"position": 2
},
{
"token": "123456",
"start_offset": 0,
"end_offset": 6,
"type": "word",
"position": 3
},
{
"token": "1234567",
"start_offset": 0,
"end_offset": 7,
"type": "word",
"position": 4
},
{
"token": "12345678",
"start_offset": 0,
"end_offset": 8,
"type": "word",
"position": 5
},
{
"token": "234",
"start_offset": 1,
"end_offset": 4,
"type": "word",
"position": 6
},
{
"token": "2345",
"start_offset": 1,
"end_offset": 5,
"type": "word",
"position": 7
},
{
"token": "23456",
"start_offset": 1,
"end_offset": 6,
"type": "word",
"position": 8
},
{
"token": "234567",
"start_offset": 1,
"end_offset": 7,
"type": "word",
"position": 9
},
{
"token": "2345678",
"start_offset": 1,
"end_offset": 8,
"type": "word",
"position": 10
},
{
"token": "345",
"start_offset": 2,
"end_offset": 5,
"type": "word",
"position": 11
},
{
"token": "3456",
"start_offset": 2,
"end_offset": 6,
"type": "word",
"position": 12
},
{
"token": "34567",
"start_offset": 2,
"end_offset": 7,
"type": "word",
"position": 13
},
{
"token": "345678",
"start_offset": 2,
"end_offset": 8,
"type": "word",
"position": 14
},
{
"token": "456",
"start_offset": 3,
"end_offset": 6,
"type": "word",
"position": 15
},
{
"token": "4567",
"start_offset": 3,
"end_offset": 7,
"type": "word",
"position": 16
},
{
"token": "45678",
"start_offset": 3,
"end_offset": 8,
"type": "word",
"position": 17
},
{
"token": "567",
"start_offset": 4,
"end_offset": 7,
"type": "word",
"position": 18
},
{
"token": "5678",
"start_offset": 4,
"end_offset": 8,
"type": "word",
"position": 19
},
{
"token": "678",
"start_offset": 5,
"end_offset": 8,
"type": "word",
"position": 20
}
]
}

I would like to stop the ngram tokenizer after it has finished ngramming 4. How can I achieve that ?

I would like to stop ngramming after this token

{
"token": "45678",
"start_offset": 3,
"end_offset": 8,
"type": "word",
"position": 17
}

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.