Highlight the result of tokenization when viewing full text

I have documents with duplicate sentences in the 'notes' field. I was able to tokenize this field and get only the original sentences and their offsets.

When the user views this "notes" field, I would like to highlight these original sentences. It seems like I should be able to since the offsets are stored, but I just can't figure out how to implement.

Any input on this matter is greatly appreciated. thank you.

// PUT mimic_dat
  "settings": {
    "number_of_shards": 1,
    "number_of_replicas": 0,
    "analysis": {
      "tokenizer": {
        "mimic_tokenizer": {
          "type": "pattern",
          "pattern": """(\.\s|\n+)""",
          "group": -1
      "filter": {
        "unique_mimic": {
          "type": "unique",
          "only_on_same_position": false
      "analyzer": {
        "mimic_hash_analyzer": {
          "type": "custom",
          "tokenizer": "mimic_tokenizer",
          "filter": [
  "mappings": {
    "mimic_type": {
      "properties": {
        "subject_id": {
          "type": "keyword"
        "notes": {
          "type": "text",
          "fielddata": true,
          "fields": {
            "my_hash": {
              "type": "text",
              "analyzer": "mimic_hash_analyzer",
              "fielddata": true,
              "term_vector": "with_positions_offsets",
              "store": true

// PUT mimic_dat/mimic_type/4
  "notes": """
Past History: Chronic xx which lead to; Ca.

Review of systems:    Cardiac,   SR.
O2: sats on room air 100%.  

ID:  No active issues, temp 99.3 PO.

Review of systems:    Cardiac,   SR.

ID:  No active issues, temp 99.3 PO. 


This topic will close a month after the last reply.

Bookmark Share Flag Reply


You will receive notifications because you created this topic.

Suggested Topics

Topic Replies Views Activity
Run remote commands from DevTools

|Modify core storage of Elastic search

|ELK architecture optimization

|Recognizing succeeded vs failed tasks w/ the Task Management API

|Shipping logs from Central machine(Jenkins machine) to Elasticsearch


There are 195 new topics remaining, or browse other topics in Elasticsearch

© 2018. All Rights Reserved - Elasticsearch

Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.