This is the mapping and settings
{
"blogs_fixed2": {
"aliases": {},
"mappings": {
"_meta": {
"created_by": "Sheereen Hamza KV"
},
"properties": {
"@timestamp": {
"type": "date"
},
"authors": {
"properties": {
"company": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"first_name": {
"type": "keyword"
},
"full_name": {
"type": "text"
},
"job_title": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"last_name": {
"type": "keyword"
},
"uid": {
"type": "object",
"enabled": false
}
}
},
"category": {
"type": "keyword"
},
"category_title": {
"properties": {
"title": {
"type": "keyword"
},
"uid": {
"type": "keyword"
}
}
},
"content": {
"type": "text",
"analyzer": "content_analyzer"
},
"locale": {
"type": "keyword"
},
"publish_date": {
"type": "date",
"format": "iso8601"
},
"search_tags": {
"type": "keyword",
"doc_values": false
},
"tags": {
"properties": {
"elastic_stack": {
"type": "keyword",
"copy_to": [
"search_tags"
]
},
"industry": {
"type": "keyword",
"copy_to": [
"search_tags"
]
},
"level": {
"type": "keyword",
"copy_to": [
"search_tags"
]
},
"product": {
"type": "keyword",
"copy_to": [
"search_tags"
]
},
"tags": {
"type": "keyword",
"copy_to": [
"search_tags"
]
},
"topic": {
"type": "keyword",
"copy_to": [
"search_tags"
]
},
"use_case": {
"type": "keyword",
"copy_to": [
"search_tags"
]
},
"use_cases": {
"type": "keyword",
"copy_to": [
"search_tags"
]
}
}
},
"title": {
"type": "text"
},
"url": {
"type": "keyword"
}
}
},
"settings": {
"index": {
"routing": {
"allocation": {
"include": {
"_tier_preference": "data_content"
}
}
},
"number_of_shards": "1",
"provided_name": "blogs_fixed2",
"creation_date": "1684088852379",
"analysis": {
"analyzer": {
"content_analyzer": {
"filter": [
"lowercase"
],
"char_filter": [
"html_strip"
],
"tokenizer": "standard"
}
}
},
"number_of_replicas": "1",
"uuid": "hnN5EsyaRR6TLa0oUJHQYg",
"version": {
"created": "8070099"
}
}
}
}
}
When I do a search on the same,
GET blogs_fixed2/_search
{
"size": 1,
"_source": false,
"fields": [
"content"
]
}
I get the below result, which doesn't have the analyser applied to it
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 4666,
"relation": "eq"
},
"max_score": 1,
"hits": [
{
"_index": "blogs_fixed2",
"_id": "aieZmIcBbskhfV7CHLs1",
"_score": 1,
"fields": {
"content": [
"""<p>Pour capitalisfaudra probablement veiller à ce qu'Elasticsearch reste synchronisé avec les données porte quel SGBDR.
</p>
<h2>Configuration système</h2>
<p>Pour les besoins de cet articlents :
</p>
<ul>
<li><a href="https://dev.mysql.com/">MySQL</a> : 8.0.16.</li>
<li><a href="/guide/en/elasticsearch/reference/7.1/index.html">Elasticsearch</a> : 7.1.1</li>
<li><a href="/guide/en/logstash/7.1/introduction.html">Logstash</a> : 7.1.1</li>
<li><a href="https://www.java.com/en/">Java</a> : 1.8.0_162-b12</li>
<li><a href="/guide/en/logstash/7.1/plugins-inputs-jdbc.html">Plug-in d'entrée JDBC</a> : v4.3.13</li>
<li><a href="https://dev.mysql.com/downloads/connector/j/">Connecteur JDBC</a> : Connector/J 8.0.16</li>
</ul>"""
]
}
}
]
}
}
I believe the analyser should be applied by default r8. And also, if I do a terms aggregation on content field, then it should bucket the different tokens in the field.