We are using manifolcf to crawl web pages and then index them through
Elastic search.
Is there way to get only few lines that contain the searched keyword in
response of elastic search query instead of whole content. Like we get in
google search.
Solution we are trying: Reference
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-termvectors.html
https://email2010.searshc.com/owa/redir.aspx?C=VTBiULXBnE-XzIuMedjuaGPHLq134dEI2v0GWL91l1pzNGDfDsz11x4ckLumFc5e2EMae1ef3sk.&URL=http%3A%2F%2Fwww.elasticsearch.org%2Fguide%2Fen%2Felasticsearch%2Freference%2Fcurrent%2Fdocs-termvectors.html
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-highlighting.html
https://email2010.searshc.com/owa/redir.aspx?C=VTBiULXBnE-XzIuMedjuaGPHLq134dEI2v0GWL91l1pzNGDfDsz11x4ckLumFc5e2EMae1ef3sk.&URL=http%3A%2F%2Fwww.elasticsearch.org%2Fguide%2Fen%2Felasticsearch%2Freference%2Fcurrent%2Fsearch-request-highlighting.html
We are trying to do mapping like:
{
"mappings": {
"test": {
"properties": {
"file": {
"type": "attachment",
"path": "full",
"fields": {
"_content_type": {
"type": "string",
"store": true
},
"_name": {
"type": "string",
"store": true
},
"content": {
"type": "string",
"term_vector": "with_positions_offsets_payloads",
"store" : true,
"index_analyzer" : "fulltext_analyzer"
}
},
"store" : true,
"term_vector" : "with_positions_offsets_payloads"
}
}
}
},
"settings" : {
"index" : {
"number_of_shards" : 1,
"number_of_replicas" : 0
},
"analysis": {
"analyzer": {
"fulltext_analyzer": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [
"lowercase",
"type_as_payload"
]
}
}
}
}
}
and then query like:
{
"query": {
"match": {
"file": "CROWLEY"
}
},
"highlight" : {
"fields" : {
"file" : {"fragment_size" : 150, "number_of_fragments" : 3}
}
}
}
But we don't get highlight in response instead we get whole content in
response.
Any help is appreciated.
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/3f59d248-0ddb-4ee0-9e0f-b78844bde48b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.