Hi all,
I've been using elasticsearch-1.2.1 and I've been indexing .xml and .jsp
file content.
And this is how my index has been analyzed as:
"settings": {
"analysis": {
"filter": {
"word_delimiter" : {
"type" : "word_delimiter",
"preserve_original" : true,
"split_on_case_change" : false,
"stem_english_possessive" : false,
"type_table" : [
"# => ALPHA",
"@ => ALPHA",
"$ => ALPHA",
"& => ALPHA",
"? => ALPHA",
"= => ALPHA"
]
}
},
"analyzer": {
"custom_analyzer" : {
"type" : "custom",
"tokenizer" : "whitespace",
"filter" : ["word_delimiter", "lowercase"]
}
}
}
And the file contains one of the lines as, <%@ page
import="java.util.Vector" %>.
While searching the index as import="java.**, i'm getting the result as
expected. But while searching for the keyword "import"*, I dint get any of
the result.
On analyzing with the help of kopf plugin, I came to know my content *import="java.util.Vector"
*was indexed into *import="java.util.Vector", **import=, *java, util and
vector.
But what I want is *import= to be indexed as it is and as well as import,
*so that it'll match my scenario.
Also I've tried other option i.e. without the type_table, in that case my
search results gets reversed. The keyword search for "import" works and import="java."
*doesn't seems to be working.
Anybody have any idea on how the index should be analyzed to get my desired
index?
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/27051ee9-7b25-461c-96ee-36cc1747ed11%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.