Searching with punctuation


(Caroline) #1

Hi,

Our search is working great until it come's to punctuation. If you don't use the punctuation that we have in our data, you will get zero results. For example, in our data we have "Corinna's Cause", so if you were to search that exactly (with the apostrophe) you will get good relevant results. However, if you search without the apostrophe "Corinnas Cause" you will receive zero results. What do we need to do to get around this?

Thank you for any help.


(Xavier Facq) #2

Hi,

Can you give us the mapping of your fields and a query sample ?

bye,
Xavier


(Caroline) #3

Hi Xavierfacq,

Here is a query sample for the Corinna's Cause search: localhost:9200/ab/_search?q=Title:corrina's cause*&size=500

The title field (book title) is mapped to our alias field that is used to find search keywords. After searching for the keyword in the title field we search in these fields respectively - Author, EAN, ISBN and publisher.

Thank you.


(Xavier Facq) #4

Hi can you provide result of :

localhost:9200/ab/_mapping?pretty=

(Jörg Prante) #5

Maybe a typo? You search for corrina's cause but the indexed title is corinna's cause

To handle these situations, you should add spellcheck suggestions https://www.elastic.co/guide/en/elasticsearch/reference/current/search-suggesters.html


(Caroline) #6

Hi Xavier,

Sure, here's the results:

{
"ab" : {
"mappings" : {
"items" : {
"properties" : {
"AliasTitle" : {
"type" : "string"
},
"Author" : {
"type" : "string"
},
"AuthorLinks" : {
"type" : "string"
},
"DiscoveryID" : {
"type" : "string"
},
"DiscoveryName" : {
"type" : "string"
},
"EAN" : {
"type" : "string"
},
"ISBN" : {
"type" : "string"
},
"ImageUrl" : {
"type" : "string"
},
"ProductID" : {
"type" : "string"
},
"ProductType" : {
"type" : "integer"
},
"Publisher" : {
"type" : "string"
},
"SalesCount" : {
"type" : "integer"
},
"StockCount" : {
"type" : "integer"
},
"Title" : {
"type" : "string"
}
}
},
"item" : {
"properties" : {
"AliasTitle" : {
"type" : "string"
},
"Author" : {
"type" : "string"
},
"AuthorLinks" : {
"type" : "string"
},
"DiscoveryID" : {
"type" : "string"
},
"DiscoveryName" : {
"type" : "string"
},
"EAN" : {
"type" : "string"
},
"ISBN" : {
"type" : "string"
},
"ImageURL" : {
"type" : "string"
},
"ProductID" : {
"type" : "string"
},
"ProductType" : {
"type" : "integer"
},
"Publisher" : {
"type" : "string"
},
"SalesCount" : {
"type" : "integer"
},
"StockCount" : {
"type" : "integer"
},
"Title" : {
"type" : "string"
}
}
}
}
}
}


(Caroline) #7

Hi jprante, yea the suggester could do the job if it works with punctuation. Shll check it out.


(Xavier Facq) #8

"Corinna's Cause" cannot be equal to "Corinnas Cause", unless you have a special analyzer.

Does it work with a term : "Corinna Cause" , without the "s" and the quote ?

localhost:9200/ab/_search?q=Title:corrina+cause&size=500

(Caroline) #9

Sorry for late response, Christmas got in the way!

No it doesn't work if I search "Corinna Cause". It has to be exactly the same as our data, so it will only work when I search with the correct punctuation. If I were to search "Cause Joanna" however it would work. It will identify "cause" in our title field and "Joanna" in our author field and show the result. If I search "corinna joanna" you get nothing. The punctuation is causing the problem with the search results. Ideally we need "Corinna's Cause" to be equal to "Corinnas Cause" but you say this can't be done without a special analyzer? What do you mean by this please?

Thank you.


(Xavier Facq) #10

What I understand is that your query is applied to the field "Title" of documents "items" and "item" in your index "ab". Those fields seems to be only a String, and no specific analyer seems to be applied.

So, questions are:

1°/ Does a search with the word "cause" works ?

2°/ Does a search with the word "Joanna" works ?

3°/ Does a search with the word "corinna" works ?

4°/ Does a search with the word "corrina" works ?


(system) #11

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.