Search text wrapped in double quotes


(Hector Sanchez-Pajares) #1

Hello, I would like to know what is the best way to search for those hits
which have some text wrapped in double quotes.

As an example, this is one of the items I have
"_index": "posts",
"_type": "post",
"_id": "2915129",
"_score": 1,
"fields": {
"text": [
""Some times I'm not even sure I believe the things I
can see."-John"
]
}

Right now I'm trying to use this query

POST /posts/_search
{
"query": {
"query_string": {
"query": "text:/(?:.*("[^"]+")).+/"
}
}
}

But I have the next error IllegalArgumentException[expected '"' at
position 17

I have tried different solutions but with no success. I know it is tricky
because of the double quotes since 1. they are reserved characters inside a
regex, 2. They have to be escaped because is a json in the query, 3. Are
the double quotes scape in the elasticsearch storage? 4. Should I look for
" instead of just "

Actually I'm kind of lost and I think I have tried almost all
possibilities. I was making this in a MySQL database before using this
regex
text.match(/"(?:[^"\]|\.)*"/)
but it doesn't work either.

Thanks for your help!

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ab192a94-736f-42e4-8eae-118d9656321d%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Binh Ly-2) #2

You'd probably need to first index that field as not analyzed, or something
that does not strip the quotes. And then use the match query, like for
example

{
"query": {
"match": {
"text": ""hello world""
}
}
}

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1d36345c-4b75-4ec6-91c3-ac60b75794ba%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Jörg Prante) #3

Do not index quotation marks, do not use query_string, do not use regexes.
Just use match_phrase to search phrases.

Jörg

On Thu, Mar 6, 2014 at 6:25 PM, Hector Sanchez-Pajares <
hectorlovestodevelop@gmail.com> wrote:

Right now I'm trying to use this query

POST /posts/_search
{
"query": {
"query_string": {
"query": "text:/(?:.*("[^"]+")).+/"
}
}
}

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoEFjDgCL857TLkKV7hxoUpeZagbJw385ZhDS_LAwRDS0A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Konstantina Lazaridou) #4

Hey Jorg,

how can you use match_phrase to find all available phrases ? The goal is not to find specific phrases but text inside the quotation symbols. Could you be more specific please? I know that it's been a long time since the last answer in the topic, but I could use some help :slight_smile:

This:
.setQuery(QueryBuilders.matchPhraseQuery("body", ""(.*?)""))
returns nothing for me.

Thank you in advance for your time,

Konstantina


(system) #5