PHP client: Check input before indexing

syiannop · February 12, 2018, 4:29pm

I am trying to use the PHP client to check if the user's input already exists before indexing it. Here is the code I tried.

My idea was:

Check if the variables are set (I have four required fields: titel, thema, type and description)
If yes, search in the ES index if they already exist
If they exist show a message (Should I use do/while instead of íf/else?)
If the don't, index the input and show a message

polyfractal · February 15, 2018, 3:16pm

The flow seems entirely reasonable. You may run into troubles with exact vs partial matches to the search request though.

E.g. if someone searches for "foo bar baz" in the description field, that will match any documents saying containing "foo", "bar", or "baz" since you are using a match query. You may have multiple matches to deal with.

Instead, you may want to use a term query, since that performs an exact match. But that also means that case sensitivity matters, as well as spacing and special characters. You could also try using phrase searches but it also has edge-cases.

Basically, search is a bit more nuanced than just a database table lookup, since you need to deal with partial matches.

Perhaps show the results to the user and ask if one of the matching docs was their input? If not they can add it to the index. That saves you a lot of trouble of trying to determine how well partial matches match the input.

syiannop · February 15, 2018, 4:03pm

Thank you so much for your answer! I am still at the beginning of the whole process, so I will take notice of your hints and work with them. The problem right now is, that the search part doesn't notice if a document already exists.

E.g: I used the word "test" on title and description and type1/thema1 and added the document to the index. I got the message that the document has been added and its ID. When I used exactly the same input and pressed add again, I got the same response.

I made a few changes to the code as I am trying to figure it out.

EDIT: I managed to make it work! I will try to improve the search query as you suggested. Any further suggestions are much appreciated

polyfractal · February 16, 2018, 4:34pm

Ah, good to hear you got it working

No particular advice, just play around with analyzers and get a feel for how they tokenize/transform the text. You may want to do a combination of exact match (term queries), phrase matching (match_phrase or phrase queries) and partial matching (match) to suit your needs.

You could also structure your document IDs so that they are deterministic, and use simple GETs instead of searches to fetch the documents by ID. May or may not be a possibility for your system.

syiannop · February 16, 2018, 4:49pm

Will do! Thank you for your help

system · March 16, 2018, 4:49pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Exact search Elasticsearch	4	440	September 10, 2018
How do I implement exact full text search on an index created by fscrawler Elasticsearch	7	1268	December 4, 2019
I can not see the document in the result Elasticsearch	2	419	September 2, 2018
Search/match multiple fields in Elasticsearch Elasticsearch language-clients	2	1108	August 23, 2021
Checking the index is exist or not in elastic search 5.6 Elasticsearch	5	982	October 18, 2019

PHP client: Check input before indexing

Related topics