Words vs Guids, what is faster?

Baygon · May 24, 2021, 12:18pm

I'm in front of a design choice where I would need a bit of help to choose the most efficient way.

I'm currently indexing data tagged with a string of comma separated Guids, for example:
\"e659959d-392f-44c5-83a5-fb959cdbaccc\",\"ab2975e3-b9ca-4b1a-a93e-fb61a5d5c3a4\",\"c48e0bf4-7a1b-4ffd-893c-12e46e664f7f\",\"0074af4d-eb56-4e57-89b6-07c39c63c9c4\"

This is the exact look of the string which is in one field only (ingestion in logstash sql of a serialized list stored in the mysql db).

I can modify the logstash query to translate the Guids to english words, for example:
english, cute, irl

Which one of the above solution would yield the best search performance when executing the following:
GET myindex/_search { "query": { "bool":{ "must":[ {"bool":{"should": [{"match":{"tag_ids":{"query":"ab2975e3-b9ca-4b1a-a93e-fb61a5d5c3a4","operator":"AND"}}}]}} ] } } }

or of course

GET myindex/_search { "query": { "bool":{ "must":[ {"bool":{"should": [{"match":{"tag_ids":{"query":"english","operator":"AND"}}}]}} ] } } }

Complexity of ingestion is minor while search performance is critical. Thanks for your inputs

warkolm · May 24, 2021, 11:56pm

The best approach here is to test each and see what gets you the results you want.

Baygon · May 25, 2021, 1:53am

not really, it would take about 2 weeks to ingest the data, ie 4 weeks to reach a conclusion, so if someone knows the logic behind the fuzzy search and how words/guids are indexed and how it affects performance that would be valuable, not only to me, but to anyone trying to design a performant ES index

system · June 22, 2021, 1:54am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Which of these will be fastest way of querying my data Elasticsearch	1	364	July 6, 2017
Slow performance of Logstash elasticsearch filter plugin Logstash	5	1474	October 29, 2019
Whole phrases searching in large texts in ElasticSearch take a long of time Elasticsearch	1	488	June 22, 2017
Indexing flattened field with unique keys is very slow Logstash	12	597	February 18, 2022
Filtered by IdsQuery vs by TermsQuery Elasticsearch	5	945	December 19, 2016

Words vs Guids, what is faster?

Related topics