More search time

hegebharat · May 18, 2020, 3:19pm

Hi,
We have case where we have indexed 1billion vectors(each is 512 -dims) using dense_vector (elastic version 7.6).
We are using cosine similarity to find the best hits.
But every query is taking 6.7 sec to get the hits after every query. Which is quite annoying to the user. We want to use reduce the timing.
We installed the ES 7.6 on m5.4xlarge.elasticsearch(16 vCPU and 64GiB)

I have following questions.

Possible options to improve it.
If we use ANN or K-means and search only particular Centroid we might improve the timing. But we might loose some relevant hits. What are the way to improve this approach?

Thanks for you time.

Christian_Dahlqvist · May 18, 2020, 3:41pm

It sound like you have used the explain API to identify that fetching the results is the slowest part, is this correct? If so, how much data does each node hold and what size and type of storage are you using?

hegebharat · May 18, 2020, 4:44pm

No of Node =1
Instance type=m5a.4xlarge(AWS EC2)
Storage : HVM, 64-bit, SSD-Backed
"title_vector": {
"type": "dense_vector",
"dims": 512
}
Around 1 billion vectors were indexed. And we using blow piece of code to query:
{
"script_score": {
"query": {"match_all": {}},
"script": {
"source": "cosineSimilarity(params.query_vector, 'title_vector') + 1.0",
"params": {"query_vector": query_vector}
}
}
}

Every query is taking ~ 7 sec for the response.

We would like to know how can this timing improved.

Thanks for your time

Christian_Dahlqvist · May 18, 2020, 4:50pm

Can you show the output from running this query through the explain API? Are you using any other conditions to narrow down the result or will all documents need to be scored?

How large is the index? How many shards does it have?

hegebharat · May 19, 2020, 8:42am

How large is the index?
We have indexed 8 million vectors as of now. Every vector is 512 dims. We plan to index at-least 30 million in future.

How many shards does it have?
shards =1

Are you using any other conditions to narrow down the result or will all documents need to be scored?
No.I am using below script to query.
script_query = {
"script_score": {
"query": {"match_all": {}},
"script": {
"source": "cosineSimilarity(params.query_vector, doc['title_vector']) + 1.0",
"params": {"query_vector": query_vector}
}
}
}

Thanks

hegebharat · May 19, 2020, 8:43am

Also:
{
"settings": {
"index" : {
"number_of_shards" : 1,
"number_of_replicas" : 0
},

  "analysis": {
  "analyzer": {
    "my_analyzer": {
      "tokenizer": "standard",
      "filter": ["lowercase","snowball","stop_words_filter"]
    }
  },
  "filter": {
    "stop_words_filter": {
      "type": "stop",
      "ignore_case": true,
      "stopwords" : ["a", "about", "above", "after", "again", "against", "ain", "all", "am", "an", "and", "any", "are", "aren", "aren't", "as", "at", "be", "because", "been", "before", "being", "below", "between", "both", "but", "by", "can", "couldn", "couldn't", "d", "did", "didn", "didn't", "do", "does", "doesn", "doesn't", "doing", "don", "don't", "down", "during", "each", "few", "for", "from", "further", "had", "hadn", "hadn't", "has", "hasn", "hasn't", "have", "haven", "haven't", "having", "he", "her", "here", "hers", "herself", "him", "himself", "his", "how", "i", "if", "in", "into", "is", "isn", "isn't", "it", "it's", "its", "itself", "just", "ll", "m", "ma", "me", "mightn", "mightn't", "more", "most", "mustn", "mustn't", "my", "myself", "needn", "needn't", "no", "nor", "not", "now", "o", "of", "off", "on", "once", "only", "or", "other", "our", "ours", "ourselves", "out", "over", "own", "re", "s", "same", "shan", "shan't", "she", "she's", "should", "should've", "shouldn", "shouldn't", "so", "some", "such", "t", "than", "that", "that'll", "the", "their", "theirs", "them", "themselves", "then", "there", "these", "they", "this", "those", "through", "to", "too", "under", "until", "up", "ve", "very", "was", "wasn", "wasn't", "we", "were", "weren", "weren't", "what", "when", "where", "which", "while", "who", "whom", "why", "will", "with", "won", "won't", "wouldn", "wouldn't", "y", "you", "you'd", "you'll", "you're", "you've", "your", "yours", "yourself", "yourselves", "could", "he'd", "he'll", "he's", "here's", "how's", "i'd", "i'll", "i'm", "i've", "let's", "ought", "she'd", "she'll", "that's", "there's", "they'd", "they'll", "they're", "they've", "we'd", "we'll", "we're", "we've", "what's", "when's", "where's", "who's", "why's", "would"]
    }
  }
}

},
"mappings": {
"dynamic": "true",
"_source": {
"enabled": "true"
},
"properties": {
"id": {
"type": "keyword"
},

  "sentence": {
    "type": "text"
  },
    
   "paperId": {
    "type": "text"
  },
    "title_vector": {
    "type": "dense_vector",
    "dims": 512
  }

}

}
}

This is the index config file

Christian_Dahlqvist · May 19, 2020, 8:53am

Given that you are doing a lot of processing using script you may get better performance with a larger number of primary shards as more work can be done in parallel. Use the split index api to create a new index with e.g. 16 primary shards and see if this performs better.

hegebharat · May 19, 2020, 9:49am

Thank you for the recommendation. We will try that.

What about using ann or KNN approach? will these approaches degrade accuracy(i mean relevance)?

Is Elastic search provides any of the below approaches to improve the timing behavior?

(https://issues.apache.org/jira/browse/LUCENE-9136)

Tree-base algorithms, such as KD-tree;
Hashing methods, such as LSH (Local Sensitive Hashing);
Product quantization based algorithms, such as IVFFlat;
Graph-base algorithms, such as HNSW, SSG, NSG;

Thanks

Christian_Dahlqvist · May 19, 2020, 10:30am

I do not know, so will leave that for someone with better knowledge of the internals.

mayya · May 21, 2020, 11:04am

@hegebharat ANN is still work in progress, and it is not yet available in elasticsearch. One way to improve the performance on your query :

{
  "script_score": {
    "query": {
      "match_all": {}
    },
    "script": {
      "source": "cosineSimilarity(params.query_vector, 'title_vector') + 1.0",
      "params": {"query_vector": query_vector}
    }
  }
}

is instead of using "match_all": {} query, use more specific filter that target much smaller number of documents. cosine_similarity is an expensive operation, filter allows you to limit the number of docs that it targets.

We also recommend to use "_source": false on a search request.

system · June 18, 2020, 11:07am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Speed of dense vector search with 512 or more dimensions Elasticsearch	8	3956	August 26, 2022
Script score vector search performance Elasticsearch	3	579	September 22, 2022
KNN search speed Elasticsearch vector-search	12	1838	April 20, 2023
Slow aKNN search Elasticsearch vector-search	7	910	April 20, 2023
How vector based text similarity works under the hood? Elasticsearch	4	775	July 15, 2020

More search time

Related topics