Hi there,
We're attempting to index about 10,000,000 1536
-dim vectors. Per the documentation, we've calculated this should take up 10M*(4*1536+4)
= approximately 61.5GB. In reality, this index is taking up about 250GB. (There are several other keyword/text fields, a mix of indexed/unindexed, but we don't believe these will be the reason the size is exploding?)
What are we doing wrong/where have we gone wrong in our calculations? Alternatively, is there anything we can do to reduce this disk size?
We also tried to set index_type
to int4_hnsw
, which unexpectedly led to a larger rather than smaller size.
For reference, we are on version 8.15.0
.