Skip to content

Adjust Dense Vector Unit Vector Epsilon#110240

Merged
Mikep86 merged 1 commit intoelastic:mainfrom
Mikep86:adjust-epsilon
Jun 28, 2024
Merged

Adjust Dense Vector Unit Vector Epsilon#110240
Mikep86 merged 1 commit intoelastic:mainfrom
Mikep86:adjust-epsilon

Conversation

@Mikep86
Copy link
Contributor

@Mikep86 Mikep86 commented Jun 27, 2024

Change dense vector unit vector epsilon to 1e-3. This is required because Cohere sometimes generates unit vectors with magnitude slightly outside our current epsilon. For example, we have observed Cohere unit vectors with magnitude 1.0001829.

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

@Mikep86
Copy link
Contributor Author

Mikep86 commented Jun 27, 2024

What tests, if any, should we add for this change?

@carlosdelest
Copy link
Member

What tests, if any, should we add for this change?

Would the Cohere vectors you found be good candidates for a test?

@Mikep86
Copy link
Contributor Author

Mikep86 commented Jun 27, 2024

@elasticsearchmachine run elasticsearch-ci/part-3

@Mikep86
Copy link
Contributor Author

Mikep86 commented Jun 27, 2024

@elasticsearchmachine run elasticsearch-ci/8.15.0 / bwc-snapshots

@Mikep86
Copy link
Contributor Author

Mikep86 commented Jun 27, 2024

@elasticsearchmachine run "elasticsearch-ci/8.15.0 / bwc-snapshots"

@Mikep86
Copy link
Contributor Author

Mikep86 commented Jun 27, 2024

@elasticsearchmachine run elasticsearch-ci/bwc-snapshots

@Mikep86
Copy link
Contributor Author

Mikep86 commented Jun 28, 2024

@elasticsearchmachine run elasticsearch-ci

@Mikep86
Copy link
Contributor Author

Mikep86 commented Jun 28, 2024

Would the Cohere vectors you found be good candidates for a test?

Not really. While they are an example of unit vectors that violate the current epsilon, there is no way to know whether they are representative of the global maximum magnitude variance that Cohere will return.

We could add a test demonstrating that some set of hard-coded vectors are within the adjusted epsilon, but given that both the vectors and the epsilon are (or would be) hard-coded, I don't know what value that adds.

I will merge without tests and we can add some in a follow-up PR if we want to revisit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

4 participants