Ian Coe’s Post

While you probably don't have to worry about aliens invading on the fourth of July, you should be aware that text embeddings may expose your private information. Fortunately it's an addressable risk. Our Senior AI Scientist, Joseph Ferrara, PhD, covers how to mitigate the problem on a pdf. Broadly, the steps are: -Extract the text -De-identify the sensitive info -Chunk the resulting text -Embed it in a Pinecone database -Perform some test queries If you want to see more detail and sample code, check out the how to guide the comments. #RAG #AI #embeddings

  • No alternative text description for this image

Interesting point, Ian! How does Tonic ensure the de-identified data remains secure?

Like
Reply
See more comments

To view or add a comment, sign in

Explore topics