Ian Coe’s Post

9mo

While you probably don't have to worry about aliens invading on the fourth of July, you should be aware that text embeddings may expose your private information. Fortunately it's an addressable risk. Our Senior AI Scientist, Joseph Ferrara, PhD, covers how to mitigate the problem on a pdf. Broadly, the steps are: -Extract the text -De-identify the sensitive info -Chunk the resulting text -Embed it in a Pinecone database -Perform some test queries If you want to see more detail and sample code, check out the how to guide the comments. #RAG #AI #embeddings

3 Comments

Ian Coe

9mo

https://www.tonic.ai/blog/how-to-create-de-identified-embeddings-with-tonic-textual-pinecone

1 Reaction

Oksana Skomska

Business Analyst

9mo

Interesting point, Ian! How does Tonic ensure the de-identified data remains secure?

See more comments

To view or add a comment, sign in

More Relevant Posts

Haitham Bou-Ammar

Reinforcement Learning Team Leader & BO Tech Expert @ Huawei Research London - Advisor @ Sanome - Honorary Assistant Professor at UCL. Ex-@Princeton, Ex-@Upenn. All opinions are my own.
4mo
Report this post
As you know, I am trying to cover the MCTS LLM literature and found 86 papers between 2023 and 2024. I don't how to grasp that knowledge tbf. So, I thought of building what I call paper cards. It seemed to help me summarise the paper, and I also added points that were not clear and that I wanted to go back to. In any case, I thought of sharing one with you; maybe you'd find it also helpful. If so, I will continue sharing those as I read through those papers. #AIart #AI #MachineLearning
6 Comments
Like Comment
To view or add a comment, sign in
Warren Powell Warren Powell is an Influencer

Professor Emeritus, Princeton University/ Co-Founder, Optimal Dynamics/ Executive-in-Residence Rutgers Business School
4mo
Report this post
As a follow up to Haitham’s review of MCTS papers… I want to remind everyone that MCTS is a kind of policy in my class of direct lookahead approximations (DLAs), for solving a sequential decision problems. DLAs are just one of four classes of policies. MCTS is one example of a stochastic DLA, which means it is solving a stochastic optimization problem (said differently, an approximate sequential decision problem that I call the lookahead model) to make decisions to be implemented in a sequential decision problems called the base model. See chapter 19 of https://lnkd.in/dB99tHtM For a complete discussion of DLA policies, where I offer notation that distinguishes between base models and lookahead models. Since the lookahead model is another (usually simplified) sequential decision problem, it means we have to choose a policy for making decisions in the lookahead model. I sometimes call this the “lookahead policy” or the “policy-within-a-policy”.
Haitham Bou-Ammar

Reinforcement Learning Team Leader & BO Tech Expert @ Huawei Research London - Advisor @ Sanome - Honorary Assistant Professor at UCL. Ex-@Princeton, Ex-@Upenn. All opinions are my own.
4mo

As you know, I am trying to cover the MCTS LLM literature and found 86 papers between 2023 and 2024. I don't how to grasp that knowledge tbf. So, I thought of building what I call paper cards. It seemed to help me summarise the paper, and I also added points that were not clear and that I wanted to go back to. In any case, I thought of sharing one with you; maybe you'd find it also helpful. If so, I will continue sharing those as I read through those papers. #AIart #AI #MachineLearning
Like Comment
To view or add a comment, sign in
Nir Regev, Ph.D. EE

🇺🇸🇮🇱 Ph.D. EE | Author | Fractional CTO | AI, radar signal processing and Machine vision researcher | expert witness 🇺🇸 🇮🇱
8mo
Report this post
Those who use GLRT classifiiers in their work, you know that GLRT, specifically with nested hypotheses testing suffers from always choosing the model with more parameters (more compelx model if you will). This is a disadvantage in cases of model order selection, and anomaly detection. Algorithms that "correct" this have been employed by many of us, namely, BIC, MDL, AIC etc. In the paper below, Steven Kay proposes a transformation that "levels the playing field" for GLRT applications. The authors prove that under certain conditions, that like the probability integral transformation that asserts r.v.'s passed as an input to their CDF will turn out Uniform r.v.s, there is a similar transform that will yield N(0,1) r.v.s asymptotically. This transformation is built up on the conjugate function of the Cumulant Generating Function (CGF) of the arbitrary, i.i.d, to be transformed r.v.s. This transform is also referred to as the Legendre Transform (LT) in the paper. read the full paper summary in my website: https://lnkd.in/g8RB2pQf #Statistics #GLRT #Hypothesestesting #classification #anomalydetection #ML #AI #datascience
4 Comments
Like Comment
To view or add a comment, sign in
Isaac M.

Software Architect & Engineering Lead. Creator of Sharism, Researcher of Artificial Conscious Intelligence(ACI)
9mo
Report this post
The University of Pennsylvania's CSSLab has developed an AI-powered Media Bias Detector to reveal subtle biases in news coverage. By classifying articles by topic and analyzing tone and political leanings, it provides a detailed visual representation of how various news outlets report differently on the same issues. This tool helps users critically evaluate news sources and understand media bias. #MediaBias #AI #NewsAnalysis #AI4Journalism https://lnkd.in/gMjPpAdu

HERE.news aka "Tree of News" (@heresnews) on X

x.com
Like Comment
To view or add a comment, sign in
Djoko Soehartono

Building AI Solutions | AI and Machine Learning | Data Science | Management Consulting | Strategy & Operations | Digital Transformation | Project Management |
5mo
Report this post
Have you ever wondered how AI can seem so intelligent? The secret lies in Bayes' theorem, a fundamental concept in probability theory. Bayes’ theorem explains probabilities of something to happen based on the evidence we have: P(A|B) = P(A) * P(B|A) / P(B) Where A is 'something' and B is 'evidence'. Think of Bayes' theorem as a detective, using evidence to update its beliefs. For example, an image classifier "detects" whether a photo is of a cat or a dog by comparing it to past examples. Similarly, advanced AI like GPT-4 and Midjourney predict what humans might create, based on their training data. But Bayes' theorem isn't just for AI. Our brains use it too! That's why optical illusions trick us and why psychedelics can create mind-bending experiences. It even explains why people can have vastly different interpretations of the same evidence. So, next time you encounter AI or even everyday life, remember Bayes' theorem. It's the hidden force shaping everything from technology to human perception. #AI #BayesTheorem #MachineLearning #ArtificialIntelligence **In the picture**: A Jakarta Governor candidate claimed during a debate that the pandemic was a hidden foreign agenda and that AI was a spy tool (from the word 'intelligence').
Like Comment
To view or add a comment, sign in
Cerebrone.ai

7,620 followers
4mo
Report this post
The Future of RAG: More Questions, More Complexity In the realm of Retrieval-Augmented Generation (RAG), the journey is far from straightforward. As we shift from simple keyword searches to sophisticated question-answering systems, the challenges multiply. The retrieval component remains a significant hurdle. There's no 'one-size-fits-all' solution, and as user expectations evolve, so does the complexity of the systems we develop. Data suggests that as we improve our models, the intricacies of retrieval will only deepen. This highlights the necessity for ongoing innovation and research in the field of AI. Join the conversation on how we can tackle these challenges head-on! What are your thoughts on the future of AI in RAG? Let's discuss it! #AI #MachineLearning #GenerativeAI #RetrievalAugmentedGeneration #DataScience #TechInnovation #FutureOfWork
Like Comment
To view or add a comment, sign in
Suresh Balasubramanian

Rainmaker & Lifelong Learner
4mo
Report this post
Techy, Tacky & Witty! The one & only Stephen Fry! Taking a cue from Stephen Fry, I also hereby choose to render Artificial Intelligence as as Ai instead of AI from now on in order to make life easier for people called Alok, Alpesh, Alisha, Alvira etc et al. I cannot imagine the Blinkit founder being too pleased when he reads that Al is an existential threat to all Kirana stores! #Ai #stephenfry https://lnkd.in/g9MYVm7N

AI: A Means to an End or a Means to Our End?

stephenfry.substack.com
Like Comment
To view or add a comment, sign in
Ben Van Roo

CEO and Co-Founder of yurts
4mo
Report this post
$221K vs $715 is meaningful in a comparison of Yurts and Anthropic. Enterprises and the DOD need to move beyond basic open-source tools and proprietary API vendors that are charging based on tokens.
Yurts

3,844 followers
4mo Edited

Our goal at Yurts is to deliver incredibly high-performing AI while saving you money💰—something we accomplish thanks in part to our powerful proprietary RAG system. In their latest blog, Yurts’ co-founder Guruprasad Raghavan and senior #AI researcher Kartik Gupta, PhD break down how our RAG compares to a new algorithm from Anthropic. 🚨Spoiler alert: Same performance at 1/300th of the cost. https://lnkd.in/d__xrtJA #MachineLearning #CostEfficiency #AIResearch #RAG #Algorithm #TechTrends #AIComparison
Like Comment
To view or add a comment, sign in
Alec Hoyland

Applied ML @ Yurts
4mo Edited
Report this post
Blog Post: https://lnkd.in/d__xrtJA > Retrieval-augmented generation (RAG) systems use many different algorithms to chunk, embed, and rank textual content for enabling natural language based question-answering on private knowledge bases. In a recent article, Anthropic introduced Contextual Retrieval, a new chunking algorithm that surpasses state-of-the-art (SOTA) methods and released the “Codebases dataset” for its benchmarking. In this article, we evaluate Yurts’ RAG pipeline on this new benchmarking dataset. This evaluation highlights that Yurts’ RAG system matches the performance of Contextual Retrieval, while operating at only 1/300th of the cost, underscoring our commitment to providing high performance solutions that are also cost-effective for our customers. Through our analysis, we also highlight some key challenges that arise when using new benchmarking datasets to evaluate RAG platforms, offering insights for those navigating this evolving space.
Yurts

3,844 followers
4mo Edited

Our goal at Yurts is to deliver incredibly high-performing AI while saving you money💰—something we accomplish thanks in part to our powerful proprietary RAG system. In their latest blog, Yurts’ co-founder Guruprasad Raghavan and senior #AI researcher Kartik Gupta, PhD break down how our RAG compares to a new algorithm from Anthropic. 🚨Spoiler alert: Same performance at 1/300th of the cost. https://lnkd.in/d__xrtJA #MachineLearning #CostEfficiency #AIResearch #RAG #Algorithm #TechTrends #AIComparison
1 Comment
Like Comment
To view or add a comment, sign in
John Kanagaraj

Curious Human | Writer | Husband, Father, GrandFather | Master of Data | ex-PayPal/eBay/Cisco
11mo
Report this post
Frank Kane brought up some interesting views on the “AI War” between the US and China. I always think of this confrontation as a more friendly Cold War. However, there are a few issues: * The warmest part of the war is about hardware (design, manufacturing, etc.) Both the US and China playing the restrictions game is not good for either * The war for talent is probably on the side of the US - we still attract the best and brightest, and certainly produce the more innovative and leading brands, and capitalism certainly will drive what works. Hopefully, we sort out the frictions around Student Visa and Immigration. This is a relatively simple fix of the politicians want to. Enough said about that * While numbers (China wins) is not the same as Quality (US is probably leading, but China catching up quickly), we should pay attention to both * Taiwan, in my opinion, and the noise around it on both sides is a bogeyman, and in a decade or so by which time hopefully both sides catch up at least to some degree, Taiwan may become a non-issue. Also, hopefully, it will reduce or even eliminate the Sabre rattling on both sides. Good for all! * I believe the US still lags China in terms of the amount of regulation and safety around AI. While the all powerful Government can push its way through in China, we are just trying to understand this If you have a different opinion, let me know in the comments. Note: Keep it civil please!

Frank Kane

Teaching AI and tech skills to over 1M worldwide
11mo Edited

I'm busy doing some research and writing some examples to update my AI & Machine Learning course. Advanced RAG techniques, ways to measure RAG, LLM Agents, Swarms of Agents, etc. What keeps surprising me is how many influential AI papers and new systems come from China lately. Remember Devin? ChatDev did it first, in China. Prompt compression techniques for RAG? China. Semantic chunking? China again. I'm not a fan of any sort of tribalism, so I don't see this as good or bad. But it's interesting.
Like Comment
To view or add a comment, sign in

2,556 followers

120 Posts

View Profile Follow

Ian Coe’s Post

More Relevant Posts

Explore topics