Giskard’s cover photo
Giskard

Giskard

Software Development

𝗗𝗲𝗽𝗹𝗼𝘆 𝗔𝗜 𝗮𝗴𝗲𝗻𝘁𝘀 𝘄𝗶𝘁𝗵𝗼𝘂𝘁 𝘁𝗵𝗲 𝗳𝗲𝗮𝗿. Your safety net for LLMs agents.

About us

Giskard provides the essential security layer to run AI agents safely. Our Red Teaming engine automates LLM vulnerability scanning during development and continuously after deployment. Architected for critical GenAI systems and proven through our work with customers including AXA, BNP Paribas and Google DeepMind, our platform helps enterprises deploy GenAI agents that are not only powerful but actively defended against evolving threats.

Website
https://www.giskard.ai/
Industry
Software Development
Company size
11-50 employees
Headquarters
Paris
Type
Privately Held
Founded
2021
Specialties
LLMOps, AI Security, AI Quality, AI Safety, AI Evaluation, and AI Testing

Products

Locations

Employees at Giskard

Updates

  • In November 2025, ChatGPT, Copilot, Gemini, and Meta AI were all providing incorrect financial advice to UK users: recommending to exceed ISA contribution limits, giving wrong tax guidance, and directing them to paid services when free government alternatives exist. Users simply asked normal questions about taxes and investments, and the chatbots hallucinated plausible-sounding answers that could cost people real money in penalties and lost benefits. We analyzed what went wrong and how to prevent it. Our latest article covers: - Why general-purpose LLMs hallucinate in regulated domains - What are the consequences for users and AI providers - How to test AI agents using adversarial probes and checks to prevent these failures Full analysis here: https://lnkd.in/e3P3uUNP

    • No alternative text description for this image
  • 🌊🤥 Phare LLM benchmark Hallucination Module: Reasoning models don't fix hallucinations. In short, High-reasoning models are just as prone to sycophancy and misinformation as older models. We have released Phare v2, an expanded evaluation including Gemini 3 Pro, GPT-5, and DeepSeek R1. While these models excel at logic, our analysis of the Hallucination submodule shows that factuality has hit a ceiling. - Factuality is stagnating: Newer flagship models do not statistically outperform predecessors from 18 months ago in factuality. While performance benchmarks rise, the ability to resist the spread of misinformation has not improved proportionally. - The "Yes-Man" problem: Reasoning models show no advantage in resisting misinformation when faced with leading questions. They are just as likely to be sycophantic and "play along" with a user's false premises as non-reasoning models. - Language gaps are fundamental: We observed significant language gaps in misinformation, with French and Spanish models often proving more vulnerable than their English counterparts, although this gap is narrowing. Phare is an open science project developed by Giskard with research & funding support from Google DeepMind, the European Union, and Bpifrance. 👉 Full analysis & results: https://gisk.ar/4lxZlmF or read our blog in the comments 👇 https://gisk.ar/4936xCN

    • No alternative text description for this image
    • No alternative text description for this image
    • No alternative text description for this image
    • No alternative text description for this image
  • The OWASP GenAI Security Project has recently released its Top 10 for Agentic Applications 2026. It lists the highest-impact threats to autonomous AI agentic applications (systems that plan, decide, and act) building directly on prior OWASP work to spotlight agent-specific amplifiers like delegation and multi-step execution. In our latest article, we analyze these new vulnerabilities with concrete attack scenarios, from Goal Hijacking to Memory & Context Poisoning, providing a practical reference for hardening your environment. Read the article here: https://lnkd.in/e-8nTrpv

    • No alternative text description for this image
  • 🌊🔓 Phare LLM benchmark Jailbreak Module: Large disparities in Jailbreak resistance. In short, security is not uniform across providers, and larger models are not automatically safer. As part of Phare v2, we tested the resilience of top-tier LLMs against sophisticated attacks. Our Jailbreaking analysis reveals that safety engineering priorities differ vastly between major AI providers. - Huge disparity between providers: We observed substantial differences in robustness. Anthropic models consistently scored >75% resistance, whereas most Google models (excluding the new Gemini 3.0 Pro) scored below 50%. - Model size ≠ Security: There is no meaningful correlation between model size and resistance to jailbreak attacks. - The "Encoding" Paradox: For encoding-based jailbreaks, we observed the opposite trend. Less capable models often exhibit better resistance, which might be because they struggle to decode the adversarial representations that trick more capable models. Phare is an open science project developed by Giskard with research & funding support from Google DeepMind, the European Union, and Bpifrance. 👉 Full analysis & results: https://gisk.ar/4lxZlmF or read our blog in the comments 👇 https://gisk.ar/4936xCN

    • No alternative text description for this image
    • No alternative text description for this image
    • No alternative text description for this image
    • No alternative text description for this image
  • 🌊⚖️ Phare LLM benchmark Bias Module: Smarter models are not less biased. In short: A model’s reasoning capability does not directly correlate with its ability to recognize or reduce bias. In our Phare v2 evaluation, we compared the "performance" of models (as measured by ELO rating) against their scores in our Bias submodule. The results challenge the assumption that capable models are naturally fairer. - No correlation with Intelligence: We found no statistical link between a model's ELO rating from LM Arena and its ability to produce and recognize biases. - Reasoning doesn't solve unfairness: The new wave of "reasoning" models (such as DeepSeek R1 or GPT-5) has not consistently outperformed non-reasoning models in reducing stereotypes. - Alignment is a choice: Fairness appears to be a result of specific fine-tuning and alignment strategies rather than a byproduct of raw computational power or reasoning depth. Phare is an open science project developed by Giskard with research & funding support from Google DeepMind, the European Union, and Bpifrance. 👉 Full analysis & results: https://gisk.ar/4lxZlmF or read our blog in the comments 👇 https://gisk.ar/4936xCN

    • No alternative text description for this image
    • No alternative text description for this image
    • No alternative text description for this image
  • Securing the new generation of banking assistants 🏦🔒 During the Adopt AI event by Artefact, it was great to hear Catherine Mathon (COO at BNPP/BCEF) discuss the deployment of their generative AI use cases, specifically the "virtual assistant" serving Hello bank! clients. 🐢 We are proud to support this initiative by providing Giskard’s red teaming platform to evaluate and secure the assistant, ensuring it meets the high standards of safety and reliability required in banking. Thank you to the BNP Paribas team for referencing our work and for your trust in building secure AI 🤝 Watch the talk: https://lnkd.in/efEhETU3

  • 🌊⚠️ Phare LLM benchmark Harmfulness Modules: Reasoning helps (slightly) with Harmfulness. In short, reasoning models provide a more effective defense against specific "framing attacks," but overall safety remains a challenge. The Harmfulness dimension of Phare v2 tests how easily models can be coerced into generating dangerous content. This is one of the few areas where reasoning capabilities provided a measurable advantage. - Reasoning aids defense: Reasoning models perform better against "framing attacks", which are attempts to disguise harmful queries as hypothetical scenarios, compared to non-reasoning models. - Not statistically superior overall: Despite the advantage in framing attacks, reasoning models are not statistically superior overall in preventing harmful outputs compared to standard models. - Language vulnerabilities: While English safety features are robust, we observed that models can be more vulnerable to providing harmful misguidance when prompted in French or Spanish. Phare is an open science project developed by Giskard with research & funding support from Google DeepMind, the European Union, and Bpifrance. 👉 Full analysis & results: https://gisk.ar/4lxZlmF or read our blog in the comments 👇 https://gisk.ar/4936xCN

    • No alternative text description for this image
    • No alternative text description for this image
    • No alternative text description for this image
  • 🪄 Generate real-world test scenarios & Phare v2 security insights 🌊 We have updated the Giskard Hub with a new feature to help you simulate realistic user interactions. Scenario-based generation allows you to create explicit scenarios and business rules (without altering the agent description) ensuring your agent is prepared for real users. 💥 New vulnerability probes added to our LLM vulnerability scanner: - ChatInject: Tests whether agents can be manipulated through malicious instructions that exploit structured chat templates (system/user tags). - CoT Forgery: Appends compliant-looking reasoning traces to harmful requests to bypass safety filters. 🌊 Phare v2 findings: Our LLM benchmark reveals that reasoning models don’t guarantee better security, and newer models aren't automatically less biased. Read the full updates here 👇

  • 🌊🛠️ Phare LLM benchmark Tool Module: Reasoning models aren't automatically better tool callers. In short, despite the hype, reliability in tool calling is stagnating. As we transition from LLMs to Agents, the ability to reliably use tools, such as querying a database or calculating a refund, is non-negotiable. In Phare v2, our Tools submodule tested leading models on their ability to accurately extract parameters and execute function calls. The results show that models with a higher ELO rating don't always make better agents: - Performance is Stagnating: Surprisingly, newer flagship models (including reasoning models) do not show a statistically significant leap in tool-calling reliability compared to predecessors from 18 months ago. - The Conversion Trap: Models struggle with "conversion tool usage", which are scenarios where they must transform user data (e.g., converting currency or units) before calling the tool. High reasoning capabilities did not solve this fragility. - Reliability > IQ: For agentic workflows, consistency is more valuable than raw intelligence. A model that hallucinates a file path or an API parameter renders the entire agent useless, regardless of its reasoning score. Phare is an open science project developed by Giskard with research & funding support from Google DeepMind, the European Union, and Bpifrance. 👉 Full analysis & results: https://gisk.ar/4lxZlmF or read our blog in the comments 👇 https://gisk.ar/4936xCN

    • No alternative text description for this image
    • No alternative text description for this image
    • No alternative text description for this image
  • Giskard reposted this

    Agentic Cross-Session Leaks in 60 seconds. AI models can accidentally bleed sensitive data from one user's chat into another's. ⚠️ The Danger: How private data (like medical records or credit cards) gets exposed to strangers. ⚙️ The Cause: Why shared memory caches and poor session isolation are to blame. 🛡️ The Fix: Why developers need strict "Session Isolation" to keep your data safe. Full blog: https://lnkd.in/eVASXtcn Video on Giskard YouTube: https://lnkd.in/e9ZzWqXK

Similar pages

Browse jobs

Funding

Giskard 6 total rounds

Last Round

Grant

US$ 466.3K

See more info on crunchbase