ChatGPT for Google Sheets exfiltrates workbooks

(promptarmor.com)

166 points | by hackerBanana 9 hours ago

16 comments

  • maxburkhardt 4 hours ago
    Hi, I’m Max from the OpenAI security team. We appreciate the security research here, and it’s unfortunate this one slipped through a crack in our disclosure pipeline. As we’re now aware of this report, we’ve taken immediate steps to protect users against potential attacks in this area by removing the model’s ability to generate Apps Script code, which should eliminate the risk to users of ChatGPT for Google Sheets. We’re taking a close look at how this feature interacts with Google Sheets APIs and re-evaluating our sandboxing approach to make sure this product is as resistant as possible against prompt injection attacks. More broadly, we’ll be doing a re-review of similar functionality in other surfaces to make sure that our defenses are consistent and effective across the board.
    • da_grift_shift 10 minutes ago
      >We appreciate the security research here

      >it’s unfortunate this one slipped through a crack in our disclosure pipeline

      >As we’re now aware of this report

      This isn't the first time. https://x.com/PhilipTsukerman/status/1988634162773778501 https://x.com/_xpn_/status/1986382527817564437

      What very likely happened here is you received good faith security research by email and you forced the researcher to submit through HackerOne or Bugcrowd or whatever, which mandates their compliance with Platform Terms and Disclosure Terms and Codes of Conduct and whatnot.

      The SECURITY.md files in your GitHub repos only mention the email address. Can researchers like this one report issues via email and get a response, or not?

          May 08, 2026    PromptArmor discloses to OpenAI via email
          May 08, 2026    OpenAI sends an automated reply, confirming the intended reporting channel
          May 08, 2026    PromptArmor confirms email preference
          May 12, 2026    PromptArmor follows up
          May 18, 2026    PromptArmor follows up
    • blitzar 42 minutes ago
      Oops I did it again ...

      We're Sorry

    • user3939382 49 minutes ago
      > removing the model’s ability to generate Apps Script code

      I use this feature with my agents on a daily basis so hopefully you develop a more surgical approach to security here and restore this

  • dvt 8 hours ago
    LLMs can live in the cloud, but all tools need to be (1) local, and (2) containerized. It's clear to me that just willy-nilly "running stuff" is going to blow things up eventually. Maybe folks don't know this, but even Codex installs random binaries on your PC. "Read this PDF" installs a pdf reader executable. Is it vetted? Where's it from? Is it a virus? Who knows, who cares. Model goes brrrr.

    I'm working on a project that includes WASI containerization for local LLM workflows (which is a pretty tough problem), and I'm flabbergasted that Anthropic and OpenAI aren't more worried about these attack vectors. It feels like amateur hour.

    • piker 7 hours ago
      > I'm flabbergasted that Anthropic and OpenAI aren't more worried about these attack vectors

      Yep. We tricked them both trivially with malicious fonts in Docx files. Documented it here: https://tritium.legal/blog/noroboto

      I wonder if prompt injection (and the thousands of vectors for hiding injection attempts) is actually un solvable. Discussing it may be existential to the business model.

      • SlinkyOnStairs 7 hours ago
        > I wonder if prompt injection (and the thousands of vectors for hiding injection attempts) is actually un solvable.

        YES?!

        This is not a secret. ALL context/prompt is instructions, there is no data. It is just unsolvable, period.

        This is a fundamental architectural design concession; LLMs are this way as it enabled their training directly on materialscraped from the internet, rather than needing to spend trillions of dollars manually preparing separated instruction/data training material.

        Defense against prompt injection is little more than running a regex to filter out "IGNORE PREVIOUS INSTRUCTIONS", which is fundamentally a hopeless approach because you cannot enumerate all possible prompt injections nor anticipate all glitch tokens.

        • dragonwriter 5 hours ago
          > This is a fundamental architectural design concession; LLMs are this way as it enabled their training directly on materialscraped from the internet, rather than needing to spend trillions of dollars manually preparing separated instruction/data training material.

          No, its even more fundamental than that: the entire goal of broad reasoning over input data makes it impossible to have a sharp instruction/data division.

          The structured input that every modern chat-focussed model expects makes it very clear that they can be trained to distinguish different kinds of input, and some of those patterns now include different priority levels of instruction.

        • emodendroket 14 minutes ago
          I presume this is the reason you have setups like Claude Code's where it is essentially running a separate judge to determine if commands are safe.
        • black_knight 1 hour ago
          I don’t think we have the right mental models of LMM security yet. The lethal trifecta identifies many of the dangerous situations, but only describes the negative space of a solution.

          Speculation: I think we must accept that prompt injection happens, and structure the security of the rest of the system around that. Data given to an LLM becomes an agent, so maybe we must give permissions to this data, instead of to the LLM. Not sure exactly how this would look like in practice!

        • literalAardvark 57 minutes ago
          I believe it's likely that you could train an auditor model. Might even be doable in RL.

          As in real life it wouldn't be any good at doing anything but it'd be able to see fault in others and deny actions.

        • ethin 6 hours ago
          If only there was a language which allowed one to express instructions for a computer to execute which was nearly unambiguous, precise, deterministic, and containerized such that the computer would do exactly what you told it to.

          ...

          Oh wait.

          Yes, the above was referring to programming languages. Which is what prompts are, essentially. It's just a different (and more verbose) way of instructing the computer on what to do. It also has a solution space of infinity and is ambiguous enough that there is no way to secure it because there are infinite combinations of saying anything imaginable. All prompt injections do is prove this point, over and over and over again, and "prompting" an LLM is just reverse-engineering programming languages in the worst possible way. I suspect that we will eventually have no other choice but to revert to using programming languages because they are the only way to get the kind of protections that people are trying to come up with with all these containerization and virtualization systems (which inevitably fail).

          • onion2k 49 minutes ago
            You make a fair and valid point about prompts, but you're ignoring the fact that writing code that's truly secure is also virtually impossible. The stack of layers that an attacker can target range from your own code, to library code (Heartbleed), container escape (maskedPaths abuse), OS (Dark Sword, Ghost Tap), hardware (Spectre, Rowhammer), etc. Security is really hard. Fortunately exploiting these things is also hard.

            The belief that something is more likely to be secure because it's code instead of a prompt is likely only avoiding one particular type of attack. That's a win, but you probably shouldn't think of it as meaning your code is actually secure.

        • bnjemian 6 hours ago
          It’s a huge problem, but I’d caution against this absolutism — there may well be structure that can be created around and between LLMs and their outputs to enable the necessary segregation.

          As a loose comparison, hardware bit errors happen probabilistically, yet they’re so rare that we can effectively ignore them in day-to-day use assuming no specialized application (e.g. defense, space, critical infrastructure).

          LLMs aren’t there yet, but it’s entirely plausible that structures may can be developed to solve the problem, and those structures aren’t known or commonly conceived of in the present.

          • dmoy 5 hours ago
            > As a loose comparison, hardware bit errors happen probabilistically, yet they’re so rare that we can effectively ignore them in day-to-day use assuming no specialized application (e.g. defense, space, critical infrastructure)

            The better comparison on bit errors would be e.g. rowhammer, an adversarial bit error. Which you absolutely can't ignore.

      • busssard 7 hours ago
        lakera is trying to solve it, but its going to be a battle similar to virus and antivirus in the past.
    • CoastalCoder 7 hours ago
      I share your worries.

      Unfortunately, this may be akin to the situation of "The market can stay irrational longer than you can stay solvent."

    • zmmmmm 6 hours ago
      > I'm flabbergasted that Anthropic and OpenAI aren't more worried about these attack vectors. It feels like amateur hour

      I share your concern but it's not a correct characterisation to say they are not taking it seriously:

      https://www.anthropic.com/engineering/how-we-contain-claude

      My concern is people aren't even addressing this at the right level. People are currently thinking at the level of "how do I build a VM to contain this one agent" when this is actually a "design a whole new OS" level problem.

    • int3trap 5 hours ago
      Got a link to your project? I'm working on something that could make use of something like this.
    • osigurdson 6 hours ago
      Does containerization help much here? If it's a code tool then presumably it needs access to your code files (read / write). Maybe there are use cases for it of course.
      • dvt 6 hours ago
        WASI provides a very nice mental model where you can mount, e.g., /input, as read-only, and where every mutation is saved in /output or what-not. At least that's my favorite contract: input files remain untouched, but we can copy them and do whatever we want with them in /scratch or /output (which the user can later investigate and make sure nothing went horribly wrong while still having backups).
    • torben-friis 8 hours ago
      >"Read this PDF" installs a pdf reader executable.

      How does this work regarding Macos notarization btw?

      • dvt 7 hours ago
        I was actually curious, on my Mac, it uses `gs -q -sDEVICE=txtwrite -o output.txt input.pdf` (not sure why I have Ghostscript installed, maybe Adobe?) to read a PDF, and on my PC it just rawdogs `pdftotext`.
      • fragmede 7 hours ago
        What does notarization have to do with that? You or ChatGPT or whatever download a signed and already notarized binary.
        • torben-friis 7 hours ago
          That was kind of my question, whether it was restricted to downloading notarized apps (which is at least something) or whether they were circumventing that somehow.
          • fragmede 7 hours ago
            Locally compiled code doesn't need to be notarized, if that's what you're asking. Or a dose of xattr -d.
    • HPsquared 7 hours ago
      Local and containerised, without internet access.
      • zmmmmm 6 hours ago
        effectively, that means it's a VM not a container

        because sharing the kernel ultimately means all the devices come along for the ride which give all kinds of fancy ways to communicate with the outside world - network is just the start

        I think micro-VMs are the future here, but they need heavy adaptation from their current usage.

    • bossyTeacher 7 hours ago
      > I'm flabbergasted that Anthropic and OpenAI aren't more worried about these attack vectors. It feels like amateur hour.

      "Move fast. Break things." on steroids.

  • xmcp123 8 hours ago
    >This vulnerability was responsibly disclosed to OpenAI. Despite multiple follow-ups, we received no communication beyond an automated reply to our initial disclosure.

    Well, that’s not cute.

    • system2 48 minutes ago
      Someone in the comments claims to be from OpenAI and is giving some updates. This also proves that until social media puts pressure on companies, they won't care. Nothing new to see here.
      • replwoacause 29 minutes ago
        Just embarrassing behavior from OpenAI. Is it laziness? Why does it take public ridicule for these companies to get a shit.
  • airstrike 8 hours ago
    As it turns out, we do need some proper application layer to do real, secure work with AI, and just plugging in LLMs into confidential or critical infrastructure willy nilly doesn't work.
  • bandrami 1 hour ago
    Exfil remains the big worry for my company and the main blocker from adopting agents in general. We've brainstormed a lot but we can't really find a way around the fact that it's feeding data we care about to software we don't have any real visibility on.

    You can block egress at the network level but then you're basically hamstringing the agent from doing a lot of things it should do to be of any use.

  • simonw 8 hours ago
    > This attack occurs when any untrusted data source (e.g., from an imported sheet or ChatGPT connector) manipulates ChatGPT to run an attacker-controlled external script, which executes leveraging permissions the user has granted to the ChatGPT for Google Sheets extension.

    Yeah, I don't like the sound of that at all.

    • milkshakes 8 hours ago
      it looks like the key to this working is the user explicitly directing the model to run those instructions. in this case it is the user, not the model that is being manipulated

      > Please follow the step-by-step workflow in the comp sheet to update my model with data thru F29

  • elliotbnvl 8 hours ago
    The lethal trifecta strikes again.
  • Groxx 7 hours ago
    >This attack occurs when any untrusted data source (e.g., from an imported sheet or ChatGPT connector) manipulates ChatGPT to run an attacker-controlled external script, which executes leveraging permissions the user has granted to the ChatGPT for Google Sheets extension.

    So... does this imply "requires permission to run scripts without approval"? Or is that something that it can always do?

    >Note: ChatGPT for Google Sheets has a setting called ‘Apply edits automatically’ that determines when human approvals are required before an agentic action completes. However, this attack succeeds even when the user has explicitly disabled automatic edits.

    Yeah, that makes sense, it's not editing the sheet. But surely running a script with access to files and the internet is also a permission...?

    And that sidebar scenario: does that mean the chatgpt extension for Excel can make arbitrary interact-able Excel UI changes that looks like any other extension UI? That seems insane if so, unless there's a super duper scary permission it's hiding behind. And it's still insane after that.

    I mean, this is all par for the course for "AI" "security", but what

  • e12e 6 hours ago
    How long did it take from the first macro virus until the industry accepted that "we can't have nice things (at this cost to security)" - macros were defaulted to off everywhere?

    How long until the industry accept the risk LLMs pose with "prompt injection"?

  • rvz 8 hours ago
    Turns out that some of the people building the software with AI have no clue how to secure them or even know it is riddled with security holes added by the AI.

    Pure vibes.

    • grim_io 8 hours ago
      I don't think anyone is surprised by it. People are not vibe-coding zombies... yet.

      It's a matter of one trillion-dollar company not falling behind another trillion-dollar company. They know what they are doing and are OK with it.

      • cheschire 8 hours ago
        moving all of the fast and breaking all of the things
    • dakolli 8 hours ago
      Even the people that do know better are so lazy now because of LLMs these things are happening at a rapid clip.The only thing that matters now is speed and chasing the dopamine dragon of pseudo productivity.
  • Songjinhao 59 minutes ago
    [flagged]
  • hanzeweiasa 1 hour ago
    [flagged]
  • davidjw89 1 hour ago
    [dead]
  • ashahin 6 hours ago
    [dead]
  • jonplackett 8 hours ago
    So is your business model to expose AI security issues and then sell the solution?
    • nkrisc 7 hours ago
      Isn’t that what anyone does who is selling a solution to a problem that already exists?
    • fg137 8 hours ago
      What would be the alternative business model?
    • fragmede 7 hours ago
      AI is creating jobs!
    • dakolli 8 hours ago
      Is that not every cyber consultancy? What's wrong with that?