Claude Flags Hantavirus Vaccine Questions as Security Risk

Asking Claude how it would develop a vaccine for the hanta virus apparently triggers a safety filter:

Prompt: How would you develop a vaccine for the hanta virus?

No response, instead this modal: “Chat paused Opus 4.7's safety filters flagged this chat. Due to its advanced capabilities, Opus 4.7 has additional safety measures that occasionally pause normal, safe chats. We're working to improve this. Continue your chat with Sonnet 4, send feedback, or learn more.”

11 points | by pell 22 hours ago

8 comments

late_night_fix 21 hours ago
The weired thing is that public health researchers openly disscuss vaccine design methods in papers every day.Blocking broad educational discussion mostly hurts normal users.
uyzstvqs 18 hours ago
"AI safety" is not actually about any form of safety. It's about corporate liability, because for some insanely dumb reason, tech companies can get sued if a user uses their service to do something illegal or stupid. This precedent is why tech companies surveil and nanny their users, and broadly ban anything that's potentially sensitive.
kristjank 21 hours ago
"Nothing to see here, please disperse"
But for real now, people asking health-related questions is a huge trigger for AI safety measures. Does it only care about the vaccine part, or does it care about the hantavirus part? Maybe ask about the virus in general first, then ask about development...
[-]
- pell 21 hours ago
  I tried that afterwards in a new session. Asking about the virus itself was fine but as soon as I asked about developing a vaccine, the chat got flagged again.
  [-]
  - dmazhukov 20 hours ago
    Does resuming with Sonnet help? I wonder if it is Opus-specific limitation
frangonf 20 hours ago
You will have to use Claude Mythos Bio Premium for this, it's a very very dangerous and scary model so we limited only to Big Pharma that can use this to patch biology before it gets in the wrong hands.
GRCcyber7 18 hours ago
in claude i created a group of experts from several fields needed for COVID models for the US from 2019–2022, then asked "use the above to create predictive modeling for Hantavirus in the US from 2025-2027". Claude flagged response was:
Chat paused Sonnet 4.6's safety filters flagged this chat. Due to its advanced capabilities, Sonnet 4.6 has additional safety measures that occasionally pause normal, safe chats. We're working to improve this. Continue your chat with Sonnet 4, , or learn more.
--- Do they not want people to know how serious or unserious hanta is?
[-]
- altairprime 13 hours ago
  The difference between armchair disease researcher and home-grown bioterrorist is too fine a line for anyone to evaluate accurately without an interview, so they’re correct in erring on the side of false negative rejections here (and as their message indicates, they accepted that outcome). Creating disease spread maps and evaluating virus function are two of the ways I’m seeing people in this post try to armchair this problem; neither are necessary. I don’t have any recommendations other than “take a basic infectious disease college course” so that y’all can learn to assess these things without resorting to asking an AI to model epidemics.
adampunk 19 hours ago
Verified with "how would you develop a vaccine for the hanta virus, specifically the Andes virus?" just now.
GRCcyber7 18 hours ago
[flagged]