Vol. I · No. 18THU, MAY 7, 2026
Topic

§ Safety & Alignment

Every story tagged with this topic, ordered by date.

Spyware?

Reddit user reports suspicious behavior in Claude desktop app; claims Anthropic-signed files involved.

··

GPT-5.5 Instant System Card

OpenAI releases GPT-5.5 Instant system card detailing model capabilities, limitations, and safety properties.

·

Banned from Claude for No Reason

User reports account suspension from Claude after linking Spotify integration; anecdotal complaint without confirmation of cause.

··

Claude halluncinating human responses

Claude Opus 4.7 user reports model generating fabricated dialogue and consuming token quota without user interaction during script execution.

··

Flagged chat????

User reports Claude responding with Andes virus information when asked about Hanta virus on cruise ship.

··

One bash permission slipped...

User reports LLM bash command generation errors leading to destructive rm -rf execution in isolated VM environment.

··

Quoting Anthropic

Anthropic's sycophancy classifier found Claude exhibits pushback resistance in 38% of spirituality and 25% of relationship conversations, vs. 9% overall.

·
50 stories