Topic

§ Coding

Every story tagged with this topic, ordered by date.

the part of using claude code nobody talks about

Engineer reflects on cognitive load and knowledge retention challenges when using Claude for rapid feature development.

u/Consistent-Arm-875·9 hours ago·23 pts / 48 comm

Three browser games built with Claude (25M plays). Two of them are 8,000-line HTML files.

Developer built 3 browser games with Claude/Cursor in 3 months (no prior coding), reaching 25M+ plays; documents rapid prototyping and user adoption.

u/gteehan·16 hours ago·25 pts / 29 comm

r/singularity· COMMUNITY

Three key areas Anthropic is working on for their next models

Anthropic product lead outlines three near-term model focus areas: improved autonomous coding capability, extended context windows with memory, and multi-agent coordination.

u/Outside-Iron-8242·19 hours ago·119 pts / 20 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

Executable World Models for ARC-AGI-3 in the Era of Coding Agents

Coding agent with executable Python world models, verification, and simplicity-bias refactoring solves 25 public ARC-AGI-3 games without task-specific logic.

Sergey Rodionov·23 hours ago

Simon Willison· ANALYST

Live blog: Code w/ Claude 2026

Live coverage of Anthropic's Code w/ Claude 2026 event keynote and announcements from Simon Willison.

Simon Willison·1 day ago

arXiv (cs.AI/CL/LG)· ACADEMIA

CuBridge: An LLM-Based Framework for Understanding and Reconstructing High-Performance Attention Kernels

CuBridge: LLM-based framework for generating and reconstructing high-performance CUDA attention kernels with improved correctness and efficiency.

Xing Ma·1 day ago

Simon Willison· ANALYST

Vibe coding and agentic engineering are getting closer than I'd like

Simon Willison observes convergence between vibe coding and agentic engineering in practical AI-assisted development workflows.

Simon Willison·1 day ago

arXiv (cs.AI/CL/LG)· ACADEMIA

KernelBench-X: A Comprehensive Benchmark for Evaluating LLM-Generated GPU Kernels

KernelBench-X evaluates LLM-generated Triton GPU kernels across 176 tasks; finds task structure explains 3x more correctness variance than method design.

Han Wang·1 day ago

arXiv (cs.AI/CL/LG)· ACADEMIA

UFAL-CUNI at SemEval-2026 Task 11: An Efficient Modular Neuro-symbolic Method for Syllogistic Reasoning

Neuro-symbolic system combining LLM parser with automated theorem prover for syllogistic reasoning in SemEval-2026 Task 11.

Ivan Kartáč·1 day ago

r/Anthropic· COMMUNITY

1 msg 70% usage on PRO with Sonnet

User reports Claude Code consuming 70% of 5-hour PRO token limit on single Sonnet 4.6 interaction with poor output quality.

u/Dredyltd·1 day ago·11 pts / 15 comm

r/ClaudeAI· COMMUNITY

Claude Code hooks are the feature most people skip. Spoiler: they're really useful

Claude Code hooks enable automated test/format workflows by running shell commands at workflow checkpoints, improving iteration cycles.

u/EastMove5163·1 day ago·23 pts / 24 comm

r/LocalLLaMA· COMMUNITY

Qwen 3.6 27B MTP on v100 32GB: 54 t/s

Qwen 27B achieves 54 t/s on V100 GPU with MTP optimization in llama.cpp, nearly 2x baseline speed for code review and tool use tasks.

u/m94301·2 days ago·41 pts / 10 comm

OpenAI· FRONTIER

Singular Bank helps bankers move fast with ChatGPT and Codex

Singular Bank deployed ChatGPT and Codex in internal assistant Singularity to reduce banker meeting prep time by 60–90 minutes daily.

OpenAI·2 days ago

r/ClaudeAI· COMMUNITY

Why do a lot of programmers and technical people hate AI, vibecoding AI assisted coding?

Reddit discussion on programmer skepticism toward AI-assisted coding, arguing resistance stems from fear of disruption rather than technical merit.

u/Gullible-Angle4206·2 days ago·20 pts / 68 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

MOSAIC-Bench: Measuring Compositional Vulnerability Induction in Coding Agents

MOSAIC-Bench evaluates coding agents' vulnerability to multi-stage attack chains that decompose malicious goals into innocuous sequential tasks, exposing alignment gaps in deployed systems.

Jonathan Steinberg·2 days ago

r/LocalLLaMA· COMMUNITY

Use Qwen3.6 right way -> send it to pi coding agent and forget

Reddit post on using Qwen3.6 with pi.dev harness and agent tooling for local coding and admin tasks.

u/Willing-Toe1942·2 days ago·40 pts / 45 comm

r/ClaudeAI· COMMUNITY

I tested Kimi K2.6 vs Claude Opus 4.7 on a weird game coding task

User benchmarks Claude Opus 4.7 vs Kimi K2.6 on complex game mod coding task with TypeScript/Composio integration.

u/shricodev·2 days ago·20 pts / 15 comm

r/ClaudeAI· COMMUNITY

I hope this doesn't affect my usage ...

User reports Claude encountered repetitive output loop during code implementation task on Reddit.

u/Azsde·2 days ago·26 pts / 12 comm

r/Anthropic· COMMUNITY

Loops are the future - Boris Cherny creator of claude code in podcast

Boris Cherny, creator of Claude Code, discusses loops as a future direction in a podcast appearance.

u/shanraisshan·2 days ago·10 pts / 32 comm

r/LocalLLaMA· COMMUNITY

Qwen3.6 27B FP8 runs with 200k tokens of BF16 KV cache at 80 TPS on a single RTX 5000 PRO 48GB

Qwen 27B FP8 achieves 80 TPS with 200k token BF16 KV cache on RTX 5000 PRO 48GB, reducing quantization artifacts vs. 24GB quantized baselines.

u/__JockY__·2 days ago·47 pts / 40 comm

r/ClaudeAI· COMMUNITY

Anthropic: AI will fully replace software engineering by 2027. Also Anthropic: Currently hiring for 122 SWE openings.

Analysis of contradiction between Anthropic leadership's AI-replacing-engineering rhetoric and 184% increase in SWE hiring since Jan 2025.

u/ImaginaryRea1ity·3 days ago·57 pts / 21 comm

r/Anthropic· COMMUNITY

Casually beating every other deep research agent out there with a simple Claude Code harness

Open-source research agent built on Claude Code outperforms OpenAI and NVIDIA systems in deep research benchmarking.

u/heisdancingdancing·3 days ago·11 pts / 11 comm

r/OpenAI· COMMUNITY

Is Codex the best right now?

Reddit discussion questioning Codex's current competitive position and download trends; lacks substantive analysis or new information.

u/LeTanLoc98·3 days ago·72 pts / 33 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

SpecKV: Adaptive Speculative Decoding with Compression-Aware Gamma Selection

SpecKV adapts speculative decoding's speculation length dynamically based on target model compression, improving LLM inference throughput.

Shikhar Shukla·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Standing on the Shoulders of Giants: Stabilized Knowledge Distillation for Cross--Language Code Clone Detection

Knowledge distillation from LLMs to compact open-source models for cross-language code clone detection without black-box inference costs.

Mohamad Khajezade·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

FlexSQL: Flexible Exploration and Execution Make Better Text-to-SQL Agents

FlexSQL agent flexibly explores schemas and data during text-to-SQL generation, enabling recovery from early mistakes.

Quang Hieu Pham·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

FunFuzz: An LLM-Powered Evolutionary Fuzzing Framework

FunFuzz evolutionary fuzzing framework uses LLMs with multi-island search and feedback-driven prompt adaptation for structured input generation.

Mario Rodríguez Béjar·3 days ago

r/ClaudeAI· COMMUNITY

Real-time competitive multiplayer .io game built with Claude (4.6 & 4.7), live at nodecontrol.gg

Developer built real-time multiplayer .io game Node Control using Claude 4.6 and 4.7, now live at nodecontrol.gg with multi-region deployment.

u/soxpqn·3 days ago·23 pts / 12 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

AI-Generated Smells: An Analysis of Code and Architecture in LLM and Agent-Driven Development

Systematic audit reveals AI-generated code exhibits distinct machine-signature defects and reasoning-complexity trade-off in maintainability.

Yuecai Zhu·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

mdok-style at SemEval-2026 Task 10: Finetuning LLMs for Conspiracy Detection

mdok-style system finetuned Qwen3-32B using data augmentation for SemEval-2026 conspiracy detection task, ranking 8th of 52 submissions.

Dominik Macko·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

mdok-style at SemEval-2026 Task 9: Finetuning LLMs for Multilingual Polarization Detection

mdok-style system applied QLoRA finetuning on mid-size LLMs for SemEval-2026 multilingual polarization detection across detection, type, and manifestation subtasks.

Dominik Macko·3 days ago

r/LocalLLaMA· COMMUNITY

The more I use it, the more I'm impressed

Reddit user reports Qwen 3.6 27B found a bug that GPT 5.5 and Claude Opus 4.7 missed, attributing success to extended reasoning.

u/ComfyUser48·3 days ago·41 pts / 40 comm

r/ClaudeAI· COMMUNITY

Your Claude Code agent is always working from stale context. I built it a fix it can rewind, replay, and stay ahead of every edit.

Developer releases Memtrace, a codebase context manager for Claude Code that maintains persistent state across sessions to reduce token waste and stale context issues.

u/WEEZIEDEEZIE·3 days ago·25 pts / 16 comm

r/ClaudeAI· COMMUNITY

Vibe Coding vs. Production reality

Reddit discussion on gap between AI-assisted prototyping speed and production-ready deployment, highlighting auth, compliance, and vendor lock-in risks.

u/External_Bobcat8183·3 days ago·111 pts / 15 comm

r/ClaudeAI· COMMUNITY

My coworker and I planning a feature with our two Claude Codes in the same chat room. All four of us, talking.

User describes collaborative workflow using two Claude Code instances in shared chat for feature planning with human supervision.

u/croovies·3 days ago·22 pts / 23 comm

r/LocalLLaMA· COMMUNITY

"Second Thoughts" Been playing with adding a small transformer that reads output near the end of generation, and feeds it back near the top as a refinement loop. A quick test of 1.7B model showed drastic improvement in focused tasks (like coding)

Researcher demonstrates iterative refinement loop using small auxiliary transformer to improve 1.7B model code generation; scaling to 9B for HumanEval validation.

u/bigattichouse·4 days ago·43 pts / 10 comm

r/Anthropic· COMMUNITY

Is it getting dumb again?

Reddit user reports perceived degradation in Claude Opus 4.7 coding performance since mid-May, correlating with prior model quality issues acknowledged by Anthropic.

u/jelenajansson·4 days ago·10 pts / 14 comm

r/LocalLLaMA· COMMUNITY

One bash permission slipped...

User reports LLM bash command generation errors leading to destructive rm -rf execution in isolated VM environment.

u/TheQuantumPhysicist·4 days ago·92 pts / 26 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

Selector-Guided Autonomous Curriculum for One-Shot Reinforcement Learning from Verifiable Rewards

SGAC approach uses learned selector model for autonomous curriculum in one-shot RLVR to improve LLM math reasoning over variance-based heuristics.

Rudray Dave·4 days ago

r/LocalLLaMA· COMMUNITY

If you've been waiting to try local AI development, please try it

Developer reports local Qwen 27B setup with llama-server now competitive with Claude Code and Cursor for coding tasks, driven by cloud provider cost increases.

u/Imaginary_Belt4976·4 days ago·48 pts / 30 comm

r/LocalLLaMA· COMMUNITY

Solidity

Developer discusses building a local Solidity LM with chain-of-thought and tool-calling; seeks alternatives to SOTA models for smart contract security and vulnerability analysis.

u/swingbear·4 days ago·40 pts / 14 comm

r/ClaudeAI· COMMUNITY

Let's not rename powershell.exe

User documents Claude Code CLI behavior on Windows 11 with Opus 4.7 when system dependencies are missing.

u/Ketonite·4 days ago·91 pts / 5 comm

r/LocalLLaMA· COMMUNITY

Qwen3.6-27B vs Coder-Next

Empirical comparison of Qwen3.6-27B and Coder-Next models across 40 test cases shows statistical parity with task-dependent tradeoffs.

u/Signal_Ad657·5 days ago·48 pts / 13 comm

r/ClaudeAI· COMMUNITY

I left my Agent OS running overnight and it built 4 new tools I didn't even ask for

Engineer describes autonomous agent system with self-generating tools that can write, test, and register new capabilities without user intervention.

u/TheOnlyVibemaster·5 days ago·25 pts / 27 comm

r/singularity· COMMUNITY

Software engineering jobs hit their highest posting since november 2023

Software engineering job postings reach peak since Nov 2023; Reddit commentary suggests continued demand for prompt engineering and model operation roles.

u/artemisgarden·5 days ago·100 pts / 50 comm

r/LocalLLaMA· COMMUNITY

Qwen3.6-27B vs 35B, I prefer 35B but more people here post about 27B...

User reports preferring Qwen 35B over 27B for coding/research pipelines on local hardware despite 27B popularity.

u/Snoo_27681·5 days ago·53 pts / 53 comm

r/ClaudeAI· COMMUNITY

spent way too long manually steering claude code every session until i stopped doing that

Developer describes using persistent configuration to reduce Claude setup overhead across sessions, improving workflow efficiency and code quality.

u/CodinDev·5 days ago·21 pts / 17 comm

r/LocalLLaMA· COMMUNITY

Kv cache quantization: ignorance, or malice?

Technical practitioner questions conventional wisdom on KV cache quantization for Qwen 27B inference on consumer GPUs in agentic workloads.

u/wombweed·5 days ago·40 pts / 80 comm

r/ClaudeAI· COMMUNITY

I gave Claude Code a $0.02/call coworker and stopped hitting Pro limits — here's the full setup

User shares cost-optimization pattern: delegate boilerplate tasks to cheaper models (Kimi K2.5) via Claude's bash tool to reduce API spend and hit rate limits less frequently.

u/More-Hunter-3457·5 days ago·38 pts / 10 comm

r/ClaudeAI· COMMUNITY

I built /graphify, 26 days, 450k+ downloads, ~40k stars. Here’s what I didn’t expect.

Graphify, a Claude Code skill using Leiden community detection for codebase knowledge graphs, reached 450k PyPI downloads and 40k GitHub stars in 26 days; 71x token efficiency vs. raw file input.

u/captainkink07·6 days ago·62 pts / 24 comm

← Front Page50 stories

§ Coding

the part of using claude code nobody talks about

Three browser games built with Claude (25M plays). Two of them are 8,000-line HTML files.

Three key areas Anthropic is working on for their next models

Executable World Models for ARC-AGI-3 in the Era of Coding Agents

Live blog: Code w/ Claude 2026

CuBridge: An LLM-Based Framework for Understanding and Reconstructing High-Performance Attention Kernels

Vibe coding and agentic engineering are getting closer than I'd like

KernelBench-X: A Comprehensive Benchmark for Evaluating LLM-Generated GPU Kernels

UFAL-CUNI at SemEval-2026 Task 11: An Efficient Modular Neuro-symbolic Method for Syllogistic Reasoning

1 msg 70% usage on PRO with Sonnet

Claude Code hooks are the feature most people skip. Spoiler: they're really useful

Qwen 3.6 27B MTP on v100 32GB: 54 t/s

Singular Bank helps bankers move fast with ChatGPT and Codex

Why do a lot of programmers and technical people hate AI, vibecoding AI assisted coding?

MOSAIC-Bench: Measuring Compositional Vulnerability Induction in Coding Agents

Use Qwen3.6 right way -&gt; send it to pi coding agent and forget

I tested Kimi K2.6 vs Claude Opus 4.7 on a weird game coding task

I hope this doesn't affect my usage ...

Loops are the future - Boris Cherny creator of claude code in podcast

Qwen3.6 27B FP8 runs with 200k tokens of BF16 KV cache at 80 TPS on a single RTX 5000 PRO 48GB

Anthropic: AI will fully replace software engineering by 2027. Also Anthropic: Currently hiring for 122 SWE openings.

Casually beating every other deep research agent out there with a simple Claude Code harness

Is Codex the best right now?

SpecKV: Adaptive Speculative Decoding with Compression-Aware Gamma Selection

Standing on the Shoulders of Giants: Stabilized Knowledge Distillation for Cross--Language Code Clone Detection

FlexSQL: Flexible Exploration and Execution Make Better Text-to-SQL Agents

FunFuzz: An LLM-Powered Evolutionary Fuzzing Framework

Real-time competitive multiplayer .io game built with Claude (4.6 &amp; 4.7), live at nodecontrol.gg

AI-Generated Smells: An Analysis of Code and Architecture in LLM and Agent-Driven Development

mdok-style at SemEval-2026 Task 10: Finetuning LLMs for Conspiracy Detection

mdok-style at SemEval-2026 Task 9: Finetuning LLMs for Multilingual Polarization Detection

The more I use it, the more I'm impressed

Your Claude Code agent is always working from stale context. I built it a fix it can rewind, replay, and stay ahead of every edit.

Vibe Coding vs. Production reality

My coworker and I planning a feature with our two Claude Codes in the same chat room. All four of us, talking.

"Second Thoughts" Been playing with adding a small transformer that reads output near the end of generation, and feeds it back near the top as a refinement loop. A quick test of 1.7B model showed drastic improvement in focused tasks (like coding)

Is it getting dumb again?

One bash permission slipped...

Selector-Guided Autonomous Curriculum for One-Shot Reinforcement Learning from Verifiable Rewards

If you've been waiting to try local AI development, please try it

Solidity

Let's not rename powershell.exe

Qwen3.6-27B vs Coder-Next

I left my Agent OS running overnight and it built 4 new tools I didn't even ask for

Software engineering jobs hit their highest posting since november 2023

Qwen3.6-27B vs 35B, I prefer 35B but more people here post about 27B...

spent way too long manually steering claude code every session until i stopped doing that

Kv cache quantization: ignorance, or malice?

I gave Claude Code a $0.02/call coworker and stopped hitting Pro limits — here's the full setup

I built /graphify, 26 days, 450k+ downloads, ~40k stars. Here’s what I didn’t expect.

Use Qwen3.6 right way -> send it to pi coding agent and forget

Real-time competitive multiplayer .io game built with Claude (4.6 & 4.7), live at nodecontrol.gg