TL;DR – It’s the usual 4chan-style noise, but the signal underneath is:
- OpenAI’s “open-source” GPT-OSS (120 B & 20 B) landed yesterday and is being roasted non-stop.
- Benchmarks are mediocre (gets trounced by GLM-4.5-Air, Qwen-3-4 B-Thinking, etc.).
- Heavy refusals unless you jail-break it; people are already sharing prompt hacks.
- Consensus: Sam dropped a distilled, over-quantized PR stunt.
- New Chinese toys just shipped and are the current hotness
- Qwen-3-4 B-Thinking-2507 – tiny, uncensored-ish, strong at reasoning; everyone testing it.
- GLM-4.5-Air/Fire – the new MoE king; 12 B active / 106 B total. Needs ~50 GB+ total RAM+VRAM but gives great quality.
- KoboldCpp v1.97 & llama.cpp already have GLM-4.5 support.
- Hardware talk
- “How do I fit GLM-Air on my 32 GB RAM/4 GB VRAM toaster?” – spilling to RAM is fine on Linux, expect 4-7 t/s.
- One mad-lad bought twelve Radeon Pro V340 cards (24 GPUs, 384 GB HBM2) for $600; thread is half envy, half roasting him for power/ROCm issues.
- 12 GB VRAM + 32 GB RAM crowd debating whether to stay pure-VRAM (Mistral-Small) or spill (Qwen-30 B-MoE).
- ERP/Rp stuff
- Anchor post: “post the model you gooned to most this week” – answers are GLM-4.5, Qwen-3-Thinking, DeepSeek-V3, Cydonia-24 B, Rocinante, etc.
- Everyone trading jailbreaks and GGUF links for 16 k+ context role-play quants.
- Misc
- People archiving models & LoRAs “before we get rug-pulled”.
- GPT-5 rumors (Cerebras wafer-scale inference, Copilot launch tomorrow) being treated as pure hopium/astroturf.
- CivitAI is half-down so lots of HF scraping links being swapped.
Bottom line: the thread is 90 % shitposting, 10 % genuinely useful links and configs for the newest Chinese models that actually run on consumer boxes and beat GPT-OSS.