New Chat
Ctrl
K
Kimi+
Chat History

OpenAI GPT-OSS Launch and Chinese AI Models Update

TL;DR – It’s the usual 4chan-style noise, but the signal underneath is:
  1. OpenAI’s “open-source” GPT-OSS (120 B & 20 B) landed yesterday and is being roasted non-stop.
    • Benchmarks are mediocre (gets trounced by GLM-4.5-Air, Qwen-3-4 B-Thinking, etc.).
    • Heavy refusals unless you jail-break it; people are already sharing prompt hacks.
    • Consensus: Sam dropped a distilled, over-quantized PR stunt.
  2. New Chinese toys just shipped and are the current hotness
    • Qwen-3-4 B-Thinking-2507 – tiny, uncensored-ish, strong at reasoning; everyone testing it.
    • GLM-4.5-Air/Fire – the new MoE king; 12 B active / 106 B total. Needs ~50 GB+ total RAM+VRAM but gives great quality.
    • KoboldCpp v1.97 & llama.cpp already have GLM-4.5 support.
  3. Hardware talk
    • “How do I fit GLM-Air on my 32 GB RAM/4 GB VRAM toaster?” – spilling to RAM is fine on Linux, expect 4-7 t/s.
    • One mad-lad bought twelve Radeon Pro V340 cards (24 GPUs, 384 GB HBM2) for $600; thread is half envy, half roasting him for power/ROCm issues.
    • 12 GB VRAM + 32 GB RAM crowd debating whether to stay pure-VRAM (Mistral-Small) or spill (Qwen-30 B-MoE).
  4. ERP/Rp stuff
    • Anchor post: “post the model you gooned to most this week” – answers are GLM-4.5, Qwen-3-Thinking, DeepSeek-V3, Cydonia-24 B, Rocinante, etc.
    • Everyone trading jailbreaks and GGUF links for 16 k+ context role-play quants.
  5. Misc
    • People archiving models & LoRAs “before we get rug-pulled”.
    • GPT-5 rumors (Cerebras wafer-scale inference, Copilot launch tomorrow) being treated as pure hopium/astroturf.
    • CivitAI is half-down so lots of HF scraping links being swapped.
Bottom line: the thread is 90 % shitposting, 10 % genuinely useful links and configs for the newest Chinese models that actually run on consumer boxes and beat GPT-OSS.