The Trellis framework now integrates RadixAttention to optimize KV cache management. This update reduces memory overhead by sharing common prefixes across multiple sequences. It enables faster inference for long-context tasks. Developers can now handle complex prompt structures with significantly lower latency and reduced VRAM consumption during generation.