The Trellis framework now integrates RadixAttention to optimize key-value cache management. This update reduces memory overhead during inference by sharing common prefixes across multiple requests. It targets efficiency in large-scale deployments. Developers can now handle higher concurrency without linear memory growth, though the performance gains remain incremental for small batches.