New token-level analysis compares standard Transformers against hybrid architectures to measure computational overhead. The data reveals specific bottlenecks in attention mechanisms during long-sequence processing. These findings suggest that hybrid models reduce memory footprints without sacrificing perplexity. Practitioners can use these benchmarks to optimize inference costs for specialized, long-context deployments.