The ToolSense framework exposes a gap between tool retrieval performance and actual model understanding. While parametric retrieval uses virtual tokens to index tool catalogs, standard benchmarks rely on constrained decoding to hide failures. This diagnostic tool reveals that LLMs often fail to grasp tool semantics. Developers must now validate internal knowledge beyond simple retrieval.