The torch.profiler tool identifies execution bottlenecks by tracking CPU and GPU activity. This guide explains how to record operator durations and analyze memory consumption within PyTorch workflows. It provides a foundational starting point for developers. Optimizing these metrics reduces latency and lowers compute costs for large-scale model inference.