The DiScoFormer architecture uses a single transformer to estimate both probability density and score functions. This approach eliminates the need for separate models when handling diverse data distributions. By unifying these tasks, researchers reduce computational overhead. Practitioners can now apply a more streamlined framework to generative modeling and anomaly detection tasks.