A new encoder-decoder architecture using multi-head attention solves the open shop scheduling problem. Trained on Taillard benchmark instances, the model produces schedules within 15-30% of best-known makespans. It uses only processing-time matrices as input. This approach offers a scalable alternative to manual tuning for Deep Reinforcement Learning practitioners in industrial settings.