BioMysteryBench evaluates whether Claude can solve complex bioinformatics puzzles. Results suggest the model matches human expert performance on specific tasks. However, these claims rely on a narrow set of problems. Practitioners should treat these findings as a preliminary signal rather than a general capability for biological research.