The new BioMysteryBench evaluates whether Claude can solve complex biological puzzles. Results suggest the model matches human expert performance in specific bioinformatics tasks. However, these claims carry significant caveats regarding real-world applicability. Practitioners should treat these benchmarks as early indicators rather than proof of professional-grade biological expertise.