Back to Publications

The Syntax of Matter: Synthesis Planning as the Foundation of Generative Chemistry
#472

ChemRxiv
2026
The Syntax of Matter: Synthesis Planning as the Foundation of Generative Chemistry

Authors

Anton Morgunov, Yu Shee, Alexander V Soudackov, Victor S Batista

Abstract

Recent advances in deep learning have improved benchmark performance for chemical property prediction, yet reliable transfer to new chemical domains remains limited. A contributing factor is that many models treat molecules primarily as static graphs, ignoring the causal logic of how they are constructed. This review surveys multistep synthesis planning (2020-2026) and argues that the field is undergoing a fundamental transition: from an Era of Navigability (2018-2023), focused on the computational feasibility of finding any route through combinatorial search space, to an Era of Validity (2024-Present), focused on the chemical correctness of those routes. We organize the literature around two dominant paradigms, search-based planning and direct sequence generation, and analyze how their design choices relate to different notions of validity. To resolve the ambiguity of current "solvability" metric, which frequently exceeds 99% by measuring only topological connectivity, we introduce a formalized Hierarchy of Chemical Validity (Solv-N). This framework distinguishes between syntactic (Solv-0) and topological (Solv-1) success, which are largely solved, and the higher-order constraints of selectivity (Solv-2) and executability (Solv-3), which remain open challenges. We critically examine how legacy benchmarks and inflated virtual inventories obscure this distinction, and we conclude with a roadmap for synthesis-aware foundation models evaluated under explicit Tier 2-3 constraints.