Chemistry is a sort of applied physics, with the behavior of electrons and their orbitals dictating a set of rules for which reactions can take place and what products will remain stable. At a very rough level, the basics of these rules are simple enough that experienced chemists can keep them all in their brain and intuit how to fit together pieces in a way that ultimately produces the starting material they want. Unfortunately, there are some parts of the chemical landscape that we don’t have much experience with, and strange things sometimes happen when intuition meets a reaction flask. This is why some critical drugs still have to be purified from biological sources.
It’s possible to get more precise than intuition, but that generally requires full quantum-level simulations run on a cluster, and even these don’t always capture some of the quirks that come about because of things like choice of solvents and reaction temperatures or the presence of minor contaminants.
But improvements in AI have led to a number of impressive demonstrations of its use in chemistry. And it’s easy to see why this works; AIs can figure out their own rules, without the same constraints traditionally imparted by a chemistry education. Now, a team at Glasgow University has paired a machine-learning system with a robot that can run and analyze its own chemical reaction. The result is a system that can figure out every reaction that’s possible from a given set of starting materials.
Chemist in a fume hood
Lee Cronin, the researcher who organized the work, was kind enough to send along an image of the setup, which looks nothing like our typical conception of a robot (the researchers refer to it as “bespoke”). Most of its parts are dispersed through a fume hood, which ensures safe ventilation of any products that somehow escape the system. The upper right is a collection of tanks containing starting materials and pumps that send them into one of six reaction chambers, which can be operated in parallel.
The outcomes of these reactions can then be sent on for analysis. Pumps can feed samples into an IR spectrometer, a mass spectrometer, and a compact NMR machine—the latter being the only bit of equipment that didn’t fit in the fume hood. Collectively, these can create a fingerprint of the molecules that occupy a reaction chamber. By comparing this to the fingerprint of the starting materials, it’s possible to determine whether a chemical reaction took place and infer some things about its products.
All of that is a substitute for a chemist’s hands, but it doesn’t replace the brains that evaluate potential reactions. That’s where a machine-learning algorithm comes in. The system was given a set of 72 reactions with known products and used those to generate predictions of the outcomes of further reactions. From there, it started choosing reactions at random from the remaining list of options and determining whether they, too, produced products. By the time the algorithm had sampled 10 percent of the total possible reactions, it was able to predict the outcome of untested reactions with more than 80-percent accuracy.
And, since the earlier reactions it tested were chosen at random, the system wasn’t biased by human expectations of what reactions would or wouldn’t work.
Once it had built a model, the system was set up to evaluate which of the remaining possible reactions was most likely to produce products and prioritize testing those. The system could continue on until it reached a set number of reactions, stop after a certain number of tests no longer produced products, or simply go until it tested every possible reaction.
Not content with this degree of success, the research team went on to add a neural network that was provided with data from the research literature on the yield of a class of reactions that links two hydrocarbon chains. After training on nearly 3,500 reactions, the system had an error of only 11 percent when predicting the yield on another 1,700 reactions from the literature.
This system was then integrated with the existing test setup and set loose on reactions that hadn’t been reported in the literature. This allowed the system to prioritize not only by whether the reaction was likely to make a product but also how much of the product would be produced by the reaction.
All this, on its own, is pretty impressive. As the authors put it, “by realizing only 10 percent of the total number of reactions, we can predict the outcomes of the remaining 90 percent without needing to carry out the experiments.” But the system also helped them identify a few surprises—cases where the fingerprint of the reaction mix suggested that the product was something more than a simple combination of starting materials. These reactions were explored further by actual human chemists, who identified both ring-breaking and ring-forming reactions this way.
That last aspect really goes a long way toward explaining how this sort of capability will fit into future chemistry labs. People tend to think of robots as replacing humans. But in this context, the robots are simply taking some of the drudgery away from humans. No sane human would ever consider trying every possible combination of reactants to see what they’d do, and humans couldn’t perform the testing 24 hours a day without dangerous levels of caffeine anyway. The robots will also be good at identifying the rare cases where highly trained intuitions turn out to lead us astray about the utility of trying some reactions.
But for now, humans will be necessary to integrate this knowledge into useful chemistry, recognizing when deploying any new or efficient reactions may make a previously inefficient process or unobtainable product easier to work with. There may come a time where AI helps out with that, as well, but we don’t seem to be quite there yet.