Researchers from ICIQ’s López group present a **new method** that allows for the **rational design of heterogeneous catalysts**. After applying Principal Component Analysis and Regression (PCA) to the adsorption energies of 71 different C_{1} and C_{2} species on 12 close-packed (transition) metal surfaces the scientists elucidated, for the first time, an **interpretable model in heterogeneous catalysis**.

The team’s new method, published in *Nature Communications*, will facilitate the discovery of heterogeneous catalysts able to transform the **non-edible fraction of biomass** into valuable chemical products. The procedure **reduces the number of calculations by a factor of 20** while retaining error bars comparable to Density Functional Theory (DFT).

**As simple as possible, but not simpler **

Biomass molecules are big. With complex molecular structures, there are many reaction sites to be considered when biomass interacts with a catalyst. A relatively small molecule, such as a C_{6}, could present a reaction network of about 500 000 reactions – making it too time and resource-demanding to study using current models. In contrast, by studying C_{1} and C_{2} species the scientists now can extrapolate the behavior to bigger molecules commonly found in biomass.

The researchers from the López group applied PCA, a simple and **unsupervised Machine Learning** **technique**, to reduce the dimensionality of the problem. After analysing the formation energies of 71 adsorbates on 12 close-packed metal surfaces, the researchers obtained a two-term linear expression that allows for the **rapid and accurate survey of whole reaction networks** on metallic surfaces while drastically **slashing the number of DFT calculations needed**.

The simplicity of the resulting model has allowed the ICIQ researchers to go one step beyond the state of the art and **interpret the results** – a first in heterogeneous catalysis.

**From black to green**

Interpreting the results of machine learning tools yield can be a challenge. Luckily, the ICIQ scientists have been able to open the box **and interpret in physical terms the parameters** of the equation. “If you know what each of the terms in the equation means, you can go further and expand your model” explains Rodrigo García-Muelas, first author of the paper, “the equation’s first term describes bond covalency while the second one is related to the bond’s ionicity.” The mathematical sequence used to find the models is publicly available to other researchers to reproduce the results.

The new model applies to a wide range of scenarios and allows the prediction of how a system will react and what it will yield by just knowing the characteristics of the bonds. Machine Learning approaches applied to catalysis are expanding our understanding of the inner workings of molecules bringing closer the transformation of biomass into biofuels, thus providing a greener alternative to fossil fuels.