Solvents are one of the largest sources of waste in the chemical industry, especially for multi-step processes in manufacturing of pharmaceuticals, flavors, fragrances, and other complex molecules. Traditionally, each step is optimized for a different solvent, and every kg of product can generate 100 kg of solvent waste. Many solvents are toxic and their fugitive emissions and inadvertent release lead to environmental degradation. Solvent recovery and reuse requires energy input, thereby driving CO2 emissions.
New data-science methods to select solvents and reaction pathways are needed that leverage machine learning techniques to achieve desired properties, while minimizing the potential for harm.
New data-science methods to select solvents and reaction pathways are needed that leverage machine learning techniques to achieve desired properties, while minimizing the potential for harm.
One challenge is incomplete understanding of solubility itself, which seems to have reached the limits possible using reductionist approaches. Complementing the reductionist approach with a data-driven based approach has potential for a transformational change in our ability to predict solubility of complex molecules.
Machine learning, transfer learning and deep learning techniques can be applied to learn from the abundant solubility data that chemists have stockpiled over the past decades to develop data-driven solubility prediction models. The use of data science techniques, such artificial neural networks, will be investigated for prediction of molecular solubility – beginning with simpler, yet related molecular-level phenomena such as melting temperature and vapor pressure. The resulting models will then be tested for solubility prediction of systems not included in the study to test model robustness. Subsequent rounds of refinement will investigate solubility in different classes of solvents, along with leveraging machine learning to discover and optimize synthesis routes with constraints to minimize solvent waste.
Researchers: Dr. Timko, Chem. Engineering and Dr. Paffenroth, Math Sciences & Data Science
Machine learning, transfer learning and deep learning techniques can be applied to learn from the abundant solubility data that chemists have stockpiled over the past decades to develop data-driven solubility prediction models. The use of data science techniques, such artificial neural networks, will be investigated for prediction of molecular solubility – beginning with simpler, yet related molecular-level phenomena such as melting temperature and vapor pressure. The resulting models will then be tested for solubility prediction of systems not included in the study to test model robustness. Subsequent rounds of refinement will investigate solubility in different classes of solvents, along with leveraging machine learning to discover and optimize synthesis routes with constraints to minimize solvent waste.
Researchers: Dr. Timko, Chem. Engineering and Dr. Paffenroth, Math Sciences & Data Science