DATA-DRIVEN DISCOVERY IN CHEMICAL SCIENCES: INTEGRATING AI WITH EXPERIMENTAL AND COMPUTATIONAL CHEMISTRY
Abstract
The rapid growth of experimental and computational data in chemical sciences has created new opportunities and challenges for scientific discovery. Traditional hypothesis-driven approaches often struggle to efficiently explore complex chemical spaces characterized by high dimensionality, uncertainty, and resource constraints. Data-driven discovery, supported by artificial intelligence, offers a transformative paradigm by enabling the integration of experimental observations and computational insights into adaptive and scalable research workflows. This study aims to examine how artificial intelligence can be systematically integrated with experimental and computational chemistry to enhance discovery efficiency, predictive accuracy, and scientific interpretability. A mixed-methods research design was employed, combining curated experimental datasets, computational chemistry simulations, and machine learning models within an iterative feedback framework. Quantitative performance analysis and qualitative case studies were used to evaluate model accuracy, robustness, and practical utility. The results demonstrate that integrated AI models significantly outperform single-source approaches, showing lower prediction errors, improved generalization, and stronger alignment with chemical theory. Case-based evidence further indicates reductions in experimental trials and computational screening costs. The study concludes that data-driven discovery frameworks that tightly integrate artificial intelligence with experimental and computational chemistry represent a robust and sustainable approach for accelerating chemical innovation, supporting more informed decision-making, and advancing next-generation research methodologies in chemical sciences.
Full text article
References
Aal E Ali, R. S., Meng, J., Khan, M. E. I., & Jiang, X. (2024). Machine learning advancements in organic synthesis: A focused exploration of artificial intelligence applications in chemistry. Artificial Intelligence Chemistry, 2(1), 100049. https://doi.org/https://doi.org/10.1016/j.aichem.2024.100049
Alzaabi, S., Elkamel, A., Karanikolos, G. N., & Alhammadi, A. (2025). Accelerating sodium-ion electrode material development through AI-driven optimization and predictive modeling. Energy and AI, 21, 100537. https://doi.org/https://doi.org/10.1016/j.egyai.2025.100537
Ambreen, S., Umar, M., Noor, A., Jain, H., & Ali, R. (2025). Advanced AI and ML frameworks for transforming drug discovery and optimization: With innovative insights in polypharmacology, drug repurposing, combination therapy and nanomedicine. European Journal of Medicinal Chemistry, 284, 117164. https://doi.org/https://doi.org/10.1016/j.ejmech.2024.117164
Bai, J., Rihm, S. D., Kondinski, A., Saluz, F., Deng, X., Brownbridge, G., Mosbach, S., Akroyd, J., & Kraft, M. (2025). twa: The World Avatar Python package for dynamic knowledge graphs and its application in reticular chemistry††Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d5dd00069f. Digital Discovery, 4(8), 2123–2135. https://doi.org/https://doi.org/10.1039/d5dd00069f
Bello, I. T., Taiwo, R., Esan, O. C., Adegoke, A. H., Ijaola, A. O., Li, Z., Zhao, S., Wang, C., Shao, Z., & Ni, M. (2024). AI-enabled materials discovery for advanced ceramic electrochemical cells. Energy and AI, 15, 100317. https://doi.org/https://doi.org/10.1016/j.egyai.2023.100317
Braga, D. M., & Rawal, B. (2025). Harnessing AI and Quantum Computing for Revolutionizing Drug Discovery and Approval Processes: Case Example for Collagen Toxicity. JMIR Bioinformatics and Biotechnology, 6. https://doi.org/https://doi.org/10.2196/69800
Cesaro, A., Wan, F., Shi, H., Wang, K., Maupin, C. M., Barker, M. L., Liu, J., Fox, S. J., Yeo, J., & de la Fuente-Nunez, C. (2025). Antiviral discovery using sparse datasets by integrating experiments, molecular simulations, and machine learning. Cell Reports Physical Science, 6(5), 102554. https://doi.org/https://doi.org/10.1016/j.xcrp.2025.102554
Chakraborty, A., Taskiran, N. P., Kottooru, R., Mann, V., & Venkatasubramanian, V. (2025). Building hybrid AI models in chemical engineering: A tutorial review. Computers & Chemical Engineering, 201, 109236. https://doi.org/https://doi.org/10.1016/j.compchemeng.2025.109236
Chen, G., & You, F. (2025). Future Manufacturing with AI-Driven Particle Vision Analysis in the Microscopic World. Engineering, 52, 68–84. https://doi.org/https://doi.org/10.1016/j.eng.2025.08.005
Chen, W., Lin, Z., Zhang, X., Zhou, H., & Zhang, Y. (2025). AI-driven accelerated discovery of intercalation-type cathode materials for magnesium batteries. Journal of Energy Chemistry, 108, 40–46. https://doi.org/https://doi.org/10.1016/j.jechem.2025.03.085
Cizauskas, C., DeBenedictis, E., & Kelly, P. (2025). How the past is shaping the future of life science: The influence of automation and AI on biology. New Biotechnology, 88, 1–11. https://doi.org/https://doi.org/10.1016/j.nbt.2025.03.004
Deng, S., Wang, L., Kim, S., & Koenig, B. C. (2025). Scientific machine learning in combustion for discovery, simulation, and control. Proceedings of the Combustion Institute, 41, 105796. https://doi.org/https://doi.org/10.1016/j.proci.2025.105796
Djidrovski, I., Pieters, R., Legler, J., & Teunis, M. (2025). O-QT assistant: a multi-agent AI system for streamlined chemical hazard assessment and read-across analysis using the OECD QSAR toolbox API. Computational Toxicology, 100395. https://doi.org/https://doi.org/10.1016/j.comtox.2025.100395
Geylan, G., Kabeshov, M., Genheden, S., Kannas, C., Kogej, T., De Maria, L., David, F., & Engkvist, O. (2025). From concept to chemistry: integrating protection group strategy and reaction feasibility into non-natural amino acid synthesis planning. Chemical Science, 16(38), 17927–17938. https://doi.org/https://doi.org/10.1039/d5sc04898b
Guedes, J., Szadai, L., Woldmar, N., Jánosi, Á. J., Koroncziová, K., Lengyel, B. M., Kelemen, B., Boltas, E., Gyulai, R., Wieslander, E., Paw?owski, K., Horvatovich, P., Betancourt, L., Szasz, A. M., Vereb, Z., Horvath, P., Oskolás, H., Appelqvist, R., Malm, J., … Gil, J. (2025). The melanoma MEGA-study: Integrating proteogenomics, digital pathology, and AI-analytics for precision oncology. Journal of Proteomics, 319, 105482. https://doi.org/https://doi.org/10.1016/j.jprot.2025.105482
Haßmann, U., Amann, S., Babayan, N., Fankhauser, S., Hofmaier, T., Jakl, T., Nendza, M., Stopper, H., Stefan, S. M., & Landsiedel, R. (2024). Predictive, integrative, and regulatory aspects of AI-driven computational toxicology – Highlights of the German Pharm-Tox Summit (GPTS) 2024. Toxicology, 509, 153975. https://doi.org/https://doi.org/10.1016/j.tox.2024.153975
Hatibi, N., Ait Benhassou, H., & Abik, M. (2025). Predicted and Explained: Transforming drug discovery with AI for high-precision receptor-ligand interaction modeling and binding analysis. Computers in Biology and Medicine, 192, 110145. https://doi.org/https://doi.org/10.1016/j.compbiomed.2025.110145
Jin, Z., Gu, D., Li, P., Ye, G., Zhu, H., Wei, K., Li, C., Zhong, W., Du, W., & Zhu, Q. (2025). Artificial intelligence-driven catalyst design for electrocatalytic hydrogen production: Paradigm innovation and challenges in material discovery. Sustainable Chemistry for Energy Materials, 2, 100010. https://doi.org/https://doi.org/10.1016/j.scenem.2025.100010
Kapustina, O., Burmakina, P., Gubina, N., Serov, N., & Vinogradov, V. (2024). User-friendly and industry-integrated AI for medicinal chemists and pharmaceuticals. Artificial Intelligence Chemistry, 2(2), 100072. https://doi.org/https://doi.org/10.1016/j.aichem.2024.100072
Khakpour, A., Florescu, L., Tilley, R., Jiang, H., Iyer, K. S., & Carneiro, G. (2025). AI-powered prediction of nanoparticle pharmacokinetics: A multi-view learning approach. Materials Today Communications, 49, 113742. https://doi.org/https://doi.org/10.1016/j.mtcomm.2025.113742
Kuppusamy, S., Meivelu, M., Praburaman, L., Mujahid Alam, M., Al-Sehemi, A. G., & K, A. (2024). Integrating AI in food contaminant analysis: Enhancing quality and environmental protection. Journal of Hazardous Materials Advances, 16, 100509. https://doi.org/https://doi.org/10.1016/j.hazadv.2024.100509
Le, M. H. N., Nguyen, P. K., Nguyen, T. P. T., Nguyen, H. Q., Tam, D. N. H., Huynh, H. H., Huynh, P. K., & Le, N. Q. K. (2025). An in-depth review of AI-powered advancements in cancer drug discovery. Biochimica et Biophysica Acta (BBA) - Molecular Basis of Disease, 1871(3), 167680. https://doi.org/https://doi.org/10.1016/j.bbadis.2025.167680
Li, C., & Yamanishi, Y. (2025). AI-driven transcriptome profile-guided hit molecule generation. Artificial Intelligence, 338, 104239. https://doi.org/https://doi.org/10.1016/j.artint.2024.104239
Li, Q., Xing, R., Li, L., Yao, H., Wu, L., & Zhao, L. (2024). Synchrotron radiation data-driven artificial intelligence approaches in materials discovery. Artificial Intelligence Chemistry, 2(1), 100045. https://doi.org/https://doi.org/10.1016/j.aichem.2024.100045
Liu, X., Xu, J., Zheng, S., Yang, Y., Xie, Y., Liu, J., Zhong, J., Zhang, H., Chen, J., Dai, C., Wang, D., Luo, J., Chen, X., Zhong, F., & Ye, Z.-C. (2025). AI-driven discovery of brain-penetrant Galectin-3 inhibitors for Alzheimer’s disease therapy. Pharmacological Research, 218, 107834. https://doi.org/https://doi.org/10.1016/j.phrs.2025.107834
Luo, M., Xie, Z., Li, H., Zhang, B., Cao, J., Huang, Y., Qu, H., Zhu, Q., Chen, L., Jiang, J., & Luo, Y. (2025). Physics-informed, dual-objective optimization of high-entropy-alloy nanozymes by a robotic AI chemist. Matter, 8(4), 102009. https://doi.org/https://doi.org/10.1016/j.matt.2025.102009
Maciejewska-Turska, M., Georgiev, M. I., Kai, G., & Sieniawska, E. (2025). Advances in bioinformatic methods for the acceleration of the drug discovery from nature. Phytomedicine, 139, 156518. https://doi.org/https://doi.org/10.1016/j.phymed.2025.156518
Montoya, J. H., Grimley, C., Aykol, M., Ophus, C., Sternlicht, H., Savitzky, B. H., Minor, A. M., Torrisi, S. B., Goedjen, J., Chung, C.-C., Comstock, A. H., & Sun, S. (2024). How the AI-assisted discovery and synthesis of a ternary oxide highlights capability gaps in materials science††Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d3sc04823c. Chemical Science, 15(15), 5660–5673. https://doi.org/https://doi.org/10.1039/d3sc04823c
Nabavi, S. F., Garmestani, H., & Fekri, F. (2025). AI-powered language models for alloy design and laser-based manufacturing: A review of NLP applications in materials science. Journal of Manufacturing Processes, 156, 86–120. https://doi.org/https://doi.org/10.1016/j.jmapro.2025.11.035
Noreldeen, H. A. A., Hamed, A.-R. M., El-Shazly, M., El-Saharty, A. A., Farghaly, O. A., & Huang, S. (2025). Integrating untargeted metabolomics and computational docking for biomarker evaluation: A case study on marine algae-derived ligands. Bioorganic Chemistry, 161, 108539. https://doi.org/https://doi.org/10.1016/j.bioorg.2025.108539
Papadimitriou, I., Gialampoukidis, I., Vrochidis, S., & Kompatsiaris, I. (2024). AI methods in materials design, discovery and manufacturing: A review. Computational Materials Science, 235, 112793. https://doi.org/https://doi.org/10.1016/j.commatsci.2024.112793
Simovi?, A. R., Milenkovi?, D., Šekli?, D., Jovanovi?, M., Milovi?, E., Me?edovi?, M., Vraneš, M., & Jankovi?, N. (2025). Exploring Biginelli hybrids in the AI-driven development of ruthenium complexes: Anticancer activity, DNA/HSA binding study, impacts on apoptosis and BCL-2/BCL-XL suppression. Journal of Inorganic Biochemistry, 272, 112988. https://doi.org/https://doi.org/10.1016/j.jinorgbio.2025.112988
Song, W., Wen, Y., Yue, X., Liu, C., Han, Y., & Sun, J. (2025). BioKMS-HAG: A hierarchically guided biomedical and space science knowledge fine-grained mining system. Life Sciences in Space Research. https://doi.org/https://doi.org/10.1016/j.lssr.2025.11.015
Song, Y., Li, J., Chi, D., Xu, Z., Liu, J., Chen, M., & Wang, Z. (2025). AI-driven advances in metal–organic frameworks: from data to design and applications. Chemical Communications, 61(82), 15972–16001. https://doi.org/https://doi.org/10.1039/d5cc04220h
Su, Q., Wang, J., Gou, Q., Hu, R., Jiang, L., Zhang, H., Wang, T., Liu, Y., Shen, C., Kang, Y., Hsieh, C.-Y., & Hou, T. (2025). Robust protein–ligand interaction modeling through integrating physical laws and geometric knowledge for absolute binding free energy calculation††Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d4sc07405j. Chemical Science, 16(12), 5043–5057. https://doi.org/https://doi.org/10.1039/d4sc07405j
Tang, S.-L., Sumitra, M. R., Chen, L.-C., Liu, F.-C., Hsu, H.-L., Kuo, Y.-C., Ansar, M., Huang, S.-L., Lee, S.-Y., Wang, H.-J., Lawal, B., Wu, A. T. H., Wen, Y.-T., & Huang, H.-S. (2025). Machine learning–driven discovery of NSC828779 as a multi-mechanistic NLRP3 inflammasome inhibitor for inflammatory diseases. Computers in Biology and Medicine, 197, 111110. https://doi.org/https://doi.org/10.1016/j.compbiomed.2025.111110
Uslu, H., Das, B., Dagdogen, H. A., Santur, Y., Y?lmaz, S., Turkoglu, I., & Das, R. (2025). Discovery of new anti-HIV candidate molecules with an AI-based multi-stage system approach using molecular docking and ADME predictions. Chemometrics and Intelligent Laboratory Systems, 267, 105543. https://doi.org/https://doi.org/10.1016/j.chemolab.2025.105543
Wang, B., Liu, Q., Zhao, W., Zhang, T., Zhang, D., Sutcharitchan, C., & Li, S. (2025). Revolutionizing drug discovery from natural products: The roles of artificial intelligence and multi-omics in accelerating innovation. Acta Pharmaceutica Sinica B. https://doi.org/https://doi.org/10.1016/j.apsb.2025.12.030
Wang, M., Qu, B., Yang, L., Wang, L., Jiang, K., & Lin, J. (2025). PyaiVS unifies AI workflows to accelerate ligand discovery and yields ABCG2 inhibitors. European Journal of Medicinal Chemistry, 300, 118176. https://doi.org/https://doi.org/10.1016/j.ejmech.2025.118176
Wang, S., Zhao, Y., Li, J., Zhang, L., Yan, F., Wang, C., Shi, L., Zhang, X., & Zhang, M. (2025). Computational discovery of RSV Pre-F inhibitors via reinforcement learning-driven ab initio design from natural fragment libraries. Computational Biology and Chemistry, 119, 108553. https://doi.org/https://doi.org/10.1016/j.compbiolchem.2025.108553
Wu, J. L., Friday, D. M., Hwang, C., Yi, S., Torres-Flores, T. C., Burke, M. D., Diao, Y., Schroeder, C. M., & Jackson, N. E. (2025). Democratizing machine learning in chemistry with community-engaged test sets. Digital Discovery, 5(1), 304–309. https://doi.org/https://doi.org/10.1039/d5dd00424a
Yu, C.-L., Dai, J.-W., Wang, T.-W., Fu, J.-D., & Liu, P.-L. (2025). AI-enabled construction and prediction of atomic models for thin-film heterostructures via materials genome approach. Surface and Coatings Technology, 498, 131755. https://doi.org/https://doi.org/10.1016/j.surfcoat.2025.131755
Zanoletti, A., Cornelio, A., Galli, E., Scaglia, M., Bonometti, A., Zacco, A., Depero, L. E., Gianoncelli, A., & Bontempi, E. (2025). AI-driven identification of a novel malate structure from recycled lithium-ion batteries. Environmental Research, 267, 120709. https://doi.org/https://doi.org/10.1016/j.envres.2024.120709
Zheng, Z., He, Z., Khattab, O., Rampal, N., Zaharia, M. A., Borgs, C., Chayes, J. T., & Yaghi, O. M. (2024). Image and data mining in reticular chemistry powered by GPT-4V††Electronic supplementary information (ESI) available: Full prompts designed to guide GPT-4V; additional examples showcasing GPT-4V’s performance in reading various figure inputs and its corresponding responses; Python code used to automate the data mining and analysis processes; detailed information on the selected papers in this study, including the ground truth and the classification output for each page in a spread-sheet format; extracted nitrogen isotherms in this study. See DOI: https://doi.org/10.1039/d3dd00239j. Digital Discovery, 3(3), 491–501. https://doi.org/https://doi.org/10.1039/d3dd00239j
Authors
Copyright (c) 2026 Fitriani Fitriani, Wang Jun, Max Weber

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.