News - Weak baselines and reporting bias lead to over-optimism in machine learning of fluid-related partial differential equations.

Thank you for visiting Nature.com. The version of browser you are using has limited CSS support. For best results, we recommend that you use a newer version of your browser (or disable Compatibility Mode in Internet Explorer). In the meantime, to ensure ongoing support, we are showing the site without styling or JavaScript.
One of the most promising applications of machine learning in computational physics is the accelerated solution of partial differential equations (PDEs). The main goal of a machine learning-based partial differential equation solver is to produce solutions that are accurate enough faster than standard numerical methods to serve as a baseline comparison. We first conduct a systematic review of the machine learning literature on solving partial differential equations. Of all the papers reporting the use of ML to solve fluid partial differential equations and claiming superiority over standard numerical methods, we identified 79% (60/76) compared to weak baselines. Second, we found evidence of widespread reporting bias, particularly in outcome reporting and publication bias. We conclude that machine learning research on solving partial differential equations is overly optimistic: weak input data can lead to overly positive results, and reporting bias can lead to underreporting of negative results. In large part, these problems appear to be caused by factors similar to past reproducibility crises: investigator discretion and positive outcome bias. We call for bottom-up cultural change to minimize biased reporting and top-down structural reform to reduce perverse incentives to do so.
The list of authors and articles generated by the systematic review, as well as the classification of each article in the random sample, is publicly available at https://doi.org/10.17605/OSF.IO/GQ5B3 (ref. 124).
The code needed to reproduce the results in Table 2 can be found on GitHub: https://github.com/nickmcgreivy/WeakBaselinesMLPDE/ (ref. 125) and on Code Ocean: https://codeocean.com/capsule/9605539/Tree/ v1 (link 126) and https://codeocean.com/capsule/0799002/tree/v1 (link 127).
Randall, D., and Welser, K., The Irreproducibility Crisis in Modern Science: Causes, Consequences, and Pathways for Reform (National Association of Scientists, 2018).
Ritchie, S. Science Fiction: How Fraud, Bias, Silence, and Hype Undermine the Search for Truth (Vintage, 2020).
Open scientific collaboration. Assessing reproducibility in psychological science. Science 349, AAAC4716 (2015).
Prinz, F., Schlange, T., and Asadullah, K. Believe it or not: How much can we rely on published data on potential drug targets? Nat. Rev. “The Discovery of Drugs.” 10, 712 (2011).
Begley, K.G. and Ellis, L.M. Raising standards in preclinical cancer research. Nature 483, 531–533 (2012).
A. Gelman and E. Loken, The Garden of Forking Paths: Why Multiple Comparisons are a Problem Even Without “Fishing Expeditions” or “p-hacks” and Preformed Research Hypotheses, vol. 348, 1–17 (Department of Statistics, 2013).
Karagiorgi, G., Kasecka, G., Kravitz, S., Nachman, B., and Shi, D. Machine learning in search of new fundamental physics. Nat. Doctor of Philosophy in Physics. 4, 399–412 (2022).
Dara S, Damercherla S, Jadhav SS, Babu CM and Ahsan MJ. Machine learning in drug discovery: a review. Atif. Intel. Ed. 55, 1947–1999 (2022).
Mather, A.S. and Coote, M.L. Deep learning in chemistry. J.Chemistry. notify. Model. 59, 2545–2559 (2019).
Rajkomar A., Dean J. and Kohan I. Machine learning in medicine. New England Journal of Medicine. 380, 1347–1358 (2019).
Grimmer J, Roberts ME. and Stewart B.M. Machine learning in social sciences: an agnostic approach. Rev. Ann Ball. science. 24, 395–419 (2021).
Jump, J. et al. Make highly accurate protein structure predictions using alphafold. Nature 596, 583–589 (2021).
Gundersen, O. E., Coakley, K., Kirkpatrick, K., and Gil, Y. Sources of irreproducibility in machine learning: A review. Preprint available at https://arxiv.org/abs/2204.07610 (2022).
Scully, D., Snook, J., Wiltschko, A., and Rahimi, A. Winner’s curse? On the speed, progress and rigor of empirical evidence (ICLR, 2018).
Armstrong, T. G., Moffat, A., Webber, W., and Zobel, J. Non-additive enhancements: preliminary search results since 1998. 18th ACM Conference on Information and Knowledge Management 601–610 (ACM 2009).
Kapoor, S. and Narayanan, A. Leakage and reproducibility crises in machine learning-based science. Patterns, 4, 100804 (2023).
Kapoor S. et al. Reform: scientific reporting standards based on machine learning. Preprint available at https://arxiv.org/abs/2308.07832 (2023).
DeMasi, O., Cording, C., and Recht, B. Meaningless comparisons can lead to false optimism in medical machine learning. PloS ONE 12, e0184604 (2017).
Roberts, M., et al. Common pitfalls and best practices for using machine learning to detect and predict COVID-19 from chest x-rays and computed tomography. Nat. Max. Intel. 3, 199–217 (2021).
Winantz L. et al. Predictive models for the diagnosis and prognosis of COVID-19: a systematic review and critical appraisal. BMJ 369, m1328 (2020).
Whalen S., Schreiber J., Noble W.S. and Pollard K.S. Overcoming the pitfalls of using machine learning in genomics. Nat. Pastor Ginette. 23, 169–181 (2022).
Atris N. et al. Best practices for machine learning in chemistry. Nat. Chemical. 13, 505–508 (2021).
Brunton S.L. and Kutz J.N. Promising directions for machine learning of partial differential equations. Nat. calculate. science. 4, 483–494 (2024).
Vinuesa, R. and Brunton, S.L. Improving computational fluid dynamics through machine learning. Nat. calculate. science. 2, 358–366 (2022).
Comeau, S. et al. Scientific machine learning with physically informed neural networks: Where we are now and what’s next. J. Science. calculate. 92, 88 (2022).
Duraisamy, K., Yaccarino, G., and Xiao, H. Turbulence modeling in the data era. Revised edition of Ann. 51, 357–377 (2019).
Durran, D.R. Numerical methods for solving wave equations in geophysical hydrodynamics, vol. 32 (Springer, 2013).
Mishra, S. A machine learning framework for accelerating data-driven computation of differential equations. mathematics. engineer. https://doi.org/10.3934/Mine.2018.1.118 (2018).
Kochikov D. et al. Machine learning – acceleration of computational fluid dynamics. process. National Academy of Sciences. science. US 118, e2101784118 (2021).
Kadapa, K. Machine learning for computer science and engineering – a brief introduction and some key issues. Preprint available at https://arxiv.org/abs/2112.12054 (2021).
Ross, A., Li, Z., Perezhogin, P., Fernandez-Granda, C., and Zanna, L. Comparative analysis of machine learning ocean subgrid parameterization in idealized models. J.Adv. Model. earth system. 15. e2022MS003258 (2023).
Lippe, P., Wieling, B., Perdikaris, P., Turner, R., and Brandstetter, J. PDE refinement: achieving accurate long extrusions with a neural PDE solver. 37th Conference on Neural Information Processing Systems (NeurIPS 2023).
Frachas, PR et al. Backpropagation algorithm and reservoir calculation in recurrent neural networks for predicting complex spatiotemporal dynamics. neural network. 126, 191–217 (2020).
Raissi, M., Perdikaris, P. and Karniadakis, G.E. Physics, computer science, neural networks: a deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Computer. physics. 378, 686–707 (2019).
Grossmann, T. G., Komorowska, U. J., Lutz, J., and Schönlieb, K.-B. Can physics-based neural networks outperform finite element methods? IMA J. Applications. mathematics. 89, 143–174 (2024).
de la Mata, F. F., Gijon, A., Molina-Solana, M., and Gómez-Romero, J. Physics-based neural networks for data-driven modeling: advantages, limitations, and opportunities. physics. A 610, 128415 (2023).
Zhuang, P.-Y. & Barba, L.A. An empirical report on physics-based neural networks in fluid modeling: pitfalls and disappointments. Preprint available at https://arxiv.org/abs/2205.14249 (2022).
Zhuang, P.-Y. and Barba, L.A. Predictive limitations of physically informed neural networks on vortex formation. Preprint available at https://arxiv.org/abs/2306.00230 (2023).
Wang, S., Yu, H., and Perdikaris, P. When and why pinns fail to train: A neural tangent nucleus perspective. J. Computer. physics. 449, 110768 (2022).
Krishnapriyan, A., Gholami, A., Zhe, S., Kirby, R., and Mahoney, M.W. Characteristics of possible failure modes in physical information neural networks. 35th Conference on Neural Information Processing Systems Vol. 34, 26548–26560 (NeurIPS 2021).
Basir, S. and Senokak, I. A critical study of failure modes in physics-based neural networks. In AiAA SCITECH 2022 Forum 2353 (ARK, 2022).
Karnakov P., Litvinov S. and Koumoutsakos P. Solving physical inverse problems by optimizing discrete losses: fast and accurate learning without neural networks. process. National Academy of Sciences. science. Nexus 3, pgae005 (2024).
Gundersen O.E. Basic principles of reproducibility. Phil.cross. R. Shuker. A 379, 20200210 (2021).
Aromataris E and Pearson A. Systematic reviews: an overview. Yes. J. Nursing 114, 53–58 (2014).
Magiera, J., Ray, D., Hesthaven, J. S., and Rohde, K. Constraint-aware neural networks for the Riemann problem. J. Computer. physics. 409, 109345 (2020).
Bezgin D.A., Schmidt S.J. and Adams N.A. Data-driven physically informed finite volume circuit for non-classical reduced voltage shocks. J. Computer. physics. 437, 110324 (2021).

Post time: Sep-29-2024

processed aerosol products

30+ Years Manufacturing Experience

Weak baselines and reporting bias lead to over-optimism in machine learning of fluid-related partial differential equations.