使用无监督机器学习解决基于物理的初值问题

Solving physics-based initial value problems with unsupervised machine learning

Source | HN Comments

该研究提出了一种使用无监督机器学习解决物理初值问题的方法。文章构建了一个深度学习框架，利用神经网络模拟各种力学系统的动力学，包括非线性、耦合和混沌系统。研究在多个系统上验证了该方法的有效性，结果表明深度神经网络能够逼近问题的解，并保留能量等物理特性。文章强调了概率激活函数和耦合神经网络在学习初值问题解中的重要性。

跳至主要内容

重用与授权

无需获得许可即可重用本文或其组件，因为它是在 Creative Commons Attribution 4.0 International 许可条款下提供的。该许可证允许在任何媒介中无限制地使用、分发和复制，前提是必须注明作者和已发表文章的标题、期刊引用和 DOI。请注意，某些图可能已获得其他第三方的许可。您有责任直接从权利持有人处获得这些图的适当许可。

开放获取

使用无监督机器学习解决基于物理的初值问题

Jack Griffiths *, Steven A. Wrathmall †, 和 Simon A. Gardiner ‡ open icon close icon

Joint Quantum Centre (JQC) Durham-Newcastle, Department of Physics, Durham University, Durham DH1 3LE, United Kingdom
*联系作者: me@jackg.co
†联系作者: s.a.wrathmall@durham.ac.uk
‡联系作者: s.a.gardiner@durham.ac.uk

PDFShareopen icon close icon

X
Facebook
Mendeley
LinkedIn
Reddit
Sina Weibo

Phys. Rev. E 111 , 055302 – Published 15 May, 2025 DOI: https://doi.org/10.1103/PhysRevE.111.055302 Export Citation Show metricsopen icon close icon See more details Posted by 18 X users Referenced by 2 Bluesky users

摘要

初值问题——常微分方程组和相应的初始条件——可以用来描述许多物理现象，包括经典力学中出现的现象。我们开发了一种使用无监督机器学习解决基于物理的初值问题的方法。我们提出了一个深度学习框架，该框架通过神经网络对各种机械系统的动力学进行建模。我们的框架非常灵活，使我们能够解决非线性、耦合和混沌动力系统。我们在包括自由粒子、重力场中的粒子、经典摆以及 Hénon-Heiles 系统（一对具有非线性扰动的耦合谐振子，用于天体力学）的系统上证明了我们方法的有效性。我们的结果表明，深度神经网络可以成功地逼近这些问题的解，产生能够保留能量等物理特性以及具有固定作用量的轨迹。我们注意到，本文中定义的概率激活函数是学习最严格意义上的初值问题的任何解所必需的，并且我们引入了耦合神经网络来学习耦合系统的解。

See 10 more figures

Physics Subject Headings (PhySH)

文章内容

参考文献 (50)

I. Lagaris, A. Likas, and D. Fotiadis, Artificial neural networks for solving ordinary and partial differential equations, IEEE Trans. Neural Netw. Learn. Syst. 9 , 987 (1998).
G. E. Karniadakis, I. G. Kevrekidis, L. Lu, P. Perdikaris, S. Wang, and L. Yang, Physics-informed machine learning, Nat. Rev. Phys. 3 , 422 (2021).
J. Han, A. Jentzen, and W. E, Solving high-dimensional partial differential equations using deep learning, Proc. Natl. Acad. Sci. USA 115 , 8505 (2018).
J. Sirignano and K. Spiliopoulos, DGM: A deep learning algorithm for solving partial differential equations, J. Comput. Phys. 375 , 1339 (2018).
L. Lu, X. Meng, Z. Mao, and G. E. Karniadakis, DeepXDE: A deep learning library for solving differential equations, SIAM Rev. 63 , 208 (2021).
Y. Xu, H. Zhang, Y. Li, K. Zhou, Q. Liu, and J. Kurths, Solving Fokker–Planck equation using deep learning, Chaos 30 , 013133 (2020).
Y. Zhang and K.-V. Yuen, Physically guided deep learning solver for time-dependent Fokker–Planck equation, Int. J. Non Linear Mech. 147 , 104202 (2022).
J. Bongard and H. Lipson, Automated reverse engineering of nonlinear dynamical systems, Proc. Natl. Acad. Sci. USA 104 , 9943 (2007).
M. Schmidt and H. Lipson, Distilling free-form natural laws from experimental data, Science 324 , 81 (2009).
S. L. Brunton, J. L. Proctor, and J. N. Kutz, Discovering governing equations from data by sparse identification of nonlinear dynamical systems, Proc. Natl. Acad. Sci. USA 113 , 3932 (2016).
M. Raissi, A. Yazdani, and G. E. Karniadakis, Hidden fluid mechanics: Learning velocity and pressure fields from flow visualizations, Science 367 , 1026 (2020).
X. Meng and G. E. Karniadakis, A composite neural network that learns from multi-fidelity data: Application to function approximation and inverse PDE problems, J. Comput. Phys. 401 , 109020 (2020).
R. T. Q. Chen, Y. Rubanova, J. Bettencourt, and D. Duvenaud, Neural ordinary differential equations, in Advances in Neural Information Processing Systems , edited by S. Bengio (Curran Associates, Red Hook, NY, 2019), p. 6571.
S. Greydanus, M. Dzamba, and J. Yosinski, Hamiltonian neural networks, in Advances in Neural Information Processing Systems , edited by H. Wallach (Curran Associates, Red Hook, NY, 2019), p. 15379.
M. Cranmer, S. Greydanus, S. Hoyer, P. Battaglia, D. Spergel, and S. Ho, Lagrangian neural networks, arXiv:2003.04630 (2020).
Y. Lecun, L. Jackel, L. Bottou, C. Cortes, J. Denker, H. Drucker, I. Guyon, U. Muller, E. Sackinger, P. Simard, and V. Vapnik, Learning Algorithms for Classification: A Comparison on Handwritten Digit Recognition in Neural Networks: The Statistical Mechanics Perspective, edited by J.-H. Oh, C. Kwon, and S. Cho (World Scientific, Singapore, 1995), pp. 261–276.
J. Ling, A. Kurzawski, and J. Templeton, Reynolds averaged turbulence modelling using deep neural networks with embedded invariance, J. Fluid Mech. 807 , 155 (2016).
R. Wang and R. Yu, Physics-guided deep learning for dynamical systems: A survey, arXiv:2103.14954 (2021).
M. Raissi, P. Perdikaris, and G. Karniadakis, Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations, J. Comput. Phys. 378 , 686 (2019).
V. N. Vapnik, The Nature of Statistical Learning Theory (Springer, New York, 1995).
V. N. Vapnik, Estimation of Dependences Based on Empirical Data (Springer-Verlag, Berlin, 1982).
W. Ji, W. Qiu, Z. Shi, S. Pan, and S. Deng, Stiff-PINN: Physics informed neural network for stiff chemical kinetics, J. Phys. Chem. A 125 , 8098 (2021).
G. Fabiani, E. Galaris, L. Russo, and C. Siettos, Parsimonious physics informed random projection neural networks for initial value problems of ODEs and index-1 DAEs, Chaos 33 , 043128 (2023).
M. Hénon and C. Heiles, The applicability of the third integral of motion: Some numerical experiments, Astron. J. 69 , 73 (1964).
J. Bastos de Figueiredo, C. Grotta Ragazzo, and C. Malta, Two important numbers in the Hénon-Heiles dynamics, Phys. Lett. A 241 , 35 (1998).
K. He, X. Zhang, S. Ren, and J. Sun, Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification, in Proceedings of IEEE International Conference on Computer Vision (Piscataway, NJ, USA, 2015), p. 1026.
A. Griewank and A. Walther, Evaluating Derivatives , 2nd ed. (Society for Industrial and Applied Mathematics, 2008).
https://github.com/0jg/nivp (2024).
The initial velocity of the first coordinate is constrained by , where we have set either or . Given these constraints, the initial position and initial velocity for the second coordinate can be arbitrarily chosen, providing that remains real-valued. To ensure an appropriate choice of the initial conditions for the second coordinate, we find by using root finding algorithms on the constraint , and setting the result as 10% of this value. We find by evaluating , and setting the result as 10% of this value.
K. Hornik, M. Stinchcombe, and H. White, Multilayer feedforward networks are universal approximators, Neural Netw. 2 , 359 (1989).
A. Pinkus, Approximation theory of the MLP model in neural networks, Acta Numerica 8 , 143 (1999).
M. Leshno, V. Y. Lin, A. Pinkus, and S. Schocken, Multilayer feedforward networks with a nonpolynomial activation function can approximate any function, Neural Netw. 6 , 861 (1993).
The universal approximation theorems do not tell us how to find the optimal weights and biases, only that they exist.
D. Wolpert and W. Macready, No free lunch theorems for optimization, IEEE Trans. Evol. Computat. 1 , 67 (1997).
L. Breiman, Bagging predictors, Mach. Learn. 24 , 123 (1996).
T. G. Dietterich, in Multiple Classifier Systems , edited by J. Kittler and F. Roli, Lecture Notes in Computer Science Vol. 1857 (Springer, Berlin, 2000), pp. 1–15.
L. Hansen and P. Salamon, Neural network ensembles, IEEE Trans. Pattern Anal. Machine Intell. 12 , 993 (1990).
A. Krogh and J. Vedelsby, in Advances in Neural Information Processing Systems , edited by G. Tesauro, D. S. Touretzky, and T. K. Leen (MIT Press, Cambridge, MA, 1995), p. 231.
D. Opitz and R. Maclin, Popular ensemble methods: An empirical study, J. Artif. Int. Res. 11 , 169 (1999).
N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, Dropout: A simple way to prevent neural networks from overfitting, J. Mach. Learn. Res. 15 , 1929 (2014).
D. Hendrycks and K. Gimpel, Gaussian error linear units (GELUs), arXiv:1606.08415 (2016).
Probability distributions with support only on the domain are not conducive to learning, possibly due to the lack of symmetry. It is our empirical observation that some distributions which are not perfectly symmetric about the origin—such as that used in Mish [see Eq. (B9b)]—are still conducive to learning.
D. Misra, Mish: A self regularized non-monotonic activation function, in Proceedings of the British Machine Vision Conference (London, UK, 2020), Vol. 31.
X. Glorot and Y. Bengio, Understanding the difficulty of training deep feedforward neural networks, in Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (AISTATS) (2010).
D. P. Kingma and J. Ba, in Proceedings of the 3rd International Conference on Learning Representations (ICLR 2015) (San Diego, CA, 2015).
J. Duchi, E. Hazan, and Y. Singer, Adaptive subgradient methods for online learning and stochastic optimization, J. Mach. Learn. Res. 12 , 2011 (2121).
G. Hinton, Lecture 6e — RMSProp: Divide the gradient by a running average of its recent magnitude, Neural Networks for Machine Learning, Coursera, Mountain View, CA (2012), https://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf.
I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning (MIT Press, Cambridge, MA, 2016), p. 311.
Automatic differentiation is explored in literature such as Baydin [50] and Griewank and Walther [27].
A. G. Baydin, B. A. Pearlmutter, A. A. Radul, and J. M. Siskind, Automatic differentiation in machine learning: A survey, J. Mach. Learn. Res. 18 , 5595 (2017).

OutlineInformation

Abstract
Article Text
```
* [INTRODUCTION](https://link.aps.org/doi/10.1103/<#s1>)
```
References

Phys. Rev. E 111 , 055302– Published 15 May, 2025 Vol. 111, Iss. 5 — May 2025

Received 10 July 2024
Revised 15 March 2025
Accepted 25 April 2025

Export Citation Reuse & Permissions DOI: https://doi.org/10.1103/PhysRevE.111.055302 Published by the American Physical Society under the terms of the [Creative Commons Attribution 4.0 International](https://link.aps.org/doi/1