Two accepted ICLR 2024 papers: particle simulations & stabilized BPTT

We’re happy to report two accepted papers at ICLR 2024! Congrats Patrick and Rene 😀 👍 They’re on particle-based learning and stabilized backprop through time, additional details, code etc. will follow soon. For now here are the two abstracts in full:

Symmetric Basis Convolutions for Learning Lagrangian Fluid Mechanics: Learning physical simulations has been an essential and central aspect of many recent research efforts in machine learning, particularly for Navier-Stokes-based fluid mechanics. Classic numerical solvers have traditionally been computationally expensive and challenging to use in inverse problems, whereas Neural solvers aim to address both concerns through machine learning. We propose a general formulation for continuous convolutions using separable basis functions as a superset of existing methods and evaluate a large set of basis functions in the context of (a) a compressible 1D SPH simulation, (b) a weakly compressible 2D SPH simulation, and (c) an incompressible 2D SPH Simulation. We demonstrate that even and odd symmetries included in the basis functions are key aspects of stability and accuracy. Our broad evaluation shows that Fourier-based continuous convolutions outperform all other architectures regarding accuracy and generalization. Finally, using these Fourier-based networks, we show that prior inductive biases, such as window functions, are no longer necessary. 

Stabilizing Backpropagation Through Time to Learn Complex Physics: Of all the vector fields surrounding the minima of recurrent learning setups, the gradient field with its exploding and vanishing updates appears a poor choice for optimization, offering little beyond efficient computability. We seek to improve this suboptimal practice in the context of physics simulations, where backpropagating feedback through many unrolled time steps is considered crucial to acquiring temporally coherent behavior. The alternative vector field we propose follows from two principles: physics simulators, unlike neural networks, have a balanced gradient flow and certain modifications to the backpropagation pass leave the positions of the original minima unchanged. As any modification of backpropagation decouples forward and backward pass, the rotation-free character of the gradient field is lost. Therefore, we discuss the negative implications of using such a rotational vector field for optimization and how to counteract them. Our final procedure is easily implementable via a sequence of gradient stopping and component-wise comparison operations, which do not negatively affect scalability. Our experiments on three control problems show that especially as we increase the complexity of each task, the unbalanced updates from the gradient can no longer provide the precise control signals necessary while our method still solves the tasks.

Uncertainty-aware Surrogate Models for Airfoil Flow Simulations with Denoising Diffusion Probabilistic Models

Our paper & source code on using diffusion models to infer RANS solutions for flows around airfoils is online now. It shows that diffusion models finally provide a reliable way to learn full distributions of solutions!


Here’s an example result, shown in terms of the standard deviation over 100 samples given one set of initial free stream conditions and a fixed airfoil shape:

Footnote: the heteroscedastic version (in blue) is not a competitor, it learns mean and standard deviation well, but can’t produce samples.

Here’s the full paper abstract for completeness: Leveraging neural networks as surrogate models for turbulence simulation is a topic of growing interest. At the same time, embodying the inherent uncertainty of simulations in the predictions of surrogate models remains very challenging. The present study makes a first attempt to use denoising diffusion probabilistic models (DDPMs) to train an uncertainty-aware surrogate model for turbulence simulations. Due to its prevalence, the simulation of flows around airfoils with various shapes, Reynolds numbers, and angles of attack is chosen as the learning objective. Our results show that DDPMs can successfully capture the whole distribution of solutions and, as a consequence, accurately estimate the uncertainty of the simulations. The performance of DDPMs is also compared with varying baselines in the form of Bayesian neural networks and heteroscedastic models. Experiments demonstrate that DDPMs outperform the other methods regarding a variety of accuracy metrics. Besides, it offers the advantage of providing access to the complete distributions of uncertainties rather than providing a set of parameters. As such, it can yield realistic and detailed samples from the distribution of solutions.

Diffusion models and score matching via differentiable physics: paper and source code

The final version of our NeurIPS paper merging physics simulations into the diffusion modeling process (SMDP) is on arXiv now:

Maybe even more importantly, the SMDP source code is online how at: , let us know how it works for you!

Here’s an overview of the algorithm:

Here’s a preview of one of the examples diffusing a very simply decaying “physics” function:

Full paper abstract: Our works proposes a novel approach to solve inverse problems involving the temporal evolution of physics systems by leveraging the idea of score matching. The system’s current state is moved backward in time step by step by combining an approximate inverse physics simulator and a learned correction function. A central insight of our work is that training the learned correction with a single-step loss is equivalent to a score matching objective, while recursively predicting longer parts of the trajectory during training relates to maximum likelihood training of a corresponding probability flow. In the paper, we highlight the advantages of our algorithm compared to standard denoising score matching and implicit score matching, as well as fully learned baselines for a wide range of inverse physics problems. The resulting inverse solver has excellent accuracy and temporal stability and, in contrast to other learned inverse solvers, allows for sampling the posterior of the solutions.

Turbulent Flow Simulation using Autoregressive Conditional Diffusion Models in Action

Interested in trying out diffusion-based “neural simulators” for fluids yourselves? We’ve just added a notebook that let’s you get started with training and probabilistic inference right away:

The image above shows a few generated posterior samples for the (tough) transonic flow dataset. Alternatively, you can also directly run it in colab via this link:

Note that this model loads a pre-trained diffusion model, and runs fine-tuning for 10 epochs. The full training would require ca. one day of runtime. Here’s also the temporal evaluation from the notebook:

Learning via Differentiable Physics for Plasma Turbulence

We’re also happy to announce the preprint of our paper on “Physics-Preserving AI-Accelerated Simulations of Plasma Turbulence”: , it’s great to see that training with a differentiable physics solver also yields accurate drift-wave turbulence!

The corresponding source code of Robin Greif’s implementation is also online at: , it contains a fully differentiable Hasagawa-Wakatani solver implemented with PhiFlow. (And a lot of tools for evaluation on top! )

Full paper abstract: Turbulence in fluids, gases, and plasmas remains an open problem of both practical and fundamental importance. Its irreducible complexity usually cannot be tackled computationally in a brute-force style. Here, we combine Large Eddy Simulation (LES) techniques with Machine Learning (ML) to retain only the largest dynamics explicitly, while small-scale dynamics are described by an ML-based sub-grid-scale model. Applying this novel approach to self-driven plasma turbulence allows us to remove large parts of the inertial range, reducing the computational effort by about three orders of magnitude, while retaining the statistical physical properties of the turbulent system.

Diffusion Models for Temporal Predictions of PDEs

Here’s another interesting result from the diffusion-based temporal predictions with ACDM: the diffusion training inherently works with losses computed on single timesteps, but is as stable as a model trained with many steps of unrolling; 16 are needed here:

We were also glad to find out that the diffusion sampling in the strongly conditioned-regime of temporal forecasting works very well with few steps. Instead of 1000 (or so), 50 steps and less already work very well:

The corresponding project page is this one, and the source code can be found at:

Control of Two-way Coupled Fluid Systems with Differentiable Solvers

We finally also have the source code for our RB-control paper online: , the paper being available here:

A differentiable flow solver is used to train a controller that steers a rigid body to reach a goal position and orientation. Interestingly, the differentiable solver learns much faster and more reliably than the reinforcement learning variants we tried, and it clearly outperforms simpler baselines:

Paper abstract: We investigate the use of deep neural networks to control complex nonlinear dynamical systems, specifically the movement of a rigid body immersed in a fluid. We solve the Navier Stokes equations with two way coupling, which gives rise to nonlinear perturbations that make the control task very challenging. Neural networks are trained in an unsupervised way to act as controllers with desired characteristics through a process of learning from a differentiable simulator. Here we introduce a set of physically interpretable loss terms to let the networks learn robust and stable interactions. We demonstrate that controllers trained in a canonical setting with quiescent initial conditions reliably generalize to varied and challenging environments such as previously unseen inflow conditions and forcing, although they do not have any fluid information as input. Further, we show that controllers trained with our approach outperform a variety of classical and learned alternatives in terms of evaluation metrics and generalization capabilities.

ACDM Source Code on Github

The source code for our turbulent flow simulations using Autoregressive Conditional Diffusion Models (ACDMs) is online now at , let us know how it works!

Project summary: Our work targets the prediction of turbulent flow fields from an initial condition using autoregressive conditional diffusion models (ACDMs). Our method relies on the DDPM approach, a class of generative models based on a parameterized Markov chain. They can be trained to learn the conditional distribution of a target variable given a conditioning. In our case, the target variable is the flow field at the next time step, and the conditioning is the flow field at the current time step, i.e., the simulation trajectory is created via autoregressive unrolling of the model. We showed that ACDMs can accurately and probabilistically predict turbulent flow fields, and that the resulting trajectories align with the statistics of the underlying physics. Furthermore, ACDMs can generalize to flow parameters beyond the training regime, and exhibit high temporal rollout stability, without compromising the quality of generated samples.

More details can also be found on the project website.

Hybrid Solver for Reactive Flows Source-Code and Paper accepted at DCE

We’re happy to report that our paper on learning hybrid solvers for reactive flows has now been accepted at the Data-Centric Engineering journal! The source code is also available now:

Full paper abstract: Modeling complex dynamical systems with only partial knowledge of their physical mechanisms is a crucial problem across all scientific and engineering disciplines. Purely data-driven approaches, which only make use of an artificial neural network and data, often fail to accurately simulate the evolution of the system dynamics over a sufficiently long time and in a physically consistent manner. Therefore, we propose a hybrid approach that uses a neural network model in combination with an incomplete partial differential equations (PDE) solver that provides known, but incomplete physical information. In this study, we demonstrate that the results obtained from the incomplete PDEs can be efficiently corrected at every time step by the proposed hybrid neural network – PDE solver model, so that the effect of the unknown physics present in the system is correctly accounted for. For validation purposes, the obtained simulations of the hybrid model are successfully compared against results coming from the complete set of PDEs describing the full physics of the considered system. We demonstrate the validity of the proposed approach on a reactive flow, an archetypal multi-physics system that combines fluid mechanics and chemistry, the latter being the physics considered unknown. Experiments are made on planar and Bunsen-type flames at various operating conditions. The hybrid neural network – PDE approach correctly models the flame evolution of the cases under study for significantly long time windows, yields improved generalization, and allows for larger simulation time steps.

PDE-based Simulations and Turbulent Flows using Autoregressive Conditional Diffusion Models

Our paper on autoregressive diffusion models for improved temporal predictions of complex PDE-based simulations is online now at (ACDM). Above you can see a preview output from a transonic turbulent flow case. Especially the shock waves around the obstacle are very tough here.

To summarize the results of ACDM: it’s highly accurate and turns a regular Neural PDE solver into a probabilistic method, i.e. it can compute different versions of the solution via posterior sampling. This is an example:

A key insight is also that the diffusion training (ACDM) yields an excellent temporal stability, despite being trained without any unrolling. Here’s a preview ACDM in orange versus a few baselines (GT in black):

Full abstract: Simulating turbulent flows is crucial for a wide range of applications, and machine learning-based solvers are gaining increasing relevance. However, achieving stability when generalizing to longer rollout horizons remains a persistent challenge for learned PDE solvers. We address this challenge by introducing a fully data-driven fluid solver that utilizes an autoregressive rollout based on conditional diffusion models. We show that this approach offers clear advantages in terms of rollout stability compared to other learned baselines. Remarkably, these improvements in stability are achieved without compromising the quality of generated samples, and our model successfully generalizes to flow parameters beyond the training regime. Additionally, the probabilistic nature of the diffusion approach allows for inferring predictions that align with the statistics of the underlying physics. We quantitatively and qualitatively evaluate the performance of our method on a range of challenging scenarios, including incompressible Navier-Stokes and transonic flows, as well as isotropic turbulence.

The project page can be found here.

Preprint for Learning Unsteady Cylinder Wakes

Our paper on learning accurate boundary conditions for cylinder wake flows with a differentiable physics NN is online now: , the full title being “Unsteady Cylinder Wakes from Arbitrary Bodies with Differentiable Physics-Assisted Neural Network”

Not too surprisingly, NNs can learn very accurate boundary conditions, but the neat thing is that Shuvayan has shown they’re very useful, e.g., for studying the behavior of new arrangements of cylinders.

Full paper abstract: This work delineates a hybrid predictive framework configured as a coarse-grained surrogate for reconstructing unsteady fluid flows around multiple cylinders of diverse configurations. The presence of cylinders of arbitrary nature causes abrupt changes in the local flow profile while globally exhibiting a wide spectrum of dynamical wakes fluctuating in either a periodic or chaotic manner. Consequently, the focal point of the present study is to establish predictive frameworks that accurately reconstruct the overall fluid velocity flowfield such that the local boundary layer profile, as well as the wake dynamics, are both preserved for long time horizons. The hybrid framework is realized using a base differentiable flow solver combined with a neural network, yielding a differentiable physics-assisted neural network (DPNN). The framework is trained using bodies with arbitrary shapes, and then it is tested and further assessed on out-of-distribution samples. Our results indicate that the neural network acts as a forcing function to correct the local boundary layer profile while also remarkably improving the dissipative nature of the flowfields. It is found that the DPNN framework clearly outperforms the supervised learning approach while respecting the reduced feature space dynamics. The model predictions for arbitrary bodies indicate that the Strouhal number distribution with respect to spacing ratio exhibits similar patterns with existing literature. In addition, our model predictions also enable us to discover similar wake categories for flow past arbitrary bodies. For the chaotic wakes, the present approach predicts the chaotic switch in gap flows up to the mid-time range.

Phiflow Validation Case Jupyter Notebooks

Here’s a nice collection of phiflow validation cases set up by Shuvayan, a post-doc in our lab. Best of all, they can be run in your browser, give it a try:

Summary: This repository is meant for the validation test-cases performed using Phiflow for benchmark fluid flow problems. The test cases include:

  • Lid-driven (shear-driven cavity)
  • Backward facing step
  • Taylor vortex decay problem

The Jupyter Notebooks works well using Google colab. They are commented as well as markdown comments have been added for instructions.

PhiFlow Version 2.4 released

We’re happy to announce version 2.4 of PhiFlow, our differentiable simulation framework for machine learning. Among others, it now has improved support for sparse matrices, preconditioners and plotting:

The new features include:

  • Improved plots with additional recipes for bar charts and error bars.
  • Improved learning curve visualization with vis.load_scalars()
  • Decorator @math.broadcast to make functions compatible with Tensors.
  • Preconditioned linear solves (experimental) with ilu and cluster preconditioners
  • Improved support for sparse matrices
  • Additional math functions, such as soft_plus, factorial, log_gamma, safe_div, primal, is_inf, is_nan
  • Explicit device management with math.to_device()
  • Tensor unstacking to dict using **tensor.dim
  • Jit-compilable sparse neighbor search
  • Improved support for ÎŚ-trees, added math.slice().
  • Broadcast string formatter using -f-f”…”

‘Neural Global Transport’ at ICLR 2023

We’re happy to report that our ‘Neural Global Transport’ paper titled “Learning to Estimate Single-View Volumetric Flow Motions without 3D Supervision” has been accepted to ICLR’23. We train a neural network to replace a difficult inverse solver (computing the 3D motion of a transparent volume) without having ground truth data. With the help of a differentiable flow solver, this works surprisingly well.

You can find the full paper on arXiv:

And the source code will be available here soon:

Full abstract: We address the challenging problem of jointly inferring the 3D flow and volumetric densities moving in a fluid from a monocular input video with a deep neural network. Despite the complexity of this task, we show that it is possible to train the corresponding networks without requiring any 3D ground truth for training. In the absence of ground truth data we can train our model with observations from real-world capture setups instead of relying on synthetic reconstructions. We make this unsupervised training approach possible by first generating an initial prototype volume which is then moved and transported over time without the need for volumetric supervision. Our approach relies purely on image-based losses, an adversarial discriminator network, and regularization. Our method can estimate long-term sequences in a stable manner, while achieving closely matching targets for inputs such as rising smoke plumes.

SMDP Video and the PBS group on Mastodon

A smaller update: Benjamin just gave a nice overview of our “Score Matching via Differentiable Physics” (SMDP) approach in the LOG2 reading group, you can check it out here: The talk gives a nice overview of the core method and the results.

Also, our “physics-based simulation” (PBS) group is online on Mastodon now (since a while ago, actually) at , enjoy!

Diffusion Models and Score Matching for Physics Simulations

We’re happy to announce that our paper titled “Score Matching via Differentiable Physics” on employing diffusion models for physical systems is available on arXiv now:

The key idea is to use a physics operator (a simulator) as “drift” term of an SDE. Among others, we give a derivation that connects the learned corrections to the score of the underlying data distribution. Here’s a visual overview:

The solutions produced by our SMDP method are (at least) on-par with regular learned methods, while providing the fundamental advantage that they allow sampling the posterior. Here’s an example from a heat equation case:

Here’s the full paper abstract:

Diffusion models based on stochastic differential equations (SDEs) gradually perturb a data distribution p(x) over time by adding noise to it. A neural network is trained to approximate the score ∇xlogpt(x) at time t, which can be used to reverse the corruption process. In this paper, we focus on learning the score field that is associated with the time evolution according to a physics operator in the presence of natural non-deterministic physical processes like diffusion. A decisive difference to previous methods is that the SDE underlying our approach transforms the state of a physical system to another state at a later time. For that purpose, we replace the drift of the underlying SDE formulation with a differentiable simulator or a neural network approximation of the physics. We propose different training strategies based on the so-called probability flow ODE to fit a training set of simulation trajectories and discuss their relation to the score matching objective. For inference, we sample plausible trajectories that evolve towards a given end state using the reverse-time SDE and demonstrate the competitiveness of our approach for different challenging inverse problems.

Video for “Guaranteed Conservation of Momentum for Learning Particle-based Fluid Dynamics”

The video for our NeurIPS’22 paper of the same name is finally online, enjoy:

As mentioned before, the source code can be found on github: , the full paper abstract is:

We present a novel method for guaranteeing linear momentum in learned physics simulations. Unlike existing methods, we enforce conservation of momentum with a hard constraint, which we realize via antisymmetrical continuous convolutional layers. We combine these strict constraints with a hierarchical network architecture, a carefully constructed resampling scheme, and a training approach for temporal coherence. In combination, the proposed method allows us to increase the physical accuracy of the learned simulator substantially. In addition, the induced physical bias leads to significantly better generalization performance and makes our method more reliable in unseen test cases. We evaluate our method on a range of different, challenging fluid scenarios. Among others, we demonstrate that our approach generalizes to new scenarios with up to one million particles. Our results show that the proposed algorithm can learn complex dynamics while outperforming existing approaches in generalization and training performance.

Untangled Layered Neural Fields for Mix-and-Match Virtual Try-On at NeurIPS

We’re happy to report that our paper “ULNeF: Untangled Layered Neural Fields for Mix-and-Match Virtual Try-On” will be presented at NeurIPS next week. This approach solves the problem of multi-layer cloth collision with a learned untangling operator.

The project website is , and the corresponding video is highly recommended

Full abstract: Recent advances in neural models have shown great results for virtual try-on (VTO) problems, where a 3D representation of a garment is deformed to fit a target body shape. However, current solutions are limited to a single garment layer, and cannot address the combinatorial complexity of mixing different garments. Motivated by this limitation, we investigate the use of neural fields for mix-and-match VTO, and identify and solve a fundamental challenge that existing neural-field methods cannot address: the interaction between layered neural fields. To this end, we propose a neural model that untangles layered neural fields to represent collision-free garment surfaces. The key ingredient is a neural untangling projection operator that works directly on the layered neural fields, not on explicit surface representations. Algorithms to resolve object-object interaction are inherently limited by the use of explicit geometric representations, and we show how methods that work directly on neural implicit representations could bring a change of paradigm and open the door to radically different approaches.

“Learning Similarity Metrics for Volumetric Simulations with Multiscale CNNs” at AAAI Conference

We’re happy to announce that our paper “Learning Similarity Metrics for Volumetric Simulations with Multiscale CNNs” was just accepted at the AAAI Conference on Artificial Intelligence.

It targets the similarity assessment of complex simulated datasets via an entropy-based CNN for 3D multi-channel data. The preprint can be found here, or try it out yourself via a pretrained model at

Full abstract: Simulations that produce three-dimensional data are ubiquitous in science, ranging from fluid flows to plasma physics. We propose a similarity model based on entropy, which allows for the creation of physically meaningful ground truth distances for the similarity assessment of scalar and vectorial data, produced from transport and motion-based simulations. Utilizing two data acquisition methods derived from this model, we create collections of fields from numerical PDE solvers and existing simulation data repositories, and highlight the importance of an appropriate data distribution for an effective training process. Furthermore, a multiscale CNN architecture that computes a volumetric similarity metric (VolSiM) is proposed. To the best of our knowledge this is the first learning method inherently designed to address the challenges arising for the similarity assessment of high-dimensional simulation data. Additionally, the tradeoff between a large batch size and an accurate correlation computation for correlation-based loss functions is investigated, and the metric’s invariance with respect to rotation and scale operations is analyzed. Finally, the robustness and generalization of VolSiM is evaluated on a large range of test data, as well as a particularly challenging turbulence case study, that is close to potential real-world applications.

“Reviving Autoencoder Pretraining” published in Neural Computing and Applications Journal

We’re happy to report that our paper “Reviving Autoencoder Pretraining” was finally published in Neural Computing and Applications Journal, and is available online now at

In short, the paper targets learning features via a forward-backward pass that was inspired by our ping-pong loss from our TecoGAN paper. There we used it to self-supervise video sequences in terms of forward and backward dynamics, while the new paper uses it to train networks that are “as-invertible-as-possible“. Orginally, we tried to coin the term “racecar” loss, the word “racecar” being a nice palindrome to highlight the bidirectional nature.

Abstract: The pressing need for pretraining algorithms has been diminished by numerous advances in terms of regularization, architectures, and optimizers. Despite this trend, we re-visit the classic idea of unsupervised autoencoder pretraining and propose a modified variant that relies on a full reverse pass trained in conjunction with a given training task. This yields networks that are {\em as-invertible-as-possible}, and share mutual information across all constrained layers. We additionally establish links between singular value decomposition and pretraining and show how it can be leveraged for gaining insights about the learned structures. Most importantly, we demonstrate that our approach yields an improved performance for a wide variety of relevant learning and transfer tasks ranging from fully connected networks over residual neural networks to generative adversarial networks. Our results demonstrate that unsupervised pretraining has not lost its practical relevance in today’s deep learning environment.

Further info can be found here.