racecar Training for Improved Generalization

Our paper on improving neural network generalization via a forward-backward pass is also finally online, together with a first code example. A common question that we get about this project: “why racecar“? This is worth explaining here in a bit more detail: it’s not about the speed of the method, but rather racecar is a nice palindrome. Hence, if you reverse the word, you still have “racecar”. Our training approach also makes use of a reversed neural network architecture, re-using all existent building blocks of the network and their weights, somewhat similar to a palindrome. Hence the name. Interestingly, this reverse structure yields an embedding of singular vectors into the weight matrices, and improves performance for new tasks, as we show for a variety of classification and generation tasks in our paper.

Paper Abstract:
We propose a novel training approach for improving the generalization in neural networks. We show that in contrast to regular constraints for orthogonality, our approach represents a data-dependent orthogonality constraint, and is closely related to singular value decompositions of the weight matrices. We also show how our formulation is easy to realize in practical network architectures via a reverse pass, which aims for reconstructing the full sequence of internal states of the network. Despite being a surprisingly simple change, we demonstrate that this forward-backward training approach, which we refer to as racecar training, leads to significantly more generic features being extracted from a given data set. Networks trained with our approach show more balanced mutual information between input and output throughout all layers, yield improved explainability and, exhibit improved performance for a variety of tasks and task transfers.

Medium-range weather forecasting with deep learning

Our new paper “Purely data-driven medium-range weather forecasting achieves comparable skill to physical models at similar resolution” is available now on arXiv: https://arxiv.org/abs/2008.08626

We show that with enough data, a deep-learning based model can actually compete and in some cases outperform established physical models (e.g., IFS forecasts for 210km resolution). We show how such models can be trained based on the WeatherBench data set, that they contain plausible learned structures, and also fare well for challenging fields such precipitation. At the same time, they illustrate that it will be very difficult to increase the performance only with the data that is currently available.

Full abstract: Numerical weather prediction has traditionally been based on physical models of the atmosphere. Recently, however, the rise of deep learning has created increased interest in purely data-driven medium-range weather forecasting with first studies exploring the feasibility of such an approach. Here, we train a significantly larger model than in previous studies to predict geopotential, temperature and precipitation up to 5 days ahead and achieve comparable skill to a physical model run at similar horizontal resolution. Crucially, we pretrain our models on historical climate model output before fine-tuning them on the reanalysis data. We also analyze how the neural network creates its predictions and find that, with some exceptions, it is compatible with physical reasoning. Our results indicate that, given enough training data, data-driven models can compete with physical models. At the same time, there is likely not enough data to scale this approach to the resolutions of current operational models.

Differentiable Physics Simulations for Deep Learning: Paper & Overview Talk online

We’re happy to report that our paper on using differentiable physics to reduce errors in PDEs is online now, and a corresponding overview talk is also available now:
– Solver-in-the-Loop: Learning from Differentiable Physics to Interact with Iterative PDE-Solvers
Differentiable Physics Simulations for Deep Learning, Talk by Nils Thuerey

Our results demonstrate that Differentiable Physics are a powerful tool, and they neatly fit into the current larger deep learning trend of generic “Differentiable Programming”. They not only yield very good minimizers in terms of well-trained neural networks: a nice side effect is that they allow for leveraging all the existing powerful numerical methods that exist for physical simulations, and employ them to improve training deep neural nets.

Solver-in-the-Loop Paper Abstract: Finding accurate solutions to partial differential equations (PDEs) is a crucial task in all scientific and engineering disciplines. It has recently been shown that machine learning methods can improve the solution accuracy by correcting for effects not captured by the discretized PDE. We target the problem of reducing numerical errors of iterative PDE solvers and compare different learning approaches for finding complex correction functions. We find that previously used learning approaches are significantly outperformed by methods that integrate the solver into the training loop and thereby allow the model to interact with the PDE during training. This provides the model with realistic input distributions that take previous corrections into account, yielding improvements in accuracy with stable rollouts of several hundred recurrent evaluation steps and surpassing even tailored supervised variants. We highlight the performance of the differentiable physics networks for a wide variety of PDEs, from non-linear advection-diffusion systems to three-dimensional Navier-Stokes flows.

Final Version of “Learning Similarity Metrics for Numerical Simulations” Online

We’re happy to report that the final version of our paper on “Learning Similarity Metrics for Numerical Simulations” to be presented at the International Conference on Machine Learning (ICML) is online now. We propose learning a metric for data produced by numerical simulations, i.e. PDEs such as Navier-Stokes, and a way to train Siamese networks with a correlation-based loss to improve the inference of similarities. The resulting deep learning based metric outperforms simpler metrics and other learned metrics such as LPIPS.

Assessing similarity for complex data is is a fundamental problem in all computational disciplines ranging from simulations of blood flow to aircraft design. Many practical problems rely on highly complex PDEs, where small perturbations in the input drastically alter the solutions. Regular vector space metrics like the L² distance are unreliable as they perform an element-wise comparison, and thus cannot compare contextual information or structures on different scales. Our approach, dubbed LSiM, employs convolutional neural networks (CNNs) as a method to extract and compare more meaningful features from a set of two simulation frames.

You can check out:
-the updated pre-print on arXiv 2002.07863 ,
– our website with further details, code, data etc.,
– or the full list of accepted ICML 2020 papers.

GANs for Temporal Self-Supervision of Videos

We typically focus on deep-learning methods for physical data, with a particular emphasis on Navier-Stokes & fluids. However, beyond latent-space simulation algorithms and learning with differentiable solvers, generative models have also been a central theme of our work.

Motivated by time-dependent problems from the physics area, we especially focus on spatio-temporal data such as videos. Here, self-supervision in space, as well as time, has shown lots of promise, e.g., in the form of the TecoGAN model, which can handle video super-resolution and unpaired video translation, among others. This is the video of a talk given at the CLIP workshop at CVPR 2020, where we demonstrate generative adversarial networks for video super-resolution, as well as unpaired video translation. In addition, we’ve targeted improved evaluation metrics for video content. In particular, Nils highlights our choice of a perceptual metric (such as LPIPS), in addition to a temporal perceptual evaluation (tLP) and a motion estimate (tOF). We’ve tested these across a range of examples and verified their rankings with user studies.

Here you can see a part of the perceptual evaluation from our user studies:

Further details:
Talk on YouTube
Learning Temporal Coherence via Self-Supervision for GAN-based Video Generation (TecoGAN)
TecoGAN source code

Source Code and Video for Learning Convolutions in Point-based Representations online

The video and full implementation for our ICLR 2020 paper “Lagrangian Fluid Simulation with Continuous Convolutions” are online now. It enables flexible and efficient learning of dynamics, e.g., for Lagrangian Navier-Stokes solves similar to Smoothed Particle Hydrodynamics (SPH). But instead of analytic kernel formulations like they’re used in SPH, our method learns the dynamics from data.

Full Abstract: We present an approach to Lagrangian fluid simulation with a new type of convo- lutional network. Our networks process sets of moving particles, which describe fluids in space and time. Unlike previous approaches, we do not build an ex- plicit graph structure to connect the particles but use spatial convolutions as the main differentiable operation that relates particles to their neighbors. To this end we present a simple, novel, and effective extension of N-D convolutions to the continuous domain. We show that our network architecture can simulate differ- ent materials, generalizes to arbitrary collision geometries, and can be used for inverse problems. In addition, we demonstrate that our continuous convolutions outperform prior formulations in terms of accuracy and speed.

WeatherBench: Benchmark dataset for data-driven weather forecasting available online

It is worth highlighting that our benchmark dataset for data-driven weather forecasting, i.e. WeatherBench, is fully available online now. You can

Here’s the full abstract: Data-driven approaches, most prominently deep learning, have become powerful tools for prediction in many domains. A natural question to ask is whether data-driven methods could also be used for numerical weather prediction. First studies show promise but the lack of a common dataset and evaluation metrics make inter-comparison between studies difficult. Here we present a benchmark dataset for data-driven medium-range weather forecasting, a topic of high scientific interest for atmospheric and computer scientists alike. We provide data derived from the ERA5 archive that has been processed to facilitate the use in machine learning models. We propose a simple and clear evaluation metric which will enable a direct comparison between different methods. Further, we provide baseline scores from simple linear regression techniques, deep learning models as well as purely physical forecasting models. We hope that this dataset will accelerate research in data-driven weather forecasting.

CVPR Paper on Physics-based Reconstructions of 3D Scans of Deformable Objects online now

Our CVPR paper titled “Correspondence-Free Material Reconstruction using Sparse Surface Constraints” is online now with a pre-print, source code, and the corresponding video! Enjoy. The paper proposes a method to optimize for solutions of a finite-element elastodynamics solver that match a set of given observations (in the form of a depth-video). This method does not employ any neural networks or deep learning methods for a change, but is nonetheless closely related due to its gradient-based optimization scheme.

Full abstract: We address the problem to infer physical material parameters and boundary conditions from the observed motion of a homogeneous deformable object via the solution of an inverse problem. Parameters are estimated from potentially unreliable real-world data sources such as sparse observations without correspondences. We introduce a novel Lagrangian-Eulerian optimization formulation, including a cost function that penalizes differences to observations during an optimization run. This formulation matches correspondence-free, sparse observations from a single-view depth sequence with a finite element simulation of deformable bodies. In conjunction with an efficient hexahedral discretization and a stable, implicit formulation of collisions, our method can be used in demanding situation to recover a variety of material parameters, ranging from Young’s modulus and Poisson ratio to gravity and stiffness damping, and even external boundaries. In a number of tests using synthetic datasets and real-world measurements, we analyse the robustness of our approach and the convergence behavior of the numerical optimization scheme.

Research updates – deep learning for simulation metrics, and reduced representations

We’ve recently posted first versions of our works that target deep learning for numerical simulations in the context of metrics and reduced spaces.

The first one, the Learned Simulation Metric (LSiM), employs a Siamese network architecture that is motivated by the mathematical properties of a metric. We leverage a controllable data generation setup with partial differential equation (PDE) solvers to create increasingly different outputs from a reference simulation in a controlled environment. A central component of our learned metric is a specialized loss function that introduces knowledge about the correlation between single data samples into the training process. We demonstrate it’s usefulness with a wide range of data sets, from fluid flow with Navier-Stokes, to weather data from real-world measurements.

The second paper targets controlled latent space mappings, i.e., a “subdivision” of the latent space vector for different physical fields. Here, we focus on single-phase smoke simulations in 2D and 3D based on the incompressible Navier-Stokes (NS) equations. To achieve stable predictions for long-term flow sequences, a convolutional neural network (CNN) is trained for spatial compression in combination with a temporal prediction network that consists of stacked Long Short-Term Memory (LSTM) layers. The central idea is a novel latent space subdivision (LSS) to separate the respective input quantities into individual parts of the encoded latent space domain. This allows to distinctively alter the encoded quantities without interfering with the remaining latent space values and hence maximizes external control.

For details you can check out our webpages for Learning Similarity Metrics for Numerical Simulations and Latent Space Subdivision: Stable and Controllable Time Predictions for Fluid Flow.

ERC Consolidator Grant “SpaTe”

All signals are GO for the upcoming ERC Consolidator Grant award to N. Thuerey – we’re happy to report that the official grant agreement between TUM and the ERC has been signed. The project is titled “Spatio-Temporal Methods for Data-driven Computer Animation and Simulation”, or in short SpaTe ( as in “spatio-temporal” methods). It will build on and extend our previous work on spatio-temporal learning and GANs (e.g. for fluids or videos), as well as differentiable physics simulations, such as phiflow, as data capturing ventures (cf. ScalarFlow). In terms of phenomena, we will not only target fluids or Navier-Stokes solutions, but also elastic and non-Newtonian materials. We believe that deep learning methods for differentiable numerical simulations represent an exciting area of research and will be of fundamental impact for a wide variety of scientific fields.

The following press release of TUM’s CS department gives a good summary: Prof. Nils Thuerey, receives funding in the form of a Consolidator Grant from the European Research Council (ERC). The goal of his “SpaTe” project is to teach physics to computers – using data and examples rather than the conventional equation-based method. Software that predicts how gases, liquids and solids change shape under certain conditions and over a certain period is used in an enormous number of industries and research projects. So far, these programs have required enormous processing power. Machine learning has the potential for flexible and realistic modelling of temporal processes. Very little research has taken place in this area, however. In his project, Prof. Nils Thuerey wants to develop new algorithms to put machine learning to use in physical simulations. In the future this might make it possible, for example, to detect automatically from video data which physical materials are present and analyze their behavior. For his research he has previously received an ERC Starting Grant and Proof of Concept Grant.

The official press release of the European Research Council can be found here.

Physics-based Deep Learning at ICLR’20: 2 Spotlights and 1 Poster coming up

We are happy to announce two upcoming spotlight presentations, and one poster at ICLR 2020:
Tranquil Clouds: Neural Networks for Learning Temporally Coherent Features in Point Clouds
Lagrangian Fluid Simulation with Continuous Convolutions , and
Learning to Control PDEs with Differentiable Physics.
Not surprisingly, all three focus on physics-based deep learning techniques and Navier-Stokes problems – the first two of them employ Lagrangian methods, the while the third one focuses on differentiable simulations, with a particular focus on long-term temporal stability. Abstracts follow below…

Tranquil Clouds: Neural Networks for Learning Temporally Coherent Features in Point Clouds: Point clouds, as a form of Lagrangian representation, allow for powerful and flexible applications in a large number of computational disciplines. We propose a novel deep-learning method to learn stable and temporally coherent feature spaces for points clouds that change over time. We identify a set of inherent problems with these approaches: without knowledge of the time dimension, the inferred solutions can exhibit strong flickering, and easy solutions to suppress this flickering can result in undesirable local minima that manifest themselves as halo structures. We propose a novel temporal loss function that takes into account higher time derivatives of the point positions, and encourages mingling, i.e., to prevent the aforementioned halos. We combine these techniques in a super-resolution method with a truncation approach to flexibly adapt the size of the generated positions. We show that our method works for large, deforming point sets from different sources to demonstrate the flexibility of our approach.

Lagrangian Fluid Simulation with Continuous Convolutions: We present an approach to Lagrangian fluid simulation with a new type of convolutional network. Our networks process sets of moving particles, which describe fluids in space and time. Unlike previous approaches, we do not build an explicit graph structure to connect the particles but use spatial convolutions as the main differentiable operation that relates particles to their neighbors. To this end we present a simple, novel, and effective extension of N-D convolutions to the continuous domain. We show that our network architecture can simulate different materials, generalizes to arbitrary collision geometries, and can be used for inverse problems. In addition, we demonstrate that our continuous convolutions outperform prior formulations in terms of accuracy and speed.

Learning to Control PDEs with Differentiable Physics: Predicting outcomes and planning interactions with the physical world are long-standing goals for machine learning. A variety of such tasks involves continuous physical systems, which can be described by partial differential equations (PDEs) with many degrees of freedom. Existing methods that aim to control the dynamics of such systems are typically limited to relatively short time frames or a small number of interaction parameters. We show that by using a differentiable PDE solver in conjunction with a novel predictor-corrector scheme, we can train neural networks to understand and control complex nonlinear physical systems over long time frames. We demonstrate that our method successfully develops an understanding of complex physical systems and learns to control them for tasks involving multiple PDEs, including the incompressible Navier-Stokes equations.

Differentiable solving framework PhiFlow is online now

Our fully differentiable physics-solving framework is online now at https://github.com/tum-pbs/PhiFlow. Having all functionality of, e.g., a fluid simulation running in TensorFlow opens up the possibility of back-propagating gradients through the simulation as well as running the simulation on GPUs.

PhiFlow (among others) supports the following things:

  • Support for a variety of differentiable simulation types, from Burgers over Navier-Stokes to the Schrödinger equation.
  • Tight integration with TensorFlow allowing for straightforward network training with fully differentiable simulations that run on the GPU.
  • Object-oriented architecture enabling concise and expressive code, designed for ease of use and extensibility.
  • Reusable simulation code, independent of backend and dimensionality, i.e. the exact same code can run a 2D fluid sim using NumPy and a 3D fluid sim on the GPU using TensorFlow.
  • Flexible, easy-to-use web interface featuring live visualizations and interactive controls that can affect simulations or network training on the fly.

Full scalarFlow Data-set online

Our full scalarFlow data set, with almost half a terabyte size, is online now! You can request a download link via the following sign-up form. It contains more than 100 sequences of fluid flows reconstructed with a Navier-Stokes-based reconstruction algorithm. Also check out our data-set page which describes the full hardware setup. You “only” need a few Raspberry-Pi computers with cameras: https://ge.in.tum.de/publications/2019-scalarflow-eckert/

Paper Abstract: We present ScalarFlow, a first large-scale data set of reconstructions of real-world smoke plumes. In addition, we propose a framework for accurate physics-based reconstructions from a small number of video streams. Central components of our framework are a novel estimation of unseen inflow regions and an efficient optimization scheme constrained by a simulation to capture real-world fluids. Our data set includes a large number of complex natural buoyancy-driven flows. The flows transition to turbulence and contain observable scalar transport processes. As such, the ScalarFlow data set is tailored towards computer graphics, vision, and learning applications. The published data set will contain volumetric reconstructions of velocity and density as well as the corresponding input image sequences with calibration data, code, and instructions how to reproduce the commodity hardware capture setup. We further demonstrate one of the many potential applications: a first perceptual evaluation study, which reveals that the complexity of the reconstructed flows would require large simulation resolutions for regular solvers in order to recreate at least parts of the natural complexity contained in the captured data.

Generating Game / Video & Fluid Sequences Using Spatio-Temporal GANs

Focusing on temporally coherent detail synthesis, we present a range of new and existing research project that all employ spatio-temporal self-supervision with GANs. This concept is beneficial for a variety of challenging tasks: from video translations, to super-resolution for games, videos and for fluid flow effects (i.e. Navier-Stokes solutions).

  • Video translation for unpaired data from different domains (Ours is shown as TecoGAN):

  • Game super-resolution using strongly aliased input (Ours is shown as DRR):

  • For video super-resolution, the TecoGAN model (Ours) yields coherent and realistic details:

  • We also proposed an algorithm for 3D fluid super-resolution with a factor of eight in each dimension:

In all tasks, we found that temporal adversarial learning is key to achieving temporally coherent solutions without sacrificing spatial detail. Compared to adding traditional temporal losses (e.g., L1 or L2 distances of warped frames) to normal spatial GANs, spatio-temporal discriminators are able to  deal with more challenging learning objectives such as sharp detail over time.

Besides spatio-temporal GANs, we have developed different technologies to tackle individual challenges across these tasks. In unpaired video translation, we use curriculum learning for discriminators to achieve better adversarial equilibriums. For strongly aliased renderings in games, we propose depth-recurrent residual connections to learn stable temporal states. In video super-resolution, a bi-directional Ping-Pong loss is proposed to improve long-term temporal coherence. When processing large volumetric fluid data, a multi-pass GAN is used to break-down the data relationships from 3D+t to lower dimensions. Preprints of papers and codes will be available soon.

Further reading:
TecoGAN
Multi-Pass GAN
tempoGAN

Collaboration with Beijing Film Academy

We are happy to announce a tighter research collaboration between the “Advanced Innovation Center for Future Visual Entertainment” of the Beijing Film Academy (BFA) and the Physics-Based Simulation group at TUM. Our joint goal is to develop the next generation of deep-learning based visual effects tools.

The BFA itself is a highly and internationally renowned film school that has, among others, alumni such as the director Zhang Yimou. The latter directed movies like “Hero” (2002), and the ceremonies for the Beijing Olympics (2008). The AICFVE research grou is headed by Prof. Baoquan Chen from Peking University.

Further reading:
http://fve.bfa.edu.cn/English.htm
https://www.china-admissions.com/beijing-film-academy/   
http://english.pku.edu.cn
https://cfcs.pku.edu.cn/baoquan/

   

ScalarFlow data-set paper to be presented SIGGRAPH Asia

Our paper on volumetric reconstructions of real world smoke flows (“fog”, to be precise) got accepted to SIGGRAPH Asia. Yay! This ScalarFlow dataset is a first one to collect a large number of space-time volumes of complex fluid flow effects. We hope it will be very useful in Navier-Stokes and CFD solvers and deep learning methods. We’re still busy preparing the final data set and source code, but in the meantime you can enjoy the paper preprint and the video.

Full Paper Abstract:
In this paper, we present ScalarFlow, a first large-scale data set of reconstructions of real-world smoke plumes. In addition, we propose a framework for accurate physics-based reconstructions from a small number of video streams. Central components of our framework are a novel estimation of unseen inflow regions and an efficient optimization scheme constrained by a simulation to capture real-world fluids. Our data set includes a large number of complex natural buoyancy-driven flows. The flows transition to turbulence and contain observable scalar transport processes. As such, the ScalarFlow data set is tailored towards computer graphics, vision, and learning applications. The published data set will contain volumetric reconstructions of velocity and density as well as the corresponding input image sequences with calibration data, code, and instructions how to reproduce the commodity hardware capture setup. We further demonstrate one of the many potential applications: a first perceptual evaluation study, which reveals that the complexity of the reconstructed flows would require large simulation resolutions for regular solvers in order to recreate at least parts of the natural complexity contained in the captured data.

Deep Learning for Graphics Course

We’re happy to announce an updated version of our “Deep Learning for Graphics” course, codename CreativeAI, which we successfully held at SIGGRAPH 2019 this year with colleagues from UCL and Stanford.

Course summary: In computer graphics, many traditional problems are now better handled by deep-learning based data-driven methods. In an increasing variety of problem settings, deep networks are state-of-the-art, beating dedicated hand-crafted methods by significant margins. This tutorial gives an organized overview of core theory, practice, and graphics-related applications of deep learning.

Course materials: https://geometry.cs.ucl.ac.uk/creativeai/
Code: http://github.com/smartgeometry-ucl/dl4g

Tranquil Clouds: Neural Networks for Learning Temporally Coherent Features in Point Clouds

Our paper on learning temporally stable features in point clouds is online by now. A pre-print can be found here: https://arxiv.org/pdf/1907.05279.pdf , and the accompanying video is this one: https://www.youtube.com/watch?v=6OoRZrqfSJ4

Full Abstract: Point clouds, as a form of Lagrangian representation, allow for powerful and flexible applications in a large number of computational disciplines. We propose a novel deep-learning method to learn stable and temporally coherent feature spaces for points clouds that change over time. We identify a set of inherent problems with these approaches: without knowledge of the time dimension, the inferred solutions can exhibit strong flickering, and easy solutions to suppress this flickering can result in undesirable local minima that manifest themselves as halo structures. We propose a novel temporal loss function that takes into account higher time derivatives of the point positions, and encourages mingling, i.e., to prevent the aforementioned halos. We combine these techniques in a super-resolution method with a truncation approach to flexibly adapt the size of the generated positions. We show that our method works for large, deforming point sets from different sources to demonstrate the flexibility of our approach.

And this is the project website. Enjoy!

Spot the Difference: Paper on Perceptual Evaluations of Simulation Data Sets online now

Our paper on perceptual evaluations of arbitrary simulation data sets, or more specifically field data, is online now. We demonstrate that user studies can be employed to evaluate complex data sets, e.g., those arising from fluid flow simulations, even in situations where traditional norms such as L2 fail to yield conclusive answers. The preprint is available here: https://arxiv.org/abs/1907.04179 , and this is the corresponding project page.

Abstract
Comparative evaluation lies at the heart of science, and determining the accuracy of a computational method is crucial for evaluating its potential as well as for guiding future efforts. However, metrics that are typically used have inherent shortcomings when faced with the under-resolved solutions of real-world simulation problems. We show how to leverage crowd-sourced user studies in order to address the fundamental problems of widely used classical evaluation metrics. We demonstrate that such user studies, which inherently rely on the human visual system, yield a very robust metric and consistent answers for complex phenomena without any requirements for proficiency regarding the physics at hand. This holds even for cases away from convergence where traditional metrics often end up inconclusive results. More specifically, we evaluate results of different essentially non-oscillatory (ENO) schemes in different fluid flow settings. Our methodology represents a novel and practical approach for scientific evaluations that can give answers for previously unsolved problems.

TecoGAN training code online

We have finally uploaded code for training new TecoGAN models in our github repository: https://github.com/thunil/TecoGAN

While inference mode and pre-trained models have been up for a while now, the new version also contains the code necessary to train new models, and a script to download a suitable amount of training data.

A properly trained TecoGAN model can generate fine details that persist over the course of long generated video sequences. The github repo contains a few samples, such as the mesh structures of the armor, the scale patterns of the lizard, and the dots on the back of the spider highlight the capabilities of our method. A spatio-temporal discriminator plays a key role to guide the generator network towards producing coherent detail.

Video: https://www.youtube.com/watch?v=pZXFXtfd-Ak , and
Paper: https://arxiv.org/pdf/1811.09393.pdf.