Generating Game / Video & Fluid Sequences Using Spatio-Temporal GANs

Focusing on temporally coherent detail synthesis, we present a range of new and existing research project that all employ spatio-temporal self-supervision with GANs. This concept is beneficial for a variety of challenging tasks: from video translations, to super-resolution for games, videos and for fluid flow effects (i.e. Navier-Stokes solutions).

  • Video translation for unpaired data from different domains (Ours is shown as TecoGAN):

  • Game super-resolution using strongly aliased input (Ours is shown as DRR):

  • For video super-resolution, the TecoGAN model (Ours) yields coherent and realistic details:

  • We also proposed an algorithm for 3D fluid super-resolution with a factor of eight in each dimension:

In all tasks, we found that temporal adversarial learning is key to achieving temporally coherent solutions without sacrificing spatial detail. Compared to adding traditional temporal losses (e.g., L1 or L2 distances of warped frames) to normal spatial GANs, spatio-temporal discriminators are able to  deal with more challenging learning objectives such as sharp detail over time.

Besides spatio-temporal GANs, we have developed different technologies to tackle individual challenges across these tasks. In unpaired video translation, we use curriculum learning for discriminators to achieve better adversarial equilibriums. For strongly aliased renderings in games, we propose depth-recurrent residual connections to learn stable temporal states. In video super-resolution, a bi-directional Ping-Pong loss is proposed to improve long-term temporal coherence. When processing large volumetric fluid data, a multi-pass GAN is used to break-down the data relationships from 3D+t to lower dimensions. Preprints of papers and codes will be available soon.

Further reading:
TecoGAN
Multi-Pass GAN
tempoGAN

Collaboration with Beijing Film Academy

We are happy to announce a tighter research collaboration between the “Advanced Innovation Center for Future Visual Entertainment” of the Beijing Film Academy (BFA) and the Physics-Based Simulation group at TUM. Our joint goal is to develop the next generation of deep-learning based visual effects tools.

The BFA itself is a highly and internationally renowned film school that has, among others, alumni such as the director Zhang Yimou. The latter directed movies like “Hero” (2002), and the ceremonies for the Beijing Olympics (2008). The AICFVE research grou is headed by Prof. Baoquan Chen from Peking University.

Further reading:
http://fve.bfa.edu.cn/English.htm
https://www.china-admissions.com/beijing-film-academy/   
http://english.pku.edu.cn
https://cfcs.pku.edu.cn/baoquan/

   

ScalarFlow data-set paper to be presented SIGGRAPH Asia

Our paper on volumetric reconstructions of real world smoke flows (“fog”, to be precise) got accepted to SIGGRAPH Asia. Yay! This ScalarFlow dataset is a first one to collect a large number of space-time volumes of complex fluid flow effects. We hope it will be very useful in Navier-Stokes and CFD solvers and deep learning methods. We’re still busy preparing the final data set and source code, but in the meantime you can enjoy the paper preprint and the video.

Full Paper Abstract:
In this paper, we present ScalarFlow, a first large-scale data set of reconstructions of real-world smoke plumes. In addition, we propose a framework for accurate physics-based reconstructions from a small number of video streams. Central components of our framework are a novel estimation of unseen inflow regions and an efficient optimization scheme constrained by a simulation to capture real-world fluids. Our data set includes a large number of complex natural buoyancy-driven flows. The flows transition to turbulence and contain observable scalar transport processes. As such, the ScalarFlow data set is tailored towards computer graphics, vision, and learning applications. The published data set will contain volumetric reconstructions of velocity and density as well as the corresponding input image sequences with calibration data, code, and instructions how to reproduce the commodity hardware capture setup. We further demonstrate one of the many potential applications: a first perceptual evaluation study, which reveals that the complexity of the reconstructed flows would require large simulation resolutions for regular solvers in order to recreate at least parts of the natural complexity contained in the captured data.

Deep Learning for Graphics Course

We’re happy to announce an updated version of our “Deep Learning for Graphics” course, codename CreativeAI, which we successfully held at SIGGRAPH 2019 this year with colleagues from UCL and Stanford.

Course summary: In computer graphics, many traditional problems are now better handled by deep-learning based data-driven methods. In an increasing variety of problem settings, deep networks are state-of-the-art, beating dedicated hand-crafted methods by significant margins. This tutorial gives an organized overview of core theory, practice, and graphics-related applications of deep learning.

Course materials: https://geometry.cs.ucl.ac.uk/creativeai/
Code: http://github.com/smartgeometry-ucl/dl4g

Tranquil Clouds: Neural Networks for Learning Temporally Coherent Features in Point Clouds

Our paper on learning temporally stable features in point clouds is online by now. A pre-print can be found here: https://arxiv.org/pdf/1907.05279.pdf , and the accompanying video is this one: https://www.youtube.com/watch?v=6OoRZrqfSJ4

Full Abstract: Point clouds, as a form of Lagrangian representation, allow for powerful and flexible applications in a large number of computational disciplines. We propose a novel deep-learning method to learn stable and temporally coherent feature spaces for points clouds that change over time. We identify a set of inherent problems with these approaches: without knowledge of the time dimension, the inferred solutions can exhibit strong flickering, and easy solutions to suppress this flickering can result in undesirable local minima that manifest themselves as halo structures. We propose a novel temporal loss function that takes into account higher time derivatives of the point positions, and encourages mingling, i.e., to prevent the aforementioned halos. We combine these techniques in a super-resolution method with a truncation approach to flexibly adapt the size of the generated positions. We show that our method works for large, deforming point sets from different sources to demonstrate the flexibility of our approach.

And this is the project website. Enjoy!

Spot the Difference: Paper on Perceptual Evaluations of Simulation Data Sets online now

Our paper on perceptual evaluations of arbitrary simulation data sets, or more specifically field data, is online now. We demonstrate that user studies can be employed to evaluate complex data sets, e.g., those arising from fluid flow simulations, even in situations where traditional norms such as L2 fail to yield conclusive answers. The preprint is available here: https://arxiv.org/abs/1907.04179 , and this is the corresponding project page.

Abstract
Comparative evaluation lies at the heart of science, and determining the accuracy of a computational method is crucial for evaluating its potential as well as for guiding future efforts. However, metrics that are typically used have inherent shortcomings when faced with the under-resolved solutions of real-world simulation problems. We show how to leverage crowd-sourced user studies in order to address the fundamental problems of widely used classical evaluation metrics. We demonstrate that such user studies, which inherently rely on the human visual system, yield a very robust metric and consistent answers for complex phenomena without any requirements for proficiency regarding the physics at hand. This holds even for cases away from convergence where traditional metrics often end up inconclusive results. More specifically, we evaluate results of different essentially non-oscillatory (ENO) schemes in different fluid flow settings. Our methodology represents a novel and practical approach for scientific evaluations that can give answers for previously unsolved problems.

TecoGAN training code online

We have finally uploaded code for training new TecoGAN models in our github repository: https://github.com/thunil/TecoGAN

While inference mode and pre-trained models have been up for a while now, the new version also contains the code necessary to train new models, and a script to download a suitable amount of training data.

A properly trained TecoGAN model can generate fine details that persist over the course of long generated video sequences. The github repo contains a few samples, such as the mesh structures of the armor, the scale patterns of the lizard, and the dots on the back of the spider highlight the capabilities of our method. A spatio-temporal discriminator plays a key role to guide the generator network towards producing coherent detail.

Video: https://www.youtube.com/watch?v=pZXFXtfd-Ak , and
Paper: https://arxiv.org/pdf/1811.09393.pdf.

Multi-Pass GAN Paper to appear at SCA 2019 now online

Our “physics-based deep learning” paper on generating very high resolution fluid flows based on generative adversarial neural networks is online now. The key idea is to split the problem into multiple orthogonal passes, which nicely works in conjunction with progressive growing techniques. We demonstrate this for several Navier-Stokes flow problems.

You can check out the video here:

The arXiv preprint can be found here.

Abstract: We propose a novel method to up-sample volumetric functions with generative neural networks using several orthogonal passes. Our method decomposes generative problems on Cartesian field functions into multiple smaller sub-problems that can be learned more efficiently. Specifically, we utilize two separate generative adversarial networks: the first one up-scales slices which are parallel to the XY- plane, whereas the second one refines the whole volume along the Z- axis working on slices in the YZ- plane. In this way, we obtain full coverage for the 3D target function and can leverage spatio-temporal supervision with a set of discriminators. Additionally, we demonstrate that our method can be combined with curriculum learning and progressive growing approaches. We arrive at a first method that can up-sample volumes by a factor of eight along each dimension, i.e., increasing the number of degrees of freedom by 512. Large volumetric up-scaling factors such as this one have previously not been attainable as the required number of weights in the neural networks renders adversarial training runs prohibitively difficult. We demonstrate the generality of our trained networks with a series of comparisons to previous work, a variety of complex 3D results, and an analysis of the resulting performance.

Deformation-aware Neural Networks for Liquid Simulations at ICLR 2019

Lukas Prantl last week successfully presented our paper on deformation learning for capturing solution spaces of Navier-Stokes (liquids in particular) at the International Conference on Learning Representations (ICLR). The full paper and video can be found here.

Our proof-of-concept Android app is still available for free in the app store: https://play.google.com/store/apps/details?id=fluidsim.de.interactivedrop

Full abstract: We propose a novel approach for deformation-aware neural networks that learn the weighting and synthesis of dense volumetric deformation fields. Our method specifically targets the space-time representation of physical surfaces from liquid simulations. Liquids exhibit highly complex, non-linear behavior under changing simulation conditions such as different initial conditions. Our algorithm captures these complex phenomena in two stages: a first neural network computes a weighting function for a set of pre-computed deformations, while a second network directly generates a deformation field for refining the surface. Key for successful training runs in this setting is a suitable loss function that encodes the effect of the deformations, and a robust calculation of the corresponding gradients. To demonstrate the effectiveness of our approach, we showcase our method with several complex examples of flowing liquids with topology changes. Our representation makes it possible to rapidly generate the desired implicit surfaces. We have implemented a mobile application to demonstrate that real-time interactions with complex liquid effects are possible with our approach.

Code for deep-learning based subgrid flow online now

We’ve just (i.e. finally) released the code for our SCA 2018 paper on learning sub-grid detail for Navier-Stokes (liquid) simulations with a stochastic deep-learning model. Our approach learns to predict the probability and a Gaussian distribution for under-resolved splash formations. It’s a good example from the larger field of “physics-based deep learning” techniques to enhance physics simulations with the help of neural network techniques. The code comes with a data generator based on our mantaflow framework, and TensorFlow code to train the neural network predictor.

You can check it out on github:https://github.com/kiwonum/mlflip

The corresponding paper and video can be found here.

Full abstract: This paper proposes a new data-driven approach to model detailed splashes for liquid simulations with neural networks. Our model learns to generate small-scale splash detail for the fluid-implicit-particle method using training data acquired from physically parameterized, high resolution simulations. We use neural networks to model the regression of splash formation using a classifier together with a velocity modifier. For the velocity modification, we employ a heteroscedastic model. We evaluate our method for different spatial scales, simulation setups, and solvers. Our simulation results demonstrate that our model significantly improves visual fidelity with a large amount of realistic droplet formation and yields splash detail much more efficiently than finer discretizations.

New Results from our Spatio-temporal Super-Resolution GAN (TecoGAN)

We’ve been working on new examples with our deep-learning based video super-resolution method (TecoGAN) that employ a novel spatio-temporal discriminators. Enjoy! These examples nicely highlight the huge amount of coherent detail that our method generates via a GAN-based training of the generator. And we’re of course still working on publishing the source code and trained models, coming up soon…

 

If you’re interested in the details, you can read the full pre-print here: https://ge.in.tum.de/publications/2019-tecogan-chu/

Or you can check out the accompanying paper video here:

Latent-space Physics Paper Video Finally Online on YouTube

The video for our latent-space physics paper is finally online! It’s been a while, the first paper version was on online on arXiv in February 2018 🙂 The paper will now be presented at Eurographics 2019 in Genoa.

Abstract: Our work explores methods for the data-driven inference of temporal evolutions of physical functions with deep learning techniques. More specifically, we target Navier-Stokes / fluid flow problems, and we propose a novel network architecture to predict the changes of the pressure field over time. The central challenge in this context is the high dimensionality of Eulerian space-time data sets. Key for arriving at a feasible algorithm is a technique for dimensionality reduction based on convolutional neural networks, as well as a special architecture for temporal prediction. We demonstrate that dense 3D+time functions of physics system can be predicted with neural networks, and we arrive at a neural-network based simulation algorithm with practical speed-ups. We demonstrate the capabilities of our method with a series of complex liquid simulations, and with a set of single-phase simulations. Our method predicts pressure fields very efficiently. It is more than two orders of magnitudes faster than a regular solver. Additionally, we present and discuss a series of detailed evaluations for the different components of our algorithm.

More detailed infos can be found here

ERC Proof of Concept Grant for a Data-driven Fluid Flow Solving Platform

We’re very happy to report that the Thuerey research group has very recently been award a so-called “Proof of Concept” grant by the European Research Council (ERC).

We will leverage deep convolutional neural networks (CNNs) with physically-based architectures and loss functions for a first deep learning based flow solver. From the ERC Starting Grant realFlow a first algorithmic realization exists, which provides the core technology that will be taken to the next level within this PoC. Specifically, we plan to employ this technology for a prediction of Reynolds-averaged turbulence flows in order to achieve interactive runtimes for complex simulations that previously took long computing times. However, instead of aiming for general purpose solvers, we will target specific application areas with targeted trained models. This technology has the potential to fundamentally change the way designers and engineers can work with physics simulations to get feedback for their designs. It will also make these simulations available to smaller companies in the value chain that previously were not able to fund and maintain complex simulators.

In parallel, our goal is to establish an open platform for exchanging data and trained models for physics simulations. We believe that open standards will on the one hand support the adoption of the new technology, while at the same time providing publicity and marketing opportunities for products to be developed alongside this platform. In particular, the deep learning based turbulence solver will make use of the open data and model formats. In the long run, this will make it possible to incorporate the trained model into new applications, e.g., for solving inverse problems in the context of flow simulations.

Pre-publication: Deep Learning Methods for Reynolds-Averaged Navier-Stokes Simulations

Details: https://erc.europa.eu/news/proof-concept-erc-awards-60-grants-innovation

 

Magic Fluid Control Animation – A Classic…

This animation is more than 10 years old, and one of our first works on fluid control (back then using Lattice-Boltzmann and SPH to simulate the fluid with a free surface). It to does not include any deep-learning or conv-nets – despite this, it’s still fun and worth a look 🙂 Enjoy!

On YouTube:

 

Fluid Simulations and Deep Learning at ICLR’19

Our work on “Generating Liquid Simulations with Deformation-aware Neural Networks” has been conditionally accepted at the International Conference on Learning Representations (ICLR), and will be presented there in May.

It focuses on an approach to pre-compute solution spaces for free-surface Navier-Stokes with deformation learning. The first version of our work appeared in April 2017, so video, paper and the accompanying demo Android app can all be already found online. More information here.

Abstract
Liquids exhibit complex non-linear behavior under changing simulation conditions such as user interactions. We propose a method to map this complex behavior over a parameter range onto reduced representation based on space-time deformations. In order to represent the complexity of the full space of inputs, we leverage the power of generative neural networks to learn a reduced representation. We introduce a novel deformation-aware loss function, which enables optimization in the highly non-linear space of multiple deformations. To demonstrate the effectiveness of our approach, we showcase the method with several complex examples in two and four dimensions. Our representation makes it possible to generate implicit surfaces of liquids very efficiently, which makes it possible to display the scene from any angle, and to add secondary effects such as particle systems. We have implemented a mobile application for our full output pipeline to demonstrate that real-time interaction is possible with our approach.

Learning temporal predictions and reduced representations at EG’19

Our two papers on learning temporal predictions and reduced representations for fluids have been accepted to the CGF Journal and will be presented at Eurographics 2019 in Milano! Congratulations to Steffen, Moritz, Byungsoo and Vinicius!
Our work explores methods for the data-driven inference of temporal evolutions of physical functions with deep learning techniques. More specifically, we target fluid flow problems, and we propose a novel network architecture to predict the changes of the pressure field over time. The central challenge in this context is the high dimensionality of Eulerian space-time data sets. Key for arriving at a feasible algorithm is a technique for dimensionality reduction based on convolutional neural networks, as well as a special architecture for temporal prediction. We demonstrate that dense 3D+time functions of physics system can be predicted with neural networks, and we arrive at a neural-network based simulation algorithm with practical speed-ups. We demonstrate the capabilities of our method with a series of complex liquid simulations, and with a set of single-phase simulations. Our method predicts pressure fields very efficiently. It is more than two orders of magnitudes faster than a regular solver. Additionally, we present and discuss a series of detailed evaluations for the different components of our algorithm.
This paper presents a novel generative model to synthesize fluid simulations from a set of reduced parameters. A convolutional neural network is trained on a collection of discrete, parameterizable fluid simulation velocity fields. Due to the capability of deep learning architectures to learn representative features of the data, our generative model is able to accurately approximate the training data set, while providing plausible interpolated in-betweens. The proposed generative model is optimized for fluids by a novel loss function that guarantees divergence-free velocity fields at all times. In addition, we demonstrate that we can handle complex parameterizations in reduced spaces, and advance simulations in time by integrating in the latent space with a second network. Our method models a wide variety of fluid behaviors, thus enabling applications such as fast construction of simulations, interpolation of fluids with different parameters, time re-sampling, latent space simulations, and compression of fluid simulation data. Reconstructed velocity fields are generated up to 700x faster than traditional CPU solvers, while achieving compression rates of over 1300x.

Deep Learning for Graphics Course at SIGGRAPH Asia 2018

We yesterday held our course on Deep Learning for Graphics Course at SIGGRAPH Asia 2018 in Tokyo. The slides are now available online at:

http://geometry.cs.ucl.ac.uk/creativeai/

Abstract: In computer graphics, many traditional problems are now better handled by deep-learning based data-driven methods. In an increasing variety of problem settings, deep networks are state-of-the-art, beating dedicated hand-crafted methods by significant margins. This tutorial gives an organized overview of core theory, practice, and graphics-related applications of deep learning.

 

Video Super-resolution with Deep Learning

Our work on video super-resolution with GANs is online now as a preview. The main trick is a special discriminator CNN that learns to supervise in terms of detail as well as temporal coherence. In addition, we propose a novel set of metrics for quantifying temporal coherence in videos. Enjoy 🙂 !

Abstract: Adversarial training has been highly successful in the context of image super-resolution. It was demonstrated to yield realistic and highly detailed results. Despite this success, many state-of-the-art methods for video super-resolution still favor simpler norms such as L_2 over adversarial loss functions. This is caused by the fact that the averaging nature of direct vector norms as loss functions leads to temporal smoothness. The lack of spatial detail means temporal coherence is easily established. In our work, we instead propose an adversarial training for video super-resolution that leads to temporally coherent solutions without sacrificing spatial detail.

In our generator, we use a recurrent, residual framework that naturally encourages temporal consistency. For adversarial training, we propose a novel spatio-temporal discriminator in combination with motion compensation to guarantee photo-realistic and temporally coherent details in the results. We additionally identify a class of temporal artifacts in these recurrent networks, and propose a novel Ping-Pong loss to remove them. Quantifying the temporal coherence for image super-resolution tasks has also not been addressed previously. We propose a first set of metrics to evaluate the accuracy as well as the perceptual quality of the temporal evolution, and we demonstrate that our method outperforms previous work by yielding realistic and detailed images with natural temporal changes.

Physics-based Deep Learning at NIPS 2018

We will be presenting our recent works on physics-based deep learning for fluid flow at the NIPS 2018 workshop on “Modeling the Physical World: Learning, Perception, and Control“, organized by Jiajun Wu, Kelsey Allen, Kevin Smith, Jessica Hamrick, Emmanuel Dupoux, Marc Toussaint, and Joshua Tenenbaum.

NIPS Conference: https://nips.cc

NIPS 2018 Workshop “Modeling the Physical World: Learning, Perception, and Control”: https://nips.cc/Conferences/2018/Schedule?showEvent=10931

Workshop homepage: http://phys2018.csail.mit.edu/submission.html

In particular we will discuss our works on:

Detailed abstracts:

Latent-space Physics: Towards Learning the Temporal Evolution of Fluid Flow: Our work explores methods for the data-driven inference of temporal evolutions of physical functions with deep learning techniques. More specifically, we target fluid flow problems, and we propose a novel network architecture to predict the changes of the pressure field over time. The central challenge in this context is the high dimensionality of Eulerian space-time data sets. Key for arriving at a feasible algorithm is a technique for dimensionality reduction based on convolutional neural networks, as well as a special architecture for temporal prediction. We demonstrate that dense 3D+time functions of physics system can be predicted with neural networks, and we arrive at a neural-network based simulation algorithm with practical speed-ups. We demonstrate the capabilities of our method with a series of complex liquid simulations, and with a set of single-phase simulations. Our method predicts pressure fields very efficiently. It is more than two orders of magnitudes faster than a regular solver. Additionally, we present and discuss a series of detailed evaluations for the different components of our algorithm.

Temporally Coherent, Volumetric GAN for Super-resolution Fluid Flow: We propose a temporally coherent generative model addressing the super-resolution problem for fluid flows. Our work represents the first approach to synthesize four-dimensional physics fields with neural networks. Based on a conditional generative adversarial network that is designed for the inference of three-dimensional volumetric data, our model generates consistent and detailed results by using a novel temporal discriminator, in addition to the commonly used spatial one. Our experiments show that the generator is able to infer more realistic high-resolution details by using additional physical quantities, such as low-resolution velocities or vorticities. Besides improvements in the training process and in the generated outputs, these inputs offer means for artistic control as well. We additionally employ a physics-aware data augmentation step, which is crucial to avoid overfitting and to reduce memory requirements. In this way, our network learns to generate advected quantities with highly detailed, realistic, and temporally coherent features. Our method works instantaneously, using only a single time-step of low-resolution fluid data. We demonstrate the abilities of our method using a variety of complex inputs and applications in two and three dimensions.

Coupled Fluid Density and Motion from Single Views: We present a novel method to reconstruct a fluid’s 3D density and motion based on just a single sequence of images. This is rendered possible by using powerful physical priors for this strongly under-determined problem. More specifically, we propose a novel strategy to infer density updates strongly coupled to previous and current estimates of the flow motion. Additionally, we employ an accurate discretization and depth-based regularizers to compute stable solutions. Using only one view for the reconstruction reduces the complexity of the capturing setup drastically and could even allow for online video databases or smart-phone videos as inputs. The reconstructed 3D velocity can then be flexibly utilized, e.g., for re-simulation, domain modification or guiding purposes. We will demonstrate the capacity of our method with a series of synthetic test cases and the reconstruction of real smoke plumes captured with a Raspberry Pi camera.