What if pretraining scientific foundation models didn’t require massive datasets at all?

We show that this is not only possible, but has a range of neat benefits: our “Tadpole” models learn from canonical PDE data that is generated on-the-fly by efficient and accurate spectral solvers. This effectively enables unlimited training data, and circumvents storage and I/O bottlenecks.

Three key ingredients to make the Tadpole approach work are:

  • Autoencoding instead of dynamics pretraining: we learn transferable spatial representations rather than system-specific dynamics. This improves generalization across heterogeneous PDE systems.
  • Custom parameter-efficient fine-tuning (PEFT): we propose the use of LoRA, latent transformations + skip connections. This gives very good performance for temporal predictions, e.g., outperforming the Walrus model, which has orders of magnitude more parameters.
  • Online pretraining at scale: as outlined above, Tadpole generates PDE data online via GPU-based ETDRK solvers. This scales to hundreds of TB equivalent data.

Most scientific foundation models are bottlenecked by data generation and storage. Tadpole flips this paradigm: Data is no longer a static, pre-computed asset. Instead, it is generated online and becomes part of the training loop. I think this is a key step toward scalable foundation models for applications in science and engineering.

Full abstract: We introduce Tadpole, a novel foundation model for three-dimensional partial differential equations (PDEs) that addresses key challenges in transferability, scalability to high dimensionality, and multi-functionality. Tadpole is pre-trained as an autoencoder on synthetic 3D PDE data generated by an efficient online data-generation framework. This enables large-scale, diverse training without storage or I/O overhead, demonstrated by scaling to an equivalent of hundreds of terabytes of training data. By autoencoding single-channel spatial crops, Tadpole learns rich and transferable representations across heterogeneous physical systems with varying numbers of state variables and spatial resolutions. Although pre-trained solely as an autoencoder, Tadpole can be efficiently applied for multiple downstream tasks beyond reconstruction, including dynamics learning and generative modeling. For dynamics learning, we propose a novel parameter-efficient fine-tuning strategy that integrates low-rank adaptation, latent-space transformations, and reintroduced skip connections, achieving accurate temporal modeling with a minimal number of trainable parameters. Tadpole demonstrates strong fine-tuning performance across various downstream tasks, highlighting its versatility and effectiveness as a foundation model for 3D PDE learning.