The SuperWing dataset is a large-scale, open dataset of transonic swept-wing aerodynamics, combining thousands of richly parameterized 3D wing geometries with high-fidelity RANS simulations across the operational flight envelope: https://arxiv.org/abs/2512.14397 It offers more complex geometry (“kinked” wings) and larger parametric variety than all existing 3D airfoil datasets!

Created primarily by Yunjia and collaborators from Tsinghua University, the HuggingFace dataset https://huggingface.co/datasets/yunplus/SuperWing captures strong and challenging variations in spanwise shape, twist, and dihedral angles, offering unprecedented geometric diversity. Transformer models trained on SuperWing achieve low drag-prediction errors and show strong zero-shot generalization to canonical aircraft configurations, positioning SuperWing as a key resource for data-driven aerodynamic modeling.

Full abstract: Machine-learning surrogate models have shown promise in accelerating aerodynamic design, yet progress toward generalizable predictors for three-dimensional wings has been limited by the scarcity and restricted diversity of existing datasets. Here, we present SuperWing, a comprehensive open dataset of transonic swept-wing aerodynamics comprising 4,239 parameterized wing geometries and 28,856 Reynolds-averaged Navier-Stokes flow field solutions. The wing shapes in the dataset are generated using a simplified yet expressive geometry parameterization that incorporates spanwise variations in airfoil shape, twist, and dihedral, allowing for an enhanced diversity without relying on perturbations of a baseline wing. All shapes are simulated under a broad range of Mach numbers and angles of attack covering the typical flight envelope. To demonstrate the dataset’s utility, we benchmark two state-of-the-art Transformers that accurately predict surface flow and achieve a 2.5 drag-count error on held-out samples. Models pretrained on SuperWing further exhibit strong zero-shot generalization to complex benchmark wings such as DLR-F6 and NASA CRM, underscoring the dataset’s diversity and potential for practical usage.