Cosmostat2026

What can ML tools and simulations do for (radio) cosmology ?

(Or what keeps me busy in Switzerland)

Nicolas Cerardi

Physics-Informed Neural Networks for continuous CDM simulations

Constraining the reionization history with higher-order statistics and simulation-based inference

Satellites mega constellations and the Square Kilometre Array Observatory

Physics-Informed Neural Networks for continuous CDM simulations

With Emma Tolley & Ashutosh Mishra

The dynamics of CDM

Cold Dark Matter obeys the Cosmological Vlasov Poisson equations:

1D CDM simulation

Phase-space distribution

Cold DM occupies a 1D submanifold in the 1+1D phase-space
Multi-stream flow after shell-crossing

Density

Singularities arise at shell-crossing locations
They affect the CDM dynamics (Poisson equation)

Routes to CDM simulations

N-body codes employ particles instead of a fluid
This leads to discreteness errors (2-body interactions)

Can we find an alternative to N-body simulations ?

Since CDM occupies a low-dimensional submanifold, we can completely describe the dynamics using the displacement field $\zeta(q,\tau)$: $$ x(q,\tau) = q + \zeta(q,\tau) $$

With the equation of motion: $$ \frac{\partial^2 \zeta}{\partial \tau^2} + \frac{3}{2\tau} \frac{\partial \zeta}{\partial \tau} = - \frac{3}{2\tau} \frac{\partial \phi}{\partial x}$$

Can a neural network learn the solution ?

Physics Informed Neural Networks (PINNs)

PINNs: simulations seen as an optimisation problem

Use a neural network to learn the solution of a PDE
Network optimisation is guided by the physical constraints (PDE + boundary)
Fully unsupervised $\Longrightarrow$ independent from N-body
Provides a continuous representation of the solution, not just particles !

Physics Informed Neural Networks (PINNs)

PDE

Numerical integration allows to calculate $\nabla_x \phi$
Auto-diff provides access to the remaining term of the PDE (velocity and acceleration)

Boundary

Initial conditions given from the Zel'dovich Approximation
Boundary conditions set to 0

Which neural architecture can we use ?

Sharp features in acceleration due to the shell-crossings.
Our first attempts with MLPs failed to captures these.
We also tested more complex activation functions.
We need an architecture that provides high flexibility down to the 2nd derivative of the network output

Kolmogorov-Arnold Networks (KANs)

The Kolmogorov-Arnold representation theorem: any multivariate function can be approximated by a combination of (many) flexible univariate functions
Allows an efficient parameterization of the target function

Regular MLP

[Liu+25]

Kolmogorov-Arnold Network

Edges are combination of 3rd-order B-splines

$\Rightarrow$ Physics-Informed KANs (PIKANs!)

PIKAN implementation

We use a [8, 12, 8, 1] KAN architecture (~4500 parameters)
We employ independent KANs on consecutive time chunks

$$ x(q,\tau) = q + \textcolor{cyan}{\zeta_0} + \sum_{i=1}^{k-1} \textcolor{orange}{\zeta_i}(q, \tau_{end,i}) + \textcolor{green}{\zeta_k}(q,\tau)$$

$ \zeta_0 $ is the Zel'dovich approximation
$ \zeta_i $ are the KANs trained on previous chunks
$ \zeta_k $ is the KAN operating at $\tau$

Results

Results : Phase-space

A very good visual agreement, up to 6 shell-crossings !

Results : displacement, velocity and acceleration

Errors w.r.t a high resolution N-body simulation

Error is less than 2% on the displacement

Results : density profiles

Density profiles are in good agreement
In the core, N-body profiles are noisy
The PIKAN provides a smoother solution (too much?)

Caveats

Training time is quite long (several hours per time chunk on a single GPU)
While the learnt representation is defined continuously, we still need to sample points for the numerical integration of $ \nabla_x\phi $
(for now a 1D single halo collapse)

What else can we do with such a tool ?

Unlike a N-body sim that advances timestep per timestep, in the PINN framework the optimisation is performed simultaneously over time.
$\Rightarrow$ We can set the "initial conditions" at any $\tau$

Results : Backward simulation

Test case: reverse time through 2 shell crossings.
Start from high-resolution N-body data at $\tau_{end}$, optimise to retrieve the underlying initial conditions.

Input: $ (\zeta, \dot{\zeta}, \ddot{\zeta})$

Input: $ (\zeta, \dot{\zeta}, \delta(x))$

CDM PIKAN - Takeaways

We presented a new method to simulate CDM, independent from N-body simulations.
PIKANs allows a continuous and differentiable modelling of the dynamics.
It is possible to run the simulation backward to retrieve the initial conditions.
Future work: higher dimensions, complex ICs...

[Cerardi, Tolley and Mishra, MNRAS, 2025]

Implicit inference of the reionization history with higher-order statistics of the 21-cm signal

With the SEarCH team: Sambit Giri, Michele Bianco, Davide Piras, Emmanuel de Salis, Massimo de Santis, Merve Selcuk-Simsek, Philipp Denzel, Kelley Hess, Carmen Toribio, Franz Kirsten & Hatem Ghorbel

If you missed the last semester of SEarCH...

Participated to the SKA Data Challenge 3b: Inference
Goal: infer the neutral fraction $ \bar{x}_{\rm HI} $ in SKA-Low mocks at three different $ z $ to constrain the Epoch of Reionisation

How did we tackle this problem ?

Our strategy

Created a large dataset of ~16000 SKA-Low mocks
- Included instrumental noise but no foregrounds

Trained 2 independent DL approaches
- Regression task on $ \bar{x}_{\rm HI} $ with deep CNNs
  
  [De Salis+25]
- Bayesian inference with Simulation-based Inference (SBI)

"I have a complex and noisy model which doesn't provide an explicit likelihood"
$ \Rightarrow $ learn the posterior directly from simulations !

Our results at SDC3b

Obtained great results on the PS2 data 🎉

Our results at SDC3b

But not ranked on the PS1 data

Going beyond the data challenge

Computing higher-order statistics from our mock simulations
Combining them within our SBI framework
Provide statistically robust results

Extending our modelling

Rejecting extreme reionisation cases
Dataset expanded x2 with more noise realisation applied on the mocks
Test the noise-level dependence: 100h and 1000h SKA-Low noise considered
More statistics computed from the mocks

More statistics !

We compute several kinds of statistics $ t $ from the 21cm cubes

2pt statistics

1D Power Spectrum

2D Power Spectrum

Higher-order statistics

Bispectrum (equilateral & squeezed)

Betti numbers (topological invariant)

+ combinations of these metrics

The true SBI situation here

$ \bar{x}_{\rm HI} $ is an output of the simulation: we do not directly sample it
Prior on $ \bar{x}_{\rm HI} $ is implicit and non-trivial
Direct neural posterior estimation is the way to go !

\[\begin{aligned} p(\bar{x}_{\rm HI} | t) & \propto p(\bar{x}_{\rm HI}) p(t | \bar{x}_{\rm HI}) \\ & \propto q_\phi (\bar{x}_{\rm HI} | t) \end{aligned} \]

We are model-dependent on both the prior and the likelihood

SBI methodology

We apply Variational Mutual Information Maximization (VMIM, Jeffrey+20) to learn $ p(\bar{x}_{\rm HI} | t) $ by:

Compressing the stacked statistics $ [t_1, t_2, t_3] $ into a 3D vector $ y $
Maximizing the mutual information between $ y $ and $ \bar{x}_{HI} $

Can you trust me ?

For each case we train 20+ models to check for the stability of our approach
We systematically perform coverage tests and select the best model for each case

Test of Accuracy with Random Points (TARP) runs on the validation set for different cases

Results: Individual test scenarios

Good, but these are individual posteriors. What about the big picture ?

Results: global trends

$ \sigma(\bar{x}_{\rm HI}) $ decreases by ~40% from 100h to 1000h (but very statistic dependent)
PS2D + higher order stats: $ \sigma(\bar{x}_{\rm HI}) $ decreases by ~60% at 100h and ~40% at 1000h

Results: the relative strength of each statistic

Betti numbers are more informative towards the end of EoR
Power spectrum is more constraining in the early EoR
Bispectrum mildly contributes to early EoR constraints

SBI for EoR - Takeaways

Based on our work for SDC3b, we investigated the reionisation history through several summary statistics
We employed 2-pt statistics (1D and 2D power spectra) with higher order statistics (bispectrum and Betti numpers)
We applied a SBI framework to infer parameters from these statistics
Betti numbers (alone and combined with others) are particularly promising
We also identify different regimes of relevance for each statistic

[Cerardi et al., submitted to MNRAS, 2025]

Satellites mega constellations and the Square Kilometre Array Observatory

With Emma Tolley, Federico di Vruno and Chris Finlay

We are not alone... on the radio spectrum !

The night sky now...

Alan Dyer / amazingsky.com

Radio signals from satellites

Intended emissions (in ITU-R allocated bands)
Reflection from sunlight and terrestrial sources
Unintended Electromagnetic Radiation (UEMR)

Starlinks: both spectral lines and broad band features at low freqs [Di Vruno+23]

Can we forecast the future observing conditions of the SKAO?

Today

Starlink: 9000+
Oneweb: ~700

In 5-10 years

Starlink: ~40000
Oneweb: 720
GuoWang: ~12000
QianFan: ~1300
Leo: ~3000

Propagating orbits for 50 000 satellites is computationally expensive...
Instead, we can model the density of satellite in the sky with an analytical model from [Bassa+2021]

Satellite density maps

Sharp features correspond to the edges of satellite shells
The SKAO will be covered by almost all shells

The exposition of the SKA-Low to mega constellations

Significantly exposed to satellites at all frequencies, up to 100% below 100 MHz
Flagging will not be an option -> RFI subtraction

Trajectory-based Subtraction and Calibration

TABASCAL (Chris Finlay): forward model of telescope + RFI + astronomical sources. Joint fit at the visibility level.

Improves Gain calibration
Enable RFI modelling and subtraction

RFI subtraction vs flagging on MeerKAT simulated observations, in [Finlay+2025].

Satellites constellations - Takeaways

Satellites will be ubiquous in SKAO observations
A serious threat for low frequency science
Future work: forecast the RFI contamination from these constellations

Trajectory-based RFI subtraction

Requires prior on satellite orbits and HPC resources
Currently tested on real observations

Thank you for your attention !

(What also keeps me busy in Switzerland)

What can ML tools and simulations do for (radio) cosmology ?

(Or what keeps me busy in Switzerland)

Nicolas Cerardi

Physics-Informed Neural Networks for continuous CDM simulations Constraining the reionization history with higher-order statistics and simulation-based inference Satellites mega constellations and the Square Kilometre Array Observatory

Physics-Informed Neural Networks for continuous CDM simulations With Emma Tolley & Ashutosh Mishra

1D CDM simulation

Physics Informed Neural Networks (PINNs)

Physics Informed Neural Networks (PINNs)

Which neural architecture can we use ?

Kolmogorov-Arnold Networks (KANs)

$\Rightarrow$ Physics-Informed KANs (PIKANs!)

Results

Results : Phase-space

Results : displacement, velocity and acceleration

Results : density profiles

What else can we do with such a tool ?

Results : Backward simulation

Implicit inference of the reionization history with higher-order statistics of the 21-cm signal

Satellites mega constellations and the Square Kilometre Array Observatory

Physics-Informed Neural Networks for continuous CDM simulations

Constraining the reionization history with higher-order statistics and simulation-based inference

Satellites mega constellations and the Square Kilometre Array Observatory

Physics-Informed Neural Networks for continuous CDM simulations

With Emma Tolley & Ashutosh Mishra