What I learned co-writing an applied-ML paper on cerebral aneurysms
We replaced an expensive CFD simulation with a neural network surrogate. The trick wasn't the network. It was throwing 99% of the dimensions away first.
Most of my code goes to production, not to journals. This one didn't. We had a problem: predicting wall shear stress fields on the inside of cerebral aneurysms. The ground-truth method is computational fluid dynamics. CFD is correct, but it's slow. A single high-fidelity sim can take hours. If you want to scan ten variations of a patient's vessel geometry, you're cooking for a day.
The paper, published in Boletim da Sociedade Paranaense de Matemática, asks: can we trade some accuracy for orders of magnitude in speed? Yes, but only if you respect the math.
Step 1, throw away 99% of the dimensions
The naive move is to throw a deep network at the raw mesh. A CFD result on a typical aneurysm mesh has hundreds of thousands of nodes. A direct regression has hundreds of thousands of outputs. Nothing trains on that.
The classical move is Proper Orthogonal Decomposition (POD). You take a stack of simulation snapshots, treat each as a vector, and find the orthogonal basis that captures the most variance. Keep the top-k basis vectors. Now every snapshot is just k coefficients instead of hundreds of thousands of numbers.
Step 2, learn the coefficients, not the field
Once you have a low-dimensional representation, the regression problem is small enough to fit a network on. We used a feedforward net with architecture 2 → 10 → 5: two geometric input parameters describing the aneurysm shape, a single hidden layer of 10 units, and 5 POD coefficients out.
import torch.nn as nn
class Surrogate(nn.Module):
def __init__(self, n_geom=2, n_modes=5, hidden=10):
super().__init__()
self.net = nn.Sequential(
nn.Linear(n_geom, hidden),
nn.Tanh(),
nn.Linear(hidden, n_modes),
)
def forward(self, x):
# x: (B, 2) geometric params --> (B, 5) POD coefficients
return self.net(x)The whole network is tiny. Inference is microseconds. Reconstructing the full field is one matrix multiply against the POD basis. End to end: about 5 ms per query, on a CPU, for what CFD does in hours.
What I expected to be hard wasn't
I expected the network architecture to matter. It didn't, really. I tried bigger nets, deeper nets, ReLU vs Tanh. Differences were noise. The expressive power of the network was never the bottleneck.
What did matter:
- Sampling the input geometry distribution well. Bad sampling means basis vectors that don't span the test cases.
- Standardising coefficients before regression. POD modes can have wildly different magnitudes.
- Holding out by shape family, not random splits. Random splits leak structure across the train/test boundary.
Why I keep thinking about this project
Most of my work is in the web stack. APIs, queues, dashboards. Different rhythm. But this taught me something I keep applying: when the problem looks too big, ask whether you're working in the wrong basis. A million-dimensional regression is impossible. A 5-dimensional one is easy. The hard part was the change of basis, not the learning.
I think about that every time a backend problem feels intractable. Usually it's because I'm trying to operate on the raw data, when there's a cheaper representation hiding behind one mathematical move.