Simulation, Modeling and Synthetic Data Lab

I run the Simulation, Modeling and Synthetic Data Lab. Our research focuses on the creation of high-quality, realistic-looking synthetic data.

I am interested in hydrodynamical models in astrophysics, specifically using the smoothed particle hydrodynamics method, and in using deep-learning generative models to create synthetic data in a variety of areas, such as financial data.

If you are a student looking to work together with me at the undergraduate or graduate level, please check out my current student opportunities.

Astrophysical Smoothed Particle Hydrodynamics

I am broadly interested in the numerical details of smoothed particle hydrodynamics (SPH) simulations.

Phantom

I am a development lead for Phantom, a high-performance astrophysical SPH code for simulating a wide array of astrophysical problems.

Parallel and Distributed Computing

Simulations are computationally expensive. Architectures and super-computing clusters continue to increase in scale. Making efficient use of these resources requires highly-effective parallel schemes, so that memory and communication overhead do not bottleneck a computation. I am interested in developing shared-memory threaded schemes that can scale to large core numbers, and in developing distributed computing algorithms for SPH codes that can scale to hundreds or thousands of nodes of a cluster.

Numerical Analysis and Convergence

All simulations are underpinned by the quality of their methods. I conduct experimental studies to understand and proof the numerical properties of SPH, to identify items such as the root cause of numerical errors and in what physical regimes a numerical method may or may not be suitable. For example, I have studied the convergence properties of SPH on the linear and non-linear evolution of the Kelvin-Helmholtz instability, demonstrating the numerical conditions under which convergent solutions may be obtained.

Multi-physics SPH

Astrophysics is a soup combining a multiple types of physics. My research involves creating and testing multi-physics SPH algorithms, that is, SPH methods that simultaneously solve hydrodynamics which is coupled to other physical processes. A key line of research in this area has been the development of magnetohydrodynamics methods for SPH.

Data Science Visualization and Analysis Tools

Sarracen is our Python library for SPH visualization and analysis. Our goal is to provide support for visualization and analysis of SPH data in a SPH-specific way using the smoothing kernel. Sarracen can be used interactively in a Jupyter notebook environment, and supports multi-threaded and GPU (through CUDA) visualization.

Synthetic Data Generation

Synthetic data is artifically created data that looks like real data, both on an individual sample level, its statistical properties, and its complex non-linear relationships. I partner closely with Verafin, a world-leading company in the detection and prevention of money laundering and financial crime.

Financial Transaction Sequences

Banksformer is our transformer-based machine learning model for the generation of synthetic financial transaction sequences. Financial transaction sequences are a complex series of multivariate, multi-modal temporal events that occur at irregular intervals. Banksformer is designed for general banking data, such as personal bank accounts. It supports multiple classes of transactions, and contains specific optimizations to robustly capture time and date patterns.

Agent-based Modelling of Financial Data

Our agent-based approach to generate financial transaction sequences simulates the behaviour of individuals as it pertains to create a realistic history of financial transactions.

Generative Deep-learning Models

We use generative deep-learning techniques, such as generative adversarial networks (GANs), for the creation of a variety of data. DG-GAN is our model for generating demographic data.