Available PhD Projects 

ExaGEO equips students with the skills, knowledge, and principles of exascale computing — drawing from geoscience, computer science, statistics, and computational engineering — to tackle some of the most pressing challenges in Earth and environmental sciences and computational research. Students will work under expert supervision in the below fields:

  • Atmosphere, hydrosphere, cryosphere, and ecosystem processes and evolution
  • Geodynamics, geoscience and environmental change
  • Geologic hazard analysis, prediction and digital twinning
  • Sustainability solutions in engineering, environmental, and social sciences

Each student will be positioned within a supervisory team consisting of multidisciplinary supervisors; one computational, one domain expert, and one from an Earth or environmental, and/or social science research background. This ‘team-based’ supervisory approach is designed to enhance multidisciplinary training.

Please note that some projects may have incomplete supervisory teams, however the full teams will be finalised before the start of the PhD. 

Overview of the ExaGEO Student Experience

Project Selection and Information

  • You must apply for three projects. Each project has two project variations, i.e., teaser projects. During your first year, after working on both teaser projects (under the same supervisory team), you will select the project that best aligns with your interests. For further information on how this process will work, please see the FAQs section on our Apply page.
  • Your PhD institution will be determined by the Principal Supervisor’s institutional affiliation.
  • You can apply for projects at different institutions. 
  • Projects are grouped by research field.
  • If you have any queries regarding a specific project, please contact the supervisor listed first (this will be the Principal Supervisor). 
  • Projects are funded via ExaGEO; this includes fees, stipends and a Research Training Support Grant. For further information, please see our Apply page.

 

Projects with a focus on Atmosphere, Hydrosphere, Cryosphere, and Ecosystem Processes and Evolution:

 

  • A Dangerous Duo: Exploring the Impact of Heatwaves on Air Pollution

    Project institution:
    Project supervisor(s):
    Prof Ryan Hossaini (Lancaster University), Dr Andrea Mazzeo (Lancaster University), Mr Michael Thomas (Reliable Insights), Dr Emma Eastoe (Lancaster University), Dr James Keeble (Lancaster University) and Dr Helen Macintyre (UK Health Security Agency)

    Overview and Background

    Heatwaves (i.e. sustained periods of exceptionally hot weather) are a well-recognised public health hazard. Strong evidence shows that the co-occurrence of extreme air pollution events during heatwaves amplifies health risks [1,2]. The 2022 European heatwave, when the UK recorded its first ever temperature >40°C, was accompanied by a widespread deterioration in air quality, with air pollutants at ground level exceeding safe limits across much of the continent3. Causal relationships between extreme temperature and air pollution are complex, involving weather patterns that promote air stagnation, pollutant emissions (e.g. from wildfires) and atmospheric photochemistry [4, 5]. These factors are not yet well understood but are important to detangle as the frequency and intensity of summer heatwaves is expected to rise due to climate change [6].

    This project’s overarching goals are to (1) characterize the response of air pollutants to heatwaves across Europe and to assess their combined health impacts, (2) provide new process-level insight into the causal relationship between extreme temperature and air pollution, and (3) evaluate and improve current systems for forecasting extreme air pollution events. This will be achieved through ultra-high-resolution air quality model simulations, analysis of air pollutant measurement data, and by exploring data-driven approaches to air quality forecasting and inference, including machine learning.

    The successful candidate will join LEC’s vibrant atmospheric science research group (AtMOS) and benefit from a diverse supervisory team. This includes a placement with a UK industry partner (Reliable Insights Ltd www.reliable-insights.com / https://www.tangentworks.co.uk/) and partnership with the UK Health Security Agency. 

    Methodology and Objectives

    Teaser Project 1: What drives adverse air quality during heatwaves?
    In Year 1, the goal will be to characterise the observed response of ground-level ozone (an important air pollutant) during European heatwaves. Using the severe summer 2022 heatwave as a case study, an analysis of surface temperature and ozone measurements from hundreds of sites across Europe will be performed (e.g. exploiting the extensive TOAR-II measurement database). Output from the UCI chemical transport model (CTM), a global model developed and maintained by the Lancaster atmospheric science group, will also be analysed. The ability of the model to simulate the behaviour of ozone during the heatwave, including the observed ozone-temperature relationship, will be determined. This work will provide a strong grounding in the observational datasets and atmospheric model used in subsequent years.

    In Year 2 emphasis will be placed on understanding the various processes (meteorological, chemical, physical) responsible for elevating ozone during heatwaves. This will be achieved through carefully designed model sensitivity experiments that allow these factors to be detangled and quantified. One enquiry will be to assess the importance of temperature-dependent ozone precursor emissions, such as emissions from wildfires (CO, NOx) and volatile organic compound emissions from stressed vegetation. Another will be to assess long-range transport of ozone into continental Europe from other world regions. As the project progresses in years 2 and 3, the scope will expand to consider other notable heatwave years and air pollutants. The ability of regional scale models to forecast extreme air pollutant events will be explored, as will innovative data-driven approaches (e.g. machine learning).

    Teaser Project 1 Objectives:

    • Characterize the ozone-heatwave response across multiple European summers using surface and satellite measurements.
    • Assess the ability of atmospheric models to capture extreme ozone events and the observed ozone-temperature relationship.
    • Interpret the observed ozone-heatwave responses using high resolution model simulations and assess the roles of meteorological, chemical and physical factors.
    • Explore data-driven approaches to air quality forecasting.

    Teaser Project 2: Optimising air quality forecasts: exploring data-driven methods

    Process-based air quality models are frequently used to simulate past (i.e. ‘hindcast’) air quality. This provides information needed to quantify how air pollutant levels have changed over time (e.g. due to policy interventions) and the implications for human exposure and health. Such models are also increasingly used to alert the public in advance of upcoming air pollution episodes (i.e. ‘forecast’). Like weather forecasts, air quality forecasts are provided up to several days ahead, though confidence generally decreases as the forecast range increases. As the skill of air quality models is often inadequate, particularly for the most ‘extreme’ episodes, various approaches to ‘bias correct’ model forecasts (before issued) have emerged [7, 8].

    In Year 1, the goal will be to examine the ability of WRF-Chem to simulate UK air quality in hindcast mode, with an emphasis on quantifying the model’s skill during recent heatwaves. WRF-Chem is a well-evaluated and widely adopted process model suitable for national-scale high resolution simulations. The model’s skill in simulating extreme air pollutant levels will be assessed using the UK’s extensive network of air pollutant measurements and by considering a range of key performance metrics (e.g. hit rate, false alarm rate etc.). This will provide a solid grounding on the strengths and weaknesses of air quality models and approaches to evaluate them.

    In Year 2, the effectiveness of a range of bias correction techniques applied to the WRF-chem simulations will be explored, including machine learning based approaches [9]. Methods that improve the tail of the ozone distribution (e.g. ‘quantile mapping’) will be examined. Bias-corrected hindcasts will be produced and the annual mortality burden attributable to long-term air pollutant exposure, along with the health impacts of elevated air pollution in conjunction with heatwaves, will be quantified [10].

    In Year 3, the focus of the project will be to explore how data science approaches can improve the skill of air quality forecasts across a range of forecast lead times (24 to 96 hours). Important predictor variables for adverse UK air quality will be assessed and ranked, including the potential for their near real-time assimilation (e.g. surface and satellite data). These data will be used to train machine learning models and the resulting data-driven forecasts will be evaluated against process models. A key question will be whether a data-driven approach outperforms a traditional process model forecast and if so under what settings. Whether the two may be combined to produce optimal results will also be examined.

    The development of data-driven solution to air quality forecasting is a rapidly evolving field with very strong opportunity for impactful science. During year 3, a placement with Reliable Insights (a leading Lancaster based time-series data specialist) will be undertaken. This placement will provide an excellent opportunity to develop industry links, to implement

    Teaser Project 2 Objectives:

    • Evaluate the skill of the WRF-Chem model in simulating air pollutants during recent UK heatwaves employing a range of metrics.
    • Test and apply bias correction techniques to produce optimised WRF-Chem hindcasts and use these to quantify the human health effects of UK air pollution over time.
    • Explore a data-driven air quality forecasting system for the UK, exploiting machine learning and other innovative data science approaches.

    References & Further Reading

    [1] Schnell, J.L., and Prather, M.J. (2017). Co-occurrence of extremes in surface ozone, particulate matter, and temperature over eastern North America. Proc. Natl. Acad. Sci., 114, 2854-2859, https://doi.org/10.1073/pnas.1614453114
    [2] Gouldsbrough, L., Hossaini, R., Eastoe, E., & Young, P.J.Y. (2022). A temperature-dependent extreme value analysis of UK surface ozone, 1980-2019. Atmos. Env., 273, 118975.
    [3] https://atmosphere.copernicus.eu/copernicus-scientists-warn-very-high-ozone-pollution-heatwave-continues-across-europe
    [4] Pope, R. J., et al. (2023). Investigation of the summer 2018 European ozone air pollution episodes using novel satellite data and modelling, Atmos. Chem. Phys., 23, 13235-13253, https://doi.org/10.5194/acp-23-13235-2023.
    [5] Otero, N., Jurado, O. E., Butler, T., and Rust, H. W. (2022). The impact of atmospheric blocking on the compounding effect of ozone pollution and temperature: a copula-based approach, Atmos. Chem. Phys., 22, 1905-1919, https://doi.org/10.5194/acp-22-1905-2022.
    [6] Doherty, R.M., Heal, M.R., and O’Connor, F.M. Climate change impacts on human health over Europe through its effect on air quality, Environ. Health, 16, https://doi.org/10.1186/s12940-017-0325-2.
    [7] Staehle, C., et al. (2024). Technical note: An assessment of the performance of statistical bias correction techniques for global chemistry–climate model surface ozone fields, Atmos. Chem. Phys., 24, 5953-5969, https://doi.org/10.5194/acp-24-5953-2024.
    [8] Neal, L.S., Agnew, P., Moseley, S., Ordóñez, C., Savage, N.H. and Tilbee, M. (2014). Application of a statistical post-processing technique to a gridded, operational, air quality forecast, Atmos. Env., 98, 385-393, https://doi.org/10.1016/j.atmosenv.2014.09.004.
    [9] Gouldsbrough, L., Hossaini, R., Eastoe, E., Young, P.J.Y. & Vieno, M. (2023). A machine learning approach to downscale EMEP4UK: analysis of UK ozone variability and trends. Atmos. Chem. Phys., 24, 3163-3196, https://doi.org/10.5194/acp-24-3163-2024.
    [10] Macintyre, H.L., et al. (2023). Impacts of emissions policies on future UK mortality burdens associated with air pollution. Environ. Int., 174, 107862, https://doi.org/10.1016/j.envint.2023.107862.

  • Antarctic Ice Loss in High Definition: Analysing novel high-resolution satellite data streams for quantifying 21st century change

    Project institution:
    Project supervisor(s):
    Prof Mal McMillan (Lancaster University), Dr Dave McKay (University of Edinburgh), Dr Jenny Maddalena (Lancaster University) and Dr Israel Martinez Hernandez (Lancaster University)

    Overview and Background

    This project offers the exciting opportunity to be at the forefront of research to exploit the potential of exascale computing, for the purposes of satellite monitoring of Earth’s polar ice sheets, at scale. 

    The polar regions are one of the most rapidly warming regions on Earth, with ongoing melting of ice sheets and ice caps making a significant contribution to global sea level rise. As Earth’s climate continues to warm throughout the 21st Century, ice melt is expected to accelerate, leading to large-scale social and economic disruption.  

    Satellites provide a unique tool for monitoring the impact of climate change upon the polar regions, and are key to tracking the ongoing contribution that ice masses make to sea level rise. With recent increases in data volumes, computing power and the use of data science, comes huge potential to rapidly advance our ability to monitor and predict changes across this vast and inaccessible region. However, currently this potential is not fully realized. 

    This project will place you at the forefront of this research, working to advance our current capabilities towards exascale computing, through a combination of state-of-the-art satellite datasets, high performance compute, and innovative data science methods. You will be supported by a multidisciplinary supervisory team of statisticians, computer scientists and environmental scientists, with opportunities to contribute to projects run by the European Space Agency. Specifically, this project aims to develop new large-scale, high-resolution estimates of 21st century Antarctic ice loss and, in doing so, better constrain our estimates of past and future sea level rise. 

    Methodology and Objectives

    Project Aim: This project aims to utilize new streams of satellite data, alongside advanced statistical algorithms and compute, to transform our ability to monitor Antarctic Ice Sheet mass loss at high spatial resolution and at the pan-Antarctic scale. More specifically, the successful candidate will develop new estimates of ice sheet mass loss using high-volume, high-resolution satellite-derived Digital Elevation Models (DEM’s), using efficient GPU-enabled processing flows. These will be used to determine unique, large-scale estimates of 21st century ice sheet mass loss and glacier evolution.  

    Methods Used:

    This project will build upon recent proof-of-concept work that has developed a novel pipeline for processing high volume (100’s Tb), extremely high (meter scale) resolution timeseries of Digital Elevation Models. The focus of this PhD will be to adapt these methods so that – for the first time – it is computationally feasible to apply them at the ice sheet scale, and then to develop a comprehensive, ice sheet wide assessment, which will ultimately improve our understanding of the impact of climate change on polar ice loss and sea level rise. This will necessitate the use of Graphical Processing Units (GPU’s) on High Performance Computing (HPC) clusters. As such, developing the code to work on this high-level computing architecture will be a key element of the project.  

    Within the first year of the PhD, the successful candidate will have the opportunity to explore 2 teaser projects, one of which will then be taken forward into subsequent years. 

    Teaser Project 1: Towards pan-Antarctic, high-resolution monitoring of ice loss

    This teaser project will work to translate the current proof of concept DEM pipeline into a system which can be run efficiently at scale, and then to test its use at a number of key Antarctic study sites. Specifically, state-of-the-art satellite altimetry will be combined with high-resolution DEM’s from the Reference Elevation Model of Antarctica (REMA) project, to generate estimates of glacier elevation change covering the period 2010-present. Study sites will be selected to cover those of high scientific interest (e.g. Pine Island Glacier, Totten Glacier), and to explore performance within diverse glaciological settings (e.g. large ice streams, narrow outlet glaciers with nearby nunataks). A key component of this initial project will be to refactor the current code on GPU-enabled systems, alongside developing better approaches to memory management, data infrastructure etc. If continued beyond year 1, the ultimate ambition of this teaser project will be to exploit the full, pan-Antarctica archive of REMA strips, to (1) generate new ultra high-resolution Antarctic Mass Balance estimates, and (2) improve process understanding of the physical drivers of current ice loss. 

    Teaser Project 2: Multi-sensor integration for multi-decadal monitoring

    The second teaser project will again aim to establish efficient, scalable DEM-processing pipelines for resolving Antarctic Ice Sheet mass loss, but this time focusing on extending the observational record to generate timeseries spanning a quarter of a century (2000-2025). This necessitates adapting the proof-of-concept pipeline which forms the basis of teaser project 1, to add functionality to process data from the ASTER mission (2000-present) over Antarctica; thus enabling a long-term record to be derived. Initial work has already been performed to process ASTER data over Greenland using conventional (CPU) systems. Hence the purpose of this teaser project will be (1) to adapt this existing code so that it runs over Antarctica, and (2) refactor it for deployment on GPU’s, with a view to future scale-up. If continued beyond year 1, the ultimate ambition of this teaser project would be to (1) generate new long-term records of Antarctic glacier evolution, and (2) improve process understanding of the physical drivers of current ice loss. 

    References & Further Reading

    Here are some tasters of our work and its impact: 

    https://www.newscientist.com/article/2490250-meltwater-bursts-through-greenland-ice-in-first-of-a-kind-eruption/ 

    https://www.weforum.org/agenda/2019/05/antarctica-s-ice-is-melting-5-times-faster-than-in-the-90s/ 

    https://www.bbc.co.uk/news/science-environment-47461199 

    https://www.washingtonpost.com/news/energy-environment/wp/2016/07/19/greenland-lost-a-trillion-tons-of-ice-in-just-four-years/ 

    The candidate will also join the UK Centre for Polar Observation and Modelling, and the Centre of Excellence in Environmental Data Science: 

    https://cpom.org.uk/ 

    https://ceeds.ac.uk/ 

     

     

  • Computationally scalable data fusion for real-time water quantity and quality forecasting

    Project institution:
    Project supervisor(s):
    Dr Abdollah Jalilian (University of Glasgow), Dr Faiza Samreen (UKCEH), Dr Andrew Elliott (University of Glasgow), Prof Claire Miller (University of Glasgow), Prof Andrew Tyler (University of Stirling) and Prof Peter Hunter (University of Stirling)

    Overview and Background

    This project will develop machine learning and statistical methods for real-time forecasting via data fusion with uncertainty quantification for water catchments. It will use and develop advanced AI models (building on and expanding the work such as Allen, et al., 2025 and relevant foundation models) to fuse in situ sensors and satellite data (optical and/or radar) for hydrology and surface water quality in river catchments. The project will also assess the generalisability of developed methods across multiple river catchments. 

    Data demands can be very large with, for example, data collected approximately every 15 minutes from multiple in situ sensors and 10m resolution for satellite. To obtain fast (real-time or near real-time), reliable and computationally efficient (and therefore environmentally friendly) models, this project specifically targets the development of GPU programming for scalable analytics and will consider the advantages of cloud GPUs and other related platforms, for example EDITO, to support transferability and impact. 

    This project is a collaboration with domain expert colleagues at University of Stirling and Scottish Water and will link to NERC projects such as SenseH20 and MOT4Rivers and the Forth-ERA digital observatory of the Forth Catchment. 

    Methodology and Objectives

    The AI-based models will be developed using the latest generation of deep learning approaches (such as transformer models, physicsinformed losses, etc). As detailed in the background, this project would leverage existing model frameworks (like Aardvark), foundation models that can be specialised, and construct domainspecific pipelines as appropriate. The fusion methodologies will be tested on a number of downstream tasks, most notably predicting values of water quantity/quality far from the sensor locations (validated by cross validation) and forecasting. While the frameworks generated will be tested on the dataset in question, the assumption would be that the models can be transferred to other localities, and testing the portability of these approaches will be part of the later stages of this project.

    Teaser Project 1: 

    The first teaser project will focus on developing an initial methodology for data fusion, based on taking an off-the-shelf deep learning approach, and applying it to the dataset in question to explore the performance. This will be compared with standard statistical approaches (e.g. Kriging and hierarchical Bayesian spatiotemporal models) to understand the relative advantages and disadvantages of this approach both in accuracy of prediction and computational time. Based on these initial findings and limitations of the approach, we will consider additional changes to the architecture including software optimisations (including GPU programming) or indeed the development of a different approach which will naturally extend into a full PhD project should the student decide to pursue this. 

    Teaser Project 2:  

    The second teaser project is strongly rooted in uncertainty quantification, i.e. understanding how certain we should be about the model’s predictions. AI-based approaches while regularly delivering high accuracy often lack the strong probabilistic frameworks to give good uncertainty quantification. To do this, we will employ a mixture of approaches from Monte Carlo based and variational inference approaches leveraging the fast inference time of AI models, to emulation-based approaches. Given the data fusion-based pipelines we will be developing this will require understanding both the uncertainty induced by the model and the uncertainty in the observations themselves. Computationally this is quite intensive, and therefore part of this teaser project will be understanding this complexity and optimising it, both through computational means (i.e. GPU coding, HPC etc etc) and through statistical techniques to most efficiently use computational resources (and limit their environmental impact). 

    References & Further Reading

    Allen, A., Markou, S., Tebbutt, W., Requeima, J., Bruinsma, W. P., Andersson, T. R., … & Turner, R. E. (2025). End-to-end data-driven weather prediction. Nature, 641(8065), 1172-1179. 10.1038/s41586-025-08897-0 

    Andersson, T. R. et al. (2021) Seasonal Arctic sea ice forecasting with probabilistic deep learning. Nature Communications, 12, 5124. (doi: 10.1038/s41467-021-25257-4) (PMID:34446701) (PMCID:PMC8390499) 

    Colombo, P., Miller, C., Yang, X., O’Donnell, R., & Maranzano, P. (2025). Warped multifidelity Gaussian processes for data fusion of skewed environmental data. Journal of the Royal Statistical Society Series C: Applied Statistics, 74(3), 844-865. 10.1093/jrsssc/qlaf003 

    Wilkie, C. J., Miller, C. A., Scott, E. M., O’Donnell, R. A., Hunter, P. D., Spyrakos, E., & Tyler, A. N. (2019). Nonparametric statistical downscaling for the fusion of data of different spatiotemporal support. Environmetrics, 30(3), e2549. 10.1002/env.2549 

    Forth Environmental Resilience Array (Forth-ERA) project and digital observatory, University of Stirling 

    MOT4Rivers: Monitoring, Modelling and Mitigating Pollution Impacts in a Changing World: Science and Tools for Tomorrow’s Rivers 

    SenseH2O: a scalable, integrated systems-based approach to monitoring water quality from headwaters to river outlets 

  • Detecting hotspots of water pollution in complex constrained domains and networks

    Project institution:
    Project supervisor(s):
    Dr Mu Niu (University of Glasgow), Dr Craig Wilkie (University of Glasgow), Prof Cathy Yi-Hsuan Chen (University of Glasgow) and Dr Michael Tso (UKCEH)

    Overview and Background

    Technological developments with smart sensors are changing the way that the environment is monitored. Many such smart systems are under development, with small, energy efficient, mobile sensors being trialled. Such systems offer opportunities to change how we monitor the environment, but this requires additional statistical development in the optimisation of the location of the sensors. 

    The aim of this project is to develop a mathematical and computational inferential framework to identify the best locations to deploy sensors in a complex constrained domain and network, to enable improved detection of water contamination. The proposed method can also be applied to solve regression, classification and optimization problems in a latent manifold which embedded in higher dimensional spaces. 

    Figure 1, Examples of complex constrained domains: Chlorophyll concentrations in Aral Sea (Wood et al., 2008).

    Methodology and Objectives

    The idea of on-site sensors to detect water contaminants has a rich history.  Since water flows at finite speeds, placing sensors strategically reduces time until detection. The mathematical analysis is often made difficult by the need to model the nonlinear dynamical systems of hydraulics within a non-Euclidean space such as constrained domains (lake or river, Wood et al., 2008) or networks (pipe network, Oluwaseye, et al., 2018). It requires solving large nonlinear systems of differential equations in the complex domain and is difficult to apply to even moderate-sized problems. 

    This proposed PhD project aims to develop new methods to improve environmental sampling, enabling improved estimation of water pollution and associated uncertainty that appropriately accounts for the geometry and topology of the water body. 

    Methods Used:

    Intrinsic Bayesian Optimization (BO) on complex constrained domains and networks allows the use of the prediction and uncertainty quantification of intrinsic Gaussian processes (GPs) (Niu et al., 2019, 2023) to direct the search of the water pollution. Once new detection is observed, the search for a hotspot can be sequentially updated. 

    The key ingredients of BO are the Gaussian processes (GPs) prior that captures beliefs about the behaviour of the unknown black-box function in the complex domains. The student will develop intrinsic BO on non-Euclidean spaces such as complex constrained domains and networks with the state-of-the-art GPs on manifolds and GPs on graphs. Extending the idea of estimating covariance functions on manifolds, the project aims to estimate the heat kernel of the point cloud, allowing the incorporation of the intrinsic geometry of the data, and a potentially complex interior structure. 

    The application areas are water quality in lakes with complex domains (such as the Aral Sea) and pollution sources in a city’s sewage network. The methods would have the potential to inform about emergent water pollution events like algal blooms, providing an early warning system, and help to identify pollution sources. 

    Teaser Project 1 Objectives: In the first teaser project, the student will apply intrinsic GPs to water quality data, seeking to understand the complex patterns of water quality in non-Euclidean spaces (both continuous domains with complex boundaries and network domains). The student will apply existing methods to small-scale datasets, getting a feel of the methodology used in this area. This work could evolve into a PhD with a focus on developing computationally demanding methods for modelling water quality and detecting hotspots over complex domains. Parallelisation over GPUs would enable modelling across large areas, with high data volumes typical of high spatial resolution water quality data. 

    Teaser Project 2 Objectives: In the second teaser project, the student will expand their work to the spatio-temporal (or manifold-temporal) setting, incorporating both complex spatial and temporal structures to fully explain the changing nature of the water quality patterns. Again, this teaser project will use involve applying existing methods to small-scale datasets. Due to the high computational complexity of spatio-temporal models, this project has the potential to evolve into a PhD with a focus on developing highly computationally efficient methods, with a focus on parallelisation on GPUs. 

    The student will benefit from the extensive expertise of the supervisory team. Dr Niu specializes in statistical inference in Non-Euclidean spaces, with application in ecology and environmental science. Dr Wilkie has a background in developing spatiotemporal data fusion approaches for environmental data, focussing on satellite and in-lake water quality data. Prof Chen specializes in network modeling, statistical inference, data science, machine learning and economics. Dr Tso is an environmental data scientist with strong computational background and a portfolio of work on water quality monitoring, including adaptive sampling. 

    References & Further Reading

    1. Niu, et al., (2019), “Intrinsic Gaussian processes on complex constrained domain”,J. Roy Statist. Soc. Series B, Volume 81, Issue 3. 
    2. Niu, et al., (2023): Intrinsic Gaussian processes on unknown manifold with probabilistic geometry, Journal of Machine Learning Research; 24 (104). 
    3. Oluwaseye, et al.,(2018) A state-of-the-art review of an optimal sensor placement for contaminant warning system in a water distribution network, Urban Water Journal, 15:10, 985–1000.  
    4. Giudicianni et al., (2020). Topological Placement of Quality Sensors in Water-Distribution Networks without the Recourse to Hydraulic Modeling. Journal of Water Resources Planning and Management, 146(6).  
    5. Wood, S. N., Bravington, M. V. and Hedley, S. L. (2008) Soap film smoothing. J. Royal Stat. Soc. Series B, 70, 931–955.
  • Firn Futures: Examining Antarctic Ice Shelf Stability with GPU-Accelerated Firn Modelling

    Project institution:
    Project supervisor(s):
    Dr Amber Leeson (Lancaster University), Dr Matt Speers (Lancaster University), Dr Katie Miles (Lancaster University) and Dr Vincent Verjans (Barcelona Supercomputing Centre)

    Overview and Background

    Antarctic ice shelves regulate the discharge of grounded ice into the ocean, and their collapse can accelerate sea level rise (Berthier et al., 2012). The Larsen C Ice Shelf (LCIS) in particular is thought to be vulnerable to surface melt and hydrofracture, processes which may lead to eventual collapse and which are controlled by the firn layer’s ability to store and refreeze meltwater. Current firn models simplify these processes to reduce computational demand (e.g. Verjans et al., 2019), but this introduces uncertainty and risks misrepresenting thresholds for saturation and collapse. This project will exploit GPU acceleration to test and improve meltwater physics in the Community Firn Model (CFM, Stevens et al., 2020) against field and satellite observations and use the developed model to simulate the evolution of the firn layer on the LCIS under predicted climate warming. By resolving these processes at scale, the project will deliver new predictions of LCIS stability under future climate forcing. 

    Methodology and Objectives

    This PhD project integrates novel field observations (Hubbard et al., 2016), satellite data, and firn modelling to address a key uncertainty in Antarctic climate science: when will the firn layer of the Larsen C Ice Shelf (LCIS) saturate, and what are the consequences for its stability? The programme begins with two exploratory projects of approximately six months each, both centred on the Community Firn Model (CFM). These projects will provide complementary experience in model physics and high-performance computation. Following this training year, the student will select one pathway to pursue in depth from Year 2 onwards, ultimately delivering a substantive contribution to predicting ice-shelf vulnerability. 

    Teaser Project 1: Advancing meltwater physics in the Community Firn Model 
    Firn modelling in high-melt environments is fundamentally limited by how liquid water processes are represented. Meltwater percolation, refreezing, and the possible development of firn aquifers are central to whether surface melt is buffered or contributes directly to destabilisation. Current CFM schemes necessarily simplify these processes, often assuming one-dimensional percolation and instantaneous refreezing, which can underestimate storage depth and persistence.
     

    This project will benchmark the CFM’s existing meltwater parameterisations against borehole-derived density profiles and refrozen ice layers sampled during field campaigns on LCIS, together with satellite observations of meltwater ponding (e.g. Corr et al., 2022) and surface elevation change. The student will then develop an inventory of physically more complete options, such as multi-phase flow and coupled thermodynamics which could be implemented within the CFM framework using GPU acceleration to reduce the computational penalty of these enhancements, enabling higher spatial and temporal resolution 

    Teaser Project 2: GPU-accelerated simulations of firn evolution under extreme melt 
    The second project develops expertise in scalable Earth system modelling by implementing the CFM on GPU-enabled high-performance clusters. Starting from borehole-constrained initial conditions, the model will be forced with outputs from regional climate models covering the past two decades. Particular emphasis will be placed on extreme melt events associated with atmospheric rivers, which have played a central role in past ice-shelf collapse (Wille et al., 2022).
     

    The student will design sensitivity experiments to explore firn response to a range of forcing scenarios, including anomalously warm summers, rainfall events, and multi-year accumulation variability. By exploiting GPU acceleration, simulations will be scaled to high spatial resolution across LCIS, and large ensembles will be run to explore uncertainty. These experiments will provide new insight into the thresholds governing firn saturation and the role of rare but intense events in destabilising ice shelves. 

    Beyond Year 1: 
    At the end of Year 1, the student will select a pathway for further development. If Project 1 is chosen, the PhD will concentrate on improving the physics of meltwater flow in the CFM and integrating field and satellite observations to constrain model predictions. If Project 2 is chosen, the emphasis will be on producing GPU-accelerated projections of firn evolution and collapse risk to 2100 under multiple climate scenarios and will include an assessment of uncertainty in predictions.
     

    Long-term objectives 
    Whichever pathway is selected, the overarching aims of the PhD are to:
     

    • Advance the representation of firn processes in high-melt Antarctic environments. 
    • Develop innovative methods for combining borehole and satellite data with physically based modelling. 
    • Exploit exascale computing to enable continent-scale, ensemble firn simulations with improved physics. 
    • Provide new projections of Antarctic ice shelf vulnerability, ultimately contributing to sea-level rise assessments

    References & Further Reading

    Berthier, E., Scambos, T. A., & Shuman, C. A. (2012). Mass loss of Larsen B tributary glaciers (AP) unabated since 2002. Geophysical Research Letters, 39, 6. 

    Corr, D., Leeson, A., McMillan, M., Zhang, C., and Barnes, T.: An inventory of supraglacial lakes and channels across the West Antarctic Ice Sheet, Earth Syst. Sci. Data, 14, 209–228, https://doi.org/10.5194/essd-14-209-2022, 2022. 

    Wille, J.D., Favier, V., Jourdain, N.C. et al. (2022) Intense atmospheric rivers can weaken ice shelf stability at the Antarctic Peninsula. Commun Earth Environ 3, 90. https://doi.org/10.1038/s43247-022-00422-9 

    Hubbard, B., Luckman, A., Ashmore, D. et al. Massive subsurface ice formed by refreezing of ice-shelf melt ponds. Nat Commun 7, 11897 (2016). https://doi.org/10.1038/ncomms11897 

    Stevens, C. M., Verjans, V., Lundin, J. M. D., Kahle, E. C., Horlings, A. N., Horlings, B. I., and Waddington, E. D.: The Community Firn Model (CFM) v1.0, Geosci. Model Dev., 13, 4355–4377, https://doi.org/10.5194/gmd-13-4355-2020, 2020. 

    Verjans, V., Leeson, A. A., Stevens, C. M., MacFerrin, M., Noël, B., and van den Broeke, M. R.: Development of physically based liquid water schemes for Greenland firn-densification models, The Cryosphere, 13, 1819–1842, https://doi.org/10.5194/tc-13-1819-2019, 2019. 

  • Forests in the Exascale Era: High-resolution Modelling of Global Biomass Drivers, Loss and Recovery

    Project institution:
    Project supervisor(s):
    Dr Wenxin Zhang (University of Glasgow), Prof Peter Atkinson (Lancaster University), Dr Dave McKay (University of Edinburgh), ​​Dr-Vasilis-Myrgiotis​ (UKCEH) and Dr Emma Robinson (UKCEH)

    Overview and Background

    Forests carbon sinks and sources are central to estimating the global carbon budget, but quantifying their biomass dynamics—especially how losses and recoveries are associated with natural and anthropogenic drivers—is a major challenge. The recent dataset of global drivers of forest loss at 1km resolution (2001–2022) classifies loss into: agriculture, logging, wildfire, infrastructure, and natural disturbances (with ~90.5 % accuracy), being a new, spatially explicit basis for attribution studies. However, pre-2001 biomass dynamics and drivers remain largely unexplored. By combining the drivers dataset with exascale-enabled simulations using JULES and Earth observation products (e.g. L-VOD, NDVI), this project aims to extend driver reconstruction to 1991, validate biomass changes, and simulate fine-resolution biomass dynamics under forest loss and recovery globally. 

    Methodology and Objectives

    This PhD proposal combines exascale computation, GPU acceleration, land surface modelling, and satellite Earth observation (EO) datasets to quantify the impacts of forest loss and recovery on above- and below-ground biomass (AGB and BGB) over the longest possible satellite-informed time series. The central modelling framework is the Joint UK Land Environment Simulator (JULES), advanced into an exascale-ready version (ExaJULES) to resolve biomass processes globally at 1 km resolution. ExaJULES will simulate carbon allocation, disturbance, and regrowth processes across AGB and BGB pools. 

    Satellite datasets provide the foundation for both driver attribution and biomass validation. The Global Drivers of Forest Loss dataset (2001–2022, Sims et al. 2024) classifies disturbances—including agriculture, logging, wildfire, infrastructure, and natural causes—at 1 km resolution. This will be combined with L-band Vegetation Optical Depth (L-VOD) for global AGB trajectories. Additional inputs such as Landsat and Sentinel-2 NDVI, MODIS and GFED fire records, FAO and ESA-CCI cropland expansion data, and ERA5-Land reanalysis will allow back-extrapolation of disturbance drivers to 1991. Handling these datasets is non-trivial: together they are estimated to require hundreds of terabytes of storage, and pre-processing steps (temporal aggregation, harmonisation of spatial resolutions, multiband image processing, and data cube construction) will at least double storage needs and incur substantial compute costs. 

    Computationally, the project exploits GPU-enabled exascale architectures to achieve kilometre-scale simulations with ExaJULES. Validation will involve cross-comparing satellite observations (L-VOD, GEDI canopy structure) with modelled biomass trajectories and, where possible, ground-based forest inventory data. Statistical attribution techniques (e.g., random forest classification, causal inference methods) will be used to disentangle the relative contributions of natural and anthropogenic drivers to observed biomass changes. 

    These approaches provide the foundation for two complementary teaser projects, each expandable into a full PhD pathway. One focuses on reconstructing disturbance drivers and validating biomass change, while the other advances exascale simulation of coupled AGB–BGB dynamics. 

    Teaser Project 1. Reconstructing Global Forest Loss Drivers and Validating Biomass Change 

    This project will extend knowledge of disturbance drivers beyond the satellite record by reconstructing global driver datasets from 1991–present. By integrating fire, agriculture, climate extremes, and governance proxies with the Sims et al. (2024) dataset (2001–2022), and using modern data pipeline techniques, it will deliver a continuous three-decade record of forest loss drivers. 

    The second objective is to validate AGB change. L-VOD (2002–present) and GEDI canopy structure will provide independent benchmarks for biomass loss and recovery. JULES simulations will be forced with observed disturbance regimes, and performance evaluated against EO-based benchmarks. 

    Key scientific questions include: How can driver attribution disentangle natural versus anthropogenic causes across continents? How does biomass allocation vary across regions dominated by different drivers (e.g., Amazon, Southeast Asia, boreal forests)? Can reconstructed driver datasets improve JULES/ExaJULES simulations of biomass dynamics? How do uncertainties in EO-based biomass propagate into carbon budget assessments? 

    This project provides a strong foundation for a PhD centred on historical reconstruction, EO–model fusion, and disturbance attribution. 

    Teaser Project 2. ExaJULES at 1 km – Simulating Above- and Below-ground Biomass with Exascale Computing 

    The second project develops and applies ExaJULES, a GPU-enabled exascale version of JULES, to simulate global biomass stocks and fluxes at 1 km resolution. The objective is to resolve how forest disturbance and recovery propagate from canopy to root-zone carbon pools. 

    Workflows will be developed for kilometre-scale global simulations, with opportunities to contribute to ExaJULES benchmarking and code development. Outputs will be benchmarked against L-VOD and ground inventory datasets. Particular emphasis will be placed on quantifying root–shoot allocation shifts under disturbance and recovery, which remain poorly represented in current models. 

    Key scientific questions include: How resilient is AGB–BGB coupling under diverse disturbance regimes (fire, drought, pests, land-use change)? What are the regional and global implications of biomass dynamics for the carbon cycle? How can new satellite missions (e.g., NISAR, BIOMASS) refine BGB representation? How does kilometre-scale modelling alter global carbon budget projections compared with coarse-scale runs? 

    This project naturally scales into a PhD centred on exascale modelling, GPU acceleration, and root–shoot dynamics, generating novel insights into biomass resilience and carbon–climate feedbacks. 

    References & Further Reading

    1. Curtis, P. G., Slay, C. M., Harris, N. L., Tyukavina, A., & Hansen, M. C. (2018). Classifying drivers of global forest loss. Science, 361(6407), 1108-1111. 
    2. Sims, N.C. et al. (2024). Global drivers of forest loss at 1 km resolution, 2001–2022. Environmental Research Letters. DOI: 10.1088/1748-9326/add606 
    3. Hansen, M. C., Potapov, P. V., Moore, R., Hancher, M., Turubanova, S. A., Tyukavina, A., … & Townshend, J. R. (2013). High-resolution global maps of 21st-century forest cover change. science, 342(6160), 850-853. 
    4. Chen, Y., Feng, X., Fu, B., Ma, H., Zohner, C. M., Crowther, T. W., … & Wei, F. (2023). Maps with 1 km resolution reveal increases in above-and belowground forest biomass carbon pools in China over the past 20 years. Earth System Science Data, 15(2), 897-910. 
    5. Mo, L., Zohner, C. M., Reich, P. B., Liang, J., De Miguel, S., Nabuurs, G. J., … & Ortiz-Malavasi, E. (2023). Integrated global assessment of the natural forest carbon potential. Nature, 624(7990), 92-101. 
    6. ExaJULES model, https://excalibur.ac.uk/projects/exajules/ 
    7. Global Forest Watch: https://www.globalforestwatch.org. 
    8. Zhang, Y., Ling, F., Wang, X., Foody, G.M., Boyd, D.S., Li, X., Du, Y. and Atkinson, P.M. (2021). Tracking small-scale tropical forest disturbances: fusing the Landsat and Sentinel-2 data record. Remote Sensing of Environment, 261, 112470. 
  • Mechanisms for and predictions of occurrence of ocean rogue waves

    Project institution:
    Project supervisor(s):
    Dr Suzana Ilic (Lancaster University), Prof Aneta Stefanovska (Lancaster University), Mr Michael Thomas (Reliable Insights) and Dr Bryan Michael Williams (Lancaster University)

    Overview and Background

    Rogue waves, exceptionally high ocean waves, whose height exceeds twice the significant wave height, are rare, short-lived events that pose serious risks to shipping, fishing, and maritime infrastructure, including offshore platforms and wind turbines. Understanding their formation and improving prediction are essential for safe marine operations.  

    Despite advances in theoretical and experimental studies, the physical mechanisms driving rogue wave formation in real seas remain poorly understood, making prediction challenging. This PhD project aims to address these gaps by analysing extensive field data, developing advanced non-linear dynamic techniques, and utilising high-performance computing. The aim is to improve understanding of rogue wave dynamics and enhance forecasting capabilities, thereby contributing to the safety and resilience of marine operations. 

    [115] 

    Methodology and Objectives

    The PhD project will address the following questions: How can data processing and modelling be accelerated to enable non-linear analysis and modelling with higher spatial and temporal resolution? Which of the sea state parameters predicted by existing operational wave models are useful for detecting the formation of rogue waves? How do the formation and predictability of rogue waves depend on physical conditions?  

    Teaser 1: 

    This data-intensive project aims to accelerate novel time-localised analysis methods to investigate physical mechanisms underlying rogue waves and predict their occurrence.  

    O1: Exploit GPU Accelerated Computing to parallelise algorithms for time-localised phase coherence and couplings between waves recorded in many spatial points, enabling scaling to higher-resolution and near real time analysis.  

    O2: Isolate the mechanisms leading to the formation of rogue waves using algorithms developed in O1. 

    O3: Develop in-situ feature detection for automated analyses exploiting GPU and assess the relationship between the occurrence of rogue waves and their characteristics from time-series measured under different physical conditions. 

    O4: Develop a time-series-based prediction modelling approach, using the relationships identified in O2-3 and assess its ability to predict the occurrence of rogue waves. 

    Methods: 

    The numerical modelling and algorithms for time-series analysis will exploit GPU Accelerated Computing; exascale will then allow near real-time practical applications. The Multiscale Oscillatory Dynamics Analysis (MODA) toolbox for non-linear and time-localised phenomena in time-series (e.g. phase coherence, coupling and wave energy exchange [3&4]) will be parallelised and used to identify rogue wave mechanisms. Automated pattern analysis and feature engineering will be applied using Tangent to detect anomalous sea surface elevations and enable an easily deployable computationally light forecasting solution using processed data. The methods will be first applied to laboratory data (e.g. [1]) and then to publicly available field measurements (e.g. Free Ocean Wave Dataset with more than 1.4 billion wave measurements). The newly developed prediction modelling approach will be systematically validated with measured data.  

    Teaser 2: 

    This is a data-intensive project focused on the computational optimisation of time series analyses for dynamic systems and the relationship between rogue wave properties and environmental conditions. 

    O1: Assess the current performance of the numerical tools included in MODA and Tangent in terms of their relevance for detecting the mechanisms of rogue waves and their computational efficiency. 

    O2: Optimise the algorithms of the tools identified in O1 with multiple GPU to improve time to improve computation time and experimental throughput, enabling large-scale ensemble time-series analyses. 

    O3: Develop and apply a GPU version of MODA to field measured data to isolate mechanisms that lead to the formation of rogue waves.   

    O4: Assess the relationship between the occurrence of rogue waves and concurrent ocean and atmospheric data.  

    Methods: 

    The Multiscale Oscillatory Dynamics Analysis (MODA) toolbox offers several high-order methods for time-series analysis, some based on wavelets. The high computational demands of uncertainty evaluation methods limits their use for operational purposes.  Optimised algorithms, GPU-acceleration and Exascale facilities will enable higher resolution and practical applications. MODA will identify the mechanisms underlying rogue wave formation using field measured time-series of surface elevations (e.g. Free Ocean Wave Dataset). The concurrent environmental data (e.g. surface ocean currents, wind and atmospheric pressure) will be collated either from field measurements or from the operational forecast models provided by meteorological offices. The correlation between the occurrence of rogue waves and environmental parameters, as well as ‘causal’ relationships between the identified mechanisms and the environmental conditions, will be investigated using the Tangent modelling engine which can be incorporated into predictions in the future.  

    [577] 

    References & Further Reading

    1. Luxmoore, J.F., Ilic, S. and Mori, N., 2019. On kurtosis and extreme waves in crossing directional seas: a laboratory experiment. Journal of Fluid Mechanics, 876, pp.792-817. 
    2. Mori N., Waseda, T., Chabchoub A.(eds.) (2023) Science and Engineering of Freak Waves, Elsevier (https://doi.org/10.1016/C2021-0-01205-0). 
    3. Newman, J., Pidde, A. and Stefanovska, A., 2021. Defining the wavelet bispectrum. Applied and Computational Harmonic Analysis, 51, pp.171-224. 
    4. Stankovski, T., Pereira, T., McClintock, P.V. and Stefanovska, A., 2017. Coupling functions: universal insights into dynamical interaction mechanisms. Reviews of Modern Physics, 89(4), p.045001. 
    5. Yang X., Rahmani H., Black S., Williams B. M. Weakly supervised co-training with swapping assignments for semantic segmentation. In European Conference on Computer Vision 2025 (pp. 459-478). Springer, Cham. 
    6. Jiang Z., Rahmani H., Black S., Williams B. M. A probabilistic attention model with occlusion-aware texture regression for 3D hand reconstruction from a single RGB image. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2023 (pp. 758-767). 
    7. Jiang Z., Rahmani H., Angelov P., Black S., Williams B. M. Graph-context attention networks for size-varied deep graph matching. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2022 (pp. 2343-2352). 
  • Mixed-precision multigrid for weather and climate applications

    Project institution:
    Project supervisor(s):
    ​​Prof Michèle Weiland​ (University of Edinburgh), Dr Eike Mueller (University of Bath) and Dr Thomas Melvin (Met Office)

    ​Overview and Background

    Modern hardware (primarily GPUs) is evolving to make extensive use of floatingpoint precisions lower than 64-bit (historically the norm for simulation algorithms). This is partly because Machine Learning remains efficient at lower precision, but also because lower precision computational units require reduced silicon area. Lower precision can deliver improved performance through better utilisation of vector units coupled with lower demand on memory and network bandwidth. This project will investigate applying low precision computation to weather and climate simulation codes, such as the Met Office’s new forecasting model LFRic. The focus will be on exploring performance gains to the multigrid solver through mixed precision approaches. 

    Methodology and Objectives

    Codes such as the Met Office’s next generation numerical weather prediction application LFRic already use a geometric multigrid preconditioner in the linear solver (to improve convergence and reduce the number of communication calls between processing units) and mixed precision, where the linear solver and transport scheme are routinely run as 32-bit bubbles within the code and it is planned that the model will run fully at 32 bit, but do not yet take advantage of the smallest precisions (e.g. 16-bit, 8-bit and 4-bit). Geometric multigrid utilises successively coarser meshes, allowing error modes to be selectively resolved by scale. The student will investigate approaches to doing this with a combination of mesh coarsening/refinement and precision coarsening/refinement. The student would be expected to devise approaches for this, initially with simplified model setups (e.g. shallow-water equations) and evaluate them from both computational and accuracy perspectives before ultimately considering how this should be approached in LFRic. 

    ​Methodology: The student will develop mixed and low precision multigrid implementations and perform rigorous evaluation of both the performance benefits, in particular focusing on Exascale hardware architectures. 

    Teaser Project 1: Numerical implications of mixed/low precision multigrid 

    Reducing the floating-point precision in computation has implications on numerical accuracy. However for multigrid, with its coarsening and refining steps, this reduction in accuracy during the coarsening steps of the algorithm may be acceptable. The student will investigate, using a simple 2D problem, what the impact of lowering precision has on the overall numerical stability and quality of the computation, and how far the low precision can be pushed before the numerics break down. The objectives of Teaser Project 1 are to evaluate how far numerical precision can be lowered safely in the context of multi-grid (looking at both geometric and precision-based refinement) before there is a trade-off with the quality of the results. 

    Teaser Project 2: Performance impact of mixed/low precision multi-grid 

    Modern GPUs can accelerate lower precision floating-point operations in hardware, though the exact performance impact is hardware specific and problem dependent. In addition, the use of lower precision floating-point numbers should reduce demands on the memory subsystem and improve the utilisation of vector units. The student will start by using iterative refinement in the current setup (a Richardson iteration preconditioned with a multigrid V-cycle), evaluate the performance improvements that can be achieved through the gradual lowering of floating-point precision, and quantify these improvements (e.g. runtime and memory bandwidth) as a fraction of the theoretical peak. The objectives of Teaser Project 2 are to define the upper and lower bounds of the performance improvements that can be expected from mixing levels of precision, from 64-bit down to 8 or even 4-bit. 

  • Modelling threatened biodiversity at national, continental and planetary scales

    Project institution:
    Project supervisor(s):
    Dr Vinny Davies (University of Glasgow), Dr Mark Bull (University of Edinburgh), Prof Richard Reeve (University of Glasgow), Prof Christina Cobbold (University of Glasgow) and Dr Neil Brummitt (Natural History Museum)

    Overview and Background

    Understanding the stability of ecosystems and how they are impacted by climate and land use change can allow us to identify sites where biodiversity loss will occur and help to direct policymakers in mitigation efforts. Our current digital twin of plant biodiversity – https://github.com/EcoJulia/EcoSISTEM.jl – provides functionality for simulating species through processes of competition, reproduction, dispersal and death, as well as environmental changes in climate and habitat, but it would benefit from enhancement in several areas. This project would target improving the speed of model runs and the feasible scale of model simulations to allow stochastic modelling to quantify uncertainty, to develop techniques for scalable inference of missing parameters, and to allow more complex models. It would do this through two approaches – on the one hand porting the code to run on GPUs for higher computational efficiency, and on the other hand applying techniques such as mesh refinement and partitioning to allow models to run at high resolution only where required by the ecosystem complexity. 

    Methodology & Objectives

    ​Teaser Project 1 Objectives: GPU: Port core EcoSISTEM code to GPU 

    This project will analyse the core CPU routines in EcoSISTEM, and port them to GPU. This will use packages from the JuliaGPU ecosystem, that provide a relatively easy user interface to NVIDIA and AMD GPUs, which are available on EPCC’s HPC system that the student will have access to. The main branch of the EcoSISTEM code is already efficiently parallelised for CPUs, and a preliminary assessment has suggested that the porting task should be feasible within a teaser project. This teaser project can be extended in a variety of ways to a full PhD: 

    On the one hand, once the GPU port speed-ups have been realised, the student can add major new components to EcoSISTEM. For instance, the student can investigate uncertainty quantification and parameter inference techniques within EcoSISTEM. 

    On the other hand, there is a more sophisticated (dev) development branch of EcoSISTEM that is not currently well optimised but allows greater flexibility of how interactions can occur between components of the model. Porting this to GPUs will be a significantly harder task, but will allow richer interactions to be more easily modelled between ecosystem components. 

    Teaser Project 2 Objectives: Scalability improvements: mesh refinement and partitioning 

    Different types of ecosystem require different scales of spatial resolution to model them adequately. For example, hedgerow and rivers are linear features that may be very species dense compared to the surrounding terrain. Currently EcoSISTEM uses a uniform mesh size, which means that the whole model has the same spatial resolution. To be able to increase the fidelity of the model in a scalable way, it must increase the spatial resolution only where needed. This teaser project would begin by capturing the requirements for non-uniform meshing and prototype an implementation, possibly in a simple proxy code.  

    With a non-uniform mesh, scaling to many thousands of processes require load balancing techniques, either static or dynamic, depending on the use case. The teaser project will investigate the suitability of existing mesh partitioners and adaptive meshing techniques, and how they can interface with the existing Julia code.  

    References & Further Reading

    1. Digital twins of the natural environment  
    2. Dynamic virtual ecosystems as a tool for detecting large-scale responses of biodiversity to environmental and land-use change 
    3. Effective extensible programming: Unleashing Julia on GPUs 
    4. Strong phylogenetic signals in global plant bioclimatic envelopes 
  • Multi-scale modelling of volcanoes and their deep magmatic roots: Constitutive model development using data-driven methods

    Project institution:
    Project supervisor(s):
    Dr Ankush Aggarwal (University of Glasgow), Dr Tobias Keller (University of Glasgow) and Prof Andrew McBride (University of Glasgow)

    Overview and Background

    This PhD studentship focuses on developing GPU-accelerated models of magmatic processes that underpin volcanic hazards and magmatic resource formation. These processes span sub-millimetre mineral-fluid-melt interactions up to kilometre-scale magma dynamics and crustal deformation. Magma is a multi-phase mixture of solids, silicate melts, and volatile-rich fluids, interacting in complex thermo-chemical-mechanical ways.  

    This is a standalone PhD project that is part of a larger framework of magmatic systems research by the wider team. The project will contribute one component of a hierarchical, multi-scale modelling framework using advanced GPU-based techniques. Specifically, in this project, the PhD student will develop constitutive relationships between stresses and strains/strain-rates of various phases at the magmatic system-scale based on granular-scale mechanical simulations (available through an existing collaboration). The result will enable accurate, large-scale simulations of magma dynamics that capture the complexity of micro-scale constituents and their interactions. 

    Your work will include software development, integrating and interpreting field and experimental data sets, attending regular seminars, collaborating within the wider research team, and receiving training through ExaGEO workshops. 

    Volcanic eruptions originate from shallow crustal magma reservoirs built up over long periods. As magma cools and crystallizes, it releases fluid phases—aqueous, briny, or containing carbonates, metal oxides, or sulfides—whose low viscosity and density contrasts drive fluid segregation. This fluid migration can trigger volcanic unrest or concentrate metals into economically valuable deposits. The distribution of fluids—discrete droplets versus interconnected drainage networks—crucially depends on crystal and melt properties. Direct observations are challenging, so high-resolution, GPU-accelerated simulations provide a way to understand these complex and dynamic systems. 

    Methodology and Objectives

    A Gaussian-process-based simulation result
    A neural-network-based constitutive modelling framework

    Modelling volcanic systems is challenging due to the multi-scale nature of their underlying physical and chemical processes. System-scale dynamics (100 m to 100 km) emerge from interactions involving crystals, melt films, and fluid droplets or channels on micro- to centimetre scales. To link these scales, this project uses a hierarchical approach: (i) direct numerical simulations of granular-scale phase interactions, (ii) deep learning-based computational homogenisation to extract effective constitutive relations, and (iii) system-scale mixture continuum models applying these relations to problems. All components leverage GPU-accelerated computing and deep learning to handle direct simulations at local scales, train effective constitutive models, and achieve sufficient resolution at the system scale. 

    In this project the candidate will extract effective constitutive relations by computationally homogenising the micro-scale mechanical simulations (available through an existing collaboration). The effective constitutive properties will then be used in the macro-scale models to accurately capture the multi-scale effects. The project will leverage recent advances in the use of neural networks [1,2] and Gaussian processes [3,4] for constitutive model development. A range of micro-scale simulation results have already been generated to produce the data covering the different deformation regimes. These results will be used to train a deep-learning-based constitutive model. Approaches based on neural networks and Gaussian processes will be explored and compared. The trained model will then be used in macro-scale simulations, and its results will be compared to those using the constitutive relations currently assumed in the literature. Lastly, the variability resulting from this homogenisation process will be quantified and its propagation into macro-scale simulations will be assessed to ensure confidence in the results. The focus of model applications will be the proposed regime transition from disconnected bubble migration to interconnected channelised seepage of fluids from crystallising magma bodies [5]. 

    Within this project, the student will start by working on two “teaser” sub-projects to gain familiarity with different techniques and data, then choose how to further develop and focus their research. 

    Teaser Project 1 Objectives: This sub-project, conducted over the first year, will focus on neural networks for constitutive modelling. Neural networks (NN) are the most popular choice of deep learning. Recent works have used NNs for constitutive model development, identification, and discovery [1,2]. With large flexibility in modelling wide-ranging phenomenon, NNs also bring a large number of tunable parameters (weights), associated uncertainty and requirement of large training dataset. This teaser project will explore NNs’ use for constitutive model development based on a simplified one- and two-phase micro-scale systems. This will include finding suitable architecture and training hyperparameters, and the required training dataset. A GPU-based implementation will be developed to make the training of high-dimensional neural networks feasible. This teaser project will pave the path towards a neural-network-based approach for the overall project over the next three years, wherein the initial implementation will be extended to complex micro-scale simulations modelling four phases. Additionally, in the full project, the uncertainty related to neural networks will be quantified, and the required training data will be optimised. These additions will further increase the computational cost, thus necessitating a GPU-accelerated framework. 

    Teaser Project 2 Objectives: This sub-project, conducted over the first year, will focus on Gaussian process for constitutive modelling. Gaussian processes (GPs) are rigorous statistical tools that are an attractive alternative to neural networks [3]. The main advantage of GPs is that, in addition to the mean, they also capture the variation/confidence in the results, which can in-turn inform which micro-scale simulations must be run to improve their accuracy. Recently, these have been used for constitutive model development for hyperelastic solids [4]. This teaser project will explore GPs’ for modelling the effective constitutive relationships of simplified one- and two-phase micro-scale systems and using the results to also select the required micro-scale simulations. Thermodynamic constraints on constitutive model will be added by extending the GP framework [4], which will increase the training cost. Thus, a GPU-based implementation will be required to make the computation feasible. If this approach is selected for the rest of the PhD, it will be extended to the fully-complex micro-scale model over the next three years. Moreover, the GP approach will be leveraged to develop a robust framework for design of experiments, such that there is a high confidence in the resulting constitutive properties. The design of experiment brings an exponentially high computational cost, thus necessitating a GPU-accelerated framework. 

    References & Further Reading

    [1] Linka, K., Hillgärtner, M., Abdolazizi, K. P., Aydin, R. C., Itskov, M., & Cyron, C. J. (2021). Constitutive artificial neural networks: A fast and general approach to predictive data-driven constitutive modeling by deep learning. Journal of Computational Physics, 429, 110010. 

    [2] Liu, X., Tian, S., Tao, F., & Yu, W. (2021). A review of artificial neural networks in the constitutive modeling of composite materials. Composites Part B: Engineering, 224, 109152. 

    [3] Williams, C. K., & Rasmussen, C. E. (2006). Gaussian processes for machine learning (Vol. 2, No. 3, p. 4). Cambridge, MA: MIT press. 

    [4] Aggarwal, A., Jensen, B. S., Pant, S., & Lee, C. H. (2023). Strain energy density as a Gaussian process and its utilization in stochastic finite element analysis: Application to planar soft tissues. Computer methods in applied mechanics and engineering, 404, 115812. 

    [5] Degruyter, W., Parmigiani, A., Huber, C. and Bachmann, O., 2019. How do volatiles escape their shallow magmatic hearth?. Philosophical Transactions of the Royal Society A, 377(2139), p.20180017. 

  • Near-real-time monitoring of supraglacial lake drainage events across the Greenland Ice Sheet

    Project institution:
    Project supervisor(s):
    Dr Katie Miles (Lancaster University), Dr Henry Moss (Lancaster University), Prof Philipp Otto (University of Glasgow) and Dr Amber Leeson (Lancaster University)

    Overview and Background

    The drainage of supraglacial lakes plays an important role in modulating ice velocity, and thus the mass balance, of the Greenland Ice Sheet. To date, research has primarily focused on drainage events and their impacts on ice motion during the summer melt season, but recent research has shown that drainage events during winter can also affect ice dynamics. However, current approaches are limited to a single season and/or 12 satellite sensors, limiting observations when year-round and multi-sensor monitoring are required to fully understand lake processes and their impacts. This PhD will utilise petabytes of available Earth Observation data and exascale computing to perform ice-sheetscale analysis of year-round supraglacial lake drainage events on the Greenland Ice Sheet and produce scalable workflows that can be used to assess the impact of lake drainage events in other glaciological environments. 

    Methodology and Objectives

    This PhD project will advance capabilities in the detection of supraglacial lake drainage events on the Greenland Ice Sheet (GrIS) and assess the impact of these drainage events on ice dynamics. The PhD will commence with two exploratory projects (~6 months each), providing complementary experience in deploying exascale compute and performing big-data analysis. 

    Teaser Project 1: Near-real-time, automated, multi-sensor, year-round monitoring of supraglacial lake drainage 

    Current assessments of supraglacial lake drainage on the GrIS are largely restricted to a single season and/or satellite sensor, particularly during winter, where existing approaches remain constrained to low-volume pipelines that analyse single orbits and provide limited temporal and spatial coverage. However, the plethora of remotely sensed imagery now available provides the opportunity to detect supraglacial lake drainage events at high temporal and spatial resolution across the entire ice sheet in near real-time through exascale big-data analysis. This project will scale a SAR-based methodology for supraglacial lake drainage detection to ice-sheet-wide monitoring, using a high-data-volume approach to achieve near-daily temporal resolution by leveraging all available orbits and both C- and L-band SAR. The project will exploit access to exascale compute to deploy GPU-accelerated machine learning methods (e.g., convnets or Unets), able to extract spatiotemporal patterns from large and complex volumes of multi-frequency inputs, to support robust, scalable detection of drainage events across diverse glaciological settings. Validation and training will draw on timestamped ArcticDEM strips and ICESat-2 altimetry, ensuring reliable accuracy assessments at ice-sheet scale, using methods for (cross-) validation across space and time (e.g., Otto et al. 2024). 

    Teaser Project 2: Near-real-time evaluation of the impact of supraglacial lake drainage on the GrIS 

    As supraglacial lakes form increasingly farther inland under increasing atmospheric temperatures, the year-round impact of their drainage events on ice dynamics is not yet well understood. Recent research has shown that winter drainage events are numerous, often occur as “cascade events”, and can result in short-term increases in ice velocity (Dean et al., under review). However, a systematic, year-round analysis of the impact of supraglacial lake drainage events on ice dynamics across the GrIS has not yet been undertaken. This project will set up a pipeline on a sub-region of the ice sheet to analyse the impact of drainage events on ice dynamics in real-time, allowing later scaling up to an ice-sheet-scale. Access to exascale compute will enable comparison of the database of drainage events created for Project 1 with climate and glaciological data (e.g., temperature, precipitation, surface energy balance, ice surface velocity, ice thickness, and bed elevation). By applying scalable statistical methods like changepoint analysis (offline detection), statistical process monitoring (online surveillance), and anomaly detection using deep learning methods, the impact of lake drainage events will be evaluated in real-time and assessed over a range of timescales. Additionally, the trained DNNs from Project 1 can be used for dimensionality reduction and process monitoring based on data depths, which can indicate further sources/reasons for detected changes (Malinovskaya et al. 2024). 

    Long-term pathway and objectives 

    At the end of Year 1, the student will select a pathway for further development. If Project 1 is chosen, the PhD will focus on leveraging additional compute resources and machine learning methods to scale the analysis to include additional, larger, and multi-modal data sources in our drainage detection methodology (such as optical to enhance summer detection) and deploying our method over other regions, such as Antarctic ice shelves, where real-time monitoring of supraglacial lake drainage events and cascades could be a useful precursor for forecasting ice-shelf disintegration (Banwell et al., 2013). If Project 2 is chosen, the PhD will focus on exploiting access to exascale compute and machine learning methods to scale the analysis to an ice-sheet-wide scale, enabling near-real-time assessment of the impact of supraglacial lake drainage events on the GrIS and potentially other glaciological environments.  

    For either pathway, the aims of the PhD are to: 

    • Advance understanding of year-round supraglacial lake drainage events on the Greenland Ice Sheet. 
    • Exploit exascale compute resources and machine learning to enable real-time detection of supraglacial lake drainage events at an unprecedented scale and assess their impacts. 
    • Produce scalable workflows that can be applied to supraglacial lake drainage events in other glaciological regions. 

    References & Further Reading

    Banwell, A. et al., (2013), Breakup of the Larsen B Ice Shelf triggered by chain reaction drainage of supraglacial lakes, Geophysical Research Letters, 40, 22, 5872-5876, doi.org/10.1002/2013GL057694 

    Christoffersen, P. et al., (2018), Cascading lake drainage on the Greenland Ice Sheet triggered by tensile shock and fracture, Nature Communications, 9, 1064, doi.org/10.1038/s41467-018-03420-8 

    Dean, C. et al. (under review), A decade of winter supraglacial lake drainage across Northeast Greenland using C-band SAR, The Cryosphere Discussions 

    Dunmire, D. et al., (2025), Greenland Ice Sheet wide supraglacial lake evolution and dynamics: insights from the 2018 and 2019 melt seasons, Earth and Space Science, 12, 2, doi.org/10.1029/2024EA003793 

    Leeson, A. et al., (2015), Supraglacial lakes on the Greenland ice sheet advance inland under warming climate, Nature Climate Change, 5, 51–55, doi.org/10.1038/nclimate2463 

    Malinovskaya, A., Mozharovskyi, P., & Otto, P. (2024). Statistical process monitoring of artificial neural networks. Technometrics, 66(1), 104-117, doi.org/10.1080/00401706.2023.2239886 

    Miles, K., et al., (2017), Toward monitoring surface and subsurface lakes on the Greenland Ice Sheet using Sentinel-1 SAR and Landsat-8 OLI imagery, Frontiers in Earth Science, 5, doi.org/10.3389/feart.2017.00058 

    Otto, P., Fassò, A., & Maranzano, P. (2024). A review of regularised estimation methods and cross-validation in spatiotemporal statistics. Statistic Surveys, 18, 299-340, doi.org/10.1214/24-SS150 

  • Scalable Deep Learning for Biodiversity Monitoring under Real-World Constraints

    Project institution:
    Project supervisor(s):
    Dr Tiffany Vlaar (University of Glasgow), Prof Colin Torney (University of Glasgow), Prof Rachel McCrea (Lancaster University), Dr Thomas Morrison (University of Glasgow) and Dr Paul Eizenhöfer (University of Glasgow)

    Overview and Background

    Technological advances have ushered in the era of big data in ecology (McCrea et al., 2023). Usage of
    deep learning and GPUs shows promise for more effective biodiversity monitoring which is vital to
    monitor and mitigate effects of climate change. However, many open questions remain on the biases and behaviour of deep neural networks under real-world constraints – such as unbalanced data and uncertain labels. There is a pressing need for better benchmarks to train, test, and understand these models. Further, Kaplan et al., 2020, found that various deep neural networks’ performance improves with model and data size. Training large models on vast ecological datasets requires endless GPU hours and reliable performance will greatly benefit from the potential of exascale computing.

    Methodology and Objectives

    Project 1: Scalable Biodiversity Monitoring with Deep Learning by Understanding What Data Matters
    The big data era in ecology offers incredible potential for biodiversity monitoring, but is constrained by the need to manually process vast amounts of data. A combination of deep learning and citizen science approaches offers a promising avenue for reliable and accelerated biodiversity monitoring, e.g. for counting wildlife in aerial survey images (Torney et al., 2019) and for species classification in cameratrap data (Sharpe et al., 2025). Neural networks are typically evaluated based on their ability to generalise to new unseen data (Zhang et al., 2017). In this project we will investigate which data is actually important to obtain good generalisation performance. A potential proxy for data sample “importance” is an example difficulty score. Although we can consider different metrics and types of data later on in the PhD, for the initial teaser project, we will consider example difficulty through the lens of citizen science classifications on camera-trap data. If there is a lot of disagreement amongst human volunteers on the corresponding classification of an image, we consider this example to be more complicated. We will investigate the role that these difficult examples play during training of deep neural networks. We will then study which, when, and how many examples can be removed from training without affecting generalisation performance, and how this is affected by the choice of model architecture and corresponding inductive bias (or learning bias). The iterative retraining cycles will benefit greatly from GPU compute and the potential of exascale computing. The outcome of this project will not only offer routes towards increased efficiency and more sustainable AI, but also important insights into deep learning models training dynamics for real-world complex ecological datasets, offering a strong foundation for the rest of the PhD.

    Project 2: Enhancing Rare Species Classification with Generative AI
    Real-world ecological datasets often feature large class imbalances, meaning that certain species are
    significantly underrepresented. While deep neural networks offer potential for successful biodiversity
    monitoring, accurately classifying these scarcely represented species remains particularly challenging. Although deep learning models can be pre-trained to enhance performance, available pre-training data can feature strong biases, inaccuracies, and limited diversity, which can affect downstream performance (Luccioni et al., 2022). Generative models, such as diffusion models (Sohl-Dickstein et al., 2015 and Song et al., 2019) and Generative Adversarial Networks (GANs) (Goodfellow et al., 2014), have the potential to generate realistic high-quality diverse synthetic images. In this project we will investigate if such synthetic data may be able to complement real ecological data to benefit training. Successful training of large generative models will greatly benefit from exascale computing resources. In this teaser project, the student will work with an existing camera-trap dataset and adapt existing deep learning techniques (e.g., Rombach et al., 2022) to generate synthetic data for this specific setting. They will then test how training on this combined dataset enhances performance at test time for rare species. This will enhance the student’s understanding of working with camera-trap data, gain relevant ecological knowledge, and work towards enhancing the performance of deep learning models for biodiversity monitoring when dealing with the widespread challenge of dataset imbalance. It would be valuable to analyse if findings generalise to other settings, including for remote sensing data and for different models, with support from the supervisor team.

    References & Further Reading

    1. McCrea et al. (2023). Realising the promise of large data and complex models. Methods in Ecology and Evolution. 14, 4-11.
    2. Kaplan et al. (2020). Scaling Laws for Neural Language Models. CoRR.
    3. Torney et al. (2019). A comparison of deep learning and citizen science techniques for counting wildlife in aerial survey images. Methods in Ecology and Evolution. 10, 779-787.
    4. Sharpe et al. (2025). Increasing citizen scientist accuracy with artificial intelligence on UK camera-trap data. Remote Sensing in Ecology and Conservation.
    5. Zhang et al. (2017). Understanding deep learning requires rethinking generalization. ICLR.
    6. Luccioni et al. (2022). Bugs in the Data: How ImageNet Misrepresents Biodiversity. AAAI.
    7. Sohl-Dickstein et al. (2015). Deep Unsupervised Learning using Nonequilibrium Thermodynamics. International Conference on Machine Learning.
    8. Song et al. (2019). Generative Modeling by Estimating Gradients of the Data Distribution. Advances in Neural Information Processing Systems.
    9. Goodfellow et al. (2014). Generative adversarial nets. NeurIPS.
    10. Rombach et al. (2022). High-Resolution Image Synthesis with Latent Diffusion Models. CVPR.
  • Scotland landscape response to past abrupt climate change: GPU-accelerated numerical simulations and model-data integration

    Project institution:
    Project supervisor(s):
    Dr Jingtao Lai (University of Glasgow), Prof Todd Ehlers (University of Glasgow), Dr Sebastian Mutz (University of Glasgow) and Dr Katie Miles (Lancaster University)

    Overview and Background

    Around 12,900 years ago, a rapid cooling of the climate triggered widespread glaciation across the Northern Hemisphere. This cold phase, known as the Younger Dryas, ended abruptly with a dramatic warming. This event provides a valuable natural experiment for understanding how Earth systems respond to sudden shifts in climate, a question highly relevant to today’s ongoing climate change. 

    Scotland preserves one of the most detailed geological and geomorphological records of Younger Dryas glaciation anywhere in the world. This project will leverage these records by developing a GPU-accelerated computational workflow that integrates numerical simulations of glaciation with glaciological, geological, and geomorphological datasets. The aim is to reconstruct and test interactions among glaciation, climate, and topography during the Younger Dryas, improving our understanding of Earth’s response to abrupt climate change. 

    Methodology and Objectives

    Although previous research has produced a wealth of observations on the Younger Dryas glaciation in Scotland, efforts to integrating them with numerical modelling remain limited. This is largely because most model-data integration methods require running a large group of simulations, while traditional glaciation simulations are computationally demanding. A recently developed GPU-based ice flow model offers an opportunity to address this challenge. By applying Physics-Informed Machine Learning techniques, the model leverages the power of modern GPU hardware to efficiently simulate glaciation. This project will build on this model and focus on exploring model-data integration methods on exascale computers. 

    The overarching goal of this PhD is to develop a robust and scalable model-data integration workflow that can integrate the GPU-based model with glaciological, geomorphological, and geological datasets in Scotland. Depending on their interests, the student may also pursue further model developments, such as optimizing the workflow for efficient multi-GPU simulations or coupling it with other Earth system models, including climate and landscape evolution models. 

    Teaser Project 1: Numerical inversion of past climate conditions in Scotland 

    Although the glaciation event in Scotland broadly coincides with the global Younger Dryas period, evidence is emerging that the exact timing and magnitude of climate shift in Scotland differed from those in other parts of northwestern Europe, and a better understanding of regional climate conditions in Scotland has important implications for understanding the interactions between different Earth systems during abrupt climate changes. The objective of this Teaser Project is to use numerical inversion methodology to infer past climate conditions from mapped ice extent during Younger Dryas glaciation in Scotland. This project will integrate GPU-based simulations of glaciation with ice extent constraints through Markov Chain Monte Carlo method, a computational method that explores many possible scenarios to determine which ones best fit the evidence. The goal is to constrain the climate conditions needed to produce glaciation that are consistent with field observations in Scotland. 

    The long-term pathway of this Teaser Project will focus on optimizing the workflow by 1) integrating additional geomorphological and geological observations (e.g., moraine positions, sedimentary records, and chronology data), and 2) combing glaciation simulations with climate simulations to capture the interaction between ice growth/decay and climate variations.  

    Teaser Project 2: Using data assimilation to reveal the dynamic evolution of glacier systems 

    The evolution of Younger Dryas glaciation in Scotland provides a valuable natural experiment for understanding how glacier systems respond to rapid climate fluctuations. Field evidence has helped reconstruct the general pattern and timing of ice advance and retreat in some areas, but these records remain spatially patchy and only partially time-resolved. This Teaser Project aims to combine such sparse chronological observations with physically based ice flow models by using data assimilation techniques (e.g., Ensemble Kalman Filter). Data assimilation is a method that updates a glacier simulation over time by combining the model’s predictions with observations, such as exposure dating, to produce the most likely evolution of the glacier. By applying this approach, the project aims to develop a framework capable of reconstructing the timing, extent, and dynamics of glaciation in unprecedented detail.  

    The long-term pathway of this Teaser Project is to create a transferable data–model integration workflow that can assimilate diverse glaciological, geological, and geomorphological datasets, ultimately improving our understanding of the dynamic response of ice flow in Scotland during the Younger Dryas. 

    ​References & Further Reading

    https://www.antarcticglaciers.org/glacial-geology/british-irish-ice-sheet/younger-dryas-loch-lomond-stadial/ 

    https://www.antarcticglaciers.org/glacial-geology/british-irish-ice-sheet/younger-dryas-loch-lomond-stadial/the-loch-lomond-stadial/ 

    Leger, T. P. M., Jouvet, G., Kamleitner, S., Mey, J., Herman, F., Finley, B. D., Ivy-Ochs, S., Vieli, A., Henz, A., & Nussbaumer, S. U. (2025). A data-consistent model of the last glaciation in the Alps achieved with physics-driven AI. Nature Communications, 16(1), 848. https://doi.org/10.1038/s41467-025-56168-3 

  • Unlocking understanding of floods and droughts through data assimilation and exascale computing

    Project institution:
    Project supervisor(s):
    Prof Jess Davies (Lancaster University), Prof Lindsay Beevers (University of Edinburgh), Dr Simon Moulds (University of Edinburgh) and Prof Gordon Blair (UKCEH)

    Overview and Background

    Floods and droughts are increasingly impacting society, ecosystems, and the environment, yet predicting when they will occur and their effects remains a major scientific challenge. Soilwater interactions are at the heart of this challenge, as they are pivotal in storing and releasing water in landscapes, determining plant growth, and nutrient cycling. However, these interactions are highly complex and we currently rely on computationally intensive process-based models to help understand these processes and predict their influence on ecosystem services. With recent advances in satellite imagery and sensing, a wealth of soil moisture data and other relevant data products are now available that could transform these models and our understanding of the risks and impacts of floods and droughts. This studentship focuses on taking advantage of new exascale computing approaches to facilitate data assimilation, exploring how the fusion of big-data with hydrological and biogeochemical soil-water process models can help unlock new insights and understanding. 

    Methodology and Objectives 

    Teaser Project 1: The role of soil water storage in drought risk 

    Objective: Estimate the contribution of soil water storage to mitigating or increasing drought risk in a case study catchment by combining remote sensed soil moisture data, along with meteorological, hydrological and hydrogeological data with hydrological models. 

    Soil water holding capacity can play a significant role in buffering droughts by storing moisture and supporting groundwater recharge. However, the interactions among precipitation, soil processes, surface flow, and groundwater are complex. Using data-driven methods to explore these relationships could improve our understanding of drought propagation. 

    Soil moisture is an important component in semi-distributed or distributed hydrological models. However, it is often poorly represented, and it is not routinely updated dynamically throughout the process of a simulation. If we can build data-driven models which relate precipitation to groundwater through soil water interactions in droughts, we could improve our hydrological models by including hybrid processes.

    In this teaser, the student will begin to explore different approaches to data assimilation of remote sensing and in-situ monitoring data into hydrological models, focusing on the Tweed to enhance their representation of soil-water-groundwater interactions during droughts.  

    Teaser Project 2: The effects of droughts on long-term soil carbon cycling   

    Objective: Improve process-based model representation of the long-term effects of droughts on plant growth and soil carbon through remote sensing data assimilation. 

    A lack of water can have large effects on plants, especially on annual crops where water conditions can severely affect the plant’s growth and survival. With changing water patterns and increasing frequency of prolonged dry periods, the effects on plant productivity are expected to be large, and there will be knock-on effects for soil carbon storage in the longer-term. 

    Remote sensing offers many data products that can provide us with data-based insights into plant productivity and soil moisture conditions. However, remote sensing of soil carbon is much more difficult, and understanding of the long-term response to changes in plant productivity still requires process-based models. 

    In this teaser, the process-based model N14CP, which simulates plant-soil carbon cycling will be adapted to assimilate (Gross or Net) Primary Productivity (GPP and NPP) and soil moisture data remote sensing products during a known period of drought in the UK. Freely available datasets for example from MODIS and SMAP that match the spatial resolution of the model will provide a starting point. This model will be used to explore the long-term effects of droughts on soil carbon. 

    Shared methods and the pathway to PhD 

    • Both teasers have a common focus on droughts and involve data assimilation into process-based models. The student will explore a range of approaches: working up from direct insertion to traditional data assimilation approaches (e.g. Kalman filter or particle filtering approaches) to ML-supported approaches (e.g. combining ensemble Kalman filtering with machine learning to reduce compute times) and using ML-based surrogate modelling to speed up process-based model simulation, using for example, Recurrent Neural Network (RNN) methods such as Long Short Term Memory approaches that have been shown to be a promising approach to emulating hydrological systems.      
    • The two teasers can be developed into two full chapters focused on the use of data assimilation in determining drought risk and knock-on impacts for carbon cycling. 
    • The PhD can be further developed in a number of other directions, depending on the student’s interests by: i) expanding the focus to floods; ii) developing scaling approaches to move up to catchment and national scales; iii) exploring two-way learning between data and models, iv) trialling real-time assimilation approaches that help move towards a digital twin.  
    • Exascale/GPU computing will be fundamental in supporting ML data assimilation approaches and hybrid model simulations. For instance, the development of a generalisable ML surrogate model capable of simulating hydrological fluxes and storage processes at the land surface requires training on large spatially and temporally explicit datasets comprising satellite imagery, model outputs, and other relevant data sources. During training, the model must be exposed to as much information as possible to accurately learn the system’s responses to various inputs. A limited dataset reduces the likelihood that the model will capture the full spectrum of system behaviours. This limitation is particularly significant in non-linear systems, such as hydrological systems, where extrapolation beyond the training range becomes unreliable. GPUs will be vital to handling data volumes needed to achieve this, enabling parallelisation of matrix operations on large training data. Greater computational capacity permits the use of larger datasets during training, thereby improving the robustness and generalisability of the surrogate model. 

    The student’s research will be connected to the Floods and Droughts Research Infrastructure at UK Centre for Ecology & Hydrology, helping connect the student with relevant research and data resources: https://www.ceh.ac.uk/our-science/projects/floods-and-droughts-research-infrastructure-fdri  

    References & Further Reading

 

Projects with a focus on Geodynamics, Geosciences and Environmental Change:

 

  • AI-Driven Satellite Embeddings for Fine-resolution Mapping and Tracking Invasive Species on Global Reclaimed Lands

    Project institution:
    Project supervisor(s):
    Dr Meiliu Wu (University of Glasgow), Dr Alex Bush (Lancaster University), Dr Wenxin Zhang (University of Glasgow) and Prof Brian Barrett (University of Glasgow)

    Overview and Background

    Reclaimed lands (e.g., post-mining and post-industrial sites) are expanding globally and are particularly susceptible to colonisation by invasive plant species, which may cause negative effects on restoration, biodiversity, and ecosystem functions. We propose to evaluate and extend AlphaEarth Foundations, i.e., Google’s Satellite Embedding V1 (annual, 10m by 10m, 64-Dimension), for fine-resolution identification of invasive plant species in reclaimed areas worldwide and for tracking temporal change in species composition and spread. AlphaEarth encodes multi-sensor Earth Observation (EO) time series into consistent, analysis-ready embeddings during 2017-2024, supporting scalable classification and efficient change detection. We will combine embedding-based models with climate and reclamation histories to identify key drivers, quantify uncertainty, and align the results with ExaGEO’s “exascale model & big-data coupling” platform.  

    Methodology and Objectives

    Data & Methods Used: 

    We will use Google Earth Engine’s Satellite Embedding V1 as the core feature space; embeddings are unit-length, consistent over years, and produced by AlphaEarth Foundations (Brown, Christopher F., et al., 2025), i.e., multi-modal assimilation across optical, radar, and LiDAR, facilitating both classification and dot-product/angle-based change metrics. We will (i) assemble global reclaimed-area masks, beginning with open global-scale mining polygons (v2) (Maus, Victor, et al., 2022) and complementary sources; (ii) compile invasive plant labels from the Global Invasive Species Database (GISD) linked to the Global Biodiversity Information Facility (GBIF) occurrences; (iii) develop label-efficient methods (e.g., positive-unlabelled, semi-supervised, and weak-supervision with quality controls); (iv) build temporal transformers over annual embeddings for trend/change analysis; and (v) perform GPU-accelerated distributed training/inference with rigorous uncertainty quantification (e.g., deep ensembles) and domain shift tests across regions and biomes.  

    Teaser Project 1  

    Suggested headings: Fine-resolution invasive species mapping on global reclaimed lands  

    Objectives:  

    1. Benchmark AlphaEarth embeddings vs. standard EO features for species-level classification on reclaimed sites (start with mining areas, then generalise), measuring macro-F1, calibration, and cross-region transfer. 
    2. Build a label-efficient pipeline that fuses GISD species lists with spatially filtered GBIF occurrences and negative sampling within reclaimed buffers; evaluate sensitivity to sampling bias. 
    3. Develop multi-scale embedding aggregation (tile/patch pooling and linear composability) for species distinguishability; compare tree-based methods vs. shallow nets on the 64-Dimension space for compute efficiency.
    4. Produce regional probability maps (e.g., Scotland, Ruhr, Appalachia, Hong Kong, Shantou, and Jiangsu) with uncertainty maps to guide validation and management.  

    How it becomes a full PhD: Global expansion across reclaimed typologies (e.g., mines, landfills, and brownfields), deeper label curation, and active learning with end-user feedback; deliver a reproducible mapping stack and evaluation framework.  

    Teaser Project 2  

    Suggested headings: Temporal dynamics and early detection of invasive spread post-reclamation 

    Objectives: 

    1. Use annual embeddings (2017–2024) to compute intra-site temporal similarity (e.g., cosine/dot-product shifts) and detect emergence or range expansion of invasive species; and integrate climate covariates with Dr Zhang’s modelling to attribute underlying drivers.  
    2. Train temporal sequence models (temporal transformers on annual 64-Dimension vectors) to predict next-year probabilities and time-to-detection, with quantified uncertainty. 
    3. Quantify management impacts by comparing trajectories across reclamation strategies (where metadata exist) and by disentangling the influence of climate anomalies vs. anthropogenic disturbance signals.
    4. Deliver global change products and a watchlist of high-risk sites for early intervention. 

    How it becomes a full PhD: Scale temporal modelling globally, generalise to additional taxa, and operationalise early-warning thresholds with end-users. 

    References & Further Reading

    Brown, Christopher F., et al. “Alphaearth foundations: An embedding field model for accurate and efficient global mapping from sparse label data.” arXiv preprint arXiv:2507.22291 (2025). 

    Maus, Victor; da Silva, Dieison M; Gutschlhofer, Jakob; da Rosa, Robson; Giljum, Stefan; Gass, Sidnei L B; Luckeneder, Sebastian; Lieber, Mirko; McCallum, Ian (2022): Global-scale mining polygons (Version 2) [dataset]. PANGAEA, https://doi.org/10.1594/PANGAEA.942325 

  • Chasing fluid pathways: GPU-enabled multiscale subduction models to unravel how subduction driven melt dynamics determine surface deformation and topography

    Project institution:
    Project supervisor(s):
    Dr Antoniette Greta Grima (University of Glasgow), Dr Tobias Keller (University of Glasgow) and Dr Luca Parisi (University of Edinburgh)

    Overview and Background

    Subduction zones are the primary gateways through which water, carbon, and other volatiles are transported into the Earth’s mantle. These fluids are central to Earth’s evolution, they trigger partial melting in the mantle wedge, sustain the deep water and carbon cycles, drive arc volcanism, and ultimately help maintain Earth’s long-term habitability (Tian et al., 2019). At shallower depths, fluids and melts profoundly modify the strength of the lithosphere and continental crust. They reduce viscosity, promote faulting and deformation, and localize magmatic pathways (Nakao et al., 2016). These processes not only shape surface landscapes but also govern volcanic hazards and the emplacement of economically critical mineral deposits (Faccenda, 2014). 

    Despite their importance, the mechanisms of reactive fluid transport in subduction systems remain poorly constrained. Fundamental open questions include: 

    • How do transient pulses of fluid release alter the rheology of the overriding plate and guide surface deformation? 
    • To what extent do fluid–rock interactions control the focusing of melts and the distribution of arc magmatism? 
    • Can slab dehydration events leave observable topographic or geophysical signals that serve as precursors to volcanic unrest or continental rifting? 

    Answering these questions is a formidable challenge, because the governing processes span scales from grain boundaries to tectonic plates, and from seconds to millions of years. Current CPU-based models cannot capture this range: resolving fluid pathways requires kilometre- to metre-scale resolution, while system-scale simulations demand computational domains hundreds of kilometres across. Bridging these scales dynamically has remained beyond reach. 

    This PhD project will break this barrier by developing GPU-accelerated, multi-scale models of subduction zone dynamics that explicitly couple fluid release, volatile transport, melt migration, and surface deformation. By exploiting exascale computing architectures, the project will integrate fine-scale reactive flow models with large-scale geodynamic simulations in ways not previously possible. Adaptive mesh refinement and GPU-enabled solvers will allow kilometre-scale fluid processes to be embedded directly within tectonic-scale models of subduction and topographic evolution. 

    The scientific goal is to establish how fluid transport interacts with subduction dynamics to reshape continental surfaces. By linking volatile release to lithospheric weakening, melt focusing, and measurable topographic responses, this research will provide new insight into the origins of volcanic hazards, the localisation of critical resources, and the long-term evolution of continents. The project will also deliver community-relevant GPU software and HPC workflows, contributing to ExaGEO’s mission to prepare Earth science for the exascale era. 

    Methodology and Objectives

    This project will use GPU-accelerated numerical modelling to directly couple reactive fluid transport with thermo-mechanical subduction dynamics. The approach is designed to bridge processes from the scale of fluid migration pathways within the crust to the scale of plate interactions and topographic evolution.  

    The student will: 

    • Develop new computational tools in Julia and Python to implement GPU-enabled solvers for two-phase flow and thermo-mechanical coupling. 
    • Extend the open-source ASPECT code to run efficiently on GPU architectures such as using GPU-accelerated finite-element matrix free methods and adaptive mesh refinement (AMR).
    • Integrate multi-scale models by embedding high-resolution, crustal-scale simulations of fluid migration within regional 2D/3D subduction models. 
    • Conduct systematic numerical experiments to test how fluid release influences deformation patterns, melt focusing, and surface uplift/subsidence. 
    • Benchmark and validate results against laboratory experiments, field observations, and published numerical benchmarks to ensure robustness. 

    The novelty lies in the computational design: instead of treating fluid migration and lithospheric deformation as separate problems, the project will couple them dynamically within the same simulation framework. GPU acceleration and exascale platforms make this coupling computationally feasible, enabling parameter sweeps and real-time tracking of fluid–rock interactions across scales. 

    The research will begin with two focused “teaser projects” that provide distinct skill sets and scientific insights, before converging into an integrated PhD focus.

    Teaser Project 1: GPU-Optimized Two-Phase Flow Model 

    In this project the student will implement a simplified two-phase flow model, based on Darcy–Stokes coupling, in Julia with GPU acceleration. The initial focus will be on ensuring computational performance and numerical stability, followed by validation of the solver against analytical benchmarks and published test cases to establish robustness. With this foundation in place, the accelerated framework will then be used to carry out systematic parameter studies, exploring how permeability, viscosity contrasts, and fluid mobility influence fluid transport. These experiments will identify the conditions under which reactive channelization and focused flow emerge, and the results will be used to assess how such migration behaviours modify the bulk rheology of the overriding plate. In turn, these outcomes will provide valuable boundary conditions and insights that can be transferred into larger-scale subduction models. 

    Teaser Project 2: GPU-Enabled Thermo-Mechanical Subduction Modelling 

    The second project turns to the thermo-mechanical evolution of subduction systems and will involve extending existing finite-element tools to incorporate GPU acceleration. The student will run two and three-dimensional subduction models that incorporate visco-plastic rheology, free-surface deformation, and slab dehydration parameterised via phase diagrams in ASPECT. These simulations will be used to investigate how episodic dehydration alters stress fields, deformation patterns, and topographic response at scales from a few kilometres to hundreds of kilometres. While ASPECT is not yet GPU enabled, it leverages the deal.ii library, which does support GPUs. This allows investigating different solvers, identify performance bottlenecks and identify which aspects of the code need be adapted to efficiently use accelerators (such as using matrix free implementations, managing data transfers between CPU and GPU), without needing to extensively change ASPECT’s framework. If time allows, the student will have the opportunity to implement those changes in ASPECT .

    References & Further Reading

    1. Faccenda, M. (2014). Water in the slab: A trilogy. Tectonophysics, 614, 1–30. https://doi.org/10.1016/j.tecto.2013.12.020 
    2. Heister, T., Dannberg, J., Gassmöller, R., & Bangerth, W. (2017). High accuracy mantle convection simulation through modern numerical methods – II: Realistic models and problems. Geophysical Journal International, 210(2), 833–851. https://doi.org/10.1093/gji/ggx195 
    3. Keller, T. and Suckale, J., 2019. A continuum model of multi-phase reactive transport in igneous systems. Geophysical Journal International, 219(1), pp.185-222.  
    4. Nakao, A., Iwamori, H., & Nakakuki, T. (2016). Effects of water transportation on subduction dynamics: Roles of viscosity and density reduction. Earth and Planetary Science Letters, 454, 178–191. https://doi.org/10.1016/j.epsl.2016.08.016 
  • Data-Driven and Physics-Informed Hybrid Modelling of Landslide Dynamics

    Project institution:
    Project supervisor(s):
    Prof Jin Sun (University of Glasgow), Prof Andrew McBride (University of Glasgow), Dr Jingtao Lai (University of Glasgow), Prof Todd Ehlers (University of Glasgow) and Dr Eric Breard (University of Edinburgh)

    Overview and Background

    Landslides and debris flows threaten human life, infrastructure, and economies worldwide, especially as climate change drives more extreme rainfall events. Timely prediction of such mass movements is therefore crucial for disaster risk reduction. Achieving this requires combining accurate physics-based modelling of granular flows with rich observational data: for example, remote-sensing change-detection of slope deformation. Recent advances in data-driven modelling and physics-informed machine learning offer new opportunities to integrate physical laws with observational and simulation data. This project aims to establish a unified hybrid modelling framework that bridges high-fidelity particle-scale simulations and large-scale field data for improved prediction of landslide initiation and runout, thereby laying the foundation for future digital twinning of landslides.  

    Methodology and Objectives

    The overarching goal of this PhD project is to develop a hybrid modelling framework that combines data-driven learning with physics-based constitutive modelling to describe the transition between the solid-like and fluid-like behaviour, which is critical for predicting slope failure and runout, and to bridge the gap between microscale simulations and macroscale landslide observations.  

    Teaser Project (TP) 1: Discrete Element Simulations of Slope Flows

    This project focuses on performing high-fidelity discrete element method (DEM) simulations to capture the failure, flow, and deposition processes in granular slopes under different inclinations. Simulations will be conducted using the open-source LAMMPS software, with GPU acceleration implemented to enhance computational efficiency. The objective is to generate a comprehensive dataset describing particle-scale kinematics, contact forces, and evolving stress fields during slope instability. By varying slope angles and material parameters, the study will investigate the onset of failure, the transition from solid-like to fluid-like flow, and the subsequent deposition patterns. These DEM results will provide detailed micro-mechanical insights and serve as training data for subsequent data-driven model development.

    A GPU-accelerated solver will be developed to optimize LAMMPS for large-scale granular simulations. The solver will utilize domain decomposition and parallel computation to handle millions of particles efficiently. The output data—velocity, stress, strain, and microstructure—will be analysed to characterize flow regime transitions. This will enable formulation of rheological indicators linking particle-scale dynamics to continuum measures of deformation.

    Teaser Project 2: Physics-Informed Machine Learning for Granular Rheology

    The second project will apply physics-informed neural networks (PINNs) or sparse regression techniques to learn rheological models for granular flow where constitutive relations are already known, such as the μ(I) rheology for steady-state flow. The objective is to test and validate the physics-informed learning methodology by comparing the discovered models against the analytical forms of these known relationships.

    Synthetic datasets will be generated using controlled numerical experiments from TP1 to train the PINNs or sparse regression models. By incorporating physical constraints such as positive energy dissipation, the learned models are expected to exhibit improved generalization and interpretability. This project will demonstrate how hybrid modelling can faithfully recover known constitutive relationships while providing a robust foundation for future discovery of new rheological forms from experimental and DEM data.

    Together, these teaser projects establish a methodological framework for subsequent years of research, which will extend toward modelling large-scale landslides by learning with data from both DEM simulations and field observations, coupling learned rheologies with continuum-scale solvers and validating against field data. The integration of physics-based and data-driven models will ultimately enable prediction of landslide initiation and runout with improved accuracy and computational efficiency. 

    References & Further Reading

    1. Iverson, R. M. The physics of debris flows. Reviews of Geophysics 35, 245 296 (1997). 
    2. Forterre, Y. & Pouliquen, O. Flows of Dense Granular Media. Annual Review of Fluid Mechanics 40, (2008). 
    3. Chialvo, S., Sun, J. & Sundaresan, S. Bridging the rheology of granular flows in three regimes. Phys Rev E 85, 021305 021305 (2012). 
    4. Raissi, M., Perdikaris, P. & Karniadakis, G. E. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys. 378, 686–707 (2019). 
  • Towards exa-scale simulations of slabs, core-mantle heterogeneities and the geodynamo

    Project institution:
    Project supervisor(s):
    Prof Radostin Simitev (University of Glasgow), Dr Antoniette Greta Grima (University of Glasgow) and Dr Kevin Stratford (University of Edinburgh)

    Overview and Background

    Magnetic fieldlines in a geodynamo simulation by Silva et al (2020) using the code of Silva and Simitev (2018).

    Scientific computing is crucial for understanding geophysical fluid flows, such as the geodynamo that sustains Earth’s magnetic field. This project will adapt an existing pseudospectral geodynamo code for magnetohydrodynamic simulations in rotating spherical geometries to GPU architectures, improving efficiency on modern computing systems and enabling simulations of more realistic regimes. This will advance our understanding of Earth’s geomagnetic field and its broader interactions, such as those with mantle heterogeneities.

    Evidence from seismology and geodynamics shows that the core-mantle boundary (CMB) is highly heterogeneous, influencing heat transport and geodynamo dynamics. By combining compressible, thermochemical convection with geodynamo simulations, this project will further investigate how deep slab properties affect the CMB heat flux, mantle heterogeneity, and the geodynamo.

    Methodology and Objectives

    Teaser project 1: What is the impact of ancient slabs on core-mantle boundary heterogeneities and the geodynamo?

    Evidence from seismology and geodynamics reveals that the lowermost mantle and the coremantle boundary (CMB) are highly heterogeneous due to the presence of post-perovskite, large low shear wave velocity provinces and ancient, subducted slab material. CMB heterogeneity results in variable heat transport from the core and plays a key role in core and mantle dynamics, the geodynamo, and ultimately the Earth’s habitability. Previous work shows that the spatiotemporal evolution of the CMB heterogeneity is closely linked to deep slab dynamics (e.g., Heron et al., 2024, 2025), however these remain poorly understood. This teaser project will investigate the role of deep slab properties on temporal evolution of the deep mantle heterogeneity, the CMB heat flux and the geodynamo. This will involve modelling compressible, multiphase, thermochemical convection in a 3D spherical shell following the approach of Dannberg et al., (2024) and Heron et al., (2024, 2025) using the state of the art, open-source, adaptive mesh refinement, finite element software ASPECT (Heister et al., 2017). These models will include the subduction history over the last 1 billion year from Merdith et al., (2021) and will be supported by high resolution 3D regional models investigating the role of end-member slab properties (e.g., weak vs. strong slabs) on the CMB heterogeneity. Temporal variations in CMB heat flux from these models will then be analysed using spherical harmonics across the first 4 harmonic degrees similar to the approach of Dannberg et al., (2024) and used as thermal boundary condition for the geodynamo simulations. The goal is to expand teaser project 1 to investigate the influence the deep slab on core-mantle dynamics and the implications this has for magnetic field generation and the strength and frequency of polarity reversals.

    Teaser Project 1 Objectives:

    • Use global convection models to calculate the temporal evolution of heat flux at the CMB
    • Investigate the influence of end member slab rheologies and geometries on the heat flux heterogeneity at the CMB
    • Apply the calculated heat flux across the CMB from geodynamic models as a boundary condition to geodynamo simulations to investigate heterogeneity in magnetic field strength and the timing and frequency of magnetic field reversals
    • Use GPU architecture to couple finite element mantle convection with geodynamo simulations

    Teaser Project 2: Spectral expansion transforms in spherical geometry

    Modelling the geodynamo involves solving the coupled 3D, time-dependent, nonlinear NavierStokes equations, pre-Maxwell electrodynamics, and heat transfer equations for a rotating fluid. At present, the pseudo-spectral method is the most accurate and widely used numerical discretisation method in this context. The method requires applying physical to spectral space transforms which are generally in integral form and have been difficult to adapt to GPU architectures. With GPUs becoming increasingly powerful and accessible, this sub-project aims to port an existing versatile pseudo-spectral code for magnetohydrodynamic simulations in rotating spherical geometries to GPU systems.

    Teaser Project 2 Objectives:

    • Investigate alternative orthogonal polynomial basis function families that can be used to expand fields in spherical geometry, including Legendre, Jones-Worland, Jacobi and Galerkin.
    • Implement alternatives in and assess/compare convergence, stability and consistency of the resulting discretisations as well as their efficiency for GPU acceleration.

    References & Further Reading

    Dannberg, J., Gassmoeller, R., Thallner, D., LaCombe, F., & Sprain, C. (2023). Changes in core-mantle boundary heat flux patterns throughout the supercontinent cycle. arXiv preprint arXiv:2310.03229.

    Paul H Roberts and Eric M King. 2013. On the genesis of the Earth’s magnetism. Rep. Prog. Phys. 76 096801 http://dx.doi.org/10.1088/0034-4885/76/9/096801

    Gary A. Glatzmaier. 2014. Introduction to Modeling Convection in Planets and Stars: Magnetic Field, Density Stratification, Rotation. Princeton https://press.princeton.edu/books/hardcover/9780691141725/introduction-to-modelingconvection-in-planets-and-stars

    Heister, T., Dannberg, J., Gassmöller, R., & Bangerth, W. (2017). High accuracy mantle convection simulation through modern numerical methods – II: Realistic models and problems. Geophysical Journal International, 210(2), 833–851. https://doi.org/10.1093/gji/ggx195

    Heron, P.J., Dannberg, J., Gassmöller, R., Shephard, G.E., & Pysklywec, R. N. (2025). The impact of Pangaean subducted oceans on mantle dynamics: passive piles and the positioning of deep mantle plumes. Gondwana Research.

    Heron, P.J., Gün, E., Shephard, G.E., Dannberg, J., Gassmöller, R., Martin, E., Sharif, A., Pysklywec, R. N., Nance, R.D., & Murphy, J.B. (2025). The role of subduction in the formation of Pangaean oceanic large igneous provinces. Geological Society London, Special Publications, 542(1).

    Merdith, A.S. Williams. S.E., Brune, S., Collins, A.S., & Müller, D..R. (2021). Extending fullplate tectonic models into deep time: linking the Neoproterozoic and the Phanerozoic, EarthSci. Rev., 214, Doi:10.1016/j.earscirev.2020.103477

    Silva. L, Simitev, R., 2018. Pseudo-spectral code for numerical simulation of nonlinear thermocompositional convection and dynamos in rotating spherical shells,zenodo.org, 1311203, 2018. https://doi.org/10.5281/zenodo.1311203

  • When Mountains Meet the Sea: Simulating Landslide-Generated Tsunamis

    Project institution:
    Project supervisor(s):
    Dr Kevin Stratford (University of Edinburgh), Dr Eric Breard (University of Edinburgh), Prof Jin Sun (University of Glasgow) and Dr Arianna Gea Pagano (University of Glasgow)
    Three-phase flow simulation of a gas–particle granular collapse into water, performed with a CPU-based DEM–CFD–VoF solver. The method is limited to small-scale cases because of its high computational cost. (Image: Breard and Desjardins)

    Overview and Background

    Climate change is showing tangible consequences on our environment. As glaciers retreat and rainfalls intensify, the frequency and scale of landslides and debris flows are rising worldwide. Their destructive power extends beyond the areas where these phenomena are initiated: unstable masses may travel several kilometres, depending on the evolving characteristics of the solids involved and their interaction with water; when masses plunge into large water bodies, they can unleash catastrophic tsunamis. Yet, our understanding and prediction of granular flows are limited by knowledge gaps in how fluids and solids interact. Key challenges include unravelling how grain shape and breakage affect flow mobility, and how mass, momentum, and energy are transferred during violent impacts. Using the GPU-accelerated multiphase solver MFIX-Exa, this project will pioneer next-generation simulations and physics-based laws to transform landslide hazard forecasts in a changing climate. 

    Methodology and Objectives

    Teaser Project 1: When Earth Hits Water — Geophysical Flows Triggering Tsunamis 

    Objective: During the initial six-month project, the focus will be on developing and validating a simplified GPU-accelerated Volume of Fluid (VoF) module within the MFIX-Exa framework to represent two-phase air–water interactions during a solid-body impact. This will establish the numerical infrastructure and performance benchmarks needed to later include granular particles. Using canonical test cases (e.g., water entry of wedges, deformable intrusions), we will assess how impact geometry and velocity control wave generation and energy transfer. 

    This short-term work will lay the foundation for the full PhD, which will extend the solver to fully three-phase (solid–gas–liquid) conditions, include granular rheology and pore-pressure coupling, and simulate natural examples such as pyroclastic flows entering the sea. The long-term goal is to derive physics-based coupling laws that can inform exascale tsunami forecasting and hazard models. 

    Teaser Project 2: Evolving Grain Size and Shape in Geophysical Granular Flows 

    Objective: The first six months will focus on implementing and testing a basic bonded-sphere representation of irregular grains in MFIX-Exa, without breakage. The aim is to quantify how initial particle shape (aspect ratio, angularity) modifies packing density and stress transmission under simple shear. Benchmark simulations and comparisons with existing experimental datasets will be used to verify the new contact model and establish computational efficiency on GPUs. 

    This work forms the foundation for a PhD that would progressively incorporate fracture and breakage physics, enabling the grain size and shape to evolve dynamically. Later stages would explore how fragmentation alters permeability, pore-fluid pressure response, and bulk rheology in flows such as landslides, debris avalanches, and pyroclastic currents, ultimately yielding improved continuum closures for natural-hazard prediction. 

    References & Further Reading

    Svennevig, Kristian, et al. “A rockslide-generated tsunami in a Greenland fjord rang Earth for 9 days.” Science 385.6714 (2024): 1196-1205. 

    https://www.exascaleproject.org/research-project/mfix-exa/ 

    https://github.com/NREL/BDEM.git 

    Musser, J., Almgren, A. S., Fullmer, W. D., Antepara, O., Bell, J. B., Blaschke, J., … & Syamlal, M. (2022). MFIX-Exa: A path toward exascale CFD-DEM simulations. The International Journal of High Performance Computing Applications, 36(1), 40-58. 

    Lu, L., Gao, X., Shahnam, M., & Rogers, W. A. (2021). Simulations of biomass pyrolysis using glued-sphere CFD-DEM with 3-D intra-particle models. Chemical Engineering Journal, 419, 129564. 

 

Projects with a focus on Geologic Hazard Analysis, Prediction and Digital Twinning:

 

  • Are we learning the weather right? Climate-based flood catastrophe analysis using AI and exascale compute

    Project institution:
    Project supervisor(s):
    Prof Peter M. Atkinson (Lancaster University), Dr Carolina Euan (Lancaster University), Prof Simon Tett (University of Edinburgh), Prof Rob Lamb (Lancaster University), Prof Paul Young (JBA Risk Management) and Dr Niall Robinson (NVIDIA)

    Overview and Background

    Machine learning (ML) lets us generate large ensemble simulations of extreme weather events quickly and efficiently. This can help assess the risk of floods, which constitute almost half of the world’s weather-related disasters. Rather than physical laws, ML models are determined by training data, which are at coarser resolutions than flood impact models. Scale transformations, through the climate-to-impacts processing pipeline, add uncertainties and highlight the challenge of representing extreme events. Are ML models realistic enough for vital applications in disaster risk reduction in present and future climates, given these challenges? You will address this with state-of-art global ML models and key industry partners JBA and NVIDIA to test the physical and statistical fidelity of extreme weather simulations of flooding and climate risk. 

    Methodology and Objectives

    This PhD will use statistical machine learning to interrogate the data produced by an AI-driven weather simulation that generates large ensembles of synthetic events for flood impact analysis under different climate conditions. The full pipeline, which has global capability, uses NVIDIA’s ‘Earth-2’ ML platform to transform (i) a combination of 75 single- and pressure-level atmospheric variables at 0.25 degrees resolution to (ii) precipitation fields, which are aggregated over river basins, to (iii) drive hydrological models of flood flows at points situated at between ~1 and ~10 km spacing on the river network, and finally (iv) flood impact analysis over variable-resolution spatial grids. Despite this cutting-edge risk analysis capability, there are important uncertainties and, hence, potential to improve the processing chain through increased understanding of the character of the various data layers and their inter-relations, specifically measurement processes, scaling processes, the statistical validity of distributions for extremes, and semantic interpretations and their fit to real-world phenomena. This is crucial since misrepresented processes and states can lead, for example, to prediction biases or omissions of consequential key events, which, in the global flood impact context, can mean loss of life or assets.  

    The PhD will develop and use a range of statistical and machine learning methods with which to interrogate the data in the climate-to-flood risk processing chain and suggest improvements in data representation. The student will work with JBA with input from NVIDIA to fully understand the Earth-2-platform. The PhD will then employ a combination of methods, including latent Gaussian processes (GPs) within a Bayesian inference framework to diagnose data support (that is, the spatial and temporal scales at which quantities are measured or modelled) and scaling relationships; machine learning approaches to characterise and represent dependencies between datasets; and natural language processing (NLP) coupled with ontological hierarchies to explore the meaning of, and relationships between, data layers. The representation of key atmospheric processes and large-scale phenomena – such as storm development, blocking, and energy balance – will be considered to ensure that the statistical and machine-learning analyses remain grounded in physical realism. 

    You will work with data and tools from the Earth-2 suite – the SFNO large ensembles model (Mahesh et al.), downscaling (Mardani et al.), and AFNO ‘diagnostic’ precipitation model (Kurth et al.) – alongside finer scale data, potentially including CEH-GEAR 1km UK rainfall, rain gauges, river flow gauges, and high-resolution flood maps produced by JBA. 

    Teaser Project 1: Data sampling, measurement processes and scale transformation to enhance realism of weather and flood risk simulations 

    Flood risk analysis depends crucially on earlier parts of the processing chain that transforms climate ensembles at coarse resolution into reliable flood simulations, and their interpretation relative to asset and population distributions. You will use the power of latent GP models, coupled with GPU-accelerated, scalable to exascale, compute to disentangle measurement processes (such as spatial discretisation, data model, spatial support, uncertainty) from the signal of interest as a key requirement for improved inter-operability between datasets. You will focus on climate ensembles and their transformation through to precipitation fields and flood extents and impacts using GPs to diagnose the measurement processes at each stage augmented with machine learning to represent the transforms and enable their improvement. This will improve interoperability between data layers and reduce artefactual loss of information, including on extremes, during scaling and transformation.  

    Specific goals include: 

    • Specification of the practical measurement support for each data layer in the processing chain, allowing its exploitation in subsequent processing.
    • Increased understanding of why extreme events are under-represented and how to inject lost structure to recover them to an improved extent.  

    Teaser Project 2: Extracting representations from weather simulations to test against physical phenomena 

    Flood risk analysis depends on the feeding of ensemble datasets through a processing chain to decision-makers who have responsibility for interpretation of the risk information provided by the Earth-2 platform. You will take the climate ensemble datasets ‘as read’ and specify and fit models to characterise the information, and minimise the losses of information that occur, in the processing chain. You will use the tools of ML and large language models (LLMs) to learn latent processes and represent the optimal data inter-operability pathways that are both internally ‘coherent’ and maximise the utility of the information sets provided. This may require interaction with decision-makers, facilitated through JBA, and inversion of the forward look through a learning framework (e.g., with reinforcement learning to optimise data transformations for maximal information retention).    

    Specific goals include: 

    • Greater understanding of the semantic meaning within datasets that are of most interest to decision-makers.  
    • More useful characterisations of the data (e.g., extremes) that can be inverted to optimise the parameters in the processing chain.  

     

    References & Further Reading

    Context and links to technical details of the modelling stack: 

    Ashcroft, J. et al., 2025, AI weather forecasting: Can it reveal unseen flood risks? JBA Risk Management blog. 

    Descriptions of the underlying ML models and example methods: 

    Chácon-Montalván, E.A., Atkinson, P.M, Nemeth, C., Taylor, B.M, Moraga, P., 2024, Spatial latent Gaussian modelling with change of support, arXiv, https://arxiv.org/abs/2403.08514. 

    Kurth, T. et al., 2023, FourCastNet: accelerating global high-resolution weather forecasting using adaptive Fourier neural operators. In Platform for Advanced Scientific Computing Conference (PASC ’23), 2023, Davos, Switzerland. ACM, https://doi.org/10.1145/3592979.3593412. 

    Mahesh, A. et al., 2024, Huge ensembles Part I: design of ensemble weather forecasts using spherical Fourier neural operators, arXiv, https://doi.org/10.48550/arXiv.2408.03100.

    Mardani, M. et al., 2025, Residual corrective diffusion modeling for km-scale atmospheric downscaling. Commun. Earth Environ. 6, 124. https://doi.org/10.1038/s43247-025-02042-5. 

  • Developing large-scale hydrodynamic flood forecasting models for exascale GPU systems

    Project institution:
    Project supervisor(s):
    Dr Mark Bull (University of Edinburgh), Dr Maggie Creed (University of Glasgow), Prof Simon Mudd (University of Edinburgh) and Dr Declan Valters (British Geological Survey)

    Overview and Background

    Flood forecasting at regional and national scale is imperative to predicting the scale and distribution of floodwaters during extreme weather events, mitigating the impact on communities most at risk from flooding. The LISFLOOD family of surface water models have proved suitable to being parallelised at scale, allowing research and forecasting communities to take advantage of the previous generation of supercomputers, such as ARCHER.

    The increasing availability of high resolution topographic and meteorological data provides an opportunity to extend the capability of the LISFLOOD modelling framework to produce large-scale or high resolution flood forecasts at operational timescales – i.e., producing model runs at sufficient lead-in times to alert communities to impending flood risk from forecasted extreme weather events. GPU-based exascale HPC systems provide the technological basis to develop forecast models delivering at operational timescales.

    Methodology and Objectives

    LISFLOOD is a family of hydrological models based on a 2D grid simulating rainfall-runoff. The water routing across a flood basin/river catchment is based on a simplified version of the shallow water (St Venant) equations. The model is process (physics) based, and there have been several implementations (see below), usually in C or C++, using a cellular automaton apporach. These have been parallelised for CPU using OpenMP and in one spin-off project, MPI. (see https://web.jrc.ec.europa.eu/policy-model-inventory/explore/models/model-lisflood/) 

    The stencil-code library used in the previous CSE project, LibGeoDecomp, purports to have support for NVIDIA GPUs and CUDA. https://github.com/STEllAR-GROUP/libgeodecomp 

    Teaser Project 1 Objectives:

    • Implement the hydrodynamic core of the LISFLOOD model on GPU hardware to demonstrate proof-of-concept that the current CPU parallelised code is portable to GPU hardware. 
    • Methods for GPU parallelisation would include OpenMP offloading as initial approach to verify proof of concept. Project could then be extended to investigate CUDA bindings available in the libgeodecomp library.
    • Profiling of the GPU ported code and identification of optimisation strategies. 

    Teaser Project 2 Objectives:

    • Gather requirements for the full workflow from data acquisition to forecast product dissemination.  
    • Prototype the workflow for small scale models and conduct experiments to identify potential bottlenecks when scaling up to full resolution.

    Development into a full PhD would involve further profiling and optimisation of the GPU code using either the libgeodecomp library, or another suitable GPU parallelisation framework. Delivering a proof-of-concept for a working flood forecast model at a regional scale would be a key aim of this project, demonstrating the potential to be used in operational flood forecasting systems. The full PhD may therefore look at workflow tools to integrate the various stages of forecast production such as: ingestion and pre-processing of data (i.e. from rainfall forecast/nowcasting data products), model scheduling on HPC systems, and post-processing of the outputs. 

    Rendering of total flood induced erosion and deposition of riverbed material during a flash flood event in the Rye catchment, North Yorkshire, UK, used as a case-study when testing the development of the LISFLOOD model. Image source: Declan Valters, British Geological Survey

    References and Further Reading

    LISFLOOD model high level overview: https://web.jrc.ec.europa.eu/policy-model-inventory/explore/models/model-lisflood/)  

    Stencil Code for LibGeoDecomp: https://github.com/STEllAR-GROUP/libgeodecomp 

    Open Source version of the C++ code developed by Declan Valters: https://github.com/dvalters/HAIL-CAESAR 

    Overview of an earlier project that developed an experimental version of the code for multi (CPU) node using stencil code: http://www.archer.ac.uk/training/virtual/2019-12-04-lisflood/lisflood.pdf 

    Reference for the Hydrodynamic Model Core: Coulthard, T.J., Neal, J.C., Bates, P.D., Ramirez, J., de Almeida, G.A. and Hancock, G.R., 2013. Integrating the LISFLOODFP 2D hydrodynamic model with the CAESAR model: implications for modelling landscape evolution. Earth Surface Processes and Landforms, 38(15), pp.1897-1906. 

  • Development of landscape evolution models and monitoring in anthropogenically influenced tropical regions

    Project institution:
    Project supervisor(s):
    Dr Amanda Owen (University of Glasgow), Dr Paul R Eizenhöfer (University of Glasgow) and Dr Mark Bull (University of Edinburgh)

    Overview and Background

    The Philippines are one of the most densely populated and economically fastest growing regions in Southeast Asia while they constantly face the threat of natural hazards such as strong climatic (e.g., monsoon) and geological forces (e.g., earthquakes and volcanic activity) in parallel to experiencing anthropogenic alteration (i.e. mining, river management) of the natural landscape. Any signals emerging from these will be communicated across a landscape primarily through their fluvial systems.  

    This project aims to develop novel live-monitoring techniques that bridge process-based modelling of the fluvial and anthropogenic systems with AI-driven data analysis to model fluvial morphology and change across Luzon. These efforts will provide state-of-the-art computational tools to inform policy makers in the Philippines to better adapt to climate change.

    Methodology and Objectives

    Digital shadow of landscape evolution of Luzon (Teaser Project 1); AI-driven prediction of geomorphic change in the Philippines (Teaser Project 2) 

    Methods Used: landscape evolution modelling, AI-enhanced remote sensing, The surface processes models employed in this effort will reach unprecedented metre-scale resolution, hence having high data volume throughput at the TB level. This requires GPU capacities that take advantage of latest flow routing routines adopted on parallelised computational architecture reducing model completion time by up to two orders of magnitude relative to the current generation of landscape evolution models. This enables the use of more efficient inversion schemes as well as lays the groundwork for potential exascale application in the future. 

    Teaser Project 1 Objectives: Landscape evolution models have been essential in developing our insights into interconnectivity of earth systems and surface processes. However, humans now have an unprecedented impact on our landscapes, now moving vast quantities of material across earth surface (e.g., mining). Teaser project 1 will establish the necessary boundary conditions to produce a mid-resolution (500 m / cell) representation of the Quaternary landscape evolution of the island of Luzon. A particular focus will be placed on accurately predicting to a first-order major present-day fluvial erosional and depositional systems including offshore basin stratigraphic records in tropical environments. Novel ensemble Kalman inversion schemes will be employed that leverage GPU capacities to be built in the project to computationally efficiently achieve fast parameter convergence using novel parallelised flow routing routines that lead to present-day geomorphological conditions.     

    Teaser Project 2 Objectives: A key challenge when working with remote sensing (satellite) based datasets is the availability of ‘cloud free’ images to perform high temporal resolution analysis on geomorphic change in river catchments. As a result, there is a bias in analysis of geomorphic change to datasets that have climates that are more conducive to ‘cloud free weather. Teaser project 2 will employ an AI approach to fill spatial and temporal data gaps in satellite remote sensing data due to dense cloud coverage. The Philippines has been selected as a location to develop the method due to the dense cloud coverage associated with its tropical monsoonal climate, common in regions near the equator This AI curated data set will then be used to train an artificial neural networks (ANNs) to make near-future (up to 500 yr) geomorphological predictions for the river catchments in the Philippine archipelago. GPU capacities will be essential in training the models

    References & Further Reading

    Lachowycz, S. (2024). Utility of artificial intelligence in geoscience. nature geoscience, 17(10), 953-955. 

    Braun, J., & Deal, E. (2023). Implicit algorithm for threshold stream power incision model. Journal of Geophysical Research: Earth Surface, 128(10), e2023JF007140. 

    Evensen, G. (2009). The ensemble Kalman filter for combined state and parameter estimation. IEEE Control Systems Magazine, 29(3), 83-104. 

  • Earth system twin for coastal erosion in the UK

    Project institution:
    Project supervisor(s):
    Dr Zhiwei Gao (University of Glasgow), Dr Christos Anagnostopoulos (University of Glasgow), Dr Martin Hurst (University of Glasgow) and Dr Hassan Al-Budairi (QTS Group Ltd)

    Overview and Background

    Coastal erosion is one of the UK’s most pressing environmental challenges, threatening homes, critical transportation infrastructure, and natural habitats. Around 1,800 km of coastline, which represents nearly 30% of the total, is actively eroding. At some stretches, such as the Holderness coast in East Yorkshire, the coastlines are retreating at rates exceeding 2 metres per year. In Happisburgh, Norfolk, approximately 35 homes have been lost to erosion in this village over the past 20 years. Climate change is expected to worsen risks, with UK sea levels projected to rise by up to 0.8 m by 2100. Under this scenario, the economic costs are substantial, with damages and adaptation needs estimated at over £1 billion. Coastal communities, transport infrastructure, and iconic landscapes face increasing vulnerability, making erosion not just a scientific concern but a social and economic priority for the UK. 

    Methodology and Objectives

    Methods Used: We aim to develop an AI-developed digital twin for coastal erosion in the UK, creating a high-fidelity virtual representation of coastal change that integrates past observations with predictive modelling. With this digital twin, we aim to learn and capture both the historical record of erosion and plausible future scenarios, providing a powerful tool for scientific research, policymaking, infrastructure planning and construction and community resilience enhancement. The foundation of the digital twin will be comprehensive data covering the past 30 years. We will draw primarily on the Copernicus programme and Sentinel satellite archives, which provide consistent Earth observation data at regional to national scales. These datasets allow detailed tracking of shoreline position, cliff retreat, and beach morphology. When combined with local aerial imagery, LiDAR surveys, and historical maps, the digital twin will offer a unique, multi-decadal record of erosion across the UK coastline. 

    Our core innovation lies in the application of machine learning and deep learning image processing techniques to extract and interpret this wealth of data. We will design algorithms for the automatic detection of coastlines, enabling the rapid identification of shoreline positions from large archives of satellite images. The results will be presented through a digital interactive map, which will be publicly accessible. In addition to documenting the past, the digital twin will incorporate a predictive and explainability capability. A separate machine learning framework, informed by the physical processes that drive erosion, will be developed to forecast future shoreline change and uncertainty quantification. This model will use past erosion trends, climate change, storm frequency, and projected sea-level rise as inputs. Embedding the physics will make the predictive model interpretable.  

    The result will be a decision-support platform that combines interactive visualisation with predictive insight and analyses. The digital twin for coastal erosion will represent not only a scientific advance but also a practical resource for building resilience to climate change and safeguarding vulnerable coastlines. 

    Teaser Project 1 Objectives: Automatic coastline detection using machine learning. In this teaser project, we will develop a machine learning model for automatic detection of coastlines based on the recent work using VedgeSat (Muir et al., 2024), which uses vegetation edges derived from Sentinel data to identify coastline positions. The VedgeSat algorithm will be improved, trained and validated on a labelled dataset of UK coastlines. The outputs will be processed into a continuous time series of coastal change, visualised through an interactive digital map similar to the DEA Coastlines platform (https://maps.dea.ga.gov.au/story/DEACoastlines).  

    Teaser Project 2 Objectives: Coastal erosion prediction using machine learning. We will develop a machine learning model for coastal erosion prediction in the UK, integrating both observational datasets and underlying physics, including climate & geological conditions and extreme weather events. The model will draw on records from the Copernicus Data Space Ecosytem (https://dataspace.copernicus.eu/explore-data) over the past 30 years. The model will be trained, tested, and improved at a site near Aberdeen, which has been studied by Dr Martin Hurst and his team for many years. 

    Each teaser project will contribute to one aspect of the bigger PhD project, in which a digital twin for coastal erosion in the UK will be established. Within this digital twin, we will have the history of past costal erosion (Teaser Project 1) and a machine learning model for predicting the future changes (Teaser Project 2).  

    References & Further Reading

    • Muir, F. M., Hurst, M. D., RichardsonFoulger, L., Rennie, A. F., & Naylor, L. A. (2024). VedgeSat: An automated, opensource toolkit for coastal change monitoring using satellitederived vegetation edges. Earth Surface Processes and Landforms, 49(8), 2405-2423. 
    • Ju, L. Y., Xiao, T., He, J., Xu, W. F., Xiao, S. H., & Zhang, L. M. (2025). A simulation-enabled slope digital twin for real-time assessment of rain-induced landslides. Engineering Geology, 108116. 
    • Maximilian Forstenhäusler, Daniel Külzer, Christos Anagnostopoulos, Shameem Parambath, Natascha Weber,’STaRFormer: Semi-Supervised Task-Informed Representation Learning via Dynamic Attention-Based Regional Masking for Sequential Data’, The Thirty-Ninth Annual Conference on Neural Information Processing Systems, NeurIPS 2025, Dec 2-7, San Diego, US [https://star-former.github.io/] 
  • Exascale modelling for resilient woodland expansion

    Project institution:
    Project supervisor(s):
    Prof Andrew Baggaley (Lancaster University), Prof Jason Matthiopoulos (University of Glasgow), Prof Peter M. Atkinson (Lancaster University), Dr Nathan Brown (Forest Research), Dr Suzanne Robinson (Forest Research) and Annabel Narayanan (Action Oak)
    Observations of Oak Processionary Moth (an key invasive plant pest) coloured by observation year, overlayed on a map of greater London

    Overview and Background

    The UK has set ambitious targets to expand woodland cover to nearly 20% by 2050, recognising the multiple benefits of forests for carbon storage, biodiversity, and climate resilience. Delivering these targets requires more than simply planting trees: new and existing woodlands must be resilient to climate stress and safeguarded against invasive pests and pathogens. At the same time, globalisation and climate change are accelerating the spread of invasive species, threatening the long-term success of woodland expansion strategies. This project will develop next-generation, GPU-accelerated environmental models that couple tree growth dynamics, pest spread, and climate forcing within an Earth system framework. By integrating large-scale observation data, the project will generate fine-grained risk forecasts to inform national woodland policy and sustainable land-use planning. 

    Methodology and Objectives

    The project will develop a new spatial modelling (open-source) framework that couples a biological growth model for tree stands with a spatially explicit SIR-type epidemiological model for invasive pests and pathogens using a fully differentiable approach. Specifically, we will adapt existing 3PG models (to to represent stand-level woodland growth) into a differentiable GPU based code. This model accounts for temperature, drought, CO, and management. However, at present, a major limitation is explicitly linking canopy openness and stand structure to pest and disease vulnerability. Addressing this shortcoming is one of the main aims of the project. 

    Climate projections will provide environmental forcing, while observation data from project partners at Forest Research will constrain key model parameters in both the 3PG and SIR models and their coupling. Data integration will account for changes in resolution and modality, enabling high-resolution simulations of coupled pest–climate–tree interactions at scales relevant for both national policy assessments. Differentiability allows the direct use of gradient-based optimisation and Hamiltonian Monte Carlo (HMC), making it feasible to perform efficient parameter calibration and full Bayesian uncertainty quantification in a high-dimensional, spatially explicit model. It also facilitates the development of risk forecasts and scenario analyses for woodland expansion strategies.  

    Teaser Project 1 Objectives: Woodland Expansion and Policy Support 

    This project strand focuses on developing and then applying the coupled pest-climate-tree code to key case studies of woodland management and expansion in the UK.  

    The core goal of TP1 is to fully rewrite the existing 3PG model in a vectorised, GPU compatible, format with auto-diff for gradient computation through physiological sub-models. This will then be coupled to the existing pest/disease code using profiling tools to ensure computational efficiency and scalability. Finally, we will incorporate parameter inference using HMC and train the model on pre-existing datasets for representative tree-pest systems under current and future climates. Key focuses will be a) Sitka spruce and Elatobium aphid (using aphid monitoring data) and b) Oak and multiple pests and diseases (e.g. mildew, Acute Oak Decline, and Oak Processionary Moth).  

    We will then be able to: 

    • Quantifying the vulnerability of new and existing woodlands to pest/pathogen outbreaks under climate stress. 
    • Identifying data-informed planting strategies that balance carbon storage, biodiversity, and woodland resistance. 
    • Generating fine-resolution maps of pest risk and climate suitability to guide woodland expansion planning at national and regional levels.

    This project offers direct policy relevance, producing modelling tools to support expansion strategies that are resilient to compound risks.

    Teaser Project 2 Objectives: Digital twinning 

    This project strand emphasises methodological advances in GPU-accelerated modelling and remote sensing technology to create a digital twin of woodland expansion. The objective is to integrate real time data streams from observations and sensors with different resolutions and modalities with the pest–climate–tree model, enabling real-time or near-real-time assimilation of key data sources, including citizen science initiatives. Specific goals include: 

    • Scaling the software developed in TP1 to harness exascale facilities for high-resolution environmental modelling. 
    • Coupling pest–climate–tree interactions to real-time multi-modal observation streams,  
    • Integrating GPU-accelerated inference schemes to constrain free parameters. 
    • Developing an accessible user interface to maximise the impact of the project. 

    References & Further Reading

    [1] J.J. Landsberg, R.H. Waring, Forest Ecology and Management, 95, 3 (1997) 
    [2]  K Reed, et al. Ecology and Management, 476, 118441 (2020) 

  • Extreme weather event impacts from global warming in the UK

    Project institution:
    Project supervisor(s):
    Prof Todd Ehlers (University of Glasgow), Dr Jingtao Lai (University of Glasgow), Dr Adam Smith (University of Glasgow) and Prof Michèle Weiland (University of Edinburgh)

    Overview and Background

    Global climate and environmental change increasingly results in weather extremes that impact nature and society.  These extremes include stormier climates with high intensity precipitation events, drought, flooding, heat waves, and increased wind speed. These types of events have large economic impacts on the UK, and are expected to increase in frequency in the future.  Diverse societal impacts result from extreme weather events and include disruption of transportation and infrastructure (e.g., energy, telecommunications), agricultural and natural ecosystem loss, and loss of human life. The primary aim of this project is to develop GPU based and exascale computing oriented software for real time analysis of how extreme weather events impact natural systems (e.g., via floods and landslides, heat waves, wind storms, ecosystem disturbance) that then propagate into society. You’ll start by investigating these interactions for Scotland, and then extend to the broader UK.

    You will work with a team of dynamic researchers at the University of Glasgow and the EPCC Super Computing Centre to develop a new exascale oriented application for daily to weekly prediction and analysis of extreme weather event impacts. In addition, software you develop will form the core platform for linking with other ExaGEO projects developed by student peers.

    The University of Glasgow and EPCC at the University of Edinburgh are in close proximity to each other with regular train service between them so you’ll have ample opportunity to interact across both institutions.

    Methodology and Objectives

    Methods Used:

    As part of this project, you will learn skills linking weather forecast and satellite data streams onto high-resolution topography to investigate how natural systems respond to extreme forcings.  From this, you will learn how to develop routines to identify areas at high risk for damages associated to extreme events.  You will do this by learning about precipitation distributions related to floods and landslides, and how temperature and wind extremes impact agriculture and natural systems. As part of these tasks, you’ll learn about the physics of surface processes and how surface water (e.g., rivers), and hillslopes evolve and respond in response to extreme weather.

    The primary programming skills you will learn (in either Julia or Fortran) will include decomposition and parallelization of large domains and physics-based processes onto GPU and exascale architectures, as well as how to integrate large data streams into process-based models. You’ll be joining an active research group at the University of Glasgow that works on related problems.  You’ll be able to attend regular research seminars whereby you can discuss each others research and also receive assistance in software development and learning about geomorphic processes.

    Each teaser project below is intended to take about 6 months of time and are suited for development into publications, and included in your dissertation.

    Teaser Project 1: This teaser project sets the stage for the rest of your research.  You will focus on the development of GPU based software to store high-resolution (~5m) digital topography of the UK in an optimal way for future extensive calculations on it.  From this, you will work with different data streams (e.g., MET office and other data weather predictions and climate reanalysis data; satellite observations of vegetation and land surface change) to downscale them to the resolution of the topography. Lastly, you integrate different ‘static’ data sets such as soil type and thickness, and land use, transportation networks and infrastructure. These data sets form the basis for understanding how the hydrologic processes are active across topography, and identifying regions of extreme rainfall, heat waves, and windspeeds.

    In the process of working on this teaser project, you will interact not only the team members, but also other ExaGEO students working on the statistics of extreme weather event identification, and coastal erosion.

    Teaser Project 2: Extreme rainfall events result in flooding and increased landsliding.  The frequency and magnitude of flooding and landsliding that occurs depends on where the precipitation falls (topography), soil type and thickness, and vegetation.  The first step to understanding flood risks is the collection and flow of water across the surface.  In this project you will develop parallelized algorithms to route water acoss the landscape and predict river discharge magnitudes and threats.  The results from this project form the basis for future work you’ll do on understanding how the hydrologic budget influences not only rural and urban flooding, but also the stability of hillslopes, and threats to transportation and infrastructure.

    Following these teaser projects, the research will evolve depending on your interests into both adding new components such as landsliding and flood analysis, sediment transport processes and soil loss, to forecasting where and what hazards might emerge in the near future based on weather forecasts.

    Diverse career prospects are possible from this project.  Your skill set will be of interest to Earth science related government and non-governmental agencies, private sector consulting, or continued research in academia. Your skills in software development and high-performance computing also provide ample opportunities for work in other sectors.

    References & Further Reading

    To get a flavour for the type of research we do, check out:

    https://www.youtube.com/watch?v=HG20l9UBOxI

    Some related publications include (to give you a taste):

    Sharma, H. and Ehlers, T. A.: Effects of seasonal variations in vegetation and precipitation on catchment erosion rates along a climate and ecological gradient: insights from numerical modeling, Earth Surf. Dynam., 11, 1161–1181, https://doi.org/10.5194/esurf-11-1161-2023, 2023.

    Schmid, M., Ehlers, T. A., Werner, C., Hickler, T., and Fuentes-Espoz, J.-P.: Effect of changing vegetation and precipitation on denudation – Part 2: Predicted landscape response to transient climate and vegetation cover over millennial to million-year timescales, Earth Surface Dynamics, 6, 859–881, https://doi.org/10.5194/esurf-6-859-2018, 2018.

    Hobley, D. E. J., Adams, J. M., Nudurupati, S. S., Hutton, E. W. H., Gasparini, N. M., Istanbulluoglu, E., and Tucker, G. E.: Creative computing with Landlab: an open-source toolkit for building, coupling, and exploring two-dimensional numerical models of Earth-surface dynamics, Earth Surface Dynamics, 5, 21–46, https://doi.org/10.5194/esurf-5-21-2017, 2017.

  • NAME as a digital twin for explosive volcanic eruptions for streamlined response and real-time impact assessment

    Project institution:
    Project supervisor(s):
    Dr Thomas Jones (Lancaster University), Dr Frances Beckett (Met Office), Dr Sebastian Mutz (University of Glasgow) and Prof Mike James (Lancaster University)

    Overview and Background

    The Met Office is home to the London Volcanic Ash Advisory Centre (VAAC). The role of the London VAAC is to provide advice and guidance to the aviation authorities on the presence of volcanic ash in the atmosphere. Ash forecasts are generated using the Met Office’s Numerical Atmospheric-dispersion Modelling Environment (NAME), initialised with eruption source parameters and driven by meteorological data. Currently, model simulations are conducted for specific individual volcanic events, each with unique eruption source parameters (i.e., the model inputs). This is necessary for real-time eruption response but does not allow for broader questions surrounding the interconnectivity between the eruption characteristics, the metrological conditions, and the associated impact on airspace to be addressed. The lack of these insights hinders long-term and strategic planning for the presence of volcanic ash in the atmosphere. 

    Methodology and Objectives

    Suggested headings:  

    1. Harnessing Machine Learning to inform risk-based decision making for volcanic ash clouds 
    2. Using NAME as a digital twin to support impact assessments. 

    Methods Used: Ensemble forecasting; Monte Carlo analysis; Numerical Weather Prediction models; Lagrangian and Eulerian particle dispersion models (NAME); Parallel computing; probability mapping; Machine Learning; ORCHID GPU cluster on JASMIN; High Performance Computing.  

    ​Teaser Project 1 Objectives: In this teaser project, you will perform probabilistic model simulations to construct a large queryable dataset. On the order of 100,000 model runs will be conducted that vary the meteorological conditions, the source/volcano location and the eruption source parameters (e.g., mass eruption rate, plume height, particle size distribution). You will devise an accessible visualisation of this big dataset to enable end-users to explore the impact that specific source terms have on the transport and dispersal of a volcanic ash cloud, as total column mass loadings (appropriate for comparison with satellite data), and ash concentration within each flight level. Furthermore, these data will also serve as a training dataset for a machine-learning based model designed to run on GPUs in a parallel setting to address the time-sensitive nature of eruption response. Testing will occur on the ORCHID GPU cluster on JASMIN and this will build towards support for real-time eruption response conducted by the London VAAC wherein observables (e.g., satellite retrievals, plume height) can be used to query the database for likely, corresponding eruption source parameters (e.g., particle size distribution) which are difficult to measure in real-time. Without this carefully constructed training dataset proposed here, machine learning techniques have limited applicability to real-time London VAAC forecasts.   

    ​Teaser Project 2 Objectives: In this teaser project you will couple NAME model outputs (potentially the large suite of NAME outputs generated in teaser project 1) to several existing models/databases of critical infrastructure and services. Examples of coupling include: 

    • airspace use models to determine the number of flights affected 
    • building type and roof structure databases to determine the likelihood of roof collapse due to ash loading 
    • road and public transport network layout and usage models to determine the number of people affected and delay times due to ash ground accumulation.  

    Coupling NAME with these other models, forming a ‘digital twin’, and ensuring that the result can be parallelised and run on GPUs will enable rapid, quantitative impact assessments to be made for explosive volcanic eruptions. Then, when exploiting exascale resources, the digital twin could be used to provide evidence for informed decision making at speed. Example questions that could be addressed include: is there a relationship between the eruption start time and the area of airspace affected? Do specific weather regimes cause a more severe disruption to air traffic/ public transport/ roads? How does impact scale with mass eruption rate? 

    References & Further Reading

    https://www.metoffice.gov.uk/research/approach/modelling-systems/dispersion-model 

    https://jasmin.ac.uk  

    Madankan, R., Pouget, S., Singla, P., Bursik, M., Dehn, J., Jones, M., Patra, A., Pavolonis, M., Pitman, E.B., Singh, T. and Webley, P., 2014. Computation of probabilistic hazard maps and source parameter estimation for volcanic ash transport and dispersion. Journal of Computational Physics, 271, pp.39-59. 

    Leadbetter, S.J., Jones, A.R. and Hort, M.C., 2022. Assessing the value meteorological ensembles add to dispersion modelling using hypothetical releases. Atmospheric Chemistry and Physics, 22(1), pp.577-596. 

    Capponi, A., Harvey, N.J., Dacre, H.F., Beven, K., Saint, C., Wells, C. and James, M.R., 2022. Refining an ensemble of volcanic ash forecasts using satellite retrievals: Raikoke 2019. Atmospheric Chemistry and Physics, 22(9), pp.6115-6134. 

    Beckett, F., Barsotti, S., Burton, R., Dioguardi, F., Engwell, S., Hort, M., Kristiansen, N., Loughlin, S., Muscat, A., Osborne, M. and Saint, C., 2024. Conducting volcanic ash cloud exercises: practising forecast evaluation procedures and the pull-through of scientific advice to the London VAAC. Bulletin of Volcanology, 86(7), p.63. 

    Hayes, J.L., Wilson, T.M. and Magill, C., 2015. Tephra fall clean-up in urban environments. Journal of Volcanology and Geothermal Research, 304, pp.359-377. 

  • Smart sensing for ecological catchments

    Project institution:
    Project supervisor(s):
    Dr Craig Wilkie (University of Glasgow), Dr Lawrence Bull (University of Glasgow), Dr Stephen Thackeray (Lancaster University), Prof Claire Miller (University of Glasgow), Prof Amy Pickard (UKCEH), Dr Liam Godwin (UHI) and Prof Roxane Andersen (UHI)

    Overview and Background

    Ecological catchments are vital for sustaining the environment, agriculture, and urban development, yet in the UK, only 33% of rivers and canals meet ‘good ecological status’ (JNCC, 2024). Furthermore, around 80% of UK’s peatlands are in dry and degraded state – we must maintain these carbon-rich ecosystems, which cover just 3% of the world’s surface while holding nearly 30% of the soil carbon (Forestry England, 2025). 

    Both ecosystems are affected by agriculture, waste, urban and infrastructure development; therefore, monitoring across these environments is essential to mitigate degradation. While sensors are increasingly affordable, scalable/affordable sensing remains limited, particularly in remote and spatially complex environments of ecological catchments.  

    This project investigates edge processing and sensing, and how they can be specifically designed to enable more effective exascale computing, given distributed telemetry from ecological catchments. Methods for combining these data into systems-level sensing will be investigated – capturing interactions between ecological processes. The project combines embedded computation and machine learning, with data and model interoperability across ecological systems. 

    Methodology and Objectives

    Methods used 

    Initially, the student will consider how computation at the edge and smart data collection can be designed to aid exascale computation. While exascale methods present many opportunities, they come at a high cost in terms of analytics and data storage, especially when relying on cloud services. In response, the project will design edge computation, especially preprocessing, to significantly reduce data loads (as raw data are typically high-resolution), enabling analytics in near real-time. Some skills that will be developed: 

    • Machine learning (ML): embedded/TinyML, from simple novelty detection to more complex models using embedded GPU/TPUs, programming (Python). 
    • Statistics: focus on statistical and interpretable ML with uncertainty quantification, to aid decision making. 
    • Uncertainty quantification (UQ): scalable UQ methods suitable for integration into large-scale or exascale modelling, including surrogate representations where appropriate. 

    [Teaser Project 1] Edge and cloud computation  

    The project will lay the groundwork to develop a smart monitoring system, designed to be embedded within sensing devices (such as NVIDIA Jetson, or Google’s Coral AI) by the end of the PhD.  These tools will be specifically designed to aid data assimilation for coupled models that simulate these interactions over large areas and long timescales, which require exascale computation. Tools will include signal processing, monitoring algorithms, or more advanced machine-learning techniques. The required data collection and analytics will be scoped with project partners, and the student will investigate models/software for edge implementation and propose developments assimilation into exascale models. This would likely involve building models in Python (Tensorflow, Keras, or Jax) and then converting them into an edge-AI device format (e.g. LiteRT). 

    Areas of focus: 

    • Development of more scalable sensing systems 
    • Monitoring of water and peatland quality indicators 
    • Underpinned by considerations of coupled systems monitoring at the exascale 

    [Teaser Project 2] Systems-level analysis of aggregate models and data 

    This project considers how one aggregates information from smart sensors, to inform a whole catchment-systems analysis. The project considers interconnected sensors within a catchment, and how distributed information can inform systems-level decision-making. For example, as smart sensors allow for active control, the study might consider how data collection activities, power schedules, and maintenance can be modified given the ‘bigger picture’, with developments taken forward for implementation and assessment throughout the PhD. Some relevant topics include: 

    • Adaptive experimental design 
    • Model fusion and federation 
    • Policy learning 
    • Decision analysis 

    This initial teaser project stands alone as a development from current sensor systems.  However, depending on the interests of the student there is scope beyond the teaser project phase to combine developments from both teaser projects. 

  • Statistical Emulation Development for Landscape Evolution Models

    Project institution:
    Project supervisor(s):
    Dr Benn Macdonald (University of Glasgow), Dr Mu Niu (University of Glasgow), Dr Paul Eizenhöfer (University of Glasgow), Dr Eky Febrianto (University of Glasgow) and Dr Mark Bull (University of Edinburgh)
    Landscape evolution model of Central Nepal including its range of input parameter types.

    Overview and Background

    Many real-world processes, including those governing landscape evolution, can be effectively mathematically described via differential equations. These equations describe how processes, e.g. the physiography of mountainous landscapes, change with respect to other variables, e.g. time and space. Conventional approaches for performing statistical inference involve repeated numerical solving of the equations. Every time parameters of the equations are changed in a statistical optimisation or sampling procedure, the equations need to be re-solved numerically. The associated large computational cost limits advancements when scaling to more complex systems, the application of statistical inference and machine learning approaches, as well as the implementation of more holistic approaches to Earth System science. This yields to the need for an accelerated computing paradigm involving highly parallelised GPUs for the evaluation of the forward problem. 

    Beyond advanced computing hardware, emulation is becoming a more popular way to tackle this issue. The idea is that first the differential equations are solved as many times as possible and then the output is interpolated using statistical techniques. Then, when inference is carried out, the emulator predictions replace the differential equation solutions. Since prediction from an emulator is very fast, this avoids the computational bottleneck. If the emulator is a good representation of the differential equation output, then parameter inference can be accurate. 

    This work is highly relevant to the area of Earth Science research. By facilitating the practical use of these systems through emulation, we can gain insights into Earth’s processes, for example, predicting potential triggers for natural hazards such as landslides. Other insights include; tracing locations of erosion and/or deposition for potential critical resource identification, determining principal and potentially hidden natural or anthropogenic drivers of landscape evolution globally, and gaining further understanding as to how future climate change will affect Earth’s surface.

    There are a number of applied and methodological directions this work can be generalised to. For example, since a main focus of this project involves making these systems faster and more efficient for use in practice, a natural extension would be to upscale model resolution to m-scale, allowing for a finer analysis and direct assessment of natural hazard risks. Once the principal work is complete, it also provides a framework for scaling to systems with additional modules, such as including ecological factors, advanced drainage flow routing components, and elements that model high-intensity/high-frequency storm events. Future work could also explore and adapt the improved emulation strategies we develop, making digital twinning more viable for use in Earth Science research, as well as other fields of study. 

    Methodology and Objectives

    Methods Used: Gaussian process interpolation (for building the emulator), Bayesian inference (for parameter inference), geomorphological analyses, surface processes modelling. 

    Teaser Project 1 Objectives: GPU-accelerated differential equation solver. Geodynamic models in Earth Science are used to simulate a range of natural processes. Landscape evolution models specifically contain, amongst others, equations that describe surface processes such as erosion and sediment deposition as well as rock/surface uplift and aspects of climate change. However, the numerical solver executes consecutively, rather than generating solutions in parallel. This first teaser project will commence at the beginning of the PhD project (semester 1) and will focus on familiarising the student with parallel computing via GPUs, including the optimisation of existing landscape evolution models for GPU use. At the same time, the student will take training from ExaGEO, equivalent to 20 UoG credits, in GPU programming and Exascale principles. This teaser project will support the PhD project in developing robust, reliable and efficient emulators for landscape evolution models, utilising GPU power, which will allow for a denser training set and the inclusion of a broader variety of geomorphological scenarios. This teaser project will also give insight on possible GPU acceleration in the emulation process itself. 

    Teaser Project 2 Objectives: Emulator development. The second teaser project will look at creating an emulator for a simple mathematical model describing elevation change as a function of spatial and temporal variations in surface uplift and efficiency of erosion. This will take place in semester 2 and the student will also undergo training at the same time from ExaGEO, in statistical and numerical methods in computing, complementing the students research aims at this stage. The skills the student will develop during this teaser project will set them up well, in combination with what they have attained from teaser project 1, to develop efficient emulators for more complex landscape evolution models, as the PhD project evolves.  

    The student will be well supported by the supervisory team. Dr Eizenhöfer has expertise in landscape evolution modelling and Earth System science, Dr Macdonald and Dr Niu have expertise in developing statistical methodology in the area of statistical emulation and Dr Febrianto has expertise in highly parallelised architecture for scientific computing and will be able to advise on software development and design with open-source vision, as well as aspects of the GPU software development. 

    References & Further Reading

    Rasmussen, C.E., & Christopher K. I. Williams, C.K.I. (2006). Gaussian Processes for Machine Learning. The MIT Press. ISBN 0-262-18253-X. 

    Donnelly, J., Abolfathi, S., Pearson, J., Chatrabgoun, O., & Daneshkhah, A. (2022). Gaussian process emulation of spatio-temporal outputs of a 2D inland flood model. Water Research. Volume 225. ISSN 0043-1354. 

    Clark, M. K., Royden, L. H., Whipple, K. X., Burchfiel, B. C., Zhang, X., & Tang, W. (2006). Use of a regional, relict landscape to measure vertical deformation of the eastern Tibetan Plateau. Journal of Geophysical Research: Earth Surface, 111(F3). 

    Eizenhöfer, P. R., McQuarrie, N., Shelef, E., & Ehlers, T. A. (2019). Landscape response to lateral advection in convergent orogens over geologic time scales. Journal of Geophysical Research: Earth Surface, 124(8), 2056-2078. 

    Mutz, S. G., & Ehlers, T. A. (2019). Detection and explanation of spatiotemporal patterns in Late Cenozoic palaeoclimate change relevant to Earth surface processes. Earth Surface Dynamics, 7(3), 663-679. 

    Whipple, K. X., Forte, A. M., DiBiase, R. A., Gasparini, N. M., & Ouimet, W. B. (2017). Timescales of landscape response to divide migration and drainage capture: Implications for the role of divide mobility in landscape evolution. Journal of Geophysical Research: Earth Surface, 122(1), 248-273. 

    Whittaker, A. C., & Boulton, S. J. (2012). Tectonic and climatic controls on knickpoint retreat rates and landscape response times. Journal of Geophysical Research: Earth Surface, 117(F2). 

    Yang, R., Willett, S. D., & Goren, L. (2015). In situ low-relief landscape formation as a result of river network disruption. Nature, 520(7548), 526-529. 

    Zachos, J. C., Dickens, G. R., & Zeebe, R. E. (2008). An early Cenozoic perspective on greenhouse warming and carbon-cycle dynamics.  nature,  451(7176), 279-283. 

  • Towards a volcano digital twin: Coupled models of shallow conduit processes at basaltic volcanoes

    Project institution:
    Project supervisor(s):
    Prof Mike James (Lancaster University), Dr Tobias Keller (University of Glasgow) and Dr Thomas Jones (Lancaster University)

    Overview and Background

    Basaltic volcanic systems present a wide range of hazards, including persistent release of toxic gases from lava lakes to eruptions that can inject volcanic plumes kilometres into the atmosphere. To support next-generation hazard forecasting and response, a volcano digital twin model will need to assimilate real-time monitoring data with numerical models of sub-surface magma processes. However, near the Earth’s surface, rising magma is a complex and rapidly evolving multi-phase compressible fluid, which makes modelling magma flow highly challenging and computationally expensive. This project aims to develop large-scale conduit models that are capable of being integrated into real-time decision support systems, such as a digital twin. Simulations will need to be sufficiently fast for ensemble modelling to be a viable approach for forecasting hazard change and quantifying associated uncertainties. 

    Methodology and Objectives

    Figure: A three-phase model of bubble-driven and crystal-hindered convection in a basaltic lava lake, as a simpler precursor to the simulations envisaged for this project.

    Methods Used:  

    The project will use Julia-based software packages (chmy.jl, ParallelStencil.jl, ImplicitGrid.jl) to develop staggered-grid finite-difference models of thermo-chemical-mechanical evolution of volcanic plumbing systems. The advantage in using these Julia-based solutions is that they allow for simulation codes to be written in a backend-agnostic manner, meaning the same code can be run in an optimised manner on CPU and all major GPU architectures. Models will be custom-built on the basis of the multi-phase reaction-transport model framework of Keller & Suckale (2019). Experimental and observational constraints, as well as state-of-the-art thermodynamic models for gas-melt and melt-solid phase equilibria (e.g., VolFe, MAGEMin), will be used to calibrate the relevant mechanical and thermodynamic properties which govern flow, gas exsolution, and crystallisation reactions. 

    Teaser Project 1: Melt-gas coupling and magma convection in open basaltic conduits

    For this teaser project, you will develop a numerical model of gas loss (degassing) and magma flow within the subsurface plumbing system (i.e., conduit) of a basaltic volcano. As true for many basaltic volcanoes worldwide, you will impose gas influx but no net magma throughput (i.e. an open and degassing system but not generating ash plumes or lava flows). Magma density changes from gas influx at the base, gas exsolution and expansion, and gas loss at the surface will drive convection within the conduit. Simultaneously, heat loss and dehydration will drive partial crystallisation of the magma, which affects its flow properties. Model outputs will include time series of magma pressure, surface height, and gas and magma chemistry to enable comparison with field-measurable geophysical signals (e.g. seismics), remote sensing data (e.g. gas flux and composition), and petrological observations (crystal and glass chemical compositions). You will use this coupled model to map the generation and evolution of flow instabilities and variability across parameter space of conduit geometry, gas influx, and melt composition (affecting density, volatile solubility, and rheology). This will result in a large output database of time series, from which you will extract identifiable patterns and characteristics using machine learning methods. This project, over the course of the PhD, could be extended by additionally coupling the fluid dynamics model with its surrounding solid edifice, to enable simulation of seismic datasets that would be measurable at the surface. Together, the large dataset of model outputs across the full range of the natural parameter space could act as training data to support the automated inversion of real-time observables (e.g., seismics, gas fluxes) to the subsurface flow processes. This would provide an invaluable ‘window’ into the subsurface and support real-time eruption forecasting and response. 

    Teaser Project 2: Dynamics of basaltic magma fragmentation and gas decoupling during eruption 

    This teaser project focusses on modelling the gas-driven fragmentation process that can occur in erupting basaltic conduits, and exploring its wider influence on conduit flow. Fragmentation evolves the flow from liquid melt being the connected phase (i.e. gas bubbles within magma) to gas as the connected phase (i.e. fluidal clots of magma within a gas stream). Magma fragmentation results in strong flow viscosity gradients that couple pressure changes into the bulk flow. You will use the model to explore how such changes may influence magma ascent at depth and could drive variability and transitions in eruption style (i.e., lava effusion vs explosive degassing), potentially resulting in large scale, paroxysmal explosive activity. As for Teaser Project 1, a large database of model simulation outputs will enable sensitivities to be explored and any tipping points identified. The project can be expanded by extending the model to encompass above-ground observations, for example, to quantify the expected size and velocity distributions of magma clots or to simulate infrasound signals. Ultimately, this will enable geophysical and remote sensing observations to be inverted for dynamic hazard assessment based on near real-time conduit and eruption conditions. 

    References & Further Reading

    Birnbaum, J., Keller, T., Suckale, J. & Lev, E. (2020) Periodic outgassing as a result of unsteady convection in Ray lava lake, Mount Erebus, Antarctica, Earth Planet. Sci. Letts, 530, 115903. https://doi.org/10.1016/j.epsl.2019.115903 

    Keller, T. & Suckale, J. (2019) A continuum model of multi-phase reactive transport in igneous systems, Geophys. J. Int., 219, 185–222. https://doi.org/10.1093/gji/ggz287 

    Jones, T.J., Reynolds, C.D. & Boothroyd, S.C. (2019) Fluid dynamic induced break-up during volcanic eruptions, Nat. Commun.10, 3828. https://doi.org/10.1038/s41467-019-11750-4 

    Pering, T.D., McGonigle, A.J.S., James, M.R., Capponi, A., Lane, S.J., Tamburello., G. & Aiuppa, A. (2017) The dynamics of slug trains in volcanic conduits: Evidence for expansion driven slug coalescence, J. Volcanol. Geotherm. Res., 348, 26–35. https://doi.org/10.1016/j.jvolgeores.2017.10.009  

    Wong, Y-Q & Keller, T. (2023) A unified numerical model for two-phase porous, mush and suspension flow dynamics in magmatic systems, Geophys. J. Int., 233, 769–795. https://doi.org/10.1093/gji/ggac481 

 

Projects with a focus on Sustainability Solutions in Engineering, Environmental, and Social Sciences:

 

  • AI Weather Prediction for Renewable Energy Forecasting

    Project institution:
    Project supervisor(s):
    Dr Xiaochen Yang (University of Glasgow), Prof Jethro Browell (University of Glasgow), Mr Dan Travers (Open Climate Fix) and Mr Jack Kelly (Open Climate Fix)

    Overview and Background

    AI weather models replace components of traditional numerical weather prediction systems with deep neural networks underpinned by GPUs. They had developed rapidly over recent years offering potential gains in forecast skill and reduced computational demand, with leading weather centres now running AI weather models operationally. However, many research questions remain open and validation in key application domains, including the energy sector, is lacking. This project will deploy and develop AI weather models to optimise performance for renewable energy forecasting, e.g. by focusing on relevant atmospheric parameters (near-surface winds, clouds, radiation) and directly forecasting and/or assimilating power production from wind and solar farms. This project is partnered with Open Climate Fix, an award-winning AI-first, renewable power forecasting company.

    Methodology and Objectives

    For both Teaser Projects:

    ECWMF recently proposed an AI weather model, Artificial Intelligence Forecasting System (AIFS) [1]. The model is trained on ERA5 re-analysis data or ECWMP’s operational numeric weather prediction (NWP) analyses to produce forecasts for upper-air variables, surface weather parameters and tropical cyclone tracks. AIFS employs an encoder–processor–decoder architecture inspired by recent advances in graph and transformer networks, offering substantial improvement in computational efficiency compared with traditional NWP approaches.

    GPU requirement: According to [1], full AIFS training requires 64 NVIDIA A100 (40 GB) GPUs, while inference (forecast generation) is achievable on a single A100 GPU. Fine-tuning, which updates only a small subset of parameters, is expected to demand 4-8 GPUs; the specific GPU requirements depend on model size and precision settings.

    Teaser Project 1 Objectives: Wind speed forecasting
    Background: Wind speeds at heights corresponding to wind turbine rotors are key inputs to wind power forecasts; however, computing wind speeds at heights between 10m and 300m is challenging due to atmospheric stability and boundary layer effects.

    Description of teaser project: This teaser project will implement the AIFS to forecast wind speeds at these heights and to benchmark its performance against physics-based models. Over a six-month period, the PhD student will undertake the following three tasks.

    1. Download the AIFS model, set it up within a GPU-enabled computing environment and reproduce the forecasting results reported in [1]. This task will require developing a thorough understanding of the model’s architecture and its efficient parallel implementation on GPUs.
    2. Acquire the UKV dataset (Met Office high resolution data) and the MIDAS data (land surface stations data). These datasets are essential for validating the AIFS forecasts and for subsequent PhD. This task will train student in data handling, pre-processing and quality control.
    3. Implement AIFS to forecast wind speeds at heights between 10m and 300m. Model outputs will be compared against NWP forecasts to identify strengths and weaknesses in AIFS.

    Development into a full PhD: The subsequent PhD will adapt AIFS for the forecasting of atmospheric variables critical to renewable energy. The focus will be on improving spatial resolution from the current 0.25° (~27.75km) grid spacing to 1.5 km, matching the UKV dataset, and producing location-specific forecasts. Additionally, the adaptation should incorporate physical constraints relevant to the UK’s orography and coastline. To achieve these, the student will investigate parameter-efficient fine-tuning techniques, such as using adapter layers for the graph-transformer encoder and decoder and LoRA for the transformer-based processor. There is also scope for students to investigate probabilistic forecasting to quantify forecasting uncertainty.

    Teaser Project 2 Objectives: Solar PV power forecasting

    Background: Accurate solar PV power forecasting is vital for maintaining grid stability and optimising renewable energy integration. However, one of the most significant challenges arises from predicting the formation, evolution and dissipation of clouds, which strongly influence solar irradiance at the Earth’s surface. Traditional forecasting pipelines often rely on predicting a suite of intermediate meteorological variables, such as temperature, humidity and cloud fraction, on a regular spatial grid before converting these forecasts into estimates of solar generation. While this multi-step approach is physically interpretable, it introduces additional aleatoric uncertainty and accumulates errors at each stage. This project proposes a more direct approach that decodes the embeddings from AIFS into estimates of solar power generation, bypassing intermediate meteorological forecasting. Validation will be conducted using real solar generation data across Great Britain’s more than 300 “Grid Supply Point” regions.

    Description of teaser project: Over a six-month period, the PhD student will explore the potential of using embeddings from AIFS to predict solar energy output by completing the following three tasks.

    1. Download the pre-trained AIFS model and set it up within a GPU-enabled computing environment. This task will require developing a thorough understanding of the model’s and its efficient parallel implementation on GPUs.
    2. Obtain historical solar power generation data from Great Britain’s over 300 “Grid Supply Point” regions.
    3. Train a simple fully connected neural network with the embeddings extracted from AIFS as input to forecast solar energy generation. The results will be benchmarked against observed data, establishing a baseline for subsequent methodological innovation

    Development into a full PhD: The subsequent PhD will aim to develop end-to-end AI frameworks that forecast renewable energy generation directly from learned weather representations. This constitutes a transfer learning task, as the embeddings originally learned for meteorological forecasting will be repurposed for energy prediction. The research will progress along three directions. First, a custom, data-driven decoder will be designed to transform the embeddings produced by the AIFS model directly into solar power forecasts. Second, if the pre-trained embeddings are found to be suboptimal for energy forecasting, the quality of the embeddings will be refined by fine-tuning the encoder and/or processor of the pre-trained AIFS. Finally, the research will develop a physics-informed decoder that integrates established physical relationships between solar irradiance and meteorological conditions, thereby improving both accuracy and interpretability.

    [1] Chantry, Matthew, et al. “AIFS-ECMWF’s Data-Driven Forecasting System.” 105th Annual AMS Meeting 2025. Vol. 105. 2025

  • Changing Ecological Role of Coral Reef Marine Protected Areas

    Project institution:
    Project supervisor(s):
    Prof Nick Graham (Lancaster University), Prof Rachel McCrea (Lancaster University), Dr David Bailey (University of Glasgow), Dr James Robinson (Lancaster University) and Prof M Aaron MacNeil (Dalhousie University)

    Overview and Background

    No-take marine protected areas (MPAs) are a key management approach for coral reef ecosystems, with decades of research, including global meta-analyses, determining the expected ecological outcomes: higher coral cover, greater species richness of fish, and more fish biomass that is dominated by higher trophic levels. However, climate disturbance and human pressures are fundamentally changing the ecological foundation of coral reefs, with evidence that the ecological outcomes of MPAs may be fundamentally changing. This project will integrate diverse datasets to assess how the response of coral reefs to MPAs is changing at a global scale, drawing on remote sensing of ocean conditions, human pressures, reef habitats, species phylogeny, and a global coral reef database of underwater surveys from over 2,000 reef sites.

    Methodology and Objectives

    Coral reef ecosystems are rapidly transforming due to climate change and direct human pressure, leading to bottom-up habitat mediated shifts in community composition interacting with MPAs that are aimed at controlling top-down fishing pressure. This PhD will draw on diverse and complex data comprising remote sensing global coral reef habitats (Allen Coral Atlas), climate-induced heat stress, ocean conditions (NOAA), proxies for human pressure, and species phylogeny. These datasets will be confronted with benthic and fish community surveys spanning over 2,000 coral reef sites throughout the tropics, containing up to 30 years of repeated sampling. Using these data, the student will use advances in machine learning, hierarchical modelling of species communities, and Bayesian modelling to determine and project the changing role of MPAs across the tropics.

    Teaser Project 1: Uncovering compositional shifts in coral reefs within Marine Protected Areas

    This teaser project will identify potential and realized drivers of community composition shifts in coral reef MPAs, using both a space for time and a temporal perspective.

    Objective 1 – quantify long-term trends in ocean temperatures, primary productivity and anthropogenic run-off in protected coral reefs, and use these trends to uncover temporal drivers of benthic community composition. The analyses will use Google Earth Engine to process global oceanographic and reef habitat datasets, and draw on machine learning approaches, such as spatial random forest, and on hierarchical Bayesian modelling.

    Objective 2 – determine how reef fish community structure in MPAs is responding to changes in benthic condition (Obj 1). This objective will leverage coral reef fish phylogeny and underwater fish surveys to conduct Hierarchical Modelling of Species Communities (HMSC) incorporating species associations to ensure poorly samples species are appropriately modelled.

    Further PhD development would investigate the underlying mechanisms of community composition change. This would involve compiling additional datasets on remotely sensed environmental stress variables and key social drivers. These large and interacting databases will require machine learning to handle nonlinearities, and using the latest casual discovery uncover the key underlying mechanisms leading to altered community composition and trophic structures in MPAs.

    Teaser Project 2: Projecting Marine Protected Area outcomes for coral reefs globally.

    This teaser project will draw on downscaled projections of key environmental and social drivers that influence coral reef ecology, coupled with contemporary analyses of change to MPA ecological outcomes to project future MPA outcomes across scales.

    Objective 1 – Build and compile down-scaled biophysical model projections (e.g. CMIP6) of key environmental (e.g. sea surface temperature anomalies, site species temperature variation, wind and wave energy) and social (human gravity, land use) parameters for coral reef cells globally. By pairing each driver with expert expectations for reef ecological variables (e.g. coral cover), these data will be used to describe different SSP outcomes to 2100 for current coral reef MPAs.

    Objective 2 – for the environmental and social variables showing most change into the future in Objective 1, hindcasts will determine environmental and social changes experienced by >2,000 coral reef sites. Drivers will then be linked to temporal shifts in benthic composition, fish trophic structure, and key community-level processes (e.g. fish productivity), with careful exploration of uncertainty quantification, helping infer how these variables will likely change into the future.

    Further PhD development would model ecological outcomes on different SSP scenarios through to 2100, under different MPA characteristics and for different climate models (e.g. multimodel ensembles). The candidate will determine the trophic and community scale outcomes under different scenarios, how this will vary spatially, and how uncertain these outcomes are. Finally, they will explore optimal configurations of future MPA designations to reach 30% national targets, in the context of optimising for multiple ecological outcomes.

    References & Further Reading

    Lester SE, et al. (2009) Biological effects within no-take marine reserves: a global synthesis. Mar Ecol Prog Ser 384: 33-46 https://www.int-res.com/abstracts/meps/v384/meps08029

    Graham NAJ, Robinson JPW, Smith SE, Govinden R, Gendron G, Wilson SK (2020) Changing role of coral reef marine reserves in a warming climate. Nature Communications 11: 2000 https://www.nature.com/articles/s41467-020-15863-z

    Hadj-Hammou J, et al. (2024) Global patterns and drivers of fish reproductive potential on coral reefs. Nature Communications 15: 6105 https://www.nature.com/articles/s41467-024-50367-0

    Hughes TP, et al. (2018) Spatial and temporal patterns of mass bleaching of corals in the Anthropocene. Science 359: 80-83 https://www.science.org/doi/full/10.1126/science.aan8048

    Mellin C, Brown S, Heron SF, Fordham DA (2025) CoralBleachRisk—Global Projections of Coral Bleaching Risk in the 21st Century. Global Ecology and Biogeography 34: e13955 https://onlinelibrary.wiley.com/doi/full/10.1111/geb.13955

    Cinner JE, et al. (2018) The gravity of human impacts mediates coral reef conservation gains. Proceedings of the National Academy of Sciences of the USA 115: E6116-E6125 https://www.pnas.org/doi/abs/10.1073/pnas.1708001115

  • High-fidelity exascale-enabled infrastructure for analysing the impact of wind farm wakes on wind/sea interactions

    Project institution:
    Project supervisor(s):
    Dr M. Sergio Campobasso (Lancaster University), Prof Adrian Jackson (University of Edinburgh), Dr Evgenij Belikov (University of Edinburgh), Dr Wenxin Zhang (University of Glasgow), Dr Andrea Mazzeo (Lancaster University), Dr Stefano Federico (Institute of Atmospheric Sciences and Climate) and Dr Miriam Marchante Jiménez (Orsted)
    From Hasager, C.B.; Rasmussen, L.; Peña, A.; Jensen, L.E.; Réthoré, P.-E. Wind Farm Wake: The Horns Rev Photo Case. Energies 2013, 6, 696-716. https://doi.org/10.3390/en6020696.

    Overview and Background

    The extraction of energy from the wind yields the formation of low-speed regions (wakes) behind wind farms (WFs). Wakes are particularly persistent offshore [2], and were recently shown to affect the heat exchange between sea and atmosphere, due to reduced convective heat transfer close to the sea surface [1]. With worldwide offshore wind capacity en-route to achieve 2,000+ GW already by 2050, WF wakes may alter ocean dynamics and marine ecosystems to extents comparable to anthropogenic climate change [2]. Evaluating wakes’ environmental impact credibly requires regional- to mesoscale climate simulations with high-fidelity WF parametrizations at temporal and spatial resolution beyond present supercomputers’ capabilities. Using Graphics Processing Unit (GPU) computing [3], this project will develop the code infrastructure to support these simulations on exascale machines, demonstrating prototype physical investigations using the developed technology.

    Methodology and Objectives

    METHODOLOGY
    Two community codes for short-to-long term climate modelling are considered: the Weather Research and Forecasting (WRF) model [4], and the Model for Prediction Across Scales (MPAS) [5]. The codes feature similar models of atmospheric physics, but use different numerical methods. WRF uses structured grids with nested domains to increase resolution in WF wake regions, whereas MPAS uses a single unstructured Voronoi grid with controllable local refinement. WRF has state-of-the-art WF parametrisations [6,7] but little GPU work reported; MPAS uses GPU acceleration but has little work reported on WF parametrization.
    This research aims at combining the strengths of both codes to develop a reliable exascale-scalable code for the considered problem. The choice of the baseline code for the project’s core development and demonstrations will follow the teaser projects (TPs) below, which offer hands-on training in climate modelling, wind farm aerodynamics and distributed-memory and GPU parallel computing, and assess the codes’ strengths. Following the TPs, the student will focus on specific topics, e.g. improving the overall code GPU framework or optimizing the parallelized WF model in existing GPU framework, depending on the code selected.

    The TPs will share one test case, to compare the two codes’ predictive capabilities and computational performance (execution speed) without GPU acceleration. The GPU development work will be performed on Lancaster University’s HEC cluster and the Bede supercomputer [9].

    Teaser project 1 (TP1): WRF-based. To investigate and optimize the predictive capabilities of the two WF parametrizations [6,7] in WRF, analyses (TC1) of the North Sea area containing two real WFs [10] will be performed. The capabilities of both models to predict wind turbine (WT) and WF wakes will be optimised using regression methods for the models’ parameters, and lidar and satellite wind speed measurements to steer the optimization. Measured WT power will also be used in the process, as this parameter is affected by wakes.
    A second test-case (TC2) without WFs will be used to perform parallel profiling studies of WRF, identifying the code’s computationally most intensive parts and familiarising with its structure. These analyses will identify the code sections that would benefit most from GPU acceleration.
    TC2 will also be used to cross-compare the predictive capability of WRF and MPAS, assessing it by comparing predicted near-sea surface wind speed maps to measurements from satellites and lidars. Boundary and initial conditions for TC1 and TC2 will be taken from the ERA5 global climate reanalysis [8]

    Teaser project 2 (TP2): MPAS-based. First, TC2 will be set up and analyzed without GPUs to cross-compare the computational speed and prediction capabilities of wind speed field of MPAS and WRF. Then, more comprehensive TC2-based parametric analyses of the performance of MPAS using different numbers of CPUs and GPUs will be undertaken to study the dependence of the computational performance of the hybrid parallelization on the CPU and GPU counts, and determine the largest achievable acceleration and the corresponding optimal ratio of GPU and CPU counts – an information paramount for exascale porting. These analyses also enable familiarising with the MPAS structure, knowledge needed to optimally merge wind farm models with the MPAS GPU infrastructure.

    Teaser Project 1 Objectives:

    1. Familiarise with WRF: assess predictive capabilities of 3D wind fields with/without WFs; analyze/optimize best suited WF parametrization.
    2. Assess computational performance and estimate potential of GPU acceleration.

    Teaser Project 2 Objectives: 

    1. Familiarise with MPAS: assess predictive capabilities of 3D wind fields; investigate performance of hybrid CPU/GPU parallelisation.
    2. Investigate optimal integration of WF model into GPU framework.

    References & Further Reading

    1) Akhtar, N. et al., Impacts of accelerating deployment of offshore wind farms on near-surface climate. Sci Rep 12, 18307 (2022). https://doi.org/10.1038/s41598-022-22868-9.
    2) Platis A. et al., First in situ evidence of wakes in the far field behind offshore wind farms. Sci Rep. 2018;8(1):2163. https://www.nature.com/articles/s41598-018-20389-y
    3) Hijma, P., et al. Optimization techniques for GPU programming. ACM Computing Surveys 55.11 (2023): 1-81, https://dl.acm.org/doi/10.1145/3570638.
    4) Powers, J. G., et al. “The weather research and forecasting model: Overview, system efforts, and future directions.” Bulletin of the American Meteorological Society 98.8 (2017): 1717-1737. (see also: Weather Research and Forecasting (WRF) model, https://www.mmm.ucar.edu/models/wrf).
    5) Skamarock, W. C., et al. “A multiscale nonhydrostatic atmospheric model using centroidal Voronoi tesselations and C-grid staggering.” Monthly Weather Review 140.9 (2012): 3090-3105. (see also: Model for prediction across scales (MPAS), https://www.mmm.ucar.edu/models/mpas).
    6) Fitch, A. C. et al., Local and Mesoscale Impacts of Wind Farms as Parameterized in a Mesoscale NWP Model, Mon. Weather Rev., 140, 3017–3038, https://doi.org/10.1175/MWRD-11-00352.1, 2012.
    7) Volker, P. et al., The Explicit Wake Parametrisation V1.0: A Wind Farm Parametrisation in the Mesoscale Model WRF, Geosci. Model Dev., 8, 3715–3731, https://doi.org/10.5194/gmd-8-3715-2015, 2015.
    8) ERA5 Global Climate Reanalysis. https://www.ecmwf.int/en/forecasts/dataset/ecmwf-reanalysis-v5.
    9) N8 CIR, The Bede supercomputer. https://n8cir.org.uk/bede/.
    10) Orsted, Offshore wind measurement and operation data. https://orsted.com/en/what-we-do/renewable-energy-solutions/offshore-wind/offshore-wind-data.

  • High-resolution nowcasting of wind speed and power generation

    Project institution:
    Project supervisor(s):
    Prof Jethro Browell (University of Glasgow), Dr Joe O'Connor (University of Edinburgh) and Dr Tiffany Vlaar (University of Glasgow)

    Overview and Background

    Operating energy systems with a high penetration of wind power challenges conventional approaches to power system operation. Variability of the wind resource and resulting power generation must be actively managed, underpinned by predictive analytics. There is an emerging need to not only forecast energy (average power over some time period, typically 15 to 60 minutes) but variability of instantaneous power production within these periods. This is challenging as conventional weather forecasts only predict atmospheric variables, such as wind speed, at hourly resolution. This project will develop novel methods for weather and power forecasting exploiting high-performance computing for high-resolution numerical weather prediction and weather-to-power modelling, including uncertainty quantification. 

    Methodology and Objectives

    Methods Used: Numerical Weather Prediction, Neural Networks, Gradient Bosting, WRF, Generative Modelling 

    ​Teaser Project 1 Objectives: Generative modelling of sub-hourly wind power generation 

    This project will develop methods for within-day forecasting of wind power variability on sub-hourly time scales based on generative modelling conditioned on conventional Numerical Weather Prediction. Considerations include model architecture and GPU implementation, representation of relevant atmospheric processes, uncertainty quantification and generalisability/transferability between wind farms. GPU software development will be required for GPU-native training of the generative models, with distributed data pipelines and efficient parallelism to fully exploit large GPU clusters. 

    Steps will include gathering and processing data from two offshore wind farms, Anholt and Westermost Rough, each comprising two years of 10-minute resolution SCADA data; gathering and processing historic NWP data from ECMWF HRES model and/or UK Met Office UKV model (via the BADC), developing GPU software to implement one or more generative models (e.g. variational auto-encoder, generative adversarial network, or similar) to produce high-resolution wind power forecasts conditioned on NWP, and establishing an evaluation framework for this type of forecast information including naïve and competitive benchmark methods. 

    This project may develop into a PhD extending these ideas alone, for instance through development of novel neural network architectures and GPU software implementations, and/or by modelling intra-wind farm effects such as wakes, and/or in combination with high-resolution weather modelling from Teaser Project 2. In all cases, scalability to fleets (100s-1000s) of wind farms and exascale compute will be a necessary component. 

    ​Teaser Project 2 Objectives: High-resolution weather and wind power forecasting 

    High-resolution (in space and time) numerical weather prediction aims to resolve small and fast atmospheric processes to better describe atmospheric conditions. This project will establish a high-resolution NWP set-up targeting near surface winds with boundary conditions coming from conventional NWP. Methods will be developed to convert high-resolution NWP output to wind power production and variability forecasts, including uncertainty quantification. GPU software developed will enable offloading of compute-intensive NWP kernels using performance-portable approaches (e.g. OpenMP, OpenACC), guided by profiling, and exascale-ready in-situ post-processing (e.g. via ADIOS2) to handle the extreme data volumes from sub-hourly NWP ensembles. 

    Steps will include gathering and processing historic NWP data from ECMWF HRES model and/or UK Met Office UKV model (via the BADC) and meteorological stations, configuring a WRF model for high-resolution wind speed and direction forecasting at multiple heights, leveraging GPU acceleration, generate a new dataset of re-forecasts using the high-resolution set-up and validate against observations from meteorological stations and wind farm data from Teaser Project 1. 

    This project may develop into a PhD extending these ideas including novel GPU implementations of WRF/similar, AI weather models, or in combination with ideas from Teaser Project 1. In all cases, scalability to fleets (100s-1000s) of wind farms and exascale compute will be a necessary component. 

     

  • Sufficiency and Carbon Efficiency of exascale computing for environmental modelling and AI

    Project institution:
    Project supervisor(s):
    Prof Adrian Friday (Lancaster University), Dr Carolynne Lord (UKCEH), Dr Kelly Widdicks (UKCEH), Dr Kirsty Pringle (University of Edinburgh) and Prof Mike Berners-Lee (Small World Consulting)

    Overview and Background

    Core research question: Exascale computing exemplifies environmental science’s growing array of ‘digital research infrastructure’ (DRI), offering exciting promise of exploring new frontiers of knowledge via larger scale data analyses, models, and increasingly, the application of AI at a scale not previously possible. As exascale implies, vast compute has vast implications for the environment due to its associated operational emissions, water use, and embodied material footprints as it’s created, embedded into research methods, and at its end of life. This raises core challenges for researchers and scientific organisations: how can we transform research software engineering practice to make optimal and sufficient use of the latest hardware (e.g. GPU) given the legacy and complexity of scientific software; how can we engage practitioners with better tools and feedback to evaluate the implications of  digital software and hardware on the environment; and to what extent can this support exploration of gains in new
    environmental science knowledge against competing digital environmental costs? We are specifically interested the energy and carbon footprint of exascale computing, how to radically lower this footprint at and beyond the point of use through better software; and wider systemic thinking around this such as notions of ‘sufficiency’ as exemplified by ‘Green AI’, to understand and communicate its impacts and find the right balance of environmental modelling and AI for Earth Systems.

    Why is this relevant to ExaGeo? In the time of the Anthropocene and given the urgency of the climate crisis, it’s imperative that future environmental scientists are equipped with new understandings and principles to support the responsible use of computational methods. Our prior work (ARINZRIT) has found significant gaps in software engineering practice, training and with legacy scientific software which has led to inefficient use of state of the art HPC and accelerated hardware (e.g. GPUs) due not least to a lack of feedback on hardware use, energy marginal carbon intensity, apportioned embodied emissions, tools and training. This PhD, aligned with goals of NetDrive and SDRI, will help bolster the Exascale cohort with exactly this critical lens making a positive contribution to the overarching project in terms of better ‘energy and carbon aware’ practice. This will help strengthen the cohort as environmentally responsible scientists and practitioners and green software engineers.

    Methodology and Objectives

    Methods Used:
    The PhD project will advance knowledge in this domain by extending current understandings of the lifecycle impacts of digital research infrastructure (particularly large scale HPC) for environmental science and developing technical tools and practical solutions that support the environmental science community in using these infrastructures sustainably. We will explore both efficiency and utilisation of exascale hardware features by scientific software, especially the use of GPU vector operations, AI acceleration and ‘carbon efficient methods’ for achieving scientific results at the lowest environmental cost. We also wish to explore to what extent we can improve carbon literacy in green software engineering practice for scientific outcomes, the trade-offs between environmental impacts of computation and scientific results (for instance, ‘sufficiency’), and how to support scientists to embrace these methods.

    To focus the PhD, and align most closely with the ExaGEO ambitions, the student will explore the environmental impacts of exascale computing in relation to integrative Earth Systems modelling and AI. Two teaser, transdisciplinary research projects in this domain have been outlined below, each drawing together methods from across computer science, environmental data science and qualitative methods from social science.

    Teaser Project 1: ‘Scientific software and sustainable exascale Earth System modelling’
    Objectives:

    • Examine the energy and carbon performance of earth and environmental science software, especially exploitation (or lack of) relating to GPU acceleration and underutilisation of concurrency;
    • Work with green software engineering and scientific software communities to improve software, toolchains, feedback and practices to enable lower carbon Earth System models and data science/AI methods
    • Develop technical mechanisms to reduce computational waste and lower the footprint of large scale HPC (e.g., development of sustainable job queuing systems) for different exascale computing facilities, testing and evaluating these
    • Produce practical guidance, training and policies to accelerate better and more sustainable use of exascale computing

    Teaser Project 2: ‘Lean/ green data science for the future of exascale Earth System modelling’
    Objectives:

    • Evaluate the purpose, outcomes and accuracy of common Earth System models, exploring the role of AI emulation in its potential to reduce model complexity and run-time
    • Build a benchmark framework for evaluating the use phase performance, marginal emissions (scope 2) and embodied costs (scope 3) of Earth Systems modelling and AI in exascale highperformance computing
    • Drive more responsible and leaner use of exascale computing and digital research infrastructure with users and developers, e.g. using AI emulation of models or more efficient or frugal ML and data science alternatives to existing models (c.f. ‘Green AI’, (Schwartz, 2020))
    • Develop technical solutions (e.g., sustainable environmental software in shared repositories), offering environmental scientists more efficient and less impactful models and methods, and testing and evaluating these
    • Produce practical guidance, training and policies for exascale-ready model and method development that embodies sustainability and reduces waste and rebound effects in all stages of its innovation and use

    References & Further Reading

    1. Pringle, K. It’s time to decarbonise digital research, Research Professional News, April 2025,
      https://www.researchprofessionalnews.com/rr-news-uk-views-of-the-uk-2025-april-it-s-time-todecarbonise-digital-research/
    2. Lord, C., Friday, A., Jackson, A., Bird, C., Preist, C., Lambert, S., Kayumbi, G. and Widdicks, K., 2025, April. The world is not enough: growing waste in HPC-enabled academic practice. In Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems (pp. 1-14).
    3. Freitag, C., Berners-Lee, M., Widdicks, K., Knowles, B., Blair, G.S. and Friday, A., 2021. The real climate and transformative impact of ICT: A critique of estimates, trends, and regulations. Patterns, 2(9).
    4. Widdicks, K., Lucivero, F., Samuel, G., Croxatto, L.S., Smith, M.T., Ten Holter, C., Berners-Lee, M., Blair, G.S., Jirotka, M., Knowles, B. and Sorrell, S., 2023. Systems thinking and efficiency under emissions constraints: Addressing rebound effects in digital innovation and policy. Patterns, 4(2).
    5. Mytton, D. and Ashtine, M., 2022. Sources of data center energy estimates: A comprehensive review. Joule, 6(9), pp.2032-2056.
    6. Juckes, M., Bane, M., Bulpett, J., Cartmell, K., MacFarlane, M., MacRae, M., Owen, A., Pascoe, C. and Townsend, P., 2023. Sustainability in Digital Research Infrastructure: UKRI Net Zero DRI Scoping Project final technical report.
    7. Lannelongue, L., Aronson, H.E.G., Bateman, A., Birney, E., Caplan, T., Juckes, M., McEntyre, J., Morris, A.D., Reilly, G. and Inouye, M., 2023. GREENER principles for environmentally sustainable computational science. Nature Computational Science, 3(6), pp.514-521.
    8. Schwartz, R., Dodge, J., Smith, N. A., and Etzioni, O. Green AI. Commun. ACM 63, 12 (Nov. 2020), 54–63.
  • VINE — Value of Information for Nature & Economics

    Project institution:
    Project supervisor(s):
    Dr Alex Bush (Lancaster University), Dr Katherine Simpson (University of Glasgow), Prof Richard Reeve (University of Glasgow) and Dr Ben Payne (Natural England)

    Overview and Background

    To transform societies towards a sustainable footing we must reposition how environmental impacts are considered within standard economic models. Nature Markets are emerging as a means of mobilising finance and incentivising sustainable land management and ecosystem restoration, but also risk becoming greenwash. A key challenge is how biodiversity units are measured and traded, as this underpins both ecological success and economic efficiency. The recently developed irreplaceability metric offers a promising solution by capturing ecological complexity while supporting cost-effective investments. However, uncertainties in data, climate impacts, and socioeconomic factors remain critical concerns. Addressing these challenges through integrated risk management and advanced data science offers an opportunity to design resilient and credible Nature markets that align economic and conservation objectives. 

    Methodology and Objectives

    In the last 5 years the UK has passed some progressive environmental policies that other nations are hoping to learn from and emulate. Increasing private investment through nature markets is a core goal, and Biodiversity Net Gain (England) defines a mandatory obligation for developers to engage with ecosystem restoration. Gaps in the current legislation and policy have been widely publicised, but it remains a step in the right direction. This project will build upon the irreplaceability metric proposed by Bush et al (2023), drawn from the literature on systematic conservation planning (SCP), to define the strategic value of landscapes. If successful, a method for systematically scoring Nature will offer a robust and transparent framework for achieving Nature positive futures that serve everyone – and change how society perceives Nature and the services we gain from it. 

    While the original concepts could be demonstrated with basic simulations, and could adopt several simplifying assumptions, to operationalise this new form of nature market will require the development of new tools to suit the volume, velocity and variability of Big Data, be flexible to users demands, and face the realities of sparse and incomplete environmental data. The first teaser therefore focuses on strengthening our ability to reassure stakeholders the recommendations are robust to uncertainty. User’s concerns over uncertainty also underpin the second teaser, which seeks to show how irreplaceability markets would create an opportunity to optimize new monitoring and data collection and reduce costs. Each will require new method development, as well as adaptation to maximise efficiency within CPU/GPU environments. 

    Methods: The project will combine simulation modelling with prototype policy scenarios using UK environmental data. Simulations allow testing of strategies under “complete knowledge” conditions, enabling rigorous evaluation of uncertainty management. Conversely while true ecological and environmental data are incomplete, defining spatial covariance and distribution of features helps refine the problem space, as well as improve the communication of the project outcomes.  

    Teaser Project 1 Objectives: Promises of environmental sustainability can be achieved through market means will only be trusted if decisions are transparent. Yet, uncertainties emerge at many stages including our knowledge of ecosystems, how to restore them, profitability for landowners, and of course outcomes over long-term forecasts. Standard approaches to SCP do not integrate objectives for risk that arise from future climate uncertainty, but tools exists within the risk management sector for precisely this purpose and provide a rich opportunity for innovation. This project will test how to integrate different sources of uncertainty through existing methodologies like Modern Portfolio Theory, alongside those that preserve the benefits of the systematic approach to Nature markets. Subsequent tasks could then focus on the further challenge to scale those solutions to Big Data, including UK relevant datasets on land use, biodiversity, land values and climate change scenarios. 

    Teaser Project 2 Objectives: Monitoring is fundamental to any systematic approach, but diverse ecological and environmental surveys cannot be sustained at large scales posing a potential barrier to the market. However, when sources of uncertainty are identified and a decision criterion is well defined (i.e. irreplaceability), we can act systematically to reduce the uncertainty present by understanding the Value of Information (VoI). VoI methods are commonly used in other fields but have been rarely adopted in conservation settings. This project provides a chance to demonstrate how monitoring and research can become highly organised, taking advantage of a range of novel technologies at different times to minimising the regulatory costs. We propose exploring Bayesian Decision Analysis, Information-Gap Decision Theory, and Real Options Analysis to evaluate VoI adaptive learning strategies on synthetic environments to improve scalability, before proceeding to UK national datasets. 

    References & Further Reading

    Bush et al. (2024). Systematic nature positive markets. https://doi.org/10.1111/cobi.14216 

    Hanley and Simpson (2025), Markets in Biodiversity Offsets. https://doi.org/10.1111/1467-8489.70027 

    Bolam et al. (2019), Using the Value of Information to improve conservation decision making. https://doi.org/10.1111/brv.12471 

    Popov et al. (2022). Managing risk and uncertainty in systematic conservation planning with insufficient information. https://doi.org/10.1111/2041-210X.13725