DARE Seminar Series

Join DARE’s Seminar Series to hear from experts about applications of statistical and data science methods to DARE’s core domains – water, minerals and biodiversity. Our Seminars run every fortnight on Tuesdays. Find the program for 2024 below, as well as recordings of previous seminars.

6 February | Second-order Optimization Methods for Machine Learning

First-order optimization methods, particularly stochastic gradient descent, are the primary workhorse in machine learning (ML) for their historically low per-iteration costs. Recent theoretical advancements, including rapid convergence in overparameterized scenarios, implicit regularization, and the emergence of network architectures like skip connections, have solidified their dominance in ML. Nonetheless, sensitivity to hyperparameter tuning, susceptibility to saddle point entrapment, slow convergence in rugged landscapes with smaller networks, and inefficiency in constrained and distributed optimization settings remain significant challenges for these methods.

Second-order methods, on the other hand, can attain superior convergence rates, overcome non-convexity and ill-conditioning, effectively handle constraints, and exploit parallelism and distributed architectures in novel ways. However, their non-trivial sub-problems as well as high per-iteration costs continue to limit their wide-spread usage. In light of this, I will provide an overview of ongoing research into efficient, robust, and scalable second-order optimization algorithms for ML. To that end, I will focus Newton-MR variants, a novel class of Newton-type methods, that offer many desirable theoretical and practical properties and have the potential to surpass first-order methods in the next generation of optimization methods for large-scale machine learning.

Speaker: A/Prof Fred Roosta-Khorasani

Fred Roosta is an associate professor in the School of Mathematics and Physics at the University of Queensland (UQ). In addition, he is a chief investigator and a theme leader with the ARC Training Centre for Information Resilience (CIRES). Prior to joining UQ, he was a post-doctoral fellow in the Department of Statistics at the University of California, Berkeley. He obtained his PhD from the University of British Columbia in 2015.

Fred’s research interests and prior works span several areas of applied mathematics and computer science, including machine learning, numerical optimization, scientific computing, computational statistics as well as distributed and high performance computing. He is generally interested in studying various theoretical and algorithmic aspects of solving modern data analysis problems. In 2018, he was awarded the Discovery Early Career Researcher Award (DECRA) by the Australian Research Council for his research on second-order optimization for machine learning.

20 February | Data Management Plans: Considering the FAIR and CARE Principles

This presentation will comment on the importance of data management plans in supporting the management of data created by a research project. It will describe how the FAIR and CARE principles can be applied and used within a data management plan. Discussion on why FAIR and CARE should be considered for the management of data will be presented.

A data management plan is a document that allows you to record intended and expected details related to the data that will arise from your research project. A data management plan is a living document for a research project, which outlines data creation, data policies, access and ownership rules, management practices, management facilities and equipment, and who will be responsible for what. Nowadays they are seen as a mandatory component when embarking on a research project.

FAIR is a principles approach to data management. It promotes elements to help ensure good data management. Researchers spend considerable time, money and effort collecting and interrogating data. Making your data findable, accessible, interoperable and reusable (FAIR) maximises the impact of that investment, including gaining more citations for your data sets.

The CARE principles describe how data should be treated to ensure that Indigenous governance over the data and its use are respected. The CARE principles reflect the crucial role of data in advancing Indigenous innovation and self-determination. They ensure that data movements like the open data movement respect the people and purpose behind the data. FAIR and CARE are two components of data management that should be considered and mentioned when creating a data management plan.

Speaker: Dr Robin Burgess

Robin Burgess is the Manager of the Engagements Team at the ARDC (Australian Research Data Commons). He has a strong background and real interest in all aspects of research data management, particularly planning, publishing, governance and of course the application of the FAIR and CARE principles.

Robin has a PhD in Biosciences where he focused on data management techniques and a love of arts related data having worked at the Glasgow School of Art for 6 years. Robin moved to Sydney 8 years ago taking up roles as a Repository and Digitisation Manager at Sydney University, followed by a Senior Research Data Librarian at UNSW. Robin’s passion and interest in data and its management continues to grow in his role at the ARDC.

5 March | Professor Mark Jessell - Centre for Exploration Targeting, UWA

Seminar details to be confirmed.

Speaker: Professor Mark Jessell, DARE Senior Academic Domain Advisor

Mark Jessell is a Professor and Western Australian Fellow at the Centre for Exploration Targeting at The University of Western Australia. Mark studied up to MSc level in the UK, then moved to the USA to do a PhD, went to Melbourne to do a Postdoc and take up a teaching and research position at Monash University before moving to Toulouse, France taking up a position with the Institute de Recherche pour le Developpement, before joining the University of Western Australia in 2013.

Mark’s scientific interests revolve around microstructure studies (the Elle platform), integration of geology and geophysics in 3D (the Loop/MinEx CRC project), and the tectonics and metallogenesis of the West African and Guyanese Cratons (WAXI & SAXI).

Mark has extensive experience collaborating with the minerals sector though AMIRA International industry consortium funding for the WAXI and SAXI projects and the MinEx CRC; a partnership between industry and the Australian Government. Mark is also involved in a MRIWA industry/state government collaboration on the Paterson Orogen which is supported by funding from the ARC, MRIWA and industry.

Tuesday 5th March
4pm AEDT

21 February | PhD Candidates from the ARC Training Centre for Transforming Maintenance through Data Science

Reliability Inference with Extended Sequential Order Statistics by Tim Pesch

In this presentation I will address the complexity of non-identical components in multi-component, load sharing systems. For most technical systems the assumption of heterogeneous components is reasonable since components are either of different type or vary in their functions within the system. While most reliability related work resorts to the assumption of homogeneous components, I aim to address the often more realistic assumption of heterogeneous components extending the model of so called ‘Extended Sequential Order Statistics’ by two novel inferential methods.

Firstly, the derivation of Maximum Likelihood Estimates (MLE’s) of the underpinning model parameters, and secondly, the introduction of a likelihood ratio test which can decide on whether components can be assumed identical. Both methods are powerful tools in reliability contexts. The former increases our understanding of component behaviour, especially upon failure of other components. This knowledge empowers system operators to make better decisions regarding maintenance schedules and failure time prediction. The latter supports operators in their quest of identifying component equivalence.

Speaker: Tim Pesch

Tim is a mathematician with experience in reliability probability estimation. He completed his studies in mathematics with a focus on frequentist statistics at RWTH Aachen University in Germany. His Master thesis featured the combination of two well established models in reliability theory, the Stress-Strength model and the Competing Risk model, in the presence of censored data. His research yielded maximum likelihood-estimators for model parameters as well as the reliability probability under an exponential assumption, amongst other inferential results.

Conveyor Belt Wear Forecasting through a Bayesian Hierarchical Modeling Framework using Functional Data Analysis and Gamma Processes by Ryan Leadbetter

Reliability engineers make critical decisions about when and how to maintain conveyor belts, decisions that can significantly impact the production of the mine. The engineers use thickness measurements across the belt’s width to justify these decisions. However, the current approaches to forecast the wear of the conveyor belts are naive and throw away valuable information about the special wear characteristics of the conveyor. We have developed a new method for forecasting belt wear that retains the wear profile’s spatial structure and considers the wear rate’s heterogeneity – caused by operation and ore body composition variations.

Speaker: Ryan Leadbetter

Ryan is a mechanical engineer who is now undertaking a PhD in applied statistics through the Centre for Transforming Maintenance Through Data Science. Ryan’s PhD focuses on the predictive maintenance of overland iron ore conveyors. More specifically, he focuses on using condition monitoring and maintenance data to inform decisions on how and when to maintain mining machinery.

7 March | Modelling Deterministic Dynamics from Data

There has been a lot of recent interest in various computational methods that allow one to extract models of the deterministic evolution operator of a dynamical system from time series data. These methods have become increasingly successful as they are able to leverage increasing computational resource available today. I will start by contrasting these efforts against some earlier attempts to do this (including some of my own) and then move on to describe our recent work with reservoir computers.

Viewed in this setting, reservoir computers are a pattern generator which appear particularly appropriate to the task of reconstructing dynamics as their memory mimics the role of Takens’ theorem in delay reconstruction. I will briefly explore some of these ideas and finish by describing our attempts to quantify the performance of reservoirs and apply them to modelling tasks in industrial settings.

Speaker: Professor Michael Small 

Michael Small is the CSIRO-UWA Chair of Complex Systems and a former Future Fellow. He is the Deputy Editor-In-Chief of the Journal Chaos, and Main Editor of Physica A. He is a Chief Investigator of the ARC Industrial Transformation Training Centre for Transforming Maintenance Through Data Science, the Industrial Transform Research Hub for Transforming energy Infrastructure through Digital Engineering, and the Medical Research Future Fund project for Transforming Indigenous Mental Health and Wellbeing.

When he is not transforming things, his research relates to complex systems, network science, dynamical systems and chaos, and focusses on data-driven approaches to understanding dynamical systems.

21 March | Seven Algorithms for the Same Task (Testing Uniformity)

Suppose you get a set of (independent) data points in some discrete but huge domain {1,2,…,k}, and want to determine if this data is uniformly distributed. This is a basic and fundamental problem in statistics, and has applications in computer science, not all made up: from testing the mixing time of a random walk, to detecting malicious changes in a data stream, to selecting a good algorithm depending on the input distribution.

The goal, of course, is to perform this task efficiently, both time- (time complexity) and data-wise (sample complexity). In this talk, I will survey and discuss seven algorithms for uniformity testing, and explain some of their advantages and disadvantages.

Speaker: Dr Clément Canonne

Dr Clément Canonne is a Lecturer in the School of Computer Science at the University of Sydney, where he does research in theoretical computer science. His main research interests lie in property testing, learning theory, and, more generally, randomised algorithms and the theory of machine learning.

Prior to joining the University of Sydney, Clément was a Goldstine postdoctoral fellow at IBM Research Almaden, and a Motwani fellow at Stanford University. He obtained his Ph.D. from Columbia University in 2017.

4 April | Maximising the Resilience of Grasslands to Extreme Precipitation, Nutrients and Grazing

Global climate change has altered precipitation patterns and disrupted the characteristics of drought and rainfall events. This, combined with nutrient and grazing practices in grasslands, will likely expose vegetation to conditions beyond their adaptive capacity, altering biodiversity and productivity, and changing ecosystem function. Knowledge on how grasslands respond to these pressures, and their potential to recover, is needed to maintain essential ecosystem services in the future.

In this talk, I will introduce my PhD research and present some of the findings from my project to date. I will describe how we used a new drought tracking technique to characterise the spatiotemporal dynamics for past drought and rainfall events in Australia. I will also present some preliminary results from our field experiment, including how grassland productivity and diversity respond to extreme precipitation, nutrient addition, and cattle grazing.

Speaker: Elise Verhoeven

Elise is a PhD candidate with the School of Life and Environmental Sciences at The University of Sydney. Elise is interested in how plant communities respond to disturbances, and which plants could be important for maintaining ecosystem function under global change conditions. Her PhD research is looking at the interactive effect of extreme precipitation (drought and rain), nutrient addition, and cattle grazing on the structure, productivity, and ecosystem function in grasslands in north-west NSW.

18 April | Statistical Models for Social Networks

In this talk, I describe an approach to modelling social networks that has its origins in models for interactive spatial processes, including in plant ecology. The approach construes global network structure as the outcome of dynamic, potentially realisation-dependent processes occurring within local neighbourhoods of a network. I describe a hierarchy of models implied by the approach and note that they can be estimated from partial network data structures obtained through certain types of network sampling schemes. I illustrate how these models enhance our capacity to model observed human networks and present an example of their application to the transmission of an infectious disease.

Speaker: Professor Philippa “Pip” Pattison AO (Chair of DARE Advisory Board)

A quantitative psychologist by background, Professor Pattison began her academic career at the University of Melbourne. She served in a number of academic leadership roles at the University of Melbourne, including president of its Academic Board from 2007-2008 and Deputy Vice-Chancellor (Academic) from 2011-2014 before taking up the role in 2014 of Deputy Vice-Chancellor Education at the University of Sydney. During her term, Pip led the University’s strategy for learning and teaching, with a major focus on transformation of the undergraduate curriculum, the student experience and new approaches to postgraduate education and microcredentials. Pip retired from the role at the end of 2021.

The primary focus of Professor Pattison’s research is the development and application of mathematical and statistical models for social networks and network processes. Applications have included the transmission of infectious diseases, the evolution of the biotechnology industry in Australia, and community recovery following bushfire.

Professor Pattison was elected a Fellow of the Academy of the Social Sciences in Australia in 1995 and of the Royal Society of NSW in 2017.

Professor Pattison was named on the Queen’s Birthday 2015 Honours List as an Officer of the Order of Australia for distinguished service to higher education, particularly through contributions to the study of social network modelling, analysis and theory, and to university leadership and administration.

2 May | Big Data, Big Dreams: How Remote Sensing and Big Data are Changing Our View of the Coast

Coastal science and engineering is a relatively young field and historically has lacked sufficient data to be able to understand how this complex earth system works at both large temporal and spatial scales. Yet, with a large portion of the world’s population living within 50km of the coastline, we are being asked to provide advice and understanding on how coastlines will change into the future.

This talk will first provide a bit of context on just how data sparse our field is, and how we are now engaging and rapidly trying to catch up to our hydrological colleagues. We will discuss how we are applying basic machine learning techniques to improve our ability to predict coastal change at a variety of timescales of interest to the public, from individual storms, to where the coast might be by 2100.

The talk will be aimed at a broadscale (non-expert) audience, discussing the challenges associated with trying to model the coastline, and the techniques we have so far applied, and we’d love thoughts and ideas from the audience as well.

Speaker: Associate Professor Kristen Splinter

Kristen is an ARC Future Fellow and Deputy Director of the Water Research Laboratory at UNSW Sydney. Her work encompasses a wide range of coastal topics examining sandy beach evolution from storms to multiple decades. She has developed a number of behavioural type numerical models to predict sandbar and shoreline evolution and the focus of her Fellowship will be to develop regional scale models for long-term shoreline prediction, along the embayed coastlines of NSW. She’s been dipping her toes into machine learning since about 2015 but her students are the real experts.

Speaker: Patrick ‘Kit’ Calcraft

Kit is a DARE affiliated PhD candidate in his first year working on machine learning methods for shoreline prediction, including bridging the gap between physics and ML. He is co-supervised by Associate Professor Kristen Splinter, Dr Josh Simmons (DARE) and Professor Lucy Marshall. He will present an overview of what he’s been up to in year 1 of his PhD.

16 May | Control Type Particle Methods for Bayesian Data Assimilation

Ensemble Kalman type methods have seen an explosion in use in data assimilation applications and more recently for a range of learning tasks. Despite their desirable stability properties, they are not consistent with Bayes theorem for non-linear, non-Gaussian systems.

Recently, a range of controlled particle filters have been proposed which aim to emulate the structure of Ensemble Kalman type methods whilst simultaneously providing consistent samples in the asymptotic limit. More specifically, such filters involve constructing a control law to steer particles such that the corresponding probability distribution satisfies a variational Bayes formula.

I will provide an overview of this new class of filters and how they can be used for nonlinear ensemble data assimilation and Bayesian inverse problems. A framework which allows to derive these filters will be explored, which will also highlight the main differences among them.

Speaker: Dr Sahani Pathiraja (DARE Chief Investigator)

Sahani Pathiraja is a Lecturer (tenure track assistant professor) in Data Science at the University of New South Wales (UNSW) in Sydney. Sahani received her double Bachelor of Science (mathematics) and Engineering (environmental) in 2011 with Hons (1st Class) and the University Medal, and her PhD in Civil and Environmental Engineering in 2018, all from the University of New South Wales (UNSW Sydney). Her dissertation topic was on improved data assimilation methods for hydrologic applications.

From 2017-2022 she was a postdoctoral researcher in the Institute of Mathematics at the University of Potsdam, Germany as part of the Collaborative Research Centre on Data Assimilation. She worked primarily on the theoretical analysis of modern sequential Monte Carlo methods as well as on new applications of data assimilation in biomedical modelling.

Sahani’s technical expertise spans both the mathematical theory and applications of data science methods, especially in hydrology. Her research is motivated by 1) how applications can inspire new theory and 2) how theory be developed in a more practically relevant way. Specifically, her research primarily focuses on Bayesian inference, Monte Carlo methods, stochastic analysis of data assimilation methods and uncertainty quantification.

30 May | Hydrological Modelling, Forecasting and Data Post-Processing

The Bureau of Meteorology provides a range of water information products and forecast services to the Australian community. While the Bureau has been providing a flood forecast and warning service for several decades, new water forecasting services have been developed and brought into production over the last 15 years. These forecast services can be categorised as either nation-wide (grid-based) or targeting specific locations (point-based) and cover different temporal scales. This seminar will begin by providing an overview of these forecasting services, with an emphasis on forecasts at the seasonal timescale.

Water forecasting services encompass the Australian Water Outlook (AWO) and seasonal streamflow forecast (SSF) service. The AWO provides historical analysis, seasonal forecasts and decadal projections of key variables of the surface water balance: root-zone soil-moisture, runoff and actual evapotranspiration. AWO is underpinned by the Australian Water Resource Model (AWRA-L), run at a daily time-step and at a 5km resolution. The seasonal streamflow forecast (SSF) service provides point-based seasonal forecasts of river discharge at 341 point-locations across Australia, coincident with selected river gauging stations and major water storages.

The Bureau has embarked on a 10-year research plan focused on Earth System Modelling. A unified modelling system supporting all forecast products and services will drive efficiency gains, improve product consistency and remove the maintenance burden of disparate systems now in operation. The final part of this seminar will cover a scientific evaluation to unite both the AWO and SSF service. This unification has been achieved by applying statistical post-processing to AWO seasonal forecasts to generate seasonal streamflow forecasts.

Speaker: Dr Christopher Pickett-Heaps

Dr Christopher Pickett-Heaps is a hydrologist at the Bureau of Meteorology and is a member of the Hydrological Applications team in the Science and Innovation Group of the Bureau. Christopher has been with the Bureau since 2013. His primary role is a hydrological modeller, working to extend the capability of current water forecasting models and systems. Currently Christopher is the scientific lead of a project to integrate seasonal streamflow forecasting with seasonal landscape forecasting. Prior to this, Christopher worked on the Australian Water Outlook. Christopher has also contributed to the development of operational systems underpinning different water forecasting services.

Christopher was awarded a PhD from the University of Melbourne in earth-system modelling after studying in both Australia and France. Christopher then continued working in France before moving to Boston for two years as a post-doctoral fellow at Harvard University. Christopher returned to Australia in 2010 to take a 3-year position at The CSIRO before joining the Bureau. Christopher is based in Canberra.

8 August | How Machine Learning Can Cut the Cost of Downscaling Evapotranspiration

Estimating future climate change and its uncertainties relies on the analysis of a range of global climate models (GCMs) and the assessment of their spread. To meet the spatial scales required to study the local impacts of climate change, GCMs are downscaled dynamically using regional climate models or empirically using statistical methods and machine learning techniques. Due to the high computational cost involved in dynamical downscaling (DD), only a few GCMs are considered in this approach, resulting in a limited range of predictions that might not be sufficient to accurately assess the uncertainty in the predicted changes. Statistical methods and machine learning, on the other hand, perform downscaling at a much lower cost, but can perform poorly when extrapolated to future climates.

We introduce a hybrid downscaling framework that leverages the merits of dynamical downscaling and machine learning while overcoming the limitations of a single approach. In the new framework, a machine learning model is developed for each coarse grid cell to predict the subgrid distribution of the variable of interest as a function of the local climate and subgrid land surface characteristics. The fine-scale data needed for training ML is sourced from dynamically downscaling 10 representative years from the entire distribution of the coarse data.

As a proof of concept, we apply the new framework to downscale daily Evapotranspiration from the Australian BARRA-R reanalysis dataset over Sydney from 12.5km down to 1.5km. We employ three machine learning algorithms and demonstrate their performance. We also explore spatial transitivity, i.e. the capability of the trained ML models to downscale regions outside the spatial domain they were trained in, and we demonstrate when it is effective.

In the proposed framework, multiple GCMs can be downscaled for the same cost as downscaling a single GCM. Ultimately, this should improve our ability to analyse future changes in local climate, and provide more robust information for impacts adaptation planning.

Speaker: Dr Sanaa Hobeichi

Sanaa Hobeichi is a post-doctoral researcher at the University of New South Wales (UNSW) Climate Change Research Centre and the ARC Centre of Excellence for Climate Extremes (CLEX). Her research spans climate science and machine learning, focusing on developing machine learning methods for downscaling climate data and improving drought predictions. She is also interested in explainable machine learning and physics-informed machine learning.

Sanaa is passionate about advancing climate science education for secondary school students. By leading the Climate Classrooms workshops for teachers, she facilitates the development of teaching resources that effectively incorporate climate science research into the Australian Curriculum.

Sanaa obtained her PhD in Climate Science from UNSW, and she holds a BSc in Computer Science and Applied Mathematics from the Lebanese University, and a MSc in Environmental Remote Sensing from Qatar University.

22 August | Continent-Scale Groundwater Models: Constraining Flow Pathways Across Eastern Australia

Numerical models of groundwater flow play a critical role for water management scenarios under climate extremes. Large‑scale models play a key role in determining long range flow pathways from continental interiors to the oceans, yet struggle to simulate the local flow patterns offered by small‑scale models. We have developed a highly scalable numerical framework to model continental groundwater flow which capture the intricate flow pathways between deep aquifers and the near surface. The coupled thermal‑hydraulic basin structure is inferred from hydraulic head measurements, recharge estimates from geochemical proxies, and borehole temperature data using a Bayesian framework. We use it to model the deep groundwater flow beneath the Sydney–Gunnedah–Bowen Basin, part of Australia’s largest aquifer system. Coastal aquifers have flow rates of up to 0.3 m/ day, and a corresponding groundwater residence time of just 2,000 years.

In contrast, our model predicts slow flow rates of 0.005 m/day for inland aquifers, resulting in a groundwater residence time of ∼ 400,000 years. Perturbing the model to account for a drop in borehole water levels since 2000, we find that lengthened inland flow pathways depart significantly from pre‑2000 streamlines as groundwater is drawn further from recharge zones in a drying climate. Our results illustrate that progressively increasing water extraction from inland aquifers may permanently alter long‑range flow pathways. Our open‑source modelling approach can be extended to any basin and may help inform policies on the sustainable management of groundwater.

Speaker: Dr Ben Mather

Dr. Ben Mather is a research fellow in the EarthByte Group within the School of Geosciences at The University of Sydney. He is an expert in fusing multi-disciplinary datasets with Earth evolution models to understand the occurrence of enigmatic volcanoes. Related research interests include the cycling of volatiles within the Earth, probabilistic thermal models of the lithosphere to unravel past tectonic and climatic events, and the response of groundwater flow pathways to tectonic forces.

A firm supporter of open-source software, Dr. Mather develops computational methods and tools that adhere to Findable, Accessible, Interoperable and Reusable (FAIR) standards and which are hosted in public repositories. He is a vocal advocate for the integral role of geoscience in responding to challenges we face in transitioning to the carbon-neutral economy. Dr. Mather has been interviewed in national and international print media, TV, and radio on a wide variety of subjects including earthquakes, volcanoes, groundwater, and critical minerals.

GitHub: github.com/brmather
Twitter: @BenRMather

19 September | Teaching Computers How to See Rocks - Using Computer Vision Models to Extract Visual Datasets from Geological Images

The mining industry is currently going through a significant phase of digital transformation to try and meet the rising global demand for minerals. As part of this digitalisation and modernisation, mining companies are collecting larger volumes of more complex data than ever before. Technologies that can assist in turning data into information and insights are key to prevent geoscientists from drowning in their new sea of data.

In this talk, we will explore how recent advances in computer vision – specifically in the field of deep learning – have provided algorithms and workflows that have the ability to efficiently augment and automate the many observational tasks in geoscience such as drill core logging. Several case studies will be presented to demonstrate how these models are trained and deployed to solve challenging geoscience problems.

Speaker: Brenton Crawford

Brenton Crawford is a geologist, data scientist, entrepreneur and mining technology enthusiast. He studied geology and geophysics at Monash University and began his career in consulting working for PGN Geoscience in a number of geological and geophysical roles in both exploration and mining. Brenton has also worked as a geophysicist and data scientist for MMG Exploration working in nickel, copper and zinc exploration and project generation in Australia, Africa and South America.

In 2015, Brenton co-founded Solve Geosolutions – Australia’s first exploration and mining focused data science consultancy which has since been acquired. In 2018, Brenton co-founded Datarock – a computer vision technology company geared at building productionised image and video analysis solutions for exploration and mining where he has served as both its Head of Business Development and Chief Operating Officer. Brenton currently serves as Datarock‘s Chief Geoscientist and Technologist.

17 October | Art and Science of Causal Inference

Sally Cripps

Speaker: Professor Sally Cripps

Sally Cripps is an internationally recognized scholar and leader in Bayesian Machine Learning (ML) and Artificial Intelligence (AI). In addition to her role as Director of Technology at the Human Technology Institute she is a Professor of Mathematics and Statistics at the University of Technology Sydney. Sally has held a number of leadership positions in ML and AI. She was cofounder and co-director of the University of Sydney’s Centre for Translation Data Science (CTDS), she was founder and Director of the Australian Research Council’s Industrial Transformation Training Centre (ARC ITTC) Data Analytics for Resources and Environments (DARE). Most recently Sally was Research Director of Analytics and Decision Science and Science Director of the Next Gen AI Training Programme in CSIRO’s Data61. She was also chair of the International Bayesian Society for Bayesian Analysis (ISBA) section on Education and Research in practice. She has served as a board member for Climate Services for Agriculture in the Department of Water and the Environment and as a member of the Data Analytics Centre of NSW Health and Human Services Expert Working Group and the NSW Smart Cities Research & Academic Working Group.

Sally’s research focuses on the development of new foundational methods in AI to address global challenges. Her work has been published in the world’s most prestigious statistical and machine learning journals such as, The Journal of the Royal Statistical Society, and the Journal of the American Statistical Association; Theory and methods, (JASA), Biometrika and Journal of Computational and Graphical Statistics (JCGS), Conference on Neural Information Processing Systems (NeurIPS) and Conference on Artificial Intelligence and Statistics (AIStats). She has applied these methods to a diverse range of fields including social disadvantage, mental health, climate, minerals and the environment. In recognition of the quality of her research Sally was awarded an ARC Future Fellowship and a visiting scholar fellowship to the Alan Turing Institute in the UK. Sally has attracted over $25M in industry, government and philanthropic funding.

31 October | Challenges in Annotating Datasets to Quantify Bias

Recent advances in artificial intelligence, including the development of highly sophisticated large language models (LLM), have proven beneficial in many real-world applications. However, evidence of inherent bias encoded in these LLMs has raised concerns about equity. In response, there has been an increase in research dealing with bias, including studies focusing on quantifying bias and developing debiasing techniques. Benchmark bias datasets have also been developed for binary gender classification and ethical/racial considerations, focusing predominantly on American demographics. However, there is minimal research in understanding and quantifying bias related to under-represented societies.

Motivated by the lack of annotated datasets for quantifying bias in under-represented societies, we endeavoured to create benchmark datasets for the New Zealand (NZ) population. We faced many challenges in this process, despite the availability of three annotators. This research outlines the manual annotation process, provides an overview of the challenges we encountered and lessons learnt, and presents recommendations for future research.

Speaker: Professor Gill Dobbie

Professor Gillian Dobbie is widely recognised for her research in database systems and artificial intelligence. She holds a PhD in Computer Science from the University of Melbourne, where she specialised in database theory and design. Her research interests encompass a wide range of topics, including conceptual modeling, knowledge representation, query optimisation, data privacy, data stream mining, continual learning, and adversarial learning. She has published over 160 papers in top-tier conferences and journals, such as SIGCSE, IJCAI, ICDM, SIGIR, CIKM, ICDE, SIGMOD, TODS, ACM Computing Surveys. She was awarded the DASFAA 10+ Year Best Paper Award for her research contribution with Prof Ling Tok Wang and Prof Mengchi Liu. Professor Dobbie is a Fellow of the Royal Society of New Zealand and Chair of the Marsden Fund Council.

Throughout her career Professor Dobbie has been a catalyst for collaboration and interdisciplinary work, leading the development of projects such as Precision Driven Health, which received the MinterEllisonRuddWatts Research & Business Partnership Award. She continued to build bridges between academia and industry through her leadership of the Auckland ICT Graduate School.

Beyond her academic pursuits, Professor Dobbie is actively engaged in promoting diversity and inclusivity in STEM fields. She is passionate about encouraging underrepresented groups to pursue careers in computer science, fostering an environment where everyone can thrive.

14 November | The Eratos Platform - A Tool to Assist with Research and Commercialisation

Dr Tom Remenyi will present his view on the research sector, and some of the key barriers to transforming research outputs into impact. Tom will then present the Eratos platform, a tool designed to assist with data management and analytics, with a particular focus on assisting research teams commercialise research. Tom will show some demonstrations of how the platform is currently being used, some successful projects so far, and point out some of the areas researchers are finding value.

Speaker: Dr Tom Remenyi

Dr Tomas Remenyi is a climate services professional expert at translating complex climate science into useful, accessible products, tools or advice. Tom focuses on meaningful engagement with stakeholders so as to rapidly determine the nexus of where their needs intersect with what known science can actually deliver. Tom has a decade of experience delivering useful climate services with more than 50 projects across a range of sectors including natural hazards, emergency services, tourism, energy, and agriculture.

Tom has a dynamic mix of science and commerce training that allows him to view both the research and commercial sectors from a perspective that differs from others. As a ‘systems thinker’ Tom is always trying to figure out what the blockers are to positive change within society across multiple sectors. Tom is currently supervising 2 PhD students and regularly supports executive leadership teams to better understand how climate change will impact their operations and strategy. After 20 years working as an academic, Tom has now transitioned into the commercial realm to help be a bridge over the research-commerce divide.

12 December | NSW Biodiversity Offsets Scheme

Biodiversity offset credits in New South Wales are transacted within a regulatory environment defined by detailed trading rules and many different types of biodiversity credits that can lead to thin markets and high transaction costs. In this talk, I will present a recent paper I co-authored with Charles Plott (Caltech), Gary Stoneham (CMD), Ingrid Burfurd (CMD), and Mladen Kovac (NSW DPE), which is not only academically valuable but also practically indispensable. It provides the necessary guidance and structure to ensure that the proposed market tool is not just an abstract concept but a practical and effective solution to the challenges faced in biodiversity offset credit trading in NSW.

The paper presents the key elements of this market, including a search algorithm that identifies potential trading partners based on regulatory constraints and an online exchange tool that streamlines the process of price discovery and allocation of offset contracts.

The search algorithm for biodiversity offset rules relies on key comparisons, improving efficiency compared to linear searching. It systematically eliminates records until the target record is found. This algorithm is applied to BOS data structures in a predefined order to update the public register, initiating a new search cycle.

Speaker: Dr Rogelio Canizales-Perez

Dr Canizales-Perez is an Economist, with a PhD in Environmental Science and several years of experience as a Public Servant managing all environmental, economic, finance, and policy affairs that support the development of Natural Capital Markets in NSW, Australia (specialised in Natural Capital policy programs such as the Biodiversity Offsets Scheme). His professional journey has been characterised by a serious dedication to innovative (and disruptive) solutions and a profound sense of responsibility to contribute significantly to the organisation’s growth while fostering a positive and lasting impact on society.

He worked for more than nine years in the Mexican Government delivering products on nationwide strategic and regulatory water planning functions. Dr Canizales-Perez is a clear and effective communicator and highly skilled at econometrics, natural capital accounting, and environmental markets design with demonstrated experience as a Team Leader in developing innovative and fit-for-purpose market information tools.

1 February | From droughts in the Pacific to algal blooms at Bonnie Doon: can we predict them?

The first part of Floris’s talk will look at drought research in the Pacific Island Countries (PICs). Drought is becoming an increasing concern for food security and production in the PICs. For example, the 1997–98 drought cost Fijian farmers $US 63 million in lost revenue from mostly sugar cane farming and threatened food security in the region. A recent study on droughts impacts in PICs identified 5 research challenges of which one is to develop drought early warning systems. The results of initial research into developing a drought forecasting system based on LSTMs will be presented.

The second part of the talk will look at using novel data sources to predict algal blooms. In 2000, it was estimated that algal blooms cost Australia more than $AUD 95 million per year by impacting our water ways and water supplies, fisheries, agriculture, and recreational water use. Continuing research being conducted by UNSW suggests that this is likely to be much higher but is yet to be assigned a dollar value. Furthermore, blue green algal blooms (BGABs) pose a significant health risk to humans and stock. WaterNSW is responsible for the challenging task of monitoring and reporting BGABs. Unfortunately, it is nigh on impossible to monitor the entire water network in NSW as its resources are limited. Therefore, we are looking into ways in which we can use readily available data such as climate records, satellite data, and novel data sets such as dust data, to predict where and when blooms might occur. These predictions can be used to aid in managing blooms through, e.g., directing when and where to erect warning signs, deploying mobile aeration stations, shunting water in and out of weirs to disrupt blooms and identify hotspots where longer term management strategies may be needed or monitor whether management plans have been effective in reducing BGABs.

Finally, a summary of current and future projects will be given as a key goal of this talk is to identify potential future collaborations with members of DARE.

Speaker: Floris Van Ogtrop

8 February | In a world awash with personal data, how can we empower people to harness and control their data?

As technology pervades our lives in an increasingly rich ecosystem of digital devices, they can capture huge amounts of long-term personal data. A core theme of my research has been to create systems and interfaces that enable people to harness and control that data and its use. This talk will share key insights a series of case studies from that work and plans to build upon these. The first case studies explored how to harness data from wearables, such as smart watches, for personal informatics interfaces that help us gain insights about ourselves over the long term, for analysis of a large dataset (over 140,000 people) and for Virtual Reality games for exercise. The second set of case studies are from formal education settings where personal data interfaces, called Open Learner Models (OLMs), can harness learning data. I will share key insights that have emerged for a research agenda: OLMs for life-wide learning; the nature of the different interfaces needed for fast, versus slow and considered, thinking; communicating uncertainty; scaffolding people to really learn about themselves from their data; and how these link to urgent challenges of education in an age of AI, fake news and truth decay.

Speaker: Judy Kay

22 February | Bayesian Computation - Why/when Variational Bayes, not MCMC or SMC?

Bayesian inference has been increasingly used in statistics and related areas as a principled and convenient tool for reasoning with uncertainty. Bayesian computation is often a challenging task and modern applications of Bayesian inference, such as Bayesian deep learning, have called for more scalable Bayesian computation techniques. In this talk, I will give a quick introduction to Variational Bayes for scalable Bayesian inference. I then provide a general discussion on its pros and cons, recent advances and applications, and some potential research directions.

Speaker: Minh-Ngoc Tran

5 April | IoT Enabled Sensors to generate data to monitor Health, Home and Environmental conditions

The advancement of sensing technologies, embedded systems, wireless communication technologies, nanomaterials and miniaturisation makes it possible to develop IoT enabled smart sensing systems. IoT enabled wearable and non-wearable sensors generate useful data to monitor physiological parameters as well as human activities continuously to detect any abnormal and/or unforeseen situations which need immediate attention. Therefore, necessary help can be provided in times of dire need. IoT enabled sensors provides real time environmental data which will provide full awareness of weather/climate and can be used to take any strategic/corrective actions to address issues. This seminar will discuss fabrication and developmental works on IoT enabled sensors based on MEMS as well as flexible materials for home, health, and environmental monitoring.

Speaker: Subhas Mukhopadhyay

19 April | Practical Quantum Sensing to Address Real World Problems

Quantum inertial sensors, quantum sensors that measure vector gravity, the gravity gradient tensor, acceleration, rotation and time offer unprecedented accuracy and very low in-run and bias offset drift. These properties are critical for many applications including the mapping of underground water resources, mineral exploration, underground structure detection and mapping, inertial navigation in GPS denied scenarios, satellite navigation, and planetary exploration. When fused with a high bandwidth, high dynamic range classical sensor, we get the best of both worlds.

In the Quantum Sensors Group at ANU, we develop fit for purpose sensors based on detailed quantum models, and are just now developing our first sensors in field deployable SWaP. We exploit a host of techniques from Bose-Einstein condensed sources, to large momentum transfer atomic beam splitting, to quantum squeezing and, in collaboration with Sydney company Q-CTRL, optimised composite pulses to provide immunity to environmental noise. This talk will be an introduction to these very promising sensors and a discussion of a variety of applications. I will discuss the strengths and weaknesses of these sensors and identify the most promising applications.

Speaker: John Close

3 May | Native Mammals Disappearing in Northern Australia

Northern Australian savannas hold exceptional biodiversity values within largely intact vegetation complexes, yet many of the 180+ mammal species, and some other taxa, found in the region when Europeans colonised Australia are Endangered. Recently, 10 mammal species were added to the 20 or so already listed in the Australian endangered category, one up-listed to Critically Endangered, one to Extinct, 2 un-listed, 2 down-listed to Vulnerable and so on. Current predictions suggest that 9 species of mammal in northern Australia are in imminent danger of extinction within 20 years. We examined the robustness of the assumptions of status and trends in light of the low levels of monitoring of species and ecosystems across northern Australia, including monitoring the effects of management actions. The causes of the declines include a warming climate, pest species, changed fire regimes, grazing by introduced herbivores, and diseases.

Speaker: Noel Preece

17 May | Quantum Computing for Statisticians and Data Scientists

Quantum computing has emerged as the next computing technology paradigm, which promises to transform many critical fields, such as pharmaceutical and fertilizer design, supply chain and traffic optimisation, or optimisation for machine learning tasks. This is an exciting development for statisticians and data scientists because it will give rise to a new evolutionary branch of statistical and data analytics methodologies. This seminar will introduce quantum computing, explain its power and challenges, and provide a few examples of applications of quantum computing to problems of interest to statisticians.

Speaker: Anna Lopatnikova

31 May | Potential Ecological and Human Health Risks of PFAS Contamination in Alaska

PFAS (per- and poly-fluoroalkyl substances) are a class of putative toxic chemicals used in firefighting foams and some industrial processes. Escape of these chemicals into waterways has implications for ecological and human health. Their long environmental half-lives, detection difficulty and remediation complexities make them especially problematic. This DARE seminar will introduce relevant findings from the rapidly evolving field of PFAS research, as performed across Alaska.

Speaker: Kristin Nielsen

28 June | Data and Inference for Plant Biodiversity

This seminar will begin with an overview of research at the Gardens, and some of the plant resources that enable it, with a special focus on data. Next, three case studies will be presented. These highlight topics where there is opportunity to develop approaches to unlock data in RBG collections, or improve inference from rich genetic datasets. The case studies focus on understanding plant biodiversity (e.g., links between traits and environment), or directly informing conservation and restoration actions (e.g., managing genetic risks to endangered plants).

Speaker: Jason Bragg

12 July | Quantifying Disturbance in an Age of Rapid Environmental Change

We have entered a new era of the Anthropocene and are facing a biodiversity crisis, with extinction rates of species at least twice that of the background rate. Coupled with biodiversity loss is also rapid environmental change and increases in disturbance events, such as extreme weather events and wildfires. Ecologists are now concerned that entire ecosystems are at risk of collapse. To address this urgent challenge, we need to understand how disturbance events affect individual species, communities and how these changes permutate through ecosystems. The first part of the seminar will include an example of the population dynamics of a threatened species in a highly variable environment, an example of how wildfire operates in arid Australian and lastly how arid ecosystems may be modified from climate change. The second part of the seminar will introduce new research programs started since 2019 that expand these concepts into agricultural and forest ecosystems.

Speaker: Aaron Greenville

9 August | Statistical Models to Incorporate Heterogeneity in Spatiotemporal Prediction

Dirichlet processes and their extensions have reached great popularity in Bayesian nonparametric statistics. They have also been introduced for spatial and spatio-temporal data, as a tool to analyse and predict surfaces.

A popular approach to the Dirichlet process in a spatial setting relies on a stick-breaking representation, where the dependence over space is described in the definition of the stick-breaking probabilities. Extensions to include temporal dependence usually introduce a temporal dependence among the atoms of the Dirichlet process, however this approach does not let us properly test and incorporate a possible interaction between space and time.

In this talk, a Dirichlet process is proposed where the stick-breaking probabilities are defined to incorporate both spatial and temporal dependence. An advantage of the method is that it offers a natural way to test for separability of the two components. The performance of this approach will be tested on simulations and a real-data example from meteorology.

Speaker: Clara Grazian

23 August | Digital Soil Spectroscopy

Digital spectroscopy is transforming the way we characterise soils. Spectroscopy from numerous electromagnetic ranges produce various types of digital spectra, which can provide soil information. This presentation will look into mathematical and statistical techniques to extract information from the spectra to predict soil information. We will discuss techniques from machine learning (large p, small n) to deep learning models (large p, large n) and the challenges and opportunities in analysing spectra data from the lab to the field.

Speaker: Budiman Minasny

1 November | Gaussian Processes in Geology and Hydrological Model Selection

Gaussian Processes (GPs) provide a probabilistic method of modelling functions that can be applied to both classification and regression problems. The first part of this talk will present a range of geological problems that GPs have been applied to. The second part of the talk will present use of Bayesian model selection to evaluate Lower Namoi aquifer water balance models. This method tests the likelihood that different hypothesised components (inflows and outflows) are contributing to the water balance and the impact of other components on these likelihoods.

Speaker: Katie Silversides

15 November | Landscape Dynamics from Catchment to Global Scale: Long Term Sediment Transport & Species Migration

Our capability to reconstruct past landscapes, and the processes that shape them, underpins our comprehension of paleo-Earth, from its tectonic, atmospheric, and oceanic past dynamics to the evolution of life.

First, I will present a global-scale landscape evolution model assimilating paleo-elevation and paleo-climate reconstructions over the past 100 Myr. The simulations track the evolution of geomorphic and sedimentary features, including paleo-physiography maps, sediment fluxes, and stratigraphic architectures. From these simulations, we could reappraise the role surface processes plays in controlling sediment delivery to the oceans, evaluate sedimentation rates and the distinct phases of sediment transfer from terrestrial to marine basins. This advance in global landscape evolution modelling opens new avenues to quantify the role that the constantly evolving physiography of the Earth has played in modulating the transport of sediments from mountain tops down to the ocean basins, ultimately regulating the carbon cycle and Earth’s climate fluctuations through deep time.

The idea that landscapes play a role in biological evolution has a long history that can be traced back to the 19th century with Darwin and Wallace when faunal and floral boundaries were noted to correspond to physiographic discontinuities and gradients. More recently, species adaptation in response to the motion of continents has been extensively studied. Yet, only a few studies have looked at how the evolution of migration pathways is modulated by Earth’s morphological changes on geological time scale. In this second part of the seminar, I will focus on the Quaternary evolution of Sundaland, the partially inundated shelf separating Java, Sumatra and Borneo from the Malay Peninsula. Building upon recent work on climatic and tectonic history of the region, I ran a series of landscape evolution simulations to (1) investigate the role that physiographic and climatic changes might have played in regional biological diversification and (2) quantify how landscape dynamics could influence species migration. Specifically, I evaluate the regional geomorphological evolution by characterising main paleo-rivers’ routing history, associated watershed evolution and multiple morphometrics describing the landscape complexity (slopes, elevational range, erosion/deposition rates). From these simulations, I will present two applications. First, focusing on the past 1 Myr, I will show that physiographic changes have modified the regional connectivity network and remodelled the pathways of species dispersal supporting the theory that rapidly evolving physiography has fostered Quaternary biodiversification across Southeast Asia. Then, from predicted Sundaland physiography, I reconstruct Homo Erectus dispersal routes, coupling ecological movement simulations to landscape evolution model and find that the hospitable terra firma conditions of Sundaland facilitated the prior dispersal of hominins to the edge of Java. I then estimate a characteristic dispersal time of Homo Erectus across Sundaland. Our comprehensive reconstruction method to unravel the peopling timeline of Southeast Asia provides a novel framework to evaluate the evolution of early humans.

Speaker: Tristan Salles

29 November | Natural Language Processing and Network Representation Learning: Algorithms and Applications

Natural language processing and graph representation learning are trending AI topics. They’ve received increasing attention due to their effectiveness in a wide range of application domains (e.g., healthcare, speech recognition) and downstream machine learning tasks (e.g., clustering, prediction). In this talk, Monica presents some state-of-the-art natural language processing and graph representation learning algorithms and discusses how natural language processing was adapted and integrated into network representation learning and analysis. In particular, Monica talks about existing graph representation learning research on different types of networks and research progress on the large-scale dynamic heterogeneous networks. This talk also showcases some applications of the two topics.

Speaker: Monica Bian

13 December | Understanding Algal Blooms in Shallow Waterbodies

Small, constructed waterbodies are designed to attenuate floods and enhance water quality. Despite a range of guidelines that inform the design of these waterbodies, many still experience harmful algal blooms (HABs). It is vital to improve our understanding of how small waterbodies respond to HABs, considering the increasing number of small waterbodies being built globally and increasing HAB risk with climate change. This seminar will introduce Shuang’s research using a data-driven approach in her PhD studies and current work. These studies include design recommendations for small, constructed waterbodies to limit HABs, and remote sensing detection methods for HABs and water quality in waterbodies on local and global scales.

Speaker: Shuang Liu