Machine Learning

Machine learning is a form of artificial intelligence that enables computers and machines to autonomously learn how to do complex tasks without being overtly programmed. 

Machine learning involves teaching computers what to do by feeding examples of data and information. This allows the computer to look for patterns and then make better future decisions based on the provided examples. The aim is for the computer to learn automatically, and adjust actions, without human assistance.

Deep learning

Deep learning, sometimes referred to as deep structured learning or hierarchical learning, is a subset of artificial intelligence and machine learning whereby computers are trained to perform human-like tasks (such as making predictions, understanding and identifying images or recognising speech). 

Deep learning involves building and training neural networks using large data sets. By performing the set task repeatedly, the machine finds patters and learns from experience.  

Our researchers have made major contributions to advancing the mathematical tools that underpin deep learning theory. We can and do use this world class expertise to help organisations better understand their data.


  • Learning the deep structure of images

    This project seeks to develop technologies that will help computer vision interpret the whole visible scene, rather than just some of the objects therein. Existing automated methods for understanding images perform well at recognising specific objects in canonical poses, but the problem of whole image interpretation is far more challenging. Convolutional neural networks (CNN) have underpinned recent progress in object recognition, but whole-image understanding cannot be tackled similarly because the number of possible combinations of objects is too large. The project thus proposes a graph-based generalisation of the CNN approach which allows scene structure to be learned explicitly. This would represent an important step towards providing computers with robust vision, allowing them to interact with their environment.

    Professor Anton van den Hengel; Dr Anthony Dick; Dr Lingqiao Liu

  • Semantic change detection through large-scale learning

    Identifying whether there has been a significant change in a scene from a set of images is an important practical task, and has received much attention. The problem has been, however, that although existing statistical techniques perform reasonably well, it has been impossible to achieve the high levels of accuracy demanded by most real applications. This is due to the fact that changes in pixel intensity are not a particularly good indicator of significant change in a scene. We propose a semantic change detection approach which aims to classify the content of an image before attempting to identify change. This technology builds upon recent developments in large-scale classification which have dramatically improved both accuracy and speed.

    Professor Anton van den Hengel; Professor Chunhua Shen; Dr Anders Eriksson; Dr Qinfeng Shi; BAE Systems

  • Scaleable classification for massive datasets: randomised algorithms

    Classification is a fundamental data analysis technology and is applied every day in fields from astronomy to zoology. It is used to identify causes of disease, forms of tax evasion, and sources of oil, but is even more critical to developing data-bound sciences such as genomics, semantic document analysis and precision agriculture. This project will develop classification technologies capable of distinguishing between tens of thousands of classes, which are trained and applied to massive datasets. These technologies will deliver a significant increase in the scale of problem which may be tackled, and to the scale of benefits which may be achieved.

    Professor Anton van den Hengel; Professor Chunhua Shen; Dr Qinfeng Shi; LBT Innovations

  • Combined shape and appearance descriptors for visual object recognition

    The quantity of video generated each year is expanding rapidly. This increasing volume of visual information means that it is more likely that any particular event will be recorded, but that the footage will be harder to find. This applies to a collection of home videos as much as to television and movie footage. The object-recognition method to be developed has the potential to alleviate this situation, in which vast amounts of video data are available but have little value. Such an outcome would be a boon for Australian industry and offer a valuable export opportunity.

    Professor Anton van den Hengel; Associate Professor Anthony Dick

  • Image search for simulator content creation

    3D content creation represents one of the most labour intensive stages of the process of constructing virtual environments such as simulators and games. In many cases it is possible to capture images or video of the environment to be simulated which may be used to assist the modelling process. This project aims to develop technologies based on search by which such imagery may provide both shape and semantic information to assist in the modelling process. The project builds upon recent developments in bag-of-words methods for image search. Particularly, we propose a novel method by which information latent in the image database may be identified and used to improve generative model underpinning this type of image search.

    Professor Anton van den Hengel; Associate Professor Anthony Dick; Sydac

  • Computational infrastructure for machine learning in computer vision

    Machine learning is responsible for many recent advances in image-based information analysis, from finding minerals in satellite images, to image-based guidance of autonomous vehicles. This progress is due to new methods for learning from the vast volumes of image-based data that are now available. These images present a great opportunity that is only just beginning to be exploited, as automated image analysis methods still lag far behind the human ability to interpret image information. This project will develop the specific infrastructure required to tackle this problem, allowing Australia researchers to carry out the large-scale image-based machine learning required to achieve automated understanding of the world through images.

    A. van den Hengel; I.D. Reid; S. Venkatesh; B. Vo; D.Suter; S. Gould; S.M. Lucey; A.R. Dick; C.Shen; D.Q. Phung

Systems optimisation

Traditional optimisation techniques achieve the best possible solutions in a fixed environment.

Applying this in the real world can be challenging, with ever-changing environmental factors impacting the result (such as electricity prices, the weather, tax, and share market). At AIML we work on developing the theory, algorithms and tools that can predict when these factors will be in a state of flux, and therefore develop solutions to meet these needs now, and prepare for the future. The best solutions now may not be appropriate or even feasible in the future. 

This is the process of updating and modifying software systems so that they work more efficiently and autonomously. 

Robust statistics

This area of research looks at the developing procedures to analyse data to ensure that the information remains informative and efficient. Otherwise data analysis by non-robust methods can result in biased answers and conclusions. Robust statistics uses methods that identify patterns in the data, focusing on homogenous subset of the data, without being influenced by smaller subgroups.

Probabilistic graphical models

Probabilistic graphical models are very effective at modelling complex relationships among variables. These might be the relationships between symptoms and diseases, or the relationships between a set of sensor inputs and the state of the system being modelled, or the relationships between cellular metabolic reactions and the genes that encode them, or the relationships between users in a social network about whom we wish to draw inferences. 

Probabilistic graphical models use nodes to represent random variables and graphs to represent joint distributions over variables. By utilising conditional independence, a gigantic joint distribution (over potentially thousands or millions of variables) can be decomposed to local distributions over small subsets of variables, which facilitates efficient inference and learning. 


  • Probabilistic Graphical Models for interventional queries

    The project intends to develop methods to suggest how to optimally intervene so that the future state of the system will best suit our interests. The power of probabilistic graphical models to model complex relationships and interactions among a large number of variables facilitates many applications. However, such models only aim to understand the underlying environment. What is ultimately needed in many real-world applications is to suggest how we ought to intervene or act, so as to alter the environment to best suit our interests. The proposed project aims to achieve this using probabilistic graphical models on massive real-world data sets, thus facilitating a variety of applications from health care to commerce and the environment.

    A/Prof Qinfeng Shi; Assistant Professor Julian McAuley; Associate Professor Pawan Mudigonda

  • Compressive sensing based Probabilistic Graphical Models

    Probabilistic Graphical Models (PGMs) use graphs to represent the interactions between random variables and provide a formalism by which to represent complex probabilistic relationships. Despite the success of PGMs in many fields, the learning on real industrial large scale applications is very slow. I will exploit the sparsity and compressibility in PGMs, and turn the large scale PGMs to a number of small scale PGMs. Solving these small scale PGMs and then reversely recover the solutions in the original large scale PGMs in the context of Compressive Sensing. This way, I can effectively deal with large scale PGMs in the computational complexity of small scale PGMs as well as provide theoretical guarantees on the consistency of the solution.

    A/Prof Qinfeng Shi

VQA: Vision and language

VQA gives computers the ability to see and understand their environment and answer questions. 

It is fundamentally different to traditional computer vision technology in that the system is trained to answer general questions about images, rather than look for a specific image type. 

VQA gives natural language answers to natural language questions about the content of visual images. 

AIML has won numerous global competitions in VQA and have made major contributions to the development of the methodology. 

Natural Language Processing (NLP)

NLP is a branch of AI whereby computers are programmed to understand, process and analyse human language, giving them the ability to interact with humans in both text and spoken language. Think virtual assistant Siri on your iPhone, or your Amazon Alexa. 

NLP can be used to extract data from written documents into databases, to automate the process of writing reports and to develop machines that can interact with a human by spoken word only.