Bioinformatics tool to unearth new pointers to disease

Professor Roger Daly and Dr Jiangning Song.

An international collaboration, led by Monash Biomedicine Discovery Institute (BDI) researchers, has developed a novel bioinformatics tool that can expose hidden knowledge in data and potentially reveal mutations associated with specific diseases, such as cancer.

Named ‘Quokka’, the new bioinformatics tool allows biologists to rapidly and accurately identify human kinase-regulated phosphorylation sites.

Phosphorylation is one of the most important mechanisms by which proteins are modified to perform different biological functions. It occurs when a phosphate group bonds with specific amino acids, helped by enzymes called kinases.

Aberrant phosphorylation is often caused by gene mutations. Such mutations can create novel sites or abolish existing sites, thereby altering kinase-regulated networks and leading to disease phenotypes such as cancer. Quokka leverages rapidly increasing genomic and proteomic data by identifying novel phosphorylation sites and interpreting mutations in kinase signalling networks.

Professor Roger Daly, Head of the Monash BDI Cancer program, and Group Leader Dr Jiangning Song led a multidisciplinary collaboration from Australia, the US, Japan and Switzerland, which conducted research to develop the tool, recently published in the journal Bioinformatics.

While other such tools exist, most of them focus on identifying potential phosphorylation sites for a limited number of kinases only, and it is difficult to apply them to perform high-throughput proteome-wide analysis, that is, analysing and prioritising thousands of proteins.

Dr Song said recent technical advances in mass spectrometry have significantly helped high-throughput analysis of proteins but the protein kinases acting on phosphorylation sites were still largely unknown. There were more than 500 different kinases in the human body responsible for phosphorylation.

“Quokka can allow the user to perform novel knowledge discoveries in the sense that it can identify novel phosphorylation sites and predict the impact of disease-associated mutations on phosphorylation events, as well as the specific protein kinases that might be involved,” Dr Song said.

Data from normal tissue could be compared to those from tumour samples to identify key mutations that might affect phosphorylation-based signalling.

Quokka has the advantage of being able to offer a better balance between speed and accuracy. On average it takes the current Quokka server about an hour to process 10,000 protein sequences, with the possibility of shortening this to within half an hour, thanks to the support of the National eResearch Collaboration Tools and Resources (NeCTAR) cloud computing facilities.

“The development of Quokka highlights the power of developing multidisciplinary collaborations on the international stage,” Professor Daly said.

“This type of integrative research will be essential to mine the full potential of large genomic and proteomic datasets,” he said.

Available to users around the world without charge, the Quokka website attracted more than 400 hits from 17 countries within a week of being launched.

“I expect there to be a lot of use from the international scientific community,” Dr Song said.

Dr Song and Professor Daly, who have helped design other machine learning-based tools, plan to extend this research to analyse the functional impact or consequences of mutations that occur in the cancer genome.

First author on the paper is PhD student Fuyi Li and last author Professor Kuo-Chen Chou of the Gordon Life Science Institute Boston, Massachusetts, US.

“The framework of Quokka is very innovative, it not only predicts phosphorylation sites accurately, but also with a very fast speed. I think Quokka will become a very popular tool and it can assist biologists, complementing experimental efforts validating uncharacterised phosphorylation events,” Mr Li said.

“The interface of Quokka is well designed and user friendly, and it is very easy to use and visualise the results provided,” he said.

And the name?

Quokka, named after the Australian marsupial, stands for ‘Quantitative predictor of kinase family-specific kinome and phosphorylation sites’.

This research was supported by the ARC, NHMRC and an Interdisciplinary Research (IDR) major projects grant awarded by Monash University.

Read the full paper in Bioinformatics titled Quokka: a comprehensive tool for rapid and accurate prediction of kinase family-specific phosphorylation sites in the human proteome.