Developing expertise in bioinformatics

Data-driven biology is the development of tools and computational approaches to understand the huge volume and diversity of bioscience data. This data includes the genetic blueprints (genomes) of living things and their variations.

This area of research will have a major impact on health, agriculture, and environmental management. It helps researchers to see patterns in huge amounts of behaviour. And helps us to adapt practices and policies in response to these patterns.

COVID-19 is an example of this. Data on the genetic makeup of the virus and on the structure of the viral particle has enabled real-time surveillance, underpinned vaccine development, and informed government health and safety policy.

Analysis of human genome data submitted to public databases shows that not enough people of African and Latin American descent have been included in genomics studies. Greater representation is needed for the development of precision medicine.

A similar scenario is found for biodiversity.

Living organisms represent over 60% of the land-based species on Earth but there is a gap in our knowledge. We need to understand the impact of human-driven activities such as intensive farming and urbanisation on biodiversity. More understanding allows for farmers to meet increased demand sustainably.

In Latin America, the production of genomic data and its use in research and development has lagged, despite a clear need. There is a shortage of postgraduate programmes in computational biology and not enough people with the requisite skills.

Researchers from the UK and Latin America (Argentina, Brazil, Colombia, Costa Rica, Mexico, and Peru) have created a capacity-building programme called CABANA.

CABANA provides high-quality tailored training to create a regional network of researchers who will continue to work together and train the next generation of data-driven biologists.

The team behind CABANA are developing a community of scientists into confident bioinformatics trainers delivering advanced training of an internationally recognised standard.

28 bioinformatics workshops and >10 train-the-trainer courses have been delivered so far. They have provided the opportunity to share bioinformatics resources and knowledge in a cost-effective and community-driven manner. A further 22 research secondments have been arranged.

Participants of the CABANA project now represent the region in several global biodata initiatives. These include the DSI Scientific Network (an international network of scientists informing policy on the use of biological data), the International Barcode of Life Project, and the Human Cell Atlas, a project focused on creating comprehensive reference maps of all human cells.

CABANA has strengthened international cooperation in this area. It’s developed new cross-border collaborations. It has supported improved data sharing, including the open sharing of COVID-19 data. CABANA has led to quality information management and associated biological data easier to find, access and annotate.

CABANA: a capacity strengthening project for bioinformatics in Latin America

Project Leads: Catherine Brooksbank, European Bioinformatics Institute (EMBL).

Funder: Biotechnology and Biological Sciences Research Council (BBSRC), part of UK Research and Innovation

Partner organisations: The University of Campinas (Unicamp), Vale Institute of Technology (ITV), The Center for Research and Advanced Studies of the National Polytechnic Institute (Cinvestav – IPN), University of Buenos Aires, University of Costa Rica, National Institute of Agricultural Technology, International Potato Center (CIP), University of San Martin de Porres, University of Los Andes, European Bioinformatics Institute (EMBL).