How to use DAVID for functional annotation in Biomarker studies?
The database for annotation, visualisation and integrated discovery (DAVID) is a bioinformatics tool that consists of an integrated biological database and analytical tools. It helps to analyse large gene or protein sequences to extract meaningful information. DAVID provides a high throughput data mining environment. It helps to analyse the gene lists that are a result of high throughput genomic experiment (Huang, Sherman, & Lempicki, 2009). This gives the option for one or more pathways mining tools like gene functional classification, functional annotation chart or clustering and functional annotation table. It also helps understand the biological themes in the list of genes that are enriched in genome-scale studies.
Challenges prior to DAVID
During the analyses of the potential biomarkers for the non-small cell lung cancer, the microarray data available at the GEO database at NCBI was collected. After normalisation of the data, there is a need for the functional enrichment of the data for finding out the differentially expressed genes for the potential biomarker. This can be achieved by performing the analysis on Bioconductor using R program. But it is not feasible for everyone to perform the analysis by this method. DAVID offers the options of integrated Gene Ontology tools and KEGG, which explains the biological significance of the DEG.
Protocol
The list of genes is derived from GEO database from NCBI

Then the above-derived gene list will be analyzed using DAVID by following the steps as follows;
Click on the Functional Annotation bar in the DAVID homepage.

Functional annotation tool page. On the left-hand side, the option for pasting the gene list is given in which paste the gene list derived from the GEO database then enter the identifier for the gene list. Click on the option for the gene list as shown below and enter ‘submit list’.

The website then generates the annotation summary results. In this page analyse the gene list by using various tools mentioned here like functional annotation clustering, functional annotation charts and functional annotation table. Furthermore, all the genes in the gene list that are involved in the specific disease formation along with the reason can be analysed.


Functional annotation clustering in DAVID utilizes a fuzzy clustering concept by which it classifies the genes in the basis of the degree of co-association among themselves. This reduces the burden of associating different terms associated with a similar biological process, thus allowing the biological interpretation to be more focused at the ‘biological module’ level. The 2D view tool is also provided for examining the internal relationships among the clustered terms and genes.

The functional annotation chart provides a gene-term enrichment analysis that helps to identify the most relevant biological function associated with the gene list. This tool consists of extended annotation coverage as compared to other enrichment analysis tool. This consist of over 40 annotation categories apart from the Gene ontology, i.e. GO terms, protein functional domains, protein-protein interaction, disease association, sequence features, disease association, bio pathways, homology, gene functional summaries, gene tissue expression and literature.

Functional Annotation table is a query engine for the DAVID knowledge base, without statistical calculations. This is a useful analytic module particularly when users want to closely look at the annotation of highly interesting genes.

Challenges and benefits of DAVID
S.No. |
Example question to ask |
Main function |
Advantage |
Drawbacks |
Gene name batch viewer | What are the genes in my list? | Display all the genes names in a linear tabular text format and search for other functionally related genes. |
|
|
Gene functional classification | What are the major gene families on my list? | Functionally related genes are classified into groups.
2D view for related gene-term relationship. |
|
Some genes that do not have strong association with other groups will be left out from the analysis. |
Functional annotation chart | Which annotation terms are enriched for my gene list? | Enriched annotation terms are identified in linear tabular text format.
Genes are viewed on pathway maps. |
|
|
Functional annotation clustering | Which annotation groups are enriched for my gene list? | Cluster functionally related annotations into groups.
2D view for related gene term relationship. |
|
Some enriched terms without strong neighbours will be left out from the analysis. |
Functional annotation table | What are the associated annotations for each of my genes? | Query selected annotations for given genes. |
|
|
The functional enrichment analysis for the differentially expressed genes was performed using DAVID software that provides us significantly enriched GO terms and KEGG pathways. Consequently, the functions of up-regulated genes and down-regulated genes will be revealed after the analysis. Then, by Recursive feature extraction of the selected genes, we can finally obtain the potential biomarker for the particular disease (Qu, Li, Li, & Chen, 2016). This software made the study easy and efficient as it is easily available online and also there is no need for any programming language to perform the analysis.
References
- Huang, D. W., Sherman, B. T., & Lempicki, R. A. (2009). Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nature Protocols, 4(1), 44–57. https://doi.org/10.1038/nprot.2008.211.
- Qu, T., Li, Y., Li, X., & Chen, Y. (2016). Identification of potential biomarkers and drugs for papillary thyroid cancer based on gene expression profile analysis. Molecular Medicine Reports, 5041–5048. https://doi.org/10.3892/mmr.2016.5855.
Discuss