The Annotate from Database tool allows you to automatically annotate one or more sequences by transferring annotated Features from other "similar" sequences in your database. Geneious performs a Blast-like comparison to detect similarity.
To create a custom database to "annotate from", create a new folder in Geneious and place in it the sequences you want to use as the source. These sequences can be annotated or unannotated* nucleotide or protein sequences, such as reference genomes downloaded from Genbank, lists of peptides, BLAST hits, or your own previously annotated sequences.
To annotate your sequences from your custom database folder:
- Select the sequence(s) you want to annotate and go to the Live Annotate & Predict tab on the right hand side of the sequence view (or alignment view) window, and tick the box next to Annotate From...
Note that if the sequence you want to annotate is a list of more than 100 sequences, you will need to open Annotate from Database via the Annotate and Predict menu.
- Click on the folder name next to the label Source:
- In the window that appears, select the folder where you placed the sequences you want to annotate from and click OK.
- Choose Best Match so that the Top matching "hit" for any particular region will be the annotation that is transferred.
- If necessary, adjust the Similarity slider until you see a preview of the annotations on your sequence. These will appear faded out when they are previewed.
- Click Apply to add the annotations to your sequences (the annotations should then appear in a darker color). If you only want to apply some of the annotations, select the annotations you want (either directly on the sequence or in the annotations table) before you click Apply.
If you are annotating small genomes with features that are longer than 50bp, we recommend setting the Index Length to the maximum value of 15 for nucleotides or 6 for proteins. This will speed up the search on larger sequences. Similarly, if your features are very short (less than 20bp) you may need to adjust the Index Length down in order to find matches. To alter the Index Length, click Advanced.
The Advanced options also enable you to restrict the operation to particular Annotation types.
*Unannotated sequences can be used in the Source folder in Geneious Prime 2019.2 onwards. To enable this, turn on the Advanced option “create misc feature type annotations from unannotated source sequences”. Geneious will then treat sequences without any annotations as though they have an annotation of type ”misc feature” across the full length of the sequence, with the same name as the sequence name, and this will be transferred if there is a match.
Improvements in Geneious Prime 2020
The boundaries of CDS annotations can now be automatically adjusted to fit the closest open reading frame, if this is within a specified distance of the Source CDS. This option can be configured in the Advanced options, under Adjust CDS boundaries by up to x bp to match nearest ORF.
Improvements in Geneious Prime 2020.1
Geneious Prime 2020.1 provides significant improvements to the Annotate from Database tool.
1. Geneious Prime 2020.1 replaces the Sample Documents/Plasmapper features folder with a new locked database of Reference Features called Geneious Plasmid Features. This database contains an expanded and revised list of common plasmid features and is now the default database for the Annotate from database tool.
You can still create and use your own annotation databases. We recommend you place your personal annotation databases in the Reference Features folder so they are easy to find and access. The Reference Features folder will always be located at the bottom of your Local folder.
2. The Tool now allows you to transfer only the Best "match". If multiple annotations in the source database of the same type overlap with each other in the same region on the target sequence, then only the closest match of these is annotated. All primer annotations covering the same region are always annotated.
3. The Advanced preferences for the Annotate from Database tool have been rearranged and revised to include an adjustable value that determines the Best match threshold.
4. Transferred annotations now have a more comprehensive selection of Annotation qualifiers, detailing the source of the transferred annotation, and a hyperlink allowing you to view an alignment of the target region and matching annotation.
Annotating nucleotide sequences using a protein sequence database.
If you wish to use a set of protein sequences to annotate your nucleotide sequences, open the Advanced options and ensure that the Protein Sequences option is checked. Your nucleotide query sequence/s will be translated in all 6 frames for comparison to the protein sequences in the Source folder.
Using blast to annotate your sequences.
If you wish to annotate your genome by BLASTing previously identified ORFs, you can use Annotate from Database to transfer the results of the BLAST back onto your genome. This procedure can also be used if you are annotating a list of nucleotide sequences by BLASTing to a protein database with blastx.
The screenshot below shows a set of ORFs annotated on a mitochondrial genome. To annotate these via BLAST, select all the ORFs (either from the Annotations table or directly on the sequence), and perform a batch BLAST search, returning the matching region with annotations (see bottom screenshot).
The annotations from the BLAST results cannot be directly transferred back onto the original mitochondrial sequence, as the link between the BLAST result and the original genome is broken by extracting the ORFs during the BLAST process. However, the BLAST result folder can be used as the Source for Annotate from Database.
Select your original genome sequence where your ORFs were annotated, enable Annotate from Database, and select the BLAST result folder as the Source. As the results for each ORF are contained in a subfolder, open the Advanced options and ensure “Include subfolders” is checked. Also, add the Source annotation type to the list of types not to annotate (as this is specific to the BLAST hit and not the query sequence).
You should then see the annotations from your BLAST hits appear on the sequence. Don’t forget to click Apply to record them on the sequence.