Annotating your sequence from a custom annotation database

The Annotate from Database function allows you to annotate your sequences by transferring features from other sequences in your database. 

To create a custom database to annotate from, create a new folder in Geneious and place in it the sequences you want to use as the source.  These sequences can be annotated or unannotated* nucleotide or protein sequences, such as reference genomes downloaded from Genbank, lists of peptides, BLAST hits, or your own previously annotated sequences.  

To annotate your sequences from your custom database folder:

  1. Select the sequence(s) you want to annotate and go to the  Screen_Shot_2019-08-29_at_1.55.02_PM.png Live Annotate & Predict tab on the right hand side of the sequence view (or alignment view) window, and tick the box next to Annotate From...  
    Note that if the sequence you want to annotate is a list of more than 100 sequences, you will need to open Annotate from Database via the Annotate and Predict menu.  
  2. Click on the folder name next to the label Source: 
  3. In the window that appears, select the folder where you placed the sequences you want to annotate from and click OK.
  4. If necessary, adjust the similarity slider until you see a preview of the annotations on your sequence.  These will appear faded out when they are previewed.  
  5. Click Apply to add the annotations to your sequences (the annotations should then appear in a darker color).  If you only want to apply some of the annotations, select the annotations you want (either directly on the sequence or in the annotations table) before you click Apply.  

Annotate_from_Figure.png

If you are annotating small genomes with features that are longer than 50bp, we recommend setting the Index Length to the maximum value of 15 for nucleotides or 6 for proteins.  This will speed up the search on larger sequences. Similarly, if your features are very short (less than 20bp) you may need to adjust the Index Length down in order to find matches.  To alter the Index Length, click Advanced.

The Advanced options also enable you to restrict the operation to particular Annotation types.  

Note: From Geneious Prime 2020 onwards, the boundaries of CDS annotations will be automatically adjusted to fit the closest open reading frame, if this is within a specified distance of the Source CDS.  This option can be configured in the Advanced options, under Adjust CDS boundaries by up to x bp to match nearest ORF. 

Annotating nucleotide sequences from protein sequences.

If you wish to use a set of protein sequences to annotate your nucleotide sequences, open the Advanced options and ensure that the Translation option is checked.  The nucleotide query sequence will be translated in all 6 frames for comparison to the protein sequences in the Source folder. 

Annotate_from_Figure2.png

Using blast to annotate your sequences.  

If you wish to annotate your genome by BLASTing previously identified ORFs, you can use Annotate from Database to transfer the results of the BLAST back onto your genome.  This procedure can also be used if you are annotating a list of nucleotide sequences by BLASTing to a protein database with blastx. 

The screenshot below shows a set of ORFs annotated on a mitochondrial genome.  To annotate these via BLAST, select all the ORFs (either from the Annotations table or directly on the sequence), and perform a batch BLAST search, returning the matching region with annotations (see bottom screenshot).  

Annotate_from_BLAST2.pngAnnotate_from_BLAST2.png

The annotations from the BLAST results cannot be directly transferred back onto the original mitochondrial sequence, as the link between the BLAST result and the original genome is broken by extracting the ORFs during the BLAST process.  However, the BLAST result folder can be used as the Source for Annotate from Database.   

Select your original genome sequence where your ORFs were annotated, enable Annotate from Database, and select the BLAST result folder as the Source.  As the results for each ORF are contained in a subfolder, open the Advanced options and ensure “Include subfolders” is checked.  Also check the option to “Merge matches” and add the Source annotation type to the list of types not to annotate (as this is specific to the BLAST hit and not the query sequence).  

Annotate_from_BLAST3.png

You should then see the annotations from your BLAST hits appear on the sequence.  Don’t forget to click Apply to record them on the sequence.  

 

*Unannotated sequences can be used in the Source folder in Geneious Prime 2019.2 onwards. To enable this, turn on the Advanced option “create misc feature type annotations from unannotated source sequences”. Geneious will then treat sequences without any annotations as though they have an annotation of type ”misc feature” across the full length of the sequence, with the same name as the sequence name, and this will be transferred if there is a match.  

 

Have more questions? Submit a request

Comments