CODEHOP:
COnsensus-DEgenerate Hybrid Oligonucleotide Primers

CODEHOP mascot

CODEHOP help

On this page: Other pages:

General information

The CODEHOP program designs PCR (Polymerase Chain Reaction) primers from protein multiple-sequence alignments. The program is intended for cases where the protein sequences are distant from each other and degenerate primers are needed.

The multiple-sequence alignments should be of amino acid sequences of the proteins and be in the Blocks Database format Proper alignments can be obtained by different methods.

The result of the CODEHOP program are suggested degenerate sequences of DNA primers that you can use for PCR. You have to choose appropriate primer pairs, get them synthesized and perform the PCR.

A CODEHOP primer is degenerate at the 3' core region, with a length of 11-12 bp across four codons of highly conserved amino acids, and is non-degenerate at the 5' consensus clamp region, with a length which depends on its desired annealing temperature, typically between 20 and 30bp:

5'                            3'
--------------------===========

non-degenerate      degenerate
consensus clamp     core
#bases from temp    11-12 bases

The hybrid structure (5' consensus and 3' degenerate) of CODEHOP primers allow the PCR amplification to be specific during the early cycles from the original source DNA and selective during the late cycles from the PCR synthesized products:

CODEHOP diagram
Schematic comparison of standard degenerate PCR (left) with the CODEHOP (right), illustrating regions of mismatch in primer-to-template annealing during early PCR cycles and in primer-to-product annealing during subsequent cycles. Vertical lines indicate nucleotide matches between primer (arrow) and template or synthesized product. The overall degeneracy is the product of degeneracies at each nucleotide position, so that the fraction of precisely hybridizing primers is 1/degeneracy.

Obtaining input alignments

Input alignments for the CODEHOP program must be in the Blocks Database format. Block multiple-sequence alignments are ungapped and usually local alignments. Local alignments cover only parts of the protein sequences. The regions between the blocks are the ones with no sequence similarity or where gaps must be inserted to align the sequences.

You can get a multiple alignment from a group of related protein sequences using the Block Maker or other automated methods (such as ClustalW). The alignments can also be manually made or modified according to your knowledge of the proteins (position of the active sites or post-translational modifications etc.). In any case the alignments passed to the CODEHOP program must be in the Blocks format. BlockMaker blocks need no reformatting. Appropriate parts of Clustal- or FASTA-formatted multiple alignments can be automatically made into blocks by the Blocks multiple alignment processor. Other types of multiple alignment can be semi-manually reformatted with the Blocks formatter. All of the above Blocks programs have links to send the resulting blocks directly to the CODEHOP program and also provide other information for evaluating the blocks (logos, trees, searches).

More information on multiple alignments can be found in the notes for the ISMB97 tutorial on Introduction to making and using protein multiple alignments.

Terms and parameters

Basic tips

Once you have a block(s) in the input window of the CODEHOP page you can "Look for primers" using the default parameters. You can adjust the setting according to your intended use of the primers and the results you got with the defaults. If you don't get predictions, or you don't like what you get we recommend to first raise the degeneracy to 256 or higher (if you dare ...) and retry. Next, you might try raising the strictness of the core region, for example to 0.1 or 0.25.

Your target sequence(s) might be expected to be more similar to some specific sequence(s) in the input blocks. In this case you can bias the primers towards these sequences. Rather than raising the degeneracy or strictness, increase the weights of the specific sequences. The weights are the values following each sequence segment of the block. Usually the highest weight is 100. In the CODEHOP input window increase the weights of your specific sequences (say to 3 or 4 times the original weight or to 200 or 400). You can also remove individual sequences from the input blocks by down-weighing them to 0 (the minimal weight) if they are too divergent and prevent finding primers.

For amplification, we recommend using AmpliTaq Gold with a 9' preheat (this provides an automatic hotstart - a hotstart of some kind is important). We have had success using the time-release feature with the addition of 15-20 extra cycles. The primer finding strategy of the CODEHOP program (Rose, et al, manuscript accepted by NAR) is different from the usual degenerate PCR strategies. It is desirable to keep annealing temperatures high - 60oC is OK if you have a 60oC clamp. We recommend trying the highest temperature that yields a clean PCR product. We have used "touchdown" PCR down to Tm-3oC or lower, say from 63oC down to a good clamp annealing temperature in -0.5 to -1oC increments, and the remaining cycles are carried out at the 53-57oC clamp annealing temperature for a 60oC clamp. The intent of the touchdown is to give the correct product a head start, because it is likely to anneal at a higher temperature than any failure product. Once the clamp 'takes over', then all primed products, whether correct or not, will be on an even footing, so we try to keep the stringency high in all cycles. With luck, it should not be necessary to gel-purify product, but may rather try cloning directly from the reaction mix if a single band of the expected size is obtained.

We and a few other users were already successful in using the CODEHOP strategy and program to amplify various sequences from complicated and diverged genomes. Please let us know if you have any tips to pass on based on your experience using CODEHOP-predicted primers.

Publication

Results obtained by this method should cite:
"Consensus-degenerate hybrid oligonucleotide primers for amplification of distantly-related sequences" by T.M. Rose, E.R. Schultz, J.G. Henikoff, S. Pietrokovski, C.M. McCallum and S. Henikoff, Nucleic Acids Research, 26(7):1628-1635.

Genes identified using CODEHOP.


[Blocks home] [CODEHOP] [Getting started] [CODEHOP program]
Contact us

Page last modified Mar 2003