Microsatellite Data

Microsatellites are defined in tandem repeat loci with repeat motifs 2 to 6 base pairs in length. They can be further classified by the length of the repeated motif. For example, a locus with repeat sequence CA is a dinucleotide, while one composed of CTC repeats is a trinucleotide. Microsatellite loci are densely and widely distributed in the human genome. Estimates of 300,000 tri- and tetranucleotide repeats in the genome translate to an average frequency of one locus every 10 kilobases. Dinucleotide repeats of the form CA are estimated to occur once every 30 kb. Microsatellites are found in large numbers on all chromosomes.


The Genome Database (GDB) contains the largest set of microsatellite loci data available. However, errors within the data make it necessary to screen the data before using it in analysis. Thus, the following data have been "cleaned" in a variety of ways.


A table (postscript or pdf) with the number of loci from each chromosome, by repeat type, that contain sufficient data to be included in the set of usable data.

Here's some data. Chromosomes 15-22 were compiled by Heidi Spratt during the period May '97 to August '97. Chromosomes 1-14 were compiled by Leslea Davison (these will be added in the future).


Return to previous page