Heterozygosity index
The heterozygosity index (H), which is basically the frequency of heterozygotes in the population, is another commonly employed measure of the degree of polymorphism in the population. For the individual genes, this index is usually calculated from the frequencies of the individual alleles:
where xi is the frequency of the i-th allele in the population. Thus, a population containing a large number of alleles with the same frequency has the largest H value. The average heterozygosity index for the given population can be calculated on the basis of the heterozygosity indices as the arithmetic mean for the individual genes. If the heterozygosity index is calculated on the basis of sequence data, it is also sometimes called the gene diversity index. The nucleotide (aminoacid) diversity index (B) can also be calculated on the basis of sequence data; this corresponds to the average number of nucleotide (aminoacid) differences between all the pairs of alleles in the sample divided by the length of the sequences of the relevant alleles. The average number of pair differences Π can be calculated for the whole population or, to be more precise, for the population sample, as
where n is the number of observed sequences (so that is the number of various pairs of sequences) and Πij is the number of differences between the i-th and j-th sequence.