Entropy-based SNP selection for genetic association studies.

Authors:
Jochen Hampe, Stefan Schreiber, Michael Krawczak
Year of publication:
2003
Volume:
114
Issue:
1
Issn:
0340-6717
Journal title abbreviated:
HUM GENET
Journal title long:
Human genetics
Impact factor:
3.930
Abstract:
Because of their abundance, density, and ease of practical use, single-nucleotide polymorphisms (SNPs) have become the major source of information for association gene mapping in humans. Sensible strategies for selecting practically useful SNPs are therefore required. Among the factors influencing the mapping utility of a given set of SNPs are (1). their individual diversity, (2). their haplotype structure in the population of interest, and (3). their physical distribution. We propose a strategy integrating these aspects into a single mapping utility measure, which is based upon Shannon entropy, and which maximizes the amount of information extracted from a genomic region under a Malecot model of linkage disequilibrium (LD) decay. The same utility measure has also been used to define a criterion guiding SNP discovery and rational decision-making about the continuation or termination of a mapping study. The proposed strategy performs consistently well in a data set comprising 549 German control individuals, genotyped for 136 SNPs from four genomic regions of different LD structure. Adoption of the method in practice is estimated to save up to 30% of genotyping load when compared with equidistant SNP localization or pair-wise LD minimization alone.