plot.info {qtl} | R Documentation |
Plot a measure of the proportion of missing information in the genotype data.
plot.info(x, chr, method=c("both","entropy","variance"), step=1, off.end=0, error.prob=0.001, map.function=c("haldane","kosambi","c-f","morgan"), alternate.chrid=FALSE, ...)
x |
An object of class cross . See
read.cross for details. |
chr |
Optional vector indicating the chromosomes to plot.
This should be a vector of character strings referring to chromosomes
by name; numeric values are converted to strings. Refer to
chromosomes with a preceding - to have all chromosomes but
those considered. A logical (TRUE/FALSE) vector may also be used. |
method |
Indicates whether to plot the entropy version of the information, the variance version, or both. |
step |
Maximum distance (in cM) between positions at which the
missing information is calculated, though for step=0 ,
it is are calculated only at the marker locations. |
off.end |
Distance (in cM) past the terminal markers on each chromosome to which the genotype probability calculations will be carried. |
error.prob |
Assumed genotyping error rate used in the calculation of the penetrance Pr(observed genotype | true genotype). |
map.function |
Indicates whether to use the Haldane, Kosambi or Carter-Falconer map function when converting genetic distances into recombination fractions. |
alternate.chrid |
If TRUE and more than one chromosome is plotted, alternate the placement of chromosome axis labels, so that they may be more easily distinguished. |
... |
Passed to plot.scanone . |
The entropy version of the missing information: for a single individual at a single genomic position, we measure the missing information as H = sum p[g] log p[g] / log n, where p[g] is the probability of the genotype g, and n is the number of possible genotypes, defining 0 log 0 = 0. This takes values between 0 and 1, assuming the value 1 when the genotypes (given the marker data) are equally likely and 0 when the genotypes are completely determined. We calculate the missing information at a particular position as the average of H across individuals. For an intercross, we don't scale by log n but by the entropy in the case of genotype probabilities (1/4, 1/2, 1/4).
The variance version of the missing information: we calculate the average, across individuals, of the variance of the genotype distribution (conditional on the observed marker data) at a particular locus, and scale by the maximum such variance.
Calculations are done in C (for the sake of speed in the presence of
little thought about programming efficiency) and the plot is created
by a call to plot.scanone
.
Note that summary.scanone
may be used to display
the maximum missing information on each chromosome.
An object with class scanone
: a data.frame with columns the
chromosome IDs and cM positions followed by the entropy and/or
variance version of the missing information.
Karl W Broman, kbroman@biostat.wisc.edu
data(hyper) plot.info(hyper,chr=c(1,4)) # save the results and view maximum missing info on each chr info <- plot.info(hyper) summary(info)