Weighted regression


		EXPERIMENTAL PLAN
		Principal Investigator/Program Director Williams, Robert W.
		Weighted regression In some experimental situations, particularly those that involve recombinant inbred lines, an estimate of the measurement error may be available for each trait value. In cases where the measurement error is not uniform across trait values, the estimate of QTL effect may be improved by an inverse-variance weighting, that is, by weighting the contribution of each trait value with the inverse of the variance for that value. The user interface for this option can be quite simple. The variance of the trait will be entered as another trait, and an option will allow one trait to be designated as the weight for another. Missing marker data Missing data constitute an important practical problem in genetic mapping, including QTL mapping. If a trait value is missing for an individual, that individual must be omitted from the regression. If marker data are missing or ambiguous, however, it is usually possible to calculate an expected value based on the genotypes of flanking markers (Martinez and Curnow 1994; Jiang and Zeng 1997) . The NTB will use the method of Jiang and Zeng. For sets of marker data curated by the NTB (recombinant inbred strains and shared intercrosses), the expected values for missing data (which should be rare) can be calculated once and stored with the marker data. Epistasis testing and search There is growing awareness of the importance of epistasis (interactions between nonlinked loci) in complex traits and a plea for software which will detect and analyze such effects (Frankel and Schork 1997) . Cheverud and Rotman (1995) recently published a method for analysis of epistatic effects, but it has not yet been implemented in mapping software. The NTB will implement that method in two ways. The first will produce a simple report on epistatic effects between any two marker loci with respect to a chosen trait. The second will allow a search for marker loci that show significant epistatic effects with a given locus. These methods will not be interval methods; they will be analogous to the single-locus QTL mapping method described above, in which single marker loci are taken as indicators for nearby quantitative trait loci. In the case of epistasis, however, two marker loci are involved, each an indicator for a different QTL. Normally, single-locus QTL mapping will detect epistatic effects between QTLs only if there is also a significant additive or dominance effect. The Cheverud and Rotman method, in contrast, will allow the detection of pairs of loci whose effect on a trait is purely epistatic, that is, pairs of loci for which there is no significant additive or dominance effect. Empirical significance thresholds As mentioned above, one of the critical problems in QTL mapping is the establishment of appropriate significance thresholds. The NTB will implement more than one method for establishing significance thresholds, including a new one. There are several aspects to this problem, the most important of which is that mapping involves testing multiple hypotheses for one data set. The importance of sufficiently stringent significance thresholds has been amply demonstrated, and a priori thresholds have been established (Lander and Kruglyak 1995) . More important, methods have been described to calculate empirical thresholds, tailored to the idiosyncrasies of the data set (Piepho, personal communication; Churchill and Doerge 1994; Doerge and Churchill 1996) . Three of these are permutation methodsone to establish a threshold for SIM and two to do the same for CIM. The fourth is an approximate method that promises to be much faster than permutation. Map Manager QT currently implements the first method; the NTB will implement all of them. Nonparametric statistics For traits expressed on an ordinal scale or whose distribution is far from normal, the NTB will provide the option of evaluation with a generalized Wilcoxon rank-sum test (Kruglyak and Lander 1995) . The behavior of this statistic has not been described for the case of composite interval mapping, but in any case users should rely on the permutation tests provided by the NTB to calculate significance thresholds. Other covariates The NTB will provide an option by which a trait can be designated as a covariate and included in the analysis as a nongenetic determinant of the trait being analyzed. This option may be useful when environmental or nongenetic conditions (such as age) are known to affect the trait. Testing The mapping functions of the NTB will be tested with simulated data generated by a set of functions written for the Mathematica mathematics software. These functions can generate marker data sets with markers at prescribed intervals; in addition, they can generate traits based on any number of quantitative trait loci, using any of the standard dominance models and using any statistical distribution for the environmental effect. Other routines can perform simple and composite interval mapping, generating the same type of figures as will be generated by the NTB. A person not part of the programming team will use these Mathematica functions to generate data sets and compare a Mathematica analysis and an analysis by Map Manager QTX with an analysis by the NTB. The budget for Project 4 includes, in Years 02 to 05, a stipend for a student intern to perform this work.



		Next Topic
		User manual and tutorials.