Select      
 Site search   
  Home    Publications

Browse Publications
 
List of Contents

Quantitative Neurogenetics & QTL Mapping

Genetics of Myopia

Control of Neuron Number and Stereology

Growth Cones and Dying Axons

Retina Development and Visual System Mutants

Grant Application

U.S. Patent

Abstracts


Need Help?
Help with Publications
Help with Nervenet
Contact Us

     
Note to the Reader This is a revised edition of a review published in Mouse Brain Development (Springer, ISBN 3540666648, $169 from Amazon.com). Text additions and modification are in brackets. [...]. Williams RW (2000) Mapping genes that modulate brain development: a quantitative genetic approach. In: Mouse brain development (Goffinet AF, Rakic P, eds). Springer Verlag, New York, pp 21–49.

Print Friendly
Mapping Genes that Modulate Mouse Brain Development: A Quantitative Genetic Approach

Robert W. Williams
Center for Neuroscience and Department of Anatomy and Neurobiology, University of Tennessee, 855 Monroe Avenue, Memphis, Tennessee 38163 USA
Email questions and comments to rwilliam@nb.utmem.edu


 

Contents

1. Why brain weight and neuron number matter
   1.1 Metabolic constraints
   1.2 Functional correlates
   1.3 Insights into CNS development

2. The biometric analysis of the size and structure of the mouse CNS
   2.1 A new opportunity
   2.2 Brain weight is highly variable
   2.3 Sex and age effects on brain weight
   2.4 Large differences between substrains
   2.5 Consistency and inconsistency across studies

3. Mapping brain weight QTLs
   3.1 QTLs versus Mendelian loci
   3.2 Assessing trait variation
   3.3 Estimating heritability
   3.4 Phenotyping and genotyping members of an experimental cross
      3.4.1 Phenotyping and regression analysis
      3.4.2 Genotyping
   3.5 The statistics of mapping QTLs
      3.5.1 Permutation analysis
   3.6 Cloning QTLs
   3.7 The probability of success

4. Neuron and glial cell numbers in adult mice
   4.1 The mouse brain library
   4.2 Numbers of neurons and glial cells in the brain of a mouse

5. Mapping QTLs that modulate neuron number
   5.1 Mapping cell-specific QTLs
   5.2 The Nnc1 locus
   5.3 Mechanisms of QTL action
   5.4 Candidate gene analysis

6. Conclusion


Chinese Intro

 


 

In my opinion there are only quantitative differences, not qualitative differences, between the brain of a man and that of a mouse.     Ramón y Cajal (1890)


 

The difference in behavioral capacity between man and chimpanzee may be no more than the addition of one cell generation in the segmentation of the neuroblasts which form the cerebral network.    Lashley (1949)

 

Introduction

The complexity of CNS development is staggering. In mice a total of approximately 75 million neurons and 25 million glial cells are generated, moved, connected, and integrated into hundreds of different circuits over a period of one month. The process is coordinated by the expression of a large fraction of the genome—as many as 40,000 genes may be involved (Sutcliffe 1988; Adams et al. 1993). These same genes coordinate the development of the human brain, but a thousand times more neurons are generated (Williams and Herrup 1988) and their integration and training take more than a decade. While 5,000 of these genes have common roles in cellular metabolism, this still leaves a huge complement that have selective, transient, and partially redundant roles in the development of different parts of the brain (Usui et al. 1994; Gautvik et al. 1996). Reductionist approaches that focus on isolated processes and molecules may seem hopelessly inadequate, but they are an absolute necessity at this early stage of analysis and understanding.
     This chapter introduces a comparatively new reductionist approach called complex trait analysis that my research group is using to explore the genetic basis of CNS development. Complex trait analysis is a field that developed rapidly in the 1990s as a result of the hybridization of quantitative and molecular genetics. The suite of techniques associated with complex trait analysis greatly extends the variety of CNS phenotypes that can be subjected to systematic molecular analysis. It is in essence a forward genetic approach that proceeds from phenotypic variation to single genes. This approach has been embraced by behavioral geneticists and neuropharmacologists (Plomin et al. 1991; Johnson et al. 1992; Takahashi et al. 1994; Crabbe et al. 1994; Kanes et al. 1996), and these techniques can now be applied with equal vigor to explore genetic sources of variation in brain structure and development. This chapter begins with a genetic analysis of sources of variation in brain weight and illustrates how we have mapped quantitative trait loci (QTLs) that control brain weight and neuron number in mice.


 

Why brain weight and neuron number matter

Metabolic constraints. There are several reasons why differences in brain size and neuron number are interesting and biologically significant. First, relative to its size, the brain with its large population of neurons consumes a disproportionate amount of energy (Clark 1994). The high cost of making, training, and maintaining this metabolically demanding organ has wide-ranging effects on an animal's development and behavior (Sacher and Staffeldt 1974; Eisenberg and Wilson 1978; Martin, 1981; Armstrong 1983; Hofman, 1983; Pagel and Harvey 1990; Allman et al. 1993). Humans are an extreme example, with a brain that is 10 times heavier than expected on the basis of body weight. We afford this luxury by developing slowly and by having an efficient diet (Aiello and Wheeler 1995). Given the fact that we are such a large-brained species, it may be a surprise to learn that mice have brains that are proportionally just as large as those of humans. A 22-gm adult mouse typically has a 450-mg brain, whereas a 66-kg human typically has a 1350-gm brain, 2% in both cases.

Functional correlates. A second and almost self-evident reason to be interested in brain weight and neuron number is that variation in these simple parameters is associated with variation in behavior (Lashley 1949; Rensch 1956; Wimer and Prater 1966; Fuller and Herman 1972; Roderick et al. 1979; Fuller, 1979; Crusio et al. 1989; Jacobs et al. 1990; Belknap et al. 1992; Aboitiz 1996; Keverne et al. 1996). This is most clear-cut when specific regions of the brains of different species or individuals are compared. For example, in song birds the volume of song system nuclei and numbers of neurons tend to be positively correlated with different features of song production (e.g., DeVoogd et al. 1993; Ward et al. 1998). Another fine example—although strongly negative in this case—is the correlation in mice between avoidance learning and the size of the infrapyramidal projection from dentate gyrus to CA3 (Schwegler and Lipp 1983; Lipp et al. 1989).

Insights into CNS development. My colleagues and I are interested in brain weight and neuron number for a third reason: as a means to map, clone, and characterize genes that control the proliferation, differentiation, and death of cells in the CNS (Williams and Herrup 1988; Williams et al. 1998a). These genes are entry points into molecular networks that control brain development. Differences in brain weight are proportional to total brain DNA content and consequently to total CNS cell numbers (Zamenhof and von Marthens 1976). This is true even in neonatal mice, before appreciable glial cell production (Zamenhof et al. 1971; Zamenhof and von Marthens 1976). For this reason, brain weight is a surprisingly good surrogate measure for total cell number in mice, as in humans (Pakkenberg and Gundersen 1996).
     The initial tactical and technical problem is how to go about identifying genes that modulate cell proliferation and death either in specific nuclei or in the brain as a whole. Mutants may be useful in some instances, but we need more generic methods that can target any and all CNS regions and cell types. And instead of depending on rare mutations and knockouts, we need methods that provide information about common gene variants—the normal gene polymorphisms that are responsible for the far more pervasive and important natural variation found within typical populations of animals.
     Natural variation can be impressive. Numbers of neurons in the human neocortex vary from 15 to 32 billion (Pakkenberg and Gundersen 1997). The volume of human primary visual cortex varies threefold (Stensaas et al. 1974; Gilissen and Zilles 1996). Numbers of ocular dominance columns within the primary visual cortex of rhesus monkeys vary more than 50% (Horton and Hocking 1996). These robust differences are not caused by mutations but are caused by the cumulative action of many normally variable genes and by the action of numerous developmental and environmental factors. In the long run, normal genetic polymorphisms are the most critical source of variance: they are the substrate for evolutionary and developmental modification of brain size and cellular architecture (Williams and Herrup 1988; Lipp 1989; Williams et al. 1993).


 

The biometric analysis of the size and structure of the mouse CNS

Precedents. In the late 1960s, Thomas Roderick, John Fuller, Douglas Wahlsten, and Richard and Cynthia Wimer began an ambitious program to manipulate neuroanatomical traits in mice by selective breeding (Roderick 1976). Their aim was to explore correlated changes in behavior. They gave the rapidly expanding field of behavioral neurogenetics a rigorous foundation in quantitative and statistical neuroanatomy (Wimer et al. 1969; Fuller and Geils 1972; Wahlsten 1975; Roderick et al. 1976; Fuller 1979; Wimer 1979; Wimer and Wimer 1985). Rather than relying on mutants, they exploited the substantial variation among standard inbred strains of mice. This work led to some important breakthroughs and some brick walls. One of the breakthroughs was successfully selecting for substantial differences in brain weight over less than 20 generations (Fuller 1979). An obvious limitation, highlighted by Roderick (1976), was that it was not possible to map gene loci responsible for the remarkable quantitative variation in CNS size, regional architecture, or behavior.

A new opportunity. The situation has changed radically in the past decade (Lander and Botstein 1989; Plomin et al. 1991; Johnson et al. 1992; Belknap et al. 1992; Tanksley 1993; Frankel 1995; Crawley et al., 1997). Computational methods and molecular reagents—particularly the polymerase chain reaction method—have become so powerful and economical that it is now practical to systematically dissect complex polygenic traits such as brain weight into sets of single well-defined QTLs. Virtually any heritable trait in mice, whether structural, physiological, pharmacological, or behavioral, can be targeted for analysis. Recent examples in mice include epilepsy (Rise et al., 1991), effects of ethanol and haloperidol (Plomin et al. 1993; Belknap et al. 1993; Hitzemann et al. 1994; Kanes et al. 1996; Buck et al. 1997); patterns of sleep and activity (Toth and Williams, 1998), and the mouse equivalent of anxiety (Flint et al. 1995). As illustrated in the work of Belknap and colleagues (1992), it is now feasible to continue the systematic genetic dissection of the mouse CNS begun in the late 1960s and to start identifying genes that underlie heritable variation in CNS size and structure.
     Variation in brain weight is a classic polygenic trait; one that is influenced during development by the activity of hundreds, if not thousands of genes. Brain weight is also affected by maternal factors and myriad environmental factors (e.g., Collins 1970; Eleftheriou et al. 1975; Wahlsten 1983; Katz and Davies 1983). Finally, many factors that target body size have important pleiotropic or correlated effects on brain size, making the selectivity of action a critical problem (Lande 1979). From the point of view of genetic complexity, it is hard to imagine a morphometric trait that would be more difficult to resolve into individual QTLs.
     We began this biometric analysis by weighing brains of numerous different types of mice. Table 1 is taken from a database that has been assembled over a five-year period with contributions from Drs. Dan Goldowitz, Richelle Strom, and Guomin Zhou. For the great majority of animals, we have information on sex, body weight, age, and type and quality of fixation. For animals born at the University of Tennessee, we also generally know the size of the litter and the mother's parity. Most cases that we have studied were fixed by perfusion with mixed aldehydes (Williams et al. 1996a). This leads to a reduction in brain weight of 3–4%, for which these data have been corrected. Weights include the olfactory bulbs, the paraflocculi, and the entire brainstem, but exclude the dura, the pineal, and the pituitary.

Brain weight is highly variable. Brain weight is highly variable among strains reared in a common environment. For example, both A/J and DBA/2J have average brain weights close to 410 mg, whereas C57BL/6J and BALB/cJ have weights close to 510 mg. The variation within each strain is considerable even after compensating for differences in age, body weight, and sex by multiple regression (Williams et al. 1997). Two animals of the same sex and body weight taken from the same litter often have brain weights that differ by 10—20 mg. The coefficient of variation within isogenic groups shown in Table 1 averages about 5.5%, but when technical errors associated with fixation and dissection are taken into account, true non-genetic variation is close to 4%. In comparison, the retinal ganglion population of isogenic mice has a coefficient of variation that averages 3.6% (Williams et al. 1996a). We have explored the possibility that some of these differences in brain weight are due to variation in water content and the volume of the ventricles, and the short answer is that neither factor is important in mice older than 30 days. Wet and dry brain weights are very tightly correlated.

Sex and age effects on brain weight. Both sexes and a wide range of ages were studied. Surprisingly, in mice sex has no detectable effect on adult brain weight (Williams et al. 1997) and this otherwise important trait can be neglected for most purposes. In some strains, there is a significant age-related increase in brain weight even after sexual maturity is reached. There is also a significant correlation between body weight and brain weight. The correlation across strains listed in Table 1 is merely 0.2, but in some crosses, such as that between CAST/Ei and BALB/cJ, the correlation can rise to 0.8. Information on over 5,000 mice and over 200 genotypes is available online at <http://www.nervenet.org>.

[There are statistically significant mean sex differences in the size of several CNS regions, including the hippocampus (Lu et al., 2000), and the olfactory bulbs (Williams et al., 2000). These differences are relatively modest and certainly should not be thought of as sexual dimorphisms. The overlap in size between the sexes is very sustantial,, and sex only accounts a few percentage points of the total variance in either hipocampus or olfactory bulb size. (RW, June 2000)]

Large differences between substrains. Perhaps the most remarkable aspect of the data summarized in Table 1 is the large differences in brain weight between several substrains of mice. For instance, brain weights of BALB/cByJ and BALB/cJ differ by 76 mg; C57L/J and C57BL/6J differ by 88 mg; C3H/HeJ and C3H/HeSnJ also differ by 88 mg. The closely matched and highly significant differences in these three pairs are intriguing. These differences were presumably generated by the recent fixation of variant alleles in a very small number of genes—probably one or two.


 
 
 

Table 1. Brain weights of 28 common inbred strains of laboratory mice with a comparison to two previous studies. Additional data on brain and body weights are availble for over 230 genotypes of mice. /P>


Inbred Strains

Brain
SA
a


SD


CV%


Litters

RWWS
1973
 b

FW
1966
 c

129/J

423

15

3.1

4

454

444

129/SvJ

430

17

3.9

4

   
A/J

408

21

5.0

11

455

437

AKR/J

464

29

4.9

5

530

 
BALB/cByJ

448

26

5.1

6

   
BALB/cJ

524

28

5.2

12

540

502

C3H/HeJ

416

21

4.8

2

   
C3H/HeSnJ

504

20

4.2

6

   
C57BL/6J

499

21

4.4

23

489

449

C57BL/10J

459

20

4.2

3

   
C57BLKS/J

463

19

4.0

8

   
C57L/J

411

18

4.1

2

448

 
C58/J

429

19

4.2

2

451

 
CBA/J

462

7

3.0

1

508

 
CBA/CaJ

437

21

4.8

3

   
CE/J

472

23

5.0

7

476

 
DBA/1J

403

23

5.9

4

409

DBA/2J

417

27

6.4

10

432

413

FVB/NJ

481

11

2.2

5

   
LG/J

488

25

5.2

4

552

 
LP/J

397

29

7.1

5

466

 
NOD/LtJ

524

47

8.4

3

   
NZB/BinJ

515

40

7.7

5

   
NZW/LacJ

479

38

7.9

3

   
PL/J

452

27

5.9

3

516

 
SJL/J

419

26

6.3

7

450

413

SM/J

469

24

4.6

11

496

436

SWR/J

396

15

3.7

2

469

 
Averagesd

453

23

5.0

4.6

483/446 e

438/446 e


a Brain weights are correted for differences in sex and age. All values normalized to those of 75-day-old females without fixation. SD is the standard deviation computed using individual values, CV is the corresponding coefficient of variation expressed as a percentage (SD x 100/mean).
b Roderick et al. (1973)
c Fuller and Wimer (1966)
d Litter average is geometric mean.
e Paired averages (483/446): first value from original study; second value is average for the same set from our current database. Note the fair agreement with Fuller and Wimer (1963, r = 0.83). Values from Roderick et al. (1973) are consistently higher and the correlation is somewhat lower ( r = 0.78). This difference may be due to their use of retired breeders killed by CO2 asphixiation.

 


 

Mapping brain weight QTLs

QTLs versus Mendelian loci. QTLs are conventional genes that have two or more alleles that contribute to quantitative variation of specific traits (Roff 1997; Lynch and Walsh 1998). A trait may be a concentration or number, a size, weight or density, an activity or behavior, a severity index or an age-of-onset. QTLs are often contrasted with Mendelian loci that have discontinuous effects on phenotypes and predictable segregation patterns. In contrast, individual QTLs usually have more modest effects on a particular phenotype and are associated with phenotypes in a probabilistic way. A QTL might account for as little as 2% or as much as 50% of the total phenotypic variance. QTLs come in sets that collectively define a polygene. For example, at least three QTLs are currently known to control part of the twofold variation in numbers of retinal ganglion cells (Williams et al. 1998a; Strom 1999), and at least 30 QTLs appear to modulate body size (Cheverud et al. 1996; Brockmann et al. 1998). In the next several pages I explain the process of mapping a QTL—in this case, one of the first QTLs demonstrated to modulate brain weight in the mouse. There are four key steps in mapping QTLs.

Figure 1


Figure 1. Variation in brain weight between two inbred strains and their test cross progeny. Click on figure to see a higher quality version. The two parental strains—BALB/cJ and CAST/Ei—are shown to the far left. Each dot represents the brain weight of an individual mouse; the short horizontal lines through each box indicate group averages; the vertical bars within each box mark indicate standard deviations; and the horizontal line at 454 mg is the mid-parental value (average of BALB/cJ and CAST/Ei). Box heights are generally ±2 SD. F1 animals were crossed back to both parental strains, giving rise to the two sets of B1 progeny shown to the right. The equation at the bottom of the figure is the Wright-Castle equation (Wright 1978) for estimating the minimum number of effective factors (single or linked QTLs) that contribute to the genetic variance of a trait. Delta P is the difference between parental strain means. VF2 and Viso are the variances of the F2 and isogenic strains, respectively. For these data, we estimate that at least eight polymorphic genes account for the increased variance of the F2 relative to that of the isogenic groups. Data are not corrected for variation in age, sex, or body weight.

 


 

Step 1: Assessing trait variation. The first step is to identify significant variation in phenotypes among individuals, or, in the case of laboratory mice, among inbred strains. Variation is an absolute necessity. It is the signal we are trying to pinpoint on a map of the genome. The greater the heritable variation, the better the prospects of success.
     Figure 1 illustrates the wide variation in brain weight among two inbred strains (BALB/cJ, and CAST/Ei) and among their intercross and backcross progeny. This is a cross that I will use throughout this section as a specific example of mapping a brain weight QTL. Note that brain weight in the F1 generation overlaps that of the BALB/cJ parental strain. Brain weight may be inherited as a dominant trait, but since all of these F1 progeny were born to BALB/cJ mothers, maternal non-genetic factors are also likely to be an important factor. The spread of points among F2 individuals is somewhat greater than that of either parental strain. This increase in variance is due to the segregation and assortment of QTLs that affect brain weight. No fewer than seven QTLs are needed to account for the differences seen among members of this cross (Wright 1978), but using our small sample of F2 animals (n = 98), we have only succeeded in mapping one of these QTLs.

Step 2: Estimating heritability. The second step in QTL mapping is to verify that a substantial fraction of the variability of the trait is heritable (Curcio 1992; Wahlsten 1992; Williams et al. 1996a). In a standard mouse colony, variation in brain weight has a heritability that ranges from 0.35 to 0.7 (Roderick et al., 1973; Roderick et al., 1976; Seyfried and Daniel 1977; Fuller 1979; Henderson 1979; Atchley et al. 1984; Williams et al. 1996b; Strom and Williams 1997; Strom 1999). Heritability estimates can admittedly be problematic (Lewontin 1957; Eleftheriou et al. 1975), and in the context of the heritability of human intelligence, Wahlsten (1994) comments that "I would feel more secure riding a three legged moose over thin ice than relying on a heritability coefficient to help me understand the origins of individual differences or predict future levels of intelligence." But it can still be useful to go through the process of computing heritability. The reason is that we need to have some idea of the approximate fraction of variance in our sample population that is due to heritable genetic factors before we attempt to map QTLs. The heritable variance is what we are trying to assign to a set of QTLs. While heritability estimates may be labile, the QTLs that we map are anchored in the genome itself.

Figure 2

Figure 2. The correlation between brain weights of parents and their offspring estimates heritability. Animals are from a multigenerational cross between C57BL/6J and DBA/2J inbred strains (G. Zhou and R. W. Williams, in progress). Parental values are the average unfixed weights of mothers and fathers without correction for variation in age or body weight. Offspring data are average brain weight per litter. Brains weights are also presented without correction for variation in body weight, sex, or age. Offspring weights tend to be slightly less than those of the parents because of offspring are on average about 50 days younger. The correlation between pairs of values is 0.38 and is a direct estimate of the narrow-sense heritability of brain weight in this cross and environment. Correlations between mothers and offspring and fathers and offspring do not differ significantly. Thus, this estimate of heritability is not inflated by maternal effect.


     Heritability is the fraction of the total variance in a trait that is generated by the segregation and assortment of allelic variants at the many gene loci that influence a trait. (New mutations contribute very little to heritability under all but extreme environmental conditions.) A simple way to measure heritability is to compare traits between parents and their offspring. Figure 2 compares the average brain weight of parents to that of their first litters. The correlation between values is a direct estimate of heritability—in this case what is called the narrow-sense, or additive, heritability (Lynch and Walsh 1998). The correlation for this particular dataset is 0.38. Broad sense heritability which includes variance due to dominance effects and non-linear interactions between different genes is likely to be as high as 0.5. In comparison to these estimates, variation in neuron number has a broad-sense heritability of approximately 0.8 for granule cells in the dentate gyrus (Wimer and Wimer 1989) and between 0.7 and 0.9 for retinal ganglion cells (Williams et al. 1996a, 1998a; Strom 1999). These values are certainly sufficiently high to motivate a QTL analysis.

 

Step 3: Phenotyping and genotyping members of an experimental cross. The third step is to gather phenotype and genotype data from a set of animals appropriate for QTL mapping. Several different types of crosses can be used to map QTLs (Taylor 1978; Groot et al. 1992; Frankel 1995; Darvasi 1998; Vadasz et al. 1998; Williams 1998b). Figure 1 already introduced one the most common—the F2 intercross. The central idea behind the intercross is to allow high and low alleles of QTLs inherited from the two inbred strains to segregate and assort independently from unlinked marker loci. The only marker loci that will consistently be associated with high, intermediate, and low trait values in the set of F2 progeny are those marker loci that are closely linked to QTLs (Tanksley 1993; Williams 1998b).

Phenotyping and regression analysis. Weighing brain weight is quick and easy, but before we can use these weights to map QTLs we need to deal with the issue of specificity of gene action. The brain weight data we have considered so far have not been corrected for significant differences in the mean body weight among mice. The heritability that we blithely assigned to brain weight may actually be a consequence of heritable variation in body size. Unless we adjust our brain weight phenotype appropriately, we risk mapping body weight QTLs (Hahn and Haber 1978; Lande 1979). To ensure that we are mapping what we want to map, we need to factor out variation in brain weight that is predictable from variation in body weight, sex, age, and other variable for which we have data.
     A crude way of factoring out body size is to use the ratio of brain to body weight as a phenotype, but a computationally and conceptually far more powerful approach is to use multiple regression analysis to remove predictable variance associated with body size and any other important variables (Williams et al. 1997). The same logic applies when the aim is to map QTLs that modulate the size of particular CNS cell populations (Williams et al. 1998a,b); we do not want to map generic brain weight QTLs inadvertently. In this case, we therefore need to use multiple regression to remove variance in cell number that is actually associated with total brain weight. Whatever types of QTLs we are trying to map, we need to carefully consider the higher-order structures and make sure that we have taken variation in these structures into account.

Figure 3

Figure 3. Regression analysis of body and brain weight. Regression analysis is used to minimize the effects of variance in brain weight due to differences in body weight. Crosses mark males, open circles mark females. Rather than using each animal's actual brain weight as a phenotype, we compute a residual brain weight based upon body weight and sex. Examples of positive and negative residuals are marked on the graph. In this dataset the correlation is 0.81 and r2 (the coefficient of determination) is 0.66. b is the coefficient (slope) of the regression equation.



     Figure 3 provides graphic explanation of a simple regression analysis run on the set of F2 intercross animals previously illustrated in Figure 1. For every 1-g increase in body weight there is approximately a 7.9 mg increase in brain weight. Sixty-six percent of the variance in brain weight can be predicted by body weight alone. Sex in this case is also a significant predictor, and at a given body weight, females have brain weights that are on average 9.4 mg heavier than those of males. However, in this particular sample, neither age nor the logarithm of age were useful predictors ( P ~ 0.6). Table 2 is a statistical synopsis of a multiple regression analysis that takes both body weight and sex into account. When we use the regression equation and coefficients in Table 2 to compensate for differences in body weight and sex we absorb 67.4% of the variance in brain weight. The residual 32.6% of the variance is generated by technical error, other non-controlled environmental effects, and by the QTLs that we are trying to locate on the map of the mouse genome. Rather than using the original brain weight data to map, we use the residual deviations illustrated in Figure 3. For each animal we compute a derived phenotype that is the difference in milligrams between the predicted weight of that animal given its weight and sex and its actual brain weight. By mapping residuals we improve our ability to detect QTLs that are likely to have selective effects on CNS development.

 

  Table 2. Regression analysis of brain weight in an F2 intercross

 

Variable

Coef

SE

P

Body (g)

8.5

0.64

< 0.0001

Sex (1 = F)

9.4

4.7

0.047

   r2 = 67.4%


Genotyping. In a typical analysis of F2 progeny, three to five marker loci spaced about 15 to 25 centimorgans (cM) apart are genotyped on each of the 20 chromosome pairs. These marker loci are usually repetitive microsatellite DNA sequences that consist of variable numbers of cytosine-adenine (CA) dinucleotide repeats. One strain of mouse may have a microsatellite with 30 CA repeats, whereas another strain may have a microsatellite with 40 CA repeats. The 5' and 3' flanking sequences of each microsatellites are unique to that part of the genome, but they are also highly conserved among strains of mice. This makes it possible to design PCR primers that selectively amplify a polymorphic microsatellite located at a precisely defined chromosomal position (Dietrich et al. 1994).
     To map QTLs responsible for a part of the variation illustrated among the F2 progeny, genomic DNA from each animal is extracted and genotyped using the polymerase chain reaction. There are three possible genotypes at each polymorphic microsatellite locus: BB, BC, and CC. Approximately 110 microsatellite loci that effectively sample the entire genome of each animal were genotyped. Table 3 illustrates the organization of phenotype and genotype data for 96 animals as entered into a spreadsheet. The first two columns are case identifiers. The third and fourth columns lists phenotypes in milligrams, while the fifth column lists genotypes for each animal at a particular microsatellite locus on chromosome (Chr) 6 called D6Mit327. The three genotypes are listed as C (corresponding to CC), H (the heterozygote CB), and B (corresponding to BB). As shown in the sixth column, these three genotypes can be converted into values of -1, 0, and +1. The sixth and seventh columns are values assigned to each genotype assuming either that the B allele or the C allele is dominant. For example, if the C allele is dominant then all of the heterozygous animals are assigned the low trait value of the CAST/Ei parent; –1 in this case.


  Table 3. Quantitative comparison between phenotypes and genotypes

 

   

Phenotypes

Genotype

Models

 

Sorted



Case

PCR Plate Order



Brain


Brain
Res



D6Mit327



Add


B
Dom


C
Dom

 


Add
1


Add
0


Add
-1

090894F

1

469

38

B

1

1

1

 

38

   

041195K

2

496

26

H

0

1

-1

   

26

 

071095A

3

502

8

H

0

1

-1

   

8

 

040695M

4

489

21

H

0

1

-1

   

21

 

051295G

5

475

-1

H

0

1

-1

   

-1

 

041195I

6

489

18

C

-1

-1

-1

     

18

090894I

7

436

-23

H

0

1

-1

   

-23

 

081595V

8

550

52

B

1

1

1

 

52

   

041195M

9

463

-27

H

0

1

-1

   

-27

 

cases

10-89

               

081595M

90

496

-7

B

1

1

1

 

-7

   

101295I

91

501

-1

H

0

1

-1

   

-1

 

040695S

92

477

8

H

0

1

-1

   

8

 

091895L

93

468

-11

H

0

1

-1

   

-11

 

071095D

94

443

-31

C

-1

-1

-1

     

-31

080394Z

95

481

-3

C

-1

-1

-1

     

-3

072195G

96

496

3

C

-1

-1

-1

     

3

   

r with brain residuals:

0.39

0.23

0.40

mean:

17.3

-1.7

-7.8

 


 

Step 4: The statistics of mapping QTLs. We now have all the necessary data and we are poised to assess whether QTLs have been discovered, and if so, with what precision and confidence (Lander and Schork 1994; Churchill and Doerge 1994). Mapping QTLs involves finding marker loci for which the three genotypes match up well with variation in the phenotype. BALB/cJ has a much larger brain than does CAST/Ei. If a QTL modulating brain weight is located near one of the microsatellites then F2 animals that are homozygous for B alleles at that marker should have heavier brains than those homozygous for C alleles. Referring to Table 3, we test whether or not there is a significant correlation (or regression coefficient) between the numerical values (–1, 0, and +1) in the sixth through eighth columns and brain weight residuals in the fourth column. These correlations are listed at the bottom of Table 3.
     A complementary way to explore these data is to determine whether brain weight residuals of animals with the BB genotype are greater than those of groups of animals with the other two genotypes. This type of categorization is shown on the right side of Table 3. The average residual of individuals with the BB genotype is 17.3 ± 4.8 mg (bottom right), whereas that of CC individuals is –7.8 ± 3.4 mg. Half of the difference between these means is an estimate of the additive effect of substituting a low C allele with a high B allele—a value of 12.6 mg in this case. The heterozygotes in this sample have an average phenotype that is 6.4 mg lower than that predicted given the difference between BB and CC genotypes. This deviation estimates the dominance of the C allele.
     In this analysis we have tested whether a single microsatellite marker, D6Mit327, is located close to a QTL that influences brain weight. But we would like to scan the entire genome in the same way. Is the correlation between the three genotypes and variation in brain weight of 0.39 the highest that we can find? If we do this analysis at each of 110 marker loci we discover that genotypes at D6Mit327 match variation in phenotypes much better than any other marker (Table 4). In fact, the probability of getting such a good match by chance alone if one only performed a single test is about 1 in 10,000. This is referred to as the point-wise, or nominal probability of linkage. In addition to listing the nominal probabilities, Table 4 lists several other interesting statistics and coefficients. One of these is the likelihood ratio statistic (LRS), a value that like the logarithm of the odds ratio (the LOD score) is used to assess whether or not a QTL is present close to the marker locus (Haley and Knott 1992). The next two columns list the additive effects of allele substitutions and the predicted dominance deviation. The last column lists the fraction of the variance that can be accounted for by genotypes at the marker locus. This latter value is just the square of the correlation coefficient that we already computed in Table 3. For example, at D6Mit327 the estimate is 16%.


  Table 4. Statistical summary of a genome-wide search for a brain weight QTL

 

Locus

Chr

P

LRS b

Add c

Domc

 % d

D2Mit295

2

0.02326

7.5

7.02

-3.29

5

D3Mit23

3

0.02901

7.1

6.56

-5.71

5

D4Mit172

4

0.01964

7.9

5.73

7.60

6

D4Mit151

4

0.02791

7.2

6.95

5.57

5

D6Mit273

6

0.04654

6.1

-2.84

-9.19

4

D6Mit327

6

0.00011

18.3

12.59

-6.42

16

D7Mit193

7

0.03257

6.8

8.68

3.24

5

D7Mit120

7

0.00747

9.8

12.13

0.78

9

D7Mit31

7

0.02386

7.5

7.85

1.92

5

D12Mit158

12

0.01966

7.9

6.68

6.99

6

D16Mit65

16

0.00891

9.4

7.05

-5.61

7

DXMit54

X

0.03059

7.0

5.73

4.05

5


a. P is the point-wise or nominal probability of achieving an LRS value by chance.
b. the LRS is the likelihood ratio statistic (4.61 times the LOD score).
c. add and dom are the additive effects and dominance deviations in milligrams.
d. % is the percentage of variance that can be explained by a QTL tightly linked to the marker locus.

 


 
 

     To refine the analysis of this QTL near D6Mit327 we could genotype neighboring markers to determine whether any have even stronger association with variation in brain weight. This additional genotyping is usually not necessary because we can infer the genotypes that are likely to be present between neighboring marker loci. For example, if a mouse has a BB genotype at one marker and a CC genotype at a flanking marker then half way between these markers the genotype will most probably split the difference and be BC. Comparing predicted genotypes with actual phenotypes in the interval between marker loci is referred to as interval mapping (Lander and Botstein 1989). This refinement can significantly improve the statistical power of a QTL search and makes it possible to distinguish between a weak QTL that is near to a marker and a strong QTL that is located farther away. In other words, interval mapping improves the ability to locate a QTL and to estimate the effects that it is likely to have on the phenotype.

Figure 4
 

Figure 4. Linkage of the QTL Bsc5 to chromosome 6 in a cross between BALB/cJ and CAST/Ei. The x-axis represents position along Chr 6. The most proximal marker that we typed (D6Mit273) maps at 19 centiMorgans (cM), whereas the most distal maps at 70 cM. The Bsc5 locus is most likely to map about 1 cM proximal to the microsatellite marker D6Mit32. The confidence interval (CI) of this estimate (bold black lines) is wide–from 37 to 61 cM for a two-LOD CI (95% probability), and from 41 to 56 cM for a one-LOD CI. Genome-wide probability thresholds (Fig. 5) are marked by fine horizontal lines. The right scale and the two lower curves indicate the approximate additive effect and dominance deviations generated by Bsc5. The substitution of a single BALB/cJ allele for a CAST/Ei allele at Bsc5 may be responsible for a 15-mg gain in brain weight.

 



     The results of the more fine-grained interval mapping analysis of Chr 6 are illustrated in Figure 4. The horizontal line at the top represents most of Chr 6 (from 19 cM to 70 cM). Only four marker loci on Chr 6 were genotyped (D6Mit273, D6Mit71, D6Mit327, and D6Mit113). Using the genotype data and the program Map Manager QT (Manly and Olson, 1999; http://mapmgr.roswellpark.org/mmQT.html), the LRS was computed at 1-cM intervals. These values were then used to generate the shaded likelihood profile. As we suspected on the basis of our initial analysis, there appears to be a QTL influencing brain weight near D6Mit327.

 

Permutation analysis. The process of mapping QTLs involves computing hundreds of linkage statistics across the entire set of chromosomes. Given the large number of statistical tests there is a strong probability of getting a "significant" association by chance alone. The nominal probabilities listed in Table 4 tell us little about the genome-wide probability that we have discovered a QTL (Lander and Kruglyak 1997). We need to compensate for these multiple tests. The appropriate correction factor depends on the particular distribution of trait values and the quality and quantity of genotype data.


 

Figure 5
Figure 5.Permutation analysis of the Bsc5 locus. Genome-wide thresholds for estimating the strength of linkage are estimated by randomly permuting data such as those listed in Table 3. This histogram tallies single best LRS scores for each of 10,000 permutations. The two-tailed probability of a random dataset having a peak LRS score better than 18.4 is 0.0215 ± 0.0015.


 

A conceptual simple but computationally tedious permutation test can be used to estimate the distribution of best LRS scores that one might expect to get by chance with a given dataset (Churchill and Doerge 1994). This procedure reassigns phenotype values listed in Table 3, and then remaps the jumbled dataset to get a new version of Table 4. For each permutation the program keeps track of the single highest LRS score. The process is carried out another 9,999 times. Figure 5 shows a histogram of the peak LRS scores that resulted from a permutation of the data in Table 3. The peak LRS score was typically near 10. This non-parametric distribution of peak LRS scores can now be used to gauge the probability of obtaining an LRS of 18.3 by chance alone. Only 2% of permutations do this well or better. We can therefore be reasonably confident that we have mapped a QTL modulating brain weight to Chr 6. This is the fifth QTL that Richelle Strom and I have mapped (Strom 1999, R. C. Strom and R. W. Williams, in progress), and we have named it brain size control 5 (Bsc5). Bsc5 maps on Chr 6, approximately 1 cM proximal to D6Mit327. Bsc5 has not been mapped with much precision: the 95% confidence interval is defined by the width of the map profile 2 LOD units (or 9.2 LRS units) to either side of the peak—in this case between 37 and 61 cM. This 24-cM interval contains approximately 1,200 genes, and perusing a list of candidates at this point is little more than an entertaining exercise in optimism. A quick scan of this region using the Mouse Genome Database reveals one interesting candidate—the thyrotropin releasing hormone gene that maps at 43 cM.

Cloning QTLs. Mapping QTLs is the initial reconnaissance stage in a systematic effort to explore mechanisms that modulate the development of the CNS. The next step is to match each QTL with a single gene and its alternative alleles. QTLs will generally need to be mapped with a precision of 1 to 2 cM—a chromosomal interval that will typically harbor 50–100 genes. Achieving this level of accuracy is not impractical, although it will often require an analysis of 1000 or more animals (Darvasi 1997, 1998). A small subset of positional candidate genes can then be chosen for further analysis on the basis of expression patterns, known function, and differences in DNA sequence among strains. The efficiency of the candidate gene approach will improve greatly in the next decade. The genome of C57BL/6J will have been sequenced within five years, and it is also likely that the utility of this code will be enhanced with sequence data from other major inbred strains such as 129, A, BALB/c, C3H, DBA/2, CAST/Ei, SPRET/Ei. Once sequence data have been combined with expression maps for different parts of the mouse brain, it should be possible to winnow a set of candidate genes to a very short list. If the thyrotropin releasing hormone gene survives this filtration, then we may then be justified in comparing its sequence among strains with different phenotypes. The conversion of quantitative phenotypes (e.g., low to high) by substituting alleles of one strain with that of another strain will provide the final and most compelling support that the identity between a QTL and a particular sequence variant has been made correctly (Frankel 1995).
     [A new method called recombinant inbred intercross (RIX) mapping promises to make it significantly more practical to fine-map QTLs within intervals of less than 1 cM (Williams et al, 2000). RIX mapping relies on the generation of a large number of F1 hybrids by crossing fully genotyped recombinant inbred strains. For example, the set of 35 BXD RI strains can be used to generate as many as 595 unique RIX F1 hybrids. Like an F1 between any two inbred parental strains, the genotypes of each of these RIX F1 is defined precisely and no genotyping is required to make use of the RIX set for mapping QTLs. For purposes of QTL mapping the set of RIX lines resembles an F2 intercross (all three genotypes are represented at each locus) more than it does an RI set. However, unlike an F2, eaach genotype is represented by a potentially unlimited number of individuals. Thus, the mean phenotype associated with each genotype can be determined with as much precision as the study demands. The availability of very large numbers of novel genotypes greatly improves the power of detecting QTLs that have modest effects on CNS structure and behavior. Not all of the huge number of avaialble RIX lines need to be generated and tested to fine-map or confirm a QTL: one can simply analyze subsets of RIX lines that have defined genotypes on intervals that harbor putative QTLs. Contrasting genotypes can be synthesized on adjacent intervals to determine the true position of a QTL. In this way RIX lines can be used to map QTLs with nearly as much precision as one could map a Mendelian locus on the same set of lines. Drs. Lu, Airey, Kulkarni, and I have recently completed generating the set of CXB RIX lines. From 13 CXB lines we have generated a set of 76 (13 x 12/2) RIX lines. These lines are now being used to map numerous CNS and eye morphometric QTLs. (RW, July 2000)]
     It is important to realize that QTLs are not invariant across different populations of mice. A QTL can be identified because it is polymorphic in a particular population or cross. The same gene may not necessarily be polymorphic in another cross. If a gene is not polymorphic it cannot generate phenotypic va