GENEHUNTER-MODSCORE (GHM) is a further extension of GENEHUNTER-IMPRINTING. The program is based on the original GENEHUNTER version 2.1 release 6 (Kruglyak et al. 1996; Kruglyak and Lander 1998; Markianos et al. 2001); it can handle autosomal or pseudoautosomal loci. GENEHUNTER-MODSCORE allows for a MOD-score analysis, in which parametric LOD scores are maximized over the parameters of the trait model, i.e., the penetrances and disease allele frequency. By this means, the disease-model parameter space is explored in an efficient way, and so researchers do not have to rely on a single trait model when performing a parametric linkage analysis. This can be of great help in the context of genetically complex traits, for which the disease model parameters are usually unknown prior to the analysis. Please note that, because of the additional maximization, MOD scores are inflated when compared to LOD scores that were calculated under a single trait model. Therefore, in the context of a MOD-score analysis, significance criteria for LOD scores cannot be applied without correction. For details regarding this issue, please see the references (Strauch et al. 2000; 2005) mentioned below. The latest version 3.1.1 of GENEHUNTER-MODSCORE includes a permutation procedure to obtain empiric p values for the MOD-score based test on genomic imprinting (MOBIT) as well as a small script (‘run_mobit_permutations.sh’) to enable parallel calculation of p values on multiple CPUs. The MOBIT is thoroughly explained in Brugger et al. (2019) also mentioned below.
For version 3.1, we have developed a completely new algorithm that substantially speeds up MOD-score calculation. In particular, our new algorithm reduces the effective number of inheritance vectors by collapsing them into classes with identical disease-locus-likelihood contribution. To this end, the disease-locus-likelihood contribution of each inheritance vector is stored in its algebraic form as a sum of products of penetrances and disease-allele frequencies. Inheritance vectors with the same disease-locus-likelihood contribution are joined together in an inheritance-vector class. This concept is even extended across pedigrees, such that the disease-locus-likelihood contributions of all inheritance vector classes can be numerically calculated in a single step for the entire dataset rather than having to recompute them many times. Unlike the previous algorithm, the new algebraic algorithm neither requires peeling of nuclear families nor loop-breaking. For details about the new algorithm, please see Brugger & Strauch (2014).
In order to determine the significance of MOD scores, the new algebraic algorithm is also used in the simulation routine of GENEHUNTER-MODSCORE, which allows researchers to determine empirical p values by performing simulations under the null hypothesis of no linkage. Hence, the evaluation of many sets of tested trait-model parameters during a MOD-score analysis in conjunction with empirical p-value calculation is now feasible within a reasonable amount of time. It is of note, however, that the computational speed-up using the new algebraic algorithm as compared to the old algorithm depends on the number of different types of pedigrees in the dataset. It can be hence advisable to also try the old algorithm for MOD score analysis, especially if there are many larger and/or different types of pedigrees in the dataset.
For further information regarding the simulation routine for the calculation of empiric p values and the option to use sex-specific recombination frequencies, please see Mattheisen et al. (2008) and Dietter et al. (2007), respectively.
A Perl script, GH_modview (written by Franz Rüschendorf), is provided with GENEHUNTER-MODSCORE. It allows for the creation of a Gnuplot graph of the LOD or MOD score, displayed by the single family contributions. An example of such a plot obtained for a sample with three pedigrees is shown below. Each family is represented by a different color. For every genetic position, the contribution of a family that yields a score above zero is added to the positive side of the y-axis, and the contribution of a family that yields a score below zero is added to the negative side of the y-axis. The overall score at a genetic position equals the total positive score (i.e., the sum over all families with positive contribution) minus the total negative score. This type of diagram is useful for both Mendelian and complex traits, since it identifies families with positive versus negative contribution to the linkage signal at a particular genetic position.
GENEHUNTER-MODSCORE can perform separate maximizations over penetrances of several liability classes, e.g. for males and females, individuals of different age, or different levels of risk due to environmental factors. By this means, it is also possible to study gene-environment interactions.
In the case that a genome scan for a certain trait yields at least two linkage peaks, it is reasonable to perform a linkage analysis that explicitly models two trait loci. Such an analysis can be done with the program GENEHUNTER-TWOLOCUS. In the parametric context, the best-fitting trait models at the two loci obtained by a MOD-score analysis with GENEHUNTER-MODSCORE can be used to derive the underlying two-locus trait model. Please see the GENEHUNTER-TWOLOCUS subpage for details regarding this issue.
More information about GENEHUNTER-MODSCORE can be found in the file INSTALL.ghm that is included in the archives provided below. Please also see the online help for details, e.g. by typing 'help modcalc' or 'help modscore' at the GHM prompt, or refer to the PDF or PostScript version of the online help (files ghm.pdf and ghm.ps, respectively).
The archives contain sample input files for a MOD-score analysis (sample.in and sample_use_map.in), in conjunction with the files linkloci.dat, linkloci.dat.sxp, linkloci.imp, linkloci.imp.sxp, and linkped.pre. Furthermore, map files with the genetic positions of markers according to the Duffy, Marshfield, Nievergelt-Schork, and the Rutgers map are included (with kind permission by David Duffy, Karl Broman, Nicholas Schork, and Tara Matise). A sample run can be executed e.g. by typing 'run sample_use_map.in' at the GHM prompt, or by calling 'ghm < sample_use_map.in' from the command shell.