Software Projects

I began working on software in the 1970s as an undergraduate, both at Caltech and during summer employment at UC-Berkeley Entomology (with Bland Ewing). I played around with APL, Pascal, Fortran and Assembly computer languages. In one project, I redesigned the Pascal compiler to be able to insert Assembly code for fast computation. UC-Berkeley had a CDC 6600 and a 7600 computer, which required users to stand in line with decks of computer cards (think one line of code per piece of cardboard, each fitting in a legal envelope). Nevertheless, our team got access through early “high-speed” telephone wires to a computer each in San Francisco and UCLA.

Later as a graduate student at UC-Berkeley, my thesis involved a substantial computing part, along with high-end theory. I happened to be in the same building with some of the designers of 4BSD Unix and some of the popular early Unix tools. Networking was rather rudimentary, with a colleague rigging a 2-wire connection between terminals to transfer coded between machines. There were still only a few computers on campus, with the new generation being PDP-11 minicomputers. The UC-Berkeley CS and Stat departments shared a machine with 11 MB for each on a disk the size of a airport-friendly suitcase. One day, I watched a colleage trash the superblock of this computer, which lost all the pointers to computer file components. I helped him design tools (functions) to recover most of the files from the major crash.

My professor-track employment at UW-Madison involved a careful balance of theoretical work (to establish my bonafides) and computational projects (to explore tools and ground ideas in data-driven stories). That is, I needed to write theory papers to justify my tenure case, but managed to back up most of these with computer tools to justify the methodology.

I was actually stretched in a third, important way, through my career-long interest in collaboration. This has involved developing professional relationships with colleagues across campus, and around the world. While some of this work extends existing, or develops new, stats theory, many of my collaborations have involved more attention to the practical aspects of addressing challenging research questions through data analysis and visualization. This work led to my book, Practical Data Analysis for Designed Experiments, along with a companion package, pda.

Software Releases

  • GCVPACK: Routines for Generalized Cross Validation (free release in 1986; now part of base of R; Bates, Lindstrom, Wahba and Yandell 1987)
  • MCMC-QTL: Markov chain Monte Carlo inference for Quantitative Trait Loci. (free release in 1998; Satagopan, Yandell, Newton and Osborn 1996).
  • RevJump-QTL: Bayesian model Determination of the Number of QTLs using Reversible Jump MCMC. (free release in 1999; Satagopan and Yandell 1998).
  • Splus/QDA: Quality Data Attributes Analysis. (proprietary release in 1999; Yandell and Tragon Corporation).
  • Practical Data Analsysis: library(pda) for Splus and R. (free release in 1997; revised in 2000)
  • Microarray Data Analysis: library(pickgene) for R. (2001; Lin et al. 2001) Bioconductor
  • Quantitative Population Ethology: library(ewing for R. (free release in 2001; Ewing et al. 2001)
  • Bmapqtl: Bayesian QTL mapping module for QTL Cartographer. (public domain release in 2001; Gaffney 2001)
  • R/bim: Bayesian interval mapping R library. (free release in 2002; CRAN in 2003; Bioconductor in 2004. [deprecated]
  • R/qtlbim: QTL Bayesian Interval Mapping. Improved and totally revamped R library for model selection with Bayesan interval mapping, allowing for covariates and epistasis; CRAN in 2006. [deprecated]
  • R/qdg: QTL-driven dependent graphs R library (CRAN 2008); See Chaibub Neto E, Ferrara C, Attie AD, Yandell BS (2008) Inferring causal phenotype networks from segregating populations. Genetics 179 : 1089-1100. doi:10.1534/genetics.107.085167. [deprecated]
  • intermediate: Mediation analysis building on work of Gary Churchill team and Elias Chaibub Neto.
  • R/qtl2 extensions
  • R/foundr: Package to analyze and visualize Diversity Outbred (DO) founder lines by sex and condition.