Charles Brenner - There's DNA Everywhere

I've made recent and rapid progress in my long-delayed venture to convert my forensic DNA interpretation work into the modern world of Dyalog APL.

The impetus is solving a particular new problem (as opposed to merely converting legacy code that implements old solutions). The particular problem at issue is to calculate whether and how strongly a suspect can be inculpated as a likely contributor to a DNA mixture, meaning a combination of several people's DNA. For example, rape evidence typically includes the victim and assailant DNA comingled. Or, DNA detection technology has becomes so sensitive that DNA can be detected on the grip of a gun from mere touch, but as a consequent complication it is common that several other people overlaid their DNA as well.

In my experience computer folks enjoy the scientific aspects of the story, and I would enjoy relating how in my view this project exemplifies the thesis that APL is a tool of thought. I credit APL with leading me to elegant and therefore very fast and flexible solutions to a problem for which the competing solutions are lumbered by complicated statistical and Monte Carlo methods.

Elegance and simplicity lead to several concrete benefits. The first is conceptual development. From January through September of 2013, I made notes in APL as one after another solution to various aspects of the problem occurred to me. The brief APL notes ensured that the ideas really make sense, that they work together, and that I would not forget them. The brevity also revealed a simple but important point that others had overlooked: nesting the computation loops in the right order saves orders of magnitude in computation time.

That the program runs fast makes it much easier to see the forest in many ways -- testing, developing, designing. The leading competing programs take minutes or hours to find the answer for the "maximum likelihood (ML)" contributor proportions. (The meaning of those technical words doesn't matter for the story.) Having worked that hard, both of those programs then succumb to a natural tendency to call it a day.

The APL "DNA-VIEW Mixture Solution" program finds the ML answer in a fraction of a second which makes it much easier to think through to the fact that there's a lot more work to do, dozens or thousandsfold more computation before the result is logically defensible. Another example: With DNA forensic calculations, it's usual, if a background population is necessary, to compute assuming one particular racial population, and in a multi-racial country like the US to repeat the calculation for each race. After crafting a careful APL-flavor implementation -- element-wise inner products and suchlike -- it became obvious from the code itself that with little cost, essentially by replacing a scalar with a vector, I could not only calculate several races simultaneously but moreover can cater to the possibility of different races for two different contributors as for example with a gang rape. The idea to account for the possibility of one white and one black rapist thus is a bonus suggested by the concise APL formulation.

In three months at the end of 2013 I found enough time to implement the Mixture Solution, gradually added some convenience features, shared it with some beta testers, and tried the program on a variety of examples including a set of 5 proficiency exercises created at the National Institute of Standards and Technology. At the beginning of April I visited NIST to discuss my results. One hundred plus entrants including the leading competition had contributed analyses. One of the competitors (supported by years and millions of dollars of government grants) got four problems right and came close on the fifth, viewing it as a three person mixture but it fact it was four. Mixture Solution alone correctly analyzed all five, and as a bonus it correctly diagnosed one suspect as a mixed race person. We APLers try to be modest but it's not always easy. Sometimes there's just not much to be modest about.

Bio: Charles Brenner, forensic mathematician

Following his undergraduate work at Stanford, Charles Brenner began his career in practical mathematics as a professional bridge player in London. He followed this with a Ph.D. in number theory at UCLA in 1984. The focus of his work is forensic mathematics. His continually evolving DNA-View software has been used by labs around the world for the last 30 years, and he maintains the well-known "DNA-View" website as a resource on forensic mathematics.