by Ian Webber Evett CBE
ONE OF THE MOST CHALLENGING TASKS for today’s forensic scientists is the
interpretation of low-level, degraded, or mixed DNA profiles from evidentiary
The theory for assigning probabilities to DNA genotypes was established
to a broad consensus in the 1990s. The task of deciding on the combinations of
genotypes and the relative weightings to be included in a calculation of evidential
weight, though, was much slower to evolve because of the complexities and
consequent computing power demands.
During this time, the forensic interpretation of profiles from DNA
mixtures was by and large a manual, time-consuming process that
analysts tackled by using heuristics to determine those genotype combinations
that could reasonably explain a recovered profile. Now, all of that has been
changed radically by the development of what has come to be known as probabilistic
genotyping (PG) software.
The heuristic methodology, which is still being used in many forensic
laboratories, meant that analysts relied on simplified interpretation
strategies to deal with mixed DNA profiles. Applying various fixed thresholds
and other biological parameters (such as heterozygote balance, mixture
component ratios, and stutter ratios), they based their interpretations on
genetic data being predominately above a given threshold in which a prospective
contributor to the mixture could be included or excluded.
While this manual approach to mixture interpretation worked fairly well
for most two-person mixtures, it was unwieldy at best and questionable at worst
for more-complex mixtures. This often resulted in good data, which potentially
could provide reliable weight with regard to the issue of whether or not DNA
from a person of interest (POI) was present, classed as inconclusive, and
ultimately discarded. Further complicating this situation was a rise in the
complexity of DNA interpretation because crime samples were from an increasing
proportion of lower-quality and more-complex mixtures.
Faced with a growing volume of less-than-ideal profiles, forensic labs
recognized that the traditional interpretation process would no longer suffice.
Fortunately, the rise in more complicated DNA mixtures was accompanied by
increased availability and subsequent use of PG software as the method of choice
for interpreting DNA profiling evidence.
PG software allows
forensic labs to assess literally thousands of proposed profiles with respect
to how closely they resemble or can explain an observed DNA mixture profile.
Analysts can then calculate the probabilities of the observed DNA evidence, given
propositions that might represent prosecution and defense positions at a future
trial. These two probabilities, in turn, can then be presented as a likelihood
ratio (LR), implying the evidential weight of the findings and the strength of
support for one proposition over the other.
This approach has proven
to be highly effective in allowing PG software to interpret possible components
of highly complex DNA mixtures, far better than what was ever possible using
the traditional manual process alone. This, in turn, has enabled PG software to
produce usable, interpretable, and reliable DNA results that have contributed
to the successful resolution of both criminal and civil investigations and stood
up to scrutiny in subsequent judicial proceedings.
PG software has been instrumental in excluding individuals who were wrongly associated as the source of crime scene evidence—and in exonerating persons who were wrongly convicted via post-conviction cases.
PG software has been particularly
productive in contributing to the resolution of violent crime and
sexual assault cases. It has been instrumental in excluding individuals who have been wrongly associated as
the source of crime scene evidence and in exonerating persons who were wrongly
convicted via post-conviction cases. It has also been useful in cracking cold
cases in which low-grade or mixture evidence that originally had to be dismissed
as inconclusive could be examined again and used to develop other investigative
Because of the
scientific underpinnings, validation studies, and peer-reviewed publications
supporting its use, PG software has garnered wide scientific support for its reliability.
While PG software has been in use for less than a decade, it is based on
standard mathematics. The probability models and Markov Chain Monte Carlo
(MCMC) methods used by PG software originated in Los Alamos, New Mexico during
World War II and were then brought closer to statistical practicality by a
number of workers in the 1970s. Widely employed outside of forensic science,
MCMC is at the heart of a huge range of applications, from computational
biology and weather prediction to physics, engineering, and the stock market.
literature contains numerous peer-reviewed papers that support the validity of
PG software. While some have pointed out that the developers of PG software are
the authors of many of these papers, it is important to recognize that
publication of a paper represents only the initial step in the peer-review
process. Such published papers are meant not simply to inform, but to provoke
discussion, promote improvements, and ultimately advance the science. To date,
the overall peer-review process supports a consensus that PG software generates
reliable results when it is used properly.
To that end, it is
incumbent on forensic organizations to ensure that the analysts who regularly use
PG software are properly trained in:
In addition, labs must regularly
review validation studies that define the limitations of PG software and properly
validate their own PG software with in-house studies in order to have a better understanding
of the data being produced. They should also regularly review the peer-reviewed
literature to know how others are working with PG software and any issues they
are experiencing. With this knowledge in hand, labs are well-placed to put
effective protocols in place that do not overstate the weight of evidence
assigned from calculations carried out by PG software, and analysts will be better
prepared to recognize when data provided by PG software cannot be supported.
All of this has led to PG software being recognized as the de
facto “go to” method for interpreting DNA profiling evidence. It enables
users to interpret DNA results faster, compare profiles against a POI,
calculate a LR, use more of the
information in a DNA profile, and ultimately, resolve previously unresolvable
and highly complex DNA mixtures. As a result, PG software is being used in
hundreds of thousands of cases worldwide. For example, one of the currently
available software packages, STRmix, has been used to interpret DNA evidence in
more than 220,000 cases in the past eight years.
The use of PG software internationally
has risen to a level that practices are now being codified. The International
Society for Forensic Genetics and the UK Forensic Science Regulator, for
example, have published guidelines for validating software. The Scientific
Working Group on DNA Analysis Methods (SWGDAM) has produced guidelines for
validating PG tools. The Organization of Scientific Area Committees’ DNA
Analysis 2 Sub-Committee is developing standards for assessing PG software
While PG software use is
now widespread, it is not without its critics and, like virtually all other technology,
it has limitations. It should not come as a surprise that no matter how good it
is, PG software cannot interpret every DNA profile. There must be sufficient
DNA signal in the profile to move forward with an analysis using PG software. Some
profiles are simply too degraded, too complex, or have too little information to
be meaningfully interpreted.
It is essential for forensic analysts to be trained on the proper use of the PG software their lab employs, while recognizing that no matter how good, it will not always be able to provide a useful result.
With that in mind, it is
essential for forensic analysts to be trained on the proper use of the PG
software their lab employs, while recognizing that no matter how good, it will
not always be able to provide a useful result. As for the lab, proper validation
studies must be conducted so that everyone is aware of the limitations of the
PG tools in use. That step, in combination with development and implementation
of effective protocols designed to represent—reliably and robustly—the strength
of the PG results, will reduce the chances of interpreting software output that
is not supportable.
Limitations aside, critics
have argued that DNA analysis results generated by PG tools are unreliable
because miscodes have been discovered in the software, casting doubt on whether
the output realistically can be trusted to produce error-free results. Attorneys,
in particular, have been quick to argue that PG software represents a flawed “black
box” approach to DNA analysis in which data is fed into a computer and a result
is generated, while little is known about how that result has been derived or
the computer algorithms that produced it. This has typically resulted in
attorneys demanding to have access to the source code—and any potential
miscodes it might contain—in cases where PG software has played a key role in
conviction or exoneration.
Developers of PG
software have countered such charges by pointing out that miscodes are present
in virtually every software package. Moreover, they claim those miscodes which
have been identified to date have tended to be on the fringe of DNA typing
results. As such, they are difficult to define, not typically encountered, and
have negligible impact on the reliability of the output.
There is a view among some
developers that PG software can be effectively scrutinized by examining its
extended output, which embodies the intermediate steps of the interpretation
process. They also suggest that proper testing and comprehensive training can
help in identifying miscodes.
Requests from prosecutors
and defense attorneys to grant access to source code appear to have been met by
various responses, ranging from permission to refusal to comply (citing issues
of intellectual property rights). One developer has adopted policies which
grant attorneys, scientists, expert witnesses, and others access not only to their
source code, but also their developmental validation records, user’s manuals,
and extended output. It seems that each case needs to be considered on its
It appears that U.S.
courts have, in the main, denied motions to exclude evidence produced by PG
software, citing general acceptance in the scientific community and scientific
validity. It is worth noting, however, that it can be extremely challenging for
judges, attorneys, and juries to fathom the intricacies of DNA evidence. The
responsibility to understand and validate PG software and its applications,
however, needs to be accepted by those who regularly use forensic DNA typing
methodologies. Doing so will provide confidence in its use and proper
significance in casework, while simultaneously meeting challenges in the legal
Despite these and other challenges
that are likely to continue in the foreseeable future, even some of the most ardent
critics of PG software have been forced to admit that it represents a major
advance in DNA profile interpretation when used properly. By employing sound
science, PG software enables the scientist to provide reliable and robust
assignments of evidential value from a significantly broader range of DNA
profiles than ever before possible. And while many other developments will occur
in the future, PG software will continue to have a profound impact on the
evaluation of DNA profiling evidence in both criminal and civil investigations.
About the AuthorIan Webber Evett CBE, DSc, Hon FCSFS is a statistician with Principal Forensic Services Ltd whose career-long specialty has been the
evaluation of evidence. Evett has worked with many different evidence types,
from fingerprints to handwriting to DNA profiling. He worked in the Home Office
Forensic Science Service until its closure in 2012 then joined a group of
senior colleagues from a range of forensic disciplines in forming Principal
Forensic Services Ltd, which offers casework, consultancy, and training
services internationally. He has doctorates from the universities of
Strathclyde and Lausanne, is an honorary life fellow of the Chartered Society
of Forensic Sciences, and was invested as a Commander of the Order of the British
Empire in 2016 in recognition of his services to forensic science.