Post contributed by Tom Gilbert, Natural History Museum of Denmark. Member of the Editorial Board for Open Quaternary. More about Tom Gilbert’s research can be found here.
Ancient DNA studies often draw fantastic levels of press coverage. Rarely a month passes these days, without highly accessed web sites such as BBC news having at least one ancient DNA related study on the front page. Whether relating to new discoveries as to our relationship to now extinct hominids, refining our understanding of the peopling of our planet, reconstructing the origin and spread of pathogens, identifying the resting place of long dead monarchs or weighing in on the debate as to whether Yetis are figments of imaginations or not, ancient DNA based stories are clearly easy to sell. This is great for people embedded in the field (like me) – most scientists like nothing more than reading about themselves online. Egotism aside, the fact that the public want to hear all about your particular research area provides concrete evidence that it is of wide-scale interest and relevant for 21st century society.
Researchers are doing amazing work exploiting ancient DNA to understand the past. And I really do mean this – the amounts of information we can glean from typical sources of material, including degraded archaeological bone, soils, ice and even coprolites – coupled with some of the state of the art analyses that can be applied to the data, is revealing findings that were beyond our wildest dreams only 10 years ago. For example, who would have thought that we would be able to use DNA recovered from a single, tiny, finger bone to identify whole new hominid species, as with the Denisovans? Or that the genetic information contained in a ca. 24,000 year old skeleton from central Siberian Mal’ta skeleton would reveal that there had been contact between populations spanning the northern hemisphere by the upper Paleolithic, and, in doing so, ensured that the original Native Americans carried a component of West European DNA in their genome.
There is one catch however, and it is the central point of my post. The samples sizes of almost all of these high profile studies are n = small. Actually, n= very small. Like 1. Now don’t get me wrong; there are a number of very good reasons for this.
The first problem is cost. At the core of most leading ancient DNA studies these days is palaeogenomics – that is the recovery of not just snapshots of DNA, but the majority of the genome within any specimen. Genomes, from modern specimens, are not cheap, falling in the range of thousands of dollars. Genomes from ancient specimens are even more expensive since the DNA is highly degraded, mixed with microbial DNA, and so on. Study costs can easily get into the range of hundreds of thousands of dollars per genome.
Secondly, there is simply a lack of material to work on. Either samples simply don’t exist (the Denisovan fingerbone was the only bone at that time discovered), or if they do, they don’t contain DNA due to age or poor preservation (e.g. the Flores hominids). Those samples that do exist are often (rightly or wrongly) buried under too much red tape to enable the destructive sampling required to exploit them. And thus, we come to my central point. Small sample sizes, when covered with wide press coverage looking for a sensationalist angle are dangerous. Yes, single samples can provide incredible, news-worthy insights. But a key question is, how much can we actually extrapolate from these findings? This is not just a problem in palaeogenomics – many other disciplines face the challenge of reconstructing scenarios based on limited data. But not all disciplines routinely get quite as much press, and as thus find their implications spreading rapidly into secondary educational resources such as blogs, text books and popular science articles, without the primary literature being consulted. Simple example – I still come across articles today discussing some incredible results (Woodward et al. 1994) derived from the early 1990s that reported the recovery of DNA from dinosaur remains (http://www.unmuseum.org/dnadino.htm). The problem is, that shortly following the initial success publication, it was rapidly demonstrated that the results derived from human contamination (Hedges and Schweitzer 1995), and for sound biochemical reasons, the chance of dino DNA really being there is essentially zero (Lindahl 1993). Ancient DNA results, in particular the first ones on whatever the latest sensational story is, spread far and wide, and later voices arguing alternate viewpoints have a much harder time being heard.
And so back to my point. How many of the recent palaeogenomic-based findings will hold up with time, as larger datasets are analysed? Were there really 3 founder sources to the modern European population as recently claimed using palaeogenomic data from three differentiated ancient populations (BBC Story; Lazaridis et al., 2014)? Or is this observation simply an artifact of the fact that only material from 3 ancient populations was analysed? If we add a fourth or fifth distinct group, will the number of founders increase? Given that one thing we definitely have learnt from ancient DNA in general, is that the past was way more complex than we can imagine. And so given that, my answer to this question is, I expect so!
Hedges SB, Schweitzer MH. 1995. Detecting dinosaur DNA. Science 268:1191-1192
Lazaridis et al., 2014. Ancient human genomes suggest three ancestral populations for present-day Europeans. Nature 513:409-413.
Lindahl T. 1993b. Recovery of antediluvian DNA. Nature 365:700
Woodward SR, Weyand NJ, Bunnell M. 1994. DNA sequence from Cretaceous period bone fragments. Science 266:1229-1232