Tuesday, December 15, 2015

How (Not) to Use
a Computer-Generated Stemma

Barbara Bordalejo has been heavily involved in the use of computer-aided stemmatics since at least the time of her NYU doctoral thesis (2003; found here).

In the newest issue of Digital Scholarship in the Humanities she has an important essay on how to understand the differences (and similarities) between the genealogy of texts and the genealogy of manuscripts—a distinction central to the CBGM

Put another way, this is an article on how to avoid misusing computer-generated genealogies. Since this is a concern that comes up again and again when I talk to people about the CBGM, I thought it would be worth quoting Bordalejo’s conclusion in full:
Phylogenetic analysis and other computer-assisted stemmatological approaches can be used productively when studying large textual traditions, despite the difficulties presented by contamination, changes in order, major alterations, and significant losses. The stemmata produced using computer-assisted methods are working hypotheses which serve as a starting point of investigation. These stemmata, whether they correspond to a textual tradition or a manuscript tradition, are one of the tools that we can use to further our understanding of how texts are transmitted and how variants are inherited. What they do not do is to present us with a one size-fits-all solution that could answer all of our queries. In the end, we are still subject to the remarks of A. E. Housman who said that knowledge and method were important, but that besides those a scholar was required to make use of her brain (Housman, 1921).

The interpretation of the stemmata generated by the use of phylogenetic software is fundamentally changed when we understand the difference between textual and manuscript traditions. Although the search for meaning in each of these follows a similar pattern, the recognition of the differences between the data sets will have an impact on our expectations.

A stemma, computer-generated or made by hand, is only a graphic representation of a hypothesis (machine or human or a combination of both) created following a specific model and has to be treated as such. The historical reality that underlies our hypotheses cannot be recovered in its totality, whether this reality corresponds with the textual tradition or with the manuscript tradition. However, combining computer-assisted stemmatic analysis, database searches and historical knowledge of the production history of a particular text can help us build increasingly convincing hypotheses about it. Once we recognize this, we will be better equipped to use the tools at our disposal more efficiently and interpret the results of our research more accurately.*
Notice that Bordalejo is saying that regardless of whether we are after manuscript relations or textual relations, our stemmata are only partial representations of the historical reality. This is important because some of the literature on classical manuscript stemmatics can leave one with the impression that what they provide is a complete history. But this is not the case (cf. M. West, Textual Criticism, p. 35 and P. Trovato, Lachmann’s Method, p. 144 [quoted here]).

It’s also worth noting that Bordalejo does not set textual relations against manuscript relations but considers them to be mutually informing. The bulk of her essay helps us think through how they can be used in this way. But for that, you’ll have to read the full article (it’s it was free).

*Barbara Bordalejo, “The Genealogy of Texts: Manuscript Traditions and Textual Traditions” Digital Scholarship in the Humanities 30.4 (2015): 1–15 (12–13); italics mine.


I see that Peter Robinson also has an essay out in DSH which comes to many of the same conclusions. For example: “In our reconstructions, we are making a wager about history, not a statement of fact.... These strictures still leave substantial space where quantitative methods can, in combination with traditional scholarly methods and knowledge, achieve valuable results.” He goes on to cite the example of Prue Shaw’s work on Dante. His article is “Four Rules for the Application of Phylogenetics in the Analysis of Textual Traditions“ (not free).


  1. Just a note: I had to be logged in through my University in order to access Bordalejo's article. Thanks for the alert to the article in any case.

  2. Ah, sorry. It must have only been free temporarily.

    1. I have noticed recently that some articles are free for a seven day period, then you must have paid access! Thanks for the information though, I was able to read several other articles by the author.
      I also wonder how her observations apply to the CBGM? As a non-specialist in TC, scholars seem to indicate in the myriad of articles on the CBGM that it isn't a perfect system yet, as this author suggests, it seems to be treated as such in reality! I admit it is possible that my limited understanding may be the problem!

      Thanks again!