Wednesday, January 11, 2017

The CBGM in One (!) Sentence

The discussion from the post last week on Greg Lanier’s thoughts on the CBGM was helpful I thought. In particular, it made me think again about how best to present the CBGM to those who find themselves mystified and or even frustrated by its complexity.

In my experience, the most common reaction to the method is still mistrust and a kind of anxiety. Some of that is not surprising. We don’t like people messing with our New Testament text in ways we don’t understand. I get that. I get it because it was a major motivation for my own research. I wanted to know why my Greek New Testament was changing and whether the changes were any good. At its most basic, that was the reason for my dissertation.

In light of that, and in light of some of the feedback I’ve received from my JETS article, I wanted to follow up with a new attempt to explain the method. In particular, I want to take a stab at defining it in a way that is not only accurate and clear, but also somewhat less intimidating.

So here is my one-sentence description: the CBGM is a new set of computer-based tools for studying a new set of text-critical evidence based on a new principle for relating texts. (Notice that I have not used any of the words in the name “Coherence-Based Genealogical Method” to define it. My English teachers would be proud.)

I think that does a good job of covering all the bases. But it is still a mouthful, so let me try to explain each part in turn. I’ll start at the end and work my way back.

...based on a new principle for relating texts

This is not the CBGM.
(Although I kind of wish it was.)
First, the CBGM really is based on a new principle for relating texts. I say this because there has been some reasonable doubt (mostly from David Parker) about whether the CBGM really is a new method or actually just a new application of an old method, namely Lachmannian stemmatics. This was an early question in my viva, in fact. And for good reason, because right at the beginning of my thesis I had referred to the CBGM simply as a “tool,” a description I’ll come back to. But I am not convinced that the word “tool” does enough to cover what the CBGM is. So my answer to the question was (and is) that the CBGM is based on a genuinely new way to relate texts to each other. In particular, whereas Lachmannian methods deduce witness relationships based on shared agreement in “error,” the CBGM rather aggregates relationships based on both agreement and disagreement. That is, insofar as I know the history of TC, a genuinely new development. Certainly it has never been applied to the NT in this way before. I won’t go into the specifics here, because I’ve outlined the differences more in my JETS article. But suffice it to say that, at the heart of the CBGM, there is a genuinely new methodological principle for relating texts.

...for studying a new set of text-critical evidence

Second, along with a new method, the CBGM also provides us with a new set of evidence. That new type of evidence comes in the form of “coherence,” and it has two types or “flavors,” if you will. The first type is pre-genealogical coherence and it is new only in how it is used and how much data it uses. It’s not really new in terms of what it is. That’s because it is nothing more than a quantitative analysis of witness agreement. And that we’ve been doing for about one-hundred years in TC. What’s new is the amount of agreement used and, more importantly, how it’s used. It’s used to help us find possible (or less possible) relationships between variants. I won’t go into detail here, but a helpful example of how it works in practice is given in Tommy’s TC article on Mark 1.1 (PDF). The second type of evidence is very similar in function but much broader in terms of the data used. This type is called genealogical coherence (no pre- here) and this is a genuinely new type of evidence. At its simplest, genealogical coherence is used by taking all the data in the CBGM about witness relationships and then applying these data to specific textual variations. Looking at how the witnesses relate to each other overall can hopefully tell us something about how the variants themselves relate (or don’t relate) at the specific point we’re studying. In particular, it may suggest to us cases of “multiple emergence” or “coincidental agreement” of  reading, where scribes have independently created the same variant reading. For examples of using genealogical coherence, see the cases I cite in the appendix of my JETS article or read Tommy’s NovT article.

A new set of computer-based tools...

Third, this brings us to the CBGM as a set of tools. The reason my definition above leads with the word “tools” rather than, say, “method” is because this is how most of us will actually interact with and use it. That is, most of us will only really ever know the CBGM through the set of online tools known as “Genealogical Queries.” Right now, these are available for the Catholic Epistles but soon, we hope, for Acts too. The first version for the Catholic Epistles has actually been online since 2008. So these are “new,” but relatively so. There are five tools in particular. Two are for viewing information about possible witness relations. These are the “Potential Ancestors and Descendants” tool and the “Comparison of Witnesses” tool. There are two tools for studying the new type of evidence I mentioned above and these are aptly if not obviously named “Coherence in Attestations” and “Coherence at Variant Passages.” You can use both of these for studying the genealogical coherence at a specific point of variation. Unfortunately, there is currently no easy way to study pre-genealogical coherence at specific variations, except in the Gospels. But explaining that would take us too far afield. The last tool is simply called “Local Stemmata” and this will show you how the editors of the ECM/NA28/UBS5 have related nearly all the variants in the Catholic Epistles to each other. It’s quite amazing actually that we have this level of detail available. You won’t get any commentary, of course, but you will learn more about the editor’s judgment from these stemmata than you ever could just by looking at your NA or UBS apparatus.

So that’s it in nutshell. The CBGM is (1) a new set of computer-based tools for studying (2) a new set of text-critical evidence based on (3) a new principle for relating texts.

That’s it.

Okay, that’s not really it. There’s more to explain, especially about how to actually use these new tools. But hopefully this three-part way of defining the CBGM can go some way toward de-mystifying what I know can be an intimidating new development in our discipline. If it does that, then it will have done what it needs to do.

As an addendum, I should say that I don’t know whether the teams in Münster or Birmingham or Wuppertal who are or will be using the CBGM would agree with all I’ve said. I obviously don’t speak for them. But this is my attempt after three years of thinking about it (almost!) every day to distill it for the “rest of us.” If others can do better than I have here, that’s great.


  1. Thank you Peter. Your one sentence summary is very helpful. Please keep the explanations comming! Maybe a "how-to" series on the blog might be good. In other words, a step by step simple tutorial on how to use these new tools and how to interpret the data.
    Perhaps that is too tall an order. And I do understand that some of these articles you have referenced do attempt to do just this.

  2. Perhaps the photo should not have been Robbie the Robot, but rather "Pay no attention to the man behind the curtain"?

    1. Just curious if Maurice Robinson has any written works or opinions on this topic. I greatly respect you as a scholar, and I'd love to hear your thoughts. :)

  3. Can you explain this sentence or is there a more detailed explanation in your article: "Lachmannian methods deduce witness relationships based on shared agreement in “error,” the CBGM rather aggregates relationships based on both agreement and disagreement."

    Colwell and Tune deduced relations based on overall agreement (and some cut offs). What does "aggregat[ing] relationships" mean? It's not a particularly clear term.

    1. Stephen, yes, I explain the CBGM's distinctive in relation to Lachmannianism a bit more in the article. Whether "deducing" and "aggregating" are the best terms to get at the difference is probably debatable. They were the best I could find.

      As for the question about Colwell and Tune, see my response to Mike Holmes below. The key is the addition of directed variant relationships in the aggregate. That is genuinely new to the CBGM.

    2. Peter,
      It is these 'directed' variant relationships that are so elusive in the articles and other materials available on the CBGM that makes some, many of us skeptical? Dr Wassermann's article you referenced was extremely clear but unlike Muenster, he provided the basis for his decisions.

    3. Well, you can see all the relationships online. But, yes, the rationale for more would be helpful. I document all I've found in my article. But I don't know that anyone would want to read rationale for all 3,000+ places of variation.

    4. Thanks, Peter. The term "directed" wasn't in your original post (haven't read the article, but I hope your post was meant to stand on its own). Is this a replacement now for "both agreement and disagreement"? How does "directed variant relationships in the aggregate" then differ from the old Lachmannian "agreement in error"?

      I have a rough idea of how the CBGM works and my impression is that it tends to resist simple, succinct characterizations.

    5. Stephen, directed is only a qualification of disagreement. Where two witnesses agree, Lachmannian methods and the CBGM both work from the principle that agreement implies relationship. (Lachmannianism goes further in assuming agreement implies shared origin.) But where two witnesses disagree, Lachmannianism is silent whereas the CBGM is not. In cases of disagreement between two witnesses, the CBGM asks the editor to relate the disagreements and then the CBGM aggregates those disagreements to determine the directed relationship of the witnesses.

      As I say in the article, "This is fundamentally different from the common error principle which, as Maas noted, can never directly demonstrate the dependence of one witness upon another but can only do so indirectly by excluding the possibility of independence" (see Maas, p. 42).

    6. As a neo-Lachmannian, I wouldn't be doing my job without objecting to your statement that "Lachmannian methods ... work from the principle that agreement implies relationship." This is imprecise and, in my experience, misleading because it leaves off the key qualification of "agreement in error." If you are not measuring agreements in error, e.g., by measuring overall agreements (including agreements in true readings), then you are not doing Lachmannian stemmatics. It's really a key distinguishing feature of the method.

      It is not clear to me, and of interest to me, whether the CBGM also somehow under hood manages to count agreements in error somewhere along the line in its complicated processes. The fact that they input local genealogies into the process shows that this information is available. Whether and how it's used is another question.

      As for where two witnesses disagree, I don't understand your point. All editors, even Lachmannian ones, look at disagreements.

    7. Yes, agreement in error is a very important qualification for the Lachmannianism principle. The CBGM has an analogous distinction in "connective" and "non-connective" readings, but since this distinction isn't used everywhere in the CBGM (it is not used in pre-genealogical coherence) like it is in Lachmannianism, I didn't want to apply it to the CBGM across the board. (So the imprecision was my way of being more accurate to what both methods share.)

      But where the CBGM is most unique is that disagreements determine the direction of witness relationships. Where witness A and B have variants in a -> b relationships, those variant relationships determine the directed relationship between A and B themselves. Lachmannianism does not do that because its genealogical principle is different. Lachmannianism works by finding shared ancestry by the principle that "shared agreement in error implies shared ancestry." The CBGM, on the other hand, works by finding direct (i.e., not necessarily shared) ancestry on the principle that the relationship of variants is the relationship of their attesting witnesses.

      I'm not sure grasping this distinction is terribly important to understanding the CBGM. I only brought it up because I think it shows the novelty of the CBGM's basic principle for determining genealogy. It's not the only valid principle. But it is new. That was my only real point in bringing up the comparison.

    8. Thanks, Peter. That's very helpful. It even makes sense of your Maas quotation.

      It's clear that there's a live distinction between the two methods between finding direct ancestry between two extant texts (CBGM) and indirect or shared ancestry between two extant texts (stemmatics). I'll have to think about why doing it one way or another is important and what side effects of that decision is.

  4. I think I would take issue with Peter's sentence on a couple of points.
    1) That it is computer-based is totally irrelevant. Computers are handy to produce and play with large amount of data but the best way to learn the CBGM is by doing a restricted set of variants and witnesses by pen and paper. And the computer scripts that have been used so far are not public and are known to contain some errors (which are not part of the method).
    2) It seems as if the goal of the method is radically diminished as well in this description. We know from the publications on the method till 2011 that its ambitions were much higher than bringing a 'new kind of evidence' to the table.
    3) To say that the tools are available is as yet still quite an overstatement. The theory behind the tools is available and anyone could write (and publish!) their own scripts. Yet what is available on the website is only the interpreted data seen through the tools as used in the preparation of ECM1 and ECM2. No one could with the currently available tools create a text on different textcritical criteria whilst accepting the quantitative data. This is no criticism of the INTF, they made perfectly acceptable choices as to their priorities. Ironically though this has lead to a situation in which the most 'scientific' text falls faul of the scientific criterion of absolute perspicuity of method, data, and procedures.

    1. Dirk,

      1) I agree to a degree. The basic principles can be done by hand, but it is hard to imagine those principles being of any practical value for editing 3,000+ variations without the computer. But pen and paper is essential to learning the method in my experience. I went through a lot of post-it-notes the last three years.

      2) Are you thinking of the global stemma here?

      3) I think the current tools can be used to challenge INTF's own results from using them, so I think it's fair to say the tools are available. At the very least, my point is that the tools are available for learning the method.

  5. I would add, that it is only a new kind of evidence if you accept the premise that one can establish the priority of a witness based on a subjective method of determining which texts within a group of witnesses are earlier.

  6. A great post, Peter, and the idea of a "one-sentence summary" as a way of focusing analysis is a great one. I would like to second Stephen Carlson's mention of the work of Colwell & Tune; I hear Colwell's ideas floating in the background of your analysis. I would also add, esp. re the role of the Byzantine tradition, the work of Zuntz (in Text of the Epistles). Over against the fundamentally Westcott-Hortian view of the history of the NT text that fundamentally shaped the editorial work on UBS 1-2-3-4/NA26-27, Zuntz championed the importance of the Byz. textual tradition for understanding the early history of the NT text. One could suggest that the CBGM makes it possible to carry out systematically and completely the kind of analyses of textual relationships that Colwell envisioned (but lacked the means to carry out), and that the view of the history of the NT text that has emerged from the work on the ECM was foreshadowed by Zuntz.
    Mike Holmes

    1. Mike, your last sentence is interesting and one I will have to consider some more.

      As for Colwell, I think it is fair to say that he greatly sharpened the use of quantitative analysis for determining manuscript relations, but I don't think we could say he invented it. In any case, if the basic principle that he worked from is that manuscript agreement implies manuscript relationship, others had done that long before him. And, as I say, the CBGM has not innovated on this point. Or, if it has, it has only innovated in how it uses it and how much data it uses. What the CBGM does that is genuinely new is that it adds a second principle, namely, that directed variant relationships (NB: not just agreement) imply witness relationship. Did Colwell do that? Not that I'm aware.

  7. Peter Gurry,
    Of course both more information is valuable and the rationale for all 3000 + places of variance are not necessary, yet like Dr. Wassermann, a detailed explanation ( commentary) for any time a decision was made that a variant occurred multiple times independently would go a long way towards removing the air of mystery around the CBGM.
    Maybe this information is available and I missed it?

    1. Yes, a commentary would be great. Hopefully for Acts. As for finding cases of possible multiple emergence, you can do that already with the coherence tools that are online.

  8. The software is available on github with absolutely no documentation on how to set it up. I see nothing that resembles a test suite. As someone who worked on a C++ team at Microsoft for 20 years, I really get the sense that Gerd Mink led a small team (like 2-3 guys) to write set of python scripts that pull from a database, then transforms the data with graph theories and their TC rules. 

    I cannot sign off on this method yet. I need to see real documentation, including design and functional specifications with exit criteria, written by the engineers along with acquisition and usage manuals and samples written by professional technical writers. 

    And to all those who scoff at the idea that you still need to master the Greek and Hebrew and not just know how to run the tools, you better apply the same benchmark with this game changing textual critical method. You all have to go back to college and get computer science degrees so that you can definitely defend the tools. Oh and by the way, software is buggy. Again, just putting undocumented code on github does not really make it open source. Until they develop a community of users who can dogfood the software, contribute to the source, run and contribute to tests, file and fix bugs, do code reviews, etc., it IS just a black box. That 500 page deck that Gerd wrote does do a nice job of explaining the heuristic, but that deck I bet was mostly for the committee and I suspect that even their eyes glossed over. I suspect that there are like 2-3 guys (or gals) that know how the code works and that if they got hit by a bus, the whole thing would be on the floor. 

    While I suspect that the software probably does what they say it does, I personally refuse to totally delegate to a magisterium of a elites until I can check their work like a good Berean. And sorry, but just running queries to evaluate the results is not enough. If we are supposed to master Greek and Hebrew and not just be able to run Accordance, Bibleworks, Logos, etc., then on that same principal, we need to understand how the software works and be able to defend each query on our own, not depend on a Metzger like commentary that does not scale with this methodology. 

    Stephen MacKenzie