Friday, July 05, 2013

On-line article by T. Finney, "How to Discover Textual Groups"

Tim Finney has made available an article on-line "How To Discover Textual Groups"

This article describes how to discover groups among New Testament
witnesses using three different multivariate analysis (MVA) methods,
focussing on textual variation in the Gospel of Mark. It applies
statistical reasoning to the question of what level of agreement or
disagreement is required for one to be confident that two witnesses are
somehow related, whether in an adjacent or opposite sense. One MVA
method, PAM analysis, partitions witnesses into a specified number of
groups, identifies the most central member of each group, and identifies which numbers of groups are more natural for a data set. The MVA methods manage to identify many groups noticed by prior generations of scholars. One conclusion of the article is that Colwell and Tune's definition of a group (70% agreement plus 10% gap) should be abandoned.


  1. Hmm.

    I have a different idea.

    How to discover textual groups:
    (1) Collate manuscripts.
    (2) Isolate manuscripts which share distinctive readings.
    (3) For versional witnesses, reconstruct their hypothetical base-texts.
    (4) Isolate versional witnesses which share distinctive readings.
    The End.

    Was this supposed to be a paper about how to find groups, or a paper about two dozen ways to not find groups?

    TF, how do you avoid grouping MSS on the basis of percentages of agreement instead of on actual content? Picture a class of 30 students who have taken a history test. The teacher suspects that some of them have worked together in groups. To isolate groups of students, the teacher compares their grades. The teacher initially thinks that Fred and George worked together, because they both scored 55%. But when the teacher compares their test-papers, it is clear that they missed different questions; the percentage does not reliably indicate relationships. Only content-to-content comparisons can do that.

    Also: regarding your statement, "The UBS4 map for the Gospel of Mark shows that Jerome’s revision (vg) lies close to a trajectory which runs between a cluster of Old Latin texts such as Vercellensis (it-a), Veronensis (it-b), Colbertinus (it-c), and Bezae (it-d) at one end and “Byzantine” texts at the other. If these Old Latin texts represent the Latin exemplars used by Jerome, it seems that the “early” Greek manuscripts he used to revise the Latin text of Mark were of the Byzantine variety." -- Can I quote you on that?

    Why is 2427 in the analysis?

    Why does 892 jump around from one group to another so much, no matter how PAM slices up the groups?

    And what is Table 24 telling us, besides that non-Byzantine MSS agree with themselves?

    And, if you're going to restrict comparisons to P45 to points where it is legible, the same approach should be done with OL witnesses such as k; this was done, right?

    Figure 10 is not just cluttered; it's abstract art. Some other way of illustrating the results must be used if you want to communicate clearly whatever you wanted to convey there.

    Interesting work, but I'm not sure if it shows that this sort of analysis is worthwhile, if it has to always be tweaked in order to not produce phantom-results.

    Yours in Christ,

    James Snapp, Jr.

  2. I wonder what are our friends at Münster saying about this kind of analysis, especially the claims that Jerome's revisions are of the Byzantine type, and that P45 goes with f-1 and so on ...

    Interesting work though

  3. Hi James,

    The distance matrices that are analysed to discover textual groups indicate relationship to precisely the same degree as the tables of percentage agreement used in quantitative analysis.

    Yes, you can quote me on anything in the Groups article.

    2427 is there because it is in the UBS4 apparatus.

    The article explains why a witness can jump from one group to another in PAM analysis. Borderline cases will shift given small perturbations in the distance matrix.

    Table 24 tells us that if the witnesses in the INTF data set are split into four groups then one group contains A-like (i.e. Ausgangstext-like), another 1339-like, another 209-like, and another 826-like texts.

    Concerning fragmentary witnesses, analysis is restricted to places where their readings are indicated in the source data sets (e.g. the UBS4 apparatus). The same is true of all witnesses analysed.

    On figure 10, the paragraph of discussion which follows says what to do if the figure is too cluttered to interpret directly.

    I have not tweaked the data anywhere (apart from the artificial data set constructed to show four well defined groups). I have instead analysed the UBS4 and INTF data to discover inherent groups. Please note the agreement between PAM analysis and Wisse's classifications.


    Tim Finney

  4. Thank you. Your blog was very helpful and efficient For Me,Thanks for Sharing the information.

  5. Every book from antiquity which has survived in multiple copies exhibits textual variation. Sites where extant witnesses differ can be identified by manual or computer-assisted comparison. Once the state of every witness has been recorded at every variation site where its text is discernible, a text to text distance can be calculated for each pair of witnesses. Multivariate analysis of these distances provides a way to discover textual groups among the witnesses.

  6. This comment has been removed by the author.

  7. My Groups article has been published at last:

    Finney, T. (2018). How to Discover Textual Groups . Digital Studies/Le champ numérique , 8 ( 1 ) , 7 . DOI: