Friday, October 06, 2006

Statistical Question

16
There are some useful statistics for the Greek NT listed here by Felix Just; including the Number of Chapters, Number of Verses in Each Chapter, Total Number of Verses, and Total Number of Words in each book of the Greek New Testament.
But what I would like to know is the number of letters in each book of the Greek New Testament (any critical edition accepted; or even TR for comparison). Any offers?

Up-date:
Rhalfs (Textual Criticism of the GNT [ET, 1901], 48f) reports that Zahn (Geschichte des N.T. Kanons, i.76) gives figures from work done by Graux (Revue de Philologie, ii). Rhalfs only provides some figures (letters first, then stichoi [I can't figure out how to do a table]):

Matthew: 89,295 (2,480)
Mark: 55,550 (1,543)
Luke: 97,714 (2,714)
John: 70,210 (1,950)
Acts: 94,000 (2,610)

3 John: 1,100 (31)
Apocalypse: 46,500 (1,292)
Philemon: 1,567 (44)

Looks like a trip to the library to get all the figures; unless someone has Zahn on their shelves.

16 comments :

  1. Matthew: 90225 letters

    Exported the Matthean text from Accordance (NA27 module) with strip accents function. Copied the RTF textfile, pasted into Word. Removed all punctuation with the search/replace facility. Used count words-function.

    Perhaps you can do something similar?

    ReplyDelete
  2. No need to make a trip to the library. A quick Perl script will tell the story. Just a few minutes please...

    Casey Perkins

    ReplyDelete
  3. 'There are more variations among our manuscripts than there are words in the New Testament' (Misquoting Jesus, p. 90)—but probably fewer than there are letters :-)

    ReplyDelete
  4. Sorry to return to a favourite subject.

    ReplyDelete
  5. Below is the output from my program for each book of NA/UBS. The counts include no spaces, brackets, accents, breathings, or punctuation.

    01 90051
    02 56533
    03 95957
    04 71346
    05 95808
    06 34428
    07 32754
    08 22273
    09 11079
    10 11995
    11 7992
    12 7878
    13 7413
    14 4042
    15 8848
    16 6519
    17 3717
    18 1556
    19 26374
    20 8820
    21 9056
    22 6073
    23 9458
    24 1129
    25 1106
    26 2571
    27 46028

    Casey Perkins

    ReplyDelete
  6. Thanks Casey,

    That looks helpful.

    Some interesting differences from Rhalfs figures, esp. for Luke; but that probably reflects some significant textual decisions in Luke.

    Tommy came up with a different number using, presumably, a similar technique. Any thoughts on this?

    These figures would have all nomina sacra fully spelled out and would presumably not count iota subscript (unlike say, a manuscript, which might have less letters by using NS and a few more by using iotas).

    ReplyDelete
  7. Hi Peter,
    "Tommy came up with a different number using, presumably, a similar technique. Any thoughts on this?"

    Tommy's figure for Matthew was only 175 characters different out of about 90000. I'm not sure how thorough he was in his search-and-replace operation, but I reviewed my output to make sure there were no extraneous characters like brackets or punctuation. (Not to say that I viewed all the data by eye, but I used VIM, my text editor program, to search for non-word characters and found none).

    I guess you could verify how precise my figures are, if you wanted to make the effort, by counting out the characters in 3 John, the shortest of the books, and seeing how close your count and my computer count are. I'm guessing you'll find the computer count very accurate. In any event, the figures are at least good enough for relative comparison.

    Casey

    ReplyDelete
  8. "These figures would have all nomina sacra fully spelled out and would presumably not count iota subscript..."

    I forgot about iota subscript. (The text file I'm using represents iota subscript as a vertical bar character, so I removed it with all other such characters). I tweaked two lines of my program and ran it again, this time retaining iota subscript:

    01 91005
    02 57094
    03 97039
    04 72131
    05 96761
    06 34907
    07 33159
    08 22544
    09 11172
    10 12178
    11 8090
    12 8010
    13 7500
    14 4099
    15 8938
    16 6609
    17 3753
    18 1584
    19 26562
    20 8888
    21 9142
    22 6142
    23 9581
    24 1143
    25 1123
    26 2594
    27 46523

    ReplyDelete
  9. Interesting. My figures are closer to Casey's (90057 letters in NA27 Matthew).

    More info on my blog, ricoblog

    I also did counts for Robinson's 2005 edition of the Byzantine and for Scrivener's 1881 edition. Text files with word and letter counts broken out by book are available in the aforementioned blog article.

    Hope it helps!

    Rick Brannan
    ricoblog

    ReplyDelete
  10. PH: "Tommy came up with a different number using, presumably, a similar technique. Any thoughts on this?"

    Tommy forgot to remove the question marks which were 167 in number.

    90225 - 167 = 90058

    Now, there is still one letter difference between my corrected figure and Rick Brannan's. On the other hand, I am using an old Accordance database, which probably has some or other error. In fact, I remember noticing an error once. These very small differences (1-10 letters) are explicable if we use different text releases. At least you now know that it is around 90000 letters in Matthew (with nomina sacra written out).

    ReplyDelete
  11. Thanks a lot everyone for this.
    Very helpful.

    ReplyDelete
  12. Does it make sense to discount iota subscripts, but to leave an adscript like John 18:2 in NA27?

    ReplyDelete
  13. Rick is quite explicit that he counted words within brackets, but I suspect that some of these counts may still need some tweaking in order to distinguish between single square brackets [in which the bracketed words are considered to be part of the NA27 text] and double square brackets [[in which the bracketed words are NOT considered to be part of the text]]. This may require a little human intervention rather than just programming a delete for all bracket characters.
    This could significantly impact counts for Mark and Luke.

    ReplyDelete
  14. ...I suspect that some of these counts may still need some tweaking in order to distinguish between single square brackets [in which the bracketed words are considered to be part of the NA27 text] and double square brackets [[in which the bracketed words are NOT considered to be part of the text]]."

    That would certainly give us a better idea of the true difference between the NA and the Byzantine text. The current NA counts give us a misleading view of that.

    ReplyDelete
  15. Hi Peter,
    I found the discrepancy between my figures and Rico's. It was a bug in my program. Our figures are now identical, with the exception of a 1 character difference in Acts, and a 2 character difference in both 2 Cor and Hebrews. It's no doubt attributable to a difference in our source files.

    I did a quick alteration to my program to account for double brackets (which were only relevant in Mark, Luke, and John). Below are the new figures. First column of numbers are the figures without iota subscript, the last column takes it into account. In Mark, Luke, and John, totals with and without words in double brackets are included.

    Matthew, 90057, 91011
    Mark, 56537/55365, 57098/55915
    Luke, 95966/95772, 97048/96852
    John, 71348/70526, 72133/71301
    Acts, 95811, 96764
    Romans, 34434, 34913
    1 Corinthians, 32760, 33165
    2 Corinthians, 22279, 22550
    Galatians, 11085, 11178
    Ephesians, 12001, 12184
    Philippians, 7998, 8096
    Colossians, 7884, 8016
    1 Thessalonians, 7419, 7506
    2 Thessalonians, 4048, 4105
    1 Timothy, 8854, 8944
    2 Timothy, 6525, 6615
    Titus, 3723, 3759
    Philemon, 1562, 1590
    Hebrews, 26383, 26571
    James, 8827, 8895
    1 Peter, 9062, 9148
    2 Peter, 6079, 6148
    1 John, 9459, 9582
    2 John, 1130, 1144
    3 John, 1107, 1124
    Jude, 2577, 2600
    Revelation, 46032, 46527

    Regards,

    Casey Perkins

    ReplyDelete