Friday, February 17, 2006

Full collation Wikipedia style

An obvious desideratum is a full collation of all Greek manuscripts, freely available on the Web.

Can this be achieved and, if so, how?

It is possible for individuals and institutions to obtain images of the majority of known manuscripts (witness the collection of microfilms at Münster). There are usually no permission/copyright problems with producing an edition of the text contained within these manuscripts.

It should therefore be possible to obtain all the texts. The barriers to full collation are lack of personnel and finance. Here I have a suggestion.

It appears to me that we need a textual resource that is built up somewhat like Wikipedia. Perhaps using Wiki, or something similar, or even going through Wikimedia, contributors could gradually add material, for which they would recieve acknowledgement, but which would be freely available. I guess this would need some senior figure(s) within textual criticism to oversee and some who know something about technology. There would need to be editorial guidelines and the aim of achieving at least triple collation. Of course, without a burdensome system of checking contributions, it would not be possible to achieve the same level of accuracy as within the IGNTP and INTF collations. However, though accuracy is vital, a free tool like this can be constantly corrected and often something is better than nothing. Moreover, if a certain collator's collations were found to be unacceptably untrustworthy they could be removed, but if they were just occasionally unreliable then they could be supplemented by further collations. Each collation would come with the name of the collator (which should provide some incentive for accuracy) and if you spotted an error in a collation you could add a note in your name marking the error.

Clearly there would be a lot of work to do, but Wikipedia is surely a testimony to the fact that a useful tool can be created by creating a format into which contributions can be slotted. The editors would provide support by directing people to particular gaps (as here) and also by providing information on the whereabouts of manuscripts and addresses to contact. Otherwise contributors would be free to focus on what interested them.

I'm not going to begin this, so who is?


  1. I've thought about this before. It's at the top of my list if I ever happen to win the lottery. ;-)
    Unfortunately, I can't see a piecemiel attempt like Wikipedia working for this, though I'd be happy to be proven wrong.

  2. Wiki will never work for text criticism. It is better to have 4% of the information with 100% accuracy than to have 95% of the information with 95% accuracy.

  3. There is a project for getting texts of the Pseudepigrapha on the web that runs along similar lines to PJW's suggestion. If you click on the "Getting Involved" bar on the left hand side, you can see the basic workings of the thing.

  4. "It is better to have 4% of the information with 100% accuracy than to have 95% of the information with 95% accuracy."

    But no first edition is ever 100% accurate. Rarely is even a second or third. So to have a 1st edition at 95% accuracy with 25 times the material of a 27th edition is a great improvement in my view.

  5. Peter,

    I've thought before about the problem of accuracy, but now think that the timescale between spotting of error and correction can be sufficiently small that the problem is not great. If the data are being used then we'll quickly be noticing things that are suspicious. When data of a manuscript are based on a single collation then that is exactly an area a project editor would draw attention to as an area requiring further work.

    Obviously there comes a point when a faulty collation is useless. However, I'm not talking about collations that are that problematic or about accepting onto the database just anything.

    IGNTP reckon that triple collation can bring error down to one character in 5,000. I would set the aims of a project like this as seeking to achieve an error rate of under one character per 1,000. There are plenty of intelligent Greek scholars out there who will never hold academic posts that would allow them to get involved in the production of a printed edition, but who could collate to a reasonably high degree of accuracy. After all, accuracy is largely a question of care. Well educated amateurs (a term I am not using negatively) may well be able to dedicate time to this, provided the project has suitably prominent and dedicated intellectual leadership.

    There are still text-critical resources, like Sperber's edition of the Targums, that fall far short of what is desirable. This does not render them completely useless.

    Early editions of Swanson or of Comfort and Barrett showed the need for correction, but subsequent editions have allowed corrections to be made. These works had the disadvantage of being print editions. An electronic system allows corrections to be posted as soon as they are spotted.

    We could even allow a voting system whereby users vote on the reliability of a particular named individual collator. This would be rather like the ratings of book suppliers on Amazon. That way we'd quickly find out whom to ignore. If ratings got too low the collator's work would not appear on the database again.

  6. I will be happy to contribute to such a project with collations of the Epistle of Jude in all available Greek continuous text MSS + dozens of lectionaries (559 MSS), as soon as my printed edition with critical apparatus, commentary and various other stuff leaves the press, hopefully sometime next year.

  7. Thanks, Tommy. One book down, only 26 to go now. :-)

  8. A quick question to all of you:

    Are you happy with "just" collations or would you prefer to have transcriptions, which would include page layout and, perhaps, other features as well?

  9. That would be a question for editors to decide, though ultimately it is better to have more information. I think that you'd make a great general editor for this project. :-) Let's chat in Muenster.

  10. P.J. Williams:
    "Let's chat in Muenster."

    Sure, Pete, WE chat in Muenster. But what do others think about the level of information they would want to have (basics - advanced - ideal world?

  11. I personally think that for the most important MSS full transcriptions with such features planned for the future edition of Codex Sinaiticus is very desirable goal. But such transcriptions for all MSS will be completely unrealistic (and unnecessary). There will have to be two or three different levels of transcription/collation, with many minuscules transcribed in a basic mode... and under all circumstances lots of photos or links to photos.

    Concerning my own work, I have collated all line (and page-)breaks on my papersheets, but I did not enter them in the digital collatefiles (but that could be done later). However, I did add comments at those places where I found that the presence of a line break was relevant (e.g. in cases of dittography, etc). Some similar comments on punctuation, accentuation, features in the margin (like "Moses' apocryphon", "Enoch's apocryphon" in Jude").

  12. "Collated all line (and page-)breaks"

    This could finally bring about the correction of an error of long standing.
    I think that every eclectic text editor since Tischendorf has omitted the first person plural pronoun in Revelation 5:9, without ever having to note in the CA that out of c. 300 mss of the Apocalypse, this reading is found only in 02 A, and at a column break to boot. No translation has been able to transmit this verse as it stands in A; they all read as if it the preferred reading were a third person plural pronoun (a reading that, were it found in even a single manuscript, would undoubtably become the preferred one).

  13. Obviously the editors would need to maintain a standard form to collations/transcriptions. It would also be necessary to ensure that these could be encoded in software available freely for Mac and PC.

    Although, naturally, images are desirable we should not let the best slay the good. Many manuscript-holding institutions will not let images of entire mss be available on the web.

    The general editors would also need to allow data to come in in different formats. For instance, if Hoskier's collations of the Apocalypse are now out of copyright (or will be in 3 more years with the 75 yr rule) then a lot would be gained through having a format that could accept the collations in the format presented. After all, Hoskier is said to have been accurate.