Interconnecting Textual Layers
Søren Kierkegaard Forskningscenteret ved Københavns Universitet
Keywords: Text encoding, genesis
FAX: +45 3532 3710
Phone: +45 3532 3778
In 1887 Henrik Pontoppidan1 wrote a short-story called Jack Of Clubs and in 1890 a feuilleton Youth. Both narratives were slightly modified and adopted into his novel from 1891, Soil which was subsequently followed by two other novels The Promised Land and The Day Of Doom2. Two of these appeared in a second printing, in which the author took the opportunity to correct and amend his work, but in 1898 the complete trilogy was collected and issued in one volume as an elaborated edition. The work was now somewhat abridged, and a vast number of smaller and greater alterations was made, which also involved the re-structuring of several parts. This now rather comprehensive novel went through yet another revision in 1903 and again in 1918. It has long been a wish among scholars to compile the ultimate text-critical edition of this important work, which would include variants of the five editions. Until now the sheer weight of the matter has made it impossible to fulfill such a task on paper. Computer techniques have, however, made it possible to collect the text on a single cd-rom and to do relevant comparisons on a computer3 .
From 1838 until his death in 1855 Søren Kierkegaard4 wrote and published 35 volumes of novels, treatises, and discourses on aesthetical, ethical, or theological issues5 . After his death, several ready-to-print manuscripts as well as his journals and papers were published wholly or in part6 . An important part of Kierkegaard's posthumous papers consisted of the preparatory studies for his works: drafts, notes, exercises and manuscripts. Some of these have been published, that is to say those which differ substantially and interestingly from the final product. Even those which bear very close resemblance to the final version, however, carry important information on the work methods of the author. It is the ambition of the new edition Søren Kierkegaards Skrifter, which is scheduled to appear in the following years, to deliver in electronic form the preparatory studies in their full length, thus creating a positive text-critical apparatus, showing variant texts in their contexts rather than a negative one, focusing on differences between variants7 .
The Concept of Textual Layers
Generally, layered texts are texts that either consist of words and phrases that to a large extent are the same, or texts that allegedly constitute the same objective. A classic example is the synoptical reading of the three (or four) gospels of The Bible's New Testament. This is an uncomplicated example because of its relatively short length and the fact that the interrelations are well established. Another example which fulfills only the last condition and lies outside the scope of this paper, is translations. We shall focus on some principal and practical issues of the two examples initially mentioned: the revised editions and the preparatory studies.
A revised edition by definition has the same objective as an earlier one and hence constitutes another textual layer. This applies to the work as a whole, however. Chapters or large parts may have been rearranged and their identity can only be established by inspection of words and phrases. This leaves the principal problem of identifying such rearranged parts and the practical one of conveying such parts to the reader. Instead of inspecting two closely related streams of text looking for differences we find ourselves occupied with the task of finding sufficiently strong evidence of equality. The latter becomes an important task in the case of the preparatory studies. In some cases the author may have left papers entitled: »Draft of work XX« with extremely vague actual resemblance of the eventual XX. In other cases, large passages of text recur as paragraphs in a work with no evidence whatsoever that they had been intended to appear there at the time of conception. This raises the question of what the genesis of a text is. Having come so far we are left with a number of booklets and loose papers defining fractions of what have become different published texts. So the straightforward idea of layers as well-defined streams of interrelated texts does not necessarily hold true.
Not even the idea of a text as a one-dimensional stream of characters and words holds true. Draft material will usually take the form of a main text including deletions plus insertions, alternatives, and inserted as well as orphan notes. Such a text would usually be regarded and presented to the reader as a hypertext possibly equipped with some text-critical apparatus revealing the amendments. For the sake of comparison with adjacent layers we may wish to flatten the structure, leaving out deletions and substituting alterations. We should aloso note that one single piece8 may contain more layers: e.g. large passages of text wiped out and rephrased in subsequent paragraphs.
The Concept of a Shaft
Imagining the text, however inaccurately, as horizontal, we term the vertical, historical interconnection of texts a shaft, i.e. the gateway to the text below, the preceding text, and to the text above, the succeeding text. Technically speaking, the shaft is a data structure mapping identifiers onto sets of positions of relevant, comparable texts.
Consider as an example the preface of Kierkegaard's novel Either-Or9. In addition to the copy text, which is the first printing of 1843 (which we shall term A) a second printing of 1849 exists (B), a fair copy (ms 25.2), and an exhaustive draft (ms 23)10 . These four texts vary at a very detailed level, e.g. orthography, punctuation, or corrections of words or phrases. In addition, there are two drafts headed Preface, which consist of fragments of the preface (mss 17 and 21), a draft differently headed, but consisting of fragments, all showing resemblance with fragments of the preface (ms 22), a scrap (ms 16.1), and notes on the board of a notebook (ms 5). Finally, a draft for the chapter called The Seducer's Diary holds a note in the margin: NB. The editor's foreword would mirror this with some ironical reflections on the comical in A's circumstances, [...] (ms 4.1). The four of these call for no detailed interconnection other than the one identifying these texts as variants on the preface. The scrap and the others, however, must be inserted at proper positions of these four. Thus, we can use this example to identify two types of shafts: (1) Each of the four texts (A, B, ms 23, ms 25.2) are encoded at the beginning of the texts with an identifying tag, say, ee preface begin. (2) Likewise the note of ms 4.1 is tagged ee preface NB. The difficult part of the latter exercise is now to properly identify and tag the mentioned ironical reflections in the preface, with the possible complication that they may not be there.
Most important, this task is a philological one. The level of accuracy must be determined regarding the text in question, and the resposibility of the principal question must be taken: Which text fractions should be interpreted as corresponding. The shaft merely offers a tool which may be boiled down to this: Identify those positions in all layers with the same tag.
From this encoding we could mechanically deduce a kwic concordance11 referring to the position of each tag. Now every text connects to the kwic concordance and the kwic concordance connects to every comparable text. The kwic concordance implements a shaft.
The Shaft Implemented as Hypertext
In a hypertext mark-up language like HTML, identifiers and references such as those mentioned above would take the form of NAME and HREF attributes respectively of an anchor tag12. In the previously mentioned example (2), the following anchor should appear at the note of ms 4.1 as well as at the ironical reflection of the final preface:
<A NAME="ee-preface-NB" HREF="shaft.html">
referring to a meta text shaft.html with the following content:
The shaft.html would furthermore contain clickable signs for the layered texts and possibly the proposed kwic quotation.
The previously mentioned critical Pontoppidan edition (ref. note 3) includes an electronic version comprising five editions of the text, which actually use HTML encoded shafts to interconnect them. In here, the chapter is the unit of reference. This represents a simplification since the editor does not undertake the task to pin-point any phrase for its similarity to other editions. Usually it leads to the uncomplicated entry of each chapter, say, book I, chapter 1 of any edition (see figure).
In the event of a chapter being fragmented and split into different chapters of the preceding edition, however, the shaft is complicated somewhat as a consequence of the simplification. Book IV, chapter 6 of the 1st and 2nd edition does not exist as such in the 3rd and following editions. The content recurs though, in book V, partly in chapter 1, partly in chapter 3. The 2nd edition would naturally refer to a shaft leading on to alternative 3rd edition chapters. The 3rd edition of V-1, however, (and similarly V-3) would refer to alternative 2nd edition chapters, showing an example of shafts which are not independent of the vertical direction: a different path is to be taken when searching upward than when searching downward. This is a consequence of choosing the chapter as the basic unit, which is again a compromise between the detail of the text and the complexity of the shaft. The shaft had been simple and bidirectional if the editor had encoded text fractions of a finer granularity. The choice is a philological one.
Having pointed out the general similarity by chapter, this edition offers a simple word-by-word comparison computer program for studying the detailed variations. So the problem of identity is solved by the editor whereas the differences is displayed by the computer.
We have defined the concept of (horizontal) textual layers and introduced a (vertical) interconnection mechanism suggesting the term shaft for this. We have shown how an encoding with relevant tags can be done and how the shaft is consistent with the tags, furthermore, how a hypertext system could implement the shaft. It has finally been shown how the mechanism can be modified for the text in question, revealing that the art of shafting is after all a philological discipline.
An earlier version of this paper appears in Kierkegaard Studies, Yearbook 1998, Walter de Gruyter, Berlin, New York, 1998.
1Henrik Pontoppidan, 1857-1943, Danish novelist and Nobel Prize winner 1917.
2The two these were translated into English titled Emanuel, or Children of the Soil and The Promised Land by Mrs. Edgar Lucas, 1896.
3Henik Pontoppidan, Muld, Det forjættede Land, and Dommens Dag, in >DET FORJÆTTEDE LAND< ed. by Esther Kielberg and Lars Peter Rømhild, computerised by Karsten Kynde, Det Daske Sprog-og Litteraturselskab and Gyldendal Publishers, Copenhagen, 1997.
4Søren Kierkegaard, 1813-1855, Danish author, philosopher, and theologian.
5Collected in Søren Kierkegaards Samlede Værker, ed. by A. B. Drachmann, J. L. Heiberg and H. O. Lange, vols. 1-14, Gyldendal Publishers, Copenhagen, 1901, 1920, and 1962.
6Søren Kierkegaards Papirer, ed. by P. A. Heiberg, V. Kuhr and E. Torsting, vols. I-XI, Gyldendal Publishers, Copenhagen, 1909-1948. Second suppleimed edition vols. I-XIII by N. Thulstrup, 1968-1970, with an index by N. J. Cappelørn vol. XIV-XVI, 1975-1978.
7Søren Kierkegaards Skrifter, ed. by Niels Jørgen Cappelørn, Joachim Garff, Johnny Kondrup, Alastair McKinnon, and Finn Hauberg Mortensen , G.E.C. Gad Publishers, Copenhagen, 1997-,hereafter referred to as SKS.
8By piece is meant a sheet, booklet, scrap or whatever physical entity which holds the message, as opposed to the logical abstraction of the textual layer.
10Ms numbers as defined in SKS K2-3, p. 10.
11Key Words In Context concordance, i.e. a short quotation of the text at the relevant spot. In this case the identifying tag takes the place of the key word.
12An HTML-anchor is according to the rules tagged <A> and may furthermore include attributes such as NAME, identifying the position of the anchor within the text and HREF referring to other anchors. The reference consists of a name of a text and the name of the anchor within the text, delimited by hash (#).T. Berners-Lee and D. Connolly, Hypertext Markup Language-2.0, IETF RFC 1866, 1995, ftp://ds.internic.net/rfc/rfc1866.txt.