Predication Transformation: A parallel corpus-based study of Russian verbless sentences and their English translations

The paper presents a corpus-based contrastive analysis of the predication involved in the translation of verbless sentences from Russian to English based on a pilot parallel corpus consisting of Dostoyevsky’s dialogue -based Russian Brat’ja Karamazovy (1880) and the Pevear and Volokhonsky English translation The Brothers Karamazov (1990). In contrast to English, known for its dependency on the finite verb phrase, Russian permits the use of verbless sentences more productively than any other Indo-European language (McShane, 2000; Kopotev, 2007). Combining the parallel-text approach to contrastive linguistics developed by Guillemin-Flescher (2003) with a new method of automatic verbless sentence extraction, the present study examines reoccurring patterns regarding the way that predication is gained or lost in translation. Following automatic segmentation, morphosyntactic annotation and extraction, verbless sentences and their translation correspondences are manually annotated for verbal and non-verbal predication in accordance with Hengeveld’s (1992) definitions. The results present a typology of a phenomenon we call ‘predication transformation’, in which translation correspondences are transformed in terms of predication type. Quantitative results reveal the rate at which verbs are gained in translation of the verbless sentences from Russian to English, as well as the predication verbalization rate. We argue that a verb-centric notion of semantic predication is not cross-linguistically stable.


INTRODUCTION
Based on a pilot parallel corpus, this paper examines the way that predication is rendered in the translation of verbless sentences from Russian to English.The definitions of predication and predicate, and the verb's role in these notions, have been highly debated in linguistics.Special issues of Faits de Langues: La Prédication (2009) and Revue de Linguistique et de Didactique des Langues: Syntaxe et Sémantique des Prédicats (2008) trace the notions throughout history.In this paper, we apply Kees Hengeveld's (1992) definitions to a series of verbless sentences and their translations which have been automatically extracted from a small literary parallel corpus.Following Jaqueline Guillemin-Flescher's (2003) approach to contrastive linguistics, we explore re-occurring patterns with regard to the way that predication is gained or lost in translation.
The comparison of Russian verbless sentences with English translations is particularly relevant due to the profound cross-linguistic differences between the two languages.Sentences in which the grammatical category of verb is absent exist in many languages, but they are known for being particularly frequent in Russian.Out of all Indo-European languages, Russian is famous for allowing the most liberal use of verbless sentences (Kopotev, 2007b).Characteristic features of Russian include a very developed morphological case system, flexibility of word order and intonation, absence of articles, an extraordinary capacity for verbal ellipsis, the possibility of a zero-copula construction in the present tense -such as (1) "Я Алексей" ( ja Alexej, lit.'I Alexei'), and many other non-elliptical verbless constructions -such as (2) "Я в монастырь" ( ja v monastyr', lit.'I to monastery') uttered in a context without a linguistically explicit verbal antecedent (Stassen, 2013).In contrast, English is known for its dependence on the finite verb phrase, the lack of a zero-copula construction and formal register restrictions on certain types of verbal ellipsis (McShane, 2000).Nonetheless, verbless sentences are also found in English across all sentences types, including typical exclamatives, e.g. ( 3) "What a picture!", imperatives, e.g. ( 4) "And now to business.",questions, e.g. ( 5) "What about my parental blessing?",and assertions, e.g. ( 6) "So much, then, for the introduction".
The paper starts with a description of the corpus and the methodology which involves the automatic extraction of verbless sentences.In Part 4, we discuss the definitions used in automatic and manual annotation.This section includes a sketch of Hengeveld's (1992) semantic notion of verbal and non-verbal predication, which we develop to include a distinction for antecedent-based ellipsis.The results in the final part present descriptive statistics with particular attention to the sentences that gain a verb in translation, characterize the phenomenon of predication transformation according to five different types, and analyze the implication of the observed translation patterns for the semantic notion of predication.We conclude by summarizing the limits of the current pilot study and suggesting perspectives for further research.

PARALLEL CORPUS
The corpus consists of Fyodor Dostoyevsky's Russian dialogue-centered Братья Карамазовы (Brat' ja Karamazovy, 1880) and the Richard Peaver and Larissa Volokhonsky English translation The Brothers Karamazov (1990).This Russian novel has inspired many studies in literature and philosophy, but Dostoyevsky's language has also been praised as particularly suitable for the study of spoken dialogue.George Thomas (1982: 672) stresses that the important role Dostoyevsky gives to dialogue makes this novel of particular interest for linguists, especially those studying speech acts.Targeting the key features of reliable parallel corpora described by Thomas Stolz (2007), we selected this work due to the frequent passages of direct speech, everyday language register of the original, realistic prose, and the existence of sixteen translations of the novel.The translation was chosen for its recency, basis on the original text, and its literal style (Vasil'čenko, 2007) which has earned it critical acclaim for being most true to the original.Although for reasons of feasibility of the pilot study we examined only one translation, the large number of competing translations make it possible to compare the patterns across multiple different translators in future work.
The scope of the corpus is limited to the first fourteen chapters of the novel and contains a total of 76,500 words.The manageable size allowed us to develop a new approach to extracting verbless sentences automatically, verify the accuracy of the extraction, and manually annotate the verbless sentences and their translation correspondences in accordance with the definitions of predication and the categories described in Part 4.

VERBLESS SENTENCE EXTRACTION
While much progress has been made in natural language processing with regard to the search for a particular word or element in a corpus, finding the absence of an element in a sentence automatically still remains a challenge.Studies of existing parsed corpora show that very often verbless sentence extraction is blocked by syntactic modeling that is based on verb-centric definitions of a clause and the typically fixed morphosyntactic annotation (Landolfi et al., 2010).
We try to resolve the challenges of automatic processing of verbless sentences by developing an alternative method of automatic extraction.By customizing automatic sentence segmentation, semi-automatically correcting morphosyntactic annotation by means of Trameur annotation, alignment and statistical text analysis software (Fleury and Zimina, 2014), and using the latter to classify the sentences into those with a verb and those without, we achieved an accuracy of 94% in terms of automatic recall of verbless sentences as compared with manually extracted results.

ANNOTATION DEFINITIONS
Automatically extracted verbless sentences include any structure that ends with a major punctuation mark and does not show a verb, or verb form (participles, infinitives), in any of its parts.Direct speech sentences were separated from embedded narration in automatic segmentation.Following extraction, verbless sentences were aligned by paragraph in order to visualize the context, they were separated into utterances, and each verbless utterance and its translation were manually annotated for the presence of a verb, the presence of an antecedent-based verbal ellipsis following Marjorie McShane (2000) and Mikhail Kopotev (2007a), and whether the predication involved was verbal or non-verbal in accordance with Hengeveld (1992).
A key element of Hengeveld's (1992) definitions is that the notion of non-verbal predication is wider than the notion of a verbless sentence.He stresses that the notion of non-verbal predication is a semantic notion that may be morphosyntactically expressed by both verbless and verbal sentences.Non-verbal predication is defined as taking place in all constructions where a non-verbal predicate is applied to arguments (Hengeveld, 1992: 26).The non-verbal predicate "should be considered the main predicate of a non-verbal predication, even in those cases in which it is accompanied by a copula" (Hengeveld, 1992: 26).Therefore, on Hengeveld's conception, sentences containing a verbal, but semantically empty, copula are treated as instances of non-verbal predication.These distinctions are illustrated in Figure 1, which is based on Hengeveld's (1992: 27) diagram and includes a modification for ellipsis and examples from the present corpus.In the same vein as Emile Benveniste (1966: 163) and Rodney Huddleston and Geoffrey Pullum (2002: 218), Hengeveld (1992) treats the copula verb "be" as semantically empty.However, he argues that the semantically empty copula cannot constitute the main predicate of the sentence.Therefore, in Hengeveld's (1992) terms, the main predicate in example (7) would be "so wonderful", not "is" or "is so wonderful", and the English verbal sentence would be treated as a case of non-verbal predication.Since the ellipted antecedent is the semantically meaningful finite verb "значит" (značit, lit.'means'), the predication involved in the Russian verbless sentence is annotated verbal.In English, the antecedent found in the previous clause is the finite form of the copula verb "be" and the predication is annotated non-verbal.Therefore, a transformation from verbal Russian predication to nonverbal English predication has occurred.The results in the following part analyze the predication transformation phenomenon from a quantitative and qualitative perspective.

Descriptive statistics
Automatic extraction revealed 315 Russian verbless sentences out of a total of 2,325 sentences, therefore, establishing a verbless sentence rate of 13.5% for the Russian corpus.Since a sentence may consist of several utterances and contain several verbs, the manual annotation was performed on utterances.A total of 419 Russian verbless utterances and their translations were examined.The translation results show an utterance verbalization rate of 49%, that is 207 of the 419 Russian verbless utterances gained a verb in English translation.
The loss or gain of a finite verb in the translation of an utterance often resulted in a change of the predication type of the utterance.While the utterance verbalization rate concerns the change in the realization of the grammatical category of the verb, the predication verbalization rate addresses the change in the predication type involved in the verbless utterances as compared to their translations.The definitions summarized in Figure 1 above were used to determine the type of predication, i.e.(a) verbal predication, involving a non-copula verb (or a non-copula verbal antecedent), or (b) non-verbal predication, which either involves a copula verb (or copula verb antecedent) or involves no verb (or verbal antecedent) at all.
Transformation from Russian verbal predication to English non-verbal predication, such as (8) above, was a rare phenomenon.Only six cases were identified, all of which involved an ellipsis of the copula "be" in English.Predication transformation essentially occurred in the opposite direction, that is from Russian non-verbal to English verbal.Predication verbalization, the result of the gain of a non-copula verb in the translation of a non-elliptical verbless utterance, was observed in 19% of the English translations of Russian verbless utterances.The verbs involved in predication transformation include: bow, can, care, come, damn, serve, do, drive, fast, follow, get, go, happen, have, hold, know, leave, like, look like, make, mean, need, say, think, treat, want.

Typology
A closer examination of the cases where non-verbal predication became verbal in translation from Russian to English reveals five patterns in which the transformation occurred.

Idiomatic expressions
The first involves the introduction of verbs as part of the translation of idiomatic expressions.These are fixed expressions that are semantically noncompositional, that is their lexical elements do not make transparent the meaning of the sentence transparent (Kopotev, 2015: 226).For instance, the gain of the verb, and the consequent predication transformation, in the exclamative in ( 9) is explained by the fact that this is a fixed Russian expression for which the translator resorts to a fixed English expression in order to scold the interlocutor for his actions.Although the introduction of the verb "like" in translation transforms the predication type, its lexical status is uncertain due to the idiomatization of the English expression of which it is a part.Moreover, the fact that the English utterance is a rhetorical question, used to indirectly assert disapproval of the previous turn, further minimizes the verb's semantic contribution to this instance of predication.

Untranslatable part of speech
In other cases, verbs were used in the translation of words that do not seem to have a non-verbal semantic equivalent in the target language.In (10) non-verbal predication becomes verbal in translation due to the fact that a single noun for "people who fast" does not seem to exist in English colloquial register.

Emphasis
The third predication transformation pattern concerns emphasis, as illustrated in ( 11 The communicative function of the verbal predication in the utterance "I mean" is to emphasize "Mitenka", the antecedent of the pronoun "he".The corresponding focus is created in the Russian utterance by means of the particle "-то" (-to, lit.'that') added onto the proper noun.The introduction of the verb, resulting in predication transformation, is a means of creating emphasis and keeping the information structure intact.

The topic-subject
In other instances, predication transformation is associated with evoking the subject for topic activation.This pattern is illustrated in (12), where in the English translation both a subject and a verbal predicate are gained as compared with the Russian source.
(12) {The woman tells a story about herself.The elder asks her a question.}Издалека?ADV from-far-away "Have you come from far away?" The introduction of the English subject pronoun "you" makes the topic of the sentence linguistically-explicit, which in accordance with Knud Lambrecht (1994) is the woman.The activation of the topic by means of evoking the subject results in the introduction of the predication transforming verb in the English utterance.

Contextually implied
Finally, the verbalization of predication in translation of Russian verbless utterances also resulted from the explicit activation of verbs referring to extralinguistically accessible activities.For example, in (13) only a verb of speaking or meaning would be appropriate for the context.["They'll proclaim it, they'll remember: 'He foresaw the crime and marked the criminal.'It's always like that with holy fools: they cross themselves before a tavern and cast stones at the temple.Your elder is just the same: he drives the just man out with a stick and bows at the murderer's feet.By introducing the verb "say", the English translation linguistically activates the relationship between "what" and "you" which is already contextually accessible.The identical Russian structure may be felicitously used in another context, for example, physical movement (e.g."jump") or emotion (e.g."laugh").The semantic value of the predication transforming verb in the English sentence seems to overlap with activities that are contextually salient for the interlocutors.

CONCLUSION
The corpus-based analysis of the predication involved in the translation of verbless utterances from Russian to English in accordance with Hengeveld's (1992) definitions has revealed a phenomenon of predication transformation.The transformation of predication type occurs when a non-copular verb is gained or lost in the translation of a verbless utterance.The present pilot corpus showed that Russian verbless utterances gained a verb at a rate of in 49% of cases, and that a transformation from Russian non-verbal predication to English verbal predication occurred in 19% of utterances.A typology of the transformation phenomenon suggests that predication transforming verbs are introduced in translation as part of an idiomatic expression, to preserve the semantic import of another part of speech, to create emphasis, as part of topic-subject activation, or to activate a salient element of the extra-linguistic context.As a result, it appears that the semantic contribution of the verb in establishing a division within the semantic notion of predication is questionable.The results of the present study imply that a verbal versus non-verbal dichotomy is inadequate for a cross-linguistically stable definition of the semantic notion of predication.
The parallel corpus of the present study was limited in order to permit the development of an accurate automatic method of verbless sentence extraction and the manual annotation of predication type.A larger and more diverse corpus that is representative of the English and Russian language would make it possible to extend the conclusions beyond the present pilot corpus.Furthermore, the results of the present study need to be verified bi-directionally, that is, in a corpus of Russian translations from an English source.It remains to be seen to what extent such a corpus would confirm or deny the present conclusions concerning the verbs that are gained or lost in translation.We expect that an English to Russian parallel corpus would reveal de-verbalization rates, similar to the present rates of utterance verbalization, and a suppression of non-copular verbs in Russian translations of English verbal utterances in accordance with the presented typology of predication transformation.

Figure 1 .
Figure 1.Predication and sentence distinctions based on Hengeveld (1992), extended to include antecedent-based ellipsis and illustrated with examples from the present corpus Source: author's own elaboration