To refer to this article use this url:

Linguistics in Amsterdam 4-2 (september 2011)

Transparency in Sri Lanka Malay

Sebastian Nordhoff

Max Planck Institute for Evolutionary Anthropology.


Sri Lanka Malay is a very transparent language. The domains which can be considered non-transparent are limited to apposition, repetition of elements, two portmanteaus, and right extraposition of heavy elements. Sri Lanka Malay does not show any of the more salient cases of opacity, such as gender, agreement, conjugation, declension, grammatical relations, or expletives.

1 Introduction

Sri Lanka Malay (SLM) is the language spoken by the descendants of soldiers, exiles, convicts and slaves brought by the colonial powers of the Dutch and the British from Indonesia and Malaysia to Sri Lanka. The first immigrants arrived in the middle of the seventeenth century. Sri Lanka Malay is by no means a dialect of Standard Malay, or Indonesian, but a ‘language in its own right’ (Adelaar 1991). The grammatical differences between Standard Malay and Sri Lanka Malay are by far greater than the differences between Dutch and Afrikaans, for instance, and are more like the differences between Dutch and Hindi as far as grammar is concerned. This is due to the convergence of SLM towards Sinhala and Tamil, the dominant languages of Sri Lanka. This convergence took place in record time: the first immigrants arrived in the 17th century and spoke Malay dialects with SVO word order, prepositions and little morphology. Today, SLM has SOV word order, postpositions, and comparatively copious morphology as far as Malayic languages go. The lexicon, however, was almost unaffected by language contact: 90% of the vocabulary is of Malay origin (Paauw 2004). There are currently around 60.000 ethnic Malays in Sri Lanka, but the number of speakers is much smaller due to the economic need to learn English and Sinhala.

Sri Lanka Malay is a very transparent language in the sense of Hengeveld (this volume). There are little to no morphosyntactic accretions, no noteworthy irregularities (allomorphs etc), and a close connection between semantics and morphosyntax. This transparency is common in varieties of Malay. It is a retention of a historic feature and not due to language contact. The transparency of SLM has probably not increased since the arrival of the language on the island; if it has changed, it has probably become less transparent. This should not betray the fact that the language is the most transparent one presented in this volume; it just so happens that its ancestors where at least as transparent.

In the following, I will chart the relation between the different levels of Functional Discourse Grammar (Hengeveld & Mackenzie 2008) in SLM as outlined in the introduction to this volume.

2 Interpersonal-Representational

No cross-reference

Hengeveld & Mackenzie (2008:350) speak of cross-reference when ‘person marking on the verb is sufficient by itself and may optionally be expanded by a lexically realized argument.’ There is no person marking on the verb in Sri Lanka Malay, so that this criterion does not apply. The following example shows that the form of the verb is invariant no matter the reference of the argument (represented by X). Whatever the person, number of gender of X, the verb will remain the same, su-dhaathang in (1).

No apposition

In the mapping of the interpersonal level to the representational level, apposition represents a one-to-many relation. This is not expected under transparency. Sri Lanka Malay is not transparent in this regard since it allows apposition. The following example shows an instance of apposition. The referents Mr Sebastian and see ‘I’ are introduced and subsequently refered to by the personal pronoun kitham ‘we’. Kitham alone is already sufficient on the representational level. However, duuva ‘two’ again refers to the same two referents in the world as does kitham; we are dealing with apposition here.

No limitations on which semantic units can be chosen as predicates

In transparent languages, we do not expect restrictions on which semantic units can be chosen as predicates.

In Sri Lanka Malay, verbs, adjectives, modals and locatives can be used as predicates without further measures being taken (3-7). Only verbs can take TAM-prefixes though (3). TAM for other predicates must be expressed lexically.[1]

While further measures are not necessary, SLM has an optional copula, which can be used for nominal predicates (Nordhoff 2011).

3 Representational-Morphosyntactic

No grammatical relations (but semantic or pragmatic alignment)

In the mapping between the representational level and the morphosyntactic level, some languages use an intermediate level of grammatical relations, which can change the direct mapping of semantic function on morphosyntactic expression. In Sri Lanka Malay, there are no grammatical relations, instead the semantic function (agent, patient, recipient, etc) is directly mapped onto morphosyntax, where it is expressed by cliticized postpositions. The following example shows the use of the postpositions =yang ‘acc’, =dering ‘abl’ and =nang ‘goal’. The agent incayang ‘he’ is not marked by a postposition.

Crucially, there is no way of changing the case postposition an argument takes. While the English Passive Alternation as in she beat him/he was beaten by her changes the marking of the patient from accusative to nominative and the marking of the agent from nominative to oblique by-agent, in SLM such operations do not exist. There is no possibility to promote of demote an argument morphosyntactically.

Nordhoff (2009) has applied an array of tests for subjecthood to Sri Lanka Malay; none of the tests yielded evidence for the category “subject” (or “object”). For reasons of space, the tests will not be repeated here. However, the three traditional coding properties of subjects can briefly be discussed. These are agreement, word order and case marking. As discussed in Section 2.1, there is no agreement in SLM. Word order in SLM is generally verb-final, but the arguments of the verb can occur in any order to the left of the verb. This can already be seen from example (9), where we find the order pat ag src goal v. If word order were an important parameter, we would either expect the subject in initial position (SOV) or in direct vicinity of the verb (OSV). In (9), neither is the case. The best candidate for subjecthood, the pronoun incayang is neither in initial position nor adjacent to the verb. (9) thus already sheds doubt on word order as a criterion for grammatical relations. Nordhoff (2009) discusses variations in word order in more detail and concludes that word order in the preverbal field is purely conditioned by pragmatics and cannot be used to establish grammatical relations. The last criterion for subjecthood is case marking. If we find that the only argument of an intransitive predicate (S) is always case-marked in the same way as either the actor (A) or the undergoer (P) of a transitive sentence, we have good arguments for subjecthood. This test also fails in SLM. The only argument of an intransitive predicate in SLM can be marked with either zero, =yang ‘acc’, =nang/=dang ‘dat’, or =dering ‘abl’.

For transitive sentences, the most typical case combinations include

There is no space here to illustrate all these patterns, but the following three examples should suffice to illustrate the diversity.

Given this diversity of case marking, it is obvious that there is no clear mapping between the cases found in intransitive sentences and in transitive sentences. This means that the third coding property for subjects, case marking, fails as well.

No discontinuity

Tearing apart in morphosyntax something which belongs together in semantics is a non-transparent operation, leading to discontinuous constituents. Such constituents are not found in Sri Lanka Malay.

Function marking and derivational processes not sensitive to nature of input

The marking of a semantic function is the more transparent the less parameters it depends on. The most transparent relation is found in cases where the only parameter is the semantic function itself, and other parameters from the realm of morphosyntax or phonology do not play a role. That is, it should make no difference to the expression of a function whether the referent is encoded as a noun, a pronoun, an adjective, a clause, or anything else. This is what we find in Sri Lanka Malay: function marking is indeed indifferent to morphosyntactic properties. Semantic roles are marked by enclitic postpositions, e.g. the dative marker =nang, which can mark recipient, experiencer, and manner, among other roles. These postpositions can attach to any type of argument. The following examples show =nang attached to a noun, a pronoun, a deictic, an adjective and a clause. There is thus no morphosyntactic restriction on the combinatorial properties of =nang with this set. The same is true for the other postpositions, like =yang ‘acc’, but these are more difficult to combine with clauses for semantic reasons. The accusative marker =yang normally occurs with affected participants, but clausal participants normally refer to states-of-affairs, which are difficult to affect due to their intangible nature.

Maximally transparent derivational markers should also not show selectivity with regard to their host. This is true for the (derivational) plural marker pada. The following examples show the use of the plural marker on a noun, a pronoun, a deictic, and a relative clause.

While the plural marker is not selective to the nature of its host, the same is not true of the nominalizer -an and the causativizer -king. The nominalizer -an can attach to verbs, adjectives and nouns, but not to pronouns, deictics or clauses as seen in (26), where it attaches to a verb and in (27), where it attaches to an adjective.

In rare cases can -an be found on nouns, like raja-han ‘king’+‘nmlzr’=‘government’. This is another use of -an, also found in Standard Malay, and indicates ‘collectivity’ or ‘similarity’ when attached to nouns, according to Adelaar (1985:193). This meaning seems to be at hand here as well, where a government can be seen as a collection of kings, or as performing a function similar to that of a king. This second use of This second use of -an seems to be no longer productive and is only found in couple of forms.

The selectivity of -an is quite clear in SLM, nevertheless, there is one instance of a nominalization after inflection, i.e. nominalization of a phrase rather than of a stem. This is thradahan ‘deprivation’ which is composed of the negative prefix thàrà-, the existential aada, and the nominalizer. The non-negated form adahan ‘possession’ also exists. One could argue that the negation takes place after derivation, however, th(à)rà- is not a morpheme which can attach to nouns, so that th(à)rà- must have been joined with a(a)da before the derivation. This makes the selection restrictions of -an less narrow, but it is still true that in the great majority of cases, -an cannot be used to nominalize phrases; it can only be used to nominalize stems.

The causativizer -king is another derivational morpheme. It can attach to verbs (28) and adjectives (29), and marginally to nouns (30), but not to pronouns, deictics or clauses either.

It can be noted that -an is a suffix on prosodic grounds while pada is a clitic. This is mirrored by their morphosyntactic behaviour: pada does not care for the nature of its host, while -an does. -king on the other hand has an intermediate position with regard to prosody, where there are reasons to treat it as a suffix, but also as a clitic. This is not mirrored by its morphosyntactic properties, which pattern like those of a suffix.

4 Morphosyntactic

No expletive elements

Transparent languages should not have elements in morphosyntax that correspond to nothing on the representational level. This is the case in SLM. SLM has no dummy subjects. The non-existence of items is always difficult to demonstrate. Nonetheless, here I use a meteorological verb, where no expletive element is present, and none can be present.

No duplicate elements

Transparent languages should encode information from the representational level exactly once, and not several times. This principle is violated by the SLM indefinite article hatthu. One occurrence would be enough to signal the unidentifiable status of the referent to the hearer, yet it is often found twice, as in (32). This is an non-transparent mapping between the representational and the morphosyntactic level.

No tense copying

Transparent languages are expected to show always the ‘real’ tense value from the representational level in morphosyntax. Changing tenses (tense copying, consecutio temporum) are not expected. Thus, constructions like English He said that he had two brothers, where the past tense in the dependent clause is used despite the present tense meaning, are not expected to occur in transparent languages. Indeed, SLM does not show tense copying.

While no past tense form is ever used for present contexts, it is possible to find the non-past form arà in past contexts. As argued for in Nordhoff (2009:289f), this is due to the polysemy of this form. Besides the more common meaning as ‘non-past’, this form can also be used as ‘simultaneous’, and this is what we find in examples like (34).

While polysemy is not exactly transparent either, the non-transparency we find here is due to the lexical entry of arà-, and not to a morphosyntactic rule of tense copying.

No raising

In a transparent language, we would expect every argument to surface in the clause where it semantically belongs. Raising constructions like John seems to be intelligent are not transparent in the sense that John semantically is an argument of the lower clause but shows up in the higher clause in morphosyntax. Such structures are not found in Sri Lanka Malay.The meanings found in English raising constructions would be expressed in SLM by the evidential marker kiyang or the enclitics =ke and =so, which attenuate the assertive force of a speech act similar to seems to in English.

No grammatical gender, declination, conjugation

A transparent language is not expected to have elements in morphosyntax which are not motivated on semantic grounds, i.e. elements whose form depends on arbitrary criteria like membership in a certain declension or conjugation class, or grammatical gender. Sri Lanka Malay has no declension or conjugation classes. There is no arbitrary gender assignment (as in German or French) either, even natural gender (sex-based classification) is very marginal. The only instance I am aware of is the pair puthra ‘prince’, puthri ‘princess’.

No agreement (but pronominal arguments)

As stated above in Section 2.1, there is no cross-reference, and thus no agreement.

Phrase marking through clitics rather than head marking through affixes

This is also what we find in SLM as discussed in Section 3.1

No fusional morphology

Transparent languages are not expected to express more than one meaning per morpheme. This is to say that we do not expect any fused portmanteau forms. While in the great lines, SLM does not have fusional morphology, some allomorphs of case markers could be analyzed as portmanteau forms. This is the case for the allomorph =dang of the dative marker =nang. This is obligatory for the monosyllabic pronouns see ‘1s’, goo ‘1s’, luu ‘2s’, dee ‘3s’. Another instance is the allomorph =ppe of the possessive marker =pe, which attaches to the same four items. One can then postulate that both =dang and =ppe carry a meaning of singular and pronoun besides their normal case semantics. This means that more than one meaning is expressed in one form, an instance of fusional morphology.

5 Morphosyntactic/Phonological

Phonological phrasing and morphosyntactic phrasing run parallel

In the interface between morphosyntax and phonology, a transparent mapping is found if there is a 1:1 correspondence between morphosyntactic phrasing and phonological phrasing. The major morphosyntactic constituent in Sri Lanka Malay are the clause CLS, the predicate PRED and the noun phrase NP. NPs are not restricted to nominal heads. Indeed, numerals, adjectives, pronouns, deictics, quantifiers, and even sentences and utterances can head NPs without further measures being taken. This means that the normal structure of the clause can be represented as NP* PRED. On the phonological level, Nordhoff (2009) distinguishes Presuppositive Phrases with a LH boundary tone from Assertive Phrases with a L boundary tone. There seems to be a 1:1 mapping of NPs on Presuppositive Phrases, and PRED on Assertive Phrases, so that the phrasing runs parallel on both levels and is therefore transparent.

Phonological weight does not influence morphosyntactic placement

A dramatically transparent language would completely ignore phonological weight when determining the order of constituents. Constituent order would be solely determined by semantics. This is not the case for SLM. While normally all arguments are preverbal, very heavy constituents can be shifted to postverbal position. This is frequently found for reported utterances (35), but other constituents can also be shifted, for instance the complement of a modal in (36).

Right extraposition of heavy constituents is one possibility to facilitate parsing, the other one is shifting heavy nominal modifiers to the right. This is what can be found with relative clauses as in (37), where the relative clause consisting of eleven morphemes is shifted to the left, and the short indefinite marker hatthu is now found between the relative clause and the head noun. Semantically, hatthu should have scope over the relative clause as well, but this is not mirrored in morphosyntax; the phonological weight has taken precedence over semantic considerations when placing the constituents in this sentence.

6 Phonology

No sandhi rules

In a transparent languages, there should be no word-external sandhi. In SLM, we can distinguish combinations of base+affix, base+clitic, compounds, and combinations of independent words as candidate domains for sandhi. As for affixes, we find that the numeral suffixes -blas ‘-teen’ and -pulu ‘-ty’ cause a labial articulation of the final nasal in the word dhlaapan ‘eight’ (dhlapamblas, dhlapampulu) as well as the dropping of the final consonant in ùmpath ‘four’ at least for some realizations of ùmpa(th)pulu ‘forty’. Combinations of base+clitic also often show assimilation of nasals to the following consonant, e.g. oorang ‘man’+=pe ‘poss’=oorampe ‘of the man’. Compounds and strings of independent words generally do not undergo sandhi.

No degemination

In a transparent mapping of morphology onto phonology, we would not expect the reduction of geminates caused by the collision of a coda with an identical onset of an affix. SLM shows non-transparent features in this regard. The word baalek is an intransitive verb meaning ‘to return’. When the causativizer -king is attached to it to yield a transitive verb, the form is not balekking, but baleking, so that we are dealing with degemination. This is true for affixation. With enclitic postpositions, degemination is not found, so the combination of aanak ‘child’ with the locative enclitic =ka is pronounced [a:nakka] and not [a:naka].

No diphthongization

In a transparent language, the pronunciation of a phoneme should remain the same (as far as this is phonetically possible at all) in any environment. A vowel should always be pronounced as a vowel, and never as a semivowel. In some languages, chance meetings of two vowels cause one vowel to be pronounced as a semi-vowel, yielding a diphthong. This phenomenon is not found in Sri Lanka Malay.

No nasalization

In line with what has been said above about diphthongization, an oral vowel should always be pronounced as oral, regardless of whether there are nasal consonants in its environment. This has not been investigated for Sri Lanka Malay. Given the mechanics of the articulatory tract and basic principles of economy, it is unlikely that speakers make efforts to keep all their vowels oral in a nasal environment. While it might be possible to rapidly move the velum back and forth in a word like maangga=nang ‘for the mango’ to switch between oral vowels and nasal consonants, it is much more likely that the speakers will avoid the effort involved and pronounce most if not all of the vowels as nasal.


1. Adjectives can convert to verbs and afford all of verbal morphology, but they are no longer adjectives then.


Adelaar, K. A. (1985). Proto-Malayic; The reconstruction of its phonology and parts of its morphology and lexicon. Ph.D. thesis, Leiden University.

Hengeveld, K. & L. Mackenzie (2008). Functional Discourse Grammar. Oxford: Oxford University Press.

Nordhoff, S. (2009). A Grammar of Upcountry Sri Lanka Malay. Ph.D. thesis, University of Amsterdam.

Nordhoff, S. (2011) “Having come to be a copula in Sri Lanka Malay -- an unusual grammaticalization path” Folia Linguistica (45), pp. 103-126

Paauw, S. H. (2004) A Historical Analysis of the Lexical Sources of Sri Lanka Malay. Unpublished M.A. thesis York University.