Assistant Professor Elsie Marie Or will be co-presenting a paper titled “The Challenges of Symmetrical Voice Languages for Universal Dependencies” with Maria Bardají (University of Barcelona), Nikolaus Himmelmann (University of Cologne), and Angelina Aquino (Charles Darwin University) at the 15th International Conference of the Association of Linguistic Typology (ALT XV), hosted this year by Nanyang Technological University Singapore. The conference runs from December 4 to 6, 2024.
The study that they will be presenting at ALT XV is based on the Tagalog Universal Dependencies Treebank project, locally headed by Or and is part of the German Research Foundation-funded project titled “Information distribution and language structure – correlation of grammatical expressions of the noun/verb distinction and lexical information content in Tagalog, Indonesian and German.” The Tagalog NewsCrawl treebank, which is still going through verification, is currently the largest UD treebank of any Philippine language, can be accessed on the UD website or at HuggingFace.
Below is a copy of the abstract of their study.
The Challenges of Symmetrical Voice Languages for Universal Dependencies
Tagalog (an Austronesian language from the Philippines), as well as other Western Austronesian languages, has been analyzed as having a symmetrical voice system (Foley 1998; Himmelmann 2005; Riesberg 2014; Chen & McDonnell 2019, among others). This means that there are at least two transitive constructions called here actor voice and undergoer voice, none of which is more basic than the other. Compare the example in (1). In the actor voice in (1)a, the subject argument, marked by nominative ang, is an agent, and the non-subject argument is an undergoer, marked by genitive ng In the undergoer voice in (1)b the alignment is reversed: the subject is the undergoer, the non-subject argument the agent. Importantly, the grammatical properties of the core arguments in both clause types are identical: the non-subject argument in both actor and undergoer voice, for example, is marked by ng, cannot be relativized or topicalized, etc. It is not the case that the agentive argument loses its core argument role in the same way it does in European-style passive alternations. The structural differences between symmetrical voice languages such as Tagalog and languages with asymmetrical voice alternations such as European languages pose a challenge for a universal annotation framework like Universal Dependencies (UD). To be sure, pragmatically the UD annotation scheme allows for language-specific workarounds to overcome the various annotation problems resulting from symmetrical voice alternations, as illustrated by the existing (smallish) UD treebanks for Tagalog and Cebuano. However, even if different annotators agree that the symmetrical voice analysis is the most plausible analysis for Tagalog clause structure (which is not unanimously agreed), there are so many different ways of applying the basic UD scheme that it is difficult to conceive of an annotation scheme for Tagalog which all researchers are happy to work with. Consequently, we want to explore in our talk the question of which assumptions built into the basic UD framework are particularly problematic and whether there are solutions that are theoretically more satisfying and practically more robust than the ad hoc solutions we ourselves have used. Specifically, we will discuss the following issues, which mostly concern the PoS and syntactic dependency layers of the UD system:
(a) the assumption that languages only have one basic or unmarked transitive clause
(b) the fact that the syntactic functions of subject and object are defined in terms of semantic roles.
(c) the fact that in Tagalog (and other Philippine languages) arguments are preceded by phrase markers (like ang and ng in (1)) which show characteristics of both determiners and prepositions: they encode specificity, definiteness or deictic distinctions but, like case markers, also indicate the syntactic role of the argument.
(d) the fact that Austronesian symmetrical voice languages tend to have non-verbal existential constructions. In these constructions, an existential operator is immediately followed by a noun phrase (if denoting existence, as in (2)a) or by two noun phrases (one denoting a possessor and the other the possessum, as in (2)b).
The full abstract complete with sample sentences and list of references can be viewed in the ALT XV conference program booklet.
Published by UP Department of Linguistics