[LSLS] A corpus-based approach to explore how children acquire Batangas Tagalog in their natural environment

  • Date: 24 Mar 2026 | 8:30 AM - 10:00 AM

How do children acquire language in linguistically and culturally diverse contexts across the globe? Naturalistic recordings of child speech and surrounding speech provide rich datasets to study this question. Particularly, such recordings enable us to study what children say, what they hear and how they learn to communicate in interactions with others.
In this talk, I will present the Batangas Tagalog (BaTa) corpus that comprises recordings of preschool children in their natural environment in rural Batangas. First, I will discuss how the corpus has been built, focusing on the methodological approach and our experiences throughout the different stages of the project. Then, I will present ongoing case studies, including the use of kinship terms in child-directed speech and the acquisition of verbs. This will showcase the diversity of questions that can be pursued with such a corpus. At the same time, I will also discuss the constraints of naturalistic data and important considerations when approaching the planning of the data collection as well as analysing the data.