Research projects
Prof. Silvia Luraghi - Project Lead
The Ancient Greek WordNet is an on-going collaboration between the Center for Hellenic Studies, the University of Exeter, and the University of Pavia, under the joint direction of William Michael Short, Alexander Forte, James Tauber, and Silvia Luraghi to create a comprehensive lexico-semantic database of the Ancient Greek language. It is intended to model Greek's semantic system as fully and accurately as possible, in a form that is machine-interpretable and machine-actionable, and thus suitable to NLP applications of different kinds, especially in the area of natural language understanding.
Adopting methodology from cognitive linguistics, this project also seeks to provide the lexicographical basis for the wider study of metaphor and metonymy across languages. The diachronic orientation of the database begins with the language of Mycenaean and of the Homeric epics, with planned chronological extension into the Classical, Hellenistic, and Roman periods.
This project builds on, and extends, original work carried out by Federico Boschetti, Riccardo Del Gratta and others at the Institute for Computational Linguistics “A. Zampolli”, CNR of Pisa, Italy, as well as by Harry Diakoff, under the auspices of the Alpheios Project, with Chiara Zanchi serving as technical coordinator.
Main researchers of the Pavia Unit:
Silvia Luraghi, Chiara Zanchi
Prof. Silvia Luraghi - Project Lead
The Sanskrit WordNet is an on-going collaboration between the University of Pavia and the University of Exeter, under the joint direction of William Michael Short and Silvia Luraghi to create a comprehensive lexico-semantic database of the Sanskrit language. It is intended to model Sanskrit's semantic system as fully and accurately as possible, in a form that is machine-interpretable and machine-actionable, and thus suitable to NLP applications of different kinds, especially in the area of natural language understanding.
This project builds on, and extends, original work by Oliver Hellwig for the Digital Corpus of Sanskrit, with Erica Biagetti serving as the project's technical coordinator.
Main researchers of the Pavia Unit:
Silvia Luraghi, Erica Biagetti, Lucrezia Carnesale
The informalisation of English language learning through the media:
language input, learning outcomes and sociolinguistic attitudes from an Italian perspective
PRIN - MUR funded national project (2022-2025)
Prof. Maria Pavesi – Principal Investigator
Owing to contemporary globalisation, multilingualism and media saturation, the availability of
English in traditional and new media has increased at an unprecedented rate. Extensive access to the language outside institutional settings is leading to a growing informalisation of L2 learning and use. No longer restricted to institutional sites, the learning of English as an additional language (L2) emerges naturally from user-initiated extramural engagement with popular media, web technologies and social encounters for non-primarily language learning reasons. Concurrently, the informalisation of L2 learning and use is changing students’ language attitudes toward English. These multifaceted trends have been investigated in many European countries. By contrast, although Italy appears to be experiencing a similar radical change, little research has been carried out on the acquisitional and sociolinguistic impact of media-induced contact with English in this country.
This project probes Italian university students’ private worlds and the undetected processes which are shaping English informal learning and use. It studies the degree, type and modality of media exposure, while engaging with the outcomes of informal language learning and learner-users’ beliefs and orientations to English as a native language (ENL), foreign language (EFL) and lingua franca (ELF). The project draws on functional, interaction-based and cognitive approaches to second language acquisition (SLA) with a focus on media genres.
The investigation involves four territorially differentiated universities: Pavia – the leading
University –, Pisa, Salento and Catania. It employs a mixed research design that combines cross-sectional and longitudinal data collections coupled with an array of empirical tools and quantitative and qualitative approaches to data analysis. It is organized in three phases:
1) By means of 2,000 questionnaires, information will be collected on students’ personal data,
extra-linguistic and social characteristics. The main variables to be tapped include frequency, type of media exposure (e.g., TV series, songs, social networks, blogs/vlogs, YouTube, videogames, websites, (online) press, radio, podcast) and modality of access (English-only vs
subtitled/multilingual input, receptive-only vs interactional use), motivation and goals, instruction and non-media contact with English (e.g., travelling and mobility programmes). The survey will comprise self-evaluations of linguistic competence and assessment of general lexical knowledge to allow correlations between the factors investigated and language learning outcomes.
2) Due to the private nature of most informal learning, ethnographic studies through semi-structured interviews will be carried out on purposefully selected respondents to gain an in-depth picture of behavioural patterns across time, beliefs and motivations associated with English-language media.
3) Longitudinal studies of presently untutored high-exposure respondents will be conducted within the Complexity, Accuracy, Fluency (CAF) framework and testing specific late-acquired areas of the L2 – including spoken language grammar and pragmatics.
Corpus-based descriptions of relevant media genres will inform hypotheses on the impact of
different input types on the acquisition of L2 English.
The four research units will follow the same protocols in the first stages of the project to guarantee results’ comparability, while developing specific components in later stages of research. The Pavia team will coordinate the development and implementation of the research design. It will contextualise the main issues in informal learning of L2 English while providing corpus-based descriptions of spoken language in audiovisual (AV) dialogue. In the interviews, the Pavia team will focus on participants’ comprehension of AV input and other cognitive processes that may lead to second language acquisition. It will also address participants’ attitudes towards English –ENL, EFL, ELF–, their identities as learners-users, their perception of L2-self/selves and multilingual repertoire developed via the media. In the longitudinal studies, address trajectories of CAF will be investigated, focusing on advanced spoken morpho-syntax.
The present PRIN project is the first one in Italy to charter this highly dynamic and largely
unexplored landscape. It has applications and implications pertaining to English second language acquisition, applied linguistics, sociolinguistics, educational and translation policies. The questionnaire built for the project qualifies as a scientific instrument available for future large-scale investigations in Italy and other countries to supervise the evolving role of English (inter)nationally and observe the correlation patterns between the different factors –including instruction– involved in contemporary language learning. The investigation will advance our understanding of how learners acquire English informally and deploy specific actions that empower them linguistically. It will show how the ubiquity of English has implications for language learning landscapes and the conceptualisation of L2 users, with an effect on speakers’ orientations towards native, learner and ELF varieties.
Main researchers of the Pavia Unit:
Maria Pavesi, Maicol Formentelli, Silvia Monti, Camilla De Riso (Dipartimento di Studi
umanistici) Elisa Ghia, Cristina Mariotti (Dipartimento di Studi politici e sociali), Erik Castello
(Università di Padova)
Prof. Ilaria Fiorentini - Principal Investigator
WhAP! è un corpus linguistico italiano, che, sulla scia di altri corpora, raccoglie dati sullo scritto e il parlato degli utenti WhatsApp. È proprio la peculiarità di WhatsApp di utilizzare sia messaggi vocali sia scritti a risultare interessante per gli studiosi, data un’interazione completamente originale nel mondo della Comunicazione Mediata da Computer (CMC). A questi saranno allegati dei metadati, che indichino le caratteristiche degli individui, come genere, fascia d’età, provenienza geografica etc., per contestualizzare alcune varianti nell’ambito sociolinguistico.
I creatori del corpus si sono occupati di ricostruire le chat con i propri contatti, sia per quanto riguarda messaggi scritti, sia vocali, opportunamente trascritti tramite ELAN, software creato appositamente per l’annotazione linguistica. Dal nostro portale, gli utenti potranno accedere direttamente a queste chat, per fini privati o divulgativi, andando a ricercare fenomeni in particolare, oppure filtrando le chat per certe aree geografiche o fasce d’età. Nonostante la presenza di metadati per ogni chat, i nostri annotatori si sono occupati di anonimizzare ogni nome, via o riferimento contestuale che possa ricondurre all’identità del parlante o dello scrivente, in modo da tutelarne la privacy.
Le chat WhatsApp risultano un territorio di studio vergine da controllo dell’utente, che con i propri contatti applica varietà registiche non influenzate dall’essere osservato da terzi. A questo sia aggiunge la possibilità di studiare messaggi vocali, contenenti fenomeni del parlato spontaneo, molto difficili da reperire in altre maniere, proprio per motivi legati alla privacy. L’ambito di studio della sociolinguistica può così ottenere molti dati su fenomeni quali code mixing, riformulazioni e altri in un medium che si pone esattamente a metà tra scritto prototipico e parlato spontaneo.
Persone
Ilaria Fiorentini, Marco Forlano, Nicholas Nese, Chiara Zanchi, le studentesse e gli studenti della Laurea Magistrale in Linguistica Teorica, Applicata e delle Lingue Moderne