Monday, March 6, 2017

Linguistic typology conference and workshops coming up! #lingtypconf

This year, Canberra will host the 12th biannual meeting of the Association for Linguistic Typology. This is the big event for researchers of cross-linguistic diversity from all over the world. This is, as is well known, a very cool and challenging research field. Grasping the diversity of the world's languages is an ambitious and worthwhile enterprise, that can sometimes leave you feeling a bit floored.

I'm happy to be able to welcome you here in Canberra for this conference, it's going to be great! If you're floored, I'll pick you up!

The deadline for abstract submission is coming up, so you're all hereby reminded that if you want to partake then get that abstract in by the 31st of March. This goes for the general session, but also the workshops.

We (Martin Haspelmath, Hannah Haynie, Robert Forkel and myself) are organising a workshop on Design principles and comparisons of typological databases. If you are interested, do get in touch and submit an abstract. Below in this post is a longer description of our workshop.

Interested people should submit abstracts for both the general session and the workshops in the same form: . You don't have to be an ALT member to send in an abstract, but you'll have to become one if you're coming here and giving a talk. But, worry about that later.

Please remember that for those with funding problems, there are a limited number of scholarships for researchers are available, applications also due 31 March 2017. 

There's actually two workshops this year with similar topics, ours and one by Round, Macklin-Cordes and Quinn titled: quantitative analysis in typology: the logic of choice among methods. Since they're overlapping, we'll see if we can arrange for some time together or some sort of linking.

If you can't come, but are keen on following what's going on in the world of linguistic typology, then subscribe to this mailing list.

Longer description of the workshop: Design principles and comparisons of typological databases

What are the shared challenges and opportunities facing databases of language diversity? What kinds of databases are out there, and what can they be used for? These are questions we would like to address in this workshop, bringing together researchers working with compiling this kind of data, and users of it.

There are quite a few existing databases of grammatical features of languages, and several more are under construction. They differ in their design and in the kinds of research questions they aim to answer. Some are created to investigate the particular history of a certain region or family (e.g. van Gijn 2014), others a particular set of traits in a global set of languages (Stassen 1997), and so on. Despite these differences, there is often the possibility of sharing data or design between different typological surveys. 

We would like to take this opportunity during the ALT to bring together scholars who are working on designing typological databases and end users of such databases and discuss comparisons and possible opportunities for co-ordination. We’re interested in the design decisions that go into the construction of a database and what consequences that has for what it can be used for, and if it can be linked to other similar databases.

Within MPI-SHH's Glottobank project (, there have been discussions of how different typological databases relate to each other and what their different aims and uses are. We would like to engage the broader typology community in these discussions and hear viewpoints from other database designers and end-users.

We are also interested in discussing design principles in relation to end-users of the data. There are many different kinds of end-users of this data, and the methods with which they approach the material carries with it certain assumptions and prerequisites. For phylogenetic studies, for example, it is best if the features are logically independent of each other and associated with a confidence value. What does the data that is available today look like, and what should future surveys look like?

This is not only a question of adjusting to certain end-users preferences, but also a matter of clearly communicating what the data looks like, how it was designed and why. This will make it clear which research questions the data is suited for, and which questions it should not be applied to.

For example, WALS (Dryer & Haspelmath 2013) was constructed using already existing data from a number of well-known typologists. There was also a core sample of languages (100 and 200) that all/most of the chapters covered, but there were still significant gaps in the database coverage of features per language. This renders certain kinds of analysis impossible. In WALS, there was most likely greater consistency per feature as opposed to per language since that was how labour was divided. This can be contrasted with APiCS (Michaelis et al 2013), where the languages each was represented with experts who corresponded with the APiCS editorial team to answer a typological questionnaire. In the case of APiCS, we expect greater consistency over each language instead of over each feature. APiCS also allows for languages to be represented with several values for one feature, whereas WALS only allows for one. These design choices has consequences for the nature of the data and are interesting to discuss in relation to databases under construction, end users and comparison. 

We would like to take this opportunity to invite researchers who are working on constructing typological databases of structural/grammatical features to discuss the questions below and related ones. We would also like to invite end-users who are engaging with this kind of data to present findings and engage in discussions on what the limitations and possibilities of the databases are.

The workshop aims at discussing these questions, but is also open to other related questions:
  • What kind of questions do we want to answer with our data, and which questions do we need to admit we cannot answer?
  • What does it mean if we are comparing doculects instead of languages?
  • What do linguistic descriptions, globally, enable us to research and what does it not?
  • What other feasible sources of information besides descriptions can we use?
  • What do we gain and lose by designing our features to be logically independent from each other (or conversely by including non-independent items in questionnaires)?
  • How do the circumstances of data collection (e.g. coding by feature or by language) affect the use and comparability of data from different surveys?
  • Can data from regionally oriented questionnaires be coordinated with globally oriented surveys to fruitfully build better sets of information on the world's languages? How do data design limitations impact this enterprise?
  • What elements need to be considered and what information needs to be documented when mapping between grammatical/typological datasets? (i.e. setting the stage for the grammaticon/getting input from other database designers on this concept)
  • How do we implement measures of coder-inter-reliability into more databases and into comparison of them?

No comments:

Post a Comment