tag:blogger.com,1999:blog-13006802529970072512024-03-16T18:08:18.501+11:00Humans Who Read GrammarsA blog by a group of young linguists interested in diversity and description of the 7 000+ languages of the world.Hedvig Skirgårdhttp://www.blogger.com/profile/03689179680848604827noreply@blogger.comBlogger206125tag:blogger.com,1999:blog-1300680252997007251.post-27401578846164000902023-09-22T23:08:00.003+10:002023-09-22T23:15:09.450+10:00CLDF for dummies (v1.0)<p><span style="font-family: arial;"> </span></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjlqM02ccC3mAZbImsSMWJYOZ19ClO5VvZS8UnQZ_KWfa651H08-BJQjbW5R6HPCkSsRrnkiDrpfhOJ8nDVgyDarn2PRKpth9lp3rs7H7SYtJ72zRlynz-n0YBeQFuNomHbRQXZTn0v7dULMcY6-r3B1Oj_gwcH2vAK4uXENrg_Fn4tZ-2RudNr7tJBo44d/s960/cldf%20for%20dummies.png" style="margin-left: 1em; margin-right: 1em;"><span style="font-family: arial;"><img border="0" data-original-height="720" data-original-width="960" height="455" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjlqM02ccC3mAZbImsSMWJYOZ19ClO5VvZS8UnQZ_KWfa651H08-BJQjbW5R6HPCkSsRrnkiDrpfhOJ8nDVgyDarn2PRKpth9lp3rs7H7SYtJ72zRlynz-n0YBeQFuNomHbRQXZTn0v7dULMcY6-r3B1Oj_gwcH2vAK4uXENrg_Fn4tZ-2RudNr7tJBo44d/w608-h455/cldf%20for%20dummies.png" width="608" /></span></a></div><span style="font-family: arial;">I wrote a little document called "CLDF for dummies" based on what I know about CLDF that I think may be helpful to other researchers in language and cultural diversity and evolution. I am NOT a CLDF-developer or editor, this is all from an end-user perspective.</span><div><span style="font-family: arial;"><br /></span></div><div><span style="font-family: arial;">I'll keep a full and updated version <a href="https://github.com/HedvigS/personal-cookbook/blob/main/R/cldf_for_dummies.md">here</a>. Here is version 1.0:</span></div><div><span style="font-family: arial;"><br /></span><h1 dir="auto" id="user-content-cldf-for-dummies" style="background-color: white; border-bottom: 1px solid var(--borderColor-muted, var(--color-border-muted)); box-sizing: border-box; color: #1f2328; font-weight: var(--base-text-weight-semibold, 600); line-height: 1.25; margin-bottom: 16px; margin-left: 0px; margin-right: 0px; margin-top: 0px !important; margin: 0px 0px 16px; padding-bottom: 0.3em;" tabindex="-1"><a class="heading-link" href="https://github.com/HedvigS/personal-cookbook/blob/main/R/cldf_for_dummies.md#cldf-for-dummies-1" style="background-color: transparent; box-sizing: border-box; font-weight: var(--base-text-weight-semibold, 600); text-decoration-line: none;"><span style="font-family: arial;">CLDF for dummies</span></a></h1><h1 dir="auto" id="user-content-cldf-for-dummies-1" style="background-color: white; border-bottom: 1px solid var(--borderColor-muted, var(--color-border-muted)); box-sizing: border-box; color: #1f2328; font-weight: var(--base-text-weight-semibold, 600); line-height: 1.25; margin: 24px 0px 16px; padding-bottom: 0.3em;" tabindex="-1"><a class="heading-link" href="https://github.com/HedvigS/personal-cookbook/blob/main/R/cldf_for_dummies.md#cldf-for-dummies-1" style="background-color: transparent; box-sizing: border-box; text-decoration-line: none;"><span style="font-family: arial;"><svg aria-hidden="true" class="octicon octicon-link" height="16" version="1.1" viewbox="0 0 16 16" width="16"></svg><span><path d="m7.775 3.275 1.25-1.25a3.5 3.5 0 1 1 4.95 4.95l-2.5 2.5a3.5 3.5 0 0 1-4.95 0 .751.751 0 0 1 .018-1.042.751.751 0 0 1 1.042-.018 1.998 1.998 0 0 0 2.83 0l2.5-2.5a2.002 2.002 0 0 0-2.83-2.83l-1.25 1.25a.751.751 0 0 1-1.042-.018.751.751 0 0 1-.018-1.042Zm-4.69 9.64a1.998 1.998 0 0 0 2.83 0l1.25-1.25a.751.751 0 0 1 1.042.018.751.751 0 0 1 .018 1.042l-1.25 1.25a3.5 3.5 0 1 1-4.95-4.95l2.5-2.5a3.5 3.5 0 0 1 4.95 0 .751.751 0 0 1-.018 1.042.751.751 0 0 1-1.042.018 1.998 1.998 0 0 0-2.83 0l-2.5 2.5a1.998 1.998 0 0 0 0 2.83Z"></path></span></span></a></h1><p dir="auto" style="background-color: white; box-sizing: border-box; color: #1f2328; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">This document outlines some of the very basics of the Cross-Linguistic Data Formats (CLDF) for researchers who want to use the data sets for analysis, comparison or plotting. CLDF is a way of organizing language data, in particular data sets with many different languages in it. The basic organisation is a set of tables, usually in csv-sheets (languages.csv, forms.csv etc). These documents are linked to each other in a specific way which makes it possible to combine them into an interlinked database. The files are all governed by standards, there are sanity-checks to make sure all lines up right. Because they are often just plain csv-sheets they can easily be read in by most data analysis software programs like python, R, julia etc or just regular spreadsheet programs like LibreOffice or Microsoft Excel. It is not necessary to use FileMakerPro, Microsoft Access or similar programs.</span></p><p dir="auto" style="background-color: white; box-sizing: border-box; color: #1f2328; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">It’s plain, flat and simpler than you might think. In this document, you will learn the very basics on how it works.</span></p><p dir="auto" style="background-color: white; box-sizing: border-box; color: #1f2328; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">The data format was first published in 2018 [1] and has since then expanded to include a large amount of different data sets.</span></p><p dir="auto" style="background-color: white; box-sizing: border-box; color: #1f2328; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">CLDF is well-documented. This document is a very basic intro, for more advanced queries go to <a href="https://github.com/cldf/cldf/#readme" style="background-color: transparent; box-sizing: border-box; text-decoration-line: none;">https://github.com/cldf/cldf/#readme</a> and <a href="https://cldf.clld.org/" rel="nofollow" style="background-color: transparent; box-sizing: border-box; text-decoration-line: none;">https://cldf.clld.org/</a></span></p><h2 dir="auto" id="user-content-before-we-start" style="background-color: white; border-bottom: 1px solid var(--borderColor-muted, var(--color-border-muted)); box-sizing: border-box; color: #1f2328; font-weight: var(--base-text-weight-semibold, 600); line-height: 1.25; margin-bottom: 16px; margin-top: 24px; padding-bottom: 0.3em;" tabindex="-1"><a class="heading-link" href="https://github.com/HedvigS/personal-cookbook/blob/main/R/cldf_for_dummies.md#before-we-start" style="background-color: transparent; box-sizing: border-box; text-decoration-line: none;"><span style="font-family: arial;">Before we start<svg aria-hidden="true" class="octicon octicon-link" height="16" version="1.1" viewbox="0 0 16 16" width="16"><path d="m7.775 3.275 1.25-1.25a3.5 3.5 0 1 1 4.95 4.95l-2.5 2.5a3.5 3.5 0 0 1-4.95 0 .751.751 0 0 1 .018-1.042.751.751 0 0 1 1.042-.018 1.998 1.998 0 0 0 2.83 0l2.5-2.5a2.002 2.002 0 0 0-2.83-2.83l-1.25 1.25a.751.751 0 0 1-1.042-.018.751.751 0 0 1-.018-1.042Zm-4.69 9.64a1.998 1.998 0 0 0 2.83 0l1.25-1.25a.751.751 0 0 1 1.042.018.751.751 0 0 1 .018 1.042l-1.25 1.25a3.5 3.5 0 1 1-4.95-4.95l2.5-2.5a3.5 3.5 0 0 1 4.95 0 .751.751 0 0 1-.018 1.042.751.751 0 0 1-1.042.018 1.998 1.998 0 0 0-2.83 0l-2.5 2.5a1.998 1.998 0 0 0 0 2.83Z"></path></svg></span></a></h2><p dir="auto" style="background-color: white; box-sizing: border-box; color: #1f2328; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">Good things to keep in mind:</span></p><ul dir="auto" style="background-color: white; box-sizing: border-box; color: #1f2328; font-size: 16px; margin-bottom: 16px; margin-top: 0px; padding-left: 2em;"><li style="box-sizing: border-box;"><span style="font-family: arial;">the absolute best way to learn how CLDF works is to poke around in an existing dataset. Open the files, check what’s in there, form assumptions and then check if the assumptions are always true. Below are two recommended starter-datasets:</span><ul dir="auto" style="box-sizing: border-box; margin-bottom: 0px; margin-top: 0px; padding-left: 2em;"><li style="box-sizing: border-box;"><span style="font-family: arial;">Wordlist: NorthEuraLex v4.0 <a href="https://github.com/lexibank/northeuralex/tree/v4.0/cldf" style="background-color: transparent; box-sizing: border-box; text-decoration-line: none;">https://github.com/lexibank/northeuralex/tree/v4.0/cldf</a></span></li><li style="box-sizing: border-box; margin-top: 0.25em;"><span style="font-family: arial;">Structure: Grambank v1.0.3 <a href="https://github.com/grambank/grambank/tree/v1.0.3/cldf" style="background-color: transparent; box-sizing: border-box; text-decoration-line: none;">https://github.com/grambank/grambank/tree/v1.0.3/cldf</a></span></li></ul></li><li style="box-sizing: border-box; margin-top: 0.25em;"><span style="font-family: arial;">after having an initial poke around, read the spec for the dataset. They are the ones that'll tell you what the columns really are etc.</span><ul dir="auto" style="box-sizing: border-box; margin-bottom: 0px; margin-top: 0px; padding-left: 2em;"><li style="box-sizing: border-box;"><a href="https://github.com/lexibank/northeuralex/blob/v4.0/cldf/README.md" style="background-color: transparent; box-sizing: border-box; text-decoration-line: none;"><span style="font-family: arial;">https://github.com/lexibank/northeuralex/blob/v4.0/cldf/README.md</span></a></li><li style="box-sizing: border-box; margin-top: 0.25em;"><a href="https://github.com/grambank/grambank/blob/v1.0.3/cldf/README.md" style="background-color: transparent; box-sizing: border-box; text-decoration-line: none;"><span style="font-family: arial;">https://github.com/grambank/grambank/blob/v1.0.3/cldf/README.md</span></a></li></ul></li><li style="box-sizing: border-box; margin-top: 0.25em;"><span style="font-family: arial;">many CLDF-datasets are continuously released, so make sure to keep track of which <span style="box-sizing: border-box; font-weight: var(--base-text-weight-semibold, 600);">version</span> you are using</span></li><li style="box-sizing: border-box; margin-top: 0.25em;"><span style="font-family: arial;">if you use python, make sure to check out pycldf and cldfbench</span></li><li style="box-sizing: border-box; margin-top: 0.25em;"><span style="font-family: arial;">if you use R, keep an eye out for rcldf which is in development</span></li><li style="box-sizing: border-box; margin-top: 0.25em;"><span style="font-family: arial;">this document is about how to navigate existing CLDF-datasets as an end-user, not how to make one.</span></li><li style="box-sizing: border-box; margin-top: 0.25em;"><span style="font-family: arial;">there already exists a lot of documentation on how CLDF works, this document is not meant to be exhaustive but just a gentle entry to get you going. For more, see:</span><ul dir="auto" style="box-sizing: border-box; margin-bottom: 0px; margin-top: 0px; padding-left: 2em;"><li style="box-sizing: border-box;"><a href="https://github.com/cldf/cldf/#readme" style="background-color: transparent; box-sizing: border-box; text-decoration-line: none;"><span style="font-family: arial;">https://github.com/cldf/cldf/#readme</span></a></li><li style="box-sizing: border-box; margin-top: 0.25em;"><a href="https://cldf.clld.org/" rel="nofollow" style="background-color: transparent; box-sizing: border-box; text-decoration-line: none;"><span style="font-family: arial;">https://cldf.clld.org/</span></a></li></ul></li></ul><h2 dir="auto" id="user-content-how-to-know-if-youre-dealing-with-a-cldf-dataset" style="background-color: white; border-bottom: 1px solid var(--borderColor-muted, var(--color-border-muted)); box-sizing: border-box; color: #1f2328; font-weight: var(--base-text-weight-semibold, 600); line-height: 1.25; margin-bottom: 16px; margin-top: 24px; padding-bottom: 0.3em;" tabindex="-1"><a class="heading-link" href="https://github.com/HedvigS/personal-cookbook/blob/main/R/cldf_for_dummies.md#how-to-know-if-youre-dealing-with-a-cldf-dataset" style="background-color: transparent; box-sizing: border-box; text-decoration-line: none;"><span style="font-family: arial;">How to know if you’re dealing with a CLDF-dataset<svg aria-hidden="true" class="octicon octicon-link" height="16" version="1.1" viewbox="0 0 16 16" width="16"><path d="m7.775 3.275 1.25-1.25a3.5 3.5 0 1 1 4.95 4.95l-2.5 2.5a3.5 3.5 0 0 1-4.95 0 .751.751 0 0 1 .018-1.042.751.751 0 0 1 1.042-.018 1.998 1.998 0 0 0 2.83 0l2.5-2.5a2.002 2.002 0 0 0-2.83-2.83l-1.25 1.25a.751.751 0 0 1-1.042-.018.751.751 0 0 1-.018-1.042Zm-4.69 9.64a1.998 1.998 0 0 0 2.83 0l1.25-1.25a.751.751 0 0 1 1.042.018.751.751 0 0 1 .018 1.042l-1.25 1.25a3.5 3.5 0 1 1-4.95-4.95l2.5-2.5a3.5 3.5 0 0 1 4.95 0 .751.751 0 0 1-.018 1.042.751.751 0 0 1-1.042.018 1.998 1.998 0 0 0-2.83 0l-2.5 2.5a1.998 1.998 0 0 0 0 2.83Z"></path></svg></span></a></h2><p dir="auto" style="background-color: white; box-sizing: border-box; color: #1f2328; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">You are dealing with a CLDF-data set if there is a file ending with the extension “json” and at the top it identifies a CLDF-dataset type. For example, it could be <code style="background-color: var(--bgColor-neutral-muted, var(--color-neutral-muted)); border-radius: 6px; box-sizing: border-box; font-size: 13.6px; margin: 0px; padding: 0.2em 0.4em; white-space-collapse: break-spaces;">dc:conformsTo": "http://cldf.clld.org/v1.0/terms.rdf#StructureDataset"</code>. (There is one exception, see “Good to know” below.)</span></p><p dir="auto" style="background-color: white; box-sizing: border-box; color: #1f2328; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">Typically, there is a folder called “cldf” with files like “languages.csv”, “values.csv” and “StructureDataset-metadata.json” in it. The last file will be different depending on the type of data set.</span></p><p dir="auto" style="background-color: white; box-sizing: border-box; color: #1f2328; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">Here are some examples of data sets that are available in CLDF that you may have encountered:</span></p><ul dir="auto" style="background-color: white; box-sizing: border-box; color: #1f2328; font-size: 16px; margin-bottom: 16px; margin-top: 0px; padding-left: 2em;"><li style="box-sizing: border-box;"><p dir="auto" style="box-sizing: border-box; margin-bottom: 16px; margin-top: 16px;"><span style="font-family: arial;">WALS (World Atlas of Language Structures)</span></p></li><li style="box-sizing: border-box; margin-top: 0.25em;"><p dir="auto" style="box-sizing: border-box; margin-bottom: 16px; margin-top: 16px;"><span style="font-family: arial;">PHOIBLE (Phonetics Information Base and Lexicon)</span></p></li><li style="box-sizing: border-box; margin-top: 0.25em;"><p dir="auto" style="box-sizing: border-box; margin-bottom: 16px; margin-top: 16px;"><span style="font-family: arial;">D-PLACE (Database of Places, Language, Culture and Environment)</span></p></li><li style="box-sizing: border-box; margin-top: 0.25em;"><p dir="auto" style="box-sizing: border-box; margin-bottom: 16px; margin-top: 16px;"><span style="font-family: arial;">Glottolog</span></p></li><li style="box-sizing: border-box; margin-top: 0.25em;"><p dir="auto" style="box-sizing: border-box; margin-bottom: 16px; margin-top: 16px;"><span style="font-family: arial;">Lexibank</span></p></li><li style="box-sizing: border-box; margin-top: 0.25em;"><p dir="auto" style="box-sizing: border-box; margin-bottom: 16px; margin-top: 16px;"><span style="font-family: arial;">Grambank</span></p></li></ul><p dir="auto" style="background-color: white; box-sizing: border-box; color: #1f2328; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">Good to know: <a href="https://github.com/cldf/cldf#metadata-free-conformance" style="background-color: transparent; box-sizing: border-box; text-decoration-line: none;">It is possible for a CLDF-dataset to only consist of one file</a>. No json, no set of csvs. Just one file, for example values.csv. In such cases, the file doesn’t have any meta-data specified and just conforms to all the default settings. You can’t tell by a json that it’s a CLDF-dataset because there isn’t one. This type of CLDF-data set is rare, and will not be dealt with further here.</span></p><h3 dir="auto" id="user-content-types-of-cldf-datasets" style="background-color: white; box-sizing: border-box; color: #1f2328; font-size: 1.25em; font-weight: var(--base-text-weight-semibold, 600); line-height: 1.25; margin-bottom: 16px; margin-top: 24px;" tabindex="-1"><a class="heading-link" href="https://github.com/HedvigS/personal-cookbook/blob/main/R/cldf_for_dummies.md#types-of-cldf-datasets" style="background-color: transparent; box-sizing: border-box; text-decoration-line: none;"><span style="font-family: arial;">Types of CLDF-datasets<svg aria-hidden="true" class="octicon octicon-link" height="16" version="1.1" viewbox="0 0 16 16" width="16"><path d="m7.775 3.275 1.25-1.25a3.5 3.5 0 1 1 4.95 4.95l-2.5 2.5a3.5 3.5 0 0 1-4.95 0 .751.751 0 0 1 .018-1.042.751.751 0 0 1 1.042-.018 1.998 1.998 0 0 0 2.83 0l2.5-2.5a2.002 2.002 0 0 0-2.83-2.83l-1.25 1.25a.751.751 0 0 1-1.042-.018.751.751 0 0 1-.018-1.042Zm-4.69 9.64a1.998 1.998 0 0 0 2.83 0l1.25-1.25a.751.751 0 0 1 1.042.018.751.751 0 0 1 .018 1.042l-1.25 1.25a3.5 3.5 0 1 1-4.95-4.95l2.5-2.5a3.5 3.5 0 0 1 4.95 0 .751.751 0 0 1-.018 1.042.751.751 0 0 1-1.042.018 1.998 1.998 0 0 0-2.83 0l-2.5 2.5a1.998 1.998 0 0 0 0 2.83Z"></path></svg></span></a></h3><p dir="auto" style="background-color: white; box-sizing: border-box; color: #1f2328; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">There are five types of CLDF-datasets. They are also known as “modules”.</span></p><ul dir="auto" style="background-color: white; box-sizing: border-box; color: #1f2328; font-size: 16px; margin-bottom: 16px; margin-top: 0px; padding-left: 2em;"><li style="box-sizing: border-box;"><span style="font-family: arial;">Wordlist (lexicon, has Forms and often Cognates)</span></li><li style="box-sizing: border-box; margin-top: 0.25em;"><span style="font-family: arial;">Structure dataset (grammar or other types of information with one value for a Parameter and a Feature, has Values)</span></li><li style="box-sizing: border-box; margin-top: 0.25em;"><span style="font-family: arial;">Dictionary (particular kind of lexicon, has Entries and Senses)</span></li><li style="box-sizing: border-box; margin-top: 0.25em;"><span style="font-family: arial;">Parallel text (collections of paragraphs of the same text in different languages, has Forms, Segments and FunctionalEquivalents)</span></li><li style="box-sizing: border-box; margin-top: 0.25em;"><span style="font-family: arial;">generic (no specifics)</span></li></ul><h2 dir="auto" id="user-content-contents" style="background-color: white; border-bottom: 1px solid var(--borderColor-muted, var(--color-border-muted)); box-sizing: border-box; color: #1f2328; font-weight: var(--base-text-weight-semibold, 600); line-height: 1.25; margin-bottom: 16px; margin-top: 24px; padding-bottom: 0.3em;" tabindex="-1"><a class="heading-link" href="https://github.com/HedvigS/personal-cookbook/blob/main/R/cldf_for_dummies.md#contents" style="background-color: transparent; box-sizing: border-box; text-decoration-line: none;"><span style="font-family: arial;">Contents<svg aria-hidden="true" class="octicon octicon-link" height="16" version="1.1" viewbox="0 0 16 16" width="16"><path d="m7.775 3.275 1.25-1.25a3.5 3.5 0 1 1 4.95 4.95l-2.5 2.5a3.5 3.5 0 0 1-4.95 0 .751.751 0 0 1 .018-1.042.751.751 0 0 1 1.042-.018 1.998 1.998 0 0 0 2.83 0l2.5-2.5a2.002 2.002 0 0 0-2.83-2.83l-1.25 1.25a.751.751 0 0 1-1.042-.018.751.751 0 0 1-.018-1.042Zm-4.69 9.64a1.998 1.998 0 0 0 2.83 0l1.25-1.25a.751.751 0 0 1 1.042.018.751.751 0 0 1 .018 1.042l-1.25 1.25a3.5 3.5 0 1 1-4.95-4.95l2.5-2.5a3.5 3.5 0 0 1 4.95 0 .751.751 0 0 1-.018 1.042.751.751 0 0 1-1.042.018 1.998 1.998 0 0 0-2.83 0l-2.5 2.5a1.998 1.998 0 0 0 0 2.83Z"></path></svg></span></a></h2><p dir="auto" style="background-color: white; box-sizing: border-box; color: #1f2328; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">Each CLDF-dataset (except the metadata-free ones) consists minimally of:</span></p><ul dir="auto" style="background-color: white; box-sizing: border-box; color: #1f2328; font-size: 16px; margin-bottom: 16px; margin-top: 0px; padding-left: 2em;"><li style="box-sizing: border-box;"><span style="font-family: arial;">a set of tables (usually in csv-sheets)</span></li><li style="box-sizing: border-box; margin-top: 0.25em;"><span style="font-family: arial;">a json-file</span></li></ul><p dir="auto" style="background-color: white; box-sizing: border-box; color: #1f2328; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">The tables are usually in csv-format and contain the data itself. The json file has information <em style="box-sizing: border-box;">about</em> the dataset, for example the type of dataset is, what the contents are, what the filenames are etc.</span></p><p dir="auto" style="background-color: white; box-sizing: border-box; color: #1f2328; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">Many CLDF-datasets also contain a bibTeX-file with bibliographic references for the data. In such cases, each data-point is tied to a reference by the key in the bibTeX entry. Usually the key is in a column called “Source” in the ValueTable or FormTable. The bibTeX file is usually called “sources.bib”. If it’s called something else, it’ll say so in the meta-data json file.</span></p><h2 dir="auto" id="user-content-tables-inside-the-datasets" style="background-color: white; border-bottom: 1px solid var(--borderColor-muted, var(--color-border-muted)); box-sizing: border-box; color: #1f2328; font-weight: var(--base-text-weight-semibold, 600); line-height: 1.25; margin-bottom: 16px; margin-top: 24px; padding-bottom: 0.3em;" tabindex="-1"><a class="heading-link" href="https://github.com/HedvigS/personal-cookbook/blob/main/R/cldf_for_dummies.md#tables-inside-the-datasets" style="background-color: transparent; box-sizing: border-box; text-decoration-line: none;"><span style="font-family: arial;">Tables inside the datasets<svg aria-hidden="true" class="octicon octicon-link" height="16" version="1.1" viewbox="0 0 16 16" width="16"><path d="m7.775 3.275 1.25-1.25a3.5 3.5 0 1 1 4.95 4.95l-2.5 2.5a3.5 3.5 0 0 1-4.95 0 .751.751 0 0 1 .018-1.042.751.751 0 0 1 1.042-.018 1.998 1.998 0 0 0 2.83 0l2.5-2.5a2.002 2.002 0 0 0-2.83-2.83l-1.25 1.25a.751.751 0 0 1-1.042-.018.751.751 0 0 1-.018-1.042Zm-4.69 9.64a1.998 1.998 0 0 0 2.83 0l1.25-1.25a.751.751 0 0 1 1.042.018.751.751 0 0 1 .018 1.042l-1.25 1.25a3.5 3.5 0 1 1-4.95-4.95l2.5-2.5a3.5 3.5 0 0 1 4.95 0 .751.751 0 0 1-.018 1.042.751.751 0 0 1-1.042.018 1.998 1.998 0 0 0-2.83 0l-2.5 2.5a1.998 1.998 0 0 0 0 2.83Z"></path></svg></span></a></h2><p dir="auto" style="background-color: white; box-sizing: border-box; color: #1f2328; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">There are some tables that occur in most CLDF-datasets, and some that occur only in certain types. For example, there is no table with word forms for Structure data sets - that’s for wordlists and Dictionaries.</span></p><p dir="auto" style="background-color: white; box-sizing: border-box; color: #1f2328; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">The tables have specific names in the CLDF-world and have pre-defined specifics. The names are different from their filenames. You can see which name is tied to which file in the json. “LanguageTable” is usually found in the file languages.csv, “CodeTable” in codes.csv, “ValueTable” in values.csv, “CognateTable” in cognates.csv etc.</span></p><ul dir="auto" style="background-color: white; box-sizing: border-box; color: #1f2328; font-size: 16px; margin-bottom: 16px; margin-top: 0px; padding-left: 2em;"><li style="box-sizing: border-box;"><span style="font-family: arial;">LanguageTable -> languages.csv (contains minimally ID)</span></li><li style="box-sizing: border-box; margin-top: 0.25em;"><span style="font-family: arial;">FormTable -> forms.csv (contains minimally ID, Form, Language_ID, Parameter_ID)</span></li><li style="box-sizing: border-box; margin-top: 0.25em;"><span style="font-family: arial;">ParameterTable -> parameters.csv (contains minimally ID, Name) etc.</span></li></ul><p dir="auto" style="background-color: white; box-sizing: border-box; color: #1f2328; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">The json-meta data file says which table is in which file, it’s specified as the <code style="background-color: var(--bgColor-neutral-muted, var(--color-neutral-muted)); border-radius: 6px; box-sizing: border-box; font-size: 13.6px; margin: 0px; padding: 0.2em 0.4em; white-space-collapse: break-spaces;">url</code> of the table which conforms to a certain CLDF-standard, for example for <code style="background-color: var(--bgColor-neutral-muted, var(--color-neutral-muted)); border-radius: 6px; box-sizing: border-box; font-size: 13.6px; margin: 0px; padding: 0.2em 0.4em; white-space-collapse: break-spaces;">LanguageTable</code>. <span style="box-sizing: border-box; font-weight: var(--base-text-weight-semibold, 600);">You can’t always bank on LanguageTable being in languages.csv</span>. <code style="background-color: var(--bgColor-neutral-muted, var(--color-neutral-muted)); border-radius: 6px; box-sizing: border-box; font-size: 13.6px; margin: 0px; padding: 0.2em 0.4em; white-space-collapse: break-spaces;">pycldf</code> and <code style="background-color: var(--bgColor-neutral-muted, var(--color-neutral-muted)); border-radius: 6px; box-sizing: border-box; font-size: 13.6px; margin: 0px; padding: 0.2em 0.4em; white-space-collapse: break-spaces;">rcldf</code> can handle this for you, i.e. look up in the json what table is where and set all that up.</span></p><p dir="auto" style="background-color: white; box-sizing: border-box; color: #1f2328; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">Each table is usually tied to several pre-defined CLDF standards for the content. For example, FormTables need to have the columns “ID”, “Form” and “Language_ID” and they in turn need to look a certain way.</span></p><p dir="auto" style="background-color: white; box-sizing: border-box; color: #1f2328; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">Tables can have more columns than the minimal requirement and can have columns that don’t map onto CLDF-standards at all.</span></p><p dir="auto" style="background-color: white; box-sizing: border-box; color: #1f2328; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">For more specifics, see this file for CLDF v1.0 <a href="http://cldf.clld.org/v1.0/terms.rdf" rel="nofollow" style="background-color: transparent; box-sizing: border-box; text-decoration-line: none;">http://cldf.clld.org/v1.0/terms.rdf</a>.</span></p><h4 dir="auto" id="user-content-tables-in-most-cldf-dataset" style="background-color: white; box-sizing: border-box; color: #1f2328; font-size: 16px; font-weight: var(--base-text-weight-semibold, 600); line-height: 1.25; margin-bottom: 16px; margin-top: 24px;" tabindex="-1"><a class="heading-link" href="https://github.com/HedvigS/personal-cookbook/blob/main/R/cldf_for_dummies.md#tables-in-most-cldf-dataset" style="background-color: transparent; box-sizing: border-box; text-decoration-line: none;"><span style="font-family: arial;">Tables in most CLDF-dataset<svg aria-hidden="true" class="octicon octicon-link" height="16" version="1.1" viewbox="0 0 16 16" width="16"><path d="m7.775 3.275 1.25-1.25a3.5 3.5 0 1 1 4.95 4.95l-2.5 2.5a3.5 3.5 0 0 1-4.95 0 .751.751 0 0 1 .018-1.042.751.751 0 0 1 1.042-.018 1.998 1.998 0 0 0 2.83 0l2.5-2.5a2.002 2.002 0 0 0-2.83-2.83l-1.25 1.25a.751.751 0 0 1-1.042-.018.751.751 0 0 1-.018-1.042Zm-4.69 9.64a1.998 1.998 0 0 0 2.83 0l1.25-1.25a.751.751 0 0 1 1.042.018.751.751 0 0 1 .018 1.042l-1.25 1.25a3.5 3.5 0 1 1-4.95-4.95l2.5-2.5a3.5 3.5 0 0 1 4.95 0 .751.751 0 0 1-.018 1.042.751.751 0 0 1-1.042.018 1.998 1.998 0 0 0-2.83 0l-2.5 2.5a1.998 1.998 0 0 0 0 2.83Z"></path></svg></span></a></h4><p dir="auto" style="background-color: white; box-sizing: border-box; color: #1f2328; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">Here are CLDF-tables that occur in most CLDF-datasets.</span></p><ul dir="auto" style="background-color: white; box-sizing: border-box; color: #1f2328; font-size: 16px; margin-bottom: 16px; margin-top: 0px; padding-left: 2em;"><li style="box-sizing: border-box;"><span style="font-family: arial;">LanguageTable - list of all of the languages in the dataset. May also include things classified by Glottolog as dialects or proto-languages. Includes meta-information like longitude, language family etc.</span></li><li style="box-sizing: border-box; margin-top: 0.25em;"><span style="font-family: arial;">ParameterTable - contains a definition of the variables. For lexicon, these are the concepts, for grammar these are the features.</span></li></ul><p dir="auto" style="background-color: white; box-sizing: border-box; color: #1f2328; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">Wordlist also contain</span></p><ul dir="auto" style="background-color: white; box-sizing: border-box; color: #1f2328; font-size: 16px; margin-bottom: 16px; margin-top: 0px; padding-left: 2em;"><li style="box-sizing: border-box;"><span style="font-family: arial;">FormTable - the forms for each concept for each language</span></li><li style="box-sizing: border-box; margin-top: 0.25em;"><span style="font-family: arial;">CognateTable (not obligatory) - the cognate classification per form per concept per language</span></li></ul><p dir="auto" style="background-color: white; box-sizing: border-box; color: #1f2328; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">Structure data-sets also contain</span></p><ul dir="auto" style="background-color: white; box-sizing: border-box; color: #1f2328; font-size: 16px; margin-bottom: 16px; margin-top: 0px; padding-left: 2em;"><li style="box-sizing: border-box;"><span style="font-family: arial;">ValueTable - the value for each parameter and language. Usually also Comment and Source.</span></li><li style="box-sizing: border-box; margin-top: 0.25em;"><span style="font-family: arial;">CodeTable - The list of possible values for each parameter. For example, GB020 in Grambank is a binary feature and can take 0, 1 and ? whereas EA016 in the Ethnographic Atlas (D-PLACE) can take 1, 2 or 9. The options are exclusive of each other for each data-point.</span></li></ul><p dir="auto" style="background-color: white; box-sizing: border-box; color: #1f2328; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">Good to know: for the CLDF-dataset of D-PLACE, the LanguageTable contains a row per <em style="box-sizing: border-box;">society</em>. There is a column for the Glottocode of the language associated with that society.</span></p><h4 dir="auto" id="user-content-columns-in-tables" style="background-color: white; box-sizing: border-box; color: #1f2328; font-size: 16px; font-weight: var(--base-text-weight-semibold, 600); line-height: 1.25; margin-bottom: 16px; margin-top: 24px;" tabindex="-1"><a class="heading-link" href="https://github.com/HedvigS/personal-cookbook/blob/main/R/cldf_for_dummies.md#columns-in-tables" style="background-color: transparent; box-sizing: border-box; text-decoration-line: none;"><span style="font-family: arial;">Columns in tables<svg aria-hidden="true" class="octicon octicon-link" height="16" version="1.1" viewbox="0 0 16 16" width="16"><path d="m7.775 3.275 1.25-1.25a3.5 3.5 0 1 1 4.95 4.95l-2.5 2.5a3.5 3.5 0 0 1-4.95 0 .751.751 0 0 1 .018-1.042.751.751 0 0 1 1.042-.018 1.998 1.998 0 0 0 2.83 0l2.5-2.5a2.002 2.002 0 0 0-2.83-2.83l-1.25 1.25a.751.751 0 0 1-1.042-.018.751.751 0 0 1-.018-1.042Zm-4.69 9.64a1.998 1.998 0 0 0 2.83 0l1.25-1.25a.751.751 0 0 1 1.042.018.751.751 0 0 1 .018 1.042l-1.25 1.25a3.5 3.5 0 1 1-4.95-4.95l2.5-2.5a3.5 3.5 0 0 1 4.95 0 .751.751 0 0 1-.018 1.042.751.751 0 0 1-1.042.018 1.998 1.998 0 0 0-2.83 0l-2.5 2.5a1.998 1.998 0 0 0 0 2.83Z"></path></svg></span></a></h4><p dir="auto" style="background-color: white; box-sizing: border-box; color: #1f2328; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">Each table consists of a set of columns. The names of these columns are often for example "ID", "Longitude", "Value" etc. However, they can vary. The meta-data contains information on which column name maps onto what property in the CLDF-universe. For example, there is the property "source"", which has the propertyURL <a href="http://cldf.clld.org/v1.0/terms.rdf#source" rel="nofollow" style="background-color: transparent; box-sizing: border-box; text-decoration-line: none;">http://cldf.clld.org/v1.0/terms.rdf#source</a> and often is mapped onto a column called "Source". However, if one CLDF-creator wanted to name this column "Reference" instead, that's all well and good. The json-metadata-file would tell the users what column "Reference" corresponds to the standardised property "source" and point to the property-url. As with filenames of tables, you can often get by with assuming that bibliographic references are in a column called "Source" and the LanguageTable is in languages.csv --- but this needn't always be true! All glory to the json-metadata file.</span></p><h1 dir="auto" id="user-content-example-wordlist" style="background-color: white; border-bottom: 1px solid var(--borderColor-muted, var(--color-border-muted)); box-sizing: border-box; color: #1f2328; font-weight: var(--base-text-weight-semibold, 600); line-height: 1.25; margin: 24px 0px 16px; padding-bottom: 0.3em;" tabindex="-1"><a class="heading-link" href="https://github.com/HedvigS/personal-cookbook/blob/main/R/cldf_for_dummies.md#example-wordlist" style="background-color: transparent; box-sizing: border-box; text-decoration-line: none;"><span style="font-family: arial;">Example: Wordlist<svg aria-hidden="true" class="octicon octicon-link" height="16" version="1.1" viewbox="0 0 16 16" width="16"><path d="m7.775 3.275 1.25-1.25a3.5 3.5 0 1 1 4.95 4.95l-2.5 2.5a3.5 3.5 0 0 1-4.95 0 .751.751 0 0 1 .018-1.042.751.751 0 0 1 1.042-.018 1.998 1.998 0 0 0 2.83 0l2.5-2.5a2.002 2.002 0 0 0-2.83-2.83l-1.25 1.25a.751.751 0 0 1-1.042-.018.751.751 0 0 1-.018-1.042Zm-4.69 9.64a1.998 1.998 0 0 0 2.83 0l1.25-1.25a.751.751 0 0 1 1.042.018.751.751 0 0 1 .018 1.042l-1.25 1.25a3.5 3.5 0 1 1-4.95-4.95l2.5-2.5a3.5 3.5 0 0 1 4.95 0 .751.751 0 0 1-.018 1.042.751.751 0 0 1-1.042.018 1.998 1.998 0 0 0-2.83 0l-2.5 2.5a1.998 1.998 0 0 0 0 2.83Z"></path></svg></span></a></h1><p dir="auto" style="background-color: white; box-sizing: border-box; color: #1f2328; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">Below is a tiny Wordlist CLDF-dataset. This dataset contains 3 words in 2 languages. The first two tables, LanguageTable and ParameterTable contains information about the languages and parameters - in this case concepts. The FormTable contains the actual forms. For one of the concepts, one of the languages has two words and both are listed.</span></p><p dir="auto" style="background-color: white; box-sizing: border-box; color: #1f2328; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">The meta-data json is not included here. You can see an example of a Wordlist-metadata json file here: <a href="https://github.com/lexibank/abvd/blob/master/cldf/cldf-metadata.json" style="background-color: transparent; box-sizing: border-box; text-decoration-line: none;">https://github.com/lexibank/abvd/blob/master/cldf/cldf-metadata.json</a>.</span></p><p dir="auto" style="background-color: white; box-sizing: border-box; color: #1f2328; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="box-sizing: border-box; font-weight: var(--base-text-weight-semibold, 600);"><span style="font-family: arial;">LanguageTable</span></span></p><p dir="auto" style="background-color: white; box-sizing: border-box; color: #1f2328; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">One row = one language (or sometimes dialect or proto-language, i.e above language in a tree). The ID column uniquely identifies each language in the dataset. In other tables, the column that links to the ID column here is called “Language_ID”.</span></p><table style="background-color: white; border-collapse: collapse; border-spacing: 0px; color: #1f2328; display: block; font-size: 16px; margin-bottom: 16px; margin-top: 0px; max-width: 100%; overflow: auto; width: max-content;"><thead style="box-sizing: border-box;"><tr style="background-color: var(--bgColor-default, var(--color-canvas-default)); border-top: 1px solid var(--borderColor-muted, var(--color-border-muted)); box-sizing: border-box;"><th style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; font-weight: var(--base-text-weight-semibold, 600); padding: 6px 13px;"><span style="font-family: arial;">ID</span></th><th style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; font-weight: var(--base-text-weight-semibold, 600); padding: 6px 13px;"><span style="font-family: arial;">Name</span></th><th style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; font-weight: var(--base-text-weight-semibold, 600); padding: 6px 13px;"><span style="font-family: arial;">Glottocode</span></th></tr></thead><tbody style="box-sizing: border-box;"><tr style="background-color: var(--bgColor-default, var(--color-canvas-default)); border-top: 1px solid var(--borderColor-muted, var(--color-border-muted)); box-sizing: border-box;"><td style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; padding: 6px 13px;"><span style="font-family: arial;">15</span></td><td style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; padding: 6px 13px;"><span style="font-family: arial;">Bintulu</span></td><td style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; padding: 6px 13px;"><span style="font-family: arial;">bint1246</span></td></tr><tr style="background-color: var(--bgColor-muted, var(--color-canvas-subtle)); border-top: 1px solid var(--borderColor-muted, var(--color-border-muted)); box-sizing: border-box;"><td style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; padding: 6px 13px;"><span style="font-family: arial;">18</span></td><td style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; padding: 6px 13px;"><span style="font-family: arial;">CHamorro</span></td><td style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; padding: 6px 13px;"><span style="font-family: arial;">cham1312</span></td></tr></tbody></table><p dir="auto" style="background-color: white; box-sizing: border-box; color: #1f2328; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">Good to know: Sometimes the IDs in LanguageTable are Glottocodes or ISO 639-3 codes, but they don’t have to be. They just have to be unique within that dataset. In Grambank, the ID’s are Glottocodes, but WALS has its own specific unique code-system different from both Glottocodes and ISO 639-3. If you want Glottocodes, go look for a column called Glottocode in the LanguageTable - don’t use the ID column.</span></p><p dir="auto" style="background-color: white; box-sizing: border-box; color: #1f2328; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">Good to know 2: Glottocodes contain 4 letters or numbers and then 4 numbers. The first 4 characters are not always letters. For example, <code style="background-color: var(--bgColor-neutral-muted, var(--color-neutral-muted)); border-radius: 6px; box-sizing: border-box; font-size: 13.6px; margin: 0px; padding: 0.2em 0.4em; white-space-collapse: break-spaces;">ww2p1234</code> and <code style="background-color: var(--bgColor-neutral-muted, var(--color-neutral-muted)); border-radius: 6px; box-sizing: border-box; font-size: 13.6px; margin: 0px; padding: 0.2em 0.4em; white-space-collapse: break-spaces;">3adt1234</code> are existing glottocodes.</span></p><p dir="auto" style="background-color: white; box-sizing: border-box; color: #1f2328; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="box-sizing: border-box; font-weight: var(--base-text-weight-semibold, 600);"><span style="font-family: arial;">ParameterTable</span></span></p><p dir="auto" style="background-color: white; box-sizing: border-box; color: #1f2328; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">One row = one parameter. The ID column uniquely identifies each parameter in the dataset. In other tables, the column that links to the ID column here is called “Parameter_ID”.</span></p><table style="background-color: white; border-collapse: collapse; border-spacing: 0px; color: #1f2328; display: block; font-size: 16px; margin-bottom: 16px; margin-top: 0px; max-width: 100%; overflow: auto; width: max-content;"><thead style="box-sizing: border-box;"><tr style="background-color: var(--bgColor-default, var(--color-canvas-default)); border-top: 1px solid var(--borderColor-muted, var(--color-border-muted)); box-sizing: border-box;"><th style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; font-weight: var(--base-text-weight-semibold, 600); padding: 6px 13px;"><span style="font-family: arial;">ID</span></th><th style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; font-weight: var(--base-text-weight-semibold, 600); padding: 6px 13px;"><span style="font-family: arial;">Name</span></th><th style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; font-weight: var(--base-text-weight-semibold, 600); padding: 6px 13px;"><span style="font-family: arial;">Concepticon_ID</span></th></tr></thead><tbody style="box-sizing: border-box;"><tr style="background-color: var(--bgColor-default, var(--color-canvas-default)); border-top: 1px solid var(--borderColor-muted, var(--color-border-muted)); box-sizing: border-box;"><td style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; padding: 6px 13px;"><span style="font-family: arial;">144_toburn</span></td><td style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; padding: 6px 13px;"><span style="font-family: arial;">to burn</span></td><td style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; padding: 6px 13px;"><span style="font-family: arial;">2102</span></td></tr><tr style="background-color: var(--bgColor-muted, var(--color-canvas-subtle)); border-top: 1px solid var(--borderColor-muted, var(--color-border-muted)); box-sizing: border-box;"><td style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; padding: 6px 13px;"><span style="font-family: arial;">2_left</span></td><td style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; padding: 6px 13px;"><span style="font-family: arial;">left</span></td><td style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; padding: 6px 13px;"><span style="font-family: arial;">244</span></td></tr></tbody></table><p dir="auto" style="background-color: white; box-sizing: border-box; color: #1f2328; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="box-sizing: border-box; font-weight: var(--base-text-weight-semibold, 600);"><span style="font-family: arial;">FormTable</span></span></p><p dir="auto" style="background-color: white; box-sizing: border-box; color: #1f2328; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">One row = one form. The ID column uniquely identifies each form in the dataset. In other tables, the column that links to the ID column here is called “Form_ID”. Here we also see Parameter_ID, which links to the column ID in the ParameterTable and Language_ID which links to the column ID in the LanguageTable.</span></p><table style="background-color: white; border-collapse: collapse; border-spacing: 0px; color: #1f2328; display: block; font-size: 16px; margin-bottom: 16px; margin-top: 0px; max-width: 100%; overflow: auto; width: max-content;"><thead style="box-sizing: border-box;"><tr style="background-color: var(--bgColor-default, var(--color-canvas-default)); border-top: 1px solid var(--borderColor-muted, var(--color-border-muted)); box-sizing: border-box;"><th style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; font-weight: var(--base-text-weight-semibold, 600); padding: 6px 13px;"><span style="font-family: arial;">ID</span></th><th style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; font-weight: var(--base-text-weight-semibold, 600); padding: 6px 13px;"><span style="font-family: arial;">Parameter_ID</span></th><th style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; font-weight: var(--base-text-weight-semibold, 600); padding: 6px 13px;"><span style="font-family: arial;">Language_ID</span></th><th style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; font-weight: var(--base-text-weight-semibold, 600); padding: 6px 13px;"><span style="font-family: arial;">Form</span></th><th style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; font-weight: var(--base-text-weight-semibold, 600); padding: 6px 13px;"><span style="font-family: arial;">Source</span></th></tr></thead><tbody style="box-sizing: border-box;"><tr style="background-color: var(--bgColor-default, var(--color-canvas-default)); border-top: 1px solid var(--borderColor-muted, var(--color-border-muted)); box-sizing: border-box;"><td style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; padding: 6px 13px;"><span style="font-family: arial;">15-144_toburn-1</span></td><td style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; padding: 6px 13px;"><span style="font-family: arial;">144_toburn</span></td><td style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; padding: 6px 13px;"><span style="font-family: arial;">15</span></td><td style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; padding: 6px 13px;"><span style="font-family: arial;">pegew</span></td><td style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; padding: 6px 13px;"><span style="font-family: arial;">Blust-15-2005</span></td></tr><tr style="background-color: var(--bgColor-muted, var(--color-canvas-subtle)); border-top: 1px solid var(--borderColor-muted, var(--color-border-muted)); box-sizing: border-box;"><td style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; padding: 6px 13px;"><span style="font-family: arial;">15-144_toburn-2</span></td><td style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; padding: 6px 13px;"><span style="font-family: arial;">144_toburn</span></td><td style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; padding: 6px 13px;"><span style="font-family: arial;">15</span></td><td style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; padding: 6px 13px;"><span style="font-family: arial;">tinew</span></td><td style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; padding: 6px 13px;"><span style="font-family: arial;">Blust-15-2005</span></td></tr><tr style="background-color: var(--bgColor-default, var(--color-canvas-default)); border-top: 1px solid var(--borderColor-muted, var(--color-border-muted)); box-sizing: border-box;"><td style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; padding: 6px 13px;"><span style="font-family: arial;">18-2_left-1</span></td><td style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; padding: 6px 13px;"><span style="font-family: arial;">2_left</span></td><td style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; padding: 6px 13px;"><span style="font-family: arial;">18</span></td><td style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; padding: 6px 13px;"><span style="font-family: arial;">akague</span></td><td style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; padding: 6px 13px;"><span style="font-family: arial;">38174</span></td></tr></tbody></table><p dir="auto" style="background-color: white; box-sizing: border-box; color: #1f2328; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">The ID column here is a combination of the Language_ID, Parameter_ID and last a number to distinguish if there are more than one form. For example, because Bintulu has two words for “to burn”, there are two rows with different Forms but the same Parameter_ID (they both mean “to burn”). The ID column, which identifies each form has a number at the end of the string which indicates the different form. If there is only one form, the string ends with “-1”, but as you can see for “to burn” it first has “-1” and then “-2”.</span></p><p dir="auto" style="background-color: white; box-sizing: border-box; color: #1f2328; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="box-sizing: border-box; font-weight: var(--base-text-weight-semibold, 600);"><span style="font-family: arial;">Source</span></span></p><p dir="auto" style="background-color: white; box-sizing: border-box; color: #1f2328; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">Optional file, but often present in the form of a bibTeX-file. One entry = one source. The bibTeX file is usually called “sources.bib”, but not necessary (check metadata.json as usual). The bibTeX Key (the first string after <code style="background-color: var(--bgColor-neutral-muted, var(--color-neutral-muted)); border-radius: 6px; box-sizing: border-box; font-size: 13.6px; margin: 0px; padding: 0.2em 0.4em; white-space-collapse: break-spaces;">@BIBTEXENTRYTYPE{</code>) maps onto the Source column in the FormTable above.</span></p><div class="snippet-clipboard-content notranslate position-relative overflow-auto" style="box-sizing: border-box; color: #1f2328; display: flex; font-size: 16px; justify-content: space-between; margin-bottom: 16px; overflow: auto; position: relative;"><pre class="notranslate" style="border-radius: 6px; box-sizing: border-box; color: var(--fgColor-default, var(--color-fg-default)); font-size: 13.6px; line-height: 1.45; margin-bottom: 0px; margin-top: 0px; overflow-wrap: normal; overflow: auto; padding: 16px;"><code style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border-radius: 6px; border: 0px; box-sizing: border-box; display: inline; font-size: 13.6px; line-height: inherit; margin: 0px; overflow-wrap: normal; overflow: visible; padding: 0px; word-break: normal;"><span style="font-family: arial;">@misc{Blust-15-2005,
author = {Blust},
date = {2005},
howpublished = {personal communication}
}
@book{38174,
author = {Topping, Donald M. and Ogo, Pedro M. and Dungca, Bernadita C.},
address = {Honolulu},
publisher = {The University Press of Hawaii},
title = {Chamorro-English dictionary},
year = {1975}
}
</span></code></pre><div class="zeroclipboard-container" style="animation: auto ease 0s 1 normal none running none; box-sizing: border-box;"><clipboard-copy aria-label="Copy" class="ClipboardButton btn btn-invisible js-clipboard-copy m-2 p-0 tooltipped-no-delay d-flex flex-justify-center flex-items-center" data-copy-feedback="Copied!" data-tooltip-direction="w" role="button" style="align-items: center; appearance: none; border-radius: 6px; border: 0px; box-shadow: none; box-sizing: border-box; color: var(--fgColor-accent, var(--color-accent-fg)); cursor: pointer; display: flex; font-size: 14px; font-weight: var(--base-text-weight-medium, 500); height: var(--control-small-size, 28px); justify-content: center; line-height: 20px; margin: var(--base-size-8, 8px) !important; padding: 0px; position: relative; text-wrap: nowrap; transition: color 80ms cubic-bezier(0.33, 1, 0.68, 1) 0s, background-color 0s ease 0s, box-shadow 0s ease 0s, border-color 0s ease 0s; user-select: none; vertical-align: middle; width: var(--control-small-size, 28px);" tabindex="0" value="@misc{Blust-15-2005,
author = {Blust},
date = {2005},
howpublished = {personal communication}
}
@book{38174,
author = {Topping, Donald M. and Ogo, Pedro M. and Dungca, Bernadita C.},
address = {Honolulu},
publisher = {The University Press of Hawaii},
title = {Chamorro-English dictionary},
year = {1975}
}"><svg aria-hidden="true" class="octicon octicon-copy js-clipboard-copy-icon" data-view-component="true" height="16" version="1.1" viewbox="0 0 16 16" width="16"></svg><span><span style="font-family: arial;"><path d="M0 6.75C0 5.784.784 5 1.75 5h1.5a.75.75 0 0 1 0 1.5h-1.5a.25.25 0 0 0-.25.25v7.5c0 .138.112.25.25.25h7.5a.25.25 0 0 0 .25-.25v-1.5a.75.75 0 0 1 1.5 0v1.5A1.75 1.75 0 0 1 9.25 16h-7.5A1.75 1.75 0 0 1 0 14.25Z"></path><path d="M5 1.75C5 .784 5.784 0 6.75 0h7.5C15.216 0 16 .784 16 1.75v7.5A1.75 1.75 0 0 1 14.25 11h-7.5A1.75 1.75 0 0 1 5 9.25Zm1.75-.25a.25.25 0 0 0-.25.25v7.5c0 .138.112.25.25.25h7.5a.25.25 0 0 0 .25-.25v-7.5a.25.25 0 0 0-.25-.25Z"></path></span></span></clipboard-copy></div></div><h2 dir="auto" id="user-content-example-wordlist---linking-together" style="background-color: white; border-bottom: 1px solid var(--borderColor-muted, var(--color-border-muted)); box-sizing: border-box; color: #1f2328; font-weight: var(--base-text-weight-semibold, 600); line-height: 1.25; margin-bottom: 16px; margin-top: 24px; padding-bottom: 0.3em;" tabindex="-1"><a class="heading-link" href="https://github.com/HedvigS/personal-cookbook/blob/main/R/cldf_for_dummies.md#example-wordlist---linking-together" style="background-color: transparent; box-sizing: border-box; text-decoration-line: none;"><span style="font-family: arial;">example: Wordlist - linking together<svg aria-hidden="true" class="octicon octicon-link" height="16" version="1.1" viewbox="0 0 16 16" width="16"><path d="m7.775 3.275 1.25-1.25a3.5 3.5 0 1 1 4.95 4.95l-2.5 2.5a3.5 3.5 0 0 1-4.95 0 .751.751 0 0 1 .018-1.042.751.751 0 0 1 1.042-.018 1.998 1.998 0 0 0 2.83 0l2.5-2.5a2.002 2.002 0 0 0-2.83-2.83l-1.25 1.25a.751.751 0 0 1-1.042-.018.751.751 0 0 1-.018-1.042Zm-4.69 9.64a1.998 1.998 0 0 0 2.83 0l1.25-1.25a.751.751 0 0 1 1.042.018.751.751 0 0 1 .018 1.042l-1.25 1.25a3.5 3.5 0 1 1-4.95-4.95l2.5-2.5a3.5 3.5 0 0 1 4.95 0 .751.751 0 0 1-.018 1.042.751.751 0 0 1-1.042.018 1.998 1.998 0 0 0-2.83 0l-2.5 2.5a1.998 1.998 0 0 0 0 2.83Z"></path></svg></span></a></h2><p dir="auto" style="background-color: white; box-sizing: border-box; color: #1f2328; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">Each of the tables has a column called “ID”. This column allows us to link the tables together. The column “Language_ID” in the FormTable maps onto the column “ID” in the LanguageTable, and so on.</span></p><ul dir="auto" style="background-color: white; box-sizing: border-box; color: #1f2328; font-size: 16px; margin-bottom: 16px; margin-top: 0px; padding-left: 2em;"><li style="box-sizing: border-box;"><span style="font-family: arial;">Langugage_ID -> ID column in LanguageTable</span></li><li style="box-sizing: border-box; margin-top: 0.25em;"><span style="font-family: arial;">Parameter_ID -> ID column in ParameterTable</span></li><li style="box-sizing: border-box; margin-top: 0.25em;"><span style="font-family: arial;">Form_ID -> ID column in FormTable.</span></li></ul><p dir="auto" style="background-color: white; box-sizing: border-box; color: #1f2328; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">There is no column “Form_ID” inside the FormTable, it’s just called ID there. Same with Parameter_ID and the ParameterTable and so on.</span></p><p dir="auto" style="background-color: white; box-sizing: border-box; color: #1f2328; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;"><span style="box-sizing: border-box; font-weight: var(--base-text-weight-semibold, 600);">WARNING</span> Some LanguageTables contain a column called “Language_ID” which is <span style="box-sizing: border-box; font-weight: var(--base-text-weight-semibold, 600);">not</span> the same as the ID column. For dialects, this column contains the Glottocode of the language that they are a dialect of. For example, Eastern Low Navarrese is a dialect of Basque. The glottocode of this dialect is east1470. The glottocode of the language Basque is basq1248. If a LanguageTable has the column Language_ID, it would contain basq1248 for the dialect. This helps when you might want to match by the language-level rather than dialect-level.The LanguageTable in Glottolog contains a column of this kind called “Language_ID”. In Grambank, there is a similar column, but it is called “Language_level_ID”.</span></p><p dir="auto" style="background-color: white; box-sizing: border-box; color: #1f2328; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">With the above information, we can now combine the tables if we want. For example, we can rename the ID column in each of the tables to “Language_ID”, “Parameter_ID” and “Form_ID” and then join them together into one new table. In the example below, not all columns are shown due to space. Nota Bene that both ParameterTable and LanguageTable contains the column “Name”, so they would have to be dropped or otherwise handled (for example renamed to “Parameter_name” and “Language_name”) otherwise the joining would not work correctly.</span></p><table style="background-color: white; border-collapse: collapse; border-spacing: 0px; color: #1f2328; display: block; font-size: 16px; margin-bottom: 16px; margin-top: 0px; max-width: 100%; overflow: auto; width: max-content;"><thead style="box-sizing: border-box;"><tr style="background-color: var(--bgColor-default, var(--color-canvas-default)); border-top: 1px solid var(--borderColor-muted, var(--color-border-muted)); box-sizing: border-box;"><th style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; font-weight: var(--base-text-weight-semibold, 600); padding: 6px 13px;"><span style="font-family: arial;">Form_ID</span></th><th style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; font-weight: var(--base-text-weight-semibold, 600); padding: 6px 13px;"><span style="font-family: arial;">Parameter_ID</span></th><th style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; font-weight: var(--base-text-weight-semibold, 600); padding: 6px 13px;"><span style="font-family: arial;">Language_ID</span></th><th style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; font-weight: var(--base-text-weight-semibold, 600); padding: 6px 13px;"><span style="font-family: arial;">Form</span></th><th style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; font-weight: var(--base-text-weight-semibold, 600); padding: 6px 13px;"><span style="font-family: arial;">Source</span></th><th style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; font-weight: var(--base-text-weight-semibold, 600); padding: 6px 13px;"><span style="font-family: arial;">Glottocode</span></th><th style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; font-weight: var(--base-text-weight-semibold, 600); padding: 6px 13px;"><span style="font-family: arial;">Concepticon_ID</span></th></tr></thead><tbody style="box-sizing: border-box;"><tr style="background-color: var(--bgColor-default, var(--color-canvas-default)); border-top: 1px solid var(--borderColor-muted, var(--color-border-muted)); box-sizing: border-box;"><td style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; padding: 6px 13px;"><span style="font-family: arial;">15-144_toburn-1</span></td><td style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; padding: 6px 13px;"><span style="font-family: arial;">144_toburn</span></td><td style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; padding: 6px 13px;"><span style="font-family: arial;">15</span></td><td style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; padding: 6px 13px;"><span style="font-family: arial;">pegew</span></td><td style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; padding: 6px 13px;"><span style="font-family: arial;">Blust-15-2005</span></td><td style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; padding: 6px 13px;"><span style="font-family: arial;">bint1246</span></td><td style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; padding: 6px 13px;"><span style="font-family: arial;">2102</span></td></tr><tr style="background-color: var(--bgColor-muted, var(--color-canvas-subtle)); border-top: 1px solid var(--borderColor-muted, var(--color-border-muted)); box-sizing: border-box;"><td style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; padding: 6px 13px;"><span style="font-family: arial;">15-144_toburn-2</span></td><td style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; padding: 6px 13px;"><span style="font-family: arial;">144_toburn</span></td><td style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; padding: 6px 13px;"><span style="font-family: arial;">15</span></td><td style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; padding: 6px 13px;"><span style="font-family: arial;">tinew</span></td><td style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; padding: 6px 13px;"><span style="font-family: arial;">Blust-15-2005</span></td><td style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; padding: 6px 13px;"><span style="font-family: arial;">bint1246</span></td><td style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; padding: 6px 13px;"><span style="font-family: arial;">2102</span></td></tr><tr style="background-color: var(--bgColor-default, var(--color-canvas-default)); border-top: 1px solid var(--borderColor-muted, var(--color-border-muted)); box-sizing: border-box;"><td style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; padding: 6px 13px;"><span style="font-family: arial;">18-2_left-1</span></td><td style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; padding: 6px 13px;"><span style="font-family: arial;">2_left</span></td><td style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; padding: 6px 13px;"><span style="font-family: arial;">18</span></td><td style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; padding: 6px 13px;"><span style="font-family: arial;">akague</span></td><td style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; padding: 6px 13px;"><span style="font-family: arial;">38174</span></td><td style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; padding: 6px 13px;"><span style="font-family: arial;">cham1312</span></td><td style="border: 1px solid var(--borderColor-default, var(--color-border-default)); box-sizing: border-box; padding: 6px 13px;"><span style="font-family: arial;">244</span></td></tr></tbody></table><h1 dir="auto" id="user-content-clld-and-cldf" style="background-color: white; border-bottom: 1px solid var(--borderColor-muted, var(--color-border-muted)); box-sizing: border-box; color: #1f2328; font-weight: var(--base-text-weight-semibold, 600); line-height: 1.25; margin: 24px 0px 16px; padding-bottom: 0.3em;" tabindex="-1"><a class="heading-link" href="https://github.com/HedvigS/personal-cookbook/blob/main/R/cldf_for_dummies.md#clld-and-cldf" style="background-color: transparent; box-sizing: border-box; text-decoration-line: none;"><span style="font-family: arial;">CLLD and CLDF<svg aria-hidden="true" class="octicon octicon-link" height="16" version="1.1" viewbox="0 0 16 16" width="16"><path d="m7.775 3.275 1.25-1.25a3.5 3.5 0 1 1 4.95 4.95l-2.5 2.5a3.5 3.5 0 0 1-4.95 0 .751.751 0 0 1 .018-1.042.751.751 0 0 1 1.042-.018 1.998 1.998 0 0 0 2.83 0l2.5-2.5a2.002 2.002 0 0 0-2.83-2.83l-1.25 1.25a.751.751 0 0 1-1.042-.018.751.751 0 0 1-.018-1.042Zm-4.69 9.64a1.998 1.998 0 0 0 2.83 0l1.25-1.25a.751.751 0 0 1 1.042.018.751.751 0 0 1 .018 1.042l-1.25 1.25a3.5 3.5 0 1 1-4.95-4.95l2.5-2.5a3.5 3.5 0 0 1 4.95 0 .751.751 0 0 1-.018 1.042.751.751 0 0 1-1.042.018 1.998 1.998 0 0 0-2.83 0l-2.5 2.5a1.998 1.998 0 0 0 0 2.83Z"></path></svg></span></a></h1><p dir="auto" style="background-color: white; box-sizing: border-box; color: #1f2328; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">CLDF is a type of data-format, the set of tables etc. CLLD is a larger project and stands for Cross-Linguistic Linked Data. CLDF is a part of CLLD. CLLD also does web applications, for example <a href="https://clics.clld.org/" rel="nofollow" style="background-color: transparent; box-sizing: border-box; text-decoration-line: none;">https://clics.clld.org/</a>. CLDF data interfaces smoothly with CLLD web applications.</span></p><h2 dir="auto" id="user-content-advanced" style="background-color: white; border-bottom: 1px solid var(--borderColor-muted, var(--color-border-muted)); box-sizing: border-box; color: #1f2328; font-weight: var(--base-text-weight-semibold, 600); line-height: 1.25; margin-bottom: 16px; margin-top: 24px; padding-bottom: 0.3em;" tabindex="-1"><a class="heading-link" href="https://github.com/HedvigS/personal-cookbook/blob/main/R/cldf_for_dummies.md#advanced" style="background-color: transparent; box-sizing: border-box; text-decoration-line: none;"><span style="font-family: arial;">Advanced<svg aria-hidden="true" class="octicon octicon-link" height="16" version="1.1" viewbox="0 0 16 16" width="16"><path d="m7.775 3.275 1.25-1.25a3.5 3.5 0 1 1 4.95 4.95l-2.5 2.5a3.5 3.5 0 0 1-4.95 0 .751.751 0 0 1 .018-1.042.751.751 0 0 1 1.042-.018 1.998 1.998 0 0 0 2.83 0l2.5-2.5a2.002 2.002 0 0 0-2.83-2.83l-1.25 1.25a.751.751 0 0 1-1.042-.018.751.751 0 0 1-.018-1.042Zm-4.69 9.64a1.998 1.998 0 0 0 2.83 0l1.25-1.25a.751.751 0 0 1 1.042.018.751.751 0 0 1 .018 1.042l-1.25 1.25a3.5 3.5 0 1 1-4.95-4.95l2.5-2.5a3.5 3.5 0 0 1 4.95 0 .751.751 0 0 1-.018 1.042.751.751 0 0 1-1.042.018 1.998 1.998 0 0 0-2.83 0l-2.5 2.5a1.998 1.998 0 0 0 0 2.83Z"></path></svg></span></a></h2><p dir="auto" style="background-color: white; box-sizing: border-box; color: #1f2328; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">This document is only a very basic intro. If you want to learn more, go to: <a href="https://github.com/cldf/cldf/#readme" style="background-color: transparent; box-sizing: border-box; text-decoration-line: none;">https://github.com/cldf/cldf/#readme</a>.</span></p><h2 dir="auto" id="user-content-references" style="background-color: white; border-bottom: 1px solid var(--borderColor-muted, var(--color-border-muted)); box-sizing: border-box; color: #1f2328; font-weight: var(--base-text-weight-semibold, 600); line-height: 1.25; margin-bottom: 16px; margin-top: 24px; padding-bottom: 0.3em;" tabindex="-1"><a class="heading-link" href="https://github.com/HedvigS/personal-cookbook/blob/main/R/cldf_for_dummies.md#references" style="background-color: transparent; box-sizing: border-box; text-decoration-line: none;"><span style="font-family: arial;">References<svg aria-hidden="true" class="octicon octicon-link" height="16" version="1.1" viewbox="0 0 16 16" width="16"><path d="m7.775 3.275 1.25-1.25a3.5 3.5 0 1 1 4.95 4.95l-2.5 2.5a3.5 3.5 0 0 1-4.95 0 .751.751 0 0 1 .018-1.042.751.751 0 0 1 1.042-.018 1.998 1.998 0 0 0 2.83 0l2.5-2.5a2.002 2.002 0 0 0-2.83-2.83l-1.25 1.25a.751.751 0 0 1-1.042-.018.751.751 0 0 1-.018-1.042Zm-4.69 9.64a1.998 1.998 0 0 0 2.83 0l1.25-1.25a.751.751 0 0 1 1.042.018.751.751 0 0 1 .018 1.042l-1.25 1.25a3.5 3.5 0 1 1-4.95-4.95l2.5-2.5a3.5 3.5 0 0 1 4.95 0 .751.751 0 0 1-.018 1.042.751.751 0 0 1-1.042.018 1.998 1.998 0 0 0-2.83 0l-2.5 2.5a1.998 1.998 0 0 0 0 2.83Z"></path></svg></span></a></h2><p dir="auto" style="background-color: white; box-sizing: border-box; color: #1f2328; font-size: 16px; margin-bottom: 0px; margin-top: 0px;"><span style="font-family: arial;">[1] Forkel, R., List, J. M., Greenhill, S. J., Rzymski, C., Bank, S., Cysouw, M. Hammarström, H., Haspelmath, M., Kaiping, G.A. and Gray, R. D. (2018). Cross-Linguistic Data Formats, advancing data sharing and re-use in comparative linguistics. Scientific data, 5(1), 1-10.</span></p></div>Hedvig Skirgårdhttp://www.blogger.com/profile/03689179680848604827noreply@blogger.com0tag:blogger.com,1999:blog-1300680252997007251.post-51517601477666138942021-02-02T06:13:00.010+11:002021-02-02T06:46:58.136+11:00A racist map of the world's languages<div><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh2HbTm4vLpKzn05FmNihaujjLrFRAaIRbYZuBEAIfA5Dq55eGgJx63-on-jJSpASxGi9IaMqHkqPWlcaFw2fGFfLKtvLa1A81TMVUi8HP0sK0VEOAnaU-VYwQcXOS89D8ReEq57TWvkCOI/s1600/blown+up+gula+rasens+spra%25CC%258Ak.png" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="392" data-original-width="661" height="236" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh2HbTm4vLpKzn05FmNihaujjLrFRAaIRbYZuBEAIfA5Dq55eGgJx63-on-jJSpASxGi9IaMqHkqPWlcaFw2fGFfLKtvLa1A81TMVUi8HP0sK0VEOAnaU-VYwQcXOS89D8ReEq57TWvkCOI/s400/blown+up+gula+rasens+spra%25CC%258Ak.png" width="400" /></a></td></tr><tr><td class="tr-caption"><i><span style="font-size: small;">Detail of world map of languages and races from 1924. <br />The legend outlines language groups of the "yellow race".<br /><br /></span></i></td></tr></tbody></table><div>In order to move forward towards racial and social justice as a discipline we must become familiar with our history and the ways in which racist, colonialist, sexist and classist ideas are still present in our academic spaces. I would like to present to you a concrete piece of evidence of our racist past in particular - a world-map where languages are grouped into three races: white, yellow and black. </div><div><br /></div>I started writing this blog post several years ago, but ended up not posting it because I felt there was so much to say and I wasn't sure I was the right person to say it, nor able to cover all related content in a fair and accessible manner. With the recent publication of the article <a href="https://muse.jhu.edu/article/775377" target="_blank"><i>Toward racial justice in linguistics: Interdisciplinary insights into theorizing race in the discipline and diversifying the profession</i> by Charity Hudley, Mallinson & Bucholtz in Language</a> (and the responses and commentary), I felt inspired to share this map now - even if the post will be shorter than I had originally imagined. I think it is essential that we as linguist fully internalise that racism was not confined to a few fringe individuals, it was <i>and is</i> an ideology which permeates everything - including science and research. </div><div><br /></div><div>Charity Hudley, Mallinson and Bucholtz outlines several steps which they believe linguistics as a disciple need to take in order to advance towards racial justice. One of them is:</div><div><ul style="text-align: left;"><li><span style="font-family: inherit;"><i><span style="background-color: white; caret-color: rgb(10, 10, 10); color: #0a0a0a;"> Fully acknowledge the ongoing legacy of the field's history of racism and colonialism (</span><a class="rid-text animated_focus_ref ref-hover" data-hasqtip="true" href="https://muse.jhu.edu/article/775377#b42" name="b42-text" style="box-sizing: inherit; color: #284f84; cursor: pointer; line-height: inherit; text-decoration-skip: objects; text-decoration: none; word-wrap: break-word;">Bolton & Hutton 2000</a><span style="background-color: white; caret-color: rgb(10, 10, 10); color: #0a0a0a;">, </span><a class="rid-text animated_focus_ref ref-hover" data-hasqtip="true" href="https://muse.jhu.edu/article/775377#b88" name="b88-text" style="box-sizing: inherit; color: #284f84; cursor: pointer; line-height: inherit; text-decoration-skip: objects; text-decoration: none; word-wrap: break-word;">Errington 2001</a><span style="background-color: white; caret-color: rgb(10, 10, 10); color: #0a0a0a;">, </span><a class="rid-text animated_focus_ref ref-hover" data-hasqtip="true" href="https://muse.jhu.edu/article/775377#b135" name="b135-text" style="box-sizing: inherit; color: #284f84; cursor: pointer; line-height: inherit; text-decoration-skip: objects; text-decoration: none; word-wrap: break-word;">Leonard 2018</a><span style="background-color: white; caret-color: rgb(10, 10, 10); color: #0a0a0a;">)</span></i></span></li></ul><div><span style="color: #0a0a0a;"><span style="caret-color: rgb(10, 10, 10);">In the spirit of that, I'd like to talk about a </span></span>Swedish map from 1924 depicting the languages of the world, coloured by language families and grouped into races. </div><div><br /></div><div><span style="font-family: inherit;">This map is part of a set of maps which was made by the Swede <a href="http://nl.wikipedia.org/wiki/Sten_de_Geer">Docenten Friherren Sten De Geer</a>. De Geer was docent in geography, which is a higher academic degree than doctor. He specialised in economic geography and ethnographic/cultural geography. These maps were published in an atlas by Åhlen & Holm AB for the Swedish Red Cross. The atlas was made to educate the general public of Sweden and Scandinavia about the world's countries and people.</span></div><div><br /></div><div>The map in question is found on the lower part of the image below. The image contains two maps, the upper one depicts religions and the lower languages. All languages of the world are grouped into linguistic groups (such as Germanic, Bantu or Malayic) and subsumed under a super-category of race. For example, Germanic languages are of the "white race", Bantu "black" and Malayic are of the "yellow race".</div><div><br /></div>I've included both a colour and black-and-white version for easier readability, and below I've translated the Swedish text into English. If you want to see a higher resolved version, <a href="https://drive.google.com/drive/folders/1_E_XZfODhs-awY-2MlHQRaqM7f6KtkI0?usp=sharing" target="_blank">go here</a>.</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhopDlCceeaEHVOagWyWbJ7tHCtY_h7mHzAHUjhS8DlEpdlq96Dnc1BCoR-cw2maas6jxZEnnteYctHLo_hKIaQ8aEHQwZJRvD3J9xSzZv60cjJgTK8f1bOjMzhiDWr9QuTCfTS_ue2wEWJ/s1600/varldskarta_sprak_de_geer_1924_Page_3.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="526" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhopDlCceeaEHVOagWyWbJ7tHCtY_h7mHzAHUjhS8DlEpdlq96Dnc1BCoR-cw2maas6jxZEnnteYctHLo_hKIaQ8aEHQwZJRvD3J9xSzZv60cjJgTK8f1bOjMzhiDWr9QuTCfTS_ue2wEWJ/s640/varldskarta_sprak_de_geer_1924_Page_3.jpg" width="640" /></a></div><a href="http://odtmaps.com/images/peters6.gif" style="clear: left; float: left; margin-bottom: 1em; margin-left: 1em;"></a><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjIv8R9ZLCrI5PhDQ7XjOaqr0mUurGQWkU44sYwwc2-zRMgdD8HfQNsoixTzB4HEQvsLrH30hJgjLvo7-F7dZztD3UhWdS_gT85om2qqx_cVylmDYdAc1sBmNKVj4NiZfG86_G5HkkMGblL/s1600/varldskarta_sprak_de_geer_1924_Page_1.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="526" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjIv8R9ZLCrI5PhDQ7XjOaqr0mUurGQWkU44sYwwc2-zRMgdD8HfQNsoixTzB4HEQvsLrH30hJgjLvo7-F7dZztD3UhWdS_gT85om2qqx_cVylmDYdAc1sBmNKVj4NiZfG86_G5HkkMGblL/s1600/varldskarta_sprak_de_geer_1924_Page_1.png" width="640" /></a>________________________________________<br />Below is the text from the lower map of languages and races translated from Swedish to English by me. I've stayed as close to the original as possible, except in some cases where I use a more well-known English label for a category, for example "syrjänska" = "Komi".<br />________________________________________<br /><br /><b>Translation:</b><div><i><u><br /></u></i><i><u>The white race's languages (Inflectional languages)</u></i><br /><i>Germanic</i><br /><i>Romance</i><br /><i>- French, Spanish, Catalan, Italian and Romanian</i><br /><i>- Portuguese</i><br /><i>Slavic languages</i><br /><i>Greek and Albanian languages</i><br /><i>Iranian languages, together with Armenian and Caucasian</i><br /><i>Hindu languages</i><br /><i>Singalese (Hindi-Dravidian)</i><br /><i>Latvian & Lituhanian</i><br /><i>Celtic</i><br /><i>Basque</i><br /><i>Semitic</i><i> languages</i><br /><i>Hamitic</i><i> languages</i><br /><i><br /></i><i><u>The yellow race's languages (Agglutinating languages)</u></i><br /><i>Uralic languages </i><br /><i> - Hungarian and Eastern Khanty</i><br /><i> - Finnish, Estonian, Volga-Finnish and Sami</i><br /><i> - Komi and Samoyedic</i><br /><br /><i>Altaic languages</i><br /><i> - Turkic languages</i><br /><i> - Mongolian </i><i>languages</i><br /><i> - Tungusic</i><i> </i><i>languages</i><br /><i> - Japanese and Korean</i><br /><i> - Chukchi </i><i>languages</i><br /><i> - Aino</i><br /><br /><i>Dravidian languages (spoken by negroid people)</i><br /><i>Munda languages (spoken by negroid people)</i><br /><br /><i>Incorporation languages</i><br /><i>- Indian languages</i> <br /><i>- Eskimo and Aleut</i><br /><i><br /></i><i>Stem-isolating languages </i><br /><i> - Malay languages (including Polynesian)</i><br /><i>Isolating languages</i><br /><i>- Isolating</i><br /><i><br /></i><i><u>The black race's languages (prefixing languages)</u></i><br /><i>Bantu (spoken by negroid people)</i><br /><i>Other negroid languages</i><br /><i> - Sudanese languages</i><br /><i> - Hottentott and Bushman languages</i><br /><i> - Papuan</i><br /><i> - West-Papuan (Negrito language)</i><br /><i> - Australnegro languages</i><br />________________________________________<div><br /></div><div>There is a lot to say here, and I am sure I will not be able to cover it all. I do not have the accompanying information about the construction or design of the map (if it exists). </div><div><br /></div><div>Let us start by making a few specific observations:</div><div><br /></div><div><div>- Just so it is said: as far as I am aware, historical linguists at this time (1924) did not propose that they had linguistic evidence of the three race groups as meaningful linguistic units. If they did propose such a thing, it must have been with skin colour as the sole evidence - not the comparative method and vocabulary.</div></div><div><br /></div><div>- The division of three races that the map makes, "white", "yellow" and "black", is probably influenced by the <a href="http://en.wikipedia.org/wiki/Georges_Cuvier#Racial_studies">religious and racist researcher Georges Cuvier.</a> He proposed that humans should be divided into three races, the Caucasian (white), Mongolian (yellow), and the Ethiopian (black). From wikipedia: <i>Cuvier claimed that Adam and Eve were Caucasian, the original race of mankind. The other two races originated by survivors escaping in different directions after a major catastrophe hit the earth 5,000 years ago, with those survivors then living in complete isolation from each other. Cuvier categorized these divisions he identified into races according to his perception of the beauty or ugliness of their skulls and the quality of their civilizations.</i></div><div><span> <span> </span></span>Just.. sit with that for a while before we go on. Read it one more time.</div><div><br /></div><div>- The creator of the map is himself struggling with the exclusive division of the languages into the three races, placing Dravidian and Munda languages under "yellow", but noting that they are spoken by "negroid people". I don't know what to make of that really, but it is noticeable.</div><div><br /></div><div>- The level of granularity alone betrays a racist attitude. There is much more detail to the various sub-groupings of Indo-European languages, while the rest of the world is represented in far, far less detail. For example, most East Asian languages are included under the label "Isolating". "Isolating" is a description of the grammar of a language, not a genealogical group. Even though our understanding of the relationship between East Asian languages was not as advanced in 1924 as it is now, the category "Tibeto-Burman" did in fact exist. The map was made for a Scandinavian audience which may mean they expected finer detail closer to home, but even so the coarseness of categorisation outside of Europe is extreme. The lack of attention betrays a lack of care.</div><div><br /></div><div>- Indigenous languages of Americas and Australia are lumped together into two respective groups even though they each consist of very large amounts of different languages families. For example, the category labelled "Indian" on the map encompasses 1,000+ languages of 160+ families. Contrast this with Europe and our 230-ish languages where we're even getting distinctions of lower genealogical subgroups. To call "Indian" a shit category is the understatement of the century, especially considering the level of detail in other areas. Perhaps you think they just didn't know better? The "grandfather of modern anthropology" Franz Boas published his first volume of the Handbook American Indian Languages in 1911 where he, among other things, described the diversity of the Americas and lists 55 families north of Mexico. So, knowledge and information existed. If in 1924 you were to get the assignment to make a map of the world's languages and you specialised in ethnography, information existed which could be used. (You can read the introduction <a href="http://etnolinguistica.wdfiles.com/local--files/biblio%3Aboas-1911-introduction/boas_1911_introduction.pdf">of that book here</a>. It's a worthwhile read, Boas discusses race at great length.) Again, lack of attention shows a lack of care from the creator.</div><div><br /></div><div><div>- The depiction of the distribution of indigenous languages of the Americas, Australia and Aotearoa (New Zealand). I take it that the intention of the mapmaker was to only show the most common language in a given place and that this is why most of these regions are depicted only with the colonial European languages, with pockets of "Indian languages" in the Americas, "Australnegro" languages in Australia and a small section of "Malay" languages in Aotearoa. These red and pinkish places which appear entirely conquered by European colonisers were in many cases multilingual. These are places where indigenous languages were spoken and are still spoken today. To make the choice to not depict these languages at all.. it is a choice that perpetuates the erasure of indigenous people.</div><div> </div></div><div>Okay. I think I'll stop there for now lest my blood pressure gets too high. Feel free to leave more observations of this map in the comment section.</div><div><br /></div><div>I think that there is a common misconception that racism was essentially something unique to Germany in the 1930's. Depictions of non-german Europe during the second world war often seems to entirely erase the racism <a href="https://www.independent.co.uk/news/uk/politics/not-his-finest-hour-dark-side-winston-churchill-2118317.html" target="_blank">we know was there</a> (<a href="https://en.wikipedia.org/wiki/Trollhättan_school_attack" target="_blank">and still is</a>). For science in particular, there is a great tendency to ignore the racist legacy of our disciplines and to fail to address its continuation and the impact it has on scholars and research today.</div><div><br /></div><div>I am a Swedish white woman, I grew up in the university town of Uppsala. Uppsala was the Swedish centre of research into "race-biology" from 1922-1958. This involved the measurement of skulls and the production of research which aimed at providing the state with scientific justifications for "exact racial hygiene" and "rational population policy". I didn't know exactly what this entailed until I became an adult, but growing up I did know the location of "rasbiologiska institutet" - it's a local landmark and reference point still.</div><div><br /></div><div>The welfare state of Sweden in the 50's and 60's was racist and supported by scientists. Antiziganism was particularly strong. In Scandinavia there is a group of Romani people commonly known as <a href="http://en.wikipedia.org/wiki/Norwegian_and_Swedish_Travellers">'travellers'</a>. They have lived there for over 500 years. We sterilised these people, and people which were deemed "slow of the mind" or otherwise genetically undesirable without their consent. This was explicitly done for the sake of "eugenic hygiene". This was legal and encouraged between 1934-1975. Approximately a total of 63,000 people were sterilised without consent. The peak was in 1947-48, i.e. <i>after</i> the war had ended.</div><div><br /></div><div>The Swedish nation state has also abused the indigenous people of the land - the Sami. This involved among other things taking children from their families, hurting them physically and emotionally - robbing them of their past and ruining their future. </div><div><br /></div><div>I can highly recommend watching the movie "<a href="https://en.wikipedia.org/wiki/Sami_Blood" target="_blank">Sami blood</a>" which depicts this abuse. It features a particular scene, linked below, where the the northern regional school of Sami children get a visit from intellectual men from down south, from Uppsala. As the event was announced to the children, I distinctly remember thinking "oh fun that's where I from!" before half a second later it hit me: there is only one reason Uppsala men are visiting a school like this in the 1930's. They are there to measure these children, to collect data on their racial inferiority and justify their continued mistreatment.</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><iframe allowfullscreen="" class="BLOG_video_class" height="343" src="https://www.youtube.com/embed/Dz-iJI9QpiE" width="636" youtube-src-id="Dz-iJI9QpiE"></iframe></div><div style="text-align: center;"><i>Scene from the movie Sami Blood where the main character, </i><i>Elle-Marja, is examined </i></div><div style="text-align: center;"><i>and measured by visitors </i><i>from the race-biology research centre in Uppsala.</i></div><div><br /></div><div>There is another scene, which is perhaps even more relevant to us as linguists, where Elle-Marja meets a group of "well-meaning" white anthropology students in Uppsala. They pressure her to preform jojk, a Sami style of singing. It becomes clear that to them she is an ethnological curiosity - not a fellow teenager.</div><div><br /></div><div>Abusive treatment of "undesirables", ethnic minorities and indigenous people is by no means unique to Sweden, but I believe many Swedes and people outside of Sweden know little of this. These events are not isolated, old or forgotten - they are recent and reverberates through to today. Forced sterilisation due to eugenic reasons were preformed in Sweden all the way into the 1960's and they were legal until 1975. The Canadian Indian Residential School system which abused ingenious children existed for over 150 years (1840's to 1996). The Australian government took indigenous children from their families from 1909 until 1969. The list goes on. Read more and let it all sink in. Learn whose land you live on and and the history of the place and the nation state there. These are stolen generations, this can never be undone. This is not old, this is recent. Let it become part of your world - because it already is. Wether you like it or not.</div><div><br /></div><div>The map which prompted this post was published in 1924, a decade before the eugenic force sterilisation laws came into effect in Sweden. Extremely racist ideas were included in educational material for the general public without question. It was obvious, this is the way things are and this is what we should tell our children. There are three races, and here are their languages.</div><div><br /></div><div>No only are these events more recent than you may like to think, racism is still around - also in academia. For this post, I focussed on events decades past but which have consequences for modern science as well. We are still a discipline which neglects languages and people outside of Europe, both as fellow scholars and as topics worthy of honest and respectful research. If you haven't already, go read the paper by<a href="https://muse.jhu.edu/article/775377" target="_blank"> Charity Hudley, Mallinsson and Bucholtz in Languag</a>e and the following responses to learn more about race in linguistics as a discipline today.</div></div>Hedvig Skirgårdhttp://www.blogger.com/profile/03689179680848604827noreply@blogger.com1tag:blogger.com,1999:blog-1300680252997007251.post-40955473301784295842020-04-08T01:09:00.001+10:002020-04-14T06:12:50.695+10:00Online resources on linguistic typology and beyond<span style="font-family: inherit;">Many Humans Who Read Grammars are also teachers of some kind, myself included. With the world-wide outbreak of COVID-19, most of this teaching is forced to be no longer in a classroom setting, but rather in a remote fashion. This comes with one benefit: if someone can tell it better than you, and a video of it happens to be on youtube, get your students to watch that lecture! So, please find some resources on linguistic typology & co below.</span><br />
<span style="font-family: inherit;"><br /><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgP2aJkLpMddopwBbtF5pVRlMmKbqFxpxnbxdjUHycOZJpANjvn17ge2LH9amKIczQZ4rkU0C6E4jwTD3_BccM4wOYiEjzhvZ2MO3INlRqoiu6ZJzvaK8MHw7pq6rfZf-N54YgsLvaFQvFP/s1600/matrix.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><span style="color: black;"><img border="0" data-original-height="619" data-original-width="1100" height="360" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgP2aJkLpMddopwBbtF5pVRlMmKbqFxpxnbxdjUHycOZJpANjvn17ge2LH9amKIczQZ4rkU0C6E4jwTD3_BccM4wOYiEjzhvZ2MO3INlRqoiu6ZJzvaK8MHw7pq6rfZf-N54YgsLvaFQvFP/s640/matrix.jpg" width="640" /></span></a></span><br />
<span style="font-family: inherit;">First up is a short list of youtube videos on linguistic typology and some related topics.</span><br />
<span lang="EN-US" style="font-family: inherit;"><br /></span>
<span lang="EN-US" style="font-family: inherit;">There is <a href="https://www.youtube.com/watch?v=a6tbdf4YKgs&list=PL853CF986474D4193" target="_blank">an entire MA course called 'Language Typology' from the Virtual Linguistic Campus</a> (Uni Marburg). Here is <a href="https://www.youtube.com/channel/UCaMpov1PPVXGcKYgwHjXB3g" target="_blank">a link</a></span><a href="https://www.youtube.com/channel/UCaMpov1PPVXGcKYgwHjXB3g" style="font-family: inherit;" target="_blank"> to the </a><span style="font-family: inherit;"><a href="https://www.youtube.com/channel/UCaMpov1PPVXGcKYgwHjXB3g" target="_blank">Virtual Linguistic Campus</a>, featuring many more lecture series on topics in linguistics. The same for <a href="https://www.youtube.com/watch?v=QhkmVNhHVTo&list=PL3GSOTyyo3Idi9C7S8vxDhRjftd_iyg2L" target="_blank">this course here</a>, a full course from the <a href="https://www.youtube.com/channel/UCYa1WtI-vb_bx-anHdmpNfA/playlists" target="_blank">NPTEL-NOC IITM channel </a>that contains a lot of courses, also on linguistics. A <a href="https://www.youtube.com/watch?v=8ql8wLTME3w&list=PLsqvR-P4ezrN4FjZ4ApxDC2t-PwwXeAUl" target="_blank">set of mini-lessons in linguistic typology by </a></span><a href="https://www.youtube.com/watch?v=8ql8wLTME3w&list=PLsqvR-P4ezrN4FjZ4ApxDC2t-PwwXeAUl" style="font-family: inherit;" target="_blank">Isabel Cooke McKay</a><span style="font-family: inherit;">, including topics such as phonological typology, 'typology of force' (imperatives and interrogatives), and subordinate clauses, can also be found on youtube. Tom Scott has made some fun relevant videos, including on <a href="https://www.youtube.com/watch?v=bxARj07jFp0" target="_blank">morphological typology</a> and <a href="https://www.youtube.com/watch?v=QYlVJlmjLEc" target="_blank">on some very useful grammar features that English lacks</a>, see all of them <a href="https://www.youtube.com/playlist?list=PL96C35uN7xGLDEnHuhD7CTZES3KXFnwm0" target="_blank">here</a>. Lastly, here </span><span style="font-family: inherit;">is </span><a href="https://youtu.be/af2T3nTsGFI" style="font-family: inherit;" target="_blank">a short intro to linguistic typology</a><span style="font-family: inherit;"> made by yours truly, because none of the above really covered what I wanted to convey. </span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">Five years ago, <a href="https://humans-who-read-grammars.blogspot.com/2015/05/excellent-educational-video-about.html" target="_blank">Hedvig wrote a post</a> about two excellent </span><span style="font-family: inherit;">educational videos about language history/evolution, one on <a href="https://youtu.be/iWDKsHm6gTA" target="_blank">Peter Whiteley and Ward Wheeler's project to map the evolution of Uto-Aztecan languages</a> and one on <a href="https://youtu.be/iWDKsHm6gTA" target="_blank">How languages evolve by Alex Gendler</a>. Here is <a href="https://www.youtube.com/watch?v=nd5cklw6d6Q" target="_blank">another lecture on that topic by Michael Corballis</a>. </span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">Another cool video deals with </span><a href="https://www.youtube.com/watch?v=gMqZR3pqMjg" style="font-family: inherit;" target="_blank">Berlin and Kay's (1969) implicational hierarchy of basic color terms</a><span style="font-family: inherit;">. The famous TEDx talk by Lera Boroditsky on linguistic relativism is</span><span style="font-family: inherit;"> </span><a href="https://www.youtube.com/watch?v=RKK7wGAYP6k" style="font-family: inherit;" target="_blank">here</a><span style="font-family: inherit;">. As well as</span><span style="font-family: inherit;"> </span><a href="https://www.youtube.com/watch?v=sgXbq1iTrHo" style="font-family: inherit;" target="_blank">another TEDx talk on the same topic by Petrina Nomiko</a><span style="font-family: inherit;">. There are a lot of other TEDx talks on linguistics, on a variety of topics, including on </span><a href="https://www.youtube.com/watch?v=D7HZOsQYx_U" style="font-family: inherit;" target="_blank">language endangerment </a><span style="font-family: inherit;">and </span><a href="https://www.youtube.com/watch?v=g2HiPW_qSrs" style="font-family: inherit;" target="_blank">the importance of linguistic diversity</a><span style="font-family: inherit;">. </span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">Hedvig also posted earlier about the lecture sets from the Centre of Excellence for Language Dynamics (CoEDL) <a href="https://itunes.apple.com/au/institution/arc-centre-excellence-for/id990808391" target="_blank">that can be found on iTunes Uni</a>, among which is a set of lectures called 'Language shape' that deal with multilinguality, diversity, language documentation and description and a lecture series on Language evolution. <a href="https://humans-who-read-grammars.blogspot.com/2017/03/podcasts-of-linguistic-seminars-from.html" target="_blank">Her original post describes</a> how non-iTunes Uni-users can get access to them. </span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">Another series of lectures can be found </span><a href="https://www.youtube.com/channel/UCtOYOn_LK7g4ki3fkCuAzhA" style="font-family: inherit;" target="_blank">on this channel (mostly in Russian)</a><span style="font-family: inherit;">. A tonne of lectures on typological topics can be found </span><a href="https://www.youtube.com/channel/UCCUr096WDp86n62CXBeHlQw" style="font-family: inherit;" target="_blank">on the Vidya-mitra channel</a><span style="font-family: inherit;">, in a variety of languages, but I haven't really figured out how these are structured yet. Here is <a href="https://www.youtube.com/watch?v=JP-1GGUMeqM&list=PL_a1TI5CC9RFWj6mV0DPV8LWx0NXL8l_e" target="_blank">a playlist of lectures on various typological topics</a>, but I don't think it is set up as an online course. </span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">Here are some documentaries on language revitalisation:</span><br />
<a href="https://voicesontherise.ca/" style="font-family: inherit;" target="_blank">Voices on the Rise</a><span style="font-family: inherit;">: Indigenous Language Revitalization in Alberta:</span><br />
h<a href="https://www.youtube.com/watch?v=-dtEujiPUE0" style="font-family: inherit;">ttps://www.youtube.com/watch?v=-dtEujiPUE0</a><span style="font-family: inherit;"> (Voices on the Rise, episode 1)</span><br />
<span lang="EN-US" style="font-family: inherit;">h<a href="https://www.youtube.com/watch?v=g0UH1IhBnNk" style="color: #954f72;">ttps://www.youtube.com/watch?v=g0UH1IhBnNk</a> (</span><span style="font-family: inherit;">Voices on the Rise,</span><span style="font-family: inherit;"> </span><span style="font-family: inherit;">episode 2)</span><br />
<span lang="EN-US" style="font-family: inherit;">h<a href="https://www.youtube.com/watch?v=YZgJ8TZ0Zs0" style="color: #954f72;">ttps://www.youtube.com/watch?v=YZgJ8TZ0Zs0</a> (</span><span style="font-family: inherit;">Voices on the Rise,</span><span style="font-family: inherit;"> </span><span style="font-family: inherit;">episode 3)</span><br />
K<a href="https://www.youtube.com/watch?v=vqldHZUaF-c" style="font-family: inherit;" target="_blank">arihwanoron: Precious Things (with Kanien'kéha/Mohawk subtitles</a>)<br />
R<a href="https://www.youtube.com/watch?v=Wr-jackHWCw" style="font-family: inherit;" target="_blank">ising Voices / Hótȟaŋiŋpi - Revitalizing the Lakota Language</a><span style="font-family: inherit;"> see also</span><a href="https://risingvoicesfilm.com/" style="font-family: inherit;" target="_blank"> the official website</a><span style="font-family: inherit;">. </span><br />
<span style="font-family: inherit;">And there is a lot more to be found on youtube, also older stuff, like <a href="https://www.youtube.com/watch?v=x7BLBUS1IXc" target="_blank">'Why save a language' by Sally Thompson (2006)</a>.</span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">This is some of the stuff that I will be assigning to my students - hit me with your best videos, podcasts, and other online resources below!</span><br />
<div class="MsoNormal" style="margin: 0cm 0cm 0.0001pt;">
</div>
<br />Annemarie Verkerkhttp://www.blogger.com/profile/14747297526182358435noreply@blogger.com1tag:blogger.com,1999:blog-1300680252997007251.post-24490548637296972032020-02-20T07:37:00.001+11:002020-02-20T07:37:42.547+11:00A decade of state-of-the-art quantitative methods in linguistic typology<span style="font-family: inherit;">Some turning points in linguistic typology are easily recognised, such as the ground-breaking work by Joseph Greenberg on implicational universals entitled "Some universals of grammar with particular reference to the order of meaningful elements" (Greenberg 1963). Other turning points are less well-defined, less commonly associated with a single paper, or a specific typologist, team, or place. But there was definitely something in the water during, let's say, a period </span>centred<span style="font-family: inherit;"> around 2010 </span><span lang="EN-US" style="font-family: Calibri, sans-serif; font-size: 12pt;">–</span><span style="font-family: inherit;"> a change that we could call the quantitative turn in linguistic typology. </span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">Linguistic typologists have long recognised that the languages of the world are related in various ways, most importantly, in nested arrays of hierarchical descent (genealogy) as well as in so-called <i>Sprachbunds</i> or linguistic areas (geography). For a long time, i.e. work reaching from Bell (1978) all the way to Bakker (2010), these interdependencies have been viewed as something of a nuisance, something to get rid of using sampling: Only languages from as many as possible different and independent genealogical and geographical units are included in one's study. Correlations and distributions are then assessed using contingency tables and statistical tests that evaluate differences in counts, most importantly, Fisher's Exact test and Pearson's chi-squared test.</span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">Between 2008 and 2010 or so, something changed. Linguistic typologists started using methods that no longer relied on sampling techniques. (At least) three more advanced quantitative methods for linguistic typology emerged: </span><br />
<br />
<ol>
<li><span style="font-family: inherit;">The Family Bias method, which estimates statistical biases in distributions of typological variables across and within language families (big families as well as small families, including isolates). The method is set out in Bickel (2013), but also in Bickel (2011) and (2015), and has a long history as an earlier manuscript (Bickel 2008) mentions talks given on the topic already in 2006. </span></li>
<li><span style="font-family: inherit;">Generalized linear mixed-effect models (GLMMs) and other regression models. These model a dependent or response variable in terms of independent or predictor variables and are widely used both in linguistics and outside it (Coupé 2018). The benefit for typologists is that there are various ways to include information on genealogy and geography in the analysis, in order to make sure that interdependencies between datapoints are not due to shared history or areality. </span></li>
<li><span style="font-family: inherit;">Phylogenetic comparative methods are a set of tools adopted from evolutionary biology, another discipline which studies the characteristics of entities with long and pertinent histories. <span style="font-family: inherit;">These methods model the evolution and distribution of cross-linguistic data on phylogenetic trees. Their first application was Dunn et al. (2011), the famous paper on lineage-specific trends in word order universals. </span></span></li>
</ol>
<span style="font-family: inherit;"><div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgV41mmMAAnmugM3pcHRbvmu05bqDnbGP_SosdWslr83u5fAb8AzLuw0eOpT4gD4Gp8bmplDcbAP7fuhfBIvAKqdAPScMxwkjnbLHjShFbyf9zNUbovziQNTUlznvXriQLkYt5r0Dd0cMoc/s1600/46487965_1887844931304096_9110826761075032064_o.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="543" data-original-width="1241" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgV41mmMAAnmugM3pcHRbvmu05bqDnbGP_SosdWslr83u5fAb8AzLuw0eOpT4gD4Gp8bmplDcbAP7fuhfBIvAKqdAPScMxwkjnbLHjShFbyf9zNUbovziQNTUlznvXriQLkYt5r0Dd0cMoc/s1600/46487965_1887844931304096_9110826761075032064_o.jpg" /></a></div>
<div style="text-align: center;">
<span style="font-family: inherit;">(c) <a href="https://xkcd.com/2048/" target="_blank">xkcd</a></span></div>
</span><span style="font-family: inherit;"><div>
<span style="font-family: inherit;"><br /></span></div>
I'm not quite sure who first introduced regression models and later, generalized linear mixed-effect models for linguistic typology. Sinnemäki (2010) is certainly one of the first, in his 2011 dissertation he describes how he got the idea of regression modelling from a lecture by Balthasar Bickel in March 2008. Other studies employing regression models that came out around the same time are Cysouw (2010) and Bakker et al. (2011). Good examples of fruitful usage of these methods are Moran et al. (2012), who falsify the idea that there is a positive relationship between population size and phoneme inventory size first proposed by Hay and Bauer (2007), and Atkinson (2011), who proposes a negative relationship between phoneme inventory size and distance from West Africa. The latter paper triggered a lot of replies on the typological and sociolinguistic measures used in linguistic typology and appropriate statistics for evaluating interactions between them.</span><br />
<span style="text-indent: -32px;"><span style="font-family: inherit;"><br /></span></span>
<span style="text-indent: -32px;"><span style="font-family: inherit;">Out of these three methods, it seems that generalized linear mixed-effect models are now fast becoming the most widely used statistical tool on the typologists' belt, with a flurry of recent (</span>recent<span style="font-family: inherit;"> meaning 2018/19) papers such as Gast and </span></span><span style="font-family: inherit; text-indent: -32px;">Koptjevskaja-Tamm (2018), </span><span style="font-family: inherit; text-indent: -32px;">Lester et al. (2018), </span><span style="font-family: inherit; text-indent: -32px;">Schmidtke-Bode & Levshina (2018), </span><span style="font-family: inherit; text-indent: -32px;">Sinnemäki & Di Garbo (2018), Schmidtke-Bode (2019), and Sinnemäki (2019). </span><span style="font-family: inherit; text-indent: -32px;">The most important one of these from a methodological perspective is </span><span style="font-family: inherit;">Coupé (2018), who reports on the recent usage of </span><span style="font-family: inherit; text-indent: -32px;">generalized linear mixed-effect models but goes much further by introducing </span><span style="font-family: inherit;">generalized additive models for location, scale, and shape for linguistic typology. </span><br />
<br />
<span style="font-family: inherit;">It's 2020! Palindrome Day has come and gone, and anyway </span><span lang="EN-US" style="font-family: Calibri, sans-serif; font-size: 12pt;">–</span><span style="font-family: inherit;"> this year marks the first full decade after the </span>the quantitative turn in linguistic typology. That is, if we want to put down the year 2010 for that <span lang="EN-US" style="font-family: Calibri, sans-serif; font-size: 12pt;">–</span> for sure it had been coming on for a few years in 2010, so feel free to argue with me on that. I think that it's absolutely wonderful that these methods are being used and developed further, and I hope to see this this particular element of doing typology flourish in years to come.<br />
<br />
I also hope to have made the teeny-tiniest contribution to this flourishing by teaching a<span style="font-family: inherit;"> one-week course on quantitative methods in typology for </span><a href="https://lotschool.nl/events/lot-winter-school-2020/" style="font-family: inherit;" target="_blank">LOT</a><span style="font-family: inherit;"> last month. We made an overview of pros and cons of these three methods (plus good-old-fashioned sampling) in class </span><span lang="EN-US" style="font-family: Calibri, sans-serif; font-size: 12pt;">–</span><span style="font-family: inherit;"> so see below, there's your guide to your method-of-choice, depending how hard-core you want to go (Ben Bolker's <a href="https://bbolker.github.io/mixedmodels-misc/glmmFAQ.html#other-sources-of-help" target="_blank">GLMM disclaimers</a> are always a nice way to come back down to earth).</span><br />
<span style="font-family: inherit;"><br /></span>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgEAsLXDNcAr1z1yL7Iq9p3wGKgjr9VeW1YWIuLVSk1wHyczjvCeiw6txvY7QM1U3iP9QmZK1WO1_bWb85gS11nIK4TUpHEm7nFKwQs7La34-NLptG4RyHVeGQ-_cKKsUE8nDD4TXEz9sxi/s1600/IMG_0957.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><span style="font-family: inherit;"><img border="0" data-original-height="1200" data-original-width="1600" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgEAsLXDNcAr1z1yL7Iq9p3wGKgjr9VeW1YWIuLVSk1wHyczjvCeiw6txvY7QM1U3iP9QmZK1WO1_bWb85gS11nIK4TUpHEm7nFKwQs7La34-NLptG4RyHVeGQ-_cKKsUE8nDD4TXEz9sxi/s1600/IMG_0957.jpg" /></span></a></div>
<br />
<span style="font-family: inherit;"><b>References</b></span><br />
<span style="font-family: inherit;"><br /></span>
<div class="csl-bib-body" style="line-height: 1.35; margin-left: 2em; text-indent: -2em;">
<div class="csl-bib-body" style="line-height: 1.35; margin-left: 2em; text-indent: -2em;">
<span style="font-family: inherit;"><span class="Z3988" title="url_ver=Z39.88-2004&ctx_ver=Z39.88-2004&rfr_id=info%3Asid%2Fzotero.org%3A2&rft_id=info%3Adoi%2F10.1126%2Fscience.1199295&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Phonemic%20Diversity%20Supports%20a%20Serial%20Founder%20Effect%20Model%20of%20Language%20Expansion%20from%20Africa&rft.volume=332&rft.issue=6027&rft.aufirst=Q%20D&rft.aulast=Atkinson&rft.au=Q%20D%20Atkinson&rft.date=2011-04&rft.pages=346%E2%80%93349&rft.spage=346&rft.epage=349"></span></span></div>
<div class="csl-entry">
<span style="text-indent: -2em;"><span style="font-family: inherit;">Atkinson, Quentin D.<span style="font-family: inherit;"> 2011. ‘Phonemic Diversity Supports a Serial Founder Effect Model of Language Expansion from Africa’ <i>Science</i> 332 (6027): 346–349.</span></span></span></div>
<div class="csl-entry">
<span style="font-family: inherit;">Bell, Alan. 1978. ‘Language Samples’. In <i>Universals of Human Languages, Volume 1: Method - Theory</i>, edited by Joseph H. Greenberg, Charles A. Ferguson, and Edith A. Moravcsik, 123–56. Stanford: Stanford University Press.</span></div>
<div class="csl-entry">
<span style="font-family: inherit;"><span style="text-indent: -2em;">Bakker, Dik. 2010. ‘Language Sampling’. In </span><i style="text-indent: -2em;">The Oxford Handbook of Linguistic Typology</i><span style="text-indent: -2em;">, edited by Jae Jung Song. Oxford: Oxford University Press.</span></span></div>
<div class="csl-entry">
<span style="font-family: inherit;"><span style="text-indent: -2em;">Bakker, Peter, Aymeric Daval-Markussen, Mikael Parkvall, and Ingo Plag. 2011. ‘Creoles Are Typologically Distinct from Non-Creoles’. </span><i style="text-indent: -2em;">Journal of Pidgin and Creole Languages</i><span style="text-indent: -2em;">26 (1): 5–42.</span></span></div>
<div class="csl-entry">
<div class="csl-bib-body" style="line-height: 1.35; margin-left: 2em; text-indent: -2em;">
<span style="font-family: inherit;"><span class="Z3988" title="url_ver=Z39.88-2004&ctx_ver=Z39.88-2004&rfr_id=info%3Asid%2Fzotero.org%3A2&rft_id=info%3Adoi%2F10.1075%2Fjpcl.26.1.02bak&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Creoles%20are%20typologically%20distinct%20from%20non-creoles&rft.jtitle=Journal%20of%20Pidgin%20and%20Creole%20Languages&rft.stitle=JPCL&rft.volume=26&rft.issue=1&rft.aufirst=Peter&rft.aulast=Bakker&rft.au=Peter%20Bakker&rft.au=Aymeric%20Daval-Markussen&rft.au=Mikael%20Parkvall&rft.au=Ingo%20Plag&rft.date=2011-02-17&rft.pages=5-42&rft.spage=5&rft.epage=42&rft.issn=0920-9034%2C%201569-9870&rft.language=en"></span></span></div>
</div>
<div class="csl-entry">
<div class="csl-bib-body" style="line-height: 1.35; margin-left: 2em; text-indent: -2em;">
<span style="font-family: inherit;"><span class="Z3988" title="url_ver=Z39.88-2004&ctx_ver=Z39.88-2004&rfr_id=info%3Asid%2Fzotero.org%3A2&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.atitle=Language%20sampling&rft.place=Oxford&rft.publisher=Oxford%20University%20Press&rft.aulast=Dik%20Bakker&rft.au=Dik%20Bakker&rft.au=Jae%20Jung%20Song&rft.date=2010"></span></span></div>
</div>
<div class="csl-entry">
<span style="font-family: inherit;"><span style="font-size: 12pt; text-indent: 0px;">Bickel, Balthasar. 2008. A general method for the statistical evaluation of typological distributions. Manuscript, University of Leipzig. </span></span></div>
<div class="csl-entry">
<span style="text-indent: 0px;"><span style="font-family: inherit;">Bickel, Balthasar. 2011. Statistical modeling of language universals. <i>Linguistic Typology</i> 15. 401–414.</span></span></div>
<div class="csl-entry">
<span style="text-indent: 0px;"><span style="font-family: inherit;">Bickel, Balthasar. 2013. “Distributional Biases in Language Families.” In <i>Language Typology and Historical Contingency: In Honor of Johanna Nichols</i>, edited by Balthasar Bickel, Lenore A. Grenoble, David A. Peterson, and Alan Timberlake, 415–44. Amsterdam: John Benjamins.</span></span></div>
<div class="csl-entry">
<span style="text-indent: 0px;"><span style="font-family: inherit;">Bickel, Balthasar. 2015. “Distributional Typology: Statistical Inquiries into the Dynamics of Linguistic Diversity.” In <i>The Oxford Handbook of Linguistic Analysis, 2nd Edition</i>, edited by Bernd Heine and Heiko Narrog, 901–23. Oxford: Oxford University Press.</span></span></div>
<div class="csl-entry">
<span style="font-family: inherit;"><span style="text-indent: -2em;">Coupé, Christophe. 2018. ‘Modeling Linguistic Variables with Regression Models: Addressing Non-Gaussian Distributions, Non-Independent Observations, and Non-Linear Predictors with Random Effects and Generalized Additive Models for Location, Scale and Shape’. </span><i style="text-indent: -2em;">Frontiers in Psychology </i><span style="text-indent: -2em;">9: 513.</span></span></div>
<div class="csl-entry">
<span style="font-family: inherit;"><span style="text-indent: -2em;">Cysouw, Michael. 2010. ‘Dealing with Diversity: Towards an Explanation of NP-Internal Word Order Frequencies’. </span><i style="text-indent: -2em;">Linguistic Typology</i><span style="text-indent: -2em;">14 (2–3): 221–34.</span><span style="text-indent: -2em;"> </span></span></div>
<div class="csl-entry">
<span style="font-family: inherit;"><span style="text-indent: -2em;">Dunn, Michael, Simon J. Greenhill, Stephen C. Levinson, and Russell D. Gray. 2011. ‘Evolved Structure of Language Shows Lineage-Specific Trends in Word-Order Universals’. </span><i style="text-indent: -2em;">Nature </i><span style="text-indent: -2em;">473 (7345): 79–82.</span><span style="text-indent: -2em;"> </span></span></div>
<div class="csl-entry">
<div class="csl-bib-body" style="line-height: 1.35; margin-left: 2em; text-indent: -2em;">
<span style="font-family: inherit;"><span class="Z3988" title="url_ver=Z39.88-2004&ctx_ver=Z39.88-2004&rfr_id=info%3Asid%2Fzotero.org%3A2&rft_id=info%3Adoi%2F10.1038%2Fnature09923&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Evolved%20structure%20of%20language%20shows%20lineage-specific%20trends%20in%20word-order%20universals&rft.jtitle=Nature&rft.volume=473&rft.issue=7345&rft.aufirst=Michael&rft.aulast=Dunn&rft.au=Michael%20Dunn&rft.au=Simon%20J%20Greenhill&rft.au=Stephen%20C%20Levinson&rft.au=Russell%20D%20Gray&rft.date=2011-05&rft.pages=79%E2%80%9382&rft.spage=79&rft.epage=82"></span></span></div>
</div>
<div class="csl-entry">
<span style="font-family: inherit;"><span style="text-indent: -2em;">Gast, Volker, and Maria Koptjevskaja-Tamm. 2018. ‘The Areal Factor in Lexical Typology’. In </span><i style="text-indent: -2em;">Aspects of Linguistic Variation</i><span style="text-indent: -2em;">, edited by Daniël Olmen, Tanja Mortelmans, and Frank Brisard, 43–82. Berlin, Boston: De Gruyter.</span></span></div>
<div class="csl-entry">
<div class="csl-bib-body" style="line-height: 1.35; margin-left: 2em; text-indent: -2em;">
<span style="font-family: inherit;"><span class="Z3988" title="url_ver=Z39.88-2004&ctx_ver=Z39.88-2004&rfr_id=info%3Asid%2Fzotero.org%3A2&rft_id=urn%3Aisbn%3A978-3-11-060796-3&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.atitle=The%20areal%20factor%20in%20lexical%20typology&rft.place=Berlin%2C%20Boston&rft.publisher=De%20Gruyter&rft.aufirst=Volker&rft.aulast=Gast&rft.au=Dani%C3%ABl%20Olmen&rft.au=Tanja%20Mortelmans&rft.au=Frank%20Brisard&rft.au=Volker%20Gast&rft.au=Maria%20Koptjevskaja-Tamm&rft.date=2018-09-10&rft.pages=43-82&rft.spage=43&rft.epage=82&rft.isbn=978-3-11-060796-3&rft.language=en"></span></span></div>
</div>
<div class="csl-entry">
<div class="csl-bib-body" style="line-height: 1.35; margin-left: 2em; text-indent: -2em;">
<span style="font-family: inherit;"><span class="Z3988" title="url_ver=Z39.88-2004&ctx_ver=Z39.88-2004&rfr_id=info%3Asid%2Fzotero.org%3A2&rft_id=info%3Adoi%2F10.1515%2Flity.2010.010&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Dealing%20with%20diversity%3A%20Towards%20an%20explanation%20of%20NP-internal%20word%20order%20frequencies&rft.jtitle=Linguistic%20Typology&rft.volume=14&rft.issue=2-3&rft.aufirst=Michael&rft.aulast=Cysouw&rft.au=Michael%20Cysouw&rft.date=2010-10&rft.pages=221%E2%80%9334&rft.spage=221&rft.epage=34"></span></span></div>
</div>
<div class="csl-entry">
<div class="csl-bib-body" style="line-height: 1.35; margin-left: 2em; text-indent: -2em;">
<span style="font-family: inherit;"><span class="Z3988" title="url_ver=Z39.88-2004&ctx_ver=Z39.88-2004&rfr_id=info%3Asid%2Fzotero.org%3A2&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Modeling%20linguistic%20variables%20with%20regression%20models%3A%20Addressing%20non-gaussian%20distributions%2C%20non-independent%20observations%2C%20and%20non-linear%20predictors%20with%20random%20effects%20and%20generalized%20additive%20models%20for%20location%2C%20scale%20and%20shape&rft.jtitle=Frontiers%20in%20Psychology&rft.volume=9&rft.aufirst=Christophe&rft.aulast=Coup%C3%A9&rft.au=Christophe%20Coup%C3%A9&rft.date=2018&rft.pages=513"></span></span></div>
</div>
<span style="font-family: inherit;"><span class="Z3988" title="url_ver=Z39.88-2004&ctx_ver=Z39.88-2004&rfr_id=info%3Asid%2Fzotero.org%3A2&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.atitle=Language%20samples&rft.place=Stanford&rft.publisher=Stanford%20University%20Press&rft.aufirst=Alan&rft.aulast=Bell&rft.au=Alan%20Bell&rft.au=Joseph%20H.%20Greenberg&rft.au=Charles%20A.%20Ferguson&rft.au=Edith%20A.%20Moravcsik&rft.date=1978&rft.pages=123-156&rft.spage=123&rft.epage=156"></span></span></div>
<div class="csl-bib-body" style="line-height: 1.35; margin-left: 2em; text-indent: -2em;">
<div class="csl-entry">
<span style="font-family: inherit;">Greenberg, Joseph H. 1963. ‘Some Universals of Grammar with Particular Reference to the Order of Meaningful Elements’. In <i>Universals of Language</i>, edited by Joseph H. Greenberg, 73–113. London: MIT Press.</span></div>
<div class="csl-entry">
<span style="font-family: inherit;"><span style="text-indent: -2em;">Hay, Jennifer, and Laurie Bauer. 2007. ‘Phoneme Inventory Size and Population Size’. </span><i style="text-indent: -2em;">Language </i><span style="text-indent: -2em;">83 (2): 388–400.</span></span></div>
<div class="csl-entry">
<span style="font-family: inherit;"><span style="text-indent: -2em;">Lester, Nicholas A, Sandra Auderset, and Phillip G. Rogers. 2018. ‘Case Inflection and the Functional Indeterminacy of Nouns: A Cross-Linguistic Analysis’. In </span><i style="text-indent: -2em;">Proceedings of the 40th Annual Meeting of the Cognitive Science Society</i><span style="text-indent: -2em;">.</span></span></div>
<div class="csl-entry">
<div class="csl-bib-body" style="line-height: 1.35; margin-left: 2em; text-indent: -2em;">
<span style="font-family: inherit;"><span class="Z3988" title="url_ver=Z39.88-2004&ctx_ver=Z39.88-2004&rfr_id=info%3Asid%2Fzotero.org%3A2&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.atitle=Case%20inflection%20and%20the%20functional%20indeterminacy%20of%20nouns%3A%20A%20cross-linguistic%20analysis&rft.btitle=Proceedings%20of%20the%2040th%20Annual%20Meeting%20of%20the%20Cognitive%20Science%20Society&rft.aufirst=Nicholas%20A&rft.aulast=Lester&rft.au=Nicholas%20A%20Lester&rft.au=Sandra%20Auderset&rft.au=Phillip%20G%20Rogers&rft.date=2018&rft.language=en"></span></span></div>
</div>
<div class="csl-entry">
<div class="csl-bib-body" style="line-height: 1.35; margin-left: 2em; text-indent: -2em;">
<span style="font-family: inherit;"><span class="Z3988" title="url_ver=Z39.88-2004&ctx_ver=Z39.88-2004&rfr_id=info%3Asid%2Fzotero.org%3A2&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Phoneme%20inventory%20size%20and%20population%20size&rft.jtitle=Language&rft.volume=83&rft.issue=2&rft.aufirst=Jennifer&rft.aulast=Hay&rft.au=Jennifer%20Hay&rft.au=Laurie%20Bauer&rft.date=2007&rft.pages=388%E2%80%93400&rft.spage=388&rft.epage=400"></span></span></div>
</div>
<div class="csl-entry">
<span style="font-family: inherit;"><span style="text-indent: -2em;">Moran, Steven, Daniel McCloy, and Richard Wright. 2012. ‘Revisiting Population Size vs. Phoneme Inventory Size’. </span><i style="text-indent: -2em;">Language </i><span style="text-indent: -2em;">88 (4): 877–893.</span><span style="text-indent: -2em;"> </span></span></div>
<div class="csl-entry">
<span style="text-indent: -2em;"><span style="font-family: inherit;"><div class="csl-entry">
Schmidtke-Bode, Karsten. 2019. “Attractor States and Diachronic Change in Hawkins’s ‘Processing Typology.’” In <i>Explanation in Typology: Diachronic Sources, Functional Motivations and the Nature of the Evidence</i>, edited by Karsten Schmidtke-Bode, Natalia Levshina, Susanne Maria Michaelis, and Ilja A. Seržant, 123–48. Berlin: Language Science Press.</div>
<div class="csl-entry">
<span style="text-indent: -2em;">Schmidtke-Bode, Karsten, and Natalia Levshina. 2018. ‘Reassessing Scale Effects On Differential Case Marking: Methodological, Conceptual And Theoretical Issues In The Quest For A Universal’. In </span><i style="text-indent: -2em;">Diachrony of Differential Argument Marking</i><span style="text-indent: -2em;">, edited by Ilja A. Seržant and Alena Witzlack-Makarevich, 509–37. Berlin: Language Science Press.</span></div>
</span></span></div>
<div class="csl-entry">
<div class="csl-bib-body" style="line-height: 1.35; margin-left: 2em; text-indent: -2em;">
<span style="font-family: inherit;"><span class="Z3988" title="url_ver=Z39.88-2004&ctx_ver=Z39.88-2004&rfr_id=info%3Asid%2Fzotero.org%3A2&rft_id=info%3Adoi%2F10.1353%2Flan.2012.0087&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Revisiting%20population%20size%20vs.%20phoneme%20inventory%20size&rft.jtitle=Language&rft.volume=88&rft.issue=4&rft.aufirst=Steven&rft.aulast=Moran&rft.au=Steven%20Moran&rft.au=Daniel%20McCloy&rft.au=Richard%20Wright&rft.date=2012&rft.pages=877%E2%80%93893&rft.spage=877&rft.epage=893&rft.language=English"></span></span></div>
</div>
<div class="csl-entry">
<span style="font-family: inherit;"><span style="text-indent: -2em;">Sinnemäki, Kaius. 2010. ‘Word Order in Zero-Marking Languages’. </span><i style="text-indent: -2em;">Studies in Language </i><span style="text-indent: -2em;">34 (4): 869–912.</span><span style="text-indent: -2em;"> </span></span></div>
<div class="csl-entry">
<span style="text-indent: -2em;"><div class="csl-entry">
<span style="font-family: inherit;">Sinnemäki, Kaius. 2019. “On the Distribution and Complexity of Gender and Numeral Classifiers.” In <i>Grammatical Gender and Linguistic Complexity: Volume II: World-Wide Comparative Studies</i>, edited by Francesca Di Garbo, Bruno Olsson, and Bernhard Wälchli, 133–200. Berlin: Language Science Press.</span></div>
</span></div>
<div class="csl-entry">
<span style="text-indent: -2em;"><span style="font-family: inherit;"><div class="csl-entry">
<span style="font-family: inherit;">Sinnemäki, Kaius, and Francesca Di Garbo. 2018. “Language Structures May Adapt to the Sociolinguistic Environment, but It Matters What and How You Count: A Typological Study of Verbal and Nominal Co</span>mplexity.” <i>Frontiers in Psychology </i>9: 89–23. </div>
<div>
<br /></div>
</span></span></div>
<div class="csl-entry">
<div class="csl-bib-body" style="line-height: 1.35; margin-left: 2em; text-indent: -2em;">
<span style="font-family: inherit;"><span class="Z3988" title="url_ver=Z39.88-2004&ctx_ver=Z39.88-2004&rfr_id=info%3Asid%2Fzotero.org%3A2&rft_id=info%3Adoi%2F10.1075%2Fsl.34.4.04sin&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Word%20order%20in%20zero-marking%20languages&rft.jtitle=Studies%20in%20Language&rft.volume=34&rft.issue=4&rft.aufirst=Kaius&rft.aulast=Sinnem%C3%A4ki&rft.au=Kaius%20Sinnem%C3%A4ki&rft.date=2010&rft.pages=869%E2%80%93912&rft.spage=869&rft.epage=912&rft.language=English"></span></span></div>
</div>
<div class="csl-entry">
<br /></div>
<div class="csl-entry">
<div class="csl-bib-body" style="line-height: 1.35; margin-left: 2em; text-indent: -2em;">
<span class="Z3988" title="url_ver=Z39.88-2004&ctx_ver=Z39.88-2004&rfr_id=info%3Asid%2Fzotero.org%3A2&rft_id=info%3Adoi%2F10.1515%2Flingty-2016-0006&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Sampling%20for%20variety&rft.jtitle=Linguistic%20Typology&rft.volume=20&rft.issue=2&rft.aufirst=Matti&rft.aulast=Miestamo&rft.au=Matti%20Miestamo&rft.au=Dik%20Bakker&rft.au=Antti%20Arppe&rft.date=2016-09&rft.pages=1%E2%80%9364&rft.spage=1&rft.epage=64"></span></div>
</div>
<span class="Z3988" title="url_ver=Z39.88-2004&ctx_ver=Z39.88-2004&rfr_id=info%3Asid%2Fzotero.org%3A2&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.atitle=Some%20universals%20of%20grammar%20with%20particular%20reference%20to%20the%20order%20of%20meaningful%20elements&rft.place=London&rft.publisher=MIT%20Press&rft.aufirst=Joseph%20H&rft.aulast=Greenberg&rft.au=Joseph%20H%20Greenberg&rft.au=Joseph%20H.%20Greenberg&rft.date=1963-07&rft.pages=73-113&rft.spage=73&rft.epage=113"></span></div>
Annemarie Verkerkhttp://www.blogger.com/profile/14747297526182358435noreply@blogger.com0tag:blogger.com,1999:blog-1300680252997007251.post-82957835976983217622019-11-04T10:20:00.002+11:002021-03-12T20:35:55.707+11:00Ethnologue changes access, again! Clarifying points<div dir="ltr" trbidi="on">
<div style="clear: both; text-align: left;">
<span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">News in brief.</span></div>
<div style="clear: both; text-align: left;">
<span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;"><br /></span></div>
<div style="clear: both; text-align: left;">
<span face=""arial" , "helvetica" , sans-serif"><span style="font-size: normal;">Ethnologue, <a href="https://www.ethnologue.com/ethnoblog/rob-hess/changes-ethnologuecom">as of October 26,</a> have changed their access conditions on the site. Instead of getting 3 free page views per month, users can now see all pages on the website but not all information on them. To the right</span> are examples of what the views look like for Country and Language pages. <a href="https://twitter.com/hashtag/Ethnologue?src=hashtag_click">This has sparked negative emotions.</a></span></div>
<div class="separator" style="clear: both; text-align: center;">
</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjJ-ans1dREtG-3BnYZUC_GNCZlINBh882bn7dbqVOwP7Nk6Rwe_f3k5wbUvDCUnGR_nUx5PXoe-3Z9bnhOgykJOTTtaXpW7YkThDgPYTtromFYWNhXlnTWxTcsKcerlEYjqrj8AiP5Y-uV/s1600/ZA.png" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><span face=""arial" , "helvetica" , sans-serif"><img border="0" data-original-height="565" data-original-width="616" height="293" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjJ-ans1dREtG-3BnYZUC_GNCZlINBh882bn7dbqVOwP7Nk6Rwe_f3k5wbUvDCUnGR_nUx5PXoe-3Z9bnhOgykJOTTtaXpW7YkThDgPYTtromFYWNhXlnTWxTcsKcerlEYjqrj8AiP5Y-uV/s320/ZA.png" width="320" /></span></a></div>
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgK3RI1Zbti7hT7mIvrpDpU6od-Bs-ZLGZfq19i8gCDEFxnXFdSm2kh5tnvT2fHnUWmEoOIJrohKq36u_WC4au0vbECUHzugRGxbQai2BrqMcMHwJNWrz0zyFEBLdcSOTL3rWyfBsjTSxIK/s1600/akolet.png" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><span face=""arial" , "helvetica" , sans-serif"><img border="0" data-original-height="681" data-original-width="617" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgK3RI1Zbti7hT7mIvrpDpU6od-Bs-ZLGZfq19i8gCDEFxnXFdSm2kh5tnvT2fHnUWmEoOIJrohKq36u_WC4au0vbECUHzugRGxbQai2BrqMcMHwJNWrz0zyFEBLdcSOTL3rWyfBsjTSxIK/s320/akolet.png" width="289" /></span></a><br />
<span face=""arial" , "helvetica" , sans-serif"><span style="font-size: normal;"><br /></span>
<span style="font-size: normal;">They are also pushing more for <a href="https://www.ethnologue.com/guides">their guide pages,</a> which old users may notice is very similar to the "Statistics" pages of older editions but with less information. These guide pages seem directed more at educators than academics.</span></span><br />
<span face=""arial" , "helvetica" , sans-serif"><span face=""arial" , "helvetica" , sans-serif"><span style="font-size: normal;"><br /></span>
<span style="font-size: normal;">Just like with the previous access restrictions, these are not levied against users in certain countries with low mean incomes. </span></span><span face=""arial" , "helvetica" , sans-serif">They have also launched </span><a href="https://www.ethnologue.com/contributor-program">a contributor program</a><span face=""arial" , "helvetica" , sans-serif">, which will enable people who contribute to access Ethnologue freely.</span></span><br />
<span face=""arial" , "helvetica" , sans-serif"><span style="font-size: normal;"><br /></span>
<span style="font-size: normal;">SIL International is the publisher of Ethnologue, they are a "faith-based" organisation and while they claim to not be missionary, they work closely with and are funded by their sister organisation <a href="https://www.wycliffe.org/">Wycliffe Bible Translators </a>who are explicitly Christian missionaries. You can read more about the finances of SIL International <a href="https://www.sil.org/about/financial-information">here</a>. They also publish other resources besides the Ethnologue.</span></span><br />
<span style="color: #660000;"><i><br /></i></span></div><div dir="ltr" trbidi="on"><span style="color: #660000;"><i>edit [march 2021]: Just to clarify the aim of SIL International further. They state themselves that their aim is the holistic flourishing of all peoples through Scripture access using the languages they value most and compassionate services. This needn't always be in conflict with scholarly process necessarily, but it is a different aim to academic research and explicitly includes value judgement of how people should live their lives which can be harmful.</i></span></div><div dir="ltr" style="text-align: left;" trbidi="on"><span face=""arial" , "helvetica" , sans-serif"><span style="font-size: normal;"><br /></span>
<span style="font-size: normal;">SIL International are also the official registration authority for ISO 639-3 - the most popular of the ISO standards for language names & codes.<a href="https://iso639-3.sil.org/"> It is hosted at a website separately from the Ethnologue and that website is not under any access restrictions</a>. You can see all old change requests and updates to the classifications there. Even though that ISO 639-3 information is technically stored separately, it is not that useable since </span></span><span face="arial, helvetica, sans-serif">it lacks information on geography, genealogy, or alternative names (as </span><a href="http://humans-who-read-grammars.blogspot.com/2016/01/clarifying-points-on-ethnologue-pay.html?showComment=1451821360948#c3957995276714796745" style="font-family: arial, helvetica, sans-serif;">Cysouw pointed out last time around)</a><span face="arial, helvetica, sans-serif">. One would have to swap back and forth between Ethnologue and ISO 639-3 and hope there's enough information in the limited view to figure out what is what.</span><br />
<span face=""arial" , "helvetica" , sans-serif"><span style="font-size: normal;"><br /></span>
<span style="font-size: normal;"><a href="http://humans-who-read-grammars.blogspot.com/2016/01/clarifying-points-on-ethnologue-pay.html">See also our old blog post about the 2016 change</a>.</span></span><br />
<span face=""arial" , "helvetica" , sans-serif"><b><span style="font-size: normal;"><br /></span></b>
<b><span style="font-size: normal;">Alternatives to Ethnologue</span></b></span><br />
<span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">Ethnologue is a great resource that have served academics well for a long time, and the ISO 639-3 code standard is very practical. However, perhaps time has come for Ethnologue to redefine their target audience and for academics to go elsewhere. The limited information provided to non-subscribers is indeed very minimal, and it is not clear that Ethnologue offers enough added value compared to other resources to warrant asking your local university library to subscribe.</span><br />
<span face=""arial" , "helvetica" , sans-serif"><span face=""arial" , "helvetica" , sans-serif"><br /></span>
<span face=""arial" , "helvetica" , sans-serif">There are several other resources that provide similar services to Ethnologue for free, and of these Glottolog is the most comprehensive. <a href="http://glottolog.org/">Glottolog.org</a> offers many of the same functionalities as Ethnologue and ISO 639-3. You can find the following information there:</span></span><br />
<div style="text-align: left;">
</div>
<ol style="text-align: left;">
<li><span face=""arial" , "helvetica" , sans-serif"><span style="font-size: normal;">Language classification (what counts as language versus dialect, by </span><a href="https://glottolog.org/glottolog/glottologinformation">their standards</a><span style="font-size: normal;">)</span></span></li>
<li><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">Language codes for languages, dialects and families and all nodes in between (handy if you disagree with their classification in (1)</span></li>
<li><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">Language locations (points, not polygons)</span></li>
<li><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;"><a href="https://scholarspace.manoa.hawaii.edu/bitstream/10125/24792/1/hammarstrom_et_al.pdf">Endangerment status and descriptive status per language</a></span></li>
<ul>
<li><span face=""arial" , "helvetica" , sans-serif">see <a href="https://glottolog.org/langdoc/status">Glottoscope</a> and <a href="http://glammap.net/glottovis/about">Glottovis</a> for interactive visualisations</span></li>
</ul>
<li><span face=""arial" , "helvetica" , sans-serif">References per language</span></li>
<li><span face=""arial" , "helvetica" , sans-serif">Alternative names</span></li>
</ol>
<span face=""arial" , "helvetica" , sans-serif"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;"><b>Glottolog's codes are also mapped to ISO 639-3, so you can quite easily convert your old data to Glottocodes.</b></span></span><br />
<span face=""arial" , "helvetica" , sans-serif"><span face=""arial" , "helvetica" , sans-serif"><span style="font-size: normal;"><br /></span></span>
<span face=""arial" , "helvetica" , sans-serif"><span style="font-size: normal;">Below is a table comparing Glottolog and products by SIL International on more points:</span>
<span style="font-size: normal;"><br /></span>
<style type="text/css">
body,div,table,thead,tbody,tfoot,tr,th,td,p { font-family:"Liberation Sans"; font-size:x-normal }
</style>
</span></span><br />
<table border="0" cellspacing="0">
<colgroup width="223"></colgroup>
<colgroup width="310"></colgroup>
<colgroup width="274"></colgroup>
<colgroup width="236"></colgroup>
<tbody>
<tr>
<td align="left" bgcolor="#E6E6FF" height="25" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;"><br /></span></td>
<td align="left" bgcolor="#E6E6FF" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><b><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">SIL International</span></b></td>
<td align="left" bgcolor="#E6E6FF" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><b><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">Glottolog</span></b></td>
<td align="left" bgcolor="#E6E6FF" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><b><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">Other resources</span></b></td>
</tr>
<tr>
<td align="left" bgcolor="#E6E6FF" height="25" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><b><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">Language codes</span></b></td>
<td align="left" bgcolor="#CCFF99" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">Yes</span></td>
<td align="left" bgcolor="#CCFF99" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">Yes (also for families and dialects)</span></td>
<td align="left" bgcolor="#E6E6FF" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;"><br /></span></td>
</tr>
<tr>
<td align="left" bgcolor="#E6E6FF" height="25" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><b><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">Open Access?</span></b></td>
<td align="left" bgcolor="#FFCCCC" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">No, mostly behind paywall</span></td>
<td align="left" bgcolor="#CCFF99" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">Yes, Open Access (CC-BY)</span></td>
<td align="left" bgcolor="#E6E6FF" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;"><br /></span></td>
</tr>
<tr>
<td align="left" bgcolor="#E6E6FF" height="58" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><b><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">Alternative language names</span></b></td>
<td align="left" bgcolor="#CCFF99" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">Yes</span></td>
<td align="left" bgcolor="#CCFF99" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">Yes, including names from Ethnologue, OLAC, MultiTree, AIATSIS etc</span></td>
<td align="left" bgcolor="#E6E6FF" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">OLAC, MultiTree, WALS, AIATSIS</span></td>
</tr>
<tr>
<td align="left" bgcolor="#E6E6FF" height="25" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><b><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">Population stats</span></b></td>
<td align="left" bgcolor="#CCFF99" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">Yes</span></td>
<td align="left" bgcolor="#FFCCCC" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">No</span></td>
<td align="left" bgcolor="#E6E6FF" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;"><br /></span></td>
</tr>
<tr>
<td align="left" bgcolor="#E6E6FF" height="25" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><b><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">Language bibliography</span></b></td>
<td align="left" bgcolor="#CCFF99" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">Yes,<a href="https://www.sil.org/resources/language-culture-archives"> 42.000+ references</a></span></td>
<td align="left" bgcolor="#99FF33" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">Yes, 180.000+ references</span></td>
<td align="left" bgcolor="#E6E6FF" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">OLAC</span></td>
</tr>
<tr>
<td align="left" bgcolor="#E6E6FF" height="41" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><b><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">Endangerment information</span></b></td>
<td align="left" bgcolor="#CCFF99" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">Yes</span></td>
<td align="left" bgcolor="#CCFF99" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">Yes, but derived from Ethnologue and other resources (→)</span></td>
<td align="left" bgcolor="#E6E6FF" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">UNESCO Atlas of Languages in Danger, ELCat</span></td>
</tr>
<tr>
<td align="left" bgcolor="#E6E6FF" height="25" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><b><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">Descriptive status</span></b></td>
<td align="left" bgcolor="#FFCCCC" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">No</span></td>
<td align="left" bgcolor="#CCFF99" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">Yes</span></td>
<td align="left" bgcolor="#E6E6FF" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;"><br /></span></td>
</tr>
<tr>
<td align="left" bgcolor="#E6E6FF" height="25" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><b><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">Genealogies</span></b></td>
<td align="left" bgcolor="#CCFF99" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">Yes, but not referenced</span></td>
<td align="left" bgcolor="#99FF33" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">Yes, and referenced</span></td>
<td align="left" bgcolor="#E6E6FF" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">MultiTree, D-place Phylogenies</span></td>
</tr>
<tr>
<td align="left" bgcolor="#E6E6FF" height="41" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><b><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">Language area polygons</span></b></td>
<td align="left" bgcolor="#FFFF99" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">Yes, but not freely available (costs est 5.000 USD)</span></td>
<td align="left" bgcolor="#FFCCCC" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">No</span></td>
<td align="left" bgcolor="#E6E6FF" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">Partial: <a href="https://native-land.ca/">https://native-land.ca/</a> and others</span></td>
</tr>
<tr>
<td align="left" bgcolor="#E6E6FF" height="25" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><b><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">Countries per language</span></b></td>
<td align="left" bgcolor="#CCFF99" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">Yes</span></td>
<td align="left" bgcolor="#CCFF99" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">Yes</span></td>
<td align="left" bgcolor="#E6E6FF" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;"><br /></span></td>
</tr>
<tr>
<td align="left" bgcolor="#E6E6FF" height="25" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><b><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">Long/lat point per language</span></b></td>
<td align="left" bgcolor="#FFFF99" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">Derivable from polygons</span></td>
<td align="left" bgcolor="#CCFF99" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">Yes</span></td>
<td align="left" bgcolor="#E6E6FF" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;"><br /></span></td>
</tr>
<tr>
<td align="left" bgcolor="#E6E6FF" height="41" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><b><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">Genealogical classification tendencies</span></b></td>
<td align="left" bgcolor="#E6E6FF" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">Merge</span></td>
<td align="left" bgcolor="#E6E6FF" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">Split</span></td>
<td align="left" bgcolor="#E6E6FF" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;"><br /></span></td>
</tr>
<tr>
<td align="left" bgcolor="#E6E6FF" height="58" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><b><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">Handling of contact languages</span></b></td>
<td align="left" bgcolor="#E6E6FF" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">Creoles, Pidgins and Mixed all in their own 3 separate families</span></td>
<td align="left" bgcolor="#E6E6FF" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">Creoles appear within their lexifier's family, pidgins and mixed in own 2 families</span></td>
<td align="left" bgcolor="#E6E6FF" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;"><br /></span></td>
</tr>
<tr>
<td align="left" bgcolor="#E6E6FF" height="41" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><b><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">Handling of sign languages</span></b></td>
<td align="left" bgcolor="#E6E6FF" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">In their own family with no hierarchy</span></td>
<td align="left" bgcolor="#E6E6FF" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">In their own family with some hierarchy based on history and type</span></td>
<td align="left" bgcolor="#E6E6FF" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;"><br /></span></td>
</tr>
<tr>
<td align="left" bgcolor="#E6E6FF" height="41" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><b><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">Handling of isolates</span></b></td>
<td align="left" bgcolor="#E6E6FF" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">All in one family</span></td>
<td align="left" bgcolor="#E6E6FF" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">Separated out (no Family_ID = Isolate)</span></td>
<td align="left" bgcolor="#E6E6FF" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;"><br /></span></td>
</tr>
<tr>
<td align="left" bgcolor="#E6E6FF" height="25" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><b><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">Requests for changes</span></b></td>
<td align="left" bgcolor="#E6E6FF" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">Form at <a href="http://iso639-3.sil.org/">iso639-3.sil.org</a></span></td>
<td align="left" bgcolor="#E6E6FF" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;"><a href="https://github.com/glottolog/glottolog/issues">GitHub Issues</a></span></td>
<td align="left" bgcolor="#E6E6FF" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;"><br /></span></td>
</tr>
<tr>
<td align="left" bgcolor="#E6E6FF" height="58" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><b><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">Transparency in decisions</span></b></td>
<td align="left" bgcolor="#FFFF99" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">Changes in ISO 639-3 are mostly well described, most other information per language is not referenced.</span></td>
<td align="left" bgcolor="#CCFF99" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">Almost everything is tied to a published reference</span></td>
<td align="left" bgcolor="#E6E6FF" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;"><br /></span></td>
</tr>
<tr>
<td align="left" bgcolor="#E6E6FF" height="41" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><b><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">Dialects</span></b></td>
<td align="left" bgcolor="#E6E6FF" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">Yes, listed but not as meticulously managed as “languages”</span></td>
<td align="left" bgcolor="#E6E6FF" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">Yes, listed but not as meticulously managed as “languages”</span></td>
<td align="left" bgcolor="#E6E6FF" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;"><br /></span></td>
</tr>
<tr>
<td align="left" bgcolor="#E6E6FF" height="40" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><b><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">Criteria for being a language</span></b></td>
<td align="left" bgcolor="#E6E6FF" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">Mutual intelligibility, shared cultural identity, shared literature</span></td>
<td align="left" bgcolor="#E6E6FF" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">Mutual intelligibility, lexical similarity</span></td>
<td align="left" bgcolor="#E6E6FF" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;"><br /></span></td>
</tr>
<tr>
<td align="left" bgcolor="#E6E6FF" height="25" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><b><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">“Faith-based”</span></b></td>
<td align="left" bgcolor="#E6E6FF" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">Yes</span></td>
<td align="left" bgcolor="#E6E6FF" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">No</span></td>
<td align="left" style="border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; border-top: 1px solid #000000; border: 1px solid rgb(0, 0, 0);" valign="top"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;"><br /></span></td>
</tr>
</tbody></table>
<span face=""arial" , "helvetica" , sans-serif"><span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;"><br /></span>
<span style="font-size: normal;"><br /></span>
</span><br />
<span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;"><b>Problems Ethnologue and Glottolog share</b></span><br />
<span face=""arial" , "helvetica" , sans-serif"><span style="font-size: normal;">It can be tricky for users of both catalogues to easily understand the reasoning behind certain decisions and lodge requests. For genealogy, <a href="https://www.ethnologue.com/about/language-info#Class">Ethnologue does state that the sources used are available on request, </a>but they are not provided for each tree and language up front (they are for Glottolog). Furthermore, changes to Ethnologue and ISO 639-3 should be submitted in different places (<a href="https://www.ethnologue.com/about/updates-corrections">here</a> and <a href="https://iso639-3.sil.org/">here</a> resp.) </span>For Glottolog, one must have basic GitHub skills to navigate the backlog of decisions and submit new. For example, you shouldn't go to the <a href="https://github.com/clld/glottolog3">clld/glottolog3</a> repos for data decisions which is where one of the link on the site takes you, but to <a href="https://github.com/glottolog/glottolog">glottolog/glottolog</a>.</span><br />
<span face=""arial" , "helvetica" , sans-serif"><br /></span>
<span face=""arial" , "helvetica" , sans-serif">These obstacles are by no means insurmountable, but they are there and they will most likely result in certain changes not being lodged and certain users not being involved.</span><br />
<span face=""arial" , "helvetica" , sans-serif"><span style="font-size: normal;"><br /></span></span>
<span face=""arial" , "helvetica" , sans-serif"><span style="font-size: normal;"><b>Usage and aims</b></span> </span><br />
<span face=""arial" , "helvetica" , sans-serif"><br /></span>
<span face=""arial" , "helvetica" , sans-serif">W</span><span face=""arial" , "helvetica" , sans-serif">hen providing a comprehensive resource, like Glottolog and Ethnologue do, it is key to be entirely clear on what the aim and target audience is (and what they are not). The Ethnologue user audience is currently changing, whether SIL International wants it to or not. Glottolog will be a good resource for some of those lost users within academia, but probably not all.</span><br />
<span face=""arial" , "helvetica" , sans-serif"><br /></span>
<span face=""arial" , "helvetica" , sans-serif">ISO 639-3 is not just used by academics, it is also used in NLP, Wikimedia, HTML, unicode, libraries and <a href="https://en.wikipedia.org/wiki/ISO_639-3#Usage">more</a>.</span><br />
<span face=""arial" , "helvetica" , sans-serif"><br /></span>
<span face=""arial" , "helvetica" , sans-serif">Wikipedia pages on languages now list both the ISO 639-3 code and glottocode (and linguasphere codes).</span><br />
<span face=""arial" , "helvetica" , sans-serif"><br /></span>
<br />
<div style="text-align: center;">
<span face=""arial" , "helvetica" , sans-serif">***</span></div>
<span face=""arial" , "helvetica" , sans-serif"><br /></span>
<span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">Hopefully this will clear things up for many disappointed Ethnologue users and clarify if Glottolog is the right choice for you in your future research. </span><br />
<span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;"><br />All the best, </span><br />
<span face=""arial" , "helvetica" , sans-serif" style="font-size: normal;">Hedders.</span><br />
<span style="font-size: normal;"><br /></span></div>
Hedvig Skirgårdhttp://www.blogger.com/profile/03689179680848604827noreply@blogger.com2tag:blogger.com,1999:blog-1300680252997007251.post-11973372898004401072019-10-30T08:14:00.000+11:002019-10-30T22:41:45.718+11:00Language family mapsLast week, I assigned Bernhard Comrie's (2017) chapter 'The Languages of the World' (from The Handbook of Linguistics, 2017) to a class. It's a basic overview of the world's language families, which is what I wanted them to read, but for one thing: there are no maps in it. I overcompensated in class by presenting a 30-item list of maps, because some things are just so much easier to understand using visual representations. I decided to post some of the best ones I could find here, for future reference and in order to invite you to post better ones in the comments.<br />
<br />
This blog has featured posts on maps before, by <a href="http://humans-who-read-grammars.blogspot.com/2014/12/linguistic-diveristy-important-things.html" target="_blank">Hedvig on how to best represent linguistic diversity on maps</a> and by <a href="http://humans-who-read-grammars.blogspot.com/2017/06/new-approaches-to-ethno-linguistic-maps.html" target="_blank">Matt on new approaches to ethnographies-linguistic maps</a>. It's clear that the kind of maps that are typically used to depict the spatial distribution of languages of a single language family are fraught with difficulties. Typically they deal with multilingualism very poorly, the data they display is usually from different sources that could be decades if not centuries apart, some maps below are based on ethnography and not on linguistics and how these line up is often not straightforward, the list goes on and on. <br />
<br />
That being said, classification in terms of family membership is one of the primary means of classifying languages, and only through the history of language families we can understand how some languages have spread and others have died. Hence, the geographical perspective on language families is an important one. Here I am mostly after polygon maps of language families, and not maps per country (big on <a href="https://www.ethnologue.com/" target="_blank">Ethnologue</a>) or using points (to be found on <a href="https://glottolog.org/" target="_blank">Glottolog</a> and <a href="http://llmap.org/" target="_blank">LL-MAP</a>).<br />
<br />
During my search, I found that many handbooks do not feature maps (just one example, The Tai-Kadai Languages by Diller, Edmondson and Luo), which I found odd as it seems such an obvious thing to include. There is a lot of stuff to be find on the web, though. There are maps on the <a href="https://www.britannica.com/" target="_blank">Encyclopaedia Britannica</a>, unfortunately behind a paywall but many can be found online in a reasonable format, one is featured below on the Khoisan families, for instance. <a href="http://www.muturzikin.com/" target="_blank">Muturzikin</a> has polygon maps on continents and countries, so not specific to families, but of course in certain cases this doesn't really matter (like Australia). Then there are some blogs with collections of maps, such as <a href="http://www.nativeweb.org/resources/reference_materials/maps/" target="_blank">Native Web</a> and t<a href="http://ccat.sas.upenn.edu/plc/clpp/images/langmaps/index.html" target="_blank">his older website</a>. There is also <a href="https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=11&ved=2ahUKEwjdhYe68sPlAhUGyKQKHSirCVQQFjAKegQIARAC&url=http%3A%2F%2Fccat.sas.upenn.edu%2Fplc%2Fclpp%2Fimages%2Flangmaps%2FSummary%2520of%2520Responses--LangMaps.docx&usg=AOvVaw29uTsYd3go00rIgB0F8Z-j" target="_blank">this collection of resources</a>, compiled by Candace Luebbering.<br />
<br />
Let's start with a journey around the world in language family maps, starting with Steve Huffmann's map of Africa:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiisyKHZVgqg3rTOx-jax9rDNIeHPXpDUk2pWzxEMpLfvq7Vm7dmKRC8P6P37l9FgdxssTRDFMhfwg5wivwvg89-geLR3zFiVFivMeFOoyzsxBB2ITS1-WRFCnpVsXS6Jw6flMthEdpUkqN/s1600/Huffman-Africa_Langs-wlms16+3.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1600" data-original-width="1132" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiisyKHZVgqg3rTOx-jax9rDNIeHPXpDUk2pWzxEMpLfvq7Vm7dmKRC8P6P37l9FgdxssTRDFMhfwg5wivwvg89-geLR3zFiVFivMeFOoyzsxBB2ITS1-WRFCnpVsXS6Jw6flMthEdpUkqN/s640/Huffman-Africa_Langs-wlms16+3.png" width="452" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<i>Steve Huffmann's map of African languages using WLMS 16</i></div>
<div class="separator" style="clear: both; text-align: left;">
<br /></div>
<div class="separator" style="clear: both; text-align: left;">
Africa is home to the biggest language family of the world (in number of languages, at least currently): Atlantic-Congo, in Huffmann's map marked in shades of purple (non-Bantu branches) and green (Bantu). To the south of the Atlantic-Congo languages, we find the Khoisan languages (marked in shades of red on Huffmann's map):</div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjCYWFKmAALuhlH_gGr3UfK5JKXPDphgYPOaqNXb9XBX4YNYAL3kyGHgtbYiRGicLMYVpuL7nIHA8_nFu2EbjD1pbri4uXccpttaC9G-f3Gghjc-1kAF1PCH3abCnhqJraY4l-5vKJrv0uc/s1600/Khoisan.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1166" data-original-width="1600" height="466" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjCYWFKmAALuhlH_gGr3UfK5JKXPDphgYPOaqNXb9XBX4YNYAL3kyGHgtbYiRGicLMYVpuL7nIHA8_nFu2EbjD1pbri4uXccpttaC9G-f3Gghjc-1kAF1PCH3abCnhqJraY4l-5vKJrv0uc/s640/Khoisan.png" width="640" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<i><a href="https://www.britannica.com/" target="_blank">Encyclopaedia Britannica</a>'s map of the Khoisan language families</i></div>
<br />
And to the north, smaller families that were once subsumed under the family name 'Nilo-Saharan' (marked in shades of pink in Huffman's map), but which are now considered to be smaller, separate families:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjgaRY5yW_bxj4moeASVbvA9EX7Dsr-ZuIxtLGLd-amxwtpGyul9b3C0bvYZuhAu3_yWR0_k5Q8nsYF7RPvOPbyCfmBdrFTWbzM4SKOZanZVHmss_VZ6wpCbGZ96qTgXkOtdcb2m4_tW7KD/s1600/Nilo-Saharan-languages.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1198" data-original-width="1599" height="478" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjgaRY5yW_bxj4moeASVbvA9EX7Dsr-ZuIxtLGLd-amxwtpGyul9b3C0bvYZuhAu3_yWR0_k5Q8nsYF7RPvOPbyCfmBdrFTWbzM4SKOZanZVHmss_VZ6wpCbGZ96qTgXkOtdcb2m4_tW7KD/s640/Nilo-Saharan-languages.jpg" width="640" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<i><a href="https://www.britannica.com/" target="_blank">Encyclopaedia Britannica</a>'s map of the Nilo-Saharan language families</i></div>
<br />
Furthest to the north and also evident in the Middle East is Afroasiatic (marked in shades of light blue on Huffmann's map):<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhbaPekWHsycaCiyVAqzVtPxv__T80vsOIZcSCa6fg6OiyWUDkPYDKEP1TlssfphM2XY2fuM-vC1NTxInq5Tyvj05VJwXVIoKD2mycmEHVxtjgKTYBR_HbXfx-Zh_UdLhFxdA2cDqY9naKV/s1600/Afro-Asiatic-languages.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1200" data-original-width="1600" height="480" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhbaPekWHsycaCiyVAqzVtPxv__T80vsOIZcSCa6fg6OiyWUDkPYDKEP1TlssfphM2XY2fuM-vC1NTxInq5Tyvj05VJwXVIoKD2mycmEHVxtjgKTYBR_HbXfx-Zh_UdLhFxdA2cDqY9naKV/s640/Afro-Asiatic-languages.jpg" width="640" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<i><a href="https://www.britannica.com/" target="_blank">Encyclopaedia Britannica</a>'s map of the Afroasiatic language family</i></div>
<div style="text-align: center;">
<br /></div>
With these languages of Northern Africa, we arrive in Eurasia. One of the most wide-spread families of Eurasia is Indo-European. In the map below, the eastern part of Eurasia including India is not very well depicted at all, unfortunately.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh4iIplIlGLF292evb6bhV5xOo7yfoGp1lgnIOg7oXdu6EsDyvVaMskHYqDWI6yXzYlL8qnrDIxL0QCLLTmKAVrKt_Coj3FqOpi8uFockBTryI-9LJixuQKZeDQgQen2Bwn7qnIgn8tDNsi/s1600/Indo-European-Eurasia.gif" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1600" data-original-width="1516" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh4iIplIlGLF292evb6bhV5xOo7yfoGp1lgnIOg7oXdu6EsDyvVaMskHYqDWI6yXzYlL8qnrDIxL0QCLLTmKAVrKt_Coj3FqOpi8uFockBTryI-9LJixuQKZeDQgQen2Bwn7qnIgn8tDNsi/s640/Indo-European-Eurasia.gif" width="606" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<i><a href="https://www.britannica.com/" target="_blank">Encyclopaedia Britannica</a>'s map of the Indo-European language family</i></div>
<div style="text-align: center;">
<br /></div>
The Caucasus Mountains and surrounding valleys are home to the Caucasian language families:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiksECBiuSxjW9gaV0m6SrVaJDg55hzZRgyPnYX53GhPHiKE1HxFVIp13EfO2A0X-gAPy54ILMFEV6juhEzfovUJzL5bSHahvLBKjeepazvYC2AR7vJDoStfgcU-cr_fN9H4O57brV6YDRo/s1600/caucassian.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1600" data-original-width="1421" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiksECBiuSxjW9gaV0m6SrVaJDg55hzZRgyPnYX53GhPHiKE1HxFVIp13EfO2A0X-gAPy54ILMFEV6juhEzfovUJzL5bSHahvLBKjeepazvYC2AR7vJDoStfgcU-cr_fN9H4O57brV6YDRo/s640/caucassian.png" width="568" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<i>Map of Caucasian peoples, source: https://commons.wikimedia.org/wiki/File:Caucasus-ethnic_en.svg</i></div>
<br />
Even wider across the Eurasian continent than Indo-European stretches the Altaic language family, containing the Turkic, Mongolic, and Tungusic families.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjhOSwmK8JNQWtx3e9fXJgopqfMdOi45l26QxxbQg9RPCfQ_DUhl0mvgCpLeeWQCQQPabEDccamXXTpzX_qIs3NDVl2g1ewQopH-lst060XwxOifcHYCXFXgcZZw_RliEOknZ004m8i5OfJ/s1600/Bellwood_2013_164.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1061" data-original-width="1600" height="424" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjhOSwmK8JNQWtx3e9fXJgopqfMdOi45l26QxxbQg9RPCfQ_DUhl0mvgCpLeeWQCQQPabEDccamXXTpzX_qIs3NDVl2g1ewQopH-lst060XwxOifcHYCXFXgcZZw_RliEOknZ004m8i5OfJ/s640/Bellwood_2013_164.png" width="640" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<i>Map on prior distribution of Altaic languages, from Bellwood (2013: 164)</i></div>
<br />
Altaic has long been contested, but is now included in proposals on a language family termed Transeurasian, which includes Altaic as well as Korean and Japonic:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEglwjNnb-tDq0gEVK3LYRiGnvrHkciV_3ru-2kDAnXTUwuhtk4UMmbvLwREPYJ8JeqykR6aMr2GT6OHnTS5CPq2TsGplO9W8PtG_EXk6Z6lv9FN52snH2NQ6I7OEg0sMpvGVunBBQvxcf84/s1600/Trans-Eurasian.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1100" data-original-width="1600" height="438" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEglwjNnb-tDq0gEVK3LYRiGnvrHkciV_3ru-2kDAnXTUwuhtk4UMmbvLwREPYJ8JeqykR6aMr2GT6OHnTS5CPq2TsGplO9W8PtG_EXk6Z6lv9FN52snH2NQ6I7OEg0sMpvGVunBBQvxcf84/s640/Trans-Eurasian.png" width="640" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<i>The distribution of the Transeurasian languages (Robbeets and Bouckaert 2018: 146)</i></div>
<br />
In northern Eurasia, we find Uralic, which includes several big European languages, such as Finnish, Hungarian, and Estonian:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEieVh7_BL4646FsPyxz5ZNlA_B37BETriv8-czpa-qsPIzwCuaXajOne4y7u04kZwMhB4DcMLhp8fGNdgR68i_1JXIVjAUjYDAZQlS9C4xicfAY8tBoNKofxBzALen5sGb5yqZGdXUhI8gV/s1600/Uralic_languages_%2528en%2529.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1199" data-original-width="1600" height="478" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEieVh7_BL4646FsPyxz5ZNlA_B37BETriv8-czpa-qsPIzwCuaXajOne4y7u04kZwMhB4DcMLhp8fGNdgR68i_1JXIVjAUjYDAZQlS9C4xicfAY8tBoNKofxBzALen5sGb5yqZGdXUhI8gV/s640/Uralic_languages_%2528en%2529.png" width="640" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<i>source: https://commons.wikimedia.org/wiki/File:Linguistic_map_of_the_Uralic_languages_(en).png</i></div>
<br />
<br />
At the very eastern end of the Eurasian continent, there is located the small family of Chukotko-Kamchatkan languages:<br />
<div>
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhgyzxGXlcm2OI6JxFrHeZE0kD-4jVIKHg5wOTSPQumPZhL66WXy5tQyuYWN3f3EmADDBzrt2GJx3CURmAfQuYPqD9pCOlgwWeaZ1Dc1KteSBiW1mhd3lzLWatqUyi_f3ztbLn9s4oGSvxs/s1600/Fortesque_2011.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1600" data-original-width="1599" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhgyzxGXlcm2OI6JxFrHeZE0kD-4jVIKHg5wOTSPQumPZhL66WXy5tQyuYWN3f3EmADDBzrt2GJx3CURmAfQuYPqD9pCOlgwWeaZ1Dc1KteSBiW1mhd3lzLWatqUyi_f3ztbLn9s4oGSvxs/s640/Fortesque_2011.png" width="638" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<i>Map of the Chukotko-Kamchatkan languages and their neighbours, Fortesque (2011)</i></div>
<div>
<br /></div>
<div>
Here is a second, more colourful map showing the neighbours of the Chukotko-Kamchatkan family across the Bering Strait:</div>
<div>
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjTNx-Pqkbst5saY0PkTsFI3bmIplepErZ54DqHkDgU9JLo4aZ1jXEV2mF7e_DsQsUPSjWUbmUnEj-XSALRgIcMyHKvOu0St4e7pz2_S-vgbg0JnUO1-u3GguOxquo3VNYJaPqiUqhLsW3S/s1600/Languages+of+the+Greater+North+Pacific+Region2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="965" data-original-width="1600" height="386" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjTNx-Pqkbst5saY0PkTsFI3bmIplepErZ54DqHkDgU9JLo4aZ1jXEV2mF7e_DsQsUPSjWUbmUnEj-XSALRgIcMyHKvOu0St4e7pz2_S-vgbg0JnUO1-u3GguOxquo3VNYJaPqiUqhLsW3S/s640/Languages+of+the+Greater+North+Pacific+Region2.png" width="640" /></a></div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhLiNpwbgY_X2NrIFNi5-wYeQQ55DtabAbquZIilANfw6_ht_LKT5ML9VRKr0vSqT1JmRpLF_nOOHiEoxyA7voWtI-E0KOxcq1PZ7BKuekJ-Z8qPziBuFldHLCJCyaf_Tu9i-VNXkQ98bol/s1600/Languages+of+the+Greater+North+Pacific+Region.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1041" data-original-width="1600" height="416" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhLiNpwbgY_X2NrIFNi5-wYeQQ55DtabAbquZIilANfw6_ht_LKT5ML9VRKr0vSqT1JmRpLF_nOOHiEoxyA7voWtI-E0KOxcq1PZ7BKuekJ-Z8qPziBuFldHLCJCyaf_Tu9i-VNXkQ98bol/s640/Languages+of+the+Greater+North+Pacific+Region.png" width="640" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<i>Map of the Chukotko-Kamchatkan languages and their neighbours, Krauss (1988)</i></div>
<div>
<br /></div>
<div>
<div>
Then we start slowly moving from Eurasia to South, East, and Southeast Asia. The Dravidian language family is located in the south-east of the Indian subcontinent, as well as in Afghanistan, Pakistan and Nepal:</div>
<div>
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjAwSq1nLGdqSGs5a-qYZ6BSNx0CN9_h7hn9ZboBvvMgN1VKIA7nc7QRyS70XnRLOG4edYLdkPOT31Jn6yw8PdSJ8UqMGAU8jNfSLVG4cRkDjDtSlSBiLb3pr-YUgyChMYPAa0P-xXZKN0y/s1600/Dravidian.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1600" data-original-width="1447" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjAwSq1nLGdqSGs5a-qYZ6BSNx0CN9_h7hn9ZboBvvMgN1VKIA7nc7QRyS70XnRLOG4edYLdkPOT31Jn6yw8PdSJ8UqMGAU8jNfSLVG4cRkDjDtSlSBiLb3pr-YUgyChMYPAa0P-xXZKN0y/s640/Dravidian.png" width="578" /></a> </div>
<div class="separator" style="clear: both; text-align: center;">
<i>Map of the Dravidian languages, Kolipakam et al. (2018)</i></div>
<div>
<br /></div>
<div>
To the north but very close to the Dravidian languages are the Munda languages, one of the subfamilies of Austroasiatic, which spreads all the way from the eastern Indian subcontinent to Malaysia:</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiMVTMK6NHvmcIIURomQdRHsiDk5_PP2MsVMJfAKa-Xu60NPS0lgIByOopkPljvoO1Npwr3HeeuvBERjiKfmD01rqRA3V6G7KnNAp17XnU83Cg3UOgTQtzf6E5Cmz6q49nqeOvjxxGOKG8H/s1600/Austroasiatic.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1119" data-original-width="1600" height="446" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiMVTMK6NHvmcIIURomQdRHsiDk5_PP2MsVMJfAKa-Xu60NPS0lgIByOopkPljvoO1Npwr3HeeuvBERjiKfmD01rqRA3V6G7KnNAp17XnU83Cg3UOgTQtzf6E5Cmz6q49nqeOvjxxGOKG8H/s640/Austroasiatic.png" width="640" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<i>Map of the Austro-Asiatic languages by Pinnow (1959), as cited by Sidwell (2009)</i></div>
<div>
<br /></div>
<div>
The Austroasiatic languages (also known as Mon-Khmer) are intermingled with a whole range of other families, including Indo-European and Dravidian in India, and Tai-Kadai, Hmong-Mien, and Sino-Tibetan in Southeast Asia. Sidwell (2009: 3-4) comments on how little is known regarding the internal relationships of the Austroasiatic, which must be so interesting given their dispersal and interaction with languages from other families. The following is a map of the Hmong-Mien language family from <a href="https://mail.languagesgulper.com/eng/Home.html" target="_blank">The Language Gulper</a>:</div>
<div>
<br /></div>
<div>
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh6radnqVI30kazDAR3-Qr6n1pKT91FUOlSk9HxiChoizKA8s5XhVAeFpmn-Ten0bNZSSKnpnn1O_66yGfgmWbjuoOpeP3BTDG66cVcW3lPOWlJFHZyQvS95Kpz_g3Vr4wYMAr4wc4H-Evo/s1600/Hmong+Mien.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="537" data-original-width="792" height="432" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh6radnqVI30kazDAR3-Qr6n1pKT91FUOlSk9HxiChoizKA8s5XhVAeFpmn-Ten0bNZSSKnpnn1O_66yGfgmWbjuoOpeP3BTDG66cVcW3lPOWlJFHZyQvS95Kpz_g3Vr4wYMAr4wc4H-Evo/s640/Hmong+Mien.jpg" width="640" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<i>Map of the Hmong-Mien language family from <a href="https://mail.languagesgulper.com/eng/Home.html" target="_blank">The Language Gulper</a></i></div>
<div>
<br /></div>
<div>
The following are two maps of the Tai-Kadai family, one from <span style="text-align: center;"><a href="https://www.britannica.com/" style="font-style: italic;" target="_blank">Encyclopaedia Britannica</a> and one from Wikipedia:</span></div>
<div>
<div style="text-align: center;">
<i><br /></i></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgTQ8SBkRjqpSiNP9P_HXSL735gyb7kGtvW8PWwtytyHF8atYco0eavgZZtJH2Xa3wSpySBvuOpbVzwvLp_zqTuObOMZdei9QJ4kx7cht9SQsEgixtKlA5SA1U5mi1iSCpnFRvMl9PIHOcm/s1600/Thai_langs.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1600" data-original-width="1171" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgTQ8SBkRjqpSiNP9P_HXSL735gyb7kGtvW8PWwtytyHF8atYco0eavgZZtJH2Xa3wSpySBvuOpbVzwvLp_zqTuObOMZdei9QJ4kx7cht9SQsEgixtKlA5SA1U5mi1iSCpnFRvMl9PIHOcm/s640/Thai_langs.jpg" width="468" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://www.britannica.com/" style="font-style: italic;" target="_blank">Encyclopaedia Britannica</a><i>'s map of the Tai-Kadai language family</i></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiV5Vj0DVVStO9edV0GP3IlU8NNnjTVCUg-PSrCmemoLkJdJWksuk6V7qG7PeanxdGfxZ5sqYwarmGkMX0zplJstC0zn9DINIVc-nY7_ywkKCZ1AzHFbF-ajRb2y79mx7qDF056NOBx0x_z/s1600/Taikadai-en.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1600" data-original-width="1283" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiV5Vj0DVVStO9edV0GP3IlU8NNnjTVCUg-PSrCmemoLkJdJWksuk6V7qG7PeanxdGfxZ5sqYwarmGkMX0zplJstC0zn9DINIVc-nY7_ywkKCZ1AzHFbF-ajRb2y79mx7qDF056NOBx0x_z/s640/Taikadai-en.png" width="512" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<i>source: https://commons.wikimedia.org/wiki/File:Taikadai-en.svg</i></div>
</div>
<div>
<br /></div>
<div>
This leaves us with the last huge family of the Eurasian continent, Sino-Tibetan. The first map is another map from <a href="https://mail.languagesgulper.com/eng/Home.html" target="_blank">The Language Gulper</a>:</div>
<div>
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgHeTlMUzmsE3PXGiUQboKGFnQFjpGtx8rhqb1w4qSPh4G5YyQPeMgkOtumpMnHZPZauLP_bb1KISpR6PPYopriAVtoYEOcvu1QGlHOpdCBwZtdIPIeCzorwcegKCwwymmXRIbpIxXQWnNU/s1600/Sino-Tibetan.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="791" data-original-width="792" height="638" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgHeTlMUzmsE3PXGiUQboKGFnQFjpGtx8rhqb1w4qSPh4G5YyQPeMgkOtumpMnHZPZauLP_bb1KISpR6PPYopriAVtoYEOcvu1QGlHOpdCBwZtdIPIeCzorwcegKCwwymmXRIbpIxXQWnNU/s640/Sino-Tibetan.jpg" width="640" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<i>Map of the Sino-Tibetan language family from <a href="https://mail.languagesgulper.com/eng/Home.html" target="_blank">The Language Gulper</a></i></div>
<br />
The second is a really nice map from Sagart et al. (2019), a recent study on the age of origin and the homeland of Sino-Tibetan:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi4qSUVnnC9Aopmt5ci6zLxhkDNx_dfaKUnVqLhMPOO0aDd9j_mbmynZBUs1NCw-gs7yCwr7MAo_Iq18Ng9DHnTQ_EmUUPEMaUozfc28ITRvkCrcJQlYbVtPCk6RlNST5jbl_GjxCvgnPf0/s1600/Sino-Tibetan_Sagart.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1150" data-original-width="1600" height="458" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi4qSUVnnC9Aopmt5ci6zLxhkDNx_dfaKUnVqLhMPOO0aDd9j_mbmynZBUs1NCw-gs7yCwr7MAo_Iq18Ng9DHnTQ_EmUUPEMaUozfc28ITRvkCrcJQlYbVtPCk6RlNST5jbl_GjxCvgnPf0/s640/Sino-Tibetan_Sagart.png" width="640" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<i>Map of the Sino-Tibetan language family from Sagart et al. (2019)</i></div>
<div>
<br /></div>
<div>
For Austronesian, the second-largest language family of the world, there are various maps below. The first is from Bellwood (2013), displaying migration patterns. The second is a points map giving internal classifications on the basis of the phylogenetic analyses by Gray et al. (2009). The third I got from Hedvig and is a map of Oceania including Australia:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEieWXajeUjtbzyofIKP02X81Peo1wQVy87_L63oIulzY4pCUzDl7noE29f7aCVvriVBUjZZPibg5VmXN9x5pxDgs7-s2R0wTd8U6YMsV74N8GMdnVJKgo-6uYJRju_VsefhZYyobA5aWlTN/s1600/Bellwood_2013_192.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="924" data-original-width="1600" height="368" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEieWXajeUjtbzyofIKP02X81Peo1wQVy87_L63oIulzY4pCUzDl7noE29f7aCVvriVBUjZZPibg5VmXN9x5pxDgs7-s2R0wTd8U6YMsV74N8GMdnVJKgo-6uYJRju_VsefhZYyobA5aWlTN/s640/Bellwood_2013_192.png" width="640" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<i>Map of population movements part of the Austronesian expansion from Bellwood (2013: 192)</i></div>
</div>
<div>
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjBCZOcczApDdXuS6N7XJRBOn313I0zOu8b_uw9MtT-pxN8jWQAjeKmMJe7evVABwdxTDzneKLuTCkDJoTdg7xKAQaZdX1auSLSHXK6n5BjFHuiRWGCYIC3THLsGnGRuh3jyty15Jhp2lGa/s1600/Austro_Gray_et_al.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="789" data-original-width="1600" height="314" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjBCZOcczApDdXuS6N7XJRBOn313I0zOu8b_uw9MtT-pxN8jWQAjeKmMJe7evVABwdxTDzneKLuTCkDJoTdg7xKAQaZdX1auSLSHXK6n5BjFHuiRWGCYIC3THLsGnGRuh3jyty15Jhp2lGa/s640/Austro_Gray_et_al.png" width="640" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<i>Map on the Austronesian expansion from Gray et al. (2009)</i></div>
<div>
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgINZuiG5zE0gAG09U_YgQbreeIy_OJ5Ru_JnUSe_8SRBLmrN9k4gT6KakSyfEcVO78WnerWtf7IeG7kbGhP8lNqnDIf1kTnYwcvR6uT2ylImWO-mfvlKlw-_6j4vT-o_LbQs3bgE5EhsG2/s1600/19-220d_Oceanic_languages-01.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1083" data-original-width="1600" height="432" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgINZuiG5zE0gAG09U_YgQbreeIy_OJ5Ru_JnUSe_8SRBLmrN9k4gT6KakSyfEcVO78WnerWtf7IeG7kbGhP8lNqnDIf1kTnYwcvR6uT2ylImWO-mfvlKlw-_6j4vT-o_LbQs3bgE5EhsG2/s640/19-220d_Oceanic_languages-01.png" width="640" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<i>Languages in Oceania by language family, courtesy of Hedvig Skirgård</i></div>
<div>
<br /></div>
<div>
As can already be seen in the map just above - the island of New Guinea is incredibly diverse, home to many Austronesian languages but also to the many Papuan language families. <a href="https://www.muturzikin.com/cartesoceanie/oceanie2.htm" target="_blank">Muturzikin</a> has a great map of these which is too big for me to copy here; what does fit is this map of the biggest of the Papuan language families, Trans-New-Guinea:</div>
<div class="separator" style="clear: both; text-align: center;">
</div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjejW-egifN58fA9EBipOAJsguI-28Esf-Lo6WmR9ggyoyUnBdYFrJtgLD8mY2KLyU6R1iQjhdIzGc7b9bmz06aVOC84u4giYaTHa15DdoLEzO9YiPCC_CFGlRgI2dTmAYIoDe0hrhqKOYg/s1600/Papua_Trans_New_Guinea.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="859" data-original-width="1600" height="342" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjejW-egifN58fA9EBipOAJsguI-28Esf-Lo6WmR9ggyoyUnBdYFrJtgLD8mY2KLyU6R1iQjhdIzGc7b9bmz06aVOC84u4giYaTHa15DdoLEzO9YiPCC_CFGlRgI2dTmAYIoDe0hrhqKOYg/s640/Papua_Trans_New_Guinea.png" width="640" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<i>Map of the Trans-New-Guinea language family by <a href="https://www.muturzikin.com/cartesoceanie/oceanie2.htm" target="_blank">Muturzikin</a> </i></div>
<div>
<br /></div>
<div>
The same goes for <a href="http://www.muturzikin.com/cartesoceanie/oceanie.htm" target="_blank">Muturzikin's map of Australia</a>, it's too big to copy here. Below is the beautiful map compiled by David Horton:</div>
<div>
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEimZHye5AISPWwh-PXjj1dTY6rTuyO27E50Fdf04ARPFIcILgjUYBIBiGSuzA-ZXUr6I1Gef5BYKxF2QWlzIUwHrD1YIzp4gdB3c9se9EwIHm3UUMpcEWyF384dz-f2y4k33DkmtC3SGUIe/s1600/australia-aboriginal-map.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1298" data-original-width="1600" height="518" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEimZHye5AISPWwh-PXjj1dTY6rTuyO27E50Fdf04ARPFIcILgjUYBIBiGSuzA-ZXUr6I1Gef5BYKxF2QWlzIUwHrD1YIzp4gdB3c9se9EwIHm3UUMpcEWyF384dz-f2y4k33DkmtC3SGUIe/s640/australia-aboriginal-map.png" width="640" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<i> <a href="https://aiatsis.gov.au/explore/articles/aiatsis-map-indigenous-australia" target="_blank">AIATSIS map of Indigenous Australia</a>, compiled by David Horton</i></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div>
<a href="http://www.muturzikin.com/cartesoceanie/oceanie.htm" target="_blank">Muturzikin's map of Australia</a> is a contemporary map, in other words it shows a patchy distribution of indigenous languages, surrounded by English. The same is true for some maps of the Americas. The following is a map of North America:</div>
<div>
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj4umalwUKqFkiN1-3ktMRxFWy50AxUntQbXjzhi06qTp_Clhx06H_QvTetFZGLoYAPLY2tSjAFjP35s08xaIbPzkp-GzrU8q-e_nwLvZSD0ZfS_-uBpzYcW8Za7WpC0UCfDzboSgSKvX3U/s1600/NorthAmerica.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1447" data-original-width="1600" height="578" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj4umalwUKqFkiN1-3ktMRxFWy50AxUntQbXjzhi06qTp_Clhx06H_QvTetFZGLoYAPLY2tSjAFjP35s08xaIbPzkp-GzrU8q-e_nwLvZSD0ZfS_-uBpzYcW8Za7WpC0UCfDzboSgSKvX3U/s640/NorthAmerica.png" width="640" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<i>Map based on two maps by cartographer Roberta Bloom appearing in Mithun (1999:xviii–xxi), https://en.wikipedia.org/wiki/File:Langs_N.Amer.png</i></div>
<div class="separator" style="clear: both; text-align: left;">
<br /></div>
<div class="separator" style="clear: both; text-align: left;">
And the next two are two maps of Meso-America, the first displaying the linguistic situation at the time when Europeans first arrived in the area, the second (more or less) contemporary:</div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjYIlRgmMlPognHlmoelP1TBioUi1TgfO3XvG0D63gPiWR9_M2bJc6XVc_7ND0LB_WepnrlKWnlr292ATqCJkpukAWsrKkYBJwnkUavt3FeD-kgre4ETyHzwJBwPZKJHGqRP3T8rdkL6VRG/s1600/meso_america_before.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="547" data-original-width="784" height="446" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjYIlRgmMlPognHlmoelP1TBioUi1TgfO3XvG0D63gPiWR9_M2bJc6XVc_7ND0LB_WepnrlKWnlr292ATqCJkpukAWsrKkYBJwnkUavt3FeD-kgre4ETyHzwJBwPZKJHGqRP3T8rdkL6VRG/s640/meso_america_before.jpg" width="640" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<i>Map of Meso-American indigenous languages, source unknown</i></div>
<div>
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjtC_5zBHbkoyR-NNishCuh0TEs3BTc-Wgl1BsyMFggOG8QM_yGKQfcJv8ZWlCCZdRMferLBvTo9pwweaGuAWw0ZvlWhYPB62lSjTcb4nZTbmgmigfvJYznO-rnMihSkHfOmnCpeJ7USlGb/s1600/Mesoamarica.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="922" data-original-width="1433" height="410" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjtC_5zBHbkoyR-NNishCuh0TEs3BTc-Wgl1BsyMFggOG8QM_yGKQfcJv8ZWlCCZdRMferLBvTo9pwweaGuAWw0ZvlWhYPB62lSjTcb4nZTbmgmigfvJYznO-rnMihSkHfOmnCpeJ7USlGb/s640/Mesoamarica.jpg" width="640" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<i>Map of Meso-American indigenous languages </i><i>from <a href="https://mail.languagesgulper.com/eng/Home.html" target="_blank">The Language Gulper</a></i><i> </i></div>
<div>
<br /></div>
</div>
<div>
<div>
The same applies to these next maps of South-America. The first map (kindly brought to my attention by Olga Krasnoukhova) presents the situation at some point in the past, the second presents a more contemporary view, showing the rate at which minority languages are dwindling and dying out.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhdNuMd04DMqZUE6xcsTTsY4Z8HjABzbAbBziwloo2yBckY7kwnnCIpkcd74P_fScoZfMvjT07gG5qWQ9AVe9j8q7sF7ZabQW0p2HLSue39oc58foM4469aXIXL6DdbvcjmcG3R1eV-lXba/s1600/south+america+Loukotka+map+1967.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1600" data-original-width="1237" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhdNuMd04DMqZUE6xcsTTsY4Z8HjABzbAbBziwloo2yBckY7kwnnCIpkcd74P_fScoZfMvjT07gG5qWQ9AVe9j8q7sF7ZabQW0p2HLSue39oc58foM4469aXIXL6DdbvcjmcG3R1eV-lXba/s640/south+america+Loukotka+map+1967.png" width="494" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<i>Ethnolinguistic map of South America by Loukotka (1968)</i></div>
</div>
</div>
<div>
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjdz4-AN-RMSRqtCvSXudUkGaJn7rPyo-0y4Nh9JkTNWsymGEf7guo1W5soSmKzH9rKooSjGAIyTpNnDQnjVEF7ItL8OnkR2Gjt5fY676yIWNRpbwC35DGiIhyEjiDe3PupTgqHrNtW29rv/s1600/Southamer+present.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1059" data-original-width="792" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjdz4-AN-RMSRqtCvSXudUkGaJn7rPyo-0y4Nh9JkTNWsymGEf7guo1W5soSmKzH9rKooSjGAIyTpNnDQnjVEF7ItL8OnkR2Gjt5fY676yIWNRpbwC35DGiIhyEjiDe3PupTgqHrNtW29rv/s640/Southamer+present.jpg" width="478" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<i>Map of South American indigenous languages </i><i>from <a href="https://mail.languagesgulper.com/eng/Home.html" target="_blank">The Language Gulper</a></i><i> </i></div>
<br />
I hope you enjoyed this trip around the world, limited though these maps may be. Please post better maps in the comments. If this post is a success, I will devote my next blog to isolates, languages with no known relatives. Enjoy!<br />
<br />
<b>References</b><br />
Bellwood, Peter. (2013). F<i>irst migrants: Ancient migration in global perspective. </i>Chichester: Wiley Blackwell.<br />
<br />
Gray, RD, A J Drummond, and S J Greenhill. 2009. “Language Phylogenies Reveal Expansion Pulses and Pauses in Pacific Settlement.” Science 323 (5913): 479–483. https://doi.org/10.1126/science.1166858.<br />
<br />
Fortescue, Michael (2011). "The relationship of Nivkh to Chukotko-Kamchatkan revisited". Lingua. 121 (8): 1359–1376. doi:10.1016/j.lingua.2011.03.001.<br />
<br />
Kolipakam, Vishnupriya, Fiona M. Jordan, Michael Dunn, Simon J. Greenhill, Remco Bouckaert, Russell D. Gray, and Annemarie Verkerk. 2018. “A Bayesian Phylogenetic Study of the Dravidian Language Family.” Royal Society Open Science 5: 171504.<br />
<br />
Krauss, Michael E. (1988). Many Tongues - Ancient Tales, in William W. Fitzhugh and Aron Crowell (eds.) Crossroads of Continents: Cultures of Siberia and Alaska (pp. 144-150 ). Smithsonian Institution.<br />
<br />
Loukotka, Čestmír. (1968). Johannes Wilbert, ed. Classification of South American Indian languages. Los Angeles: Latin American Center, University of California.<br />
<br />
Mithun, Marianne. (1999). The languages of Native North America. Cambridge: Cambridge University Press.<br />
<br />
Robbeets, Martine, and Remco Bouckaert. (2018). “Bayesian Phylolinguistics Reveals the Internal Structure of the Transeurasian Family.” Journal of Language Evolution 3 (2): 145–62. https://doi.org/10.1093/jole/lzy007.<br />
<br />
Sagart, Laurent, Guillaume Jacques, Yunfan Lai, Robin J. Ryder, Valentin Thouzeau, Simon J. Greenhill, and Johann-Mattis List. 2019. “Dated Language Phylogenies Shed Light on the Ancestry of Sino-Tibetan.” Proceedings of the National Academy of Sciences. https://doi.org/10.1073/pnas.1817972116/<br />
<br />
Sidwell, Paul. (2009). Classifying the Austroasiatic languages: History and state of the art. München: LINCOM.<br />
<div>
<br /></div>
<div>
<br /></div>
<br />
<br />
<br />
<br />
<br />
<br />Annemarie Verkerkhttp://www.blogger.com/profile/14747297526182358435noreply@blogger.com0tag:blogger.com,1999:blog-1300680252997007251.post-47254199684496829432019-09-13T19:31:00.000+10:002019-09-26T22:05:00.405+10:00ALT2019 conference reportTwo weeks ago, the <a href="https://sites.google.com/universitadipavia.it/alt2019/home" target="_blank">13th Conference of the Association for Linguistic Typology</a> (ALT) took place in Pavia, Italy. As the name says, this is the main gathering for members of the <a href="https://linguistic-typology.org/" target="_blank">Association for Linguistic Typology</a>, and it's on a different continent every two years. It just happened to be in Europe as I was ready to go conferencing again (now dragging two kids in tow) so that was lucky.<br />
<br />
I like ALT a lot because I can go to basically any talk and find myself interested in it. There are hardly any talks or posters where I am disappointed because it isn't really my cup of tea - it's all typology so everything is my cup of tea :)! It is where the humans who read grammars gather. This year, ALT was paired with a <a href="https://lude.lakecomoschool.org/" target="_blank">summer school</a> on 'Language universals and language diversity in an evolutionary perspective', which I would have loved to attend (but, kids).<br />
<br />
For the first time in history (as far as we could find), ALT offered child care. About 5 attendees made use of this (and so the next generation of linguists are already networking ;)), in my case it really helped to attend some talks and give our own. Unfortunately I couldn't attend as many talks as I wanted, but as a logistic experiment it was mostly a success. Below I'll feature some talks I attended and others I wanted to attend but didn't, so you can read a bit about the latest & upcoming work in typology.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgE-ILx6pWZj5i6pU364wfd_aOifqSTl90ozsB8PwVJDGZaiKqj0vMHyZ7sPVJQ5tqj6e5R_9iVS_3_2PcUMFZ-ChZLM4eWkGe6SprWH_gwsmQJCmYcVK8oj7e9lluuNy5yUv4ksWFg0aGg/s1600/IMG_8836.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1200" data-original-width="1600" height="300" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgE-ILx6pWZj5i6pU364wfd_aOifqSTl90ozsB8PwVJDGZaiKqj0vMHyZ7sPVJQ5tqj6e5R_9iVS_3_2PcUMFZ-ChZLM4eWkGe6SprWH_gwsmQJCmYcVK8oj7e9lluuNy5yUv4ksWFg0aGg/s400/IMG_8836.jpg" width="400" /></a></div>
<br />
<br />
The first talks I managed to attend where those by Kirsten Culhane ('A typology of consonant/zero alternations') and Erich Round ('Canonical phonology'), both part of the workshop on 'Current research in phonological typology'. Culhane's talk argues for a more typologically informed analysis of consonant insertions and deletions, especially considering phonological and morphological conditions. Round's talk explained in detail why phonologists invariably diverge in their analyses of particular aspects of phonology, and how this can be avoided using a canonical approach.<br />
<br />
Later that day, I wanted to attend Denis Creissels' talk on 'Cross-linguistic tendencies in the encoding of experiencers in the languages of Sub-Saharan Africa, and possible typological correlations' but I had to leave because the older kid wouldn't shut up - which several people found very funny.<br />
<br />
The final day of the conference I was finally able to see some more talks. First, Kilu von Prince et al. ('Realis and irrealis in Oceanic'), who argued how the realis vs. irrealis distinction is relevant in Oceanic and probably also outside it (see <a href="http://kiluvonprince.de/realis-and-irrealis-at-alt-pavia/" target="_blank">here</a> for the slides). Then, Jeff Good et al. with a more methodological talk on 'Individual-based socio-spatial networks as a tool for areal typology'. They presented extremely fine-grained data on language competence of individuals in a highly multilingual region, integrating linguistic, social, and geographic data (see picture below). Then, Dmitry Idiatov and Mark van de Velde ('Single feature approach to linguistic areas: labial-velars and the prehistory of the Macro-Sudan Belt') spoke about how labial-velar stops might be a characteristic of the now disappeared indigenous languages of West and Central Africa, whose speakers have shifted to various Niger-Congo languages.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhm9cAt-ghQPRxJwZpKOQNXiddZtdGIuF40mtCWCc8dgqOlJsViR-WobHaXbUtOUKM_Zo9asUkSg8L6nHUz7rhqHN0jmKOpEKVbT7GqjC6B54vrpG58owQ12K5m3hDROMWdO9o447PzT1rD/s1600/IMG_8882.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1600" data-original-width="1200" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhm9cAt-ghQPRxJwZpKOQNXiddZtdGIuF40mtCWCc8dgqOlJsViR-WobHaXbUtOUKM_Zo9asUkSg8L6nHUz7rhqHN0jmKOpEKVbT7GqjC6B54vrpG58owQ12K5m3hDROMWdO9o447PzT1rD/s400/IMG_8882.jpg" width="300" /></a></div>
<br />
Then it was time for our own talk ('Testing Greenberg’s universals on a global scale'), which was suffering a bit in attendance because in one of the parallel sessions, Nikolaus Himmelmann was speaking about 'Against trivialising linguistic description, and comparison'. In the abstract he had written '<i>In fact, Haspelmath’s approach to comparative concepts trivialises crosslinguistic comparison by elevating the pragmatic approach to grammatical comparison apparently required when compiling resources such as the WALS (Dryer & Haspelmath 2013) to the only proper methodology in crosslinguistic comparison. There are other, more rigorous and methodologically superior approaches to comparison</i>, ...' so I guess people went to see what would happen during question time as Martin Haspelmath was attending. I am told there was some interesting discussion.<br />
<br />
I missed a lot of cool new talks :(, in alphabetical order:<br />
<br />
Chundra Cathcart et al.'s talk on numeral classifiers and plural marking in Indo-Iranian, showing that there is some evidence for the hypothesis that numeral classifiers develop more often in languages without plural marking;<br />
Francesca Di Garbo's talk showing that in Cushitic and beyond, plural agreement can be dependent on lexical-semantic properties of the noun;<br />
Jessica Ivani & Taras Zakharko's presentation of <a href="https://github.com/jkivani/tymber" target="_blank">Tymber</a>;<br />
Gerhard Jäger's talk on Differential Object Marking and Differential Subject Marking investigated using hierarchical Bayesian modelling. This can be seen as a follow-up to work by <a href="https://www.degruyter.com/viewbooktoc/product/247734" target="_blank">Balthasar Bickel et al.</a> and <a href="https://www.oapen.org/download?type=document&docid=1001675#page=517" target="_blank">Karsten Schmidtke-Bode & Natalia Levshina</a> that is interesting to follow because all three author sets use different methods and have different outcomes;<br />
Olga Krasnoukhova & Johan van der Auwera's talk on the diachrony of a rather curious source of standard negation in certain languages;<br />
Natalia Levshina's talk on the range (narrow vs. wide) that basic grammatical relations have and how this range can be investigated using corpora, showing that Finnish is the most extreme 'tight-fit' language, while Chinese and English are the most extreme 'loose-fit';<br />
Ilja Seržant's talk on the lengths of person-number affixes of verbs, finding no evidence for Gívon's cycle (where indexes demise via phonological attrition and new indexes are formed through free personal pronouns);<br />
Manuel Widmer et al.'s talk on the evolution of hierarchical person-marking systems in Tupian and Sino-Tibetan, showcasing the differences and commonalities of these systems in the two families.<br />
<br />
Another thing I missed was the business meeting, which was sad because they are usually quite enjoyable - so know I don't know where ALT will be in two years time. If you do, please post a comment! Thanks to all involved for hosting a great conference.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhtkZ8zU_aa5Em8ZVEotgziic36kVkme0M6L5NWOu9-XVMGAbN5d7CAZ1SufJRmVqMh2585LMiMK9xtKIt2nyg1ou6lStnScJ8S44VDkN5NNjlR2nWi31Ss8vqyFIy-ShyRy_XXTAhZ3TW7/s1600/IMG_8867.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1600" data-original-width="1200" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhtkZ8zU_aa5Em8ZVEotgziic36kVkme0M6L5NWOu9-XVMGAbN5d7CAZ1SufJRmVqMh2585LMiMK9xtKIt2nyg1ou6lStnScJ8S44VDkN5NNjlR2nWi31Ss8vqyFIy-ShyRy_XXTAhZ3TW7/s400/IMG_8867.jpg" width="300" /></a></div>
<br />
<br />
<br />Annemarie Verkerkhttp://www.blogger.com/profile/14747297526182358435noreply@blogger.com0tag:blogger.com,1999:blog-1300680252997007251.post-88004025253857269892019-07-26T13:25:00.001+10:002019-07-26T13:33:05.282+10:00My ELAN workflow for segmenting and transcription<div dir="ltr" style="text-align: left;" trbidi="on">
<div dir="ltr" style="text-align: left;" trbidi="on">
Hello everyone,<br />
<br />
Hedvig here. I'm currently writing up my PhD thesis, hence the lack of writing here. Hopefully I'll be able to pick it up after submission, there's a lot of drafts lying on blogger waiting for completion. If you really, really miss <i>me</i> in particular, you could listen to my ramble at <a href="https://talkthetalkpodcast.com/">Talk the Talk - a weekly show about linguistics.</a><br />
<br />
Now that the shameless plug and excuses are done with, let's get down and talk about:<br />
<br />
<div style="text-align: center;">
<span style="color: #660000; font-family: inherit; font-size: x-large;"><b><u><i>THE TRANSCRIPTION CHALLENGE!</i></u></b></span><br />
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgy8HUZxuxVyZmf5QbX3Wophu0XMJ7pHMVN0RzS5lbH_O1wnuXgdYqKffMf0v8AVhdolp9yt8gE6cUKsyepd8Lg7WNXDzmqVvO-OHRmkl43HgEilzz2ottZt1Aho_LI38_WPSUITpbd4uiL/s1600/IMG_20170503_170932.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="300" data-original-width="400" height="240" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgy8HUZxuxVyZmf5QbX3Wophu0XMJ7pHMVN0RzS5lbH_O1wnuXgdYqKffMf0v8AVhdolp9yt8gE6cUKsyepd8Lg7WNXDzmqVvO-OHRmkl43HgEilzz2ottZt1Aho_LI38_WPSUITpbd4uiL/s320/IMG_20170503_170932.jpg" width="320" /></a></div>
<br /></div>
</div>
In this blog post, I will focus on a part of this challenge¹ - the workflow for segmenting and transcribing audio material. This is a rough guide, if it turns out people appreciate something like this I'll re-write it more thoroughly. This is a bit sloppily written in places, but trust me - if I do this "properly" right now I will lose days of work time that I should be spending on my thesis... so, I'll only do it if people really want it - and I might wait a while until I do. Sorry, but it is what it is.<br />
<br />
Anyone who has done fieldwork that involves interviews, be they video or audio, will know how time consuming it can be to segment and transcribe data.<br />
<br />
<div class="page" title="Page 2">
<div class="layoutArea">
<div class="column">
<div style="text-align: right;">
<span style="font-family: inherit;"><i>Estimates of the factor involved here vary, depending on recording quality, the number of speakers involved, etc. Factors smaller than 10 (i.e. ten minutes are necessary to transcribe and translate one minute of recording) are rarely mentioned, and factors as high as 150 and higher are not unrealistic in the case of complex multiparty conversations. </i>(Himmelmann 2018:34)</span></div>
</div>
</div>
</div>
<br />
That's a lot of time, and often times there is no way around it, in particular if you're dealing with a language that has little description.<br />
<br />
This challenge isn't only relevant to linguists, but also pertains to anthropologists, historians, journalists and others who need transcription. For journalists and historians, they often interview people in major language like English or Spanish and there's a tonne of software out there for automatic transcription. There's so much, that Adobe has even developed what they call "<a href="https://www.youtube.com/watch?v=eGs11gujRjE">Photoshop for Audio</a>" alongside their transcription services.<br />
<br />
There even exists initiatives to bring this kind of automatic transcription technology to smaller languages. <a href="https://www.dynamicsoflanguage.edu.au/news-and-media/latest-headlines/article/?id=transcription-acceleration-project-tap-update">Check out the Transcription Acceleration Project and their tool Elpis here</a>. But even Elpis needs to start with some manually transcribed audio, some training data. So, how do we get nice transcribed data in a timely fashion?<br />
<br />
Most linguists who do fieldwork start out using <a href="https://tla.mpi.nl/tools/tla-tools/elan/">ELAN</a> for transcription. ELAN is a free software from The Language Archive that's fairly easy to use and provides a large amount of functions relating to segmenting and transcribing your data, both audio and video. ELAN is great, don't get me wrong, and this guide will be based on using ELAN. However, the program has a lot of different options and people use it very differently - this can be overwhelming for beginners and it can be difficult to figure out how to optimise it for what you need to do.<br />
<br />
Different linguists often develop their own "ELAN-style", and since the workflow (and often also the transcription data itself) isn't shared with people outside of your project- there is little dissemination of these different ELAN-styles. Some people have even described learning ELAN as an apprentice type system, where you may learn the ins and outs by working for someone else first before you start on your own data. If you're attending a linguistic fieldwork class that teaches ELAN, you'll probably be introduced either to your instructors personal ELAN-style, or one of the styles that TLA suggests <a href="https://tla.mpi.nl/wp-content/uploads/2017/01/How-to-pages_9.pdf">in their manuals</a>. That can be great and if it's working well for you, awesome! However, it may be that there is some fat to trim of your current ELAN-workflow. I'll share a basic outline of my workflow here, and perhaps you'll find some trick that can improve you workflow too!<br />
<br />
<b><span style="font-size: large;">My ELAN workflow</span></b><br />
<b>Main take-away: </b>you don't need to segment by hand and you <u>don't need to listen through the recording several times for each speaker in order to get speaker separated tiers</u>. The fact that you can export (and import) your ELAN transcription into regular tsv-files can save you a LOT of time and energy.<br />
<b><br /></b>
<b>Caveat: </b>This guide will be rather schematic, if it turns out that this is useful for people I can develop it in more detail later. <u><span style="color: #741b47;">If you want that to happen, drop a comment on the blogger-blogpost.</span></u> I have actually basically already described this workflow in two separate blogposts, I'm just brining them together here for a start-to-finish-flow.<br />
<br />
<b>Assumptions: </b>you have audio and/or video files of semi-natural conversations where most of the time one person is talking at a time, even if there is some overlap. You want to have it segmented into intonational unites, transcribed, translated and you want to separate out who is speaking when. <a href="https://tla.mpi.nl/wp-content/uploads/2017/01/How-to-pages_9.pdf">You have downloaded and installed ELAN and mastered how to create ELAN files and associate them with audio/video-files.</a><br />
<br />
<b>Don't worry about separate speaker/signer tiers: </b>In this workflow, we're going to start out with transcribing all speakers/signers on one tier. If you want them separated out into different tiers, we have an option for that later. Don't worry, it'll be fine. If you have a large amount of overlapping speech or sign utterances and you want them all transcribed separately, you can still use this guide but you'll have to go over the steps for each speaker/signer/articulator. If that is the case, this guide may not be that much more effective than what you're already doing, but let me know if it is.<br />
<br />
<b>Caveat 2: </b>I don't make use of "tier types" and their attributes at all in my ELAN-use. I just use the basic time-aligned default tier type. I haven't yet encountered a situation where I really need tier types. It may be that the project your in cares about tier types, if so do make sure that you obey those policies. If not, don't worry about it.<br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit; font-size: large;"><b>The steps</b></span><br />
<b>1) Create two tiers, call them:</b><br />
<br />
<ul style="text-align: left;">
<li>segmentation by utterance</li>
<li>larger segments (optional)</li>
</ul>
<br />
<b>2) Make sure you know how to switch between <a href="https://www.mpi.nl/corpus/html/elan/ch03.html">different modes in your version of ELAN on your OS</a>. </b>We're going to be using the annotation, segmentation and transcription modes.<br />
<br />
<b>3) Segment "empty chunks" tier into annotations. Either:</b><br />
a) Automatic segmenting via PRAAT (<a href="https://yammeringon.wordpress.com/2017/05/01/elanpraat-machine-segmenting/">see blogpost here</a>)<br />
<br />
b) the "<a href="https://www.mpi.nl/corpus/html/elan/ch03s04.html#id358694">one keystroke per annotation; the end time of one annotation is the begin time of the next. This creates a chain of adjacent annotations</a>" segmenting option in ELAN.<br />
<br />
Tap whenever you think an intonational utterance has reached its end. If there are pauses, just tap it into smaller chunks. Annotations with silences aren't a big problem, they will just have no transcription in them later so we can remove them automatically then if need be. They can be a bit annoying, but they're not a major problem really.<br />
<br />
<a href="https://www.mpi.nl/corpus/html/elan/ch01s05s16.html">You may want to adjust the playback speed while segmenting or transcribing.</a> If someone is talking very slowly and going through an elicitation task with clear pauses, you may be able to segment at a higher speed.<br />
<br />
Trivia: it seems like intonational units are quite easy for humans to detect, so much so that <a href="https://www.cambridge.org/core/journals/phonology/article/on-the-universality-of-intonational-phrases-a-crosslinguistic-interrater-study/D0E198FAEA5B7D172F49320F58D0E74F">speakers of German were able to fairly successfully segment Papuan Malay despite not knowing any Malay</a><br />
<br />
<b>4) Larger segments-tier </b><br />
If you have several events happening in one recording (say a consent confirmation, a wordlist and a narrative), then you may want to keep track of this during step 3. Either select to only chunk the events you need, or at least make note separately on a piece of paper when an event started and ended if your using 3b. Use that information to create really long annotations in the larger segments tier for each of the events. Alternatively, use the information in the transcription tier later to generate annotations in the larger segments-tier, for example if you know the first and last word of the wordlist you're using.<br />
<br />
<b>5) Make copies of the segmentation by utterance-tier with empty annotations and call them</b><br />
<ul style="text-align: left;" type="disc">
<li>Transcription</li>
<li>Translation</li>
<li>Speaker/signer/articulator<span style="color: #191e3f; text-align: center;">²</span></li>
<li>Comment</li>
</ul>
<div>
These will be exactly time aligned with each other, and <b><i><u>this is important.</u></i></b> Make sure that any obvious goofs in the empty chunks tier are taken care of before you duplicate it.</div>
<div>
<br /></div>
<div>
Keep the empy tier around, you might need it later.</div>
<div>
<br /></div>
<b>6) Transcription. </b>Switch to transcription-mode. <a href="https://www.mpi.nl/corpus/html/elan/ch03s03s02.html">Show only the 4 tiers from step </a>5.<br />
<br />
If you have different people transcribing from translation, select only the tiers that are relevant for that person. <a href="https://www.mpi.nl/corpus/html/elan/ch03s03s07.html">Turn on automatic playback and loop mode.</a> Make sure that each person has their own comment tier, and encourage them to write things there while they're transcribing if there is something they want to quickly note.<br />
<br />
Make sure you have set clear rules for how to deal with false starts, humming, laughter, <a href="https://en.wikipedia.org/wiki/Backchannel_(linguistics)">backchanneling</a> noises etc. Do you want all of those transcribed? If so, do you have a <a href="http://ca-tutorials.lboro.ac.uk/notation.htm">short hand symbols</a> for them? Make sure you're clear about this early on, especially if you have multiple people working on transcribing the data.<br />
<br />
In the speaker/signer/articulator tier put down the appropriate initials of the person/articulator.<br />
<br />
Since I don't use tier-types, I can't use the column mode. I don't really mind, but if you prefer using the column set-up then you need to assign the 4 different tiers to different tier types.<br />
<br />
If you only want to transcribe a certain event, either only chunk that event in step 3b and not the others. Or go back to annotation mode, write "blubbi" in the first segment on the transcription tier within that event, go back to annotation mode and scroll down until you see "blubbi". Not the most elegant solution, but hey it works.<br />
<br />
Leave the silence annotations entirely blank.<br />
<br />
<b>7) Overlaps!</b><br />
Now, you may have overlapping speech/gesture/sign at times. The first thing you need to do is ask yourself this question: do you really need to have all of the overlaps separately transcribed? For example, if it's very hard to make out what one person is saying in the overlapping speech, how valuable is it to you to attempt to transcribe it? It may very well be that the answer is "yes" and "very valuable", and that's all good. Just make sure that this is indeed the case before you go on.<br />
<br />
It is entirely possible that you don't want to transcribe instances of overlapping utterances, if that is the case you can stop here and just leave your file in the stat it is in. You can still tease out who is speaking when. The main reason to separate out speakers into separate tiers it to handle overlap, and if you don't care about that you can actually just stick with having all speakers merged on one tier. It will actually probably be easier for you in the long run. I don't do step 8 and 9 normally, but I have figured out how to do them so that if I ever wanted to/was made to - I can separate out speakers.<br />
<br />
If you do want to tease them out here's what you do. During step 6 put down the initials of all the people talking at the overlapping annotation in the speaker/signer/articulator tier, write "overlap" in comment tier and leave the other 2 tiers blank. That's it, for now.<br />
<br />
<b>8) Separating out the tiers into separate for speaker/signer/articulator</b><br />
<b><br /></b>
Now we should have an eaf file with transcription & translation for all of the non-overlapping annotations, including information about which person is associated with which annotations and where there is overlap (and who speakers/signs in that overlap).<br />
<b><br /></b>
What we're going to do now is basically make <a href="http://humans-who-read-grammars.blogspot.com/2017/08/elan-making-tiers-out-of-search-results.html">slimmed down version of what we did here</a>. In that guide, we did a clever search within ELAN, exported the results of exactly that search only and imported those results as a new tier. The new tier was merged into the old transcription document, and voila we've got an extra new tier with only the search results. This is useful for example if you want to listen through only words transcribed with [ts] clusters to see if they are indeed realised as [ts] or sometimes as [t]. The same principle also works here where we want to separate out annotations associated with certain people.<br />
<br />
We're going to<br />
a) export all of the tiers and all annotations<br />
b) make copies of the exported files and prune each of them to only the annotations that pertain to a certain speaker<br />
c) import those files as new transcription documents<br />
d) merge those with the original file<br />
<br />
<b>a) export transcription</b><br />
Within ELAN, export the entire transcription document as a tab-delimited text file. You do this under File> Export as.. > tab-delimitated text file. Tick "separate columns for each tier".<br />
<br />
<br />
<div>
</div>
<br />
<div>
<div style="margin: 0px;">
Name your file something sensible, and put it in a good place. The file will have the file-extension ".txt", but it is a tab-separated file (".tsv"). Rename the file so that the suffix is ".tsv". Open the file in some spreadsheet program (excel, numbers, libreoffice, google sheets, etc). I recommend Libreoffice, because it let's you explicitly set what the delimiters and endcoding are, whereas excel makes a bunch of decisions for you that may not be ideal.</div>
<div style="margin: 0px;">
<br /></div>
<div style="margin: 0px;">
Now, since your annotations are time aligned we get them all on the same row. Here's a little example of what it looks like in my data:</div>
<div style="margin: 0px;">
<br /></div>
<table border="0" cellspacing="0" style="color: black; font-family: "Liberation Sans"; font-size: x-small;"><colgroup span="12" width="101"></colgroup><tbody>
<tr><td align="left" height="20">Starttid - hh:mm:ss.ms</td><td align="left">Starttid - ss.msek</td><td align="left">Sluttid - hh:mm:ss.ms</td><td align="left">Sluttid - ss.msek</td><td align="left">Tidslängd - hh:mm:ss.ms</td><td align="left">Tidslängd - ss.msek</td><td align="left">Larger segments</td><td align="left">Segmentation by utterance</td><td align="left">Speaker</td><td align="left">Transcription</td><td align="left">Translation</td><td align="left">Comments</td></tr>
<tr><td align="left" height="20">00:17:56.450</td><td align="left">1076.45</td><td align="left">00:20:56.785</td><td align="left">1256.785</td><td align="left">00:03:00.335</td><td align="left">180.335</td><td align="left">Heti's spectial wordlist</td><td align="left"><br /></td><td align="left"><br /></td><td align="left"><br /></td><td align="left"><br /></td><td align="left"><br /></td></tr>
<tr><td align="left" height="20">00:22:28.400</td><td align="left">1348.4</td><td align="left">00:27:54.900</td><td align="left">1674.9</td><td align="left">00:05:26.500</td><td align="left">326.5</td><td align="left"><br /></td><td align="left"><br /></td><td align="left"><br /></td><td align="left"><br /></td><td align="left"><br /></td><td align="left"><br /></td></tr>
<tr><td align="left" height="20">00:00:02.449</td><td align="left">2.449</td><td align="left">00:00:04.072</td><td align="left">4.072</td><td align="left">00:00:01.623</td><td align="left">1.623</td><td align="left"><br /></td><td align="left"><br /></td><td align="left"><br /></td><td align="left"><br /></td><td align="left"><br /></td><td align="left"><br /></td></tr>
<tr><td align="left" height="20">00:17:58.760</td><td align="left">1078.76</td><td align="left">00:18:03.703</td><td align="left">1083.703</td><td align="left">00:00:04.943</td><td align="left">4.943</td><td align="left">Heti's spectial wordlist</td><td align="left"><br /></td><td align="left">M</td><td align="left">o se tane lelei e fa'a fa'aaloalo le a:va</td><td align="left">a good husband repect his wife</td><td align="left"><br /></td></tr>
<tr><td align="left" height="20">00:18:03.703</td><td align="left">1083.703</td><td align="left">00:18:06.663</td><td align="left">1086.663</td><td align="left">00:00:02.960</td><td align="left">2.96</td><td align="left">Heti's spectial wordlist</td><td align="left"><br /></td><td align="left">T</td><td align="left">.. tane tane tane</td><td align="left">husband</td><td align="left"><br /></td></tr>
<tr><td align="left" height="20">00:18:06.663</td><td align="left">1086.663</td><td align="left">00:18:09.055</td><td align="left">1089.055</td><td align="left">00:00:02.392</td><td align="left">2.392</td><td align="left">Heti's spectial wordlist</td><td align="left"><br /></td><td align="left">M</td><td align="left">o le ga . koe faikau uma a</td><td align="left">is that it . read all of this</td><td align="left"><br /></td></tr>
<tr><td align="left" height="20">00:18:09.055</td><td align="left">1089.055</td><td align="left">00:18:16.263</td><td align="left">1096.263</td><td align="left">00:00:07.208</td><td align="left">7.208</td><td align="left">Heti's spectial wordlist</td><td align="left"><br /></td><td align="left">M</td><td align="left">o fea le le manu .. na ou va'ai ai ananafi</td><td align="left">where is the bird i see yesterday</td><td align="left"><br /></td></tr>
<tr><td align="left" height="20">00:18:16.263</td><td align="left">1096.263</td><td align="left">00:18:18.695</td><td align="left">1098.695</td><td align="left">00:00:02.432</td><td align="left">2.432</td><td align="left">Heti's spectial wordlist</td><td align="left"><br /></td><td align="left">M</td><td align="left">manu manu manu</td><td align="left">bird</td><td align="left"><br /></td></tr>
<tr><td align="left" height="20">00:18:18.695</td><td align="left">1098.695</td><td align="left">00:18:24.038</td><td align="left">1104.038</td><td align="left">00:00:05.343</td><td align="left">5.343</td><td align="left">Heti's spectial wordlist</td><td align="left"><br /></td><td align="left">M</td><td align="left">... o namu e pepesi ai fa'ama'i na</td><td align="left">mosquito spread disease</td><td align="left"><br /></td></tr>
</tbody></table>
<div style="margin: 0px;">
<br /></div>
<div style="margin: 0px;">
<b>b) filter the rows</b></div>
<div style="margin: 0px;">
Now, by just using the simple filter functions in most spreadsheet programs, we can make new files that only contains the rows with certain speakers in it. Make a few copies of your tsv file, call them "speaker x", "speaker y" etc. In each of those, filter for all of the rows you want to delete, and delete them - leaving only the rows with the relevant speaker. In the example below, I'm filtering for all the rows where the speaker isn't "M" and deleting those.</div>
<div style="margin: 0px;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhHpVOHObfxQQPMZUTNOPdFDeb0vB-YZnJiWKTB2bObYvv_OuzjybK4FUpBevpyUaiAPCPhv91dY8Y40QUEzDWv3qCE3JTToPZCfwWc9ZqRn5jSvPRijvRqynH_RdEu8MhXjT8j8srPn7Mu/s1600/dfgadfg.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="540" data-original-width="643" height="268" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhHpVOHObfxQQPMZUTNOPdFDeb0vB-YZnJiWKTB2bObYvv_OuzjybK4FUpBevpyUaiAPCPhv91dY8Y40QUEzDWv3qCE3JTToPZCfwWc9ZqRn5jSvPRijvRqynH_RdEu8MhXjT8j8srPn7Mu/s320/dfgadfg.png" width="320" /></a></div>
<div style="margin: 0px;">
<br /></div>
<div style="margin: 0px;">
<br /></div>
<div style="margin: 0px;">
<br /></div>
<div>
<b>c) import filtered tiers into ELAN</b></div>
<div>
Now we go back to ELAN and we import the files as tiers. What will happen here is that a entire new .eaf-file will be created, the tier will actually not be imported directly into whichever file you currently have open. This means that it doesn't matter which .eaf-file you currently have open when you import (or indeed if any is open). Counterintuitive, I know, but don't worry - I've figured it out. It's not that complicated, just stay with me.<br />
<br />
For this to work, the file needs to have the ".txt" suffix again.</div>
<div>
<br /></div>
<div>
File>Import> CSV/Tab-delimited Text file</div>
<div>
<br /></div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhuASXWyK55e5JFo3Yh9GV3UzFm5MYZNJkU_Vp-Bljfj3s79x6r2YENhlQI7AwxSTnaG07nSYA_TykZqB3JEF2SnEDUiGP4lYKk36-RC-exFYkXv3W00pxA7esCB8yJ9jCIuyTXSJKFuBye/s1600/import.tiff" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="490" data-original-width="430" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhuASXWyK55e5JFo3Yh9GV3UzFm5MYZNJkU_Vp-Bljfj3s79x6r2YENhlQI7AwxSTnaG07nSYA_TykZqB3JEF2SnEDUiGP4lYKk36-RC-exFYkXv3W00pxA7esCB8yJ9jCIuyTXSJKFuBye/s320/import.tiff" width="280" /></a></td></tr>
<tr><td class="tr-caption" style="font-size: 12.800000190734863px;">Importing CSV/Tab-delimited Text file</td></tr>
</tbody></table>
<div>
Next up you will get a window asking you questions about the file you're trying to import, make sure that it lines up with the little preview you get.</div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiOSRc8BiyDNX96ofckXWxtAlYSPNf4e28U3LgitiYVn3_p1lnwPJWYFFS7WfZKCN33X71WnfgvYZJW4K_15lUZUL5jRJT6bRBy0_pnNyVBSkFkrKqypOTYhy5Nl4Lu6xMEw1CdTnAd_A34/s1600/specify.tiff" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="498" data-original-width="1282" height="248" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiOSRc8BiyDNX96ofckXWxtAlYSPNf4e28U3LgitiYVn3_p1lnwPJWYFFS7WfZKCN33X71WnfgvYZJW4K_15lUZUL5jRJT6bRBy0_pnNyVBSkFkrKqypOTYhy5Nl4Lu6xMEw1CdTnAd_A34/s640/specify.tiff" width="640" /></a></td></tr>
<tr><td class="tr-caption" style="font-size: 12.800000190734863px;">Import CSV/Tab-delimited Text file dialogue window.</td></tr>
</tbody></table>
<div>
I wish that ELAN had a way of automatically recognising its own txt-output, but it doesn't. No need to specify the other options, just leave them unchecked.</div>
<table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: right; margin-left: 1em; text-align: right;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEipH5I1jwdDihIbIHPoX5gtLK3LEK-Tn8setQ1dn3YZ-Q1P6b9k6tetT3-0bXom4i7wlPrN28NTBoL7qXIXDr0ghuIL_A8Gkfwly6e0u0i9u0xO3L9QJNa8W9IFBEjDdrsT7JOoirmzRfD3/s1600/ghost.gif" imageanchor="1" style="clear: right; margin-bottom: 1em; margin-left: auto; margin-right: auto;"><img border="0" data-original-height="350" data-original-width="498" height="224" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEipH5I1jwdDihIbIHPoX5gtLK3LEK-Tn8setQ1dn3YZ-Q1P6b9k6tetT3-0bXom4i7wlPrN28NTBoL7qXIXDr0ghuIL_A8Gkfwly6e0u0i9u0xO3L9QJNa8W9IFBEjDdrsT7JOoirmzRfD3/s320/ghost.gif" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="font-size: 12.800000190734863px; text-align: center;">An actual ghost</td></tr>
</tbody></table>
<div>
<br /></div>
<div>
Now you will have a new .eaf-file with the same name as the file with the pruned results.<br />
This file will only contain the annotations that matched your filterings. There's no audio file and no other tiers. It's like a ghost tier, haunting the void of empty silence of this lonely .eaf-file.</div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiS2sQm3_sLP5gDTM4Gu0GBeqcyA5eISi14YZ1y1Kk9-eeHW3zBnExFHP9273hiehGp3oUibXC9lB2JWPTLFbdJbMlvGg9_6m7zfIYKydj5Ow6tj6r56plqjXduTVZBSSJMVYlOmhqyg8rp/s1600/just+search+as+tier.tiff" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="539" data-original-width="1445" height="238" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiS2sQm3_sLP5gDTM4Gu0GBeqcyA5eISi14YZ1y1Kk9-eeHW3zBnExFHP9273hiehGp3oUibXC9lB2JWPTLFbdJbMlvGg9_6m7zfIYKydj5Ow6tj6r56plqjXduTVZBSSJMVYlOmhqyg8rp/s640/just+search+as+tier.tiff" width="640" /></a></td></tr>
<tr><td class="tr-caption" style="font-size: 12.800000190734863px;">A lonely ghost tier in an otherwise empty .eaf-file</td></tr>
</tbody></table>
<div>
Save this file and other files currently open in a good place, quit ELAN and then restart ELAN. Sometimes there seems to be a problem for ELAN to accurately see files later on in this process unless you do this. I don't know why this is, but saving, closing and restarting seems to help, so let's just do that :)!</div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjLpTEf9I7efnre3bgY5z1j20Xe1GVn-rQE-SWCHSQ2VScZsnwXmlsTPLMZUrmfPuFaJVOQsMtUznZDit3FMHQ4Dz9lrZGzP1bORp-PrwZptKnT4jBTfPmU2Ju_Yp-83-5O6Mf3WlHQEIEB/s1600/giphy+IT+crowd.gif" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="281" data-original-width="500" height="179" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjLpTEf9I7efnre3bgY5z1j20Xe1GVn-rQE-SWCHSQ2VScZsnwXmlsTPLMZUrmfPuFaJVOQsMtUznZDit3FMHQ4Dz9lrZGzP1bORp-PrwZptKnT4jBTfPmU2Ju_Yp-83-5O6Mf3WlHQEIEB/s320/giphy+IT+crowd.gif" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="font-size: 12.800000190734863px;">Chris O'Dowd as Roy Trenneman in IT-crowd</td></tr>
</tbody></table>
<div>
<b>d) importing the search results tier into the original file</b></div>
<div>
Now here's where I slightly lied to you: we're not going to import the tier into your file. We're going to merge the pruner speaker-only-file with the other .eaf -file that has all the audio and other tiers and the result is going to be a <i>new</i> .eaf-file. So you'll have three files by the end of this:</div>
<div>
<ul style="text-align: left;">
<li>a) your original .eaf-file with audio and lotsa annotations</li>
<li>b) your .eaf-file with only the search results-tier and no audio etc (ghost-tier)</li>
<li>c) a new merged file consisting of the two above combined</li>
</ul>
</div>
<div>
Don't worry, I've got this. I'm henceforth going to call these files (a), (b) and (c) as indicated above.</div>
<div>
<br /></div>
<div>
Open file (a). Select "Merge Transcriptions..."</div>
<div>
<br /></div>
<div>
File>Merge >Transcriptions...</div>
<div>
<br /></div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj3L3KrRbjLk4t1wYTO_37IrW0BjaWg9bFm5QJgjELHyC3K8aKTzWFTeVKXGbT5W-_h4wkoSk57ERO3xpusVkpXptN_4eEoQ4EYELNy3FaMZB00qPXGM39CzElWtnH3D-VqnFUcTNcpxfkP/s1600/select+merge.tiff" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="494" data-original-width="284" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj3L3KrRbjLk4t1wYTO_37IrW0BjaWg9bFm5QJgjELHyC3K8aKTzWFTeVKXGbT5W-_h4wkoSk57ERO3xpusVkpXptN_4eEoQ4EYELNy3FaMZB00qPXGM39CzElWtnH3D-VqnFUcTNcpxfkP/s320/select+merge.tiff" width="183" /></a></td></tr>
<tr><td class="tr-caption" style="font-size: 12.800000190734863px;">Select Merge transcriptions</td></tr>
</tbody></table>
<div>
Now, select file (a) as the current transcription (this is default anyway), file (b) as the second source and choose a name and location for the new file, file (c), in the "Destination" window. You can think of "Destination" as "Save as.." for file (c) - our new file.</div>
<div>
<br /></div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi15RB9hdzmH_SMgY5fRvxRCEf4dcrXFJLzNQkxTsnGrPqnMaatVsoLLW3gr0B35elyvAYUbZJS82uG7Wi3LZi3-uJwy6yLl4RGHljGZtkJAWvBRkeTmUEISpFRazWExRi6M-9GilSP88QM/s1600/merge+dialogue.tiff" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="557" data-original-width="1079" height="330" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi15RB9hdzmH_SMgY5fRvxRCEf4dcrXFJLzNQkxTsnGrPqnMaatVsoLLW3gr0B35elyvAYUbZJS82uG7Wi3LZi3-uJwy6yLl4RGHljGZtkJAWvBRkeTmUEISpFRazWExRi6M-9GilSP88QM/s640/merge+dialogue.tiff" width="640" /></a></td></tr>
<tr><td class="tr-caption" style="font-size: 12.800000190734863px;">Specifying what should be merged and how</td></tr>
</tbody></table>
<div>
Do not, I repeat, do <i style="font-weight: bold;">not </i>append. And no need to worry about linked media, because (b) doesn't have any audio or anything (remember, it's a ghost). Just leave all those boxes unchecked.</div>
<div>
<br /></div>
<div>
Let ELAN chug away with the merging, and then you're done! You've now got a eaf file with separate tiers for separate speakers.<br />
<br />
<b>9) dealing with the overlap</b><br />
Now, when you're at step 8b and you're filtering for people, make sure you including the overlapping speech for that person in that file. You're going to have to go back to that tier and search for the instances where you have "overlap" written in the comments and manually sort things out. There's no automatic way of dealing with this this I'm afraid, you're going to have to delete the annotation and make new ones that line up across the tiers for that speaker. Go to annotation mode, hide all the other tiers, keep only the ones for that speaker. Navigate to the overlap by searching, delete the existing annotations in that region, highlight new appropriate time intervals and right click each tier and select "new annotation here". This will give you new aligned annotations intervals that you can now deal associate with just one speaker.</div>
<div>
<div style="text-align: center;">
<br /></div>
<div style="text-align: center;">
<span style="font-size: large;">DONE!</span></div>
<div style="text-align: center;">
<br /></div>
</div>
</div>
If you're curious how to use this technique but for matching particular searchers,<a href="http://humans-who-read-grammars.blogspot.com/2017/08/elan-making-tiers-out-of-search-results.html"> read this blogpost</a><br />
<br />
If you found this useful and want be to write it up a bit more neatly and with more screenshots etc, let me know in the comments. There should be a way of making this work better with python, but I haven't figured that out just yet.<br />
<br />
<div style="text-align: center;">
<b><span style="font-size: large;">Good bye!</span></b></div>
<div style="text-align: center;">
<b><br /></b></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjQMgOyjWu_H7xAr2_CeciDM4OtUTNTO6tKCiqf1r6Od3rvwUQvITi6_vE6XuRfEzaSFWy6_AAdRUQ1Y13zZdAwwQq_Ug-AuyPBu8-b9IOsndlqzyvyTj-6xVBKXsffafScp2UV91RPgQlp/s1600/IMG_20170801_145802.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="286" data-original-width="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjQMgOyjWu_H7xAr2_CeciDM4OtUTNTO6tKCiqf1r6Od3rvwUQvITi6_vE6XuRfEzaSFWy6_AAdRUQ1Y13zZdAwwQq_Ug-AuyPBu8-b9IOsndlqzyvyTj-6xVBKXsffafScp2UV91RPgQlp/s1600/IMG_20170801_145802.jpg" /></a></div>
<br />
¹<span style="font-family: inherit;"> Himmelmann wrote <a href="https://scholarspace.manoa.hawaii.edu/bitstream/10125/24806/1/ldc-sp15-himmelmann.pdf">a paper about this challenge,</a> and he says that the actual challenge is "reaching a better understanding of the transcription processitself and its relevance for linguistic theory". We're not going to be doing that here, but please read his paper if this challenge is something that interests you. </span><br />
<div style="text-align: left;">
<span style="font-family: inherit;"><span style="color: #191e3f; font-weight: 500; text-align: center;">² Articulators are relevant for sign languages and gesture transcription, and this guide actually can fit transcription of speech as well as sign and gesture, including transcribing different articulators on different tiers. </span></span></div>
</div>
Hedvig Skirgårdhttp://www.blogger.com/profile/03689179680848604827noreply@blogger.com0tag:blogger.com,1999:blog-1300680252997007251.post-15908125922822060532018-11-06T17:44:00.002+11:002018-11-06T17:44:44.560+11:00That infographic, again ;)<div dir="ltr" style="text-align: left;" trbidi="on">
<div class="separator" style="clear: both; text-align: left;">
<a href="http://humans-who-read-grammars.blogspot.com/2015/06/that-infographic-on-languages-of-world.html">In 2015 I wrote a blogpost about Alberto Lucas López visualisations of the worlds languages.</a> I answered some frequently asked questions in relation to that visualisation, mostly to do with Ethnologue's definitions of languages, macro-languages and speakers. There's a lot more context needed to fully understand that infographic, and every time I see if re-shared I see the same questions pop up. It's a good infographic, so I understand that it goes viral - but when the same questions come every time it means that more context is needed. </div>
<div class="separator" style="clear: both; text-align: left;">
<br /></div>
<div class="" style="clear: both; text-align: left;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiVpfyNvnA3zoMORKzEgWt0jK5tgLQGfDQco2fIvTSbkKaL0eo19oTygN_Jg-upu0FoTjMsbEVSrz6tQ38G5sbwTrbGa1kmhjq8REdbRADibFKQvh_hG99BnnduYGjPFMzDF9h0AA5bcmMU/s1600/languagesHQ_1000.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="1588" data-original-width="1000" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiVpfyNvnA3zoMORKzEgWt0jK5tgLQGfDQco2fIvTSbkKaL0eo19oTygN_Jg-upu0FoTjMsbEVSrz6tQ38G5sbwTrbGa1kmhjq8REdbRADibFKQvh_hG99BnnduYGjPFMzDF9h0AA5bcmMU/s640/languagesHQ_1000.png" width="401" /></a><a href="https://www.lucasinfografia.com/Mother-tongues">Since then, Alberto (who is now Senior Graphics Editor at the National Geographic) has released an updated version, which among other things fixes the color of Mexico.</a> I haven't gone through to check what else has been adjusted, but many of the same questions will remain. This is because Ethnologue's classification of what is and what is not a language (which still underlies the visualisation) is still controversial at times and the general public will not known what Ethnologue and the Library of Congress mean by "macro-language". </div>
<div class="" style="clear: both; text-align: left;">
I might do another blog post going through the visualisation, if there's enough new questions. Post them here or at the old post if you have any :)!</div>
<div class="" style="clear: both; text-align: left;">
<br /></div>
<div class="" style="clear: both; text-align: left;">
Over and out, </div>
<div class="" style="clear: both; text-align: left;">
Hedders</div>
<div class="separator" style="clear: both; text-align: left;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiVpfyNvnA3zoMORKzEgWt0jK5tgLQGfDQco2fIvTSbkKaL0eo19oTygN_Jg-upu0FoTjMsbEVSrz6tQ38G5sbwTrbGa1kmhjq8REdbRADibFKQvh_hG99BnnduYGjPFMzDF9h0AA5bcmMU/s1600/languagesHQ_1000.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><span style="color: black;"></span></a></div>
<div>
<br /></div>
<div>
<br /></div>
</div>
Hedvig Skirgårdhttp://www.blogger.com/profile/03689179680848604827noreply@blogger.com1tag:blogger.com,1999:blog-1300680252997007251.post-29541045679900880522018-09-10T18:08:00.002+10:002018-09-10T18:09:50.331+10:00Brust, breast, borst: an encounter with r-metathesisTwo months ago I gave birth to our second daughter. In order to prepare for this joyous event, I prepared by trying to get some of the local (German) vocabulary on labour & babies in my head. One of the words I had some trouble with was <i>Brust</i> 'breast'. Basically, my German reading is pretty decent, but speaking and writing are another matter, I just don't have enough vocabulary at the ready, hence my quest. Until now I could get away with blaming my high school education, where I suffered from a then new policy to split up second language education in a compulsory reading module and an optional speaking & writing module that I did not take. Having lived in Germany for over two years now, it's getting rather embarrassing though. <br />
<br />
Anyway, back to <i>Brust</i>. The reason I found it confusing is that compared to my native Dutch, the <i>r</i> is in the wrong place: in Dutch it's <i>borst</i> 'breast'. Hmm. English <i>breast</i> has the <i>r</i> in the same place as German though. Then when I was brushing my teeth one night, I realised <i>borstel</i> 'brush' in Dutch looks a lot like <i>Bürste</i> 'brush' in German, and also to the English form <i>brush</i>, now with English having the <i>r</i> in the 'wrong' place. What's going on here?<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjsnMhngAEVnh-5flEKwmTnVUqwmvu_MCoPIP_AHxdcj4qQcoEEs6Ev9gIcWVtTYc2r_rqc_oERB_MOKfYhyphenhyphen2b2C3I7uwhH2ks9D6tk0Cx6TOWrAspIjn2uIa1RSGlDuUW8SpPASmknk8Ns/s1600/1.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="700" data-original-width="1120" height="200" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjsnMhngAEVnh-5flEKwmTnVUqwmvu_MCoPIP_AHxdcj4qQcoEEs6Ev9gIcWVtTYc2r_rqc_oERB_MOKfYhyphenhyphen2b2C3I7uwhH2ks9D6tk0Cx6TOWrAspIjn2uIa1RSGlDuUW8SpPASmknk8Ns/s320/1.jpg" width="320" /></a></div>
<div style="text-align: center;">
<i>Food items that look like breasts, no. 1</i></div>
<div class="separator" style="clear: both; text-align: center;">
</div>
<br />
This phenomenon is called metathesis, when sounds or syllables in a word or, when dealing with sentences, words in a sentence, change their order. Specifically we are dealing with r-metathesis here, the position of <i>r </i>has been changed to come after the vowel in Dutch <i>borst</i> 'breast' - as for <i>brush</i>, we'll come to that. This is a really common type of sound change, and many examples are given on a wikipedia page <a href="https://en.wikipedia.org/wiki/Metathesis_(linguistics)" target="_blank">here</a>.<br />
<br />
In the words for 'breast', it's Dutch <i>borst</i> that's the odd one out: all the Germanic forms have the <i>br</i> cluster and go back to Proto-Germanic *<i>breust</i>-, *<i>brusti</i>- 'breast', which ultimately goes back to Proto-Indo-European *<i>bhreus</i>-, *<i>bhreu</i>-, with meanings ranging from 'breast' to 'belly' (see <a href="http://www.etymologiebank.nl/trefwoord/borst1" target="_blank">here</a>). The only <a href="https://en.wiktionary.org/wiki/boarst#West_Frisian" target="_blank">exception</a> is West Frisian <i>boarst</i> 'breast', from Old Frisian <i>briast</i>, <i>brast</i> - one can only imagine this instance of r-metathesis must have occurred under pressure from Dutch.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjc83r5AAyE5VnUzW55VZuradWKqw9TPHEaE8ka-N5b1-M49sWt9NFVPyYlohKl3LXb-s2uyivwwEBzNeucASooQiYNHZARApgiOO1N63lBnb0vKXTOUlp3P27ME1d65ADOTnh9nwGk_1xS/s1600/2.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="352" data-original-width="576" height="195" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjc83r5AAyE5VnUzW55VZuradWKqw9TPHEaE8ka-N5b1-M49sWt9NFVPyYlohKl3LXb-s2uyivwwEBzNeucASooQiYNHZARApgiOO1N63lBnb0vKXTOUlp3P27ME1d65ADOTnh9nwGk_1xS/s320/2.jpg" width="320" /></a></div>
<div style="text-align: center;">
<i>Food items that look like breasts, no. 2</i></div>
<br />
As for the words for 'brush', turns out they are not all related. English <i>brush</i> is a loanword from Old French <a href="https://en.wiktionary.org/wiki/brush" target="_blank"><i>broisse</i></a>, while Dutch <i>borstel</i> and German <i>Bürste</i> come from the Proto-Germanic form *<i>burstila</i>-, a diminutive of *<i>burst</i> 'bristle'. These forms come from a word meaning something like the hair that sticks up on certain animals, as brushes used to be maid from wiry animal hair. Actually, the Dutch and German words are related to English <i>bristle</i>, where it's actually English that has undergone the r-metathesis (see <a href="http://www.etymologiebank.nl/trefwoord/borstel" target="_blank">here</a> and <a href="https://en.wiktionary.org/wiki/bristle" target="_blank">here</a>).<br />
<br />
Just as a bonus, another word I learned and was confusing too, but because of very different reasons, is <i>Kreißsaal</i>. <i>kreißen</i> is a verb meaning 'to labor', and <i>saal</i> means 'room', so together it's 'delivery room'. My immediate association though is with the cognate Dutch verb <i>krijsen</i> 'scream, screech', so for me <i>Kreißsaal</i> translates to 'screaming room'! When I explained this to native German speakers they seemed rather surprised at this. Turns out the verb <i>kreißen</i> seems rather uncommon in contemporary German ('only' 36.700 hits on Google). One of my friends told me her only association of the first part of the word was with <i>Kreis</i> 'circle', and she thought that for some reason, all delivery rooms were circular well into adulthood.<br />
<br />
So far for baby-related adventures in German, I promise to talk about grammar next time.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg1Vrx6b8As0btxS_Cw_iujhSL4qsagKT4lS8YIrusExweLnWVZt-jFwy-8u-94m-z-E7aGOnjW5sABW-gizUGsdkw4jBWCdIbN3CDRQZjNAY0JosxU40woIPYHJszH8W0g0Kv5xdQlCgPC/s1600/3.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="450" data-original-width="600" height="240" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg1Vrx6b8As0btxS_Cw_iujhSL4qsagKT4lS8YIrusExweLnWVZt-jFwy-8u-94m-z-E7aGOnjW5sABW-gizUGsdkw4jBWCdIbN3CDRQZjNAY0JosxU40woIPYHJszH8W0g0Kv5xdQlCgPC/s320/3.jpg" width="320" /></a></div>
<div style="text-align: center;">
<i>Food items that look like breasts, no. 3</i></div>
<br />Annemarie Verkerkhttp://www.blogger.com/profile/14747297526182358435noreply@blogger.com2tag:blogger.com,1999:blog-1300680252997007251.post-78197858708102173102018-04-30T19:28:00.003+10:002018-05-11T13:28:04.754+10:00Having fun with phrase structure grammars: Midsomer Murders and Beatles<div dir="ltr" style="text-align: left;" trbidi="on">
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-size: 12pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;">This post is about phrase-structure grammars, which can be both entertaining and educational.
If you're a linguistics student, you will be interested in this. We’re going to learn how to define
a little set of rules for a made up language, and then generate possible sentences in that
language based on the rules. We can also use it to test if something is grammatical in our
tested language.</span><br />
<span style="font-family: "arial"; font-size: 12pt; white-space: pre;"><br /></span>
<span style="font-family: "arial"; font-size: 12pt; white-space: pre;">You may already be familiar with phrase structure from linguistics class, or parsing in </span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-size: 12pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;">programming. Regardless, this introduction is accessible for everyone - including novices.</span><br />
<span style="font-family: "arial"; font-size: 12pt; white-space: pre;"><br /></span>
<span style="font-family: "arial"; font-size: 12pt; white-space: pre;">We will first learn the basics of these little rules, and then illustrate by generating </span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; font-family: "arial"; font-size: 12pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;">random plot summaries for possible episodes of the TV show Midsomer Murders
(à la </span><a href="https://twitter.com/midsomerplots?lang=en" style="text-decoration: none;"><span style="background-color: transparent; font-family: "arial"; font-size: 12pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: underline; vertical-align: baseline; white-space: pre;">the Midsomer Murders Bot on twitter</span></a><span style="background-color: transparent; font-family: "arial"; font-size: 12pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;">) and also Beatles lyrics.</span></div>
<b style="font-weight: normal;"><br /></b>
<br />
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt; text-align: center;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-size: 12pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;"><img height="240" src="https://lh5.googleusercontent.com/pvgBaHN1YY4kBebqHy3GvzcxQGQqeADn78huhytP5rYOj4YwYA_FlREede3ENMEHcv8fA7B3F8cYqet9Y61A9_FSNPKG5FtNcVnjKFLJYFygPas2wREKFZ85-jNfwZCRVKtDp1-G" style="-webkit-transform: rotate(0.00rad); border: none; transform: rotate(0.00rad);" width="380" /></span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt; text-align: center;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-size: 10pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;">Even Barnaby can see the templatic nature of the show.</span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt; text-align: center;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-size: 10pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;"><br /></span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt; text-align: center;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-size: 12pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;"><img height="210" src="https://lh3.googleusercontent.com/z81vlrHtwdvNEWQbJwKlQqZwbxIEvLwft5wkTZ6fYQ8rMbNhn2CGmpP1pHDylkYD1a_XKl8ZOEUbyh8yq4GQElF00Jwz1B1L-Azyhpcu5aWfOJ6GaMALt_64ujiKga7yayH00STu" style="-webkit-transform: rotate(0.00rad); border: none; transform: rotate(0.00rad);" width="500" /></span><span style="background-color: transparent; color: black; font-family: "arial"; font-size: 12pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;"> </span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
</div>
<div style="text-align: center;">
<span style="font-family: "arial"; white-space: pre-wrap;"><span style="font-size: x-small;">How many nas do we need to generate this song?</span></span></div>
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 700; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Nearley parser</span><br />
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">We will be using the Nearley parser, a computer program that helps parse sentences given pre-defined rules. The Nearley parser uses context-free grammar rules. The Nearley parser is written by Kartik Chandra at Stanford. The program is also supported and licensed by MIT. Guillermo Webster from MIT made an online interactive environment for web browsers where we can play with this parser and construct little context-free grammars and see the results right away. You can play with this parser here:</span><br />
<br />
<span style="font-family: "arial"; vertical-align: baseline; white-space: pre-wrap;"><a href="http://omrelli.ug/nearley-playground/" style="text-decoration-line: none;">http://omrelli.ug/nearley-playground/</a></span><br />
<span style="font-family: "arial"; white-space: pre-wrap;"><br /></span>
<span style="font-family: "arial"; white-space: pre-wrap;">Since the Nearley parser can run in a web browser, you’re not going to have to install anything or download anything, just follow the links and copy-paste from this post. It’ll be real simple and easy, and then at the end we’re going to discuss some more real-world applications.</span></div>
<b style="font-weight: normal;"><br /></b>
<br />
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Please note that the interactive environment, the Nearley playground, works best on a Google Chrome browser. If it crashes, it may be because you made too complicated a grammar for your web browser to handle (ambiguities or recursion). If it crashes, delete your cookies and restart the browser. It remembers what you did from last time, and if whatever you did before caused it to crash it will unfortunately crash again. The best thing is to set your browser to never save cookies for this domain or have it in private browsing/incognito mode. Hopefully this won't happen, but if it does you know what to do.</span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><br /></span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt; text-align: center;">
<img height="162" src="https://lh3.googleusercontent.com/RfH3zBPnptsKqr2OodNDzPibh1czcs1S5P0ncERzfI0nKaUL_eKj0lNISHkwG9p7cbSjhVwwwynpWKonu1zIvpgW0KSRxnRlDhbux83ZQ_-NhvpmPXJZpCJMVTdHX9bY_cdUEsqc" style="border: none; font-family: Arial; transform: rotate(0rad); white-space: pre-wrap;" width="254" /></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><br /></span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Nearley as a parser exists outside of this specific web browser environment. </span><a href="https://nearley.js.org/" style="text-decoration: none;"><span style="background-color: transparent; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: underline; vertical-align: baseline; white-space: pre-wrap;">You can read more about it here.</span></a></div>
<b style="font-weight: normal;"><br /></b>
<br />
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 700; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Basics</span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="font-family: "arial" , "helvetica" , sans-serif;"><span style="background-color: transparent; color: black; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">We will be writing rules that take an item at the left and puts its definition/content to the right. </span><span style="background-color: transparent; color: black; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Let’s just start with something really simple so that we get the hang of it, and then we can go through more details on how to write these little grammars. Rules are defined by using an arrow, "->". Here’s a very simple example of a set of rules that will just generate one simple sentence:</span></span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-left: 48.75pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">SENTENCE -> SUBJ VERB OBJECT</span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-left: 48.75pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">SUBJ -> "I "</span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-left: 48.75pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">VERB -> " love "</span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-left: 48.75pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">OBJECT -> "chocolate"</span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">This little grammar will generate one sentence, and one only. Perhaps that’s all that needs to be said, after all? </span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-left: 47.25pt; margin-top: 0pt; padding: 0pt 0pt 0pt 1.5pt; text-indent: -1.5pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">I love chocolate</span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">It literally only knows these three words and it only knows to string them together in this particular order. It can't even say </span><span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">"</span><span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Chocolate I love".</span><br />
<span style="font-family: "arial"; white-space: pre-wrap;"><br /></span>
<span style="font-family: "arial"; white-space: pre-wrap;">This grammar says that a string in this little language needs to have something that is an SUBJ then a VERB and then an OBJ (in that exact order). It then defines what those things are. Words in this system need to be within quotation marks. We can think of the things in CAPITALS as meta-categories, labels, and the things between quotation marks as the actual lexicon. The lexicon is what will actually make up the language, and the things in CAPITALS are the rules that will govern how these are combined. </span><br />
<span style="font-family: "arial"; white-space: pre-wrap;"><br /></span>
<span style="font-family: "arial"; white-space: pre-wrap;">In this post, we will use capital letters for labels, but it is not strictly necessary for the grammar to work. You can also name things exactly whatever you want. For example, this would generate exactly the same grammar and sentences.</span><br />
<span style="font-family: "arial"; white-space: pre-wrap;"><br /></span>
<span style="font-family: "courier new"; white-space: pre-wrap;">BIGGESTBLOB -> BLAB BLOB BLEB</span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-left: 48.75pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">BLAB -> "I "</span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-left: 48.75pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">BLOB -> "love "</span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-left: 48.75pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">BLEB -> "chocolate"</span></div>
<b style="font-weight: normal;"><br /></b>
<br />
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Output:</span></div>
<b style="font-weight: normal;"><br /></b>
<br />
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-left: 47.25pt; margin-top: 0pt; padding: 0pt 0pt 0pt 1.5pt; text-indent: -1.5pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">I love chocolate</span></div>
<b style="font-weight: normal;"><br /></b>
<br />
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">It’s helpful to you and to others who might read your code if you name things sensibly, but it’s not technically required by the system itself. The above little BLOB-grammar is just as formally good as the other.</span><br />
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><br /></span>
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">If you are doing this as a linguist or linguistics student, you'll of course need to conform to whatever notational conventions your framework has or your teacher requires of you. Don't hand in homework with "BLOB" instead of "VP"..</span><br />
<span style="font-family: "arial"; white-space: pre-wrap;"><br /></span>
<span style="font-family: "arial"; white-space: pre-wrap;">The items that are the actual lexicon are known as "terminals" and the others as "non-terminals". The terminals can be thought of as the "end-stations", they don’t themselves refer to anything else - they are the actual final output. Whereas, the non-terminals do not end up in the output, they just define other things. Here’s the terminals and non-terminals of the examples above</span><br />
<span style="font-family: "arial"; white-space: pre-wrap;"><br /></span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-left: 48.75pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 700; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Non-terminals </span><span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">= SENTENCE, SUBJ, VERB & OBJ (or BIGGESTBLOB, BLAB, BLOB & BLEB)</span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-left: 48.75pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 700; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Terminals:</span><span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"> "I", "love" & "chocolate"</span></div>
<span style="font-family: "arial"; white-space: pre-wrap;">When using a context-free grammar to generate language, the terminals are going to be words, phonemes and/or morphemes.</span><br />
<b style="font-weight: normal;"><br /></b>
<br />
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 700; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Basics of online interface</span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">You can either tell the set of rules to </span><span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 700; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">generate </span><span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">language, or you can </span><span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 700; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">ask it if a particular string is correct or no</span><span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">t (according to the rules you just defined).</span></div>
<ul style="text-align: left;">
<li><span style="font-family: "arial" , "helvetica" , sans-serif;">you can write a little set of rules and items under "Basic grammar" to the left</span></li>
<li><span style="font-family: "arial" , "helvetica" , sans-serif;">you generate strings by clicking "generate" on the lower right</span></li>
<li><span style="font-family: "arial" , "helvetica" , sans-serif;">you can test if strings are correct or not by "add test" to the right</span></li>
<li><span style="font-family: "arial" , "helvetica" , sans-serif;">remember to delete cookies and reboot if it crashes.</span></li>
</ul>
<span style="font-family: "arial"; font-weight: 700; white-space: pre-wrap;">Meta-comments in the code</span><br />
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">We can make our little grammar a bit more clear and separate out the rules (non-terminals) and lexicon (terminals) using non-scripting lines of comments, denoted by the hash-symbol. Everything that is on the line after the hash will be ignored by the program as it runs the code, it's for meta-commentary to the reader of the script and will not end up in the output.</span></div>
<b style="font-weight: normal;"><br /></b>
<br />
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Here is an example of a slightly longer grammar with meta-commentary. In this language, we have separated out the kinds of pronouns that can be objects from those that are subjects, and we are using | to mean "or".</span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-left: 48.75pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">#RULES</span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-left: 48.75pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">SENTENCE -> SUBJ PREDICATE OBJECT</span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-left: 48.75pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">SUBJ -> NP_S</span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-left: 48.75pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">PREDICATE-> VERB</span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-left: 48.75pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">OBJECT -> NP_O</span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-left: 48.75pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">NP_S -> PRONOUN_S | N</span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-left: 48.75pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">NP_O -> PRONOUN_O | N</span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-left: 48.75pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">#LEXICON</span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-left: 48.75pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">N -> "chocolate" | "pandas"</span><span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><br class="kix-line-break" /></span><span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">PRONOUN_S -> "They " | "You "</span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-left: 48.75pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">PRONOUN_O -> "me" | "you"</span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-left: 48.75pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">VERB -> "love "</span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">This is called "commenting out" lines, i.e. making it so that they will not be understood as actual code. Basically all programming languages do this. R & Python even uses #, just like Nearley. LaTeX on the other hand uses "%".
</span><span style="font-family: "arial"; white-space: pre-wrap;">
Remember to not leave too much "crap" in your comments, if you share the code or make a publication out of it you don't want there to be unflattering bits of comments lying around..
</span><span style="font-family: "arial"; white-space: pre-wrap;">
<b>Some more notation</b></span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">When we list alternatives, we will use "|" to mean "or". When we want to denote that something is a non-terminal, an actual string, we put it in quotation marks (non-smart/curly quotation marks). When we want to instruct the grammar to repeat an item, we use regex operators. See the table below for the basic symbols you'll need.
</span><span style="font-family: "arial"; white-space: pre-wrap;">
<b><span style="font-size: large;">Nearley cheat sheet symbols</span></b></span></div>
<div dir="ltr" style="margin-left: 0pt;">
<table style="border-collapse: collapse; border: none; width: 451.27559055118115pt;"><colgroup><col width="*"></col><col width="*"></col><col width="*"></col><col width="*"></col></colgroup><tbody>
<tr style="height: 0pt;"><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 700; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Quantifier</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 700; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Description</span></div>
</td><td colspan="2" style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 700; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">example</span></div>
</td></tr>
<tr style="height: 0pt;"><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">:+</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">one or infinite of whatever is before</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">"cake ":+, </span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">"cake", "cake cake cake", "cake cake"</span></div>
</td></tr>
<tr style="height: 0pt;"><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">:?</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">zero or one of whatever is before</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">"cake":?</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">"", "cake"</span></div>
</td></tr>
<tr style="height: 0pt;"><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">:*</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">zero or infinite of whatever is before</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">"cake":*</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">"", "cake", "cake cake cake cake"</span></div>
</td></tr>
<tr style="height: 0pt;"><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">|</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">or</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">"cake" | "muffin"</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">"muffin" </span></div>
</td></tr>
<tr style="height: 0pt;"><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">""</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">whatever is in between is a string</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">"cake"</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">"cake"</span></div>
</td></tr>
<tr style="height: 0pt;"><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">()</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">grouping of non-terminals</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Line -> (H H):?</span></div>
<div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">H -> "hey" | "ho"</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">"" or "heyho" or “heyhey” or…</span></div>
</td></tr>
<tr style="height: 0pt;"><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">\n</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">line break</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">"Hello \n Hi"</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">"Hello </span></div>
<div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"> Hi"</span></div>
</td></tr>
<tr style="height: 0pt;"><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">.</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Any character (but whitespace</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">.</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">"N" or "<" or "x"</span></div>
</td></tr>
<tr style="height: 0pt;"><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">#</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">comment</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">NP -> "word"</span></div>
<div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">#this is a comment</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">"word"</span></div>
</td></tr>
</tbody></table>
</div>
<b style="font-weight: normal;"><br /></b>
<br />
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 700; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Differences between traditional ways of writing phrase structure rules and Nearley</span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">If you have learned about phrase structure rules before, chances are this notation looks a bit unfamiliar to you. That is because Nearley is more similar to programming whereas traditional conventions that linguists use are more similar to writing. Chances are you have seen notation that looks something like this:
</span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt; text-align: center;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><img height="296" src="https://lh5.googleusercontent.com/NMPnIaa2pPrnfeKaZHE9YrQ3m1fT8-6sh8xhmBKDVEavbPkR4NJJBsUITxmzQuKblmVhuZB1fuoVFBti4fy1m4KVItVzvoRm4yu__yVLIfJV5WEYpFUaqAUD2gikNI8qNcJ9kbQC" style="border: none; transform: rotate(0rad);" width="400" />
<span style="text-align: left;">Don't worry! The differences between the conventional linguist way and Nearley are few. Let's go through it.
</span></span><br />
<div style="text-align: left;">
</div>
</div>
<a href="https://lh6.googleusercontent.com/syraXf7upvIrypKx07wY5IBsxaueuC-p6CAyfxBGKtT4m__vwK2CB0W8ukbzZbGKu5PSQ9vfKe8ccSsNTPe7AwE6r3nc0w39BD3k5tMptjuK6m93NhJ9RTK6SuF0A7oirlP6YchN" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" height="220" src="https://lh6.googleusercontent.com/syraXf7upvIrypKx07wY5IBsxaueuC-p6CAyfxBGKtT4m__vwK2CB0W8ukbzZbGKu5PSQ9vfKe8ccSsNTPe7AwE6r3nc0w39BD3k5tMptjuK6m93NhJ9RTK6SuF0A7oirlP6YchN" style="border: none; text-align: center; transform: rotate(0rad);" width="392" /></a><b style="font-weight: normal;"><br /></b>
<br />
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">The main differences lie in optionality and listing alternatives. </span><span style="font-family: "arial"; white-space: pre-wrap;">Alternatives in Nearley are denoted by the pipe symbol "|". This symbol means "or".
</span><span style="color: black; font-family: "arial"; vertical-align: baseline; white-space: pre-wrap;">
Rule: </span><span style="color: black; font-family: "courier new"; vertical-align: baseline; white-space: pre-wrap;">NP -> "bird" | "book"</span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Output 1: </span><span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><span class="Apple-tab-span" style="white-space: pre;"> </span></span><span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">"bird"</span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Output 2:</span><span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><span class="Apple-tab-span" style="white-space: pre;"> </span></span><span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">"book"
</span><span style="white-space: pre-wrap;"><span style="font-family: "courier new";">
</span><span style="font-family: "arial";">In this parser, if something is optional ("occur once or never") we mark it with ":?". ":?" is a quantifier that means that whatever is before occurs 0 times or 1 time. There are other quantifiers, this table below lists all of them.</span></span></div>
<div dir="ltr" style="margin-left: 0pt;">
<table style="border-collapse: collapse; border: none;"><colgroup><col width="149"></col><col width="149"></col><col width="149"></col><col width="148"></col></colgroup><tbody>
<tr style="height: 21pt;"><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 700; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Quantifier</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 700; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Description</span></div>
</td><td colspan="2" style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 700; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">example</span></div>
</td></tr>
<tr style="height: 0pt;"><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">:+</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">one or infinite of whatever is before</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">"cake ":+, </span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">"cake", "cake cake cake", "cake cake"</span></div>
</td></tr>
<tr style="height: 0pt;"><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">:?</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">zero or one of whatever is before</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">"cake":?</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">"", "cake"</span></div>
</td></tr>
<tr style="height: 0pt;"><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">:*</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">zero or infinite of whatever is before</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">"cake":*</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">"", "cake", "cake cake cake cake"</span></div>
</td></tr>
</tbody></table>
</div>
<span style="font-family: "arial"; white-space: pre-wrap;">The quantifier will pertain to whatever unit is precisely before it. If the unit before is a terminal string, the string needs to be surrounded by quotation marks. If it’s a non-terminal, this isn’t necessary.</span><br />
<div dir="ltr" style="margin-left: 0pt;">
<table style="border-collapse: collapse; border: none; width: 451.27559055118115pt;"><colgroup><col width="*"></col><col width="*"></col></colgroup><tbody>
<tr style="height: 0pt;"><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 700; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Example</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 700; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Output</span></div>
</td></tr>
<tr style="height: 0pt;"><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">lyrics -> line:+</span></div>
<div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">line -> "na" </span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">"na", "nananana", "nana", etc</span></div>
</td></tr>
<tr style="height: 0pt;"><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">lyrics -> "na":+</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">"na", "nananana", "nana", etc</span></div>
</td></tr>
<tr style="height: 0pt;"><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">lyrics -> "n" "a":+</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">"na", "naaaaa", "naaa", "naaaaaa" etc</span></div>
</td></tr>
</tbody></table>
</div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">These conventions with the pipe-symbol for "or" ("|") and the quantifiers (*+?) are really common in many programming languages for doing text searches. It’s the same in Python, R etc. It’s good to learn these conventions if you want to do anything quantitative with text, regardless of discipline. They're very handy in linguistics when you want to search through corpora, databases or texts.
</span><span style="font-family: "arial"; white-space: pre-wrap;">
The last thing that is different is that Nearley will not put spaces between things unless you explicitly tell it to. You can solve that either by putting spaces inside your terminals, or by defining a non-terminal as a space. More on this later.
</span><span style="font-family: "arial"; white-space: pre-wrap;">
Here is a summary of the differences between the conventional way and Nearley</span></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh0Lf5HHPcMncoMQcZ_b_vrcEwg8VisLov9qY3U704o6rWgc11Tfy_sS-6LN0tQv0DlV6n0dzNRvuk88jX_C-nPRfKPv9dtEUC_vqYw97SvV1VUuFtwgg166iQoMmGzy0jJPJlxusiQy_t7/s1600/p2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><span style="color: black;"><img border="0" data-original-height="538" data-original-width="874" height="392" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh0Lf5HHPcMncoMQcZ_b_vrcEwg8VisLov9qY3U704o6rWgc11Tfy_sS-6LN0tQv0DlV6n0dzNRvuk88jX_C-nPRfKPv9dtEUC_vqYw97SvV1VUuFtwgg166iQoMmGzy0jJPJlxusiQy_t7/s640/p2.png" width="640" /></span></a></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="font-family: "arial"; white-space: pre-wrap;">Here is the same grammar and lexicon that we saw above, in the conventional linguist way and Nearley.</span></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhcFOGPM6ezpUs9Fzsv0QajDZY1pzXipCDBOeX3Tpk4L9VdJDjPWSGYPnizYHcpcuZPsEgVAcrPcrnGyUa-kAd8mWIAtKEkRllN82k-GvzvihL53sM-s45u9kapCFR8mgC7brFB9KTUwWMP/s1600/p3.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><span style="color: black;"><img border="0" data-original-height="594" data-original-width="935" height="406" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhcFOGPM6ezpUs9Fzsv0QajDZY1pzXipCDBOeX3Tpk4L9VdJDjPWSGYPnizYHcpcuZPsEgVAcrPcrnGyUa-kAd8mWIAtKEkRllN82k-GvzvihL53sM-s45u9kapCFR8mgC7brFB9KTUwWMP/s640/p3.png" width="640" /></span></a></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="font-family: "arial"; font-weight: 700; white-space: pre-wrap;">Spaces </span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Spaces will not be inserted into the output unless you specify (either with a non-terminal or within the strings). In the example above, we’ve put spaces into the terminal objects so that the output becomes sensible. If they weren’t there, we’d get: "</span><span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Marysleptalittlelamb".
</span><span style="white-space: pre-wrap;"><span style="font-family: "courier new";">
</span><span style="font-family: "arial";">Another way of doing this would be to define a terminal that is space. In these examples, we will use underscore to signify space.</span></span><br />
<span style="font-family: "courier new"; text-indent: 36pt; white-space: pre-wrap;"><br /></span>
<span style="font-family: "courier new"; text-indent: 36pt; white-space: pre-wrap;">SENTENCE -> NP _ V _ NP</span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt; text-indent: 36pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><span class="Apple-tab-span" style="white-space: pre;"> </span></span><span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">NP -> "Girls"|"pandas"</span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt; text-indent: 36pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"> V -> "love"</span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt; text-indent: 36pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"> _ -> " "
</span><span style="text-indent: 36pt; white-space: pre-wrap;"><span style="font-family: "courier new";">
</span><span style="font-family: "arial";">Generates:</span></span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt; text-indent: 36pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Girls love pandas
</span><span style="white-space: pre-wrap;"><span style="font-family: "courier new";">
</span><span style="font-family: "arial";"><b>Advanced: recursion and ambiguity</b></span></span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">It is easiest if you make your grammars un-ambiguous and non-recursive. It'll be kinder to your browser and processor, and it'll probably make for a more readable grammar. Nearley can handle recursion and ambiguity, though. It is just that it can cause trouble, so it's easier to avoid it if possible, in particular in the beginning. </span><a href="https://nearley.js.org/" style="text-decoration: none;"><span style="background-color: transparent; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: underline; vertical-align: baseline; white-space: pre-wrap;">If you want to read more about that go here</span></a><span style="background-color: transparent; font-family: "arial"; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">.
</span><span style="font-family: "arial"; white-space: pre-wrap;">
<b>Examples</b></span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">These are the basics you need, now we can get going creating some fun things!
</span><span style="font-family: "arial"; vertical-align: baseline; white-space: pre-wrap;">
Did you ever play "Mad Libs" in school? It's a game where you take a template and fill it with some words, and have lots of fun (apparently). </span><a href="http://www.sundhagen.com/babbooks/adlib.cgi" style="text-decoration-line: none;"><span style="font-family: "arial"; vertical-align: baseline; white-space: pre-wrap;">There's a website here for kids where you can do this.</span></a><span style="font-family: "arial"; vertical-align: baseline; white-space: pre-wrap;"> It's a trick teachers use to teach word classes to their students. Here's an example:</span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt; text-align: center;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><img height="448" src="https://lh5.googleusercontent.com/iQAXI4Y_puxDoCTcEUzg-iMU4wlbjD8o0m0F0V04q_yPrhabFPVrqcyusk7ZXfWo8xzWyjBd_jRW7hWCMsrVIYSl4NrcGbIYCVmBQXA8C45pkI3WsiiP7iIYA9nO90BG3Es4wFSD" style="-webkit-transform: rotate(0.00rad); border: none; transform: rotate(0.00rad);" width="346" /></span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">What we're going to be doing with the Nearley grammar, Midsomer Murders and Beatles is actually not that dissimilar. We're going to keep some things fixed, and then rotate over other items to create new unique lines.
</span><span style="font-family: "arial"; white-space: pre-wrap;">
<b>Example: generating Beatles-lyrics</b></span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Let’s consider the song "Hey Jude" by the Beatles. It was written by Paul McCartney in 1968 and </span><span style="background-color: white; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">was at the time the longest single ever to top the British charts. The single has sold approximately eight million copies and is frequently included on professional critics' lists of the greatest songs of all time. In 2013, </span><a href="https://en.wikipedia.org/wiki/Billboard_(magazine)" style="text-decoration: none;"><span style="background-color: white; font-family: "arial"; font-style: italic; font-variant: normal; font-weight: 400; text-decoration: underline; vertical-align: baseline; white-space: pre-wrap;">Billboard</span></a><span style="background-color: white; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"> named it the 10th "biggest" song of all time. </span></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://lh3.googleusercontent.com/tLq8GbUxuhMF9HBxHKdrOibR6Lcn8Dz9-JJrcMs-2862djXW_ju8WjBF7moWVBlL5JbPeCDEQqJnZDzIjPXf4C7SH78npxVYZEuptVK_XpxsPTGwQc6uNMOkRJWMNiAMyrDpk-D4" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><span style="color: black;"><br /><br /><img alt="hey-jude-flow-chart.png" border="0" height="393" src="https://lh3.googleusercontent.com/tLq8GbUxuhMF9HBxHKdrOibR6Lcn8Dz9-JJrcMs-2862djXW_ju8WjBF7moWVBlL5JbPeCDEQqJnZDzIjPXf4C7SH78npxVYZEuptVK_XpxsPTGwQc6uNMOkRJWMNiAMyrDpk-D4" style="border: none; transform: rotate(0rad);" width="295" /></span></a></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: white; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><br /></span>
<span style="background-color: white; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">The song's original title was "Hey Jules", and it was intended to comfort Julian Lennon from the stress of his parents' divorce. McCartney later said, "I knew it was not going to be easy for him", and that he changed the name to "Jude" "because I thought that sounded a bit better"</span></div>
<b style="font-weight: normal;"><br /></b>
<br />
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">The song features quite simple lyrics, someone even drew a flowchart to represent it.
</span><span style="font-family: "arial"; white-space: pre-wrap;">
If we wanted to render the "na" of this song, we’d want either one or more than one (potentially an unlimited number). This would be expressed like this:</span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-left: 48.75pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">"na":+</span></div>
<b style="font-weight: normal;"><br /></b>
<br />
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">In this example we're going to be mixing terminals and non-terminals. The terminals will be the fixed elements, and the non-terminals will predict what we will fill the "gaps" with. We can generate any line of the song, according to this flowchart, like this:</span></div>
<b style="font-weight: normal;"><br /></b>
<br />
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">SONGVERSE -> "Hey Jude, don’t " LINE1 ". Remember to " LINE2 " then you " LINE3 " to make it better. Better better better better better waaaaaa" NA</span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">LINE1 -> "make it bad, take a sad song and make it better" | "be afraid, you were made to go out and get her" | "let me down, you have found her, now go and get her"</span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">LINE2 -> "let her into your heart" | "let her under your skin"</span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">LINE3 -> "can start" | "begin"</span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">NA -> " na":+</span></div>
<b style="font-weight: normal;"><br /></b>
<br />
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Copy and paste this bit of code into </span><a href="http://omrelli.ug/nearley-playground/" style="text-decoration: none;"><span style="background-color: transparent; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: underline; vertical-align: baseline; white-space: pre-wrap;">the Nearley parser playground</span></a><span style="background-color: transparent; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">, and click "generate" and off we go!</span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt; text-align: center;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><img height="187" src="https://lh3.googleusercontent.com/RRIIarIXlsjHJheuFf6F8LmgA5wURTFb-ygLqUKLlOCVUAnYkhxtK5z2LwTEm4cFQCcbfuVox73PzjqlDQIeCcSmVjo4-8Fnso4rssF7o1HxDfKWXzdzNvIqLzlBUWiDGWjDJuME" style="-webkit-transform: rotate(0.00rad); border: none; transform: rotate(0.00rad);" width="245" /></span></div>
<b style="font-weight: normal;"><br /></b>
<br />
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 700; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Example: generating the summary of a Midsummer Murders plot</span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Let’s do another example, this time, we’re going to try to impersonate </span><a href="https://twitter.com/midsomerplots" style="text-decoration: none;"><span style="background-color: transparent; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: underline; vertical-align: baseline; white-space: pre-wrap;">the hilarious "Midsomer Murders Bot" on twitter.</span></a><span style="background-color: transparent; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"> The Midsomer Murders TV show has been running since the 1997, telling the stories of the seemingly never ending stream of murders on the English countryside. The show is known for being a bit silly sometimes, and also being a bit templatic. This little grammar right here will produce funny plot summaries of the popular British murder mystery show "Midsomer Murders", à la the bot on twitter.</span></div>
<b style="font-weight: normal;"><br /></b>
<br />
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-left: 27pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">MIDSOMER_MURDERS_PLOT -> LOCAL PROFESSION FOUND DEAD PLACE SUSPICION SUSPECT ANGRY BLAMED THREAT THREATENED</span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-left: 27pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">LOCAL -> "A local "</span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-left: 27pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">PROFESSION -> "linguist" | "philosopher" | "novelist"</span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-left: 27pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">FOUND -> " is found "</span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-left: 27pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">DEAD -> "drowned" | "strangled" | "dead" | "hanged" | "battered" | "suffocated" | "shot" </span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-left: 27pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">PLACE -> " in the coffee shop." |" in the swimming pool." | " after band practice."| " behind the primary school."</span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-left: 27pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">SUSPICION -> " Suspicion falls on the village "</span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-left: 27pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">SUSPECT -> "baker" | "pastor" | "mailman" | "florist" | "nerd" | "twins"</span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-left: 27pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">ANGRY -> ", angry that the "</span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-left: 27pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">BLAMED -> "new wind farm" | "pig" | "pub" | "decline in newspaper reading" | "metric system"</span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-left: 27pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">THREAT -> " might threaten "</span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-left: 27pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">THREATENED -> "the village fabric." | "the Old Inn." | "the cow farm." | "the annual Full Moon party." | "what little sexual tension the village has left."</span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">This gives us, for example:</span></div>
<b style="font-weight: normal;"><br /></b>
<br />
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-left: 27pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">"A local linguist is found battered in the coffee shop. Suspicion falls on the village nerd, angry that the pub might threaten what little sexual tension the village has left."</span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">or</span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-left: 27pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">"A local novelist is found shot after band practice. Suspicion falls on the village mailman, angry that the pub might threaten the Old Inn."</span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">or</span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-left: 27pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">"A local philosopher is found dead after band practice. Suspicion falls on the village florist, angry that the decline in newspaper reading might threaten the village fabric."</span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Just in case you think I'm being silly, here's some more scenes from the show:</span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><br /></span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt; text-align: center;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><img height="180" src="https://lh3.googleusercontent.com/GCiCcUrqiurZQ3KGNCgO27TNpSQJnT7sFC2sF9dKx9xtu-vNLut8-ZvG9dtrK-d7JiHl4vO5ZKjYDTIMyL4kkZKkifSUEdKwbldtDptawx-DwKdqtUy7bBW_iXQ34LsQP0cftnjp" style="border: none; transform: rotate(0rad);" width="320" /></span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt; text-align: center;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><br /></span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt; text-align: center;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><img height="180" src="https://lh3.googleusercontent.com/-4GNoKwwzmcZBzg5CP0zhOlclCkF0wm1OYbnaIHAK6CkqcyB-ZF3oiqhDdHQy-CTVsbmUWyhm55VVXB4Pmut6pLyTJfDKDcjm651z9_EIUtF7GunkJGbHFa96TMFwTh17Qje1GHR" style="border: none; transform: rotate(0rad);" width="320" /></span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt; text-align: center;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><br /></span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Yes, this is a real show and it is brilliant.</span></div>
<b style="font-weight: normal;"><br /></b>
<br />
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<a href="https://twitter.com/Laserhedvig/status/913773977398476800" style="text-decoration: none;"><span style="color: black;"><span style="background-color: transparent; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: underline; vertical-align: baseline; white-space: pre-wrap;">Thanks to twitter bot coders 沈马修 and </span><span style="background-color: white; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: underline; vertical-align: baseline; white-space: pre-wrap;">Matthew Berryman </span><span style="background-color: transparent; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: underline; vertical-align: baseline; white-space: pre-wrap;">from University of Wollongong for indulging me when I first wrote this little script.</span></span></a></div>
<span style="font-family: "arial" , "helvetica" , sans-serif;"><b style="font-weight: normal;"><br /></b>
</span><br />
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial" , "helvetica" , sans-serif; font-style: normal; font-variant: normal; font-weight: 700; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Final notes</span></div>
<span style="font-family: "arial" , "helvetica" , sans-serif;">Finally, there’s just a few more things that are good to know before continuing.</span><br />
<div style="text-align: left;">
<span style="font-family: "arial" , "helvetica" , sans-serif;"><br /></span>
<br />
<ul style="margin-bottom: 0pt; margin-top: 0pt; text-align: left;">
<li style="background-color: transparent; font-style: normal; font-variant: normal; font-weight: 400; list-style-type: disc; text-decoration: none; vertical-align: baseline; white-space: pre;"><span style="font-family: "arial" , "helvetica" , sans-serif;">The web interface where you write Nearley into has different colors for the text when it’s
non-terminals and terminals. This is just to be helpful. It may be that if you put a modifier
after a non-terminal, it gets the same color as a terminal. </span></li>
</ul>
<ul style="margin-bottom: 0pt; margin-top: 0pt; text-align: left;">
<li style="background-color: transparent; font-style: normal; font-variant: normal; font-weight: 400; list-style-type: disc; text-decoration: none; vertical-align: baseline; white-space: pre;"><span style="font-family: "arial" , "helvetica" , sans-serif;">Terminals are strings, and should hence be surrounded by "". Make sure you get the
right kind of quotation marks, sometimes they get changed to so called "smart/curly"
quotation marks *shudders*</span></li>
</ul>
<ul style="margin-bottom: 0pt; margin-top: 0pt; text-align: left;">
</ul>
<ul style="margin-bottom: 0pt; margin-top: 0pt; text-align: left;">
<li style="background-color: transparent; font-style: normal; font-variant: normal; font-weight: 400; list-style-type: disc; text-decoration: none; vertical-align: baseline; white-space: pre;"><span style="font-family: "arial" , "helvetica" , sans-serif;">Rules need to be written top-down. Rules in Nearley need to be written in such an
order that non-terminals that refer to something else need to occur before that something.
It’s sort of like a hierarchy of more abstract units all the way down to more concrete.
In our earlier example, for example, they need to be in this order:</span></li>
</ul>
<span style="font-family: "arial"; white-space: pre;"></span><br />
<ul style="margin-bottom: 0pt; margin-top: 0pt; text-align: left;">
</ul>
<ol style="margin-bottom: 0pt; margin-top: 0pt; text-align: left;">
<li><span style="font-family: "courier new" , "courier" , monospace;">SENTENCE -> SUBJ VERB OBJECT</span></li>
<li><span style="font-family: "courier new" , "courier" , monospace;">SUBJ -> "I "</span></li>
<li><span style="font-family: "courier new" , "courier" , monospace;">VERB -> " love "</span></li>
<li style="background-color: transparent; font-style: normal; font-variant: normal; font-weight: 400; list-style-type: decimal; text-decoration: none; vertical-align: baseline; white-space: pre;"><span style="font-family: "courier new" , "courier" , monospace;">OBJECT -> "chocolate"</span></li>
</ol>
<ol style="margin-bottom: 0pt; margin-top: 0pt;">
</ol>
<span style="font-family: "arial"; white-space: pre-wrap;"><br /></span>
<span style="font-family: "arial"; white-space: pre-wrap;">This order wouldn’t work:
</span><br />
<ol style="margin-bottom: 0pt; margin-top: 0pt; text-align: left;">
<li><span style="font-family: "courier new" , "courier" , monospace;">SUBJ -> "I "</span></li>
<li><span style="font-family: "courier new" , "courier" , monospace;">VERB -> " love "</span></li>
<li><span style="font-family: "courier new" , "courier" , monospace;">OBJECT -> "chocolate"</span></li>
<li style="background-color: transparent; font-style: normal; font-variant: normal; font-weight: 400; list-style-type: decimal; text-decoration: none; vertical-align: baseline; white-space: pre;"><span style="font-family: "courier new" , "courier" , monospace;">SENTENCE -> SUBJ VERB OBJECT</span></li>
</ol>
<ol style="margin-bottom: 0pt; margin-top: 0pt; text-align: left;">
</ol>
<span style="font-family: "courier new" , "courier" , monospace;"><b style="font-weight: normal;"></b>
</span><br />
<div style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">
The things we have covered now are the basics that you need to get started, you’re now able to write some grammars of your own and fool around. </span><span style="color: black; font-family: "arial"; vertical-align: baseline; white-space: pre-wrap;">Thanks for</span><span style="font-family: "arial";"><span style="white-space: pre-wrap;"> Kartik Chandra for writing the parser, Guillermo Webster for making the playground.
</span></span><span style="font-family: "arial"; white-space: pre-wrap;">
Next time, we’re going to continue to some real-world examples with actual languages Samoan and Arabana!</span></div>
<div style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt; text-align: center;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><img height="207" src="https://lh5.googleusercontent.com/IojTAaOzNNqLmTQFskQcbkU4P95whn7gmUJdI9C-5Qh95EXD02g3esOWrGAZdRmHiR8pIuuf9vHSgPFARd6Vex_8Sgxi9jvZhm_3_QoqGZtADlrqdikxFMGyu3pS2oa-W_LsDdZ3" style="-webkit-transform: rotate(0.00rad); border: none; transform: rotate(0.00rad);" width="368" /></span></div>
<b style="font-weight: normal;"><br /></b>
<br />
<div style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 700; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Test</span></div>
<div style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">You said you wanted a test did you? Ok, sure. That's unusual... but lucky I had one prepared!</span></div>
<b style="font-weight: normal;"><br /></b>
<br />
<div style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Consider this little grammar and lexicon</span></div>
<div style="line-height: 1.38; margin-bottom: 0pt; margin-left: 27pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">#Grammar</span></div>
<div style="line-height: 1.38; margin-bottom: 0pt; margin-left: 27pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">DISNEY_MOVIE_TITLE -> NP CONJ NP PREP PLACE</span></div>
<div style="line-height: 1.38; margin-bottom: 0pt; margin-left: 27pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">NP -> ADJ N</span></div>
<b style="font-weight: normal;"><br /></b>
<br />
<div style="line-height: 1.38; margin-bottom: 0pt; margin-left: 27pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">#Lexicon</span></div>
<div style="line-height: 1.38; margin-bottom: 0pt; margin-left: 27pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">ADJ -> "Pretty "| "Little " | "Sweet " | "Sad " |"Honest "</span></div>
<div style="line-height: 1.38; margin-bottom: 0pt; margin-left: 27pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">N -> "Prince" | "Princess" | "Frog" | "Dog" | "Pumpkin"</span></div>
<div style="line-height: 1.38; margin-bottom: 0pt; margin-left: 27pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">PREP -> " in" | " outside of" | " behind" | " near" | " at"</span></div>
<div style="line-height: 1.38; margin-bottom: 0pt; margin-left: 27pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">CONJ -> " or " | " and "</span></div>
<div style="line-height: 1.38; margin-bottom: 0pt; margin-left: 27pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">PLACE -> " Colorado" | " Sea World" | " Disneyland" | " Target"</span></div>
<div style="line-height: 1.38; margin-bottom: 0pt; margin-left: 27pt; margin-top: 0pt;">
<br /></div>
<div style="line-height: 1.38; margin-bottom: 0pt; margin-left: 27pt; margin-top: 0pt;">
<span style="font-family: "arial" , "helvetica" , sans-serif;">Are the strings below grammatical or not?</span></div>
<div style="margin-left: 27pt;">
<table style="border-collapse: collapse; border: none; width: 451.27559055118115pt;"><colgroup><col width="64"></col><col width="312"></col><col width="*"></col></colgroup><tbody>
<tr style="height: 0pt;"><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><br /></td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div style="line-height: 1.38; margin-bottom: 0pt; margin-left: 7.086614173228355pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">String</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Grammatical according to rules above?</span></div>
</td></tr>
<tr style="height: 0pt;"><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div style="line-height: 1.38; margin-bottom: 0pt; margin-left: 7.086614173228355pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">A</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div style="line-height: 1.38; margin-bottom: 0pt; margin-left: 7.086614173228355pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Pretty Prince or Honest Frog in Colorado</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><br /></td></tr>
<tr style="height: 0pt;"><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div style="line-height: 1.38; margin-bottom: 0pt; margin-left: 7.086614173228355pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">B</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div style="line-height: 1.38; margin-bottom: 0pt; margin-left: 7.086614173228355pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Honest Frog near Sea world</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><br /></td></tr>
<tr style="height: 0pt;"><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div style="line-height: 1.38; margin-bottom: 0pt; margin-left: 7.086614173228355pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">C</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div style="line-height: 1.38; margin-bottom: 0pt; margin-left: 7.086614173228355pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Sweet and Sad Dog at Target</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<br /></div>
</td></tr>
<tr style="height: 0pt;"><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div style="line-height: 1.38; margin-bottom: 0pt; margin-left: 7.086614173228355pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">D</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div style="line-height: 1.38; margin-bottom: 0pt; margin-left: 7.086614173228355pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Honest Frog and Sweet Pumpkin near Sea World</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<br /></div>
</td></tr>
<tr style="height: 0pt;"><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div style="line-height: 1.38; margin-bottom: 0pt; margin-left: 7.086614173228355pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">E</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div style="line-height: 1.38; margin-bottom: 0pt; margin-left: 7.086614173228355pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Prince or Frog in Colorado</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<br /></div>
</td></tr>
<tr style="height: 0pt;"><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div style="line-height: 1.38; margin-bottom: 0pt; margin-left: 7.086614173228355pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">F</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div style="line-height: 1.38; margin-bottom: 0pt; margin-left: 7.086614173228355pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Sweet Dog and Sad Dog at Target</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<br /></div>
</td></tr>
<tr style="height: 0pt;"><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div style="line-height: 1.38; margin-bottom: 0pt; margin-left: 7.086614173228355pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">G</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div style="line-height: 1.38; margin-bottom: 0pt; margin-left: 7.086614173228355pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "courier new"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Frog in Colorado</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><br /></td></tr>
</tbody></table>
</div>
<br />
<div style="line-height: 1.38; margin-bottom: 0pt; margin-left: 27pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-size: 8pt; font-style: normal; font-variant: normal; font-weight: 700; text-decoration: none; vertical-align: baseline; white-space: pre;">Answers:</span><span style="background-color: transparent; color: black; font-family: "arial"; font-size: 8pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;"><br class="kix-line-break" /></span><span style="background-color: transparent; color: black; font-family: "arial"; font-size: 8pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;">A = T, B = F, C = F, D = T, E = F, F = T, G = F.</span><br />
<span style="background-color: transparent; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;"><br /></span>
<span style="background-color: transparent; color: #990000; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;">EDIT
That was fun, making rules an a lexicon. But, you might want to draw trees of your sentences too, no?
Unfortunately, I don't know of an app that does both. Gothenburg University used to have one, but it is no longer active.
But! We can go<a href="http://ironcreek.net/phpsyntaxtree/"> here</a> and draw specific trees nicely. That's a bit helpful, if not a full solution.</span><br />
<span style="background-color: transparent; color: #990000; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;"><br /></span>
<span style="background-color: transparent; color: black; font-family: "arial"; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;"><br /></span>
</div>
</div>
</div>
Hedvig Skirgårdhttp://www.blogger.com/profile/03689179680848604827noreply@blogger.com0tag:blogger.com,1999:blog-1300680252997007251.post-35602940160967962202018-04-26T18:45:00.001+10:002018-05-08T20:26:24.880+10:00X-rays of vocal tracts<div dir="ltr" style="text-align: left;" trbidi="on">
<div dir="ltr" style="text-align: left;" trbidi="on">
<div style="text-align: left;">
Dear everyone,<br />
<br />
The vocal track is a fabulous thing, and most of it is hidden from our eyes. That's why x-rays are so awesome. In this post, we're going to look at the vocal track in profile through x-rays. It's not about grammar, but it is entertaining and educational.<br />
<br />
This here are 3 gifs of people using their vocal tract. These images are from Christine Ericsdotter's research (Ericsdotter 1999, Stark et al 1999).<br />
<br /></div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEixPr-w-WGLUrrYbBWkb0rdpe4m0BGnolsQGKD4vvL9fcBLD01LZ42LTOsJWbT4CV8yCCVRSE3REtCJRJPFKQCx6879OWVh6nn58lWU8EkqJG0VhlNHf2hdXq2b3ZCDiq-jdPQCsuvnXfU/s1600/bothred.gif" style="margin-left: auto; margin-right: auto;"><span style="color: black;"><img border="0" height="397" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEixPr-w-WGLUrrYbBWkb0rdpe4m0BGnolsQGKD4vvL9fcBLD01LZ42LTOsJWbT4CV8yCCVRSE3REtCJRJPFKQCx6879OWVh6nn58lWU8EkqJG0VhlNHf2hdXq2b3ZCDiq-jdPQCsuvnXfU/s400/bothred.gif" width="400" /></span></a></td></tr>
<tr><td class="tr-caption" style="font-size: 12.8px;">Female subject saying swedish ''både'' (eng: both)<br />
Pictures from <a href="http://www2.ling.su.se/staff/ericsdotter/projects/xray_info.html">Christine Ericsdotter's website</a></td></tr>
</tbody></table>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhytTNPTygi-KX7dcLGxrmjm-_eSy20HWZDl2xunXht2UhQ6riRM6msB5Sr4NiHDPlQARHUWjckaRpWNVEKCI8yUMX_-geeEhi-G5KXh5RDI12G7ksyfqKBOKP3D5TIUIqXgtRsdmSXfvE/s1600/cebm.gif" style="margin-left: auto; margin-right: auto;"><span style="color: black;"><img border="0" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhytTNPTygi-KX7dcLGxrmjm-_eSy20HWZDl2xunXht2UhQ6riRM6msB5Sr4NiHDPlQARHUWjckaRpWNVEKCI8yUMX_-geeEhi-G5KXh5RDI12G7ksyfqKBOKP3D5TIUIqXgtRsdmSXfvE/s400/cebm.gif" width="370" /></span></a></td></tr>
<tr><td class="tr-caption" style="font-size: 12.8px;">Female subject exploring her vocal tract<br />
Pictures from <a href="http://www2.ling.su.se/staff/ericsdotter/projects/big_mouth.htm">Christine Ericsdotter's website</a></td></tr>
</tbody></table>
<div style="text-align: left;">
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><span style="font-size: xx-small;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhxzeEV6mh7DlryQpz7XFilZW0WrdZzTtfD88W2Ga5hIA1SxXRzOVhXguHIA9E63kHPUyai0oj33Y4KWpifuY2RQox-cjmJyLkXFGAEzJR1GHmw8V0WIDxYGJhN8IQZMESWkck6MHu7uqw/s1600/pion_fp11.gif" style="margin-left: auto; margin-right: auto;"><img border="0" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhxzeEV6mh7DlryQpz7XFilZW0WrdZzTtfD88W2Ga5hIA1SxXRzOVhXguHIA9E63kHPUyai0oj33Y4KWpifuY2RQox-cjmJyLkXFGAEzJR1GHmw8V0WIDxYGJhN8IQZMESWkck6MHu7uqw/s400/pion_fp11.gif" width="400" /></a></span></td></tr>
<tr><td class="tr-caption" style="font-size: 12.8px;"><span style="font-size: xx-small;">Male subject saying swedish</span><span style="font-size: xx-small;"> ''pion''</span><span style="font-size: xx-small;">[piun]<br />Pictures from <a href="http://www2.ling.su.se/staff/ericsdotter/projects/pion_fp11.htm">Christine Ericsdotter's w</a>ebsite</span></td></tr>
</tbody></table>
</div>
<div style="text-align: left;">
<b>The study behind the images above</b><br />
Christine and many others from the <a href="http://www.ling.su.se/english/research/research-areas/research-in-phonetics-1.14251">phonetics section of the department of linguistics at Stockholm university</a> have been involved in a study that took place at the Danderyd hospital. They did x-rays of people when the spoke. The aim was to learn more about dental and retroflexive stops (different types of consonants) and their variation in different vowel contexts.<br />
<br />
The subjects were subjected to as little radiation as possible, all appropriate safety requirements were met. The radiation dose was smaller than that of a typical visit to the dentist.<br />
<br />
<b>The vocal tract</b><br />
Here is a more detailed diagram of the vocal tract with labels.<br />
<div class="separator" style="clear: both; text-align: center;">
<span style="color: black; margin-left: 1em; margin-right: 1em;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg8Y_uz8nDGAeJErHhuHQxL_fTXD0dnF0WFRJ9MFW6rpYx_zoxYRMnuGbltUiHW2PBV6mnwfZkBY3fpJ3AINeWZooypYrAm2hKMRRvzIjSfACZysotWAAKDnbev9s0Nek7vDobFgVurGQg/s1600/image03.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg8Y_uz8nDGAeJErHhuHQxL_fTXD0dnF0WFRJ9MFW6rpYx_zoxYRMnuGbltUiHW2PBV6mnwfZkBY3fpJ3AINeWZooypYrAm2hKMRRvzIjSfACZysotWAAKDnbev9s0Nek7vDobFgVurGQg/s400/image03.jpg" width="400" /></a></span></div>
<br /></div>
<div style="text-align: left;">
<b>Vowel chart</b><br />
The standard way of representing vowels is through a chart like so:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgkkqmvHgFw81sDP9L0Zw4Yq3cHZ9s46uThVve9QABf2M9GSzdytZqb2dnm2PY1DjCgsa5zO28Z824hBTmUdd5nINaLrmyE6XvjqKuR2zhn83tQgoxHnbikj9PYs5r7yf8kpRIWQ-Y6itE/s1600/ipa_vowel_chart_2005s.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="321" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgkkqmvHgFw81sDP9L0Zw4Yq3cHZ9s46uThVve9QABf2M9GSzdytZqb2dnm2PY1DjCgsa5zO28Z824hBTmUdd5nINaLrmyE6XvjqKuR2zhn83tQgoxHnbikj9PYs5r7yf8kpRIWQ-Y6itE/s400/ipa_vowel_chart_2005s.png" width="400" /></a></div>
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgkkqmvHgFw81sDP9L0Zw4Yq3cHZ9s46uThVve9QABf2M9GSzdytZqb2dnm2PY1DjCgsa5zO28Z824hBTmUdd5nINaLrmyE6XvjqKuR2zhn83tQgoxHnbikj9PYs5r7yf8kpRIWQ-Y6itE/s1600/ipa_vowel_chart_2005s.png" style="margin-left: 1em; margin-right: 1em; text-align: center;"><span style="color: black;"></span></a><br />
<br />
The vowel chart is meant to be used as a representation of how phonetic symbols representing vowels relate to each other. The vowel chart actually maps onto the vowel tract, like so:<br />
<br /></div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhoBs1Y4-4LLLAgN1b6ihrHu3H0wDbof8pGsZiahVS-Ww1VSgplTTgNeOJ6vZalPVsDVO-grMMIQt9oYNOfP6zeX5aVe9bjuyxIhrr4n2RCLUekIuO7rhhuX8VXlCBk_nzIOnXW-neyWGI/s1600/vocal+tract.tiff" imageanchor="1" style="margin-left: auto; margin-right: auto;"><span style="color: black;"><img border="0" height="255" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhoBs1Y4-4LLLAgN1b6ihrHu3H0wDbof8pGsZiahVS-Ww1VSgplTTgNeOJ6vZalPVsDVO-grMMIQt9oYNOfP6zeX5aVe9bjuyxIhrr4n2RCLUekIuO7rhhuX8VXlCBk_nzIOnXW-neyWGI/s320/vocal+tract.tiff" width="320" /></span></a></td></tr>
<tr><td class="tr-caption" style="font-size: 12.8px;">Vowel chart and it's relation to the vocal tract. Cred: Mikael Parkvall</td></tr>
</tbody></table>
<br />
<b>One of the things Christine taught me</b><br />
Christine Ericsdotter was my tutor in phonetics, and she taught me a little phrase in Swedish that can help one remember how the four front unrounded vowels and the four back rounded are pronounced: "v<b>i</b> b<b>e</b>r f<b>e</b>m m<b>a</b>n dr<b>a</b> b<b>o</b>rt v<b>å</b>rt b<b>o</b>rd"<br />
<div style="text-align: left;">
<div class="separator" style="clear: both; text-align: center;">
</div>
</div>
<br />
<div style="text-align: left;">
<div class="separator" style="clear: both; text-align: center;">
<span style="color: black;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgkS6FoXGUDRKip3OEtzcp0i1wSJ2OwouZsLVd53w5r9BtEsyOCPld6KWlwMf2fp0oD4HE7CKY-cBDOeHLG-FHqDfqRkVKKAv79CaoQrbxMiC46cDCvKrNW60o5s954CcUrBN6zubkQVjg/s1600/viberfem.tiff" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="217" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgkS6FoXGUDRKip3OEtzcp0i1wSJ2OwouZsLVd53w5r9BtEsyOCPld6KWlwMf2fp0oD4HE7CKY-cBDOeHLG-FHqDfqRkVKKAv79CaoQrbxMiC46cDCvKrNW60o5s954CcUrBN6zubkQVjg/s400/viberfem.tiff" width="400" /></a><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgkkqmvHgFw81sDP9L0Zw4Yq3cHZ9s46uThVve9QABf2M9GSzdytZqb2dnm2PY1DjCgsa5zO28Z824hBTmUdd5nINaLrmyE6XvjqKuR2zhn83tQgoxHnbikj9PYs5r7yf8kpRIWQ-Y6itE/s1600/ipa_vowel_chart_2005s.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"></a></span></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
See? Isn't it clever? I think it is<b> </b>terribly clever.<br />
<br />
<b><a href="https://www.facebook.com/maxplanckgesellschaft/videos/1896407567037041/?hc_ref=ARTmV-3qc8FRUU5InATMgxrC_yyWWA_6Q5kw1TIjZs-aPV8AP98Bq5zLLZ8KALDFuNk&fref=gs&dti=174954463067345&hc_location=group">Award-nominated x-ray video</a></b><br />
<a href="https://www.facebook.com/maxplanckgesellschaft/videos/1896407567037041/?hc_ref=ARTmV-3qc8FRUU5InATMgxrC_yyWWA_6Q5kw1TIjZs-aPV8AP98Bq5zLLZ8KALDFuNk&fref=gs&dti=174954463067345&hc_location=group">The Max Planck society recently released an x-ray video of a man speaking german that's doing the rounds on social media and being nominated for awards. Have a look at it here.</a><b><i><span style="color: #660000;"><br /><br />EDIT: The video from the MPG by Jens Frahm is MRT, not x-rays. </span></i></b><br />
<br />
<b>Trumpet blowing in x-ray!</b><br />
For funsies, here is some trumpet blowing x-rays. Compare it to the vowel chart for maximal education.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<iframe allowfullscreen="" class="YOUTUBE-iframe-video" data-thumbnail-src="https://i.ytimg.com/vi/tpOwuAMqFTA/0.jpg" frameborder="0" height="266" src="https://www.youtube.com/embed/tpOwuAMqFTA?feature=player_embedded" width="320"></iframe></div>
<div style="text-align: center;">
<b><br /></b>
<br />
<div class="separator" style="clear: both; text-align: center;">
<iframe allowfullscreen="" class="YOUTUBE-iframe-video" data-thumbnail-src="https://i.ytimg.com/vi/pjSBQl38ksk/0.jpg" frameborder="0" height="266" src="https://www.youtube.com/embed/pjSBQl38ksk?feature=player_embedded" width="320"></iframe></div>
</div>
<div style="text-align: center;">
<div class="separator" style="clear: both;">
<br /></div>
</div>
</div>
<div style="text-align: center;">
<div style="text-align: left;">
That's it, we were just doing some vowel tract ogling. I hope you enjoyed it!</div>
<div style="text-align: left;">
<b><br /></b></div>
<div style="text-align: left;">
<b>Over and out</b></div>
<div style="text-align: left;">
<b><br /></b></div>
</div>
<b>References:</b><br />
Ericsdotter, C. (1999): Modeling lingual coarticulation in coronal stops. Master Thesis in Phonetics, Department of Linguistics, Stockholm University & Department of Speech, Music and Hearing, Royal Institute of Technology, Stockholm, Spring 1999.<br />
<br />
Stark, Ericsdotter, Branderud, Sundberg, Lundberg & Lander (1999) The Apex Model As A Tool In The Specification Of Speakerspecific Articulatory Behavior. PDF:<br />
<a href="https://www.researchgate.net/publication/2359061_The_Apex_Model_As_A_Tool_In_The_Specification_Of_Speakerspecific_Articulatory_Behavior?ev=auth_pub"><span style="color: black;">https://www.researchgate.net/publication/2359061_The_Apex_Model_As_A_Tool_In_The_Specification_Of_Speakerspecific_Articulatory_Behavior?ev=auth_pub</span></a></div>
</div>
Hedvig Skirgårdhttp://www.blogger.com/profile/03689179680848604827noreply@blogger.com2tag:blogger.com,1999:blog-1300680252997007251.post-53486583411753875162018-02-27T07:43:00.000+11:002018-02-27T07:44:46.444+11:00A visit to the Royal Museum for Central AfricaTwo weeks ago I went to visit the <a href="http://www.africamuseum.be/home" target="_blank">Royal Museum for Central Africa</a> in Tervuren, Belgium. The reason was straightforward: in the ongoing endeavour to study Bantu gender systems, there are at least 10 north-west Bantu languages on which the only information easily available is located there. Since that is not an insignificant number as the total sample is about 250 languages, of which 50 or so with no information available anywhere, it seemed worthwhile to go there and collect as much data as possible.<br />
<br />
The museum itself has been closed since 2013 and, bad timing on my part, is supposed to re-open late this year. The main building is spectacular, surrounded by a beautiful park, and the immense collections make the RMCA one of the worlds' centers for the study of Central Africa. Once the museum opens again, there will be a permanent exhibition on the languages and linguistics of Central Africa, so if you have a chance to go there you should totally do that.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhUC03jiT70vszWTPyefFAVBzUeQ0VtOdtW8ll-yAsKB-hvamyBrAscFfhtxlMgbutn2HFDgqA7ed1otjbdsR5EJv8Fa4qjq9levJODxmXNWrLS1faAUbouvTjmCgnnOWp9HkPKRSfHhLM2/s1600/IMG_0154.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="993" data-original-width="1600" height="395" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhUC03jiT70vszWTPyefFAVBzUeQ0VtOdtW8ll-yAsKB-hvamyBrAscFfhtxlMgbutn2HFDgqA7ed1otjbdsR5EJv8Fa4qjq9levJODxmXNWrLS1faAUbouvTjmCgnnOWp9HkPKRSfHhLM2/s640/IMG_0154.JPG" width="640" /></a></div>
<br />
<br />
Unfortunately, I did not set foot in the beautiful main building. The RMCA library is located in an ugly building shared with, for whatever reason, the Belgian Federal Agency for the Safety of the Food Chain. This building also houses scientists on linguistics and musicology and the main library, and most importantly the linguistics library. I was allowed to sit in the kitchen and go about my business, which I thought involved mostly coding languages as fast as I could, but which quickly evolved into photographing as much documents as I could, as I realized that I would never manage to go through everything they had on the spot.<br />
<br />
I came across some really cool things, including Meeussen's handwritten 'Notes on Buyu', which is the only document (in the world?) on <a href="http://glottolog.org/resource/languoid/id/buyu1239" target="_blank">Buyu</a>, a Bantu D language spoken in the western Democratic Republic of Congo. It was apparently lost for a while, and found again only months before I arrived, so it was pretty amazing to see it. Because it is handwritten, I had to code it on the spot, as the photographs I made (see below) are at least in part unreadable.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhuo_F12U_yvu7o3ucrmYxMtuPLOQ6jUv7taA8X9VaOVkitbpiTF52HhvXWhseoVBUVf7NYYUnS2-Wqp0TWRx-dLYmyoZ-PabQn7oTSptZnoih7Dim7_kfsShhtq6-xi_bZxXGYW0S_maxk/s1600/meeussen_buyu1951_01.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1600" data-original-width="1200" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhuo_F12U_yvu7o3ucrmYxMtuPLOQ6jUv7taA8X9VaOVkitbpiTF52HhvXWhseoVBUVf7NYYUnS2-Wqp0TWRx-dLYmyoZ-PabQn7oTSptZnoih7Dim7_kfsShhtq6-xi_bZxXGYW0S_maxk/s640/meeussen_buyu1951_01.png" width="480" /></a></div>
<br />
<br />
The document included several pages that were not intended to appear in the ultimate version of this work, which never materialized, including the page below where the author seemed to be preoccupied with drawing angles:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg0Kwa65DaRi9OavHKVHLvjqT_zcQAvyRyMD0je26_utYhVadNH9TqxBEuPjYeKa9Hg_PmmQP7mXsg2wivSlMmfonUK-mplRwleM-6XAUp8wlbBz9vmG6Oe_IZ1uMxGiyBkscYvnzUJnaZ-/s1600/meeussen_buyu1951_p80.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1600" data-original-width="1200" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg0Kwa65DaRi9OavHKVHLvjqT_zcQAvyRyMD0je26_utYhVadNH9TqxBEuPjYeKa9Hg_PmmQP7mXsg2wivSlMmfonUK-mplRwleM-6XAUp8wlbBz9vmG6Oe_IZ1uMxGiyBkscYvnzUJnaZ-/s640/meeussen_buyu1951_p80.png" width="480" /></a></div>
<br />
<br />
As it turned out, I could find no evidence that Buyu behaved any different from its better-studied sister Holoholo. Buyu and Holoholo are part of a group of languages that seems to have migrated into the forest from the south, but unlike another group of Bantu D languages that traveled through the forest, including Bila, Amba, Komo, Kari, Homa, Bodo, Bera, and Beeke, has kept a more or less traditional noun class system. Bila and sisters all have (heavily) reconstructed noun class systems, with some not having gender in any meaningful sense any longer.<br />
<br />
Other very cool finds at the museum included mostly hand drawn maps or maps I had never seen before, but which would be great to have in a large format:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgUPA0tNNluxQVjkgeMNdOxd3Ui_iRi8LxOh0MJ9ce0ycCHxCUCcQRRQbt5rOWE_FUatS9QgMbtbOnu9Y7R4uSDMsuCNxTWa-Z4LXwp-slvU5V1BwsOF0vccJrmGU1sTz1I5VAC2EEZEZuw/s1600/makila_mbala1981_map.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1600" data-original-width="1200" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgUPA0tNNluxQVjkgeMNdOxd3Ui_iRi8LxOh0MJ9ce0ycCHxCUCcQRRQbt5rOWE_FUatS9QgMbtbOnu9Y7R4uSDMsuCNxTWa-Z4LXwp-slvU5V1BwsOF0vccJrmGU1sTz1I5VAC2EEZEZuw/s640/makila_mbala1981_map.png" width="480" /></a></div>
<br />
<div style="text-align: center;">
Makila, Moyo-Kayita. (1981) Esquisse de grammaire mbala: morphologie et syntaxe. Lubumbashi: Université Nationale du Zaïre (UNAZA) MA thesis.</div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiDK5OGgGNv_jiOvORDVfQrAxyb0zQ-t3sgCq5dqj8J3_nTE20o8xe0FV-pDWCOAHFcTlVk2BaXAy7d8kthTlk12d61z1Qm9-5vX9ab2WEcsP-sTjSkTpiDBxhx3V1XdAy8zCFbZBsoR2Vj/s1600/nkabuwakabili_boa1986_map.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1600" data-original-width="1200" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiDK5OGgGNv_jiOvORDVfQrAxyb0zQ-t3sgCq5dqj8J3_nTE20o8xe0FV-pDWCOAHFcTlVk2BaXAy7d8kthTlk12d61z1Qm9-5vX9ab2WEcsP-sTjSkTpiDBxhx3V1XdAy8zCFbZBsoR2Vj/s640/nkabuwakabili_boa1986_map.png" width="480" /></a></div>
<br />
<div style="text-align: center;">
Nkabuwakabili, A. (1986) Esquisse de la langue Boa (C44). Université Libre de Bruxelles MA thesis.</div>
<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh6KDkGb2dcDgjqs3aS-xsFcmTW9zXdnUSFjZafubtQclPQ688zYxYRlD8QEIVKv3upL3YBqYcnl9X3wtXT7JzGGNT_kWgbNRGX2zIV3qGLvPKy34Sh6MfQ-IvwEgMKotyzINDYlSJEh3dp/s1600/nzang-bie_mmala1989_map.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1600" data-original-width="1200" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh6KDkGb2dcDgjqs3aS-xsFcmTW9zXdnUSFjZafubtQclPQ688zYxYRlD8QEIVKv3upL3YBqYcnl9X3wtXT7JzGGNT_kWgbNRGX2zIV3qGLvPKy34Sh6MfQ-IvwEgMKotyzINDYlSJEh3dp/s640/nzang-bie_mmala1989_map.png" width="480" /></a></div>
<br />
<div style="text-align: center;">
Nzang-Bie, Yolande. (1989) Éléments du description du mmǎlá: langue bantu de zone A. Université Libre de Bruxelles MA thesis.</div>
<br />
All in all, it was great to visit, even though it was really cold and I didn't have a chance to see the museum. The long-term plan is to continue with the rest of the Bantu family once the study of the north-west languages are finished, so who knows, maybe I'll get to visit again someday.<br />
<div>
<br /></div>
<br />
<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<br />Annemarie Verkerkhttp://www.blogger.com/profile/14747297526182358435noreply@blogger.com0tag:blogger.com,1999:blog-1300680252997007251.post-37709239904386034292018-02-18T00:52:00.000+11:002018-02-27T19:23:09.782+11:00Testing Grammarly's Grammaticality Judgements<div dir="ltr" style="text-align: left;" trbidi="on">
<div style="-webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; font-family: Baskerville; line-height: normal; margin-bottom: 9px;">
<span style="-webkit-font-kerning: none; font-size: large;">I was looking back fondly over my past rejected job applications, when I came across a cover letter that I wrote to <a href="https://app.grammarly.com/docs/267526709">Grammarly</a> in 2017, a grammar-checking app that came to prominence due to their aggressive YouTube advertising campaigns. They were advertising for a job for linguists, aimed in particular at people working on natural language processing. </span><span style="font-size: large;">I used Grammarly for a few days, and had some fun</span><span style="font-size: large;"> devising tortuous variants of ungrammatical sentences to test what it could do. It does many things well, but it also makes bizarre mistakes that reveal that the app is not attempting to parse the sentences, or was doing so very superficially, and on which it is out-performed by existing parsers such as the <a href="https://cloud.google.com/natural-language/">Google Natural Language API</a> or the <a href="http://nlp.stanford.edu:8080/parser/index.jsp">Stanford Parser</a>. I wrote these points up in a cover letter and sent it to them, which got me invited to an uninformative Skype interview and a polite rejection some time later. What follows is a brief summary of these points.</span><br />
<span style="-webkit-font-kerning: none; font-size: large;"><br /></span>
<span style="-webkit-font-kerning: none; font-size: large;">Grammarly corrects ‘I am disappointed of course’ to ‘I am disappointed in course’ (treating ‘disappointed of’ as a collocation error). At the time it also corrected ‘Last year we have been to Thailand’ to ‘Last year we were to Thailand’ (this issue has now gone). By themselves these problems are minor, but may be part of a large class of ambiguities that could be tackled by having a probabilistic parse of the sentence, such as recognising that ‘of course' normally occurs as an adverbial phrase, or ‘been’ as the past participle of ‘go’ if its object is a prepositional phrase.</span></div>
<div style="line-height: normal; margin-bottom: 9px;">
<span style="font-kerning: none; font-size: large;"><span style="font-family: "baskerville";">I made four sentences below, </span><span style="font-family: "baskerville";">all of which are ungrammatical in that they have a plural verb (‘are’) where the subject (‘impact’) is singular. Grammarly only detects the agreement error in sentences 2 and 3, marked with an asterisk. Its parsing is affected by irrelevant features such as the choice of verb in the subordinate clause, ‘predict’ in (1) compared with ‘expect’ in (3). It is also thrown by</span></span><span style="font-family: "baskerville"; font-size: large;"> the irrelevant subject of the verb in the subordinate clause, such as ‘theories of economics’ in (4). </span><br />
<br /></div>
<ol>
<li style="-webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; font-family: Baskerville; line-height: normal; margin: 0px 0px 9px;"><span style="font-size: large;"><span style="font-family: "helvetica"; line-height: normal;"></span><span style="font-kerning: none;"> The impact, which people predict are bound to be felt sooner or later, could be enormous.</span></span></li>
<li style="-webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; font-family: Baskerville; line-height: normal; margin: 0px 0px 9px;"><span style="font-size: large;"><span style="font-family: "helvetica"; line-height: normal;"></span><span style="font-kerning: none;">*The impact, which they predict are bound to be felt sooner or later, could be enormous.</span></span></li>
<li style="-webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; font-family: Baskerville; line-height: normal; margin: 0px 0px 9px;"><span style="font-size: large;"><span style="font-family: "helvetica"; line-height: normal;"></span><span style="font-kerning: none;">*The impact, which people expect are bound to be felt sooner or later, could be enormous.</span></span></li>
<li style="-webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; font-family: Baskerville; line-height: normal; margin: 0px 0px 9px;"><span style="font-size: large;"><span style="font-family: "helvetica"; line-height: normal;"></span><span style="font-kerning: none;">The impact, which theories of economics expect are bound to be felt sooner or later, could be enormous.</span></span></li>
</ol>
<div>
<span style="font-family: "baskerville"; font-size: large;"> I used another four sentences to show that it gets confused by different adjuncts, such as ‘maneuvering in markets for oil’ in (1) and ‘maneuvering in markets for oil and coffee’ in (2)). Again, all of these sentences are ungrammatical ('have' should be 'has', to agree with 'manuevering'), but Grammarly only recognises (1) and (4) as ungrammatical.</span></div>
<ol>
<li style="-webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; font-family: Baskerville; line-height: normal; margin: 0px 0px 9px;"><span style="font-size: large;"><span style="font-family: "helvetica"; line-height: normal;"></span><span style="font-kerning: none;">*The maneuvering in markets for oil have brought billions in profits to banks. </span></span></li>
<li style="-webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; font-family: Baskerville; line-height: normal; margin: 0px 0px 9px;"><span style="font-size: large;"><span style="font-family: "helvetica"; line-height: normal;"></span><span style="font-kerning: none;">The maneuvering in markets for oil and coffee have brought billions in profits to banks.</span></span></li>
<li style="-webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; font-family: Baskerville; line-height: normal; margin: 0px 0px 9px;"><span style="font-size: large;"><span style="font-family: "helvetica"; line-height: normal;"></span><span style="font-kerning: none;">The maneuvering in markets for oils have brought billions in profits to banks.</span></span></li>
<li style="-webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; font-family: Baskerville; line-height: normal; margin: 0px 0px 9px;"><span style="font-size: large;"><span style="font-family: "helvetica"; line-height: normal;"></span><span style="font-kerning: none;">*The maneuvering for oils have brought billions in profits to banks.</span></span></li>
</ol>
<div style="-webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; font-family: Baskerville; line-height: normal; margin-bottom: 9px;">
<span style="font-size: large;"><span style="font-kerning: none;">The problem seems to be that Grammarly is not attempting to parse the sentences. Parsers such as the Google Natural Language API</span><span style="-webkit-font-kerning: none; line-height: normal;"><sup></sup></span><span style="font-kerning: none;"> and Stanford Parser</span><span style="-webkit-font-kerning: none; line-height: normal;"><sup></sup></span><span style="font-kerning: none;"> are both able to do correctly detect the dependency between 'impact' and 'could', as in their respective trees below.</span></span><br />
<span style="font-size: large;"><span style="font-kerning: none;"><br /></span></span>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiXzqPsmQXzs2mPNjf47K1-45GJwgcyFCDjprN1F5EdsQgyWLMITrI_DK8AQxybeET28IyIB0CBCNf_FF76YGfVQ40linXp8KIdZRf00xlERVhpcLnIBaQ4dsRhIhVa7_fd-HGirdGUNg4/s1600/Screen+Shot+2018-02-17+at+21.24.59.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="460" data-original-width="1600" height="184" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiXzqPsmQXzs2mPNjf47K1-45GJwgcyFCDjprN1F5EdsQgyWLMITrI_DK8AQxybeET28IyIB0CBCNf_FF76YGfVQ40linXp8KIdZRf00xlERVhpcLnIBaQ4dsRhIhVa7_fd-HGirdGUNg4/s640/Screen+Shot+2018-02-17+at+21.24.59.png" width="640" /></a></div>
<span style="font-size: large;"><span style="font-kerning: none;"><br /></span></span>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgn6I9frSqL8vcRBzDoBN8YKJ3mK67vnjNYcfAzhyW1jZlmNfr2uV3XkGwFm5m6HD6p4oMuBJdlGiUdnxgXGG41ulusITtTfuKiazuEhkEeJelwxcCQASrx16O69EU-zTOeSbZT0uQeI5s/s1600/Screen+Shot+2018-02-17+at+20.51.27.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="884" data-original-width="894" height="395" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgn6I9frSqL8vcRBzDoBN8YKJ3mK67vnjNYcfAzhyW1jZlmNfr2uV3XkGwFm5m6HD6p4oMuBJdlGiUdnxgXGG41ulusITtTfuKiazuEhkEeJelwxcCQASrx16O69EU-zTOeSbZT0uQeI5s/s400/Screen+Shot+2018-02-17+at+20.51.27.png" width="400" /></a></div>
<br />
<span style="font-size: large;"><span style="font-kerning: none;">Parsing each sentence of a text before suggesting corrections may be computationally intensive, but is desirable for detecting errors of agreement in long-distance dependencies, precisely the type of error that writers are most likely to make: the above sentences are variants of two errors that have appeared in print, from Steven Pinker's <i>The Sense of Style </i>(91-92). Grammarly in fact happens to be slower anyway, </span></span><span style="font-size: large;">taking five seconds to produce an incorrect answer compared with</span><span style="font-size: large;"> the Stanford Parser and the Google Natural Language API, </span><span style="font-size: large;">which each gets the correct answer in under two seconds.</span></div>
<div style="-webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; font-family: Baskerville; line-height: normal; margin-bottom: 9px;">
<span style="font-size: large;"><span style="font-kerning: none;">Other mistakes that Grammarly does not pick up on include incorrect uses of definite articles for generic nouns (‘Some of us consider the money as the force which controls our lives’</span><span style="-webkit-font-kerning: none; line-height: normal;"><sup></sup></span><span style="font-kerning: none;">), for indefinite nouns (‘I live in the small apartment in the suburbs’</span><span style="-webkit-font-kerning: none; line-height: normal;"><sup></sup></span><span style="font-kerning: none;">), or in particular idioms (‘to put it in the nutshell’</span><span style="-webkit-font-kerning: none; line-height: normal;"><sup></sup></span><span style="font-kerning: none;">). Mistakes involving tense are also not picked up on (‘It is raining for two days already’, ‘I go to France tomorrow’). Problems such as these may have to be tackled by having not only a probabilistic parse of the sentence, but some attempt at probabilistic semantics, such as the knowledge that ‘money’ is often a generic noun; that ‘in a nutshell' is an idiom; and estimating the intended tense and aspect of a sentence given the adverbs used (‘tomorrow’, ‘already’) and semantic properties of the verb (‘raining' cannot be scheduled for a duration, unlike ‘going’ in ‘He is going for two days’). </span></span><br />
<span style="font-size: large;"><br /></span>
<span style="font-size: large;">Grammarly seems to be a useful tool in some cases, but lacking dependency parsing, knowledge of collocations, or rudimentary semantics that would be needed to detect mistakes that people are most likely to make. </span><span style="font-size: large;">I was reminded of these issues by Douglas Hofstadter's recent article </span><a href="https://www.theatlantic.com/technology/archive/2018/01/the-shallowness-of-google-translate/551570/"><span style="font-size: large;">here</span></a><span style="font-size: large;"> in which he plays around with Google Translate and makes similar points about its relative shallowness, despite its impressive advances (e.g. see this <a href="https://www.nytimes.com/2016/12/14/magazine/the-great-ai-awakening.html">essay</a>). </span><br />
<span style="font-size: large;"><br /></span>
<span style="font-size: large;">I deregistered from Grammarly soon afterwards, partly because of the </span><span style="font-size: large;">weekly spam emails which border on desperate if you do not use it (</span><span style="font-size: large;"> </span><span style="font-size: large;">"Anybody home?" "This is awkward. You seem to have stopped using Grammarly on your browser", "We miss you already"), and which give you patronising compliments if you do, one of them given below to finish on just because it suggests a comical lack of semantics in their language software more generally.</span><br />
<span style="font-size: large;"><br /></span></div>
<div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgVlMbAgl_OSYFnkwEmCsayyliF10ijpkGpI1o5KKG4cy0fdRoAIGLG8E8glbPN-WYM0vDqEwmAbFVMUmuF4LSiB1nAj8xZ2JGLLfAWzw4KRIhJlxY_hf5QqfWZac8WcRDTmkB8u745yM8/s1600/Screen+Shot+2018-02-17+at+20.59.48.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="560" data-original-width="1422" height="252" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgVlMbAgl_OSYFnkwEmCsayyliF10ijpkGpI1o5KKG4cy0fdRoAIGLG8E8glbPN-WYM0vDqEwmAbFVMUmuF4LSiB1nAj8xZ2JGLLfAWzw4KRIhJlxY_hf5QqfWZac8WcRDTmkB8u745yM8/s640/Screen+Shot+2018-02-17+at+20.59.48.png" width="640" /></a></div>
<span style="font-kerning: none; font-size: large;"><br /></span></div>
<br /></div>
Jeremy Collinshttp://www.blogger.com/profile/02949376439100679223noreply@blogger.com0tag:blogger.com,1999:blog-1300680252997007251.post-57680365515230173312018-02-15T23:18:00.001+11:002018-02-16T11:29:23.807+11:00Thoughts and results from the generative side of linguistics<div dir="ltr" style="text-align: left;" trbidi="on">
Hello everyone,<br />
<br />
This posts serves to show non-generativist readers of this blog things going on in the generative sphere that they might like to know about. Mainly:<br />
<br />
<ul style="text-align: left;">
<li>grand challenges, the future of the field</li>
<li>comparisons between genetic research into human history and historical linguistics</li>
<li>large cross-linguistic databases</li>
</ul>
<br />
Sometime ago, <a href="http://humans-who-read-grammars.blogspot.com.au/2015/12/generative-and-non-generative-ideas-of.html">we wrote about </a>the conference that was held in Athens in 2015 titled:<br />
<div style="text-align: center;">
<i><br /></i></div>
<span style="background-color: white; color: #333333; font-family: "georgia" , "bitstream charter" , serif; font-size: 16px;"></span><br />
<div style="text-align: center;">
<span style="background-color: white; color: #333333; font-family: "georgia" , "bitstream charter" , serif; font-size: 16px;"><i>Generative Syntax in the Twenty-First Century: The Road Ahead</i></span></div>
<span style="background-color: white; color: #333333; font-family: "georgia" , "bitstream charter" , serif; font-size: 16px;">
</span><br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgIWhy6UAH-4NdWK8lHOffP1J1mmPjinRVP0lDDmKxDzSi1rnNWxg__AxRLhZ3FZ2bD7TLURjZpcko01YGUM_NQu4X-ZXNykEgmWI-h4PkKKC50ooMhMfwwlX5cbj9Puks5Rj2vpX9DSOvV/s1600/road.gif" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="264" data-original-width="352" height="240" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgIWhy6UAH-4NdWK8lHOffP1J1mmPjinRVP0lDDmKxDzSi1rnNWxg__AxRLhZ3FZ2bD7TLURjZpcko01YGUM_NQu4X-ZXNykEgmWI-h4PkKKC50ooMhMfwwlX5cbj9Puks5Rj2vpX9DSOvV/s320/road.gif" width="320" /></a></div>
<br />
It was a conference that gathered many people and were very important matters were discussed, matters that are relevant both to those who describe themselves as "generativists" and those of us who don't.<br />
<br />
They had a round-table discussion, and the discussants also submitted written statements prior to the event. These can still be accessed and read, and provide insights into what people consider as crucial issues. Some of the links to these files had since broken, I contacted one of the organisers - Peter Svenonius - and he directed me to pages where the material can still be found.<br />
<br />
Voila: s<a href="http://site.uit.no/castl/events/road-ahead/">ummaries of the conference and statements from discussants</a><br />
<br />
<a href="http://blogg.uit.no/psv000/2016/08/30/significant-mid-level-results-of-generative-linguistics/">He also kindly directed me to this blog posts where they have written up significant mid-level results of generative linguistics</a>. I must confess, I struggle to understand the exact implications of these results because I am not familiar enough with some of the technical jargon.<br />
<br />
I'd like to draw your attention to certain excerpts from the round-table discussion that I found particularly intriguing.<br />
<br />
<div style="text-align: center;">
<span style="color: #444444; font-family: inherit;">***</span></div>
<div class="page" style="background-color: white;" title="Page 2">
<div class="layoutArea">
<div class="column">
<div style="text-align: center;">
<i><span style="color: #444444; font-family: inherit;">there remains an Indo-European bias in the field, which privileges certain data sets as being inherently more theoretically interesting than others. </span></i></div>
<div style="text-align: center;">
<span style="color: #444444; font-family: inherit;">[...]</span></div>
<div class="page" title="Page 2">
<div class="layoutArea">
<div class="column">
<div style="text-align: center;">
<i><span style="color: #444444; font-family: inherit;">it [developing long-term interdisciplinary collaborative research teams that leverage the insights of formal syntactic theorizing] needs to be integrated into graduate training programs so that junior scholars are socialized at the outset to take a broader view of the field. </span></i></div>
<div style="text-align: center;">
<i><span style="color: #444444; font-family: inherit;"><br /></span></i></div>
</div>
</div>
</div>
<div>
<div style="text-align: right;">
<span style="background: transparent; text-decoration-line: none;"><a href="https://castl.uit.no/phocadownload/Road_Ahead/dechaine.pdf" style="background: transparent; text-decoration-line: none;"><span style="color: #444444; font-family: inherit;">Rose-Marie Déchaine (University of British Columbia)</span></a></span></div>
</div>
<div style="text-align: center;">
<span style="color: #444444; font-family: inherit;"><br /></span></div>
<div style="text-align: center;">
<span style="color: #444444; font-family: inherit;">***</span></div>
<div class="page" title="Page 2">
<div class="layoutArea">
<div class="column">
<div style="text-align: center;">
<span style="color: #444444; font-family: inherit;"><i>Syntacticians are often highly selective in the way they read and cite, and they adopt main stream proposals without questioning their basic assumptions. At the same time, interesting theoretical work is ignored if it is not fashionable or produced at the right places. This imbalance does not encourage free thinking. Success measures are often one-sided and the pressure for increased productivity does not always outweigh the cost of decrease in depth</i> </span></div>
</div>
</div>
</div>
<div style="text-align: center;">
<span style="color: #444444; font-family: inherit;">[...]</span></div>
<div style="text-align: center;">
<i><span style="color: #444444; font-family: inherit;">Research on the interfaces requires formal knowledge of more subdisciplines than just syntax. Undergraduate and PhD programs that take this into account are more successful than those that don’t.</span></i></div>
</div>
</div>
</div>
<div style="background-color: white; text-align: right;">
<span style="color: #444444; font-family: inherit;"><a href="https://castl.uit.no/phocadownload/Road_Ahead/anagnostopoulou.pdf" style="background: transparent; text-decoration-line: none;">Elena Anagnostopoulou (University of Crete</a>)</span></div>
<div style="background-color: white;">
<div style="text-align: center;">
<div>
<span style="color: #444444; font-family: inherit;">***</span></div>
</div>
</div>
<div style="background-color: white;">
</div>
<div class="page" style="background-color: white;" title="Page 1">
<div class="layoutArea">
<div class="column">
<div style="text-align: center;">
<i><span style="color: #444444; font-family: inherit;">I think the field is generally in a good shape, better than it has ever been before. There has been substiantial progress in all relevent domains: More data from many more languages have been investigated, and there have been spectacular theoretical developments over the last few decades, mostly triggered by the move to come up with minimalist accounts. In addition, I take it to be fairly obvious that there is simply no viable alternative to generative grammar (where the concept is understood in a broad sense, as a formal approach that systematically predicts the wellformedness or illformedness of linguistic expressions and is prepared to envisage abstract concepts in doing so); it would seem to me to be the case, for instance, that any potential challenge from pure usage-based construction grammar approaches has by now all but disappeared, due to an absence of well-defined theoretical concepts (e.g., no ontology of theoretical primitives) and an almost complete lack of interesting results. </span></i></div>
<div style="text-align: right;">
<a href="https://castl.uit.no/phocadownload/Road_Ahead/mueller.pdf" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; line-height: 19px; text-decoration-line: none;"><span style="color: #444444; font-family: inherit;">Gereon Müller (University of Leipzig)</span></a></div>
</div>
</div>
</div>
<div style="background-color: white;">
<div style="text-align: center;">
<span style="color: #444444; font-family: inherit;">***</span></div>
</div>
<div style="background-color: white;">
</div>
<div class="page" style="background-color: white;" title="Page 1">
<div class="layoutArea">
<div class="column">
<div style="text-align: center;">
<i><span style="color: #444444; font-family: inherit;">It helps to start by asking what the subject matter of linguistics is. There are two related enterprises; descriptions of native speaker’s particular Gs [Language-specific Grammars] and descriptions of human capacity to acquire Gs. The latter aims, in effect, to describe FL [Faculty of Language, i..e Universal Grammar]. </span></i></div>
<div style="text-align: center;">
<span style="color: #444444; font-family: inherit;">[...]</span></div>
<div style="text-align: center;">
</div>
<div class="page" title="Page 1">
<div class="layoutArea">
<div class="column">
<div style="text-align: center;">
<i><span style="color: #444444; font-family: inherit;">Moreover, I don’t believe that using route-1 [Inferring properties of FL from G’s up ] is sufficient to get a decent account of FL, as there is an inherent limitation to scaling up from Gs to FL. The problem is akin to confusing Chomsky and Greenberg universals. A design feature of FL need not leave overt footprints in every G (e.g. island effects will be absent in Gs without movement) so the idea that one can determine the basic properties of FL by taking the intersection of features present in every G likely is a failing strategy. </span></i></div>
</div>
</div>
</div>
</div>
</div>
</div>
<div style="background-color: white;">
<span style="color: #444444; font-family: inherit;"><br /></span></div>
<div style="background-color: white; text-align: right;">
<span style="line-height: 19px;"><a href="https://castl.uit.no/phocadownload/Road_Ahead/hornstein.pdf" style="background: transparent; text-decoration-line: none;"><span style="color: #444444; font-family: inherit;">Norbert Hornstein (University of Maryland) </span></a></span></div>
<div style="text-align: center;">
<div>
<span style="color: #444444; font-family: inherit;">***</span></div>
</div>
<div style="text-align: center;">
<i><span style="color: #444444;">– there is a disconnect between theory and description. A
great deal of theoretical work makes no empirical proposals at all and sometimes, beyond seeing
what can’t be said using a new theoretical vocabulary, it’s hard to see what the empirical
consequences are. There’s nothing wrong with purely theoretical work whose goals are simply
not empirical, but the connection between the strands is often lost sight of. Conversely a great
deal of very good comparative work is really descriptive in nature, using theoretical ideas purely
as analytical tools. Again, there’s nothing wrong with this in itself, but the trend has become too
dominant. In particular, too little attention is paid to the mechanism of cross-linguistic variation
(the theory of parameters, or whatever might correspond to, or replace, that) in comparative
work, too little attention is paid to the kind of higher-order typological questions that our data
and theory now permit us to address and, in diachronic work, too little attention is paid to
mechanisms of change.</span></i></div>
<div style="text-align: right;">
<span style="background-color: white; color: #333333; font-family: "georgia" , "bitstream charter" , serif; font-size: 16px; text-align: left;"><a href="http://site.uit.no/castl/files/2017/03/roberts.pdf">Ian Roberts (University of Cambridge) </a></span></div>
<div style="text-align: center;">
<span style="color: #444444; font-family: inherit;">***</span></div>
<div>
<span style="color: #444444; font-family: inherit;"><br /></span></div>
<div style="text-align: center;">
<span style="color: #444444;"><i>current discussions about biological foundations of linguistics and issues of
evolution of language strike me more as part wishful thinking, part promissory notes. It
would be good to make some progress on that front, but in order to do that we need a
more serious conversation with evolutionary biologists and geneticists. We have to make
up our minds as to what the features are that we think of as crucial in evolution and bring
those to the table in our discussion with people from outer fields.</i></span></div>
<div style="text-align: right;">
<span style="background-color: white; color: #333333; font-family: "georgia" , "bitstream charter" , serif; font-size: 16px; text-align: left;"><a href="http://site.uit.no/castl/files/2017/03/polinsky.pdf">Maria Polinsky (Harvard University) </a></span></div>
<div style="text-align: right;">
<span style="text-align: center;"><br /></span></div>
<div style="text-align: right;">
<div style="text-align: center;">
<span style="font-family: inherit;">***</span></div>
<div style="text-align: left;">
<span style="font-family: inherit;">I'm not going to provide extensive commentary about these excerpts, I think they are very </span>enlightening<span style="font-family: inherit;"> on their own. Do follow the links and read more.</span></div>
<div style="text-align: left;">
<span style="font-family: inherit;"><br /></span></div>
<div style="text-align: left;">
<span style="font-family: inherit;">I'd also like to share with you two other generative enterprises that you may not be aware of. <a href="https://www.york.ac.uk/language/research/projects/langelin/project/">Firstly, there is a project led by Longobardi in York where they are looking at comparing language history to genetic research on human population history</a>:</span></div>
<div style="text-align: center;">
<i><span style="font-family: inherit; font-size: large;"><br /></span></i></div>
<div style="text-align: left;">
<h1 style="background-color: white; font-weight: 400; line-height: 1.11; margin: 0px 0px 12px; text-align: center;">
<i><span style="font-family: inherit; font-size: large;"><a href="https://www.york.ac.uk/language/research/projects/langelin/project/">Meeting Darwin's last challenge: <br />toward a global tree of human languages and genes (LanGeLin)</a></span></i></h1>
</div>
<div style="text-align: left;">
<span style="font-family: inherit;">They wrote a paper in 2009 in Lingua (Longobardi & G</span><span style="font-family: "charissil"; font-size: 12pt; text-align: right;">uardiano 2009) where they explore this in relation to the Indo-European family. They have since published more papers, but this one happens to be online and in the <a href="http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.706.1280&rep=rep1&type=pdf">appendices you can see exactly which linguistic features they have used and how they are coded</a>. For the other publications, I couldn't easily find the coding - that's why I'm linking to the 2009 paper. (The PDF is hosted by Penn State Uni, I don't know why.) </span></div>
<div style="text-align: left;">
<br /></div>
<div style="text-align: left;">
If you're curious about new methods in historical linguistics and what generativists are up to, you should check out their<a href="https://www.york.ac.uk/language/research/projects/langelin/project/" style="font-family: CharisSIL; font-size: 12pt; text-align: right;"> project here.</a> For full context, I should say that I am not a believer of this methodology or framework, but I do like to know what's going on and I believe it's for the good of the field at large to have some insights into other schools of thought. I do work on related topics, but in a different way.</div>
<div style="text-align: left;">
<br /></div>
<div style="text-align: left;">
Secondly, there is also a cross-linguistic database with more of a generative tint:</div>
<div style="text-align: left;">
<br /></div>
<div style="text-align: left;">
<div style="text-align: center;">
<span style="color: black; font-size: large;"><i><a href="http://test.terraling.com/groups/7">Syntactic Structures of the World's Languages</a> (SSWL)</i></span></div>
<div style="text-align: left;">
<br /></div>
<div style="text-align: left;">
SSWL contains 281 languages and 150 features. It also contains a lot of examples submitted by experts, which is very nice. The site that hosts SSWL, Terraling, also hosts Cinque's Universal 20 Database which contains 1687 languages coded for 32 features. Go check it out!</div>
</div>
<div style="text-align: left;">
<span style="font-family: inherit;"><br /></span></div>
</div>
<div>
<span style="font-family: inherit;">I hope this has helped some of our readers to connect a bit to our </span>generativists<span style="font-family: inherit;"> </span>colleagues. <span style="font-family: inherit;">You can also see our</span><a href="http://humans-who-read-grammars.blogspot.com.au/search/label/challenges" style="font-family: inherit;"> "challenges" tag </a><span style="font-family: inherit;">for more posts on these kinds of things, and</span><a href="http://humans-who-read-grammars.blogspot.com.au/2015/12/generative-and-non-generative-ideas-of.html" style="font-family: inherit;"> this post in particular for the non-generativst round table that was held in Poznan on similar topics.</a></div>
<div>
<br /></div>
<div>
Good night!</div>
</div>
Hedvig Skirgårdhttp://www.blogger.com/profile/03689179680848604827noreply@blogger.com0tag:blogger.com,1999:blog-1300680252997007251.post-21767908766260198002017-12-01T10:08:00.001+11:002017-12-01T10:08:16.769+11:00New positions in Jena and general about looking for jobs as a linguist<div dir="ltr" style="text-align: left;" trbidi="on">
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgBFsk8EvSzNqdyrc2fAGDW_sx1Eqo0VkH1nTQjvkEtHW9xeMy2hwN2i3FSO74HAXP8_mLdboSO_VfKcvw3RttFoxJG_wvLE3sY__TxbkgjmL1mb446UnboftsyPvxUPrsrTpvvrBEYte_x/s1600/20160909_130754.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="900" data-original-width="1600" height="225" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgBFsk8EvSzNqdyrc2fAGDW_sx1Eqo0VkH1nTQjvkEtHW9xeMy2hwN2i3FSO74HAXP8_mLdboSO_VfKcvw3RttFoxJG_wvLE3sY__TxbkgjmL1mb446UnboftsyPvxUPrsrTpvvrBEYte_x/s400/20160909_130754.jpg" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Jena from afar (Jena is in the valley of the plateau)</td></tr>
</tbody></table>
<table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: right; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjfPhVgEWitM3-GR3oDEA5smlap0Qv6SuXITLrJ-S60MvPcrxVjkQg1hpzksRM1IugEvpkTDgRL3etbEWMY1OWeuEmiQr34Poz3hrsxzN8Hrhy4_4BhKS1cPqngtmK0GBkfi7sOpQC3SwB6/s1600/20160909_112838.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="900" data-original-width="1600" height="179" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjfPhVgEWitM3-GR3oDEA5smlap0Qv6SuXITLrJ-S60MvPcrxVjkQg1hpzksRM1IugEvpkTDgRL3etbEWMY1OWeuEmiQr34Poz3hrsxzN8Hrhy4_4BhKS1cPqngtmK0GBkfi7sOpQC3SwB6/s320/20160909_112838.jpg" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Central Jena</td></tr>
</tbody></table>
The Max Planck Insitute for the Sciences of Human History in Jena has just announced<a href="http://listserv.linguistlist.org/pipermail/lingtyp/2017-November/005984.html"> 2 PhD positions and 1 postdoc,</a> and the <a href="http://listserv.linguistlist.org/pipermail/lingtyp/2017-November/005986.html">Friedrich-Schiller-Universität Jena has also announced 1 PhD position</a>. Follow the links for more info.<br />
<br />
In light of that, I wanted to highlight some other relevant resources for those of us who are looking for a PhD position, or considering what to do after having finished the PhD or masters degree.<br />
<b><br /></b>
<b>Linguists Outside of Academia</b><br />
<a href="https://linguistsoutsideacademia.com/">There's a mailing list for linguists who are working outside of academia, or who want to work outside of academia. </a><br />
<span style="background-color: white;"><i><span style="color: #444444;"><br /></span></i></span>
<div class="separator" style="clear: both; text-align: center;">
<span style="background-color: white; font-family: "Roboto Slab", serif; text-align: start;"><i><span style="color: #444444;">Welcome along to Linguists Outside Academia! We are a motley crew of self-identifying linguists with tenuous connections to the groves of academe. This includes trained linguists who are currently out of work, as well as people on shaky fixed-term academic contracts, and others who have 'linguistic' type jobs in non-academic settings. We're here to share ideas about professional life, rejoice in success, commiserate in failure, and generally steer our way through the rolling swells of a tumultuous job market.</span></i></span></div>
<div class="separator" style="clear: both; text-align: center;">
<span style="background-color: white; font-family: "Roboto Slab", serif; text-align: start;"><i><span style="color: #444444;"><br /></span></i></span></div>
<b>Specific jobs outside of academia in linguistics - Appen</b><br />
There are several different possible workplaces for linguists outside of academia, one in particular that I want to highlight is the tech-company Appen. <a href="https://join.appen.com/">They have a large variety of different kinds of positions available</a>, and several of them are combinable with studying part-time or working remotely.<br />
<br />
<b>What does a linguist do?</b><br />
Annemarie had the great idea to start a series of blogposts on the theme <a href="http://humans-who-read-grammars.blogspot.com.au/search/label/%23whatdoesalinguistdo%3F">"What does a linguist do?"</a>, <a href="http://humans-who-read-grammars.blogspot.com/2016/02/what-does-linguist-do-re-annemarie.html">hers is here.</a> <a href="http://humans-who-read-grammars.blogspot.com.au/p/authors.html">We other Humans </a>here haven't caught up yet and made a post though, but we will!<br />
<br />
<b>Tips for people looking for positions</b><br />
<a href="http://humans-who-read-grammars.blogspot.com.au/2015/12/tips-when-looking-for-phd-positions-or.html">We made a longer post with more advice on applying for PhD positions,</a> and there's also the tag <a href="http://humans-who-read-grammars.blogspot.com.au/2016/03/hopefully-helpful-for-linguistics.html">"Hopefully helpful for linguistics students".</a></div>
Hedvig Skirgårdhttp://www.blogger.com/profile/03689179680848604827noreply@blogger.com0tag:blogger.com,1999:blog-1300680252997007251.post-43213481148073353072017-10-16T07:07:00.002+11:002017-10-16T19:20:06.161+11:00A week of Bantu grammar reading: The good, the bad, and the ugly<!--[if gte mso 9]><xml>
<o:OfficeDocumentSettings>
<o:AllowPNG/>
</o:OfficeDocumentSettings>
</xml><![endif]-->
<!--[if gte mso 9]><xml>
<w:WordDocument>
<w:View>Normal</w:View>
<w:Zoom>0</w:Zoom>
<w:TrackMoves/>
<w:TrackFormatting/>
<w:PunctuationKerning/>
<w:ValidateAgainstSchemas/>
<w:SaveIfXMLInvalid>false</w:SaveIfXMLInvalid>
<w:IgnoreMixedContent>false</w:IgnoreMixedContent>
<w:AlwaysShowPlaceholderText>false</w:AlwaysShowPlaceholderText>
<w:DoNotPromoteQF/>
<w:LidThemeOther>EN-US</w:LidThemeOther>
<w:LidThemeAsian>JA</w:LidThemeAsian>
<w:LidThemeComplexScript>X-NONE</w:LidThemeComplexScript>
<w:Compatibility>
<w:BreakWrappedTables/>
<w:SnapToGridInCell/>
<w:WrapTextWithPunct/>
<w:UseAsianBreakRules/>
<w:DontGrowAutofit/>
<w:SplitPgBreakAndParaMark/>
<w:EnableOpenTypeKerning/>
<w:DontFlipMirrorIndents/>
<w:OverrideTableStyleHps/>
<w:UseFELayout/>
</w:Compatibility>
<m:mathPr>
<m:mathFont m:val="Cambria Math"/>
<m:brkBin m:val="before"/>
<m:brkBinSub m:val="--"/>
<m:smallFrac m:val="off"/>
<m:dispDef/>
<m:lMargin m:val="0"/>
<m:rMargin m:val="0"/>
<m:defJc m:val="centerGroup"/>
<m:wrapIndent m:val="1440"/>
<m:intLim m:val="subSup"/>
<m:naryLim m:val="undOvr"/>
</m:mathPr></w:WordDocument>
</xml><![endif]--><!--[if gte mso 9]><xml>
<w:LatentStyles DefLockedState="false" DefUnhideWhenUsed="true"
DefSemiHidden="true" DefQFormat="false" DefPriority="99"
LatentStyleCount="276">
<w:LsdException Locked="false" Priority="0" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Normal"/>
<w:LsdException Locked="false" Priority="9" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="heading 1"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 2"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 3"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 4"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 5"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 6"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 7"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 8"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 9"/>
<w:LsdException Locked="false" Priority="39" Name="toc 1"/>
<w:LsdException Locked="false" Priority="39" Name="toc 2"/>
<w:LsdException Locked="false" Priority="39" Name="toc 3"/>
<w:LsdException Locked="false" Priority="39" Name="toc 4"/>
<w:LsdException Locked="false" Priority="39" Name="toc 5"/>
<w:LsdException Locked="false" Priority="39" Name="toc 6"/>
<w:LsdException Locked="false" Priority="39" Name="toc 7"/>
<w:LsdException Locked="false" Priority="39" Name="toc 8"/>
<w:LsdException Locked="false" Priority="39" Name="toc 9"/>
<w:LsdException Locked="false" Priority="35" QFormat="true" Name="caption"/>
<w:LsdException Locked="false" Priority="10" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Title"/>
<w:LsdException Locked="false" Priority="1" Name="Default Paragraph Font"/>
<w:LsdException Locked="false" Priority="11" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtitle"/>
<w:LsdException Locked="false" Priority="22" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Strong"/>
<w:LsdException Locked="false" Priority="20" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Emphasis"/>
<w:LsdException Locked="false" Priority="59" SemiHidden="false"
UnhideWhenUsed="false" Name="Table Grid"/>
<w:LsdException Locked="false" UnhideWhenUsed="false" Name="Placeholder Text"/>
<w:LsdException Locked="false" Priority="1" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="No Spacing"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 1"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 1"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 1"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 1"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 1"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 1"/>
<w:LsdException Locked="false" UnhideWhenUsed="false" Name="Revision"/>
<w:LsdException Locked="false" Priority="34" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="List Paragraph"/>
<w:LsdException Locked="false" Priority="29" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Quote"/>
<w:LsdException Locked="false" Priority="30" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Quote"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 1"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 1"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 1"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 1"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 1"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 1"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 1"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 1"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 2"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 2"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 2"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 2"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 2"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 2"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 2"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 2"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 2"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 2"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 2"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 2"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 2"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 2"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 3"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 3"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 3"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 3"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 3"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 3"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 3"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 3"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 3"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 3"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 3"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 3"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 3"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 3"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 4"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 4"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 4"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 4"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 4"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 4"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 4"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 4"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 4"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 4"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 4"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 4"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 4"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 4"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 5"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 5"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 5"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 5"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 5"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 5"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 5"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 5"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 5"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 5"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 5"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 5"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 5"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 5"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 6"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 6"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 6"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 6"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 6"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 6"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 6"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 6"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 6"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 6"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 6"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 6"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 6"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 6"/>
<w:LsdException Locked="false" Priority="19" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtle Emphasis"/>
<w:LsdException Locked="false" Priority="21" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Emphasis"/>
<w:LsdException Locked="false" Priority="31" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtle Reference"/>
<w:LsdException Locked="false" Priority="32" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Reference"/>
<w:LsdException Locked="false" Priority="33" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Book Title"/>
<w:LsdException Locked="false" Priority="37" Name="Bibliography"/>
<w:LsdException Locked="false" Priority="39" QFormat="true" Name="TOC Heading"/>
</w:LatentStyles>
</xml><![endif]-->
<!--[if gte mso 10]>
<style>
/* Style Definitions */
table.MsoNormalTable
{mso-style-name:"Table Normal";
mso-tstyle-rowband-size:0;
mso-tstyle-colband-size:0;
mso-style-noshow:yes;
mso-style-priority:99;
mso-style-parent:"";
mso-padding-alt:0cm 5.4pt 0cm 5.4pt;
mso-para-margin:0cm;
mso-para-margin-bottom:.0001pt;
mso-pagination:widow-orphan;
font-size:12.0pt;
font-family:Cambria;
mso-ascii-font-family:Cambria;
mso-ascii-theme-font:minor-latin;
mso-hansi-font-family:Cambria;
mso-hansi-theme-font:minor-latin;}
</style>
<![endif]-->
<!--StartFragment-->
<!--EndFragment--><br />
<div class="MsoNormal">
In my week of grammar reading, there were some truly good,
truly bad, and truly ugly aspects. It was entirely devoted to
<a href="https://en.wikipedia.org/wiki/Bantu_languages" target="_blank">Bantu</a>, as it has been for some time, because of a project on Bantu gender systems I am doing together with <a href="http://www.su.se/english/profiles/fdiga-1.187638" target="_blank">Francesca Di Garbo</a> of Stockholm University. Bantu languages are best known for their complex morphology, both involving the noun (<a href="https://books.google.de/books?id=lvy1va7QcuEC&printsec=frontcover&hl=nl&source=gbs_ge_summary_r&cad=0#v=onepage&q&f=false" target="_blank">gender</a>) and the verb (Tense-Mood-Aspect). The noun typically is classified in up to around 12 different genders, signified by prefixes that attach to the noun root, and prefixes that mark agreement with the noun throughout the clause. To start with the good, below is an overview of nominal prefixes in Ding (diz), spoken in the Democratic Republic of Congo. The Roman numerals in the first column signify each gender, the two subsequent columns relate the singular and plural prefixes used for nouns, including different forms, and the final column gives examples of the type of noun found in each gender. </div>
<div class="MsoNormal">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgpYpZCFdLZFKI7OorirdgM186y9uvQK5r0iVFAqL06OUtpwLEKsuam1kG84Kw4rp6khvWnP-TGdJhC6qV8IRDmsoapclkJOlC34ONUIOTJsY6tCNO4YARBKtVTsB9VhMT2uxE91198FzrC/s1600/Mertens_1935_1939_v2_26.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1254" data-original-width="1600" height="311" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgpYpZCFdLZFKI7OorirdgM186y9uvQK5r0iVFAqL06OUtpwLEKsuam1kG84Kw4rp6khvWnP-TGdJhC6qV8IRDmsoapclkJOlC34ONUIOTJsY6tCNO4YARBKtVTsB9VhMT2uxE91198FzrC/s400/Mertens_1935_1939_v2_26.png" width="400" /></a></div>
<div class="MsoNormal" style="text-align: center;">
<i>Mertens (1938: 26)</i></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
By far the large majority of Bantu languages have systems such as Ding. What we are looking for in this project is not the presence of gender systems per se, but rather variation regarding the number of genders, the places where gender marking is found aside from on the noun, and occasions when agreement is not based on gender, but rather on animacy - more on latter the below. We look for over 20 different word classes that may carry agreement marking: attributive adjectives, demonstratives, genitives, predicative nominals, possessive pronouns, and so on. Many of these do have gender agreement in many Bantu languages, but unfortunately this is not always very well described. </div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
The Ding grammar by Joseph Mertens is in one word, AWESOME. It is part of a three-volume description of ethnography, grammar, and lexicon, in total of over 1000 pages. The grammar is very extensive, features many examples, and is written clearly and honestly - the author did not find out exactly how relative clauses work in Ding, and just says this. Yay! And, as it turns out, agreement is not always assigned on the basis of gender. Below is a full overview of page 64 and 65, where demonstratives are described. In Table 1, the demonstratives for each gender are given - for each gender (I-VII) and number (SG/PL), there is a separate demonstrative, which bear some similarity to the nominal markers above. </div>
<div class="MsoNormal">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhNiEI3FE3vbz1LRbqarLF2oMmsNTRbMi0dbA0NJo0LOR20V1bYVDKjYSgxbkOvl94xs_vHNs98m2j8QVpMS9s3T4Xc3r-fxe6N4l2GVGlRtSDIuyR48w-IQz45Q_M6ug3yJiRxh-K6OkfX/s1600/Mertens_1935_1939_v2_64_65.png.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1307" data-original-width="1600" height="521" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhNiEI3FE3vbz1LRbqarLF2oMmsNTRbMi0dbA0NJo0LOR20V1bYVDKjYSgxbkOvl94xs_vHNs98m2j8QVpMS9s3T4Xc3r-fxe6N4l2GVGlRtSDIuyR48w-IQz45Q_M6ug3yJiRxh-K6OkfX/s640/Mertens_1935_1939_v2_64_65.png.png" width="640" /></a></div>
<div class="MsoNormal" style="text-align: center;">
<i>Mertens (1938: 64-65)</i></div>
<div class="MsoNormal" style="text-align: center;">
<br /></div>
<div class="MsoNormal">
The first example on this page shows how gender agreement works: <i>muur wu tɛɛn ntɛɛn</i> 'cet homme bavarde' - <i>muur</i> means man, it starts with the <i>mu</i>- prefix for singular gender I. <i>wu</i> means something like 'this', i.e. it is a demonstrative that indicates something is located close to the speaker. It agrees with the noun <i>muur</i>, and is also singular gender I. This is the status quo: different parts of the clause, such as the demonstrative, are marked for the same gender as the (subject) noun. However, many Bantu languages diverge from this by allowing agreement to not reflect gender assignment, but rather to reflect animacy. Mertens describes this just below the first set of examples: "a) Des noms de personnes, appartenant à d’autres classes que la première, en adoptent pourtant le démonstratif." Nouns that designate humans, even if they belong to a different gender, take the demonstrative of the singular gender I. The first gender typically contains nouns designating persons, such as kinship terms, and thus by assigning human nouns agreement in the first gender, the animacy status of these nouns is appropriately flagged. As Mertens continues, in Ding, the same is true for animals: "b) Même remarque pour les noms d’animaux", which is not unreasonable, as many animals in stories have human-like qualities. In Ding, animacy-based agreement is also found on verbs, but not for any of the other parts of the clause we looked at. </div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
Francesca also had a great find this week on Bangi (bin), spoken in Congo, the Democratic Republic of Congo, and the Central African Republic, it is worth reading in full: </div>
<div class="MsoNormal">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgg320wcyKlByJLzV4dpHM5IsKO4SfZItoTZeUJcVm0xoBGH-zVV-hyB5Txu-XT2pCmXmN9OgRJQDhJq8Z21NUR5LjVgt5V5k0Ht5jOQE966sWPck6waFlAIGmNr6UAGNW76rQzHRDxt6HJ/s1600/whitehead_1899_8.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1573" data-original-width="1254" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgg320wcyKlByJLzV4dpHM5IsKO4SfZItoTZeUJcVm0xoBGH-zVV-hyB5Txu-XT2pCmXmN9OgRJQDhJq8Z21NUR5LjVgt5V5k0Ht5jOQE966sWPck6waFlAIGmNr6UAGNW76rQzHRDxt6HJ/s640/whitehead_1899_8.png" width="510" /></a></div>
<div style="text-align: center;">
<i> Whitehead (1899: 8)</i></div>
<div class="separator" style="clear: both; text-align: center;">
</div>
<br />
<div class="separator" style="clear: both; text-align: center;">
</div>
<div class="MsoNormal">
This passage contains so many (implicit) claims about language use, second language acquisition, contact-induced change, simplification, prescriptivism, prestige, it is really something special. The processes in noun class reduction through contact-induced change are one of the things we are trying to study in this project. Also, it points out the problem that we face in trying to uncover animacy-based agreement, or other types of agreement - some grammar writers will not report on these deviances from 'regular' gender-based agreement. </div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
To continue with the bad. This week I was also looking on information on Tsaangi (tsa), spoken in Congo and Gabon. I went looking for the following two references: </div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<i>Loubelo, Fidèle. (1987) Description phonologique du itsaangi, parler de Madouma-Mossendjo. Brazzaville: Université Marien Ngouabi MA thesis.</i></div>
<div class="MsoNormal">
<i><br /></i></div>
<div class="MsoNormal">
<i>Loubelo, Fidèle. (1990) Le nom en kitsa:ngi, langue bantoue du Congo. Dakar: Université Cheikh Anta Diop doctoral dissertation. </i></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
Since one of these references focuses especially on the noun, I tried to look around on the internet to see if I could find the author and ask them whether they might be able to share their work. I found their name mentioned in at least three places (<a href="https://www.fidh.org/IMG/pdf/rap-braz.pdf" target="_blank">here</a> and <a href="http://equmeniakyrkan.se/ateruppbyggnaden-av-kongo/" target="_blank">here</a> and <a href="http://www.salasambila.com/files/JOURNAL-LE-CHEMIN-N-272-FEVRIER-2015.pdf" target="_blank">here</a>), mentioning that this person was murdered in late 1998, during the Republic of Congo Civil War. This could be another person with the same name, but the sources mention this person was a minister, and being a minister and a linguist may very well go hand in hand, with many people studying Bantu languages that are ultimately employed by Christian organizations to translate the Bible. This was such a saddening finding. The wars fought in Africa and those still being fought today are horrible. </div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
This leaves us at the ugly. I usually don't like to speak ill of other people's work, and this classification is partly in jest - normally I would classify the following simply as 'bad'. But this week I read a grammar of a Bantu language that did not discuss the gender system at all - despite there in all likelihood being one. This was a grammar of Teke-Eboo (ebo), spoken in Congo and the Democratic Republic of Congo, by Edouard Tetsio. The grammar consists of four parts, preliminaries, history, and ethnography, p. 1-108; grammar, p. 109-168, texts with commentary, p. 169-268, and a lexicon, p. 269-312. The emphasis on ethnography suggest that the author was probably not a linguist, so I can forgive him - however, as gender systems are so prolific in Bantu languages, I don't think I have ever before seen a grammar that does not talk about gender, even if it is mostly absent. On page 131-132, the author discusses sex-based gender, of the type we find in much of Europe, with a female/male, or female/male/neuter distinction. However, this is not relevant for any Bantu language we have come across so far (when Bantu languages restructure their gender system so far to end up with a 2-way or 3-way distinction, they do this on the basis of animacy, not sex). Then, on page 137 below, the author discusses the sentence <i>Onké wu ombi o yaya </i>"Une vilaine femme arrive", about mid-page.</div>
<div class="MsoNormal">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhlJjljK5ZIqh2le93u-WGKiKwY8DFNoyZFfRW5EPNc6DVgufzT3-P1rusYit_3gRxMu4So7b8-T4G8Ijg3OUQANPHd1__kunU1WdpFWBrQ3n_nM6o-tHjqcbHP_pfGr9NAXwJSSmmUHoqB/s1600/etsio_1999_136_137.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1176" data-original-width="1600" height="470" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhlJjljK5ZIqh2le93u-WGKiKwY8DFNoyZFfRW5EPNc6DVgufzT3-P1rusYit_3gRxMu4So7b8-T4G8Ijg3OUQANPHd1__kunU1WdpFWBrQ3n_nM6o-tHjqcbHP_pfGr9NAXwJSSmmUHoqB/s640/etsio_1999_136_137.png" width="640" /></a></div>
<div class="MsoNormal" style="text-align: center;">
<i>Etsio (1999: 137)</i></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
At the bottom of that paragraph, the author writes that attributive adjectives agree with the noun they modify, <i>ombi</i> being the singular form, and <i>abi </i>the plural. These are in all likelihood prefixed adjectives, the root being -<i>bi</i>, the singular gender I agreement prefix being <i>om</i>-, and the singular gender I agreement prefix being <i>a</i>-. (To go into unnecessary detail, since <i>wu </i>is inserted between the noun <i>onké</i> and the 'adjective' <i>ombi</i>, it is likely that <i>ombi</i> is a noun meaning something like 'ugliness', and <i>wu</i> is a so-called associative marker, also marked for gender. Most Bantu languages do not have a large class of adjectives, and uses nouns (in genitive constructions) and verbs (in relative clauses) instead.) However, despite the author noticing that these adjectives agree with the noun, there is no description of the gender system, or further description of other parts of the clause that also show agreement. He seems to have completely missed this feature - something which could only have happened if he didn't know that this is a relevant feature of Bantu languages, and there was no proofreader of the grammar that had this knowledge either. Weirdly enough, Etsio has also published a grammar of Lingala, also without describing <a href="https://en.wikipedia.org/wiki/Lingala#Noun_class_system" target="_blank">its gender system</a>. </div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
Of course I may be entirely wrong here, but so far further investigations suggest Teke-Eboo has at least something resembling a traditional Bantu gender system that Etsio completely missed. For now, goodnight and hope the next week of grammar reading only brings good.</div>
<div class="MsoNormal">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEisVAwjKboghW8qKqPGzz-83FywE5st1gZMN7ir0IAenyc4xx_h-KKKapQ3ePSupAb4guqziR0GBULu6-Wi4F0FCnrV-RrTnWkw7oe0t_tZ47Npqpeolc1FBWgE1Qms4ozlnf1ejIWIhW-6/s1600/good-bad-ugly-the-end.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="272" data-original-width="640" height="272" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEisVAwjKboghW8qKqPGzz-83FywE5st1gZMN7ir0IAenyc4xx_h-KKKapQ3ePSupAb4guqziR0GBULu6-Wi4F0FCnrV-RrTnWkw7oe0t_tZ47Npqpeolc1FBWgE1Qms4ozlnf1ejIWIhW-6/s640/good-bad-ugly-the-end.jpg" width="640" /></a></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<b>References</b></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
Etsio, Edouard. 1999. Parlons téké: langue et culture. (Collection Parlons.) Paris: L'Harmattan.</div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
Etsio, Edouard. 2003. Parlons Lingala. Paris: L'Harmattan. 240pp.</div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
Mertens, Joseph. 1935, 1938, 1939. Les Ba Dzing de la Kamtsha. (Mémoires / Institut Royal Colonial Belge, Section des Sciences Morales et politiques). Bruxelles: Campenhout. (3 vols.)</div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
Whitehead, John. 1899. Grammar and Dictionary of the Bobangi Language. London: Baptist Missionary Society.</div>
<div class="MsoNormal">
<br /></div>
<div>
<br /></div>
Annemarie Verkerkhttp://www.blogger.com/profile/14747297526182358435noreply@blogger.com0tag:blogger.com,1999:blog-1300680252997007251.post-33747793003695400812017-09-27T22:58:00.001+10:002019-08-02T11:24:24.603+10:00Linguistic map making: Drawing polygons<div dir="ltr" style="text-align: left;" trbidi="on">
Hedvig has written o<a href="http://humans-who-read-grammars.blogspot.de/2017/08/ethnologue-more-restricted.html" target="_blank">n how Ethnologue has become even more restricted than it already was</a>, and what resources are out there that could be used instead. One of the things I miss from Ethnologue are its maps - although at least recently it was still possible to access most of these, by downloading them instead of viewing them on your browser. In her post, Hedvig points out that <a href="http://www.langscape.umd.edu/" target="_blank">Langscape</a> can be used instead, and that's all great.<br />
<br />
But what if you wanted to draw a map yourself? Especially one which you intend to publish? Some institutions may have access to the <a href="http://www.worldgeodatasets.com/language/" target="_blank">World Language Mapping System</a> (WLMS), which lies at the core of Ethnologue's (and Langscape's) maps, and was made by Global Mapping International (<a href="https://www.gmi.org/gis-transition" target="_blank">which recently was closed, and now the WLMS is back formally with SIL</a>). I'm not sure about the details (and the user agreement parts of the WLMS website are down), but paying a lot of money for the WLMS must enable users to draw their own maps and publish them, as long as they cite the source.<br />
<br />
Not everyone has access to the World Language Mapping System, and even if you do, it is very likely that your specific needs are not covered by it. For example, have a look at the following map of the border between the Central African Republic, South Sudan, and the Democratic Republic of the Congo. As you can see, where these three countries meet in the center of the map, Zande, one of the biggest Ubangi languages, is spoken on all three sides of the border.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEicfKmgr05zR6RFQ3UPairtKNstzL5iDzwCk646tv2W1dyet1VgtFRp5QGgXzYfLBKPwrewS30GbpkKyWNA7lbXZ-5uze3fUm8LNBRvd67xkRoeXrMgr4fsHZxyJX1tNtRXZQySDgSbwZJ0/s1600/langscape_Zande.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="778" data-original-width="1600" height="310" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEicfKmgr05zR6RFQ3UPairtKNstzL5iDzwCk646tv2W1dyet1VgtFRp5QGgXzYfLBKPwrewS30GbpkKyWNA7lbXZ-5uze3fUm8LNBRvd67xkRoeXrMgr4fsHZxyJX1tNtRXZQySDgSbwZJ0/s640/langscape_Zande.png" width="640" /></a></div>
<div style="text-align: center;">
source: http://www.langscape.umd.edu</div>
<br />
However, there are some Bantu languages spoken in this border area. One of them is Homa (hom), and as you can see by its location on Glottolog and comparing it with the langscape map, it simply does not have a polygon in the World Language Mapping System / Ethnologue maps.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgzM_GW1deEc6m6WF5s8fVgVxMmeWeAbDR5pp3Ms8d1FFj315teInb0ql3tGzRHTPDqXTIk9zNXi_G9SrfnJk7wr6_cF-mPueemYaZhSGWkoRbj1GqBgBBFbjhbaZxkF0gTyClO6nV29JKr/s1600/Homa_glottolog.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="944" data-original-width="1600" height="376" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgzM_GW1deEc6m6WF5s8fVgVxMmeWeAbDR5pp3Ms8d1FFj315teInb0ql3tGzRHTPDqXTIk9zNXi_G9SrfnJk7wr6_cF-mPueemYaZhSGWkoRbj1GqBgBBFbjhbaZxkF0gTyClO6nV29JKr/s640/Homa_glottolog.png" width="640" /></a></div>
<div style="text-align: center;">
source: http://glottolog.org, http://glottolog.org/resource/languoid/id/homa1239</div>
<br />
Homa is a small, underdescribed Bantu language, which according to Sommer (1992: 352) may be threatened with extinction. The only extremely sketchy description of it is Santandrea (1963), who describes animacy-based agreement on adjectives, and suggests a heavily attrited gender system - something rather unusual for a Bantu language, with their generally extensive and healthy gender systems. This is the immediate reason for this post - together with <a href="http://www.su.se/english/profiles/fdiga-1.187638" target="_blank">Francesca Di Garbo</a> I am looking at gender systems in Bantu, and I would very much like to be able to plot Homa on a map, not just with a point as in Glottolog, but with a polygon that I can color to indicate its special characteristics. A polygon rather than a point also makes far more clear that this language community is spoken in Zande country, far away from most of the rest of the Bantu languages.<br />
<br />
Turns out, there is an extremely easy way to do this. One can use <a href="https://en.wikipedia.org/wiki/Google_Earth" target="_blank">Google Earth</a> to draw polygons anywhere on the world's surface, save them, and load them into R to make nice maps. The link to Google Earth is <a href="https://www.google.com/earth/" target="_blank">here</a> (use within browser, wants Chrome), but you can download it <a href="https://www.google.com/earth/download/gep/agree.html" target="_blank">here</a>.<br />
<br />
Once you open the Google Earth application, you can draw a polygon with the 'draw polygon' tool in the toolbar above the map. While the window is open, you can make the polygon by clicking on the map. Then you name it and save it as a kml file - described in much more detail <a href="http://www.instructables.com/id/Creating-KML-Files-For-Your-Custom-Google-Maps/" target="_blank">here</a>. This is the polygon I drew for Homa, see explanation below:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiBPVAZaQwhEKfktz_9QnreuynhE9TnNFiu15GMryQrNAFNmRHKNdpZ23h6IgDLy-2Kf9aZ6tn7cJSQVcwC8-ecmY72BZdVAawdDAUjw5rUHJOzYOaWU0Laz60MLq3KTelSmg4g4tgqEYlS/s1600/Google_Earth.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="989" data-original-width="1600" height="394" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiBPVAZaQwhEKfktz_9QnreuynhE9TnNFiu15GMryQrNAFNmRHKNdpZ23h6IgDLy-2Kf9aZ6tn7cJSQVcwC8-ecmY72BZdVAawdDAUjw5rUHJOzYOaWU0Laz60MLq3KTelSmg4g4tgqEYlS/s640/Google_Earth.png" width="640" /></a></div>
<div style="text-align: center;">
source: Google Earth</div>
<br />
The location of Homa speakers according to Glottolog is close to Nagasi. Santandrea (1948: 81) states speakers of the language can be found in Tombora, and Sommer (1992: 352) puts their location "around towns of Mopoi and Tambura". As you can see on the Glottolog map, this is an area just north of where Glottolog puts the centroid for Homa. So, using Google Earth I draw a kind of oblong shape around these towns, the northwestern one being Tambura, and saved the polygon as Homa.kml. I don not know why there is a discrepancy between these sources and the Glottolog point, that is a story for another time. <br />
<br />
Next, we can read the .kml file into <a href="https://www.r-project.org/" target="_blank">R</a>, and place it on a map. Please see code below.<br />
<br />
<div class="MsoNormal">
<span style="font-family: "andale mono";"># a libary you need
to read in .kml files<o:p></o:p></span></div>
<div class="MsoNormal">
<span style="font-family: "andale mono";">library(rgdal)<o:p></o:p></span></div>
<div class="MsoNormal">
<span style="font-family: "andale mono";">Homa <-
readOGR(dsn="Homa.kml")<o:p></o:p></span></div>
<div class="MsoNormal">
<span style="font-family: "andale mono";"><br /></span></div>
<div class="MsoNormal">
<span style="font-family: "andale mono";"># a libary you need
to make the map<o:p></o:p></span></div>
<div class="MsoNormal">
<span style="font-family: "andale mono";">library(mapdata)<o:p></o:p></span></div>
<div class="MsoNormal">
<span style="font-family: "andale mono";"><br /></span></div>
<div class="MsoNormal">
<span style="font-family: "andale mono";"># plotting the map<o:p></o:p></span></div>
<div class="MsoNormal">
<span style="font-family: "andale mono";">map("world2Hires",
xlim=c(23, 31), ylim=c(1, 8), boundary=TRUE)<o:p></o:p></span></div>
<div class="MsoNormal">
<span style="font-family: "andale mono";">map.axes()<o:p></o:p></span></div>
<div class="MsoNormal">
<span style="font-family: "andale mono";">map.scale(cex=0.8)<o:p></o:p></span></div>
<div class="MsoNormal">
<span style="font-family: "andale mono";"><br /></span></div>
<div class="MsoNormal">
<span style="font-family: "andale mono";"># putting in country
names so we can situate them<o:p></o:p></span></div>
<div class="MsoNormal">
<span style="font-family: "andale mono";">text(x = 30, y =
7.5, "South Sudan")<o:p></o:p></span></div>
<div class="MsoNormal">
<span style="font-family: "andale mono";">text(x = 26, y =
4.5, "Democratic Republic of Congo")<o:p></o:p></span></div>
<div class="MsoNormal">
<span style="font-family: "andale mono";">text(x = 24.5, y =
7, "Central African Republic")<o:p></o:p></span></div>
<div class="MsoNormal">
<span style="font-family: "andale mono";"><br /></span></div>
<div class="MsoNormal">
<span style="font-family: "andale mono";"># plot the Homa
polygon<o:p></o:p></span></div>
<!--[if gte mso 9]><xml>
<o:OfficeDocumentSettings>
<o:AllowPNG/>
</o:OfficeDocumentSettings>
</xml><![endif]-->
<!--[if gte mso 9]><xml>
<w:WordDocument>
<w:View>Normal</w:View>
<w:Zoom>0</w:Zoom>
<w:TrackMoves/>
<w:TrackFormatting/>
<w:PunctuationKerning/>
<w:ValidateAgainstSchemas/>
<w:SaveIfXMLInvalid>false</w:SaveIfXMLInvalid>
<w:IgnoreMixedContent>false</w:IgnoreMixedContent>
<w:AlwaysShowPlaceholderText>false</w:AlwaysShowPlaceholderText>
<w:DoNotPromoteQF/>
<w:LidThemeOther>EN-US</w:LidThemeOther>
<w:LidThemeAsian>JA</w:LidThemeAsian>
<w:LidThemeComplexScript>X-NONE</w:LidThemeComplexScript>
<w:Compatibility>
<w:BreakWrappedTables/>
<w:SnapToGridInCell/>
<w:WrapTextWithPunct/>
<w:UseAsianBreakRules/>
<w:DontGrowAutofit/>
<w:SplitPgBreakAndParaMark/>
<w:EnableOpenTypeKerning/>
<w:DontFlipMirrorIndents/>
<w:OverrideTableStyleHps/>
<w:UseFELayout/>
</w:Compatibility>
<m:mathPr>
<m:mathFont m:val="Cambria Math"/>
<m:brkBin m:val="before"/>
<m:brkBinSub m:val="--"/>
<m:smallFrac m:val="off"/>
<m:dispDef/>
<m:lMargin m:val="0"/>
<m:rMargin m:val="0"/>
<m:defJc m:val="centerGroup"/>
<m:wrapIndent m:val="1440"/>
<m:intLim m:val="subSup"/>
<m:naryLim m:val="undOvr"/>
</m:mathPr></w:WordDocument>
</xml><![endif]--><!--[if gte mso 9]><xml>
<w:LatentStyles DefLockedState="false" DefUnhideWhenUsed="true"
DefSemiHidden="true" DefQFormat="false" DefPriority="99"
LatentStyleCount="276">
<w:LsdException Locked="false" Priority="0" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Normal"/>
<w:LsdException Locked="false" Priority="9" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="heading 1"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 2"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 3"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 4"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 5"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 6"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 7"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 8"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 9"/>
<w:LsdException Locked="false" Priority="39" Name="toc 1"/>
<w:LsdException Locked="false" Priority="39" Name="toc 2"/>
<w:LsdException Locked="false" Priority="39" Name="toc 3"/>
<w:LsdException Locked="false" Priority="39" Name="toc 4"/>
<w:LsdException Locked="false" Priority="39" Name="toc 5"/>
<w:LsdException Locked="false" Priority="39" Name="toc 6"/>
<w:LsdException Locked="false" Priority="39" Name="toc 7"/>
<w:LsdException Locked="false" Priority="39" Name="toc 8"/>
<w:LsdException Locked="false" Priority="39" Name="toc 9"/>
<w:LsdException Locked="false" Priority="35" QFormat="true" Name="caption"/>
<w:LsdException Locked="false" Priority="10" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Title"/>
<w:LsdException Locked="false" Priority="1" Name="Default Paragraph Font"/>
<w:LsdException Locked="false" Priority="11" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtitle"/>
<w:LsdException Locked="false" Priority="22" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Strong"/>
<w:LsdException Locked="false" Priority="20" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Emphasis"/>
<w:LsdException Locked="false" Priority="59" SemiHidden="false"
UnhideWhenUsed="false" Name="Table Grid"/>
<w:LsdException Locked="false" UnhideWhenUsed="false" Name="Placeholder Text"/>
<w:LsdException Locked="false" Priority="1" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="No Spacing"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 1"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 1"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 1"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 1"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 1"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 1"/>
<w:LsdException Locked="false" UnhideWhenUsed="false" Name="Revision"/>
<w:LsdException Locked="false" Priority="34" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="List Paragraph"/>
<w:LsdException Locked="false" Priority="29" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Quote"/>
<w:LsdException Locked="false" Priority="30" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Quote"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 1"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 1"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 1"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 1"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 1"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 1"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 1"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 1"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 2"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 2"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 2"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 2"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 2"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 2"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 2"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 2"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 2"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 2"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 2"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 2"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 2"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 2"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 3"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 3"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 3"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 3"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 3"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 3"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 3"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 3"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 3"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 3"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 3"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 3"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 3"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 3"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 4"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 4"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 4"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 4"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 4"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 4"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 4"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 4"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 4"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 4"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 4"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 4"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 4"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 4"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 5"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 5"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 5"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 5"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 5"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 5"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 5"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 5"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 5"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 5"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 5"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 5"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 5"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 5"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 6"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 6"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 6"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 6"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 6"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 6"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 6"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 6"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 6"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 6"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 6"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 6"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 6"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 6"/>
<w:LsdException Locked="false" Priority="19" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtle Emphasis"/>
<w:LsdException Locked="false" Priority="21" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Emphasis"/>
<w:LsdException Locked="false" Priority="31" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtle Reference"/>
<w:LsdException Locked="false" Priority="32" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Reference"/>
<w:LsdException Locked="false" Priority="33" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Book Title"/>
<w:LsdException Locked="false" Priority="37" Name="Bibliography"/>
<w:LsdException Locked="false" Priority="39" QFormat="true" Name="TOC Heading"/>
</w:LatentStyles>
</xml><![endif]-->
<!--[if gte mso 10]>
<style>
/* Style Definitions */
table.MsoNormalTable
{mso-style-name:"Table Normal";
mso-tstyle-rowband-size:0;
mso-tstyle-colband-size:0;
mso-style-noshow:yes;
mso-style-priority:99;
mso-style-parent:"";
mso-padding-alt:0cm 5.4pt 0cm 5.4pt;
mso-para-margin:0cm;
mso-para-margin-bottom:.0001pt;
mso-pagination:widow-orphan;
font-size:12.0pt;
font-family:Cambria;
mso-ascii-font-family:Cambria;
mso-ascii-theme-font:minor-latin;
mso-hansi-font-family:Cambria;
mso-hansi-theme-font:minor-latin;}
</style>
<![endif]-->
<!--StartFragment-->
<!--EndFragment--><br />
<div class="MsoNormal">
<span style="font-family: "andale mono";">plot(Homa, col
="magenta", add=TRUE)<o:p></o:p></span></div>
<br />
The resulting plot looks like this:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjB9izM9ZdDKE6acS9vPUHMvIVhxnrtCjjhdRQYm8LkTBtLlgXmNYXwqWxIAaMJufvB4C6d3N-R9ujWe-prYW9RCgaYD5hmDeWiD_H5_bPY2FtFylvusF1DXc54rbfw-Rpn1XuSh6LYuRb1/s1600/polygon.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="619" data-original-width="1089" height="362" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjB9izM9ZdDKE6acS9vPUHMvIVhxnrtCjjhdRQYm8LkTBtLlgXmNYXwqWxIAaMJufvB4C6d3N-R9ujWe-prYW9RCgaYD5hmDeWiD_H5_bPY2FtFylvusF1DXc54rbfw-Rpn1XuSh6LYuRb1/s640/polygon.png" width="640" /></a></div>
This is only a very partial solution. If you wanted to draw a big map with many languages on it, it would be an enormous amount of work to go through the literature and surveys on where different languages are spoken. This work was already done, at least in part, by Ethnologue / the World Language Mapping System, and it is rather sad to do such work twice, or trice, etc. However, as the Homa case points out, data on where languages are spoken may be missing in Ethnologue, or may be incomplete, or no longer correct. Especially when you know a particular area in detail, it may be worth drawing your own map, and Google Earth + R makes this very easy. Of course, it would be even better <a href="http://humans-who-read-grammars.blogspot.de/2017/06/new-approaches-to-ethno-linguistic-maps.html" target="_blank">to use actual data to draw ethno-linguistic maps,</a> and not a 70-year old description, but for some areas of the world, that is something only for the very far future.<br />
<br />
<b>References</b><br />
<br />
Santandrea, Stefano. 1948. Little Known Tribes of the Bahr El Ghazal. Sudan Notes and Records XXIX. 78-106.<br />
<br />
Santandrea, Stefano. 1963. Short Notes on the Bɔdɔ, Huma and Kare Languages. Sudan Notes and Records 44. 82-99.<br />
<br />
Sommer, Gabriele. 1992. A Survey on Language Death in Africa. In Brenzinger, Matthias (ed.), Language Death: Factual and Theoretical Explorations with Special Reference to East Africa, 301-413. Berlin/New York: Berlin: Mouton de Gruyter.<br />
<br /></div>
Annemarie Verkerkhttp://www.blogger.com/profile/14747297526182358435noreply@blogger.com2tag:blogger.com,1999:blog-1300680252997007251.post-3093408970311577042017-09-19T18:27:00.000+10:002018-02-08T14:30:14.440+11:00Public service announcement: list of databases and more<div dir="ltr" style="text-align: left;" trbidi="on">
<div class="separator" style="clear: both; text-align: center;">
</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgzAk29w8-reQK1ZymxTlalB9Ma6EhVQmM226THKpLJgZspfj4yRNBVRxc2eFrqMo0pN2VjArB6pY3tK9MTjTmyV1CNRCEazG_-RP3BGt23kYwUJ1O19vA5uqePBx7-pzV5gRgww9zfMsJM/s1600/giphy+%25281%2529.gif" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="419" data-original-width="496" height="270" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgzAk29w8-reQK1ZymxTlalB9Ma6EhVQmM226THKpLJgZspfj4yRNBVRxc2eFrqMo0pN2VjArB6pY3tK9MTjTmyV1CNRCEazG_-RP3BGt23kYwUJ1O19vA5uqePBx7-pzV5gRgww9zfMsJM/s320/giphy+%25281%2529.gif" width="320" /></a></div>
<br />
Public service announcement: there are website that keep a well-curated list of things that are useful to linguistics researchers and students, including the following:<br />
<ul style="text-align: left;">
<li><a href="http://languagegoldmine.com/">Language Goldmine - catalogue of cross-linguistic databases</a></li>
<li><a href="http://humans-who-read-grammars.blogspot.com.au/p/help-linguistics-is-hard.html">Linguistic terminology (us here at HWRG)</a></li>
<li><a href="https://oaling.wordpress.com/2017/07/21/list-of-platinum-open-access-linguistics-journals/">Platinum Open Access publishing in linguistics</a></li>
<li><a href="http://humans-who-read-grammars.blogspot.com.au/p/open-access-publishing-in.html">More Open Access Linguistics</a> (us here at HWRG)</li>
<li><a href="http://humans-who-read-grammars.blogspot.com.au/p/online-list-of-free-online-tutorials.html">Online tutorials for linguistic tools (ELAN, R etc)</a> (also HWRG)</li>
<li><a href="http://www.linguistic-typology.org/resources.html">free online PDFs of publicly available grammars</a> (ALT)</li>
<li><a href="https://t.co/iwt5tbrVJU">Language universals</a></li>
<li><a href="https://typo.uni-konstanz.de/rara/intro/">Weird things in langauge (Rara & Rarissima)</a></li>
<li><a href="http://cals.conlang.org/">CALS (Conlang WALS)</a></li>
</ul>
It would appear that some don't know about these lists, so now you know/are reminded :).<br />
<br />
Lists are good, and instead of reinventing them you can look through these and add to them. For more hopefully useful stuff like this,<a href="http://humans-who-read-grammars.blogspot.com.au/search/label/hopefully%20helpful%20to%20linguistics%20students"> go here.</a></div>
Hedvig Skirgårdhttp://www.blogger.com/profile/03689179680848604827noreply@blogger.com0tag:blogger.com,1999:blog-1300680252997007251.post-48778830865784482792017-08-28T00:22:00.002+10:002017-08-30T20:49:31.567+10:00Ethnologue more restricted<div dir="ltr" style="text-align: left;" trbidi="on">
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhQheBlr_mM45fQsBaT4ydcLVdpTNfFbygde4Jr0Q3-5JLnrjwm7KeDvG6VU3j_r1p-GQjCRnqSVWSjXUHQaXNynpLdzPDqP0SrnSMLhbGLDwN_KWr8N8Ni0Mrfr9gu_hovY1VGZKqBqVLt/s1600/ethnologue_new_site_screen_shot_shadow.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="300" data-original-width="473" height="202" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhQheBlr_mM45fQsBaT4ydcLVdpTNfFbygde4Jr0Q3-5JLnrjwm7KeDvG6VU3j_r1p-GQjCRnqSVWSjXUHQaXNynpLdzPDqP0SrnSMLhbGLDwN_KWr8N8Ni0Mrfr9gu_hovY1VGZKqBqVLt/s320/ethnologue_new_site_screen_shot_shadow.png" width="320" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
In April this year, Ethnologue changed access restrictions to their website again. Now, non-paying users from high income countries can only access 1 page per month before they are banned, previously it was 7. In light of this change, we will go through some basics regarding the paywall <a href="http://humans-who-read-grammars.blogspot.com.au/2016/01/clarifying-points-on-ethnologue-pay.html">again (old post here)</a> and where you can go instead. Finally, I list some questions should any SIL International/Ethnologue staff see this post.<br />
<br />
<b>Basics on the pay-wall</b><br />
We haven't received much detailed information on this change, but if it's the same as last time it means that users with IP-addresses in countries that are classified by the world bank as "high-income" will be restricted.<a href="https://twitter.com/SteveMoitozo2/status/674102529278480384"> </a>Cloudflare would appear to be the service provider managing this for Ethnologue. <a href="https://twitter.com/SteveMoitozo2/status/674102529278480384">Previously, we've learned that only 5% of users look at more than 7 pages per month.</a> We don't know how many go to more than 1 page (probably a lot more though!).<br />
<br />
<a href="http://www-01.sil.org/iso639-3/">SIL International also maintains the ISO 639-3 codes for language names (one of 6 language ISO-codes). Those pages are NOT affected by this restriction.</a> Ethnologue and SIL International are not the same thing, SIL International produce more things than just Ethnologue.<br />
<br />
<a href="https://www.ethnologue.com/archive">Old editions of Ethnologue have different restrictions</a> than the current edition.<br />
<br />
Ethnologue is mainly funded by <a href="http://www.wycliffe.net/en/">Wycliffe Global Alliance</a> (an explicitly Christian organisation), and not by any state or academic institution. This information is based on what I understand from financial statements, I may be mistaken. Clarification form Ethnologue/SIL International staff is highly appreciated here. Please note that there are many other ISO industry standards that are pay-access only, the fact that SIL International provides 639-3 openly is fortunate.<br />
<br />
It appears to us, the users, that SIL International have made these decisions to remedy a financial situation. It is not clear at this time if SIL International is seeking other ways of bringing in funds, like more traditional grants from research councils.<br />
<br />
<b>Where to go instead?</b><br />
Much of the information that Ethnologue provides is actually available elsewhere. Here is a table displaying some of the places you could go to instead of Ethnologue.<br />
<br />
<style>
<!--table
{mso-displayed-decimal-separator:"\.";
mso-displayed-thousand-separator:"\,";}
@page
{margin:.75in .7in .75in .7in;
mso-header-margin:.3in;
mso-footer-margin:.3in;}
td
{padding-top:1px;
padding-right:1px;
padding-left:1px;
mso-ignore:padding;
color:black;
font-size:14.0pt;
font-weight:400;
font-style:normal;
text-decoration:none;
font-family:Calibri, sans-serif;
mso-font-charset:0;
mso-number-format:General;
text-align:general;
vertical-align:bottom;
border:none;
mso-background-source:auto;
mso-pattern:auto;
mso-protection:locked visible;
white-space:nowrap;
mso-rotate:0;}
-->
</style>
<br />
<table border="0" cellpadding="0" cellspacing="0" style="border-collapse: collapse; width: 345px;">
<!--StartFragment-->
<colgroup><col style="mso-width-alt: 6729; mso-width-source: userset; width: 184pt;" width="245"></col>
<col style="width: 175pt;" width="175"></col>
</colgroup><tbody>
<tr height="25" style="height: 19.0pt;">
<td height="25" style="height: 19.0pt; width: 184pt;" width="245"><span style="font-family: inherit; font-size: small;">Family trees</span></td>
<td style="width: 175pt;" width="200"><span style="font-family: inherit; font-size: small;"><a href="http://multitree.org/">MultiTree</a>, <a href="http://glottolog.org/glottolog/family">Glottolog</a></span></td>
</tr>
<tr height="25" style="height: 19.0pt;">
<td height="25" style="height: 19.0pt;"><span style="font-family: inherit; font-size: small;">Codes</span></td>
<td><span style="font-family: inherit; font-size: small;"><a href="http://glottolog.org/glottolog/language">Glottolog</a>, <a href="http://www-01.sil.org/iso639-3/">ISO 639-3 repository </a></span></td>
</tr>
<tr height="25" style="height: 19.0pt;">
<td height="25" style="height: 19.0pt;"><span style="font-family: inherit; font-size: small;">Alternative names</span></td>
<td><span style="font-family: inherit; font-size: small;"><a href="http://glottolog.org/glottolog">Glottolog</a></span></td>
</tr>
<tr height="25" style="height: 19.0pt;">
<td height="25" style="height: 19.0pt;"><span style="font-family: inherit; font-size: small;">Endangerment level</span></td>
<td><span style="font-family: inherit; font-size: small;"><a href="http://www.unesco.org/languages-atlas/">UNESCO Atlas of the World's Languages in Danger</a></span></td>
</tr>
<tr height="25" style="height: 19.0pt;">
<td height="25" style="height: 19.0pt;"><span style="font-family: inherit; font-size: small;">Maps of language areas (polygon data)</span></td>
<td><span style="font-family: inherit; font-size: small;"><a href="http://www.langscape.umd.edu/">Langscape</a></span></td>
</tr>
<tr height="25" style="height: 19.0pt;">
<td height="25" style="height: 19.0pt;"><span style="font-family: inherit; font-size: small;">Population stats</span></td>
<td><span style="font-family: inherit; font-size: small;">(<a href="https://www.ethnologue.com/archive">Old Ethnologue editions</a>), <a href="https://www.cia.gov/library/publications/the-world-factbook/">CIA World Factbook,</a> <a href="https://en.wikipedia.org/wiki/List_of_languages_by_number_of_native_speakers">Wikipedia</a></span></td>
</tr>
<!--EndFragment-->
</tbody></table>
<br />
<span style="font-family: inherit;">The information that Ethnologue provides that is the hardest to replace is population stats. The pages that I regret the most that I cannot access are <a href="https://www.ethnologue.com/statistics">the overall summary stat pages</a>, they're nice for showing size of language families and the power law of speaker populations.</span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">Here is some more details on some of the resources listed above.</span><br />
<span style="font-family: inherit;"><br /></span>
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjlU9oRpfx3NRN7WbMi9MNNwcn-ZV0iB09374JMwU1RD2yjymVsMrIKwzZmB6T0SfywumI-zgbaNnngcTFfb1tEpDK4iTtHa80atsbJy5IamQAHkE0XlPDL_JwSmjcQcu79oaWOe80RZm89/s1600/mt-tree80.png" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" data-original-height="71" data-original-width="80" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjlU9oRpfx3NRN7WbMi9MNNwcn-ZV0iB09374JMwU1RD2yjymVsMrIKwzZmB6T0SfywumI-zgbaNnngcTFfb1tEpDK4iTtHa80atsbJy5IamQAHkE0XlPDL_JwSmjcQcu79oaWOe80RZm89/s1600/mt-tree80.png" /></a><span style="font-family: inherit;"><br /><b>MultiTree</b></span><br />
<span style="font-family: inherit;">MultiTree is by Linguist List and is a catalogue of lots of different language trees. You can search through the database for lots of different trees and compare them, very cool!</span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;"><b>Glottolog</b></span><br />
<span style="font-family: inherit;">Glottolog is provided by the Max Planck Society, and edited by Harald Hammarström, Robert Forkel and Martin Haspelmath. Most of the detailed curation of the data is managed by Harald. Glottolog provides a lot of information, mainly language codes, trees, location (dots, not polygons) and references. Each tree in Glottolog has a clear reference to a published source, which is very handy.<a href="http://glottolog.org/glottolog/glottologinformation"> There is also clear information on how the classification is handled.</a></span><br />
<br />
If you disagree with information you find on Glottolog, or want to add information, <a href="https://github.com/clld/glottolog/issues">you can file a GitHub-issue</a> or click the little alarm bell symbol on the relevant page.<br />
<br />
<b>UNESCO Atlas of Languages in Danger</b><br />
This atlas is the complimentary online version of the 2010 print edition and edited by Christopher Moseley. It contains information on 2,464 languages. This is the scale, and the number of languages at each level:<br />
<ul style="text-align: left;">
<li>Vulnerable (592)</li>
<li>Definitely endangered (640)</li>
<li>Severely endangered (537)</li>
<li>Critically endangered (577)</li>
<li>Extinct (228)</li>
</ul>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgYbeo58qZG8zW07U5394o2ACRshjRIixkcXzytqkaCDW5NdxANJa1rK5thyphenhyphenH7ErLmFMFaGERKduUULgRoP6giEqxSAXkRPrCPYMmCcjpgRlUwc4IKfF7wo6iGwl8Rb3UTneYzbYnYQh9iK/s1600/langscape-R01.png" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" data-original-height="77" data-original-width="50" height="50" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgYbeo58qZG8zW07U5394o2ACRshjRIixkcXzytqkaCDW5NdxANJa1rK5thyphenhyphenH7ErLmFMFaGERKduUULgRoP6giEqxSAXkRPrCPYMmCcjpgRlUwc4IKfF7wo6iGwl8Rb3UTneYzbYnYQh9iK/s200/langscape-R01.png" width="60" /></a></div>
<b>Langscape</b><br />
<span style="font-family: inherit;">Langscape is a website by the Maryland Language Science Center. They provide games, lesson material for teachers and - most interestingly to us - maps. <a href="http://langscape.umd.edu/map.php">These interactive maps</a> are actually based on the polygon set of SIL International, and they're not available for download freely. They are however accessible in the interactive web browser interface. </span><span style="font-family: inherit;">One way you can see that these are Ethnologue polygons, is that the genealogical classification is the same. For example: Mande languages are marked as the same family as Bantu languages (not the case in Glottolog).</span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;"><a href="http://langscape-gis.umd.edu/LFT/">One of the games that Langscape has is an identification game,</a> not that different from the Great Language Game that <a href="http://humans-who-read-grammars.blogspot.com.au/2017/04/why-are-some-languages-often-confused.html">we wrote a paper about</a>! <a href="http://humans-who-read-grammars.blogspot.com.au/2017/04/lingquest-new-language-guessing-game.html">We also made a new game, LingQuest, that you can play</a>.</span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">Alright, now you know. Best of luck with whatever research you have that is dependent on this kind of information.</span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;"><b>Questions for Ethnologue/SIL International staff (should they be reading)</b></span><br />
<br />
<ol style="text-align: left;">
<li>will the ISO 639-3 codes ever be behind a pay-wall?</li>
<li>are there other products of SIL International that may become like Ethnologue?</li>
<li>is it still only in effect in high-income countries (according to the world bank)?</li>
<li>is it intentional that the Ethnoblog and the summary statistics pages are included under the pay-wall?</li>
<li>how are Ethnologue and SIL International funded?</li>
<li>have you considered other funding options?</li>
<li>how does Ethnologue and SIL International see their own roles in modern academia and some academics dependence on the data, despite these resources not being traditionally funded by academic institutions?</li>
<li>why was the change made?</li>
<li>was the change announced anywhere publicly?</li>
<li>how many users access more than 1 page per month?</li>
<li>how many users access more than 7 pages per month?</li>
<li>how has the user stats changed the past 2 years?</li>
<li>how many of your users are commercial and how many are academic, by estimation?</li>
</ol>
<div>
<b><span style="color: #990000; font-size: x-small;">***<br />EDIT<br />Note that Ethnologue is not only used by the academic research community, but also by commercial and governmental institutions <a href="http://humans-who-read-grammars.blogspot.com.au/2014/10/scandal-linguistics-used-horribly-wrong.html">(for example in this scandal)</a>. In fact, considering the new restrictions on access and problems with the basic data (opaque decisions and sources), perhaps academics shouldn't really use Ethnologue much at all.</span></b></div>
<style>
<!--table
{mso-displayed-decimal-separator:"\.";
mso-displayed-thousand-separator:"\,";}
@page
{margin:.75in .7in .75in .7in;
mso-header-margin:.3in;
mso-footer-margin:.3in;}
td
{padding-top:1px;
padding-right:1px;
padding-left:1px;
mso-ignore:padding;
color:black;
font-size:14.0pt;
font-weight:400;
font-style:normal;
text-decoration:none;
font-family:Calibri, sans-serif;
mso-font-charset:0;
mso-number-format:General;
text-align:general;
vertical-align:bottom;
border:none;
mso-background-source:auto;
mso-pattern:auto;
mso-protection:locked visible;
white-space:nowrap;
mso-rotate:0;}
-->
</style></div>
Hedvig Skirgårdhttp://www.blogger.com/profile/03689179680848604827noreply@blogger.com0tag:blogger.com,1999:blog-1300680252997007251.post-73173899737710973052017-08-03T16:19:00.004+10:002017-09-28T11:22:53.237+10:00When someone calls for the study of language to become more integrated<div dir="ltr" style="text-align: left;" trbidi="on">
<div class="separator" style="clear: both; text-align: center;">
<span style="font-family: inherit;"><a href="https://www.nature.com/articles/s41562-017-0163">Read this piece by Christiansen & Chater on a new emergent field of study that integrates different sub-disciplines of language sciences.</a></span></div>
<div class="separator" style="clear: both; text-align: center;">
<span style="font-family: inherit;"><br /></span></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEitYpq9Nf_j-nJg0R2a0rIMJNOkzuVaWZM3vPv3DNZ7fDNpSWDBGnWZ6T0iGk48xxNLwNGIVJHMLC_lC-bwVRGYCEuEsiOR4PlFBnAM7-hxv9LyOBR6N0C5HNql9DevfZqg35H_lA04CxLD/s1600/handy+gifs.gif" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><span style="font-family: inherit;"><img border="0" data-original-height="607" data-original-width="640" height="303" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEitYpq9Nf_j-nJg0R2a0rIMJNOkzuVaWZM3vPv3DNZ7fDNpSWDBGnWZ6T0iGk48xxNLwNGIVJHMLC_lC-bwVRGYCEuEsiOR4PlFBnAM7-hxv9LyOBR6N0C5HNql9DevfZqg35H_lA04CxLD/s320/handy+gifs.gif" width="320" /></span></a></div>
<div style="text-align: left;">
<span style="font-family: inherit;"><br /></span></div>
<div style="text-align: left;">
<span style="font-family: inherit;">See also: </span></div>
<div style="text-align: left;">
<span style="font-family: inherit;"><br /></span></div>
<div style="background-color: white; font-stretch: normal; font-weight: normal; line-height: normal; margin: 0px; position: relative; text-align: left;">
<span style="font-family: inherit; font-size: large;"><a href="http://humans-who-read-grammars.blogspot.com.au/2015/04/can-we-get-to-post-generativist-vs.html">Can we get to the post-generativist-vs-functionalist-war generation yet?</a> ( - I'm bored with the conflicts of my elders)</span></div>
<div style="text-align: left;">
<span style="font-family: inherit;"><br /></span></div>
<div style="text-align: left;">
<span style="font-family: inherit;">and</span></div>
<div style="text-align: left;">
<span style="font-family: inherit;"><br /></span></div>
<div style="text-align: left;">
<a href="http://humans-who-read-grammars.blogspot.com.au/search/label/challenges"><span style="font-family: inherit; font-size: large;">Grand challenges of current linguistics</span></a></div>
</div>
Hedvig Skirgårdhttp://www.blogger.com/profile/03689179680848604827noreply@blogger.com0tag:blogger.com,1999:blog-1300680252997007251.post-32430422765987804712017-08-01T16:54:00.004+10:002017-09-29T12:28:44.677+10:00ELAN: making tier(s) out of search results<div dir="ltr" style="text-align: left;" trbidi="on">
<table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: right; margin-left: 1em; text-align: right;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgzEmNynDokEPy6pYe3tp6Baqnq6y3oumdATaiL3KdnQqvfizExkmG6tAxNfoBmriNJK9kul5mtYhXTHysL9T9FwX0Ce2YJESNQcIiI3kH0IVNeAgWHWjxNvLkxESHlFDDj-vPnSlexp8AC/s1600/IMG_20170801_145802.jpg" imageanchor="1" style="clear: right; margin-bottom: 1em; margin-left: auto; margin-right: auto;"><img border="0" data-original-height="1430" data-original-width="1600" height="286" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgzEmNynDokEPy6pYe3tp6Baqnq6y3oumdATaiL3KdnQqvfizExkmG6tAxNfoBmriNJK9kul5mtYhXTHysL9T9FwX0Ce2YJESNQcIiI3kH0IVNeAgWHWjxNvLkxESHlFDDj-vPnSlexp8AC/s320/IMG_20170801_145802.jpg" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Hedvig in her office in Canberra figuring this out<br />
and writing this guide.</td></tr>
</tbody></table>
Here is another guide for how to do something practical in ELAN. Previously, <a href="https://yammeringon.wordpress.com/2017/05/01/elanpraat-machine-segmenting/">we relayed Eri Kashima's guide for sensible auto-segmentation with PRAAT and ELAN </a>(time saver!). (For all posts about fieldwork on this blog, see <a href="http://humans-who-read-grammars.blogspot.com.au/search/label/fieldwork">this tag.</a>)<br />
<br />
<b>This time:</b> how to take your search results and make the matching annotations into new separate tier(s). This is useful if you for example want to cycle through only the annotations that match a certain search query in transcription mode. This post has a longer guide, and a short guide at the end.<br />
<br />
You can also use this guide if you want to compare several different transcriptions with each other, for example older and newer versions or if you are collaborating with different people. In that case, start from <b>step (4).</b><br />
<br />
<b>For those who don't do a lot of transcription</b>: ELAN (<b>E</b>UDICO <b>L</b>inguistic <b>An</b>notator) is a program from TLA at MPI-Nijmegen. This program allows us to easily annotate audio and/or video files with lots of relevant data. We can use ELAN to count things, but we can also export as CSV-files for analysis later (Excel, R, Libreoffice etc). ELAN is free and great. If you ever need to do transcription, do it in ELAN. Do not create long text-documents with no linking to the audio, it is just ridiculous. <a href="https://tla.mpi.nl/tools/tla-tools/elan/">Download ELAN here.</a><br />
<br />
<b>Version of ELAN:</b> 4.8.1 (to my knowledge though this should work the same for other versions)<br />
<br />
<b>We're going to:</b><br />
<ul style="text-align: left;">
<li>search in a clever way</li>
<li>export those results</li>
<li>import them as new tier(s) into the .eaf-file you're working on</li>
<li>thus creating a tier with a defined subset of other existing tiers, making work speedier on targeted parts of your corpus</li>
</ul>
<div>
You can click the images for larger versions.</div>
<div>
<br /></div>
<b>Example case</b><br />
I've got a transcribed file where I've noticed some different pronunciation of a certain word. I'd like to pick out only the annotations containing that word, make a new tier with only them, and write down some clever things about this word in that tier. I don't want to have to scroll through all annotations to get to only these.<br />
<br />
I work on Samoan, and the word I'm looking at means "to tell/explain": <i>fa'amatala. "Fa'amatala" </i>is the dictionary entry for this word, but it varies in pronunciation in actual speech. I've asked my transcription assistant to mark down vowel length and presence and absence of glottal stops (as opposed to more orthographic transcription). She has done this pretty consistently (as far as I can tell, it's hard to hear glottal stops sometimes), and since I know what kind of variations to expect I can easily find the instances for this word. Due to <i>t</i> and <i>k</i>-style (lects in Samoan) and speed these are the variations we can expect:<br />
<ul style="text-align: left;">
<li>fa'amatala</li>
<li>fa:matala</li>
<li>famatala</li>
<li>fa'amakala</li>
<li>fa:makala</li>
<li>famakala</li>
</ul>
<div>
Besides the obvious difference in pronunciation, I've noticed something unusual going on in the realisation of the realisation of t/k, sort of like an affricate. So, I'd like to listen to all instances of this word with all these spellings and make notes of that.<br />
<br />
Here are the steps. At the end is a <b>short guide</b> for when you've started to get the hang of this but need basic guidance.</div>
<div>
<br /></div>
<div>
<b>Step 1) clever searching</b></div>
<div>
In ELAN we can search for simple words, but we can also do something a bit more clever: we can search using regular expressions. Now, you don't need to have a complicated query or know all regex magic to make use of this. In this case, we're simply going to use the 'OR'-function. 'OR' in regular expressions is expressed by the vertical line/pipe character: "|" .</div>
<div>
<br /></div>
<div>
So, I'm searching for "fa'amakala|fa:makala|famakala|fa'amatala|fa:matala|famatala" in the tier marked "transcription". No need for bracketing, asterisks or anything like that in this case. If you want to do more complicated things with regular expressions, I highly recommend <a href="https://tla.mpi.nl/wp-content/uploads/2011/12/Searches_in_ELAN_with_regular_expressions.pdf">this guide and cheat sheet for regular expressions in ELAN by Ulrike Mosel*.</a></div>
<div>
<br /></div>
<table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: right; margin-left: 1em; text-align: right;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEigKBEOXo7R3PmoqrkvJ2e4yQEQ04dUKCLoO0CiZQDHaWaozuz3SJNL_pcrhCdqcQVIyglDyMUJbuCsuJbDT9N17m_P5P4uSeIbcnOToQ9RRLAkhHPV8hjoeYLRy4Sq6UuTisGjBQaVREen/s1600/search+window.tiff" imageanchor="1" style="clear: right; margin-bottom: 1em; margin-left: auto; margin-right: auto;"><img border="0" data-original-height="729" data-original-width="1390" height="334" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEigKBEOXo7R3PmoqrkvJ2e4yQEQ04dUKCLoO0CiZQDHaWaozuz3SJNL_pcrhCdqcQVIyglDyMUJbuCsuJbDT9N17m_P5P4uSeIbcnOToQ9RRLAkhHPV8hjoeYLRy4Sq6UuTisGjBQaVREen/s640/search+window.tiff" width="640" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Search query results</td></tr>
</tbody></table>
<div>
<br /></div>
<div>
<br /></div>
<div>
<br /></div>
<div>
<br /></div>
<div>
<br /></div>
<div>
<br /></div>
<div>
<br /></div>
<div>
<br /></div>
<div>
<br /></div>
<div>
<br /></div>
<div>
<br /></div>
<div>
<br /></div>
<div>
<br /></div>
<div>
<br /></div>
<div>
<br /></div>
<div>
<br /></div>
<div>
<br /></div>
<div>
<br /></div>
<div>
<br /></div>
<div>
<br /></div>
<div>
Here are our search results:</div>
<table border="0" cellpadding="0" cellspacing="0" style="border-collapse: collapse; width: 465px;"><colgroup><col style="width: 465pt;" width="465"></col></colgroup><tbody>
<tr height="5" style="height: 5.0pt;"><td height="5" style="height: 5.0pt; width: 465pt;" width="465"><ul>
<li>uma fa'amatala i a'u
i le tala o le video<span style="mso-spacerun: yes;"> </span></li>
</ul>
</td></tr>
<tr height="15" style="height: 15.0pt;"><td height="15" style="height: 15.0pt;"><ul>
<li>fa:makala loa le!</li>
</ul>
</td></tr>
<tr height="15" style="height: 15.0pt;"><td height="15" style="height: 15.0pt;"><ul>
<li>fa:makala?</li>
</ul>
</td></tr>
<tr height="15" style="height: 15.0pt;"><td height="15" style="height: 15.0pt;"><ul>
<li>fa:makala ka:maloa lale e<span style="mso-spacerun: yes;"> </span></li>
</ul>
</td></tr>
<tr height="15" style="height: 15.0pt;"><td height="15" style="height: 15.0pt;"><ul>
<li>ma: e mafai ona e fa:matala mai fapefea
le vaitaimi na'e tuputupu 'ae i: falealupo</li>
</ul>
</td></tr>
<tr height="15" style="height: 15.0pt;"><td height="15" style="height: 15.0pt;"><ul>
<li>mafai ona e fa'amatala i a'u<span style="mso-spacerun: yes;"> </span></li>
</ul>
</td></tr>
<tr height="15" style="height: 15.0pt;"><td height="15" style="height: 15.0pt;"><ul>
<li>fa'amatala?</li>
</ul>
</td></tr>
<tr height="15" style="height: 15.0pt;"><td height="15" style="height: 15.0pt;"><ul>
<li>i e mafai ona e fa:matala i le ese'esega
o gagana sa:moa<span style="mso-spacerun: yes;"> </span></li>
</ul>
</td></tr>
<tr height="15" style="height: 15.0pt;"><td height="15" style="height: 15.0pt;"><ul>
<li>e mafai ona e fa'amatala i le tala le
lenei<span style="mso-spacerun: yes;"> </span></li>
</ul>
</td></tr>
<tr height="15" style="height: 15.0pt;"><td height="15" style="height: 15.0pt;"><ul>
<li>i fasa:moa, fa'amolemole fa'amatala i le
a</li>
</ul>
</td></tr>
<tr height="15" style="height: 15.0pt;"><td height="15" style="height: 15.0pt;"><ul>
<li>le kusi la ga ae kago famakala aka<span style="mso-spacerun: yes;"> </span></li>
</ul>
</td></tr>
<tr height="15" style="height: 15.0pt;"><td height="15" style="height: 15.0pt;"><ul>
<li>o: mai o le se famakala aku le mea<span style="mso-spacerun: yes;"> </span></li>
</ul>
</td></tr>
<tr height="15" style="height: 15.0pt;"><td height="15" style="height: 15.0pt;"><ul>
<li>fa:makala uma ?</li>
</ul>
</td></tr>
<tr height="15" style="height: 15.0pt;"><td height="15" style="height: 15.0pt;"><ul>
<li>e ke kago famakala le aka<span style="mso-spacerun: yes;"> </span></li>
</ul>
</td></tr>
</tbody></table>
<div>
That looks good! Not all variations we thought might exist occurred (we didn't get "famatala"), but that's normal. (In fact, specifically not getting that form is expected. Shortening of vowel + the <i>t</i>-lect should not co-occur often, if we believe what Mayer, Ochs and others have said about Samoan variation.)<br />
<br />
If you want to edit your search query, you don't need to start all over. Just click the search window again right there over your results, it'll be editable again. (This took me a while to realize.)</div>
<div>
<br /></div>
<div>
<b>Step 2) exporting the search results</b></div>
<div>
This is is very easy, in the search window you have up, go to "Query>Export" and choose to export as tab-delimited text.</div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEisZyQV9ctzwNW0-ZvANP5tisG8frdp0csidMj0WDj240cKuqXlgcKX0WfhY-l7eNVyxSFsQxFDCIolKI0lC5c1jQ0DDFlUcIjjcWLrwIjGajQSeG15OzEYA0waGn45C8L5aeOaoap4T-do/s1600/export+search.tiff" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="176" data-original-width="378" height="148" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEisZyQV9ctzwNW0-ZvANP5tisG8frdp0csidMj0WDj240cKuqXlgcKX0WfhY-l7eNVyxSFsQxFDCIolKI0lC5c1jQ0DDFlUcIjjcWLrwIjGajQSeG15OzEYA0waGn45C8L5aeOaoap4T-do/s320/export+search.tiff" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Export search query results</td></tr>
</tbody></table>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhQE_n9OREOHRK3bvitI8UaHVyrh2bC-L-6aI2wCmY1CkKVsepmBBgzCj1ksWnUt5oo3ud0RSD7jyHWn8GkQOUi70gxpDLN4I6plv2VMWNrqvNmguhckqQFIq7fWryeDmPZfrdycnabL8CF/s1600/export+dialog.tiff" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="423" data-original-width="572" height="236" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhQE_n9OREOHRK3bvitI8UaHVyrh2bC-L-6aI2wCmY1CkKVsepmBBgzCj1ksWnUt5oo3ud0RSD7jyHWn8GkQOUi70gxpDLN4I6plv2VMWNrqvNmguhckqQFIq7fWryeDmPZfrdycnabL8CF/s320/export+dialog.tiff" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Exporting search results dialogue window</td></tr>
</tbody></table>
<div>
Name your file something sensible, and put it in a good place. Now let's have a look at said file outside of ELAN, shall we? The file will have the file-extension ".txt", but it is a tab-separated file (".tsv"). Open it in some spreadsheet program (excel, numbers, libreoffice, google sheets, whathaveyou) and it should look a little something like this:</div>
<div>
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhwX3MPA3FpU5BZ1oONqi_LH4XcDt_XtbEqWOBKtU6MRyoRPxB7GwPvF6GDGIdG6W_Es4zrNmi30UZrBEfJkLxoeo1bS0Ncy3s9MrfMKnDLmwoVikOsR9Y1TQKAlyQTWobOPM5RgtDyMXY7/s1600/file+table.tiff" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="281" data-original-width="811" height="219" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhwX3MPA3FpU5BZ1oONqi_LH4XcDt_XtbEqWOBKtU6MRyoRPxB7GwPvF6GDGIdG6W_Es4zrNmi30UZrBEfJkLxoeo1bS0Ncy3s9MrfMKnDLmwoVikOsR9Y1TQKAlyQTWobOPM5RgtDyMXY7/s640/file+table.tiff" width="640" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Search results file opened in Excel, specifying tab as delimiter.</td></tr>
</tbody></table>
</div>
<div>
That looks kinda alright, doesn't it? There's no headings, but we can figure this out. There's some things in there that we didn't ask to have, for example the first column is the file location. That's not needed for what we're doing, and I'll show you how to handle that in the next step. Don't worry.</div>
<div>
<b><br /></b></div>
<div>
<b>Step 3) creating tier(s) out of the search results</b></div>
<div>
Now we go back to ELAN and we import this file as a tier. What will happen here is that a entire new .eaf-file will be created, the tier will actually not be imported directly into whichever file you currently have open. This means that it doesn't matter which .eaf-file you currently have open when you import (or indeed if any is open). Counterintuitive, I know, but don't worry - I've figured it out. It's not that complicated, just stay with me.</div>
<div>
<br /></div>
<div>
File>Import> CSV/Tab-delimited Text file</div>
<div>
<br /></div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhuASXWyK55e5JFo3Yh9GV3UzFm5MYZNJkU_Vp-Bljfj3s79x6r2YENhlQI7AwxSTnaG07nSYA_TykZqB3JEF2SnEDUiGP4lYKk36-RC-exFYkXv3W00pxA7esCB8yJ9jCIuyTXSJKFuBye/s1600/import.tiff" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="490" data-original-width="430" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhuASXWyK55e5JFo3Yh9GV3UzFm5MYZNJkU_Vp-Bljfj3s79x6r2YENhlQI7AwxSTnaG07nSYA_TykZqB3JEF2SnEDUiGP4lYKk36-RC-exFYkXv3W00pxA7esCB8yJ9jCIuyTXSJKFuBye/s320/import.tiff" width="280" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Importing CSV/Tab-delimited Text file</td></tr>
</tbody></table>
<div>
Next up you will get a window asking you questions about the file you're trying to import. Remember how the file didn't have headings for the columns? How will we figure out what is what? Not to worry, it's like this:</div>
<div>
<br /></div>
<div>
1 col: <i>ignore (uncheck)</i></div>
<div>
2 col: Tier</div>
<div>
3 col: Begin time</div>
<div>
4 col: <i>ignore </i><i>(uncheck)</i></div>
<div>
5 col: end time</div>
<div>
6 col: <i>ignore </i><i>(uncheck)</i></div>
<div>
7 col: Duration (not sure why this is needed but oh well)</div>
<div>
8 col:<i> ignore </i><i>(uncheck)</i></div>
<div>
9 col: Annotation</div>
<div>
<br /></div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiOSRc8BiyDNX96ofckXWxtAlYSPNf4e28U3LgitiYVn3_p1lnwPJWYFFS7WfZKCN33X71WnfgvYZJW4K_15lUZUL5jRJT6bRBy0_pnNyVBSkFkrKqypOTYhy5Nl4Lu6xMEw1CdTnAd_A34/s1600/specify.tiff" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="498" data-original-width="1282" height="248" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiOSRc8BiyDNX96ofckXWxtAlYSPNf4e28U3LgitiYVn3_p1lnwPJWYFFS7WfZKCN33X71WnfgvYZJW4K_15lUZUL5jRJT6bRBy0_pnNyVBSkFkrKqypOTYhy5Nl4Lu6xMEw1CdTnAd_A34/s640/specify.tiff" width="640" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Import CSV/Tab-delimited Text file dialogue window.</td></tr>
</tbody></table>
<div>
I wish that ELAN had a way of automatically recognizing its own search output, but it doesn't and we know how to do this anyway so it's all good. No need to specify the other options, just leave them unchecked.</div>
<table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: right; margin-left: 1em; text-align: right;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEipH5I1jwdDihIbIHPoX5gtLK3LEK-Tn8setQ1dn3YZ-Q1P6b9k6tetT3-0bXom4i7wlPrN28NTBoL7qXIXDr0ghuIL_A8Gkfwly6e0u0i9u0xO3L9QJNa8W9IFBEjDdrsT7JOoirmzRfD3/s1600/ghost.gif" imageanchor="1" style="clear: right; margin-bottom: 1em; margin-left: auto; margin-right: auto;"><img border="0" data-original-height="350" data-original-width="498" height="224" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEipH5I1jwdDihIbIHPoX5gtLK3LEK-Tn8setQ1dn3YZ-Q1P6b9k6tetT3-0bXom4i7wlPrN28NTBoL7qXIXDr0ghuIL_A8Gkfwly6e0u0i9u0xO3L9QJNa8W9IFBEjDdrsT7JOoirmzRfD3/s320/ghost.gif" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">An actual ghost</td></tr>
</tbody></table>
<div>
<br /></div>
<div>
Now you will have a new .eaf-file with the same name as the file with the search results. This file will contain only the tier(s) you had searched within and only the annotations matching the search query. There's no audio file and no other tiers. It's like a ghost tier, haunting the void of empty silence of this lonely .eaf-file.</div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiS2sQm3_sLP5gDTM4Gu0GBeqcyA5eISi14YZ1y1Kk9-eeHW3zBnExFHP9273hiehGp3oUibXC9lB2JWPTLFbdJbMlvGg9_6m7zfIYKydj5Ow6tj6r56plqjXduTVZBSSJMVYlOmhqyg8rp/s1600/just+search+as+tier.tiff" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="539" data-original-width="1445" height="238" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiS2sQm3_sLP5gDTM4Gu0GBeqcyA5eISi14YZ1y1Kk9-eeHW3zBnExFHP9273hiehGp3oUibXC9lB2JWPTLFbdJbMlvGg9_6m7zfIYKydj5Ow6tj6r56plqjXduTVZBSSJMVYlOmhqyg8rp/s640/just+search+as+tier.tiff" width="640" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">A lonely ghost tier in an otherwise empty .eaf-file</td></tr>
</tbody></table>
<div>
Save this file and other files currently open in some clever place(s), quit ELAN and then restart ELAN. Sometimes there seems to be a problem for ELAN to accurately see files later on in this process unless you do this. I don't know why this is, but saving, closing and restarting seems to help, so let's just do that :)!</div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjLpTEf9I7efnre3bgY5z1j20Xe1GVn-rQE-SWCHSQ2VScZsnwXmlsTPLMZUrmfPuFaJVOQsMtUznZDit3FMHQ4Dz9lrZGzP1bORp-PrwZptKnT4jBTfPmU2Ju_Yp-83-5O6Mf3WlHQEIEB/s1600/giphy+IT+crowd.gif" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="281" data-original-width="500" height="179" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjLpTEf9I7efnre3bgY5z1j20Xe1GVn-rQE-SWCHSQ2VScZsnwXmlsTPLMZUrmfPuFaJVOQsMtUznZDit3FMHQ4Dz9lrZGzP1bORp-PrwZptKnT4jBTfPmU2Ju_Yp-83-5O6Mf3WlHQEIEB/s320/giphy+IT+crowd.gif" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Chris O'Dowd as Roy Trenneman in IT-crowd</td></tr>
</tbody></table>
<div>
<b>Step 4) importing the search results tier into the original file</b></div>
<div>
Now here's where I slightly lied to you: we're not going to import the tier into your file. We're going to merge the search-results-tier-only-file with the other .eaf -file that has all the audio and other tiers and the result is going to be a new .eaf-file. So you'll have three files by the end of this:</div>
<div>
<ul style="text-align: left;">
<li>a) your original .eaf-file with audio and lotsa tiers</li>
<li>b) your .eaf-file with only the search results-tier and no audio etc (ghost-tier)</li>
<li>c) a new merged file consisting of the two above listed</li>
</ul>
</div>
<div>
Don't worry, I've got this. I'm henceforth going to call these files (a), (b) and (c) as indicated above.</div>
<div>
<br /></div>
<div>
Open file (a). Select "Merge Transcriptions..."</div>
<div>
<br /></div>
<div>
File>Merge >Transcriptions...</div>
<div>
<br /></div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj3L3KrRbjLk4t1wYTO_37IrW0BjaWg9bFm5QJgjELHyC3K8aKTzWFTeVKXGbT5W-_h4wkoSk57ERO3xpusVkpXptN_4eEoQ4EYELNy3FaMZB00qPXGM39CzElWtnH3D-VqnFUcTNcpxfkP/s1600/select+merge.tiff" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="494" data-original-width="284" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj3L3KrRbjLk4t1wYTO_37IrW0BjaWg9bFm5QJgjELHyC3K8aKTzWFTeVKXGbT5W-_h4wkoSk57ERO3xpusVkpXptN_4eEoQ4EYELNy3FaMZB00qPXGM39CzElWtnH3D-VqnFUcTNcpxfkP/s320/select+merge.tiff" width="183" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Select Merge transcriptions</td></tr>
</tbody></table>
<div>
Now, select file (a) as the current transcription (this is default anyway), file (b) as the second source and choose a name and location for the new file, file (c), in the "Destination" window. You can think of "Destination" as "Save as.." for file (c) - our new file.</div>
<div>
<br /></div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi15RB9hdzmH_SMgY5fRvxRCEf4dcrXFJLzNQkxTsnGrPqnMaatVsoLLW3gr0B35elyvAYUbZJS82uG7Wi3LZi3-uJwy6yLl4RGHljGZtkJAWvBRkeTmUEISpFRazWExRi6M-9GilSP88QM/s1600/merge+dialogue.tiff" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="557" data-original-width="1079" height="330" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi15RB9hdzmH_SMgY5fRvxRCEf4dcrXFJLzNQkxTsnGrPqnMaatVsoLLW3gr0B35elyvAYUbZJS82uG7Wi3LZi3-uJwy6yLl4RGHljGZtkJAWvBRkeTmUEISpFRazWExRi6M-9GilSP88QM/s640/merge+dialogue.tiff" width="640" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Specifying what should be merged and how</td></tr>
</tbody></table>
<div>
Do not, I repeat, do <i style="font-weight: bold;">not </i>append. And no need to worry about linked media, because (b) doesn't have any audio or anything (remember, it's a ghost). Just leave all those boxes unchecked.</div>
<div>
<br /></div>
<div>
Let ELAN chug away with the merging, and then you're done!</div>
<div>
<br /></div>
<div>
<b>Step 5) finished!</b></div>
<div>
Tadaaa! We're done! That wasn't so bad, was it? And look at what we've created!</div>
<div>
<b><br /></b></div>
<div>
Here's my merged file - file (c). I've taken the search-results tier and renamed it ("famakala"). I also copied it and renamed that one ("famakala - comments"). That way, I have a tier for making comments about the transcription annotation that has the exact same annotation distributions, but different values.</div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjelOPueXpLu08mSBhfjiSRRL6ovmomvDQASebxtztj12tnkOqRUZPtcAIZWYH6xFfQX8VzI_w9-7jfAh-gUhP0Nd08jSXVMCtMTQQU57gBkzfu5mQPWGuJBudyDTyMqTHGPuAedJTiZjr1/s1600/results%252C+annotation.tiff" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="520" data-original-width="1059" height="314" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjelOPueXpLu08mSBhfjiSRRL6ovmomvDQASebxtztj12tnkOqRUZPtcAIZWYH6xFfQX8VzI_w9-7jfAh-gUhP0Nd08jSXVMCtMTQQU57gBkzfu5mQPWGuJBudyDTyMqTHGPuAedJTiZjr1/s640/results%252C+annotation.tiff" width="640" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Final merged file in annotation mode, with the search results tier renamed and copied.</td></tr>
</tbody></table>
<div>
Here's the same file in the transcription mode, configured to only show the two tiers targeting the search query:</div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhIydDmxLgxiHlN75NN_JBmdfEoDX_O2p8HD8y_s5OyBnoSk4c9OEAdBOajDaWvJ15_K-FlRqwSxRbGRQdWZM9DpR9Zyg-DIuxb4dYuFgjYF4at2vHDSrJVkwFZqPdjgZxslq8UTrgr4h2e/s1600/results+trans.tiff" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="621" data-original-width="713" height="278" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhIydDmxLgxiHlN75NN_JBmdfEoDX_O2p8HD8y_s5OyBnoSk4c9OEAdBOajDaWvJ15_K-FlRqwSxRbGRQdWZM9DpR9Zyg-DIuxb4dYuFgjYF4at2vHDSrJVkwFZqPdjgZxslq8UTrgr4h2e/s320/results+trans.tiff" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Final merged file in transcription mode, showing only the search results tiers.</td></tr>
</tbody></table>
<div>
Now, some final notes:</div>
<div>
<ul style="text-align: left;">
<li>You might want to rename file (c) and delete file (a) and (b), for your own sanity later when managing the files, if for nothing else</li>
<li>Don't know how to get to transcription mode? Go to "Options>Transcription Mode".</li>
<li>Your tiers aren't showing up properly in transcription mode? Check that the "linguistic types" of the tiers are what you think they are and that that's what you've configured to see in transcription mode. Transcription mode can only show you tiers of one linguistic type at once (unless columns but that complex). I also don't get it really, but then again I barely get "linguistic types" at all though</li>
<li>Transcription mode getting clogged up with lots of irrelevant tiers? Got o "Configure..." left in the transcription mode window, select the right linguistic type and "Select tiers.." in the bottom left. Tick only the tiers you want to see at that moment</li>
<li>You can import several tiers at once by this method, you don't have to merge one search result at a time, see below</li>
<li>You might want to do something complicated related to speakers, see below</li>
</ul>
</div>
<div>
<b>Several tiers at once</b></div>
<div>
You can either search several tiers at once in the search mode and hence have several tiers in the search query output, or you could do several searches separately and then append the resulting tsv-files together afterwards in your spreadsheet-program. If there is a different value in the "Tier" column, ELAN will make several tiers when importing back as an .eaf-file. So, you can do several tiers at once.</div>
<div>
<br /></div>
<div>
<b>Speaker tiers</b></div>
<div>
Everyone organises their ELAN-files differently. I have a separate tier where I indicate who the speaker is in the annotation (see above screenshots). This is in contrast to how a lot of other people do it, with different tiers for different speakers. This means that I can search many speakers at the same time, or condition the search for "when X is indicated in speaker-ID-tier". </div>
<div>
<br /></div>
<div>
If you're doing different tiers for different speakers, you might have to figure out something a bit different from me in order to search many speakers at the same time. It's not that difficult though, you just have to meddle a bit with the search query (or just search one speaker at a time). Contact me if you want help.</div>
<div>
<br /></div>
<div>
On a related note, if someone ever was to ask me to do separate speakers in different tiers, I can use the above process to separate out only annotations with a certain value in the speaker-tier and then import them back as tiers per speaker. I'd rather not, I like it this way. But, I like making sure that the way I set things up is possible to configure to please others as well. Flexibility is good, don't lock yourself into a too narrow set-up that doesn't allow you to change without losing data.<br />
<br />
That granted, I need to do manual fidgety things for overlapping speech given this model. That's inconvenient, but I'm ok with it.<br />
<br />
<b>Short guide</b><br />
Step 1) Clever searching<br />
<div>
Step 2) export search results</div>
<ul style="text-align: left;">
<li>Query>Export (Save as tab-delimited text file)</li>
</ul>
<div>
Step 3) create new tier</div>
<div>
<ul style="text-align: left;">
<li>File>Import> CSV/Tab-delimited Text file</li>
<li>Specify columns (1 col: <i>ignore, </i>2 col: Tier, 3 col: Begin time, 4 col: <i>ignore, </i>5 col: end time, 6 col: <i>ignore, </i>7 col: Duration, 8 col:<i> ignore , </i>9 col: Annotation)<div>
</div>
</li>
<li>Save new .eaf-file. </li>
<li>Quit and restart ELAN</li>
</ul>
<div>
Step 4) Creating merged file</div>
</div>
<div>
<ul style="text-align: left;">
<li>Open original file with audio and other tiers</li>
<li>File>Merge transcriptions...</li>
<li>Select .eaf-file with search results as second source (do not append)</li>
<li>Save new merged file</li>
<li>Delete superfluous files</li>
</ul>
<div>
Step 5) done</div>
</div>
<div>
<ul style="text-align: left;">
<li>rename and copy tiers if necessary</li>
</ul>
</div>
</div>
<div>
<b>Questions/comments</b></div>
<div>
<a href="http://humans-who-read-grammars.blogspot.com.au/p/contact.html">Let 'em loose here.</a></div>
<div>
<br /></div>
<div>
I'm sure there's other ways of doing this, but this is what has worked well for me. I'd like this to be easier in ELAN, but in the meantime this works so I'm gonna do it like this.<br />
<br />
I find, in general, that I learn more about ELAN and other similar tools by just trying lots of different things and probing the system. Sure, there's manuals, but they often envisage a different usage than I'm after. For example, I'm not clear on what I actually gain by "linguistic types" in what I want to do. Nevermind, probing, searching and sharing seem to be the best way to go for tailored functions. Usually, what you can conceptually imagine as a useful thing exists somewhere (it's like rule 34 but for software). I didn't know how this worked until I thought to myself: "there must be a way of importing search results". And lo and behold, there is. Now here's something I've learned and that you now can do too! Good luck!</div>
<div style="text-align: center;">
<b><span style="font-size: large;">Good bye!</span></b></div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEggC9KdV3Z7vPzxaL5iP3F8wBH7EPuePHvckWlVDUzkH5OmALfquUqAdcPbyBW7fQSNcWyanfOLFqiqmpBCloBQt-6eYFgsIEwHekqfupqQjccyq-zDP_j2TNjx8Xya4xBKdU_L4LKbg4ht/s1600/finished.gif" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="281" data-original-width="500" height="179" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEggC9KdV3Z7vPzxaL5iP3F8wBH7EPuePHvckWlVDUzkH5OmALfquUqAdcPbyBW7fQSNcWyanfOLFqiqmpBCloBQt-6eYFgsIEwHekqfupqQjccyq-zDP_j2TNjx8Xya4xBKdU_L4LKbg4ht/s320/finished.gif" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Richard Ayoade as Maurice Moss in IT-crowd</td></tr>
</tbody></table>
<div>
<span style="font-size: x-small;">* No, I don't know why it is that two linguists who are working/worked on specifically Samoan are trying to teach other linguists to use regular expressions in ELAN. Must be something in the water.</span></div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: left; margin-left: 1em; text-align: left;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhSTut-9N51yuFw7yaub_wE6zXWaQD1-CVtkoSKUwrc0Yy8DJO855e-fhzkZNAgKSIwX2zAAWd_XjRHH765Xyt4L-Olk54r3tMvA-gdlj-3WZET167y3f3Ty1p1AHmD4uHqIva1UzmBN55T/s1600/20160423_152304.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="900" data-original-width="1600" height="180" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhSTut-9N51yuFw7yaub_wE6zXWaQD1-CVtkoSKUwrc0Yy8DJO855e-fhzkZNAgKSIwX2zAAWd_XjRHH765Xyt4L-Olk54r3tMvA-gdlj-3WZET167y3f3Ty1p1AHmD4uHqIva1UzmBN55T/s320/20160423_152304.jpg" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Ulrike Mosel and Hedvig Skirgård (yours truly) in Canberra</td></tr>
</tbody></table>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: right; margin-left: 1em; text-align: right;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiMPsc-wJcRXu8H-aMp6A_OTV-7cm5g_6PIec81Gilkjw6JBNSM5Pj87fuSvnvPem7MMqkYkgN39QiqRHujMoUrCOc-3_UWLDdJHRUeTOkd3iDfR4EdnJhi4CwlhIE4rB34JxPi0VUbwM2B/s1600/14633353_10157620701815029_5379839874800686480_o.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="1080" data-original-width="1080" height="200" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiMPsc-wJcRXu8H-aMp6A_OTV-7cm5g_6PIec81Gilkjw6JBNSM5Pj87fuSvnvPem7MMqkYkgN39QiqRHujMoUrCOc-3_UWLDdJHRUeTOkd3iDfR4EdnJhi4CwlhIE4rB34JxPi0VUbwM2B/s200/14633353_10157620701815029_5379839874800686480_o.jpg" width="200" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Samoan water, Neiafu-Tai village</td></tr>
</tbody></table>
<div>
<span style="font-size: x-small;"><br /></span></div>
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<b>References</b><br />
<ul style="background-color: white; text-align: left;"><ul>
<li>Sloetjes, H., & Wittenburg, P. (2008).<br />Annotation by category – ELAN and ISO DCR.<br />In: Proceedings of the 6th International Conference on Language Resources and Evaluation (LREC 2008).</li>
<li>Wittenburg, P., Brugman, H., Russel, A., Klassmann, A., Sloetjes, H. (2006).<br />ELAN: a Professional Framework for Multimodality Research.<br />In: Proceedings of LREC 2006, Fifth International Conference on Language Resources and Evaluation.</li>
<li>Brugman, H., Russel, A. (2004).<br />Annotating Multimedia/ Multi-modal resources with ELAN.<br />In: Proceedings of LREC 2004, Fourth International Conference on Language Resources and Evaluation.</li>
<li>Crasborn, O., Sloetjes, H. (2008).<br />Enhanced ELAN functionality for sign language corpora.<br />In: Proceedings of LREC 2008, Sixth International Conference on Language Resources and Evaluation.</li>
<li>Lausberg, H., & Sloetjes, H. (2009).<br />Coding gestural behavior with the NEUROGES-ELAN system.<br />Behavior Research Methods, Instruments, & Computers, 41(3), 841-849. doi:10.3758/BRM.41.3.591.</li>
</ul>
</ul>
<a class="synved-social-button synved-social-button-share synved-social-size-24 synved-social-resolution-single synved-social-provider-facebook nolightbox" data-provider="facebook" href="http://www.facebook.com/sharer.php?u=https%3A%2F%2Ftla.mpi.nl%2Ftools%2Ftla-tools%2Felan%2Fciting_elan%2F&t=Citing%20ELAN&s=100&p[url]=https%3A%2F%2Ftla.mpi.nl%2Ftools%2Ftla-tools%2Felan%2Fciting_elan%2F&p[images][0]=&p[title]=Citing%20ELAN" rel="nofollow" style="background-color: white; color: #839b98; display: inline-block; font-family: Arial, Helvetica, sans-serif; font-size: 0px; height: 24px; margin: 0px 0px 5px 5px; width: 24px;" target="_blank" title="Share on Facebook"></a></div>
Hedvig Skirgårdhttp://www.blogger.com/profile/03689179680848604827noreply@blogger.com2tag:blogger.com,1999:blog-1300680252997007251.post-30861379703084809192017-07-29T14:22:00.001+10:002017-07-29T14:37:50.799+10:00Speakers per language diagram & International Linguistics Olympiad memes<div dir="ltr" style="text-align: left;" trbidi="on">
<div dir="ltr" style="text-align: left;" trbidi="on">
Hello readers of Humans Who Read Grammars,<br />
<br />
As well as writing on this blog, I also work with the International Linguistics Olympiad (IOL*). The IOL is a contest for students of secondary school from all over the world where they get to compete in solving linguistic puzzles. Normally in order to explain what the contest is all about I send people to <a href="http://www.ioling.org/problems/">the page with old problem sets,</a> but there's <a href="https://www.facebook.com/iolmemes/?fref=ts">a hip IOL-meme page</a> that's produced some very apt memes that may do a better job at explaining the contest to linguists. I'll paste them in below. (<a href="http://humans-who-read-grammars.blogspot.com.au/search/label/meme">Remember how we started as a meme-based blog for typologists?</a>)<br />
<br />
<a href="http://international-linguistics-olympiad.blogspot.com.au/2017/07/non-european-countries-in-iol-more.html">I recently made a post on our blog over there about the dominance of European countries in the contest and language diversity</a>. For that post, I derived a little data visualisation of speaker populations per language (based on the 19th edition of Ethnologue) with <a href="https://infogram.com/">infogram</a>. I thought y'all might like it as well, so I'm sharing it here too.<br />
<br />
By the way, if you're a linguist who'd like to help keep the contest strong and encourage clever youngsters to get into linguistics, <a href="https://international-linguistics-olympiad.blogspot.com.au/p/contact.html">get in touch</a>! There's a lot of countries where there is no contest, or where the contest could well do with some help in thinking of clever problems based on small languages, lecturing etc. Talk to us and we'll figure something out.<br />
<br />
<br /></div>
<div style="border-top: 1px solid #dadada; font-family: "arial important"; font-size: 13px; line-height: 15px; margin: 0 30px; padding: 8px 0; text-align: center;">
<a href="https://infogram.com/speakers_per_language" style="color: #989898!important; text-decoration: none!important;" target="_blank">Speakers per language</a><br />
<a href="https://infogram.com/create/pie-chart?utm_source=embed_bottom&utm_medium=seo&utm_campaign=pie_chart" rel="nofollow" style="color: #989898!important; text-decoration: none!important;" target="_blank">Infogram</a></div>
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">Here is a table from Ethnologue that tries to explain this as well, a bit niftier but perhaps less pretty.</span><br />
<span style="font-family: inherit;"><br /></span>
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEje5CnW8HuYhA6oQhYIsROe-GwOJ7rVt8ZLg9r_vQZ6WGhUhbcwE04KTLcEKnFDh8bnUF8pcpzSPPoltGo-DHjga-U5g57xuRU_sBtNYU0GytSkDT9sq_OGssvc6j10AVmrW_bjFElvR8Rn/s1600/ehtnolanguage+size.tiff" imageanchor="1" style="margin-left: auto; margin-right: auto;"><span style="font-family: inherit;"><img border="0" data-original-height="310" data-original-width="913" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEje5CnW8HuYhA6oQhYIsROe-GwOJ7rVt8ZLg9r_vQZ6WGhUhbcwE04KTLcEKnFDh8bnUF8pcpzSPPoltGo-DHjga-U5g57xuRU_sBtNYU0GytSkDT9sq_OGssvc6j10AVmrW_bjFElvR8Rn/s1600/ehtnolanguage+size.tiff" /></span></a></td></tr>
<tr><td class="tr-caption" style="font-size: 12.8px;"><div style="font-size: 12.8px;">
<span style="font-family: inherit;">Table from Ethnologue summarising the number of speakers per language.</span></div>
</td></tr>
</tbody></table>
<div>
<span style="font-size: large;"><b><br /></b></span></div>
<div>
<span style="font-family: inherit; font-size: large;"><b>Memes from <a href="https://www.facebook.com/iolmemes/?fref=ts">IOL Memes for Ergative-Absolutive Teens</a></b></span></div>
<div>
<span style="font-size: x-small;"><br /></span></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjzNXedTyiKLi6uP4-oVTujCogD5YF1x6gyDKo6uZbUOqnZUjlzSLxn2UnLpZnWOydfPrZzaqa3JrnMGZlIX70phmTU6WPRHv2DIlL97sQ6HvQH8i81uAlPZ2VjZ_TaXGv4lNzcYxoeAE4l/s1600/17991832_165699470624263_8230412309732004805_n.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="521" data-original-width="540" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjzNXedTyiKLi6uP4-oVTujCogD5YF1x6gyDKo6uZbUOqnZUjlzSLxn2UnLpZnWOydfPrZzaqa3JrnMGZlIX70phmTU6WPRHv2DIlL97sQ6HvQH8i81uAlPZ2VjZ_TaXGv4lNzcYxoeAE4l/s1600/17991832_165699470624263_8230412309732004805_n.png" /></a></div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg3hxMj7QeTdBgT45j64jAL8z0uqve2I6cvA_hO4FNpRzHd40PXmH1UHSQiKWvi26XQ-k_05Sd5NYrjlxH2-ozBI19zinrnkNif3uZKcc6NO7AIpsvRm3WNKI8CZ4iJ08WfOv8ghZynbre8/s1600/18118487_166965707164306_7953965985573845161_n.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="401" data-original-width="576" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg3hxMj7QeTdBgT45j64jAL8z0uqve2I6cvA_hO4FNpRzHd40PXmH1UHSQiKWvi26XQ-k_05Sd5NYrjlxH2-ozBI19zinrnkNif3uZKcc6NO7AIpsvRm3WNKI8CZ4iJ08WfOv8ghZynbre8/s1600/18118487_166965707164306_7953965985573845161_n.jpg" /></a></div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhy5Gh8Arc-KRBWNCnYyt-YHvA54twTg7-ahujFAVc0lw5T6_ooBcflpMri74vuC4ZQaS2McEgMlzay4TLiliRYAkt1QUQE0Y4V10x_6Hrlb5jdUr6ZSDS6-da_YY-1oQRPGRdxDhlMZSv9/s1600/20258510_216814325512777_2167443426154043402_n.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="586" data-original-width="658" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhy5Gh8Arc-KRBWNCnYyt-YHvA54twTg7-ahujFAVc0lw5T6_ooBcflpMri74vuC4ZQaS2McEgMlzay4TLiliRYAkt1QUQE0Y4V10x_6Hrlb5jdUr6ZSDS6-da_YY-1oQRPGRdxDhlMZSv9/s1600/20258510_216814325512777_2167443426154043402_n.png" /></a></div>
<div>
<span style="font-size: x-small;">* Yes, the International Linguistics Olympiad is abbreviated "IOL". It's a thing about neutrality, don't worry about it.</span></div>
</div>
Hedvig Skirgårdhttp://www.blogger.com/profile/03689179680848604827noreply@blogger.com0tag:blogger.com,1999:blog-1300680252997007251.post-15906626005230645512017-06-27T16:50:00.000+10:002020-07-06T04:35:23.488+10:00New Approaches to Ethno-Linguistic Maps<div dir="ltr" style="text-align: left;" trbidi="on">
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">I’m excited to give a guest blog post here at humans who read grammars on new methods in language geography. I’m a geographer by trade, and I am currently a PhD student at the University of Maryland. I also work for an environmental nonprofit - Conservation International - doing data science on agriculture and environmental change in East Africa. Before ending up where I am now, I lived for some time in West Africa and the Philippines. During my time in both of those linguistically-rich areas, I became quite interested in language geographies and linguistics more generally. Spurned on by curiosity and my disappointment in available resources, I’ve done some side projects mapping languages and language groups, which I’ll talk about here.</span></div>
<h2 dir="ltr" style="line-height: 1.38; margin-bottom: 6pt; margin-top: 18pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-size: 16pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Problems with Current Language Maps</span></h2>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgLpEiZ_0ZFRRYt5NE6fD66vPObZbo5k_FGxoiTlkFvJ6UUEvZEFWV0xvbC6VJKRWCQhWtYcynrj8DDSGr0ioGuFMRpfTsLRBVNJldgVo34AkE1_jwlr8NIszoKiYH6MVuRgsx7D6xDkpeH/s1600/image+%25281%2529.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="501" data-original-width="999" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgLpEiZ_0ZFRRYt5NE6fD66vPObZbo5k_FGxoiTlkFvJ6UUEvZEFWV0xvbC6VJKRWCQhWtYcynrj8DDSGr0ioGuFMRpfTsLRBVNJldgVo34AkE1_jwlr8NIszoKiYH6MVuRgsx7D6xDkpeH/s640/image+%25281%2529.png" width="640" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;"><span style="font-size: 13.6px;">A map of tonal languages from WALS. Fascinating at a global scale, but unsatisfying if you zoom in to smaller regions.</span></td></tr>
</tbody></table>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">One major issue with most modern maps of languages is that they often consist of just a single point for each language - this is the approach that WALS and glottolog take. This works pretty well for global-scale analyses, but simple points are quite uninformative for region scale studies of languages. Points also have a hard time spatially describing languages that have disjoint distributions, like English, or languages that overlap spatially. See </span><a href="http://humans-who-read-grammars.blogspot.se/2014/12/linguistic-diveristy-important-things.html" style="text-decoration: none;"><span style="background-color: transparent; color: #1155cc; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: underline; vertical-align: baseline; white-space: pre-wrap;">here</span></a><span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"> for a more in-depth discussion of these issues from Humans Who Read Grammars</span></div>
<b id="docs-internal-guid-28b73285-e849-672e-ff67-2cdc6de57844" style="font-weight: normal;"><br /></b>
<br />
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">One reason that most language geographers go for the one-point-per-language approach is that using a simple point is simple, while mapping languages across regions and areas is very difficult. An expert must decide where exactly one language ends and another begins. The problem with relying on experts, however, is that no expert has uniform experience across an entire region, and thus will have to rely on other accounts of which language is prevalent where. This is how, for example, </span><a href="https://upload.wikimedia.org/wikipedia/commons/4/49/Africa_ethnic_groups_1996.jpg" style="text-decoration: none;"><span style="background-color: transparent; color: #1155cc; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: underline; vertical-align: baseline; white-space: pre-wrap;">the Murdock Map</span></a><span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"> of African ethno-linguistic groups was created. As a continental scale map, it is rich and fascinating. However, looking for closely at specific region, and the map seems to have problems - how did Murdock know exactly the shape of each little wiggle identifying the boundary between two groups? What about areas where two different groups overlap? Other issues can arise when trying to distinguish distinct groups when often the on-the-ground reality is that a language may exist as a dialect continuum, something that subjectively drawing polygons does not readily account for.</span></div>
<b style="font-weight: normal;"><br /></b>
<br />
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">These maps can have real import when they form the foundation of other analyses. Researchers have examined whether ethnic diversity in developing countries, and in Africa in particular, can hamper economic development and lead to conflict. Scientists disagree, although many analyses use the Murdock map. See some of this research </span><a href="http://www.jstor.org/stable/2951270?seq=1#page_scan_tab_contents" style="text-decoration: none;"><span style="background-color: transparent; color: #1155cc; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: underline; vertical-align: baseline; white-space: pre-wrap;">here</span></a><span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">, </span><a href="http://jae.oxfordjournals.org/content/9/3/244.short" style="text-decoration: none;"><span style="background-color: transparent; color: #1155cc; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: underline; vertical-align: baseline; white-space: pre-wrap;">here</span></a><span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"> and </span><a href="http://www.sciencedirect.com/science/article/pii/S0305750X14000138" style="text-decoration: none;"><span style="background-color: transparent; color: #1155cc; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: underline; vertical-align: baseline; white-space: pre-wrap;">here</span></a><span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">. Another study, recently published in </span><a href="https://www.ncbi.nlm.nih.gov/pubmed/27609892" style="text-decoration: none;"><span style="background-color: transparent; color: #1155cc; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: underline; vertical-align: baseline; white-space: pre-wrap;">Science</span></a><span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">, looked at Internet penetration in areas where politically excluded ethnic groups live. They found that groups without political power were often marginalized in terms of internet service provision. However, their data for West Africa, which came from the </span><a href="http://www.epr.ucla.edu/" style="text-decoration: none;"><span style="background-color: transparent; color: #1155cc; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: underline; vertical-align: baseline; white-space: pre-wrap;">Ethnic Power Relations</span></a><span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"> database, was quite rough: all of southern Mali was one ethnic group labeled "blacks" while the north was labeled as "Tuaregs" or "Arabs", while there was no data at all for Burkina Faso. While their findings were important and they did the best that they could with available datasets, a less informed analysis from the same data could end up looking like </span><a href="http://humans-who-read-grammars.blogspot.se/2014/10/scandal-linguistics-used-horribly-wrong.html" style="text-decoration: none;"><span style="background-color: transparent; color: #1155cc; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: underline; vertical-align: baseline; white-space: pre-wrap;">linguistics done horribly wrong</span></a><span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">. We need better ethno-linguistic maps simply to do good social science and address these critical questions. </span></div>
<h2 dir="ltr" style="line-height: 1.38; margin-bottom: 6pt; margin-top: 18pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-size: 16pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">New Methods and Datasets</span></h2>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">I believe that, thanks to greater computational efficiency offered by modern computers and new datasets available from social media, it is increasingly possible to develop better maps of language distributions using geotagged text data rather than an expert’s opinion. In this blog, I’ll cover two projects I’ve done to map languages - one using data from Twitter in the Philippines, and another using computationally-intensive algorithms to classify toponyms in West Africa.</span></div>
<b style="font-weight: normal;"><br /></b>
<br />
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">I should note that for all its hype, big data can be pretty useless without real-world experience. The Philippines and West Africa are two parts of the world where I have spent a good amount of time and have some on-the-ground familiarity with the languages. Thus, I was able to use my local knowledge to inform how I conducted the analyses, as well as to evaluate their issues and shortcomings.</span></div>
<h2 dir="ltr" style="line-height: 1.38; margin-bottom: 6pt; margin-top: 18pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-size: 16pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Case Study 1: Social Media From The Philippines</span></h2>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Many fascinating language maps from twitter have been created at global scales - see </span><a href="https://www.mapbox.com/labs/twitter-gnip/languages/#6/7.929/110.984" style="text-decoration: none;"><span style="background-color: transparent; color: #1155cc; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: underline; vertical-align: baseline; white-space: pre-wrap;">here</span></a><span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">, and </span><a href="https://www.flickr.com/photos/walkingsf/6277163176/in/photostream" style="text-decoration: none;"><span style="background-color: transparent; color: #1155cc; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: underline; vertical-align: baseline; white-space: pre-wrap;">here</span></a><span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">. However, to explore the distribution of understudied languages that don’t show up in maps of global languages, one must use more bespoke methods. This is especially true of austronesian languages like those found in the Philippines, which don’t have a lot of phonemic variability, and therefore aren’t easily classified using the methods that google translate uses. These methods, which rely on slices of the sample text, often confuse austronesian languages like Tagolog and Bahasa - just look at the maps I mentioned above. Thus, I had to use a word-list method, and created word lists from corpora offered by </span><a href="http://www.sealang.net/" style="text-decoration: none;"><span style="background-color: transparent; color: #1155cc; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: underline; vertical-align: baseline; white-space: pre-wrap;">SEAlang</span></a><span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">, and by scraping from local-language wikipedia articles. The resulting maps show exactly where minority languages are used in comparison with English and Tagalog in the philippines, and likely underestimate the prevalence of minority languages because the corpora used (wikipedia and the bible) are quite different from the twitter data that was classified.</span></div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiL6P1nyp4IG-JWj52glieayNC-5W8WPZUEDsVT-275AGispeacvC2W77sDf3jfnzuJSxZXMqddFN13QaHWAVfpc81rDLnWM373xPdIejRvtPN2KmE-TRxHoukdk33d024Z3E1bfaJYi7rH/s1600/TweetsMap_MinorityBig.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="1056" data-original-width="816" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiL6P1nyp4IG-JWj52glieayNC-5W8WPZUEDsVT-275AGispeacvC2W77sDf3jfnzuJSxZXMqddFN13QaHWAVfpc81rDLnWM373xPdIejRvtPN2KmE-TRxHoukdk33d024Z3E1bfaJYi7rH/s640/TweetsMap_MinorityBig.png" width="494" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;"><span style="font-size: 13.6px;">Languages of Tweets in the Philippines.</span></td></tr>
</tbody></table>
<b style="font-weight: normal;"><br /></b>
<br />
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt; text-align: center;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"></span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">The resulting map shows about 125,000 tweets in English, Tagalog, Taglish (using Tagalog and English in the same tweet), and the local languages Cebuano, Ilocano, Hiligaynon, Kapampangan, Bikol, and Waray. This map offers more nuance than traditional language maps of the Philippines. For example, most maps would show Ilocano over the entire northern part of Luzon, but this map shows that the use of Ilocano is much more robust on the northwest coast than in the rest of the north. This analysis also allowed me to test a hypothesis that I frequently heard locals assert when in the Philippines - that English is more common in the south, because southerners would rather use English than Tagalog, which is seen as a northern language. I found that this was to be the case, and I was only able to confirm this because I had such a large sample size. Without newer datasets like those offered by social media, this hypothesis would be untestable.</span></div>
<b style="font-weight: normal;"><br /></b>
<br />
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">To see a more in-depth description of this analysis, you can see my original blog post </span><a href="https://mcooper.github.io/share/philippines-twitter.html" style="text-decoration: none;"><span style="background-color: transparent; color: #1155cc; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: underline; vertical-align: baseline; white-space: pre-wrap;">here</span></a><span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">.</span></div>
<b style="font-weight: normal;"><br /></b>
<br />
<h2 dir="ltr" style="line-height: 1.38; margin-bottom: 6pt; margin-top: 18pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-size: 16pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Case Study 2: West African Toponyms</span></h2>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Another project I did used toponyms, or place names, from West Africa. Toponyms databases like </span><a href="http://geonames.org/" style="text-decoration: none;"><span style="background-color: transparent; color: #1155cc; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: underline; vertical-align: baseline; white-space: pre-wrap;">geonames.org</span></a><span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"> have relatively high spatial resolution - with a name for every populated place in an area. And while a place name is not as long as a tweet or other linguistic dataset, toponyms do encode ethno-linguistic information. It would be easy for someone familiar with Europe to distinguish whether a toponym is associated with the French or German linguistic group - a French name would likely begin with “Les” and end with “-elle”, while a German name could begin with “Der” and end with “-berg”. Similar differences exist between toponyms from different ethnic groups all over the world, and are quite evident to locals. What if you could train an algorithm to detect these differences, and then had it classify every single toponym throughout a region? That is what I tried to do in this analysis.</span></div>
<b style="font-weight: normal;"><br /></b>
<br />
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">I used toponyms for six countries in French West Africa. I decided to focus on French West Africa for several reasons. For one, I have worked there, and have some familiarity with the ethnic groups of the region and their distributions, and it is an area I am very curious about. For another thing, this is a relatively poorly documented part of the world as far as ethno-linguistic groups go, and it is an area with significant region-scale ethnic diversity. Finally, the countries I selected were colonized by one group, meaning that all of the toponyms were transliterated the same way and could be compared even across national borders. In all, I used 35,785 toponyms.</span></div>
<b style="font-weight: normal;"><br /></b>
<br />
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">First, I got a list of every possible set of three letters (called a 3-gram) from the toponyms. Then, I tested for spatial autocorrelation in the locations that contained each 3-gram using a Moran's I test, and selected only those 3-grams that had significant clustering. </span></div>
<b style="font-weight: normal;"><br /></b>
<br />
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">To give an illustration of why this was necessary, here are two examples of the spatial distribution 3-grams. One 3-gram - "ama" - occurs roughly evenly throughout the regions in this study. The other 3-gram - "kro" - is very common in toponyms in south-east Côte d'Ivoire, and virtually nonexistent in other areas. Thus, "kro" has significant spatial autocorrelation whereas "ama" does not. </span><br />
<span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><br /></span>
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj49g6b_iTRejgIOyrk9fy4HxRe1-NHZf4UeAKJ7inlrE0NtJlwmE5UIBb9KznQhX6rtPsS_OfbYcnkIbn4i1-xMJNvtpbw7mKomQuA5pbhW0yOavtVyASozNjPw0sEBjqF0FU-afx4TQ6W/s1600/Clustered_3-gram.jpeg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="380" data-original-width="504" height="241" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj49g6b_iTRejgIOyrk9fy4HxRe1-NHZf4UeAKJ7inlrE0NtJlwmE5UIBb9KznQhX6rtPsS_OfbYcnkIbn4i1-xMJNvtpbw7mKomQuA5pbhW0yOavtVyASozNjPw0sEBjqF0FU-afx4TQ6W/s320/Clustered_3-gram.jpeg" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Here are all of the toponyms that contain the 3-gram "kro"<br />
<div>
<br /></div>
</td></tr>
</tbody></table>
<br /></div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjqiZdZfnBFoHvGpWNlEPJlRjUHi7PJSsoMvup5tkO-DpIRsnITDl764TYFe29U3rnemy3xyLNLo3EXYJoJ4YdPb4jpTjzBc8RlN02KWjuLJeWcV3c3_VwLB2o9QVgSguIi2AltBXoBoj2o/s1600/Random_3-gram.jpeg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="380" data-original-width="504" height="241" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjqiZdZfnBFoHvGpWNlEPJlRjUHi7PJSsoMvup5tkO-DpIRsnITDl764TYFe29U3rnemy3xyLNLo3EXYJoJ4YdPb4jpTjzBc8RlN02KWjuLJeWcV3c3_VwLB2o9QVgSguIi2AltBXoBoj2o/s320/Random_3-gram.jpeg" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">And here are all of the toponyms that contain the 3-gram "ama"<br />
<div>
<br /></div>
</td></tr>
</tbody></table>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Thus, the the 3-gram "ama" doesn't tell us much about which ethnic group a toponym belongs to, because that 3-gram is found evenly distributed throughout West Africa - it is just noise. The 3-gram "kro", on the other hand, carries information about which ethnic group a toponym belongs to, because it is clearly clustered in a group in Southeast Côte d'Ivoire. </span></div>
<b style="font-weight: normal;"><br /></b>
<br />
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">I then calculated the lexical distance between all of the toponyms based on the number shared 3-grams that had significant spatial autocorrelation. To add a spatial component, I also linked any two toponyms that were less than 25 kilometers apart. Thus, I had a graph where every toponym was a vertex, and undirected edges connected toponyms that had spatial or lexical affinity. Finally, I used a fast greedy modularity-optimizing algorithm to detect communities, or clusters, in this graph. </span></div>
<b style="font-weight: normal;"><br /></b>
<br />
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-size: 13pt; font-style: normal; font-variant: normal; font-weight: 700; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Results</span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">The algorithm found seven distinct communities, which definitely correspond to ethnic groups and ethnic macro-groups in West Africa. </span></div>
<b style="font-weight: normal;"><br /></b>
<br />
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjTrNT_JytE9HrYWkI5_J7anjmzQD7I4DBoMrHgRddPpd3PY7aTfYXqLVkgVsaI8smc2RjNDcLKj4BqvTee8El1OgqbLYKHbsOISwRGjjZEuvdw_X5FDvykgEGsIQcVDn2oj635dQDCX5oQ/s1600/villnames.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="1192" data-original-width="1600" height="476" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjTrNT_JytE9HrYWkI5_J7anjmzQD7I4DBoMrHgRddPpd3PY7aTfYXqLVkgVsaI8smc2RjNDcLKj4BqvTee8El1OgqbLYKHbsOISwRGjjZEuvdw_X5FDvykgEGsIQcVDn2oj635dQDCX5oQ/s640/villnames.png" width="640" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">More info at: <a data-saferedirecturl="https://www.google.com/url?q=https://mcooper.github.io/share/vill-names.html&source=gmail&ust=1594060327726000&usg=AFQjCNEBM7Na7AmvkOsJ7zGjaumJaaEj9w" href="https://mcooper.github.io/share/vill-names.html" style="background-color: white; color: #1155cc; font-family: Arial, Helvetica, sans-serif; font-size: small; text-align: start;" target="_blank">https://mcooper.github.io/<wbr></wbr>share/vill-names.html</a></td></tr>
</tbody></table>
</div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">The red cluster includes Wolof, Serer, and Fulfulde place names, which makes sense, as all of these groups are Senegambian languages. This group of languages is the primary group in Senegal and Mauritania, which my classification picked up on. It also caught the large Fulfulde presence in central Guinea, throughout an area known as the Fouta-Djallon. This cluster also has a significant presence throughout the Sahel, stretching into Burkina Faso and dotted throughout the rest of West Africa, much like the migrant Fulfulde people.</span></div>
<b style="font-weight: normal;"><br /></b>
<br />
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">The green cluster captures most of the area where Mandé languages are spoken, including most of Mali, where the Bambara are found, as well as Eastern Guinea and Northern Côte d'Ivoire, where Malinké is found. Interestingly, most of the toponyms in Western Mali fell into the Senegambian/Fulfulde cluster, and were not in the Mandé cluster, even though there are Mandé groups like the Soninké and Khassonké in Western Mali. Southern Guinea is densely green, representing the presence of Mandé groups there, like the Kuranko. Surprisingly, much of central and southern Côte d'Ivoire also fell into the green cluster, even through there are a couple of different groups there which are not in any way related to the Mandé groups that were most represented in the green cluster. This is also true of areas in Western Burkina Faso and Eastern Mali, where there are many languages unrelated to the broader Mandé group, such as Dogon, Bobo, Minianka, and Senufo/Syempire. However, I know that Dyula, a Mandé language closely related to Bambara, is spoken as a trade language in both of these areas (Côte d'Ivoire and Western Burkina Faso). It could be that Dyula has had a long enough presence in these areas to leave an imprint on the toponyms there. </span></div>
<b style="font-weight: normal;"><br /></b>
<br />
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">The purple group pretty clearly captured two different disjoint groups that are both in the broader Mandé group - the Susu, in far Western Guinea, and the Dan, in Western Côte d'Ivoire. These groups are normally classified as being on quite separate branches of the Mandé language family, with the Susu being Northern Mandé and Dan being Eastern Mandé. However, the fact that the algorithm put them in the same group, even though they were too far apart to have edges/connections based on spatial affinity, shows that Dan and Susu toponyms have several three-grams common. </span></div>
<b style="font-weight: normal;"><br /></b>
<br />
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">The yellow cluster seems to have caught two sub-groups within the broader green/Mandé cluster. Many of the yellow toponyms in central Mali are in what you could call the Bambara homeland, between Bamako and Segou. However, a second cluster stands out quite distinctly in southern Guinea. It's unclear to me what group this could represent and why it would have toponymic features distinct enough from its neighbors that the algorithm put it in a different cluster. Some maps say that a group called the Konyanka lives here and speaks a language closely related to Malinké. </span></div>
<b style="font-weight: normal;"><br /></b>
<br />
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">The turquoise cluster quite clearly captures the Mossi people and their toponyms, as well as the Gurunsi, a related group (both Mossi and Gurunsi are classified as Gur languages). </span></div>
<b style="font-weight: normal;"><br /></b>
<br />
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">The black cluster in southern Burkina Faso captured a group that most national ethno-linguistic maps call the Lobi, although this part of West Africa is known for its significant entho-linguistic heterogeneity. Another group of villages in Eastern Burkina Faso also fell into the black cluster, although I could not find any significant ethnic group found there. </span></div>
<b style="font-weight: normal;"><br /></b>
<br />
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Finally, the blue cluster captured both the Baoulé/Akan languages as well as the Senufo. It captured the Senufo especially in Côte d'Ivoire and somewhat in Burkina Faso, but not much in Mali, where I know the Senufo have a significant presence. This could represent a Bambarization of previously Senufo toponyms due to the fact that the government of Mali is predominantly Bambara, or it could pre-date the Malian state, as this area was part of Samori Toure's Wassoulou Empire, in which the Malinké language was strongly enforced. The classification of the Senufo languages has always been controversial, but this toponymic analysis suggests that they are more related to Kwa toponyms to the south rather than to Gur toponyms to the northeast. </span></div>
<h3 dir="ltr" style="line-height: 1.38; margin-bottom: 4pt; margin-top: 14pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-size: 13pt; font-style: normal; font-variant: normal; font-weight: 700; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Caveats</span></h3>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Some caveats with this work and its interpretation. For one, this only shows toponymic affinities. Those affinities usually correspond to ethnic distributions, but not always. There is a lot of migration in West Africa today, and place names don't usually change as quickly as the distributions of people. Thus, toponyms can sometimes encode historic ethnic distributions, for example many toponyms in the United States come from Native American languages, and there are many toponym suffixes in England that reflect </span><span style="background-color: transparent; color: #0066ff; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><a href="http://www.viking.no/e/england/danelaw/epl-danelaw.htm">a historic Nordic presence</a></span><span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">. Thus, this and similar maps are most informative when interpreted in combination with on-the-ground information and knowledge. </span></div>
<b style="font-weight: normal;"><br /></b>
<br />
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Another issue with classifying toponyms in West Africa in particular is that West African toponyms are transcribed using the Latin alphabet, which definitely does not capture all of the sounds that exist in West African languages. Different extensions of the Latin alphabet, as well as </span><span style="background-color: transparent; color: #0066ff; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><a href="https://en.wikipedia.org/wiki/N'Ko_alphabet">an indigenous alphabet</a></span><span id="goog_933522713"></span><a href="https://www.blogger.com/"></a><span id="goog_933522714"></span><span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">, are often used to transcribe these languages, however these idiosyncratic methods of writing languages are not used in the geonames dataset. Thus, the Fulfulde bilabial implosive (/ɓ/ in IPA) is written the same way as a pulmonic bilabial plosive - as a "b", so this distinction is lost in our dataset, even though it adds a lot of information about what ethnic group a given toponym belongs to. However, some other sounds and sound combinations, which are very indicative of specific languages are captured using a Latin alphabet- for example prenasalized consonants (/mb/) common in Senegambian languages, labial velars (/gb/ and /kp/) common in coastal languages, or the lack of a 'v' in Mandé languages. Issues also arise with how different colonizers transcribe sounds differently, for example 'ny' and 'kwa' in English would be 'gn' and 'coua' in French. However, this didn't apply in this analysis, which only used Francophone countries, and I believe it could be dealt with if I tried to do a larger analysis. </span></div>
<h1 dir="ltr" style="line-height: 1.38; margin-bottom: 6pt; margin-top: 20pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-size: 20pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Conclusion</span></h1>
<br />
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">This is an exciting time to be at the intersection of geography and linguistics! New datasets and computational methods are giving researchers the ability to ask newer and better questions about who belongs to what group, and where. I hope new developments in this research can yields new linguistic results about phylogeny, migration, and the spread of linguistic phenomena. Outside of the field of linguistics, better language maps could have broad applications, from improving disaster response planning to helping to answer critical questions about the origins of ethnic conflict.</span><br />
<span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><br /></span>
<span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Thanks for reading! You can check out my <a href="http://mcooper.github.io/">personal website</a> for more detailed descriptions of these two projects, as well as other side projects I've done.</span></div>
<div>
<span style="background-color: transparent; color: black; font-family: "arial"; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><br /></span></div>
</div>
Matthttp://www.blogger.com/profile/17581909151763050200noreply@blogger.com1tag:blogger.com,1999:blog-1300680252997007251.post-58640326898867180312017-06-01T17:56:00.002+10:002017-12-01T09:18:02.795+11:00World map of language families from Glottolog<div dir="ltr" style="text-align: left;" trbidi="on">
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjhJi-unMosWi-jgAHKBnX9Xr2LUarpN9PkqdFjBh7IUOUAnHk7dUT94VzWzbZ4w11arTq8rhqo85aP-sBPfVOHNkIHGyamuCIrXYQT3DnuAe15c6vvdiFfgW09GnOpxL_CDK6bTeL6KX0Z/s1600/world.tiff" imageanchor="1" style="margin-left: auto; margin-right: auto;"><span style="font-family: "times" , "times new roman" , serif; font-size: large;"><img border="0" height="311" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjhJi-unMosWi-jgAHKBnX9Xr2LUarpN9PkqdFjBh7IUOUAnHk7dUT94VzWzbZ4w11arTq8rhqo85aP-sBPfVOHNkIHGyamuCIrXYQT3DnuAe15c6vvdiFfgW09GnOpxL_CDK6bTeL6KX0Z/s640/world.tiff" width="640" /></span></a></td></tr>
<tr><td class="tr-caption" style="font-size: 12.8px;"><span style="font-family: "times" , "times new roman" , serif; font-size: large;">World map from Glottolog, each language is one dot and coloured by language family (or other top-genetic unit).</span></td></tr>
</tbody></table>
<span style="font-family: "times" , "times new roman" , serif; font-size: large;">Language families are the main way we categorise and understand the language diversity of the world. A language family is a group of languages that have been analysed as having one ancestor, one great-great-great-and-yet-greater-grand-mother language. Indo-European is a language family, with the sub-groups of Romance, Germanic, Slavic etc.</span><br />
<span style="font-size: large;"><span style="font-family: "times" , "times new roman" , serif;"><br /></span>
<span style="font-family: "times" , "times new roman" , serif;"><span style="font-family: "times" , "times new roman" , serif;">Maps are great tools for visualising information, </span><a href="http://humans-who-read-grammars.blogspot.com.au/search/label/maps" style="font-family: Times, "Times New Roman", serif;">we're pretty map-nerdy on this blog.</a> <span style="font-family: "times" , "times new roman" , serif;">Robert Forkel, one of the editors of Glottolog, kindly shared an interactive map of the world with languages plotted out and coloured by language family with me. This map is interactive, rendered in a web browser with and html and json file.</span></span></span><br />
<span style="font-family: "times" , "times new roman" , serif; font-size: large;"><span style="font-family: "times" , "times new roman" , serif;"><br /></span>
<span style="font-family: "times" , "times new roman" , serif;">This map is not available on the Glottolog site, but will later be implemented in the command-line interface. </span><span style="font-family: "times" , "times new roman" , serif;">You can see language families on the website by either selecting a country or a specific family. This tool is the only way to see all language families in all countries on Glottolog. </span></span><br />
<span style="font-family: "times" , "times new roman" , serif; font-size: large;"><span style="font-family: "times" , "times new roman" , serif;"><br /></span>
<span style="font-family: "times" , "times new roman" , serif;">I will let you know when this is implemented and you can play with it yourself. In the meantime, I thought I'd share this screenshot and talk a little bit about language</span><span style="font-family: "times" , "times new roman" , serif;"> families.</span></span><br />
<span style="font-family: "times" , "times new roman" , serif; font-size: large;"><span style="font-family: "times" , "times new roman" , serif;"><br /></span>
<span style="font-family: "times" , "times new roman" , serif;">A</span><span style="font-family: "times" , "times new roman" , serif;">lso:</span></span><br />
<span style="font-size: large;"><span style="font-family: "times" , "times new roman" , serif;"><br /></span>
</span><br />
<ul style="text-align: left;">
<li><a href="http://glottolog.org/meta/downloads" style="font-family: Times, "Times New Roman", serif;"><span style="font-family: "times" , "times new roman" , serif; font-size: large;">Here is a link to files you can download from Glottolog today, including geo-location </span></a></li>
<li><span style="font-family: "times" , "times new roman" , serif; font-size: large;"><a href="https://github.com/clld/glottolog/blob/master/README.md#cli">Here is a link to the command line interface description</a> (you'll need git and python)</span></li>
</ul>
<span style="font-size: large;"><span style="font-family: "times" , "times new roman" , serif;"><br /></span>
<b><span style="font-family: "times" , "times new roman" , serif;">Some notes on language families, and in particular Glottolog language families and this map</span></b></span><br />
<span style="font-family: "times" , "times new roman" , serif; font-size: large;"><span style="font-family: "times" , "times new roman" , serif;">When we look at the collected wisdom of linguistic scholars, we actually find a lot of disagreement. For example, Ethnologue counts to 135 language families and Glottolog to 239!* </span><a href="http://humans-who-read-grammars.blogspot.com.au/2015/11/the-other-languages-in-ethnologue.html" style="font-family: Times, "Times New Roman", serif;">To read more about this, please go to this post on the "other" languages of Glottolog and Ethnologue, and how the two catalogues define these categories.</a></span><br />
<span style="font-size: large;"><span style="font-family: "times" , "times new roman" , serif;"><br /></span>
<span style="font-family: "times" , "times new roman" , serif;">Due to lack of data and disagreements, we also have very different estimates for language family depth, i.e. how long time ago the greatest-grand-mother language was spoken. Here are some examples:</span></span><br />
<span style="font-family: "times" , "times new roman" , serif; font-size: large;"><span style="font-family: "times" , "times new roman" , serif;"><br /></span>
<style>
<!--table
{mso-displayed-decimal-separator:"\.";
mso-displayed-thousand-separator:"\,";}
@page
{margin:1.0in .75in 1.0in .75in;
mso-header-margin:.5in;
mso-footer-margin:.5in;}
td
{padding-top:1px;
padding-right:1px;
padding-left:1px;
mso-ignore:padding;
color:black;
font-size:12.0pt;
font-weight:400;
font-style:normal;
text-decoration:none;
font-family:Calibri, sans-serif;
mso-font-charset:0;
mso-number-format:General;
text-align:general;
vertical-align:bottom;
border:none;
mso-background-source:auto;
mso-pattern:auto;
mso-protection:locked visible;
white-space:nowrap;
mso-rotate:0;}
.xl63
{font-weight:700;
font-family:"Times New Roman";
mso-generic-font-family:auto;
mso-font-charset:0;
text-align:left;}
.xl64
{font-family:"Times New Roman";
mso-generic-font-family:auto;
mso-font-charset:0;
text-align:left;}
.xl65
{color:#222222;
font-family:"Times New Roman";
mso-generic-font-family:auto;
mso-font-charset:0;
text-align:left;}
.xl66
{color:#222222;
font-family:"Times New Roman";
mso-generic-font-family:auto;
mso-font-charset:0;
mso-number-format:"\#\,\#\#0";
text-align:left;}
.xl67
{font-family:"Times New Roman";
mso-generic-font-family:auto;
mso-font-charset:0;
mso-number-format:"\#\,\#\#0";
text-align:left;}
-->
</style>
</span><br />
<table border="0" cellpadding="0" cellspacing="0" style="border-collapse: collapse; width: 168px;">
<!--StartFragment-->
<colgroup><col style="mso-width-alt: 4394; mso-width-source: userset; width: 103pt;" width="103"></col>
<col style="width: 65pt;" width="65"></col>
</colgroup><tbody>
<tr height="15" style="height: 15.0pt;">
<td class="xl63" height="15" style="height: 15.0pt; width: 103pt;" width="103"><span style="font-family: "times" , "times new roman" , serif; font-size: large;">Language
family</span></td>
<td class="xl63" style="width: 65pt;" width="65"><span style="font-family: "times" , "times new roman" , serif; font-size: large;">proposed date</span></td>
</tr>
<tr height="15" style="height: 15.0pt;">
<td class="xl64" height="15" style="height: 15.0pt;"><span style="font-family: "times" , "times new roman" , serif; font-size: large;">Afro-Asiatic</span></td>
<td class="xl65"><span style="font-family: "times" , "times new roman" , serif; font-size: large;">9,500 - 18,000</span></td>
</tr>
<tr height="15" style="height: 15.0pt;">
<td class="xl64" height="15" style="height: 15.0pt;"><span style="font-family: "times" , "times new roman" , serif; font-size: large;">Algic</span></td>
<td class="xl66"><span style="font-family: "times" , "times new roman" , serif; font-size: large;">7,000</span></td>
</tr>
<tr height="15" style="height: 15.0pt;">
<td class="xl64" height="15" style="height: 15.0pt;"><span style="font-family: "times" , "times new roman" , serif; font-size: large;">Austronesian</span></td>
<td class="xl64"><span style="font-family: "times" , "times new roman" , serif; font-size: large;">6,000-8,000</span></td>
</tr>
<tr height="15" style="height: 15.0pt;">
<td class="xl64" height="15" style="height: 15.0pt;"><span style="font-family: "times" , "times new roman" , serif; font-size: large;">Dravidian</span></td>
<td class="xl67"><span style="font-family: "times" , "times new roman" , serif; font-size: large;">6,000</span></td>
</tr>
<tr height="15" style="height: 15.0pt;">
<td class="xl64" height="15" style="height: 15.0pt;"><span style="font-family: "times" , "times new roman" , serif; font-size: large;">Indo-European</span></td>
<td class="xl67"><span style="font-family: "times" , "times new roman" , serif; font-size: large;">5,500</span></td>
</tr>
<!--EndFragment-->
</tbody></table>
<span style="font-family: "times" , "times new roman" , serif; font-size: large;"><span style="font-family: "times" , "times new roman" , serif;"><br /></span>
<style>
<!--table
{mso-displayed-decimal-separator:"\.";
mso-displayed-thousand-separator:"\,";}
@page
{margin:1.0in .75in 1.0in .75in;
mso-header-margin:.5in;
mso-footer-margin:.5in;}
td
{padding-top:1px;
padding-right:1px;
padding-left:1px;
mso-ignore:padding;
color:black;
font-size:12.0pt;
font-weight:400;
font-style:normal;
text-decoration:none;
font-family:Calibri, sans-serif;
mso-font-charset:0;
mso-number-format:General;
text-align:general;
vertical-align:bottom;
border:none;
mso-background-source:auto;
mso-pattern:auto;
mso-protection:locked visible;
white-space:nowrap;
mso-rotate:0;}
.xl63
{font-weight:700;
font-family:"Times New Roman";
mso-generic-font-family:auto;
mso-font-charset:0;}
.xl64
{font-family:"Times New Roman";
mso-generic-font-family:auto;
mso-font-charset:0;}
.xl65
{color:#222222;
font-family:"Times New Roman";
mso-generic-font-family:auto;
mso-font-charset:0;}
.xl66
{color:#222222;
font-family:"Times New Roman";
mso-generic-font-family:auto;
mso-font-charset:0;
mso-number-format:"\#\,\#\#0";}
.xl67
{font-family:"Times New Roman";
mso-generic-font-family:auto;
mso-font-charset:0;
mso-number-format:"\#\,\#\#0";}
-->
</style>
</span><br />
<span style="font-family: "times" , "times new roman" , serif; font-size: large;">In this case, we're using the language families (and other top-genetic units) from Glottolog. Glottolog is a carefully curated catalogue of languages, and for each grouping there is always a reference provided to where in the academic literature we can find support for exactly how the tree is structured. This is very helpful. With this said, it's worth noting that Glottolog often tends to be more "splitting" (not lumping languages into very large families) than other similar resources, like Ethnologue. In general, Glottolog often represents a more conservative view of language history.</span><br />
<span style="font-family: "times" , "times new roman" , serif; font-size: large;"><span style="font-family: "times" , "times new roman" , serif;"><br /></span>
<span style="font-family: "times" , "times new roman" , serif;">Glottolog also contains other kinds of groupings besides what we commonly think of as "families", for example: unattested, sign languages, isolates, pidgins, artifical etc. More on this <a href="http://humans-who-read-grammars.blogspot.com.au/2015/11/the-other-languages-in-ethnologue.html">here</a>.</span></span><br />
<span style="font-size: large;"><span style="font-family: "times" , "times new roman" , serif;"><span style="font-family: "times" , "times new roman" , serif;"></span><br /></span>
<span style="font-family: "times" , "times new roman" , serif;"><span style="font-family: "times" , "times new roman" , serif;"></span>
<span style="font-family: "times" , "times new roman" , serif;">Please remember when you look at this/these map that:</span></span></span><br />
<span style="font-size: large;"><span style="font-family: "times" , "times new roman" , serif;"><br /></span>
</span><br />
<ul style="text-align: left;">
<li><span style="font-family: "times" , "times new roman" , serif; font-size: large;">stacking of dots is not trivial, Nigeria for example looks more full of atlantic-congo languages than it is, see images below. Zoom in for denser areas</span></li>
<li><span style="font-family: "times" , "times new roman" , serif; font-size: large;">the colours on this map were not picked manually, but assigned automatically</span></li>
<li><span style="font-family: "times" , "times new roman" , serif; font-size: large;">Creoles are in the family of their lexifier</span></li>
<li><span style="font-family: "times" , "times new roman" , serif; font-size: large;">there are other groupings besides traditional language families in the dataset</span></li>
<li><a href="http://humans-who-read-grammars.blogspot.com.au/2014/12/linguistic-diveristy-important-things.html"><span style="font-family: "times" , "times new roman" , serif; font-size: large;">these are dots, not polygons</span></a></li>
<li><span style="font-family: "times" , "times new roman" , serif; font-size: large;">this will be implemented as a command line tool, so you should get your git and python on in order to make these yourself.</span></li>
</ul>
<span style="font-size: large;"><span style="font-family: "times" , "times new roman" , serif;"><br /></span>
</span><br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: right;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjsCVuvApC8s1LmtqYoxt2gVY5UXJ1czJI8MghJSqva0nTk-LMhQ9UQ4UWVm38_vZ4BicAw5wmQPLXkye37R60kRW0Kj4FKdRGd0fPdOI8KI2fCvhgPo3hB2irOwt6e83Ez-57B_9XRYOYF/s1600/nigeria_bad.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><span style="font-family: "times" , "times new roman" , serif; font-size: large;"><img border="0" data-original-height="60" data-original-width="66" height="181" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjsCVuvApC8s1LmtqYoxt2gVY5UXJ1czJI8MghJSqva0nTk-LMhQ9UQ4UWVm38_vZ4BicAw5wmQPLXkye37R60kRW0Kj4FKdRGd0fPdOI8KI2fCvhgPo3hB2irOwt6e83Ez-57B_9XRYOYF/s200/nigeria_bad.png" width="200" /></span></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;"><span style="font-family: "times" , "times new roman" , serif; font-size: large;">Nigeria in the world map at the top of the post</span></td></tr>
</tbody></table>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: right;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEixHkaETydQB305TBmsA8mWEvoItCuOkPV3I2WTetwXj4vmMNpxyInMikYHAZAmHQY8qtyHpeANuvCdWAXUXGgdEwq2KkT3UZFCHDOsznk_dmtdfPfSKssshzqKRS8ryrKmQjcY7fqA8Ldk/s1600/nigeria.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><span style="font-family: "times" , "times new roman" , serif; font-size: large;"><img border="0" data-original-height="840" data-original-width="1083" height="155" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEixHkaETydQB305TBmsA8mWEvoItCuOkPV3I2WTetwXj4vmMNpxyInMikYHAZAmHQY8qtyHpeANuvCdWAXUXGgdEwq2KkT3UZFCHDOsznk_dmtdfPfSKssshzqKRS8ryrKmQjcY7fqA8Ldk/s200/nigeria.png" width="200" /></span></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;"><span style="font-family: "times" , "times new roman" , serif; font-size: large;">Nigeria zoomed in</span></td></tr>
</tbody></table>
<span style="font-family: "times" , "times new roman" , serif; font-size: large;"><span style="font-family: "times" , "times new roman" , serif;">Here are some more zoomed in areas for your enjoyment</span><span style="font-family: "times" , "times new roman" , serif;"></span></span><br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><span style="font-family: "times" , "times new roman" , serif; font-size: large;"><img border="0" height="348" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi2JXynXshkBzfFGSEmuRzdDkuu7-zBdY7egZ-YlYPXQNWBub2v5rN0rWsDp6Qoji7NtpeO-kDR-OTxNCOR8ZxfILhy-QHtb52uI5wOqmB8vxMQUuNjGLp4NNvWI132Gp6Ixb_SpPUVAZL2/s640/new+guinea.tiff" style="margin-left: auto; margin-right: auto;" width="640" /></span></td></tr>
<tr><td class="tr-caption" style="text-align: center;"><span style="font-family: "times" , "times new roman" , serif; font-size: large;">The island of New Guinea</span></td></tr>
</tbody></table>
<div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhPcWuH7LJKPPegsFkecoxOqDYui7wmdVOZT5mrafFb3zhAw0IBCIeEHc-mIhXB2h6vzwuabuJHULErzcGZhP9wW2q10MovfkQ7ZN-3BsMBwAsKupOmX_1I7cHl_TUhVFtQonKob_qVrfyJ/s1600/msea.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><span style="font-family: "times" , "times new roman" , serif; font-size: large;"><img border="0" data-original-height="610" data-original-width="704" height="346" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhPcWuH7LJKPPegsFkecoxOqDYui7wmdVOZT5mrafFb3zhAw0IBCIeEHc-mIhXB2h6vzwuabuJHULErzcGZhP9wW2q10MovfkQ7ZN-3BsMBwAsKupOmX_1I7cHl_TUhVFtQonKob_qVrfyJ/s400/msea.png" width="400" /></span></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;"><span style="font-family: "times" , "times new roman" , serif; font-size: large;">Mainland South East Asia</span></td></tr>
</tbody></table>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg0IAo-joXCcvpbYlTttcH96FUtrVJg71poDCAU7CnZbAnOWtjDgDpxVOs-isaOm_FzsQQ-4CMYih4RmhIsfpj3gN_cUB19weQkU-gKvwy4HfBmJlVq0ado7bAB-OrsT3oUOYqhXobrGqjG/s1600/south+america+top.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><span style="font-family: "times" , "times new roman" , serif; font-size: large;"><img border="0" data-original-height="484" data-original-width="630" height="306" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg0IAo-joXCcvpbYlTttcH96FUtrVJg71poDCAU7CnZbAnOWtjDgDpxVOs-isaOm_FzsQQ-4CMYih4RmhIsfpj3gN_cUB19weQkU-gKvwy4HfBmJlVq0ado7bAB-OrsT3oUOYqhXobrGqjG/s400/south+america+top.png" width="400" /></span></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;"><span style="font-family: "times" , "times new roman" , serif; font-size: large;">Top South America</span></td></tr>
</tbody></table>
<span style="font-size: large;"><span style="font-family: "times" , "times new roman" , serif;"><br /></span>
<b><span style="font-family: "times" , "times new roman" , serif;">Language Family Tournament</span></b></span><br />
<span style="font-family: "times" , "times new roman" , serif; font-size: large;">On a sillier note, the Facebook page <a href="https://www.facebook.com/etymologymemes/?fref=ts" style="font-family: Times, "Times New Roman", serif;">Etymology Memes for Reconstructed Phonemes</a> recently ran a tournament where followers could vote for which was their favourite language family from a set of 24. Since this is related to the content of this blog post, I'll share those results as well!</span><br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><span style="font-family: "times" , "times new roman" , serif; font-size: large;"><img border="0" data-original-height="901" data-original-width="1600" height="360" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhxkdITntZCCoHvh6e2ctZ4S2kfjY-M8IyX7d46maSPoUo9gHdxwj8Mcl9q1E_MpggMpeW4nsgi_NrSbSuAPeHYpdb_kaFOXGOsHrzA3PPDxiIJiBUOcEOGhi40xl_-rVGI0XQp6sA4fam1/s640/18814524_1565690420108204_2938862255269251953_o.png" style="margin-left: auto; margin-right: auto;" width="640" /></span></td></tr>
<tr><td class="tr-caption" style="text-align: center;"><span style="font-family: "times" , "times new roman" , serif; font-size: large;">A tournament on Facebook where followers of the page </span><br />
<span style="font-family: "times" , "times new roman" , serif; font-size: large;"><a href="https://www.facebook.com/etymologymemes/?fref=ts">"Etymology Memes for Reconstructed Phonemes"</a> could vote for which was their favourite language family.</span></td></tr>
</tbody></table>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh_jckZqFSVT2Z-iKvcThhnTpWzmwaBN-um6ETv07RIpk_q4MQ8D1QiHQF-bQBTzd6aDOmCpEYTFAQS1tVHMPEBszbtD1NUWKEeg-g41vMQtnyzw3FxBoksjIEMLOwzd9zaINMGZY5xXwuX/s1600/18838986_1565715916772321_167580056530580179_n.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><span style="font-family: "times" , "times new roman" , serif; font-size: large;"><img border="0" data-original-height="540" data-original-width="960" height="225" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh_jckZqFSVT2Z-iKvcThhnTpWzmwaBN-um6ETv07RIpk_q4MQ8D1QiHQF-bQBTzd6aDOmCpEYTFAQS1tVHMPEBszbtD1NUWKEeg-g41vMQtnyzw3FxBoksjIEMLOwzd9zaINMGZY5xXwuX/s400/18838986_1565715916772321_167580056530580179_n.jpg" width="400" /></span></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;"><span style="font-family: "times" , "times new roman" , serif; font-size: large;">The winner of said contest, Basque</span></td></tr>
</tbody></table>
<b><span style="font-family: "times" , "times new roman" , serif; font-size: large;">Other ways of categorising languages besides language families</span></b><br />
<span style="font-family: "times" , "times new roman" , serif; font-size: large;"><span style="font-family: "times" , "times new roman" , serif;">There are other way of categorising languages than into language families, most notably into <u>geographic areas</u>. It seems that languages that are in contact influence each other. Furthermore, it is not necessarily true that all parts of a language (sound system, vocabulary, grammar, syntax, etc) has one and only one shared ancestry - <u>there could be multiple underlying trees for different parts of language. </u></span><span style="font-family: "times" , "times new roman" , serif;">It may be that the counting system was borrowed from neighbour x and some phonemes imported from neighbour y. Another reason for multiple trees is dialect chains breaking up and coming together again, which is hard to detect given enough time.</span></span><br />
<span style="font-family: "times" , "times new roman" , serif; font-size: large;"><span style="font-family: "times" , "times new roman" , serif;"><br /></span>
<span style="font-family: "times" , "times new roman" , serif;">Besides these approaches, we can also categorise languages into types (suffixing, tonal, CVCV, VSO, isolating etc). This is what typologists do. Knowing the distribution of various traits in the worlds languages, we can not only investigate language history, but also ask questions such as:</span></span><br />
<span style="font-size: large;"><span style="font-family: "times" , "times new roman" , serif;"><br /></span>
</span><br />
<ul style="text-align: left;">
<li><span style="font-family: "times" , "times new roman" , serif; font-size: large;">are certain traits correlated with each other?</span></li>
<li><span style="font-family: "times" , "times new roman" , serif; font-size: large;">are there trade-offs between traits, for example to minimize complexity?</span></li>
<li><span style="font-family: "times" , "times new roman" , serif; font-size: large;">are there cognitive constraints on combination of traits?</span></li>
</ul>
<span style="font-size: large;"><span style="font-family: "times" , "times new roman" , serif;"><br /></span>
<span style="font-family: "times" , "times new roman" , serif;">Ok, that's it for now. Hope you enjoyed this!</span></span><br />
<span style="font-family: "times" , "times new roman" , serif; font-size: large;"><span style="font-family: "times" , "times new roman" , serif;"><br /></span>
<span style="font-family: "times" , "times new roman" , serif;">Bye, </span></span><br />
<span style="font-family: "times" , "times new roman" , serif; font-size: large;">Hedvig</span><br />
<span style="font-family: "times" , "times new roman" , serif; font-size: large;"><span style="font-family: "times" , "times new roman" , serif;"><br /></span>
<span style="font-family: "times" , "times new roman" , serif; font-size: xx-small;">* In order to make a fair comparison, I've excluded some special cases that the two catalogues deal with in very different ways or that we have very little data on. For Ethnologue, I've excluded:<span style="background-color: white; color: #323232; vertical-align: baseline; white-space: pre-wrap;"> </span><span style="background-color: white; color: #323232; vertical-align: baseline; white-space: pre-wrap;">constructed languages (1), creoles (88), deaf sign languages (137), language isolates, mixed languages (21), pidgins (13), and unclassified languages (51). For Glottolog I've </span><span style="color: #323232;"><span style="white-space: pre-wrap;">excluded</span></span><span style="background-color: white; color: #323232; white-space: pre-wrap;"> pidgins (79), isolates (198), mixed languages (23), artificial (9), speech registers (6), “unattested” (61), “unclassifiable” (117) and sign languages (166). Creoles in Glottolog are classified under their lexifier family, making them hard to count, but they don’t increase the number of families. There are 37 language with "creole" or "kriol" in their name in Glottolog, but I didn't subtract these since they belonged to families that also contain non-contact languages.</span></span></span></div>
<style>
<!--table
{mso-displayed-decimal-separator:"\.";
mso-displayed-thousand-separator:"\,";}
@page
{margin:1.0in .75in 1.0in .75in;
mso-header-margin:.5in;
mso-footer-margin:.5in;}
td
{padding-top:1px;
padding-right:1px;
padding-left:1px;
mso-ignore:padding;
color:black;
font-size:12.0pt;
font-weight:400;
font-style:normal;
text-decoration:none;
font-family:Calibri, sans-serif;
mso-font-charset:0;
mso-number-format:General;
text-align:general;
vertical-align:bottom;
border:none;
mso-background-source:auto;
mso-pattern:auto;
mso-protection:locked visible;
white-space:nowrap;
mso-rotate:0;}
.xl63
{color:#222222;
font-family:Calibri;
mso-generic-font-family:auto;
mso-font-charset:0;}
.xl64
{color:#222222;
font-family:Calibri;
mso-generic-font-family:auto;
mso-font-charset:0;
mso-number-format:"\#\,\#\#0";}
.xl65
{mso-number-format:"\#\,\#\#0";}
-->
</style></div>
Hedvig Skirgårdhttp://www.blogger.com/profile/03689179680848604827noreply@blogger.com1