Posts

Showing posts from October, 2024

Language population data from Google (FREE!)

Image
Google's linguistics research team have put together a great big table of speaker population size data and other information. There is population data for over 5,000 languages. It's free, it includes sources and you can get it here: https://github.com/google-research/url-nlp/tree/main/linguameta .  Proper citation is: @InProceedings{ritchie-etal-2024-linguameta-unified, author = {Ritchie, Sandy and van Esch, Daan and Okonkwo, Uche and Vashishth, Shikhar and Drummond, Emily}, title = {LinguaMeta: Unified metadata for thousands of languages}, booktitle = {Proceedings of the Joint International Conference on Computational Linguistics, Language Resources and Evaluation}, month = {May}, year = {2024}, address = {Torino, Italy}, publisher = {European Language Resources Association}, pages = {10530–-10538}, abstract = {We introduce LinguaMeta, a unified resource for language metadata for thousands of languages, includi...