The validity of mixed-effects regression for analysing linguistic distance matrices: a simulation study | Amsterdam University Press Journals Online
2004
Volume 75, Issue 1
  • ISSN: 0039-8691
  • E-ISSN: 2215-1214

Abstract

Abstract

Recent work in dialectometry has proposed the use of linear mixed-effects regression (LMER) for analysing full distance matrices. While the outcomes are promising, work is needed to confirm that such outcomes are valid, given that the analysis of distance matrices using this method is not established. The current contribution provides a supporting framework for this approach by testing its validity through a series of simulated datasets. We analysed the generated data using LMER, and compared its performance to that of the well-established multiple regression on distance matrices (MRM) approach. We find that the LMER results are on par with—and sometimes even exceed—the results obtained from MRM. The potential to include random effects makes LMER a more powerful tool than MRM to examine a linguistic area as a whole, with all pairwise comparisons included, making it an ideal candidate for big data analyses that are becoming more prevalent with the ongoing digitisation of large dialect databases.

Loading

Article metrics loading...

/content/journals/10.5117/TET2023.1.004.HUIS
2023-09-01
2023-12-07
Loading full text...

Full text loading...

/deliver/fulltext/00398691/75/1/TET2023.1.004.HUIS.html?itemId=/content/journals/10.5117/TET2023.1.004.HUIS&mimeType=html&fmt=ahah

References

  1. Bates, D., Mächler, M., Bolker, B. & Walker, S. (2015). Fitting Linear Mixed-Effects Models Using lme4. Journal of Statistical Software, 67(1), 1–48.
    [Google Scholar]
  2. Blaha Pfeiler, B., & Skopeteas, S. (2022). Sources of convergence in indigenous languages: Lexical variation in Yucatec Maya. PLoS One, 17(5), e0268448.
    [Google Scholar]
  3. Bloomfield, L. (1933). Language. Henry Holt & Co.
    [Google Scholar]
  4. Chambers, J. K., & Trudgill, P (1998). Dialectology. Cambridge University Press.
    [Google Scholar]
  5. Franco, K. (2021). Is de rijksgrens ook een dialectgrens? Oude en nieuwe inzichten over de rol van de rijksgrens als dialectgrens. Neerlandia, 125(4), 21–23.
    [Google Scholar]
  6. Goslee, S. C., & Urban, D. L. (2007). The ecodist package for dissimilarity-based analysis of ecological data. Journal of Statistical Software, 22, 1–19.
    [Google Scholar]
  7. Heeringa, W., Gooskens, C., & van Heuven, V. J. (2023). Comparing Germanic, Romance and Slavic: Relationships among linguistic distances. Lingua, 287, 103512.
    [Google Scholar]
  8. Honkola, T., Ruokolainen, K., Syrjänen, K. J., Leino, U. P., Tammi, I., Wahlberg, N., & Vesakoski, O. (2018). Evolution within a language: environmental differences contribute to divergence of dialect groups. BMC Evolutionary Biology, 18, 1–15.
    [Google Scholar]
  9. Huisman, J. L. A., Majid, A., & van Hout, R. (2019). The geographical configuration of a language area influences linguistic diversity. PloS One, 14(6), e0217363.
    [Google Scholar]
  10. Huisman, J. L. A., Franco, K., & van Hout, R. (2021). Linking linguistic and geographic distance in four semantic domains: computational geo-analyses of internal and external factors in a dialect continuum. Frontiers in Artificial Intelligence, 4, 71.
    [Google Scholar]
  11. Ko, V., Wieling, M., Wit, E., Nerbonne, J., & Krijnen, W. (2014). Social, geographical, and lexical influences on Dutch dialect pronunciations. Computational Linguistics in the Netherlands Journal, 4, 29–38.
    [Google Scholar]
  12. Lefcheck, J. S. (2016). piecewiseSEM: Piecewise structural equation modeling in R for ecology, evolution, and systematics. Methods in Ecology and Evolution, 7(5), 573–579.
    [Google Scholar]
  13. Mantel, N. (1967). The detection of disease clustering and a generalized regression approach. Cancer research, 27(2-II), 209–220.
    [Google Scholar]
  14. Nerbonne, J. (2010). Measuring the diffusion of linguistic change. Philosophical Transactions of the Royal Society B: Biological Sciences, 365(1559), 3821–3828.
    [Google Scholar]
  15. Nerbonne, J., Colen, R., Gooskens, C., Kleiweg, P., & Leinonen, T. (2011). Gabmap-a web application for dialectology. Dialectologia: Revista Electrònica, 65–89.
    [Google Scholar]
  16. R Core Team. (2022). The R Project for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing.
    [Google Scholar]
  17. Wieling, M., Nerbonne, J., & Baayen, R. H. (2011). Quantitative social dialectology: Explaining linguistic variation geographically and socially. PLoSOne, 6(9), e23613.
    [Google Scholar]
  18. Wieling, M., Valls, E., Baayen, R. H., & Nerbonne, J. (2018). Border effects among Catalan dialects. In: Mixed-effects regression models in linguistics, pp. 71-97. Springer.
    [Google Scholar]
http://instance.metastore.ingenta.com/content/journals/10.5117/TET2023.1.004.HUIS
Loading
/content/journals/10.5117/TET2023.1.004.HUIS
Loading

Data & Media loading...

This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error