Consistency and variability in acceptability judgments from naive native speakers

Gert-Jan Schoenmakers; Roeland van Hout

doi:10.5117/NEDTAA2024.1.005.SCHO

ISSN: 1384-5845
E-ISSN: 2352-1171

Consistency and variability in acceptability judgments from naive native speakers
Authors: Gert-Jan Schoenmakers¹ & Roeland van Hout²
View Affiliations Hide Affiliations

¹ Utrecht University ² Radboud University
Publisher: Amsterdam University Press
Source: Nederlandse Taalkunde, Volume 29, Issue 1, Jun 2024, p. 49 - 75
DOI: https://doi.org/10.5117/NEDTAA2024.1.005.SCHO
Language: English
- Published online: 01 Jun 2024

Abstract

Syntactic theories are typically construed based on acceptability judgments. These judgments are increasingly often collected experimentally, testing larger sets of linguistically naive participants. An important assumption is that participants have a very clear understanding of what it is they are asked to do, which can be assessed by establishing their internal consistency. The question we address in this paper is whether ‘human measuring instruments’ are consistent in their judgments. To this end, we re-examined the judgment data from Schoenmakers (2023), where three types of violations of the prescriptive norm and object scrambling sentences were evaluated. We used Generalizability Theory to investigate the degree of covariation in the judgments and found that the internal consistency was poor in the norm violation item sets, but excellent in the scrambling item set. A difference between the data patterns is that the former item sets led to ‘sledgehammer’ effects between the stigmatized and non-stigmatized variants, which left little room for participant variation. Our analyses show that judgments from naive native speakers can adequately serve linguistic theorizing, both in the case of stigmatized and non-stigmatized variation. Furthermore, we performed cluster analyses to identify subgroups of participants to get a better grasp on the variation in the data set. We conclude that specific statistical analyses can help understand data and advance linguistic theory building.

Article metrics loading...

/content/journals/10.5117/NEDTAA2024.1.005.SCHO

2024-06-01

2025-06-05

Full text loading...

References

Bates, Douglas, MartinMächler, BenBolker & SteveWalker (2015). Fitting linear mixed effects models using lme4. Journal of Statistical Software67(1), 1-48. https://doi.org/10.18637/jss.v067.i01
[Google Scholar]
Birdsong, David (1989). Metalinguistic performance and interlinguistic competence. Berlin: Springer. https://doi.org/10.1007/978-3-642-74124-1
[Google Scholar]
Blevins, Juliette (2004). Evolutionary phonology: The emergence of sound patterns. Cambridge: Cambridge University Press. https://doi.org/10.1017/CBO9780511486357
[Google Scholar]
Brennan, Robert (2001). Generalizability Theory. Berlin: Springer. https://doi.org/10.1007/978-1-4757-3456-0_6
[Google Scholar]
van Bree, Cor (2012). Hun als subject in een grammaticaal en dialectologisch kader. Nederlandse Taalkunde17(2), 229-249. https://doi.org/10.5117/nedtaa2012.2.hun_527
[Google Scholar]
Briesch, Amy, HariharanSwaminathan, MeganWelsh & SandraChafouleas (2014). Generalizability Theory: A practical guide to study design, implementation, and interpretation. Journal of School Psychology52(1), 13-35. https://doi.org/10.1016/j.jsp.2013.11.008
[Google Scholar]
Broekhuis, Hans (2008). Derivations and evaluations: Object shift in the Germanic languages. Berlin: De Gruyter. https://doi.org/10.1515/9783110207200
[Google Scholar]
Broekhuis, Hans (2023). Scrambling of definite object NPs in Dutch: Formal theories, corpus data and experimental research. Nederlandse Taalkunde28(2), 145-179. https://doi.org/10.5117/NEDTAA2023.2.001.BROE
[Google Scholar]
Bybee, Joan (2002). Word frequency and context of use in the lexical diffusion of phonetically conditioned sound change. Language Variation and Change14(3), 261–290. https://doi.org/10.1017/s0954394502143018
[Google Scholar]
Chen, Zhong, YuhangXu & ZhiguoXie (2020). Assessing introspective linguistic judgments quantitatively: The case of The Syntax of Chinese. Journal of East Asian Linguistics29(3), 311-336. https://doi.org/10.1007/s10831-020-09210-y
[Google Scholar]
Cowart, Wayne (1997). Experimental syntax: Applying objective methods to sentence judgments. Thousand Oaks: SAGE.
[Google Scholar]
Cronbach, Lee, GoldineGleser, HarinderNanda & NageswariRajaratnam (1972). The dependability of behavioral measurements: Theory of generalizability for scores and profiles. New York: John Wiley.
[Google Scholar]
Edelman, Shimon & MortenChristiansen (2003). How seriously should we take Minimalist syntax?Trends in Cognitive Sciences7(2), 60-61. https://doi.org/10.1016/s1364-6613(02)00045-1
[Google Scholar]
Featherston, Sam (2020). Can we build a grammar on the basis of judgements? In: SamuelSchindler, AnnaDrożdżowicz & KarenBrøcker (eds.), Linguistic intuitions: Evidence and method. Oxford: Oxford University, 165-188. https://doi.org/10.1093/oso/9780198840558.003.0010
[Google Scholar]
Francis, Elaine (2022). Gradient acceptability and linguistic theory. Oxford: Oxford University Press. https://doi.org/10.1093/oso/9780192898944.001.0001
[Google Scholar]
Gibson, Edward & EvFedorenko (2010). Weak quantitative standards in linguistics research. Trends in Cognitive Sciences14(6), 233-234. https://doi.org/10.1016/j.tics.2010.03.005
[Google Scholar]
Gibson, Edward & EvFedorenko (2013). The need for quantitative methods in syntax and semantics research. Language and Cognitive Processes28(1/2), 88-124. https://doi.org/10.1080/01690965.2010.515080
[Google Scholar]
Gibson, Edward, StevenPiantadosi & EvFedorenko (2013). Quantitative methods in syntax/semantics research: A response to Sprouse & Almeida (2013). Language and Cognitive Processes28(3), 229-240. https://doi.org/10.1080/01690965.2012.704385
[Google Scholar]
Giesbers, Herman (1983/1984). Doe jij lief spelen? Notities over het perifrastisch doen. Mededelingen van de Nijmeegse Centrale voor Dialect- en Naamkunde19, 57–64.
[Google Scholar]
Giles, Howard (1973). Accent mobility: A model and some data. Anthropological Linguistics15(2), 87-105
[Google Scholar]
Gussenhoven, Carlos (2000). On the origin and development of the central Franconian tone contrast. In: AditiLahiri (ed.), Analogy, levelling, markedness: Principles of change in phonology and morphology. Berlin: Mouton de Gruyter, 215-260. https://doi.org/10.1515/9783110808933.215
[Google Scholar]
Hartshorne, Joshua, JoshuaTenenbaum & StevenPinker (2018). A critical period for second language acquisition: Evidence from 2/3 million English speakers. Cognition177, 263-277. https://doi.org/10.1016/j.cognition.2018.04.007
[Google Scholar]
Hennig, Christian, MarinaMeila, FionnMurtagh & RobertoRocci (2016). Handbook of cluster analysis. New York: Chapman & Hall/CRC Press. https://doi.org/10.1201/b19706
[Google Scholar]
van Hout, Roeland (2003). Hun zijn jongens: Ontstaan en verspreiding van het onderwerp ‘hun’. In: JanStroop (ed.), Waar gaat het Nederlands naartoe? Panorama van een taal. Amsterdam: Uitgeverij Bert Bakker, 277-286.
[Google Scholar]
van Hout, Roeland (2006). Onstuitbaar en onuitstaanbaar: de toekomst van een omstreden taalverandering. In: Nicolinevan der Sijs, JanStroop & FredWeerman (eds.), Wat iedereen van het Nederlands moet weten en waarom. Amsterdam: Uitgeverij Bert Bakker, 42-54.
[Google Scholar]
van Hout, Roeland & PieterMuysken (2016). Taming chaos: Change and variability in the language sciences. In: KlaasLandsman & Ellenvan Wolde (eds.), The challenge of chance: A multidisciplinary approach from science and the humanities. Berlin: Springer, 249-266. https://doi.org/10.1007/978-3-319-26300-7_14
[Google Scholar]
Häussler, Jana & TomJuzek (2021). Data convergence in syntactic theory and the role of sentence pairs. In: SamuelSchindler, AnnaDrożdżowicz & KarenBrøcker (eds.), Linguistic intuitions: Evidence and method. Oxford: Oxford University Press, 233-254. https://doi.org/10.1093/oso/9780198840558.003.0013
[Google Scholar]
Koon, Terry & MaeLi (2016). A guideline of selecting and reporting intraclass correlation coefficients for reliability research. Journal of Chiropractic Medicine15(2), 155-163. https://doi.org/10.1016/j.jcm.2016.02.012
[Google Scholar]
Kovač, Iva & Gert-JanSchoenmakers (submitted). An experimental-syntactic take on long passive in Dutch: Unraveling the patterns underlying its (un) acceptability. Unpublished manuscript.
[Google Scholar]
Krippendorff, Klaus (2013). Content analysis: An introduction to its methodology (3rd ed.). Thousand Oaks: SAGE. https://doi.org/10.4135/9781071878781
[Google Scholar]
Langsford, Steven, AmyPerfors, AndrewHendrickson, LaurenKennedy & DanielleNavarro (2018). Quantifying sentence acceptability measures: Reliability, bias, and variability. Glossa: A Journal of General Linguistics3(1), 37. https://doi.org/10.5334/gjgl.396
[Google Scholar]
Lüdecke, Daniel, AlexanderBartel, CarstenSchwemmer, ChuckPowell, AmirDjalovski & JohannesTitz. (2023). sjPlot: Data visualization for statistics in social science. R package version 2.8.14. Retrieved from https://CRAN.R-project.org/package=sjPlot
[Google Scholar]
Mahowald, Kyle, PeterGraff, JeremyHartman & EdwardGibson (2016). SNAP judgments: A small N acceptability paradigm (SNAP) for linguistic acceptability judgments. Language92(3), 619-635. https://doi.org/10.1353/lan.2016.0052
[Google Scholar]
Matuschek, Hannes, ReinholdKliegl, ShravanVasishth, HaraldBaayen & DouglasBates (2017). Balancing Type I error and power in linear mixed models. Journal of Memory and Language94, 305-315. https://doi.org/10.1016/j.jml.2017.01.001
[Google Scholar]
van der Meulen, Marten (2018). Do we want more or less variation? The comparative markers als and dan in Dutch prescriptivism since 1900. Linguistics in the Netherlands 35, 79-96. https://doi.org/10.1075/avt.00006.meu
[Google Scholar]
Moore, Christopher (2016). gtheory: Apply Generalizability Theory with R. R package version 0.1.2. Retrieved from https://CRAN.R-project.org/package=gtheory
[Google Scholar]
Neeleman, Ad & Hansvan de Koot (2008). Dutch scrambling and the nature of discourse templates. Journal of Comparative Germanic Linguistics11(2), 137-189. https://doi.org/10.1007/s10828-008-9018-0
[Google Scholar]
Newmeyer, Frederick (2020). The relevance of introspective data. In: SamuelSchindler, AnnaDrożdżowicz & KarenBrøcker (eds.), Linguistic intuitions: Evidence and method. Oxford: Oxford University Press, 149-164. https://doi.org/10.1093/oso/9780198840558.003.0009
[Google Scholar]
Ohala, John (1981). The listener as a source of sound change. In: CarrieMasek, RobertaHendrick & Mary FrancesMiller (eds.), Proceedings of the Chicago Linguistics Society 17: Papers from the parasession on language and behavior. Chicago: Chicago Linguistics Society, 178-203. https://doi.org/10.1075/cilt.323.05oha
[Google Scholar]
Pierrehumbert, Janet (2001). Exemplar dynamics: Word frequency, lenition, and contrast. In: JoanBybee & PaulHopper (eds.), Frequency effects and the emergence of lexical structure. Amsterdam: John Benjamins, 137-157. https://doi.org/10.1075/tsl.45.08pie
[Google Scholar]
Phillips, Colin (2010). Should we impeach armchair linguists? In: ShoishiIwasaki, HajimeHoji, PatriciaClancy & Sung-OckSohn (eds.), Japanese/Korean linguistics 17. Stanford: CSLI Publications, 49-64.
[Google Scholar]
Phillips, Colin, PhoebeGaston, NickHuang & HannaMuller (2021). Theories all the way down: Remarks on “theoretical” and “experimental” linguistics. In: GrantGoodall (ed.), The Cambridge handbook of experimental syntax. Cambridge: Cambridge University Press, 587-616. https://doi.org/10.1017/9781108569620.023
[Google Scholar]
Preston, Alvin (2021). Grammaticality judgements: A linguistic perspective. New York: States Academic Press.
[Google Scholar]
R Core Team (2022). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/
[Google Scholar]
Rietveld, Toni (2021). Human measurement techniques in speech and language pathology. New York: Routledge. https://doi.org/10.4324/9781003053118-2
[Google Scholar]
Rietveld, Toni & Roelandvan Hout (1993). Statistical techniques for the study of language and language behaviour. Berlin: Mouton de Gruyter.
[Google Scholar]
Schaeffer, Jeanette (2000). The acquisition of direct object scrambling and clitic placement: Syntax and pragmatics. Amsterdam: John Benjamins. https://doi.org/10.1075/lald.22
[Google Scholar]
Schoenmakers, Gert-Jan (2023). Linguistic judgments in 3D: The aesthetic quality, linguistic acceptability, and surface probability of stigmatized and non-stigmatized variation. Linguistics61(3), 779-824. https://doi.org/10.1515/ling-2021-0179
[Google Scholar]
Schoenmakers, Gert-Jan, MarjoleinPoortvliet & JeannetteSchaeffer (2022). Topicality and anaphoricity in Dutch scrambling. Natural Language & Linguistic Theory40(2), 541-571. https://doi.org/10.1007/s11049-021-09516-z.
[Google Scholar]
Schoenmakers, Gert-Jan & Peterde Swart (2019). Adverbial hurdles in Dutch scrambling. In: AnjaGattnar, RobinHörnig, MelanieStörzer & SamFeatherston (eds.), Proceedings of Linguistic Evidence 2018: Experimental data drives linguistic theory. Tübingen: University of Tübingen, 124-145. http://doi.org/10.15496/publikation-32627
[Google Scholar]
Schütze, Carson (1996). The empirical base of linguistics: Grammaticality judgments and linguistic methodology. Chicago: University of Chicago Press. Reprinted in 2016 by Language Science Press. https://doi.org/10.26530/OAPEN_603356
[Google Scholar]
Sert, Cansel, FerdyHubers, TheresaRedl & Helende Hoop (2023). On the acceptability of the not so dummy auxiliary ‘do’ in Dutch. Linguistics in the Netherlands40, 210-229.
[Google Scholar]
Shavelson, Richard & NoreenWebb (1991). Generalizability Theory: A primer. Thousand Oaks: SAGE. https://doi.org/10.1016/0886-1633(93)90019-l
[Google Scholar]
Shrout, Patrick & JosephFleiss (1979). Intraclass correlations: Uses in assessing rater reliability. Psychological Bulletin86(2), 420-428. https://doi.org/10.1037/0033-2909.86.2.420
[Google Scholar]
Sprouse, Jon (2020). A user’s view of the validity of acceptability judgments as evidence for syntactic theories. In: SamuelSchindler, AnnaDrożdżowicz & KarenBrøcker eds.), Linguistic intuitions: Evidence and method. Oxford: Oxford University Press, 215-232. https://doi.org/10.1093/oso/9780198840558.003.0012
[Google Scholar]
Sprouse, Jon, CarsonSchütze & DiogoAlmeida (2013). A comparison of informal and formal acceptability judgments using a random sample from Linguistic Inquiry 2001–2010. Lingua134, 219-248. https://doi.org/10.1016/j.lingua.2013.07.002
[Google Scholar]
Sprouse, Jon & DiogoAlmeida (2012). Assessing the reliability of textbook data in syntax: Adger’s ‘Core syntax’. Journal of Linguistics48(3), 609-652. https://doi.org/10.1017/s0022226712000011
[Google Scholar]
Sprouse, Jon & DiogoAlmeida (2017). Design sensitivity and statistical power in acceptability judgment experiments. Glossa: A journal of general linguistics2(1), Article 14. https://doi.org/10.5334/gjgl.236
[Google Scholar]

/content/journals/10.5117/NEDTAA2024.1.005.SCHO

Consistency and variability in acceptability judgments from naive native speakers

NedTaal 29, 49 (2024); https://doi.org/10.5117/NEDTAA2024.1.005.SCHO

/content/journals/10.5117/NEDTAA2024.1.005.SCHO

Data & Media loading...

Keyword(s): cluster analysis; Generalizability Theory; prescriptive norm violations; reliability; scrambling

Consistency and variability in acceptability judgments from naive native speakers

Abstract

Most Read This Month

Most Cited Most Cited RSS feed

Leve hun! Waarom hun nog steeds hun zeggen

Tussentaal wordt omgangstaal in Vlaanderen

Expressive markers in online teenage talk

Understanding grammar at the community level requires a diachronic perspective

Language-specific tendencies towards morphological or syntactic constructions

Goed of fout

Feiten en fictie - Taalvariatie in Vlaamse televisiereeksen vroeger en nu

Perceptie van tussentaal in het gesproken Nederlands in Vlaanderen

Connectieven in de rechterperiferie - Een contrastieve analyse van dus en donc in gesproken taal

Expeditie Tussentaal - Leeftijd, identiteit en context in “Expeditie Robinson”