A Character Recognition Tool for Automatic Detection of Social Characters in Visual Media Content

Joshua Baldwin; Ralf Schmälzle

doi:10.5117/CCR2022.1.010.BALD

E-ISSN: 2665-9085

oa A Character Recognition Tool for Automatic Detection of Social Characters in Visual Media Content
Authors: Joshua Baldwin¹, Ralf Schmälzle²
View Affiliations Hide Affiliations

Affiliations: ¹ Department of Communication, Michigan State University ² Department of Communication, Michigan State University
Publisher: Amsterdam University Press
Source: Computational Communication Research, Volume 4, Issue 1, Feb 2022,
DOI: https://doi.org/10.5117/CCR2022.1.010.BALD
Language: English

Abstract

Content analysis is the go-to method for understanding how social characters, such as public figures or movie characters, are portrayed in media messages. It is an indispensable method to investigate character-related media processes and effects. However, conducting large-scale content-analytic studies is a taxing and expensive endeavor that requires hours of coder training and incurs substantial costs. This problem is particularly acute for video-based media, where coders often have to exert extensive time and energy to watch and interpret dynamic content. Here we present a Character-Recognition-Tool (CRT) that enables communication scholars to quickly process large amounts of video data to identify occurrences of specific predefined characters using facial recognition and matching. This paper presents the CRT and provides evidence for its validity. The CRT can automate the coding process of on-screen characters while following recommendations that computational tools be scalable, adaptable for novice programmers, and open source to allow for replication.

Article metrics loading...

/content/journals/10.5117/CCR2022.1.010.BALD

2022-02-01

2024-04-16

Full text loading...

/deliver/fulltext/26659085/4/1/CCR2022.1.010.BALD.html?itemId=/content/journals/10.5117/CCR2022.1.010.BALD&mimeType=html&fmt=ahah

References

Araujo, T., Lock, I., & van de Velde, B. (2020). Automated visual content analysis (AVCA) in communication research: A protocol for large scale image classification with pre-trained computer vision models. Communication Methods and Measures,14(4), 239-265. https://doi.org/10.1080/19312458.2020.1810648
[Google Scholar]
Bergamini, E., Demidenko, E., & Sargent, J. D. (2013). Trends in tobacco and alcohol brand placements in popular US movies, 1996 through 2009. JAMA Pediatrics, 167(7), 634-639. https://doi:10.1001/jamapediatrics.2013.393
[Google Scholar]
Bradski, G., & Kaehler, A. (2008). Learning OpenCV: Computer vision with the OpenCV library. O’Reilly Media, Inc.
[Google Scholar]
Datta, A. K., Datta, M., & Banerjee, P. K. (2015). Face detection and recognition: Theory and practice. CRC Press.
[Google Scholar]
Dhall, A., Goecke, R., Lucey, S., & Gedeon, T. (2012). Collecting large, richly annotated facial-expression databases from movies. IEEE Annals of the History of Computing, 19(3), 34-41. https://doi.ieeecomputersociety.org/10.1109/MMUL.2012.26
[Google Scholar]
Casas, A., & Williams, N. W. (2019). Images that matter: Online protests and the mobilizing role of pictures. Political Research Quarterly, 72(2), 360-375. https://doi.org/10.1177/1065912918786805
[Google Scholar]
Chollet, F. (2018). Deeping learning with Python. Manning Publications Co.
[Google Scholar]
Evans, W. (2000). Teaching computers to watch television: Content-based image retrieval for content analysis. Social Science Computer Review, 18(3), 246-257. https://doi.org/10.1177/089443930001800302
[Google Scholar]
Geitgey (2016). Machine learning is fun! Part 4: Modern face recognition with deep learning. Medium. https://medium.com/@ageitgey/machine-learning-is-fun-part-4-modern-face-recognition-with-deep-learning-c3cffc121d78
[Google Scholar]
Gibson, R., & Zillmann, D. (2000). Reading between the photographs: The influence of incidental pictorial information on issue perception. Journalism & Mass Communication Quarterly, 77(2), 355-366. https://doi.org/10.1177/107769900007700209
[Google Scholar]
Giles, D. C. (2002). Parasocial interaction: A review of the literature and a model for future research. Media Psychology, 4(3), 279–305. https://doi.org/10.1207/s1532785xmep0403_04
[Google Scholar]
Goyal, A., Gupta, V., & Kumar, M. (2018). Recent named entity recognition and classification techniques: A systematic review. Computer Science Review, 29, 21-43. https://doi.org/10.1016/j.cosrev.2018.06.001
[Google Scholar]
Guha, T., Huang, C. W., Kumar, N., Zhu, Y., & Narayanan, S. S. (2015). Gender representation in cinematic content: A multimodal approach. Proceedings of the 2015 ACM on International Conference on Multimodal Interaction, 31-34.
[Google Scholar]
Huang, G. B., Mattar, M., Berg, T., & Learned-Miller, E. (2008). Labeled faces in the wild: A database for studying face recognition in unconstrained environments. Workshop on faces in 'Real-Life' Images: Detection, alignment, and recognition.
[Google Scholar]
Joo, J., Bucy, E. P., & Seidel, C. (2019). Automated coding of televised leader displays: Detecting nonverbal political behavior with computer vision and deep learning. International Journal of Communication, 13, 4044–4066. https://ijoc.org/index.php/ijoc/article/view/10725
[Google Scholar]
King, D. E. (2009). Dlib-ml: A machine learning toolkit. The Journal of Machine Learning Research, 10, 1755-1758.
[Google Scholar]
Kluyver, T., Ragan-Kelley, B., Pérez, F., Granger, B., Bussonnier, M., Frederic, J., … et al., Jupyter development team. (2016). Jupyter Notebooks – A publishing format for reproducible computational workflows. In F.Loizides & B.Scmidt (Eds.), Positioning and power in academic publishing: Players, agents and agendas (pp. 87–90). IOS Press.
[Google Scholar]
Krantz-Kent, R. (2018). Television, capturing America's attention at prime time and beyond. Beyond the Numbers: Special Studies & Research, 7(14), 1-11. https://www.bls.gov/opub/btn/volume-7/television-capturing-americas-attention.htm
[Google Scholar]
Krippendorff, K. (2004). Content analysis: An introduction to its methodology. Sage.
[Google Scholar]
Kuntsche, E., Bonela, A. A., Caluzzi, G., Miller, M., & He, Z. (2020). How much are we exposed to alcohol in electronic media? Development of the Alcoholic Beverage Identification Deep Learning Algorithm (ABIDLA). Drug and Alcohol Dependence, 208, 107841. https://doi.org/10.1016/j.drugalcdep.2020.107841
[Google Scholar]
Lee, J. G., Agnew-Brune, C. B., Clapp, J. A., & Blosnich, J. R. (2014). Out smoking on the big screen: Tobacco use in LGBT movies, 2000–2011. Tobacco Control, 23(e2), e156-e158. http://dx.doi.org/10.1136/tobaccocontrol-2013-051288
[Google Scholar]
Levesque, H. J. (2019). Common sense, the Turing test, and the quest for real AI. MIT Press.
[Google Scholar]
Lovejoy, J., Watson, B. R., Lacy, S., & Riffe, D. (2014). Assessing the reporting of reliability in published content analyses: 1985–2010. Communication Methods and Measures, 8(3), 207-221. https://doi.org/10.1080/19312458.2014.937528
[Google Scholar]
Luck, S. J., & Kappenman, E. S. (Eds.). (2011). The Oxford handbook of event-related potential components. Oxford University Press.
[Google Scholar]
Masters, R. D., Frey, S., & Bente, G. (1991). Dominance & attention: Images of leaders in German, French, & American TV news. Polity, 23(3), 373-394. https://doi.org/10.2307/3235132
[Google Scholar]
Maze, B., Adams, J., Duncan, J. A., Kalka, N., Miller, T., Otto, C., ... & Grother, P. (2018, February). Iarpa janus benchmark-c: Face dataset and protocol. 2018 International Conference on Biometrics (ICB), 158-165.
[Google Scholar]
McCombs, M. E., & Shaw, D. L. (1972). The agenda-setting function of mass media. Public Opinion Quarterly, 36(2), 176-187. https://doi.org/10.1086/267990
[Google Scholar]
McNamara, Q., de la Vega, A., & Yarkoni, T. (2017). Developing a comprehensive framework for multimodal feature extraction. arXiv [cs.CV]. Retrieved from http://arxiv.org/abs/1702.06151
[Google Scholar]
Messaris, P. (1997). Visual persuasion: The role of images in advertising. Sage.
[Google Scholar]
Mitchell, M. (2019). Artificial intelligence: A guide for thinking humans. MIT Press.
[Google Scholar]
Monfort, M., Andonian, A., Zhou, B., Ramakrishnan, K., Bargal, S. A., Yan, T., ... & Oliva, A. (2019). Moments in time dataset: One million videos for event understanding. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(2), 502-508. https://doi.org/10.1109/TPAMI.2019.2901464
[Google Scholar]
Nagrani, A., & Zisserman, A. (2018). From Benedict Cumberbatch to Sherlock Holmes: Character identification in tv series without a script. arXiv. https://arxiv.org/abs/1801.10442
[Google Scholar]
National Television Violence Study. (1996). National television violence study (Vol. 1). Thousand Oaks, CA: Sage.
[Google Scholar]
National Television Violence Study. (1997). National television violence study (Vol. 2). Studio City, CA: Sage.
[Google Scholar]
Patron-Perez, A., Marszalek, M., Zisserman, A., & Reid, I. (2010). High five: Recognising human interactions in TV shows. Proceedings of the British Machine Vision Conference, 1(2), 1-11. https://doi:10.5244/C.24.50
[Google Scholar]
Peng, Y. (2018). Same candidates, different faces: Uncovering media bias in visual portrayals of presidential candidates with computer vision. Journal of Communication, 68(5), 920–941. https://doi.org/10.1093/joc/jqy041
[Google Scholar]
Pennebaker, J. W., Francis, M. E., & Booth, R. J. (2001). Linguistic inquiry and word count (LIWC): LIWC2001. Mahway: Lawrence Erlbaum Associates.
[Google Scholar]
Raney, A. A. (2008). Affective disposition theories. In The International Encyclopedia of Communication. Wiley. https://doi.org/10.1002/9781405186407.wbieca031.pub2
[Google Scholar]
Riff, D., Lacy, S., & Fico, F. (2014). Analyzing Media Messages: Using Quantitative Content Analysis in Research. Routledge.
[Google Scholar]
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2017). Imagenet classification with deep convolutional neural networks. Communications of the ACM, 60(6), 84-90. https://doi.org/10.1145/3065386
[Google Scholar]
Schmälzle, R., & Grall, C. (2020). The coupled brains of captivated audiences: An investigation of the collective brain dynamics of an audience watching a suspenseful film. Journal of Media Psychology, 32, 187-199. https://doi.org/10.1027/1864-1105/a000271
[Google Scholar]
Schmälzle, R., Schupp, H. T., Barth, A., & Renner, B. (2011). Implicit and explicit processes in risk perception: neural antecedents of perceived HIV risk. Frontiers in Human Neuroscience, 5(43), 1-10. https://doi.org/10.3389/fnhum.2011.00043
[Google Scholar]
Shah, D. V., Cappella, J. N., & Neuman, W. R. (2015). Big data, digital media, and computational social science: Possibilities and perils. The ANNALS of the American Academy of Political and Social Science, 659(1), 6-13. https://doi.org/10.1177/0002716215572084
[Google Scholar]
Schill, D. (2012). The visual image and the political image: A review of visual communication research in the field of political communication. Review of Communication, 12(2), 118-142. https://doi.org/10.1080/15358593.2011.653504
[Google Scholar]
Taskiran, M., Kahraman, N., & Erdem, C. E. (2020). Face recognition: Past, present and future (a review). Digital Signal Processing, 106, 1-28. https://doi.org/10.1016/j.dsp.2020.102809
[Google Scholar]
Trilling, D., & Jonkman, J. G. F. (2018). Scaling up content analysis. Communication Methods and Measures, 12(2-3), 158–174. https://doi.org/10.1080/19312458.2018.1447655
[Google Scholar]
Wagner, D. D., Dal Cin, S., Sargent, J. D., Kelley, W. M., & Heatherton, T. F. (2011). Spontaneous action representation in smokers when watching movie characters smoke. Journal of Neuroscience, 31(3), 894-898. https://doi.org/10.1523/JNEUROSCI.5174-10.2011
[Google Scholar]
Weber, R., Eden, A., Huskey, R., Mangus, J. M., & Falk, E. (2015). Bridging media psychology and cognitive neuroscience. Journal of Media Psychology, 27, 146-156. https://doi.org/10.1027/1864-1105/a000163
[Google Scholar]
Zamith, R., & Lewis, S. C. (2015). Content analysis and the algorithmic coder: What computational social science means for traditional modes of media analysis. The ANNALS of the American Academy of Political and Social Science, 659(1), 307-318. https://doi.org/10.1177/0002716215570576
[Google Scholar]
Zellers, R., Bisk, Y., Farhadi, A., & Choi, Y. (2019). From recognition to cognition: Visual commonsense reasoning. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 6720-6731.
[Google Scholar]
Zhu, J., Luo, J., You, Q., & Smith, J. R. (2013). Towards understanding the effectiveness of election related images in social media. 2013 IEEE 13th International Conference on Data Mining Workshops, 421-425.
[Google Scholar]
Zillmann, D., Taylor, K., & Lewis, K. (1998). News as nonfiction theater: How dispositions toward the public cast of characters affect reactions. Journal of Broadcasting & Electronic Media, 42(2), 153-169. https://doi.org/10.1080/08838159809364441
[Google Scholar]
Zou, J., & Schiebinger, L. (2018). AI can be sexist and racist—it’s time to make it fair. Nature, 559, 324-326. https://doi.org/10.1038/d41586-018-05707-8
[Google Scholar]
Zweig, M. H., & Campbell, G. (1993). Receiver-operating characteristic (ROC) plots: A fundamental evaluation tool in clinical medicine. Clinical Chemistry, 39(4), 561-577. https://doi.org/10.1093/clinchem/39.4.561
[Google Scholar]

http://instance.metastore.ingenta.com/content/journals/10.5117/CCR2022.1.010.BALD

A Character Recognition Tool for Automatic Detection of Social Characters in Visual Media Content

CCR 4 (2022); https://doi.org/10.5117/CCR2022.1.010.BALD

/content/journals/10.5117/CCR2022.1.010.BALD

Data & Media loading...

Keyword(s): computational communication; computer vision; content analysis; face recognition; media

Most Cited Most Cited RSS feed

Call for Papers

Global Vietnam welcomes papers in varied forms, including individual concept papers, research articles, book reviews, debate and opinion pieces, etc. It also welcomes proposals for special issues. The maximum length for a paper published in the Journal is up to 12,000 words including references, notes and figures/tables/charts. Requested length for book reviews is two pages. You can send your proposal to Prof Phan Le-Ha ([email protected]) or to the commissioning editor at AUP, Inge Klompmakers ([email protected]).

Tijdschrift voor Geschiedenis zoekt reviewartikelen!

Aan de hand van een serie reviewartikelen brengt Tijdschrift voor Geschiedenis de komende tijd recente ontwikkelingen in het historische landschap in kaart. Maakt uw vakgebied een interessante ontwikkeling door? Heerst er een debat? Kregen recente publicaties volgens u niet voldoende aandacht? Kruip dan in uw pen en schrijf een reviewartikel voor Tijdschrift voor Geschiedenis! We verwelkomen bijdragen van historici uit alle mogelijke vakgebieden.

Meer info via: www.aup-online.com/content/journals/00407518
en tijdschriftvoorgeschiedenis.org.

Call for papers

Carillon and Bell Culture in the Low Countries is an annual publication about carillon and bell culture and the related tangible and intangible cultural heritage. The articles are the output of academic and/or artistic research. The editors welcome contributions from history, musicology, sociology, anthropology, (historically) informed performance, heritage, cultural studies, campanology etc. Although the Low Countries are the main focus, we also welcome contributions on the bell and carillon culture from all over the world.

For the second yearbook, to be published mid-May 2023, authors may send a short abstract (max. 300 words) to [email protected] by 1 September 2022. Articles can be submitted in Dutch or English. More information can be found in the CfP attached (Nederlands).

oa A Character Recognition Tool for Automatic Detection of Social Characters in Visual Media Content

Abstract

Most Read This Month

Most Cited Most Cited RSS feed

Conversational Agent Research Toolkit

Computational observation

Detecting Impoliteness and Incivility in Online Discussions

Opinion-based Homogeneity on YouTube

Four best practices for measuring news sentiment using ‘off-the-shelf’ dictionaries: a large-scale p-hacking experiment

How Document Sampling and Vocabulary Pruning Affect the Results of Topic Models

iCoRe: The GDELT Interface for the Advancement of Communication Research

The 4CAT Capture and Analysis Toolkit: A Modular Tool for Transparent and Traceable Social Media Research

A Roadmap for Computational Communication Research

Fifteen Seconds of Fame: TikTok and the Supply Side of Social Video