class: center, middle, inverse, title-slide # Semi-automated Content Analysis of Media Frames ## How to Analyse Media Reports (of protest) ### Johannes B. Gruber ### University of Glasgow ### 2019-06-22 --- class: center, middle ## How can we find systematic patterns in media reports ## if the reported topics differ? ??? I started my PhD with this question in mind. The reason was this example. -- Application: Coverage of protest events over time -- Idea: **Analyse the framing of a story instead of its content** ??? I collected coverage containing reports about domestic protest from 26 UK newspaper. The topics of the protests differ wildly. Between fox-hunting protests, Anti-war protest, high fuel prices and pro- and anti-Brexit protests. --- # What is framing? -- <center> .Large[A frame is a *"a central organizing idea or story line that provides meaning to an unfolding strip of events"* (Gamson & Modigliani, 1987, p. 143)] </center> ??? We all use framing to make sense of everyday events and issues when telling others about it. Example: Campus West end: We can either tell people about the modern architecture blending in with the style of the old buildings; the bright rooms, modern teaching facilities, that the campus is adjacent to a beautiful park and that is nice food available on campus. Or we can tell people that it's almost impossible to live close to campus since it is located in one of the most expensive neighbourhoods in the city, that there are no cheap places nearby where you can eat or drink (tying students to the expensive and sometimes low quality food in the Mensa) and that there are not enough power outlets in the offices. None of these facts are untrue, but based on the selection of information, you can tell completely different stories --- # How can we detect and code frames? - .Large[Most often used definition of framing in media studies:] <center> .Large["To frame is to *select some aspects of a perceived reality and make them more salient in a communicating text, in such a way as to promote a particular problem definition, causal interpretation, moral evaluation, and/or treatment recommendation* for the item described" (Entman, 1993, original emphasis).] </center> --- # Available Approaches .pull-left[ - Qualitative<br><br> - Manual-Holistic<br><br> - Manual-Clustering<br><br> - Automated Content Analysis (ACA)<br> ] -- .pull-right[ → focus only on in-depth description<br> → hard to ensure validity and reliability<br> → easier to code = more validity and reliability<br> → make analysis scalable but same concept?<br> ] ??? Qualitative approaches - rooted in qualitative research traditions - proceed inductively - more about in-depth description - little or no quantification of elements or the distribution of frames within a discourse is provided by the researcher Manual-Holistic approaches - frames are holistic variables - usually quantitative content analyses - frames can be either derived from the literature or identified inductively in a pilot study of a small sample - validity and reliability depend on the transparency with which the study communicates the coding decisions manual-clustering approaches - split up frames into sub-variables which are easier to code in content analysis- - frames are operationalised as a set of yes/no indicator questions (coders are asked if a certain aspect is mentioned in the text or not) --- count: false # Available Approaches .pull-left[ - Qualitative<br><br> - Manual-Holistic<br><br> - Manual-Clustering<br><br> - Automated Content Analysis (ACA)<br> + Dictionary Methods (deductive) + Fully Automated Classification (inductive) + Supervised Machine Learning (SML) ] .pull-right[ → focus only on in-depth description<br> → hard to ensure validity and reliability<br> → easier to code = more validity and reliability<br> → make analysis scalable but same concept?<br> ] ??? Qualitative approaches - rooted in qualitative research traditions - proceed inductively - more about in-depth description - little or no quantification of elements or the distribution of frames within a discourse is provided by the researcher Manual-Holistic approaches - frames are holistic variables - usually quantitative content analyses - frames can be either derived from the literature or identified inductively in a pilot study of a small sample - validity and reliability depend on the transparency with which the study communicates the coding decisions manual-clustering approaches - split up frames into sub-variables which are easier to code in content analysis- - frames are operationalised as a set of yes/no indicator questions (coders are asked if a certain aspect is mentioned in the text or not) --- ## How can we detect and code frames (better)? - .Large[Most often used definition of framing in media studies:] <center> .Large["To frame is to *select some aspects of a perceived reality and make them more salient in a communicating text, in such a way as to promote a particular <span style='background-color: #51A8A8'>problem definition</span>, <span style='background-color: #AFC131'>causal interpretation</span>, <span style='background-color: #D35126'>moral evaluation</span>, and/or <span style='background-color: #FFDE77'>treatment recommendation</span>* for the item described" (Entman, 1993, original emphasis).] </center> -- - Instead of coding frames, I code **frame elements** --- # Method (1): Finding Frames <img src="JBGruber_framing_files/figure-html/unnamed-chunk-1-1.svg" style="display: block; margin: auto;" /> --- # Method (2): Replicating frames <img src="JBGruber_framing_files/figure-html/unnamed-chunk-2-1.svg" style="display: block; margin: auto;" /> --- # Method Alternative <img src="JBGruber_framing_files/figure-html/unnamed-chunk-3-1.svg" style="display: block; margin: auto;" /> --- class: duke-softblue background-image: url('https://www.dropbox.com/s/o5byzx7jjygsblg/Iesha_background.png?dl=1') background-size: cover .content-box-blue[ # Application: Background - Case/population: Mainstream news media articles about protests in the UK (1992-2017) - Time-series design: it is expected that the patterns have changed substantially since the first seminal studies – not least due to the arrival of the internet (Cottle, 2008) - Data: Population scale sample of protest reports in newspapers (n > 27,000) - State of knowledge: Journalists use a default theme (so called *protest paradigm*) to report about protest: details about the event (clash with police, the appearance of protesters, nuisance caused or reactions of bystanders) are highlighted while the message of protesters is undermined or not even mentioned. ] --- # Application: Codebook Frame elements are further divided into coding variables: - Problem Definition + Topic + Actor - Causal Attribution + Benefit Attribution + Risk Attribution - Moral Evaluation + Benefit + Risk - Treatment + Judgement --- # Application: Codebook
--- # Application: Coding example <center> <span style='background-color: #51A8A8'>*«Police worked very hard with the organisers to ensure a peaceful protest and it was a small core of determined troublemakers bent on conflict with the police who I believe were responsible for the violence. There were fireworks and missiles thrown at the police and some people were intent on breaking through the barriers. We were there to ensure that did not take place.»*</span> </center> <br> - Topic: Violence/Crime --- count: false # Application: Coding example <center> <span style='background-color: #51A8A8'>*«Police worked very hard</span> with the organisers to ensure a peaceful protest and it was a small core of determined troublemakers bent on conflict with the police who I believe were responsible for the violence. There were fireworks and missiles thrown at the police and some people were intent on breaking through the barriers. We were there to ensure that did not take place.»* </center> <br> - Topic: Violence/Crime - Actor: Police --- count: false # Application: Coding example <center> *«Police worked very hard with the organisers to <span style='background-color: #AFC131'>ensure a peaceful protest</span> and it was a small core of determined troublemakers bent on conflict with the police who I believe were responsible for the violence. There were fireworks and missiles thrown at the police and some people were intent on breaking through the barriers. <span style='background-color: #AFC131'>We were there to ensure that did not take place</span>.»* </center> <br> - Topic: Violence/Crime - Actor: Police - Benefit: Reinstating public order --- count: false # Application: Coding example <center> *«Police worked very hard with the organisers to ensure a peaceful protest and it was a small core of determined troublemakers bent on conflict with the police who I believe were responsible for the <span style='background-color: #AFC131'>violence. There were fireworks and missiles thrown at the police and some people were intent on breaking through the barriers</span>. We were there to ensure that did not take place.»* </center> <br> - Topic: Violence/Crime - Actor: Police - Benefit: Reinstating public order - Risk: Public Safety --- count: false # Application: Coding example <center> *«<span style='background-color: #D35126'>Police worked very hard</span> with the organisers to ensure a peaceful protest and it was a small core of determined troublemakers bent on conflict with the police who I believe were responsible for the violence. There were fireworks and missiles thrown at the police and some people were intent on breaking through the barriers. We were there to ensure that did not take place.»* </center> <br> - Topic: Violence/Crime - Actor: Police - Benefit: Reinstating public order - Risk: Public Safety - Benefit Attribution: Police --- count: false # Application: Coding example <center> *«Police worked very hard with the organisers to ensure a peaceful protest and it was a small core of determined troublemakers bent on conflict with the police who I believe were responsible for the violence. There were fireworks and missiles thrown at the police and <span style='background-color: #D35126'>some people were intent on breaking through the barriers</span>. We were there to ensure that did not take place.»* </center> <br> - Topic: Violence/Crime - Actor: Police - Benefit: Reinstating public order - Risk: Public Safety - Benefit Attribution: Police - Risk Attribution: Protesters --- count: false # Application: Coding example <center> *«Police worked very hard with the organisers to ensure a peaceful protest and it was a small core of determined troublemakers bent on conflict with the police who I believe were responsible for the violence. There were fireworks and missiles thrown at the police and some people were intent on breaking through the barriers. We were there to ensure that did not take place.»* </center> <br> - Topic: Violence/Crime - Actor: Police - Benefit: Reinstating public order - Risk: Public Safety - Benefit Attribution: Police - Risk Attribution: Protesters - Judgement: None --- # Application: Coding example (2) <div style="border: 1px solid #ddd; padding: 0px; overflow-y: scroll; height:250px; overflow-x: scroll; width:100%; "><table class="table table-striped" style="width: auto !important; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> Par_ID </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> Problem Definition: Topic: Violence/Crime </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> Problem Definition: Actor: Police </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> Moral Evaluation: Benefit: Reinstating public order </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> Moral Evaluation: Risk: Public safety </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> Causal Attribution: Risk_Attribution: Protesters </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> Causal Attribution: Benefit_Attribution: Police </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> Treatment: Judgement_Positive: 0 </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> Treatment: Judgement_Positive: 1 </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> Problem Definition: Topic: Nuisance </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> Problem Definition: Topic: Protesters </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> … </th> </tr> </thead> <tbody> <tr> <td style="text-align:right;"> 14900405 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 0 </td> </tr> </tbody> </table></div> --- # Application: Clustering Frame Elements The R package NbClust (Charrad et al., 2014) combines many indices to determine optimal cluster solutions:. <img src="JBGruber_framing_files/figure-html/unnamed-chunk-6-1.png" width="2800" /> ??? "ch" (Calinski and Harabasz 1974) "duda" (Duda and Hart 1973) "pseudot2" (Duda and Hart 1973) "cindex" (Hubert and Levin 1976) "gamma" (Baker and Hubert 1975) "beale" (Beale 1969) "ccc" (Sarle 1983) "ptbiserial" (Milligan 1980, 1981) "gplus" (Rohlf 1974; Milligan 1981) "db" (Davies and Bouldin 1979) --- # Application: Interpreting Clusters as Frames Heatmap showing cluster means for codes: <img src="JBGruber_framing_files/figure-html/unnamed-chunk-7-1.png" width="3200" /> --- # Application: SML replicating classification <table> <thead> <tr> <th style="text-align:left;"> model </th> <th style="text-align:right;"> Accuracy </th> <th style="text-align:right;"> AccuracyLower </th> <th style="text-align:right;"> AccuracyUpper </th> <th style="text-align:left;"> package </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Maximum Entropy </td> <td style="text-align:right;"> 0.64 </td> <td style="text-align:right;"> 0.44 </td> <td style="text-align:right;"> 0.81 </td> <td style="text-align:left;"> RTextTools </td> </tr> <tr> <td style="text-align:left;"> SVM </td> <td style="text-align:right;"> 0.59 </td> <td style="text-align:right;"> 0.39 </td> <td style="text-align:right;"> 0.78 </td> <td style="text-align:left;"> quanteda.classifiers </td> </tr> <tr> <td style="text-align:left;"> LogitBoost </td> <td style="text-align:right;"> 0.59 </td> <td style="text-align:right;"> 0.36 </td> <td style="text-align:right;"> 0.79 </td> <td style="text-align:left;"> caret/caTools </td> </tr> <tr> <td style="text-align:left;"> bagging </td> <td style="text-align:right;"> 0.50 </td> <td style="text-align:right;"> 0.31 </td> <td style="text-align:right;"> 0.69 </td> <td style="text-align:left;"> RTextTools </td> </tr> <tr> <td style="text-align:left;"> Naive Bayes </td> <td style="text-align:right;"> 0.48 </td> <td style="text-align:right;"> 0.29 </td> <td style="text-align:right;"> 0.68 </td> <td style="text-align:left;"> quanteda </td> </tr> <tr> <td style="text-align:left;"> Random Forest </td> <td style="text-align:right;"> 0.48 </td> <td style="text-align:right;"> 0.29 </td> <td style="text-align:right;"> 0.68 </td> <td style="text-align:left;"> caret/ranger </td> </tr> <tr> <td style="text-align:left;"> NNSEQ </td> <td style="text-align:right;"> 0.44 </td> <td style="text-align:right;"> 0.25 </td> <td style="text-align:right;"> 0.65 </td> <td style="text-align:left;"> quanteda.classifiers </td> </tr> <tr> <td style="text-align:left;"> Penalised Multinomial Regression </td> <td style="text-align:right;"> 0.44 </td> <td style="text-align:right;"> 0.25 </td> <td style="text-align:right;"> 0.65 </td> <td style="text-align:left;"> glmnet </td> </tr> </tbody> </table> .full-width[.content-box-red[Work in progress (training/test sample n = 270/30)!]] <!-- - Support Vector Machine (SVM) (Meyer et al., 2011) --> <!-- - Linear SVM --> <!-- - multinomial Naïve Bayes from quanteda --> <!-- - sequential neural network from quanteda.classifiers --> <!-- - Random Forest(Liaw and Wiener, 2002) --> <!-- - boosting (Tuszynski, 2012) from caTools --> <!-- - bagging (Peters et al., 2002) fromipred --> <!-- - scaled linear discriminant analysis (slda) --> <!-- - glmnet (Friedman et al., 2010) --> <!-- - maximum entropy(Jurka, 2012) --> <!-- Usually it is best to use an ensemble of classifiers together! --> ??? Once this is done, we can show change over time and do some analysis why certain reports are the way they are (right wing protest, more positive reports from right-wing media?) --- # Application: Next Steps .large[ - Finish training sample - Agreement between coders on training set - Agreement between coders and clustering - Agreement between clustering and SML - Outlook: + Explain the framing of protest with event data (size, tactics, time after event, ideogical stance, etc.) ] --- # Conclusion - Detected frames make sense -- - Classification better than chance already -- - More control over categories than topicmodels - Less abstract coding and category building than dictionary methods --- background-image: url(https://live.staticflickr.com/8467/8129232704_c408251a34_k_d.jpg) count: false # Thank you for your attention! .pull-left[.full-width[.content-box-blue[ ## Working Paper **bit.ly/JBGruber_framing_paper** ]]] .pull-right[.content-box-green[ ## Contact Johannes B. Gruber - Mail: j.gruber.1@research.gla.ac.uk - Web: johannesbgruber.eu/ - GitHub: github.com/JBGruber - Twitter: @JohannesBGruber ]] --- count: false # Method <img src="JBGruber_framing_files/figure-html/unnamed-chunk-9-1.svg" style="display: block; margin: auto;" /> --- count: false # Dataset construction Data downloaded from LexisNexis using "protest" and "demonstration" (plus several variations) before cleaning the data: <img src="JBGruber_framing_files/figure-html/unnamed-chunk-10-1.png" width="2800" /> --- count: false # Frames in newspapers ![](JBGruber_framing_files/figure-html/unnamed-chunk-11-1.png)<!-- -->