Time Does Tell: An Analysis of Observable Audience Responses From the 2016 American Presidential Campaigns

In this study a microanalysis of OAR (Observable Audience Responses) in the 2016 U.S. Presidential Election was conducted. OAR were coded into dimensions including response rate (frequency per minute), response type, and categorised as either a unitary (a single response), composite (two or more simultaneous response types) or sequential (a unitary or composite response that is followed by a different response type) response form. It was found that U.S. audiences made use of all three response forms (unitary, composite, and sequential) and that certain response forms had been under-represented when contrasted with findings from previous research. This study was also the first to measure the duration of OAR in the context of an election, and it was observed that response form significantly affected the duration of response. It was inferred from this that the audiences might select different responses as a means to control the force of reply. This study failed to replicate previous research that had found a correlation between response rate (affiliative OAR per minute) and voter share on polling day, but instead found a stronger, significant correlation between the duration of OAR and voter share. It was interpreted that duration of OAR may be a superior indicator of wider voter enthusiasm as it captures the length of response as well as the incidence.

ported in mass media as a gauge of support and enthusiasm towards speakers, as well as being a feedback device for which pundits can evaluate their performance (West, 1984).
Microanalysis of political speeches has then revealed that politicians will use a number of different so called "rhetorical devices" (or "claptraps") when delivering speeches to evoke responses from their audience (Bull, 2012).
One such device identified by Atkinson (1984) is the contrast, which involves the sequential juxtaposition of an item with its opposite; the latter part allowing the audience to identify the completion point (and then when to respond). The use of these devices has then been noted not only in British political oratory, but also in U.S. and Japanese political speeches (Bull & Feldman, 2011;Bull & Miskinis, 2015). Early work by Heritage and Greatbatch (1986) then identified that the use of these devices preceded 68% of collective applause incidents in political speeches made to British party political conferences. Not only are audience responses in political oratory then reported in mass media as a means to evaluate the support of politicians, but politicians will make use of rhetorical devices to invite audience responses. More recently, Bull and Miskinis (2015) found a significant, positive relationship between affiliative response rate (affiliative responses per minute) and voter share in the 2012 US Presidential Election, thus indicating that the study of OAR in political oratory could potentially reveal insights into wider voter enthusiasm and intent. Stewart et al. (2018) argued that OAR in political oratory may be a superior indicator of shared individual and emergent group attitudes towards budding candidates than what might arguably be considered self-report techniques such as polling, dial testing focus groups and analysis of social media such as Twitter. Audience vocalisations such as cheering, laughter and booing are closer to reflecting automatic and spontaneous response than such self-report techniques which are subject to cognitive control, and prone to social conformity. Whilst such measures have a strength over OAR in that they can account for the numerous factors that influence viewer perceptions (e.g. media visual presentations style, coverage bias/slant, spin doctoring), a social media user for example will have the prerogative of consciously crafting their "response" before displaying it to the world, whereas audience responses such as applause and laughter may be more spontaneous and automatic, better reflecting the intensity of response.
The efficacy of these techniques was also heavily scrutinised in the aftermath of the 2016 US Presidential Election.
Some pre-election polls forecast predictions that Hillary Clinton's likelihood of winning the presidency was as high as 99 percent, and in the aftermath of her rival's election victory there was a widespread sentiment that polling had "got it wrong" (Kennedy et al., 2018). Large-scale social analysis may also be skewed by the presence of online bots, which contributed nearly a fifth of political discussion concerning the 2016 US Presidential Election on social media (Bessi & Ferrara, 2016). Of the OAR reported during the election campaign, Donald Trump notably surpassed fellow Republican candidates in primary debates in his propensity to generate laughter from the audience (Stewart, Eubanks, & Miller, 2016), as well as in the presidential debates with Democratic rival Clinton (Stewart et al., 2018). The study of OAR in political oratory can then provide valuable insight into emergent voter attitudes towards prospective presidential candidates.

Characteristic Features of OAR
Audience responses can be affiliative in that they align themselves with the speaker, or disaffiliative in that they align themselves against the speaker. Responses are not intrinsically one or the other: For example, booing can be regarded as disaffiliative when audience members are responding to a perceived attack on their preferred contender (Clayman, 1993), or affiliative when the audience aligns themselves with a speaker that is denigrating their opponent (Bull & Miskinis, 2015). Arguably affiliative responses might also go against speakers in that applause for example might be delayed or half-hearted.
To produce a collective response, individual listeners coordinate with one another in an audience to respond as a whole, or as a significant proportion using these forms. Isolated responses by one or two listeners may also occur. Isolated responses are a highly regular feature of U.S. presidential campaign speeches (Bull), but have seldom or never been observed in Korean or Japanese political speeches (Bull & Feldman, 2011;Choi, Bull, & Reed, 2016;Feldman & Bull, 2012). Bull and Miskinis (2015) characterised these differences in the context of Hofstede's distinction between individualist and collectivist societies. Individualist societies can be seen as cultures that privilege a loosely knit social framework in which individuals are expected to take care of only themselves and their immediate family, whereas collectivist cultures privilege the needs of groups and communities over individuals (Hofstede, Hofstede, & Minkov, 2010). On this dimension, Hofstede ranks the USA (91, individualist) as being almost diametrically opposed to Korea (18,collectivist) and to a lesser extent, Japan (41, collectivist).
Consequently, Bull and Miskinis (2015) argued that individualistic cultures such as the USA allow for greater freedom in responding to a speaker through the use of disaffiliative and isolated responses, whereas collectivist cultures such as Japan only allow affiliative and collectivist responses which maintain group harmony.
Recently, Choi et al. (2016) identified three types of audience response termed unitary, composite, and sequential.
A unitary response refers to a singular form of response such as laughter, or cheering. A composite response refers to a combined response in which two forms of response co-occur within one audience turn, such as cheering and applause. A sequential response then refers to one form of response that is unitary or composite, after which the audience shifts their turn to another form of response. For example, they might begin with booing (first form) before collectively moving onto chanting (second form), or they might begin with cheering and applause to then move onto chanting, and so on. Notably, in the 2012 Korean presidential speeches, sequential forms made up 34.74% of all audience responses.
As such, a major limitation of previous research is that prior to Choi et al.'s (2016) paper, coding systems have been simplistic. For example, Bull and Miskinis (2015) did not distinguish between co-occurring audience responses, rather only the predominant form was recorded. It is then entirely possible that certain responses are under-represented because they are subordinate to another form of response occurring at the same time. The advanced coding system put forward by Choi et al. also provides another way of comparing cross cultural differences as it increases the scope of dimensions in which OAR can be observed, though it should be noted that this study was not originally designed with this comparison in mind.
A research question for this study was then set: How did U.S. audiences respond in terms of unitary, composite and sequential responses towards speakers in 2016 U.S. Presidential Election campaign speeches?

Duration of OAR
The duration of audience responses in electoral campaign speeches generally has not been well researched.
There is some, albeit limited, evidence to suggest that different response forms may differ in length from one another. Clayman (1993) analysed disaffiliative booing and found that it rarely exceeded 3 seconds, and Atkinson (1984) found that applause incidents typically last 8 seconds. Otherwise, the duration of response forms in political campaign speeches has never been directly compared.
If the mean lengths of certain response forms differ from one another, this might indicate that audiences use different responses in order to communicate different levels of enthusiasm towards a speaker. For example, if composite incidents of cheering and applause are longer in duration than unitary incidents of cheering, this would indicate that audiences use the former response to present a more forceful reaction to a speaker. A second research question for this study was then posed: How did the duration of responses used by American audiences vary in 2016 U.S. Presidential Election campaign speeches?

OAR as a Predictor of Voter Share
Recent studies of audience-speaker interactions in political contexts have also investigated whether there is a link between affiliative response rate (OAR per minute) and voter share. Bull and Miskinis' (2015) study was the first to observe a significant relationship between audience responses and electoral success. The researchers found a significant positive correlation between affiliative response rate and regional voter share in the 2012 U.S. Presidential Election. In swing-states where Barack Obama evoked a greater affiliative response rate than his opponent (Mitt Romney), he also received a higher proportion of votes in those respective states (Iowa, Ohio, Florida, and Wisconsin). Likewise, Romney evoked a higher affiliative response rate in North Carolina and received a higher proportion of votes. Therefore, response rate could additionally provide critical feedback to electoral campaigns looking to increase their voter share at the ballot box.
However, no such relationship was observed between response rate and electoral success in Japanese speeches made by candidates in the 2012 Japanese general election campaign (Feldman & Bull, 2012). Bull and Miskinis (2015) critically noted though that although the U.S. and Japanese speeches were both election campaign speeches, there were differences between the speech contexts. The Japanese speeches (Feldman & Bull, 2012) were delivered only to the supporters of the speakers, rather than to win over uncommitted voters. In contrast, the U.S. speeches (Bull & Miskinis, 2015) were delivered at informal public meetings without a pre-selected audience. There are then important distinctions between these contexts: the Japanese speeches could be regarded as "rallies of the faithful" where audiences are expected to show their appreciation and support. Compared with so-called "swing states" in the U.S. presidential election (Bull & Miskinis, 2015), audience responses delivered in the Japanese context would arguably not be reflective of wider voter enthusiasm towards the speakers, due to the composition of the audience and purpose of the speech (Feldman & Bull, 2012).
In the South Korean study, Choi et al. (2016) considered that there might be a significant relationship between response rate and electoral success, as the context was more similar to that of the US speeches, given that both were presidential election campaigns. However, no such significant relationship was found. This may have been because the Korean audiences were more partisan than those in the U.S. swing states. Notably, the Korean presidential election of 2012 was regarded not only as a battle between the progressive (Moon) and the conservative (Park), but also between younger and older generations. Moon was supported more by progressive younger generations (20's, 30's, and 40's), Park more by the conservative older generations (50's and over 60's), as shown by the results of the exit poll. It was also observed that the composition of the audiences differed, with a predominance of the younger generation at Moon's rallies, and of the older generation at Park's rallies. Hence, given this greater partisanship, OAR may have been less indicative of votes cast.
there would be a significant relationship between affiliative response rate and electoral success in the 2016 U.S. presidential election.

Duration of OAR as a Predictor of Voter Share
It was further proposed that the duration of these responses might be a superior indicator of audience enthusiasm than affiliative response rate, as it contains information not only about the occurrence of a response, but also about its length. Thereby it adds a second dimension where an audience can signal their support for a speaker.
In contrast, one drawback of affiliative response rate is that it may simply reflect the ability of a speaker to incorporate multiple rhetorical devices into their speech, rather than being indicative of audience enthusiasm as such.
Accordingly, it was hypothesised that the duration of audience responses would be a better predictor of electoral success than affiliative response rate alone.
Thus, in summary, despite recent and significant developments in the study of OAR, it is still a nascent field of research with yet-to-be explored considerations and only a handful of attempts at replication. To address some of these questions and replicate recent findings, two research questions and two hypotheses were set:

Presidential Election campaign speeches?
Hypothesis 1: There would be a significant, positive correlation between affiliative response rate and electoral success in the 2016 U.S. Presidential Election.

Hypothesis 2:
The duration of audience responses would be a better predictor of electoral success than affiliative response rate alone.

Method Data Corpus
Like in Bull and Miskinis' (2015) study of 2012 U.S. presidential election campaign speeches, so-called "swing state" speeches were selected for analysis of speeches from the 2016 presidential election campaign. Swing states are those in which no campaign or party has historically overwhelming or consistent support (Mahtesian, 2016). In the American political system, it is the Electoral College that decides the winner rather than the nationwide popular vote, and whoever wins the state takes all of its votes. As such, winning swing states is essential to winning the election; they are the states in which candidates and their respective parties can make the most significant gains over their opponents. Swing state speeches were then selected from the date of the candidates' respective nominations (Clinton, July 26th, 2016;Trump, July 19th, 2016) to Election Day (November 8th, 2016). attended meetings which required prior bookings, but did not require a declaration of party affiliation; in this regard they can be interpreted as unselected, but more restrictive to the general public than the 2012 presidential election campaign speeches that took place in venues allowing free attendance (Bull & Miskinis, 2015).

Materials
We analysed ten speeches in total: two per state; five by Hillary Clinton and five by Donald Trump. Where multiple speeches in state were given by a candidate, a speech was chosen for its geographical and date proximity to the other candidate's selected speech. Some swing states such as Wisconsin had to be excluded from selection altogether, as Clinton for example did not campaign there. Candidates in the following swing states delivered the speeches in indoor and outdoor locations (auditoriums, gymnasiums, halls, and parks): Michigan, Pennsylvania, Florida, New Hampshire, and Colorado, lasting a total of 6 hours 26 minutes and 37 seconds. The full list of ten speeches is given in Table 1. Video recordings and transcripts of the ten selected swing state speeches were available on C-SPAN, a U.S. cable and satellite network which televises public affairs. Transcripts were downloaded and amended in the course of the analysis if the coder encountered any discrepancies. A stopwatch was used to measure the duration of responses.

Procedure
Videos and transcripts of the ten speeches were downloaded from the Cable-Satellite Public Affairs Network (C-SPAN) website. Transcripts were checked against delivery from the video recordings to ensure an accurate verbatim record, with amendments made wherein appropriate. The collected videos were also compared with other recordings of the speeches on YouTube to confirm that the obtained videos contained the full duration of the speeches, and to ensure that there had been no editing after the event.
Quality content analysis was undertaken by the lead author in terms of three dimensions: 1) forms of response (e.g., cheers, applause or booing), 2) response type (unitary, composite and/or sequential), and 3) duration (total time from the onset of a collective response to its end). Responses on the first two dimensions were identified and marked on the transcripts which had been imported into a word processing package, which were later collated into one coding system sheet for statistical analysis. Responses on the third dimension were marked separately on another document, using a stopwatch to measure the duration of collective responses from beginning to end.
An interrater reliability analysis (reported below) was also conducted using a sub-sample of speeches taken from the main study to check on the efficacy of the coding system used by the main coder.

The Criteria of the Coding System
Following Choi et al. (2016), audience responses were coded as unitary, composite or sequential. Composite responses were denoted with a "+", and sequential responses were denoted with a "→".
From a pilot study of two speeches utilising the rhetorical device categories as listed by Bull and Miskinis (2015), three new response forms were identified: cheering and applause, yes/no-verbal responses, and slogans. In Choi et al.'s (2016) coding system, applause and cheering co-occurring were always denoted as "applause + cheers".
However, it was found necessary to distinguish when applause or cheering preceded one another, as there were several incidents where applause preceded the onset of cheering by several seconds before they both co-occurred.
"Yes" and "No" were also previously coded under an umbrella term, "verbal responses"; however, it was found necessary to distinguish these responses from other verbal responses, as these responses occurred in response to the use of both implicit and explicit devices. This is demonstrated in Extract 1, where the audience replies "YES" to both the use of an implicit device (speaker: Lines 1-2; audience responds Line 3) and an explicit device (speaker: Line 7, audience responds Line 8). Other verbal responses typically occurred in response to explicit invitations hence they were recorded as "verbal others". Note. Quotation marks indicate verbal responses by the audience. "ccccc+xxxx" represents cheering and applause (see Table  2 for more information).

Analysis of OAR From 2016 US Election Speeches 374
Finally, the response category of slogans was introduced, whereby the audience repeated the words of the speaker in synchrony with the speaker.
There was no usage of disaffiliative responses towards the speakers noted in the pilot study, so only affiliative responses towards the speakers were coded in the main study.
Identified audience responses were then marked on the transcripts manually according to the following notation guidelines, based on those of Bull and Miskinis (2015) and Choi et al. (2016) as shown in Table 2. Isolated responses occurred so consistently throughout all speeches that they were treated as unscoreable, so were not annotated on the transcripts. Single form "Chant:" with the inclusion of the chanted message (e.g. "Lock her up!", "Hillary!")

Chanting
Single form "b" 5. Booing "Yes" or "No" "VR: yes/no" 6. Verbal response: yes/no Verbal other (e.g. unison verbal remarks such as "that's right", "great") "VR other:" with the inclusion of the response7. Verbal other Applause & cheers co-occurred, with applause occurring first "x + c" 8. Applause + cheers Laughter moved to other forms "h →" followed by any form of response 1-7 9. Laughter → various Cheers & applause co-occurred, with cheers occurring first "c + x" 10. Cheers + applause Cheers & applause co-occurred then moved to chanting "c + x → Chant:" followed by the inclusion of the chanted message 11. Cheers + applause → chanting Booing co-occurring with various forms "b +" any form of response 1-7 12. Booing + various Verbal response yes/no or others then moved to other forms "VR: yes/no → " any form of response 1-7 13. Verbal response → various Slogan then moved to other forms "S:" any phrase or utterance iterated in time with the speaker → followed by any form of response 1-7

Duration of Replies
Specific parameters defined how long a response was recorded: a response was only to be recorded if several audience members were participating. In this way, a response could be distinguished as a collective audience behaviour from an isolated one. Extract 2 illustrates an example of an applause incident lasting 7.63 seconds in the context of these parameters: preceding the collective response was a lone clapper (2.80 seconds; not counted) then several audience members clapped (7.63 seconds; counted), and a lone clapper that continued after the cessation of several members clapping (1.41 seconds; not counted). Responses that were not documented in Choi, Bull, and Reed's (2016) coding system were readily subsumed into the unitary, composite and sequential dimensions, such as booing (unitary). Composite responses made up a majority of responses towards speakers, with the predominant response within this category being the co-occurrence of cheering and applause. Sequential responses had the lowest occurrence at the category level.
The speakers differed particularly across several response forms: Trump for example evoked more incidences of booing, cheering, cheering co-occurring with applause followed by chanting (as a sequential response) and notably, Clinton evoked no booing that co-occurred with other forms, but evoked more incidents of applause, and applause co-occurring with cheering. A chi-square test of independence was conducted on the data shown in Table 3, which revealed a significant difference in the overall pattern of responses received by the two speakers, χ 2 (13) = 135.49, p < .001. This means that the type of response received depended on the speaker, and that there was a significant difference between Trump and Clinton in the pattern of responses with Trump receiving more incidences of OAR than Clinton.

Duration of Audience Responses
The total duration of all 14 forms of response was calculated. Table 4 displays the relative proportions of each form of response by their total duration for each speaker, across the ten swing-state speeches. Laughter, cheering, booing, and "yes/no" verbal responses appear to be overrepresented when contrasted with incidence totals (see Table 3), and the sequential response "cheers and applause followed by chanting" and the composite response "cheers and applause" appear to be underrepresented. A chi-square test of independence then revealed that the pattern of responses by incidence total and duration total were significantly different, χ 2 (13) = 240.249, p < .001. This suggests that when different types of response are tallied by their total duration, there is a different pattern of frequency compared with the pattern of frequency as tallied by their incidence. Another chisquare test of independence was also conducted on the data shown in Table 4, which revealed a significant difference in the overall pattern of responses received by the two speakers, χ 2 (13) = 1060.63, p < .001, as tallied by their duration. This means that the type of response received as tallied by its total duration also depended on the speaker, and that there was a significant difference between Trump and Clinton in the pattern of response with Trump receiving more OAR than Clinton.  Response forms accounting for a minimum 5% all of responses were then selected for further analyses. As the data were not normally distributed and there were unequal sample sizes between the several conditions, nonparametric tests were used. A Kruskal-Wallis H-test revealed a significant effect of response type on duration, H(5) = 397.74, p < .001. This finding suggests that the different response types coded significantly differed from each other in terms of duration. Dunn-Bonferroni pairwise comparisons were then conducted on the six response forms, revealing that sequential response forms were all significantly longer than unitary response forms, as were composite response forms except for applause. Notably, the composite response "applause + cheers" was significantly longer than the composite response "cheers + applause", though there was no significant difference between this response and the sequential response "cheers + applause → chanting". A full summary of the Dunn-Bonferroni pairwise comparisons between the six response forms is given in

Electoral Success
The affiliative response rate (number of responses per minute) and total audience response times are shown in Table 6 and Table 7, respectively. As speeches varied in length, the total audience response time was then calculated as a proportion of overall speech time. The affiliative response rate and total audience response time were then correlated with election results (percentage of votes received by each speaker) for the ten swing-state speeches.
Journal of Social and Political Psychology 2020, Vol. 8(1), 368-387 https://doi.org/10.5964/jspp.v8i1.953  Remarkably, affiliative response rate and total audience response time was stronger for a state's winning candidate 4/5 times. As the data were non-normal, Kendall's Tau correlations were used to test the relationships between affiliative response rate, total audience response time, and voter share.
Contrary to our first hypothesis, Kendall's Tau correlation revealed a non-significant negative relationship between affiliative response rate, and electoral success at the state level, r t = -.20, p = .210, 1-tailed.
In line with our second hypothesis, Kendall's Tau correlation revealed a significant positive relationship between the total audience response time as a proportion of overall speech time, and electoral success at state level, r t = .73, p < .01, 1-tailed.

Discussion
The results show that unitary, composite and sequential audience responses all occur in American political speeches, as in Korean political speeches (Choi et al., 2016), and moreover responses not observed in Korean speeches were readily subsumed into these response categories. Thereby, the advanced coding system introduced by Choi et al. (2016) was readily adaptable to the context of U.S. presidential campaign speeches, and is recommended as a template for coding OAR in political speech contexts henceforth. It was also found that the overall pattern of response received by the speakers was significantly different, with Trump receiving more responses than Clinton. Moreover, this study was the first to measure the duration of OAR in the context of an election campaign, and it was found that the overall pattern of response by duration total was significantly different to the pattern of response by incidence total. As the duration of a response was considered to be a more advanced measure of OAR than its incidence count, it is recommended that future researchers in this field adopt this measure too. The researchers also found that response type affected the duration of a response, suggesting that audiences might use different responses (and subsequent lengths) to control the force of reply. The first hypothesis that there would be a significant, positive correlation between affiliative response rate and voter share in swing states was rejected, and so this study did not replicate the findings of Bull and Miskinis (2015), who found such a relationship from their study of 2012 U.S. Presidential Election campaign speeches. The second hypothesis that there would be a stronger, positive correlation between the duration of OAR (calculated as a proportion of speech time) and voter share, however, was supported. Thus, the study of OAR in political speech contexts (such as those studied in this paper) could potentially provide insight into emergent group attitudes towards candidates.

Characteristic Features of OAR
From these results, it can be seen that unitary, composite and sequential audience response forms all occur in U.S. political speeches, as in Korean political speeches (Choi et al., 2016). Composite, unitary, and sequential responses accounted for 52.56%, 39.32%, and 8.12% of all response forms across both speakers, respectively.
The most frequent response for speakers Trump and Clinton was the composite response of cheering and applause, which accounted for 44.67% and 50.80% of all responses, respectively. Bull and Miskinis (2015) however found that applause evoked by speakers Obama and Romney in their 2012 U.S. Presidential campaign speeches accounted for substantially less (12.94% and 2.24%, respectively). This finding is likely due to the simpler coding system used by Bull and Miskinis, which only recorded the predominant form of response. It is then possible that applause occurring at the same time as other responses may not have been as audibly predominant, or that it took longer to initiate than other response forms; hence, it can be seen as substantially underrepresented when compared to the findings of this study. Future studies on political oratory then would be wise to characterise audience responses using this study's coding system to better reflect the range of responses with which an audience may reply to a speaker. U.S. audiences analysed in this study replied to speakers with sequential response forms 8.12% of the time, which is notably lower than what was reported of Korean audiences (34.74%) by Choi et al. (2016). This difference could potentially be interpreted in the context of Hofstede's distinction between individualist and collectivist societies (e.g. Hofstede, Hofstede, & Minkov, 2010). As an individualist culture, the U.S. allows for greater freedom in responding to a speaker, such as through the use of disaffiliative and isolated responses (Bull & Miskinis, 2015), whereas cultures such as Japan and Korea typically use affiliative and collectivist responses which are seen to sustain group harmony as the preponderant types of response (Bull & Feldman, 2011;Choi et al., 2016). Notably, composite responses only made up 21.34% of responses in Korean presidential campaign speeches (Choi et al., 2016), whereas in this study composite responses made up 52.56% of responses. Collectivist societies might be less inclined towards using composite responses as they incorporate several co-occurring responses, such that they might be seen as disruptive to group harmony as they are not coordinated, hence they may be avoided. As an individualist society, U.S. audiences may make greater use of composite responses as the use of several cooccurring response forms which may not regarded so much as being discordant, rather as showing a greater appreciation by each individual towards the speaker.
On the other hand, this difference might simply reflect the kind of rhetoric used by speakers. Although Choi et al.'s (2016) study did not report on the usage of explicit and implicit devices used by speakers, the highest response form was that of verbal responses (37.39% responded with yes, no, or other). These might arguably be prompted more by explicit questions to the audiences rather than by implicit devices embedded in speech. A future study might better answer this question by investigating cross-cultural differences in audience-speaker behaviour in such speech contexts with a uniform coding system, as well as modelling the relationship between devices employed by speakers and the responses they generate. Such a study might elucidate information that helps to identify cross-cultural differences (and indeed, similarities) between speakers in campaign speech contexts, as well as cross-cultural norms and differences in audience behaviour.
This study also noted a departure from previous work (particularly that of Bull & Miskinis, 2015) in that no disaffiliative responses were recorded in either a pilot study, or indeed the main study. This might be explained by the arguably more self-selecting nature of the audience. Whilst the researchers noted that events did not require party affiliation, they did require prior bookings. In this regard the speech contexts could be seen as different from those of the 2012 U.S. presidential speeches which were delivered in informal locations, such that any member of the public could participate, whereas the requirement to book in advance might possibly be a disincentive for the politically undecided. If so, the phrase "rally of the faithful" would be more applicable than ever, making dissenting behaviours such as disaffiliative responses less likely than before.
Finally, it was also observed that the overall pattern of responses received by speakers were significantly different from each other, with Trump receiving more incidences of OAR than Clinton. The fact that Trump received more incidences of OAR might in itself suggest that Trump was more popular than Clinton, though this would assume that speech lengths were of the same length which they were not; in each state Trump spoke for longer than Clinton. The affiliative response rate accounts for this variable, though it did not correlate with voter share and so more incidences of OAR do not appear to reflect popularity. Nevertheless, the finding is a notable contrast from Bull and Miskinis' (2015) study which noted that the overall pattern of responses received by speakers in the 2012 U.S. Presidential Election campaign were in fact extraordinarily similar. Their finding could potentially be explained by the comparable oratory stylings of the speakers; the overall pattern of rhetorical devices used by the speakers was also extraordinarily similar. This study did not analyse the rhetorical devices used by the speakers and so no similar comparison can be made, though it is worth noting that in the 2016 Republican primary debates Donald Trump diverged from his speakers in that he received less applause and more laughter (Stewart, Eubanks, & Miller, 2016). A more direct comparison of the pattern of rhetorical devices used and pattern of OAR received by the speakers may then be an interesting avenue for future research; such a comparison would not only elucidate the extent of the difference between candidates' oratory stylings, but may also observe correlations between certain rhetorical devices and response types.

Duration of OAR
From the results it can be seen that the pattern of responses by incidence total and duration total were significantly different from each other, suggesting that there is a different pattern of frequency between response types when they are tallied by their total duration as opposed to their incidence total. This is a novel finding as it suggests that certain response forms may be over-or underrepresented by the measure of incidence. For example, the sequential response "cheers + applause → chanting" made up 2.90% of total incidences from both candidates, but 8.66% of the duration total. Arguably, the measure of duration is more advanced given that it records not only the occur- rence of a response itself but its duration too, which is potentially another barometer for measuring audience enthusiasm given that there was also a strong correlation between duration of OAR and voter share on polling day.
Future research could then benefit from continuing to measure OAR across this dimension if it is considered to be an advanced metric that produces a different pattern of results than incidence count.
Additionally, the results showed that the overall pattern of responses received by speakers was significantly different from each other, with Trump receiving longer OAR than Clinton. Given that there was also a significant, positive correlation between the proportion of OAR received and voter share on election day, there may be some scope to argue that longer incidences of OAR may be reflective of candidate popularity whereas the incidence in and of itself is not. This would then add further weight to the suggestion that future research would stand to benefit from measuring the duration of OAR as opposed to just the incidence.
The results also show that the form of response used significantly affected its duration. It could be inferred from this that audiences might select different forms of response as a means to control the force of their reply. For example, a posteriori tests revealed that composite and sequential response forms were all significantly longer than unitary response forms, except for applause. By virtue of containing multiple responses within one turn, composite and sequential responses could be considered as being more reactive than unitary responses, and this is reflected by the fact that they last longer. Thus, audiences may use several responses within a turn as a means to show greater appreciation towards a speaker. However, this appears to be contradicted by the fact that applause was significantly longer than other unitary forms, and did not differ either from the composite response "cheering and applause", or the sequential response of "cheering and applause followed by chanting". One explanation for this finding might be due to the nature of the responses: cheering and booing are both vocal responses, whereas applause is a non-vocal response. As such, cheering and booing may be similar to speech in that they are all vocal, and an audience that intends to use these forms may wait until the speaker has finished their turn so as not to interrupt. Applause might then not be seen as interruptive to the speaker by virtue of it being non-vocal, and so applause incidents may have a larger window of time in which they can occur, if they typically begin before the end of a speaker's turn. However, further research is needed to test this theory, as this study did not record where responses began in relation to the end of the speaker's turn.
Another notable finding was that the duration of the composite response "applause and cheering" (wherein applause preceded the cheering) was significantly longer than the composite reply "cheering and applause" (wherein cheering preceded applause). This justifies the distinction made in terms of the order in which forms of response first occur, as it suggests that despite both responses being comprised of cheering and applause, the audience may be altering their order to further control the impact of their response.
Future research might take these novel findings further by comparing OAR (and their duration) with the rhetorical devices used to prompt them. One expected finding based on the results of this study might be that the longer forms of response such as composite and sequential responses might tend to be evoked by the combination of several rhetorical devices delivered by the speakers.

Audiences Responses as a Predictor of Electoral Success
Our first hypothesis, according to which affiliative audience response rate would be predictive of electoral success, was not supported. Notably, this represents a failure to replicate the findings of Bull and Miskinis (2015), who found a significant positive correlation between affiliative response rate and electoral success in the 2012 U.S. Presidential Election. However, the second hypothesis that there would be a significant positive correlation between the total audience response time and voter share than affiliative response rate was supported. These contrasting findings suggest that the total audience response time is a superior predictor of electoral success, compared to affiliative response rate. Not only did the former correlation reach significance when the latter did not, but the former correlation was also greater (r = +.73) than the significant correlation between affiliative response rate and electoral success found by Bull and Miskinis in the 2012 U.S. presidential election (r = +.67).
An explanation for these contrasting findings could be that the duration of a response contains more information about genuine enthusiasm towards a speaker; it might be that how long an audience responds is proportional to their support for the speaker. This information is unaccounted for by affiliative response rate, as it only records the very occurrence of a response, not its duration. Thus, the proposed relationship between affiliative response rate and electoral success is predicated on the assumption that the more responses a speaker receives, the more enthusiastic the audience is towards them. Indeed, whilst Trump had a higher mean response rate than Clinton across the sample of speeches, a greater proportion of total speech time was spent by the audiences responding to Clinton. There is then a possibility that a high affiliative response rate is perhaps more reflective of the speaker's ability to incorporate several rhetorical devices that prompt responses from the audience, rather than reflecting genuine enthusiasm. Hence, the duration of a response might be controlling for this potential confound, a view which is supported by the results of this study demonstrating that when the duration of all responses was combined (the total audience response time) and calculated as a proportion of overall speech time, there is a stronger positive correlation. If the duration of OAR is considered to be a more robust metric of audience enthusiasm than affiliative response rate, then the results do not necessarily contradict Bull and Miskinis' (2015) conclusions that OAR might serve as a proxy for public opinion intensity and subsequent voting intention on election day. Rather, it refines and emboldens these conclusions.
There are, however, some considerations that limit the strength of the argument to be made for OAR as a corollary of voter intent, let alone a viable alternative to measures such as polling and dial testing. Firstly, the somewhat self-selecting audiences of the 2016 speeches might be seen as somewhat contradictory to the notion that OAR could reveal insight into state-wide voting intent. The 2016 speeches were markedly different from the 2012 speeches in that they required pre-bookings, which might be a disincentive to swing voters who could otherwise arrive freely at the event. It is somewhat perplexing that OAR in these so-called "rallies of the faithful" could be informative of state-wide voter participation when the samples are not necessarily reflective of the politically undecided who ultimately decide the winner. On the other hand, there is evidence to support the view that intra-audience effects such as crowd noise and behaviours can facilitate a greater sense of excitement and increased immersion in the televised spectation of sports (Cummins & Gong, 2017). Furthermore, these speeches are televised and reported on by the mass media, and so there is potentially a case to be made that OAR could be influencing voter intention of television viewers too.
There are other confounds that limit the strength of the argument regarding the relationship between OAR and voter intention, such as proximity. In this study, the majority of speeches in a given state were often delivered within days of each other, although it is interesting to note that in Michigan (where Trump's speech was delivered nearly two months before Clinton's), Trump won despite having a lower duration of OAR. Future studies could compare multiple speeches in a state delivered by a candidate, if applicable, and observe whether the proximity of the speech to polling day affects the relationship with electoral success, and additionally investigate the relationship with pre-election polling too. Another consideration is the geographical proximity of speeches; the states in themselves are very large, with Colorado and Michigan covering a greater area than the United Kingdom. It is not unreasonable to point out that voting intention or enthusiasm varies at not just a state-level but also a countylevel, or even a city/rural level too. If possible, a novel design might account for these differences by isolating campaign speeches within states at either of these levels, and compare the relationship between duration of OAR and corresponding voter share on polling day.

Limitations
One improvement that the researchers would like to suggest for future study of OAR is to improve the precision of time measurement through the use of content analysis software (which can control recordings at the millisecond level). Although Vicente-Rodríguez et al. (2011) argue that manual measurements with a stopwatch by a "trained rater" show only minimal differences in time estimation compared with gold standard measurement tools, what experience the researchers considered to make a rater "trained" was vague, only stipulating that prior experience with making manual measurements would be sufficient. The volume of content that the researchers analysed in this study likely would have made them "trained raters" by this criterion, but for precision's sake future research would benefit from utilising content analysis software which can control recordings at the millisecond level.
Vicente-Rodríguez et al. (2011) also provide a methodology for testing inter-rater reliability between raters at the time measurement level which, if adopted, would improve the reliability of results as well.

Conclusion
In conclusion, a microanalysis was performed in this study of OAR during campaign speeches made in the 2016 U.S. presidential election. Composite, unitary and sequential response forms were all observed as in Choi et al.'s (2016) analysis of Korean campaign speeches, although notably American audiences made much greater use of composite responses than sequential responses. This was interpreted in terms of Hofstede's distinction between individualist and collectivist societies (e.g., Hofstede et al., 2010). However, in contrast to Bull and Miskinis' (2015) study of OAR in the context of the 2012 U.S. presidential campaign speeches, there were no disaffiliative responses towards the speakers, and there was also a significant difference in the overall pattern of responses received by speakers (favouring Trump), which was not found in their study. Notably, this study was the first to characterise audience responses in terms of their duration, and observed that there was a significant difference in the overall pattern of responses received by speakers when tallied by duration total, compared with incidence total. If the duration of OAR is interpreted to be a more advanced measure than simply incidence total, then it is recommended that future researchers adopt this measure within the field. Significant differences were also found in mean length between several forms of response, suggesting that audiences may utilise different forms in order to modulate their impact. It was suggested that future research could investigate the relationship between rhetorical devices embedded in the speech that prompt OAR, which might elucidate further how speakers deploy certain devices (or indeed, combinations of them) to whip up support amongst their audiences. This study failed to replicate Bull and Miskinis' (2015) finding of a significant relationship between affiliative response rate and electoral success, although a significant relationship was observed when response duration of response was taken into account.
This finding was subsequently considered to support and build upon the suggestion that OAR might be considered to be an indicator of voter enthusiasm and intent, but that future research would be needed to control for potential confounds that might have influenced the supposed relationship with electoral success.