Oratory has always been an important form of political communication. Its study dates back to the times of classical civilisations, in ancient Rome through the writings of Cicero (55BCE/2001) and Quintilian (c.95CE/2015), in ancient Greece through the writings of Aristotle (4th century BCE/2006). In the modern era, significant insights have been gained into how politicians interact with live audiences through the detailed microanalysis of video and audio recordings, focussed especially on rhetorical techniques used by politicians to invite applause (e.g., Atkinson, 1984). Whilst oratory has traditionally been regarded as monologic, these studies show how political speeches can be conceptualized as a form of dialogic interaction between speaker and audience, akin to the way in which people take turns in conversation (Atkinson, 1984).
The aim of this paper is to review this research, with a view to formulating a new theoretical model of how speakers interact with audiences in set-piece political speeches, based on the concept of dialogue. To identify relevant articles, a literature search was conducted, utilising the electronic databases Google Scholar, Web of Science and PsycINFO; the search terms were rhetorical devices and applause in political speeches, and booing in political speeches. In conducting this review, consideration is given to other audience responses besides applause, namely, laughter, cheering, chanting, and booing. Consideration is given also to other factors that affect speaker-audience interaction besides rhetorical devices, in particular, delivery, speech content, and uninvited applause. Although this review is based primarily on studies of British politicians, it also includes recent research on political speeches delivered in both Japan and the USA. This cross-cultural perspective, it is proposed, provides significant insights into the role of political rhetoric in speaker-audience interaction, which may be usefully conceptualized in terms of broader cross-cultural differences between collectivist and individualist societies (e.g., Hofstede, 2001; Hofstede, Hofstede, & Minkov, 2010).
In the first section of the paper, a description is given of the seminal work carried out by Atkinson (e.g., 1984) on rhetorical devices used to invite applause (based on a review by Bull and Feldman, 2011). These analyses have proved remarkably enduring, and provide some compelling insights into the stage management of political speeches. However, Atkinson’s original studies were published in the 1980s, and since then, a great deal of further research has been conducted. These studies are reviewed in the second section of this paper, which is focussed on a number of factors that affect speaker-audience interaction in set-piece political speeches. In the third and final section of the paper, a new model is proposed of speaker-audience interaction in political oratory, based on the concept of dialogue between speakers and audience.
Claptraps: Techniques for Inviting Applause
Atkinson’s (e.g., 1984) key insight was to compare political speech-making with how people take turns in conversation, where for example the end of a list can signal the end of an utterance – a point at which another person can or should take over the speaking turn (Jefferson, 1990). Such lists in conversation typically consist of three items, so that once the listener recognizes that a list is under way, it is possible to anticipate when the speaker is about to complete the utterance, referred to as a completion point (Jefferson, 1990).
In the context of political speeches, the three-part list may signal to the audience not when to start talking but when to applaud, as in the following example from a speech by Tony Blair (at that time Leader of the British Labour Opposition): “Ask me my three main priorities for government, and I tell you: education, education and education” (Labour Party Conference, October 1, 1996). In this example, the word “and” coming before the third and final mention of “education” acts as a signal that Blair is about to reach a completion point; the audience then responded with tumultuous applause. Thus, just as conversationalists take it in turn to speak, so speaker and audience may also take turns, although audience “turns” are essentially limited to gross displays of approval or disapproval (such as applause, cheering or booing).
Another device identified by Atkinson (e.g., 1984) is the contrast, which juxtaposes a word, phrase or sentence with its opposite. Thus, “need” with “wealth” are contrasted in the following example from Tony Blair (at that time British Labour Prime Minister) (Labour Party Conference, September 30, 1997): “I will never countenance an NHS [National Health Service] that departs from its fundamental principle of health care based on need not wealth” (square brackets are used here and subsequently to provide contextual information to clarify the quotations).
The contrast operates in a slightly different way from the three-part list. Essentially, it comprises a word, phrase or sentence that is followed by a word, phrase, or sentence with the opposite meaning. To be effective, the second part of the contrast should closely resemble the first in the details of its construction and duration, so that the audience can more easily anticipate the point of completion. If the contrast is too brief, people may have insufficient time to recognize that a completion point is about to be reached, let alone to produce an appropriate response. According to Atkinson (1984), the contrast is by far the most frequently used device for inviting applause. He also proposed that the skilled use of both contrasts and three-part lists is characteristic of “charismatic” speakers (Atkinson, 1984, pp. 86-123), and that such devices are often to be found in those passages of political speeches that are selected for presentation in the news media (Atkinson, 1984, pp. 124-163).
An important feature of both the three-part list and the contrast is that the speaker does not explicitly ask the audience to applaud. For example, the speaker does not say “I am asking you for support”, or even “Please put your hands together to give a round of applause”. Rather, these devices are implicit in the structure of speech, embedded in the construction of talk itself, that indicates to the audience when applause is appropriate.
How these features work can be seen most clearly in ritualized messages, such as introductions and commendations, which involve what Atkinson (1984) calls naming. In inviting the audience to show their appreciation for a particular individual, the speaker may start by giving some clues to the person’s identity, then continue with some appreciative comments, and finally reveal the person’s name. To help the whole process along, the speaker may even pause a little just before the actual naming. The audience is thus given ample time to realize that applause is expected and to anticipate who is to be identified, so that they are fully prepared when the name is finally announced (Atkinson, 1984, pp. 49-57). Namings may often be combined with gratitude, in which the speaker thanks a named person in the audience. Another five rhetorical devices for inviting applause were identified by Heritage and Greatbatch (1986); these were termed puzzle-solution, headline-punchline, position taking, combination, and pursuit. In a puzzle-solution device, the speaker begins by establishing some kind of puzzle or problem, and then, shortly afterwards, offers the solution – the important and applaudable part of the message. The headline-punchline device is structurally similar to the puzzle-solution, although somewhat simpler. Here, the speaker proposes to make a declaration, pledge or announcement and then proceeds to make it. Thus, the speaker might use headline phrases such as “I’ll tell you what makes it worthwhile...”, “And I’ll say why...”. The applaudable part of the message is emphasised by the speaker's calling attention in advance to what s/he is about to say. In a position taking, the speaker first describes a state of affairs towards which s/he could be expected to take a strongly evaluative stance. The description itself contains little or no evaluation. However, at the end of the description, the speaker overtly and unequivocally either praises or condemns the state of affairs described. All these devices may be combined with one another, with the result that the completion point of the message is further emphasised (combination). If an audience fails to respond to a particular message, speakers may actively pursue applause (pursuit).
A number of these devices can be illustrated in the following extract from a speech delivered by Ed Miliband (at that time Leader of the British Labour Opposition) to the Labour Party annual conference (Liverpool, 27 September, 2011). Early in the speech, Miliband said: “Ask me the three most important things I’ve done this year and I’ll tell you; being at the birth of my second son, Sam.” Thus, Miliband set a puzzle (“Ask me the three most important things I’ve done this year”), and then a headline (“I’ll tell you”); this is followed by the solution/punchline (“being at the birth of my second son, Sam”), which should be the applaudable part of the message. But the audience did not applaud, there was a pause. Presumably, the audience was still waiting for items two and three, because Miliband said “Ask me the three most important things”). Miliband then nodded his head during the pause, and the audience applauded. Arguably, the head nod might be understood as a nonverbal form of pursuit, indicating to the audience that Miliband was inviting applause at this point (Bull, 2015). This extract can also be seen as an example of a combination, because Miliband combines several devices together (headline/punchline, puzzle/solution, three-part list, pursuit).
In the same speech, the following example can also be seen of a position taking, combined with a three-part list, and a contrast. "(1) You need to know there is an alternative, (2) you need to know it is credible, (3) so people need to know where I stand. (A) The Labour Party lost trust on the economy. (B) I am determined we restore your trust in us on the economy." The elements of the three-part list are indicated 1, 2 and 3, the two elements of the contrast A and B (“lost trust” is contrasted with “restore your trust”). The state of affairs that Miliband describes is that “The Labour Party lost trust on the economy”. The position taking is “I am determined we restore your trust in us on the economy” (Bull, 2015).
Two further devices have been identified by Bull and Wells (2002). They argued for the inclusion of jokes, since jokes often receive applause as well as laughter, and also for another device which they termed negative naming. Whereas in naming, the audience are invited to show their appreciation for a particular individual (e.g., Atkinson, 1984), in negative naming, the audience are invited to applaud the abuse or ridicule of a named person. Typically, this is a politician of an opposing political party, although negative naming may also be used to castigate a social group, such as another political party. So, for example, Gordon Brown (British Labour Prime Minister 2007-2010) received tumultuous applause for his denunciation of the far right-wing British National Party: “And we will back you in the second task you’ve taken on: to ensure there is no place for the British National Party in the democratic politics of our country” (speech to the Labour Party Conference, 29 September, 2009).
Atkinson’s (e.g., 1984) original observations have made an enormous contribution to our understanding of political rhetoric. In summary, his key theoretical insight was the analogy between audience applause and conversational turn-taking. Just as people take turns in conversation by anticipating when the speaker will reach the end of an utterance (e.g., Duncan & Fiske, 1985; Walker, 1982), so audience members are able to anticipate when the speaker will reach a completion point through rhetorical devices embedded in the structure of talk. This enables them to applaud at appropriate moments, and is reflected in the close synchronization between speech and applause.
However, given that Atkinson’s (e.g., 1984) research was based on the analysis of selected extracts, it is possible that his examples are not necessarily representative of political speech-making as a whole. The only effective answer to this criticism is comprehensive sampling. This was the intention of Heritage and Greatbatch (1986), who analysed all the 476 speeches televised from the British Conservative, Labour and Liberal Party conferences in 1981. They found that contrasts were associated with no less than 33.2% of the incidents of collective applause during speeches, lists with 12.6%; hence, almost half the collective applause was associated with the two rhetorical devices originally identified by Atkinson.
Heritage and Greatbatch (1986) also analysed collective applause in response to all seven rhetorical devices outlined above (contrasts, lists, puzzle-solutions, headline-punchlines, position takings, combinations, and pursuits).They found that over two-thirds of collective applause (68%) was associated with these seven rhetorical devices. Most effective were contrasts and lists, the two devices originally identified by Atkinson as significant in evoking applause. Thus, the results of this comprehensive survey provided impressive support for Atkinson‘s original observations.
However, if over two-thirds of collective applause occurs in response to the seven rhetorical devices, what about the other one-third of applause incidents? Furthermore, audiences produce other responses than applause; they may, for example, laugh, cheer, chant, and even boo. Thus, for a truly comprehensive analysis of speaker-audience interaction in political speeches, all these other audience responses need to be considered, including isolated as well as collective responses, uninvited as well as invited responses. Notably, although Atkinson (e.g., 1984) gave pride of place in his analysis to the role of rhetorical devices, he also discussed the role of speech content and speech delivery (nonverbal and vocal cues used to accompany rhetorical devices). The relative importance of all these factors need to be considered in any model of speaker-audience interaction, and are reviewed below.
In addition, consideration is given to cross-cultural differences in speaker-audience interaction. Two studies of political speeches have been conducted in Japan, based on 36 speeches from the general election of 2005 (Bull & Feldman, 2011), and 38 speeches from the general election of 2009 (Feldman & Bull, 2012). In addition, a study also has been conducted of the United States presidential election of 2012, based on 11 speeches from the Democrat and Republican candidates, Barack Obama and Mitt Romney (Bull & Miskinis, 2015). On the basis of all these studies, an analysis of cross-cultural differences between British, US and Japanese speeches is proposed in terms of differences between collectivist and individualist societies, following Hofstede (e.g., Hofstede, 2001; Hofstede et al., 2010).
Factors That Affect Speaker-Audience Interaction
Delivery
The delivery of a speech can refer to various forms of body language, such as the use of hand gesture, posture, gaze and facial expression. It can also include the vocal delivery of a speech, for example, tone of voice, loudness, pitch and speech rate. It has long been recognized that delivery plays an important role in oratory. In ancient Rome, the treatises of both Cicero (55BCE/2001) and Quintilian (c.95CE/2015) included a number of observations on the use of gesture. Notably, the term “gestus” for Quintilian referred not only to actions of the hands and arms, but also to movements of the rest of the body.
But can we be more specific about the role of delivery in speech-making? Atkinson (1984, p. 84) argued that audiences are much more likely to applaud if a rhetorical device is accompanied by appropriate delivery. Heritage and Greatbatch (1986) coded a sample of speeches formulated in one of the seven basic rhetorical devices in terms of the degree of “stress”. Stress was evaluated in terms of whether the speaker was gazing at the audience at or near the completion point of the message, whether the message was delivered more loudly than surrounding speech passages, or with greater pitch or stress variation, or with some kind of rhythmic shift or accompanied by the use of gestures. In the absence of any of these features, the message was coded “no stress”. One of these features was treated as sufficient for a coding of “intermediate stress”, while the presence of two or more features was categorized as “full stress”. Over a half of the “fully stressed” messages were applauded, only a quarter of the “intermediate” messages attracted a similar response and this figure fell to less than 5% in the case of the “unstressed” messages. Thus, Heritage and Greatbatch supported Atkinson’s view that delivery increases the chance of a rhetorical device receiving applause (see also Bull, 1986).
An alternative viewpoint has been put forward by Bull and Wells (2002). These authors proposed that delivery tells us whether or not the rhetorical device is an applause invitation. So, for example, a speaker may deliver a three-part list, each item accompanied by a hand gesture, and receive tumultuous applause. But if the speaker continued to gesture after the third item, and/or took a visible and/or audible intake of breath, this would suggest that the list was not intended as an applause invitation. Notably, not every rhetorical device receives applause, something for which Atkinson’s analysis never provided an account. In contrast, Bull and Wells’ (2002) proposal that delivery is integral to applause invitation can be used to account for both why some rhetorical devices receive applause, and not others.
The Content of Speech
Of course, audiences do not simply applaud rhetoric, they also respond to the content of a political speech. Atkinson (1984) never sought to deny this. He conducted an analysis of the kind of content that received applause, and found that predominantly it took the form of what he called ingroup praise (praise of your own party), and outgroup derogation (criticism of others). In a sample of applauded statements using rhetorical devices, 95% made favourable references to specific individuals, favourable references to “us” and unfavourable references to “them” (Atkinson, 1984, p. 44). Atkinson took the view that audiences are much more likely to applaud if applaudable speech content is expressed in an appropriate rhetorical device.
A content analysis was also conducted by Heritage and Greatbatch (1986). They too found that applause was reserved for a relatively narrow range of message types. Specifically, these were external attacks (statements critical of outgroups such as other political parties); general statements of support or approval for the speaker’s own party; internal attacks (criticisms of individuals or factions within the speaker’s own party); advocacy of particular policy positions; commendations of particular individuals or groups; and various combinations of these message types. In total, these categories of political message made up over 81% of all the applauded messages in their sample (see also Bull & Wells, 2002).
Heritage and Greatbatch (1986) also analysed external attacks in further detail. Whereas 71% of external attacks expressed in one of the seven rhetorical devices were applauded, only 29% of external attacks without rhetorical devices received applause. Thus, while Heritage and Greatbatch acknowledged that applause is clearly related to certain types of speech content, they concurred with Atkinson’s view that such content is much more likely to receive applause if couched in appropriate rhetorical devices. Thereby, speakers may also be seen to facilitate their interaction with the audience, given that there are strong normative expectations that audience members should applaud at party political conferences.
However, what this analysis does not address is the role played by speech content in the absence of applause invitations. In one study, instances were identified from leader speeches at British party political conferences where collective applause occurred in the absence of any of the seven rhetorical devices described above (Bull, 2000). In every case, the applause occurred in response to statements of political policy, that is, what the leader proposed to do if returned to power. Thus, for some messages, speech content may be so significant that it will receive applause in the absence of rhetorical devices.
The following example comes from a speech by Tony Blair (1 October, 1996), his last party conference speech before he became Prime Minister. Applause is indicated by lower case crosses, louder applause by upper case crosses (following Atkinson, e.g., 1984).
There was nothing in Blair’s use of the phrase …to join a trade union…” to suggest that this was a completion point. Indeed, given that he followed it with “…and if…”, it seems likely that his intention was to continue. Nor did his delivery suggest that he had reached a completion point; he was not gesturing, and he continued to look straight ahead at the audience. It is of course possible that the audience mistakenly anticipated a completion point after “…join a trade union”, but given the strong traditional association of the Labour Party with trade unionism, it is much more likely that audience members were interrupting in order to endorse Blair’s support of the right to join a trade union, hence, thus, their applause seemed a direct response to the content of speech.
Uninvited Applause
In the example above, the applause for “…to join a trade union” seemed not only to be a direct response to the content of speech, it also seemed to be uninvited. That is to say, Blair was not making use of any of the rhetorical devices described above, nor did his delivery suggest an applause invitation.
Both invited and uninvited applause were analysed in 15 speeches from the annual British party conference speeches (Bull & Wells, 2002). To identify uninvited applause was relatively unproblematic; reliability between the two raters was 0.94. Most of the applause was judged as invited (86% of all applause incidents), the remaining 14% was uninvited. However, in contrast to the previous study (Bull, 2000), there were incidents of uninvited applause associated with rhetorical devices, but where the delivery did not suggest the device was intended as an applause invitation (75% of all incidents of uninvited applause).
For example, the following extract comes from a speech by William Hague (7 October, 1999), at that time Leader of the British Conservative Opposition. “What annoys me most about today’s Labour politicians is not their (A) beliefs – they’re entitled to those – (B) but their sheer, unadulterated hypocrisy. They (A) say one thing and (B) they do another”. In this extract, Hague used the rhetorical device of a contrast twice in quick succession (“beliefs” are contrasted with “hypocrisy,” “saying one thing” is contrasted with “doing another”). However, Hague also showed a very clear and visible intake of breath following the phrase “they do another,” which suggested that his intention had been to continue, and that he had not been seeking applause at that point. Hence, the applause which occurred after “…they do another” was judged to have been uninvited and interruptive.
Thus, from this perspective, uninvited applause may occur not only as a direct response to the content of the speech, but also through a misreading of rhetorical devices as applause invitations, when the associated delivery suggested that the politician intended to continue with his speech.
Rhetorical Devices
The seven rhetorical devices originally identified by Atkinson (1984) and Heritage and Greatbatch (1986) (referred to subsequently as the seven traditional devices) were all identified from British political speeches. But these devices may not be characteristic of political oratory worldwide, they may be specific only to British political culture.
In the two studies of Japanese elections (Bull & Feldman, 2011; Feldman & Bull, 2012), the seven traditional devices could all readily be identified, but only accounted for a small proportion of the collective applause (29% in the 2005 election, 26% in the 2009 election). Hence, it was found necessary to devise seven new categories of rhetorical device. These additional categories are listed and defined below, together with illustrative examples (based on Bull and Feldman, 2011).
New Categories of Rhetorical Device
Greetings/salutations
Opening utterance in which the candidate introduces him/herself by name, and requests the audience’s support. Following an introduction by the master of ceremonies, the candidate will usually appear from behind the audience, and walk through the room while shaking hands with several of the people attending the meeting. Eventually the candidate moves forward, takes his/her position in front of the audience (it can be also on stage behind a podium or desk), bows deeply and then briefly introduces him/herself. For example, “As I was just introduced, I am Shimizu Koichiro and in this election for the Lower House I will take part in the campaign serving as the head of the [campaign] office in the [Kyoto] third constituency. I would like to ask for your support” (Shimizu Koichiro, August 26, 2005). The audience always responds to this statement with applause. Following this introduction by the master of ceremonies, the first utterance of the candidate will be, “Hello, good evening everyone”. The audience responds almost immediately with a collective “Good evening”. In other cases, the candidate after bowing will greet the audience with “Good evening. Are you all well?” The audience will usually respond to this with a collective “Yes, we are fine.” Then the candidate continues with “Good…” and starts his/her speech.
Expressing appreciation
This is the next utterance after the greetings/salutations in which the speaker expresses thanks or gratitude to the audience for attending the meeting. To this utterance, the audience will respond with applause. For example, “Today was a hot day. I am really thankful to so many of you for joining me here and staying until such a late hour” (Ishimura Kazuko, September 3, 2005). Again, “Now, I would like to thank all of you who gathered here to listen to my speech, for your support. I want to express from my heart my feelings of gratitude to all of you who came here today. Thank you very much” (Tanaka Hideo, September 2, 2005).
Request agreement/asking for confirmation
A statement in which audience agreement/confirmation is requested explicitly in response to what the speaker has just said, through expressions such as “Don’t you think so?”, “Wouldn’t you agree with me?”, “Don’t you think this is the truth?” To such requests, the audience will always respond, either with applause or, most often, with phrases such as “Yes, it is true,” “Naturally,” and “This is correct.” For example, “In other words, entrusting it to the private sector, it is the right thing to do when the enterprise is no more profitable. And that is why, trying to keep this familiar financial window safe, I oppose the privatization of posts and telecommunication. It is common sense, isn’t it?” (Kokuta Keiji, August 30, 2005). Again, “And the thing we are striving for, it is not destruction. We will show that there is security and stability for our lives at the end of these reforms. It’s our job as politicians, isn’t it?” (Kitagami Keiro, September 5, 2005).
Jokes/humorous expressions
Witty or amusing remarks intended to invite laughter from the audience. In the following example, the audience respond with laughter to this joke from Shimizu Koichiro about his personal appearance (August 26, 2005). “And yet, however, politics, er… of course I’m not saying that the younger the better, but my opponent is taller than me and a little bit more handsome. But, we are not choosing here a film actor, and I don’t want you to choose only the most “photogenic”. To do my best I also came here wearing platform shoes, but I don’t think it’s enough to outdo [my opponent]”.
Asking for support
The speaker explicitly requests the audience’s support for his/her candidature. These requests may be quite direct and straightforward. For example, “Please, stay with me until the end [of the election campaign]. Dear all, I sincerely thank you! [for your support]” (Hara Toshifumi, August 31, 2005). Or such appeals may be more elaborate, detailed, and emotional. For example, “My mother called me Kyoko – the child of Kyoto, and to be worthy of the name of Kyoko from Kyoto, I want earnestly, with all my heart to represent our region. I will put all my energies to it and I, Izawa Kyoto, will win the seat in the Diet. And for this I am asking for your support. Please, do assist me” (Izawa Kyoko, September 4, 2005). To both types of appeals, the audience will typically respond with cries of encouragement, such as “Do it!” “Go for it!” “Do your best!” “Give it your best!” and “You can do it”.
Description of campaign activities
The speaker relates details of his/her campaigning activities: of their travels, of people they met, of talks with voters and supporters in other parts of the constituency. By doing so, candidates may demonstrate their commitment, their communication skills, and their ability to work hard and sincerely. Thereby, they may seek to persuade as many voters as possible to support their candidature. For example, the following extract comes from a speech by Izumi Kenta (September 6, 2005). “I jumped off the election car and walked on the streets, coming across various people, who were standing in front of their houses, walking or running on the street, waving their hands. I distributed the manifesto [of his political party] to all of them. In the afternoon I rode the bicycle and if I saw people waving their hands to me, I turned the bicycle around and even though I knew that the car, the election car has already passed there, I shook their hands and gave them the manifesto”. Such comments may be greeted with applause and/or cries of encouragement.
Other
Miscellaneous statements from the candidates that receive an audience response, not included in any of the categories listed above. For example, Izumi Kenta (August 26, 2005) made the following slip of the tongue by saying “I worked as a representative in the Diet for only one month, for only one year and nine months”. The audience respond with cries of “Pull yourself together!”
Explicit and Implicit Devices
A notable feature of the traditional seven devices is that they are all implicit, embedded in the structure of speech. In contrast, the seven new devices are predominantly explicit, in the sense that the speaker is overtly asking for an audience response.
Thus, requesting agreement/asking for confirmation refers to statements in which audience agreement/confirmation is requested explicitly in response to what the speaker has just said, through expressions such as “Don’t you think so?”, “Wouldn’t you agree with me?”, “Don’t you think this is the truth?” Again, in asking for support, the speaker explicitly requests the audience’s support for his/her candidature, for example, “I am asking for your support. Please, do assist me”. By the same token, jokes/humorous expressions may be seen as explicitly inviting audience laughter.
Two further devices may be seen as formulaic. Greetings/salutations refers to the opening utterance in which the candidate introduces him/herself by name, and requests the audience’s support; this was always greeted with applause. After the greetings/salutations, the speaker expresses thanks or gratitude to the audience for attending the meeting (expressing appreciation). To this utterance, the audience also responds with applause.
All the above devices can be seen as ways in which the speaker is explicitly inviting an audience response. Of the remaining two categories, description of campaign activities is more similar to the seven traditional devices, in that it can be construed as an implicit response invitation. Rather than explicitly requesting support, the speaker relates details of his/her campaigning activities: of their travels, of people they met, of talks with voters and supporters in other parts of the constituency. Such comments may be greeted with applause and/or cries of encouragement. The final category (Other) comprised miscellaneous statements from the candidates that receive an audience response, not included in any of the other six categories listed above, nor treated as either explicit or implicit response invitations.
The results showed that the distribution of rhetorical devices used by speakers between the two elections of 2005 and 2009 was highly similar, hence that the findings of the first study were not just confined to one general election, but were arguably more typical of Japanese political speech-making in general. Furthermore, the majority of applause incidents occurred in response to explicit invitations from the speaker: 68% in the study of the 2005 election, 70% in the study of the 2009 election. In contrast, Atkinson’s (e.g., 1984) analysis is based on the proposal that applause invitations from British politicians are implicit, built into the construction of talk to indicate to the audience when applause is appropriate.
The third cross-cultural study was based on 11 speeches from the 2012 presidential election in the United States (Bull & Miskinis, 2015). In addition to the 14 rhetorical devices analysed in the two Japanese studies, two further devices were included, those of naming (Atkinson, 1984) and negative naming (Bull & Wells, 2002), as described above. The overall distribution of rhetorical devices as used by both Obama and Romney was highly similar, thus indicating a distinctive style of US political rhetoric. The seven traditional devices altogether accounted for most of the techniques used by both Obama (82%) and Romney (81%), in particular, contrasts and lists (Obama 33%; Romney 35%). Overall, the total proportion of implicit devices (namings, negative namings, description of campaign activities, the seven traditional devices) was high for both candidates (Obama 82%, Romney 81%) (Bull & Miskinis, 2015).
Thus, the results of this analysis were strikingly similar to those found in British speeches. Both candidates predominantly made use of the seven traditional devices, and most of the techniques were implicit. Arguably, it is thus possible to speak of an Anglo-American style of speech-making, which contrasts markedly with that of Japanese politicians. However, there were also noticeable cultural differences between the UK and the US, not in rhetorical devices, but in audience responses, as described below.
Audience Responses
All the early interactional research on political speeches was focussed principally on applause (e.g., Atkinson, 1984). But of course audiences do other things beside applaud. They may, for example, cheer or laugh. In the two studies of Japanese politicians, laughter and cheering were analysed, as well as applause. Although applause was the predominant form of audience response in the 2005 election (59% of responses), there was also a substantial proportion of laughter (25%) and cheering (16%) (Bull & Feldman, 2011). In the 2009 election, there was almost as much laughter (39%) as applause (40%), while cheering was just 9% (Feldman & Bull, 2012).
In addition, analyses were conducted of what are termed aizuchi (Bull & Feldman, 2011; Feldman & Bull, 2012). In Japanese, this term refers to what in English has been called listener responses (e.g., Dittmann & Llewellyn, 1967), signals used to indicate continued listener attention and interest. Aizuchi are considered reassuring to the speaker, showing that the listener is active and involved in the discussion. Common aizuchi are “hai”, “ee”, or “un” (yes, with varying degrees of formality), “sō desu ne” (that’s how it is, I think), “sō desu ka” (is that so?), “hontō”, “hontō ni” or “honma” (really). In these Japanese speeches, the use of phrases such as “Don’t you think so?”, “Wouldn’t you agree with me?” is directly comparable to the way in which speakers request aizuchi in ordinary conversation. However, audiences typically responded to such phrases not with aizuchi but with applause. Actual aizuchi responses were relatively infrequent (only 3.3% of all affiliative responses in the study of the 2009 election; Feldman & Bull, 2012), but those that did occur were typically in response to the speaker requesting agreement (75% of all aizuchi responses). Specifically, these took the form of hai, tadashii desu and honto desu (“yes, this is true”), machigai nai and sono ori desu (“you are correct”), atarimae and tozen (“naturally, obviously”), and tashika ni (“certainly”).
Another notable feature of Japanese audience responses was the total absence of what is termed isolated applause. This occurs when only one or two people clap, in contrast to collective applause, either from the audience as a whole, or from a substantial proportion of it. Isolated applause has been noted in several studies of British political speeches (e.g., Heritage & Greatbatch, 1986; Bull, 1986). For example, in one study, 5% of all the applause in six party political conference speeches was judged to be isolated (Bull & Noordhuizen, 2000). In the US, individualized responses are even more pronounced. In the study of 11 speeches from the 2012 American presidential election, individualized audience responses were observed throughout, with a constant flurry of isolated applause and encouraging individual verbal remarks, mostly interruptive (Bull & Miskinis, 2015). Such responses were never observed in the analyses of Japanese political meetings. In Japan, all the audience responses were collective, that is to say, people applaud, laugh, cheer or produce aizuchi together (Bull & Feldman, 2011; Feldman & Bull, 2012).
The collective US audience responses were coded into applause, cheering, laughter, chanting and booing. There were also a noticeable proportion of responses that could not be assigned to any category because of their infrequency and variance, hence there was a sixth category of others. These include unison verbal remarks, for example “Yes!”, “Amen”, and empathetic sighs. Overall, in marked contrast to both Japan and the UK, it was found that cheering was by far and away the most frequent response (63% of total audience responses); applause accounted for only 12% of all responses.
Another distinctive feature of audience responses in the US was the occurrence of booing. In one study of 178 speeches from the US presidential campaign of 1980, the occurrence of booing was noted as well as that of applause (West, 1984). In another study, Clayman (1993) gathered data on booing from a wide variety of public speaking environments, including US Presidential debates, Congressional floor debates, TV talk shows, and British party conference speeches. Clayman observed that booing occurs quite differently from applause. Booing is typically preceded by a substantial delay or by some other audience behaviour (such as heckling, jeering, clapping or shouting), or by both of these in combination. It was further observed that booing can follow affiliative responses (such as applause and appreciative laughter) as often as it follows disaffiliative responses. In this context, booing can be seen as a reaction to such affiliative responses, indicating that support for the speaker is not universal; furthermore, this reaction can be seen as decidedly competitive.
From this analysis, Clayman (1993) proposed that there are two principal ways in which an audience can coordinate its behaviour, referred to as independent decision-making, and mutual monitoring. In independent decision-making, individual audience members may act independently of one another yet still manage to coordinate their actions, for example through applause in response to rhetorical devices. In mutual monitoring, individual response decisions may be guided, at least in part, by reference to the behaviour of other members.
Thus, once it becomes evident that some members of the audience are starting to applaud, this drastically alters the expected payoff for other audience members: the fear of responding in isolation will be reduced, while conversely not applauding can also increasingly become an isolating experience. Responses organized primarily by independent decision-making should begin with a “burst” that quickly builds to maximum intensity, as many audience members begin to respond together, whereas mutual monitoring in contrast should result in a “staggered” onset as the initial reactions of a few audience members prompts others to respond. “Clappers usually act promptly and independently, while booers tend to wait until other audience behaviours are underway” (Clayman, 1993, p. 124).
In contrast, in the study of speeches from the 2012 US presidential election, Bull and Miskinis (2015) identified two distinctive types of booing: disaffiliative (the audience boo the speaker), and affiliative (the audience align with the speaker to boo a political opponent). Overall, in the 11 speeches, there were 48 instances of booing (45 affiliative, 3 disaffiliative). All 48 instances of booing were preceded by rhetorical devices characteristically used to invite applause. Hence, this raises the question as to whether booing (like applause) can be regarded as an invited response.
An example of affiliative booing can be seen in the following statement from a speech by Obama in Ames, Iowa (29 August, 2012): “Last week my opponent’s [i.e., Romney’s] campaign went so far as to write you off as a lost generation. That’s you according to them”. When the audience booed this statement, they could be seen not as attacking Obama, but as aligning themselves with Obama against Romney.
An example of disaffiliative booing comes from a speech delivered by Romney to a predominantly hostile audience at a conference of the National Association for the Advancement of Colored People in Houston (NAACP), Texas (11 July, 2012). “If our goal is jobs, we have to stop spending a trillion dollars more than we take in every year. And so, I am gonna eliminate every non-essential programme I can find. [HEADLINE] That includes Obamacare. [PUNCHLINE] And I’m gonna work to reform…” In this example, the booing started when Romney stated his opposition to a focal election campaign topic, that of Obamacare [which required everyone to buy health insurance from January 2014]. The booing can be clearly understood as disaffiliative, because of the accompanying shouts directed at Romney of “Get off the stage”, “no” and “shame” (Bull & Miskinis, 2015).
In this example, the booing is clearly associated with a headline-punchline device, customarily used to invite applause. Given that the booing was clearly disaffiliative (it was directed at Romney), the audience response might be regarded as uninvited (i.e., Romney was inviting applause, but instead got booed). However, this interpretation is seriously open to question. Notably, Romney was subsequently widely criticized by Democrats for staging a political stunt in which he deliberately invited booing to make himself look tough in the eyes of hard-line Republicans – as the man who was standing up to Obamacare, who was prepared to put himself on the line even to a predominantly hostile audience (see: “Romney Says He Knew He’d Be Booed at NAACP”; “Mitt Romney On NAACP Obamacare Booing: If People Want More Free Stuff They Should Vote For Obama”). From this perspective, this example of disaffiliative booing may be regarded as invited response. In a sense, Romney appears rather like a stage villain in a pantomime, inviting the audience’s disapproval. Thereby, he might enhance his standing not with the audience in the conference hall but with an audience elsewhere, namely, that of hard-line Republicans (Bull & Miskinis, 2015).
Overall, in the other 10 speeches by Obama and Romney, 7% of audience responses took the form of affiliative booing, most frequently associated with the rhetorical device of negative naming (55% of devices associated with affiliative booing) (Bull & Miskinis, 2015). Negative naming does also occur in British political speeches, but as a form of applause invitation (Bull & Wells, 2002). However, affiliative booing has never been observed in any of the analyses of British political speeches (e.g., Atkinson, 1984; Bull, 2006; Heritage & Greatbatch, 1986), it seems to be distinctive to US political culture. Notably, neither negative naming nor booing (affliative or disaffiliative) were observed in either of the studies of Japanese speeches (Bull & Feldman, 2011; Feldman & Bull, 2012).
Most of the audience responses discussed above were affiliative, that is to say, the audience are invited to align with the speaker. Even booing can be affiliative, as shown in the analysis of US speeches. Of course, booing can also be disaffiliative, for example, when Romney was booed in his speech in Texas (Bull & Miskinis, 2015). But so too can applause, cheering and laughter. For example, an audience may slow hand clap, or cheer or laugh at a pratfall by the speaker. From this perspective, it is not the responses themselves that are intrinsically affiliative or disaffiliative, but how they are used and in what context.
Audience Responses and Voting
Applause, laughter, cheering, chanting, even booing the opposition, may all be signs of popularity, but are they important in relation to how people vote? Although success at the ballot box is arguably the most important indicator of popularity for a democratically elected politician, interactional research on political speeches has until recently been essentially focussed on audience responses to rhetorical devices, not on voting. However, their relationship is important. Does vociferous support from the audience mean the politician is on course to win the election, or merely that s/he knows how to work an audience? Recently, two analyses have been conducted to test the relationship between affiliative response rates and electoral success, one of Japanese speeches, the other of US speeches.
In the study of the 2009 Japanese general election (Feldman & Bull, 2012), affiliative response rates for 18 politicians were compared with both the percentage of the vote received, and whether the candidate was elected. There were no significant correlations between the percentage of the vote and either the overall rate of affiliative responses (-.105), or rates for applause (.372), laughter (.060) and cheering (.073). Neither were there any significant correlations between whether the candidate was elected and either the overall rate of affiliative responses (.075), or rates for applause (.181), laughter (.028) or cheering (-.035).
All the Japanese speeches were delivered in the evening at indoor locations (such as school classes and gymnasia). It is important to distinguish these meetings from outdoor street speeches that candidates make in central locations (such as railway stations). There, candidates direct their talk to random pedestrians, many of whom are undecided voters, or perhaps supporters of rival politicians. In contrast, indoor meetings are essentially “rallies of the faithful”, attended principally by individuals who are already supporters of the speaking candidates and their political group, and will most likely vote for them. Those who gather at these meetings do so more to encourage the candidates and show loyalty to the candidates and their political party, rather than to consider their political beliefs and views on issues before deciding on how to vote.
Candidates manage these meetings by conveying their appreciation to supporters who came to listen to their speeches, and by mixing their discourse with humorous expressions followed by frequent requests for agreement and support in the campaign. Instead of discussing complicated political issues or seeking to win the support of uncommitted voters (who are rarely present), these meetings are more like a community social event. As such, audience reactions to the speakers’ affiliative response invitations are not necessarily indicators of the candidates’ popularity or support amongst the wider electorate.
In contrast, the study of the 2012 US presidential election (Bull & Miskinis, 2015) included analyses of ten speeches delivered to informal public meetings without a pre-selected audience, in the swing states of North Carolina, Iowa, Ohio, Florida and Wisconsin (swing states are those in which no single candidate or party has overwhelming support in securing the electoral college votes for that state). Given that it is not the popular vote but the electoral college that selects the US president, such states are major targets for the main political parties, since winning swing states is the best means of winning electoral college votes; hence, undecided voters are critical. The results showed a significant positive correlation between affiliative response rate (per minute) and electoral success for Obama and Romney (r = .67, p = .033). Thus, Obama had a higher percentage of the vote and a higher affiliative response rate and in Wisconsin, Florida, Ohio and Iowa, whereas Romney had a higher percentage of the vote and a higher affiliative response rate in North Carolina. From this perspective, affiliative responses would seem to be a useful source of feedback in political speeches, as previously proposed by West (1984) in an analysis of applause in response to speeches from the US presidential campaign of 1980.
Individualism and Collectivism
In what way can these cross-cultural differences in speaker-audience interaction between the US, UK and Japan be conceptualized? In his highly influential cultural dimensions theory, the Dutch social psychologist Geert Hofstede (e.g., Hofstede, 2001; Hofstede et al., 2010) famously distinguished between what he called collectivist and individualist cultures. A collectivist society is defined as one in which “people from birth onwards are integrated into strong, cohesive in-groups, which throughout people’s lifetime continue to protect them in exchange for unquestioning loyalty” (Hofstede et al., 2010, p. 92). In a collectivist culture, people tend to view themselves as members of groups (e.g., families, tribes or nations), and usually consider the needs of the group to be more important than the needs of individuals. Most Asian cultures, such as Japan, tend to be collectivist, according to Hofstede (e.g., Hofstede, 2001; Hofstede et al., 2010).
Conversely, an individualist society is defined as a culture in which “the ties between individuals are loose; everyone is expected to look after him or herself and his or her immediate family” (Hofstede et al., 2010, p. 92). Thus, the emphasis is on personal freedom and achievement at the possible expense of group goals, resulting in a strong sense of competition. Social status is awarded to personal accomplishments, and all actions that make an individual stand out.
Individualism and collectivism have been conceptualized in terms of self-construals by Markus and Kitayama (1991). Thus, many Asian cultures have concepts of self that insist on the fundamental relatedness of individuals to each other, on harmonious interdependence. In contrast, Americans neither assume nor value such overt connectedness. Individuals seek to maintain their independence from others, by discovering and expressing their own unique attributes. On the basis of a range of large-scale questionnaire studies, the Hofstede Centre (https://www.geert-hofstede.com) has provided ratings of different cultures on individualism and collectivism. Whereas the US and the UK have one of the highest rated levels of individualism in the world (91 and 89 respectively), Japan scores just 46 (scores range from 0 to 100).
Hofstede’s cultural dimensions theory (e.g., Hofstede, 2001; Hofstede et al., 2010) has been highly influential, but is also highly controversial. A number of criticisms have been summarized by Shaiq, Khalid, Akram, and Ali (2011) as follows. For example, one common objection is that Hofstede overgeneralises to whole communities from individual assessments, in particular, that he based his analyses on the data from only one commercial company (IBM). Another criticism is that whereas his unit of analysis is the nation state, cultures may be fragmented across groups and national boundaries. A further criticism is that the research work is too old, and cannot be effectively implemented in an era of rapidly changing environment, international convergence and globalization. One of Hofstede’s most vocal critics has been McSweeney (2002), who regards Hofstede’s research methodology as fundamentally flawed, dismissing it as “a triumph of faith – a failure of analysis”.
Despite these criticisms, it is notable that cross-cultural differences in speaker-audience interaction can be readily subsumed within Hofstede’s dimension of individualism-collectivism. Thus, in the two studies of general elections in Japan (Bull & Feldman, 2011; Feldman & Bull, 2012), all audience responses were collective, there were no incidents of either isolated or uninvited applause. Nor were there any incidents of either negative naming or booing, which might be regarded as disruptive to group harmony, and hence to the interconnectedness regarded as such a distinctive feature of collectivist societies (e.g., Markus & Kitayama, 1991). Furthermore, rhetorical devices were predominantly explicit, thereby making it clear to the audience when it would be appropriate to respond. Arguably, if people in a collectivist society want to respond together en masse, it is helpful if speakers provide the audience with clear guidance as to what is expected (Bull & Miskinis, 2015). Implicit devices can be confusing, and result in applause which may be uninvited (Bull & Wells, 2002), or asynchronous, delayed or interruptive (Bull & Noordhuizen, 2000).
In the United Kingdom, audience responses are typically collective, but incidents of isolated applause have also been observed (e.g., Heritage & Greatbatch, 1986; Bull & Noordhuizen, 2000). So too have incidents of uninvited applause, which can occur either through a misreading of rhetorical devices (Bull & Wells, 2002), or as a direct response to speech content (Bull, 2000). The occurrence of both isolated and uninvited applause suggests that British audiences are more individualistic than those in Japan. Furthermore, rhetorical devices are predominantly implicit, which arguably shows greater respect for individual autonomy, thereby allowing audience members greater freedom of action as to whether or not to respond (Bull & Miskinis, 2015). Again, this would be more consistent with an individualistic society.
In the US, rhetorical devices are also predominantly implicit; arguably it is thus possible to speak of an Anglo-American rhetorical style. However, audience responses are much more varied than the UK and Japan. Not only do US audiences applaud, cheer and laugh, they also chant and boo. Furthermore, although they respond collectively, they also respond on an individual basis, and there is a constant flurry of encouraging individual verbal remarks audible throughout all the speeches, which are both uninvited and interruptive (Bull & Miskinis, 2015). In that respect, US audiences are highly individualistic, in marked contrast to the pronounced collectivist audience behaviour observed in the two studies of general elections in Japan (Bull & Feldman, 2011; Feldman & Bull, 2012). However, it should perhaps be noted that this apparent individualism might also reflect response norms from another context, namely, that of Evangelical/Pentecostal religious responses to sermons and prayers during services, where people are believed to be “moved by the Spirit” to express utterances about the proceedings. Although they respond individually, they might thereby be seen as following group norms.
A Model of Speaker-Audience Interaction in Political Speeches
The focus of this article has been on how and why audiences respond to political speeches. It draws its initial inspiration from Atkinson’s (e.g., 1984) pioneering analysis of how politicians use rhetorical devices (or “claptraps”) to invite audience applause. From subsequent research in the UK, the US and Japan, Atkinson’s analysis has been refined and elaborated, so that it is now possible to propose a model of speaker-audience interaction in political speeches. There are two principal sections to the model: (1) political speeches as dialogue; (2) the cross-cultural context of speaker-audience interaction. Each of these major sections is further analysed below:
Political Speech-Making as Dialogue
-
Political speech-making has traditionally been regarded as monologue, but the studies reported in this paper show how political speeches can be regarded as a form of dialogue between speakers and their audiences, akin to the way in which people take turns in conversation.
-
However, in comparison to a conversation, audience responses are typically quite limited; for example, audiences may applaud, laugh, cheer, chant, shout out comments, or even boo.
-
Audience responses may be collective (from the audience as a whole, or a substantial proportion of it), or isolated (from one or two people).
-
Audience responses may be affiliative (the audience align with the speaker) or disaffiliative (the audience align against the speaker).
-
Audience responses are not in themselves either intrinsically affiliative or disaffiliative. Thus, although applause is typically regarded as affiliative, it may also be delayed, isolated, spasmodic, unenthusiastic, or even take the form of a slow handclap. Likewise, although booing is typically regarded as disaffiliative (the audience boo the speaker), it may also be affliative (the audience align with the speaker to boo a political opponent). Laughter is typically affliative, but the audience may laugh at the speaker (disaffiliative). Cheering is typically affiliative, but may also be ironic (disaffiliative). Chanting is typically affiliative, but may also be disaffiliative, if the content of the chant is hostile.
-
Audience responses may occur in response to speaker invitations through rhetorical devices, or may be uninvited, initiated by the audience in response to speech content, or through a misreading of rhetorical devices.
-
Rhetorical devices may be implicit (embedded in the structure of speech), or explicit (the speaker overtly invites an audience response).
-
Delivery may indicate whether or not a rhetorical device is intended as an affiliative response invitation. Delivery may be particularly important in the case of implicit devices, because response invitations are less overt.
The Cross-Cultural Context of Speaker-Audience Interaction
-
Speaker-audience interaction needs to be understood in a cross-cultural context.
-
Whereas Japanese general election audiences typically responded together (Bull & Feldman, 2011; Feldman & Bull, 2012), in US presidential speeches, there was a constant flurry of asynchronous and uninvited individual remarks, typically expressing support, attentiveness or encouragement to the candidate (Bull & Miskinis, 2015).
-
In Anglo-American speeches, implicit rhetorical devices are the norm (e.g., Atkinson, 1984; Bull & Wells, 2002; Heritage & Greatbatch, 1986), whereas in Japanese general election speeches, rhetorical devices are typically explicit (Bull & Feldman, 2011; Feldman & Bull, 2012).
-
Audience responses are culturally variable. In the study of the US 2012 presidential election (Bull & Miskinis, 2015), the most frequent response was cheering, whereas in Japanese general election speeches, it was applause (Bull & Feldman, 2011; Feldman & Bull, 2012).
-
Invited booing. Another distinctive feature of US presidential speeches was invited booing (Bull & Miskinis, 2015). This was never observed in the two analyses of Japanese general election speeches (Bull & Feldman, 2011; Feldman & Bull, 2012), nor has it been observed in any previous analyses of British speeches (e.g., Atkinson, 1984; Bull, 2006; Heritage & Greatbatch, 1986).
-
Individualism and collectivism. Cross-cultural differences in speaker-audience interaction can readily be subsumed within Hofstede’s dimension of individualism-collectivism (Hofstede, 2001; Hofstede et al., 2010).
-
In our two studies of Japanese general election speeches (Bull & Feldman, 2011; Feldman & Bull, 2012), audiences were found to respond together to explicit invitations from the speaker, there were no isolated responses, nor were there any incidents of invited booing (which might be disruptive to group harmony). In studies of speeches in the US and the UK, audience responses were invited predominantly through implicit rhetorical devices, allowing audiences greater latitude as to whether or not to respond. Responses may be uninvited as well as invited, nor are they always collective. Isolated and individual responses may also occur, particularly in the context of American presidential speeches (Bull & Miskinis, 2015).
-
Conclusions: Speech-Making as Dialogue
Speech-making has traditionally been regarded as monologue, but the studies reported in this paper show how political speeches can be regarded as a form of dialogue, akin to the way in which people take turns in conversation. According to Weigand (2000, 2010), all language should be regarded as dialogue. Weigand rejects the traditional distinction between monologue and dialogue, arguing that it fails adequately to capture the nature of language as a form of communication. Her theory is in fact based on two premises: language is used primarily for communicative purposes, and communication is always performed dialogically. She further proposes that rhetoric is inherent to dialogue; hence it is not necessary to divide language into rhetorical and non-rhetorical language use. From this perspective, the rhetorical techniques reviewed in this chapter may be construed not as unique to political speech-making, but rather as specific manifestations of dialogic interaction in one particular social context.
In spite of all this research, there are still those who like to deny the importance of rhetorical techniques in political oratory. Thus, Atkinson (2004, p. 239) reports how the first elected Mayor of London, Ken Livingstone, was asked in a radio interview what he thought of the rhetorical techniques that Atkinson had identified. Livingstone dismissed their importance. He replied: “Public speakers are born, not made. People shouldn’t worry about all these techniques; they should just be themselves”. Notably, in making this response, Livingstone used two rhetorical devices (consecutive contrasts): (1) Public speakers are (A) born (B) not made. (2) People (A) shouldn’t worry about all these techniques (B) they should just be themselves. Thus, even in denying the importance of rhetorical devices, Livingstone used exactly the kind of rhetorical techniques identified by Atkinson!