Tuesday, 5 August 2014



Dear Digest readers,

The Digest team will be taking a break during August.

This year has been especially busy for the Digest team and we have sometimes been a bit erratic in the postings, but we hope to resume regular weekly postings in September.

There has been a huge growth in the number of people reading the Digest and the number of followers also continues to grow. Please tell your colleagues/students/friends about us so that we can reach as many people as possible.

We are also interested in hearing from our readers with comments about the Digest and suggestions for how we might improve.

Thank you for your continued interest in the Digest.

Best wishes from all at the Linguistics Research Digest team!

Monday, 28 July 2014

(stereo) type your questions into Google

click to see this better!

While searching for information on the web, Paul Baker and Amanda Potts noticed that Google’s auto completion algorithm was inadvertently reproducing stereotypes. Troubled by this phenomenon, they set out to investigate which social groups elicited more stereotyping questions than others and how these differed in nature.

In 2010 Google added auto-completion algorithms to offer a list of suggestions when users type words into the search box. The predicted text shown in a drop-down list has either been entered by previous users or appears on the web. While this process can save time, it also has some unintended consequences. For example, when one types ‘why do gay’ into the search box, the information in the picture above appears.

These suggestions are stereotypical; they ascribe characteristics to people on the basis of their group membership, reducing them to certain (often exaggerated) traits.

To carry out their study, the researchers first created a list of identity groups to investigate. They found that the terms which produced the most questions were related to ethnicity, gender, sexuality and religion. The groups eventually chosen were: black, Asian, white, Muslim, Jewish, Christian, men, women, gay, lesbian and straight. In each category, a selection of similar terms was used, so for example the male category included men, boys and guys. The researchers also chose people as a control group to show how humans are characterised when they are not associated with particular identities.

Next, they paired these terms with question forms in order to elicit auto-suggestions. The question forms included items such as why do, how do, where do as well as questions beginning with do, should and are. Each question was entered individually and the top suggestions were recorded.
Some of the elicited questions did not refer to social groups and were excluded. Finally, 2690 queries were analysed and it was found that the groups which elicited the most queries were Jewish, Black  and Muslim with over 300 results each, whereas People only elicited 70 and Lesbian a mere 41 queries.
The questions were then divided into the following categories:

Each question was also rated with regards to evaluation, and was classed as positive, negative or neutral. While the majority of the questions were classed as neutral, most groups tended to have more of their questions categorised as negative than positive. Surprisingly, the control category people elicited proportionally the most negative questions, which tended to be about why people engage in hurtful behaviours such as bullying and self-harming.

The relatively high proportion of negative questions for three groups was particularly concerning. For black people, these involved constructions of them as lazy, criminal, cheating, under-achieving and suffering from various conditions such as fibroids. Gay people were constructed as contracting AIDS, going to hell, not deserving equal rights or talking like girls. The negative questions for males positioned them as catching thrush, under-achieving and treating females poorly.

Conversely, all ethnic groups also elicited many positive questions. Black people were constructed as stronger, more attractive and virile. Asians were viewed as smart, slim and attractive, while white people were viewed as attractive and ‘ruling’ other groups.

In general, race and gender searches elicited questions concerned with the level of interest of one group for another. Indeed, top results for both genders featured the opposite sex. However, sexual fulfilment and references to orgasm appeared more frequently for the female questions.

The straight category included fewer questions (perhaps as this is seen as the ‘norm’). These tended to be about whether straight men enjoyed stereotypically gay entertainment such as TV show Glee or singer Cher, and whether straight men could ‘turn’ gay or have homosexual thoughts.

Whereas the Gay results included questions about whether gay people should be allowed to marry, adopt, join the military or give blood, Lesbian questions included negative stereotypes such as acting/looking like men, questions about sexual and emotional behaviour towards men and the mechanics of lesbian relationships.

While the researchers do not claim that people who are exposed to such questions will be influenced by the stereotypes they encounter, it is important to acknowledge that these do exist. In addition, this paper raises the moral question of whether content-providers should ‘protect’ their users and remove offensive auto-suggestions (and indeed who decides what is inappropriate?), or should they simply reflect the phenomena that people are interested in?

Baker and Potts warn that auto-complete suggestions could perpetuate stereotypes, and that not all people would view them with a critical eye. They therefore recommend that there should be a facility to flag certain auto-completion suggestions as problematic, and that Google should consider removing those that are consistently flagged.

Baker, Paul & Amanda Potts (2013) ‘Why do white people have thin lips?’ Google and the perpetuation of stereotypes via auto-complete search forms, Critical Discourse Studies,10:2, 187-204


This summary was written by Danniella Samos

Tuesday, 8 July 2014

Accent on accent

image of Language Diversity from Tobias Mikkelson

"It is impossible for an Englishman to open his mouth without making some other Englishman hate or despise him" wrote George Bernard Shaw in his preface to Pygmalion. He was, of course, referring to the way people evaluate accents and make (usually negative) judgements about speakers.

When we talk about accent, it is important to remember that this relates only to pronunciation and intonation rather than grammar or vocabulary. Thus, two people speaking the same language, who use the same grammar and word choices will give different cues about their social and regional origins, ethnic group membership or class. While we, as listeners, naturally pick up these cues about people’s ethnic, socioeconomic and geographical background, experimental research has shown that listeners can also make judgements on others’ intelligence, warmth and even height just by listening to recorded accented speech.

Although Shaw was referring primarily to British English accents in a class-oriented society, many studies from the UK, USA and Australia from the past six decades all show that foreign accented speech is negatively evaluated by native speakers of a language. People who view their own group or culture as the centre of everything, and who scale and rate all other groups with reference to it, can be said to be ethnocentric. Ethnocentric people tend to strongly identify with people in their own group and are biased against outsiders.

James Neuliep and Kendall Speten-Hansen hypothesised that there would be significant negative correlations between ethnocentrism and the way speakers with non-native accents are socially perceived. To test this, they recruited 93 male and female undergraduate students and randomly assigned 46 of them to an experimental group and 47 to a control group. All participants were native speakers of English.

In the experiment, participants in both groups completed a Generalized Ethnocentrism scale test to see how ethnocentric they were. This included rating on a 5-point scale items such as ‘‘My culture should be the role model for the world’’ and ‘‘I have little respect for the values and customs of other cultures’’.

Then, both groups watched a video of the same male speaker talking for 12 minutes about a non-controversial topic – the benefits of exercise. The videos were identical in every way except for the accent of the speaker. In the film viewed by the experimental group, the speaker had a non-native accent, and in order to try and reduce stereotypical judgments, this accent was left ambiguous, with no detectable regional, ethnic, or national associations. The speaker viewed by the control group spoke with a standard American accent. In both videos, the speaker wore the same clothes, spoke in the same location, at the same pace, and used the same number of gestures.

After viewing the video, participants were asked to complete tasks designed to assess their perception of how attractive the speaker was, and how credible and like themselves he was.
To rate speaker credibility, participants used 7-point semantic scales including scales asking about expertise (for example, a scale between “Qualified” and “Unqualified”) and character (for example, scales such as ‘‘Reliable-Unreliable’’ and ‘‘Honest-Dishonest’’).

Speaker attractiveness was assessed using 7-point scales testing for social, physical, and task attraction. The latter refers to the perception that someone is competent, trained, and qualified to perform a job. To assess attractiveness, participants had to rate items such as ‘‘I think he could be a friend of mine’’, ‘‘I find him physically attractive’’ and ‘‘I have confidence in his ability to get the job done’’.

Perceived homophily (how like the participant the speaker was judged to be) was also assessed by a 7-point semantic scale, where participants were asked to rate items such as ‘‘The Speaker is: Like me-Unlike me’’ and ‘‘The Speaker is: Of similar status to me-of different status to me’’.

As predicted, for the experimental group ethnocentrism was negatively and significantly correlated with perceptions of the speaker’s physical, social, and task attractiveness, his credibility, and perceived homophily. Moreover, this bias is not binary but continuous; the more ethnocentric a participant was, the lower their ratings of the non-native accented speaker’s attractiveness, credibility, and homophily. However, when presented with a speaker with a standard American accent, ethnocentrism played little to no role in the way that the speaker was socially perceived.

The findings are important because previous research has shown that credibility, attractiveness and homophily are three of the most significant social perceptions we make about others, affecting the interpretation of what a person says.  What a non-native speaker is interpreted as saying, then, can depend on the ethnocentricity of their interlocutor.
Neuliep, James W. & Kendall M. Speten-Hansen (2013) The influence of ethnocentrism on social perceptions of nonnative accents. Language & Communication 33:167–176

This summary was written by Danniella Samos

Monday, 9 June 2014

The Jenny Cheshire Lecture in Sociolinguistics 2014

Professor David Britain: "English in Paradise? A new variety in the Pacific"

Friday 13th June 2013, 6.30pm, Maths Lecture Theatre, School of Mathematica Sciences, Mile End Campus, Queen Mary, University of London.

For more details and to book a free place at this event click here

Friday, 23 May 2014

Do birds of a feather tweet together?

Do you think you would be able to guess the gender of the author of an anonymous tweet? David Bamman, Jacob Eisenstein and Tyler Schnoebelen found some distinct but complex differences in the way men and women use language on the Twitter microblogging site.

The researchers amassed a corpus of over 9 million tweets from more than 14,000 American users. They then assigned gender to each account using historical census information on given names. Using computational methods and statistical tests, they found that:

·       pronouns are used more frequently by women. These include alternative spellings such as u and yr;

·      emotion terms (sad, love, etc.) and emoticons are also associated with female authors.;

·      although previous research has found that kinship terms are used more often by women, in this study  it really is a mixed bag. Most kinship words including mom, sister, daughter, child, dad and husband are used more by women. However, a few words, including wife, bro, and brotha, are associated with male authors;

·      some abbreviations such as lol and omg are used more by women, as are ellipses, expressive lengthening (e.g. coooooool), emoticons, exclamation marks, question marks, representations of sounds like ah, hmmm, and grr as well as hesitation words such as um;

·      assent terms such as okay and yess, are all used more by women, but yessir is used more by men. Similarly, negation terms nooo, and cannot are associated with women, while nah, nobody, and ain’t suggest the author is likely to be a man.

·      swearwords and taboo words are mostly used by male writers whereas women choose milder terms such as darn.

The researchers suggest that the male/female distinction used in much previous research is too simplistic. For example, some linguists claim that women use language in a more expressive way than men by lengthening words like yess and noo. However, using swearwords may also be seen as expressive, and this is done more frequently by men. Similarly, the fact that women use more abbreviations such as omg and lol goes against the common view that women prefer to use more standard language. And although men mention named 'entities' such as Apple or Steve about 30% more often than women, this does not support previous claims that men use language mainly to convey information while women tend to engage with others. When one looks more closely at the data it becomes clear that many of the named entities are sports figures and teams, and are  used by men to engage with others with similar sports interests.

The researchers then identified groups of tweeters who used similar sets of words, regardless of their gender. Many groups turned out to have a substantial majority of either men or women. While some of these clusters matched the linguistic expectations  for their gender, others didn’t. For example, although swearwords are generally preferred by men, some of the male-associated clusters used taboo terms far less often than women. On a closer look, many of these messages turned out to be work-related, where taboo language would be discouraged.

Finally, the researchers wanted to find out whether individuals with a greater proportion of same-gender people in their social networks use more linguistic items associated with their gender. In other words, do birds of a feather tweet together?

Well, sort of. There was a strong correlation between the use of gendered language and the composition of people’s social networks. The women in the dataset had networks which were on average 58% female. However, women whose tweets contained the most strongly marked female characteristics had social networks which were 77% female. Conversely, women who displayed the least gender-marked language had social networks that were on average only 40% female. The results for the men followed a similar pattern.

This fits with previous work showing that people change the way they communicate to match their addressees. People can use language to position themselves in relation to others, and they can do this by either conforming to or defying gendered expectations. So, it seems that it is not so straightforward to match language use with gender after all.


Bamman, David, Jacob Eisenstein & Tyler Schnoebelen (2014) Gender identity and lexical variation in social media. Journal of Sociolinguistics 18(2): 135–160.

doi. 10.1111/josl.12080

This summary was written by Danniella Samos

Monday, 21 April 2014

Please excuse me while I use the Bathroom Formula.....

Everyday language is usually the most interesting to study.  A perfect example of this is how we excuse ourselves when we need to use the bathroom/go to the toilet/powder our nose/visit the little boys’ room etc….As you can see the possibilities are endless!  In a fascinating study Magnus Levin investigated how and why speakers use this ‘Bathroom Formula’.

The ‘Bathroom Formula’ refers to the phrases speakers use to express their need to leave an ongoing activity in order to go to the bathroom.  It is a highly complex formula as in most situations it would be inappropriate to just disappear without giving an explanation and yet the explanation itself in this instance could cause embarrassment or be deemed impolite.  Therefore the speaker needs to be resourceful and draw on predictable expressions to negotiate this potential difficulty.

In his data, taken from British (BE) and American (AE) English, Levin identified six different ways of using the Bathroom Formula:

1)     Going to a place: ‘I’ll have to go to the loo.’ (BE)
2)     Specifying the activity: ‘I’m gonna go pee.’ (AE)
3)     Asking for directions: ‘Where’s the little boys’ room at?’ (AE)
4)     Asking permission: ‘Please may I go to the toilet?’ (BE)
5)     Promising to be back: ‘I’ll be back in a minute.’ (AE)
6)     Using a metaphor: ‘I’ve got to wash my face.’ (AE)

In both British and American English (1) was the preferred way of using the Bathroom Formula, regardless of gender or age of the speaker.  We would expect children to use (2) the most; however, interestingly, Levin found that adults also quite often specify the activity they intend to perform with words like pee and tinkle.  Also surprising is that 86% of these speakers are women, who describe what they are going to do just as often in conversations with men as with other women.  It seems that verbs like tinkle are seen as polite and inoffensive enough to use in any company. 

Many other uses of the Bathroom Formula were found to be intermixed; for example, (5) was used with at least one of the other categories and served to make the interruption in the conversation seem less impolite.  The metaphors used in (6) were nearly always quite conventional (wash my hands/face, powder my nose or spend a penny, for example) and the following, rather bizarre, conversation between two American men helps to illustrate why this may be:

A:              Thanks for getting bags and stuff
B:              Oh, no problem.  They were two for one so
A:              Alright.  I’ll be right back….I’ve got to go deliver something
       around the corner OK. I just smelled gun powder
B:              Really
A:              It’s somebody…still lighting off fireworks
B:              (laughs) I wouldn’t doubt that

The only way that the speaker manages to convey his real need to go to the bathroom through his peculiar metaphor is by using the highly conventionalised phrase I’ll be right back to introduce it, which gives his friend a clue about what’s really going on.  So, using a tried and tested formula that everyone recognises, like wash my hands, is more accessible for the listener.  This is why new metaphors for going to the bathroom tend to fall out of use so quickly – they’re just too much work to use! In fact, Levin found that people rarely use metaphors at all when excusing themselves; it’s just that, when they do, they ‘stick out’ in the conversation and so are more memorable.

The same happens with potentially more offensive expressions like take a piss and have a dump.  Levin found that they are very rarely used and because of this are more noticeable when they occur, hence their force.  Generally people choose a ‘safe’ and inoffensive conversational path, sticking to tried and tested formulas that everyone knows.  As Levin writes, ‘things which are heard often tend to be noticed less.’

Levin found that, overall, women used the bathroom formula more than men but he is unsure as to why – It may because in general women use the toilet more often than men?  Or maybe because women are generally more polite than men?  More studies are needed to investigate this.  However, more interesting than the differences between speakers is the lack of variation when it comes to using the Bathroom Formula – we all generally stick to the same phrases.  Levin puts this down to our desire to be as unobtrusive and discreet as possible.

Now, if you’ll excuse me …

Levin, Magnus (2013) The Bathroom Formula: A corpus-based study of a speech act in American and British English. Journal of Pragmatics 64: 1-16.


This summary was written by Gemma Stoyle 

Sunday, 30 March 2014

It's her personality man's looking at

I don't mind how my girl looks.. it's her personality man's looking at 

London English has a new pronoun. Young people living in multicultural areas of the inner city use man as an alternative to I. Sometimes the meaning could be indefinite: in the caption to the picture Alex’ man pronoun could perhaps be replaced by you (in its general sense of ‘anyone’) or even one; but in other examples, like (1) below, man refers quite unambiguously to the speaker. Here Alex is telling his friend what he’d said to his girlfriend, who had annoyed him by bringing along her friends when he had arranged to meet her.

(1) didn’t I tell you man wanna come see you  . I don’t date your friends I date you (Alex)

How has this new pronoun developed? One relevant factor is that young people in multicultural areas of London now use man as a plural noun as well as a singular noun. Look, for example, at (2) and (3), where the number thirty-six and the adjective bare, ‘many’, show clearly that the noun is plural.

(2) what am I doing with over thirty-six man chasing me blud (Alex)

(3) and I ended up hanging around with bare bare man (Roshan)

Man is not the only new plural form of the noun: mens, mans and mandem are also heard in London, as well as the expected men. Mandem seems a straightforward borrowing from Jamaican Creole. The other forms result from the way that children acquire English in linguistically diverse inner city areas – in an unguided, informal fashion, in their friendship groups.  Many different varieties of English are used in these groups, resulting in much linguistic variation and linguistic flexibility (click on ‘Multicultural London English’ in the list of terms on the right to see our other posts on this new variety of English).

As a plural noun, man always refers to a group of individuals: either to people who are there with the speaker (e.g. you man are all batty boys, said by a young speaker to his friends) or to a group of people that the speaker has just been talking about. This paves the way for the development of the pronoun, since this is exactly how pronouns are used: I refers to a person who is there (the speaker), while he or she refer either to another person who is there or to a person the speaker has just mentioned. Since the plural noun man refers to a group of people, speakers can present themselves as symbolically belonging to that group. So when Alex uses man to refer to himself, as in the caption to the picture, he presents himself as a member of the group of people who think that personality is more important than looks. This gives his opinion more authority, by implying that there are others who feel the same way he does. In the same way, in (1), above, Alex refers to himself as man and by doing so portrays himself as one of a group of like-minded people who would also feel this way.

Another factor that helps explain the emergence of man as a pronoun is that the discourse-pragmatic form man is very frequent indeed in multicultural inner city London. Like other discourse markers, man has many functions, but the chief one seems to be to express emotion (as in (4)) and to construct solidarity between speakers.

(4) aah man that’s long that’s kind of long (Roshan)

Because man is used so often this way, the connotations of solidarity may spread over into its other uses – including the new use as a pronoun. So, in (5), below, Dexter is telling his friends how upset he was at not being able to use the plane ticket he had bought, because the police had arrested him. He uses you know to involve the other speakers, reinforces the fact that he had paid for the ticket himself by saying paid for my own ticket (rather than simply I’d bought a ticket), highlights the amount of money (a big three hundred and fifty pounds) and says explicitly that he was so upset. Here, using man to refer to himself is just one of many ways to emphasise the experience and look for solidarity and support from the listeners.

(5) before I got arrested man paid for my own ticket to go Jamaica you know . but I’ve never paid to go on no holiday before this time  I paid... a big three hundred and fifty pound .. I was so upset (Dexter)

In the data analysed in this paper it is almost exclusively male speakers who use the new pronoun, suggesting that it retains the meaning of the noun man. It has not yet, then, become a fully-fledged pronoun like I: only when both male and females refer to themselves as man will this have happened.

Cheshire, Jenny (2013) Grammaticalisation in social context: The emergence of a new English pronoun. Journal of Sociolinguistics 17 (5): 608-633.

doi. 10.1111/josl.12053

This summary was written by Jenny Cheshire