In the time that it takes you to read this article, millions
of users will have sent a Snapchat, uploaded an Insta Story and updated their
Twitter profile. The age of digital culture is very much upon us. For
Linguists, the contemporary networked society offers a way to explore language
use beyond the traditional method of recording and interviewing speakers. This
includes those studies which examine the dialectal distribution of words and
features across different parts of the country. One such paper is Grieve and
colleagues’ recent Twitter-based analysis of lexical variation in British
English.
Traditionally, linguists interested in researching dialectal
variation (i.e., linguistic features specific to a particular geographic region
or group) have set about researching this topic by conducting surveys and
interviews with speakers of a particular variety. For instance, a linguist
might ask someone to name the “a narrow passageway between or behind buildings”.
If you’re from the south, you might say ‘alleyway’ but northern speakers might
call it a ‘snicket’ or a ‘ginnel’.
With the advent of social media, however, linguists no
longer have to elicit these words directly. Rather, they can extract massive
datasets of social media data to examine where in the country these words are
used most.
In their 2019 paper, Grieve and colleagues used a corpus
(i.e., dataset) of 180 million Tweets to examine lexical variation in British
English. Helpfully, since tweets include what is known as ‘metadata’ that
relates to the location in which the tweet was sent, Grieve and colleagues were
able to plot these tweets on maps to identify where these words were most frequent.
They compared their analysis with the more traditional approach taken in the BBC Voices project.
Their analysis very convincingly shows that the lexical
variation observed in the Twitter data mirrors that identified in more
traditional analyses! This finding is shown in the graphic below, where for all
of the 8 words, the Twitter maps look comparable to those created for the BBC
Voices project. For instance, consider the maps for the word ‘bairn’ – a word that means
‘child’ is typically heard in northern UK dialects (second row, right). The BBC Voices project map
and the Twitter map are virtually indistinguishable. Across both maps, this
word appears largely confined to the north/north-east of the UK – as expected.
Whilst, for the most part, the traditional dialect maps and the Twitter dialect maps look very similar, Grieve and colleagues note some differences. For instance, in the Twitter dataset, ‘bairn’
is observed to account for a maximum of 7.2% instances of the word ‘child’, even in the areas where it is stereotypically associated with that dialect. This is in comparison to the BBC Voices dataset, which reports a
maximum of 100% of instances of ‘bairn’ for ‘child’ in some areas. Discussing the reasons for this difference, Grieve and colleagues explore several possibilities. First, they suggest that the differences may be related to a decline in usage of this word. It is possible that 'bairn' has simply become less popular over time. However, the decline in the use of this word also might have something to do with the type of data we get from Twitter and the way it's analysed in large-scale studies such as this. In particular, the authors note that it is impossible to examine the conversational context of the tweet. A such, it’s possible that’s there’s some
contexts where users would use ‘child’ for ‘bairn’ even if they use the
dialectal term ‘bairn’ in speech. For instance, if a user is reporting someone
else’s speech.
Nevertheless, with these issues aside, Grieve and
colleagues’ analysis suggests that the findings observed in large-scale
dialectal surveys are largely mirrored in the Twitter data. As such, we can
expect more and more sociolinguistic research to examine data from social media
sites, such as Twitter in the future! So, it seems, you really are what you tweet!
------------------------------------------------------------
Grieve, Jack;
Chris Montgomery; Andrea Nini; Akira Murakami & Diansheng Guo (2019) Mapping
Lexical Dialect Variation in British English Using Twitter. Frontiers in
Artificial Intelligence.
This summary was written by Christian Ilbury
https://doi.org/10.3389/frai.2019.00011.