This post was authored by Megha Arora, a third-year undergraduate student majoring in Math who is also an aspiring sociologist, and co-authored by Eric Borja, a third-year graduate student in the Sociology Department. Eric was Megha’s mentor for the Intellectual Entrepreneurship program, and this post came out of their many discussions regarding social media as a data source.
According to the World Bank, there are nearly 2.5 billion Internet users worldwide, and according to Facebook’s Investor Relations site there are more than a billion monthly active Facebook users. With more researchers mining social media for data, it is important to explore the scope of such a data source. On November 8, Dr. Shamus Khan of Columbia University visited the Ethnography Lab to deliver a talk on his co-authored article with Colin Jerolmack, entitled Talk is Cheap: Ethnography and the Attitudinal Fallacy.
The premise of the article is simple with important implications for the field of sociology: talk is cheap. Jerolmack and Khan demonstrate that “sociologists routinely proceed to draw conclusions about people’s behaviors based on what they tell us,” committing what Jerolmack and Khan call the attitudinal fallacy. Given the concept of the attitudinal fallacy, can social behavior be deduced from analyzing data pulled from the Internet, specifically social media? In other words, if talk is cheap, are tweets cheaper?
This post is divided into three parts, each answering one of three questions. First, what is social media as a source of data? It is important to think through what kind of data is pulled from social media – is it qualitative or quantitative in nature? Second, are methods utilized to analyze social media, even quantitatively, more like ethnography or demography? A large amount of data can be pulled from social media, but does that mean we must use quantitative methods when analyzing it? Finally, what can social researchers discern from data pulled from social media? Can social behavior be discerned from data pulled from social media?
What is social media as a source of data?
In an instant, a researcher can collect a large number of tweets through social analytic sites such as Topsy, then analyze the data utilizing statistical or computational models, such as agent-based modeling. A number of demographic characteristics can be pulled from public accounts. For instance, a person tweeting or posting a major life event can be recorded and then pulled. These methods can be used to see who is participating in social media, and how. With geocoding, the researcher can spatially understand social networks and trends by pinpointing the location of a specific tweet or hashtag.
Qualitative methods could be utilized to analyze a small subset of tweets. This could involve comparing tweets before and after a specific event to be analyzed, or observing discourse between users. Directly observing people and interactions between people is a form of qualitative research in a new field: the Internet.
Are methods utilized to analyze social media, even quantitatively, more like ethnography or demography?
These methods, though mixed with parts of quantitative and qualitative research, are more similar to ethnography than demography. Demography is used to reveal shifting trends of a given population by analyzing data collected through surveys and censuses. The gaze of the researchers is present whenever a respondent answers a question in a survey or census. In ethnography, people are examined over time in a field. Instead of taking a survey of a respondent’s answer at one point in time, the ethnographer has the advantage of placing what they say in the context of what they do. In the field, the ethnographer can see people “do things” over time and across a multiplicity of contexts. The Internet, then, is a new sort of field.
The Internet as a field, of course, is not physical; but, similar to an ethnographic site, the Internet – specifically its users – can be observed from many perspectives in many different contexts over time. The amount of time you are “in the field” is indefinite because when someone uses the Internet, either through tweeting or posting, this activity is recorded.
On the Internet, people can be observed for as little or as long as necessary, both retroactively and in real time. It is a field in which the observations of this data can be made at any time, and because of the technology now available, data is being collected faster than ever before. Collecting observations where people are not being prompted to answer surveys or interviews and are behaving without recognition of a researcher is much more similar to ethnography than demography. Essentially, the researcher can place what someone says (i.e. what they tweet or post) in the context of what they do.
What can social researchers discern from data pulled from social media?
Within the field of the Internet, data is collected and behavior is observed with mixed methods. Aggregating a large number of tweets and analyzing them statistically uses quantitative methods; however, when observing real people, in whom attitudes and behaviors can differ, when the researcher analyzes and uses that data the methods are also qualitative. These observations of and between people express behaviors because attitudes are expressed when prompted but behaviors are observed. Using the Internet can allow researchers to observe people and interactions both in real time and passed time, within different contexts and from different perspectives, and use qualitative, ethnographical methods to extract behaviors from quantitatively collected and analyzed data.
In conclusion, we claim that social behavior can be deduced through Facebook posts and tweets because what people post/tweet is a close proxy of what they do. Because the users observed are not prompted to answer a series of questions and are instead observed from a relatively outside perspective, the collected data can allow the researcher to observe discrepancies between what people say and do, and provide a more holistic view of social behavior, one similar to ethnography.