Surveys do not always answer our questions. Instead of revealing people’s attitudes, we learn how they use language.
BI RESEARCH: Management
We are constantly being invited to answer surveys where we are asked to state our opinions about everything under the sun. We often have to tick boxes indicating to what degree we agree with various statements.
Surveys are widely used in the social sciences, perhaps especially within management research. But they are also used in many other contexts to map people’s attitudes regarding various topics.
The researchers are not always interested in your actual answer to the questions, whether you give a low or high score or something in the middle. They are looking for patterns by examining correlations between your responses to multiple questions.
Not sure we can trust the results
Among other things, the researchers want to find out if different types of leadership are correlated with motivation in employees. Or whether different human resource management practices are correlated with how you behave at work. Or learning what managers can do to enhance employee performance.
Can we then trust the management researchers’ findings?
Professor and management researcher Jan Ketil Arnulf at BI Norwegian Business School is not at all sure that we can. He recently published a study that challenges the method used by management researchers.
“The correlations in such surveys are often determined by language. The statistics do paradoxically not depend on how strongly people agree or disagree with the questions they have answered,” he claims.
Can calculate results in advance
The BI researcher has previously shown that it is possible to guess how people will respond in surveys before they are even asked - and with nearly 90 per cent accuracy. How on earth is this possible?
In his research, Arnulf has collaborated with colleagues in the US who have developed digital language algorithms. This technology is similar to what is used in search engines on the internet.
The algorithms compare the questions in the surveys to see to what extent the questions have an overlapping meaning. For example, the algorithm will show that “today is Tuesday” means the same thing as “tomorrow will be Wednesday”, although the sentences do not share a single word between them.
Often, two questions will be enough to “guess” the answer to a third question. The example using Tuesday and Wednesday is obvious, but the algorithms will find such correlations in far more ambiguous sentences.
In practice, this entails that questions with seemingly different wording are asking about the same thing. When we answer questions that are quite similar, the answers will often also be quite similar.
The algorithms detect more nuanced and systematic differences in sentences than we humans are able to. And they use the information to “guess” what a person will answer.
In surveys with multiple questions on the same topic, patterns of similarities occur which will determine the statistics in the answers. Based on the similarity in the questions, it is thus possible to guess what people will answer in reality, sometimes with a very high accuracy.
Together with researchers Kai Rune Larsen and Øyvind L. Martinsen, Jan Ketil Arnulf has demonstrated this possibility in a research article that was recently published in the periodical Sage Open. In the article, they show that knowledge about the first three answers in a survey makes it possible to quite accurately guess how the next 43 questions will be answered.
People’s attitudes disappear
Surveys detect both the interviewee’s degree of agreement – what is referred to as attitude strength – and how the subjects understand and use language. Together with colleagues Kai Rune Larsen, Øyvind L. Martinsen and Thore Egeland, Arnulf has also developed a method to see how “correctly” each person behaves with regard to semantics.
“We can now differentiate between the degree of agreement between the people – their so-called attitude strength – and their degree of language systematics, in other words, how close they get to the algorithms’ guesses based on how similar the questions are,” says the BI Professor. According to Arnulf, this will make it possible to track which elements are included in the researchers’ models.
Jan Ketil Arnulf and his researcher colleagues have examined four sets of data from surveys that comprise nearly 7800 respondents and more than 27 million observed answer combinations. The results of the study were recently published in the scientific periodical Behavior Research Methods.
“Unfortunately, it turns out that the participants’ attitude strength is filtered out. All you are left with is semantic relationships,” according to Arnulf. That means that their answer is the same to questions that ask the same thing with different wording.
The survey shows that people usually leave quite a lot of information about their attitudes regarding management and work. But where researchers have asked people what they think about their boss, about the working conditions and their perception of their job, this information is frequently filtered out in the most common methods that researchers use in management.
Research and reality
In this way, it appears that quite a lot of research on management and work is not about management and work, according to Arnulf.
The topic being researched quite simply became lost in the processing of the information. Instead, you are left with a few numbers that mainly indicate how people use language. And those results are often not very surprising.
This can for example tell us that people who like their job do not want to quit as often as people who do not like their job.
“You hardly need research to prove this, because this just becomes definitions of terms that everyone uses.”
In the same way, it is not surprising that people like a boss who is interested in them more than a boss who is not interested in them.
“We actually need to ask ourselves whether the numerous employee surveys being conducted regularly actually measure what they claim to,” says Arnulf.
Back to the drawing board
The findings from the study conducted by Arnulf and his colleagues touch on a question that, while known, has not been widely discussed: What is actually the correlation between what could appear to be abstract research findings and real behaviour at work?
The researchers’ statistics appear to be more clear and correlated than real life behaviour. The correlations are stronger in research than in reality. Now, Arnulf is claiming that the management researchers’ use of surveys simply does not measure up.
“Only for personality tests, which carry a heritage from IQ tests, did we find that the method still holds up,” says Jan Ketil Arnulf.
After offering up this torch, the BI Professor believes that there is only one thing to do:
“Management research needs to return to the drawing board. We need different data, other methods, and not least; we need a better basic philosophical understanding of what management research actually involves.”
Arnulf, Jan Ketil, Kai Rune Larsen, Øyvind Lund Martinsen og Thore Egeland (2018): The failing measurement of attitudes: How semantic determinants of individual survey responses come to replace measures of attitude strength. Behavior Research Methods doi.org/10.3758/s13428-017-0999-y. Summary
Arnulf, J. K., Larsen, K. R., & Martinsen, Ø. L. (2018). Respondent Robotics: Simulating responses to Likert-scale survey items. Sage Open, January-March, 1-18. doi:10.1177/2158244018764803.
This article was first published in the online paper forskning.no 29 May 2018.