How to measure behaviour change?

Human behaviour exists in a system and individuals make decisions based on their environment. Within this complex environment we try to distil a limited number of forces that would have maximum impact on a certain type of decision — almost like a recipe for behaviour. While oversimplified frameworks do serve the purpose of recognising where maximum impact could take place, when we test and measure, the friction in the system and all its confounding variables come into play. The article explores some open questions regarding measurement in my beloved field of study.

By: Isha Jain

Image Credit: McKinsey Quarterly 2010

As an applied behavioural scientist, I’ve always struggled with reliable measurement. Early in my career I was enamoured by small nudges leading to big changes, the simplicity and elegance of behavioural science appealed to me. The idea that one carefully crafted text message could increase uptake of medication, or reduce credit card delinquency, or improve engagement on learning portals was fascinating. However, as time progressed and the complexity of problems went from improving tech engagement rates to improving judicial processes for survivors of sexual assault, the glamour reduced and questions surrounding the effectiveness of these interventions increased.

For all the fascinating, low-cost interventions in this field, there seems to be a glaring gap when it comes to measurement. As a combination of many fields ranging from economics to social psychology to neuroscience, it is difficult to identify how exactly measurement in behavioural science should take place. The field of social psychology is rife with issues of lack of reliability due to self-reported tests. Measures in economics too are contested because they assume human rationality. The set-up for measuring brain activity is far too elaborate for it to be sustainable in an applied setting. Behavioural science is often criticised or assumed to be inferior because there is no standard framework to successfully predict and explain the efficacy of interventions. The question is fundamental — when it comes to behaviour change, what are reliable, scientific measures?

Before exploring the difficulties in evaluation, let’s consider the process of designing interventions. Human behaviour exists in a system and individuals make decisions based on their environment. Within this complex environment we try to distil a limited number of forces that would have maximum impact on a certain type of decision — almost like a recipe for behaviour. For example, the lack of uptake for tuberculosis medication could be a combination of social stigma, discounting future outcomes and a fear of side effects. While our research may indicate that these are the most important drivers, they could be related to a variety of other internal drivers as well. For example, discounting future outcomes and making poor health decisions in the present, could be related to the beliefs of the people currently around the patient. While oversimplified frameworks do serve the purpose of recognising where maximum impact could take place, when we test and measure, the friction in the system and all its confounding variables come into play.

Another layer of complexity is that most of our decision-making is unconscious. The problem then becomes linking latent, unobserved drivers of behaviour to something more universal and easily measurable. The obvious answer is to find proxies for the unobserved drivers. Let’s take the case of a study regarding a girl’s sexual health. One could spend hours crafting the perfect survey, ensuring the words are in a local language and relatable. However, the feelings associated with the words could change on a daily basis; asking her how confident she feels about saying no to unprotected sex could be a highly influenced by the experience she had that very day. Therefore, using only self-reported changes to measure improvement in a girl’s attitude towards sexual health may not always be accurate. The issue lies in isolating variables that are prone to change over time. Of course, these issues are not unique to behavioural science, but since behaviour is a function of complex and often unidentifiable causes, the number of extraneous factors are more than usual. Additionally, since behavioural science aims to link cause and effect so closely, the process by which an outcome is obtained becomes important ie. measuring the driver of behaviour and not just the changed behaviour.

Furthermore, in my experience it is far easier to isolate these variables and test them in settings that afford it. This is especially true when applying behavioural science to contained problems where there is ample data and clearly defined boundaries. This is especially the case in urban settings with financial services, tech products and retail. These sectors have immense amount of historical customer data, flexibility in doing A/B testing and a fairly controlled environment. A bank for example, will have the credit history, frequency of transactions, nature of transactions and so on for each customer. Therefore, when a form is changed for a loan application it is easier to use a range of correlated variables to even vaguely identify what could have been the predictor of changes in behaviour. However, when we apply behavioural science to fields that need it the most — strongly resource crunched areas with huge capacity for improvement, other issues start creeping in. The lack of data in the social sector is well known. When an ASHA worker (accredited social health activist) is executing a plan to improve commitment to iron tablets for anaemic women in rural India, the number of confounding variables increase with her husband’s role, her role in society and the village’s beliefs around medication.

Fig 1: Indicators mapped to problem type

We try to seek uniformity in testing, but behavioural science interventions are extremely contextual. We make decisions based on our genetics, the way we learn and our social networks; we’re dynamic in our preferences and are driven by what is arguably the most complex component to measure — emotions. As applied behavioural scientists here are some principles that might help:

1. Keep in mind the timeline for change when implementing choice architecture such as changing the order of products in an aisle, and when aiming to create long term behaviour change, such as changing social norms about child marriage. The former is trying to close an intent-action gap in a controlled environment while the latter is trying to influence factors that are far less malleable. Having intermediate measures of success for long-term change will help maintain a sense of progress and keep an eye on the efficacy of implemented strategies.

2. Identify incremental change in behaviour, even if that might not lead to the outcome dramatically improving. For instance, when trying to reduce domestic violence, even if the survey results don’t explicitly state a reduction, measuring smaller shifts in attitude such as increased hesitation from the abuser can indicate that the intervention is in the right direction.

3. Ensure a strong focus on evaluation right from the beginning by deciding the measurement criteria and methodology before the project and capturing baseline measurements. Also keep time to pilot ideas since even the most well-founded theoretical suggestions will not serve the purpose we’re trying to achieve unless a prototype is executed and learned from.

4. Maintain a balance between outcome and process measures, measuring outcomes such as the amount of time spent on an application is an easy and straightforward way of measuring results. While these universal indicators are important, subtle process-based factors of where the user is spending most time and how they got to that section will give insight into whether the implemented change was in the direction that was intended or not.

5. Experiment with mix methods, having a healthy mix of both qualitative and quantitative tools helps validate research, catch discrepancies that would arise by using just one method and improve external validity of the interventions as well.

No framework is perfect and as the field evolves so will the rigour with which it is measured. However even as we behavioural scientists strive to create an empirical science; it is important to remember that human beings are not lines on a graph or percentage changes that wait patiently for interventions to be applied and reactions to be assessed.

Final Mile brings unique and proven capabilities in addressing complex behavioral challenges. As one of the first Behavioral Science & Design consultancies, Final Mile has had the opportunity to bring these to practice in a wide variety of sectors and contexts. We have executed highly complex behavior change projects across a wide variety of areas covering Global Health (HIV, TB, Maternal Health, WASH), Financial Inclusion, Safety across Africa, Asia, Europe, and the US.

Final Mile is also building a pandemic playbook that can be used as a potential toolkit by policymakers and implementors in mitigating Covid19 and future such pandemics.

Reach out to us at contact@thefinalmile.com.

Final Mile is a research and consulting firm solving tough and relevant behavioral problems across the globe