What Is A Randomized Controlled Trial?
It is nearly impossible to do proper scientific experiments in economics. Other branches of science don’t have this problem. In medicine, if you want to know if a new drug is more effective than the current standard: you can find some sick people, randomly give half of them the new medicine, give the other half the standard medicine, wait a bit, and see which group dies less. Since the assignment of which medicine a patient gets is randomly determined, the observed differences are due to the treatment and not due to some underlying characteristic of those treated.
When They Work
This is not so easily done in the social sciences. Researchers cannot alter the results of a presidential election, economic policy, or the demographic mix of a city just to see what happens. Because of this, researchers typically rely on mathematical models, case studies, natural experiments, or statistical analysis. While sometimes convincing, these methods never quite have the same rigor as a proper experiment.
Fortunately, there are some situations where researchers can randomly assign subjects to treatment and control groups. In economics, randomized controlled trials (RCT’s) are most common in development economics. Organizations like the Poverty Action lab set up experiments to test all manner of interventions. Since the treatments are randomly assigned, the conclusions should be robust, and not subject to various selection biases.
Of course, there are issues. Principally, issues of external validity. By their nature, RCT’s only provide direct results for the specific group of people involved in the experiment. Sometimes this might be a large number or a small number, but it is always a limited group.
External Validity is the extent that a result applies to other contexts. Does a result found in one set of circumstances apply to others? Pretend there is a study that shows providing free meals to school children in a village in Côte d’Ivoire, increases school attendance. Does it mean that the same policy would work in a different village, or a different country? for secondary school children? Does it matter if the food provided is different? Maybe attendance increased, but did any additional learning take place?
External Validity is the headline problem of RCT’s. It is a serious issue. The argument in favor of RCT’s has always been that they have very strong internal validity. While the conclusions may not apply elsewhere, at least they are valid for the group you studied.
In medicine, RCT’s are nearly always double blind. Neither those receiving treatment, nor those administering the experiment know who is getting the real medicine and who is not. The reason for this excess of mystery, is the placebo effect. Receiving some treatment, even if that treatment is completely worthless, tends to improve outcomes.
RCT studies in economics are essentially never double blind. The participants typically know exactly what they are getting or not getting. This might be a problem. A 2014 Study tried to test out if it would matter if the participants did not know if they were receiving a treatment or not.
How to increase output?
In studies of agricultural productivity the central question might be something along the lines of “how does using a different sort of seed increase crop output?”
There are at least two components of this sort of question. One is the physical impact of the seeds themselves. Some seeds simply produce higher yields than others.
The second component is behavioral-what the farmers do with the seeds. If a farmer receives a more productive variety of seed, they will do things differently. This includes decisions like the amount of care and effort they put into their crops. Better seeds could result in people doing more work (that extra work is more valuable) or less work (they don’t need to work as hard to get the same result).
Bulte et al, the folks who did this study, also surmise a third component. Which is how behavior is changed in ways that are unrelated to the treatment. The example given is that participants might be overly optimistic about the potency of the seeds they receive, and put extra effort into raising their crops, even if this extra effort is not merited by the actual quality of the seeds. When the outcome is measured, due to this effort, it will look like the intervention is more effective than it was.
The distinction between these second and third components is that the first one is directly related to the actual mechanics of the intervention while the second one is not. Sort of like a placebo effect, the second effect happens even if the actual treatment has no impact at all.
Cowpea seeds in Tanzania
Here’s how the study went down.
First, some farmers are randomly divided into these four different groups:
- Those who received traditional cowpea seeds and were explicitly told they had received traditional seeds
- Those who received modern cowpea seeds and were explicitly told that they had received modern seeds
- Those who received traditional cowpea seeds but did not know which kinds of seeds they had received
- Those who had received modern cowpea seeds but did not know which kinds of seeds they had received.
The seeds appeared identical, so it was not possible to tell which one they had received by observing them.
When the harvest was gathered, the results showed that the modern seeds outperformed the traditional seeds by 27%. This is where most studies like this would stop. Researchers would then go off and make policy recommendations, telling governments and NGOs to give people better cowpea seeds, or something.
The wrinkle in this case, was that the 27% difference was only for those farmers who had known which type of seeds they had received. The impact of the seeds on the group that did not know which one they had received was zero. The seeds themselves had done nothing. Since the seeds were ineffective, any behavioral changes that were brought on by the impact of the seeds was also negligible. The 27% increase in harvest was due to the response of the farmers themselves to receiving the new seeds.
Effort, Expectations, and External Validity
Seeds do not know they are being experimented on. Proper scientific trials can be done on seeds, and it can be determined how they grow in different conditions. The whole point of giving them to actual farmers is to see what those farmers do with them. From the standpoint of evaluating an intervention for policy purposes, the behavioral effect is a necessary factor. How people respond to an intervention is not some confounding variable, it is a key part of the intervention.
Crucially, Bulte et al, make a distinction between changes in behavior that are driven by the genuine impact of the treatment and those which are not. The changes in behavior that were not due to the treatment biases the estimate upward, because even if the treatment does nothing, you still observe a positive impact.
I am not convinced this distinction is a particularly important one. While it is important to think about how much of an outcome is due to participants’ effort, it seems that expectations (even wrong ones) are part of the intervention, not separate from it. Differences in expectations in time and space, would be a question of external, not internal, validity.
Part of the measured outcome will be driven by the actual impact of the intervention. Part of that will be driven by expectations that may be unrelated to the actual effectiveness of the intervention. The measured impact will include both things. It probably should.
<This post was slightly edited for clarity on Jan 23 >
Sources, References, And Further Reading
Brought to my attention by this recent blog post
 An increase of about 5%. Not statistically significant at traditional levels.
 Unfortunately, this is a deeply imperfect paper. There are issues with attrition, the way the experiment was carried out regardless as to the validity of the methods used
2 thoughts on “Do randomized econ studies suffer the placebo effect?”
It is here! I am not going to read it tonight. I will read it tomorrow when I my brain in more awake.
Julie Neitz Wielga
Director of Partners in Literacy
(3) Is it OK that the paper is deeply flawed? It still makes your point?
Not sure about “differences in time and space, would be a question of external not internal validity. Why do you say time and s pace there?
Julie Neitz Wielga
Director of Partners in Literacy