Field Experiments versus Laboratory Experiments: A Question of Purpose Rather than Preference
This paper aims to explore two worlds, without explicitly choosing one over the other. We realize of course that making that choice would make our task easier as we would become advocates for one side. However, as we shall see, this would not be the prudent course of action. In science, both field and laboratory experiments have specific purposes. We will delineate what some of those purposes are, and the reader will hopefully gain an insight and understanding into why this is not simply a matter of choice.
First, it should be noted that the distinction between field and lab work is a fundamental division- so much so that it has spawned two starkly different sets of research designs, namely qualitative and quantitative studies. And, much like the debate on field versus lab experiments, the division between qualitative and quantitative has spawned fierce battles within the interested parties, each side advocating the superiority of its research design when in fact both sides have a valid method, albeit for starkly different purposes.
As Trochim (2006) explains, both traditions are about gathering and deciphering data- be it words, with qualitative research or numbers when dealing with quantitative measures. Furthermore, Trochim claims that “ All qualitative data can be coded quantitatively” because “[a]nything that is qualitative can be assigned meaningful numerical values”. When qualitative researchers collect data that is verbal- such as answers to “ tell me what you think”, they then code these answers and in doing so, divide the verbal data into numeric categories. Trochim also takes on the other side of the coin, advocating that“[a]ll quantitative data is based on qualitative judgment.” When a quantitative researcher assigns a number, he also assigns meaning. For example, if the heart rate of subject A is 78 beats per minute, that data has to be classified in order for it to have meaning. Is 78 beats per minute too high? Is it too low? Is it appropriate given the subjects’ age, health and circumstances? Similarly, when a qualitative researcher makes a survey where the answers vary from “ excellent” to “ poor”, those terms need to be defined in order for the data to have meaning. Hence, Trochim argues, quantitative methods and qualitative methods are merely different ways of grappling with the same set of issues.
Likewise, the debates between field and laboratory experiments are merely two ways of looking at the same subject- science. As Harrison and List (2004) make clear, “[F]ield experiments differ from laboratory experiments in many ways. Although it is tempting to view field experiments as simply less controlled variants of laboratory experiments, we argue that to do so would be to seriously mischaracterize them. What passes for “ control” in laboratory experiments might in fact be precisely the opposite if it is artificial to the subject or context of the task. In the end, we see field experiments as being methodologically complementary to traditional laboratory experiments.”
Before discussing how these two methods of experimentation complement one another- as Harrison and List plausibly claim that they do- it would be necessary to define each method. A field experiment is, broadly, a method of investigation which involves “ studies with regular people in “ real world” settings-studies designed to determine how well a new intervention or program works in the real world (i. e., relative effectiveness) rather than how well it works under ideal circumstances (i. e., efficacy)” (Dennis, 1990.) As one might imagine, this real world approach is no easy task. Indeed, Dennis goes so far as to claim that field studies are “ often logistical failures”. Granted, Dennis comes to this view from a medical/drug treatment background, but his warning, and the reasons he gives for this view are widely applicable.
Dennis gives six reasons for why he looks upon field experiments as failures. First, in a field experiment, variability is almost inevitable. You are dealing with human beings in their natural (as opposed to a controlled) environment. Hence, this variation is incredibly difficult to control and said lack of control has an effect on whether the experiment is effective or statistically reliable. Secondly, the treatment plan devised in the laboratory can easily become compromised in the field, be it through simple error or even through personal intervention- for instance, if two groups in need of medical treatment are out in the field, and one is a control group that is not receiving treatment, their counselor may try to give them the treatment out of a sense of fairness, thus sabotaging the value of the field experiment. A third trouble spot when it comes to field experiments is “ being unable to estimate the expected effect size and, consequently, being unable to estimate the number of units necessary to achieve a reasonable level of statistical power because there are no direct data to estimate expected caseflow.” Fourthly, many experiments depend on randomly assigning subjects. This random assignment is what lends many experiments their credibility in that this way they can show that the treatment has broadly similar effects throughout the targeted population. In the field, random assignments may be breached, either intentionally or otherwise. A fifth issue with field experiments, particularly those which can last for several years [which is not unheard of] is that the changes in the environment, such as “ staff turnover, changes in local funding, changes in federal regulations” can affect the experiment negatively. Imagine a study which employs two groups and four trained counselors who have built up both credibility and an emotional bond with the participants. If just one of those four counselors leaves, which is reasonably likely, especially in the course of a long running field experiment, that departure can have a significant impact on the findings. Lastly, and connected with this issue, Dennis notes that “ few programs are static; most are continuously evolving, and researchers, find it difficult to maintain rigid experimental regimens over a long period of time.”
Nevertheless, Dennis agrees that field experiments are a necessity for they bring out aspects of a product or hypothesis which are not readily apparent in the course of a laboratory experiment. Laboratory experiments serve two major purposes. Firstly, as has already been mentioned, field experiments are costly and hard to control. However, this is an argument against field work, not one in favor of lab work. The argument made in favor of doing laboratory experiments, specifically in the field of economics is that economics is an inherently theoretical profession and that said theories can best be tested in a strictly controlled environment that tests the effects of economic theories, ideas and stratagems in a “ small-scale microeconomic environment in the laboratory where adequate control can be maintained.” (Pitt, 1981 at 138.)
This is a crucial step in the research process. In order to proceed to field work, no matter what area is being studied, ideas must stand up to controlled scrutiny. Because the field of economics in itself is so flexible and far reaching, effecting everything from environmental protection laws to crime control policies, control becomes even more important. As Pitt explains “ control is crucial because it is necessary for measurement and thus replicability. Replicability, in turn, allows the experimenter to identify systematic relationships between preferences, institutional parameters and outcomes.” (Pitt, 1981 at 142.) Pitt goes on to give three primary uses for lab experiments in the field of economics, namely an ability to pick between competing economic theories, the chance to exposes economic theories and models which lack any validity and exposing theory to new models of organization. All of these are valid reasons for conducting laboratory experiments. After all, committing a beta error- that is, allowing a theory to be adopted widely without adequate laboratory testing is flirting with disaster. Furthermore, even if an economic theory withstands scrutiny in a given situation, that theory may not necessarily have broad applicability. Given the nature of the marketplace of ideas, where individuals and institutions often adapt ideas, concepts and theories to suit new purposes, it is important to test theories that have withstood scrutiny in new environments and under new circumstances to assure, as much as possible, that these theories can withstand the sort of challenges they will inevitably face once they are out of the laboratory and in the general consciousness.
If laboratory experiments in economics are a necessary filter between raw theory and irresponsible application, then field work must be viewed as another such necessary filter. As stated in the introduction, we are not making a choice between two approaches, but rather designating each approach as being welcome and appropriate under certain circumstances. As Harrison and List (2004) put it “ By examining the nature of field experiments, we seek to make it a common ground between researchers. We approach field experiments from the perspective of the sterility of the laboratory experimental environment. We do not see the notion of a “ sterile environment” as a negative, provided one recognizes its role in the research discovery process. In one sense, that sterility allows us to see in crisp relief the effects of exogenous treatments on behavior. However, lab experiments in isolation are necessarily limited in relevance for predicting field behavior, unless one wants to insist a priori that those aspects of economic behavior under study are perfectly general in a sense that we will explain. Rather, we see the beauty of lab experiments within a broader context—when they are combined with field data, they permit sharper and more convincing inference.”
How does field work complement, as Harrison and List insist that it does laboratory work? Thus far, we have seen the role of field work in broad terms of necessity, as a concession to the brutal reality of the greater world, where elements do not conform to our expectations. But field work is far too valuable to regard as a mere safeguard. Indeed, “ to view field experiments as simply less controlled variants of laboratory experiments… would be to seriously mischaracterize them.” (Harrison and List, 2004.) Harrison and List make the point that the controls imposed in the lab are in fact undesirable in the field since imposing said controls would alter the behavior of the subjects and the relationship between variables in a way not found in the “ real” world- and hence, any attempt at such control in the field would do a serious disservice to the viability of the theory under examination.
Field work in economics is astonishingly varied. Further, a single economic theory can be tested in a variety of settings- but as these settings diverge, one must be prepared to see different results under different conditions. Indeed, one major reason for field work is to expose laboratory notions to different conditions and to see how well they hold up, how many exceptions there are and how much these exceptions may affect the theory overall. [Since a theory with a multitude of exceptions may not remain a valid and viable theory for very much longer.] One example of this phenomenon is the Nash equilibrium bidding behavior model, which was used to show, back in 1961 that no matter which one of four possible auction formats [English, Dutch, first-price, second-price] is used by an auction house, the expected amount of revenue collected will be nearly equal. More than thirty years later, David Lucking-Reiley decided to re run the Nash experiment on the internet- using an internet game called Magic: The Gathering , where game players become “ dueling wizards, each with their own libraries of magic spells (represented by decks of cards) that may potentially be used against the player’s opponent.” These cards, which represent the magic spells in question, “ are sold in random assortments, just like baseball cards, at retail stores ranging from small game and hobby shops to large chains such as Toys “ R” Us and Waldenbooks.” The game’s internet structure and the value of the cards in question have combined to produce a thriving economy where said cards can be bought and sold by players and even auctioned off- an internet version of the real life auctions Nash studied in 1961. Where Nash found that different auction models made no difference to the bottom line, Lucking-Reiley, using a medium not existing in Nash’s day, spent two years first observing and then participating in the sort of online card auctions described above and concluded that the Dutch auction format, where the price starts at an inflated level and declines until the first, winning bid, produces product prices that are thirty percent higher than the prices of the other auction formats. (Lucking-Reiley, 1999.)
Lucking-Reiley’s work shows how essential field work is to economics and other sciences. It is through his field work, using other participants behind a computer game that he was able to disprove, within a narrow set of circumstances, Nash’s theory. Does this mean that Nash’s theory is no longer valid? Not necessarily. Lucking-Reiley confined his field work to a specific medium- the internet- and to a specific format within that medium, namely internet game players. Not only that, he used a specific game which attracts specific players. Lucking-Reiley is not claiming that all internet auctions would produce the best prices through the Dutch auction format. Nor is he saying that all internet gamers would respond the way that those who play Magic: The Gathering did. However, he is challenging Nash’s work as well as a theory (that all auction formats would produces roughly the same economic results) which was developed and tested over time. Field work like that of Lucking-Reiley shows how a theory adapts to a new framework or medium- or how it doesn’t.
At the start of this paper we explained that we choose not to choose- that we do not have a clear preference for either laboratory work or field work. In truth, both are absolutely essential to economic theory as well as to other sciences. The control provided by a lab will allow a theory to be tested in a way that would show how it does, or doesn’t hold up to strict scientific scrutiny. If the theory in question fails in the lab, the matter can be regarded as settled- at least until a new testing procedure is devised. If the theory succeeds however, the next step is not to simply foist it upon the world but to test it out in the field, allowing it to stand or fail on its own in an environment which is not as sterile and has far fewer safeguards. Thus, lab and field work are two sides of a necessary filter between ideas and reality.
References:
Dennis, M. L. (1990) “ Assessing the Validity of Randomized Field Experiments: An Example of Drug Abuse Treatment Research,” Center for Social Research and Policy Analysis- Research Triangle Institute Evaluation Review, Vol. 14 No. 4 (Aug. 1990), 347-373.
Harrison, G. W. and List J. A. (2004) “ Field Experiments,” Journal of Economic Literature, Vol. XLII (Dec. 2004), 1009-1055.
Lucking-Reiley D. (1999) “ Using Field Experiments to Test Equivalence between Auction Formats: Magic on the Internet,” The American Economic Review, Vol. 89 No. 5 (Dec. 1999), 1063-1080.
Pitt, J. (1981) Philosophy in Economics , Springer-Verlag New York LLC: New York.
Trochim, W. M. K. (2006) “ The Qualitative Debate,” Research Methods Knowledge Base: Web Center for Social Research Methods. Accessed viahttp://www. socialresearchmethods. net/kb/qualdeb. htmon 15 April 2008.