Simulation (sim' yoo la 'shen ) an imitation or counterfeit. This definition, according to Websters Dictionary, implies the presence of a replication so well constructed that the product can pass for the real thing. When applied to the study of research design, simulations can serve as a suitable substitute for constructing and understanding field research. Trochim and Davis (1986) posit that simulations are useful for (1) improving student understanding of basic research principles and analytic techniques: (2) investigating the effects of problems that arise in the implementation of research; and (3) exploring the accuracy and utility of novel analytic techniques applied to problematic data structures.

As applied to the study of research design, simulations can serve as a tool to help the teacher, evaluator, and methodologist address the complex interaction of data construction and analysis, statistical theory, and the violation of key assumptions. In a simulation, the analyst first creates data according to a known model and then examines how well the model can be detected through data analysis. Teachers can show students that measurement, sampling, design, and analysis issues are dependent on the model that is assessed. Students can directly manipulate the simulation model and try things out to see immediately how results change and how analyses are affected. The evaluator can construct models of evaluation problems -- making assumptions about the pretest or type of attrition, group nonequivalence, or program implementation -- and see whether the results of any data analyses are seriously distorted. The methodologist can systematically violate assumptions of statistical procedures and immediately assess the degree to which the estimates of program effect are biased (Trochim and Davis, 1986, p. 611).

Simulations are better for some purposes than is the analysis of real data. With real data, the analyst never perfectly knows the real-world processes that caused the particular measured values to occur. In a simulation, the analyst controls all of the factors making up the data and can manipulate these systematically to see directly how specific problems and assumptions affect the analysis. Simulations also have some advantages over abstract theorizing about research issues. They enable the analyst to come into direct contact with the assumptions that are made and to develop a concrete "feel" for their implications on different techniques.

Simulations have been widely used in contemporary social research (Guetzkow, 1962; Bradley, 1977, Heckman, 1981). They have been used in program evaluation contexts, but to a much lesser degree (Mandeville, 1978; Raffeld et at., 1979; Mandell and Blair 1980). Most of this work has been confined to the more technical literature in these fields.

Although the simulations described here can certainly be accomplished on mainframe computers, this workbook will illustrate their use in manual and microcomputer contexts. There are several advantages to using simulations in these two contexts. The major advantage to manual simulations is that they cost almost nothing to implement. The materials needed for this process are: dice, paper, and pencils. Computer simulations are also relatively low in cost. Once you have purchased the microcomputer and necessary software there are virtually no additional costs for running as many simulations as are desired. As it is often advantageous to have a large number of runs of any simulation problem, the costs in mainframe computer time can become prohibitive. A second advantage is the portability and accessibility. Manual simulations can be conducted anywhere there is a flat surface on which to roll dice. Microcomputers are also portable in that one can easily move from home to office to classroom or into an agency either to conduct the simulations or to illustrate their use. Students increasingly arrive at colleges and universities with microcomputers that enable them to conduct simulations on their own.

This workbook illustrates some basic principles of manual and computer simulations and shows how they may be used to improve the work of teachers, evaluators, and methodologists. The series of exercises contained in this manual are designed to illuminate a number of concepts that are important in contemporary social research methodology including:

- simulations and their role in research
- basic measurement theory concepts
- the elements of pretest/posttest group designs, including nonequivalent, regression-discontinuity and randomized experimental designs
- some major threats to internal validity, especially regression artifacts and selection threats

The basic model for research design presented in this simulation workbook is the program or outcome evaluation. In program evaluation the goal is to assess the effect or impact of some program on the participants. Typically, two groups are studied. One group (the program group) receives the program while the other does not (the comparison group). Measurements of both groups are gathered before and after the program. The effect of the program is determined by looking at whether the program group gains more than the comparison group from pretest to posttest. The exercises in this workbook describe how to simulate the three most commonly used program evaluation designs, the Randomized Experiment, the pretest/posttest Nonequivalent Group Design, and the Regression-Discontinuity design. Additional exercises are presented on regression artifacts, which can pose serious threats to internal validity in research designs that involve within-subject treatment comparisons.

We can differentiate between these research designs by considering the way in which assignment of units to treatment conditions is conducted - in other words, what rule has determined treatment assignment. In the randomized experimental (RE) design, persons are randomly assigned to either the program or comparison group. In the regression-discontinuity (RD) design (Trochim, 1984), all persons who score on one side of a chosen preprogram measure cutoff value are assigned to one group, with the remaining persons being assigned to the other. In the nonequivalent group design (NEGD) (Cook and Campbell, 1979; Reichardt, 1979), persons or intact groups (classes, wards, jails) are "arbitrarily" assigned to either the program or comparison condition. These designs have been used extensively in program evaluations where one is interested in determining whether the program had an effect on one or more outcome measures. The technical literature on these designs is extensive (see for instance, Cook and Campbell, 1979; Trochim, 1986). The general wisdom is that if one is interested in establishing a causal relationship (for example, in internal validity), RE designs are most preferred, the RD design (because of its clear assignment-by-cutoff rule) is next in order of preference, and the NEGD is least preferable.

All three of the program evaluation designs (RE, RD, and NEGD) have a similar structure, which can be described using the notation:

O X O O O

where the Os indicate measures and the X indicates that a program is administered. Each line represents a different group; the first line depicts the program participants whereas the second shows the comparison group. The passage of time is indicated by movement from left to right on a line. Thus, the program group is given a preprogram measure (indicated by the first O), is then given the program (X), and afterward is given the postprogram measure (the last O). The vertical similarity in the measurement structure implies that both the pre and postmeasures are given to both groups at the same time. Model-building considerations will be discussed separately for each design.

The simulations are presented in two parts. The first part contains the manual simulations, including the basic randomized experiment, nonequivalent group and regression-discontinuity research designs with an additional exercise presented on regression artifacts. Part two of this manual contains the computer simulation equivalents of the research designs presented in part one. Also included in this section is a computer analog to the regression artifacts simulation.

Both Parts I and II begin with an exercise called Generating Data. This exercise describes how to construct the data that will be used in subsequent exercises. Because this exercise lays the foundation on which subsequent simulations are based, it is extremely important that you do it first and follow the instructions very carefully.

Simulation Home Page