Identifying and Implementing Educational Practices Supported By Rigorous



Yüklə 113,13 Kb.
səhifə3/3
tarix23.11.2017
ölçüsü113,13 Kb.
#12174
1   2   3

not themselves meet the threshold for "possible" evidence.


Meta-analysis is a quantitative technique for combining the results of

individual studies, a full discussion of which is beyond the scope of this

Guide. We merely note that when meta-analysis is used to combine studies

that themselves may generate erroneous results - such as randomized

controlled trials with significant flaws, poorly-matched comparison group

studies, and pre-post studies - it will often produce erroneous results as

well.
Example. A meta-analysis combining the results of many nonrandomized

studies of hormone replacement therapy found that such therapy significantly

lowered the risk of coronary heart disease.22 But, as noted earlier, when

hormone therapy was subsequently evaluated in two large-scale randomized

controlled trials, it was actually found to do the opposite - namely, it

increased the risk of coronary disease. The meta-analysis merely reflected

the inaccurate results of the individual studies, producing more precise,

but still erroneous, estimates of the therapy's effect.

IV. IMPORTANT FACTORS TO CONSIDER WHEN IMPLEMENTING AN EVIDENCE-BASED

INTERVENTION IN YOUR SCHOOLS OR CLASSROOMS.


A. Whether an evidence-based intervention will have a positive effect in

your schools or classrooms may depend critically on your adhering closely to

the details of its implementation.
The importance of adhering to the details of an evidence-based intervention

when implementing it in your schools or classrooms is often not fully

appreciated. Details of implementation can sometimes make a major difference

in the intervention's effects, as the following examples illustrate.


Example. The Tennessee Class-Size Experiment - a large, multi-site

randomized controlled trial involving 12,000 students - showed that a state

program that significantly reduced class size for public school students in

grades K-3 had positive effects on educational outcomes. For example, the

average student in the small classes scored higher on the Stanford

Achievement Test in reading and math than about 60 percent of the students

in the regular-sized classes, and this effect diminished only slightly at

the fifth-grade follow-up.23


Based largely on these results, in 1996 the state of California launched a

much larger, state-wide class-size reduction effort for students in grades

K-3. But to implement this effort, California schools hired 25,000 new K-3

teachers, many with low qualifications. Thus the proportion of

fully-credentialed K-3 teachers fell in most California schools, with the

largest drop (16 percent) occurring in the schools serving the lowest-income

students. By contrast, all the teachers in the Tennessee study were fully

qualified. This difference in implementation may account for the fact that,

according to preliminary comparison-group data, class-size reduction in

California may not be having as large an impact as in Tennessee.24


Example. Three well-designed randomized controlled trials have established

the effectiveness of the Nurse-Family Partnership - a nurse visitation

program provided to low-income, mostly single women during pregnancy and

their children's infancy. One of these studies included a 15-year follow-up,

which found that the program reduced the children's arrests, convictions,

number of sexual partners, and alcohol use by 50-80 percent.25


Fidelity of implementation appears to be extremely important for this

program. Specifically, one of the randomized controlled trials of the

program showed that when the home visits are carried out by

paraprofessionals rather than nurses - holding all other details the same -

the program is only marginally effective. Furthermore, a number of other

home visitation programs for low-income families, designed for different

purposes and using different protocols, have been shown in randomized

controlled trials to be ineffective.26


B. When implementing an evidence-based intervention, it may be important to

collect outcome data to check whether its effects in your schools differ

greatly from what the evidence predicts.
Collecting outcome data is important because it is always possible that

slight differences in implementation or setting between your schools or

classrooms and those in the studies could lead to substantially different

outcomes. So, for example, if you implement an evidence-based reading

program in a particular group of schools or classrooms, you may wish to

identify a comparison group of schools or classrooms, roughly matched in

reading skills and demographic characteristics, that is not using the

program. Tracking reading test scores for the two groups over time, while

perhaps not fully meeting the guidelines for "possible" evidence described

above, may still give you a sense of whether the program is having effects

that are markedly different from what the evidence predicts.

APPENDIX A: WHERE TO FIND EVIDENCE-BASED INTERVENTIONS

The following web sites can be useful in finding evidence-based educational

interventions. These sites use varying criteria for determining which

interventions are supported by evidence, but all distinguish between

randomized controlled trials and other types of supporting evidence. We

recommend that, in navigating these web sites, you use this Guide to help

you make independent judgments about whether the listed interventions are

supported by "strong" evidence, "possible" evidence, or neither.
The What Works Clearinghouse (http://www.w-w-c.org/) established by the U.S.

Department of Education's Institute of Education Sciences to provide

educators, policymakers, and the public with a central, independent, and

trusted source of scientific evidence of what works in education.


The Promising Practices Network (http://www.promisingpractices.net/) web

site highlights programs and practices that credible research indicates are

effective in improving outcomes for children, youth, and families.
Blueprints for Violence Prevention

(http://www.colorado.edu/cspv/blueprints/index.html) is a national violence

prevention initiative to identify programs that are effective in reducing

adolescent violent crime, aggression, delinquency, and substance abuse.


The International Campbell Collaboration

(http://www.campbellcollaboration.org/Fralibrary.html) offers a registry of

systematic reviews of evidence on the effects of interventions in the

social, behavioral, and educational arenas.


Social Programs That Work

(http://www.excelgov.org/displayContent.asp?Keyword=prppcSocial) offers a

series of papers developed by the Coalition for Evidence-Based Policy on

social programs that are backed by rigorous evidence of effectiveness.

APPENDIX B: CHECKLIST TO USE IN EVALUATING WHETHER AN INTERVENTION IS BACKED

BY RIGOROUS EVIDENCE

Checklist to use in evaluating whether an intervention is backed by rigorous

evidence
Step 1. Is the intervention supported by "strong" evidence of effectiveness?

A. The quality of evidence needed to establish "strong" evidence:

randomized controlled trials that are well-designed and implemented. The

following are key items to look for in assessing whether a trial is

well-designed and implemented.


Key items to look for in the study's description of the intervention and

the random assignment process


a.. The study should clearly describe the intervention, including: (i)

who administered it, who received it, and what it cost; (ii) how the

intervention differed from what the control group received; and (iii) the

logic of how the intervention is supposed to affect outcomes (p. 5).


b.. Be alert to any indication that the random assignment process may

have been compromised. (pp. 5-6).


c.. The study should provide data showing that there are no systematic

differences between the intervention and control groups prior to the

intervention (p. 6).
Key items to look for in the study's collection of outcome data
a.. The study should use outcome measures that are "valid" - - i.e.,

that accurately measure the true outcomes that the intervention is designed

to affect (pp. 6-7).
b.. The percent of study participants that the study has lost track of

when collecting outcome data should be small, and should not differ between

the intervention and control groups (p. 7).
c.. The study should collect and report outcome data even for those

members of the intervention group who do not participate in or complete the

intervention (p. 7).
d.. The study should preferably obtain data on long-term outcomes of the

intervention, so that you can judge whether the intervention's effects were sustained

over time (pp. 7-8).
Key items to look for in the study's reporting of results

a.. If the study makes a claim that the intervention is effective, it

should report (i) the size of the effect, and (ii) statistical tests showing

the effect is unlikely to be the result of chance (pp. 8-9).


b.. A study's claim that the intervention's effect on a subgroup (e.g.,

Hispanic students) is different than its effect on the overall population in

the study should be treated with caution (p. 9).
c.. The study should report the intervention's effects on all the

outcomes that the study measured, not just those for which there is a

positive effect. (p. 9).
B. Quantity of evidence needed to establish "strong" evidence of

effectiveness (p. 10).


a.. The intervention should be demonstrated effective, through

well-designed randomized controlled trials, in more than one site of

implementation;
b.. These sites should be typical school or community settings, such as

public school classrooms taught by regular teachers; and


c.. The trials should demonstrate the intervention's effectiveness in

school settings similar to yours, before you can be confident it will work

in your schools/classrooms.
Step 2. If the intervention is not supported by "strong" evidence, is it

nevertheless supported by "possible" evidence of effectiveness? This is a judgment call that depends, for example, on the extent of the flaws in the randomized trials of the intervention and the quality of any nonrandomized studies that have been done. The following are a few factors to consider in making these judgments.


A. Circumstances in which a comparison-group study can constitute

"possible" evidence:


a.. The study's intervention and comparison groups should be very

closely matched in academic achievement levels, demographics, and other

characteristics prior to the intervention (pp. 11-12).
b.. The comparison group should not be comprised of individuals who had

the option to participate in the intervention but declined (p. 12).


c.. The study should preferably choose the intervention/comparison

groups and outcome measures "prospectively" - i.e., before the intervention

is administered (p. 12).
d.. The study should meet the checklist items listed above for a

well-designed randomized controlled trial (other than the item concerning

the random assignment process). That is, the study should use valid outcome

measures, report tests for statistical significance, and so on (pp. 16-17). B. Studies that do not meet the threshold for "possible" evidence of effectiveness include: (i) pre-post studies (p. 2); (ii) comparison-group studies in which the intervention and comparison groups are not well-matched; and (iii) "meta-analyses" that combine the results of individual studies which do not themselves meet the threshold for "possible" evidence (p. 13).


Step 3. If the intervention is backed by neither "strong" nor "possible" evidence, one may conclude that it is not supported by meaningful evidence of effectiveness.

REFERENCE



1 Evidence from randomized controlled trials, discussed in the following journal articles, suggests that one-on-one tutoring of at-risk readers by a well-trained tutor yields an effect size of about 0.7. This means that the average tutored student reads more proficiently than approximately 75 percent of the untutored students in the control group. Barbara A. Wasik and Robert E. Slavin, "Preventing Early Reading Failure With One-To-One Tutoring: A Review of Five Programs," Reading Research Quarterly, vol. 28, no. 2, April/May/June 1993, pp. 178-200 (the three programs evaluated in randomized controlled trials produced effect sizes falling mostly between 0.5 and 1.0). Barbara A. Wasik, "Volunteer Tutoring Programs in Reading: A Review," Reading Research Quarterly, vol. 33, no. 3, July/August/September 1998, pp. 266-292 (the two programs using well-trained volunteer tutors that were evaluated in randomized controlled trials produced effect sizes of 0.5 to 1.0, and .50, respectively). Patricia F. Vadasy, Joseph R. Jenkins, and Kathleen Pool, "Effects of Tutoring in Phonological and Early Reading Skills on Students at Risk for Reading Disabilities, Journal of Learning Disabilities, vol. 33, no. 4, July/August 2000, pages 579-590 (randomized controlled trial of a program using well-trained nonprofessional tutors showed effect sizes of 0.4 to 1.2).
2 Gilbert J. Botvin et. al., "Long-Term Follow-up Results of a Randomized Drug Abuse Prevention Trial in a White, Middle-class Population," Journal of the American Medical Association, vol. 273, no. 14, April 12, 1995, pp. 1106-1112. Gilbert J. Botvin with Lori Wolfgang Kantor, "Preventing Alcohol and Tobacco Use Through Life Skills Training: Theory, Methods, and Empirical Findings," Alcohol Research and Health, vol. 24, no. 4, 2000, pp. 250-257.
3 Frederick Mosteller, Richard J. Light, and Jason A. Sachs, "Sustained Inquiry in Education: Lessons from Skill Grouping and Class Size," Harvard Education Review, vol. 66, no. 4, winter 1996, pp. 797-842. The small classes averaged 15 students; the regular-sized classes averaged 23 students.
4 These are the findings specifically of the randomized controlled trials reviewed in "Teaching Children To Read: An Evidence-Based Assessment of the Scientific Research Literature on Reading and Its Implications for Reading Instruction," Report of the National Reading Panel, 2000.
5 Frances A. Campbell et. al., "Early Childhood Education: Young Adult Outcomes From the Abecedarian Project," Applied Developmental Science, vol. 6, no. 1, 2002, pp. 42-57. Craig T. Ramey, Frances A. Campbell, and Clancy Blair, "Enhancing the Life Course for High-Risk Children: Results from the Abecedarian Project," in Social Programs That Work, edited by Jonathan Crane (Russell Sage Foundation, 1998), pp. 163-183.
6 For example, randomized controlled trials showed that (i) welfare reform programs that emphasized short-term job-search assistance and encouraged participants to find work quickly had larger effects on employment, earnings, and welfare dependence than programs that emphasized basic education; (ii) the work-focused programs were also much less costly to operate; and (iii) welfare-to-work programs often reduced net government expenditures. The trials also identified a few approaches that were particularly successful. See, for example, Manpower Demonstration Research Corporation, National Evaluation of Welfare-to-Work Strategies: How Effective Are Different Welfare-to-Work Approaches? Five-Year Adult and Child Impacts for Eleven Programs (U.S. Department of Health and Human Services and U.S. Department of Education, November 2001). These valuable findings were a key to the political consensus behind the 1996 federal welfare reform legislation and its strong work requirements, according to leading policymakers - including Ron Haskins, who in 1996 was the staff director of the House Ways and Means Subcommittee with jurisdiction over the bill.
7 See, for example, the Food and Drug Administration's standard for assessing the effectiveness of pharmaceutical drugs and medical devices, at 21 C.F.R. ¡±314.126. See also, "The Urgent Need to Improve Health Care Quality," Consensus statement of the Institute of Medicine National Roundtable on Health Care Quality, Journal of the American Medical Association, vol. 280, no. 11, September 16, 1998, p. 1003; and Gary Burtless, "The Case for Randomized Field Trials in Economic and Policy Research," Journal of Economic Perspectives, vol. 9, no. 2, spring 1995, pp. 63-84.
8 Robert G. St. Pierre et. al., "Improving Family Literacy: Findings From the National Even Start Evaluation," Abt Associates, September 1996.
9 Jean Baldwin Grossman, "Evaluating Social Policies: Principles and U.S. Experience," The World Bank Research Observer, vol. 9, no. 2, July 1994, pp. 159-181.
10 Roberto Agodini and Mark Dynarski, "Are Experiments the Only Option? A Look at Dropout Prevention Programs," Mathematica Policy Research, Inc., August 2001, at http://www.mathematica-mpr.com/PDFs/redirect.asp?strSite=experonly.pdf.
11 Elizabeth Ty Wilde and Rob Hollister, "How Close Is Close Enough? Testing Nonexperimental Estimates of Impact against Experimental Estimates of Impact with Education Test Scores as Outcomes," Institute for Research on Poverty Discussion paper, no. 1242-02, 2002, at http://www.ssc.wisc.edu/irp/.
12Howard S. Bloom et. al., "Can Nonexperimental Comparison Group Methods Match the Findings from a Random Assignment Evaluation of Mandatory Welfare-to-Work Programs?" MDRC Working Paper on Research Methodology, June 2002, at http://www.mdrc.org/ResearchMethodologyPprs.htm. James J. Heckman, Hidehiko Ichimura, and Petra E. Todd, "Matching As An Econometric Evaluation Estimator: Evidence from Evaluating a Job Training Programme," Review of Economic Studies, vol. 64, no. 4, 1997, pp. 605-654. Daniel Friedlander and Philip K. Robins, "Evaluating Program Evaluations: New Evidence on Commonly Used Nonexperimental Methods," American Economic Review, vol. 85, no. 4, September 1995, pp. 923-937; Thomas Fraker and Rebecca Maynard, "The adequacy of Comparison Group Designs for Evaluations of Employment-Related Programs," Journal of Human Resources, vol. 22, no. 2, spring 1987, pp. 194-227; Robert J. LaLonde, "Evaluating the Econometric Evaluations of Training Programs With Experimental Data," American Economic Review, vol. 176, no. 4, September 1986, pp. 604-620.
13 This literature, including the studies listed in the three preceding endnotes, is systematically reviewed in Steve Glazerman, Dan M. Levy, and David Myers, "Nonexperimental Replications of Social Experiments: A Systematic Review," Mathematica Policy Research discussion paper, no. 8813-300, September 2002. The portion of this review addressing labor market interventions is published in "Nonexperimental versus Experimental Estimates of Earnings Impact," The American Annals of Political and Social Science, vol. 589, September 2003.
14 J.E. Manson et. al, "Estrogen Plus Progestin and the Risk of Coronary Heart Disease," New England Journal of Medicine, August 7, 2003, vol. 349, no. 6, pp. 519-522. International Position Paper on Women's Health and Menopause: A Comprehensive Approach, National Heart, Lung, and Blood Institute of the National Institutes of Health, and Giovanni Lorenzini Medical Science Foundation, NIH Publication No. 02-3284, July 2002, pp. 159-160. Stephen MacMahon and Rory Collins, "Reliable Assessment of the Effects of Treatment on Mortality and Major Morbidity, II: Observational Studies," The Lancet, vol. 357, February 10, 2001, p. 458. Sylvia Wassertheil-Smoller et. al., "Effect of Estrogen Plus Progestin on Stroke in Postmenopausal Women - The Women's Health Initiative: A Randomized Controlled Trial, Journal of the American Medical Association, May 28, 2003, vol. 289, no. 20, pp. 2673-2684.
15 Howard S. Bloom, "Sample Design for an Evaluation of the Reading First Program," an MDRC paper prepared for the U.S. Department of Education, March14, 2003. Robert E. Slavin, "Practical Research Designs for Randomized Evaluations of Large-Scale Educational Interventions: Seven Desiderata," paper presented at the annual meeting of the American Educational Research Association, Chicago, April, 2003.
16 The "standardized effect size" is calculated as the difference in the mean outcome between the treatment and control groups, divided by the pooled standard deviation.
17 Rory Collins and Stephen MacMahon, "Reliable Assessment of the Effects of Treatment on Mortality and Major Morbidity, I: Clinical Trials," The Lancet, vol. 357, February 3, 2001, p. 375.
18 Robinson G. Hollister, "The Growth of After-School Programs and Their Impact," paper commissioned by the Brookings Institution's Roundtable on Children, February 2003, at http://www.brook.edu/dybdocroot/views/papers/sawhill/20030225.pdf. Myles Maxfield, Allen Schirm, and Nuria Rodriguez-Planas, "The Quantum Opportunity Program Demonstration: Implementation and Short-Term Impacts," Mathematica Policy Research (no. 8279-093), August 2003.
19 Guidance for Industry: Providing Clinical Evidence of Effectiveness for Human Drugs and Biological Products, Food and Drug Administration, May 1998, pp. 2-5
20 Robert J. Temple, Director of the Office of Medical Policy, Center for Drug Evaluation and Research, Food and Drug Administration, quoted in Gary Taubes, "Epidemiology Faces Its Limits," Science, vol. 269, issue 5221, p. 169.
21 Debra Viadero, "Researchers Debate Impact of Tests," Education Week, vol.

22, no. 21, February 5, 2003, page 1.


22 E. Barrett-Connor and D. Grady, "Hormone Replacement Therapy, Heart Disease, and Other Considerations," Annual Review of Public Health, vol. 19, 1998, pp. 55-72.
23 Frederick Mosteller, Richard J. Light, and Jason A. Sachs, op. cit., no. 3.
24 Brian Stecher et. all, "Class-Size Reduction in California: A Story of Hope, Promise, and Unintended Consequences," Phi Delta Kappan, Vol. 82, Iss. 9, May 2001, pp. 670-674.
25 David L. Olds et. al., "Long-term Effects of Nurse Home Visitation on Children's Criminal and Antisocial Behavior: 15-Year Follow-up of a Randomized Controlled Trial," Journal of the American Medical Association, vol. 280, no. 14, October 14, 1998, pp. 1238-1244. David L. Olds et. al., "Long-term Effects of Home Visitation on Maternal Life Course and Child Abuse and Neglect: 15-Year Follow-up of a Randomized Trial," Journal of the American Medical Association, vol. 278, no. 8, pp. 637-643. David L. Olds et. al, "Home Visiting By Paraprofessionals and By Nurses: A Randomized, Controlled Trial," Pediatrics, vol. 110, no. 3, September 2002, pp. 486-496. Harriet Kitzman et. al., "Effect of Prenatal and Infancy Home Visitation by Nurses on Pregnancy Outcomes, Childhood Injuries, and Repeated Childbearing," Journal of the American Medical Association, vol. 278, no. 8, August 27, 1997, pp. 644-652.
26 For example, see Robert G. St. Pierre et. al., op. cit., no. 8; Karen McCurdy, "Can Home Visitation Enhance Maternal Social Support?" American Journal of Community Psychology, vol. 29, no. 1, 2001, pp. 97-112
Yüklə 113,13 Kb.

Dostları ilə paylaş:
1   2   3




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©www.genderi.org 2024
rəhbərliyinə müraciət

    Ana səhifə