April 1, 2001
Should we attempt to assess or evaluate arts experiences? Can we assess or evaluate aesthetic experience? What do numbers, measurement, and accountability have to do with arts education? Should research in the arts or education follow the established dictates of experimental science-control groups, statistical probability measures, researcher distance or "objectivity?" Can arts experiences be "generalized?" The answers to these questions are, of course, both "yes" and "no," or "nothing and a great deal."
Educators in the United States often wish for an educational system in which the curriculum is sequential, comprehensive, and standardized across school districts or even across the nation. Such systems exist in other more homogeneously populated nations, but it never was in this nation, and probably won't be in our lifetime. Whether that is a good thing or not is beside the point. The reality of education in the American context is, in Gardner's terms, "highly dispersed, with each of the 50 states and many of the 16,000 school districts having their own programs." Gardner reports that " 'Context' has not been my favorite concept, but I have gained a new respect for its importance."1 While he is referring to in-school curriculum arts, when partnership arts in education programs are developed by schools and cultural organizations jointly, the contextual nature of the work takes on new and more complex features. Even in more traditional school arts curriculum projects, the more innovative programs cannot assume, "a familiar and supporting context, [they] must in part create a new context."2 It has become increasingly important, as partnership program have expanded with renewed funding to account for contextual elements in our assessments of student learning and our evaluation of instructional programs. The ways in which contextual variables are incorporated into instructional designs and evaluated by researchers have become the defining elements in measures of success. Those measures of achievement, impact, or operational implementation that do not account for complex sets of variables are judged to be incomplete or inadequately designed. Just as it is important to design arts education instruction around those characteristics of the arts and arts experiences that are necessary for their definition, so is it important to evaluate arts education programs according to those contextual variables that are necessarily part of their definition. If such programs "must create a new context," then our research and evaluation efforts must attempt to document and account for the ways in which the new contexts are shaped by the programs. Such research should, as Winner and Hetland say, "explore the ways in which the arts may change the entire atmosphere of a school. This way we can begin to understand how the arts affect the 'culture of learning' in a school. We can then develop rich, qualitative measures to evaluate whether the arts lead to deepened understanding of-and engagement in-non-arts areas."3 The evaluation work described in this paper is aimed at creating the kind of rich documentation of context variables in an elaborate partnership arts in education program and they ways that students, schools, communities changes in response to these new combinations of variables.
Many attempts to evaluate the impact of arts education on students have focused on student achievement, development of vocabulary, or reading comprehension scores (DuPont, 1992; Gourgey, Bosseau and Delgado, 1985; Hudspeth, 1986; Kardask and Wright, 1987.) These studies, which have found positive effects on test scores, are based on well integrated curriculum or artist-in-residence programs. Many other studies, however, have not found positive effects (Lauder, 1976; Miller, Rynders and Schleien, 1993; Trusty and Oliva, 1994). All of these studies focus on transfer effects and measure change in terms of the receiving subject area-reading, math, sometimes science. Few studies attempt to define transfer in terms common to the arts. Music is something of an exception in that mathematics and music have some commonalities, and research on the physiology of the brain has identified some physical changes that result from the practice of music.
In any case, all these studies have attempted to find ways of quantifying the data collected. That means, of course, that the researchers must either start with data that is clearly quantifiable or discover new ways to quantify information that has not been seen as quantifiable. This researcher has received two recent reports that talk about the "countless" ways that certain social or human characteristics are reported in the studies. The word "countless" is a difficult one for researchers to accommodate, yet that is what much recent educational research has had to do, accommodate variables almost beyond number. Once researchers move from easily observable, countable phenomena such as the number of subjects who spell a word correctly, the amount of time spent on a task, the quantity of paint used to cover a given area, they enter scarcely charted arenas.
There is one major reason to enter these arenas, because the phenomena researchers have been able to count turn out not to count for much in the overall scheme of things-the famous Einstein dictum, "Not everything that can be counted, counts." The kinds of things that we have been able to count in education-scores on reading and math tests, attendance, responses on attitude surveys-have some, but limited, importance in the overall definition of human beings. And, in the history of education in this country, the sum total of all the counting has been that researchers generally conclude that our educational system is failing our children and our nation. Is it possible that by insisting on counting- house answers, on bottom line accountability, we are asking the wrong questions, focusing on the wrong topics, and "counting" in the wrong ways? Where in all those measures of education are our measures of emotional responses, sensory discrimination, or the creation of new meaning? How, in all our measures of frequencies, do we measure countervailing forces of sensation or understanding? How often do we suspend our calculation of standard deviations to measure the impact of long-term involvement in the arts on lives, emotions, and relationships?