homeour goalsmeet the researchersnews from the projectpublicationsresources


Cultural Validity in Assessment:
A Cross-Cultural Study on the Interpretation of Mathematics and Science Test Items

Ursula Sexton & Guillermo Solano-Flores

WestEd

Paper presented at the 2002 meeting of the American Educational Research Association. New Orleans, LA, April 1-5.

Introduction

The increasing demand for educational accountability is reinforcing long standing assessment development and testing practices. Academic achievement test scores not only drive curriculum decisions but also allow comparisons of program and school level achievement, inform policy makers and reflect trends on a larger scale. However, the conditions that allow students to achieve are not the same for all students (Delpit, 1995; Lipka, 1998; Nelson-Barber, 1999, forthcoming), nor do all students have equal access to adequate instructional content. As such, the consequences of testing can play out differently for gender, ethnic, racial, language and social class groups (College Board, 1999; Knapp, Turnbull & Shields, 1990; Oakes, 1985; Steele & Aronson, 1995; Steele 1997). This is of particular concern in mathematics and science education (American Association for the Advancement of Science, 1990, 1993; American Association of University Women, 1998; Kusimo, Ritter, Busick, Ferguson, Trumbull and Solano-Flores, 2000; National Research Council, 1996).

This paper is part of a larger National Science Foundation-funded study, "Assessing the Cultural Validity of Science and Mathematics Assessments," which examines whether cultural validity should be considered in assessment development and testing practices (Solano-Flores & Nelson-Barber, 2001). Our conceptual framework is based on the Vygostskyan notion that sociocultural setting shapes mental functioning and that different groups socialize their members to create meaning from experience in culturally-determined ways. We are exploring ways in which thinking, communication, and learning styles inherent to culture, as well as sociocultural activity in informal settings influence how students interpret and respond to science and mathematics tests, as well as how culture influences the inferred cognitive activity elicited by those exercises.


Method

In this paper we report results for three cultural groups, indigenous Chamorro and Carolinian (Micronesian) students from the Commonwealth of the Mariana Islands (CNMI) (n = 14), rural Alaskan Yup'ik ("Eskimos") (n = 18), and immigrant Latino students from rural central Washington (n = 32). These students were given one item from a set of two mathematics and two science items selected from the set of released items of the National Assessment for Educational Progress (NAEP) issued in 1996.

The mathematics items involved basic computational and problem solving skills. Gumball involves understanding of probability and estimation, while Lunch Money involves a three-step solution to a money problem.

The science items were distinctly from the Earth and Physical Science disciplines. Mountains asks students to identify the effects of erosion on landforms, as compared through an illustration. Metals expects students to know at least two reasons why metals are used to make different things, identifying properties of metals.

After they responded to the items, students were interviewed individually to elicit information on how they related the item's content to various contexts of their personal experiences and daily lives, as well as how these may have influenced the reasoning and strategies used to complete the item.

These two questions can be characterized as Item Focus and Item Referent (see Solano-Flores, 2001). The former explores how the student perceives the content and nature of and item: What is this item about? (What do you think it is asking from you?). The latter explores the knowledge and skills base perceived by the student as needed to respond to the item.

The students' responses to these two questions were coded from the interviews according to three type of interpretation categories (see Solano-Flores, 2001). Text/Basic Skills means that the student relates the item to the same information provided by the item or to general academic skills. Context/Personal Experience means that the student relates the item to contextual information provided by the item or to their own, first-hand, everyday life experience. Content/Concepts means that the student relates the item to formal learning that usually takes place at school.

Even though the interviews were structured, there was a margin necessary to adapt the interview to the answer given by the student. Therefore, in some cases, both the Focus and Referent questions were asked (or responded to by the student) as a whole. The coding and analyses were based on the compounded number of observations on the instances in which the two questions were asked either independently or as a whole. Two readers coded the student responses with the categories described above. An inter-coder agreement of 97.5, 90, and 95 percent was reached respectively for Focus, Referent, and Focus-and-Referent.


Results

Tables 1 to 3 show the results based on effective observations (number and percent). Cases in which the questions were not asked, the student responses were not clear, or the student did not respond, were not taken into account. Tables 1 and 2 show the results respectively by site and by item. Table 3 shows the compounded results across all sites, identifying the frequency of interpretation types by item.

The results show that students in CNMI were largely dependent upon use of Text/Basic skills as a type of interpretation, with 75% as compared to the other two sites, (52 % and 67% for Central Washington and Alaska, respectively). This was the most utilized method of interpretation at the three sites.

Table 1: Type of interpretation percentages by site. (Number of observations in parentheses).

  CNMI Central Washington Alaska
  (20) (29) (30)
Text/Basic Skills 75
(15)
52
(15)
67
(20)
Context/Personal Experience 10
(2)
24
(7)
13
(4)
Content/Concepts 15
(3)
24
(7)
20
(6)
Total % 100 100 100


Close to one fourth (24%) of the Latino students in Central Washington interpreted the items via personal experiences just as much as basing their interpretation on conceptual content. This contrasts with the 10 and 13 percent of students respectively in CNMI and Alaska, who interpreted the items via personal experiences.

The interpretation via content and concepts was more frequent for Central Washington (24%), followed by students from Alaska, with 20% of them interpreting the items using learned content or academic concepts. The CNMI students used this type of interpretation as their second most frequent approach, (15%) leaving personal experience as their least frequently used method of interpretation.

Mountains was the item for whose interpretation most students used the Text/Basic Skills type. With 91% followed by Metals (74%). This contrasts with 44 and 41 percent respectively for Gumballs and Lunch Money.

The next most frequent type of interpretation observed was Content/Concepts for Gumball (44%). This type of interpretation was evident in 36% of the observations for Lunch Money. Mountains had each Content/Concept and Context/Personal Experience interpretation types only evident in 5 percent of the observations.

Table 2. Type of interpretation percentages by item. (Number of observations in parentheses).

  Gumball
(16)
Lunch Money
(22)
Metals
(19)
Mountains
(22)
Text/Basic Skills 44
(7)
41
(9)
74
(14)
91
(20)
Context/Personal Experience 13
(2)
23
(5)
26
(5)
5
(1)
Content/Concepts 44
(7)
36
(8)
0 5
(1)
Total 100 100 100 100

Note: Numbers may not add to 100 due to rounding.

The Content/Concept interpretation approach was not observed at all in the Metals item.

The two most frequently observed type of interpretation categories for Gumballs were Text/Basic Skills (44%) and Content/Concepts (44%), while the Context/Personal Experience approach was observed in 26 percent of the students for Metals, and only in 23% of the observations for Lunch Money.

Our observations reinforce the notion that items that have a higher contextual connection to personal experiences can affect how students interpret items and respond to them. Knowledge acquired through formal instructional experiences at school, and informal, first-hand experiences at home and within their communities do shape the way they make sense of test items. In these cases, students must depend largely on text and basic skills to relate to the items in a test. This is evident in the Metals item, in which the students drew upon personal experiences to interpret the content of the item, and those who did not do so, solely depended upon textual cues and basic skills. Conversely, in Gumball, a reduced frequency use of personal and contextual interpretations, along with a higher frequency use of conceptual and content related interpretations yielded a comparable and equal frequent use of textual cues and basic skills to interpret this item.

These results reinforce the notion that students from different cultural groups exhibit different patterns in which they interpret science and mathematics exercises and that context and personal experience plays a significant role in the interpretation of the items.


Conclusions

Assessments should avoid inaccurate assumptions about students' experiences, penalizing cultural groups. (Kopriva & Sexton 1999). They should take into account how culture influences both the ways in which people construct knowledge (Greenfield, 1998) and the ways in which individuals learn and teach in both informal and school settings (Lipka, 1998).

The implications of this study shed new light on how science and mathematics test items should be developed to honor cultural diversity while measuring the same high standards desired for all students. We may learn, for instance, that new and improved methods for developing assessments should be created to address item wording and illustration selection more effectively. These new methods would ensure that imaginary situations and stories used with the intent to make an item meaningful are not based on inaccurate assumptions about student's experiences, lives and values, or are not misleading by well-intended illustrations.

Thus far, the results speak to the relevance of culture as a factor that must be considered in the development of science and mathematics tests.


References

American Association for the Advancement of Science. (1990). Science for all Americans: A Project 2061 report on literacy goals in science, mathematics, and technology. New York: Oxford University Press.

American Association for the Advancement of Science. (1993). Benchmarks for science literacy. New York: Oxford University Press.

American Association of University Women. (1998). How schools shortchange girls: The AAUW Report. A study of major findings on girls, education. Washington, DC: AAUW Foundation: NEA.

American Association of University Women (1998). Gender gaps: Where schools still fail our children. Washington, DC: AAUW Foundation.

College Board (1999). Reaching the top: A report of the national task force on minority high achievement. New York: College Board Publications.

Delpit, L. (1995). Other people's children. New York: The New Press.

Kopriva, R., & Sexton, U. (1999). Guide for linguistic issues in science assessment scoring by monolingual teachers. Washington, DC: Chief Council of School State Officers.

Knapp, M., Turnbull, B. & Shields, P. (1990, September). New directions for educating the children of poverty. Educational Leadership.

Kusimo, P., Ritter, M.G., Busick, K., Ferguson, C., Trumbull, E., & Solano-Flores, G. (2000). Making assessment work for everyone: How to build on student strengths. San Francisco, CA: Regional Educational Laboratories, WestEd.

Lipka, J. (Ed.). (1998). Transforming the Culture of Schools: Yup'ik Eskimo Examples. Hillsdale, NJ: Lawrence Erlbaum.

Nelson-Barber, S. (1999) A better education for every child: The dilemma for teachers of culturally and linguistically diverse students. In N. Simms & A. Peralez, (Eds.). Roundtable on Culturally and Linguistically Diverse Student Populations. Aurora, CO: Mid-continent Regional Educational Laboratory.

Nelson-Barber, S. (Forthcoming) Exploring Pacific knowledge and classroom learning in Micronesia: The promise of "cultural considerations." Research on the Education of Asian and Pacific Americans.

Oakes, J. (1985). Keeping track: How schools structure inequality. New Haven: Yale University.

Solano-Flores, G. (2001). Worldviews and testviews: the relevance of cultural validity. Paper presented at the European Association of Research in Learning and Instruction. Fribourg, Switzerland, August 28-September 1.

Solano-Flores, G. & Nelson-Barber, S. (2001). On the Cultural Validity of Science Assessments. Journal of Research in Science Teaching, 38(5), 553-573.

Steele, C. M., & Aronson, J. (1995). Stereotype threat and the intellectual test performance of African-Americans. Journal of Personality and Social Psychology, 69, 797-811.

Steele, C. M. (1997). A threat in the air: How stereotypes shape the intellectual identities and performance of women and African-Americans. American Psychologist, 52, 613-629.

Trumbull, E., Rothstein-Fisch, C., Greenfield, P.M., and Quiroz, B. (2001). Bridging cultures between home and school: A guide for teachers. Mahwah, NJ: Lawrence Erlbaum Associates.


Note

This paper reports some of the results from a project funded by the National Science Foundation, grant number REC-9909729. We are grateful to Rebeca Díaz, Jo Ann Izu and Rachel Lagunoff for their participation in the data collection stage of the project, and to Sharon Nelson-Barber for her collegial support.


Table 3. Type of interpretation percentages by item and site. (Number of observations in parentheses).

CNMI             Central Washington             Alaska
  Gumball
(5)
Lunch Money
(6)
Metals
(4)
Mountains
(5)
Gumball
(7)
Lunch Money
(6)
Metals
(7)
Mountains
(9)
Gumball
(4)
Lunch Money
(10)
Metals
(8)
Mountains
(8)
Text/Basic Skills 40
(2)
67
(4)
100
(4)
100
(5)
29
(2)
33
(2)
43
(3)
89
(8)
75
(3)
30
(3)
87
(7)
88
(7)
Context/Person Experience 20
(1)
17
(1)
0 0 14
(1)
33
(2)
57
(4)
0 0 20
(2)
13
(1)
13
(1)
Content/Concepts 40
(2)
17
(1)
0 0 57
(4)
33
(2)
0 11
(1)
25
(1)
50
(5)
0 0
Total 100 100 100 100 100 100 100 100 100 100 100 100

Note: Numbers may not add to 100 due to rounding.

OUR GOALS | MEET THE RESEARCHERS | NEWS FROM THE PROJECT | PUBLICATIONS| RESOURCES