How does the Learning Record Model Compare with Existing Methods of Measurement in Assessing Student Literacy Learning?M. A. Syverson and Mary Barr
There are three dominant models of literacy assessment besides the LRO/CLR/PLR in use today: grades, standardized tests, and portfolios. The first model is the classroom-based assignment of grades.
Grades indicate the degree to which learners meet teachers' expectations for students in a particular course. This system allows for great diversity in classroom learning situations, since it does not dictate the use of standardized tasks and activities. However, it is extremely difficult to use grades as a basis for comparison across student populations, even between two classes at the same grade level and subject in the same school. Therefore, grades alone are not sufficient for determining the relative success of programs, departments, schools, or districts, for example.
The system of classroom grades does reflect the teacher's proximity to the learner and the learning situation, but there are no assurances against bias, nor any substantive accountability for what is taught or learned, nor are there any real connections to current learning theory. Individual grades tend to obscure or discount students and teachers collaborative activities, to privilege products over activities, and to reinforce behaviorist assumptions about learning long discredited among learning theorists. Grades become the reward intended to motivate students to behave in certain ways or to punish them for their inability or unwillingness to perform as expected. Students and teachers gravitate towards safe, school assignments and responses.
As a counterpoint to the grading system, and to compensate for some of its limitations, the system of standardized assessment has emerged, including such tests as the PSAT, the SAT, the ACT, various state-mandated tests, and other test packages marketed by assessment publishers. The popularity and influence of standardized tests rests on the claims of test providers for their objectivity, reliability, and generalizability. Such claims grow out of a misconceived analogy with scientific research. However, these claims have for some time now been challenged by a large number of researchers and theorists in the field of learning and development (Ball, 1993; Calfee, 1992; Freedman, 1993; Moss, 1994; Sadler,1987). For example, claims of objectivity, they argue, are seriously compromised by precisely what standardized test providers have regarded as a strong point: the removal of the test situation from authentic contexts for learning.
There is also a great deal of criticism from conservative factions in the public, academics, and politicians about the steps taken by test-makers to account for the learning of students from non-mainstream cultures. The inclusion of reading passages from writers of color in the California state test of reading, for example, has been challenged loudly. The change to performance assessment by school district administrators has been met with distrust by African-Americans in San Diego, for instance, under the belief that multiple choice testing avoids teacher bias and is thus fairer to their children.
The claims for reliability and generalizability of standardized tests have presumedly assured generations of test-takers that the tests themselves are fair. But are they? A thermometer provides an objective measure of body temperature, but taken by itself this measure reveals little about the patient's physical condition, even to the doctor, who must conduct an examination to determine the cause of an abnormal reading. Similarly, standardized test scores may take students' academic temperatures while indicating very little of what they actually know and know how to do, leaving teachers and parents to try to decide what to make of the test results. This limitation in standardized testing is true not only for measuring understanding of complex concepts but even for students grasp of the so-called basics.
The psychometric approach to assessment relies on inferences about achievement drawn from a single performance which may or may not represent what the individual student actually knows or can do. The inferences have been drawn without knowing anything about the learning situation, the students, or their teachers. Further, and most at odds with current learning theory, these forms of assessment rely on given tasks with single answers. The tasks are, therefore, often opaque to students because they are inauthentic and because they represent only fragments of the domain of knowledge being sampled.
A major claim for this kind of assessment is that comparability of the levels of performance can be shown across student populations--across classrooms, regions, ethnic groups, gender, socioeconomic classes. But, as Moss (1994) points out, the generalizations are based on human judgments of performance every bit as much as performance-based approaches. The question in her mind, and in ours, is "whether those generalizations are best made by limiting human judgment to single performances, the results of which are then aggregated and compared with performance standards, or by expanding the role of human judgment to develop integrative interpretations based on all the relevant evidence" (p. 8). Certainly it is important to know how well schools and their programs are working, and which students are failing to thrive in a particular environment, so that school staffs, district offices, state departments of education and legislators can direct resources appropriately. However, as Moss argues, reliability without validity is a meaningless concept (p. 10).
When comparing test scores across populations there is a great deal dropped out of the picture, or significantly misrepresented: How well can we represent the larger populations of recently-arrived immigrants who are not yet proficient in English? Where are the children in migrant families whose education may be repeatedly interrupted, and who may not even be in school on the day the assessment is given? What about children with disabilities, or with unconventional or unrecognized capabilities, or even the students with inspired or nontraditional teachers? As a result of these difficulties, many critics have become seriously concerned with the distorting effect of standardized assessments on teaching and learning situations. Standardized norm referenced tests can provide some meagre indication of the learning of students but, alone, they provide a poor measure that turns classrooms into places where teachers must prepare their students to be sorted and graded like so many cattle.
These concerns with grades and standardized assessment have fueled a search for more authentic and flexible methods of assessment. A major recent development is the portfolio assessment movement. Portfolios are collections of student work, often accompanied by some student reflections on what the collection represents. The flexibility in providing materials for a portfolio allows this form of assessment to accommodate a wider range of work, as well as the potential to account for the work of students from diverse cultural and linguistic backgrounds.
Proponents say portfolios can capture more of the learner's progress over time than either grades or tests can. Since the materials are typically produced in the course of normal classroom activities, the portfolios are believed to provide more "authentic" assessment, that is, assessment closely linked to, and reflective of, actual learning situations. Portfolios present some challenges when it comes to assessment beyond the classroom, however. Because they are diverse compilations of materials, it is difficult to make any comparison across student populations, or to make well-supported interpretations about the effectiveness of programs, schools, or districts. It is very difficult to achieve consistent results with different readers. And it is difficult to make informed interpretations about the student's development over time. It can be expensive and time-consuming to train readers and to conduct the assessment itself, because of the volume of materials that must be sifted through.
The portfolio assessment movement has attempted to address these difficulties by establishing some standard requirements for portfolios, either in terms of tasks and activities or products. The further we push in this direction, however, the closer we come to replicating the worst features of standardized assessment.
The Learning Record model
The Learning Record model incorporates the strengths of all three of these kinds of student assessment while also addressing their weaknesses. At the classroom level, the Records compile evidence of students' learning from multiple sources, including student writing in response to class activities, their observations and interpretations of their own learning, interviews with family members or other adults, and teachers' observations and interpretations of student learning. The Record is accumulated over time, yet it is not merely additive; materials are selected to be included where they provide evidence in support of interpretations about learning. At this point, it is a more sophisticated form of portfolio evaluation, and as such, can inform grades at the classroom level.
However, to be useful as an assessment of learning, we must be able to take this richly documented perspective on student learning to a larger scale. We are able to accomplish this through a unique process of moderation readings, which serve to guard against subjective bias, establish comparability, and assure validity, without sacrificing any of the benefits of authenticity. (See, for example, the technical reports on the moderation readings.) As a large scale assessment of literacy learning, the multiple perspectives on student achievement can be used profitably without losing their contributions to understanding how students are making progress.
While a major goal of both standardized testing and portfolio assessment is to eliminate, as much as possible, any evidence of learning supplied by the teacher, the teacher's professional judgment is central to the Learning Record. The assumption is that the adult professional in the classroom is especially situated, educated, and prepared to observe and interpret student progress and achievement in terms of our shared expectations for and understanding of literacy development. Students, too, are positioned to contribute significantly to the evaluation of their own learning. The Learning Record approach validates teachers and students' unique perspectives on student learning through a process of faculty seminars and moderation readings. In the process, teachers have the opportunity to deepen and enrich their understanding of their students as well as their awareness of effective teaching and learning practices.
Portfolios of student work have been a big trend in student evaluation and large-scale assessment for at least twenty years. Portfolios are wonderfully rich sources of data about student learning—if you know where and how to look. Eportfolios in general have improved the collection of data and made it easier for students to gather data in electronic format, including types of data that cannot be represented in print. Some of these systems are far more technologically advanced than the Learning Record. But there is a very big issue with portfolios, including eportfolios: while they continue to make data collection easier for writers, they do little or nothing to aid readers. Confronted with a bewildering variety of data, now scattered in blogs, multimedia presentations, email messages, collaborative group work, and so on, how do teachers make sense of students’ development? And where do they find the time to manage even a cursory browsing of such diverse and random samples?
This is where the Learning Record shines. Because of its structure, information about student learning, no matter how diverse, is organized in consistent, meaningful sections that can be quickly accessed and understood by readers across all disciplines. Because the five dimensions of learning ground the Learning Record, students and teachers can talk together about how development has occurred. Because students provide an analysis of their learning based on these dimensions, teachers can quickly determine whether students have grasped the important teaching and learning concepts, skills, and activities of the course. Because students are responsible for evaluating their work based on matching grade criteria and evidence in the Learning Record, teachers have a sense of how well students can connect evidence of learning and interpretations and judgment. This shared structure and underlying theoretical framework provides teachers and students a rapid and precise way of evaluating learning and discussing it together.