Assessing Students' Learning
Designing and Administering Tests
For testing to be as effective and worthwhile for you and your students as possible, consider the exams you’ll implement when you’re designing a course. If evaluation is considered only in hindsight, it’s likely your time will be used ineffectively and students will be discontent with how their learning was assessed.
Design tests that will measure the goals you set out to achieve in the course and be clear in your instructions. Walvoord and Anderson recommend teachers ask themselves the following question: “By the end of the course, I want my students to be able to (fill in the blank).” Use your responses to guide assessment design.
It’s often advantageous to mix types of items (multiple choice, essay, short answer) on a written exam or to mix assessments throughout the course (e.g., a performance component with a written component). Weaknesses connected with one type of item or aspect of students’ test taking skills will be minimized. It’s also useful to ask how students in the future would be likely to use what they are learning in your course. If they’ll be expected to recognize an example of a phenomenon or category, then give them opportunities to attempt such recognition in your course. If they’ll be asked to evaluate the evidence for a claim relevant to your field, then your assignments should give them practice in such evaluation and graded feedback on their skill at it. Be sure that your assignments (both for practice and for grading) engage students in the kind of knowing or understanding that will be useful to them in future courses and in application to real life.
The process of placing a category judgment such as a grade on student work is rarely easy. In some cases, you can simply count the number of factual or simple items done correctly, but understanding measured by a more complex performance will need to be judged. Walvoord and Anderson (1998) outline strategies for grading in a variety of fields, with plenty of examples. They claim that establishing a set of clear criteria ahead of time will make grading easier for the teacher, more consistent across students, and even faster to get done. The key is to think through the range of feedback you want to give (e.g., points from 1 to 10 or letters from A to F) and identify how you would recognize or characterize a performance in each category. What are the strengths of an answer at each level, and what might be missing that would keep it from being in a higher category? What are the habits of mind or the kinds of knowledge demonstrated that characterize various levels of understanding?
When you engage in this kind of thinking, your work giving feedback will be less challenging and more efficient. If you then share those criteria with your students, they can learn more clearly what you mean by understanding, and there will be fewer occasions for disagreement about feedback. Ambiguous or unstated criteria are a common cause of conflict and frustration for students. Investing time up front to think through your grading criteria will pay dividends in saved time and hassle later.
Time-limited assessments such as tests or presentations can be very stressful for all concerned. Especially in large classes that play a role in sorting out students’ future careers, there can be tension and challenges to academic honesty. Whenever possible, it’s best to create testing occasions that avoid some of the tension and potential for abuse. If your tests are mostly at the rote end of the Bloom framework of understanding (pdf), students will perceive that their primary job is to memorize and regurgitate bits of knowledge; these are the kinds of tests that are most amenable to various forms of unacceptable collaboration or information transfer. Whenever possible, include items that ask students to do more than merely memorize. You can even provide the basic information in the question, but ask students to demonstrate their ability to use intellectual skills to analyze the information given. Items that involve written answers present fewer issues than items with multiple choice formats. Exam items that are more complex in the Bloom framework are not as amenable to academic misconduct. That will relieve your testing situation of some tension due to mistrust and avoid the necessity for maximum security procedures.
If you decide to use test performances that lend themselves to various forms of misconduct, then you’ll need to adopt a more skeptical attitude. There are many sources of practical advice, such as alternating forms and mixing bluebooks. See Davis’ (2001) guidelines in Tools for Teaching for more suggestions.
Ben Eggleston redesigned his introductory ethics tests to avoid simply testing memorization while still making his exams easy to grade. His tests retained their multiple-choice format but required students to apply knowledge and definitions instead of simply restating them.
Unlike questions that test only memorization of definitions, the new questions, which were set up as conversations in which students were asked to choose certain statements that reflected particular ethical positions, require students to apply deeper understandings of concepts to novel situations. The advantages of the conversational format are that the student has to grasp the content rather than merely recall a phrase or expression that he or she could remember from the book or class notes and that they better test the kind of understanding that will serve students well outside the classroom.
Old question: What is the main idea of cultural relativism?
- Moral beliefs vary from one culture to another.
- Morality itself (not just moral beliefs) varies from one culture to another.
New question: In the following dialogue, which of the following statements is incompatible with cultural relativism?
- Some countries rely heavily on child labor, and would suffer devastating economic consequences if they were forced to give it up.
- Despite these consequences, the harms to children are too great to ignore. It is wrong of those cultures to force children to work.
For more information, see Professor Eggleston’s Course Portfolio.
Robert Magnan (1990) suggests taking your students on a “test drive” to help them prepare for your exams. When you design a test, save items you decide not to use. Make a practice test with these items along with instructions for the exam, including the percentage or points for each section or exercise, and have students complete this practice test in class.
This technique has two advantages: You can test your exams and expose students to instructions. If an exam structure is weak, you can improve it before the exam. If instructions are unclear, you can clarify them.
The test drive should include only a sample of test items. Correct and discuss them as a group. If there are several possible answers, indicate which are better and why. If you’ve included essays, ask students to list the essential points they think should be included when they answer the essay question, and then evaluate their responses.
The key is to use the minimum amount of time to get the maximum benefit for you and your students.
^Back to top^
What does it mean to grade? Grading is a context-dependent, complex process that is at its best when teachers recognize the opportunity it offers to enhance student learning. Walvoord and Anderson (1998) identify four major roles of the grading process:
- It works as a means of evaluating student learning in relation to course material and goals.
- It can communicate the level of learning to the students, as well as to employers and others.
- It functions as a motivation device in that it affects what students focus on in their studies.
- It helps organize course components by marking transitions between topics and by bringing closure to particular segments of the class.
In order for grading to be as effective and worthwhile to yourself and your students as possible, make sure that you consider the tests you will implement when you are designing the course (see Course Design for more information). Design tests that will measure the concepts and learning that you set out to achieve in the course, allow student input when designing course goals, and be clear in your instructions. Walvoord and Anderson recommend that teachers ask themselves the following question: “By the end of the course, I want my students to be able to (fill in the blank).” Use your responses to guide the design of your assessments.
Walvoord and Anderson provide examples from professors of several disciplines:
At the end of Western Civilization [a 100-level general education course for first-year students], I want my students to be able to:
- Describe basic historical events and people.
- Argue as a historian does: Take a position on a debatable historical issue, use historical data as evidence for the position, raise and answer counterarguments.
At the end of this math course, I want my students to be able to:
- Solve [certain kinds] of mathematics problems.
- Explain what they’re doing as they solve a problem and why they are doing it.
If grading is considered only in hindsight, it is likely that your time will be ineffectively used and students will be discontent with how their learning was assessed.
For more recommendations for grading tests, see the information on rubrics and Primary Traits Analysis under Grading Writing Assignments.
Walvoord, B. & Anderson, V.J. (1998). Effective Grading. San Francisco: Jossey-Bass.
^Back to top^
Designing Writing Assignments
John C. Bean (2011) states that writing assignments, particularly essay exams, can help students exhibit their mastery of material, synthesize course material, and better understand the goals and direction of the overall course, thus increasing overall retention and understanding of material. He states, “Essay exams send the important pedagogical message that mastering a field means joining its discourse, that is, demonstrating one’s ability to mount effective arguments in response to disciplinary problems.”
In order for students’ writing in assignments and exams to improve, students need to be taught how to write essays. One strategy is to provide students with copies of essays from previous years’ classes, without any instructor comments. Have students rank the essays from best to worst, and ask the class to list which factors they think distinguish an A paper from a B, C, and so on. After that, explain your grading criteria and discuss them with the class. In that way, students are more likely to internalize these criteria and apply them to their own work.
Allowing students to assess previous writing assignments could also be used with a Primary Trait Analysis-designed rubric. With PTA, the teacher determines criteria for each score within the rubric and describes this in a handout given with the assignment or included in the syllabus. Having students work with the rubric to assess another student’s work will help them understand the assignment and hopefully aid them in their own work.
Other ideas for teaching students how to write essay exams include allowing students to practice writing cogent thesis statements in small groups, thus gaining insight and guidance from others, and allowing students to revise an essay, so they receive guidance and learn strategies for future writing assignments.
Another method for increasing processing of information through the design of in-class essays is including time for pre-writing and synthesis before the essay is given. Some ways to achieve this include providing students with a list of all potential essay questions before the day of the exam, requiring students to create and bring to the exam a crib sheet for each essay question, which they can use to answer the essay questions, or assigning take-home essay exams. All these methods allow students time for deeper critical thinking and organization of their arguments.
When Ruth Ann Atchley began teaching a history of psychology course, she decided to use writing as the primary means of learning. She believed that making the course writing-centered would serve two purposes: It would be an active way for students to encounter the material, and it would give psychology students a chance to improve their writing skills.
The nature of the course’s subject matter requires students to process abstract ideas. Atchley focused on helping students learn to write concise answers to relatively broad questions. She found that if students didn’t understand concepts, their writing was vague, flowery or imprecise. Students who deeply understood concepts could write in a clear, comprehensive manner.
For more information about her work, see Atchley’s course portfolio in the KU Portfolio Gallery.
^Back to top^
Grading Writing Assignments
When you’re grading a stack of papers, it’s easy to mark mistakes or note negative points and give a grade—nothing more. But a positive word or two might make a big difference to students. When you need to point out an error, telling students to “Clarify this” may be like telling them to “Be tall”; they might not know how to do what you ask. Consider how you can help students see why they might have made the error, to help them focus their thinking on areas where they need the most work.
Bean (2011) offers four recommendations for grading essay exams. First, don’t look at students’ names when you read the exams, or have students write an ID number (not a Social Security Number) on the test instead. This way, you’ll be able to eliminate grader bias. Second, grade the exam one question at a time, rather than reading the whole exam of each student. This will help with grading reliability.
The third recommendation Bean provides is to shuffle the exams after you complete each question so that you read them in a different order. Record scores in such a way that you don’t know what a student received on Question 1 when you grade Question 2. Finally, if time permits, you should skim a random sample of exams before you make initial decisions about grades. Your goal is to establish anchor papers that represent prototype A, B, and C grades. Then, when you come to a difficult essay, ask yourself, “Is this better or worse than my prototype B or C?”
Instead of using anchor papers to determine grades, you may find it beneficial to use a scoring rubric to grade essays and papers through Primary Trait Analysis (PTA). Developing a PTA scale requires four steps (below).
The advantage of using rubrics or PTA is that, rather than writing out extensive comments, you score the essay or assignment using the rubric, making this an efficient way of grading. Students can refer to the rubric when writing the assignment, as well as use their scored rubric to examine their work’s strengths and weaknesses. This method also increases inter-grader reliability when multiple individuals grade assignments. See Walvoord and Anderson’s Effective Grading (1998) for an in-depth discussion of PTA.
Four steps to creating a rubric
- Choose a test, assignment or group of assignments that you’ll evaluate. Clarify your objectives.
- Identify the criteria or traits that will count in this evaluation. These are usually words or phrases such as “thesis,” “use of color,” or “use of relevant examples.”
- For each trait, construct a two- to five-point scale. Each point relates to a descriptive statement; e.g. “A 5 thesis is clear and appropriate for the scope of the essay; it neither repeats sources nor states the obvious.”
- Try out the scale with a sample of student work and revise as needed. CTE also has samples of rubrics available.
Jorge Pérez’s course portfolio contains an excellent example of both a means for developing a rubric and ways to use it effectively. Kim Warren’s course portfolio also provides an excellent example of a rubric. You can find them both in the CTE Portfolio Gallery (see link above).
^Back to top^