Drew On Grading - Ben Slavic

This is from Drew in response to Grant and Nathan’s desire to know more about the grading system he helped develop based on “10 learning scales for foreign language to measure kids based on their ability to use language, not to conjugate.”
When I start presenting the scales at training sessions I always start out with we are bringing back the old check/check plus/check minus system. We are communicating to our kids what they need to do get that check mark–to meet standards. The scale is a 4-point scale, so-to-speak. A 3.0 means that the kid has met standard, a 2.0 is approaching standard, and a 4.0 is that the kid has exceeded the standard. It must be said that I am not teaching for a 4.0, I am teaching for a 3.0. Thus, I am not giving out 4.0s all the time. In fact, I can probably count on both of my hands the number of 4.0s I’ve given this semester. Yes, there is a 1.0, far below standard, I give very few of those–there is not, however, a 0.0. Regardless of a student not turning in an assignment (whatever) it is our job as teachers to see what they know. Even if its an informal conversation with the kid we can see where the kid lies 0.5-4.0
My teaching has not changed very much since I started implementing the learning scales. What has changed is the type of work I collect from kids, how I assess their work, and how they interpret my evaluation of their work.
Let’s take the comprehension scale. The language of the scale is:
Score 3.0: Student describes the critical or essential elements of the text or audio source (e.g.: main idea, characters, plot). The student exhibits no major errors or omissions.
Score 2.0: There are no major errors or omissions regarding the simpler details and processes as the student:
recognizes or recalls the critical elements such as main idea, character, plot.
performs basic processes such as recalls or recognizes accurate statements about critical or essential elements of the text or audio source.
So a comprehension assignment would be exactly what we all probably do in a CI class. We would ask a story or personalize the vocab structures. A student will be creating the quiz. The twist now is that the student is creating the 2.0 questions for the quiz. By definition the 2.0 questions are multiple choice, fill in the blank, true/false. So if the students can answer 5 true/false questions correctly for that day of CI then they get a 2.0. To get a 3.0 the students need to be able to describe the story/plot/details/whatever of what was addressed in class that day. All they do is write a few sentences or a paragraph underneath of their T/F questions. It’s super fast to grade.
As the students read a Blane Ray reader one of the things we might do as a class is come up with 10 2.0 questions for a chapter and a couple 3.0 questions for a chapter.
We are reading Mi propio auto in Spanish 2. Some 2.0 questions would be:
True or false: Ben is going to Costa Rica to help people build houses.
True or false: Ben only goes because he wants a car.
A 3.0 question would be: Describe Ben’s motive for going to El Salvador for the summer.
If I am doing my job as a CI teacher then all of my students should be able to hit that 2.0 for comprehension. Now my grade book is set up in such a way that I can see which students are meeting standard in comprehension. I can see which students are now my target students based on who isn’t meeting standard. I know that in my 3rd period Spanish 2 class the average for comprehension is 2.9 (my class understands Spanish!) and the lowest score for comprehension is 2.39. so Lauren, with her 2.39 is able to answer all T/F questions for what she hears and sometimes is able to use her language to partially answer 3.0 questions.
Students keep track of their grades on a graph and can observe their learning trends. They have the same data that I have and they use it for goal setting and focus on their independent practice.
The interesting thing is that the 0 and the Super F don’t exist anymore. Regardless of how the kid did at the beginning of the semester, it is the very last assignment that carries the most weight. Lauren, at the beginning of the semester got a 1.5 on the story “Nicole no sabe aplaudir”, got a 2.0 on “Katie hace fila”, and a 2.5 on “Arin compra un ruso”. That is what makes her current comprehension score a 2.39. The average of the three is a 2.0, the Power Law took her positive learning trend into account and projected her score to be the 2.39. She sees herself getting better too as she tracks her progress, which is an ego-boost in itself. Think about how that 0 would factor into an average grade. It doesn’t reflect what the kid knows or can do–it just pulls the kid so far down into that D or F range that he doesn’t want to work anymore.
The comprehension quizzes are still unannounced, vocabulary quizzes are still unannounced. Everything just happens.
When we do vocabulary quizzes they have 10x 2.0 questions (cómo se dice) and they have 4x 3.0 questions, use numbers 2, 4, 5, and 8 in a sentence or paragraph. That score goes into written vocab. Any and all words are fair game from the semester.
The fluency scales are easy to use as well. At 2.0 a teacher can understand what you are saying. At 3.0 a sympathetic native speaker could understand what you are saying.
For the most part, the scales are not an arbitrary assigning of points. Students know exactly what they have to do to meet standard. How do they exceed standard? I don’t tell them that, they have to come up with it on their own. The language for a 4.0 on every scale reads: “In addition to Score 3.0, in-depth inference and applications that go beyond what was taught.” Some kids come up with some great vocab words and structures we haven’t learned in class. Some kids are using the two past tenses pretty accurately… It’s pretty cool. If I told them what A 4.0 is, the grade-grubbers would come out of the woods.
All 9 of the scales take their power law grades and average them together to create the overall grade for the course. The average score for my 3rd period is 2.53. Now we have to be arbitrary again, because I have to give a letter grade, not approaching standard with a 2.53. I have been translating grades like this: A 3.0+-2.41; B 2.4-1.78; C 1.76-1.14; D 1.13-1.03; F 1.02-0. For semester grades, each range will be shifted up by .2 points. Notice how small the D range is.
Their grade print outs are really cool. They have each of the 9 scales with the assignments listed underneath of them to show them exactly how they have been performing. Now when we have those grade-pow-wows and kids ask why do I have a C, I can say well you’re not meeting standard here, here and here but look at you on comprehension. Let’s start focusing your outside study time on vocabulary…
We created 12 scales as a district, I use 7 of the ones we created and I created 2 additional ones for my CI class. Pronunciation, directed response, register, and some other ones I thought were silly.
The 9 scales I am using are: oral vocabulary (3.0=wide range of appropriate vocabulary), oral grammar (3.0=high frequency of grammatical accuracy), oral fluency (3.0=a native speaker can understand you), written vocabulary (3.0=wide range of appropriate vocabulary and 3.0 vocab quizzes [next semester I am adding curriculum vocabulary to use for the vocabulary quizzes]), written grammar (3.0=high frequency of grammatical accuracy), written fluency (3.0 a native speaker can understand you), comprehension, completion (3.0=student addresses all parts of the prompt adequately), free write (.5= 0-50 words, 1.0 50-99 words, 2.0 100-149 words, 3.0 150-175 words, 4.0 175+), independent practice (3.0=student does required work).
The one essential component is an electronic grade book that has a standards-based grading option with the power law feature built in.
I’m a weird one who tries new things. Keeps my professional life interesting. I tried TPRS thinking what the heck, it might work; if not, it’s just Spanish. I tried this scale business thinking that I could mess up grades. Well whatever, I’ll just give everyone As; it’s just Spanish.
Was it worth the change? Ask me in 5 weeks at semester. So far I am seeing better data, 0 Fs, most kids meeting/approaching the standard, and fewer inflated grades (As for kids who don’t deserve them). I’m not giving a kid for breaking and keeping a seat warm in my class. Now they have to do something.