So what we are doing is testing an assessment instrument. That is most often done these days by gathering hard data. To test the validity of an assessment instrument, people usually isolate individual responses to a question after the test, count how many kids got it right, and then keep the question or toss it and replace it with a new question depending on all kinds of little numbers.
We actually do that with our district wide writing assessment here in DPS. (the math is incredibly complex and the money spent in our district alone on this somewhat questionable process is in the high millions of dollars every year). A core group of TCI teachers have met with Diana each June for two weeks to do exactly this minutiae kind of work and we (me anyway) are still not convinced that the instrument, because it is summative, has the kind of value that people think it does. Why?
Well, we are teaching to standards now, and the Three Modes of Communication and in particular the Interpersonal Skill of ACTFL are the meat of the standard and the Three Modes are all about observable behaviors and the negotiation of meaning which all happen in a formative way in the classroom every day. I can predict that jGR is going to be a rubric that we use here, but that few teachers outside of this group will ever use, because what is our little rubric when placed next to the mammoth summative data gathering instruments that are happening now?
jGR is just different and so we will have to test it differently. We have to test it ourselves, since we created it, nobody understands it, and since we can’t evaluate it after the fact anyway. This formative kind of assessment of formative work is far superior in language classes to the gathering of summative data at the end of the year and – I am going to repeat this just to be obnoxious – is in far closer alignment to the national standards than any summative assessement could be, because of the nature of language to involve observable behaviors and the ability to negotiate meaning. I have always felt that field testing is the way to go in education anyway.
As I mentioned here a few days ago, the honeymoon period is over on jGR and, although those of us who are using it are blown away by its effectiveness in supporting the Classroom Rules and in flat out in-your-face making our kids accountable, we still need to go though a period, which is naturally starting about now here at the end of September for most of us in which we test this bad boy and find the warts. I use the term warts because it is so far from sounding like an official data kind of word that it makes me happy.
Now, the first wart to appear was when Chris said that he wants to see how the numbers are set up before running his own tests of it. Typically cautious Chris. Just kidding – he is the Wild Man of Ohio (see how names emerge in groups organically?) But he is right in that we don’t really yet know what the best percentage distribution should be in relationship to other assessments, mainly the quick quizzes for most of us, for jGR.
The kids have snapped to, as Brigitte said, for most of us. That is a stunning victory and a first in my own experience. Without this instrument, I myself would have eventually snapped because many of us know that, in spite of all the great stuff, the big bags of gold, that we have discovered in comprehension based instruction, we have always needed something like this with fangs to make the kids show up for our instruction in a completely new way.
But what do we do if what we call “attentive” (the jGR “2” which puts the kid in the C/D range) in a kid is her total effort? What if all she can give us at this time in her academic career deserves, if for mere effort, to be called “responsive” (the jGR “3” which critically places the kid in the B/C range)? Considering that, and based on my own classroom experience over the past five weeks, I am already thinking that jGR at 50% is too severe.
I know that we have given ourselves license to evaluate the kid mainly in terms of those two critical key indicator words “attentive” (2) or “responsive” (3, 4), – the 2/3 split being the tipping point for the kid’s grade. But, as stated in the above paragraph, when a kid has all her life been taught by teachers that merely staring in class and being attentive is enough for the A, and then here we are demanding a complete change in that behavior, isn’t that a bit harsh? Problem, class? Big problem? Big problem or little problem. Problem for us or for everybody? Problem for everybody? Professeur 2? Problem for everybody. I think so.
And here we are moving towards posting of the crucial first set of batches of grades and, since grades have always been presented to kids as the only thing that counts, all of the F’s we have now are going to have to be explained. Yes, many of our kids ARE working at the level of a bonafide 2 on the rubric, but, back to Chris’ concern, this is interpreted by the computer as a 40% – half their grade (do the math for what that does to the grade) – and those kids’ grades are going to involve a ton of explaining and phone calls in which parents say how hard the child is trying but they can’t succeed in our class. I’m expressing it clumsily, bc this is all so new, but you get the idea.
So what do we do? I have two solutions:
1. I am going to experiment with lowering jGR down to less than 50% (say 30%) and see what happens. Or find the right weight – one not so heavy that I end up with half of my kids flunking but one that (so importantly!) keeps them snapped to as they have been. I ain’t ever gonna loose that snap to focus again, not after all these years without it.
2. I am going to give a pop translation quiz in the middle of class very frequently. I will put it in the gradebook next to the jGR grade in that category and not as a quiz grade. The kids will have to, with no notice, write a five minute mid-class translation of the CI so far – whether it is in the form of PQA or a story doesn’t matter. This will help me more accurately assess what I see. I like it! Data in support of observable behavior in class, and not “all hail the summative data gods!”
The grade distribution wart may go away if those of us who are testing jGR right now keep putting our heads together here every day. We can fix it and get the warts to go away together.
