Data - 2 - Ben Slavic

Since that last article (renamed Data – 1) was so long to read, I wrote a kind of summary of the points here, with a few more added in. In my own view, these things are true about gathering data:

1. It wastes time. In my district, two full weeks of instructional time are lost during each year to gather data at the district level, not to mention all the time spent on testing within the normal curriculum. Studies have shown some startling numbers on this. In one study I think I read that up to fully one third of a school year was given over to testing.
2. It is expensive. School districts are metaphorically building entire wings onto their buildings that are being filled with data gatherers while teachers are desperately needed in school buildings.
3. It shames kids. Students who work hard all year in good faith to memorize verbs, etc. are often shocked about how little they know when the results come back. This causes them to question their ability to learn a language, when it is proven that they really can since they are already fluent in their first language, and many drop out after the first or second year. This affects classroom retention rates and, besides causing them to think that they are stupid when learning languages (since it can’t be the teacher’s fault in their minds), it also causes the teachers to lose potential students to boost their own enrollment numbers, which diminishes their job security and makes them teach in a kind of nervous fog.
4. It shames teachers. This seems to not concern anyone, but it concerns me. At the end of the year in some districts, teachers are given bar graphs to show where they rank, in spite of the flawed nature of the testing. Thinking that the results mean something, some teachers lick their wounds over the summer and suffer thinking about the coming year. This is inhuman. We should not hate our jobs. Many potentially great teachers are driven from the profession by the flawed data because they think they aren’t any good at it when they are. Angie Dodd and Greg Stout are perfect examples of this who did not let the system destroy them and instead rose far above it in just a few short years of extreme stress. There is no way a set of tests can reflect what a teacher really does in a classroom with their children, and yet we act as if such tests can in fact do that. They can’t. Teaching is about so much more than testing. People say it, but they don’t mean it, apparently.
5. The assessment instruments used are currently seriously flawed and resemble instruments created decades ago. Things have changed in the classroom regarding standards, there is a visible push to more and more communicative competence in the classroom even with teachers who don’t understand comprehension based instruction, yet the testing hasn’t really changed from twenty or thirty years ago. This is inexcusable.
6. No one can agree on what a good test is, since we are now in the middle of the biggest shift in foreign language instructional techniques in the history of education. How much each section should weigh is another topic. In my view, and I disagree with Diana on this, writing and speaking should count as zero in the aggregate score. Writing is impossible to grade accurately – we have proven that in Denver Public Schools – and speaking is non-emergent in the first three or four years to any meaningful degree, and yet teachers continue to evaluate writing and speaking as if this were not true. The only thing, if anything, that should be measured is listening and reading. But there are too many factors that skew those two things as well. In listening we must ask who gives the test? How fast do they speak? What is the nature of the spoken text? Who writes the questions? In reading, which may be the only justifiable area to try to gather data in, which text do we choose? What is the base vocabulary? Since language is a rich flow of words, how can we target certain words that we would expect a student to even know? Do we all have the same discussions in class? Given the nature of the art of conversation (see https://benslavic.com/blog/lart-de-la-conversation-and-tprs/ ), that is impossible.
7. Teachers who lean towards teaching to the conscious mind and memorization, where language acquisition simply does not happen that way, still have the upper hand in test design. They do not understand that such results can therefore mean nothing. Yet, since those old memorization type of teachers are still largely in power (this is changing fast), they produce a kind of schism between what they are testing for and the national standards, which, when fully grasped by district organizers, will come to a crashing halt in favor of comprehensible input instruction, which in fact does align with all state and national standards.
8. It is done on large populations of students who are not motivated. Failure to recognize this glaring fact has always puzzled me. Would we gather data on the ability of a fighter jet to move through the air if it were put together with duct tape? Many of our students have duct tape attitudes, and just want the credit. How can the results of tests taken by them mean anything? An added factor here is the attendance factor in data gathering.
9. As stated, it does not take into consideration the fact that input is needed for thousands of hours before any meaningful or authentic output can occur. This calls into question results gathered before those thousands of hours have been completed, which they never are in secondary schools because we run out of time. (If you are new to this group and this issue of time required to gain competence in a language is new to you, you may want to read some of the Primers in the hard link bar above to get a handle on this very important point). It is like trying to measure the growth of the seed below the ground before the flower appears, or measuring the growth of a fetus before it is born. This is excessive, invasive and unnatural.
10. In the case of the AP exam, results come back so late, in July, that the seniors who took them could care less about their scores. They just took the course because they wanted the AP class on their transcript for college. When a large percentage of the scores indicate failure, those results are shoved aside for the new year and the dismal cycle of smoke and mirrors presented to the public and to the largely complicit school district repeats itself again the next year.
11. There will always be an unfair and undemocratic discrepancy in scores gathered in urban and suburban schools. They don’t indicate in any way that there is more intelligence in the suburbs, but this is the received idea. It is a big fake. Suburbs were designed to separate races and require people to have a car (see Life, Inc., How the World Became a Corporation and How to Take it Back by Douglas Rushkoff). And now that process is reversing itself in the form of gentrification. The resultant relief to suburban communities to invest in taking care of the poor meant large amounts of saved tax dollars to invest in whites-only education. Of course the scores are going to be higher in the suburbs and those privileged kids get into the privileged colleges and it perpetuates separation of the classes and the destruction of our democratic values. When data is used to condemn urban education as of poor quality, it supports the elitist views of the rich. This separation of the country into haves and have-nots is being done consciously.