As a psychologist I recommend to abandon all tests based on Classical and Modern “test theories.” But I am not sure whether my fellow psychologists will agree. many make a living on applying traditional tests. Even those who critically examine test usage do not question their validity and their use in principle. They have not only vested interests but have not heard of possible alternatives to which they could switch. Critical scholars like Alan Schoenfeld, professor of math didactics and former president of APA, warn us of the use of psychometric methods but all they suggest is a moratorium of tests. I think we can do better.
I am a retired German professor of psychology, having specialized in experimental and psychometric methods, besides my involvement in the study of moral-democratic competence and its application in education. Already during my study at university I developed some suspicion against Classical Test Theory and its modern variations (IRT, Rasch-scaling), on which nearly all tests are based. The better I understood these “theories” the more I discovered that they have nothing to do with scientific psychology. Prevailing test theories are a modern form of Vodooism with sacred rituals which are to make the people believe that our sorting and evaluating of people is something rational, scientific. It is not.
Prevailing test theories fail an important standard of sound science: they cannot be falsified by data, they are immune against reality. If a test yields some anomalies, its items are replaced until the data fit the statistical dogma of reliability – regardless of the damage this “item analysis” does to the overall validity of the test. Because test makers have no real understanding of what they measure they cannot answer the basis question of validity: Does the test really measure what we intent to measure? Instead they invent all kinds of “validities” in order to save their assumptions.
No wonder that these tests have all failed. They have little, if any, “prognostic validity”. Even much criticized teacher grading is a better predictor of college success. Moreover, no support can be found for the allegation that their use would improve teaching and learning. I have analyzed many studies of the effects of the high-stakes-testing which began with the Head Start program in 1965, the year when I was exchange student in the US. I could not find any support for this allegation. Some small, short-term increases of test scores occurred but they could be fully explained by growing test-wiseness and cheating. Therefore, tests have to be replaced by new versions at an ever faster rate.
Then it was the first time I had to take a test as a school student. In Germany we had no multiple choice tests in school until PISA started. I was surprised how easy it was to get an A. To answer a 90-minute test, it took me just ten minutes. I did not know many of the answers, I just made guesses. Only much later I understood why my school-mates worked harder but got lower test scores. It was BECAUSE they worked harder. For me tests were just fun like cross-word puzzles. I was not obliged to get credits. For them tests were high-stakes. They scared the hell out of them and confused them. Peter Sacks has shown how test anxiety, students’ background and test scores are connected. Tests cannot compensate for student poverty, bad teacher-education and poor curriculum. On the contrary, they even seem to deepen these disadvantages.
But, if tests are based on well-elaborated teaching goals and on sound psychology, and if they are used anonymously, they can be a great help for improving curriculum and teaching methods. If tests are not used for evaluating people (which I believe is a human rights issue), but for evaluating teaching method and content, and for improving teacher education programs, they can be a real blessing. I have shown how a valid test can help to multiply the effect size of methods for teaching moral competence. Just google for the experimentally designed Moral Competence Test. Its construction principle, Experimental Questionnaire, can be easily adapted for other fields of teaching.