I am subjected to stories and articles about testing on a nearly weekly basis. Given my work, this might not be particularly surprising. What is surprising is that I run across the subject in non-specialist media. It is not like I get into my car and tune into the Journal of Applied Psychology radio station (from a public safety standpoint, I am glad such a thing does not exist- drowsy driving is dangerous driving). The point of all this is to say that testing, particularly of the standardized variety, is very much a part of public discourse. More to the point, nobody seems to have anything positive to say about it.
I am not a legislator, charged with developing public policy. I am not a standardized test publisher, risking unemployment if the industry changes. What I am is a person with many years of formal training in psychology, much of it in psychometrics specifically. So, I know a thing or twelve about testing and after reading an article in the most recent Chronicle of Higher Education, I felt the need to clear the air about the topic. In all fairness I should point out that the article in question mostly addresses how these tests are used, but they pull readers in with an image of a standardized test score form (sneaky).
Nature of the Test
Psychologists and other professionals seeking to evaluate different aspects of being human find themselves at a disadvantage relative to other scientists. Where a chemist can directly observe the volume of a liquid, or a biologist can directly observe the number of organisms in a specific habitat a psychologist is almost never able to directly measure the objects of her study. We cannot weigh happiness, nor can we count the number of extraversion in someone’s brain. Even ignoring the ethical problems involved in rooting around in someone’s skull, the physical essence of psychological phenomenon still eludes us. We know it has something to do with neural connections and can even point to certain areas of the central nervous system that work harder in certain situations, but we have yet to isolate the physical components of a memory or an attitude. This inability to directly observe the phenomenon of our primary interest can be a bit of a problem, but one that we train hard to overcome.
Much like not knowing about deoxyribonucleic acid (DNA) did not stop geneticists from making discoveries, psychologists continue to advance human understanding in spite of the very real limitations we face. We do this by operationalizing, which is a fancy way of saying we find a way to estimate. For example, while we can’t directly measure how angry someone might be, we can count the number of times that person yells, how hard they punch, or even measure the amount of cortisol in their blood streams. We then combine these direct observations to make inferences about the angry person’s emotional state. Even if I can’t say anything about the person’s amount of angry in some absolute sense, I can absolutely say something about that person’s anger levels compared to some other person. When you think about it, the whole process mirrors what humans do intuitively on a daily basis, only psychologists tend to be more systematic and objective (at least when conducting research).
What does any of this have to do with standardized testing? Each of these instruments are designed to estimate something. In a broad sense, they attempt to estimate the amount of an individual’s knowledge about something. Much like with previous examples, we are not able to measure knowledge in any direct way, but we can operationalize it and measure phenomenon associated with it. In this case, we might give two people the same math test. One of these participants is a math expert and the other is a math novice. It is reasonable to expect that these two test takers will vary in their performance on this test. Maybe the expert will answer more items correctly, or perhaps the novice will quit trying after a certain number of questions. Either way, we can observe the difference in their response patterns and make inferences about their knowledge in that particular domain. This is the basic premise of standardized testing.
Why Use Standardized Testing?
Once we accept that the basic premise behind testing is a sound one (and it is accepted, isn’t it?), we have to understand why they have become so prevalent in the world of education. At least in higher education, I can think of a few reasons, but at a fundamental level it is an issue of scarcity. As the educational system currently exists, we cannot educate everyone that wants an education. Education takes time and money. It is a cost born by bother the individual seeking education and the society providing it. Since scarcity exists, we need some method of determining who gets and education and who does not. Standardized tests give universities a tool to identify individuals that are better prepared for the rigors they will face, so that resources may be better allocated towards them. Standardized test scores allow a university an objective basis for selection decisions; every dollar we spend on a student that did not graduate is a dollar we did not spend on one that might have (see also opportunity cost).
One might reasonably argue that everyone should have the opportunity to get a higher education. I am the last person to disagree with such a notion. Unfortunately, universities do not make those decisions. As a society, we have decided that we cannot or will not spend the resources needed to provide a universal education at the post-secondary level. Whether of not we agree with our national and societal priorities is beside the point. Universities operate in the real system, not the ideal one.
In the interest of a balanced discussion on the use of standardized testing in the higher education context, I should point out that it is not a given that these tests predict college success. It is what my graduate school professors would call “an empirical question.” In other words, it is not something we necessarily know, but we can find the answer with some research. In my own experience, I have found that standardized tests do predict college success, but they aren’t particularly good at it. The specifics are rather too technical for the current piece, but revolves around the difference between statistical and practical significance. Universities currently use the best instruments available, even if the best aren’t particularly good. I have heard arguments that a more holistic approach would be better, but I vehemently disagree. Holistic approaches have a number of serious disadvantages- they are subjective, resource intensive and quite frankly they do not lend themselves to empirical validation. In other words, we have no way of knowing whether or not they work. A holistic approach would have universities replace a system that we know works, if not particularly well, with one we can’t tell works at all.
So where does all this leave us? Well, we know that standardized testing is a reliable and valid tool in differentiating between prospective university students. Further, we know that alternatives to these instruments are impractical, at least as they are likely to be implemented. Standardized test scores are verifiably better than random selection, if not by much, more egalitarian than subjective approaches and much more practical to use. Personally, I find it useless to freak out about these tools when we could use that energy into either improving their utility and/or overhauling our educational system to eliminate scarcity from playing a role.