I worry a lot about the difficulties of administering standardized tests. That might sound a little strange, but assessment data are essential to my work with children, and it concerns me when tests are invalidated, usually by mistake. A standardized test compares a student to a "norm," or the average performance of similar students, generally in a national sample. Part of the process of producing such a test involves "norming," or administering it to a sample of children considered representative of the national population. During norming, the test is administered under specific conditions with very specific instructions to the students in the normative group, and the publishers expect users to later replicate those same conditions and use those same instructions. Otherwise, the results are invalid. It's as simple as that. If test administrators allow extra time or ask leading questions that are not in the test manual or give students advanced preparation or permit multiple attempts beyond those allowed or in many other ways give students advantages that the children in the normative sample did not have--or conversely make the test more difficult--they are invalidating the test results.
I often need to read other professionals' evaluation reports, and sometimes the test scores appear to be an extreme over-estimate of the student's ability. I can only guess at the reasons for this because there's no way to know what occurred during test administration. Still, it makes me wonder how carefully the tests were administered. Admittedly this can be confusing because tests can have different administration and scoring rules even when they measure the same task. For example, some oral reading tests count all self-corrected errors, whereas others suggest that we note these corrections but do not count them in the scoring. Some tests have time limits per item administered and some do not. And so on. Yet while this is indeed confusing, it is also the evaluator's responsibility to understand and apply the rules appropriately. I frequently review the test manuals before giving some tests even though I've administered them dozens of times. I just consider it part of the job.
But there are other ways to invalidate a standardized test. Some evaluators' reports provide examples of items that students answered incorrectly. At first glance this might seem to make sense; after all, it can be part of an in-depth error analysis. The problem is that this practice can weaken the security and integrity of the test items. I sometimes describe the type of item with sample words that are not part of the actual tests. However, when real test items are shared, there is the possibility that they will become known by teachers or parents, or both, and ultimately by students, which invalidates the test. Parents or teachers may even see these errors and teach them to students--which makes the test useless for re-evaluation at a future time. If the items are directly instructed to a class, this test can be invalidated for all the students in that class. Now you may assume that in the course of a school year, some of these items would naturally be part of the curriculum anyway, and you are certainly correct. Tests are meant to sample the entire domain, e.g., of word meanings or high-frequency words or spelling. But inadvertently teaching some of the items is not the same thing as purposely teaching specific test items.
Ultimately what's important here is to carefully guard standardized tests so they can remain useful indicators of student performance. While I believe that informal tests that have not been standardized are also useful, and I include them in my test battery, there's no substitute for good norm-referenced tests. We use standardized tests to compare students to similar students in the national sample; informal tests can flesh out that information to inform instruction. Both are necessary sources of data.
Dr. Andrea Winokur Kotula is an educational consultant for families, advocates, attorneys, schools, and hospitals. She has conducted hundreds of comprehensive educational evaluations for children, adolescents, and adults.