14 August 2013

On too-precise assessment

A correspondent (author of this very useful site) got in touch today to let me know about yet another link down. Maintaining links is the single biggest hidden chore when you manage a web-site, or three. (But approaches from essay-mills and cheat sites to advertise run a close second. Yet another today who did not seem to realise that it would be a conflict of interest to accept advertising for a "service" precisely about subverting everything the sites stand for...)

Fixing the link took me back to this page, about Bruner's (and Dale's) "cone of knowledge". And that reminded me of some thoughts while revising pages on the principles of assessment a week or two ago. I discussed "assessment drift" (or better, "assessment creep"!) and of course arrived at the trite conclusion that--particularly in vocational areas--assessment needs to be as close as possible to real-world practice. Big deal! But...

Nate Silver (2013) discusses the issue of "over-fitting" accounts of events. He calls it "The most important scientific problem you've never heard of" (p.163). Kahneman  touches on it, too, of course (2011, ch.20). The problem comes down to the construction of over-precise models which fit particular situations beautifully, but to the extent that they cannot be generalised. They describe all the sufficient conditions for this particular occurrence, without identifying which are necessary.  (I am reminded of Lamb's "Dissertation upon Roast Pig".)

The simple point, of course, is that it is impossible to generate an assessment scheme from such an over-specified account. It is bound to generate far too many Type II errors (people who failed when they should not have done because of irrelevant assessment requirements). And of course as far as the assessees are concerned, they may spend a disproportionate amount of time and effort on meeting merely contingent requirements.

And that is the danger lurking behind the concern to assess as closely as possible to practice. In Bruner's terms, this is about working at the experiential or enactive level. When NVQ trade and skill-based qualifications are assessed, the default method is direct observation of real-life practice. That is fine, as long as the circumstances of the practice correspond exactly to the requirements. Say that I am being assessed on my ability to weigh ingredients, for a catering task, for example. I may be trained and assessed using an electronic, digital scale which can be reset for each additional ingredient added; that does not mean that I can do the job with older and less sophisticated equipment such as a balance with weights. In practice, the assessment will be complemented by verbal questioning about what to do under different circumstances--the interesting thing about that is that it is moving up the Bruner cone to the iconic level.

And that is about vaguer and softer skills where other considerations come in, such as verbal competence and fluency (as in the case of a person who normally speaks another language).

In the case of vocational teaching such as I discussed here and here, there is ever greater pressure to push up "achievement" levels, and thus to teach to the test, thereby ironically moving further away from the realities of practice.

The issue is not confined to FE; it applies in higher education, too, as I discussed here a couple of years ago. It is actually worth quoting part of that posting:
Graham Gibbs' short but magisterial report on educational achievement in HE appeared in August (2010) and I blogged about his presentation based on it it here. Among his observations (p.24) is:
"High levels of detail in course specifications, of learning outcomes and assessment criteria, in response in part to QAA codes of practice, allow students to identify what they ought to pay attention to, but also what they can safely ignore. A recent study has found that in such courses students may narrow their focus to attention to the specified assessed components at the expense of everything else (Gibbs and Dunbar-Goddet, 2007). Students have become highly strategic in their use of time and a diary study has found students to progressively abandon studying anything that is not assessed as they work their way through three years of their degree (Innis and Shaw, 1997).
It is time to consider properly how to re-instate the broader, even vaguer, elements of assessment; otherwise we may be imposing a self-limiting cap on learning.

Gibbs G (2010) Dimensions of Quality  York: Higher Education Academy [On-line] available: http://www.heacademy.ac.uk/assets/York/documents/ourwork/evidence_informed_practice/Dimensions_of_Quality.pdf  Accessed: 19 November 2010

Gibbs G and Dunbar-Goddet H (2007) The effects of programme assessment environments on student
York: Higher Education Academy

Innis K and Shaw M (1997) "How do students spend their time?" Quality Assurance in Education 5 (2), pp. 85–89.

Kahneman D (2011) Thinking, Fast and Slow London, Penguin Books

Silver N (2013) The Signal and the Noise; the art and science of prediction London; Penguin Books

No comments:

Post a Comment

Comments welcome, but I am afraid I have had to turn moderation back on, because of inappropriate use. Even so, I shall process them as soon as I can.