12 June 2012

On "assessment for learning"; the petrifaction of a process.

I've been reading with interest a detailed blog on ESOL by Sam Shepherd, and in particular a point from a (fairly) recent post:
My personal experiment here was around learning outcomes, the value of “sharing” them I have always wondered about. And I have to be honest and say that for this lesson  I might as well have just whistled the learning outcomes at the beginning, for all it meant to the learners. [...] But the whole learning outcomes thing, really, not impressed. For this group it just didn’t make sense. [...] But I want to emphasise that it was just one lesson: and in the spirit of research and experimentation, I am going to persist and see how we get on.
This is an issue which crops up regularly in my own classes with people already teaching in post-compulsory education taking a part-time teaching qualification, and I usually come in for some stick from "students" who tell me they are obliged by college policy to announce the learning outcomes of every session at the start, and one very upset man told me that the only reason Ofsted had not deemed his class "outstanding" was because he had been too nervous to recite them. My usual response is to say that the only decent reason I can see for doing it is that it's a bit of ritual which you can use to communicate, "this is where the class really starts", but there are lots of ways of doing that. (I used to illustrate this by reciting at least one of the module outcomes in Anglican chant--I gave up on that when it became clear that it meant nothing to most students, and I don't have the voice for it... Pity.) 

So what is my objection? It is certainly not about wanting to confuse the students (although sometimes that is legitimate and effective; surprise can be an effective teaching tool). I've no objection to outlining what we are going to be doing, or "looking at" in this session, or giving an idea of about how much time we are likely to be spending on something; especially when it is a two-hour session, they like to know when the comfort-break is likely to come up. 

Students don't understand them--why should they?

My first objection is precisely what Shepherd identifies. Learning Outcomes (LOs) are written in teacher-speak gobbledygook. Even my students frequently don't understand them--and at the beginning of the module we do actually study not only what the outcomes say but also how they are written. They were difficult enough in the days when my colleagues and I wrote them, but since they have been laid down centrally by the now-defunct and unlamented "Lifelong Learning UK" they are well-nigh impossible. Granted that these are at the module level rather than the session level, here are two examples from the 2007 LLUK specification:
Understand the application of theories and principles of learning and communication to inclusive practice.
Understand how to apply theories and principles of learning and communication in planning and enabling inclusive learning.
(Shepherd's outcomes are SMART; one thing anyone formulating LOs learns from the outset is that "understand" is not an acceptable verb for an LO. See here and here for my heretical take on that. But the guardians of the Gradgrindian fog at LLUK used it all the time.) It's not clear what they mean. It's not clear how you show their LOs have been met. And there are two of them, and it's not clear what the difference is between them! They add nothing. Time spent on introducing them could be spent more profitably on doing practically anything else.
    Such as using an Advance Organiser. OK, the evidence for its effectiveness as a teaching device is not very high, but it's a respectable tool to have in the box, and it takes very little effort to use it. AOs (let's carry on with this silly game) refer to the students' experience, so they engage them, and set them up for the session. LOs on the other hand, distance them, and assure them that teaching is about to be done to them, so that they will emerge at the end of the session having been reliably processed through the sausage-machine.

Where's the evidence? Substitution of the sign for the signified.

My second objection is that there is no evidence that this is good practice. (This is a more convoluted and longer argument.)

A few months ago a correspondent got in touch about my Heterodoxy pages;
Very interested in your article re reflective practice, much of which can, in my opinion, be attributed to today's idea which cannot be questioned - 'Assessment for Learning'.

It seems to me that the impact of this seemingly effective form of practice has not born fruit [...] and that opportunities for practicing a particular skill or trying something more difficult have been replaced by pointless navel gazing. 

I'd like to see you take a swipe at AfL.
(Link above inserted). I'm not "taking a swipe" just because CJ requested it (but apologies to him/her for taking nine months to get into this), but because this kind of issue is a classic case of the ritual ossification of what he/she describes as "this seemingly effective form of practice"--concentrating on a relatively arbitrary sign rather than on the substance towards which it is supposed to point. And to explore it is also to explore the fate of many other educational innovations...

So I've done a lot of reading and talked to a lot of people--no, I didn't "do a literature review" and "interview respondents"--I live in the real world nowadays! And I'll spare you (most of) the references.

I am a fan of assessment for learning. The real McCoy, though, not the institutionalised tick-box version being peddled by Ofsted and their intimidated toadies. But how did we get the version we have now?

How did the principle of maximising feedback to learners, and thereby getting feedback from them, become this sterile ritual observance of reciting LOs, testing in every class, recapitulating the pre-determined learning points, and then writing it all up as if it constituted evidence of something?

There are several complementary perspectives on the process, and they all tend to push in the same direction...

The (proximate) origins of AfL


To cut through a lot of the origins of the idea, we can say that while it was enshrined in a Department of Education and Science report in 1988, it was championed by Black and Wiliam (1998), although they of course acknowledged that it was what good teachers had been doing for decades or even centuries--certainly at least long enough for them to amass 250 studies for a meta-analysis. They note:
"Typical effect sizes of the formative assessment experiments were between 0.4 and 0.7. These
effect sizes are larger than most of those found for educational interventions." (1998 p.3)
And Hattie's much larger meta-analysis, published 2010, confirms the figures (my take on it is here, including a note on what "effect-sizes" are).

But were they talking about what is commonly understood today as "Assessment for Learning"? It's clear from re-reading Black and Wiliam that they envisage a long-term cultural shift in classrooms. They repeat several times that it will be a slow process. It will be characterised by introducing measures which enable teachers to get a clearer picture of where their students are, in relation to particular topics and skills within a subject area, drawing on, inter alia, student self-assessment. Open communication is the key.

But Black and Wiliam are too experienced to believe that teachers can re-adjust simply on the basis of principles--they need concrete practices to follow:
"Teachers will not take up ideas that sound attractive, no matter how extensive the research base, if the ideas are presented as general principles that leave the task of translating them into everyday practice entirely up to the teachers. Their classroom lives are too busy and too fragile for all but an outstanding few to undertake such work. What teachers need is a variety of living examples of implementation, as practiced by teachers with whom they can identify and from whom they can derive the confidence that they can do better. They need to see examples of what doing better means in practice." (1998 p.10)
That is entirely understandable. I don't want to over-simplify the convoluted processes of change, particularly in education, but I suspect that nevertheless one unintended consequence of such thinking, exemplified in professional development programmes at one end of the scale, and in Ofsted inspection frameworks at the other, is to concentrate on the outward and visible signs, rather than the more elusive and protean processes of cultural shift. (We also need to see this process in a political and historical context, of obsessional micro-management of public services under the Labour government...)

We are also now looking at a second or even third generation of teachers since the desirability of AfL became the conventional wisdom--and quite possibly because of the way these teachers have themselves been taught--they have simply accepted all these signs and rituals as given, with no realisation of the rationale or principles underlying them. Reflective practice (from ten years earlier) has similarly become a matter of going through the motions without knowing why.

The managerial argument has been: If we are looking for a more open flow of information about understanding and achievement within the classroom, how do we know that it is happening? We need it to be reported. We need ILPs (Individual Learning Plans) to be set up and recorded on forms, and in order to ensure that the information is hard enough, that needs to be based on testing.

But! Given the time it takes to set up and implement and record such tests, and their inherent bias towards summative assessment, and Black and Wiliam's point above that "[Teachers'] classroom lives are too busy and too fragile for all but an outstanding few to undertake such work." --it's not surprising that their point that:
For assessment to function formatively, the results have to be used to adjust teaching and learning; thus a significant aspect of any program will be the ways in which teachers make these adjustments.  (1998 p.4, emphasis added.)
has in many cases not been met. It's just too much effort to complete the circle, and as with any cyclical model, there is nothing more disappointing that trying again and again to start something which continually peters out. (Think of trying to start an engine with a pull-cord--or even a starting-handle--when it won't "catch".)

The social construction of learning and its institutions.


Nevertheless, there is a venerable descriptive cyclical model in play here--see Berger and Luckmann (1967). (It is consistent with Wenger (1998): the cycle can start at any point.)
Put very crudely, it represents how we put ideas out into the world (externalise them). We do that by talking, writing, making objects. Ideas become social "things" (reified), such as institutions, laws, languages, art, etc. but of course changed in the process; and we in turn internalise them, which changes us and the next ideas we try to externalise. It is the reification of AfL which has lost the point (or if you like, the spirit) of the whole thing, and this cycle carries on independently of the will of the teachers. (There are fairly clear parallels with this model.)

What happens in the process of reification is that the idea, for want of a better word, has to accommodate to all the other reified ideas out there if it is to survive, and it is this adapted version which we (and the next generation) internalise. Thus, any social institution (and any approach to teaching is a social "institution" in a broad sense) gets knocked into shape by the dominant political, economic, technological and other powerful cultural factors of the day.

This is not merely an "academic" digression. What counts as "learning"--especially in the sense of "what is supposed to go on in education institutions"--is socially constructed. And so, therefore, is its assessment. At the level of universities, Stephan Collini writes (2012 ch.5) on  the substitution of metrics and superficial proxies for valid evaluative tools, because the rhetoric of "a public good" is no longer recognised as a reason for doing anything unless it can also be justified in primarily economic or other instrumental terms. (He writes as an apologist for the humanities.)

And he notes that inspection and evaluation of processes of delivery have taken over from any engagement with disciplines and content themselves, because of the value-questions which are inevitably encountered through such an engagement.

So, much as Ofsted and other quality assurance institutions would wish us to believe that their work is value-neutral, and that an "outstanding" class is better than a merely "good" one on any terms, that is not the case. (I grant there are some practices clearly to be avoided in teaching, so there is likely to be a more substantial consensus on distinguishing between "outstanding" and "failing".)

In other words, what kind of learning is "assessment for learning" about? The choice of what kind of assessment to focus on is critical in determining what kind of learning is promoted. In assessing a piece of written work, what counts most? Is it spelling and grammar? Elegance of expression? Originality of ideas? Arguments based on evidence? Underlying research? All of them, of course, but emphasising some beyond others depending on the age and stage of the learner, the subject matter of the essay, and so on. Choosing the "correct" criteria, choosing what to pay attention to and what to ignore are all based on those socially constructed values.

(The issues are of course generally much more straightforward in STEM* subjects.)

Accept only substitutes!


Frankly, all that is too hard to measure, so the system contents itself with these relatively trivial, perhaps harmless (other than their cost in time, effort and morale, and distortion of priorities) proxies which at one time might have pointed towards the quality of classroom communication, but have long since lost the connection. And perhaps that is all one can do in and inspection of just a few days. But it's not surprising if institutions focus on compliance with the letter rather than the spirit of the approach. (I'm giving inspectors, principals and other managers the benefit of the doubt here; my suspicion is that they no longer realise that there is a problem.

Just to round off, though, there is of course a quis custodiet question about these guardians of teaching standards: here's an interesting take on that, and more generally here is one on all those who presume to diagnose, predict and treat within ill-defined systems, exhibiting "deluded self-limiting prescriptivism".  Thanks to David Stone for the link.

But see here for a post on an LSE blog which rather grudgingly concedes some validity to Ofsted's judgements.

Update 27 July: Dylan Wiliam himself is now pointing out how misunderstood the AfL is, and how little implemented (TES, 13 July)



*Science, Technology, Engineering and Mathematics

Berger P L and Luckmann T (1967) The Social Construction of Reality London; Penguin (And Peter Berger is still blogging--largely about the social practice of religion--here.)

Collini S (2012) What Are Universities For? London; Penguin (Review here)

No comments:

Post a Comment

Comments welcome, but I am afraid I have had to turn moderation back on, because of inappropriate use. Even so, I shall process them as soon as I can.