Literacy in Leafstrewn: John Hattie

Showing posts with label John Hattie. Show all posts

Thursday, April 18, 2013

Economics research is better than education research, after all!

It's been interesting to follow the responses to the revelation that Reinhardt and Rogoff's influential 2010 paper about debt and growth was fundamentally flawed (their paper implied, and their own public advocacy made more explicit, that once a country's debt/GDP ratio got above 90%, economic growth fell off sharply, but it turns out that there is no such falling off).

I've already written about the first thing that interested me, the similarity between the shoddiness of economics research and the shoddiness of education research. Neither field is particularly scientific, and both are subject to what looks to me like heavy cultural and political bias or influence. One problem is that it's difficult in both fields, and virtually impossible in macroeconomics, to do controlled experiments. A related problem is that it's often hard to tell what's cause and what's effect. And both education and economics are central to culture and politics, so both are more subject to cultural and political pressures than, say, chemistry.

Another remarkable phenomenon has been the chutzpah of Reinhardt and Rogoff. In their response yesterday, they wrote that they didn't believe that their mistakes affected in any significant way the "central message" of the 2010 paper or their subsequent work. That is an eye-poppingly nervy assertion. You would never know from their response that the supposed 90% tipping point had become central to the public debate, nor that they themselves, in testimony to Congress and in prominent opinion articles, like this one, had claimed that the 90% line was "an important marker," with "little association" between debt and growth below 90%. Nor, from their response, would you know that they had continually, in their public statements, implied that it was the debt that was causing the slow growth, rather than the other way around.

The other interesting thing about the R&R debate this week has been the amount of attention, discussion, and thoughtful analysis their work has prompted. And it is here that the economics research community has shown itself as vastly superior to the education research community. Thoughtful and fundamental questions had been raised about the Reinhardt/Rogoff paper from the very beginning, with people like Dean Baker and Paul Krugman and many others suggesting that their 90% cut-off was bogus, that the causality was likely reversed (with the slow growth causing the debt, rather than vice versa), and that R&R should release their data set so that other researchers could analyze it. Then this week, once the data was finally made available to the public, other scholars did immediately start to analyze it, with one of them finding that the causality did indeed seem to run the other way, since high debt was correlated more with low growth in prior years than with low growth in succeeding years, and with others noting that the negative relationship between debt and growth was more significant at the low levels of debt (<30%) that R&R had claimed showed "little association" than at the >90% levels that R&R had been telling everyone were so dangerous.

The level of debate, just over the past couple of days, has been impressive, and puts education debates to shame. For instance, when I looked into vocabulary research last summer, what I found was virtually no scientific debate at all, and an apparently general innumeracy that supported Paul Krugman's contention that advanced mathematics is usually not important; what you need instead is just a level of comfort with numbers and a sense of how they relate to the real world. The same was true when I looked closely at the most prominent statistical study of education research, John Hattie's meta-analysis of education studies; there were significant problems with his analysis that seemed to have been publicly noted, in the years since publication, only by a guy in Norway and by me, a high school English teacher.

This is not to say that education research is never thoughtfully debated. When the Chetty, Friedman and Rockoff paper about the long-term effects of "high-VA" teachers came out, it was very carefully responded to by Bruce Baker, among others. Overall, however, education research strikes me as basically a backwater, and especially those areas of research that have to do with actual pedagogy. Perhaps this is partly because pedagogy, despite its central importance, seems more obscure to non-teachers than less important but larger-scale issues like school funding and class sizes. These large-scale issues seem more like economics problems, and so tend to be studied, not by Ed. School professors, but by economists. Some of the best work on class sizes was done by Alan Krueger, the chairman of Obama's Council of Economic Advisors, and the study on the long-term effects of "high-VA" teachers was done by Raj Chetty, who just won the Clark medal.

So, how can Education research get better? I'm not sure. I guess I hope people become more aware that it's lousy and that when scholarship is brought into public debate it is almost inevitably turned into propaganda, even by the scholars themselves. In particular I hope it becomes clear that some of the great received ideas of the ed. world are actually urban legends (vocabulary increases comprehension; schools can overcome poverty; etc.)--but that's another post.

Thursday, December 20, 2012

Can we trust educational research? ("Visible Learning": Problems with the evidence)

I've been reading several books about education, trying to figure out what education research can tell me about how to teach high school English. I was initially impressed by the thoroughness and thoughtfulness of John Hattie's book, Visible Learning, and I can understand why the view of Hattie and others has been so influential in recent years. That said, I'm not ready to say, as Hattie does, that we must make all learning visible, and in particular that "practice at reading" is "minimally" associated with reading gains. I discussed a couple of conceptual issues I have with Hattie's take in an earlier post--I worry that Visible Learning might be too short-term, too simplistic, and less well-suited to English than to other disciplines. Those arguments, however, are not aimed at Hattie's apparent strength, which is the sweep and heft of his empirical data. Today, then, I want to address a couple of the statistical weaknesses in Hattie's work. These weaknesses, and the fact that they seem to have been largely unnoticed by the many educational researchers around the world who have read Hattie's book, only strengthen my doubts about the trustworthiness of educational research. I agree with Hattie that education is an unscientific field, perhaps analogous to what medicine was like a hundred and fifty years ago, but while Hattie blames this on teachers, whom he characterizes as "the devil in this story" because we ignore the great scientific work of people like him, I would ask him to look in the mirror first. Visible Learning is just not good science.

Hattie's data
Visible Learning attempts to be both encyclopedia and synthesis. The book categorizes and describes over 800 meta-analyses of educational research (altogether, those 800 meta-analyses included over 50,000 separate studies), and it puts the results of those meta-analyses onto a single scale, so that we can compare the effectiveness of the very different approaches. After categorizing the meta-analyses, into, for instance, "Vocabulary Programs", "Exposure to Reading", "Outdoor Programs", or "Use of Calculator", Hattie then determines the average effect that the constituent meta-analyses show for that educational approach. By these measures, exposure to reading seems to make more of a difference than the use of calculators, but less of a difference than outdoor programs, and much less of a difference than vocabulary programs. (There are some odd results: "Direct Instruction," according to Hattie's rank-ordering, makes more of a difference than "Socioeconomic Status.")

Like other teaching gurus and meta-meta-analyzers (for instance, Robert Marzano, whose 2000 monograph, A New Era of School Reform, makes the case very explicitly), Hattie believes that good teaching can be codified and taught (that sounds partly true to me), that good teaching involves having very clear and specific learning objectives (I'm somewhat doubtful about that), and that good teaching can overcome, at the school level, the effects of poverty and inequality (I don't believe that). Hattie uses a fair amount of data to back up his argument, but the data and his use of it are somewhat problematic.

First, questions about the statistical competence of Hattie in particular
I am not sure whether we can trust education research, and I am not alone. John Hattie seems to be a leading figure in the field, and while he seems to be a decent fellow, and while most of his recommendations seem somewhat reasonable, his magnum opus, Visible Learning, has such significant issues that my one friend who's a professional statistician believes, after reading my copy of the book, that Hattie is incompetent.

The most blatant errors in Hattie's book have to do with something called "CLE" (Common Language Effect size), which is the probability that a random kid in a "treatment group" will outperform a random kid in a control group. The CLEs in Hattie's book are wrong pretty much throughout. He seems to have written a computer program to calculate them, and the computer program was poorly written. This might be understandable (all programming has bugs), and it might not have meant that Hattie was statistically incompetent, except that the CLEs Hattie cites are dramatically wrong. For instance, the CLE for homework, which Hattie uses prominently (page 9) as an example to explain what CLE means, is given as .21. This would imply that it was much more likely that a student who did not have homework would do well than a student who did have homework. This is ridiculous, and Hattie should have noticed it. But even more egregious is when Hattie proposes CLEs that are less than 0. Hattie has defined the CLE as a probability. A probability cannot be less than 0. There cannot be a less than zero chance of something happening (except perhaps in the language of hyperbolic seventh graders.)

As my statistician friend wrote me in an email, "People who think probabilities can be negative shouldn't write books about statistics."

Second, doubts about the trustworthiness of educational researchers in general
My statistician friend is not the first to have noticed the probabilities of less than zero. A year and a half ago a Norwegian researcher wrote an article called "Can We Trust The Use of Statistics in Educational Research" in which he raised questions about Hattie's statistical competence, and in follow-up correspondence with Hattie the Norwegian was not reassured. (Hattie seems, understandably, not to want to admit that his errors were anything more than minor technical details. In a exchange of comments on an earlier post on this blog, as well, Hattie seems to ignore the CLE/negative probability problem.)

For me, the really interesting thing about Hattie's exchange with the Norwegians was that he seemed genuinely surprised, two years after his book had come out, by the fact that his calculations of CLE were wrong. In his correspondence with the Norwegians, Hattie wrote, "Thanks for Arne Kåre Topphol for noting this error and it will be corrected in any update of Visible Learning." This seems to imply that Hattie hadn't realized that there was any error in his calculations of CLE until it was pointed out by the Norwegians--which means, if I'm right, that no one in the world of education research noticed the CLE errors in between 2009 and 2011.

If it is true that the most prominent book on education to use statistical analysis (when I google "book meta-analysis education", Hattie's book is the first three results) was in print for two years, and not a single education researcher looked at it closely enough and had enough basic statistical sense to notice that a prominent example on page 9 of the book didn't make sense, or that the book was apparently proposing negative probabilities, then education research is in a sorry state. Hattie suggests that the "devil" in education is the "average" teacher, who has "no idea of the damage he or she is doing," and Hattie approvingly quotes someone who calls teaching "an immature profession, one that lacks a solid scientific base and has less respect for evidence than for opinions and ideology" (258). He essentially blames teachers for the fact that teaching is not more evidence-based, implying that if we hidebound practitioners would only do what the data-gurus like him suggest, then schools could educate all students to a very high standard. There is no doubt that there is room for improvement in the practice of many teachers, as there is in the practice of just about everyone, but it is pretty galling to get preachy advice about science from a guy and a field who can't get their own house in order.

Another potential problem with Hattie's data
Aside from the CLE issue, I am troubled by the way Hattie presents his data. He uses a "barometer" that is supposed to show how effective is the curricular program or pedagogical practice he is considering. This is the central graphic tool in Hattie's book, the gauge by which he measures every curricular program, pedagogical practice and administrative shift:

Note that developmental and teacher effects are both above zero. What this implies is that the effect size represented by the arrow is not the effect as compared to a control group of students that got traditional schooling, nor even the effect size as compared to students who got no schooling but simply grew their brains over the course of the study, but the effect size as compared to the same students before the study began.

This would imply that offering homework, with a reported effect size of .29, is actually worse than having students just do normal school, or that multi-grade classes, with an effect size of .04, make kids learn nothing.

Now, that is obviously not what Hattie means. The truth is that Hattie sometimes uses "effect size" to mean "as compared to a control group" and other times uses it to mean "as compared to the same students before the study started." He seems comfortable with this ambiguity, but I am not. Not only is the "barometer" very confusing in cases like homework and multi-grade classrooms, where the graphic seems clearly to imply that those practices are less effective than just doing the regular thing (especially confusing in the case of homework, which is the regular thing), this confusion makes me very, very skeptical of the way Hattie compares these different effect sizes. The comparison of these "effect sizes" is absolutely central to the book. Comparing effect sizes (and he rank orders them in an appendix) is just not acceptable if the effects are being measured against dramatically different comparison groups.

Hattie, in a comment on an earlier post in which I expressed annoyance at this confusion, suggested that we should think of effect sizes as "yardsticks"--but in the same comment he says that effect size is the effect as compared to two different things. In his words: "An effect size of 0 means that the experimental group didn't learn more than the control group and that neither group learned anything." Now, I am an English teacher, so I know that words can mean different things in different contexts. But that is exactly what a yardstick is not supposed to do!

Of course, it is possible that many of Hattie's conclusions are correct. Some of them (like the idea that if you explicitly teach something and have kids practice it under your close observation, then they will get better at it more quickly than if you just ask them to try it out for themselves) are pretty obvious. But it is very hard to have much confidence in the book as a whole as a "solid scientific base" when it contains so much slipperiness, confusion and error.

Beyond these broad issues with Hattie's work, I also have some deep qualms about the way he handles reading in particular. Maybe one day I'll address those in another post.

Monday, December 10, 2012

"Visible Learning," School Reform, and In-school Reading

(I wrote this really quickly; I apologize for typos. I wanted to get it written while it was still fresh in my mind, but I have too much schoolwork to take a long time and write it well.)

In-school reading makes parents uncomfortable
At my son's middle school, and indeed throughout the district in the city my family lives in (slightly less leafstrewn than Leafstrewn, but graced with the Perfect University and the Institute of Machine Perfection), English teachers are having their students do a lot of in-class reading. On Parents' Night, I told my son's teacher I was thrilled about this. Last week, at a pot luck dinner for my daughter's class, a mom who also has a seventh grader told me she had wondered about what I said. "You said, 'I think in-class reading is a great idea,'" she said. "I couldn't tell if you were serious, or being sarcastic."

It turns out that she, and many other parents, see reading in class as a strange idea. That same week, on a parent email forum, a couple of parents raised the issue: why, they said, does school have them doing silent reading, an activity that they could easily do at home?

I wrote a response in which I made the obvious arguments: that reading is (as even such a champion of whole-class direct instruction as Doug Lemov argues) always a productive use of class time; that it is, in fact (as Lemov says), the (very high) baseline against which any other use of class time should be measured; that many kids don't read at home and indeed hardly read at all; that in-class reading allows the teacher to do more one-on-one instruction, which is the best kind; and so on.

Of the parents who responded, most backed me up. Some parents, however, were still not convinced. Most of these parents were willing to admit that reading in class might be good for other people's children--the implication I think was that good parents would have kids who read--but not for their own children, because their own children didn't need it. At the pot luck dinner I talked to a couple of parents whose kids read a lot at home; they were confused as to why their kids would spend time in school reading when they could be doing something else.

What the Research Says
I spent several hours over the weekend immersed in John Hattie's magisterial 2009 overview, Visible Learning. I've also spent some time looking back at Marzano's interesting 2000 monograph, A New Era of School Reform, which is, while less impressively encyclopedic, very much along the line that Hattie takes. Hattie and Marzano stress very clear and well-defined "learning objectives," very clear teacher control over where the lesson is going, and lots of "feedback" going both ways between teacher and student. Most education reformers over the past twenty years or so have justified their efforts by pointing to research like that which Hattie and Marzano synthesize. For a different perspective, I also looked back at Stephen Krashen's summary of studies of in-class reading, The Power of Reading. I've learned a lot, but I come away with more questions than clear answers.

Hattie and Marzano are very clear on what does work. Hattie calls it "visible learning"--he says that it's very important that both teachers and students be able to see what is working and what is not working. That requires very clear learning objectives, something Marzano stresses that seems to be taken to heart by all of today's reformers. It also requires a lot of feedback, from teachers to students, but also, and according to Hattie even more importantly, from students to teachers. Teachers need to know what their students are "getting" and what they are not getting. Teachers should be "activators" and not "facilitators." Hattie argues strongly and explicitly against the classic "constructivist" conception of teaching, in which the teacher facilitates "authentic" learning that is open-ended, "student-centered," and aimed at "discovery"--that is, learning whose end points or goals neither the teacher nor the student know beforehand. According to Hattie, this kind of "hands-off" teaching and "intrinsically motivated" learning is "almost directly opposite" to the kind of learning that the experimental data says is most successful.

Hattie and Marzano back up their arguments with vast amounts of experimental data. Hattie's book is particularly sweeping: his book is a meta-meta-analysis--that is, a meta-analysis of over 800 meta-analyses, each of which included several individual studies. Hattie's book synthesizes research on over a hundred million students, and with this vast body of evidence, a logical organization, a thoughtful, placid style but a forceful argument, it is certainly the most impressive analysis of educational data I've ever seen. That said, I came away not fully convinced.

Two Ways to Critique Hattie: theoretically and empirically
There are two kinds of potential problems with the "recipe" that Hattie (along with Marzano and all the other proponents of "clear learning objectives") is selling.

One kind of problem is theoretical: it seems possible that this kind of learning is potentially dry, uninspiring, and limited--and perhaps especially boring to students who get the idea quickly; also, it's not clear to me that, in the case of reading, the "learning goals" can be, to use Hattie's words, "challenging, specific and visible" (25).

A second class of problems, that I'll deal with in a post later this week, have to do with the empirical evidence Hattie is using: (a) the evidence for this recipe's efficacy seems to be quite thin in the case of reading; (b) the evidence for this recipe's efficacy seems to be mostly short-term, not long-term; (c) Hattie's evidence is mostly from controlled experiments, not from natural experiments; (d) Hattie and especially Marzano argue loudly that their evidence shows that poverty can be overcome by good schools and good teaching, but as far as I know this has never actually been done by an actual school in the real world--or at least not in the United States--despite the fact that thousands of schools across the country are trying to put this kind of "visible learning" into practice.

A theoretical critique: Is "Visible Learning" possible in English class--and if so, is it doomed to be dry and boring?
Today I wrote the following on the board in my "Honors" American Lit. class: "Learning Targets: (1) I will be able to write a good question about a sophisticated text and find a specific passage through which I can explore the question; (2) I will be able to discuss coherently the relationship between Hawthorne and Transcendentalism."

The cluster of A students who sit near the front of the room noticed what I had written and hooted derisively. One of them asked, "Is that a joke?" Another said, "No teacher in my entire school career has ever written learning targets on the board." Another one said it reminded him of a description of a typical day at a Fall River charter school that was posted a few days ago on Edushyster, an anti-corporate Ed reform blog. This led to a spirited discussion for a minute or so, before I shut it down so that we could get to our learning targets.

One student said that she liked the idea of learning targets, even in English class, but the most common knee-jerk response from the kids was that putting up a "learning target" every day in English class was laughable and ridiculous. Again, we didn't spend much time on this, but I believe their reaction was based on two ideas that I've written about before: (1) that, on the micro level, the skills we work on in English class are very complex and interdependent, so that isolating one of them each day is either absurd (today we're learning about commas in Hemingway) or obvious (the one I put up), while on the macro level the skills are (2) daily learning goals promote short-term thinking, while English-class skills are gained over the very long term--learned through repeated practice over years.

What are the learning goals in the case of upper-grade reading?
It is worth asking, then, what people who support "visible learning" would say should be our learning goals. The obvious place to look for these learning goals is in the Common Core.

The Common Core has two reading strands for grades 6-12, one covering "Literature" and one covering "Informational Text." I looked at the Literature strand. The Common Core specifies 9 standards for literature in the ninth and tenth grade. Three of them are below:

CCSS.ELA-Literacy.RL.9-10.2 Determine a theme or central idea of a text and analyze in detail its development over the course of the text, including how it emerges and is shaped and refined by specific details; provide an objective summary of the text.

CCSS.ELA-Literacy.RL.9-10.4 Determine the meaning of words and phrases as they are used in the text, including figurative and connotative meanings; analyze the cumulative impact of specific word choices on meaning and tone (e.g., how the language evokes a sense of time and place; how it sets a formal or informal tone).

These standards seem fairly unhelpful. They are essentially the same as standards from earlier grades (1). That would be fine in itself; as I've written, English class is a matter of practicing the same skills over and over. The problem is that these standards don't tell us anything we don't already know. We know that every English student in the world will be required to "determine the meaning of words and phrases as they are used in the text." This is quite a low-level goal.

In the end, I don't think putting a "learning target" up on the board is a particularly bad idea. I don't think it's necessary, but in the hands of a decent teacher it won't necessarily hurt. The danger with the practice is that it may lead to reductive, simplistic, boring teaching, in which kids are taught a lesson completely focusing on the simple structure of Hemingway's sentences. On the other hand, the danger with the opposite approach, the let's-just-read--and-talk-about-it approach, is that it may lead to a class that is way too loose. Either way is dangerous, but it seems unlikely that telling kids specifically what they are going to learn in every lesson is going to lead to more learning overall. Learning in English class is, as the Common Core standards admit both implicitly and (sometimes) explicitly, cumulative and long-term. To pretend that students are going to be noticeably better at "determining the meaning of words and phrases as they are used in the text" after a day-long lesson, or even after a month-long unit, is silly. Determining the meaning of words and phrases is a life-long task--and it is one at which the learning curve becomes flatter and flatter. I am probably better in some ways at this now than I was when I was twenty-five; but I'm not sure anyone would notice the improvement. Does this mean that reading a new author (as when I discovered Edward St. Aubyn last year) is useless? No, of course not. But it would be silly to set targets for that learning ("I will be able to describe Patrick Melrose's family dysfunction...").

That may mean that more advanced students will be the more dismissive of "visible learning"; perhaps, but I think it's not necessarily great for any students, and I'll talk more about that in a post on the empirical critique of Hattie's work later this week.

********************************************************************************
Footnote:
(1)
The fourth grade standards include the following:

CCSS.ELA-Literacy.RL.4.2 Determine a theme of a story, drama, or poem from details in the text; summarize the text.

CCSS.ELA-Literacy.RL.4.4 Determine the meaning of words and phrases as they are used in a text, including those that allude to significant characters found in mythology (e.g., Herculean).

Question: Why "Leafstrewn"?

Answer: Doth the universe lie within the compass of yonder town, which only a little time ago was but a leaf-strewn desert, as lonely as this around us? Whither leads yonder forest-track? Backward to the settlement, thou sayest! Yes; but onward, too! Deeper it goes, and deeper, into the wilderness, less plainly to be seen at every step; until, some few miles hence, the yellow leaves will show no vestige of the white man’s tread. There thou art free! So brief a journey would bring thee from a world where thou hast been most wretched, to one where thou mayest still be happy!