Literacy in Leafstrewn: Evidence-based BS

Showing posts with label Evidence-based BS. Show all posts

Friday, June 14, 2013

When shown that there's no solid evidence for their side, literacy gurus demand evidence for the other

Over the past month or so, Tim Shanahan and I have been having an interesting discussion in the comments section to one of his blog posts. I've been pleased that he's taking the time to respond thoughtfully, but he's not convincing me. I'm writing this post, even more than some others, to clarify my own thinking--apologies for getting too much into the boring weeds here.

Initially I asked for evidence that reading more informational text led to better comprehension of such text. He said there was lots of evidence, I asked for specifics, and he finally admitted, after a few back-and-forths, that "You are correct that there is no study showing that increasing the amount of the reading of informational text has had a clear positive unambiguous impact on reading achievement or student knowledge. "

Shanahan did not, however, address why he had written in his blog post:CCSS is emphasizing the reading of literary and informational text to ensure that students are proficient with a wide variety of text. Nor did he address why, when I asked for evidence that reading more information text led to greater proficiency with informational text, he responded by saying "Actually there is quite a bit of research showing that if you want students to be able to read expository text, you have to have them read (or write) expository text."

Instead of explaining why he had made incorrect statements about the evidence for reading informational text, Shanahan asked me to show the evidence for reading literary text. He doesn't seem to get it: my whole point is that there is not strong evidence either way, and it is dishonest to pretend that there is. He, and many other scholars who engage in education discourse aimed at teachers and the general public, are continually pretending that there is strong scientific evidence for their pet curriculum ideas. Very often there is no such evidence.

When I suggested that a lot of "evidence-based" educational policies are not founded on particularly strong evidence, Shanahan made an interesting move: he essentially said that I was demanding too much. As he put it, "the basic problem here is with your understanding of research and how causal claims are put forward." He said that what he and others do is to look at some available evidence and come up with a "logic model" that fits the facts. Not all research is done, because some questions, like " Is third grade really necessary?", are not going to be studied.

So he seems to think if you have a story that is not inconsistent with some emprirically established facts, then apparently you have the right to say that "there is quite a bit of research showing" that your story is true.

Maybe. But it seems to me that if there is debate about a question, like the question of whether it is worthwhile to make young children read more informational text, then if you say there is "quite a bit of research showing" that your side of the debate is true, you have to have evidence that is not only consistent with your side of the debate but also inconsistent with the other side.

And it's not like we couldn't do some studies! Nell Duke, a prominent proponent of more informational text in the early grades, has gotten millions of dollars in grant money and has spent over a decade studying the issue of how much informational text children "are exposed to" in school. Couldn't she have taken some of that large amount of time and money and done a controlled experiment? Surely some district would have been happy to have a huge library of informational text provided to half of their K-4 schools, so that Duke could check whether students at those schools would actually do better, a few years down the line, at understanding informational texts? But she didn't do it, and Shanahan didn't do it, and now Shanahan is implicitly suggesting that such research would be as silly as a controlled experiment in which we got rid of third grade.

I'm still trying to figure out what I think about "research-based" arguments. I guess my position now is: research can be useful and informative, but it is only rarely, to use a legal term that has been cropping up a lot lately, dispositive; and we should have a lot more of it before we take the kind of authoritative tone that Tim Shanahan and a lot of educational experts take when they are writing for a popular audience. In their scholarly papers, and when pressed in debate, these experts are circumspect and honest about the limitations of their certainty; I'd like to see more of that circumspection in the advice given to us teachers and to the public.

Wednesday, May 22, 2013

Evidence shows that reading informational text more frequently is correlated with lower reading scores

I have another little story about non-evidence-based BS. I'm getting kind of tired of this topic, but I'm going to write it up anyway, just for the record, while my students are writing an in-class essay on Song of Solomon.

Is there evidence that reading more informational text is important?
Because it's being pushed by the Common Core, "informational text" is all the rage these days. Lesson plans for high school English classes are looking more and more like SAT prep--read a brief passage and answer some factual questions about it--except that the passages and questions I've seen in lesson plans have been less interesting than the ones I used to see on the SAT, back when I used to work as a tutor. One of the people promoting the Common Core these days is literacy titan Tim Shanahan. Some of Shanahan's work on CCSS matters is pretty good--he has a decent take on how to handle close reading in the classroom that is much better than a lot of the dreck I have seen--but like DAvid Coleman he has, I think, too little to say about reading volume, and he has jumped on the informational text bandwagon too wholeheartedly. In his most recent blog post, Shanahan writes, "CCSS is emphasizing the reading of literary and informational text to ensure that students are proficient with a wide variety of text."

I am skeptical of this claim, since my working hypothesis is that what's really important is overall reading ability, which is increased by reading a lot of whatever kind of text interests you. So I wrote a comment on Shanahan's blog post asking if he knew of any evidence for his assertion. I wrote, "I have not seen any evidence that trying to make students read more informational text will lead to greater proficiency with informational text.

Shanahan quickly replied to my comment, saying that there was lots of evidence: "Actually there is quite a bit of research showing that if you want students to be able to read expository text, you have to have them read (or write) expository text"

I wrote back asking for specifics, which he didn't give (I understand--he's a busy guy), and then I spent a bit of time poking around. What I found shouldn't have surprised me. Here's the upshot: not only does there seem to be no hard evidence that reading informational text makes you a better reader of informational text, there is actually, oddly, some hard evidence that the very opposite is true: that the more regularly students read informational text, the worse they do on reading tests.

A leading scholar makes the case for informational reading, but has no evidence
Nell Duke is a Michigan professor who has spent much of her career pushing to get more informational text in U.S. classrooms; she also edits the "Research-Informed Classroom" book series. Duke has tried to make the case for more informational text in many articles over many years, and her efforts may be paying off: both of my children have been exposed to more informational text in the course of their schooling than I was. This is not necessarily bad, but it's not necessarily good, either.

For what Nell Duke has not done is provide empirical evidence that reading more informational text will make you better at reading informational text. She is upfront about this: "While there is a great deal of agreement about the necessity of substantial or ongoing experience with a genre (e.g., New London Group, 1996), there is currently no empirical research available to speak to the question of how much experience with a given form of written text is necessary for a particular level of acquisition" (Duke, "3.6 Minutes a Day," RRQ, 2000, p.207) In other words, there is "agreement" among some researchers, but they don't have any hard evidence.

Do U.S. children "need" to read informational text?
In 2010 Nell Duke published an article in The Phi Beta Kappan called "The Real World Writing U.S. Children Need." The article begins by citing an international test that shows US children doing slightly better on standardized test questions about literary text than those on informational text. The article goes on to make Duke's usual argument that students need to read more informational text.

Because I am skeptical of this claim, I looked up the international test Duke mentions, the PIRLS. As Duke reported, U.S. children, like those in many other countries, did a bit better on questions about literary text than informational text--but the scores were not very far apart. What Duke did not report, however, was that the 2006 PIRLS study had actually done a bit of empirical research on the very question of whether more exposure to informational text is associated with higher scores on informational text.

The PIRLS study asked students how frequently they read literary texts, and how frequently they read informational texts. It turns out, counterintuitively perhaps, that students who reported reading informational texts more frequently actually did worse on the reading test than students who reported reading informational texts less frequently. Here's the relevant section of the US Government report on the 2006 PIRLS:

"The average score on the combined reading literacy scale for U.S. students who read stories or novels every day or almost every day (558) was higher than the average score for students who read stories or novels once or twice a week (541), once or twice a month (539), and never or almost never (509). In contrast, the average score for students who read for information every day or almost every day (519) was lower than the average score for students who read for information once or twice a week (538), once or twice a month (553), and never or almost never (546).

"The higher performance of U.S. students who read for information less frequently relative to U.S. students who read for information more frequently was also observed internationally." (http://nces.ed.gov/pubs2008/2008017.pdf, page 16-17))

So, to clarify, the very study that was cited as evidence of U.S. students not reading enough informational text turns out to show that frequent reading of informational text is associated with lower reading scores.

What to conclude?
First, while those PIRLS data are weird and counterintuitive, and almost certainly don't mean that reading informational text actually harms one's reading level, one thing is clear: this is not a study that offers any support for the idea that U.S. students "need" to read more informational text. The evidence for this assertion, like the evidence for explicit vocabulary instruction, for charter schools, for VAM teacher evaluation, for larger class size, for explicitly teaching reading "strategies" rather than focusing on meaningful, content-based reading and discussion--the evidence is simply very weak, if not outright negative.

Second, we are again confronted with the spectacle of very eminent scholars (Shanahan is a real bigwig, and Duke is a professor at a very good university who is quite well-established) making strong assertions in the practical and policy realms that don't seem backed up by evidence in the scholarship realm. There is a striking contrast between the careful language ("may," "currently no empirical research available," etc.) used in scholarly papers and the bold, authoritative tone of articles aimed at teachers and the public about what children "need" to be doing, and what practices will "ensure" a particular result.

The takeaway for me, once again, is that we simply cannot trust any assertion that we have not ourselves looked into carefully--even, or perhaps especially, if it is accompanied by the label "research-based", or as Nell Duke's book series has it, "Research-Informed." Instead, we must rely mostly on our own common sense and our sense of humanity. At the heart of our work should be: meaningful reading, meaningful writing, and meaningful discussion.

Friday, May 17, 2013

Salt is fine for you! (poor logic in public health and educational pseudo-science)

I'm busy, but this caught my eye, so I'll do it quickly. It turns out that low-salt diets are actually bad for you.

This is another in a long line of reversals for overly simplistic public health guidelines (butter is bad, mammograms are good, breastfeeding is bad, eat lots of carbs, etc.), and it should make us very wary of the educational pseudo-science, which often uses the exact same simplistic logic.

Salt was considered bad because salt consumption was associated with slightly higher blood pressure, and slightly higher blood pressure was associated with slightly more heart attacks. So A was associated with B, and B was associated with C--but no one had actually checked that A actually led to C--that increased salt consumption meant more heart attacks or earlier death. It turns out that it doesn't--quite the opposite.

This isn't so surprising--the human body is a very complicated system, and there are a lot of other billiard balls on the table besides just salt and blood pressure--but it is worth noting as a cautionary tale, because so much “evidence-based” discourse in the education world is highly dubious and likely to be disproven in the future.

Some examples

I have seen this kind of logic--A is correlated with B, and B is correlated with C, so A must cause C—in arguments about all sorts of questions. Here's an example that I wrote about a long time ago:

A. Explicit vocabulary instruction can lead to some increase in vocabulary

B. Good readers tend to have larger vocabularies

C. Therefore, one of the most evidence-based ways to increase reading comprehension is explicit vocabulary instruction

Sometimes the arguments are even weaker:

A. The texts assigned in schools are less complex than they were 40 years ago.

B. Students are marginally less good at reading complex texts than they were 40 years ago (I think the evidence for this is very weak, but I’ll accept it for the sake of argument)

C. Therefore, we should assign more difficult texts in school.

And sometimes they're painfully comical:

A. Schools are spending more money

B. Test scores are flat

C. Therefore, we should get rid of teachers’ unions

Why not more logical arguments?

These arguments would seem to be self-evidently silly, and it might not be worth taking the time to respond to them, except that they are so widely accepted and so central to the major education debates of our time. So we might ask, why not the following arguments?

A. Many of our students go whole years without reading a single book.

B. No one has ever become a good reader without reading a lot

C. Therefore, we should spend a lot of time and money and thought on getting kids to read more.

A. In an appropriately leveled book, 1% of the words will be new, and readers will learn on average about 15% of those new words.

B. If you read 100 pages a week, you will learn about 45 new words, which is far higher than the ten to fifteen that kids are given in typical vocabulary instruction.

C. Therefore, teachers should stop spending time on explicit vocabulary instruction and should instead devote more time to independent reading.

A. If you read books that are too hard, you don't improve as much as if you read books that are at the appropriate level.

B. If you read books that are too hard, you will read less.

C. Therefore, students should read books at their reading level.

A. The educational achievement of poor kids is much, much worse than that of rich kids.

B. No school has ever succeeded in educating poor kids up to the level of rich kids.

C. Therefore, if we want all kids to have the same opportunity, we should eliminate poverty.

But that last argument is its own refutation, because of course it is unthinkable anymore to seriously consider attacking poverty directly. So instead we get ridiculous logic.

Thursday, May 16, 2013

Krugman on Ed Reform

Paul Krugman has a good piece in the NYRB discussing the ill-judged move, over the past few years in the west, to "austerity"-- that is, cutting government spending and raising taxes. The article is worth reading even for people who aren't usually terribly interested in economics, both because this is one of the major stories of our time and because Krugman's analysis of how the opinions of the elites and the policymakers could be so wrong is more broadly applicable. In particular, I think the analysis applies pretty well to the Ed Reform movement of the past decade.

According to Krugman, the move to austerity has a few notable features:

* It found support from scholarly studies that are, despite fancy pedigrees (Harvard!), shaky and dubious
* It had a simple moral and psychological appeal
* It did not demand anything difficult from the elites themselves

All of these fit the Ed Reform movement as well:

Support from dubious but ivy-league scholarship
Educational research is, like economics research, anything but conclusive. A lot of pretty basic questions are surprisingly unclear: whether homework is worthwhile, whether class size makes a big difference, how reliable or valid standardized test scores are as measures of teacher effectiveness, whether vocabulary instruction is useful, and many , many more. Nevertheless, you would never know this from the self-assured pronouncements of people like Bill Gates, who can move with dizzying fickleness from demanding small schools to demanding Common Core Standards to demanding that teachers be evaluated by test scores to demanding larger class sizes, citing studies for each new "evidence-based" proposal despite the fact that none of these proposals has more than the slimmest of empirical evidence in its favor.

In fact, the Ed reformers' emphasis on standardized tests is striking in its radical departure from what has long been understood: that what matters is student engagement with the material in as authentic a way as possible. But just as radical proponents of austerity economics have, on the basis of thin scholarship and simplistic moralizing ("We must tighten our belts!"), left behind the accepted wisdom of John Maynard Keynes--just so have the Ed reformers radically left behind the legacy of John Dewey on the basis of thin scholarship and similarly simplistic moralizing ("We need to get tough!").

Crude moral and psychological appeal
Blame and punishment have an eternal appeal. Just as the Germans blame the Greeks, and demand cuts and austerity from the Greeks even though cutting the Greek economy off at the knees means it won't ever get back on its feet, American Ed reformers blame teachers and schools, and demand punishment. Never mind changing the larger system of inequality and poverty, never mind the fact that punitive measures never work, blame and punishment are appealing--especially if you can put them onto other people, not yourself or people you know.

Few demands on the elites themselves
Because cutting taxes on the rich helps the rich, cutting government spending doesn't hurt them directly, and failing to tackle high unemployment keeps wages down and corporate profits high, the wealthy proponents of government austerity are remarkably insulated from the bad effects of the policies they propose. In the same way, the Ed reformers are very far from personally connected to the reforms they propose. Not only have none of these people (Gates, Broad, Duncan, Obama, Bloomberg, Klein, Coleman, Emanuel, etc.) ever actually been a teacher, not a single one of them, as far as I know, has kids in public school. The fact that all of the ed reformers send their kids to private school is significant for two reasons: (1) because private schools like Lakeside, Sidwell Friends or the Lab School do not follow an ed reform model now, and (2) because they will never have to. New standards, new testing regimes, increased class sizes, teacher evaluation based on test scores--all of these dubious reforms will be imposed on those of us in public schools, but their architects and proponents are sheltered and insulated from them. It is infuriating.

Conclusion
What Krugman says about proponents of austerity economics is, I think, appropriate to Ed Reform as well (and reminds me why the wonk is an invasive species). Here's Krugman:

"It’s a terrible story, mainly because of the immense suffering that has resulted from these policy errors. It’s also deeply worrying for those who like to believe that knowledge can make a positive difference in the world. To the extent that policymakers and elite opinion in general have made use of economic analysis at all, they have, as the saying goes, done so the way a drunkard uses a lamppost: for support, not illumination. Papers and economists who told the elite what it wanted to hear were celebrated, despite plenty of evidence that they were wrong; critics were ignored, no matter how often they got it right."

Tuesday, April 16, 2013

And I thought economics research was slightly more trustworthy than Ed. research?

I have often been amazed at how sloppy and thin education research is. An item in today's news reminds me that all human endeavor is subject to the same human error.

For years now, the public debate over governmental spending and governmental debt, which has led to the truly bizarre situation of a Democratic U.S. President proposing to cut Social Security and Medicare, has been heavily influenced by a 2010 study by two eminent economists, Reinhart and Rogoff. The R/R study claimed to show, by rigorous analysis of the historical record, that countries with high government debt levels (above 90% of GDP, I think was the number) had very low economic growth--actually, economic growth of -0.1%, so not growth at all, but contraction. This paper was used to argue that the correct response to the economic crisis of recent years was not stimulus spending, as per Keynes and much of standard textbook economics, because stimulus spending that increased debt levels too high would not be stimulative, but contractionary. Instead of doing stimulus spending, governments were supposed to respond to the deep recession by... well, at any rate by doing something else (all too often education reform was dragged into the debate).

Now, with the publication of a paper by UMass researchers, it turns out that the Reinhart/Rogoff study was flawed, partly due to massaging the data in unconventional ways (picking and choosing, and weighting it weirdly), and partly--get this--due to a typo in the excel spreadsheet they used to work with their data. The typo (44 instead of 49) led to the exclusion several key countries. Mike Konczal covers it here, but the key result is that if you handle the data normally and don't have the typo, countries with 90% debt/GDP ratios actually had average economic growth of 2.2%, not -0.1%. 2.2% is not great, but it's not negative, and it destroys the argument that most mainstream economists and pundits have been using to argue for cuts in government spending.

This is really remarkable news for anyone who's been following the public economics discourse recently. The Reinhart/Rogoff study has been cited more than anything else in the debates these past few years over government spending (for example, just last week Paul Ryan's response to Obama's budget was entirely based on the R/R study) by people who needed an apparent scientific basis for cuts in government spending like those that have brought England back into recession and Europe to the first stages of dissolution, and yet the study seems to have been simply shoddy and wrong.

What to take away from this? One, we can usually trust Dean Baker and Paul Krugman. Two, it behooves us to be modest, compassionate and natural, and to beware of dodgy dossiers.

Saturday, February 23, 2013

For vocabulary, input volume matters

A recent issue of Edweek has another lame article purporting to connect research to the Common Core--again, most of the "research" is incredibly weak and the experts say silly things. Nevertheless, the article does mention one classic piece of research, and looking back at that study leads, as usual, to the conclusion that reading volume matters a lot.

Unnecessary new research...
According to the first paragraph of the article:

     Children who enter kindergarten with a small vocabulary don't get taught enough words—
     particularly, sophisticated academic words—to close the gap, according to the latest in a
     series of studies by Michigan early-learning experts.

Well, duh. Anyone who had looked carefully at the vocabulary research, or indeed simply thought logically about the matter, would know that no scientific studies are necessary to conclude that kids with small vocabularies can't possibly be "taught" enough words to close the gap, since the only truly significant way kids learn words is through reading them and hearing them used. I have never yet read a study purporting to show that any class of students, anywhere, has been "taught" enough words to make a significant increase in their vocabularies. The main study discussed in the article found limited vocabulary instruction across the board, and less instruction in "academically challenging words" at high-poverty-schools. Neither of these findings is necessarily significant, because vocabulary instruction just doesn't make much of a difference.

...and humorless experts!
Throughout the article, ostensible experts are quoted saying silly things. For instance, one scholar says that Kindergarteners should be taught academic words like "predict." That might be reasonable, but then she goes on, "Why would you choose to emphasize the word 'platypus'? It makes no sense." Hm. What makes no sense to me is that someone who can't imagine a reason to emphasize a really interesting, cool, loveable word like "platypus" would have anything to do with children's education, let alone be on the faculty at the University of Michigan.

What we should be thinking about
The article spends a fair amount of space, and a cool decision tree sidebar graphic, on which words to teach. Thinking a lot about this is probably a waste of time, since teaching words doesn't make much of a difference, except, perhaps, insofar as kids enjoy them. What then should we be thinking about? Well, the only decent piece of research cited by the article is the classic 1995 study by Hart and Risley that reveals the remarkable disparities in the numbers of words heard by kids from different socioeconomic backgrounds. Upper-middle class kids hear 11 million words per year, while poor kids hear 3 million; by the age of three, upper-middle class kids know twice as many words. The two "key conclusions" of the Hart and Risley study are the following:

• The most important aspect of children’s language experience is quantity.
• The most important aspect to evaluate in child care settings for very young children is the amount of talk actually going on, moment by moment, between children and their caregivers.

These conclusions do NOT say that we should be spending our time deciding which words to "emphasize" or "prioritize" or teach; instead, what matters is how many words kids are hearing or reading. In fact, I see no reason not to transfer the second conclusion of the Hart and Risley study to schools of older children, too, with only slight modifications. As children get older, we need to add reading to our model, since as we get older reading is essential for experiencing high volumes of sophisticated language, and the quality of the input may become more important, since you do want them to hear or read words thast they don't actually know, but the quantity is still much more important than the Edweek article acknowledges. So I would extrapolate thus:

• The most important aspect to evaluate in child-care settings for older children (i.e., schools) is the amount of sophisticated language actually experienced by the children, whether from a caregiver (i.e. teacher) or by reading.

Of course, I suspect reading is probably more important than teacher-talk. The Hart and Risley study focused on talk among families, which is primarily one-on-one, and the best way to simulate that in a classroom with a student-teacher ratio of at least 20:1 is by having each child reading a book. So I'll conclude where I always do, with another form of my usual hypothesis:

• The most important aspect to evaluate in child-care settings for older children (i.e., schools) is the amount of reading actually going on.

Thursday, December 20, 2012

Can we trust educational research? ("Visible Learning": Problems with the evidence)

I've been reading several books about education, trying to figure out what education research can tell me about how to teach high school English. I was initially impressed by the thoroughness and thoughtfulness of John Hattie's book, Visible Learning, and I can understand why the view of Hattie and others has been so influential in recent years. That said, I'm not ready to say, as Hattie does, that we must make all learning visible, and in particular that "practice at reading" is "minimally" associated with reading gains. I discussed a couple of conceptual issues I have with Hattie's take in an earlier post--I worry that Visible Learning might be too short-term, too simplistic, and less well-suited to English than to other disciplines. Those arguments, however, are not aimed at Hattie's apparent strength, which is the sweep and heft of his empirical data. Today, then, I want to address a couple of the statistical weaknesses in Hattie's work. These weaknesses, and the fact that they seem to have been largely unnoticed by the many educational researchers around the world who have read Hattie's book, only strengthen my doubts about the trustworthiness of educational research. I agree with Hattie that education is an unscientific field, perhaps analogous to what medicine was like a hundred and fifty years ago, but while Hattie blames this on teachers, whom he characterizes as "the devil in this story" because we ignore the great scientific work of people like him, I would ask him to look in the mirror first. Visible Learning is just not good science.

Hattie's data
Visible Learning attempts to be both encyclopedia and synthesis. The book categorizes and describes over 800 meta-analyses of educational research (altogether, those 800 meta-analyses included over 50,000 separate studies), and it puts the results of those meta-analyses onto a single scale, so that we can compare the effectiveness of the very different approaches. After categorizing the meta-analyses, into, for instance, "Vocabulary Programs", "Exposure to Reading", "Outdoor Programs", or "Use of Calculator", Hattie then determines the average effect that the constituent meta-analyses show for that educational approach. By these measures, exposure to reading seems to make more of a difference than the use of calculators, but less of a difference than outdoor programs, and much less of a difference than vocabulary programs. (There are some odd results: "Direct Instruction," according to Hattie's rank-ordering, makes more of a difference than "Socioeconomic Status.")

Like other teaching gurus and meta-meta-analyzers (for instance, Robert Marzano, whose 2000 monograph, A New Era of School Reform, makes the case very explicitly), Hattie believes that good teaching can be codified and taught (that sounds partly true to me), that good teaching involves having very clear and specific learning objectives (I'm somewhat doubtful about that), and that good teaching can overcome, at the school level, the effects of poverty and inequality (I don't believe that). Hattie uses a fair amount of data to back up his argument, but the data and his use of it are somewhat problematic.

First, questions about the statistical competence of Hattie in particular
I am not sure whether we can trust education research, and I am not alone. John Hattie seems to be a leading figure in the field, and while he seems to be a decent fellow, and while most of his recommendations seem somewhat reasonable, his magnum opus, Visible Learning, has such significant issues that my one friend who's a professional statistician believes, after reading my copy of the book, that Hattie is incompetent.

The most blatant errors in Hattie's book have to do with something called "CLE" (Common Language Effect size), which is the probability that a random kid in a "treatment group" will outperform a random kid in a control group. The CLEs in Hattie's book are wrong pretty much throughout. He seems to have written a computer program to calculate them, and the computer program was poorly written. This might be understandable (all programming has bugs), and it might not have meant that Hattie was statistically incompetent, except that the CLEs Hattie cites are dramatically wrong. For instance, the CLE for homework, which Hattie uses prominently (page 9) as an example to explain what CLE means, is given as .21. This would imply that it was much more likely that a student who did not have homework would do well than a student who did have homework. This is ridiculous, and Hattie should have noticed it. But even more egregious is when Hattie proposes CLEs that are less than 0. Hattie has defined the CLE as a probability. A probability cannot be less than 0. There cannot be a less than zero chance of something happening (except perhaps in the language of hyperbolic seventh graders.)

As my statistician friend wrote me in an email, "People who think probabilities can be negative shouldn't write books about statistics."

Second, doubts about the trustworthiness of educational researchers in general
My statistician friend is not the first to have noticed the probabilities of less than zero. A year and a half ago a Norwegian researcher wrote an article called "Can We Trust The Use of Statistics in Educational Research" in which he raised questions about Hattie's statistical competence, and in follow-up correspondence with Hattie the Norwegian was not reassured. (Hattie seems, understandably, not to want to admit that his errors were anything more than minor technical details. In a exchange of comments on an earlier post on this blog, as well, Hattie seems to ignore the CLE/negative probability problem.)

For me, the really interesting thing about Hattie's exchange with the Norwegians was that he seemed genuinely surprised, two years after his book had come out, by the fact that his calculations of CLE were wrong. In his correspondence with the Norwegians, Hattie wrote, "Thanks for Arne Kåre Topphol for noting this error and it will be corrected in any update of Visible Learning." This seems to imply that Hattie hadn't realized that there was any error in his calculations of CLE until it was pointed out by the Norwegians--which means, if I'm right, that no one in the world of education research noticed the CLE errors in between 2009 and 2011.

If it is true that the most prominent book on education to use statistical analysis (when I google "book meta-analysis education", Hattie's book is the first three results) was in print for two years, and not a single education researcher looked at it closely enough and had enough basic statistical sense to notice that a prominent example on page 9 of the book didn't make sense, or that the book was apparently proposing negative probabilities, then education research is in a sorry state. Hattie suggests that the "devil" in education is the "average" teacher, who has "no idea of the damage he or she is doing," and Hattie approvingly quotes someone who calls teaching "an immature profession, one that lacks a solid scientific base and has less respect for evidence than for opinions and ideology" (258). He essentially blames teachers for the fact that teaching is not more evidence-based, implying that if we hidebound practitioners would only do what the data-gurus like him suggest, then schools could educate all students to a very high standard. There is no doubt that there is room for improvement in the practice of many teachers, as there is in the practice of just about everyone, but it is pretty galling to get preachy advice about science from a guy and a field who can't get their own house in order.

Another potential problem with Hattie's data
Aside from the CLE issue, I am troubled by the way Hattie presents his data. He uses a "barometer" that is supposed to show how effective is the curricular program or pedagogical practice he is considering. This is the central graphic tool in Hattie's book, the gauge by which he measures every curricular program, pedagogical practice and administrative shift:

Note that developmental and teacher effects are both above zero. What this implies is that the effect size represented by the arrow is not the effect as compared to a control group of students that got traditional schooling, nor even the effect size as compared to students who got no schooling but simply grew their brains over the course of the study, but the effect size as compared to the same students before the study began.

This would imply that offering homework, with a reported effect size of .29, is actually worse than having students just do normal school, or that multi-grade classes, with an effect size of .04, make kids learn nothing.

Now, that is obviously not what Hattie means. The truth is that Hattie sometimes uses "effect size" to mean "as compared to a control group" and other times uses it to mean "as compared to the same students before the study started." He seems comfortable with this ambiguity, but I am not. Not only is the "barometer" very confusing in cases like homework and multi-grade classrooms, where the graphic seems clearly to imply that those practices are less effective than just doing the regular thing (especially confusing in the case of homework, which is the regular thing), this confusion makes me very, very skeptical of the way Hattie compares these different effect sizes. The comparison of these "effect sizes" is absolutely central to the book. Comparing effect sizes (and he rank orders them in an appendix) is just not acceptable if the effects are being measured against dramatically different comparison groups.

Hattie, in a comment on an earlier post in which I expressed annoyance at this confusion, suggested that we should think of effect sizes as "yardsticks"--but in the same comment he says that effect size is the effect as compared to two different things. In his words: "An effect size of 0 means that the experimental group didn't learn more than the control group and that neither group learned anything." Now, I am an English teacher, so I know that words can mean different things in different contexts. But that is exactly what a yardstick is not supposed to do!

Of course, it is possible that many of Hattie's conclusions are correct. Some of them (like the idea that if you explicitly teach something and have kids practice it under your close observation, then they will get better at it more quickly than if you just ask them to try it out for themselves) are pretty obvious. But it is very hard to have much confidence in the book as a whole as a "solid scientific base" when it contains so much slipperiness, confusion and error.

Beyond these broad issues with Hattie's work, I also have some deep qualms about the way he handles reading in particular. Maybe one day I'll address those in another post.

Friday, November 30, 2012

The Common Core is not "evidence-based"--but maybe that's okay!

My Curriculum Coordinator just got a subscription to the Marshall Memo, which may be bad for my mental health--I'll be reading a lot more Ed. research. This post came from reading an article in EdWeek that Marshall refers to in this week's memo. I'm sorry the post is so long. I think it served to clarify my thinking...

The Common Core is Not Evidence-Based
There's an article in a recent EdWeek with the remarkable headline: "New Literacy Research Infuses Common Core." The article's subtitle reads, "In the 15 years since the National Reading Panel convened, the knowledge base on literacy has grown." As far as I can tell, both the headline and the subhead are essentially false. The Common Core standards are not really evidence-based, and the knowledge base on literacy has not grown much in the 15 years since the now-discredited National Reading Panel --except perhaps in the Socratic sense of knowing how much it doesn't know.

This is interesting only because it points up the farcical nature of so much of today's educational discourse. While most of what happens in schools today is worthwhile, the way people talk about it is just ridiculous. One of my colleagues suggested that our schools would be better off if all the Graduate Schools of Education disappeared from the face of the earth (he used stronger words), and I think he might be right. Education research is like Hollywood: NOBODY KNOWS ANYTHING. Of course, this isn't true, either of schools or of Hollywood, but it's partly true.

The article itself is somewhat better than its headlines, and if you read it carefully you can see that the Ed. professors are basically just as in the dark as we teachers are--if not more so. What is most amazing about an article that claims to be about "new literacy research" is that it describes very little actual research. The article quotes many academics, but often they say things as questionable and non-evidence-based as the following paragraph, which manages to express the same simple idea, an idea that has been a truism for many decades now, over and over:

"In our knowledge-based economy, students are not only going to have to read, but develop knowledge-based capital. We need to help children use literacy to develop critical-thinking skills, problem-solving skills, making distinctions among different types of evidence," said Susan B. Neuman, a professor in educational studies specializing in early-literacy development at the University of Michigan in Ann Arbor. "The Common Core State Standards is privileging knowledge for the first time. To ensure they are career-and-college ready, we have to see students as lifelong learners and help them develop the knowledge-gathering skills they will use for the rest of their lives. That's the reality."

Privileging knowledge is a new idea? Helping kids become life-learners is a new idea? Critical thinking in literacy is a new idea? What? Not only are these old ideas, there is not a the slightest bit of research or data to be found in that paragraph.

The recent history of "evidence-based" BS
The EdWeek article goes on to discuss the National Reading Panel of 2000, which was a much-ballyhooed effort to establish the most advanced and scientific thinking about how children learn to read and how we can help them. The Panel's report came down decisively on the side of explicit instruction in skills: phonemic awareness; vocabulary; comprehension strategies; etc. The panel's recommendations formed the basis of a $1 billion-a-year effort, "Reading First" by the Federal government to improve reading in the early grades. Eight years later, there was a comprehensive assessment of the program, to find out how much difference this explicit instruction in skills had made. The answer: zero difference.

The assessment reported three key findings: (1) Reading First did indeed result in students spending more time on reading "instruction" (phonemic awareness, vocab, etc.); (2) Reading First did indeed result in more professional development in "scientifically based reading instruction (SBRI)"; (3) however, "Reading First did not produce a statistically significant impact on student reading comprehension test scores in grades one, two or three" (page v).

Another finding, that the assessment did not consider "key," but that may have had some impact, was that the increased instructional time and the emphasis on skills did not result in any increase in students' actually reading. As the assessment puts it: "Reading First had no statistically significant impacts on student engagement with print" (page xii).

This is remarkable: in 2000, only twelve years ago, the state of the research (the "knowledge-based capital," in the vapid phrase of the Michigan professor), which the panel of eminent experts claimed to hold to the "highest standards of scientific evidence," was utterly and completely wrong.

After being so wrong, the education experts tried to reposition themselves--but not very clearly. One of them is quoted in the EdWeek article as saying that after the National Reading Panel, "comprehension became the 'next great frontier of reading research.'" This is odd, since "comprehension" was one of the central topics of the NRP itself. (1)

Reading Next:
One of the ways the experts tried to reposition themselves was in a report called "Reading Next," which according to the EdWeek article "helped spark the common core's approach. Education professor Catherine A. Snow and then-doctoral student Gina Biancarosa of the Harvard Graduate School of Education found that explicit comprehension instruction, intensive writing, and the use of texts in a wide array of difficulty levels, subjects, and disciplines all helped improve literacy for struggling adolescent readers."

Reading Next focused on an array of fifteen "powerful tools" for improving literacy. In an improvement on the NRP's exclusive focus on skills instruction, many of Reading Next's recommendations were so vague that no one could object ("Effective Instructional Principles Embedded in Content"), and many sounded fairly old-fashioned (Strategic Tutoring; Motivation and Self-Directed Learning; Extended Time for Literacy). But when you actually looked more deeply into what the specific recommendations were, it became clear that the report was, like the NRP, trying as hard as possible to avoid mentioning the very simple strategy of having students actually read.

Avoiding all mention of actual reading:
Here is the passage from the Reading Next report that discusses "Extended Time for Literacy," which I had thought from its title might mean more time for students to actually read. That may be what is meant, but the authors seem to twist themselves into jargony knots so as to avoid discussing actual "reading":

Extended Time for Literacy
None of the above-mentioned elements are likely to effect much change if instruction is limited to thirty or forty-five minutes per day. The panel strongly argued the need for two to four hours of literacy-connected learning daily. This time is to be spent with texts and a focus on reading and writing effectively. Although some of this time should be spent with a language arts teacher, instruction in science, history, and other subject areas qualifies as fulfilling the requirements of this element if the instruction is text centered and informed by instructional principles designed to convey content and also to practice and improve literacy skills.

To leverage time for increased interaction with texts across subject areas, teachers will need to reconceptualize their understanding of what it means to teach in a subject area. In other words, teachers need to realize they are not just teaching content knowledge but also ways of reading and writing specific to a subject area. This reconceptualization, in turn, will require rearticulation of standards and revision of preservice training.

This passage is amazing. Despite the fact that it seems intended to promote spending more time having students actually reading, the language in this passage and the whole report seems to avoid saying that straight out. Instead we hear about "instruction" (four times), "literacy-connected learning," "interaction with texts," and "instructional principles designed to convey content." The word "reading" appears twice, but never on its own, never with the implication that the students might be actually reading; instead, we read that time should be spent with "a focus on reading" and in "teaching... ways of reading."

This passage, like the whole report and indeed like so much of the discourse of reading experts, makes me think of Pearson and Gallagher's "Gradual Release of Responsibility Model." These experts, perhaps because they are so far removed from teaching actual children, are not willing to release responsibility...

Much of the data that does exist is obvious
Much of the "research" that the EdWeek article mentions is super-obvious. For instance, here is some expert wisdom:

"research showing that there is no bright line for when students start to read to learn"

"Kids have to read across texts, evaluate them, respond to them all at the same time. In office work of any sort, people are doing this sort of thing all the time."

"a student's depth and complexity of vocabulary knowledge predicts his or her academic achievement better than other early-reading indicators, such as phonemic awareness."

Didn't we all know these things already? But here is my personal favorite piece of obvious data:

"students who practiced reading, even when it was difficult, were significantly better 20 weeks later at reading rate, word recognition, and comprehension, in comparison with the control group."

Wow--if you read more, you get better. Who knew?!

Education is like medicine, circa 1850
I have read that at the end of John Hattie's 2009 Magnum Opus, Visible Learning (I've ordered the book, but it hasn't come yet), Hattie compares the state of research in education to the state of medical research in the nineteenth century. In other words, we teachers might be better off with home remedies or folk wisdom. And in a sense this makes me feel a bit better about the Common Core. The Common Core is in no real sense, as far as I can tell, evidence-based (saying that students will one day have to write non-fiction is not scientific evidence for making them read it a lot when they are eight), but given the state of education research, maybe that's okay. What matters is that students read a lot, think and talk about what they read, and look carefully at their own writing. We English teachers can facilitate this process, but we shouldn't worry too much about the standards, which are, as Tim Shanahan says, in an expert opinion I can agree with completely, "a little goofy."

***************************************************
Footnotes:
(1) In fact, one of my favorite passages from the NRP report is the following piece of meaningless verbiage: "Comprehension is critically important to the development of children’s reading skills and therefore to the ability to obtain an education. Indeed, reading comprehension has come to be the “essence of reading”." This is as absurd as if one were to say, "Movement is critically important to the development of children's running skills and therefore to the ability to compete in many team sports. Indeed, movement has come to be the "essence of running"."

Friday, October 12, 2012

The Writing Counterrevolution

I. "The Writing Revolution"
There's an interesting but insidious article about writing instruction, "The Writing Revolution," in this month's Atlantic. The article tells the story of a high school on Staten Island that changed the way it taught writing and saw its test scores and graduation rates improve significantly.

The changes in the writing instruction don't seem unreasonable--an increased focus on argument and grammar, along with a heavy use of sentence stubs and frameworks (e.g. "I agree/disgagree that_____, because _____")--and it seems possible that instituting a coherent writing and thinking curriculum as a big part of a schoolwide overhaul could be a big improvement in a bad school. Why, then, does the article so raise my hackles?

I think it's mainly because the article takes this one curricular shift and weaves it, with a lot of dangerously simplistic received ideas, into a standard narrative of recovering a lost golden age--in this case, the golden age of the 1950s. According to the article, the school's shift to "formal lessons in grammar, sentence structure and essay-writing" was a return to the ways that "would not be unfamiliar to nuns who taught in Catholic schools circa 1950." This counterrevolution (the article's headline is misleading) was necessary, according to the article's narrative, because misguided educational movements of the 60s, 70s and 80s had led schools away from teaching "the fundamentals" and toward a weak, pointless curriculum of "creative-writing" in a "fun, social context."

This long-term narrative is annoyingly untethered to any hard data. Were students better writers in the 1950s? I doubt it very much. The best data we have on long-term trends comes from the National Assessment of Educational Progress, which shows little change from 1971 to 2008, but if anything shows a gradual upward trend. If the story this Atlantic article is telling has much truth to it--if there was a shift in the 70s and 80s to a more fun and creative writing curriculum that ruined academic achievement across the country--then we should see NAEP scores going down. But they don't go down, they go (slightly) up! Here are the National NAEP reading scores for 13 year olds:

1971       255
1975       256
1980       258
1984       257
1988       257
1990       257
1992       260
1996       258
1999       259
2004       259
2008       260

The reading scores for 9 year olds, who wouldn't have had as much schooling, and 17-year olds, who had more, tell basically the same story. Students in 1971 had not had much time to be ruined, as the article implies they were, by teachers teaching Paulo Freire in Ed school, and yet they seem to have been no more literate than students in the nineties, or in 2008.

The particular story the Atlantic article tells, about one high school that changed (among other things) its writing instruction, is an interesting anecdote, one whose facts could be probed further (what else was changed at the school?) and whose meaning can be debated (even if the shift in writing curriculum was responsible for a dramatic improvement in academic achievement, is it possible that any coherent writing curriculum, even one that focused on personal or creative writing, could have had the same effect?). The interesting anecdote, however, is put in the context of a larger narrative that seems to be clearly and demonstrably wrong. It is simply not true that because misguided 60s and 70s pinkos, in the name of freedom, stopped teaching anything, student achievement plummeted. Whatever teachers were doing in the 70s, 80s and 90s, student achievement did not plummet.

II. "How Self-Expression Damaged My Students"
The main Atlantic article is accompanied by a shorter piece by a former "teacher" (the guy seems to have used a brief stint in the New York public schools as a stepping stone from a career in magazine publishing to a career in the Ed Reform industry), entitled "How Self-Expression Damaged My Students." This bizarre article likens the "Reader's and Writer's Workshop" approach (one that this guy used in his classroom) to a "cargo cult." In other words, the reading and writing his students did was, as he sees it, as totally pointless as the building of runways by primitive peoples who hoped that by imitating the form of an airfield they could bring back the airdrops of supplies and food that had come during the war. This comparison is so insane on so many levels that I am not going to take the time to analyze it.

Later in his article, apparently realizing that he has gone off the deep end, the author tries to reel himself back, writing, "Let me hasten to add that there should be no war between expressive writing and explicit teaching of grammar and mechanics," but he goes on to argue that "at present, we expend too much effort trying to get children to 'live the writerly life' and 'develop a lifelong love of reading.'" And he concludes by implying that it is ten times more important to teach grammar and mechanics than to try to get kids to love reading and writing by having them actually do it.

These people are all about data, but where is the data that shows that grammar and mechanics "instruction" works better than just reading a lot? It sounds like the author of the Atlantic article had his students spend way too much time on the writing process and not nearly enough time reading, but just because he was bad at it does not mean that getting kids to develop a lifelong love of reading won't help them read and write better. It almost certainly will. Grammar and mechanics instruction, on the other hand, should be a small part of the curriculum.

III. What, then, to think? (Besides that the Atlantic is owned by right-wing crazies...)
I'm not sure what my own overarching narrative is (maybe that we're in the middle of a decades-long counterrevolution in which we are making the poor poorer, blaming them for the results of their poverty, and then telling them they ought to act more like they did in the 50s, when people respected their betters?), but I am pretty sure that these people in the Atlantic don't have the right one.

Friday, September 14, 2012

Research shows that "research-based" pedagogy doesn't work! (but discussion does!)

I. Excellent scholarship is multivalent

Great education scholarship does many things at once. McKeown, Beck and Blake's excellent paper about content pedagogy vs. skills pedagogy is a good example: they manage to show that content pedagogy can actually do a better job of teaching skills than a pedagogy that aims at direct instruction in those skills; but the paper also contains some remarkable data implying that what seems "natural" to teachers may be both unnatural and ineffective.

Another wonderful paper on reading is a 2006 classic by Martin Nystrand, called "Research on the Role of Classroom Discourse As It Affects Reading Comprehension". Nystrand's paper, a magisterial review of 150 years of research into American classroom discourse, argues powerfully that discussion-based teaching can produce significant gains in reading comprehension. The paper is worth reading for its main point, but it also sheds light on our era's focus on misguided reforms that claim to be "data-driven--and as a bonus, Nystrand is a sharp, funny writer. In fact, I laughed aloud more than once while I was reading the paper last summer. At one point, my wife asked what I was laughing at, and I tried to explain:

In his paper, Nystrand says that the National Research Council (the NRC) and the Department of Education have declared that education is woefully unfounded in data-based research, being stuck instead, according to these august bodies, in "a 'folk wisdom' of education based on the experience of human beings over the millennia in passing information and skills from one generation to the next." Denigrating "the experience of human beings over the millennia" is kind of funny--education is a deep human enterprise, and most of the deepest human experiences (love, child-rearing) have rightly held modern "data-based research" at a skeptical arm's length--but it was Nystrand's next sentence that really cracked me up:

"While considerable recent work supports the NRC's contention that the education research base is little used in schools, research also strongly suggests that this conception of education research is, in the case of the pedagogical effects of classroom discourse, inappropriate, and even counterproductive." (393).

My wife didn't think it was funny. I tried to explain: in other words, I said, research does show that education is not often based on research, but research also shows that education that is based on research is inappropriate and counterproductive.

Ha!

II. The limitations of prescripted lesson plans; the limitations of rigidly "scientific" social research

Nystrand's whole paper, like that sentence, is excellent, both witty and wise. In fact he isn't really saying that all research is counterproductive, only research of the type privileged by the National Research Council, that is, research in which the relevant variables are strictly defined and rigidly controlled. For classroom practice to be strictly defined and rigidly controlled, it must follow prewritten scripts. The use of prewritten scripts, however, precludes a more flexible pedagogy, and more particularly precludes rich conversation of the kind Nystrand calls "dialogic" (after Bakhtin and Volosinov, and informed by Vygotsky--what is it with Russians?).

I'm very interested in this" dialogic" mode of classroom discourse, and how it relates to what my colleagues and I are trying to do this year with our focus on close reading of brief passages, but first I had to think about the implications of Nystrand's point about research. I have been curious and confounded by the inadequacy of a lot of "research-based" pedagogy, and Nystrand's point about education research gave me a new persective on why the research is so weak.

If a process is standardized, it is much easier to reliably measure results. To produce a more scientifically rigorous and reliable conclusion, then, you want to compare one type of highly standardized instruction with another. For instance, if you're comparing a skills approach to a content discussion approach, you don't want your data muddied by the way different teachers implement the two approaches. If you have two teachers doing each approach, you want both skills teachers teaching the skills in exactly the same way, and you want the two content discussion teachers running the content discussions in exactly the same way.

However, Nystrand is arguing, standardization itself may be bad pedagogy. In the example of a study comparing a skills approach to a content approach, neither approach might be as good as one that allows for more open-ended discussion. Any script, he suggests, might be too limiting.

In some ways, this seems to support my suspicion of Common Core standards and indeed all backwards lesson planning in English class: if you know exactly where you are heading in a lesson from the very beginning, how can you be excited about the lesson, and how can you make it exciting?

III. The (Research-Proven) Benefits of Discussion-Based Pedagogy

His insight into the limitations of much education research is, however, basically an aside; Nystrand's paper is largely about the significant benefits of discussion-based pedagogy. Ironically, he makes his case by citing dozens of empirical studies, including more than one showing that discussion leads to large gains in reading comprehension. He also says that the best discussions are those in which the teacher controls text and topic but students have both time and interpretive control.

According to Nystrand, the virtues of discussion-based pedagogy, rather than lecture or recitation, have been reiterated in American scholarship on teaching since at least 1860, when Morrison complained that "young teachers are very apt to confuse rapid-fire question and answer with effective teaching." One of the great virtues of Nystrand's paper is his review of the history of American classroom discourse. According to Nystrand, it is well-documented that the prevalent form of classroom discourse for over a hundred years has been "recitation," that is, rapid-fire question and answer about the basic facts of what the students have already read. Some writers celebrated this, but many argued that recitation was essentially pointless, and that discussion was a much better practice. Nystrand quotes Bloom (1956) as saying that in his observation over 50% of instructional time in American classrooms was taken up with teachers talking.

This is still true, apparently. In a helpful overview of the state of English instruction, Nystrand reports on research finding that 95% of ELA teachers value discussion, and their students report that discussion helps them understand their readings--yet only 33% of the teachers regularly make room for it. Indeed, Nystrand writes, other researchers found that in the 58 ninth grade classes they observed, only five classes had any group work, and 90% of this group work was merely collaborative seatwork. In an even more disturbing finding, the same study reported that in these 58 ninth grade classrooms, open-ended whole class discussion averaged only 15 seconds a day.

This is disturbing, but perhaps not particularly surprising. Fostering "open-ended discussion" does not come naturally to many teachers, including me at times. But it is incredibly important to do so, partly because a "discussion-based" classroom results in greater recall and comprehension, but also because discussion-based classrooms help build what Nystrand calls "classroom epistemology" and what my colleagues and I have been calling "habits of mind."

First, Nystrand's own research and the research of others, involving thousands of classroom observations, shows that the amount of classroom discussion is strongly correlated with improved recall and comprehension of the reading, and, interestingly, a greater response to the "aesthetic elements" of literature. Second, a "dialogic" classroom also creates a culture of thinking more deeply and helps students develop an "identity" as a reader--which will indirectly help students over the longer term. As Nystrand writes, citing dozens of studies, it is "the conversations teachers lead with their students" that define the way students think about literature.

IV. How we should put "dialogism" into practice in our classrooms

In short, Nystrand says that research shows that we should control text and task (and we should prefer problematic and difficult passages (duh!)), and we should give students time and interpretive responsibility. According to Nystrand, a recent meta-analysis of 49 studies found that the most productive discussions were clearly framed by the teacher, but gave students extended time to elaborate their ideas and allowed for considerable flexibility in what students actually said. Nystrand also suggests that pair discussion is not as good as larger group discussions.

This fall, I am planning to put some of these ideas into practice. In addition to high-volume independent reading, my ninth grade classes this year are going to be largely focused on brief passages. In addition to writing short analytical essays about such passages, and partly to prepare them for the writing of the papers, they will have extended large-group discussions about brief passages more than once a week (as Nystrand suggests, I will mostly retain control of text and topic). My hope is not only that these discussions will help their reading comprehension and their analytical abilities, but that the discussions will also foster a culture of thoughtful discourse and intellectual and aesthetic curiosity. We'll see how it goes.

I was going to stop this section there--but then it occurred to me that "seeing how it goes" is far from a simple concept. In fact, I wish I had a better handle on how to know whether changes I make in my teaching actually make any difference to student learning--but what I see over and over again is that even people who are very interested in data, like Bruce Baker of the excellent blog schoolfinance101, are very suspicious of the validity and reliability of evaluating teachers based on test scores. Is my own classroom a large enough sample for me to have much confidence that any improvement in my students' performance is actually due to my own efforts? Hm... I'll try to find out. In the meantime, let me go back to my conversation with my wife, after I laughed aloud at Nystrand's wit.

Coda: Is research-based reform ever helpful?

I didn't get to talk about discussion-based pedagogy with my wife--and it actually took me a while to get back to the article itself, because after I laughed at Nystrand's sentence about research showing that education based on research is inappropriate and counterproductive, she asked a very good question: Well, when is research-based reform not counterproductive?

Um, maybe in automobile safety, I said.

My wife just looked at me. It was a witty look, and I laughed, because I could see what she was thinking. Automobile safety is a perfect example to use, since decades of research and data-based improvements in safety have resulted in a system that still produces nearly 40,000 violent deaths in the United States every year. It's possible, said her look, that what's needed is not evidence-based tinkering but a paradigm shift.

Okay, obviously I should have offered a medical example--knee-replacement surgery, maybe--in which research-based reform is useful. Still, my wife's response was a wise one: what is necessary in the case of automobile safety, as well as what would be helpful in the case of reading, is to look, not more closely at the component parts of the system, but at the system overall. This is in a way what Nystrand attempts to do in his article about classroom discourse, suggesting that we need to get the sage off the stage. I wonder if we might also look at the overall system by looking at so called "natural experiments"--for instance by trying to look at the differences between the Finnish educational system and that in the US, or the automobile safety system in Holland as compared to that in the US, in something of the way a recent excellent post on the New York Times's economics blog discusses teacher salaries. Another way of looking at the overall system is by comparing it to other systems--by comparing education to health care, say. I hope to explore these comparisons a bit in future posts.

Question: Why "Leafstrewn"?

Answer: Doth the universe lie within the compass of yonder town, which only a little time ago was but a leaf-strewn desert, as lonely as this around us? Whither leads yonder forest-track? Backward to the settlement, thou sayest! Yes; but onward, too! Deeper it goes, and deeper, into the wilderness, less plainly to be seen at every step; until, some few miles hence, the yellow leaves will show no vestige of the white man’s tread. There thou art free! So brief a journey would bring thee from a world where thou hast been most wretched, to one where thou mayest still be happy!