The Adventures of the Kidde Woodward Family: Data Driven

I went to the Alliance for Education’s Community Forum last week to hear Kate Walsh from the National Council on Teacher Quality present their report, “Human Capital in Seattle Public Schools: Rethinking How to Attract, Develop, and Retain Effective Teachers” to a group of 200 educators, school officials, and reform advocates. It felt a bit like arriving at a friend’s house for Thanksgiving dinner and walking in on the latest round of a bitter family feud: battle lines had been drawn long ago; seemingly innocuous statements (and even applause) contained hidden barbs, and you quickly learned to hold your tongue lest you inadvertently take sides on an issue you didn’t even know was under discussion.

I had hoped (perhaps foolishly) that this report would provide some information about the way Seattle hires, pays, and manages its teachers, and some ideas about how to make the system support good teaching more effectively. I did come away with plenty of information and ideas, for sure -- but also many large grains of salt, a raised eyebrow or two, and a new, depressing understanding of the depth & breadth of the yawning chasm of mistrust between teachers and reform advocates.

The NCTQ report contains many helpful suggestions, such as giving bigger raises early in a teacher’s career when their skills are improving fastest, and eliminating "superseniority" that allows some teachers to choose a position over the objections of the principal. There are ideas to consider, at least: delaying tenure so the decision is not made, as now, before the end of a teacher’s second year, or increasing the length of the school year & day so they’re more in line with neighboring districts. Then there were a couple of gratuitous slaps in the face, like the recommendation that we forbid teachers from taking personal days on Mondays or Fridays. But there were two concepts that really seemed to stir up people’s feelings: restructuring teacher pay to reward effective teaching in the classrooms that need it most, and making student learning a factor in teacher evaluations.

Let me just say before I go any further that I am not opposed to either of these concepts -- they make a lot of sense, as concepts. I also see why teachers tend to react to them with wariness, if not downright hostility (as they did to related proposals in HB2261, which passed the legislature this year). Even the most marvelous concepts can be implemented disastrously (communism, anyone?). And quantifying "teacher effectiveness" -- by using test scores to measure it or money to reward it -- is a particularly tricky business.

Ms. Walsh acknowledged this problem right off the bat. "We know that teachers matter a lot," she said. "But little that makes them effective can be measured or predicted." She accompanied this statement with a weirdly simplistic, vaguely tautological graph.

The graph showed how the effect of good and bad teaching can be compounded over the course of a student’s education. If an average teacher produces, on average, one year’s worth of academic growth per year, and a highly effective teacher produces, on average, a year and a half’s worth of progress in the same time, then a student who has had highly effective teachers throughout his or her schooling will finish 8th grade knowing as much as a 12th grader taught by average teachers. And a student whose teachers have only produced half a year of learning each year will finish 8th grade with a 4th grade education.

You can see the appeal of the quantitative approach, can’t you? Numbers are so easy to work with: 1.5 x 8 = 12. 0.5 x 8 = 4. Just plug them in, plot them out, and presto! Those clean, clear lines can hide all the limitations of your quantitative data, the difference between statistical averages and actual students and teachers, and the myriad factors involved in student learning that are way outside a teacher’s control. (I was reminded of the Ph.D. thesis I once read that calculated "change in neighborhood character" to the second decimal point based on a six-question survey of three Columbia City residents. I so wanted to know: which three?)

I don’t think anyone would dispute the idea that kids learn more with fabulous teachers, less with lousy ones. But there’s this incredible variation that goes into the "average" progress a group of kids might make. And so much of that variation is caused by things entirely unrelated to what the teacher is doing: class size, mandated curriculum, administrative support, parental involvement, you name it. There was one year when Josie went from the lowest reading group in her class to the highest because she got into a Powerful Readers tutoring program. (Or was it because we finally figured out she needed glasses? The fact that she found a best friend that year might have had something to do with it too, now that I think about it.) Another year she had to do 4th grade math over again in 5th grade because there wasn’t an extra teacher around to pull the 5th graders out of her mixed-age classroom for separate math time.

We know that standardized test scores correlate as highly with socioeconomic status and parental education levels as anything else. We can mostly control for the fact that some teachers are teaching small classes of rich white kids whose parents help them with their physics homework, while other teachers are working in overcrowded classrooms full of poor kids of color whose parents struggle to pull off a daily bedtime story. But what about all those other factors: the tutoring, the glasses, the best friend? Can we assume they will get "averaged out" for an individual teacher? Even the most overcrowded classroom is a pretty small sample size, after all.

So there are all these very real questions about exactly how we're going to measure student achievement, whether we can find a fair way to link it to teacher effectiveness, and whether we can really quantify either of these things reliably enough to use them as the basis for someone’s paycheck. I don’t blame teachers for wanting answers to these questions before they can embrace a new evaluation rubric or compensation structure.

And yet… It doesn’t make much sense to evaluate teachers and allocate their salaries without looking at student achievement data at all -- which is what we do now, apparently.

The NCTQ found that Seattle has one of the most Byzantine teacher pay scales in the country, with nine "lanes" to navigate on your way to the top. And the way teachers move up the scale is through coursework -- taking classes -- up to 155 credits plus a Masters degree. The NCTQ calculates that Seattle Public Schools spends nearly $48 million a year on this incentive.

It sounds like this is not such a great use of that money. Ms. Walsh presented another graph showing the results of some 17 studies that have concluded that more coursework under a teacher’s belt does not translate into more student achievement in the classroom. This graph seemed to have a little more substance behind it than the previous one, but it still left a lot of questions open. (Like the chart in the book I’m reading about redlining that shows a clear racial disparity in home mortgage denials… These numbers tell us we have a problem, but they don’t tell us exactly where the problem lies -- Real estate agents? Banks? Historic barriers to personal savings in the African American community? -- or what to do to fix it.)

Why isn’t all this continuing education making teachers more effective? Does the link between coursework and pay raises really motivate teachers to choose, as Ms. Walsh put it, the "cheapest, quickest credits they can find"? Is it worth trying to distinguish different kinds of coursework, to see which ones might have some value, or should we really just assume it’s all useless? The professional development teachers do together as a staff, for instance -- work that ties directly to their classroom teaching & instructional goals -- my teacher friends tell me this is immensely valuable for them, and directly affects the kids’ learning too. Can we talk to teachers about what they think they’re getting out of these classes?

Of course we want teachers to continue to learn and grow throughout their careers -- continuing education is a requirement for most professions, after all. But it is not usually tied so directly to pay. And if the coursework teachers are doing isn’t having an effect on their ability to teach the kids, then maybe we should think about doing something else with that money. What if we spent $48 million on mentoring and supporting new or struggling teachers, or providing extra resources to teachers in poor schools in order to reduce turnover, or helping teachers get National Board certification (a process that apparently does result in better teaching – or, at least, higher test scores)? As Superintendent Maria Goodloe-Johnson said in her remarks after Ms. Walsh’s presentation. "If this isn’t working, we should change it. And we should change it together. This should be the start of a dialogue."

Which brings us back to that yawning chasm of mistrust. It’s really hard to have a fruitful dialogue with someone when you’re both feeling bitter, betrayed, attacked, and overwhelmed. If the school district and the teachers union are going to sit down and completely restructure the teachers’ pay scale and evaluation process, they’re going to have to repair their relationship and build some trust in each other before they can even begin.

I thought of this a few days later when I sat in on a demonstration of Parametrix’s intriguing little tool, the "Neighborhood Cohesion Calculator." (I know, I know, doesn’t it sound ridiculous?) It’s essentially a spreadsheet that allows you to identify and weight various factors that contribute to neighborhood cohesion, then rate your neighborhood on each one and see where you come out. It was developed as a way to take neighborhood cohesion into account when choosing where to put a freeway or a light rail station -- which of the alternatives we’re considering would have the least devastating effect on the affected neighborhoods? But Parametrix is looking for other possible applications.

What struck me about the Neighborhood Cohesion Calculator was how much qualitative discussion has to take place before you can start in on the quantification. Your first step is to convene a group of stakeholders and leaders who can work together to decide exactly what to measure, how to measure it, and what to do with the results -- hopefully people whose opinions/authority on these matters would be respected by most of their neighbors.

Then the qualitative work can begin. Parametrix has identified a dozen or so neighborhood cohesion factors (schools, gathering places, neighborhood organizations, etc.), but the Calculator guides the group through a discussion to refine each one, distinguishing, for instance, between a neighborhood elementary school where local families are likely to run into each other regularly and a magnet high school where this is less likely to occur. There is an option to define your own neighborhood cohesion factors if your group decides that Parametrix left out something important. Then each factor has to be weighted – even if you’ve got the world’s most fabulous park, say, how much does it really contribute to neighborhood cohesion? And then you’ve got to agree on how you’re going to rate these factors for any given neighborhood: What makes a neighborhood newsletter stellar, as opposed to just okay -- Content? Frequency? Circulation?

The consultant from Parametrix pointed out that this whole qualitative exercise is often the most useful part of the process. Just having the conversation about what’s important, and what makes those things important, is way more useful than the number the Calculator spits out at the end. He even said that when the number does get spat out, many groups decide to go back and adjust the parameters, or their weighting, or even the geographic boundaries of the neighborhood, to make the results better reflect the reality they know and live in every day.

Of course, this kind of collaborative discussion is time-consuming and potentially contentious. But it seems to me like a plausible model for any process that seeks to quantify something as qualitative as "neighborhood cohesion" or "teacher quality." And I think if you don’t have this kind of conversation before you start grabbing whatever data you’ve got handy and plugging it into a grid, your conclusions will always be suspect, as likely to fan the flames of whatever conflicts exist, as they are to point a way forward that everyone will embrace.

I think a process like this could be fruitful for a collaborative group of teachers and administrators who are trying to develop, for instance, a way to incorporate student achievement into teacher evaluation and compensation. Unfortunately, my experience of the school district’s decision-making process doesn’t lead me to believe they are likely to adopt such an approach, despite Dr. Goodloe-Johnson’s invitation to "dialogue."

Just the other day the District sent home a survey: apparently they want our input about the new attendance area boundaries they are proposing for neighborhood schools. The survey had two questions: How important is it to you that students attend school closer to where they live? And, How important is it to you that attendance boundaries are developed through a data-driven process? (I won’t dwell on the fact that neither of these questions addressed the issue they must know is foremost in the minds of all South Seattle parents: what if the best schools are located at the other end of town, and the ones near you all seem to be struggling?)

I had to laugh. Data-driven? As opposed to what? Astrology? Throwing paintballs at the map?

How about this, folks: Let’s have a process that’s driven by people. Or rather, let’s admit that that’s what’s going on anyway, whether we’re drawing new attendance maps or devising new pay structures for teachers. Let’s do our best to make sure that the people driving the process are knowledgeable, thoughtful, and equipped with the time and resources they need. Let’s make sure they represent, or at least understand, the many perspectives on the topic at hand. Let’s hope they are people who can get us all on board the same bus, headed in a direction we all agree on. People who know that an issue you sidestep at the beginning of the process is going to come roaring back triple strength to bite you in the butt at the end. People who understand the power of data, yes -- but also how destructive numbers can be without a shared context that gives them meaning.

Can we do that? I sure hope so.

2 comments:

Susan Hayden said...: Thank you for this thoughtful piece. And thanks for attending that meeting. I was unable, but not sure I could have stomached it if I had.

And that survey... It was so ridiculous and laughable that I haven't figured out how to respond to it. It's sitting on my kitchen counter mocking me every time I look at it. Obviously they want to use the results as future statistical ammunition ("clearly, 75% of people responded that they wanted the decision made by a data-driven process, which is how we proceeded".) Aaaaaargh.

I do wish you could be in charge.; October 29, 2009 at 6:38 PM
Paul Inghram said...: Wow, another amazing piece.

Glad to see that the Neighborhood Cohesion Calculator experiment was food for thought.

When in my schools/PTA world I often think naively as to why things can't simply be made better. Your comment about how innocent questions can raise the hackles on long sensitve issues reminds that solutions can't simply be thrust on the situation, but need to be established in trust (or some approximation of it).; October 30, 2009 at 1:18 PM

The Adventures of the Kidde Woodward Family

Thursday, October 29, 2009

Data Driven

2 comments:

Archive

About Us

Pages

More Family News

Creative Maladjustment Associates