We recently had a web session hosted by Kirk Vandersall from Arroyo Research Services, Joellen Killion from the National Staff Development Council, and Travis Colton from the American Productivity and Quality Center focused on how to evaluate professional learning. Audio will be posted in the near future, but please check out the transcript until then.
Kirk Vandersall – Arroyo Research Services Concerns and Challenges:
The most common concern is models, the use of models and an appropriate model for looking at PD. The study will be looking at how are they used in practice and align with workings.
- How evaluating professional learning can be operationalized to be used every day in PD.
- Models are meant to identify specifically where you need to do measurement. Making sure it’s leading to appropriate changes in teacher understanding and practice.
- Every district has different strategic initiatives and will look quite different. Models are meant to identify what that theory is and the expected steps that could later lead to classroom changes or student achievement.
Challenges in the study:
Challenge 1. How to figure out recommended ways that evaluation can fit in everyday practice in the district.
Challenge 2. Organizational structure is an issue (also in 1st study). PD may be housed under HR or Curriculum and Instruction or the Asst. Superintendent for Professional Learning.
Challenge 3. Trying to bring stronger scientific basis to evaluating professional development and also keeping it simple, useful, and powerful is a challenge.
Case example: One district attempted to establish seven initiatives in Professional Development. Problems with multiple intiatives:
1. keeping track of the multitude of initiatives.
2. keeping it simple so findings can be useful and powerful.
Challenge 4: Identifying appropriate unit of analysis.
Case example: One district wanted a unique survey for each initiative but individual surveys are non-advisable and not feasible. Professional development has a need at the course or experience level, but should be more at the level of a strategic initiative; defining a broader change desired by a district and defining what that means is a challenge.
Case Example: A district wanted surveys for each initiative for PD across different schools. Wound up doing a content analysis and identifying what they wanted to accomplish and what that would look like for a teacher. Really have to think of it in terms of “What would teachers be doing differently? What those initiatives look would like for a teacher, what would they be doing or doing differently” and the district implemented quarterly surveys for teachers that ask about goals but not for specific initiatives separately. Evaluated the change throughout the year rather than per course.
Two additional things:
1. Many districts are struggling with transforming professional development from a course-based-system to a broader professional learning across the board. New professional development evaluation models should involve more school-focused initiatives and be self-directed. Traditional models in evaluating professional development don’t do very well assessing teacher growth in all environments and they need to identify appropriate data and feed that back into decision making in appropriate and useful ways.
2. Focusing on a teacher continuum of growth. Need a strategy for identifying teacher competencies and their growth on a continuum across their career. It’s difficult finding strategies that can be carried across districts but that’s part of what we’re looking for.
Joellen Killion – National Staff Development Council
Critical Points/Issues
What is it that we are trying to evaluate? PD is shifting to be less formal and more job-embedded, on-going and continuous, such that the models that we might have used for evaluation need to change along with our approaches. What it is that we’re trying to demonstrate makes an impact on student learning.Process vs. Results: Have become savvy in evaluation arena that help validate processes that are in place and validate that teachers are participating. But it isn’t the process that’s the issue but process plus results. Are teachers engaged in PLCs (professional learning communities), what is the degree of engagement, how are they influencing classroom practice, and are they paying off in terms of student achievement?Issue of “Data Burden.” We feel like we need (and sometimes do need) to collect a great deal of information, which creates a tremendous burden. Find info that’s already available and use it in answering some questions. Rather than create more data burden we need to figure out what info is naturally available and use it to be able to answer our questions.NSDC discovered that the constituents in a school system all had different questions they wanted to answer. School board wanted an ROI (increases in student achievement). Staff Development wanted to know the degree to which PD was being implemented and whether the processes/practices were being effective. And teachers wanted to know if practices were changing sufficiently, how to get support when they weren’t, and how they could tell the changes in their classroom. The questions keep on changing, which puts pressure on school systems trying to enact evaluations of their PD systems.
Evaluating the use of instructional coaches in school systems: Features used as indicators are: (1) student achievement; (2) teacher satisfaction with coaches’ services; (3) the culture of the school and how that is shifting. This kind of evaluation is suggesting that it’s critical to be collecting baseline data that allow us to evaluate programs within school districts. What are those indicators with a fair degree of clarity, and how can we track evidence over time that supports a conclusion that a PD effort is having an impact?
Two current projects in New Jersey are looking into the use of learning communities and collaborative professional learning communities and how impacting schools with student achievement and teacher satisfaction.
NSDC is working on a national effort exploring models of evaluation that will help with this new kind of professional development, important models and theories of change that we’re using show that old models are now are falling short. NSDC is looking for a precise intervention for viable models of evaluation that will allow school systems to evaluate professional learning; shifting from centrally controlled to more school-based and job-embedded models of professional learning.
Paying attention to the issue of results is hard. Measures are not sensitive enough to see the shift in teacher practice. Often, shifts in teacher practice happen over time; PD programs may not be fully implemented and operational until the next school year. We want to be able to show results that are early indicators that will tell us with some certainty that if these things are changing. Down the road student scores on assessments will change.
Q&A Session
Q: How do you assess the changes in school culture related to PD activities?
JK: Survey instruments, observational instruments, and interview protocol help us assess the changes. In the “Journal of Staff Development” on culture done several years ago, that had a collection of assessment tools by Chris Peterson. Depending on how we define it, we can create instruments to be able to gather some evidence. We like to look at relationships and professional conversation, more social or professional conversations between teachers. We can use a survey or observe.
KV: When surveying teachers, I found them to be quite honest but reluctant due to survey burden. If the school culture values professional conversation and attention to student work, then PD should be focused on changing that.
Q: We focus on PD for staff and after-school and other non-school-hour programs. Since PD participants come from various organizations, and since student outcomes include social and emotional changes, do you have any suggestions regarding how to relate evaluation processes in terms of measuring student impact?
JK: There are indicators, there’s a way to measure just about anything as long as we can define it. In after-school programs, the indicators of success are academic and equally other kinds of indicators. So we want to be able to put both of those indicators or multiple indicators in front of us and say kids are better engaged, more socially competent and more academically successful. It probably wouldn’t be easy for us to do a study that says because they are more engaged and more socially competent that they are more academically successful. Proving cause and effect is difficult. But we could conduct studies that allow us to look at correlations among some of those areas. I didn’t talk about the struggle we have with causation and the demand to prove that PD contributes to student achievement. I think a lot of you realize that that level of evaluation is incredibly complex; not that it can’t be done, it’s just incredibly complex.
KV: When it comes to after-school programs or even professional development that’s dealing with a number of different kinds of personnel, I understand Joellen’s point about causation. I think there’s strong research to show that if you can extend engagement and student self-direction, for example, that that clearly has academic implications. You can expect that as a leading indicator of later student outcome, and that’s not that difficult to prove. The real challenge, in my view and experience with the after school programs, is knowing exactly what the intervention was, with some specificity, as well as understanding what each participant’s actual participation in that intervention was. Record keeping tends to be very poor, if it exists, and just saying they were a part of the after-school program doesn’t tell you whether they came once a week for 20 minutes, an hour per day four times a week, or just what the intensity of that engagement was. You therefore wouldn’t know whether the issue was that the program didn’t work or whether the issue was the program didn’t happen for some group of those kids. Being real clear about what the intervention is and what constitutes participation is an important part of that. That’s as true for after-school programs for kids as it is for professional development workshops and weekend experiences that serve a number of different teachers from across the district or from multiple districts.
One additional comment about causation: I had written down here to mention after my comments that we really have been talking about the difficulty in establishing defensible measurement regimes of any kind to show progress in a PD program. But we really haven’t discussed appropriate research design for establishing causation to talk about participant selection, recruitment, participation and outcomes in ways that let us say that it was participation in this program and not another one that lead to these outcomes. A district that we worked with recently asked us to look at a technology program which they weren’t even thinking about as a PD program. We looked at it and said, “You know, there’s nothing really unique about this technology, but the PD you’re offering to administrators really is something special and different.” It absolutely defied us and the program managers to identify an appropriate control group—there simply wasn’t one that we could use that had any related experience so that we could say it really was this program that was making the difference. That is a major challenge above and beyond what we’ve already discussed.
JK: I second that and I think we sometimes try to answer questions related to evaluation without the appropriate design. It’s because we’re trying to do our work and we’re trying to do it quickly and we’re making some errors, just in terms of the field of evaluation.
KV: I think in some cases we are making errors in terms of the leaps that we are being asked to make and going ahead and making them without the evidence. In other cases, though, it’s that different stakeholders require different levels of evidence to prove the point, and so the design and the data have to be appropriate for its purpose.
Q: How accurate is teacher self-reporting of practice changes after PD activities versus classroom data gathered via walk-throughs?
JK: I don’t know that I would say it’s an either/or. I’d love to see both. I’d love to be able to use a triangulated model to have teacher self-reports and also to have data that comes from walk-throughs or classroom observations that would allow us to know with some confidence that what teachers are saying on a self-assessment or self-report is being held up in terms of observation. There’s a huge amount of literature in the field of evaluation and research about self-report data, much of it not very complimentary about the accuracy of that, and yet there’s another body of language that says who betters knows than the person who’s telling you. So I don’t think it’s an either/or, I’m not sure that I would even say one is better than the other. I would love to have both.
JK: One of the components we’ve added to that is putting students as part of the survey. We get their observation combined with classroom observations, teacher data and input. The student component adds a whole different validity to where the teachers think they actually are.
KV: There are cases where you have to use the self report and it’s important and useful data, you just have to know where it came from and use it appropriately. In some cases teachers famously under-report if you ask them about particular skills; if they think they need a degree in order to really be certified and successful, they will tend to under-report. I’m thinking of this in relation to technology—if they think you’re asking them if you are implementing the district’s program that everybody knows you’re supposed to be doing, like no one else on earth, teachers know when they’re being asked something that there’s supposed to be a right answer to and they’ll give you that answer. But if instead you ask about specific things that they do in their classrooms, specific ways they relate to other teachers, if they understand what you’re asking and the language that you’re using, our experience is that they tend to be pretty straight-forward about that. It’s when the question is leading or if it’s confounded by some other self-understanding that it gets difficult, but there is a place for it and we use it. But we use it judiciously.
JK: One of the things that I know several of you probably know from your own experience is that when teachers self-report on the early end, they may over-report. On the other end, they may be more honest, and sometimes you see results declining in terms of self reports because in the early stages teachers thought they knew; then when they had an opportunity to study a particular methodology or whatever, they discovered that they didn’t know as much as they thought they knew. Over time, the more they know, the more they realize they don’t know.
Q: Given the complexity of demonstrating causality, do we have to settle for leading indicators? What then is the incentive to try and connect with long-term indicators?
KV: I would say one incentive is you first need to have a model that can show that there’s growth even if you can’t prove definitely that it’s causation by comparison to some other approach or by not participating in that professional development initiative. The kinds of things that you would want to take a look at still involve looking at changes in teacher practice, changes in student achievement. You want to make sure that you have a model, that if you were to put in place appropriate selection, recruitment, etc., it could bear the weight. Typically, even the more rigorous scientifically-based research design, the kind of things that you find being supported by the Institute for Education Sciences, for example, start off with a study that would never pass muster for scientifically-based research. They start off with a study which says “do we have any reason to believe that there’s going to be growth or change based on participation in this project”. If they think the answer is yes, based on a study that gives them good information, then they structure up a more organized, detailed thing. I don’t see that it’s necessarily getting us off the hook of looking at student achievement; it just means that we have to be careful about the claims that we make based on that.
JK: It places more weight on the importance of our theories of change and how we’re expecting those changes to take place and be willing to be open to modifying those theories as we move through the implementation of a program.
Benefits to Benchmarking
JK: The greatest benefit that districts can gain from a benchmarking study is to do a self-assessment and have it compared to other district’s self-assessment to get a sense of where they stand. I’ve talked with a number of people from districts that participated in the last study, and what they realized is that they have strengths they didn’t know they had, even though they struggled in those areas. They also gain a great network of support among these districts, where together they can talk about some of the challenges they are experiencing and share resources with one another to be able to think about how to address those challenges in a contextually appropriate way. It gives you ideas, strategies, and a picture of reality.
KV: The single most interesting, rewarding part of our work is getting to see dozens of different organizations struggling with similar problems sitting in slightly different places and coming up with different ways to address them. In the benchmarking study you will have a chance in short order to sit where other people sit and see the world the way they see it, from many different perspectives that are not your own. When you then return to your district, you will see different possible solutions to issues that that you didn’t see before, because you now understand there are different ways to do it. That will enrich your ability to do your work.
Closing Remarks
Q: Chris Brown with Pearson. We’ve looked at this area with several of our businesses, Pearson Achievement Solutions in particular, and one of the things we found as a systemic issue in the effectiveness of the various PD efforts that we’ve seen and in fact, that we’ve tried to implement, is leadership. I wondered if leadership is especially important and if that’s looked at by any of the districts that are on the line, if they pay any special attention to getting buy-in and emphasis placed at the very top and how that leadership might impact what they’re doing.
KV: It’s always vitally important. There’s strong literature showing that it’s true, and I have a couple of theories about that. One of which is, and it’s tying back to the value of benchmarking, in education we do a much better job thinking about what the outcomes are and what the initiatives are than we do about the process of actually delivering them at the district level, about how we organize and execute our work. I say that having been at a district position where you could see that it was a challenge and in addition to working in districts all of the time. Process improvement is not something that many of us who have worked in districts think about a lot, and yet it’s very important to our work. One of the reasons why leadership is so vitally important is that it can trump the lack of a process that is well and completely specified. It’s also important because there is so much going on in schools, there is so much competition for attention and ways to lose your focus, that leaders are vitally important in helping teachers and folks involved in professional learning stay focused on the right things in the sea of all the other priorities and important things that need to happen. For a lot of reasons, it’s just not possible to avoid the of issue of leadership if you’re trying to establish efficacy of a program. You could have a very well functioning program for one district that fails utterly in another simply because leaders were not brought in and didn’t support it in order to keep the focus.
JK: Absolutely, I want to underscore that. The whole notion, I think, of education’s way of implementing programs, leaves a lot to be desired from a leadership perspective, from a resource perspective. From the perspective of a long-term focus, we very often are launching or initiating and we rarely spend time thinking about what it takes to implement a program fully. We almost never thinking about what it takes to sustain a program. So, if you think about your own work in professional development, we often launch professional development initiatives and invest in the launching and rarely invest in the ongoing implementation or in what is necessary to sustain that. We take all the resources, time, effort and energy and shift it to a new launch, and so there’s constant launching and very little ongoing implementation. I think to some degree that we are at fault. Part of that is a leadership responsibility, and I would urge you to think very deeply about the kind of professional development you are wanting to evaluate and ask yourself very thoughtfully, “Is this really the kind of professional development that has a chance to produce the results I want it to produce?”







