Kirkpatrick’s Four Levels of Training Evaluation

Sometimes, you say a thing and it just catches on. It’s a moment of insight that gets frozen in time like a mosquito in amber, and later you realize just what you have. Kirkpatrick’s Four Levels of Training Evaluation is like this. It’s a simple framework for evaluating the efficacy of your training program. Don Kirkpatrick uttered the words: reaction, learning, behavior, and results. His son and his son’s bride take up these words and refine the meaning that the industry gave to the words and adjust them back towards their original intent.

The Levels

Despite the fact that Don never uttered the words as levels, others added them to the descriptors, and eventually people began calling it the “Kirkpatrick Model.” It stuck. Today, professionals speak about the levels of evaluation like this:

Level 1: Reaction – Did the students report that they enjoyed the learning experience, including the materials, the instructor, the environment, and so on?
Level 2: Learning – Did the students learn the material that was delivered? This is the typical assessment process that we’re used to having to complete to be able to report successful completion of a course, but it’s more than that. It’s did we learn anything that we can retain after the class and the test are long over?
Level 3: Behavior – Ideally when we’re training, we’ve identified the key behaviors that we want to see changed. Level 3 is the measurement of the change in the behavior.
Level 4: Results – Did the change in behaviors create the desired outcome? Are we able, as training professionals, to demonstrate that what we’re doing has value to the organization in a real and tangible way?

The Process called ADDIE

Many instructional designers use a design process called ADDIE after the steps in the process:

Analysis – What results do we want, what behaviors need to change to support that, and what skills need to be taught to change the behaviors? (Here, I’d recommend looking at The Ethnographic Interview and Motivational Interviewing for tools you can use.)
Design – What kinds of instructional elements and approaches will be used to create the skills and behaviors that are necessary to accomplish the goal? (Here, Efficiency in Learning, Job Aids and Performance Support, The Art of Explanation, and Infographics are all good resources.)
Development – The long process of developing each of the individual elements of the course.
Implement – Implementation is the execution of the training, either instructor led or in a learning management system.
Evaluate – Assess the efficacy of the program – and, ideally, revise it.

If you’re unfamiliar with the course development process or you’d like to explore it in more detail, our white paper, “The Actors in Training Development,” can help you orient to the roles in the process and what they do.

The untold truth is that, in most cases, the processes is rushed, hurried, and many of the steps are skipped or given insufficient attention. Rarely does an organization even have someone with instructional design training much less the time to do the process right. There’s always more training that needs to be developed and never enough time. The Kirkpatricks are driving home an even more telling point. The evaluation process – how you’re going to assess the efficacy – needs to be planned for during the analysis and design phases. The development and implementation phases need to consider the conditions that will be necessary to get good evaluation results. Evaluation isn’t something that can be bolted on at the end with good results.

It’s sort of like Wile E. Coyote strapping a rocket to his back and hoping to catch the roadrunner. It always seems to end badly, because he never seems to think through the whole plan.

Leading and Lagging Indicators

I learned about the horrors of metrics through The Tyranny of Metrics but learned real tools for how to create metrics through How to Measure Anything. However, it was Richard Hackman who really got me thinking about leading and lagging indicators in his book, Collaborative Intelligence. He was focused not just on how to make teams effective in the short term but how to create teams where their performance remains good and keeps getting better. He was talking about the results as a lagging measure, an outcome from the creation of the right kind of team. Influencer picked up and reinforced the concept. We need to look not just at the outputs that we want but the behaviors that we believe will drive those outcomes.

It’s all too easy, as you’re working on developing the metrics for your training, to focus on the lagging metrics and say that you don’t have enough influence on them. After all, you can’t take responsibility for sales improvement. Some of that’s got to be up to the sales manager. And you certainly don’t want to say that your training sucked if sales dropped after salespeople took the course. As a result, training professionals too often shy away from the very metrics that are necessary to keep the organization when there’s a downturn. Instead of being seen as an essential ingredient to success, they’re seen as overhead.

By focusing on a mixture of both leading indicators and lagging indicators, training professionals can get to an appropriate degree of accountability for end performance. Leading indicators are – or at least should be – behaviors. They should be the same behaviors that were identified as a part of the analysis phase as needing to be changed. These should be very highly impacted by the training. The lagging measures are the business outcomes that also should have been a part of the analysis process – but are further from the learning professionals’ control.

Waypower

While it’s not true that we need to hope for good outcomes, there’s a bit we can learn from The Psychology of Hope with regard to training’s role in the process of changing behaviors. In The Psychology of Hope, Snyder explains that hope is made of two components: willpower and waypower. Willpower is what you’d expect. It’s the desire, perseverance, or grit to get things done. (See Willpower or Grit for more.)

Waypower is different. It’s the knowledge of how to get to the other side. It’s the knowledge of the how that learning professionals can help individuals with. It’s waypower that training professionals give to their students every day. This may be used for the purposes of some corporate objective, but in the end, it’s a way of creating hope in the minds of the students that they can get it done if only they try. (Here, a proper mindset is important, too, as explained in Mindset.)

Application

There’s nearly zero research on the relationship between overall performance on the job and well trained, knowledgeable people. The problem is that we don’t really know how much training does really matter. What we do know, however, is that the application of the skills and behaviors that are taught in the classroom don’t always happen. The problem is called “far transfer,” and it’s a relative secret that what we teach in classrooms doesn’t always get applied to the real world. (If you’re interested in some other relative secrets in the training industry, check out our white paper, “Measuring Learning Effectiveness.”)

There’s an absolute essential need to consider how the skills that are being taught in the course can – and will – be applied by the student in the real world. Discussions, case studies, and conversations make for learning experiences that tend to be more used long after the training has been completed.

About the Questions

The book wouldn’t be complete without some guidance on how to write actual evaluation questions, including avoiding superlatives and redundant adjectives when evaluating in a scale – and ensuring that the scale matches the type of question being asked. Question authors are encouraged to keep the questions focused on the learning experience rather than the instructor or environment to get better answers.

The real question for you is will you read Kirkpatrick’s Four Levels of Training Evaluation and apply it to the way you evaluate your training?