Podcast: Measuring Learning Impact

Welcome to Real Impact!, PDG’s podcast series.

In each episode, we reach out to thought leaders and deep thinkers in the talent development space to talk about the issues that matter to you and your business.

Why is measuring the impact of learning so difficult? In a recent survey, nearly 70% of respondents said the inability to measure learning impact was impeding the achievement of strategic goals. PDG’s VP of Consulting, Rich Mesch, discussed the challenges of leadership measurement with thought leader Will Thalheimer, creator of the Learning-Transfer Evaluation measurement model. Listen to additional episodes of our podcast to learn how to measure the impact of leadership development.

Full Transcript

Measuring Learning Impact with Will Thalheimer

Rich Mesch (00:03):

Real impact is the podcast of performance development group of Malvern, Pennsylvania. In each episode, we talk with colleagues and experts about the talent development challenges facing business. Today. My name is Rich Mesch and welcome to real impact. Why is measuring the impact of learning so difficult in a recent survey, nearly 70% of respondents said the inability to measure learning impact was impeding the achievement of strategic goals. We’ll discuss the challenges of measurement with Will Thalheimer, a thought leader who created his own measurement model.

We are talking today to Dr. Will Thalheimer Will is a well-known thought leader in the performance improvement space, including quite a bit of focus on the measurement of learning impact. He has made a career of debunking learning myths and focusing on research-based methods and not for nothing. He created his own learning measurement model, the learning transfer evaluation model, which we will no doubt dive into as part of our conversation Will Thalheimer welcome to real impact

Will Thalheimer  (01:21):

<laugh>. Thank you, rich. And you know, I just a couple things that one thing you said, not for nothing, no, it actually was for nothing because, um, I gave it away for free on the website and uh, people use it and then I hear about it later, but sort of a, you know, I’ve been a bad business person for a long time, and there’s just another example. And the other thing rich, you, you know, in my little introduction there, uh, you didn’t mention the most important contribution I ever made to the learning and development industry.

Mesch (01:51):

I have a feeling I know where this is going. Will

Thalheimer (01:53):

<laugh> uh, yes. Ladies in general, I want you to know my thing I’m most proud of is hiring Rich Mesch into the industry.

Mesch (02:03):

So if, if anyone is sick to death of these podcasts, you know who to blame, I never would’ve been here. Were it not for will <laugh>

Thalheimer (02:10):

Uh, alright, what are we gonna talk about?

Mesch (02:12):

Well let me start with a, a little bit of a softball to get you warmed up here will, so learning measurement is just an utter disaster. I mean, sometimes I think it doesn’t work at all. So easy question. Why is learning measurement such a mess?

Thalheimer (02:27):

<laugh> okay. So, you know, first of all, let’s, let’s talk about, what’s messy about it. Number one, we tend to measure things like, uh, attendance and completion rates, and we use smile sheets that are ineffective and, you know, get us data. That’s not really relevant to whether the learning’s effective or not. So we got those issues. Um, we got people, uh, sharing a lot of fancy-looking dashboards that have really bad data underneath, but it looks beautiful and people buy into it. What am I missing Rich? Is there anything else you could think of>

Mesch (03:07):

<laugh> So I think you hit some good points Will. One of the things that has always shocked me about learning measurement is sometimes we measure whether or not people showed up, and that might be great if you’re like, you know, measuring attendance at a football game, but it doesn’t really mean much when it comes to learning and development. But I think there’s another extreme on the other side, which is people want to measure. They say we sent people to a training course. We sent salespeople to a training course, and now we wanna know why sales isn’t higher because we train them and there seems to be a missing part in that equation.

Thalheimer (03:41):

<laugh> so a lot of people went well. Okay. So we, we, uh, we sent folks to sales training. We expect their sales to go up, but you know, there’s a little bit of an issue there and that is that well, how do we know that it’s the actual training that made a difference? Maybe they got a better manager, maybe they got some coaching, maybe the economy picked up. So maybe they would’ve done even better. If we had done a different type of training, it’s just, you know, human learning is complex, attaching measurement to it is even more complex. So, you know, the answer to your question is why is it messy? One reason is it’s just really complicated and it’s not easy to, uh, get data that’s clean enough so we can make meaningful decisions out of it. Um, number two organizations don’t seem to want to pay for it. They don’t see the value in it. We as L and D professionals have not been good enough about, uh, educating our stakeholders about the value of measurement. Um, and, and let’s just let me be explicit. Well, there’s three reasons. We measure one to sort of prove that we’ve got the results we want. Number two is we measure to support learners in learning and number three, we measure to improve what we’re doing to me and when I talk to senior learning people, uh, most of them are focused on the third one. How do we improve what we’re doing? How do we create the most effective learning interventions that we can? If we don’t have good feedback loops, then we’re gonna be creating stuff. Uh, continually that’s not as effective as it might be. So number one, we wanna do it to CR, to get that information. Now, why so messy, well, organizations don’t wanna pay for that. We don’t educate about that. Also. Some of us are a little bit, uh, shy about getting good results. You know,

Mesch (05:28):

Something I’ll add to that is I, I think there’s a, a lack of intellectual curiosity. I think in some cases, people don’t really wanna know what the answer is. They’re afraid that if the answer is that, uh, people didn’t really improve as the result of training, that’s going to make the training department look bad and they don’t wanna see that on the record. And the flip side is true as well. If sales go up and there was training, the training department is perfectly happy to have everyone believe that sales went up because of training, even if there’s no actual evidence that that’s

Thalheimer (05:57):

True. I think it’s worse than lack of intellectual curiosity. I think it’s being afraid of the reality. And, and I’ve, I’ve, you know, I’ve had people, well, we wanna do this thing. Okay, great. Let’s do some evaluation. Oh, well, maybe we shouldn’t look at that. <laugh> uh, you know, the other thing, another reason why we don’t go deeper with our evaluation is that we are getting evaluation results and they look pretty good. Right? We get our smile sheets. We use liquor scales. We put ’em on a five-point scale. Uh, we get four-point twos, four-point fives. You know, that looks pretty good. Uh, Y rock the boat. And so we feel like we’re doing learning measurement. And so, because we’re doing that, you know, there’s not that much incentive or emphasis on doing better evaluation. I think that keeps us stuck. The other reason why we’re doing bad evaluation is we have a dominant evaluation model out there. The Kirkpatrick could sell four-level model. Most people call it, their Patrick model turns out that Raymond could sell was the one that created the idea. So that’s why I try to remember to call it the Kirk. Patrick could sell model, say that really quickly. So it’s been out there. It’s dominant people think about it. Uh, think about learning evaluation in those terms, but that model in and of itself has some weaknesses.

Mesch (07:16):

That’s interesting because, uh, even today, the, the Kirk Patrick model has been with us for a very, very long time. But even today, uh, when you talk to a lot of learning leaders, they’re still very attached to the Kirk Patrick model. What does the Kirk Patrick model get wrong in your opinion?

Thalheimer (07:31):

Okay. So maybe your listeners don’t know what it is. So just really quickly, uh, level one is learner reaction, level two is learning, level three is behavior and level four are results. One of the things you’ll notice in the model is that, uh, learning is all in one bucket. Well, learning can be measured along a continuum. We could measure learning by measuring people’s ability to regurgitate trivia, right, or meaningless information, or just recognize information. Or we can go up a little bit, you know, people’s ability to, uh, recall meaningful, important knowledge, but we can go further than that. Are they able to make decisions based on what they’ve learned? Are they able to actually carry out tasks? Can they, you know, do the performance that we’re trying to teach? ’em so all that gets measured in one bucket learning. So what it means that we do is, oh, we need a level two.

Oh, okay. Let’s do a knowledge check. We <laugh>, that pushes us down to sort of the lower levels of learning. And that creates this cycle of incompetence in us as learning professionals. We measure that. And so we build to that and we build it so that people can answer knowledge questions when we should be building our learning so that people can actually perform that they can actually make decisions so they can actually show that they have some task competence. So that’s one of the big things. Now, the Kopari model, isn’t all bad. It sends some really good messages about our smile sheets. Our learner perceptions should be sort of at a lower level, less important. And it sends a message that, you know, what’s really important, um, are also behaviors and results. Those things, you know, that sends a very good message. And the other thing it doesn’t do, which I think is a shame, it doesn’t sort of, um, send a signal to us that measuring attendance and completion rates is not good enough that measuring just learner participation isn’t good enough. Those things are, you know, useful to gather, but you can’t validate your learning based on participation, based on attendance. So some good things from the model, some bad things, but overall, I think it’s a, it’s been a negative, uh, for the field.

Mesch (09:46):

As the old saying goes, you, you get what you measure. And it sounds like in some cases, maybe not all, but in some cases, the Kirk Patrick model has encouraged people to measure the wrong things. And as a result builds towards that measurement, as opposed to trying to, to build ultimately for the, the success and the performance of the learner, a conversation I’ve had quite frequently is sometimes I will hear people say for thousands of years, the sort of seminar didactic method worked great. And suddenly in the last 30 or 40 years, we need, you know, artificial intelligence and distributed learning and, and all these things. And why do we suddenly need those things now? And I think the answer is because those things didn’t exist a thousand years ago. And so it never really occur to anybody to use them. And I wonder to what extent, uh, a model with no disrespect to, uh, to Mr. Kirkpatrick, a model that was, was developed, uh, 60 some years ago is not really current to what’s possible today.

Thalheimer (10:45):

Yeah. Well, there is, there is some truth to that. You know, the model was built back in the 1950s and then began to be popularized in the sixties. And then, uh, when Donald Kirkpatrick wrote his book in the 1990s, that’s when it really, really got popular. I just think about what’s happened since 1950s. We’ve had this amazing revolution about the psychology of learning. We’ve learned so much about it. Um, one of the things that the Kirkpatrick model is silent about is whether people understand or whether they can remember when we design learning, we should be thinking, oh, we want to create this comprehension, this understanding, but we also want to go further. We want to help people to be able to remember, we want them to be able to apply as well. And, uh, it has no there’s no, there’s no sense about remembering in the, in the Katra model.

Mesch (11:35):

So let’s talk a little bit about performance because performance seems to get sort of short shrift and certainly in the Kirk model and, and some other models as well. What should we be looking at in trying to identify whether what someone has learned is actually going to turn into something they can now do?

Thalheimer (11:54):

Let, let me sort of segue to the L em model here because there’s, yeah, there’s, there’s zillion things we can do, uh, in measuring learning, but it’s important to keep it relatively simple. So, um, in, in, in LM, tier four is measuring knowledge. Tier five is measuring decision-making competence and tier six is measuring task competence. Can you actually do something? And by having that distinction between knowledge and decision making and full task competence, it gives you a sense of, you know, you know, how you might measure that, right? You and I used to build simulations together many, many years ago. And what we did was we gave people very well crafted, multiple choice, branching simulations, um, with a few other elements mixed in, and what we were measuring essentially, was decision-making competence. And we did a lot of leadership development stuff.

Speaker 4 (12:51):

And so if you think about it, we gave people scenarios. We said, okay, what’s the best decision. And they made that decision. And so one of the things we taught back then was you as a manager, as a leader, um, should bring your direct reports into decision making. That probably sounds familiar <laugh> and okay. So what, and we measured that through this simulation, but what could we have? What, what were we missing? Well, we were missing. Okay, well, how do you do it? How exactly do you bring people in decision-making when you are in a meeting to do that? What body language are you using? What tone of voice are you using? Are you using the right words? Are you picking out the right words? So we focused on that decision-making competence, um, and we weren’t doing it for assessment purposes. We were doing it for development purposes, but still what we were, we would’ve been at a tier five decision-making competence. We were not at the next level of a task competence.

Mesch (13:53):

Let me ask you this will, one of the things, uh, I’ve been focusing on in my own work is what happens to knowledge once it’s imparted. In other words, when someone is actually out there performing, how are they reinforced? How are they encouraged? How are they coached? And the example you just shared is a good one, because it’s one thing to know how to do something. And it’s often quite another thing to actually do it. And one of the things I think we often talked about when we were designing leadership simulations, is that often people know how to do something, but there are other forces that actually prevent them from doing it, even though they know how so in a leadership scenario, someone may have learned a model for giving effective feedback, but they still find themselves incapable of giving effective feedback because they lack confidence because they’re intimidated by their employees because they’re afraid they’re going to hurt someone’s feelings. And, and those things are very difficult to train, but I think they’re very key to turning what people know into what they’re able to do. So what role do you see things like coaching, reinforcement, encouragement, feedback, playing in any kind of measurement model?

Thalheimer (15:05):

Well, it’s really comes down to when you measure one of the things that got me interested in measurement in the first place, you know, I was out there doing research-based consulting, you know, telling people how to design their classroom, training their e-learning to be effective based on the science. But then I realized that we have some biases in the way we learn. So we measure right at the end of learning, when everything’s top of mind, this is a major bias because they’ve gone up the learning curve. There they are. We measure them right then, well, if you just wait a few days or a week or a month, they are gonna have slid down the forgetting curve. And so part about part of, about thinking about measurement is when to do it. So certainly if we measure in the training program, we have to look at the timing of that, but also what you measure, you know, have we measured them? Have we actually made them make decisions? Have we actually made them practice, role play? You know, some of the best training is the most challenging changing. We try to simulate as much as possible what people are gonna have to do in their work. Now, I think you’re really right on to emphasize these things that are coming after training the coaching, the reinforcement, all those things are related to what we know about the science of transfer. I’ll give you a link to a research review. I did a couple years ago on the transfer research, but it talks about things like coaching and support, et cetera. So one of the things you can do, and let’s, let’s go through the timeframe at the end of learning, you can ask the learners to on a learner survey, you can ask them to anticipate what kind of support they’re gonna get.

Will your manager support you, will have job aids, will your instructor be available for questions, for support, uh, will you be coached? You can have them anticipate that. And, and that’s okay. It’s not perfect, but it can send really strong signals if you do, uh, an assessment like that. And everybody says, no, my manager’s not gonna support me well, that can send a signal to the organization to get the managers, the supervisors to begin thinking about, Hey, maybe we should be a little bit more supportive, um, after the training. Okay. So now you delay it, you delay it a month and you don’t wanna delay these assessments too long because the training’s, you know, likely to have less impact, less measurable. Um, but then you can ask people, particularly, you know, you and I are old leadership training guys, so you can ask people questions about, well, well, what, what I want to emphasize is that there’s a special opportunity.

When you do leadership training, you can ask the direct reports, how good a leader that you know, so let’s say we train somebody to be a better coach. Okay. They’re a manager coach, and we can go out, uh, a month later and we can ask their direct reports, how well has that person been coaching? And we can ask specific questions. We look at the research on coaching to find out what are the most important leverage points. We can develop an assessment on that. And then we can get some, uh, sense of that. Yeah, we can measure, we can measure anything. Um, although <laugh>, uh, you know, some things are harder to measure than others.

Mesch (18:17):

Well, and I’d like to, I’d love to get your opinion on something. Back when we were designing simulations, we talked a lot about a concept called intervening variables and the whole idea of intervening variables is that in between, uh, an action, a decision, and an outcome, a whole lot of things happened depending on sort of what the overall health of the organization was. And the more I thought about this over the years, one of the things I’ve realized is that when we train when we asked people to learn, what we really want them to do is adopt a new set of behaviors. And we do that in the belief that that new set of behaviors will drive the metrics that the business cares about, but we don’t necessarily have any evidence. That’s true. So is the ultimate metric of learning, not results, but behavior change. And I’m gonna throw that out for you feel free to disagree with me.

Thalheimer (19:13):

No, no, I actually agree with you a lot on this. So I had many times when customers would come to me and say, well, we need a level four. We need tier eight. We need, we wanna look at ROI. We, we wanna look at the main results and I go, great. Let me help. And, uh, then we talk about what it’s gonna take and what are some of the questions that will be raised. And more often than not, I would talk them out of doing a <laugh> a results analysis, because you do a, like a results analysis. You’ve gotta figure out, well, how much of this is due, uh, to the training. And then if you do this right, where you’re using like randomized controlled trials and you have some learners, you know, you get a bunch of, you get a hundred people that are gonna take your training, but you don’t give it to all of them.

You give it to 50 of them. And then you see, you know, compare the 50 to the 50 that didn’t go through it. Maybe you’d give the 50, uh, the training, the other 50 to training later, but there’s a lot of complications there. And if you’re thinking about a lot of the subjective data, then there’s questions about, you know, how valuable the subjective data. The learner said it was great. They said they got results, but are they telling us the truth? So, I am, <laugh> more often than not encouraging people to focus on, uh, sort of the interim thing you were talking about, the behavior change every once in a while. Yeah. It’s good to get at the results, you know, for particularly a strategically important initiative that you’re undergoing, or just to, to sort of validate that your general methods are working, but for the most part, um, the behavior change is, uh, more valuable.

Don’t just focus on the behavior change because it’s the, let’s think of it this way. We give a learning program. All right. So what, what are the causal results there? Okay. Maybe the learners, uh, understand the concepts, but then they have to be able to remember them later. Then they have to be motivated to apply them. Then they have to apply them. Then they have to overcome obstacles. Then we say that the behavior successful. So think about that causal chain, going back from the behavior or the results. If we just measure the results or just measured the behavior, we haven’t measured those intervening things, where we as learning and development professionals, where we have the most leverage.

Mesch (21:34):

One of the best examples I can think of, I was just sharing this story with somebody. I designed a selling simulation for a sales organization and the goal was to help people understand what it looked like when they applied the organization selling model correctly. So they were able to go through a number of scenarios and see what the likely outcomes were when it was done as recommended, and when it was done, not as recommended. And as part of the rollout of that simulation, we went to the sales leadership meeting and, and sat down with all the regional directors and the district managers to sort of get their buy in and their first response upon going through it was, we don’t like this. We don’t agree with this sales model. We’re not going to reinforce it. And we’re gonna keep selling the way we always did. <laugh> and yeah, I’m glad you did laugh at that. Cuz believe me, I wasn’t laughing. And I’ll tell you right now, I’m awfully glad nobody did an ROI assessment on that. Because at the end of the day, you can train people, all you like, you can give them as many skills as you like. If the first thing that happens after you do that is their direct manager says, that’s a lot of nonsense. We’re gonna do it my way instead. There’s absolutely no learning initiative in the world that can stand up to that.

Thalheimer (22:47):

Think of the Addie model. Right. And then there’s complaints about it. But, you know, in some ways it makes sense. So we analyze first and then we design, we develop, we implement and we evaluate, okay. So what’s missing from that model. The human element is missing from the model. You really have to think about the organization, the context, the work context. Oftentimes we focus too much on topics too much on our learning objectives. And we don’t think about the context that things are in. We don’t think about the political elements. We don’t think about the human aspect of it. And we, we fall down on the job. I feel that I have to say your story makes me, makes me feel pain, just racking through my body now. Cause I can imagine what that was like.

Mesch (23:32):

Well, sure. And, and these are mistakes you don’t make twice. <laugh> so very much along the lines of what you just shared will. One of the things we’re doing now, we’re working with a large global pharmaceutical company. And we are in fact working with their sales team, but we don’t make a single learning design decision without having sales leadership in the room because we need to be aligned. We need to be doing the things that they believe in because ultimately as much as I love the learning team, the learning team is not the people who are gonna be out in the field, pulling this through. It’s going to be sales leadership. If they don’t believe in it, if they’re not aligned with it, if they’re not willing to reinforce it, it’s ultimately probably not going to happen. And then all the measurement models in the world, aren’t going to change the fact that people don’t have the motivation to actually follow through on it. And that, uh, that I think is, is where the difference is.

Thalheimer (24:25):

And it’s more complicated, right? You and I know this <laugh> right? It’s, you know, we’re in the learning and development field, but we’re also in the performance field and we know learning’s not enough. And we know learning measurement is difficult. So let’s even make it more complex. Now we’re not just focused on learning. We’re focused on performance improvement in general. I have not even begun to contemplate what it would be like to measure some of the things around performance improvement. I am working on a, this is sort of, uh, haven’t really told the world about this yet, but, um, uh, working on this, uh, it’s a new model called the performance activation model and it has learning in there, but it also has a lot of other things like, you know, the behavioral economic notion of nudging and, uh, you know, context queuing and, uh, memory accessibility and habits, and a bunch of things that are unrelated to learning per se, but are related to performance improvement. And in some ways our goal as L and D professional is to create performance improvement in any way we can. Right. We, our biggest, our biggest tool is learning, but there’s other tools that we should probably learn about, figure out how to implement those. And then our next task is to think about performance measurement in general, if you got the solution to that, rich, I’d love to hear it. <laugh>

Mesch (25:48):

No, I was, I was just about to say Will, we may have to have you come back in a little while when, when this is, uh, more fully baked, cuz I think people wanna hear about it.

Thalheimer (25:56):

Okay. I’d love to, <laugh>

Mesch (25:58):

As we wrap up today will obviously we’ve, we’ve talked about a lot of things today, but people tend to remember the last thing they heard. So what’s the one message you’d love to send people away with, from this conversation?

Thalheimer (26:10):

Number one, it’s hard, it’s hard to measure. So have some humility about it. Number two, beware that we tend to measure. What’s easy to measure, We don’t always measure what is meaningful to measure? Did I say that right? We tend to measure. What’s easy to measure. Um, not what’s meaningful to measure and start wherever you are. I give people the picture of LTEM the eight tiers and I say, okay, where are you now? And they go, they pointed out, oh yeah, we’re measuring a completion rates. And we’re doing, uh, you know, old-fashioned smile sheets. I said, okay, well what’s one improvement you can make in the next six months? Oh, well maybe we could use performance focus, smile sheets, maybe, oh, we have this new strategic program. Maybe we’ll measure some decision-making competence. So start where you are. Be honest about it. You know, maybe get an outside person to take a look and then think about what you wanna improve. You don’t have to do everything if you think you’re gonna do everything, forget about it. We do not have the budget. We do not have the resources. We do not have permission. Um, but do whatever you can to make it a little bit, a little bit better because our goal is to get better information, better feedback loops, um, so that we can be better professionals and make our learning as effective as possible.

Mesch (27:21):

Well, I think that’s more than one thing, but maybe that’s an illustration of just how complex this topic is.

Thalheimer (27:26):

Yeah. I can’t count rich. You know, that <laugh>

Mesch (27:29):

Will Thalheimer. Thank you for joining us today.

Thalheimer (27:32):

Rich. It has been my incredible pleasure. Thanks for inviting me.

Rich Mesch

Rich Mesch has been working in the performance improvement space for over 30 years. An ideator and creator, he works with some of the world’s largest companies to solve business challenges by improving human performance. He is the host of the podcast “Real Impact!”, co-author of the ATD/Wiley book “The Gamification of Learning and Instruction Fieldbook,” and a frequent blogger, conference speaker, and contributor to industry publications. Rich is the VP, Consulting for Performance Development Group.

All author posts