Archive for the ‘Performance Testing and Qualification’ Category

Are You Competent? (Hint: “Not Really” is a Better Answer than “Yes”)

Friday, April 30th, 2010

You may have heard the expression “the problem is, you don’t know what you don’t know” used to describe how unknowns create risk in decisions. (You may also have heard the expression “too bad stupid doesn’t hurt”…but that’s just funny).

In general though, when we hear someone speak “with authority” we assume they know what they are talking about. This assumes that they have developed confidence based on years of study, hard work, and even being wrong enough times to have “learned the hard way.”

But, that is not always a safe assumption. (Are any assumptions ever safe? Never mind, different topic.) Often people who are not that competent over-estimate their own competence. That’s not too surprising. The real surprise is that people who are very competent often under-estimate their competence. As a result, if you listen to people’s own PR, you run the risk of trusting the less-competent individual!

In 1999, Justin Kruger and David Dunning, then both of Cornell University published the results of a study in the Journal of Personality and Social Psychology. Their “Dunning-Kruger Effect” noted that, with a range of skill areas (from playing chess to driving to reading) the following are typical (emphasis added).

  1. Incompetent individuals tend to overestimate their own level of skill.
  2. Incompetent individuals fail to recognize genuine skill in others.
  3. Incompetent individuals fail to recognize the extremity of their inadequacy.
  4. If they can be trained to substantially improve their own skill level, these individuals can recognize and acknowledge their own previous lack of skill.

Another psychologist, C.F. Downing determined that this sort of reverse bias applies to intelligence, with more intelligent people believing they are less intelligent than they are and less intelligent people…well, you get it by now. It leads all the way to “depressive realism” which argues that people who are depressed actually have a more accurate view of what is going on!

What does it all mean? It just means that the more you know, the more you know there is more to know. And, that we should avoid using self-assessment when getting an accurate assessment is important.

What Gets Measured Gets Counted

Thursday, April 22nd, 2010

On Wednesday, March 03, 2010, the ABC station in New York ran a report about a New York City police officer who went public about quotas. Apparently, the police are given specific targets to meet for arrests and summons. The complaint was that the quotas were being enforced blindly…so officers had no choice but to miss their numbers (and be disciplined) or just arrest people indiscriminately to keep their numbers up. (The original story.)

Well, are performance measurements and quotas bad? Large organizations need to manage by the numbers to keep things fair. Don’t they?

From a human performance perspective, there is actually a lot wrong with this approach, whether it is used in law enforcement or other businesses. For one thing, the numbers need to be connected to the desired performance and they need to be under the control of the performer. In this case, the measure doesn’t track the real desired results and it isn’t in the control of the performance because it doesn’t account for situational differences.

First of all, the measure is tracking activity, not results. The number of arrests look like results but it is more like measuring the number of proposals a sales person generates. Yes, there is a relationship between arrests and crime just as there is a relationship between the number of proposals and sales. But, what we are really looking for is a measure that tracks the amount of good arrests or, ultimately, the amount of actual crime.

Counting arrests is a problem because it assumes a constant volume of crime. To establish a required number of arrests for all police officers on all shifts in all areas implies that there is a stable amount of crime and the police can reasonably be expected to solve a certain amount of it. This may be approximately true over time (but probably not) but can’t possibly be true on a daily basis. The performer cannot control their performance on this metric. Unless they cheat.

One part of the bad news is that holding performers accountable for measures that they can’t control breeds cynicism and actually harms performance.

Imaginary dialog. (Italics/parens indicate what the individual is thinking but not saying.)

Sargent: Here is your quota. (I have a pretty good idea about what goes on in his area…I’m glad I don’t have to meet these targets.)

Officer: But we’ve been patrolling heavily and crime is down. I don’t think I can hit those goals, especially during the day. (Surely he knows this isn’t reasonable.)

Sargent: I don’t want to hear about it…just hit the numbers. (I have to maintain a firm hand as a leader. Besides, the people I answer to are so far removed from the daily problems of the beat officer that they won’t hear anything I say about the quotas either.)

Officer: Yes sir. (My only hope is to cheat.)

The result is wasted time, money, effort, and also injury to innocent people. With the side benefit of misleading statistics on record.

This sounds like lots of businesses actually…the farther away from the actual work you are, the less you will be able to understand the issues behind the numbers and the more likely you are to turn it into a clear cut, simplistic question. “Did you make the numbers? No? Then start making the numbers. I insist.” It highlights the importance and the sad lack of knowledgeable management. Maybe even worse is that there is management training out there that will teach you NOT to “take on your employees problems.” Which some people translate into meaning “don’t listen to any explanation or get dragged into troubleshooting.” Which is really not helpful and not managing either.

But back to the measurement question. What should they measure then?

I’m not an expert on law enforcement but if you are trying to measure police performance, how about measures that measure what you really want to improve that are also in the control of the performer? By the way, this isn’t easy but here are a few ideas.

  • You really don’t need more arrests. You either want more convictions (as an indicator that the right person was arrested) or, ultimately, reduced crime, maybe based on reports or complaints by citizens. Some kind of ratio would be a good place to start.
    • A ratio comparing arrests with convictions (or plea bargains) to indicate the quality of the arrests.
    • Ratio of crimes reported to crimes solved.
    • An index incorporating crime per capita, arrests (or convictions), complaints, and feedback from the public.

In a business, besides looking at the performance and setting measures based on indicators that the performer can legitimately control, it is also critical to incorporate knowledgeable managers who can understand the context of the performance and make allowances where appropriate. There is a point as you progress up the “food chain” though, where you lose touch with the day-to-day issues…or even to where the management never had the know-how in the first place.  (For example, how many mayors or public commissioners are former police officers?) In those cases, the higher-ups need to listen to the people on the ground and develop enough trust to have rational discussions about performance.

If you want the measures to drive performance, it is critical to define them carefully and consider the possible unintended consequences because people will really try to make the numbers…even if it might be better if they didn’t.

Teach to the Test

Tuesday, March 30th, 2010

One of the biggest complaints in schools is that the “No Child Left Behind” act has set up standard test hurdles to be cleared by all schools and students. The idea seems sound — set a standard and then expect everyone to meet it. It allows teachers the freedom to vary their methods but, ultimately, they would be held accountable for results.

In practice, it doesn’t work that way. In the informal conversations I have had with people in the education business, this is seen as a wrong and short-sighted idea. If there is a test, “teachers will not worry about the kids learning…they will just teach to the test.” At its most absurd, an example of this would be, instead of teaching how to add (the core capability) teachers would teach how to add “5 + 7″ because that is on the test.

I would h0pe that the actual test content is not available to allow teachers to directly teach the test answers. But is there any harm in defining in advance the subjects and type of test questions (i.e., the capabilities) to be tested? In fact, that would seem to be the best way to get standardization.

One thing that muddies the waters though, is that teachers are accustomed to having almost no oversight. They are pretty much allowed to do whatever they want in their classroom. In a business, managers who run their organization well, meeting their goals and getting good employee feedback, are often left alone as well. But someone is looking at the results. The standardized tests are the results. So, if they are well-designed tests, they would show which teachers are getting the job done and which are not. Many get concerned because results have never been tracked and reported so publicly before..and anytime you introduce testing it is perceived as threatening because the performers are being asked to 1) trust that the tests are fair and 2) that the results will be used constructively. Often, that is a big leap to expect people to take.

As it turns out, it can be argued that the tests are not completely effective. Of course, to my knowledge, we haven’t taken the step to figure out exactly what we want students to be able to do when they get out of school. If you don’t know what you are shooting for, any attempt to measure whether you have hit the mark is a futile effort.

Another problem is more practical. Tests for large numbers of students are built for electronic grading — so every question has to be multiple choice or some other easy-to-grade format. If life were only multiple choice…it would be so much easier. But figuring out which of four (intentionally unambiguous) options is the right answer isn’t the same as have the capability to do something. It doesn’t prove that you know something other than how to use the process of elimination.

Another problem is that the tests are often administered poorly. In some cases, students with learning disabilities and IEPs (individual learning plans…which means they are intentionally NOT following the same sequence and pace as the standard) are still tested. In one case, students that could not read were forced to take the test but their aide was not allowed to read them the questions. So, they looked at the tests and randomly filled in circles…

Finally, it seems that the tests are intended to measure a minimum standard. But, due to the emphasis placed on them, they are in danger of becoming the actual goal. An effective teacher who is focusing on getting real and important learning to happen should produce students that blow through the simple standardized tests like a trained athlete would ace a basic physical. Administrators and parents wouldn’t need to fret so much about hitting the numbers. (Anecdote: We probably all know at least one teacher who has had an irate parent complain about a poor grade on an elementary school test impairing their child’s chance to get into Harvard. Any society that doesn’t see that as absurd should go slap itself.)

The fix though, is not to discard the tests and go back to the good ole days. The first step should be an analysis of the performance, followed by the design of tests that test capability (not the ability to guess multiple choice answers) related to the desired performance. Then, checkpoints should be designed to allow teachers to track how well they are progressing toward the standard. And, somehow, individual differences need to be accommodated so the test really measures, rather than blindly generating meaningless numbers for administrators to gloat over or fume about.

This is not dissimilar to a standard performance-based ISD approach that we (and many others) use to develop custom training programs. It’s frustrating to see a problem continue when the means to fix it is well understood and available.

Without some kind of test, there is no verification of capability. Performance testing is the most accurate but can be difficult to administer. We still have to decide though, is making it easier to process a large number of poor tests really a better solution?

Why Performance Tests are Better Than Knowledge Tests

Tuesday, August 25th, 2009

What is a performance test?

A performance test is esssentially a checklist of key performance characteristics that define the criteria for successful performance. The checklist is used during observation of performance (or to review the result or output of performance) to assess whether the performance is acceptable. Performance tests can be used as a “gate” to determine whether performers are ready to “go solo” or simply as a way to verify capability (e.g., in a training course).

By contrast, a knowledge test attempts to assess the learner’s retention and recall of information or, occasionally, the application of rules.
 

 

Why I like performance tests.

Performance tests have several advantages over knowledge tests.

1. A performance test measures the right things. 

Assuming that your focus is on performance, a performance test is going to tell you what you should want to know. Specifically, it will tell you what people can do overall, and at a more granular level. It doesn’t tell you the learner “knows” the rules of the road…it tells you the learner can follow the rules of the road, steer and stop the car appropriately, use lanes correctly, follow traffic signals and signs, monitor other drivers’ movements, or any other criteria built into the test. As a benefit, it also tells you if they know the rules of the road based on whether or not they follow the rules in the course of performance!

2. A performance tests defines the work and the criteria for performance.

This is not trivial. In almost every case where we have developed performance tests (including work environments where there are detailed SOPs for every task) we have created new knowledge. That is, we have identified or clarified tasks or techniques or sequences that were missing or incorrect in existing documentation. Usually, the criteria for performance we define at the task level has not been previously documented. (In many cases, employees had figured these things out but they had not been communicated or standardized.) This is valuable to the business.

Clarifying performance requirements usually also simplifies the performance. It takes some of the mystique out of ”mastery” but makes it easier for all performers to perform effectively.
 

3. Performance tests connect training to performance.

The actual performance test instrument, as mentioned earlier, is typically a description of the work down to the task level. It includes criteria for successful performance that are as clear and objective as possible–instrument must be able to yield consistent results when used by multiple evaluators. In many cases, the performance test is used as a job aid by learners and a training by coaches and supervisors (in addition to being used for assessment).

4. If the performance test is done well, it is a more accurate test of capability than a knowledge test.

Actually performing almost always requires more than simple recall of information or even application of rules. It requires putting everything together in a real situation. That includes information, use of tools/resources, situational factors, and even “noise” in the environment. Performance often happens in “real time” where knowledge tests are usually off-line (or “stop time”). For example, knowing traffic laws (the “written test”) is not a good test for whether a teenager can actually drive, that is, can actually navigate through traffic and make good decisions in the moment. 

5. You don’t have to hide the answers.

A performance test, including the key performance criteria, can be published to anyone. With a performance test, just because you know what is expected doesn’t mean you can do it. So there is no need to hide or randomize the questions and answers. (This is why it can be used by learners as a job aid.)
 

6. (There is the potential to) get work done during the testing process.

In a business situation, the performance test can often be administered by a “master performer” (who has been trained/qualified to administer performance tests). So while the learner is being tested, he or she is actually doing real work. It may be at a slower rate and may require an additional resource (i.e., the master performer) to evaluate it more closely than normal, it is still resulting in output. And, you would hope that someone would be checking the work of any new (or unqualified) performer anyway so it is not an incremental increase in resource.

7. Managing learner expectations for “going solo.”

Instead of the learner “watching and learning” with the master performer for an undetermined period of time, a specific gate is identified and the learner, master performer, and supervisor will have a clear point in time for when the learner is ready to solo.