Deming Not DiMaggio

Call Center quality is abysmal.  And it has been for the entire forty years the call center industry has been in existence.  We can make cars with near perfect quality, but after 40 years, call center leaders are high-fiving each other up and down the hallway if call agents do what they are supposed to do 80% of the time.

If this is news to you, you have not had to call for tech support recently.  If you need some fresh perspectives, read the comments to this article published in the NYTimes November 26, 2011.

There are a stack of reasons why call center quality is so bad.  Pick the top call driver, ask what the Required Call Components (RCCs) are…what the agents need to do in their systems and what they need to say to the customers…and ask what % of time the RCCs are met just on that one top call driver.  (RCCs are as essential to call center quality as specs are to manufacturing quality.)

Here is what you are likely to get back.  The RCCs will not have been clearly defined or not agreed on by SMEs, training, monitoring, agents and most important, customers. On the outside chance that the RCCs are defined, you are more likely to find flying pigs than a center tracking RCC performance by call type.  As for seeing the data on run or control charts, they won’t even know what those terms mean, let alone have them.

Even if a center had all that data, and you produced a Pareto chart on the reasons why agents do not properly execute the RCCs, would find the same reasons you find behind all human errors:  1) the agents weren’t completely clear on the requirements, 2) they were distracted by something else and/or just forgot, or 3) they didn’t want to (e.g., collectors often skip the required mini-Miranda warnings because they have learned they are more likely to collect if they don’t scare the debtor off at the top of the call…sad, I know).

Now at this point, with RCCs being missed right and left on call after call, a center is likely to spend a lot of time and money on a host of countermeasures, in a kind of “spray and pray” approach.  Typical shotgun strategies include posting signs reminding the agents what to do, putting an incentive plan in place, pulling the agents off the phone for training or coaching.  You also might find them trying to make the work place more enjoyable by hanging yellow smiley balloons or getting the supervisors to cook hot dogs for the agents (I am not kidding about either of these approaches.  And the centers that did these things honestly felt like this was an effective way to improve quality).

On the other hand, call center leaders could do what Manufacturing leaders do:  look for automation opportunities that can error-proof the process so it’s easy for the agents to do what they are supposed to and impossible to blow it even if they tried.  (Click here to read more about types of agent-assisted automation and results.)

Error-proofing or spray and pray, which do you think would be a better strategy?  Sadly, the number one call center quality improvement strategy is hope. They send an email and they hope the agents read it and remember to do it. They train and they hope. They coach and they hope. They come up with a fancy incentive comp system and they hope.  They cook hot dogs and they hope.  By choosing hope over error-proofing, is it any wonder call center experiences are a favorite whipping boy for late-night comedians?

Some of you are probably howling that I don’t know what I am talking about.  “Call centers don’t rely on hope!  They use scripts to make sure the call is right.”

Fair enough.  Scripts are better than a sharp stick in the eye, but this isn’t the stuff of Six Sigma quality.  There are dozens of failure paths that lead to the script not being executed as designed:

  1. the agents have trouble reading legalese, especially in a second language, so don’t read it correctly or skip it
  2. the agents memorize the script and then don’t even notice when things change.
  3. the agents are texting or surfing, and skip it
  4. the agents feel reading the script hurts their performance (think sales or collections where disclosures can result in the customer backing out)
  5. the agents blast through the disclosures to reduce their handle time…if they are speaking in a second language, the accent and speed can make the disclosures almost unintelligible.

You could, of course, just fire the bottom x% of agents that weren’t doing what you wanted them to. But there are at least two problems with this. First, how do you find the agents you want to fire? You have to hire a bunch of monitors (read as, inspectors…didn’t manufacturing get rid of the “end of the line” inspectors?) and they have to monitor lots of calls to get a large enough sample for each agent.

Second, focusing on and/or firing the bottom 20% is rather un-Deming-like, no? Ed Deming’s approach to quality is what transformed Japanese manufacturing from a backwater to the juggernaut that it is today.  Central to his approach was the notion that the system is the problem, not the individual workers operating in that system.  The bottom 20%, at any given time, are part of the normal variation of that system’s performance.  (As an aside, a consultant could make a lot of money taking call center leaders through Deming’s Red Bead experiment, showing how counterproductive, demoralizing, and futile it is to focus on the bottom x%.)

Speaking of Deming, I don’t know if there is a required reading list for call center leaders, but if there is one, I do know that Deming’s Out of the Crisis is not on that list.  What he wrote three decades ago in that book about the Quality crisis in American manufacturing and the way out of the wilderness is as true of and applicable to Call Centers today as it was to the automotive industry in the early 80s.  Specs.  Performance tracked over time against those specs.  Make changes to the “System” (error-proofing with automation).  Lather.  Rinse.  Repeat.  This is Quality 101.

In 1968, Simon and Garfunkel wrote Mrs. Robinson, where they captured the longing for guidance of a nation in the throes of a controversial war and social unrest with he lyrics, “Where have you gone Joe DiMaggio, our nation turns its lonely eyes to you.”

The call center industry is in the throes of a crisis too, one they have been in for decades that is showing no signs of appreciable improvement and one largely of their own making.  The call center nation does not need to look towards a towering role model of kindness, grace, and dignity like Joe DiMaggio. They need to turn their eyes to the writings of Ed Deming, a results-oriented pragmatist.  Joe DiMaggio entertained the world.  Ed Deming changed it.

GD Star Rating
loading...

Free Webinar about a New Control Chart

David’s p-prime chart is an innovation that is being used in a wide variety of real-world applications. It is now included in many statistical applications, such as Minitab and SigmaXL.

The Laney p’ Control Chart is an exciting innovation in statistical process control (SPC). The classic control charts for attributes data (p-charts, u-charts, etc.) are based on assumptions about the underlying distribution of their data (binomial or Poisson). Inherent in those assumptions is the further assumption that the “parameter” (mean) of the distribution is constant over time. In real applications, this is not always true (some days it rains and some days it does not). This is especially noticeable when the subgroup sizes are very large. Until now, the solution has been to treat the observations as variables in an individual’s chart. Unfortunately, this produces flat control limits even if the subgroup sizes vary. David B. Laney developed an innovative approach to this situation which has come to be known as the Laney p’ chart (p-prime chart.) It is a universal technique that is applicable whether the parameter is stable or not.

About Your Presenter, David B. Laney

David B. Laney
David B. Laney

David B. Laney worked for 33 years at BellSouth as Directory of Statistical Methodology. He is a pioneer at BellSouth in TQM, DOE, and Six Sigma. David’s p-prime chart is an innovation that is being used in a wide variety of areas. It is now included in many statistical applications, such as Minitab and SigmaXL. David is enjoying retirement with his family in the Birmingham, Alabama area.

Date: Wednesday, September 28, 2011

Session #1, 1:00 PM Eastern Time. Click here to register.
Session #2, 7:00 PM Eastern Time. Click here to register.

 

Update

Click here to view a video recording of David’s webinar.

GD Star Rating
loading...

The Problem with Swiss Army Knife Control Charting

The skewed distribution problem is exacerbated by using I-charts.

I’m an advocate of using the I-chart as the default control chart. If I am teaching statistical process control (SPC) and can only teach one chart, the I-chart is always the one that I teach. It’s the only control chart I cover in my Lean Six Sigma Green Belt training. It’s the only chart that I teach in Process Excellence Leadership training. It’s the only chart I use if the data I’m looking at are reasonably close to symmetric (note that I didn’t say “normal”,) unless I have some compelling need for greater sensitivity. I teach that the I-chart is the “Swiss army knife” of control charts.

But I still sometimes use other control charts.

The Problem

Organizations don’t do SPC for the fun of it. They do it because it helps them achieve their goals. Organizations exist to produce things of value for the benefit of customers, investors, and employees. They do this by transforming inputs into outputs of higher value via processes. They can do this better if they minimize variability of outcomes, which can best be accomplished by controlling the sources of variation in the inputs and processes. This is where SPC comes in. SPC is a methodology that uses statistical guidelines to help separate “special cause” and “common cause” variation. If a special cause of variation exists, it signals the need to act. Special cause variation is defined as a change of such a large magnitude that its cause can probably be identified if looked for at once. SPC operationally defines such a change as a measurement result more than 3 standard deviations from the process mean for whatever process metric is being monitored.

A problem might exist if the process generates measurements that are highly skewed, even when it is not being influenced by special causes of variations. Such processes are quite common in the real world. For example, nearly all measurements produced by geometric dimensioning and tolerancing are skewed, as are measurements of time-based phenomena such as those encountered in services industries including the healthcare and hospitality industries. Highly skewed distributions produce a relatively high percentage of results more than 3 standard deviations from the mean even if no special causes exist. In other words, they produce many “false alarms” that will trigger a search for a problem when there is no problem. The false alarms may even lead to tampering, thereby causing a stable process to become unstable.

I-Charts Don’t Solve the Problem

The skewed distribution problem is exacerbated by using I-charts. I-charts are relatively insensitive to moderate departures from normality, and very insensitive if the non-normality still produces a symmetric distribution. But for the data described above, this is not the case. If you use the I-chart for these data you will experience many false alarms. It’s just that simple.

The problem is to determine if a process is or is not being influenced by special causes of variation. A process distribution might appear as skewed because of special cause outliers, or because it naturally produces skewed data. The I-chart treats all data beyond 3 sigma as outliers; it doesn’t help you separate the natural, common cause process outcomes from special cause outcomes. Is the point beyond 3 sigma an outlying chicken, or a common cause egg? I.e., is the process being influenced by special causes, or only common causes? If the process data are naturally skewed you can’t answer this question using an I-chart.

A Simple Solution

The solution that I recommend is to begin your investigation with averages charts, also known as x-bar charts. Averages tend to have distributions that are approximately normal, even if the individual values are skewed. This means that, for a process with a skewed distribution that is not influenced by special causes, averages are much more likely to produce results that stay within 3 standard deviations of the mean than I-charts. It’s the best of both worlds: few false alarms, but still sensitive to special causes. If you have a nice run of subgroup averages without a special cause, plot a histogram of the data and see if the distribution looks skewed or symmetric. If the latter, you can use I-charts with confidence. If the former, stick with averages charts, or find a statistician or Master Black Belt to help you find a more advanced solution.

Stable Does Not Mean Normal

Before ending this article, I’d like to address another pet peeve of mine. I believe that too many teachers of SPC obsess on the need for normality. They confuse normality with the absence of special causes, also known as statistical control. I usually attribute this misunderstanding to a lack of experience with the real world, where normal distributions are so rare as to be virtually non-existent. By insisting on normality we encourage tampering and all of the problems associated with this approach to “process management.”

On the other hand, I am also impatient with people who insist that all non-normality be ignored. These individuals advocate using I-charts in all situations, regardless of the risk of false alarms. This attitude may also be due to a lack of experience. However, I’ve seen SPC lose its credibility when concerned process owners look for special causes over-and-over again without finding them. Like the boy who cried “Wolf!”, out-of-control signals become something to ignore. Eventually so does SPC.

My approach, which favors the I-chart but doesn’t make its use dogma, provides a rational middle ground.

GD Star Rating
loading...

The Six Sigma Knowledge Gap

I will present a way of measuring ignorance; a simple-to-compute statistic which highlights the fact that there is always something to learn about how to improve a given process.

Statistical probability should be used only when we lack knowledge of the situation and cannot obtain it at a reasonable cost.

I recently attended a presentation using a control chart. The control chart showed a process in statistical control at about an 8-percent reject rate. The presenter noted that the process was stable and went on with her presentation. I barely avoided shouting that, while stability is nice, an 8-percent reject rate is not acceptable. The 8-percent level represents a certain amount of ignorance about the process; a level I find unacceptable. The problem is that the presenter didn’t think of it that way at all. To her, 8 percent represented a considerable accomplishment.

This blog is for those of you who, like me, want to scream that, as long as improvement is economically justified, “It’s never good enough!” I will present a way of measuring ignorance; a simple-to-compute statistic which highlights the fact that there is always something to learn about how to improve a given process.

First, let’s take a look at the philosophy that underlies statistics. In his book, The Art of Thinking, philosopher Leonard Peikoff wrote, “Statistics are applicable only when: 1. You are unavoidably ignorant about a given concrete; 2. Some action is necessary and cannot be deferred.”

In other words, if you’re trying to determine a course of action, your best bet is to acquire knowledge, not to blindly use statistics to guide you. While it’s true that we don’t want to tamper with a stable process, it’s also true that we don’t want to settle for anything other than the best level of quality we can provide. Control charts guide us away from tampering, but they don’t tell us how we can improve the process. Only new knowledge can do that.

Statistical probability should be used only when we lack knowledge of the situation and cannot obtain it at a reasonable cost. If we have direct knowledge about a situation, or can get it through a bit of research or by consulting someone who has it, then we should not blindly follow the statistical probabilities. In other words, if you know something about the situation, you should act on what you know.

Statistics are an expression of ignorance. They should only be used when ignorance is unavoidable, i.e., when knowledge is absent and unobtainable. Statistics are not knowledge. They are a calculation that permits action in the face of ignorance. This is the critical point missed by the presenter. She assumed that if she simply stated the level of ignorance, further improvement was not necessary.

Properly used, statistics measure ignorance or, conversely, knowledge. For example, assume that you want to buy a new piece of production machinery. Think of the important variables in the process as a list of 100 items, all of them unknown. You begin by creating a list of those items you believe to be important and prepare a plan to control as many of these items as possible. Let’s say you start with 75 items. Assuming that every item on your list is actually an important variable, these 75 items are special causes–things that affect your process and must be controlled. The remaining 25 items are common causes of variation, unknown to you but also important causes of process variability even though any one of these causes will have only a small effect.

From this starting point, you conduct a process capability study and, using statistics, quantify your knowledge as explaining all but +/-0.003″ of variation in the process. There are some out-of-control data points. After investigating these, you identify five more important variables. The process stabilizes, i.e., all of the remaining points on the control chart fall within the control limits.

Let’s assume that the control limits for the X-chart are now +/-0.002″. In philosophical terms, this means that you acquired +/-0.001″ of new knowledge, but +/-0.002″ of ignorance still remains. As time goes by and you learn more, the control limits will measure the amount of your learning. If in a year the control limits are at +/-0.001″, then you’ve learned enough to reduce the process variation by 50 percent.

As soon as you acquire this knowledge, the previous statistics become irrelevant. Gaining knowledge is the equivalent of converting special causes into common causes. This is like discovering more and more items on the list of things that cause your process to vary. You may never discover every item on the list, but with statistics to help you keep score, it’s fun to try. One way to make it even more fun is to plot a “knowledge chart.” Here’s how it works:

Qdbullet Record the process standard deviation from your most recent process control chart, for example, S0 = 10.

Qdbullet For each subsequent complete control chart, compute the process standard deviation, for example, s1 = 9.

Qdbullet Compute your relative knowledge,

k, as K=100% x (S0-S1)/S0

For our example, K= 100% x (10-9)/10 = 10%

As you reduce your ignorance to zero, the knowledge measure will go to 100 percent. It’s a fun way to keep track of your quality progress!


GD Star Rating
loading...

Six Sigma, Data Mining and Dead Customer Accounts

The quality profession must focus on things done right.

In the February issue of Quality Digest, Joseph M. Juran pointed out that quality needs to “scale up” if it is to remain a viable force in the next century. In other words, quality must spread beyond its traditional manufacturing base.

A major opportunity to do just that now exists, as Six Sigma enjoys a resurgence of interest among quality professionals and “data mining” is the hot topic among information systems professionals. Six Sigma involves getting 10-fold improvements in quality very quickly; data mining uses the corporate data warehouse, the institutional “memory,” to obtain information that can help improve business performance.

The two approaches complement each other, but have differences as well. Data mining is a vaguely defined approach for extracting information from large amounts of data. This contrasts with the usual small data set analysis performed by quality engineers and statisticians. Data mining also tends to use automatic or semiautomatic means to explore and analyze the data. Again, this contrasts with traditional hands-on quality applications, such as control charts maintained by machine operators. Data mining for quality corresponds more closely with what Taguchi described as “off line” quality analysis. The idea is to tap into the vast warehouses of quality data kept by most businesses to find hidden treasures. The discovered patterns, combined with business process know-how, help find ways to do things better.

Data-mining techniques tend to be more advanced than simple SPC tools. Online analytic processing and data mining complement one another. Online analytic processing is a presentation tool that facilitates ad hoc knowledge discovery, usually from large databases. Whereas data mining often requires a high level of technical training in computers and statistical analysis, OLAP can be applied by just about anyone with a minimal amount of training.

Despite these differences, both data mining and OLAP belong in the quality professional’s tool kit. Many quality tools, such as histograms, Pareto diagrams and scatter plots, already fit under the information systems banner of OLAP. Advanced quality and reliability analysis methods, such as design of experiments and survival analysis, fit nicely under the data mining heading. Quality professionals should take advantage of the opportunity to share ideas with their colleagues in the information systems area.

Example: A bank wants to know more about its customers. It will start by studying how long customers stay with the bank. This is a first step in learning how to provide services that will keep customers longer.

This should be considered a quality study. The quality profession tends to focus too much on things gone wrong in identifying failures, then looking for ways to fail less. An alternative is to examine things done right, then look for new ways to do things that customers will like even better. This proactive, positive approach is a key to quality scale-up and a ticket into an organization’s mainstream operations.

This bank’s baseline study can also be viewed in the traditional failure-focused sense of quality. If customer attrition is viewed as “failure,” a host of quality techniques can be used on the problem at once. In particular, reliability engineering methods would seem to apply. If we look at creating a new account as a “birth” and losing an account as a “death,” the problem becomes a classic birth-and-death process perfectly suited to reliability analysis. Rather than using a traditional reliability engineering method like Weibull analysis, the following example uses a method from health care known as Kaplan-Meier survival analysis.

Survival analysis studies the time to occurrence of a critical event, such as a death or, in our case, a terminated customer account. The time until the customer leaves is the survival time. Kaplan-Meier analysis allows analysis of accounts opened by customers at any time during the period studied and can include accounts that remain open when the analysis is conducted. Accounts that remain open are known as censored because the actual time at which the critical event (closing the account) occurs is unknown or hidden from us.

Table 1 shows the first few database records (customer names are coded for confidentiality.) The database of 20,000+ records is large by traditional quality standards, but tiny by data-mining standards. The bank manager wishes to evaluate customer accounts’ lifespan. She also suspects that customers who use the bank’s Web banking service are more loyal. Figure 1 shows the survival chart for that data set.

The information confirms the branch manager’s suspicion: Web users clearly stick around much longer than nonusers.   However, the chart doesn’t tell us why they do. For example, Web users may be more sophisticated and better able to use the bank’s services without “hand holding.” Or there may be demographic differences between Web and non-Web users. The data was also stratified by the customer’s city, revealing many more interesting patterns and raising more questions.

An obvious question is what the bank might do to improve the overall retention rate; after all, nearly 70 percent of the customers left within 1,400 days of opening their account and, from a quality viewpoint, what “defect” could possibly be more serious than a customer leaving your business?

Quality and reliability professionals already have a full arsenal of tools and techniques for extracting information from data and knowledge from information. Long before information systems became popular, data helped guide continuous improvement and corrective action. Today, organizations are having an extremely difficult time finding qualified people to help them deal with the deluge of information. It’s time we let people in the nontraditional areas of the organization know that they have an underutilized resource right under their noses: the quality and reliability professional.

GD Star Rating
loading...

Thinking Outside the Box

The best way to think outside the box is to understand that there is no box.

A common problem with SPC is that the world appears too complicated for a statistical approach to work. In complex electronics products, for example, circuit boards may have thousands of holes and microchips may have millions of transistors. Plotting control charts of each and every dimension is clearly not feasible. What can be done?

To answer this question, consider a simple product: the box in Figure 1. How many things could we measure on this box? It turns out, a great many. Length, width and height are obvious choices. But we could also measure the diagonals on all six sides, interior diagonals front-to-back and back-to-front, linear combinations of these measurements and a great many more. We could conceivably come up with dozens of measurements on this simple box.

But–and this is critical–we don’t need these measurements to control the box process. The “P” in SPC stands for process, not product. When we focus on the product, we lose sight of the fact that we’re not trying to control the product. Control of the box process may be a great deal more simple than controlling the product. And if we control the process properly, the product will take care of itself.

The statistical technique known as principle components analysis can help us determine just what is important and what is not. Most statistical software packages can perform PCA. To illustrate the approach, I measured an assortment of boxes (see figure 2). The measurements I obtained are shown (in inches) in Table 1.

When these data are crunched through PCA, we find that three principle components explain 99 percent of the variation in the data set: Component No. 1 explains 76.9 percent of the variation, component No. 2 explains 14.1 percent, and component No. 3 explains 8 percent. The PCA clearly shows that these three components are associated with A, B and C respectively. Thus, the “box process” can be characterized almost entirely by controlling these three characteristics. If we do that, the other dimensions will be OK, too.

An example of using this approach in the real world involves CNC machining. A defense plant machined parts for use in guided missiles. The parts were extremely complex, with thousands of holes, cutouts, etc. on each. However, when the data were analyzed using PCA, it was determined that four principle components accounted for nearly all of the process variation. Further study showed which measurements were correlated with each principle component.

From this, the engineers determined that, for all the apparent complexity, the machining process was, in fact, quite simple. The four principle components corresponded with the machining center’s four axes of movement: X, Y and Z movement of the bed, and the rotation of the table on which the parts were mounted. SPC could be accomplished by selecting those features most difficult to position in each axis of movement. Often, a single feature could measure more than one axis; for example, a hole furthest from the “home” position in both the X and Y axes. The result: One or two control charts suffice for the control of a process placing thousands of features.

Note that the features selected for SPC may be of little or no importance to the product itself. In fact, some parts were designed with “process control features” that were later removed from the part entirely. This makes sense when remembering that P stands for process, not product. If you keep that in mind, the complexity you face might just evaporate before your eyes.

GD Star Rating
loading...

SPC and Global Warming Part I

Global warming is complex, dynamic, important and imperfectly understood. Statistical methods are designed to help us analyze just such processes.

In the world of work, people have a natural tendency to become emotionally involved in their jobs. This is vital if they are to take pride in their accomplishments and do quality work. However, this involvement also makes it difficult for most people to see problems in their work.

SPC benefits users by directing attention toward the facts and thus promoting reason and rationality in problem solving. In this posting, I’ll put that belief to the test by using SPC to explore an issue that has generated much emotion lately: global warming. Global warming is complex, dynamic, important and imperfectly understood. Statistical methods are designed to help us analyze just such processes.

The figure of merit in this case is the Earth’s mean temperature. Figure 1 presents a run chart of the data. The numbers are coded and show the deviation in average global temperature in hundredths of degrees C from the base period mean temperature. The base period is from 1951 to 1980. A value of 0 indicates an annual global mean temperature equal to the base period mean, while a value of 20 indicates a temperature 0.20° C below the average during the base period. The chart includes data from 1866 to 1996. (There is a comment about more recent data.)

Figure 1

Putting the data into a run chart shows 131 years of temperature variation at a glance. Initially, temperatures are cooler, roughly 0.50° C below the base period mean. At the end they are warmer, roughly 0.25° C above the base period mean.

In SPC, the preferred approach to determine potential long-term process performance is to conduct a process capability analysis. In a PCA, changes are carefully controlled to determine how the process behaves under ideal conditions. Control limits are computed from the PCA data and used to identify important changes that occur in the future.

Needless to say, we can’t to do this with many of our processes, including the global warming process. Instead, we are forced to deal with things the way they are. A first step is to compute the control limits for these data. To do this, we first must estimate the process average and standard deviation, s. The temptation is to compute the average and s by using a spreadsheet such as Excel, which gives us an average of 10.4 and s = 24.30.

However, computing s in this way only works if the process is in a state of statistical control. When the process’s state is unknown, it’s far better to base our sigma estimate on the “moving range.” A recent article in Quality Engineering shows that s can be estimated accurately by multiplying the median moving range by 1.047. With this method, we get s = 10.47.

The control limits are set at plus-and-minus three standard deviations from the long-term mean, giving 41.8 and 21.1 using coded measurement units. Figure 2 shows the control chart with the average and control limits drawn in. There are points below the lower control limit at the beginning of the chart, followed by points above the upper control limit at the end of the chart. This is an SPC definition of a trend.

We’ve now established that between 1866 and 1996, the global mean temperature measurements increased. If we compare the first 20 years on the chart to the last 20, the change is +64.4, or an increase of 0.64° C. The next step is to identify the special cause or causes behind this change. We will explore this issue in a future posting.

GD Star Rating
loading...