CPIA Module 4, Section 4: Error Points and Failure Mode Analysis
MODULE 4: MEDICATION-USE SYSTEMS & ARCHITECTURE

Section 4.4: Error Points and Failure Mode Analysis

A systematic approach to proactive risk assessment. We will learn the techniques of Failure Mode and Effects Analysis (FMEA) to identify and mitigate potential system-level medication errors.

SECTION 4.4

Error Points and Failure Mode Analysis

Moving Beyond Blame: Engineering Safer Systems by Proactively Identifying How They Can Break.

4.4.1 The “Why”: The Critical Shift from Reactive to Proactive Safety

For most of healthcare history, our approach to safety has been fundamentally reactive. A patient is harmed by a medication error, a sentinel event occurs, and a team is assembled to conduct a Root Cause Analysis (RCA). We investigate what happened, why it happened, and what we can do to prevent that specific error from happening again. This is a necessary and valuable process, but it has a profound limitation: a patient must first be harmed for the learning to occur. We are, in effect, driving by looking in the rearview mirror. While we must learn from our mistakes, a truly mature safety culture seeks to anticipate and prevent errors before they can ever reach a patient.

This is the philosophical leap from reactive to proactive safety. It is a shift in mindset from asking “What happened?” to asking “What could happen?”. This is not speculation or guesswork; it is a structured, engineering-based discipline designed to systematically deconstruct a process and identify its hidden vulnerabilities. It acknowledges a fundamental truth that you, as a pharmacist, intuitively understand: errors are not the result of “bad people” but are symptoms of “bad systems.” Well-intentioned, highly skilled professionals will make mistakes when placed in a poorly designed process with latent hazards and inadequate safeguards. Our job as informatics pharmacists is not to find better people, but to build better, more resilient systems.

Failure Mode and Effects Analysis (FMEA) is the premier tool for this proactive work. It is a systematic, team-based methodology for identifying potential failure points in a process, assessing the potential impact of those failures, and prioritizing them for mitigation. The FMEA forces us to ask a series of uncomfortable but essential questions about our workflows and technologies: How can this step fail? What would be the consequences if it did fail? How would we know it failed? How can we design the system to either prevent the failure or catch it before it causes harm? Engaging in this process is one of the most powerful and impactful activities an informatics pharmacist can lead. It is how we move from being system maintainers to system architects, actively engineering safety into the fabric of the medication-use process.

Retail Pharmacist Analogy: The New High-Risk Drug Launch

Imagine your pharmacy chain is about to launch a new, high-cost oral chemotherapy agent with a complex REMS program, significant drug interactions, and a very narrow therapeutic index. A reactive approach would be to simply launch the drug and wait for the first dispensing error, patient complaint, or insurance audit clawback to occur, and then try to fix the problem.

A proactive, FMEA-style approach is entirely different. Weeks before the launch, you assemble a team: a pharmacist who will be dispensing it, a technician who will handle the inventory and ordering, and a billing specialist. You get a whiteboard and you ask the key FMEA questions, mapping the process from prescription receipt to patient counseling:

  • “How could this fail?” (Failure Mode): “What if a prescription arrives without the required REMS authorization number? What if the technician stores this $10,000 bottle next to a look-alike, sound-alike (LASA) drug? What if the pharmacist forgets to check for interacting P-gp inhibitors during DUR? What if the patient isn’t counseled on the specific food restrictions?”
  • “What would be the consequences?” (Effects): “A missing REMS number means we can’t dispense, delaying therapy. A LASA error could be fatal. Missing the interaction could lead to severe toxicity. Failure to counsel correctly could lead to sub-therapeutic levels or side effects.”
  • “How could we prevent this?” (Mitigation): You decide to create a hard stop in the computer system that won’t allow the prescription to be processed without a value in the “REMS ID” field. You decide to store the new drug in a separate, locked cabinet with a bright red “High-Alert Chemo” sticker. You build a custom alert into the DUR system that specifically flags P-gp inhibitors when this drug is selected. You create a mandatory counseling checklist that must be initialed by both the pharmacist and the patient.

You have not waited for an error to happen. You have used a structured, proactive process to anticipate the potential failures and have built robust defenses into your workflow and technology *before* the first prescription ever arrives. This is the essence of an FMEA. It’s the engineering work of safety that happens before the assembly line starts running.

4.4.2 The Language of FMEA: Defining the Core Concepts

To effectively conduct an FMEA, you must be fluent in its specific terminology. Each term has a precise meaning that guides the analytical process. Mastering this vocabulary is the first step to leading a successful analysis.

Masterclass Table: The FMEA Lexicon
Term Definition Simple Example (Dispensing Amoxicillin)
Failure Mode The specific way in which a process step can fail. It answers the question, “How could this go wrong?” There can be multiple failure modes for any single process step. At the step “Select drug from shelf,” a failure mode is: “Pharmacist selects Amoxil instead of Augmentin (LASA error).”
Effect The potential consequence(s) of the failure mode, assuming it is not detected. It describes the impact on the patient or the process. It answers, “If this failure happens, what is the worst-case outcome?” The effect of the LASA error is: “Patient receives wrong medication, leading to treatment failure for their infection.”
Cause The underlying reason or root cause that could lead to the failure mode. It answers the question, “Why would this failure happen?” Potential causes for the LASA error include: “Bottles are stored next to each other on the shelf,” “Pharmacist is distracted by a phone call,” “Confirmation bias due to similar packaging.”
Severity (S) A numerical rating (typically on a 1-10 scale) of the seriousness of the effect of the failure. A score of 10 represents catastrophic harm (e.g., death), while a 1 is negligible. For the amoxicillin/Augmentin error, the Severity might be rated a 7 (significant harm, treatment failure, potential for adverse reaction).
Occurrence (O) A numerical rating (1-10) of the likelihood that the cause of the failure will occur. A 10 means it’s almost certain to happen, while a 1 is extremely unlikely. If the drugs are stored side-by-side and the pharmacy is always chaotic, the Occurrence might be rated a 6 (moderately high).
Detection (D) A numerical rating (1-10) of the likelihood that the failure will be detected before it reaches the patient. This scale is inverted: a 10 means there is absolutely no check in place to catch the error, while a 1 means the error is certain to be caught. If the only check is the pharmacist’s own self-check, Detection might be rated an 8 (very unlikely to be caught). If barcode scanning is used at the point of dispensing, the rating might drop to a 2.
Risk Priority Number (RPN) The mathematical product of the three scores: $$RPN = S \times O \times D$$. The RPN is a calculated value (ranging from 1 to 1000) that is used to prioritize the identified risks. It is not an absolute measure of risk, but a powerful tool for ranking and focusing improvement efforts. Without barcode scanning: $$RPN = 7 \times 6 \times 8 = 336$$. With barcode scanning: $$RPN = 7 \times 6 \times 2 = 84$$. This calculation makes the risk-reducing impact of the barcode scanner quantitatively visible.

4.4.3 The FMEA Process in Action: A Step-by-Step Guide

An FMEA is not just a brainstorming session; it is a structured, disciplined process. Following a consistent methodology ensures that the analysis is thorough, objective, and leads to actionable outcomes. As an informatics pharmacist, you will often be the project manager and facilitator for these initiatives.

Step 1: Assemble the Multidisciplinary Team and Define the Scope

This is identical to the first two steps of workflow mapping. An FMEA cannot be done in isolation. You need a cross-functional team of frontline experts who live the process every day. You also need a tightly defined scope. Trying to conduct an FMEA on “The Entire Inpatient Medication Process” is doomed to fail. A good scope is “The Process for Dispensing and Delivering First Doses of Oral Medications from the ADC.”

Step 2: Map the Process

You cannot analyze the failure modes of a process you do not understand. The “Current State” workflow map you learned to create in the previous section is the required foundation for any FMEA. Each process step on that map becomes a row in your FMEA worksheet.

Step 3: For Each Process Step, Brainstorm Potential Failure Modes

Go through the map step-by-step. For the step “Nurse selects medication from ADC,” the team brainstorms all the ways it could go wrong: “Nurse selects wrong drug from a list,” “Nurse selects wrong patient,” “Nurse bypasses the system and uses the ‘override’ function inappropriately,” “Nurse selects correct drug but wrong strength.” All of these are documented.

Step 4: For Each Failure Mode, Identify Effects and Causes

For the failure mode “Nurse selects wrong drug,” the team identifies the potential effect (“Patient receives wrong medication, potential for adverse drug event or treatment failure”) and the potential causes (“Look-alike drug names on the screen,” “Distractions at the ADC,” “Multiple alerts leading to alert fatigue”).

Step 5: Score the Severity, Occurrence, and Detection

This is where the team uses a predefined scoring matrix (see example below) to assign a numerical rating to each factor. This is a consensus-driven process. The facilitator’s job is to guide the discussion to an agreed-upon score for S, O, and D for each line item.

Step 6: Calculate the RPN and Prioritize

Multiply S x O x D to get the RPN for each failure mode. Sort the entire list from the highest RPN to the lowest. This immediately draws the team’s attention to the highest-risk parts of the process. While not a strict rule, any RPN over a certain threshold (e.g., 100 or 150) or any Severity score of 9 or 10, regardless of the RPN, is typically targeted for action.

Step 7: Develop and Implement an Action Plan

For each high-priority failure mode, the team brainstorms concrete, actionable interventions to either eliminate the cause or improve detection. For “Nurse selects wrong drug due to LASA names,” the action plan might be: “Implement Tall Man Lettering on all ADC screens” and assign the informatics pharmacist to complete this task by a specific date.

Step 8: Re-evaluate the Risk and Monitor

After the action plan is implemented, the team goes back and re-scores the process. With Tall Man lettering implemented, the Occurrence of a LASA error might drop from a 6 to a 3. The Detection might also improve. The new, lower RPN demonstrates the quantitative impact of the safety improvement. The process is then monitored over time to ensure the changes are sustained.

Example FMEA Scoring Matrix

Having a standardized, agreed-upon scoring matrix is essential for consistency. This should be developed and approved by the organization’s quality and safety leadership.

Severity (S)
10Death
8Permanent Major Harm
6Temporary Major Harm
4Temporary Minor Harm
2No Harm, Monitoring Req.
1No Harm
Occurrence (O)
10Almost Inevitable (>1/day)
8Frequent (1/week)
6Occasional (1/month)
4Uncommon (1/year)
2Remote (1/5 years)
1Extremely Unlikely
Detection (D)
10No check exists
8Low chance of detection
6Moderate chance of detection
4High chance of detection
2High chance + forcing function
1Error is certain to be caught

4.4.4 Masterclass Example: FMEA of the “STAT First Dose IV” Workflow

Let’s now apply the full FMEA methodology to the “STAT First Dose IV” workflow that we mapped in the previous section. The table below represents the kind of detailed worksheet that your multidisciplinary team would create during an FMEA session. This is the core artifact of the process, transforming a workflow map into a prioritized risk register and action plan.

Process Step Potential Failure Mode Potential Effect(s) Potential Cause(s) S O D RPN Recommended Action(s)
Phase: Ordering
Provider enters order in CPOE Provider forgets to flag order as “STAT” Delayed therapy for critically ill patient; potential for clinical deterioration. – Lack of awareness
– Rushed environment
– CPOE interface is not intuitive
7 5 9 315 – Build CDS alert for antibiotics on septic patients.
– Add “STAT” as a default for certain order sets.
Provider enters order in CPOE Provider orders wrong antibiotic for suspected source Treatment failure, patient harm, promotion of resistance. – Lack of knowledge of local antibiogram
– Misdiagnosis of infection source
8 4 3 96 – Embed links to the hospital antibiogram directly in order sets.
– Pharmacist verification is the key detection step.
Phase: Verification
Order appears in pharmacy queue Order is missed or verification is delayed Significant delay in therapy, patient harm. – Pharmacist is overwhelmed (bottleneck)
– No audible/visual alert for new STAT orders
– Pharmacist distracted with other tasks
8 6 7 336 – Implement a visual dashboard with timers for STATs.
– Create a dedicated STAT pharmacist role.
– Institute an audible alert for new STATs.
Phase: Compounding & Delivery
Label prints in IV room Label is not seen or is lost Significant delay in compounding and delivery. – Printer jam/out of paper
– Noisy/chaotic environment
– Label falls behind counter
6 5 8 240 Implement IV Workflow Middleware. This provides a digital queue, eliminating the dependency on a single piece of paper.
Tech delivers dose to unit Dose is delivered to the wrong location/unit Massive delay, potential for wrong patient administration if picked up by another nurse. – Similar patient names on different units
– Manual delivery process is error-prone
– Tech is new or unfamiliar with hospital layout
9 2 7 126 – Implement a barcode scanning system for medication delivery, where the runner scans the bag and then scans a barcode at the nursing station to confirm correct location.
Phase: Administration
Nurse performs BCMA Barcode on IV bag fails to scan Delay in administration while nurse troubleshoots. Potential for nurse to use a manual override, bypassing the safety check. – Smudged or torn label
– Poor quality printer
– Condensation on the bag obscures barcode
5 7 3 105 – Institute a Quality Assurance process for pharmacy printers.
– Use synthetic, water-resistant label stock for IV bags.
– Monitor override reports to identify nurses who bypass scanning frequently.