Section 12.3: Predictive Analytics for Risk Stratification
Leveraging data science to proactively identify high-risk patients and enable targeted, preventative interventions.
Predictive Analytics for Risk Stratification
From Reactive Problem-Solver to Proactive Population Health Manager.
12.3.1 The “Why”: The Clinical & Financial Failure of Reactive Care
In the previous section, we built a powerful dashboard. This tool gives you an exceptional, at-a-glance view of your pharmacy’s performance. It tells you, with data, what your “gut feeling” used to: when you are busy, when quality is slipping, and where your bottlenecks are. This is descriptive analytics—it describes the past and the present. It is a massive leap forward, but it is still, fundamentally, a reactive tool.
You look at your dashboard and see “Patient John Smith, PDC for Eliquis: 45%.” You have successfully identified a non-adherent patient. But what does this mean? It means the patient has already been non-adherent for weeks or months. The harm—be it disease progression or an increased risk of stroke—has already begun. The “late to refill” report, the core of most pharmacy workflows, is a list of failures. You are a highly-skilled clinical detective arriving at the scene after the crime has been committed.
This reactive model is a clinical, financial, and moral failure:
- It’s Clinically Ineffective: You are intervening after the patient has failed therapy. You are trying to reverse months of poor adherence, which is infinitely harder than preventing it from the start.
- It’s Financially Wasteful: This model is the definition of “a day late and a dollar short.” The PBM has already tagged this patient as non-adherent and is already planning to penalize you with DIR fees. The health plan is already exposed to the massive cost of a potential ER visit or hospitalization that you failed to prevent.
- It Wastes Your Resources: Your “late to refill” list is long and full of “noise.” You spend hours calling 100 patients. 80 of them are “low-risk” (e.g., they just forgot and will pick it up tomorrow, or they went on vacation). 20 are “high-risk” (e.g., they stopped the med due to a severe side effect, or they can’t afford the copay). You gave equal effort to all 100, meaning your most valuable, high-touch interventions were diluted.
Predictive Analytics is the solution to this problem. It is the practice of using your rich Real-World Data (RWD) to forecast the future. Instead of looking at a dashboard that tells you “who is non-adherent,” you will build and use a tool that tells you “who is most likely to become non-adherent in the next 30 days.”
This shift moves you from a reactive detective to a proactive population health manager. You can now focus your limited, high-value pharmacist time only on the 20 high-risk patients, and intervene before they ever miss a dose. You stop calling the 80 low-risk patients, saving valuable time. This is Risk Stratification: the process of “bucketing” your patient population by their predicted risk, so you can apply the right intervention, to the right patient, at the right time. This is the single most valuable, and most advanced, application of data in modern pharmacy practice.
12.3.2 Pharmacist Analogy: Your “Clinical Gut Feeling” is Already a Predictive Model
A Deep Dive into the Analogy
You are already a master of predictive analytics. You just do it in your head, one patient at a time, and you call it “clinical intuition” or a “gut feeling.”
Imagine a new patient walks up to your counter. You’ve never met them. They hand you a new, paper prescription for Humira. Your brain instantly and subconsciously runs a predictive model. In less than two seconds, you are scanning “features” and calculating a “risk score.”
Your Internal “Algorithm” is Running:
- Feature 1: The Drug. (Specialty biologic, high cost, injectable, cold-chain, complex initiation). Risk Score: +30 points.
- Feature 2: The Patient. (Looks nervous, asking “how much does this cost?”, is clutching a pamphlet on “How to Inject”). Risk Score: +20 points.
- Feature 3: The Prescriber. (It’s from a “pill mill” doctor you don’t trust… OR… It’s from the top rheumatologist in the city, which is a good sign). Risk Score: +/- 10 points.
- Feature 4: The PBM. (You see their insurance card is a high-deductible plan you know never covers specialty drugs without a fight). Risk Score: +40 points.
Prediction (Your “Gut Feeling”): “This patient has a 90% probability of failing therapy. They are going to face a ‘prior authorization’ barrier, a ‘cost’ barrier, and a ‘side effect/fear’ barrier. They are at extremely high risk of primary non-adherence (never starting) or early discontinuation.”
Your Intervention (Targeted): Because your internal risk score is “HIGH,” you don’t just say, “This will be ready in 3 days.” You immediately switch into “white glove” service. You say, “This is a complex medication. Let’s sit down. I’m going to call your insurance right now and start the benefits investigation. Then, let’s schedule a 1-on-1 injection training with me for when it’s approved.” You just performed a targeted, high-intensity intervention based on a predictive model in your head.
The “Aha!” Moment: Data science and predictive analytics are not magic. They are simply the process of teaching a computer to do what your pharmacist brain already does. You are just codifying your “gut feeling” into a mathematical formula. The computer’s advantage is that it can run this “gut feeling” algorithm on all 10,000 of your patients at once, every single night, and give you a sorted list of who needs your help the most. This is how you scale your clinical intuition.
12.3.3 How a Predictive Model is Built: A 5-Step Guide for Pharmacists
You do not need to be a data scientist or a statistician to use a predictive model. However, as a CASP, you must be an intelligent “consumer” of these models. You need to understand how they are built so you can assess their quality, identify their biases, and, most importantly, trust their outputs. The process can be broken down into five core steps.
Tutorial: The 5-Step Model-Building Lifecycle
1. Define the Target
Ask a specific, measurable, time-bound question. “What exactly are we trying to predict?”
e.g., “Non-adherence (PDC < 80%) in the next 90 days."
2. Gather & Engineer Features
Collect all the “raw ingredients” (RWD) that might be predictive of your target.
e.g., “PDC, copay, age, diagnosis codes, # of meds.”
3. Train the Model
“Teach” the computer by showing it 100,000 past patients. The algorithm “learns” the patterns.
“Patients with high copays and depression were 4.5x more likely to fail.”
4. Test & Validate
Use the model on a new set of patients it has never seen. Is it still accurate? Is it biased?
“The model correctly predicted 85% of the non-adherent patients.”
5. Deploy & Stratify
The model is live. Run it on your current patients, assign risk scores, and build your new workflow.
“Jane Smith: Risk Score 92 (High). Joe Brown: Risk Score 14 (Low).”
Step 1 Deep Dive: Defining Your “Target Variable”
This is the most important step. “Garbage in, garbage out” starts here. You must be precise. “Predicting non-adherence” is a bad target. It’s too vague.
A good target must be:
- Specific & Binary: The patient either did or did not have the outcome. (e.g., `PDC < 0.80`).
- Time-Bound: What is the “look-forward” window? (e.g., `in the next 90 days`).
- Actionable: If you predict it, is there an intervention you can perform to stop it?
Bad Target Example: “Is this patient sick?” (Too vague, not binary, not actionable).
Step 2 Deep Dive: “Feature Engineering”
This is where you, the pharmacist, are more valuable than the data scientist. A data scientist might see “LISINOPRIL” and “METFORMIN.” You see “Hypertension” and “Diabetes.” A data scientist sees a 15-day gap in refills. You see a “vacation override” or a “hospitalization.”
Feature Engineering is the art of translating raw, messy RWD into clean, simple “features” (i.e., new columns in your spreadsheet) that the model can understand.
Raw Data: A list of 200 fill dates for 1 patient.
Engineered Feature: A new column called `PDC_Last_180d` with the value `0.72`.
Raw Data: A list of 15 ICD-10 diagnosis codes.
Engineered Feature: A new column called `Charlson_Comorbidity_Index` with the value `8` (a standardized score for disease burden).
Step 3 & 4 Deep Dive: “Training” and “Validating”
This is where the “machine learning” happens. You take your giant spreadsheet of 100,000 patients (from 2023) and you “feed” it to the algorithm. The algorithm’s job is to find the hidden mathematical patterns that connect the “features” (e.g., copay, # of meds) to the “target” (non-adherence).
The “Gotcha” is Overfitting: A model can get too smart. It can “memorize” your 2023 data perfectly. But when you show it new 2024 data, it fails because it “memorized” the noise instead of “learning” the true, general pattern.
The Solution is “Test/Train Split”: You are smarter than the model. You take your 100,000-patient dataset and “hide” 20,000 of them. You train the model on 80,000 patients. Then, you test the model on the 20,000 it has never seen before. This mimics the real world. If the model is 85% accurate on the “training” data and 84% accurate on the “test” data, you have a good, generalizable model. If it’s 99% accurate on “training” but 60% on “test,” it’s “overfit” and useless.
Step 5 Deep Dive: “Deploying”
This is the handoff from “data science” to “pharmacy operations.” The data science team gives you the final, validated algorithm. Your IT team “deploys” it by building it into your pharmacy software. Every night, a process runs that takes all 10,000 of your active patients, runs them through the algorithm, and generates a new “Risk Score” (e.g., 0-100) for each one.
The next morning, you don’t look at your old “late to refill” list. You look at your new “Non-Adherence Risk Dashboard,” which is just a list of your patients, sorted from highest risk score to lowest. Your workflow has just become 100x more efficient.
12.3.4 Masterclass Table: Common “Target Variables” to Predict in Pharmacy
The first step is deciding what to predict. Your choice of “target” will define your entire program. A model built to predict non-adherence is useless for predicting ADEs. Below are the most common and high-value targets for a CASP.
| Target to Predict | Target Definition (Example) | Why It Matters (The “Value”) | Key Predictive Features (The “Clues”) | Pharmacist Intervention |
|---|---|---|---|---|
| Primary Non-Adherence (PNA) | A new chronic prescription is e-prescribed but is never picked up (i.e., not filled within 14-30 days of prescribing). | Financial: Stops “prescription abandonment.” Huge value to Pharma.
Clinical: The most dangerous adherence gap. Therapy never even starts. |
High Copay, High Deductible, Prior Authorization Required, “Silent” e-Rx (no patient notification), New-to-Class (e.g., first injectable). | Proactive benefits investigation, copay card enrollment, and “Welcome Call” before the patient even knows the Rx is there. |
| Secondary Non-Adherence | A currently-adherent patient (PDC > 80%) who is predicted to drop below 80% in the next 180 days. | Financial: This is the core of all Payer / Star Rating contracts. This is your DIR Fee protection.
Clinical: Prevents disease progression. |
Past adherence drops, high # of prescribers, “vacation” overrides (sign of disorganization), SDoH factors (low income, transport issues), Depression diagnosis. | Targeted MTM, offer Med-Sync, adherence coaching, address SDoH barriers (e.t., switch to 90-day, offer delivery). |
| Early Discontinuation (NTT Drop-off) | A patient on a new-to-therapy (NTT) specialty drug (e.g., Humira, Ozempic) who discontinues (no refill > 30-day gap) within the first 90 days. | Financial: Huge value to Pharma. Preserving one $5,000/mo patient is worth $60,000/year.
Clinical: Prevents “therapy tourism.” |
High Copay, Complex Administration (injectable), Known High-Incidence Side Effects (e.g., GI issues for GLP-1s), Lack of Initial Education. | Proactive “NTT Onboarding” program. A pharmacist call at Day 7 (side effects) and Day 21 (adherence check-in). |
| High-Cost Utilization (Hospital/ER) | Any patient predicted to have an all-cause hospitalization or ER visit in the next 30-90 days. | Financial: This is the “Holy Grail” for Health Plans. This is how you prove ROI on medical cost savings.
Clinical: Saves lives. |
Previous Hospitalization (the #1 predictor), High Comorbidity Score (e.g., CHF + COPD + CKD), High-Risk Meds (opioids, insulin), Polypharmacy (>10 meds). | High-intensity MTM, transitions of care (TCM) program, coordinate with patient’s PCP, in-home monitoring. |
| Adverse Drug Event (ADE) | A patient predicted to have a specific, high-risk ADE.
(e.g., a “Bleed Risk” model for patients on anticoagulants) |
Clinical: Pure patient safety.
Financial: Prevents costly ADE-related hospitalizations. |
Polypharmacy (e.g., DOAC + Aspirin + NSAID), Age > 75, CKD diagnosis, History of GI bleed, High HAS-BLED score. | Targeted safety review. Recommend discontinuing the NSAID, recommend PPI, counsel patient on bleed signs. |
12.3.5 Masterclass Table: The Feature Engineering “Cookbook”
This is your “cookbook” of ingredients. A predictive model is only as smart as the features you feed it. As a pharmacist, your greatest value is recommending these features. A data scientist may not know that a “vacation override” is a sign of chaos, or that a `Depression` diagnosis is one of the single highest predictors of non-adherence. You do. Your clinical insight is what makes the model smart.
| Data Source | Feature Name (Engineered) | Raw Data It Comes From | The “Clinical Insight” (Why It’s Predictive) |
|---|---|---|---|
| Pharmacy Dispensing Data | `PDC_Prior_180d` | List of all fill dates and days supply. | The #1 predictor of future behavior is past behavior. A patient with a PDC of 60% last year is almost guaranteed to be 60% this year without intervention. |
| `Median_Refill_Gap` | List of all fill dates. | Measures volatility. A patient who refills every 30 days on the dot is stable. A patient who refills at 20 days, then 45, then 32, is chaotic and high-risk. | |
| `Copay_Amount` | Patient_Pay field from the claim. | A massive predictor. Financial toxicity is a primary driver of PNA and non-adherence. A copay > $100 is a huge red flag. | |
| `Is_Med_Sync` | A binary (Yes/No) field from your pharmacy system. | A negative predictor (a “good” feature). Patients in med-sync are organized and adherent. This feature lowers their risk score. | |
| `Vacation_Override_Count` | Count of “vacation override” codes used. | Often a proxy for disorganization or gaming the system. Can also be a proxy for SDoH issues (e.S., unstable housing, travel for work). | |
| Medical Claims / EHR Data | `Polypharmacy_Count` | Count of unique medications filled in last 180d. | Measures regimen complexity. A patient on 15 meds is far more likely to be non-adherent, have ADEs, and be hospitalized than a patient on 3. |
| `Prescriber_Count` | Count of unique NPIs on claims in last 180d. | Measures “fragmentation of care.” A patient with 5 doctors is at high risk for therapeutic duplication, ADEs, and conflicting advice. | |
| `ER_Visit_Count_Prior_6mo` | Count of medical claims with ER “place of service” codes. | A powerful predictor of future hospitalization. It’s a sign of unstable disease or poor primary care access. | |
| `Has_Depression_Dx` | A binary (Yes/No) feature based on presence of ICD-10 codes for depression (e.g., F32.x, F33.x). | One of the single most powerful predictors. Depression is strongly linked to apathy and self-neglect, which directly drives non-adherence. | |
| Lab Data (from EHR) | `Last_A1c_Value` | The numeric value from the lab result. | A direct measure of disease control. An A1c > 9% is a sign of current non-adherence and a predictor of future complications (and high cost). |
| `Last_SCr_Value` | The numeric value from the lab result. | A direct measure of renal function. Highly predictive of ADEs for renally-cleared drugs (e.g., DOACs, gabapentin). | |
| SDoH Data (External) | `Median_Income_Zip` | The median income for the patient’s zip code. | A proxy for ability to pay. Low income is strongly correlated with non-adherence, poor nutrition, and high-cost utilization. |
| `Transport_Barrier_Flag` | A binary (Yes/No) feature based on zip code data or patient surveys. | A direct predictor of non-adherence and missed appointments. If a patient can’t get to the pharmacy, they can’t be adherent. |
12.3.6 A Pharmacist-Friendly Guide to Model Types (Interpreting the “Black Box”)
The algorithm itself can seem like a “black box.” While you don’t need to code the algorithm, you need to understand the logic of how it “thinks.” This helps you trust the output and explain it to stakeholders. The two most common and useful model types in pharmacy are Logistic Regression and Decision Trees.
Model 1: Logistic Regression (The Workhorse)
This is the most common, most trusted, and most interpretable model. It’s used when your “target” is binary (Yes/No). It is simply a formula that calculates the probability (from 0% to 100%) that your patient will have the “Yes” outcome. Its output is a list of “Odds Ratios” for each feature, which are incredibly easy for a clinician to understand.
Tutorial: How to Read Logistic Regression Output (A “Non-Adherence” Model)
Your data science team gives you this report for your “Predict Non-Adherence” model. This is what it means.
| Feature | Odds Ratio (OR) | How You, the Pharmacist, Interpret This |
|---|---|---|
| `Has_Depression_Dx` | 3.5 | “Patients with a depression diagnosis are 3.5 times more likely to become non-adherent than patients without one.” (A very strong predictor). |
| `Copay_High` (> $100) |
2.8 | “Patients with a high copay are 2.8 times more likely to become non-adherent.” (Another strong predictor). |
| `Prescriber_Count` (for each 1-doctor increase) |
1.4 | “For every additional doctor a patient sees, their odds of non-adherence increase by 40%.” (A strong ‘fragmentation’ signal). |
| `Age` (for each 1-year increase) |
0.98 | “An odds ratio < 1.0 is protective. For every 1-year increase in age, the patient is 2% less likely to be non-adherent.” (i.e., older patients are more adherent). |
| `Is_Med_Sync` | 0.30 | “Patients in Med-Sync are 70% less likely to become non-adherent (1.0 – 0.3 = 0.7).” (This proves the value of your Med-Sync program). |
The model’s final “Risk Score” is simply the output of the full equation that combines *all* of these features for a specific patient.
Model 2: Decision Trees / Random Forests (The Intuitive Flowchart)
This model works exactly how your brain does. It’s a series of “If-Then” questions that splits the population into smaller and smaller buckets. A “Random Forest” is simply a “forest” of hundreds of different “trees” that all vote on the final prediction (this makes it much more accurate and stable).
This is the easiest model to visualize and explain to your staff.
Visual Tutorial: A Simple Decision Tree for Predicting Non-Adherence
Imagine this flowchart is your algorithm. You start at the top with a new patient and follow the “Yes/No” path down to their final risk bucket.
All Patients (n=10,000)
Avg. Non-Adherence: 20%
Split 1: Is `PDC_Prior_180d` < 85%?
YES (n=3,000)
Avg. Non-Adherence: 65%
Split 2: Is `Copay_High` = YES?
YES (n=1,000)
Non-Adherence: 92%
BUCKET 1: HIGH RISK
NO (n=2,000)
Non-Adherence: 51%
BUCKET 2: MED RISK
NO (n=7,000)
Avg. Non-Adherence: 5%
Split 2: Is `Is_Med_Sync` = NO?
YES (n=3,000)
Non-Adherence: 8%
BUCKET 3: MED-LOW RISK
NO (n=4,000)
Non-Adherence: 1.5%
BUCKET 4: LOW RISK
12.3.7 The Grand Finale: Building the Risk-Stratified Workflow
You have a validated model. Every night, it runs and generates a “Risk Score” from 0 to 100 for all 10,000 of your patients. This is the moment of truth. How do you turn this list of 10,000 numbers into an actual, efficient workflow? You do it by creating risk-stratified “buckets” and defining a different intervention playbook for each.
This is the single most important concept: Stop treating all patients the same. Your “white glove” pharmacist-led interventions are expensive and scarce. They must be saved only for the patients who need them the most. Conversely, your “low-touch” automated reminders are cheap and scalable, but they are ineffective for high-risk patients. You must match the intensity of the intervention to the intensity of the risk.
Masterclass Table: The Risk-Stratified Intervention Playbook
| Risk Tier | Risk Score | Patient Profile (Example) | Intervention Channel | Intervention Owner | Goal of Intervention (The “Play”) |
|---|---|---|---|---|---|
| Tier 1: High Risk (“All Hands”) | Score 81-100
(e.g., Top 5% of pop.) |
Patient with new specialty med, high copay, depression diagnosis, and prior PDC of 60%. | High-Touch: Proactive 1-on-1 Phone Call. | Clinical Pharmacist (CASP) | Prevent Failure. “White glove” onboarding. Perform benefits investigation, enroll in copay card, schedule injection training, and schedule a 7-day follow-up call. |
| Tier 2: Medium Risk (“At Risk”) | Score 51-80
(e.g., Next 15% of pop.) |
Patient on 3 chronic meds who is normally adherent, but model flagged a new high-deductible plan and 2 prescribers. | Medium-Touch: Phone Call or Secure 2-Way Text. | Clinical Technician or Student | Maintain Adherence. “Hi, this is [Name] from the pharmacy. I’m just calling to confirm you’re all set for your refills. Would you be interested in our Med-Sync program to get them all on the same day?” |
| Tier 3: Low Risk (“Stable”) | Score 0-50
(e.g., Bottom 80% of pop.) |
Patient on 1-2 chronic meds, on auto-refill, PDC of 99%, low copay. | Low-Touch: Automated (App/Text/IVR). | The System (Automation) | Maintain & Don’t Annoy. “Your refill is ready.” Do not waste human resources here. Let the automation work. This frees up your staff for Tiers 1 & 2. |
12.3.8 The Ethical Imperative: Bias, Fairness, and How to Trust Your Model
A predictive model is a powerful tool. But like any tool, it can be dangerous. A model is an amplifier. If it is built on a foundation of biased data, it will not just replicate that bias; it will amplify it and give it the false authority of “objective data.” As a CASP, you are the last line of defense. You are the clinical and ethical guardian who must question the “black box.”
The Great Pitfall: Algorithmic Bias
A model is a reflection of our history, not our values. If our healthcare system has historically provided worse care to certain populations (e.g., based on race, income, or language), the data will reflect that. The model will learn this bias as a “pattern.”
The Classic Example: Cost as a Proxy for Risk.
A health system wants to predict “high-risk” patients to enroll them in a special care program. A “lazy” data scientist uses `Past_Healthcare_Cost` as a primary feature, assuming `high cost = high risk`.
The Bias: Minority and low-income populations have historically had less access to care. Therefore, they have lower past healthcare costs.
The Result: The model learns that “low-income patients are low-cost” and therefore predicts they are “low-risk.” It systematically fails to enroll the very patients who need the program the most, while over-enrolling wealthy “worried well” patients.
Your Role: You must interrogate the model. Ask your data team: “What features are in this model? Are we using race, zip code, or cost in a way that could be a proxy for social inequity? Show me the model’s performance broken down by race and income.”
Correlation is NOT Causation:
Your model might find that patients who use your “auto-refill” service are more adherent. This is a correlation.
The Wrong Conclusion (Causation): “Auto-refill causes adherence! We should just enroll everyone!”
The Right Conclusion (Correlation): “Patients who are already organized, motivated, and adherent are the ones who sign up for auto-refill. They are two related effects of a third, unmeasured variable (the patient’s ‘motivation’).” This is called “confounding,” and it’s why you must be a skeptical, clinical expert.
How Do We Know a Model is “Good”? (A 2-Minute Guide to Validation)
When your data team “validates” the model (Step 4), they will give you a report. You don’t need to understand all the math, but you need to understand the two main “scores,” which are identical to how you think about diagnostic tests.
Tutorial: Model Validation is a Diagnostic Test
Think of your predictive model as a new, fast diagnostic test.
The “Disease”: Patient will become non-adherent.
The “Test”: The model’s prediction (“Positive” = Predicts Non-Adherent, “Negative” = Predicts Adherent).
1. Sensitivity (The “True Positive Rate”)
Definition: “Of all the patients who actually became non-adherent, what percentage did our model correctly flag as ‘High Risk’?”
Example: “Our model has a Sensitivity of 85%.”
Interpretation: “This model is excellent. It will successfully find 85 out of every 100 patients who are going to fail therapy. We will only ‘miss’ 15 of them (False Negatives).”
2. Positive Predictive Value (PPV)
Definition: “Of all the patients our model flagged as ‘High Risk,’ what percentage actually became non-adherent?”
Example: “Our model has a PPV of 40%.”
Interpretation: “This tells us about our workflow. When we call the 100 patients on our ‘High Risk’ dashboard, 40 of them will be ‘true positives’ (a good intervention), but 60 will be ‘false positives.’ This is still a massive win. You have successfully filtered 10,000 patients down to a list of 100, and you know 40% of them are true, high-risk patients. This is a highly efficient use of your time.”
The Takeaway: You are looking for a model with high Sensitivity (it finds the “needles” in the haystack) and a reasonable PPV (your list of needles isn’t 90% hay). This combination gives you an efficient, effective, and data-driven workflow to truly practice at the top of your license.