Section 12.1: The Foundations of Healthcare Data & Analytics
From Dispensing Data to Demonstrating Value: The Pharmacist’s Role as a Data Expert.
The Foundations of Healthcare Data & Analytics
Translating your daily workflow into a powerful source of clinical and operational insight.
12.1.1 The “Why”: From Clinical Intuition to Data-Driven Validation
As an experienced pharmacist, you are a master of pattern recognition. You operate on a highly-tuned “clinical intuition” built from thousands of patient interactions. You know which patients are at high risk for non-adherence. You know which prescribers are most likely to accept your recommendations. You know that your MTM interventions are preventing hospitalizations. You can feel the rhythm of your workflow, and you know when your team is operating efficiently and when it is stretched thin.
For decades, this clinical intuition was the primary tool of our profession. It was enough to be a good clinician. This is no longer the case.
We have entered a new era of healthcare defined by one central shift: the transition from fee-for-service (where you are paid for the act of dispensing) to value-based care (where you are paid for the outcome of your actions). In this new world, your intuition is not enough. You must be able to prove it. Payers, health systems, and manufacturers no longer pay for effort; they pay for demonstrable value. Your “feeling” that you are improving care must be translated into a hard-number report that shows a 15% increase in adherence and a 10% reduction in associated medical costs.
This module is not about replacing your intuition. It is about amplifying it, validating it, and scaling it with the power of data. Your pharmacy’s dispensing system is not just a tool for filling prescriptions; it is a rich, longitudinal data warehouse. Every prescription you fill, every clinical note you type, every patient you counsel is a data point. When aggregated, these data points tell a powerful story.
Your role as a Certified Advanced Specialty Pharmacist (CASP) is to become a “translator” — an expert who can speak both the language of clinical practice and the language of data. You are the nexus. You are the only person who can see the claims data from the payer, the clinical data from the prescriber (via the EHR), the dispensing and adherence data from your own system, and the qualitative data from the patient themselves. This unique position makes you the most valuable data expert on the healthcare team.
This foundational section is your boot camp. We will deconstruct this new responsibility into five core competencies that build upon each other:
- Real-World Evidence (RWE) & Data Sources: First, we will map the entire data landscape. You will learn to identify and understand the “raw materials” of healthcare data—from claims and EHRs to your own dispensing logs—and how they are used to generate evidence.
- KPI Dashboard Development: Next, you will learn how to use this data for internal improvement. We will build effective dashboards to monitor your own Key Performance Indicators (KPIs) for both clinical quality and operational efficiency.
- Predictive Analytics for Risk Stratification: Once you can measure the past, you will learn to predict the future. We will explore how to use data science to proactively identify high-risk patients before they fail therapy, allowing for targeted, life-saving interventions.
- Outcomes Reporting for Payers and Manufacturers: Then, you will learn to package your data for external stakeholders. We will master the art of creating reports that prove your clinical and economic value to the entities that pay for your services.
- Data Visualization and Benchmarking: Finally, you will learn to be a master communicator. We will cover how to use clear, compelling charts and graphs to tell your data’s story and how to benchmark your performance against industry standards to prove you are best-in-class.
This is the future of pharmacy. It is how you will secure your role as an indispensable provider, justify new clinical services, and, most importantly, systematically improve the health of your entire patient population. Let’s begin.
12.1.2 Pharmacist Analogy: The Pharmacy Inventory as a Data Warehouse
A Deep Dive into the Analogy
You are already a data manager. You just call it “inventory management.”
Think about your pharmacy’s drug inventory. It is a multi-million dollar physical asset that you manage with meticulous precision. You don’t “feel” like you’re low on lisinopril; you know you are, because your system data tells you. You don’t “guess” what to order for flu season; you project it based on past dispensing trends. The logic you apply to your physical inventory is the exact same logic you must now apply to your data inventory.
Let’s translate your existing skills:
Translation 1: From Physical Shelves to Data Tables
- Your Physical Inventory: Your shelves of bottles, refrigerators, and safes. This is your Data Warehouse — a central repository of information.
- A Single Drug (e.g., Atorvastatin 20mg): This is a Data Table. It’s a collection of like items. You might have a “Patients” table, a “Prescriptions” table, and a “Prescribers” table.
- A Single Pill: This is a Data Point or a Record. It’s the smallest unit. A single row in the “Prescriptions” table, representing one fill for one patient on one day.
Translation 2: From Inventory Workflows to Data Processes
- Receiving an Order (The “Tote”): When you scan in a new drug order, you are “Extracting, Transforming, and Loading” (ETL) data. You extract the data from the invoice, transform it into your system’s format, and load it onto your virtual shelf. This is data ingestion.
- Dispensing a Prescription: This is a “real-time transaction.” You are writing a new record to your data warehouse: `[Patient_ID] [Drug_ID] [Date] [Qty]`.
- Perpetual Inventory: This is your real-time database. It’s constantly being updated with every transaction.
- A Cycle Count: This is a Data Query. You are asking a specific question of your warehouse: `SELECT COUNT (pills) FROM (Atorvastatin_20mg_Table) WHERE (Location = ‘Shelf 3B’)`.
- Checking for Expired Drugs: This is Data Cleansing and Data Integrity. You are running a query to find and remove “bad” data (records that are no longer valid).
Translation 3: From Inventory Analysis to Business Intelligence
- Your “Turns” Report: This is your Key Performance Indicator (KPI) Dashboard. It tells you your operational and financial efficiency. You use it to make business decisions.
- Ordering Extra Tamiflu in October: This is Predictive Analytics. You are using historical dispensing data (RWD) to build a model that predicts future demand.
- A DEA Form 222 or a CSOS Audit: This is Reporting. You are aggregating, formatting, and presenting your data to an external stakeholder (the DEA) to prove your compliance and value. You can’t just hand them your whole inventory; you must give them a specific, validated report.
The “Aha!” Moment: You would never manage your multi-million dollar drug inventory with “intuition” and “anecdotes.” You use a rigorous, data-driven system. Your patient data is an asset of equal, if not greater, value. This module teaches you to be the data-driven “inventory manager” for your patient population. You already have the skills; you just need to learn the new vocabulary.
12.1.3 Deep Dive: The Landscape of Real-World Data (RWD) and Real-World Evidence (RWE)
This is the core of our new discipline. We must first understand the “raw materials” we work with. The entire field of data analytics is built on this foundation.
The Core Definitions: RWD vs. RWE
These two terms are often used interchangeably, but they are critically different. As an expert, you must know the distinction.
-
Real-World Data (RWD): The Raw Material.
This is the “data” itself. RWD is defined as data relating to patient health status and/or the delivery of health care, collected from a variety of sources outside of traditional, highly-controlled Randomized Controlled Trials (RCTs). It is the “messy” data from the real world. Think of RWD as the raw logs, the dispensing records, the lab values, the claims forms. It’s just a collection of facts.
-
Real-World Evidence (RWE): The Finished Product.
This is the “insight” you generate from the raw data. RWE is defined as the clinical evidence regarding the usage and potential benefits or risks of a medical product or intervention, derived from the analysis of RWD. Think of RWE as the published report, the clinical conclusion, the answer to a question. (e.g., “Our analysis of RWD shows that patients in our MTM program had a 15% higher adherence rate.”).
In short: You analyze RWD to generate RWE.
The Great Debate: RWE vs. Randomized Controlled Trials (RCTs)
The “gold standard” of clinical evidence has always been the RCT. So why do we even need RWE? Because RCTs and RWE answer two fundamentally different questions: Efficacy vs. Effectiveness.
- Efficacy (The RCT Question): “Can this drug work under perfect, ideal, laboratory-like conditions?” RCTs use highly selective, “clean” patient populations (e.g., no comorbidities, perfect adherence) to prove a drug is safe and effective for FDA approval.
- Effectiveness (The RWE Question): “Does this drug actually work in the real world?” RWE studies use “messy” patient populations (e.g., 80-year-olds with 10 comorbidities, poor adherence, multiple prescribers) to see how the drug performs in your actual patient population.
As a pharmacist, you live in the “effectiveness” world. RWE is your domain. An RCT tells you a new DOAC is effective; RWE tells you if it’s still effective in your patient who also takes carbamazepine and misses 10 doses a month.
Masterclass Table: RWE vs. RCT
| Metric | Randomized Controlled Trial (RCT) | Real-World Evidence (RWE) Study |
|---|---|---|
| Primary Question | Efficacy: “Can it work?” | Effectiveness: “Does it work?” |
| Study Design | Experimental, Interventional. Always prospective. Randomization and blinding are key. | Observational. Usually retrospective (using past data), but can be prospective (registries). |
| Patient Population | Homogeneous. Very strict inclusion/exclusion criteria (e.g., “non-smoking males, 40-50, no comorbidities”). | Heterogeneous. “All comers.” Represents your actual, “messy” patient population (e.g., all ages, all comorbidities, polypharmacy). |
| Internal Validity | Very High. Randomization minimizes bias and confounding. High confidence that the drug caused the outcome. | Low to Moderate. High risk of bias and confounding (e.g., did the statin save the patient, or were they just more health-conscious?). Requires advanced statistical adjustment. |
| External Validity (Generalizability) | Very Low. Results may not apply to your 80-year-old female patient with renal failure, because she would have been excluded from the trial. | Very High. The results are directly applicable to your real-world patient population because they were derived from it. |
| Data Collection | Proactive, manual, and extremely expensive. Case Report Forms (CRFs) capture specific, pre-defined data points. | Retrospective, often automated. Uses data that was already collected for routine clinical care or billing. Relatively inexpensive. |
| Pharmacist’s Role | As a clinical trial investigator, you ensure protocol adherence. | As a practitioner-analyst, you generate and interpret RWE to guide care for populations. |
A Comprehensive Guide to Real-World Data (RWD) Sources
To generate RWE, you must be a master of the raw materials (RWD). These sources are not mutually exclusive; their true power is unlocked when they are linked. But first, you must understand their individual strengths and, more importantly, their critical weaknesses.
1. Claims and Billing Data (The “Financial” Record)
What It Is: This is the data generated every time a healthcare service is paid for. It is the “financial exhaust” of the healthcare system. It includes pharmacy claims (NCPDP standard) and medical claims (CMS-1500 for outpatient, UB-04 for inpatient).
What’s Inside a Pharmacy Claim:
- Identifiers: De-identified Patient ID, Prescriber NPI, Pharmacy NPI.
- Drug Info: National Drug Code (NDC), Quantity Dispensed, Days Supply, Date of Fill, Refill Number.
- Financials: Drug Cost (Ingredient Cost + Dispensing Fee), Copay, Payer Name, BIN/PCN/Group.
What’s Inside a Medical Claim:
- Identifiers: De-identified Patient ID, Provider NPI, Facility ID.
- Encounter Info: Date of Service, Place of Service.
- Billing Codes:
- ICD-10-CM Codes: International Classification of Diseases. These are the diagnosis codes (e.g., `E11.9`: Type 2 diabetes, `I10`: Hypertension).
- CPT/HCPCS Codes: Current Procedural Terminology. These are the procedure codes (e.g., `99213`: Office visit, `83036`: HbA1c lab test).
- Financials: Amount Billed, Amount Paid.
Strengths of Claims Data:
- Longitudinal & Comprehensive: This is its superpower. Because it follows the money, you can see every encounter a patient has, even with different doctors in different health systems. You can build a patient journey over years.
- Captures Adherence: Claims data is the only reliable source for calculating adherence metrics (like PDC/MPR) because it represents what the patient actually picked up, not just what was prescribed.
- Captures Cost: It is the source of truth for all pharmacoeconomic analyses (e.g., “Total Cost of Care”).
Critical Pitfalls of Claims Data
As an analyst, you MUST be skeptical of claims data. It is built for billing, not clinical analysis, which leads to huge “data gaps.”
- NO CLINICAL RICHNESS: This is the cardinal sin. There are no lab results (you know the HbA1c test was billed, but you don’t know the result), no vital signs (no BP, no weight), and no “why” (no provider notes).
- BILLING CODES ARE NOT DIAGNOSES: A provider may “rule out” a condition, but the code still appears. Or, they may “up-code” to get higher reimbursement. You must treat ICD-10 codes as “suspected” conditions, not ground truth.
- DATA LAG: It can take 30-90 days for a claim to be processed and appear in a dataset. It is not real-time.
- THE “CASH” BLINDSPOT: Claims data only captures insured events. If your patient pays cash, uses GoodRx, or gets samples, they “disappear” from the dataset. This can make a perfectly adherent patient look 100% non-adherent.
Tutorial: Calculating Proportion of Days Covered (PDC) from Claims Data
PDC is the industry standard for adherence, used in Star Ratings. It measures what percentage of an observation period a patient was covered by a medication.
Formula: $PDC = \frac{\text{Number of days in period ‘covered’ by fills}}{\text{Total days in observation period}}$
Example Case: A 90-day (Q1) observation period for a patient on Lisinopril 10mg (30-day fills).
- Fill 1 (30-day supply): Filled on Day 1. (Covers Day 1 – Day 30)
- Fill 2 (30-day supply): Filled on Day 35. (Covers Day 35 – Day 64)
- Fill 3 (30-day supply): Filled on Day 60. (This is an early fill! The patient is stockpiling.)
The “Gotcha” (Overlapping Fills): You do not double-count days. The patient is just “covered.”
- Coverage from Fill 1: Days 1-30 (30 days)
- Coverage from Fill 2: Days 35-64 (30 days)
- Coverage from Fill 3: Starts on Day 65 (immediately after Fill 2 ends, not on Day 60). Covers Days 65-90. (The period ends at Day 90, so this fill provides 26 days of coverage within the period).
Calculation:
Total Days Covered = (Days 1-30) + (Days 35-64) + (Days 65-90)
Total Days Covered = 30 + 30 + 26 = 86 days
PDC = 86 / 90 = 95.6% (This is a 5-Star adherent patient!)
This calculation is the bedrock of how payers measure your pharmacy’s quality.
2. Electronic Health Record (EHR) Data (The “Clinical” Record)
What It Is: This is the data generated from the clinical workflow within a health system (e.g., Epic, Cerner, Allscripts). It is the digital version of the patient’s chart.
What’s Inside an EHR Record:
- Identifiers: Medical Record Number (MRN), Patient Demographics.
- Structured Data (The “Easy” Stuff):
- Problem List: Diagnoses (ICD-10).
- Medication List: e-Prescriptions (what was ordered) and Medication Administration Records (MARs) (what was given inpatient).
- Allergies: Standardized list.
- Vitals: Blood Pressure, Heart Rate, Weight, Height, BMI, Temp.
- Lab Results: The actual numeric values (e.g., `HbA1c = 8.1%`, `SCr = 1.4 mg/dL`).
- Unstructured Data (The “Hard” Stuff):
- Provider Notes, Discharge Summaries, Consult Notes, Radiology Reports. This is free text (e.g., “Patient reports frustration with high copay and is splitting pills.”).
Strengths of EHR Data:
- UNMATCHED CLINICAL RICHNESS: This is its superpower. You have the lab results, the vitals, the diagnoses. You can answer “why” a drug was prescribed and “what happened” as a result.
- Real-Time: The data is generated during the encounter. It is the most up-to-date source of information.
- Captures Prescriber Intent: You see what was ordered, not just what was filled. This is how you find “primary non-adherence” (prescriptions that were ordered but never picked up).
Critical Pitfalls of EHR Data
EHR data is incredibly rich, but it has massive blind spots that you, as a pharmacist, are perfectly positioned to see.
- IT IS SILOED: This is its cardinal sin. An EHR only contains data from its own health system. If your patient sees a cardiologist at System A (with Epic) and an endocrinologist at System B (with Cerner), neither EHR has the full picture.
- THE “ADHERENCE” BLINDSPOT: This is your value proposition. The EHR knows the lisinopril was prescribed, but it has NO IDEA if the patient ever picked it up, if they paid cash at another pharmacy, or if they are taking it.
- MESSY & UNSTRUCTURED: The most valuable insights (e.g., “patient lost job and can’t afford meds”) are buried in free-text notes. Analyzing this requires a complex tool called Natural Language Processing (NLP).
- NO COST DATA: EHRs are for clinical care, not billing, so they typically do not contain reliable information on what the patient or payer paid for services.
The Pharmacist’s Opportunity: You are the only one who can bridge the gap between the EHR (what was prescribed) and the Claims data (what was picked up).
3. Pharmacy Dispensing & Management System Data (YOUR Data)
What It Is: This is the data from your own pharmacy’s software (e.g., PioneerRx, Rx30, EnterpriseRx, QS/1). It is a powerful hybrid dataset that you create every day.
What’s Inside Your System’s Data:
- Claims/Dispensing Data: You have a copy of every prescription claim you’ve ever processed for that patient.
- Pharmacist Clinical Notes: This is your gold mine. MTM platform notes, adherence call logs, intervention documentation (e.g., “Called Dr. Smith, rec’d switch from Xarelto to Eliquis due to cost”).
- Clinical Service Data: Vaccination records, point-of-care test results (Strep, Flu, A1c), blood pressure readings from your in-store cuff.
- Patient-Reported Data: Information you collect directly (e.g., “patient reports taking med 5/7 days,” “patient reports side effect of cough”).
Strengths of Your Data:
- LINKS INTERVENTION TO OUTCOME: This is the only dataset in the world that connects your specific clinical intervention (e.g., an adherence call) to a dispensing outcome (e.g., the patient refilled the next day).
- Captures the “Why”: Your notes are the richest source of “unstructured” clinical insight.
- It is YOURS: You can structure it, control it, and query it in real-time to manage your own practice.
Weaknesses of Your Data:
- HIGHLY FRAGMENTED: It’s siloed to your pharmacy or chain. You don’t know what the patient filled at a competitor.
- NOT STANDARDIZED: This is the biggest challenge. One pharmacist might type “Pt non-adherent,” another “patient missed refill,” and another “Called pt re: late Rx.” These are impossible to query.
Tutorial: Standardizing Your Clinical Intervention Notes for Data Capture
You can fix the “not standardized” problem today. Mandate that every clinical note in your system follows a “structured” template. Instead of (or in addition to) a free-text note, use a “snippet.”
Bad, Unstructured Note: “called pt about late refill, said they forgot, will pick up tmrw.”
Good, Structured Note (that you can query):
`-INTERVENTION-`
`TYPE: Adherence`
`REASON: Refill Overdue (7d)`
Now, at the end of the month, you can run a report by searching for `TYPE: Adherence` to count all your adherence interventions. You can run a report on `BARRIER:` to find that “Cost” is your #1 barrier, and present this to payers. You have just turned your notes from “text” into “data.”
4. Patient-Generated Health Data (PGHD) & Other Sources
This is the new frontier of RWD, and it’s exploding.
- Patient Registries: These are organized, observational studies that collect highly specific data on one disease (e.g., a cystic fibrosis registry). They are excellent for studying disease progression and often include Patient-Reported Outcomes (PROs) (e.g., quality-of-life surveys).
- PGHD: Data generated by the patients themselves. This includes data from:
- Smart Devices: Apple Watch (EKG, Afib), smart scales (weight).
- Connected Medical Devices: Continuous Glucose Monitors (CGMs) like Dexcom/Freestyle, smart inhalers (Propeller Health), smart blood pressure cuffs.
- Apps: PRO surveys, mood trackers, diet/exercise logs.
- Social Determinants of Health (SDoH) Data: This is dataset enrichment. You can buy data (from sources like LexisNexis) that links a patient’s zip code to data on income, education level, transportation access, and “food deserts.” This is highly predictive of health outcomes.
The Holy Grail: Data Linkage. The true power comes when you link these datasets. Imagine a patient record that combines:
1. Claims Data: We see they picked up their insulin (PDC = 90%).
2. EHR Data: Their last HbA1c was 9.5% (a clinical problem).
3. PGHD Data: Their CGM log shows they are adherent, but their glucose is spiking every night after dinner.
4. Pharmacist Data: Your note says, “Patient is adherent but reports confusion about carb-counting for her evening meal.”
Without linkage, you have conflicting data. With linkage, you have a precise, actionable clinical insight. The problem isn’t adherence; it’s diet education. This is the power of advanced data analytics.
12.1.4 Application 1: Key Performance Indicator (KPI) Dashboard Development
Now that you understand your “raw materials” (RWD), your first application is to use them for internal improvement. You cannot manage what you do not measure. A Key Performance Indicator (KPI) dashboard is your practice’s “cockpit” — it gives you a real-time view of your performance.
- Key Performance Indicator (KPI): A measurable value that demonstrates how effectively your pharmacy is achieving key clinical or business objectives.
- Dashboard: A visual display of your most important KPIs, consolidated on a single screen so they can be monitored at a glance.
Your Skill Translation: You already use dashboards every day. Your pharmacy system’s “To-Do List” or “Refill Queue” is a basic operational dashboard. It shows you KPIs like `Prescriptions Waiting for Verification` or `Prescriptions Due Today`. We are simply expanding this concept to include clinical and financial KPIs.
Characteristics of a GOOD KPI (The “SMART” Framework)
A KPI is not a vague goal. It must be SMART:
- Specific: It targets a specific area for improvement.
- Measurable: You can quantify it.
- Achievable: It is realistic to accomplish.
- Relevant: It matters to your practice and goals.
- Time-bound: It has a defined timeframe.
Bad “KPI” (Vague Goal): “We should improve our diabetes care.”
Good KPI (SMART): “Increase the percentage of our diabetes patients (age 40-75) who are on a statin from 78% to 85% by the end of Q4.”
A great pharmacy dashboard is balanced. It measures two key areas: Clinical Quality (Are we improving health?) and Operational Efficiency (Are we doing it well?).
Masterclass Table: Designing Your Pharmacy KPI Dashboard
| KPI Category | Key Performance Indicator (KPI) | Data Source(s) Needed | The “So What?” (Why It Matters) |
|---|---|---|---|
| Clinical Quality KPIs (Are we improving health?) |
Adherence (PDC) for Star Ratings | Pharmacy Dispensing Data or Payer Claims Data | Directly impacts payer reimbursement and patient outcomes. This is your #1 quality metric. |
| Statin Use in Persons with Diabetes (SUPD) | Dispensing Data + Payer Claims (for diabetes diagnosis) | A key HEDIS/Star Ratings measure. Shows you are practicing evidence-based medicine. | |
| Comprehensive MTM (CMR) Completion Rate | MTM Platform (e.g., OutcomesMTM) | Payer contract compliance and direct revenue generation. | |
| New-to-Therapy (NTT) Discontinuation Rate | Pharmacy Dispensing Data | Measures your effectiveness at counseling and onboarding patients to new meds. High value to manufacturers. | |
| Vaccination Rate (e.g., Flu, Shingrix) | Dispensing/EHR/Registry Data | Public health impact and a primary driver of non-dispensing revenue. | |
| Operational Efficiency KPIs (Are we running well?) |
Prescription Turnaround Time (Avg) | Pharmacy System (e.g., time from ‘Received’ to ‘Verified’ and ‘Verified’ to ‘Filled’) | Directly impacts patient satisfaction and workflow stress. |
| Dispensing Accuracy Rate | Pharmacy System + Quality Assurance Log | Measures patient safety. (e.g., `(Total Fills – Dispensing Errors) / Total Fills`). | |
| Call Center Wait Time / Abandon Rate | Phone System Data (if available) | A major driver of patient frustration. Identifies staffing needs. | |
| Inventory Turn Rate | Inventory/Purchasing Data | Measures your financial health. `(Cost of Goods Sold / Avg. Inventory)`. High turns = good cash flow. | |
| Clinical Interventions per Pharmacist FTE | Pharmacy System (Clinical Notes) | Measures the clinical productivity of your team. (Requires structured note-taking!) |
12.1.5 Application 2: Predictive Analytics for Risk Stratification
Once you have mastered measuring your performance (KPIs), the next evolution is to predict the future. This is the most powerful application of data analytics in healthcare. Instead of reacting to problems, you prevent them.
- Risk Stratification: The process of dividing your patient population into “buckets” of low, medium, and high risk.
- Predictive Analytics: The engine that does this. It uses data science and statistical modeling on your RWD to calculate a “risk score” for every patient for a specific negative outcome.
The Goal: To move your pharmacy from a reactive model to a proactive, data-driven model.
Reactive Model (The Past): “Mr. Smith’s lisinopril is 10 days late in the refill queue. I should call him.” (The non-adherence has already happened).
Proactive Model (The Future): “My dashboard shows Mr. Smith has a 92% probability of becoming non-adherent in the next 30 days. I should call him today to find out why.”
How Does a Predictive Model Work? (A Pharmacist’s Guide)
You don’t need to be a data scientist to use a model, but you must understand how it thinks. Your clinical intuition is already a predictive model. When you see a patient with 12 prescriptions, 5 prescribers, a new depression diagnosis, and a high copay, your brain screams “HIGH RISK!” A predictive model simply formalizes, quantifies, and scales that intuition.
Here is the (simplified) 4-step process:
- 1. Define the Outcome (The “Target”): You must ask a specific question. What, exactly, do you want to predict?
- Example: “Hospitalization in the next 90 days.”
- Example: “Becoming non-adherent (PDC < 80%) in the next 6 months."
- Example: “Discontinuation of a new specialty drug in the first 30 days.”
- 2. Gather Features (The “Ingredients”): These are the RWD variables you will “feed” the model. The more data sources you have, the more powerful your model.
- Claims Data Features: Past PDC, # of prescribers, # of pharmacies, # of medications (polypharmacy), # of ER visits in past 6 months.
- EHR Data Features: Specific diagnoses (e.g., ‘Depression’, ‘CHF’, ‘Substance Use’), last HbA1c value, BMI, smoking status.
- SDoH Data Features: Zip code-derived income level, transportation access.
- Pharmacy Data Features: Patient’s preferred language, use of auto-refill, use of med-sync.
- 3. Build (Train) the Model: A data scientist “trains” a statistical model (like a logistic regression) on 100,000 past patients. The model “learns” the mathematical relationship between each feature and the outcome. It creates an algorithm.
(Example Algorithm): `Risk Score = [0.3 * (Past PDC)] + [0.5 * (# of Prescribers)] + [0.8 * (Depression Dx)] – [0.2 * (Auto-Refill)] …` - 4. Deploy & Score: The model is now “live.” It runs on your current patient population and assigns every single patient a “risk score” (e.g., 0-100). You can now sort your entire population from highest risk to lowest risk.
Putting the Model into Practice: Targeted Interventions
This risk list is now your new workflow. Instead of just working your refill queue, you are working a “risk queue.” This allows you to allocate your most expensive resource—your time—effectively.
Masterclass Table: Risk-Stratified Intervention Model
| Risk Tier | Risk Score | Example Patient Profile (Predicting Non-Adherence) | The Proactive, Targeted Intervention |
|---|---|---|---|
| High Risk | 71 – 100 | 45 y/o, new to specialty drug (e.g., Humira), 10+ total meds, 4 prescribers, diagnosis of depression, past PDC of 65%, high copay. | High-Touch, Pharmacist-Led: Proactive call before first fill, benefits investigation, copay card enrollment, side-effect counseling at day 7, monthly adherence check-in calls. |
| Medium Risk | 31 – 70 | 55 y/o, stable on 3 meds, but has 2 prescribers and PDC just dropped to 82%. Model flags them as “at risk” of falling. | Tech-Enabled: Targeted adherence call from a pharmacy technician, offer enrollment in med-sync program, send automated text reminders. |
| Low Risk | 0 – 30 | 65 y/o, stable on lisinopril for 10 years, 1 prescriber, PDC of 98%, on auto-refill. | Standard/Automated Care: No manual intervention needed. Let the automated systems (auto-refill, text alerts) handle this patient. |
The Critical Ethical Pitfall: Algorithmic Bias
Predictive models are powerful, but they are dangerous if built carelessly. A model is only as good as the data it’s trained on. If your historical data is biased, your model will be too.
Example: A health system builds a model to predict “high-risk” patients to enroll them in a care management program. They use “past healthcare costs” as a primary feature, assuming high cost = high risk.
The Bias: This model will systematically under-score minority and low-income populations. Why? Because these populations have historically had less access to care, and therefore have lower past healthcare costs. The model learns this bias and concludes they are “low risk” because they don’t see the doctor, when in reality they are the highest risk.
Your Role as a CASP: You must be the clinical and ethical sanity check. You must ask: “What features are in this model? Is it using ‘cost’ as a proxy for ‘risk’? Is it possible this model is just amplifying our existing health inequities?” You are the guardian of equitable care.
12.1.6 Application 3: Outcomes Reporting for Payers and Manufacturers
You have used RWD to build KPIs (measuring the past) and predictive models (predicting the future). Now, you must complete the cycle by packaging your results to prove your value to external stakeholders. This is how you get paid for your clinical services. This is how you move from a “cost center” (the cost of dispensing) to a “value center” (a generator of savings and quality).
Your Skill Translation: This is no different from preparing your meticulous records for a State Board of Pharmacy inspection, a PBM audit, or a DEA audit. In those cases, you are gathering, organizing, and presenting specific data (dispensing logs, invoices) to prove compliance. For outcomes reporting, you are gathering, organizing, and presenting specific data (adherence rates, clinical notes) to prove value.
Speaking Their Language: What Each Stakeholder Wants
You cannot send the same report to a payer and a manufacturer. They have completely different goals, and you must tailor your message to what they care about.
1. Reporting to Payers (Insurers, PBMs, Health Plans)
- What They Care About: Two things: Total Cost of Care (Medical + Rx) and Quality Metrics (HEDIS / Star Ratings). They are asking: “Did your pharmacy service save me money by preventing ER visits, and did you help me hit my 5-Star quality bonuses?”
- Data You Need: Claims data is king here. You need pharmacy claims (to calculate PDC) and medical claims (to see hospital/ER costs). You also need your own pharmacy data to prove it was your intervention that caused the change.
- Your Report Should Look Like This:
Example Payer Outcomes Report (Executive Summary)
“For Q4, we enrolled 150 of your highest-risk diabetes patients (identified by our predictive model) into our Pharmacist-Led Diabetes Management Program. We compared their outcomes to a matched control group of 150 similar patients.”
- Clinical Quality: Our intervention group’s average PDC for statins increased from 78% to 91% (a 5-Star measure), while the control group’s PDC remained flat at 79%.
- Economic Value: Our intervention group had a 22% reduction in diabetes-related ER visits and a 15% reduction in all-cause hospitalizations compared to the control group.
- The ROI: This resulted in a net medical cost savings of $28,500. The cost of our pharmacist service was $15,000. This represents a 1.9-to-1 Return on Investment (ROI) for your health plan.“
2. Reporting to Manufacturers (Pharma)
- What They Care About: One thing: Adherence & Persistency for their specific, brand-name drug. They are launching a new, expensive specialty drug and are terrified of “patient drop-off” in the first 90 days due to cost or side effects.
- Data You Need: Your internal pharmacy dispensing data (for their specific NDCs) and your clinical intervention notes.
- Your Report Should Look Like This:
Example Manufacturer Outcomes Report (Executive Summary)
“For Q4, 85 of your patients were prescribed your new biologic, [Drug Name]. We enrolled 100% of them in our ‘New-to-Therapy (NTT)’ Onboarding Program.”
- Access & Onboarding: Our benefits investigation team successfully resolved 32 prior authorization and/or affordability issues, preventing access delays.
- Persistency: Our 90-day discontinuation rate for [Drug Name] was only 8%, compared to the 25% national average RWD benchmark for this drug class.
- Intervention Data: Our pharmacists conducted 112 proactive counseling calls, with 65% of interactions focused on side effect management (e.g., injection site reactions), which is the primary driver of non-persistence.
Tutorial: Building a Simple Return on Investment (ROI) Model for a Payer
This is the “back-of-the-napkin” math that gets you in the door. It must be simple and defensible.
Goal: Prove your adherence program for 1,000 CHF patients is worth paying for.
Step 1: Calculate Your Intervention Cost (The “I” in ROI)
- Pharmacist time per patient per year = 1 hour (4 x 15-min check-ins)
- Pharmacist “loaded” cost (salary + benefits) = $80/hour
- Total Cost: 1,000 patients * 1 hour/pt * $80/hr = $80,000
Step 2: Calculate Your Value (The “R” in ROI)
- Average cost of a CHF-related hospitalization = $20,000
- RWE shows your program reduces hospitalizations by 10%
- Baseline hospitalization rate for this population = 25% (250 hospitalizations)
- Hospitalizations Prevented: 250 * 10% = 25 hospitalizations
- Total Savings: 25 hospitalizations * $20,000/admission = $500,000
Step 3: Calculate ROI
Formula: $ROI = \frac{(\text{Financial Gain} – \text{Intervention Cost})}{\text{Intervention Cost}}$
Calculation:
$ROI = \frac{(\$500,000 – \$80,000)}{\$80,000}$
$ROI = \frac{\$420,000}{\$80,000} = 5.25$
Your “Elevator Pitch” to the Payer: “My pharmacist-led program costs $80,000 per year for your 1,000 highest-risk CHF members. Based on RWE, we will prevent 25 hospitalizations, saving you $500,000. For every $1.00 you invest in my service, I will return $5.25 in direct medical cost savings.“
12.1.7 Application 4: Data Visualization and Benchmarking
You have done the analysis and calculated the ROI. Your final task is to present your findings. A 10,000-row Excel spreadsheet is not insightful; it is a headache. A clear, simple graph is a powerful tool of persuasion. Data visualization is the art and science of communicating complex information clearly, effectively, and persuasively.
Your Skill Translation: You are already a data visualization expert. When you draw a simple graph for a patient showing their HbA1c trend (9.0% -> 8.2% -> 7.5%) to motivate them, you are using data visualization. When you show a prescriber a patient’s BP log, you are using a table (a form of data viz) to make a clinical point. This is the same skill, just applied to a different audience.
Choosing the Right Chart for the Right Question
The #1 mistake in data visualization is choosing the wrong type of chart. The chart you choose depends on the question you are trying to answer.
Masterclass Table: The Pharmacist’s Data Viz Playbook
| Chart Type | Icon | What It’s Used For | Pharmacy Example (The Question It Answers) |
|---|---|---|---|
| Line Chart | Showing a trend over time. | “How has our adherence rate for DOACs changed over the last 12 months?” | |
| Bar Chart (Vertical) | Comparing categories or groups. | “What is the average turnaround time for each of our 5 pharmacy locations?” | |
| Bar Chart (Horizontal) | Comparing categories with long labels. | “What are the Top 10 most common clinical interventions our pharmacists perform?” | |
| Pie Chart / Donut Chart | Showing parts of a whole (sum = 100%). Use sparingly. | “What percentage of our new prescriptions come from each local medical group?” | |
| Histogram | Showing the distribution of a single variable. | “What is the distribution of HbA1c values for our 500 diabetes patients? Are they clustered around 7.0 or 9.0?” | |
| Scatter Plot | Showing the relationship between two different variables. | “Is there a relationship between a patient’s age (X-axis) and their number of missed refills (Y-axis)?” | |
| Heat Map | Showing intensity/density in a grid or map. | “Which zip codes in our delivery area have the highest density of non-adherent patients?” |
Benchmarking: The Power of Context
A number without context is meaningless. If you tell your boss, “Our statin adherence rate is 88%,” their first question will be: “Is that good?”
Benchmarking is the process of comparing your performance (your KPI) against a standard to provide context and demonstrate excellence. Data visualization is how you show this comparison.
Types of Benchmarks You MUST Use:
- 1. Historical Benchmarking: Comparing yourself to… yourself.
- Example: A line chart showing your adherence rate was 82% last year and is 88% this year.
- The Story: “We are improving.”
- 2. Internal Benchmarking: Comparing yourself to your peers.
- Example: A bar chart showing Pharmacy A (88%), Pharmacy B (84%), and Pharmacy C (91%).
- The Story: “Pharmacy C is a top performer, and we should learn from their best practices.”
- 3. Industry / Payer Benchmarking: Comparing yourself to the “gold standard.”
- Example: A bar chart showing Your Pharmacy (88%) next to the “CMS 5-Star Goal” (90%).
- The Story: “We are a high-performing pharmacy, but we are 2 points away from the top-tier quality bonus.”
Data Viz Pitfalls: How to Lie with Charts
Your goal is to communicate with clarity and honesty. Bad data visualization can be misleading, even if unintentional. Avoid these common traps:
- The Truncated Y-Axis: Never start the Y-axis of a bar chart at anything other than zero. Starting the axis at 80% to show a jump from 82% to 88% will make the 6% gain look like a 300% gain. This is dishonest.
- The 3D Pie Chart: Never use a 3D pie chart. The perspective distortion makes the slices at the “front” look larger than the slices at the “back.” Use a simple 2D donut chart instead.
- The “Spaghetti” Line Chart: Do not put 10 different-colored lines on one line chart. It’s unreadable. Limit it to 3-4 key comparisons.
- The “Vanity Metric”: Don’t just show data that looks good; show data that is actionable.
Vanity Metric: “We made 10,000 adherence calls this month.” (So what?)
Actionable Metric: “Our adherence calls to the ‘High Risk’ group resulted in a 40% increase in refills, while calls to the ‘Low Risk’ group had a 0% impact.” (This tells you to stop calling the low-risk group and focus on the high-risk).
As a CASP, your data presentations must be like your clinical interventions: ethical, clear, and designed to drive an outcome. You have now completed the entire data lifecycle: from understanding the raw data (RWD) to performing internal analysis (KPIs), predicting the future (Analytics), and finally, communicating your value to the world (Reporting & Visualization).