CPAP Module 26, Section 1: Introduction to PBM Data Sources
MODULE 26: DATA ANALYTICS & POPULATION HEALTH IN UTILIZATION MANAGEMENT

Section 1: Introduction to PBM Data Sources

Understanding the digital breadcrumbs of healthcare: pharmacy claims, medical claims, and the crucial data from prior authorizations.

SECTION 26.1

Introduction to PBM Data Sources

From Prescription Slips to Petabytes: Learning to Read the Language of Managed Care.

26.1.1 The “Why”: From Dispensing Data to Population Insights

In your pharmacy practice, you are a master of the transaction. A patient presents a prescription, you verify it, process it through their insurance, dispense the medication, and counsel them. Each of these steps is a critical point of care. But each of these steps also generates a digital footprint—a tiny, coded piece of data. In isolation, a single prescription claim is just a record of one event. But when millions of these events are collected, aggregated, and analyzed, they cease to be mere transactions. They become a story. They tell the story of a patient’s journey through the healthcare system, the story of a physician’s prescribing habits, and the story of a population’s health.

Pharmacy Benefit Managers (PBMs) are, at their core, massive data processing engines. Their primary function—paying claims—is also their primary method of data acquisition. Every “paid” or “rejected” claim is a new entry into a colossal database. The ability to harness this data is what separates a simple claims processor from a sophisticated healthcare management company. The entire edifice of utilization management—from prior authorizations and step therapy to formulary design and drug utilization reviews—is built upon a foundation of data analytics.

As a pharmacist transitioning into this world, one of the most significant mindset shifts is learning to see beyond the individual patient and to think in terms of populations. Your clinical expertise remains paramount, but it must now be applied through the lens of data. You must learn the language that this data speaks. Why? Because a pharmacist who can not only adjudicate a prior authorization based on clinical criteria but can also understand the data that led to the creation of that criteria is an entirely different class of professional. They can identify trends, question assumptions, and contribute to the evolution of clinical strategy. This section is your Rosetta Stone for deciphering the three core languages of PBM data: pharmacy claims, medical claims, and the crucial, context-rich data generated by the prior authorization process itself. Mastering this is the first step from being a user of the system to becoming an architect of it.

Retail Pharmacist Analogy: The Comprehensive Patient Profile Investigation

Imagine a new patient, Mr. Jones, comes to your pharmacy with a prescription for Janumet XR from a new endocrinologist. His insurance rejects the claim with a message you’ve seen a thousand times: “Prior Authorization Required.” Your work as a clinical detective begins, and you start assembling a complete picture from disparate data sources.

1. Your Pharmacy Dispensing System (Pharmacy Claims Data): You look up Mr. Jones in your system. You see he’s been filling metformin 1000 mg BID and glipizide 10 mg daily for the past two years from his primary care physician. You see his adherence is around 85%. You also see a rejected claim from last week for Ozempic. This is your first data set—it tells you what medications he has been taking and trying to take.

2. The Phone Call to the Provider (Medical Claims Data): The “why” is missing. You call the endocrinologist’s office. The nurse pulls up Mr. Jones’s chart. She tells you his most recent A1c was 9.2%, and he has a new diagnosis of chronic kidney disease stage 3b. She faxes you the latest labs and chart notes. This is your second data set—it provides the clinical context, the diagnoses (ICD-10 codes) and lab values that justify the change in therapy.

3. The Insurance Call/Portal (Prior Authorization Data): Now you contact the PBM. The representative tells you the rejection for Ozempic was because the plan requires a trial of a DPP-4 inhibitor after metformin/sulfonylurea failure before approving a GLP-1 agonist. They confirm Janumet is a preferred DPP-4 combo product and just needs a PA to document the A1c and confirm the contraindication to a GLP-1. This is your third data set—it contains the specific, rule-based logic of the health plan and the history of previous authorization attempts.

Without consciously thinking about it, you have just performed a miniature version of what a PBM data analyst does on a massive scale. You integrated three separate streams of information—(1) the pharmacy history, (2) the medical diagnosis, and (3) the insurance plan’s rules—to solve a single problem. A PBM does the exact same thing, but for millions of patients, to manage the health of an entire population. You already have the investigative skills; this module will simply teach you the structure and formal language of how that data is collected and interpreted at scale.

26.1.2 The Three Pillars of PBM Data Analytics

To build a comprehensive understanding of drug utilization, PBMs rely on three distinct but interconnected streams of data. Each pillar provides a unique perspective, and only by integrating all three can a complete picture emerge. Understanding the role, content, and limitations of each is fundamental for any pharmacist in managed care.

Pharmacy Claims

The record of every prescription dispensed. It tells you WHAT drug was given, WHEN, and at WHAT cost.

Medical Claims

The record of every doctor visit, procedure, and diagnosis. It tells you WHY a patient is being treated.

Prior Authorization Data

The record of every utilization management request. It is the BRIDGE that explicitly links a specific drug to a specific diagnosis, providing rich clinical detail.

Longitudinal Patient Record

By linking these data sources via a unique patient identifier, a PBM constructs a comprehensive, chronological record of a member’s healthcare journey, enabling powerful population health analysis.

26.1.3 Deep Dive: Deconstructing Pharmacy Claims Data (The “What”)

The pharmacy claim is the bedrock of PBM data. It is the most standardized, most complete, and most frequently generated data set available. Every time you process a prescription, you are creating a rich data file structured according to the National Council for Prescription Drug Programs (NCPDP) Telecommunication Standard. While you see the immediate output (paid/rejected), a PBM’s system captures dozens of fields from that single transaction. Understanding these fields is like learning the vocabulary of PBM analytics.

Masterclass Table: Anatomy of a Pharmacy Claim
Data Field Category Specific Field Example Value Analytical Significance & PBM Use Case
Routing & Identification Bank Identification Number (BIN) 012345 The “phone number” for the PBM. Directs the claim to the correct processor. Used to segment data by PBM client or line of business.
Processor Control Number (PCN) DRUGS A secondary router. Used to direct claims to specific plans within a PBM’s portfolio (e.g., commercial vs. Medicare).
Cardholder ID 987654321 The unique identifier for the member. This is the master key used to link all pharmacy, medical, and PA data for a single individual.
NABP Number (NPI for Pharmacy) 1234567 Identifies the specific pharmacy that dispensed the drug. Crucial for network analysis, performance metrics (e.g., generic dispense rates), and fraud, waste, and abuse (FWA) monitoring.
Prescriber NPI 1122334455 Identifies the specific provider who wrote the prescription. The key to all provider-level analytics, such as identifying high-cost prescribers or those with unusual patterns.
Drug & Dispensing Information National Drug Code (NDC) 0002-8215-01 The 11-digit universal product identifier for the specific drug, strength, and package size. This is the most granular drug identifier, allowing for precise analysis of brand vs. generic, formulation, etc.
Date of Service (Fill Date) 2025-10-15 The date the prescription was filled. Essential for all time-based analysis: calculating adherence, measuring days between fills, and tracking trends over time.
Quantity Dispensed 30 The amount of drug dispensed (e.g., 30 tablets, 1 inhaler, 10 mL). Used to calculate cost per unit and monitor for stockpiling or abuse.
Days Supply 30 The number of days the dispensed quantity is intended to last. A critical field for calculating medication adherence (PDC) and identifying early or late refills.
Rx Number 7654321 The pharmacy’s internal prescription number. Primarily used for auditing and linking refills to the original prescription.
Financial Data Ingredient Cost Submitted $550.00 The amount the pharmacy submitted as the cost of the drug itself. Usually based on a benchmark like Average Wholesale Price (AWP) or Wholesale Acquisition Cost (WAC).
Dispensing Fee Submitted $5.00 The professional fee the pharmacy requested for the act of dispensing.
Ingredient Cost Paid $485.75 The actual amount the PBM reimbursed the pharmacy for the drug, based on the contracted rate (e.g., AWP – 18%). The difference between submitted and paid is a key source of PBM savings.
Dispensing Fee Paid $1.50 The actual, contracted professional fee paid to the pharmacy.
Patient Pay Amount (Copay) $45.00 The total amount of cost-sharing paid by the member (copay, deductible, coinsurance). Used to analyze member out-of-pocket costs and benefit design effectiveness.
Total Amount Paid $487.25 The sum of Ingredient Cost Paid and Dispensing Fee Paid. This is the total cost to the plan for that specific claim. Aggregating this field is how total drug spend is calculated.
From Data Fields to Clinical Metrics: Calculating Adherence

Seemingly simple fields like “Date of Service” and “Days Supply” are the building blocks for one of the most important clinical metrics in population health: medication adherence. The most common measure is the Proportion of Days Covered (PDC).

The PDC calculation is used to determine how consistently a patient is taking a chronic medication over a specific period (the “measurement period”). It’s a foundational metric for Medicare Star Ratings and nearly all value-based contracts.

$$ PDC = \frac{(\text{Sum of Days’ Supply for all fills in period})}{(\text{Number of days in period})} \times 100 $$

Example: A patient is taking lisinopril. We are looking at a 365-day measurement period.

  • Fill 1: Jan 15 (30 days supply)
  • Fill 2: Feb 13 (30 days supply)
  • Fill 3: Mar 18 (30 days supply) – Late refill
  • … and so on for a total of 11 fills of 30 days each during the year.
Total Days’ Supply = 11 fills * 30 days/fill = 330 days.
PDC = (330 / 365) * 100 = 90.4%. A PDC value ≥ 80% is typically considered “adherent.” By running this calculation across thousands of members, a PBM can identify non-adherent patients and target them for intervention programs. This is a perfect example of translating raw claims data into actionable clinical intelligence.

26.1.4 Deep Dive: Deconstructing Medical Claims Data (The “Why”)

While pharmacy claims tell us what drug was dispensed, they are notoriously poor at telling us why. A claim for adalimumab doesn’t specify if it’s for rheumatoid arthritis, Crohn’s disease, or psoriasis. This is where medical claims data becomes indispensable. Every time a patient sees a healthcare provider or undergoes a procedure, a medical claim is generated. This claim contains a wealth of clinical context in the form of standardized codes that paint a picture of the patient’s condition.

For a PBM, gaining access to a client’s medical claims data (often through a data sharing agreement) is a game-changer. It allows them to link a specific drug to a specific diagnosis, enabling far more sophisticated utilization management. Instead of a blunt, one-size-fits-all PA criterion for a drug, they can create diagnosis-specific criteria, which is the entire basis of modern, clinically sound prior authorization.

The Key Medical Coding Systems

ICD-10-CM Codes

International Classification of Diseases, 10th Revision, Clinical Modification. These are alphanumeric codes that represent a patient’s diagnosis. They are the “why” behind the visit. Example: M06.9 for Rheumatoid arthritis, unspecified.

CPT® Codes

Current Procedural Terminology. These are five-digit numeric codes that describe the services and procedures a provider performed. Example: 99214 for an established patient office visit.

HCPCS Level II Codes

Healthcare Common Procedure Coding System. Alphanumeric codes for products, supplies, and services not included in CPT. Critically for PBMs, this includes J-codes, which are used for physician-administered injectable drugs billed under the medical benefit. Example: J0135 for an injection of adalimumab.

Masterclass Table: High-Relevance ICD-10 Codes for PA Pharmacists

As a PA pharmacist, you will become intimately familiar with the ICD-10 codes for common conditions treated with high-cost specialty medications. Recognizing these codes on a medical claim is often the first clue in building a clinical picture.

Therapeutic Area ICD-10 Code Diagnosis Commonly Associated High-Cost Drugs
Rheumatology M05.79 Rheumatoid arthritis with rheumatoid factor of other specified site Adalimumab (Humira), Etanercept (Enbrel), Infliximab (Remicade), Tofacitinib (Xeljanz)
L40.50 Psoriatic arthritis, unspecified Secukinumab (Cosentyx), Ustekinumab (Stelara), Apremilast (Otezla)
M32.10 Systemic lupus erythematosus, organ involvement unspecified Belimumab (Benlysta), Anifrolumab (Saphnelo)
Gastroenterology K50.90 Crohn’s disease, unspecified, without complications Adalimumab (Humira), Ustekinumab (Stelara), Vedolizumab (Entyvio)
K51.90 Ulcerative colitis, unspecified, without complications Tofacitinib (Xeljanz), Ozanimod (Zeposia), Infliximab (Remicade)
K74.60 Unspecified cirrhosis of liver Often associated with Hepatitis C treatment regimens (e.g., Epclusa, Mavyret)
Neurology G35 Multiple sclerosis Ocrelizumab (Ocrevus), Natalizumab (Tysabri), Fingolimod (Gilenya)
G43.109 Chronic migraine without aura, not intractable Erenumab (Aimovig), Galcanezumab (Emgality), Rimegepant (Nurtec)
Oncology C91.10 Chronic lymphocytic leukemia of B-cell type not having achieved remission Ibrutinib (Imbruvica), Venetoclax (Venclexta)
C61 Malignant neoplasm of prostate Abiraterone (Zytiga), Enzalutamide (Xtandi)
The Limitations of Claims Data: Reading Between the Codes

While incredibly powerful, medical claims data is not a perfect source of truth. As a clinical expert, you must be aware of its inherent limitations:

  • Lag Time: Unlike pharmacy claims which are real-time, medical claims can take weeks or even months to be submitted by the provider and processed by the payer. This means your data may not reflect the most recent clinical events.
  • “Rule-Out” Coding: A provider may use a diagnosis code on a claim as part of a workup to “rule out” a condition. The presence of an ICD-10 code is not definitive proof of a confirmed diagnosis.
  • Lack of Clinical Nuance: A claim tells you the patient has rheumatoid arthritis (M05.79), but it doesn’t tell you their disease activity score (DAS28), which specific joints are affected, or what other medications they’ve failed. This level of detail can only come from chart notes, which is why PA data is so vital.
  • Billing vs. Clinical Reality: Claims are generated for the purpose of billing. This can sometimes lead to “upcoding” (using a code for a more severe condition to get higher reimbursement) or other inaccuracies that don’t reflect the true clinical picture.

The Pharmacist’s Role: Your job is to use claims data as a highly effective, but imperfect, starting point for your investigation. It generates hypotheses (“This member may have Crohn’s disease”) that must then be confirmed with more detailed clinical information, often found in the PA submission itself.

26.1.5 Deep Dive: Deconstructing Prior Authorization Data (The “Bridge”)

If pharmacy claims are the “what” and medical claims are the “why,” then prior authorization data is the “how.” It is the rich, contextual data stream that explicitly connects a specific drug to a specific diagnosis and provides the clinical justification required by the health plan. The PA system is more than just a gatekeeper; it is a powerful clinical data collection tool. Every field completed, every document uploaded, and every decision logged creates a data point that can be analyzed.

This data is the PBM’s most valuable asset for understanding and managing high-cost specialty drugs. It provides a level of detail that is simply unavailable in either pharmacy or medical claims alone. For the data-savvy pharmacist, understanding the structure of this data allows you to see the entire utilization management strategy of the PBM laid bare.

Masterclass Table: The Prior Authorization Data Ecosystem
Data Category Specific Field / Metric Example Analytical Significance & PBM Use Case
Request & Submission Metrics Submission Channel Portal, Fax, Phone Analyzed to understand provider behavior and drive adoption of more efficient channels like web portals. High fax volume may trigger provider outreach and education.
Turnaround Time (TAT) 1.2 days The time from request receipt to final decision. A critical metric for regulatory compliance (e.g., ERISA, Medicare) and client service level agreements (SLAs). PBMs are constantly monitoring and trying to reduce TAT.
Request Type Initial, Renewal, Appeal Segmenting data by request type helps understand workflow. A high volume of renewals indicates a stable patient population on a given drug. A high appeal rate may suggest the criteria are too restrictive or unclear.
Completeness of Submission “Missing Labs” Internal metric tracking if the initial submission contained all necessary information. High rates of incomplete submissions can pinpoint confusing criteria or provider education opportunities.
Clinical & Decision Data Submitted ICD-10 Code K50.90 The diagnosis code provided by the prescriber on the PA form. This is the most direct link between drug and diagnosis and is used to trigger the correct set of clinical criteria in the PA system.
Submitted Clinical Values A1c=9.2%, DAS28=5.4 Structured data fields that capture key lab values, scores, or other metrics required by the criteria. This is pure clinical gold for analysis, allowing the PBM to stratify patients by disease severity.
Final Decision Approved, Denied The ultimate outcome of the review. The approval/denial rate for a given drug is a key performance indicator for a PA program.
Denial Reason Code “Step Therapy Required” A standardized code explaining why a request was denied. This is perhaps the most actionable data point. It tells program managers precisely which criteria are being triggered and whether the program is functioning as intended.
Program & Provider Insights Approval Rate by Provider Dr. Smith: 95%, Dr. Jones: 35% Calculating approval rates at the individual provider level can quickly identify outliers. Dr. Smith may be following criteria perfectly, while Dr. Jones may need education on the plan’s requirements.
Denial Reasons by Drug Drug X: 80% of denials are for “Not FDA-approved indication” This analysis can reveal widespread off-label prescribing that the PBM may want to address through a new policy or provider education campaign.

26.1.6 The Grand Unification: Creating a Longitudinal Patient Record

The true power of PBM data analytics is unleashed when these three distinct data pillars are integrated, or “unified,” into a single, comprehensive record for each member. Using the unique Cardholder ID as the master key, analysts can link a member’s pharmacy fills, doctor visits, and PA requests into a chronological timeline. This creates a longitudinal patient record, a powerful tool for understanding a patient’s entire healthcare journey.

This unified view allows for analyses that would be impossible with any single data set alone. It allows the PBM to move beyond simple questions like “How much did we spend on Humira last year?” to far more sophisticated questions like, “For members with Crohn’s disease who started Humira last year, what was their total cost of care, including hospitalizations, and did it differ from members who started on Stelara?” This is the foundation of true population health management and outcomes-based analysis.

Case Study: The Journey of a Patient with Rheumatoid Arthritis

Let’s trace how the data from a single patient, Mrs. Smith, would be unified to create an analytical record.

Mrs. Smith’s Unified Data Journey

Linking Pharmacy, Medical, and PA data to tell a complete story.

1
Medical Claim – Diagnosis

Date: June 10, 2025

Mrs. Smith visits a rheumatologist for joint pain.

Data Generated:

  • Cardholder ID: 12345ABC
  • Provider NPI: 9988776655 (Rheumatologist)
  • CPT Code: 99204 (New patient office visit)
  • ICD-10 Code: M05.79 (Rheumatoid arthritis)

Analytical Insight: A new member has been identified with a diagnosis of RA. The system flags them as potentially needing specialty therapy in the future.

2
Prior Authorization Request – The Bridge

Date: June 12, 2025

The rheumatologist’s office submits a PA request for Enbrel.

Data Generated:

  • Cardholder ID: 12345ABC
  • Drug Requested: Enbrel (Etanercept)
  • Submitted ICD-10: M05.79
  • Clinical Info: Documented failure of methotrexate, DAS28 score of 6.1.
  • Decision: Approved (Meets clinical criteria)
  • Turnaround Time: 0.5 days

Analytical Insight: The link is confirmed. The RA diagnosis is tied directly to the need for Enbrel. The clinical data (methotrexate failure, high DAS28 score) justifies the approval and provides severity context.

3
Pharmacy Claim – The Action

Date: June 14, 2025

Mrs. Smith picks up her first fill of Enbrel from her specialty pharmacy.

Data Generated:

  • Cardholder ID: 12345ABC
  • NDC: 58406-0040-04 (Enbrel SureClick)
  • Days Supply: 28
  • Total Amount Paid: $5,250.00
  • Patient Pay Amount: $50.00

Analytical Insight: The care journey results in a pharmacy spend. The PBM can now track Mrs. Smith’s adherence (PDC) to Enbrel and correlate it with future medical claims (e.g., fewer rheumatologist visits or hospitalizations) to measure the total value of the therapy.