Section 11.1: SQL and Query Basics for Pharmacists
A foundational deep dive into Structured Query Language (SQL), the universal language of data. Learn how to “speak” directly to your hospital’s databases to select, filter, and aggregate medication data, transforming you from a passive report user into an active data investigator.
SQL and Query Basics for Pharmacists
From Clinical Question to Actionable Data: Your New Superpower.
11.1.1 The “Why”: Beyond Pre-Built Reports
In your daily practice, you are a master of data retrieval, though you may not think of it that way. Every time you pull up a patient’s profile, check their allergies, review their lab results, or look at their medication history, you are querying a database. You are asking the system a specific question—”Show me all active medications for John Smith”—and the Electronic Health Record (EHR) translates your clicks into a query and presents you with the answer. You are, in effect, using a graphical user interface (GUI) to access a vast repository of information.
This system works beautifully for patient-specific, pre-defined questions. But what happens when your clinical question is broader? What if you want to ask:
- “How many of our heart failure patients were discharged on both an ACE inhibitor and an ARB last month?”
- “Which medical service prescribes the most broad-spectrum antibiotics for community-acquired pneumonia?”
- “What is the average time from a STAT vancomycin order to the first trough level being drawn?”
- “Show me a list of all patients over 75 who are currently on three or more anticholinergic medications.”
These are precisely the kinds of questions that drive improvements in medication safety, formulary management, and clinical practice. However, there is no button in the EHR for “Find Potentially Inappropriate Polypharmacy.” To answer these questions, you have two options: you can submit a request to a dedicated data analyst team and wait for them to run a report, or you can learn to ask the database the questions yourself. This section is about teaching you the second option.
Learning Structured Query Language (SQL) is the single most empowering technical skill a pharmacist can develop in the informatics space. It is the universal key that unlocks nearly every major healthcare database in the world. It allows you to bypass the limitations of pre-built reports and GUIs to engage in a direct conversation with your hospital’s data. It transforms you from a passive consumer of information into an active, independent data investigator, capable of formulating a clinical question, retrieving the precise data needed to answer it, and driving evidence-based change. This is not about becoming a programmer; it’s about learning a new language of inquiry that is a direct extension of your clinical curiosity.
Retail Pharmacist Analogy: The Advanced Search Function
Imagine your community pharmacy software. You know how to pull up a patient’s profile and see their history. This is like using the basic functions of the EHR. Now, imagine you need to prepare for a DEA audit. You need a list of every single hydrocodone prescription filled in the last six months, sorted by prescriber. You probably wouldn’t go through every patient profile one by one. Instead, you would go to your system’s reporting or “advanced search” function.
In that search screen, you would specify several criteria:
- Drug Name: LIKE ‘hydrocodone%’
- Fill Date: BETWEEN [6 months ago] AND [Today]
- Sort By: Prescriber Name
When you click “Run Report,” the software takes those criteria and writes a SQL query behind the scenes to pull the exact data you requested. Now, imagine if you could bypass that search screen entirely. What if, instead of being limited to the search fields the software vendor gave you, you could write the instructions yourself? You could ask for that same list, but also include the patient’s age, the quantity dispensed, and the days’ supply. You could then ask to only see results where the days’ supply was greater than 30. Then, you could ask to see only patients who also filled a benzodiazepine within the same month.
That is the power of SQL. It is the language that powers every advanced search you’ve ever used. Learning SQL is like being handed the master key to your pharmacy’s data cabinet, allowing you to ask any question you can think of, not just the ones the pre-built search form allows.
11.1.2 Deconstructing the Database: Your Digital Filing Cabinet
Before you can write a query, you need a mental model of where the data lives. The term “database” can seem abstract, but for our purposes, you can think of a relational database as an exceptionally well-organized digital filing cabinet. This cabinet is designed to store vast amounts of information in a way that prevents duplication and makes retrieval efficient. It’s built on a few simple, powerful concepts.
The Database
This is the entire filing cabinet. It’s a container that holds all the related sets of information for the hospital (e.g., the Clinical Data Warehouse).
Tables
These are the drawers in the cabinet. Each table holds a specific category of information, like Patients, Medication_Orders, or Lab_Results.
Columns (or Fields)
These are the labels on the folders inside each drawer. They define what pieces of information are stored for each record. For the Patients table, columns would be Patient_ID, LastName, FirstName, DateOfBirth.
Rows (or Records)
These are the actual sheets of paper inside the folders. Each row is a single, complete record. In the Patients table, one row would represent one specific patient, containing their ID, Last Name, First Name, and DOB.
A Visual Example: Patients and Medication Orders
Let’s visualize two common tables in a hospital database. Notice how they each store a distinct type of information, but they share a common column—Patient_ID. This shared column is the key that allows us to link them together, which is the “relational” part of a relational database.
dbo.Patients
| Patient_ID | LastName | FirstName | DateOfBirth |
|---|---|---|---|
| 1001 | Smith | John | 1965-05-10 |
| 1002 | Williams | Mary | 1982-11-22 |
| 1003 | Jones | David | 1950-01-15 |
dbo.Medication_Orders
| Order_ID | Patient_ID | Drug_Name | Dose | Order_Date |
|---|---|---|---|---|
| 98551 | 1002 | Lisinopril | 10 mg | 2025-10-18 |
| 98552 | 1001 | Metformin | 500 mg | 2025-10-18 |
| 98553 | 1002 | Atorvastatin | 40 mg | 2025-10-18 |
Primary Keys and Foreign Keys: The Foundation of Relationships
In the example above, Patient_ID in the `dbo.Patients` table is the Primary Key. It is a unique identifier for each row in that table; no two patients can have the same ID. Think of it as a patient’s medical record number (MRN).
The Patient_ID column in the `dbo.Medication_Orders` table is a Foreign Key. It “points back” to the primary key in the `dbo.Patients` table. This is how the database knows that order `98551` for Lisinopril belongs to Mary Williams. This key system is what allows us to perform powerful `JOIN` operations to combine data from multiple tables, which we will cover in depth shortly.
11.1.3 The Language of SQL: Your Core Vocabulary
SQL has a vast vocabulary, but a handful of core commands will handle 95% of the questions you need to ask. The language is designed to be readable, almost like a sentence. We will build up your vocabulary one command at a time, starting with the most fundamental building blocks.
The SELECT and FROM Statements: What and Where
Every query you write will begin with `SELECT` and `FROM`. This is the absolute minimum required to retrieve data.
- The SELECT statement specifies the columns (the folders) you want to see.
- The FROM statement specifies the table (the drawer) you want to get them from.
Syntax and Examples
To select specific columns, you list them after `SELECT`, separated by commas. To select all columns, you can use the asterisk (`*`) as a wildcard, though this is often discouraged in final queries for performance reasons.
Example 1: Get the names of all patients.
FROM dbo.Patients;
Example 2: Get all information for every medication order.
FROM dbo.Medication_Orders;
A Note on Syntax: Semicolons and Case Sensitivity
Most SQL databases use a semicolon (`;`) to signify the end of a command. While not always strictly required, it is a universal best practice and prevents errors when running multiple queries at once. SQL keywords like `SELECT` and `FROM` are generally case-insensitive, but table and column names may or may not be, depending on the database’s configuration. For consistency and readability, the standard convention is to write SQL keywords in UPPERCASE and table/column names as they appear in the database (often PascalCase or snake_case).
The WHERE Clause: Your Clinical Filter
The `WHERE` clause is the pharmacist’s most powerful tool. It allows you to filter the rows in a table to only the ones that meet your specific criteria. This is how you move from seeing all medication orders to seeing only the ones for a specific drug, a specific patient, or a specific date range. The `WHERE` clause always comes after the `FROM` clause.
Masterclass Table: Common `WHERE` Clause Operators
| Operator | Description | Pharmacy Example |
|---|---|---|
= |
Equal to. Used for exact matches. | WHERE Drug_Name = 'Warfarin' |
<> or != |
Not equal to. | WHERE Patient_Status <> 'Discharged' |
> |
Greater than. | WHERE Dose_Value > 500 |
< |
Less than. | WHERE Age < 18 |
>= or <= |
Greater/less than or equal to. | WHERE Order_Date >= '2025-10-01' |
BETWEEN |
Checks for a value within a range (inclusive). | WHERE Order_Date BETWEEN '2025-10-01' AND '2025-10-31' |
LIKE |
Searches for a pattern in a string. The percent sign (%) is a wildcard for any number of characters. |
WHERE Drug_Name LIKE 'Heparin%' (Finds Heparin Flush, Heparin Drip, etc.) |
IN |
Checks if a value is present in a list of options. | WHERE Drug_Name IN ('Apixaban', 'Rivaroxaban', 'Edoxaban') |
IS NULL |
Checks for empty or missing values. | WHERE Stop_Date IS NULL (Finds active, non-discontinued orders) |
Example 3: Find all orders for Lisinopril.
FROM dbo.Medication_Orders
WHERE Drug_Name = 'Lisinopril';
Example 4: Find all medication orders for patient 1002.
FROM dbo.Medication_Orders
WHERE Patient_ID = 1002;
Example 5: Find all cephalosporin orders using a wildcard.
SELECT Drug_Name
FROM dbo.Medication_Orders
WHERE Drug_Name LIKE 'Ceph%'; -- or 'Cef%' depending on naming conventions
AND, OR, NOT: Combining Your Filters
Rarely is a clinical question answered by a single filter. You almost always need to combine criteria. The `AND`, `OR`, and `NOT` logical operators allow you to build complex, precise filters that mirror clinical logic.
- AND: All conditions must be true. (Narrower results)
- OR: At least one of the conditions must be true. (Broader results)
- NOT: Reverses the logic of a condition.
Example 6: Find all Metformin orders with a dose greater than 500 mg.
FROM dbo.Medication_Orders
WHERE Drug_Name = 'Metformin' AND Dose LIKE '1000 mg'; -- Assuming Dose is a string
Example 7: Find all patients who are either over 65 OR have a diagnosis of renal failure. (This requires another table, but illustrates the logic).
FROM dbo.Patients
WHERE Age > 65 OR Primary_Diagnosis = 'Renal Failure';
Use Parentheses to Control the Order of Operations
Just like in mathematics, you can use parentheses `()` to group your `AND` and `OR` conditions. Conditions inside parentheses are evaluated first. This is critical for avoiding logical errors in complex queries.
Scenario: Find all patients on Warfarin who are ALSO on either Amiodarone OR Bactrim (a significant interaction).
Correct Query:
Incorrect Query (without parentheses):
The incorrect query would return all patients on Warfarin and Amiodarone, PLUS all patients on Bactrim, regardless of whether they are on Warfarin. This is a subtle but catastrophic difference.
ORDER BY: Sorting Your Results
The `ORDER BY` clause allows you to sort your results based on one or more columns. This is essential for making sense of large datasets. By default, the sort is ascending (`ASC`). You can specify descending order using the `DESC` keyword. The `ORDER BY` clause is almost always the last clause in a query.
Example 8: Get a list of all medication orders, with the most recent ones first.
FROM dbo.Medication_Orders
ORDER BY Order_Date DESC;
Example 9: Get a list of all patients, alphabetized by last name, then first name.
FROM dbo.Patients
ORDER BY LastName ASC, FirstName ASC;
11.1.4 The Power of Relationships: JOINing Tables
So far, we've only asked questions of a single table at a time. But the real power of SQL comes from its ability to combine information from multiple tables. If you want to find the names of the patients who received Lisinopril, you need data from both the `dbo.Patients` table (for the names) and the `dbo.Medication_Orders` table (for the drug information). The `JOIN` clause is how you do this.
A `JOIN` links rows from two or more tables based on a related column between them—the primary key/foreign key relationship we discussed earlier. The most common type of join is the `INNER JOIN`.
INNER JOIN: Finding the Matches
An `INNER JOIN` returns only the rows where the key exists in both tables. Think of it as a Venn diagram—it only returns the overlapping section. You specify which tables to join and, critically, the `ON` condition that tells the database how to match them up.
Visualizing an INNER JOIN
dbo.Patients
Contains all patients
dbo.Medication_Orders
Contains all orders
Result of INNER JOIN on Patient_ID
A new, temporary table containing only patients who have at least one medication order. The patient and order information are now in the same row.
Example 10: Get the names of patients and the drugs they were ordered.
P.FirstName,
P.LastName,
O.Drug_Name,
O.Order_Date
FROM
dbo.Patients AS P -- 'P' is a temporary alias for the Patients table
INNER JOIN
dbo.Medication_Orders AS O -- 'O' is an alias for the Orders table
ON
P.Patient_ID = O.Patient_ID; -- This is the crucial link
Table Aliases: A Best Practice for Readability
In the query above, we used `AS P` and `AS O`. These are called aliases. They create a temporary, shorthand name for the tables within your query. This is extremely useful for two reasons:
- Readability: It makes your `SELECT` statement much shorter and easier to read (`P.FirstName` instead of `dbo.Patients.FirstName`).
- Necessity: When you join tables that might have columns with the same name (like a 'Create_Date' column in both), you must specify which table you mean (e.g., `P.Create_Date` vs. `O.Create_Date`). Using aliases makes this clean and unambiguous.
LEFT JOIN: Finding All of One, and Some of Another
What if you wanted a list of all patients, and for those who have medication orders, you want to see them? If you used an `INNER JOIN`, you would only get the patients who have orders. A `LEFT JOIN` solves this. It returns all rows from the "left" table (the one listed first) and the matched rows from the "right" table. If there is no match, the columns from the right table will be filled with `NULL` (empty) values.
Example 11: Get a list of ALL patients, and list their medication orders if they have any.
P.LastName,
P.FirstName,
O.Drug_Name
FROM
dbo.Patients AS P
LEFT JOIN
dbo.Medication_Orders AS O ON P.Patient_ID = O.Patient_ID
ORDER BY
P.LastName;
This query is incredibly useful for finding patients who are missing something. For example, to find all diabetic patients who do NOT have an order for an ACE inhibitor, you could `LEFT JOIN` the `Patients` table to the `Medication_Orders` table and then filter with `WHERE Medication_Orders.Drug_Name IS NULL`.
11.1.5 Aggregating Data: From Lists to Insights
A long list of every order is not an analysis; it's just raw data. The real power comes from summarizing, or aggregating, this data into meaningful numbers. This is done using aggregate functions, often paired with the `GROUP BY` clause.
Masterclass Table: Core Aggregate Functions
| Function | Description | Pharmacy Example |
|---|---|---|
COUNT() |
Counts the number of rows. COUNT(*) counts all rows, while COUNT(ColumnName) counts non-NULL values in that column. |
COUNT(*) FROM dbo.Medication_Orders WHERE Drug_Name = 'Vancomycin' (Counts the total number of Vanc orders) |
COUNT(DISTINCT ...) |
Counts the number of unique values in a column. | COUNT(DISTINCT Patient_ID) FROM dbo.Medication_Orders (Counts how many unique patients received any medication) |
SUM() |
Adds up all the numeric values in a column. | SUM(Dispensed_Quantity) (Calculates total tablets dispensed) |
AVG() |
Calculates the average of the numeric values in a column. | AVG(Cost) (Calculates the average cost per dose) |
MIN() / MAX() |
Finds the minimum or maximum value in a column. | MIN(Order_Date) (Finds the earliest order date for a patient) |
GROUP BY: Summarizing by Category
Aggregate functions on their own give you one number for the whole table. The `GROUP BY` clause is the magic ingredient that allows you to see these summaries for each category. For example, instead of the total count of all orders, you can get the count of orders for each drug.
When you use `GROUP BY`, any column in your `SELECT` statement must be either one of the columns you are grouping by, or it must be inside an aggregate function.
Example 12: Count the number of orders for each drug.
Drug_Name,
COUNT(*) AS NumberOfOrders
FROM
dbo.Medication_Orders
GROUP BY
Drug_Name
ORDER BY
NumberOfOrders DESC; -- See the most frequently ordered drugs first
Example 13: Count the number of unique patients who received each DOAC.
Drug_Name,
COUNT(DISTINCT Patient_ID) AS UniquePatientCount
FROM
dbo.Medication_Orders
WHERE
Drug_Name IN ('Apixaban', 'Rivaroxaban', 'Edoxaban', 'Dabigatran')
GROUP BY
Drug_Name
ORDER BY
UniquePatientCount DESC;
11.1.6 The Pharmacist's Query Cookbook: Real-World Scenarios
Theory is one thing; application is another. This section provides practical, step-by-step examples of how to combine these SQL commands to answer complex clinical questions. These are the types of queries you will run constantly in an informatics role.
Playbook Scenario 1: Antimicrobial Stewardship - IV Vancomycin Duration
Clinical Question: "Our stewardship committee wants a list of all current inpatients who have been on IV vancomycin for more than 5 consecutive days to assess for appropriate de-escalation or discontinuation."
Logical Breakdown:
- We need to find orders for IV vancomycin.
- The orders must be for currently admitted patients (not discharged).
- We need to calculate the duration of therapy for each order. A simple way is to find the difference between today's date and the order start date.
- The duration must be greater than 5 days.
- We need patient information (name, location) and order information (start date). This will require a `JOIN`.
Sample Query: (Note: Date functions like `DATEDIFF` can vary slightly between SQL versions like T-SQL, PL/SQL, etc. This uses a common syntax.)
P.LastName, P.FirstName, P.Current_Location,
O.Order_StartDate,
DATEDIFF(day, O.Order_StartDate, GETDATE()) AS DaysOfTherapy
FROM
dbo.Patients AS P
INNER JOIN
dbo.Medication_Orders AS O ON P.Patient_ID = O.Patient_ID
WHERE
P.Patient_Status = 'Admitted'
AND O.Drug_Name LIKE 'Vancomycin%IV%'
AND O.Order_Status = 'Active'
AND DATEDIFF(day, O.Order_StartDate, GETDATE()) > 5
ORDER BY
DaysOfTherapy DESC;
Playbook Scenario 2: Medication Safety - The "Triple Whammy"
Clinical Question: "Identify all currently admitted patients who have active orders for an ACE Inhibitor or ARB, a diuretic, AND an NSAID concurrently, as this 'triple whammy' combination poses a high risk for acute kidney injury."
Logical Breakdown:
- This is a complex query. The easiest way to solve this is to find all patients on at least one drug from each of the three classes.
- We can use `GROUP BY` on the patient's ID.
- Inside the `GROUP BY`, we can use a conditional count (`COUNT` within a `CASE` statement) to see if they have a drug from each category.
- The final `HAVING` clause (a `WHERE` clause for aggregated data) will filter for patients who meet all three conditions.
Sample Query:
P.Patient_ID, P.LastName, P.FirstName
FROM
dbo.Patients AS P
INNER JOIN
dbo.Medication_Orders AS O ON P.Patient_ID = O.Patient_ID
WHERE
P.Patient_Status = 'Admitted' AND O.Order_Status = 'Active'
GROUP BY
P.Patient_ID, P.LastName, P.FirstName
HAVING
-- Condition 1: Has an ACE/ARB
SUM(CASE WHEN O.Drug_Class = 'ACEI' OR O.Drug_Class = 'ARB' THEN 1 ELSE 0 END) > 0
AND
-- Condition 2: Has a Diuretic
SUM(CASE WHEN O.Drug_Class = 'Diuretic' THEN 1 ELSE 0 END) > 0
AND
-- Condition 3: Has an NSAID
SUM(CASE WHEN O.Drug_Class = 'NSAID' THEN 1 ELSE 0 END) > 0;