Most organizations that consume drug pricing data never question it. The numbers arrive from CMS, from pricing compendia, from analytics vendors, and they are ingested into models, dashboards, and decision frameworks with the implicit trust that data from authoritative sources is accurate. This trust is generally warranted for individual data points — the WAC published for a given NDC is almost certainly the WAC the manufacturer reported, and the NADAC for a given product reflects actual survey responses from pharmacies. The problems emerge not in the accuracy of individual data points but in the assumptions required to use those data points together.
When Daily Remedy conducted an internal audit of its analytics platform — a scheduled quality check as part of its data integrity framework — the team discovered something that would be unremarkable to anyone who has worked closely with pharmaceutical data and astonishing to anyone who has not: the same drug, priced in three federal datasets, produced three incomparable numbers because the unit conventions differed across datasets. The budesonide discrepancy was not an error in any single database. Each database was internally consistent. The problem was in the seams between them — in the unstated assumption that a “unit” in one dataset means the same thing as a “unit” in another.
This kind of structural discrepancy is the data integrity equivalent of a silent mutation. It does not produce an error message. It does not flag itself in a validation check. It sits in the data, producing answers that are precisely calculated and potentially wrong, waiting for someone to notice that the inputs were not commensurable. The organizations that find these discrepancies are the ones that audit systematically — that cross-reference datasets as a routine practice rather than consuming them in isolation. Most organizations do not audit at this level, either because they lack the resources, the expertise, or the awareness that the discrepancies exist.
The pharmaceutical data ecosystem compounds the problem through its layered architecture. A downstream analytics platform ingests WAC from a pricing compendium, NADAC from CMS, ASP from CMS’s quarterly files, and utilization data from claims databases. Each source carries its own conventions, its own update cadence, and its own scope limitations. The platform must harmonize these inputs — aligning NDCs across sources, normalizing units, matching formulations, handling discontinued products and NDC changes — before any analysis can begin. Each harmonization step requires assumptions. Each assumption introduces potential error. And because the harmonization is performed internally, the assumptions are rarely documented in a way that allows external validation.
The quality management approach that the Daily Remedy team described — scheduled internal audits designed to catch structural discrepancies before they propagate into published analysis — represents a level of data governance that is standard practice in financial services, rare in healthcare analytics, and almost unheard of in drug pricing specifically. Financial institutions subject their data pipelines to continuous validation because regulatory requirements and fiduciary obligations demand it. Drug pricing analytics platforms face no comparable regulatory requirement for data quality, and their users — payers, researchers, legislators — generally lack the technical capacity to audit the data they consume.
The NCPDP billing unit mismatch that produces a twenty-eight-fold error in per-unit pricing is an extreme example, but lesser discrepancies pervade the data ecosystem. A packaging change that alters the number of units per NDC can create a discontinuity in a time series if the historical data is not adjusted. A manufacturer’s reclassification of a product from branded to authorized generic can shift it from one pricing benchmark to another, changing its apparent cost without any change in the underlying product. An NDC that expires and is replaced by a new NDC for an identical product creates a gap in the data that must be bridged manually.
Each of these issues has a known solution. NDC crosswalks, historical adjustment factors, reclassification trackers — the tools exist. The question is whether they are applied consistently, maintained over time, and documented transparently. For proprietary analytics platforms, the answer depends on the vendor’s investment in data governance. For organizations building their own crosswalks, the answer depends on institutional resources and expertise. For academic researchers using publicly available data for a one-time study, the answer is almost certainly that the crosswalk was built ad hoc, used once, and discarded.
The implication for decision-makers is uncomfortable: the drug pricing data that informs formulary decisions, reimbursement policies, cost-effectiveness analyses, and legislative debates carries an unquantified error rate that nobody measures and few acknowledge. The error is not random noise that averages out over large samples. It is systematic — driven by unit mismatches, scope limitations, lag effects, and harmonization assumptions that introduce directional bias. A decision made on systematically biased data is not improved by having more of it. It is improved by understanding the bias.
The Charm Economics response to the Daily Remedy inquiry pointed toward the practical standard: use NCPDP billing units as a normalization foundation, roll data up into course-of-treatment metrics using assumptions documented clearly enough to evaluate, and accept that the result is an approximation rather than a ground truth. This is sound advice. It is also an acknowledgment that the current state of drug pricing data infrastructure requires every serious user to become, in effect, their own data governance function — building, maintaining, and validating reconciliation processes that a shared public standard would render unnecessary.
The audit that catches a budesonide unit discrepancy is not the end of the story. It is the beginning of a question: what other discrepancies exist in the data, across how many drugs, affecting how many downstream analyses? The honest answer is that nobody knows, because systematic cross-dataset audits of drug pricing data are performed by a small number of organizations and disclosed by fewer still. The data looks clean on the surface. The assumptions beneath it are largely untested. And the decisions that rely on those assumptions — covering billions of dollars and affecting millions of patients — proceed with a confidence that the underlying data may not fully support.













