The All-Important Endpoint of a Medical Study
Medical studies are often hyped as "positive." Consumers of medical evidence must always be on alert for the endpoint measured. Not all positive results are the same.
We need to stop and think about the endpoints measured in a medical experiment.
In cardiology, we have long tested new treatments in randomized controlled trials. This means that patients with a specific problem, say, heart failure, get randomized to either the new drug or a placebo.
Both groups are treated with the most up-to-date medications. The question is if the new drug adds benefit to the standard regimen.
To answer this question, scientists must choose an endpoint to measure. You’d like the endpoint to be important. Death, stroke, or quality of life are good endpoints. Blood tests and images on scans, not so much.
Heart Failure Endpoints in the Past
Heart failure used to be an extremely fatal disease. When I was in medical school, an injured heart would stretch and dilate to compensate for a weakened area—usually due to a heart attack. That stretching would eventually lead to total failure to pump blood to the organs and patients would die.
Then came the trials of renin-angiotensin-blockers (ACE-inhibitors, then ARB drugs). These drugs worked by reducing the load on the heart after blood was ejected. Trials measured the ultimate endpoint—death. And boy did they work. Patients who took these drugs had much better survival.
Then came the discovery of beta-blockers—which block the effects of adrenaline. It seemed counter-intuitive, because you’d think a failing heart would need more kick from adrenaline, but it turned out that too much adrenaline led to worse outcomes. The beta-blocker trials again showed dramatic improvements in survival.
The third discovery came with a simple and old diuretic drug called spironolactone. It had special potassium-sparing properties and in a trial called RALES, spironolactone reduced death rate by a massive amount.
Heart Failure Endpoints Now
The heart failure community is quite excited about a new class of medications for patients with heart failure due to either a weak heart muscle or normal heart muscle. We call these different types of heart failure: HFrEF (heart failure with reduced ejection fraction) or HFpEF (heart failure with preserved ejection fraction).
The problem for innovators now is that heart failure therapy has become so good that it’s hard to show any more benefit.
So… scientists changed the endpoints. And this is the key lesson of this newsletter.
I will use the EMPEROR-PRESERVED trial to illustrate my point. This trial randomized patients with heart failure, mostly with preserved function, to the SGLT2 inhibitor drug empagliflozin or placebo.
SGLT2 inhibitors are a new class of drugs that have been shown to improve cardiac outcomes in patients with diabetes and preserve kidney function in patients with chronic kidney disease. They have a variety of actions but one big one is to cause the kidneys to excrete glucose, and this leads to a diuretic effect.
Back to EMPEROR-PRESERVED. It’s a heart failure trial, so you might ask whether it measured survival as its main endpoint—like the previous heart failure trials.
The answer is no. The scientists who ran EMPEROR-REDUCED chose as an endpoint a composite of either death due to cardiovascular causes or hospitalization due to heart failure. In fact, this endpoint, abbreviated CVD or HHF, has become the new standard of heart failure trials.
The trial found that empagliflozin reduced the composite endpoint by 21% compared with placebo. This easily met the threshold of statistical significance.
A win, right? Let’s start using the new drug!
Well, not so fast.
Always Look at the Components of a Composite Endpoint
The rate of death due to cardiac causes was nearly the same (7.3% vs 8.2%).
The 25% benefit in the composite primary endpoint was driven by a reduction in hospitalizations for heart failure: 259 or 8.6% in the empa group vs 352 or 11.8% in the placebo group. That 29% reduction easily met statistical significance.
That still sounds great, right?
Here is the problem: patients enrolled in this trial were 72 years old and they had multiple medical problems—many of which could cause hospitalizations.
A few lines down in the table of results is the rate of total hospitalizations. Here there were 2566 total hospitalizations in the empa arm vs 2769 in the placebo arm. That 7% reduction did not reach significance.
The existential problem with heart failure hospitalization as an endpoint is that it represents a mere fraction of total hospitalizations (10% in EMPEROR-PRESERVED).
Also relevant now is that death from any cause is no different—14.1% vs 14.3%.
Here is a slide with all the key results:
Conclusion:
Consumers of medical evidence must always be aware of endpoints. In the past, the drugs for heart failure shredded death rates. Patients lived longer. These were (are) disease-altering drugs.
Now, though, the newest drug on the block, the SGLT2i, have statistically positive results in trials of patients with HFpEF. But instead of making people live longer, the drug merely reduces one small fraction of hospitalizations. It doesn’t reduce death due to cardiac causes; it doesn’t extend survival; and it doesn’t much budge total hospital admissions.
If you are a 75-year-old person, you simply want to avoid any hospital admission.
There is a great pressure to use these drugs. They are good drugs for patients with diabetes and chronic kidney disease, but as pure heart failure drugs in patients with preserved heart function (HFpEF), I don’t see reason for enthusiasm—especially at the current high cost.
The take-home message is: always look at the endpoints. These will help you decide how to translate results to the bedside.
”Positive” trials come in many different varieties.
Super article, thank you.
The following fixes less than 1/2 of the problem but may still be worth pondering. Part of the problem comes from treating all hospitalizations alike, independent of (1) duration and (2) intensity of care needed. Restricting attention to duration, one can think of all-cause mortality and all-cause hospitalization this way: As a function of time since randomization, treatment, and baseline covariates, what is the probability that a patient will either be in the hospital or be dead? This can readily be analyzed with a state transition model that considers death as an absorbing state and (at home, in hospital alive, dead) as 3 ordinal outcome levels. Long or multiple hospitalizations will elevate the probability of being in the hospital on a given day post randomization. More info is at https://www.fharrell.com/talk/cmstat/
From your description John it seems that recurrent hospitalizations were not counted and only the first hospitalization was used. This is problematic.