Lessons from a Superb Clinical Trial
The LAAOS III trial teaches many important lessons on the interpretation of clinical science
I write often about flawed studies. Mistakes are effective teaching tools.
But good studies can also be instructive. This letter describes one of the best studies I have seen in some time.
What follows is a basic framework for appraising any trial. It requires little content expertise. Do not be intimidated.
The LAAOS III trial studied whether closure of the left atrial appendage at the time of heart surgery reduced the odds of having a future stroke. NEJM published the paper:
Background:
Humans have seemingly useless appendages in the right and left atrium of our hearts. Previous studies have shown that clots can form in the nooks and crannies of the left atrial appendage—especially in patients with atrial fibrillation (AF). Clots can then break off and cause stroke. (Figure C below)
Regular readers of Stop and Think know that efforts to plug the appendage with devices placed via the leg have failed to show benefit.
Surgical closure is different. Surgeons can cut the appendage off and/or clip it with a device that is never exposed to blood. (We call this epicardial closure.)
Before LAAOS III, nonrandomized studies suggested appendage closure at the time of surgery reduced stroke. But comparing groups without randomization is problematic because the purported benefits could have been due to surgeons choosing to close the appendage in healthier patients. (We call this selection bias.)
The LAAOS III trial:
Surgeon Richard Whitlock from McMasters University led the trial of patients with AF who were going to have heart surgery for other reasons—say, bypass or valve replacement. Whitlock recruited centers with experienced surgeons and he instructed them on proper technique of appendage closure, which had been worked out in previous studies.
Before any patient was recruited, trialists chose stroke or systemic embolus (clot that went somewhere besides the brain) as a primary endpoint. This is crucial because you have to pick your endpoint before the experiment.
About 2400 patients were randomized to the closure and 2400 were in the no-closure group.
After a follow-up of nearly 4 years, stroke or systemic embolism occurred in 114 participants (4.8%) in the occlusion group and in 168 (7.0%) in the no-occlusion group.
That 2.2% absolute risk reduction resulted in a hazard ratio of 0.67. The 95% confidence interval spanned from 0.53 to 0.85. Roughly stated, that means that if the trial was repeated many times, there is 95% chance that the true effect would be between a 47% reduction and 15% reduction of stroke.
The incidence of adverse effects, such as bleeding or heart failure did not differ in each group. Appendage closure resulted in only 6 extra minutes on the heart-lung machine.
Six Reasons LAAOS III was exemplary:
Internal Validity: (Trial conduct)
LAAOS III used government funding. Industry funding ought not be considered nefarious, but it ought to be considered as a potential source of bias.
LAAOS III measured an important “hard” outcome. Some trials measure surrogate outcomes, such as blood pressure, glucose levels or sizes of a tumor on a CT scan. A stroke is not a surrogate; it is a terrible outcome that we try to prevent.
Patients and caregivers were blinded to the treatment assignment. This is crucial because it reduces the odds of performance bias. In many procedure trials, blinding is impossible, and proponents of a procedure may (unintentionally) give special care to the patients in the procedure arm.
The trial had minimal lost-to-follow-up. That’s important because patients may not go missing at random.
External Validity:
The external validity of a trial refers to the applicability of the results to real-world practice. Recall that many trials screen oodles of patients to recruit a small fraction of patients. The CASTLE-AF trial of AF ablation vs medical therapy in patients with heart failure screened 3000 patients but ended up with only 360 patients in the trial.
LAAOS III enrolled a typical group of patients having heart surgery. The average age of patients was 71. Enrolled patients were having types of heart surgeries done every day.
Unlike closure of the appendage with a device, surgical closure is simpler and less dependent on training.
LAAOS III, therefore, has extreme generalizability to everyday practice.
Strength of the Evidence:
The strength of the evidence refers to the possibility of the results occurring by chance—like getting 4 heads (or tails) in a row.
I look at two things: the confidence intervals surrounding the effect size and the p-value.
In LAAOS III, there was a 33% reduction in stroke (HR = 0.67) but the 95% confidence intervals went from 0.53 to 0.85. That means appendage closure could have reduced stroke by as much as 47% or as little as 15%. That’s darn strong.
The way I think about the p-value is as a test of the null hypothesis. In this case…if appendage closure had no effect on stroke, the chance we would have seen this 33% reduction in stroke (or something even more extreme) is 0.001 or less than 1 in 1,000.
Translation: these results were statistically robust and not likely to be due to chance.
Clinical Importance:
A statistically strong result does not equate to a clinically important one. Big data studies can find highly significant p-values for meaningless reductions in an outcome.
In LAAOS III, the 33% relative reduction in stroke translated to a 2.2% absolute risk reduction (ARR). Three were 114 stroke events in the LAAO arm and 167 in the no LAAO arm.
In the slide below, I contrast that degree of reduction with other big cardiology trials. You can see that it is a little less than some but a lot more than others. For instance, AF ablation is commonly done but confers no risk reduction in hard outcomes.
Notable also in LAAOS III is that the stroke reducing effect seems to increase over time. This is viewed in the next slide.
Adverse Effect and Cost:
Medical/surgical treatments always come with downsides. Cancer chemotherapy, for instance, can be life preserving but it can also be difficult to take because of side effects. Statin drugs reduce future heart attacks but it requires taking a pill every day. (We call that pill disutility.)
Left atrial appendage closure as an add-on when the surgeon is already there added only 6 minutes to bypass time and no extra bleeding or re-operations.
It was a low-burden, low-cost therapy that allows for 100% compliance and no disutility.
Translation to Practice
Every trial requires wise application to practice. Two big points to make about LAAOS III:
LAAOS III studied appendage closure in addition to standard care, including anticoagulant drugs. It does not mean that we can substitute appendage closure for these drugs. That would require a second trial.
LAAOS III studied surgical closure from the outside of the atrium. Surgeons leave no foreign body inside the atrium. This differs greatly from the increasingly popular percutaneous technique using devices such as Watchman. This trial should not increase enthusiasm for Watchman-type closure of the appendage.
I hope that this exercise fosters understanding of the approach to clinical evidence. You can use this framework for any clinical trial in any field.
Love this content!
Thanks for this very interesting piece - it spurred me to look at the pdf presentation of the results by the trial team.
However, the pdf's joyfully positive spin on the results made me wish for a more sober analysis, particularly of the post-surgery events in the two groups.
For example, since the 4800 patients were spread out over so many countries, were there any country patterns or skews detected regarding the events? Were there any patterns regarding patients' past medical histories? Since the death rates were identical in the two groups, were the actual causes of death also largely identical? Were the average ages at death the same? Were there any skews regarding deaths in different countries?
Still, I saw a comment by Dr. Gani who wondered whether the identical mortality rates didn't tend to reduce the clinical relevance of the trial, unless it was demonstrated that quality of life was measurably improved. This seems a reasonable question regarding any intervention in elderly patients.
Didn't read the whole NEJM study though, so maybe some of these questions were addressed.