Key Questions We Considered in Creating Surgeon Scorecard

July 14, 2015, 12:10 am

Update, Aug. 19, 2015: This FAQ has been updated to include question 21.

Experts say that a transparent, clear measure of surgeons’ complication rates is a crucial tool for patients and healthcare providers. But building a means of comparison that is both fair and easily understood is not simple. What follows is a detailed look at how we addressed some of the thorny issues that arose as we put together Surgeon Scorecard.

1. You measured surgeon performance by examining eight elective procedures. Why did you pick these operations?

The goal was to identify instances in which patients suffered harm for which surgeons could reasonably be held accountable. We focused on elective operations like knee and hip replacements because they are common and because the patients are typically in stable health. We excluded anyone whose records suggested they were suffering from trauma like a shattered hip. We did not include anyone transferred into the hospital from emergency rooms or other facilities like nursing homes. The procedures we focused on are done every day in hospitals across the country. They are: knee replacements, hip replacements, cervical (neck) spinal fusion, lumbar spinal fusion/posterior (performed on the back portion of the spine), lumbar spinal fusion/anterior (performed on the front portion of the spine), gallbladder removal, prostate resection, and prostate removal.

2. ProPublica’s analysis treats deaths and readmissions to a hospital within 30 days as a signal of surgeon quality. Is that fair?

Measuring readmissions within 30 days of a hospital stay is a widely accepted way to gauge patient safety. Medicare counts any return to the hospital within 30 days –regardless of cause. We found this measure overly broad because it included complications unrelated to the procedure. Instead, we asked two dozen doctors, including surgeons, to identify hospital readmissions that could be reasonably attributed to complications from surgery. We tried to be conservative in assembling our count of complications. For example, surgeons routinely treat minor post-surgical problems in office visits. These are not included in our analysis. As for deaths, we took a conservative approach and only included those that occurred in the hospital within the initial stay.

3. If an operation is the work of a team – surgeon, nurses and an anesthesiologist – how can you conclude from Medicare records that a patient’s readmission to the hospital was solely the surgeon’s fault?

We’re not assigning “fault,” but responsibility. There are undoubtedly complications counted in our data that were the fault of someone other than the surgeon. But experts we interviewed argued that surgeons can and should be held accountable for everything that happens to their patients, much as a captain in the U.S. Navy is treated if a ship runs aground while a junior officer is at the helm. This is also the view of the American College of Surgeons, which says in its statement of principles:

“To enhance patient safety, it is the responsibility of the surgeon to oversee proper preoperative preparation of the patient; obtain informed consent; confirm with the team the diagnosis and agreed-upon operation; perform the operation safely and competently, including planning with the anesthesia professional the optimal anesthesia method for the patient; provide postoperative care of the patient, including personal participation in the direction of this care and management of postoperative complications should they occur; and disclose information to the patient or patient’s representative relative to the conduct of the operation, operative and pathological findings, procedure forms, and the expected outcome.”

4. Why did you leave out complications that surfaced before the patient left the hospital?

The hospital billing codes one would use for such an analysis have been studied by scholars and found to be less reliable than the metrics we used for Surgeon Scorecard.

5. You adjusted the scores of each surgeon. Why did you do that and how does the adjustment work?

We know not all patients are the same. A surgeon who operates on unhealthy patients could have a higher complication and death rate than someone who treats healthier people. After screening out certain categories of high risk and very sick patients, we adjusted each surgeon’s complication rate for patient health, age and other factors. We also tried to account for differences in hospitals’ quality to further focus on surgeons’ performance. Finally, we tried to incorporate the effects of good – or bad – luck. To do this, we used a well-regarded statistical technique known as a mixed effects model. You can read more about our methodology in this paper.

Our model produced a range of possible values for each surgeon’s complication rate. The number we’re reporting is near the middle of the high and low ends of the range. We designed our online presentation of each surgeon’s scorecard to make clear that while higher and lower values are possible, they are increasingly less probable as the numbers move up or down from the rate we’ve identified as most likely.

6. I looked up a surgeon who has zero actual complications listed but has an “adjusted” complication rate of more than zero. How that can be?

The model adjusts complication rates based in part on the number of operations a surgeon has done. To be as fair as possible, we made smaller adjustments to the complication rates of surgeons who’ve done hundreds of operations. That’s because high-volume surgeons have more extensive track records and random events are likely to play a lesser role in their outcomes. But our model assumes that chance affects every surgeon’s score. For that reason, those with the highest rates of complications are assumed to have had some bad breaks. And those with the lowest raw rates, even zero, are moved to at least 1.1 percent to account for the likelihood that these surgeons likely had some benefit from good luck.

7. You only looked at Medicare data. Why didn’t you calculate complication rates for all procedures each surgeon performed regardless of who was paying for them?

Medicare is the only source of nationwide data on readmissions after surgical procedures. A handful of states make public data that covers inpatient hospital stays for both public and private insurers. But those states differ widely in how they handle the data. In particular some states do not provide data in a way that allows for identification of patients from one hospital visit to the next.

8. Medicare only pays for about 40 percent of hospital stays and mostly covers people 65 and older. Is that fair to surgeons who operate on many patients with private insurance?

This is a limitation of our data. Researchers consider Medicare data a reasonable if imperfect proxy for the health care system. Because they are older, Medicare patients tend to have more health problems. By focusing on elective operations and screening out more complicated cases, our sample should be reasonably comparable to non-Medicare patients. We believe that patients, hospital administrators and doctors are better off having information with limits than having no information at all.

9. Why aren’t you making the underlying data public?

We are prohibited from releasing certain data under the agreement we made with the federal agency that oversees Medicare. The data we used in our analysis includes identifiers for surgeons, but patients are anonymous. The Centers for Medicare and Medicaid Services’ data contract bars ProPublica from publishing information about a surgeon’s specific procedures when the doctor did fewer than 11 of them or had fewer than 11 complications. Agency officials said this is to protect patient privacy. The agency determined that publishing a statistically adjusted complication rate met the terms of our agreement, which also prohibits us from sharing our data with others.

10. As you point out, these complications are pretty rare events. How are you sure you are counting enough of them to have statistically significant results?

As noted above, this is one reason we gathered five years of data. We do not report an adjusted complication rate if a surgeon performed fewer than 20 operations. On Surgeon Scorecard, we show a range in which each surgeon’s score is most likely to fall. Statisticians call this range a confidence interval. It’s the same concept you see in political polling, when there’s a margin of error of plus or minus some percent.

The less data we had on a particular doctor, the wider the interval. We explain in our methodology paper why we are 95 percent confident that the adjusted rates in Surgeon Scorecard fall within the intervals we show.

11. If the range of possible values for surgeon scores is wide, isn’t there some possibility that some surgeons with different rates in your online representation actually performed similarly?

Yes. We have categorized adjusted complication rates as low, medium and high. When the confidence interval around a surgeon’s rate straddles two categories, our online graphic shows it. There is a possibility that a surgeon whose adjusted complication rate is “high” might be equivalent to a doctor listed in the “medium” category. The further apart the doctors’ rates stand, the less probability there is of an overlap.

12. The surgeons with the highest adjusted complication rates in your data are still doing 90 percent or more of their procedures without serious complications. Isn’t that good?

Given the high stakes for patients, it’s important to look for any possibilities for improvement. And our data shows it’s possible to have lower complication rates. There are hundreds of high-volume surgeons in Medicare who did not have any readmissions for the complications we identified. Our reporting suggests these surgeons may be doing something differently from their colleagues.

13. Your story says that the choice of surgeon is more important than picking a top-rated hospital. Are you saying that the quality of the hospital has no effect on patients’ outcomes?

No. According to our analysis, the quality of hospitals has some effect on adjusted complication rates. It is just not as important as the work of surgeons. Your best choice remains picking an excellent doctor at an excellent hospital.

14. Looking at ProPublica’s staff list, I don’t see anyone with a doctorate in biostatistics. How did you vet this work?

The analysis was designed in conjunction with Sebastien Haneuse, a professor of biostatistics at the Harvard School of Public Health, who worked as a paid consultant for ProPublica on this project. We summarized our findings in a detailed document that we shared with leading scholars in the field, a process similar to the peer review required to publish an academic paper.

We consulted with experts at every step along the way. Three were exceptionally generous with their time and expertise. They were: Dr. Ashish Jha, professor at the Harvard School of Public Health; Dr. Karen Joynt, also a professor at the Harvard School of Public Health; and Dr. Marty Makary, a professor at Johns Hopkins University of School of Medicine. We also spoke with dozens of frontline doctors and surgeons.

15. Aren’t you worried the complication rates will be misunderstood by members of the public?

Online ratings have become a fact of modern life. Sites rate everyone from doctors to plumbers to massage therapists. Consumers are becoming increasingly savvy about using the Internet to do research on important decisions. We don’t think it’s beyond their intellectual abilities to understand and use Surgeon Scorecard. This data is a useful starting point for a conversation between a prospective patient and surgeon. We also hope that publishing this information will encourage doctors and hospitals to take steps to lower their complication rates and improve patient safety. You can read our editors’ note for a fuller explanation of this point.

16. In most cases, patients who undergo a laparoscopic gall-bladder removal these days don’t stay overnight in a hospital. Your data only includes cases in which patients stayed overnight (inpatient cases). Don’t such patients have a higher risk of complications?

We consulted with experts before deciding to include this procedure in Surgeon Scorecard because we shared similar concerns. They told us that an inpatient procedure did not necessarily mean there was a higher risk. We reviewed medical literature and found that readmission rates for outpatient and inpatient cases are very similar. The overall readmission rate for gall bladder cases in our data is in line with rates in those studies.

17. I’m a surgeon, and I’ve done more operations than are reported in this data. Why is this number so low?

There are a few limitations of the data from Medicare that might make the number of procedures seem low. First, we only have fee-for-service Medicare from 2009-2013. That means it doesn’t include any operations paid for by private insurance. Nor does it include patients in Medicare Advantage, which covers people who join health maintenance organizations, or HMOs. The latter limitation is why there are no entries for hospitals owned by Kaiser Permanente, one of the country’s biggest nonprofit health plans.

18. If the confidence intervals for two surgeons overlap, does it mean their complication rates are essentially the same? If so, how can anyone use this tool to compare them?

When two confidence intervals touch each other at the outer edges, it’s possible that the rates of complications are in fact the same, though this possibility is small. When the overlap is larger, as is frequent in our data, that potential increases. The statistical methods we used are designed to assure that a surgeon’s number is the most likely one for him or her, making such comparisons possible. For more on confidence intervals, see questions 10 and 11.

19. A doctor in your database no longer operates at a hospital you list.

Surgeon Scorecard reports surgeries done from 2009-2013. We report “Related Hospitals” in our interactive database for each surgeon based on where the operations were performed, not where the surgeon now works.

20. A surgeon in my town has the same Adjusted Complication Rate, procedure count and complication count at several hospitals. How can that be?

A surgeon’s complication rate is for all hospitals at which he or she operates and is not unique to a given hospital. We combined a surgeon’s data from all hospitals because having more cases creates greater statistical confidence in the complication rate.

21. What are ‘hospital misattributions’ and why do they matter?

The data in Surgeon Scorecard comes from billing claims that hospitals submit to Medicare. In addition to information about the surgery and a patient’s health condition, the claims include a code that identifies the operating physician. Some hospitals had reported that physicians with inappropriate specialties, and even some non-physicians, were the operating surgeon. We have removed these physicians from Scorecard. We also screened for hospitals with high levels of such misattributions. We removed surgeons from Scorecard if a hospital they worked at misattributed more than 5 percent of claims in one of the procedures we analyzed, or if they misattributed more than 100 claims in our analysis overall. We are working with Medicare and hospitals to resolve misattributions in our next update.

Key Questions We Considered in Creating Surgeon Scorecard

Republish This Story for Free

1. You measured surgeon performance by examining eight elective procedures. Why did you pick these operations?

2. ProPublica’s analysis treats deaths and readmissions to a hospital within 30 days as a signal of surgeon quality. Is that fair?

3. If an operation is the work of a team – surgeon, nurses and an anesthesiologist – how can you conclude from Medicare records that a patient’s readmission to the hospital was solely the surgeon’s fault?

4. Why did you leave out complications that surfaced before the patient left the hospital?

5. You adjusted the scores of each surgeon. Why did you do that and how does the adjustment work?

6. I looked up a surgeon who has zero actual complications listed but has an “adjusted” complication rate of more than zero. How that can be?

7. You only looked at Medicare data. Why didn’t you calculate complication rates for all procedures each surgeon performed regardless of who was paying for them?

8. Medicare only pays for about 40 percent of hospital stays and mostly covers people 65 and older. Is that fair to surgeons who operate on many patients with private insurance?

9. Why aren’t you making the underlying data public?

10. As you point out, these complications are pretty rare events. How are you sure you are counting enough of them to have statistically significant results?

11. If the range of possible values for surgeon scores is wide, isn’t there some possibility that some surgeons with different rates in your online representation actually performed similarly?

12. The surgeons with the highest adjusted complication rates in your data are still doing 90 percent or more of their procedures without serious complications. Isn’t that good?

13. Your story says that the choice of surgeon is more important than picking a top-rated hospital. Are you saying that the quality of the hospital has no effect on patients’ outcomes?

14. Looking at ProPublica’s staff list, I don’t see anyone with a doctorate in biostatistics. How did you vet this work?

15. Aren’t you worried the complication rates will be misunderstood by members of the public?

16. In most cases, patients who undergo a laparoscopic gall-bladder removal these days don’t stay overnight in a hospital. Your data only includes cases in which patients stayed overnight (inpatient cases). Don’t such patients have a higher risk of complications?

17. I’m a surgeon, and I’ve done more operations than are reported in this data. Why is this number so low?

18. If the confidence intervals for two surgeons overlap, does it mean their complication rates are essentially the same? If so, how can anyone use this tool to compare them?

19. A doctor in your database no longer operates at a hospital you list.

20. A surgeon in my town has the same Adjusted Complication Rate, procedure count and complication count at several hospitals. How can that be?

21. What are ‘hospital misattributions’ and why do they matter?

ProPublica

Most Read

Series: Patient Safety: Exploring Quality of Care in the U.S.

1. You measured surgeon performance by examining eight elective procedures. Why did you pick these operations?

2. ProPublica’s analysis treats deaths and readmissions to a hospital within 30 days as a signal of surgeon quality. Is that fair?

3. If an operation is the work of a team – surgeon, nurses and an anesthesiologist – how can you conclude from Medicare records that a patient’s readmission to the hospital was solely the surgeon’s fault?

4. Why did you leave out complications that surfaced before the patient left the hospital?

5. You adjusted the scores of each surgeon. Why did you do that and how does the adjustment work?

6. I looked up a surgeon who has zero actual complications listed but has an “adjusted” complication rate of more than zero. How that can be?

7. You only looked at Medicare data. Why didn’t you calculate complication rates for all procedures each surgeon performed regardless of who was paying for them?

8. Medicare only pays for about 40 percent of hospital stays and mostly covers people 65 and older. Is that fair to surgeons who operate on many patients with private insurance?

9. Why aren’t you making the underlying data public?

10. As you point out, these complications are pretty rare events. How are you sure you are counting enough of them to have statistically significant results?

11. If the range of possible values for surgeon scores is wide, isn’t there some possibility that some surgeons with different rates in your online representation actually performed similarly?

12. The surgeons with the highest adjusted complication rates in your data are still doing 90 percent or more of their procedures without serious complications. Isn’t that good?

13. Your story says that the choice of surgeon is more important than picking a top-rated hospital. Are you saying that the quality of the hospital has no effect on patients’ outcomes?

14. Looking at ProPublica’s staff list, I don’t see anyone with a doctorate in biostatistics. How did you vet this work?

15. Aren’t you worried the complication rates will be misunderstood by members of the public?

16. In most cases, patients who undergo a laparoscopic gall-bladder removal these days don’t stay overnight in a hospital. Your data only includes cases in which patients stayed overnight (inpatient cases). Don’t such patients have a higher risk of complications?

17. I’m a surgeon, and I’ve done more operations than are reported in this data. Why is this number so low?

18. If the confidence intervals for two surgeons overlap, does it mean their complication rates are essentially the same? If so, how can anyone use this tool to compare them?

19. A doctor in your database no longer operates at a hospital you list.

20. A surgeon in my town has the same Adjusted Complication Rate, procedure count and complication count at several hospitals. How can that be?

21. What are ‘hospital misattributions’ and why do they matter?

What We’re Watching

Sharon Lerner

Andy Kroll

Melissa Sanchez

Jesse Coburn

Most Read

Kristi Noem Misled Congress About Top Aide’s Role in DHS Contracts

“The Intern in Charge”: Meet the 22-Year-Old Trump’s Team Picked to Lead Terrorism Prevention

Firm Tied to Kristi Noem Secretly Got Money From $220 Million DHS Ad Contracts

Albuquerque’s Mayor Said Arrests Were “Not the Solution” to Homelessness. Yet Jail Bookings Have Skyrocketed.

Trump Officials Attended a Summit of Election Deniers Who Want the President to Take Over the Midterms

Journalism That Holds Power to Account