Ear, Nose & Throat Journal2023, Vol. 102(8) 498 –503© The Author(s) 2021Article reuse guidelines:sagepub.com/journals-permissionsDOI: 10.1177/01455613211016702journals.sagepub.com/home/ear
Objective: To determine whether surgeons can estimate thyroid operative time more accurately than a system-generated average time estimate. Methods: Four otolaryngologists at a single institution with extensive endocrine surgery experience were asked to predict their operative times for all eligible thyroid surgeries. These estimates were compared to system-generated operative time predications based on averaging the surgeon’s previous 10 cases with the same Current Procedural Terminology code. The surgeon-generated estimations and system-generated estimations were then compared to each other and the actual operative time. Results: A final sample of 73 cases was used for all analyses. Average age was 51 years old and the majority of patients were female. Surgeon-generated operative time estimates were significantly more accurate than system-generated estimates based on time averaging (P < .001). These findings were consistent across each surgeon individually and within each procedure type (hemithyroidectomy and total thyroidectomy). These findings had a power of over 99% based on mean differences. Conclusion: As the financial center of modern hospitals, an efficient operating room is integral to economic success. Improving the precision of operative time estimation reduces costly unplanned staff overtime, canceled cases, and underutilization. Our research at a rural tertiary care center shows that experienced thyroid surgeons can substantially reduce the error of estimating thyroid operative times by considering individual patient characteristics. Although no objective variables have so far been identified to correlate with thyroid operative time, surgeon-generated operative time estimation is significantly more accurate than a generic system approach of averaging previous operative times.
Keywordssurgical utilization, thyroid surgery, OR scheduling
As reimbursement and care models gradually switch to valuebased models, hospitals and the health care industry are seeking ways to maintain high-quality care while minimizing costs wherever possible. The operating room (OR) is the undeniable financial center of modern hospitals, consuming an estimated 40% of hospital spending and accounting for 60% to 70% of revenue.1-5 It is estimated that more than 60% of admitted patients need some type of surgical intervention.6 The cost of one minute of OR time has been reported as anywhere between $22/min to $133/min with an average of $36-$62/min.7-9
As such, the OR is a logical place to look for ways to make health care more efficient and cost-effective. Overestimated surgical times can lead to wasted OR time, wages paid to unnecessarily scheduled surgical staff, and fewer cases being scheduled on any given day.4 Consequently, there can be increased delays for patients who have been seen in clinic and are awaiting surgery. Conversely, underestimated surgical times can lead to cancelled or delayed surgery, unplanned overtime for staff, and dissatisfaction of surgeons, staff, and patients. In addition to immediate perioperative cost measures, inaccurate operative scheduling can affect downstream aspects such as underutilized postoperative bed occupancy.2,4,10-12 Optimizing OR scheduling is a high yield starting point as it has implications in utilization, time management, resource allocation, and patient and staff satisfaction. Accurate operative time prediction is therefore integral to optimizing scheduling, but the process is unfortunately difficult and often inaccurate.2,13-15
Thyroid surgery is an attractive area to start looking to more accurately predict operative times. The procedures are relatively standardized, commonly performed, and utilize common preoperative imaging techniques. Ultrasound (US) is widely used in the assessment of the thyroid and is considered standard of care, playing a large role in many thyroid cancer management guidelines.16,17 A recent large multicenter review of 3454 thyroidectomy patients showed that the specific surgeon and hospital had more influence over the operative time than any identified patient characteristics or the type of procedure planned.10 Surprisingly, this same study revealed that unexpected events or technical difficulties during surgery had a negligible impact on the duration of the case.10 Although obesity has been shown to effect surgical time across all common major surgeries, a 2019 study of 469 thyroidectomy patients found no significant difference in operative time between obese (body mass index ≥ 30 kg/m2) and nonobese patients.18,19 Based on this information it is reasonable to conclude that it would be possible for individual surgeons to accurately predict their operative time.
The aim of this study was to evaluate thyroidectomy surgeries across a hospital system to determine the degree of error involved in an established prediction system based on time averaging and to compare this system to surgeon estimates.
In this prospective operational study, 4 otolaryngologists at a single institution with extensive endocrine surgery experience were asked to predict their operative times within 10 minutes for all eligible hemi and total thyroidectomies. The surgeons were asked to predict both the time from incision to specimen removal and incision to skin closure. Inclusion criteria were total or hemithyroidectomies with preoperative US available. Pathology (including Graves’ disease or thyroiditis) was not directly evaluated but was considered in some cases by the surgeon’s making their estimates. Exclusion criteria included completion thyroidectomy, subtotal thyroidectomy, inclusion of any concurrent procedure such as neck dissection or parathyroid autotransplantation, and lack of an US completed 2 years within the date of surgery. System-generated operative time predictions were generated through the established institutional method of averaging the operative times of the same surgeon’s previous 10 cases of the same Current Procedural Terminology (CPT) code. The most common CPT codes used were 60220 (“total thyroid lobectomy unilateral”) for hemithyroidectomies and 60240 (“thyroidectomy complete”). A resident was present in every case, and in some cases more than one resident were present. Attending surgeons were not aware of the number or postgraduate year of the resident that would be with them at the time estimations were made.
Power analyses were conducted by looking at the mean difference and standard deviation for the sample. Gland volume was calculated simply as rectangular volume using the dimensions measured on the patient’s preoperative US. To examine the relationship between estimated times and gland volume, Pearson bivariate correlation coefficients were assessed. To examine the relationship between the accuracy of the estimation and the gland volume, an estimation error variable was computed by looking at the difference between the estimated and actual times. The absolute estimation error was calculated by taking the absolute value of the estimation error term and correlations were run. To determine whether surgeon and system error differed significantly, estimation errors were compared using a paired t test.
Surgeons were asked to estimate their operative times for a total of 77 procedures. Four cases were excluded from analysis due to significant and unexpected changes in surgical plan. In one case, the decision was made to change the procedure from a hemithyroidectomy to a total thyroidectomy in the preop area. Another case was excluded because decision was made to proceed with subtotal thyroidectomy though it was scheduled for hemithyroidectomy. A third case was excluded because a frozen specimen was taken at the end of the case, significantly prolonging the operative time. The fourth and final case excluded was a hemithyroidectomy for a large goiter that required a second attending to scrub in. A final sample of 73 patients was used for all analyses. Average patient age was 51 years and the majority were female. The majority of procedures were hemithyroidectomies and were most often performed on the right side. The mean volume on preoperative US was 68 cm3. See Table 1 for full demographic results.
On average, surgeons were able to estimate the time from incision to gland removal within 28.01 minutes of real time. Similarly, surgeon estimations of cut to close had an average error of 28.11 minutes. Conversely, the system generated estimations had an average error of 67 minutes. The systemgenerated estimations of cut to close were more accurate than the times to gland removal with an average of 49.09 minutes, though this was still less accurate than the surgeon-generated estimations (Table 2). Pearson bivariate correlations were examined to assess the relationship between surgical specimen volume and estimated times. There was not a significant correlation between gland volume and the surgeon-estimated times. There was also not a significant correlation between gland volume and the system estimated total time or the actual surgery times. There was also no significant relationship between the surgeon and system estimated error and operative volumes. This finding remained consistent when examining the relationship within each surgeon and within each procedure type. Both the surgeons and the system were more likely to underestimate (compared to overestimate) procedure times for both gland out and cut-to-close estimates.
Paired t tests were used to examine the difference between surgeon and system accuracy. Specifically, surgeon estimation error means were compared to system estimation error means. Surgeons were more accurate at estimating both gland out and cut-to-close times compared to the system (Table 2). Although the gland out effect size was larger (d = 1.03) than the cut-toclose effect size (d = 0.59), both effects had over 99% power based on the mean differences, the mean difference standard deviation and the sample size of 73. When looking at the differences between surgeon and system estimates within surgeons, the same pattern of results was found. All surgeons were more accurate at estimating gland out and cut-to-close times compared to the system (Figure 1). With the smaller sample sizes within each surgeon, gland out effects had effect sizes ranging from 0.69 to 1.76 with power estimations ranging from 97% to over 99%. See Table 3 for more detailed effect sizes and power estimations.
When examining the differences between surgeon and system accuracy within procedure types, surgeons were more accurate for both hemithyroidectomy and total thyroidectomy procedures compared to the system (Figure 2). In hemithyroidectomy procedures, gland out effects resulted in a large effect size (d = 1.05) and over 99% power and cut-to-close effects were moderate (d = 0.65) with over 99% power estimated. Similar results were observed with total thyroidectomy procedures (gland out d = 0.9, power = 99%; cut-to-close d = 0.51, power = 79%).
Based on our study, surgeons can accurately predict the operative time for both hemithyroidectomies and total thyroidectomies. The results of our study show that surgeon-generated operative time estimates for both hemithyroidectomy and total thyroidectomy are more accurate than a system-generated estimate based on time averaging (Table 2, Figure 1). This finding was consistent for each surgeon individually and within each procedure type. Both surgeon- and system-generated estimations were more likely to underestimate operative times. Interestingly, this trend toward underestimation is the opposite of that observed in a different study of 116 599 cases (across multiple specialties) that would a statistically significant tendency for surgeons to overestimate operative time.4,14 This difference is likely attributable to the much smaller sample size of this study as well as this study’s focus on one specific type of surgery.
As anticipated given the relatively uniform process of closing, the effect size was lower for cut-to-close time (d = 0.59) compared to the effect size for the specimen out time (d = 1.03). This pattern remained consistent when hemithyroidectomies and total thyroidectomies were examined individually as well. The observed variation of closure time is likely attributable to the participation of at least one if not more residents in every case at this teaching institution. Surgeons were not aware of the specific residents who would be assisting them or the residents’ level of experience at the time that they made their estimations. Furthermore, though an exact number was not recorded, more senior residents instructed junior residents on closing in several cases. Predictably, this added variable would make the closure time more difficult to anticipate prior to surgery. These findings had an over 99% power (based on mean differences) for our sample size of 73 patients.
Although the surgeon estimated cut-to-close time had an average error of 28 minutes compared to the true elapsed time, the error of the system-generated cut-to-close estimate was an average of 49 minutes (Table 2). This additional 20 minutes of error compared to the surgeon-generated estimates may initially seem inconsequential but using the previously cited value of $62/min for OR time this difference accounts for $1240.7 Although this is a gross estimation, it places a tangible value on the cost of these errors. Twenty minutes multiplied across several cases in one day can make the difference between the ability to schedule another case or extending into unplanned overtime.
The system-generated estimate was incorrect by ≥60 minutes in 23% (23/73) of cases compared to 15% (11/73) in surgeon estimates. Although the surgeons were not able to successfully estimate times to within 10 minutes as hypothesized, they were less likely to make substantial errors in estimation. These substantial errors have obvious major effects on surgical scheduling; it only takes one case with such a substantial scheduling error to throw off a tightly booked OR day.
Surprisingly, while we had originally hypothesized that thyroid gland volume on preoperative US could be an objective variable used to predict operative time, we did not find gland volume to correlate with elapsed surgical time or surgeon estimates. As surgeons were not consistent in their estimation of time in relation to gland volume, it stands to reason that the surgeons more heavily considered other patient-specific characteristics for their estimations.
Unfortunately, due to the nature of this study requiring surgeon estimations, blinding was not possible. Although this naturally introduces a risk for bias, the intended real-world application of this information is solely the improvement of operative time estimation. To that end, surgeons will be acutely aware of their operative time and estimation accuracy, so any blinding wound actually confound our results and further remove them from real-world applicability. Another potential limitation is the difference in sample size between surgeons. Although the range of sample size varied from 8 to 31 cases during the length of this study, all surgeons were individually more accurate than the system (Figure 1). Although this could be seen as a limitation, these consistent results across a range of case volumes also show that our results can likely be extended to real-world situations where there is a wide variety of caseloads between different endocrine surgeons. In addition, we did not directly evaluate pathology during our study. Therefore, our study may overlook disparities in OR time between inflammatory thyroid disease such as Graves’ disease and noninflammatory conditions. However, pathology is not considered in time averaging by CPT code, so consideration of pathology would likely only increase the accuracy of surgeon estimates compared to time averaging. Lastly, the characteristics of thyroid surgery that made it attractive for this study, including a standardized stepwise procedure, consistent preoperative workup, and relatively short duration make its applicability to other fields and more complicated surgeries questionable.
As the financial center of modern hospitals, the success of any hospital depends significantly on the efficiency of its OR. Estimating operative times is a difficult and complicated task, but accurate predictions are essential to efficient OR scheduling. By improving the precision of operative time estimates it is possible to reduce costly unplanned staff overtime, cancelled cases, and underutilization. Although no objective variables have so far been identified to correlate with thyroid operative time, our research shows that by taking time to consider their individual patient characteristics, experienced thyroid surgeons can significantly reduce the error of estimating thyroid operative times. This patient-specific approach to operative time estimation is clearly more accurate than a generic system approach of averaging previous operative times.
Reprint requests may directed to Kevin Stavrides, MD, at kpstavrides@geisinger.edu.
The authors would like to acknowledge Erin A. Vanenkevort, PhD, for her assistance with statistical analysis.
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Kevin P. Stavrides https://orcid.org/0000-0002-4586-1969
1 Department of Otolaryngology–Head and Neck/Facial Plastic Surgery, Geisinger Medical Center, Danville, PA, USA
Received: April 9, 2021; accepted: April 21, 2021
Corresponding Author:Kevin P. Stavrides, MD, Department of Otolaryngology–Head and Neck/Facial Plastic Surgery, Geisinger Medical Center, 100 North Academy Avenue, Danville, PA 17822, USA.Email: kpstavrides@geisinger.edu