Article Text

Download PDFPDF

Original research
Health economics evaluation of diagnostic strategies for gastro-oesophageal reflux disease with reflux symptoms in China: a modelling study
  1. Xiaxiao Yan1,
  2. Xiaoqing Li1,
  3. Yang Chen1,
  4. Meiduo Ouzhu2,
  5. Ziqi Guo3,
  6. Chengzhen Lyu1,
  7. Daiyu Yang3,
  8. Hongda Chen4,
  9. Feng Xie5,
  10. Dong Wu1,2
  1. 1Department of Gastroenterology, Peking Union Medical College Hospital, Beijing, China
  2. 2Department of Gastroenterology, Tibet Autonomous Region People’s Hospital, Lhasa, China
  3. 3Peking Union Medical College, Beijing, China
  4. 4Center for Prevention and Early Intervention, National Infrastructures for Translational Medicine, Institute of Clinical Medicine, Peking Union Medical College Hospital, Beijing, China
  5. 5Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada
  1. Correspondence to Dr Dong Wu; wudong{at}pumch.cn

Abstract

Objectives American College of Gastroenterology (ACG) and Chinese expert consensus recommended different algorithmic approaches for the diagnosis of gastro-oesophageal reflux disease (GERD) are not yet defined. We compared the two recommended diagnostic processes using a Chinese population-based health economics analysis.

Methods Our analysis considered a hypothetical cohort of patients with typical reflux symptoms. We constructed a decision tree model to compare the two recommended diagnostic processes described in ACG clinical guidelines (stratified endoscopy strategy) and Chinese expert consensus (endoscopy-first strategy). The first strategy begins with hazard stratification based on alarm symptoms. Patients with alarm symptoms directly undergo endoscopic examination, while patients without alarm symptoms receive proton pump inhibitors as diagnostic treatment. In the second strategy, all patients with reflux symptoms complete an endoscopic examination. Sensitivity analysis was performed to evaluate a range of cost and probability estimates on costs and health outcomes over a 1-year time horizon from the healthcare system perspective.

Results The total expected costs were US$122.51 for the stratified endoscopy strategy and US$150.12 for the endoscopy-first strategy. The incremental cost-effectiveness ratio (ICER) comparing the endoscopy-first strategy with the stratified endoscopy strategy was US$440.39 per additional correct case of GERD. The rates of detecting upper gastrointestinal carcinoma of the two strategies were 0.0088 and 0.0120, and the ICER was US$8561.34.

Conclusions The use of endoscopy for all patients with reflux symptoms was more effective but with an increased cost compared with the strategy recommended in international guidelines.

  • Health economics
  • Endoscopy
  • Clinical Decision-Making

Data availability statement

All data relevant to the study are included in the article or uploaded as supplementary information.

http://creativecommons.org/licenses/by-nc/4.0/

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

STRENGTHS AND LIMITATIONS OF THIS STUDY

  • Nationally representative data sources based on a particular population were used.

  • Sensitivity analysis was done to determine the uncertainty in the estimates.

  • Costs and outcomes related to treatment, survival and disability were not measured.

  • Regional differences among the Chinese population were not considered.

Introduction

Gastro-oesophageal reflux disease (GERD) is a condition in which reflux of gastric contents causes troublesome symptoms and complications. Although heartburn and regurgitation are considered typical symptoms associated with GERD, a broad spectrum of other symptoms includes dysphagia, chest pain, painful swallowing and extraoesophageal symptoms (eg, chronic cough, hoarseness, laryngitis, pharyngitis and pulmonary fibrosis).1 The estimated global prevalence of GERD is 13% and varies considerably by region and population. In mainland China, the overall pooled prevalence of GERD was 8.7% and showed an increasing trend.2 3 Considering the large population size of China, effective screening and management strategies for GERD are needed.

The diagnosis of GERD is commonly based on the combination of symptoms, endoscopic findings, reflux monitoring and therapeutic response.1 American College of Gastroenterology (ACG) clinical guidelines recommend starting with a proton pump inhibitor (PPI) among patients with typical symptoms.4 For patients with alarm symptoms (such as dysphagia, weight loss, bleeding, vomiting and/or anaemia) or risk factors for Barrett’s oesophagus, endoscopy is strongly recommended as the first step for evaluating oesophageal mucosa. In contrast, the Chinese expert consensus recommends endoscopy for all patients with reflux symptoms at the initial diagnosis.5 The rationale is based on the fact that China is a country with a high incidence of upper gastrointestinal tumours and readily available gastroscopy at a low cost.6–8 Early endoscopic examination is beneficial for tumour screening and assessment of disease status. A meta-analysis found that the tumour detection rate of endoscopy in patients with upper gastrointestinal symptoms at the initial consultation in Asia was 1.3%.6 A study in Guangzhou reported a detection rate of 0.8% for oesophageal and gastric cancers in patients with initial heartburn without alarm symptoms.7 However, no study provides economic evidence for this strategy in China. Therefore, we compared the two recommended diagnostic processes described above using a Chinese population-based health economics analysis.

Methods

Our analysis considered a hypothetical cohort of patients with typical reflux symptoms (heartburn and regurgitation) in China. Our decision tree model incorporated base-case estimates of most likely clinical scenarios and then used sensitivity analysis to evaluate a range of cost and probability estimates on costs and health outcomes over a 1-year time horizon from the healthcare system perspective. All analyses were performed using TreeAge Pro 2022 software.

Decision model

The decision model considered two strategies representing different diagnostic processes in international or Chinese guideline recommendations (figure 1).

Figure 1

Decision tree model for cost-effectiveness analysis. □, decision nodes; ○, chance nodes; ⊳, terminal nodes; BE, Barrett’s oesophagus; CA, carcinoma; GERD, gastro-oesophageal reflux disease; NERD, non-erosive reflux disease; PPI, proton pump inhibitors; PUD, peptic ulcer disease; RE, reflux oesophagitis.

As recommended in international guidelines, the first strategy begins with hazard stratification based on alarm symptoms, including dysphagia, weight loss, gastrointestinal bleeding and persistent vomiting (stratified endoscopy strategy). Patients without alarm symptoms are considered at low risk of malignancy and receive PPI as diagnostic treatment. Ineffective PPI therapy is indicative of sequential invasive testing using endoscopy and oesophageal reflux monitoring. Patients with alarm symptoms directly undergo endoscopic examination, followed by a biopsy for suspected lesions. If no positive endoscopic results are found, a PPI test and oesophageal reflux monitoring will be performed for next-step testing. Patients with reflux oesophagitis (RE) or Barrett’s oesophagus (BE) confirmed by endoscopic biopsy, positive PPI response or reflux evidence from oesophageal monitoring are diagnosed as having GERD in this strategy. Peptic ulcer disease (PUD) and upper gastrointestinal carcinoma (CA) can also be detected during endoscopy examination. If endoscopy, PPI test and reflux monitoring are negative, GERD is excluded.

The second strategy is based on an expert consensus in China, the endoscopy-first strategy. All patients first complete an endoscopic examination. The subsequent assessment algorithm is the same as that in the first strategy.

The biopsy is considered the gold standard for differentiating upper gastrointestinal lesions under endoscopy and oesophageal monitoring for pathological reflux. We did not consider the potential side effects of PPIs, complications of diagnostic procedures or the impact of the diagnosis on quality of life or the subsequent utilisation of healthcare resources.

Clinical inputs and transition probabilities

Preference was given to the most recent studies based on the Chinese population. When more than one value of the same parameters was reported in multiple studies, the maximum and minimum values, 95% CI or baseline ±20% (if insufficient parameters) were included as the value range. For unavailable parameters, data were obtained through expert consultation or referred to relevant studies from other countries. All input parameters are listed in table 1.

Table 1

Model parameters

Disease prevalence

Bai et al conducted a large-scale retrospective analysis in a single tertiary medical centre and demonstrated the symptomatic profile of patients undergoing upper endoscopy.9 A total of 15 431 patients had regurgitation or heartburn, and 1204 had alarm symptoms (7.8%). Common endoscopic lesions included RE, PUD and BE, while CA was rarely detected. In patients with reflux symptoms but no alarm symptoms, the proportions of RE, PUD and CA were 25.8%, 12.7% and 0.7%, respectively.10 However, no study has separately characterised endoscopic performance in patients with reflux and alarm symptoms. The results from all alarm symptom populations (12.5% RE, 17.9% PUD and 7.7% CA under endoscopy) were used to estimate these parameters in our model.10 For all patients with reflux symptoms, the proportions were calculated using the following formula:

Probability of certain lesion in all patients = probability of certain lesion in patients with alarm symptoms × probability of alarm symptoms + probability of certain lesion in patients without alarm symptoms × (1 - probability of alarm symptoms).

The detected rate of BE has been rarely investigated, and the approximate estimation of baseline values was obtained through a meta-analysis (total endoscopic detection rate 1.0%, 95% CI 0.1% to 1.8%).11 The proportion of patients without clinically significant endoscopic findings in this model was calculated from one minus the sum of other lesions.

Diagnostic test characteristics

The response rate of PPI over 2–8 weeks in patients with reflux symptoms ranged from 54.1% to 63.9%.12–16 We chose the result of an RCT evaluating esomeprazole as the baseline.14 The PPI test’s pooled sensitivity, specificity and positive predictive value from a previous meta-analysis were 0.52, 0.32 and 0.38, respectively.17 Oesophageal reflux monitoring was once considered the ‘gold standard’ in many diagnostic test accuracy studies (DTAs) and guidelines. However, the diagnostic performance in Chinese patients was limited, and results varied widely.18–21 Wang et al retrospectively investigated 177 patients with typical reflux symptoms who received oesophageal function tests, and 122 of them had AET >4%. In patients who did not respond to PPI, 50.0% had AET >4%. In patients without positive endoscopic findings, 65.9% had AET >4%.18

Cost

All costs were converted to US$ using published exchange rates. Only direct healthcare costs were considered. Costs for drugs and endoscopic and diagnostic procedures were referenced in terms of drug and medical service pricing in Peking Union Medical College Hospital. There was no time discounting of future costs and health outcomes as the period of the model was less than 1 year.

Base-case analysis

The base-case analysis estimated the incremental cost-effectiveness ratio (ICER) between the stratified endoscopy strategy and the endoscopy-first strategy. We used the incremental cost per additional correct diagnosis of GERD. As a primary outcome measure for effectiveness, the correct diagnosis of GERD (including biopsy-confirmed RE and BE, NERD confirmed by reflux monitoring and true positive results in the PPI test) was assigned a value of 1. When the final diagnosis was incorrect (false positive) or was determined as PUD, CA or other disorders, we assigned a value of 0. We also evaluated the incremental cost per additional detection of upper gastrointestinal CA (biopsy-confirmed CA was assigned a value of 1, while other results were 0). The result of the cost-effectiveness analysis was only described in this study since there is no accepted willing-to-pay (WTP) threshold for ICER.

Sensitivity analysis

To evaluate the robustness of the results of the decision tree analyses, we explored broad distributions around uncertain parameters using one-way sensitivity analysis. Each parameter varied within the value range to explore the potential factors affecting the optimal strategy, and the results were shown in the tornado diagrams.

Patient and public involvement

No patients were involved in the development of the research question or its outcome measures, the conduct of the research or the preparation of the manuscript.

Results

Base case analysis

The results of our base case analysis are presented in table 2. The total expected costs were US$122.51 for the stratified endoscopy strategy and US$150.12 for the endoscopy-first strategy. The rate of correct diagnosis of GERD was 0.45 and 0.52 for the stratified strategy and the endoscopy-first strategy, respectively. The ICER comparing the endoscopy-first strategy with the stratified endoscopy strategy was US$440.39 per additional correct case of GERD. The rates of detecting upper gastrointestinal CA of the two strategies were 0.0088 and 0.0120. The ICER was US$8561.34. A total of 47.4% of patients underwent endoscopy, and 25.8% finished reflux monitoring in the stratified endoscopy strategy. In the other strategy, where all patients underwent endoscopy, 25.7% needed reflux monitoring.

Table 2

Base-case analysis

One-way sensitivity analyses

The one-way sensitivity analysis related to the GERD diagnosis is shown in figure 2. The most sensitive parameters were the probability of RE in patients without alarm symptoms, the probability of true positives in the PPI test, the probability of RE in all patients, the cost of endoscopy and the probability of patients with alarm symptoms. When the probability of RE in patients without alarm symptoms varied from 0.206 to 0.410, the ICER would range from 324.78 to 1190.42; when the probability of true positives in the PPI test varied from 0.300 to 0.490, the ICER would range from 348.07 to 693.18; when the probability of RE in all patients varied from 0.227 to 0.298, the ICER would range from 580.22 to 254.93; when the cost of endoscopy varied from 45.103 to 67.654, the ICER would range from 345.72 to 535.06; when the probability of patients with alarm symptoms varied from 0.062 to 0.270, the ICER would range from 455.33 to 291.56 (online supplemental table S1).

Figure 2

Tornado diagram of ICER. BE, Barrett’s oesophagus; CA, carcinoma; PPI, proton pump inhibitors; PUD, peptic ulcer disease; RE, reflux oesophagitis.

Discussion

There is an increasing trend of GERD globally, as well as in the Chinese population. However, the diagnostic processes still vary in different regions of the world.2 ,4 ,5 The endoscopy-first strategy used in China was more effective but also more expensive than the stratified endoscopy strategy recommended by international guidelines.

The Chinese expert consensus that prioritises the recommendation of endoscopy is based on two main facts, the first of which is the risk of malignant lesions.5 Upper gastrointestinal tract cancer (UGIC), including oesophageal cancer (EC) and gastric cancer (GC), is prevalent in China.22 In 2020, UGIC accounted for 11.38% and 15.97% of all new incident cases and deaths from malignant tumours in China.23 Endoscopic screening can reduce the incidence and mortality associated with UGIC.24–27 Multiple economic evaluation studies from different countries indicated that endoscopic screening was cost-effective compared with no screening.23 28–32 Xia et al constructed a Markov model to evaluate the cost-effectiveness of endoscopic screening strategies for UGIC among people aged 40–69 years in areas of China where the risk of these cancers is high.33 Combined endoscopic screening for EC and GC may be cost-effective, and screening every 2 years would be optimal. The use of endoscopy is common in China. According to the national gastrointestinal endoscopy census in 2020, from 2012 to 2019, the number of medical institutions conducting gastrointestinal endoscopies increased from 6128 to 7470; the number of practitioners had a growth rate of 51.27%, and a total of 38,730,000 cases of gastrointestinal endoscopy were carried out nationwide in 2019, representing an increase of 34.62% from 2012.

When we focused on diagnosing GERD or CA in this model, the endoscopy-first strategy showed increased effectiveness and more costs. The use of alarm symptom stratification avoided endoscopy in more than half of all patients, while the need for expensive reflux monitoring was comparable between the two strategies. Moreover, we noted that the proportion of CA in the reflux symptomatic population does not correlate with the traditionally high prevalence of upper gastrointestinal CA in China. In addition, chronic inflammation caused by GERD is one of the most critical risk factors for oesophageal adenocarcinoma, while squamous carcinoma accounts for more than 80% of cases in China.34 Therefore, the significance of reflux symptoms alone in suggesting upper gastrointestinal malignancies in the Chinese population still needs to be supported by large-scale studies.

According to the one-way sensitivity analysis, the first five factors affecting the baseline results of ICER were the probability of RE in patients without alarm symptoms, the probability of true positives in the PPI test, the probability of RE in all patients, the cost of endoscopy and the probability of patients with alarm symptoms. Based on the literature search results, alarm symptoms are commonly used exclusion criteria when investigating endoscopic manifestations. RE is the most common lesion observed under endoscopy in patients with reflux symptoms, and the range of its probability was obtained from different single-centred research data covering the provinces of Guangdong, Shanghai, Beijing and Xinjiang.7 10 19 35 36 However, given the differences between regions and age groups in China, the characterisation of upper digestive tract lesions detected by endoscopy still requires more well-planned epidemiological investigations. For the diagnostic accuracy of the PPI test, pooled results and its 95% CI from a meta-analysis were used as baseline and range.17 However, these results are not specific to the Chinese population alone. The price of endoscopes is another critical point to focus on. We used the pricing of endoscopes in Beijing hospitals as the basis, with a 20% upward and downward fluctuation as the range according to expert consultation. The real world is bound to be more complex, influenced by different regions, hospital grades and health insurance policies. The main difference between the two strategies compared in this decision tree model is risk stratification according to the presence or absence of alarm symptoms. Therefore, the proportion of alarm symptoms in the tested population obviously has a greater impact on the results of the model operation. When the probability of alarm symptoms increased, more subjects directly entered the endoscopy session and the difference between the two strategies decreased, with the ICER showing a decreasing trend. The probabilistic sensitivity analysis would provide a more comprehensive understanding of the uncertainty in our model. However, due to the lack of a recognised WTP threshold, we are unable to conduct a probabilistic sensitivity analysis at this stage. Although the one-way sensitivity analyses may not capture the full range of uncertainty, we believe it offers preliminary insights and highlights potential directions for future research. We hope that as more data and methodological support become available, probabilistic sensitivity analysis can be incorporated in subsequent studies.

In 1999, Ofman et al compared the clinical and economic outcomes of the empiric trial of omeprazole and the traditional invasive strategy for diagnosing GERD as the cause of non-cardiac chest pain.37 Results showed that the omeprazole test was related to reduced costs and improved diagnostic certainty, providing a simple, cost-effective choice for common disorders in primary care settings. However, no cost-effectiveness studies compared different diagnosis strategies in patients with typical reflux. Compared with non-cardiac chest pain, reflux symptoms suggest different differential diagnoses and different significance in predicting malignancy, thus affecting patient treatment choices and outcomes. The stratified endoscopy strategy in this model used alarm symptoms as the rationale for hazard stratification. Additional factors are considered to identify high risk for malignant lesions, including region, family history, dietary habits, Helicobacter pylori infection, etc, which are potentially to be included in further hazard stratification. Accurate risk stratification helps to highlight the value of endoscopy for precise screening and definite diagnosis rather than crude primary screening.

This study had some limitations. One of the significant limitations is the 1-year time horizon. The study did not measure the costs and outcomes related to treatment, survival and disability. Cost-effectiveness was not measured in terms of cost per disability-adjusted life year averted, which is a more robust measure of cost-effectiveness. Moreover, our model is structured based on several assumptions and parameter estimates. Parameter estimates were extracted from multiple sources with different evidence quality. Considering that the prevalence also varies considerably in various regions of China and different periods of age, these results are bound to change with changes in prevalence rates from other populations. More epidemiological findings based on Chinese populations are urgently needed as a basis for further health economic analysis. While our decision tree model offers a systematic approach for selecting GERD diagnostic strategies, it is important to acknowledge that this remains a model-based study. Potential gaps may exist between the theoretical framework and real-world clinical practice, including variations in patient populations, healthcare settings and resource availability. Future studies should aim to validate and refine this model using large-scale, real-world data to assess its practicality and generalisability. Such efforts would strengthen the evidence base for optimal strategy selection and provide more robust recommendations tailored to diverse clinical scenarios.

Conclusion

This study provides economic evidence for the expert consensus of GERD in China. The use of endoscopy for all patients with reflux symptoms was more effective but with an increased cost compared with the strategy recommended in international guidelines. Diagnosing GERD while ruling out malignant lesions in the vastly outnumbered reflux population in China still requires more targeted, higher-quality endoscopy strategies depending on the regional spectrum of diseases and accessibility of medical resources.

Data availability statement

All data relevant to the study are included in the article or uploaded as supplementary information.

Ethics statements

Patient consent for publication

Ethics approval

Not applicable.

References

Footnotes

  • X @fengxie_mac

  • XY and XL contributed equally.

  • Contributors XY and DW designed the study. XY, XL, YC, MO, ZG, CL and DY contributed to the conduction of the study. XY and XL collected the data. XY and XL performed statistical analysis and wrote the draft of the manuscript. HC and FX provided advices for the research protocol, statistical analysis plan and data analyses. HC, FX and DW reviewed and edited the manuscript. All authors read and approved the final manuscript. DW is responsible for the overall content as guarantor.

  • Funding National Natural Science Foundation of China (81970476) and National Key Clinical Specialty Construction Project (ZK108000).

  • Competing interests None declared.

  • Patient and public involvement Patients and/or the public were not involved in the design, conduct, reporting or dissemination plans of this research.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.