Building and validating trend-based multiple sclerosis case definitions: a population-based cohort study for Manitoba, Canada

Naomi C Hamm; Ruth Ann Marrie; Depeng Jiang; Pourang Irani; Lisa Lix

doi:10.1136/bmjopen-2023-083141

Article Text

PDF

PDF +
Supplementary
Material

Epidemiology

Original research

Building and validating trend-based multiple sclerosis case definitions: a population-based cohort study for Manitoba, Canada

http://orcid.org/0000-0002-8646-8995Naomi C Hamm1,
http://orcid.org/0000-0002-1855-5595Ruth Ann Marrie1,2,
http://orcid.org/0000-0002-7060-4892Depeng Jiang1,
Pourang Irani3,
Lisa Lix1

¹Department of Community Health Sciences, University of Manitoba, Max Rady College of Medicine, Winnipeg, Manitoba, Canada
²Department of Internal Medicine, University of Manitoba, Max Rady College of Medicine, Winnipeg, Manitoba, Canada
³Department of Computer Science, Mathematics, Physics and Statistics, University of British Columbia, Kelowna, British Columbia, Canada

Correspondence to Naomi C Hamm; lettn{at}myumanitoba.ca

Abstract

Objective This study aims to (1) build and validate model-based case definitions for multiple sclerosis (MS) that use trends (ie, trend-based case definitions) and (2) to apply dynamic classification to identify the average number of data years needed for classification (ie, average trend needed).

Design Retrospective cohort study design.

Participants 608 MS cases and 59 620 MS non-cases.

Setting Data from 1 April 2004 to 31 March 2022 were obtained from the Manitoba Population Research Data Repository. MS case status was ascertained from homecare records and linked to health data. Trend-based case definitions were constructed using multivariate generalised linear mixed models applied to annual numbers of general and specialist physician visits, hospitalisations and MS healthcare contacts or medication dispensations. Dynamic classification, which ascertains cases and non-cases annually, was used to estimate mean classification time. Classification accuracy performance measures, including sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), proportion correctly classified (PCC) and F1-scores, were compared for trend-based case definitions and a deterministic case definition of 3+MS healthcare contacts or medication dispensations.

Results When applied to the full study period, classification accuracy performance measure estimates for all case definitions exceeded 0.90, except sensitivity and PPV for the trend-based dynamic case definition (0.88, 0.64, respectively). PCC was high for all case definitions (0.94–0.99); F1-scores were lower for the trend-based case definitions compared with the deterministic case definition (0.74–0.93 vs 0.96). Dynamic classification identified 5 years as the average trend needed. When applied to the average trend windows, accuracy estimates for trend-based case definitions were lower than the estimates from the full study period (sensitivity: 0.77–0.89; specificity: 0.90–0.97; PPV: 0.54–0.81; NPV: 0.97–0.99; F1-score: 0.64–0.84). Accuracy estimates for the deterministic case definition remained high, except sensitivity (0.42–0.80). F1-score was variable (0.59–0.89).

Conclusions Trend-based and deterministic case definitions classifications were similar to a population-based clinician assessment reference standard for multiple measures of classification accuracy. However, accuracy estimates for both trend-based and deterministic case definitions varied as the years of data used for classification were reduced. Dynamic classification appears to be a viable option for identifying the average trend needed for trend-based case definitions.

Multiple sclerosis
Chronic Disease
EPIDEMIOLOGY
STATISTICS & RESEARCH METHODS

Data availability statement

Data may be obtained from a third party and are not publicly available. Data used in this article were derived from administrative health data as secondary use. The data were provided to the investigators under specific data sharing agreements only for approved use at Manitoba Centre for Health Policy (MCHP). The original source data are not owned by the researchers or MCHP and as such cannot be provided to a public repository. The original data source and approval for use have been noted in the Ethics section of the article. Where necessary, source data specific to this article or project may be reviewed at MCHP with the consent of the original data providers, along with the required privacy and ethical review bodies.

http://creativecommons.org/licenses/by-nc/4.0/

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.

https://doi.org/10.1136/bmjopen-2023-083141

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

STRENGTHS AND LIMITATIONS OF THIS STUDY

This population-based cohort study assessed the performance of trend-based case definitions using data from 2004 to 2022 with a near complete capture of healthcare use trends via Manitoba’s universal healthcare system.
Variations in case definition performance are identified based on the number of data years used for classification, providing applicable insights into the future application of these case definitions.
Dynamic classification provides an empirical approach to identify the average number of data years needed for classification.
The reference standard for multiple sclerosis is based on clinical assessments obtained from homecare records, which may limit generalisability.

Introduction

Administrative health data, such as physician billing claims and hospital discharge abstracts, are widely used for population-based chronic disease research and surveillance. Disease cases are identified from the data using case definitions (ie, algorithms) with either deterministic or model-based (eg, probabilistic) rules.1–3 Accurate identification of disease cases can be challenging, as administrative health data were not originally collected for these purposes.4 Episodic diseases, such as multiple sclerosis (MS), that have periods of remission and relapse can be particularly difficult to accurately identify in administrative health data compared to chronic diseases that are not episodic in nature.5 6 MS is disease of the central nervous system that can lead to physical and cognitive disability over time. Canada has among the highest prevalence of MS in the world7–9 and is, therefore, a disease of significant interest in national surveillance initiatives.

The choice of a model-based or deterministic case definition depends on several factors, including characteristics of the disease of interest. Model-based case definitions, which rely on statistical or machine learning models to estimate case probabilities,10–15 often have better accuracy for disease identification than deterministic case definitions,12 13 15 which are based on a fixed number and type of observations often occurring within a defined time interval to identify cases (eg, one or more hospital separation records, or five or more physician claims within 2 years). Both approaches primarily rely on cross-sectional data and do not take the temporal characteristics of an individual’s health history into account. Identifying episodic disease cases from administrative health data may benefit from using healthcare use trends (ie, longitudinal data), rather than summing an individual’s healthcare history at a single time point. Studies that have used longitudinal data to predict health status have primarily relied on electronic health records or clinical data.16–22 One study used administrative health data to detect juvenile arthritis in children16; however, observation began at birth, which is often not feasible for adult populations.

Case definitions that use healthcare use trends (ie, trend-based case definitions) may have variable performance as the number of data years used for classification or identification changes. Moreover, the number of data years required for accurate classification may vary across individuals. Therefore, identifying the average number of years needed for accurate case ascertainment (ie, average trend needed) is a critical first step when using healthcare use trends to identify episodic disease cases within administrative health data. Dynamic classification aims to minimise observation time19 20 and is a potential approach for identifying the average trend needed. Individuals are classified using probability intervals and estimated interval limits are updated at regular time points.20 If the predetermined classification cut-off value falls outside an individual’s estimated interval, the individual is classified; otherwise, observation continues. Therefore, classification occurs throughout the observation period and only when enough data are present to make a ‘confident classification’ (ie, full probability interval either above or below classification cut-off value), allowing observation time to vary across individuals.

It is unclear how the accuracy of trend-based case definitions for episodic diseases compares to the accuracy of deterministic case definitions currently used in research and surveillance. In addition, the application of dynamic classification to identify the average trend needed for classification has not yet been tested. Our study purpose was to build and validate model-based case definitions for an episodic disease, MS, that use trends (ie, trend-based case definitions) and apply dynamic classification to identify the average number of data years needed for classification (ie, average trend needed). The objectives were to (1) assess classification performance of trend-based case definitions and a previously validated deterministic case definition using MS status obtained from homecare records as a reference standard; (2) identify the average trend needed for MS case ascertainment using dynamic classification and (3) compare the classification performance of trend-based and deterministic case definition over time using the average trend needed.

Methods

Study design

A retrospective cohort study design was used; the study period was from 1 April 2004 to 31 March 2022. Case definitions were initially applied to data from the full study period. The study period was then split into 5-year windows (fiscal year used) based on the average trend needed as identified by the dynamic classifier (average trend windows; 1 April 2004–31 March 2009; 1 April 2009–31 March 2014; 1 April 2014–31 March 2019; 1 April 2019–31 March 2022).

Patient and public involvement

None.

Data source

Data were obtained from the Manitoba Population Research Data Repository housed at the Manitoba Centre for Health Policy, University of Manitoba. Manitoba has a universal healthcare system and captures all publicly insured healthcare contacts for its 1.3 million residents. The Manitoba Health Insurance Registry, Hospital Discharge Abstracts, Medical Claims/Medical Services, Drug Program Information Network, Canada Census and Home Care Assessment databases were used. Data on health insurance coverage dates, birth date, sex and postal code were obtained from the Manitoba Health Insurance Registry; the registry was also used to conduct individual-level linkage of databases. The Hospital Discharge Abstracts (5-digit International Classification of Diseases Codes (ICD)-10-Canadian version (CA) codes), Medical Claims/Medical Services (3-digit ICD-9-Clinical Modification (CM) codes) and Drug Program Information Network (Anatomical Therapeutic Chemical (ATC) codes) databases were used to obtain information on healthcare visits and prescription medications from community pharmacies. The Canada Census was used to obtain area-level income quintile based on postal code and average household income.23 The Home Care Assessment database was used to construct the study cohort and provide a reference standard for identifying MS cases and non-cases. This database captures data on home care assessments, utilisation and health status for all individuals receiving homecare delivered by the Winnipeg Regional Health Authority. The Winnipeg Regional Health Authority is the largest health authority in the province and serves approximately 60% of Manitoba’s population.

Reference standard

The reference standard for MS status was based on the interRAI assessment obtained from the Homecare Assessment database. The interRAI assessment is an internationally recognised tool that assesses an individual’s health and functioning and has been shown to have high sensitivity (0.90) and specificity (1.00) for identifying individuals with MS within the homecare setting.24 Assessments are completed by a clinician based on patient interviews and review of medical information24; within Manitoba, an assessment is required for homecare access. Indication of MS was determined from a checklist of conditions; those with MS checked were considered cases and those without were considered non-cases. If multiple assessments per individual were available, the status of the majority of assessments was used. Therefore, individuals were assigned the case status that had the strongest evidence of true MS status. Where the numbers of MS case/non-case assessments were equal, individuals were excluded. Overall, the proportion of individuals that had conflicting MS statuses in the data was low (0.002% of cohort).

Study cohort

Individuals were included in the study if they had one or more assessments in the Homecare Assessment database during the study period. Cohort entry was defined as the start of the study period (1 April 2004) or start of healthcare coverage, whichever was later. Cohort inclusion criteria were a valid MS assessment field (ie, a response from an assessment that had been signed by a physician), linkage to the Manitoba Health Insurance Registry, at least 730 days of continuous healthcare coverage between cohort entry and assessment date and at least 20 years of age at assessment date. 20 years was chosen as the minimum cut-off age as this is the age used by the Public Health Agency of Canada for MS surveillance.25

Study variables

Four healthcare use variables were used to build longitudinal case definitions: the number of general physician (ie, family physician) visits, the number of specialist physician visits, at least one inpatient hospitalisation for any reason and at least one MS healthcare contact (ie, physician visit or hospitalisation with an MS diagnosis code or a MS-specific prescription medication claim). These measures were constructed for each year in the study period. The number of general physician visits and the number of specialist physician visits were capped at 92 visits per year (ie, 1 visit every 4 days; <1% of cohort affected). Specialist physician visits encompassed any specialty because the MS population has a higher co-occurrence of multiple health conditions than the general population.26 27 Neurologist visits in the Manitoba MS Clinic were not captured in administrative data between 2000 and 2010 due to a lack of shadow billing for alternate-funded physicians and were, therefore, excluded from specialist visits. MS diagnosis codes were ICD-9-CM 340 and ICD-10-CA G35.28 ATC codes for MS-specific prescription medications are reported in online supplemental table S1.29 Demographic variables included age at cohort entry, sex, income quintile and years of healthcare coverage during the study period.

Supplemental material

[bmjopen-2023-083141supp001.pdf]

Case definitions

Three types of case definitions were applied to the data: Trend-based with dynamic classification, where classification was based on credible intervals (CrI) calculated annually (trend-based dynamic; more details on the dynamic classification scheme are found in the Statistical Analysis section); trend-based with static classification, where classification was based on a single probability point estimate calculated from data over the full study period (trend-based static) and a previously validated deterministic case definition (3 or more MS contacts over the full study period).28 30 The trend-based case definitions were built using group-specific multivariate generalised linear mixed models (ie, separate models for cases and non-cases). A full description of the models and methods used to calculate case probability and corresponding CrIs can be found in online supplemental material and in Hughes et al.20 Outcome variables included the four healthcare use variables (general physician visits, specialist physician visits, hospitalisation and MS contacts). Count outcome variables (general and specialist physician visits) were modelled with a log link function and binary outcome variables (hospitalisation and MS contacts) were modelled with a logit link function. Model covariates included time (time=0 for the first year in the study period), sex, age at cohort entry (continuous) and income quintile (quintiles 1–3=low income; 4–5=high income); all covariates were binary or continuous and no transformations were used. All models included a random intercept. Based on univariate analyses (online supplemental table S2), general visits and specialist visits were modelled with a random time slope; hospitalisations were modelled assuming a fixed time slope. MS-specific contacts were modelled with a random intercept and no covariates due to their sparse numbers in the non-cases. The same outcome variables and covariates were used in all models (ie, case and non-case models).

Supplemental material

[bmjopen-2023-083141supp002.pdf]

Individual group probabilities were estimated by using Bayesian methods approximated via a Markov Chain Monte Carlo with a burn-in of 500 iterations and a thinning rate of 100.20 Trace plots were used to determine burn-in rates and autocorrelation plots were used to assess thinning rates.31 Model convergence was assessed using trace plots, Gelman-Rubin-Brooks plots and the Gelman-Rubin diagnostic; when the trace plots for all coefficients had strong overlap and the Gelman-Rubin diagnostic was <1.1, the model was considered sufficiently converged.31 32 Trace, density, Gelman-Rubin-Brooks and autocorrelation plots can be found in online supplemental figures S1 and S2.

Statistical analysis

Study cohort characteristics were described using means, SD, medians, IQRs, frequencies and percentages based on variable type. Group (ie, case vs non-case) differences were tested using Student’s t-tests for continuous measures and χ² tests of independence for categorical measures.

Models for the trend-based case definitions were applied to data from the full study period. A classification cut-off point, denoted as c, was determined as the value nearest to the top left corner of the receiver operating characteristic (ROC) curve. For the trend-based static case definition, individual probabilities were estimated using data from the full study period. For the trend-based dynamic case definition, classification was as follows:

Calculate individual probabilities and their corresponding 95% CrI for year 1 of study period (P^LOW(t), P^UPP(t)).
If P^LOW(t) > c, classify individual as a case.
If P^UPP(t) < c, classify individual as a non-case.
If P^LOW(t) ≤ c≤ P^UPP(t), leave individual unclassified.
If individual remains unclassified, follow to next year and update probability of corresponding CrI and repeat steps 2–5.

Using this classification scheme, individuals had different classification times (ie, the number of data years) depending on the observation year where their CrI was either fully below the cut-off point (non-case) or fully above the cut-off point (case).

Case definition performance was evaluated using the following accuracy measure estimates: area under the curve (AUC), sensitivity, specificity, positive and negative predictive values (PPV, NPV), proportion of individuals correctly classified (PCC) and F1-scores. Trend-based case definitions were evaluated using five-fold cross-validation. Due to the computational intensity of the models, random samples of non-cases were selected for both training and validation models (1:5 case to non-case ratio for training models; 1:10 case to non-case ratio for validation).

The average trend needed was identified as the mean classification time, in years, when applying the trend-based dynamic case definition. All case definitions were then reapplied to average trend windows and performance was reassessed.

Supplementary analyses were conducted to explore factors that may influence classification time for the trend-based dynamic case definition. After the trend-based dynamic case definition was applied, classification time for the entire cohort was split into quintiles. Quintile means and proportions of cohort characteristics were calculated and stratified by MS case status. Case definitions were also applied to an additional average trend window (1 April 2017–31 March 2022) as a supplementary analysis, as the full study period could not be evenly split into 5-year windows.

All data analyses were performed by using R V.4.1.033 and SAS V.9.4 (SAS Institute). The mixAK package34 was used to build and validate the trend-based case definitions. SAS was used to apply and validate the deterministic case definition.

Results

Between 1 April 2004 and 31 March 2022, 60 228 eligible individuals (608 MS cases (1.0%), 59 620 non-cases (99.0%)) were identified in the Homecare database (figure 1). Cohort characteristics are provided in table 1. Cases comprised a slightly higher percentage of females compared with non-cases (cases 68% female; non-cases 63% female). A larger proportion of cases were in the higher-income quintiles, whereas a larger proportion of non-cases were in the lower-income quintiles. At cohort entry and MS assessment date, cases had a mean age of 54 and 61 years and non-cases had a mean age of 69 and 78 years, respectively. Cases had a slightly greater average total number of years of healthcare coverage than non-cases. Non-cases had more healthcare coverage before the assessment date and less coverage after the assessment date than cases.

View this table:

Table 1

Description of cohort characteristics

Figure 1

Flow chart for study cohort. MS, multiple sclerosis.

Table 2 reports accuracy estimates for the case definitions applied to the full study period. The trend-based static case definition had the highest sensitivity estimate (0.96, SD: 0.02) and the deterministic case definition had the highest PPV estimate (0.98, SD: 0.005); specificity and NPV estimates were similar for all three case definitions. The AUC estimate was slightly higher for the trend-based static case definition compared with the trend-based dynamic (0.98 vs 0.94). The trend-based dynamic case definition had the lowest PPV, sensitivity and F1-score estimates (0.64,0.88 and 0.74, respectively). The PCC estimates were similar for all case definitions. The trend-based static and deterministic case definition had similar F1-scores. Mean classification time for the trend-based dynamic case definition was 5 years; this estimate was used to define the average trend windows.

View this table:

Table 2

Classification accuracy measure estimates (SEs) for trend-based and deterministic case definitions for the full study period

Accuracy estimates for the trend-based case definitions applied to the average trend windows were slightly lower than the accuracy estimates obtained for trend-based case definitions applied to the full study period (table 3). PPV and F1-score estimates had the lowest values, which ranged from 0.54 to 0.81 and 0.64 to 0.84, respectively. In contrast, the deterministic case definition had similar accuracy measure estimates when applied to the average trend windows and the full study period, except sensitivity and F1-score, which had considerably lower estimates for the average trend windows (sensitivity: 0.41–0.80 vs 0.94; F1-score: 0.59–0.89 vs 0.96). Variability in sensitivity, specificity, PPV and NPV estimates for the three case definitions across the average trend windows can be seen in online supplemental figure S3. Supplementary analyses for the average trend window of 1 April 2017–31 March 2022 can be found in online supplemental table S3. Classification accuracy measures for all cases were slightly higher when applied to the 2017–2022 average trend window compared with classification accuracy measures obtained using data from the 2019–2022 average trend window.

View this table:

Table 3

Classification accuracy measure estimates and ranges for trend-based and deterministic case definitions for the average trend windows

Mean case probability estimates and 95% CrIs for cases and non-cases over the study period are reported in figure 2. Mean case probability estimates increased over time for cases and decreased for non-cases. Mean 95% CrIs decreased over time for non-cases.

Figure 2

Mean estimated case probability and 95% credible interval (CrI) limits with SE bars at each year during observation period for cases and non-cases.

Results from the supplementary analyses exploring the impact of cohort characteristics on classification time are found in online supplemental tables S4-S6. For cases and non-cases, individuals who were younger at cohort entry were more likely to be classified within the first classification time quintile. For cases, a higher proportion of individuals classified within classification time quintile 1 were in income quintile 5 (Q5; 0.25) compared with the remaining income quintiles. For non-cases, the highest proportion of individuals classified within classification time quintile 1 was in quintiles 1 and 2 (Q1 and Q2; 0.24 and 0.25, respectively). The highest proportion of cases were classified in year 1 (online supplemental figure S4), whereas the highest proportion of non-cases were classified in year 5 (online supplemental figure S5).

Discussion

This study aimed to build and validate trend-based case definitions for MS and assess the use of dynamic classification for identifying the average trend needed for classification. Trend-based case definition performance was compared with the performance of a previously validated deterministic case definition. We found similar accuracy estimates for trend-based dynamic, trend-based static and deterministic case definitions; the trend-based dynamic case definition had lower PPV and sensitivity estimates compared with the other case definitions. Dynamic classification estimated an average trend of 5 years was needed for classification. When the observation period was limited to the average trend needed (ie, 5 years), performance of all case definitions was slightly lower; sensitivity for the deterministic case definition was considerably lower. Poorest performance estimates were observed for all case definitions when they were applied to the most recent trend window (ie, 1 April 2019–31 March 2022).

Previous studies validating MS case definitions for administrative health data have reported variable accuracy estimates.28 30 35–42 The majority of validated case definitions were deterministic. The deterministic case definition used in this study (three or more MS contacts) has been validated in multiple geographical regions and populations (children and adults), with good performance (sensitivity: 0.87–0.99; specificity: 0.56–1.00; PPV: 0.75–1.00; NPV: 0.76–0.98).28 30 36–39

The estimated PPV values obtained when applying the trend-based case definitions to the average trend windows were lower than the PPV estimates obtained when applying trend-based case definitions to the full study period. In contrast, the deterministic case definition had lower sensitivity estimates when applied to the average trend windows compared with when it was applied to the full study period. This indicates trend-based case definitions are more likely to misclassify non-cases, whereas the deterministic case definition is more likely to misclassify cases when the number of data years used for classification is reduced. Understanding how case definitions are robust to changes in the years of data used for classification is important when applying case definitions across multiple jurisdictions, as most jurisdiction do not have the same number of data years available.43 44

As trend-based case definitions rely on trends for classification, changes in disease treatment over time may influence performance. While this applies to the full study period, it is most evident when applying case definitions to the 2019–2022 average trend window, where drastic changes in healthcare were observed due to the COVID-19 pandemic, including increased virtual care, physician departure from clinics and introduction of new MS medications.45 46 Data exploration indicated a drop in the mean number of physician visits (general and specialist), hospitalisations and MS-specific contacts for the 2021 fiscal year due to the COVID-19 pandemic, which likely contributed to lower estimates of case definition performance. Lower estimates were still observed when the average trend window was extended to 1 April 2017–31 Marc 2022 in supplementary analysis.

As expected, mean case probability estimates for cases increased over the study period, whereas mean case probability estimates decreased for non-cases. This indicates that using a longer trend for classification (ie, more years of data) resulted in a more accurate estimated case probability. Notably, the estimated case probability trend seen in this study was based on a classification approach and obtained under the assumption that MS prevalence remained constant over time, which is not always the case.28 47 Worldwide, MS prevalence is considered to be increasing primarily due to earlier diagnosis and improved survival rates.47 A different trend in estimated case probability may be observed when changes in baseline prevalence are considered or where a prediction, rather than classification approach is used.

There were some limitations to this study. The selected reference standard for MS status and study cohort comes from those receiving homecare within the largest health authority in the province of Manitoba, which primarily serves urban populations. Therefore, study finding may not generalise to younger, healthier populations or rural populations. MS status was obtained from the InterRAI assessment. This is a validated assessment conducted by a clinician24 48; however, MS cases and non-cases may still be misclassified.24 The interRAI assessment has been previously used as a validation source for MS.36

Strengths of the study include near-complete capture (ie, 99%) of healthcare use via Manitoba’s universal healthcare system.49–51 In addition, applying case definitions to a reduced number of data years as well as the full study period provides a more complete picture of case definition performance. The deterministic case definition chosen for this study as a comparison for the trend-based case definitions has been well validated in Manitoba27 28 52 as well as other jurisdictions.30 38 52 Last, the novel use of dynamic classification provides an empirical and effective approach to identify the average trend needed, which can easily be applied to different episodic disease in future research.

In conclusion, trend-based case definitions have similar performance to deterministic case definitions when identifying MS cases and non-cases from administrative health data. Performance for both trend-based and deterministic case definition varies when the number of data years used for classification is limited. When using a trend-based case definition, dynamic classification appears to be a viable option for identifying the average trend needed for classification. Future research should examine how changes in the years of data used for classification at the individual-level impact case definition performance, as we only explored changes in the number of data years at the population (ie, marginal) level. The application of trend-based case definitions should also be explored for other episodic chronic diseases, such as rheumatoid arthritis53 54 or inflammatory bowel disease.55 56

Data availability statement

Ethics statements

Patient consent for publication

Ethics approval

Ethics approval was granted by the University of Manitoba’s Health Research Ethics Board (HREB No. HS23961). Data access approval was provided by the Provincial Health Research Privacy Committee (PHRPC No. 2020/2021-12) and Manitoba Shared Health along with the Winnipeg Regional Health Authority (RAAC2020:026).

References

↵
1. Lix LM,
2. Ayles J,
3. Bartholomew S, et al
. The Canadian chronic disease surveillance system: a model for collaborative surveillance. Int J Popul Data Sci 2018;3:433. doi:10.23889/ijpds.v3i3.433
↵
1. Lix LM,
2. Yogendran MS,
3. Shaw SY, et al
. Population-based data sources for chronic disease surveillance. Chronic Dis Can 2008;29:31–8.
OpenUrl PubMed Web of Science
↵
1. Rector TS,
2. Wickstrom SL,
3. Shah M, et al
. Specificity and sensitivity of claims-based algorithms for identifying members of medicare choice health plans that have chronic medical conditions. Health Serv Res 2004;39:1839–57. doi:10.1111/j.1475-6773.2004.00321.x
OpenUrl CrossRef PubMed Web of Science
↵
1. Iezzoni LI
. Assessing quality using administrative data. Ann Intern Med 1997;127:666–74. doi:10.7326/0003-4819-127-8_part_2-199710151-00048
OpenUrl CrossRef PubMed Web of Science
↵
1. Marrie RA,
2. McKay K
. Administrative data for observational research in multiple sclerosis: opportunities and challenges. Mult Scler 2022;28:3–6. doi:10.1177/13524585211055787
OpenUrl CrossRef
↵
1. Yamout BI,
2. Alroughani R
. Multiple sclerosis. Semin Neurol 2018;38:212–25. doi:10.1055/s-0038-1649502
OpenUrl CrossRef
↵
1. Petrin J,
2. Marrie RA,
3. Devonshire V, et al
. Good multiple sclerosis (MS) care and how to get there in Canada: perspectives of Canadian healthcare providers working with persons with MS. Front Neurol 2023;14:1101521. doi:10.3389/fneur.2023.1101521
↵
1. GBD 2016 Multiple Sclerosis Collaborators
. Global, regional, and national burden of multiple sclerosis 1990-2016: a systematic analysis for the Global Burden of Disease Study 2016. Lancet Neurol 2019;18:269–85. doi:10.1016/S1474-4422(18)30443-5
OpenUrl CrossRef PubMed
↵
1. Amankwah N,
2. Marrie RA,
3. Bancej C, et al
. Multiple sclerosis in Canada 2011 to 2031: results of a microsimulation modelling study of epidemiological and economic impacts. Health Promot Chronic Dis Prev Can 2017;37:37–48. doi:10.24095/hpcdp.37.2.02
OpenUrl
↵
1. Cooke CR,
2. Joo MJ,
3. Anderson SM, et al
. The validity of using ICD-9 codes and pharmacy records to identify patients with chronic obstructive pulmonary disease. BMC Health Serv Res 2011;11:37. doi:10.1186/1472-6963-11-37
↵
1. English SW,
2. McIntyre L,
3. Fergusson D, et al
. Subarachnoid hemorrhage admissions retrospectively identified using a prediction model. Neurology (ECronicon) 2016;87:1557–64. doi:10.1212/WNL.0000000000003204
OpenUrl
↵
1. Fan J,
2. Arruda-Olson AM,
3. Leibson CL, et al
. Billing code algorithms to identify cases of peripheral artery disease from administrative data. J Am Med Inform Assoc 2013;20:e349–54. doi:10.1136/amiajnl-2013-001827
OpenUrl CrossRef PubMed
↵
1. Lix LM,
2. Yogendran MS,
3. Leslie WD, et al
. Using multiple data features improved the validity of osteoporosis case ascertainment from administrative databases. J Clin Epidemiol 2008;61:1250–60. doi:10.1016/j.jclinepi.2008.02.002
OpenUrl CrossRef PubMed Web of Science
↵
1. Peng M,
2. Chen G,
3. Lix LM, et al
. Refining hypertension surveillance to account for potentially misclassified cases. PLoS One 2015;10:e0119186. doi:10.1371/journal.pone.0119186
↵
1. Prosser RJ,
2. Carleton BC,
3. Smith MA
. Identifying persons with treated asthma using administrative data via latent class modelling. Health Serv Res 2008;43:733–54. doi:10.1111/j.1475-6773.2007.00775.x
OpenUrl CrossRef PubMed
↵
1. Feely A,
2. Lim LS,
3. Jiang D, et al
. A population‐based study to develop juvenile arthritis case definitions for administrative health data using model‐based dynamic classification. BMC Med Res Methodol 2021;21:105. doi:10.1186/s12874-021-01296-9
↵
1. Choi E,
2. Schuetz A,
3. Stewart WF, et al
. Using recurrent neural network models for early detection of heart failure onset. J Am Med Inform Assoc 2017;24:361–70. doi:10.1093/jamia/ocw112
OpenUrl CrossRef PubMed
↵
1. Reddy BK,
2. Delen D
. Predicting hospital readmission for lupus patients: an RNN-LSTM-based deep-learning methodology. Comput Biol Med 2018;101:199–209. doi:10.1016/j.compbiomed.2018.08.029
OpenUrl CrossRef PubMed
↵
1. Hughes DM,
2. Komárek A,
3. Czanner G, et al
. Dynamic longitudinal discriminant analysis using multiple longitudinal markers of different types. Stat Methods Med Res 2018;27:2060–80. doi:10.1177/0962280216674496
OpenUrl CrossRef
↵
1. Hughes DM,
2. Komárek A,
3. Bonnett LJ, et al
. Dynamic classification using credible intervals in longitudinal discriminant analysis. Stat Med 2017;36:3858–74. doi:10.1002/sim.7397
OpenUrl
↵
1. Wang T,
2. Qiu RG,
3. Yu M
. Predictive modeling of the progression of Alzheimer’s disease with recurrent neural networks. Sci Rep 2018;8:9161. doi:10.1038/s41598-018-27337-w
OpenUrl
↵
1. Maruyama N,
2. Takahashi F,
3. Takeuchi M
. Prediction of an outcome using trajectories estimated from a linear mixed model. J Biopharm Stat 2009;19:779–90. doi:10.1080/10543400903105174
OpenUrl PubMed
↵
1. Martens P,
2. Nickel N,
3. Forget E, et al
. The cost of smoking: a Manitoba study. Winnipeg, MB: Manitob Centre for Health Policy, 2015. Available: http://mchp-appserv.cpe.umanitoba.ca/deliverablesList.html
↵
1. Foebel AD,
2. Hirdes JP,
3. Heckman GA, et al
. Diagnostic data for neurological conditions in interRAI assessments in home care, nursing home and mental health care settings: a validity study. BMC Health Serv Res 2013;13:457. doi:10.1186/1472-6963-13-457
↵
1. Government of Canada
. Canadian chronic disease surveillance system (CCDSS) data tool. 2023. Available: https://health-infobase.canada.ca/ccdss/data-tool/Index
↵
1. Marrie RA,
2. Patten SB,
3. Tremlett H, et al
. Sex differences in comorbidity at diagnosis of multiple sclerosis: a population-based study. Neurology (ECronicon) 2016;86:1279–86. doi:10.1212/WNL.0000000000002481
OpenUrl
↵
1. Marrie RA,
2. Fisk JD,
3. Yu BN, et al
. Mental comorbidity and multiple sclerosis: validating administrative data to support population-based surveillance. BMC Neurol 2013;13:16. doi:10.1186/1471-2377-13-16
↵
1. Marrie RA,
2. Yu N,
3. Blanchard J, et al
. The rising prevalence and changing age distribution of multiple sclerosis in Manitoba. Neurology (ECronicon) 2010;74:465–71. doi:10.1212/WNL.0b013e3181cf6ec0
OpenUrl
↵
1. Marrie RA,
2. Kosowan L,
3. Taylor C, et al
. Identifying people with multiple sclerosis in the Canadian primary care sentinel surveillance network. Mult Scler J Exp Transl Clin 2019;5:2055217319894360. doi:10.1177/2055217319894360
↵
1. Marrie RA,
2. Fisk JD,
3. Stadnyk KJ, et al
. The incidence and prevalence of multiple sclerosis in Nova Scotia, Canada. Can J Neurol Sci 2013;40:824–31.
OpenUrl CrossRef PubMed
↵
1. Plummer M,
2. Best N,
3. Cowles K, et al
. CODA: convergence diagnosis and output analysis for MCMC. R News 2006;6:7–11.
OpenUrl CrossRef
↵
1. Brooks SP,
2. Gelman A
. General methods for monitoring convergence of Iterative simulations. J Comput Graph Stat 1998;7:434–55. doi:10.1080/10618600.1998.10474787
OpenUrl CrossRef Web of Science
↵
1. R Core Team
. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing, 2023.
↵
1. Komárek A,
2. Komárková L
. Capabilities of R package mixAK for clustering based on multivariate continuous and discrete longitudinal data. J Stat Softw 2014;59:1–38. doi:10.18637/jss.v059.i12
OpenUrl CrossRef PubMed
↵
1. Murley C,
2. Friberg E,
3. Hillert J, et al
. Validation of multiple sclerosis diagnoses in the Swedish National Patient Register. Eur J Epidemiol 2019;34:1161–9. doi:10.1007/s10654-019-00558-7
OpenUrl CrossRef PubMed
↵
1. Al-Sakran LH,
2. Marrie RA,
3. Blackburn DF, et al
. Establishing the incidence and prevalence of multiple sclerosis in Saskatchewan. Can J Neurol Sci 2018;45:295–303. doi:10.1017/cjn.2017.301
OpenUrl CrossRef
↵
1. Teljas C,
2. Boström I,
3. Marrie RA, et al
. Validating the diagnosis of multiple sclerosis using Swedish administrative data in Värmland County. Acta Neurol Scand 2021;144:680–6. doi:10.1111/ane.13514
OpenUrl
↵
1. Marrie RA,
2. O’Mahony J,
3. Maxwell C, et al
. Incidence and prevalence of MS in children: a population-based study in Ontario, Canada. Neurology (ECronicon) 2018;91:e1579–90. doi:10.1212/WNL.0000000000006395
OpenUrl
↵
1. Iljicsov A,
2. Milanovich D,
3. Ajtay A, et al
. Incidence and prevalence of multiple sclerosis in Hungary based on record linkage of nationwide multiple healthcare administrative data. PLoS One 2020;15:e0236432. doi:10.1371/journal.pone.0236432
↵
1. Culpepper WJ,
2. Marrie RA,
3. Langer-Gould A, et al
. Validation of an algorithm for identifying MS cases in administrative health claims datasets. Neurology (ECronicon) 2019;92:e1016–28. doi:10.1212/WNL.0000000000007043
OpenUrl
↵
1. Culpepper WJ,
2. Ehrmantraut M,
3. Wallin MT, et al
. Veterans health administration multiple sclerosis surveillance registry: the problem of case-finding from administrative databases. J Rehabil Res Dev 2006;43:17–24. doi:10.1682/jrrd.2004.09.0122
OpenUrl CrossRef PubMed
↵
1. Schmedt N,
2. Khil L,
3. Berger K, et al
. Incidence of multiple sclerosis in Germany: a cohort study applying different case definitions based on claims data. Neuroepidemiology 2017;49:91–8. doi:10.1159/000481990
OpenUrl
↵
1. Lix LM,
2. Walker R,
3. Quan H, et al
. Features of physician services databases in Canada. Chronic Dis Inj Can 2012;32:186–93. doi:10.24095/hpcdp.32.4.02
OpenUrl PubMed
↵
1. Groome PA,
2. McBride ML,
3. Jiang L, et al
. Lessons learned: it takes a village to understand inter-sectoral care using administrative data across jurisdictions. Int J Popul Data Sci 2018;3:440. doi:10.23889/ijpds.v3i3.440
↵
1. Canadian Institute for Health Information
. Virtual care: a major shift for Canadians receiving physician services. 2022.
↵
1. Canadian Institute for Health Information
. Supply, distribution and migration of physicians in Canada, 2022—data tables. Ottawa ON: CIHI, 2023.
↵
1. Walton C,
2. King R,
3. Rechtman L, et al
. Rising prevalence of multiple sclerosis worldwide: insights from the Atlas of MS, third edition. Mult Scler 2020;26:1816–21. doi:10.1177/1352458520970841
OpenUrl CrossRef PubMed
↵
1. Danila O,
2. Hirdes JP,
3. Maxwell CJ, et al
. Prevalence of neurological conditions across the continuum of care based on interRAI assessments. BMC Health Serv Res 2014;14:29. doi:10.1186/1472-6963-14-29
↵
1. Katz A,
2. Enns J,
3. Smith M, et al
. Population data centre profile: the Manitoba centre for health policy. Int J Popul Data Sci 2020;4:1131. doi:10.23889/ijpds.v5i1.1131
↵
1. Roos LL,
2. Mustard CA,
3. Nicol JP, et al
. Registries and administrative data: organization and accuracy. Med Care 1993;31:201–12. doi:10.1097/00005650-199303000-00002
OpenUrl CrossRef PubMed Web of Science
↵
1. Manitoba Centre for Health Policy
. Concept: Manitoba health insurance registry / MCHP research registry - overview [internet]. 2023. Available: http://mchp-appserv.cpe.umanitoba.ca/viewConcept.php?conceptID=1213
↵
1. Marrie RA,
2. Fisk JD,
3. Tremlett H, et al
. Differences in the burden of psychiatric comorbidity in MS vs the general population. Neurology (ECronicon) 2015;85:1972–9. doi:10.1212/WNL.0000000000002174
OpenUrl
↵
1. Kroeker K,
2. Widdifield J,
3. Muthukumarana S, et al
. Model-based methods for case definitions from administrative health data: application to rheumatoid arthritis. BMJ Open 2017;7:e016173. doi:10.1136/bmjopen-2017-016173
↵
1. Widdifield J,
2. Bernatsky S,
3. Paterson JM, et al
. Accuracy of canadian health administrative databases in identifying patients with rheumatoid arthritis: a validation study using the medical records of rheumatologists. Arthritis Care Res (Hoboken) 2013;65:1582–91. doi:10.1002/acr.22031
OpenUrl
↵
1. Hutfless S,
2. Jasper RA,
3. Tilak A, et al
. A systematic review of Crohn’s disease case definitions in administrative or claims databases. Inflamm Bowel Dis 2023;29:705–15. doi:10.1093/ibd/izac131
OpenUrl
↵
1. Lirhus SS,
2. Høivik ML,
3. Moum B, et al
. Incidence and prevalence of inflammatory bowel disease in Norway and the impact of different case definitions: a nationwide registry study. Clin Epidemiol 2021;13:287–94. doi:10.2147/CLEP.S303797
OpenUrl

Supplementary materials

Supplementary Data

This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

Data supplement 1
Data supplement 2
Supplementary Data

This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

Data supplement 1
Data supplement 2

Footnotes

Contributors NCH and LL conceived the idea for the study. NCH, LL, DJ, RAM and PI defined the scope of the study and created the analysis plans. NCH conducted the analyses and had access to individual-level data obtained within study period. NCH and LL drafted the manuscript and DJ, RAM and PI contributed to its revisions. NCH, LL, DJ, RAM and PI reviewed and approved the final manuscript for submission. NCH is the guarantor.
Funding This work was supported by the Canadian Institutes of Health Research [FDN-143293]. NCH received funding from the Visual and Automated Disease Analytics Trainee ProgramProgramme during the time of this study. LL is supported by a Canada Research Chair in Methods for Electronic Health Data Quality (CRC-2017–00186). PI is supported by a Canada Research Chair in Ubiquitous Analytics. RAM is supported by the Waugh Family Chair in Multiple Sclerosis and a Manitoba Research Chair from Research Manitoba.
Competing interests RAM receives research funding from: CIHR, Research Manitoba, Multiple Sclerosis Society of Canada, Multiple Sclerosis Scientific Foundation, Crohn’s and Colitis Canada, National Multiple Sclerosis Society, CMSC and the US Department of Defense, and is a co-investigator on studies receiving funding from Biogen Idec and Roche Canada. LL receives research funding from CIHR, Canada Research Chairs Programme, Natural Sciences and Engineering Research Council of Canada, National Institutes of Health, and the Canadian Agency for Drugs and Technologies in Health.
Patient and public involvement Patients and/or the public were not involved in the design, or conduct, or reporting, or dissemination plans of this research.
Provenance and peer review Not commissioned; externally peer reviewed.
Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.

[1] ↵
Lix LM,
Ayles J,
Bartholomew S, et al
. The Canadian chronic disease surveillance system: a model for collaborative surveillance. Int J Popul Data Sci 2018;3:433. doi:10.23889/ijpds.v3i3.433

[2] Lix LM,

[3] Ayles J,

[4] Bartholomew S, et al

[5] ↵
Lix LM,
Yogendran MS,
Shaw SY, et al
. Population-based data sources for chronic disease surveillance. Chronic Dis Can 2008;29:31–8.
OpenUrl PubMed Web of Science

[6] Lix LM,

[7] Yogendran MS,

[8] Shaw SY, et al

[9] ↵
Rector TS,
Wickstrom SL,
Shah M, et al
. Specificity and sensitivity of claims-based algorithms for identifying members of medicare choice health plans that have chronic medical conditions. Health Serv Res 2004;39:1839–57. doi:10.1111/j.1475-6773.2004.00321.x
OpenUrl CrossRef PubMed Web of Science

[10] Rector TS,

[11] Wickstrom SL,

[12] Shah M, et al

[13] ↵
Iezzoni LI
. Assessing quality using administrative data. Ann Intern Med 1997;127:666–74. doi:10.7326/0003-4819-127-8_part_2-199710151-00048
OpenUrl CrossRef PubMed Web of Science

[14] Iezzoni LI

[15] ↵
Marrie RA,
McKay K
. Administrative data for observational research in multiple sclerosis: opportunities and challenges. Mult Scler 2022;28:3–6. doi:10.1177/13524585211055787
OpenUrl CrossRef

[16] Marrie RA,

[17] McKay K

[18] ↵
Yamout BI,
Alroughani R
. Multiple sclerosis. Semin Neurol 2018;38:212–25. doi:10.1055/s-0038-1649502
OpenUrl CrossRef

[19] Yamout BI,

[20] Alroughani R

[21] ↵
Petrin J,
Marrie RA,
Devonshire V, et al
. Good multiple sclerosis (MS) care and how to get there in Canada: perspectives of Canadian healthcare providers working with persons with MS. Front Neurol 2023;14:1101521. doi:10.3389/fneur.2023.1101521

[22] Petrin J,

[23] Marrie RA,

[24] Devonshire V, et al

[25] ↵
GBD 2016 Multiple Sclerosis Collaborators
. Global, regional, and national burden of multiple sclerosis 1990-2016: a systematic analysis for the Global Burden of Disease Study 2016. Lancet Neurol 2019;18:269–85. doi:10.1016/S1474-4422(18)30443-5
OpenUrl CrossRef PubMed

[26] GBD 2016 Multiple Sclerosis Collaborators

[27] ↵
Amankwah N,
Marrie RA,
Bancej C, et al
. Multiple sclerosis in Canada 2011 to 2031: results of a microsimulation modelling study of epidemiological and economic impacts. Health Promot Chronic Dis Prev Can 2017;37:37–48. doi:10.24095/hpcdp.37.2.02
OpenUrl

[28] Amankwah N,

[29] Marrie RA,

[30] Bancej C, et al

[31] ↵
Cooke CR,
Joo MJ,
Anderson SM, et al
. The validity of using ICD-9 codes and pharmacy records to identify patients with chronic obstructive pulmonary disease. BMC Health Serv Res 2011;11:37. doi:10.1186/1472-6963-11-37

[32] Cooke CR,

[33] Joo MJ,

[34] Anderson SM, et al

[35] ↵
English SW,
McIntyre L,
Fergusson D, et al
. Subarachnoid hemorrhage admissions retrospectively identified using a prediction model. Neurology (ECronicon) 2016;87:1557–64. doi:10.1212/WNL.0000000000003204
OpenUrl

[36] English SW,

[37] McIntyre L,

[38] Fergusson D, et al

[39] ↵
Fan J,
Arruda-Olson AM,
Leibson CL, et al
. Billing code algorithms to identify cases of peripheral artery disease from administrative data. J Am Med Inform Assoc 2013;20:e349–54. doi:10.1136/amiajnl-2013-001827
OpenUrl CrossRef PubMed

[40] Fan J,

[41] Arruda-Olson AM,

[42] Leibson CL, et al

[43] ↵
Lix LM,
Yogendran MS,
Leslie WD, et al
. Using multiple data features improved the validity of osteoporosis case ascertainment from administrative databases. J Clin Epidemiol 2008;61:1250–60. doi:10.1016/j.jclinepi.2008.02.002
OpenUrl CrossRef PubMed Web of Science

[44] Lix LM,

[45] Yogendran MS,

[46] Leslie WD, et al

[47] ↵
Peng M,
Chen G,
Lix LM, et al
. Refining hypertension surveillance to account for potentially misclassified cases. PLoS One 2015;10:e0119186. doi:10.1371/journal.pone.0119186

[48] Peng M,

[49] Chen G,

[50] Lix LM, et al

[51] ↵
Prosser RJ,
Carleton BC,
Smith MA
. Identifying persons with treated asthma using administrative data via latent class modelling. Health Serv Res 2008;43:733–54. doi:10.1111/j.1475-6773.2007.00775.x
OpenUrl CrossRef PubMed

[52] Prosser RJ,

[53] Carleton BC,

[54] Smith MA

[55] ↵
Feely A,
Lim LS,
Jiang D, et al
. A population‐based study to develop juvenile arthritis case definitions for administrative health data using model‐based dynamic classification. BMC Med Res Methodol 2021;21:105. doi:10.1186/s12874-021-01296-9

[56] Feely A,

[57] Lim LS,

[58] Jiang D, et al

[59] ↵
Choi E,
Schuetz A,
Stewart WF, et al
. Using recurrent neural network models for early detection of heart failure onset. J Am Med Inform Assoc 2017;24:361–70. doi:10.1093/jamia/ocw112
OpenUrl CrossRef PubMed

[60] Choi E,

[61] Schuetz A,

[62] Stewart WF, et al

[63] ↵
Reddy BK,
Delen D
. Predicting hospital readmission for lupus patients: an RNN-LSTM-based deep-learning methodology. Comput Biol Med 2018;101:199–209. doi:10.1016/j.compbiomed.2018.08.029
OpenUrl CrossRef PubMed

[64] Reddy BK,

[65] Delen D

[66] ↵
Hughes DM,
Komárek A,
Czanner G, et al
. Dynamic longitudinal discriminant analysis using multiple longitudinal markers of different types. Stat Methods Med Res 2018;27:2060–80. doi:10.1177/0962280216674496
OpenUrl CrossRef

[67] Hughes DM,

[68] Komárek A,

[69] Czanner G, et al

[70] ↵
Hughes DM,
Komárek A,
Bonnett LJ, et al
. Dynamic classification using credible intervals in longitudinal discriminant analysis. Stat Med 2017;36:3858–74. doi:10.1002/sim.7397
OpenUrl

[71] Hughes DM,

[72] Komárek A,

[73] Bonnett LJ, et al

[74] ↵
Wang T,
Qiu RG,
Yu M
. Predictive modeling of the progression of Alzheimer’s disease with recurrent neural networks. Sci Rep 2018;8:9161. doi:10.1038/s41598-018-27337-w
OpenUrl

[75] Wang T,

[76] Qiu RG,

[77] Yu M

[78] ↵
Maruyama N,
Takahashi F,
Takeuchi M
. Prediction of an outcome using trajectories estimated from a linear mixed model. J Biopharm Stat 2009;19:779–90. doi:10.1080/10543400903105174
OpenUrl PubMed

[79] Maruyama N,

[80] Takahashi F,

[81] Takeuchi M

[82] ↵
Martens P,
Nickel N,
Forget E, et al
. The cost of smoking: a Manitoba study. Winnipeg, MB: Manitob Centre for Health Policy, 2015. Available: http://mchp-appserv.cpe.umanitoba.ca/deliverablesList.html

[83] Martens P,

[84] Nickel N,

[85] Forget E, et al

[86] ↵
Foebel AD,
Hirdes JP,
Heckman GA, et al
. Diagnostic data for neurological conditions in interRAI assessments in home care, nursing home and mental health care settings: a validity study. BMC Health Serv Res 2013;13:457. doi:10.1186/1472-6963-13-457

[87] Foebel AD,

[88] Hirdes JP,

[89] Heckman GA, et al

[90] ↵
Government of Canada
. Canadian chronic disease surveillance system (CCDSS) data tool. 2023. Available: https://health-infobase.canada.ca/ccdss/data-tool/Index

[91] Government of Canada

[92] ↵
Marrie RA,
Patten SB,
Tremlett H, et al
. Sex differences in comorbidity at diagnosis of multiple sclerosis: a population-based study. Neurology (ECronicon) 2016;86:1279–86. doi:10.1212/WNL.0000000000002481
OpenUrl

[93] Marrie RA,

[94] Patten SB,

[95] Tremlett H, et al

[96] ↵
Marrie RA,
Fisk JD,
Yu BN, et al
. Mental comorbidity and multiple sclerosis: validating administrative data to support population-based surveillance. BMC Neurol 2013;13:16. doi:10.1186/1471-2377-13-16

[97] Marrie RA,

[98] Fisk JD,

[99] Yu BN, et al

[100] ↵
Marrie RA,
Yu N,
Blanchard J, et al
. The rising prevalence and changing age distribution of multiple sclerosis in Manitoba. Neurology (ECronicon) 2010;74:465–71. doi:10.1212/WNL.0b013e3181cf6ec0
OpenUrl

[101] Marrie RA,

[102] Yu N,

[103] Blanchard J, et al

[104] ↵
Marrie RA,
Kosowan L,
Taylor C, et al
. Identifying people with multiple sclerosis in the Canadian primary care sentinel surveillance network. Mult Scler J Exp Transl Clin 2019;5:2055217319894360. doi:10.1177/2055217319894360

[105] Marrie RA,

[106] Kosowan L,

[107] Taylor C, et al

[108] ↵
Marrie RA,
Fisk JD,
Stadnyk KJ, et al
. The incidence and prevalence of multiple sclerosis in Nova Scotia, Canada. Can J Neurol Sci 2013;40:824–31.
OpenUrl CrossRef PubMed

[109] Marrie RA,

[110] Fisk JD,

[111] Stadnyk KJ, et al

[112] ↵
Plummer M,
Best N,
Cowles K, et al
. CODA: convergence diagnosis and output analysis for MCMC. R News 2006;6:7–11.
OpenUrl CrossRef

[113] Plummer M,

[114] Best N,

[115] Cowles K, et al

[116] ↵
Brooks SP,
Gelman A
. General methods for monitoring convergence of Iterative simulations. J Comput Graph Stat 1998;7:434–55. doi:10.1080/10618600.1998.10474787
OpenUrl CrossRef Web of Science

[117] Brooks SP,

[118] Gelman A

[119] ↵
R Core Team
. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing, 2023.

[120] R Core Team

[121] ↵
Komárek A,
Komárková L
. Capabilities of R package mixAK for clustering based on multivariate continuous and discrete longitudinal data. J Stat Softw 2014;59:1–38. doi:10.18637/jss.v059.i12
OpenUrl CrossRef PubMed

[122] Komárek A,

[123] Komárková L

[124] ↵
Murley C,
Friberg E,
Hillert J, et al
. Validation of multiple sclerosis diagnoses in the Swedish National Patient Register. Eur J Epidemiol 2019;34:1161–9. doi:10.1007/s10654-019-00558-7
OpenUrl CrossRef PubMed

[125] Murley C,

[126] Friberg E,

[127] Hillert J, et al

[128] ↵
Al-Sakran LH,
Marrie RA,
Blackburn DF, et al
. Establishing the incidence and prevalence of multiple sclerosis in Saskatchewan. Can J Neurol Sci 2018;45:295–303. doi:10.1017/cjn.2017.301
OpenUrl CrossRef

[129] Al-Sakran LH,

[130] Marrie RA,

[131] Blackburn DF, et al

[132] ↵
Teljas C,
Boström I,
Marrie RA, et al
. Validating the diagnosis of multiple sclerosis using Swedish administrative data in Värmland County. Acta Neurol Scand 2021;144:680–6. doi:10.1111/ane.13514
OpenUrl

[133] Teljas C,

[134] Boström I,

[135] Marrie RA, et al

[136] ↵
Marrie RA,
O’Mahony J,
Maxwell C, et al
. Incidence and prevalence of MS in children: a population-based study in Ontario, Canada. Neurology (ECronicon) 2018;91:e1579–90. doi:10.1212/WNL.0000000000006395
OpenUrl

[137] Marrie RA,

[138] O’Mahony J,

[139] Maxwell C, et al

[140] ↵
Iljicsov A,
Milanovich D,
Ajtay A, et al
. Incidence and prevalence of multiple sclerosis in Hungary based on record linkage of nationwide multiple healthcare administrative data. PLoS One 2020;15:e0236432. doi:10.1371/journal.pone.0236432

[141] Iljicsov A,

[142] Milanovich D,

[143] Ajtay A, et al

[144] ↵
Culpepper WJ,
Marrie RA,
Langer-Gould A, et al
. Validation of an algorithm for identifying MS cases in administrative health claims datasets. Neurology (ECronicon) 2019;92:e1016–28. doi:10.1212/WNL.0000000000007043
OpenUrl

[145] Culpepper WJ,

[146] Marrie RA,

[147] Langer-Gould A, et al

[148] ↵
Culpepper WJ,
Ehrmantraut M,
Wallin MT, et al
. Veterans health administration multiple sclerosis surveillance registry: the problem of case-finding from administrative databases. J Rehabil Res Dev 2006;43:17–24. doi:10.1682/jrrd.2004.09.0122
OpenUrl CrossRef PubMed

[149] Culpepper WJ,

[150] Ehrmantraut M,

[151] Wallin MT, et al

[152] ↵
Schmedt N,
Khil L,
Berger K, et al
. Incidence of multiple sclerosis in Germany: a cohort study applying different case definitions based on claims data. Neuroepidemiology 2017;49:91–8. doi:10.1159/000481990
OpenUrl

[153] Schmedt N,

[154] Khil L,

[155] Berger K, et al

[156] ↵
Lix LM,
Walker R,
Quan H, et al
. Features of physician services databases in Canada. Chronic Dis Inj Can 2012;32:186–93. doi:10.24095/hpcdp.32.4.02
OpenUrl PubMed

[157] Lix LM,

[158] Walker R,

[159] Quan H, et al

[160] ↵
Groome PA,
McBride ML,
Jiang L, et al
. Lessons learned: it takes a village to understand inter-sectoral care using administrative data across jurisdictions. Int J Popul Data Sci 2018;3:440. doi:10.23889/ijpds.v3i3.440

[161] Groome PA,

[162] McBride ML,

[163] Jiang L, et al

[164] ↵
Canadian Institute for Health Information
. Virtual care: a major shift for Canadians receiving physician services. 2022.

[165] Canadian Institute for Health Information

[166] ↵
Canadian Institute for Health Information
. Supply, distribution and migration of physicians in Canada, 2022—data tables. Ottawa ON: CIHI, 2023.

[167] Canadian Institute for Health Information

[168] ↵
Walton C,
King R,
Rechtman L, et al
. Rising prevalence of multiple sclerosis worldwide: insights from the Atlas of MS, third edition. Mult Scler 2020;26:1816–21. doi:10.1177/1352458520970841
OpenUrl CrossRef PubMed

[169] Walton C,

[170] King R,

[171] Rechtman L, et al

[172] ↵
Danila O,
Hirdes JP,
Maxwell CJ, et al
. Prevalence of neurological conditions across the continuum of care based on interRAI assessments. BMC Health Serv Res 2014;14:29. doi:10.1186/1472-6963-14-29

[173] Danila O,

[174] Hirdes JP,

[175] Maxwell CJ, et al

[176] ↵
Katz A,
Enns J,
Smith M, et al
. Population data centre profile: the Manitoba centre for health policy. Int J Popul Data Sci 2020;4:1131. doi:10.23889/ijpds.v5i1.1131

[177] Katz A,

[178] Enns J,

[179] Smith M, et al

[180] ↵
Roos LL,
Mustard CA,
Nicol JP, et al
. Registries and administrative data: organization and accuracy. Med Care 1993;31:201–12. doi:10.1097/00005650-199303000-00002
OpenUrl CrossRef PubMed Web of Science

[181] Roos LL,

[182] Mustard CA,

[183] Nicol JP, et al

[184] ↵
Manitoba Centre for Health Policy
. Concept: Manitoba health insurance registry / MCHP research registry - overview [internet]. 2023. Available: http://mchp-appserv.cpe.umanitoba.ca/viewConcept.php?conceptID=1213

[185] Manitoba Centre for Health Policy

[186] ↵
Marrie RA,
Fisk JD,
Tremlett H, et al
. Differences in the burden of psychiatric comorbidity in MS vs the general population. Neurology (ECronicon) 2015;85:1972–9. doi:10.1212/WNL.0000000000002174
OpenUrl

[187] Marrie RA,

[188] Fisk JD,

[189] Tremlett H, et al

[190] ↵
Kroeker K,
Widdifield J,
Muthukumarana S, et al
. Model-based methods for case definitions from administrative health data: application to rheumatoid arthritis. BMJ Open 2017;7:e016173. doi:10.1136/bmjopen-2017-016173

[191] Kroeker K,

[192] Widdifield J,

[193] Muthukumarana S, et al

[194] ↵
Widdifield J,
Bernatsky S,
Paterson JM, et al
. Accuracy of canadian health administrative databases in identifying patients with rheumatoid arthritis: a validation study using the medical records of rheumatologists. Arthritis Care Res (Hoboken) 2013;65:1582–91. doi:10.1002/acr.22031
OpenUrl

[195] Widdifield J,

[196] Bernatsky S,

[197] Paterson JM, et al

[198] ↵
Hutfless S,
Jasper RA,
Tilak A, et al
. A systematic review of Crohn’s disease case definitions in administrative or claims databases. Inflamm Bowel Dis 2023;29:705–15. doi:10.1093/ibd/izac131
OpenUrl

[199] Hutfless S,

[200] Jasper RA,

[201] Tilak A, et al

[202] ↵
Lirhus SS,
Høivik ML,
Moum B, et al
. Incidence and prevalence of inflammatory bowel disease in Norway and the impact of different case definitions: a nationwide registry study. Clin Epidemiol 2021;13:287–94. doi:10.2147/CLEP.S303797
OpenUrl

[203] Lirhus SS,

[204] Høivik ML,

[205] Moum B, et al

Log in using your username and password

Main menu

Log in using your username and password

You are here

Abstract

Data availability statement

Statistics from Altmetric.com

Request Permissions

STRENGTHS AND LIMITATIONS OF THIS STUDY

Introduction

Methods

Study design

Patient and public involvement

Data source

Reference standard

Study cohort

Study variables

Supplemental material

Case definitions

Supplemental material

Statistical analysis

Results

Discussion

Data availability statement

Ethics statements

Patient consent for publication

Ethics approval

References

Supplementary materials

Supplementary Data

Supplementary Data

Footnotes

Read the full text or download the PDF:

Log in using your username and password