We validated a case definition for combined suspected or confirmed asthma in primary care. This study’s proposed case definitions had similar results for both suspected and confirmed asthma. Case definitions could not discriminate between suspected and confirmed asthma because the use of objective measures to confirm asthma diagnosis was either not completed or not documented. Our findings of a combined prevalence of suspected and confirmed asthma of 9,7% is comparable to current national statistics . However, 75% of the cases in our study were suspected not confirmed. This highlights the importance of confirming and documenting the status of asthma diagnoses in EMRs. National statistics based on population surveys that rely on self-report of physician diagnosis or billing data may also be subject to considerable misclassification. Until EMR data elements are adopted that allow for the distinction between suspected and confirmed asthma , one case definition that can be used for combined suspected or confirmed asthma is recommended.
Our proposed case definitions had similar operating characteristics to those reported previously. However, in replicating case definition algorithms from both Xi et al.  and Cave et al.  (Table 1), we found different results across all metrics calculated. For example, for Case Definition 1, Xi et al. (2015) report a SN of 78% and SP of 89%, compared to a SN of 35% and SP of 99% in our study. For Case Definition 3, Xi (2015) reported a SN of 7% and SP of 99%, compared to a SN of 4% and SP of 100% in our study . Case Definitions 1 and 3 were attempts to replicate their algorithms and were considered approximated because the original case definition algorithms used information directly from the source EMR in OSCAR. For Cave et al., the metrics were similar, with a reported a SN of 83%, SP of 99%, PPV of 74%, NPV of 99%, and a Youden’s Index of 0.82, compared to a SN of 78%, SP of 97%, PPV of 75%, NPV of 98%, and YI of 0.73 in our study. 
This variability can likely be attributed to the variation in the data sources used for case definition analysis and the variation in charting behaviour between clinical sites. Xi et al. (2015) created a cohort with a high proportion of patients with asthma and COPD for analysis. In contrast, we used a population-based sample, thus having a lower asthma prevalence, reducing SP and PPV while improving SN and NPV. In Cave et al.’s (2020) study, the authors used data from the Southern Alberta Primary Care Research Network of CPCSSN (SAPCReN-CPCSSN) to classify cases of asthma. In this study, reviewers used the source EMR for classification, allowing for a complete review of the patient’s entire medical history.
The results of this study highlight the importance of having discrete data elements for asthma diagnostic tests in EMRs, particularly given that there were no searchable data elements that enabled us to differentiate between suspected and confirmed asthma. In addition, in EMRs, there is no requirement for confirming asthma diagnosis through objective measures such as spirometry or a methacholine challenge test. EMRs should incorporate data elements such as those proposed by the Pan-Canadian Respiratory Standards Initiative for Electronic Health Records (PRESTINE) so that providers are able to document whether asthma is suspected or confirmed, and if confirmed by what method [7, 20]. Data elements that capture if asthma has been confirmed would enable case definition search strategies to differentiate between suspected and confirmed asthma . By adopting these data elements, knowledge translation eTools could provide decision support to healthcare providers on cases of suspected asthma that require objective testing, while simultaneously improving asthma surveillance by ensuring cases of asthma are confirmed asthma .
In our study, although we included every medication combination presented in the CTS guidelines for asthma management  (Case Definitions M1-M7), medication data did not improve the operating characteristics of detection algorithms (Table 1). The proposed case definitions that included medication data had a wide sensitivity range, from 0 to 76%. This result differs from previous literature on asthma case definitions, which discuss adding medications as an effective way to improve case definitions . We believe that this may be because many medications are now being used for both asthma and COPD, and as could contribute to misdiagnosis of asthma and COPD if used as part of EMR algorithms. Additionally, this finding suggests that researchers creating asthma case definitions must be very specific in their inclusion or exclusion of medications in case definitions.
The findings of our study fit well within the existing literature on the validation of asthma diagnoses using EMRs. A recent study from Howell et al.  developed a case definition algorithm for asthma using EMR data from a pulmonary specialty clinic. This study’s best-case definition had a SN of 94% and a SP of 85%. These results are slightly higher than the results of our study. In this case, the slightly higher SN and SP can be attributed to using a specialty clinic, which would be more likely to have confirmed cases of asthma, improving specificity, and a higher relative proportion of patients with asthma, improving sensitivity. Another systematic review of literature on the validation of asthma diagnoses in electronic health records by Nissen et al. described 13 studies on the subject . The authors found that most studies were able to demonstrate a high positive predictive value (PPV > 80%), with a high degree of variation based on methodology used. Our study builds upon the systematic review by using a national database that can utilize the case definition in primary care practices across Canada.
We were able to directly replicate the case definition proposed by Cave et al., given that it also used CPCSSN data holdings. For case definition 13, Cave et al. (2020) reported a SN of 83% (+ 5%), a SP of 99% (+ 2%), PPV of 74% (-1%), NPV of 99% (-1%), and a Youden’s Index of 0.82 (+ 0.09), which are nearly identical to our results. The discrepancy between the findings can be attributed to the data source used for classifying cases of asthma and the data source used for validating the case definition.
The clinical implications of using a combined case definition for asthma in primary care EMRs for suspected and confirmed asthma are important to consider. Until EMR data elements that document whether asthma has been confirmed by objective lung function tests are widely adopted, surveillance data utilizing an asthma EMR case definition that cannot differentiate between suspected and confirmed asthma may over-estimate true asthma prevalence. Separate case definitions would provide more accurate information on disease patterns, prevalence, and performance measurement for quality improvement. Future knowledge translation initiatives should focus on adoption of EMR data elements that would allow separate EMR case definitions for the suspected and confirmed asthma.
Strengths of this study include using the original EMR source data for chart abstraction and classification. By manually reviewing the patient chart, the abstractor and physicians had the entire medical record of a patient available to accurately classify the charts based on all information available. Another strength of this study is the use of CPCSSN data holdings for testing and validating case definitions. CPCSSN data is more granular than health administrative data that has been used for case definitions of asthma in the past. This is due to CPCSSN’s data being derived from primary care medical records which have more specific information than health administrative data. In addition to CPCSSN’s added specificity, CPCSSN remains more broadly applicable than data from a single EMR as it compiles data from multiple EMR platforms . Another strength of utilizing CPCSSN as a database is to improve the generalizability of the study, as CPCSSN can analyze data from all major EMR providers in Canada. This allows the proposed case definition to be applied across the country to various primary care settings and EMR providers. Additional strengths of this study are the use of a single abstractor and experts for classification purposes, which ensured consistency in both data collection and final classification of cases.
Limitations of this study include generalizability and the data source. This exercise was conducted at a single academic clinical site that is a member of CPCSSN. It may be difficult to generalize the findings at this academic primary care practice to community practices, as the case mix may differ, and this particular practice may have unique charting, billing, and data entry patterns. Additionally, this study used information from one EMR, OSCAR. As a result, the case definitions developed in this study may have different results when applied to other EMRs, although the criteria used in the CPCSSN database applies to sites across Canada.