Scientific Papers

A scoping review of the use of minimally important difference of EQ-5D utility index and EQ-VAS scores in health technology assessment | Health and Quality of Life Outcomes


Literature search

A detailed breakdown of the flow of studies in the HTA review has been described previously [17]. In summary, 1329 HTA decision and supporting documents from 1072 technology appraisals were identified in the literature search. After screening for eligibility, 298 documents from 195 TAs met the inclusion criteria (G-BA n = 60, HAS n = 11, ICER n = 3, IQWiG n = 78, NICE n = 43). However, only 16 of the 60 G-BA TAs meeting the inclusion criteria provided additional EQ-5D data to linked IQWiG TAs and were extracted. Therefore, 151 TAs were considered for MID data.

Discussion of minimally important difference

Of the 151 TAs included in the HTA review which provided unique data, only 38% (n = 58/151) discussed the MID of EQ-5D data (Table 1). German appraisals most frequently mentioned the MID, in 75% (n = 12/16) of G-BA TAs and 44% (n = 34/78) of IQWiG TAs. ICER mentioned the MID in 33% (n = 1/3) and NICE in 23% (n = 10/43) of appraisals. Discussion of MID occurred less often in French appraisals (n = 1, 9%). Cancer was the most frequently addressed disease, in 91% of appraisals mentioning MID (n = 53/58). All G-BA appraisals were on cancer (n = 12, 100%), all HAS appraisals on blood or immune disease (n = 1, 100%) and all ICER on digestive disease (n = 1, 100%). Among IQWiG appraisals discussing MID, 94% (n = 32) were on cancer and 6% on musculoskeletal issues (n = 2), with 90% of NICE appraisals on cancer (n = 9) and 10% on musculoskeletal issues (n = 1).

While terminology for MID was variable (minimal(ly) important difference, minimal clinically important difference, clinically meaningful, clinically meaningful difference, clinically meaningful change, clinically meaningful improvement, clinically relevant improvement, clinically relevant deterioration), the term ‘minimal(ly) important difference’ was most frequently reported, in 72% of TAs mentioning MID (n = 42/58). However, limited explanation of the methodology utilised made it difficult to assess whether the correct terms were employed. In contrast to Terwee et al’s 2021 definition of MIC as longitudinal and MID as cross-sectional [7], the G-BA considered MID as longitudinal [19,20,21,22,23,24,25,26,27].

Table 1 Discussion of minimally important difference, stratified by HTA agency

Of those which mentioned the MID, a greater proportion discussed the MID for the EQ-VAS (86%) than the EQ-5D utility index (5%), or the utility index and EQ-VAS in combination (5%; see Table 2). Forty six of the 53 (87%) appraisals which discussed the MID for the EQ-VAS were German.

Table 2 Discussion of minimally important difference, stratified by EQ-5D measure and HTA agency

EQ-5D MID thresholds reported

Reported MID thresholds stratified by HTA agency are summarised in Table 3. Of the 58 appraisals which mentioned MID, 50 (86%) reported the threshold utilised (thresholds were reported for both the EQ-5D utility index and EQ-VAS in 1 NICE [28] and in 1 HAS TA [29]). Only NICE and HAS reported using MID thresholds for the EQ-5D utility index in 5 TAs [28,29,30,31,32] and none of these were utilised more than once. Of the appraisals which specified the MID threshold used for the EQ-VAS, 100% reported MID thresholds of ≥ 7 points (n = 47). A threshold of > 7 or > 10 points was most frequently used for EQ-VAS data (28%), including in 2 G-BA [23, 24] and 11 IQWiG TAs [33,34,35,36,37,38,39,40,41,42,43].

Table 3 EQ-5D MID thresholds reported, stratified by HTA agency

Source of MID thresholds

Of the 58 appraisals which mentioned MID, 40 (69%) reported the source of the thresholds used (1 NICE TA reported the same source for both the EQ-5D utility index and EQ-VAS [28]). As shown in Table 4, only 2 TAs published by NICE reported the source of MID thresholds utilised for the EQ-5D utility index [28, 30], which equally referenced Walters & Brazier 2005 [44] and Delaloge et al. 2019 [45]. For EQ-VAS data, Pickard et al. 2007 [18] was most frequently reported (90%), including in 11 G-BA [19,20,21,22,23,24,25,26,27, 46, 47] and 24 IQWiG TAs [33, 34, 36, 39, 41,42,43, 48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64].

Table 4 Source of MID thresholds for EQ-5D utility index and EQ-VAS

Of the TAs which reported the source of MID (n = 40), 38 (95%) were for cancer, one ICER TA was for digestive tract conditions, and one IQWiG TA was for musculoskeletal conditions. Almost all applied MID thresholds to patient populations with the same indication as the source (95%, n = 38). The exceptions to this include 1 IQWiG TA in cancer [68] utilising rheumatoid arthritis-specific MID for the EQ-VAS by Hurst et al. (1997) [67], and 1 IQWIG TA for osteoporosis [62] utilising cancer-specific MID for the EQ-VAS by Pickard et al. (2007) [18].

Differences in MID between G-BA and IQWiG appraisals of the same product

When MIDs were compared between G-BA and IQWiG appraisals of the same product and indication (linked appraisals), 4 (25%) G-BA appraisals which presented additional EQ-5D data reported different MID usage [21, 22, 26, 47] (Table 5). In all cases, the MID threshold was reported in G-BA and not in IQWiG TAs. For 31% of G-BA TAs, MID thresholds were not reported in either one or both of the linked IQWiG and G-BA documents.

Table 5 Differences between reported MID in linked G-BA and IQWiG appraisals

Acceptability of EQ-5D MID data

In 34 appraisals, HTA agency comments were provided about the acceptability of the MID source and/or thresholds applied by the submitting companies, almost all of which were from Germany (G-BA n = 12, IQWiG n = 19, NICE n = 3). In 2 NICE TAs [30, 69], it was noted that there was a lack of clarity about the MID thresholds applied, and results should be interpreted cautiously due to small patient sample sizes in another [32]. In a fourth NICE TA, the Evidence Review Group stated it was “satisfied that the company’s approach to analysing patient-reported outcomes was pre-specified” (including applying an MID of ≥ 0.08 to the EQ-5D-5 L utility index) and that the approach was appropriate [31].

However, German HTA agencies were more critical of MID data analyses, particularly in reference to a lack of pre-specification of the MIDs utilised [37, 68, 70] and their source. In 13 TAs, IQWiG criticised the use of Pickard et al. 2007 [18] as the source of MID thresholds for the EQ-VAS, as it was perceived as being unsuitable for assessing the validity of MID [33, 34, 36, 39, 41, 42, 48, 49, 56,57,58, 62, 63]. Consequently, MID analyses were excluded from the benefit assessment. Similarly, in the assessment of daratumumab (Darzalex, Janssen-Cilag International NV) [68], analyses of EQ-VAS data based on MIDs estimated by Hurst et al. 1997 [67] were also considered to be inappropriate and excluded from the benefit assessment, as it was noted that a MID for the EQ-VAS was not examined in Hurst et al. 1997 [67].

The G-BA echoed the opinion of IQWiG that the MID from Pickard et al. 2007 [18] was unsuitable, as the MID was not derived from a longitudinal study [19,20,21,22,23,24,25,26,27]. Furthermore, the G-BA stated that the Eastern Cooperative Oncology Group Performance Scale (ECOG-PS) and Functional Assessment of Cancer Therapy – General (FACT-G) total score anchors used in the study were also not considered by IQWiG to be suitable for deriving a MID, however the reasoning for this was not provided [19,20,21, 26]. In several cases, IQWiG utilised continuous analyses of EQ-VAS data (e.g., standardised mean differences [a summary statistic where standard deviations are used to standardise results of studies to a single, weighted scale [71]] in EQ-VAS score, expressed as Hedges’ g [an effect size measure representing the standardised difference between means [72]]) instead of responder analyses (the proportion of patients achieving a pre-defined level of improvement [73]) based on a MID [19,20,21,22,23,24, 26, 27, 70]. Nevertheless, the G-BA differed from IQWiG and considered responder analyses using the EQ-VAS in its decision making, citing that responder analyses based on a MID for clinical evaluation of effects have advantages over analyses of standardised mean value differences [19,20,21,22,23,24,25,26, 47, 70].



Source link