Scientific Papers

Assessment of validity and reliability of the feedback quality instrument | BMC Research Notes


Study design

The study involved translating a tool based on WHO steps, assessing face, content, and construct validity, and estimating its internal consistency and inter-rater reliability.

Description of the tool

The feedback quality instrument was designed by Johnson et al. (2021). The creator allowed permission to use it. The questionnaire includes 25 questions in five main areas, consisting of 4 questions for Setting the Scene, 7 for Analyze Performance, 4 for Plane Improvement, 4 for Foster Learner Agency, and 5 for Foster Psychological Safety. The first three domains occur sequentially, and the next two domains flow during a single feedback encounter. The first domain, set the scene, is to ensure a strong start by introducing key factors that will influence the interaction from the outset. The analysis of performance helps the learner gain a better understanding of the desired performance and how their performance is measured. Items in the plan improvement, involve choosing significant learning objectives and creating efficient strategies for enhancement that are customized to the individual. The other two domains develop throughout feedback, Foster learner agency Encouraging learner autonomy involves involvement, motivation, and active learning. Overall, the tool provides a set of explicit descriptions of helpful behaviors to guide clinical workplace feedback to educators [8].

The options are graded according to the Likert scale.

0 = not done, 1 = sometimes done, 2 = done consistently.

The feedback quality score based on this questionnaire can be between 0 and 50.

Translation

Since the study was conducted in a Persian-speaking country, the English FQI had to be translated into Persian. The questionnaire was first translated into Persian using WHO’s 4-step translation methodology. A translator with prior medical training experience and proficiency in the field and interview protocols performed the pre-translation, focusing on translating concepts and asking direct, succinct questions. Following this, a panel of experts reviewed the translation to identify and add any missing words. Subsequently, an independent translator who was unaware of the questionnaire translated the tool into English.

In the second phase, a translation was finalized following testing and interviews. As questionnaire advisors, ten participants were requested to become familiar with the tool and describe their response process. They were also questioned about any offensive difficult terms and their synonyms. The final version of the tool in Persian was then developed, incorporating additional information and the pre-test report.

Validity

After translation, a group of specialists in medical education assessed the Persian version of the FQI for content and face validity.

Face validity

Qualitative face validity was assessed based on expert comments on certain criteria, including associating the questions with the intended concepts, appropriate wording, difficulty, ambiguity, and syntax, appropriateness of question order, and the importance of questions [14]. The tool was revised. Face validity was also assessed using a 3-point Likert scale for each item. The respondents would rate the items if they completely agreed with the item’s intended concept 3 and 1 if they disagreed. The impact score, the percentage of participants who rated the importance of items 2 or 3, was higher than 1.5 for all questionnaire items. This showed an acceptable level of validity.

Content validity

The CVR was calculated to assess the agreement of the expert panel on the necessity and usefulness of each item. The CVR values were calculated by applying the method proposed by Lawshe [15]. Also, CVI was calculated to assess for relevance and clarity of each item. The content validity of the items was ascertained by asking 16 medical education experts of Iran University of Medical Sciences (IUMS). Based on their feedback, the questionnaire’s difficulty, non-conformity, presence of phrases, and word meaning misunderstandings were determined, through partial question modifications. The experts evaluated the questionnaire by rating each question’s importance on a three-part scale (never, usually, and always). It was approved if an item received a score equal to or > 0.79 [16].

Construct validity

The pre-final version was tested on 15 medical interns. As per rules of thumb, the number of subjects per variable may vary from 4 to 10. In this study, 5 subjects were calculated [17].

The Persian tool was administered to a 120 medical students to evaluate its construct validity and reliability. The students were asked to complete questionnaires based on feedback from teachers. This study occurred at Ali Asghar Hospital, an academic medical center affiliated with IUMS. Construct validity was assessed through confirmatory factor analysis. In addition, Construct validity was assessed using Chi-squared values. It is worth noting that caution should be exercised when interpreting squared values due to potential inflation associated with large sample sizes and increased freedom. Confirmatory factorial analysis is used between items to assess internal consistency [18]. AMOS version 26 was used for statistical analysis.

Reliability

For measuring the stability of tools over time, the Intraclass Correlation Coefficient [19]. Cronbach’s Alpha was calculated to assess for internal consistency. The ICC was used to analyze the internal reliability in SPSS16.



Source link