Differential item functioning of the HADS and PHQ-9: an investigation of age, gender and educational background in a clinical UK primary care sample.
Background: The Patient Health Questionnaire (PHQ-9) and Hospital Anxiety and Depression Scale (HADS) are commonly used measures in clinical practice and research. It is important that such scales measure the trait they purport to measure and that the impact of other measurement artefacts is minimal. Differential item functioning of these scales by gender, educational background and age is currently assessed.
Methods: Severity of depression and anxiety symptoms were measured in primary care patients referred to mental health workers using the PHQ-9 and HADS. Each scale was assessed for Differential Item Functioning (DIF) and Differential Test Function (DTF) by gender, educational background and age. Minimum n per analysis=895. DIF was assessed with Mantel's χ(2), Liu-Agresti cumulative common odds ratio (LA LOR) and the standardised LA LOR (LA LOR-Z). DTF was assessed in relation to ν(2).
Results: PHQ-9, HADS Depression Sub-scale (HADS-D) and HADS Anxiety Subscale (HADS-A) lacked bias in terms of gender and educational background (ν(2)<0.07). However, both PHQ-9 and HADS-D exhibited bias with regard to age: PHQ-9 ν(2)=0.103 (medium effect); HADS-D ν(2)=0.214 (large effect). PHQ-9 items exhibiting DIF by age covered: anhedonia, energy and low mood. HADS-D items exhibiting DIF by age covered psychomotor retardation and interest in appearance.
Conclusions: No assessment of other potential DIF contributors was made. Conclusions: PHQ-9, HADS-D and HADS-A generally do not exhibit bias for gender and educational background. However bias was observed in PHQ-9 and HADS-D for age. Caution should be exercised interpreting scores both in clinical practice and research.