Back to Journals » Journal of Pain Research » Volume 19
Psychometric Properties of a Smartphone Application for Measuring Shoulder Active Range of Motion in Individuals with and without Shoulder Pain and Mobility Deficits [Response to the Letter]
Authors Aafreen A
, Khan AR
, Ahmad A
, Alshehre YM
, Alshehri MM, Shaphe MA
, Aldhahi MI
Received 16 April 2026
Accepted for publication 23 April 2026
Published 2 May 2026 Volume 2026:19 617431
Aafreen Aafreen,1 Abdur Raheem Khan,2 Ausaf Ahmad,3 Yousef M Alshehre,1 Mohammed M Alshehri,4 Mohammad Abu Shaphe,4 Monira I Aldhahi5
1Department of Health Rehabilitation Sciences, Faculty of Applied Medical Sciences, University of Tabuk, Tabuk, Saudi Arabia; 2Department of Physiotherapy, Integral University, Lucknow, India; 3Department of Community Medicine, Kalyan Singh Government Medical College Bulandshahr, Bulandshahr, UP, India; 4Physical Therapy department, College of Nursing and Health Sciences, Jazan University, Jazan, Saudi Arabia; 5Department of Rehabilitation Sciences, College of Health and Rehabilitation Sciences, Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia
Correspondence: Abdur Raheem Khan, Department of Physiotherapy, Integral University, Lucknow, India, Email [email protected]
View the original paper by Dr Aafreen and colleagues
This is in response to the Letter to the Editor
Dear editor
We sincerely thank Goyal K, Goyal M, and Bathla M for their interest in our article and welcome the opportunity to clarify the methodological choices made in our study.
1. ICC Model Selection: In our reliability analysis, we utilized a two-way random-effects model, absolute agreement, single-rater type, commonly denoted as ICC (2,1). This specific model was selected because our study utilized multiple raters (Raters 1, 2, and 3) to evaluate the participants. By applying a two-way random model, we treat these raters as a representative sample of the broader clinical workforce, ensuring that our inter-rater and intra-rater reliability findings are highly generalizable to other clinicians in similar professional settings. Furthermore, the single-rater type was chosen to reflect standard clinical practice, where a patient is typically evaluated by a single physiotherapist.1
2. Criterion Validity and Agreement: Our assessment of criterion validity was not limited to association. As demonstrated in the manuscript, we utilized Bland–Altman analysis as the primary method to evaluate the level of agreement and identify potential systematic bias between the smartphone application and the universal goniometer. Methodological experts emphasize that simple correlation can be highly misleading in agreement studies, as it measures the strength of a relationship rather than the agreement between two clinical methods.2
3. Intra-rater Reliability Interval: The decision to assess intra-rater reliability within a short time interval was a deliberate design choice to isolate the instrument’s stability while minimizing the confounding effects of biological variability and pain fluctuations inherent in symptomatic populations. In clinical measurement research, a short interval is often preferred to ensure the underlying trait being measured remains stable, thereby preventing “true change” from being misidentified as “measurement error”.3
4. Addressing Baseline Confounders: While differences in age and BMI were observed between groups, our study employed a paired-comparison design where each participant served as their own control. In this design, individual characteristics like BMI and age remain constant across both measurement methods, effectively neutralizing their impact on the internal validity of the comparison between the two tools. This within-subject approach is the standard for assessing the degree of agreement between two different instruments.4
5. Sampling and Generalizability: Convenience sampling was used as an efficient and practical strategy to establish the primary psychometric properties of this technology in the demographic most frequently seeking rehabilitation for shoulder mobility deficits. While we targeted the 20–50 age range to provide a robust foundation for this active population, the use of non-probability samples is a valid and widely accepted starting point for psychometric validation before progressing to broader, stratified populations.5
Disclosure
The authors report no conflicts of interest in this communication.
References
1. Shrout PE, Fleiss JL. Intraclass correlations: uses in assessing rater reliability. Psychol. Bull. 1979;86(2):420–2. doi:10.1037/0033-2909.86.2.420
2. Bland JM, Altman DG. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet. 1986;1(8476):307–310. doi:10.1016/S0140-6736(86)90837-8
3. Polit DF. Getting serious about test-retest reliability: a critique of retest research and some recommendations. Qual Life Res. 2014;23(6):1713–1720.PMID: 24504622. doi:10.1007/s11136-014-0632-9
4. Zaki R, Bulgiba A, Ismail R, Ismail NA. Statistical methods used to test for agreement of medical instruments measuring continuous variables in method comparison studies: a systematic review. PLoS One. 2012;7(5):e37908. doi:10.1371/journal.pone.0037908
5. Jager J, Putnick DL, MH B. More than just convenient: the scientific merits of homogeneous convenience samples. Monogr Soc Res Child Dev. 2017;82(2):13–30. doi:10.1111/mono.12296
© 2026 The Author(s). This work is published and licensed by Dove Medical Press Limited. The
full terms of this license are available at https://www.dovepress.com/terms
and incorporate the Creative Commons Attribution
- Non Commercial (unported, 4.0) License.
By accessing the work you hereby accept the Terms. Non-commercial uses of the work are permitted
without any further permission from Dove Medical Press Limited, provided the work is properly
attributed. For permission for commercial use of this work, please see paragraphs 4.2 and 5 of our Terms.
