Abstract
Purpose of the Study: Comprehensive measures of disability accommodations have been lacking in national health and aging studies. This article introduces measures of accommodations developed for the National Health and Aging Trends Study, evaluates their reliability, and explores the validity and reliability of hierarchical classification schemes derived from these measures. Design and Methods: We examined test–retest reliability for questions about assistive device use, doing activities less often, and getting help from another person with both percentage agreement and kappa (N = 111). Summary measures across activities and several hierarchical classification schemes (e.g., no accommodation, devices/activity reductions only, help) were developed. For the latter, we also evaluated validity by examining correlations with measures of capacity and demographic characteristics (N = 326). Results: Items about assistive device use and help in the last month were robust (most kappas 0.7–0.9). Activity reduction measures were moderately reliable (around 0.5) but still showed reasonable agreement. Reliabilities for summary measures were good for device use (0.78–0.89) and help (0.62–0.67) but lower, albeit acceptable, for activity reduction (0.53). Hierarchical classifications had acceptable reliability and levels demonstrated hierarchical properties. Implications: National Health and Aging Trends Study’s self-care and mobility accommodation measures offer ample reliability to study adaptation to limitations and can be used to construct a reliable and valid hierarchy.
Key Words: Disability, Measurement, Technology
Broadly construed, disability accommodations include a complex set of behavioral adjustments that individuals make to changes in underlying physical, cognitive, and sensory functioning (Freedman, 2009; Verbrugge & Jette, 1994). Individuals may change how an activity is accomplished by using assistive technology (e.g., a cane or walker) or changing the physical environment in which an activity is performed (e.g., adding a grab bar). At more extreme levels of impairment, accommodation may take the form of assistance from another person. More subtle changes include doing activities more slowly or altering the nature of the activity (e.g., cleaning up at the sink instead of bathing). The frequency of an activity may also be reduced, or an individual may avoid an activity altogether. Individuals who change behavior in this way but report no difficulty are considered to have “preclinical” disability (Fried, Herdman, Kuhn, Rubin, & Turano, 1991).
Although measures of personal assistance are routinely included in national aging and health studies, and questions about devices are common (Cornman, Freedman & Agree, 2005), comprehensive, task-specific approaches are lacking and no national study has incorporated measures of preclinical disability. Moreover, attempts to assess statistical properties of disability accommodations have been rare (Fried et al 1996; Miller, Andresen, Malmstrom, Miller, & Wolinsky, 2006). Hence, the reliability of such items remains largely undocumented, and whether different types of accommodations reflect ordered stages (a hierarchy) of response to functional loss remains unexplored.
This study fills this void by examining the reliability of survey questions designed to provide national estimates of accommodations for seven mobility and self-care activities and by exploring the reliability and validity of alternative hierarchical classification schemes. We focus on three types of accommodations: use of assistive devices, changes in the frequency of doing an activity, and help from another person.
Methods
Data
Questions were tested as part of the development of the new National Health and Aging Trends Study (NHATS; Kasper & Freedman, 2012), a new national panel study of older adults. During NHATS’s development, an in-person validation study was undertaken to establish the reliability, validity, and statistical properties of its disability protocol. NHATS incorporates a comprehensive model of disability measurement—one that emphasizes the role of accommodations in the disablement process (Freedman, 2009). Previous analysis established the reliability and validity of NHATS’s self-reported measures of physical capacity, activity limitations, and participation restrictions (Freedman et al., 2011). Here, we draw upon the validation study to examine reliability of accommodations measures.
An initial computer-assisted in-person interview (n = 326) was administered to a sample of older adults selected from four locations across the United States and stratified by age to ensure adequate numbers in three age groups: 65–74 (42%), 75–84 (38%), and 85+ (20%). Attention was also paid to recruiting individuals from a list of licensed assisted living facilities (15.6% of the sample was living in residential care). Almost one quarter was recruited based on affirmative responses to questions on the need for help with personal care (eating, bathing, dressing, or getting around inside the home). See Table 1 for additional sample characteristics.
Table 1.
NHATS Validation Study Sample Characteristics
Characteristic | Full interview sample (n = 326) | Reinterview sample (n = 111) |
---|---|---|
Percent/mean | Percent/mean | |
Age | ||
65–74 | 42.0 | 44.1 |
75–84 | 38.3 | 36.9 |
85+ | 19.6 | 18.9 |
Mean age | 77.1 | 76.7 |
Female (vs. male) | 56.4 | 49.6 |
Lives in a residential care setting | 15.6 | 14.4 |
Frail (3–5 frailty criteria met) vs. not frail (0–2 criteria met)a,b | 26.1 | 27.0 |
Mean word recall (0–17)a,c | 7.9 | 8.1 |
Mean mobility score (0–12)a,d | 7.5 | 8.0 |
Note: NHATS = National Health and Aging Trends Study.
aMeasured at initial interview.
bCriteria include unintentional weight loss of 10 or more pounds in the last 12 months; lowest 25th percentile of grip strength (within gender); self-report of low energy; usual walking speed below 0.6 meters per second; and low physical activity as evidenced by self-reports of never walking for exercise and never doing vigorous activities in the last month.
cThe sum of the number of words correctly recalled from a list of 10 nouns immediately after being read and again after a 5-min delay.
dThe Short Physical Performance Battery score was calculated from three physical performance tests: a 4-m usual walking speed test, a rapid chair stand, and several balance tests (20). Each test was coded from 0 (unable) to 4 (best performance).
An identical reinterview of a randomly selected subsample (n = 111) was conducted 1–4 weeks later, with about 70% of follow-up interviews done within 2 weeks, and all interviews were completed within 4 weeks. The subsample was purposefully sized to provide reasonable precision for estimating reliability (e.g., 95% confidence intervals of width ±0.1 for kappa = 0.9 and ±0.2 for kappa = 0.6, for an outcome with prevalence of 0.20).
Variables
The interview first identified the presence of assistive environmental features, such as grab bars and a bath/shower seat, in the home. Respondents were also asked if in the last month they used any common mobility devices (a cane, walker, wheelchair, or scooter) and if so, which ones. These items were then used to determine whether questions about the use of particular devices should be asked for specific activities.
The remaining questions were activity specific. Three mobility (getting outside, getting around inside one’s home, getting out of bed) and four self-care (eating, bathing/showering/washing up (hereafter bathing), using the toilet, and dressing) activities were included. For each activity, respondents were asked whether they performed the activity in the last month, how often they used particular devices to accomplish the task (see notes in Table 2 for details), and if anyone helped them. For getting outside, getting around inside, bathing, and dressing, respondents were also asked whether, compared with a year ago, they carried out the activity more often, less often, or about the same. Activity reduction was chosen as a measure of behavior change because of its ease of interpretation and applicability across most activities.
Table 2.
Frequency and Reliability of Mobility Device Use and Presence of Home Modificationsa
Uses/has… | Percentage | Percent agree | Kappa | 95% Confidence interval | |
---|---|---|---|---|---|
Initial interview | Reinterview | ||||
Mobility devices (used in past month) | |||||
Any device | 29.7 | 29.7 | 94.6 | 0.87 | 0.77–0.97 |
Cane | 18.9 | 19.8 | 95.5 | 0.86 | 0.73–0.98 |
Walker | 12.6 | 12.6 | 96.4 | 0.84 | 0.68–0.99 |
Wheelchair | 9.0 | 8.1 | 97.3 | 0.83 | 0.64–1.00 |
Scooter | 3.6 | 3.6 | 100.0 | 1.00 | 1.00–1.00 |
Environmental accommodations (have) | |||||
Grab bars in the shower | 45.1 | 44.1 | 90.1 | 0.80 | 0.69–0.91 |
Shower/bath seat | 42.3 | 44.1 | 89.2 | 0.78 | 0.66–0.90 |
Raised toilet seat | 71.2 | 74.8 | 87.4 | 0.68 | 0.53–0.84 |
Grab bars around the toilet | 80.2 | 77.5 | 93.7 | 0.81 | 0.68–0.95 |
aReinterview sample (n = 111).
We created the following measures by summarizing over activity-specific accommodations: (a) use of devices for any self-care activity, any mobility activity, and any self-care or mobility activity; (b) help from another person with any self-care activity, any mobility activity, and any self-care or mobility activity; and (c) whether the respondent performed either a self-care or mobility activity less often compared with a year ago.
To capture potential stages of adopting accommodations, we used the summary measures for any self-care or mobility activity to create three hierarchical classification schemes. The first includes three mutually exclusive categories commonly used in the literature: no accommodations, uses devices only (but no help), and receives help for any self-care or mobility activity (Scheme I). The second classification broadens the middle category to include activity reduction (Scheme II), and the third treats activity reduction as a distinct adaptation (Scheme III). To validate the classification schemes, we examined key demographic characteristics (age, sex, residential care status) along with physical and cognitive performance constructs of frailty (Fried et al., 2001), mobility (using the Short Physical Performance Battery; Guralnik et al., 1994), and memory (using a 10-word recall; Ofstedal, Fisher, & Herzog, 2005; see Table 1 for details).
Analysis
We assessed reliability with the percentage agreement between initial and reinterview responses and kappa coefficients (which adjust for the extent of agreement expected by chance alone). Even over the brief period of our study, some true change in the measures being evaluated may occur, so 100% agreement or kappas of 1.0 are not expected. Kappas greater than 0.6 are considered acceptable and those 0.8 or higher are considered excellent, but for rare outcomes that may change in a brief period, kappas of 0.5–0.6 are not unusual (Maclure & Willett, 1987).
To validate the hierarchical classification schemes, we estimated correlations with key variables (and means/percentages for each level). We expected individuals at higher levels to be older, to be more likely to be women and to live in a residential care setting, and to have poorer physical and cognitive capacity.
Results
Reliability Analysis
Questions about device use that determine whether subsequent activity-specific questions should be asked have very good reliability (Table 2). For instance, 94.6% of respondents provide the same answer to the global mobility device question (kappa = 0.87), and agreement for specific types of devices ranges from 95.5% to 100.0%. Questions regarding the presence of home modifications also show good reliability, ranging from kappa = 0.68 for a raised toilet seat to 0.80–0.81 for grab bars.
Prevalence and reliability for the three types of accommodations—device use, help, and activity reduction—are presented in Table 3. Focusing on activity-specific device use, agreement is 89% or higher. Kappas for bathing, toileting, getting outside and getting around inside are very good (0.77–0.86). The kappa for using a device to get out of bed was 0.53 and for getting dressed was only 0.30, in part, because the prevalence for these items is quite low (<10%), and kappa could not be computed for eating because no one reported using devices to eat. For activity-specific help, percent agreement is 91% or higher, and kappas for eating, getting dressed, toileting, and getting out of bed were 0.70 or higher. The remaining kappas range from 0.59 (getting around inside) to 0.64 (bathing). Finally, the agreement for activity reduction ranges from 83% to 95%, with kappas in the range of 0.37–0.53.
Table 3.
Frequency and Reliability of Accommodations for Self-Care or Mobility Activitiesa
Percentage | Percent agree | Kappa | 95% Confidence interval | ||
---|---|---|---|---|---|
Initial interview | Reinterview | ||||
In the last month ever used a deviceb for… | |||||
Eating | 0.0 | 0.0 | 100.0 | — | — |
Bathing/showering/washing up | 36.9 | 38.7 | 89.2 | 0.77 | 0.65–0.89 |
Getting dressed | 9.0 | 6.3 | 90.1 | 0.30 | 0.00–0.61 |
Toileting | 27.0 | 25.2 | 94.6 | 0.86 | 0.75–0.97 |
Going outside | 24.3 | 28.8 | 93.7 | 0.84 | 0.72–0.95 |
Getting around inside home | 22.5 | 21.6 | 93.7 | 0.82 | 0.69–0.95 |
Getting out of bed | 9.0 | 9.9 | 91.9 | 0.53 | 0.26–0.80 |
In the last month ever got help with… | |||||
Eating | 5.4 | 5.4 | 98.2 | 0.82 | 0.59–1.00 |
Bathing/showering/washing up | 7.2 | 6.3 | 95.5 | 0.64 | 0.35–0.93 |
Getting dressed | 10.8 | 9.0 | 94.6 | 0.70 | 0.47–0.93 |
Toileting | 2.7 | 1.8 | 99.1 | 0.80 | 0.41–1.00 |
Going outside | 15.3 | 11.7 | 91.0 | 0.62 | 0.40–0.83 |
Getting around inside home | 5.4 | 6.3 | 95.5 | 0.59 | 0.27–0.92 |
Getting out of bed | 2.7 | 2.7 | 100.0 | 1.00 | 1.00–1.00 |
Compared with a year ago, did activity less often | |||||
Bathing/showering/washing up | 8.1 | 5.4 | 93.7 | 0.50 | 0.18–0.82 |
Getting dressed | 4.5 | 4.5 | 94.6 | 0.37 | 0.02–0.77 |
Going outside | 19.8 | 20.7 | 82.9 | 0.47 | 0.27–0.67 |
Going around inside home | 8.1 | 10.8 | 91.9 | 0.53 | 0.26–0.80 |
aReinterview sample (n = 111).
bDevices include adapted utensils (eating); grab bars and shower/bath seat (bathing); special items to help get dressed (dressing); grab bars, raised toilet seat (toileting); cane, walker, wheelchair, scooter (getting around inside, getting outside); and cane and walker (getting out of bed).
The reliability of the summary and hierarchical classification schemes is also good. Kappas for summary measures of device use range from 0.78 to 0.89 and for help from 0.62 to 0.67. They are somewhat lower, albeit still acceptable, for activity reductions (0.53; See Table 4). Hierarchical Scheme I shows 78% agreement and kappa = 0.65. When activity reduction is combined with device use (Scheme II) agreement and kappa are 73% and 0.58, treating activity reduction separately (Scheme III) results in slightly lower estimates (67% agreement and kappa = 0.53).
Table 4.
Frequency and Reliability of Accommodations for Self-Care or Mobility Activities: Summary Measures and Hierarchical Classification Schemesa
Percentage | Percent agree | Kappa | 95% Confidence interval | ||
---|---|---|---|---|---|
Initial interview | Reinterview | ||||
Summary measures | |||||
In the last month ever used a device for… | |||||
Any self-care activity | 46.0 | 46.0 | 89.2 | 0.78 | 0.67–0.90 |
Any mobility activity | 26.1 | 28.8 | 95.5 | 0.89 | 0.79–0.98 |
Any self-care or mobility activity | 51.4 | 53.2 | 89.2 | 0.78 | 0.67–0.90 |
In the last month ever got help with… | |||||
Any self-care activity | 13.5 | 11.7 | 92.8 | 0.67 | 0.46–0.88 |
Any mobility activity | 17.1 | 13.5 | 91.0 | 0.65 | 0.46–0.85 |
Any self-care or mobility activity | 23.4 | 18.0 | 87.4 | 0.62 | 0.44–0.80 |
Compared to a year ago, did activity less often | |||||
Any self-care or mobility activity | 23.4 | 24.3 | 82.9 | 0.53 | 0.34–0.72 |
Hierarchical classification | |||||
Scheme I | |||||
No accommodation | 46.0 | 46.9 | 77.5 | 0.65 | 0.59–0.79 |
Use of devices only | 30.6 | 35.1 | |||
Receipt of any assistance | 23.4 | 18.0 | |||
Scheme II | |||||
No accommodation | 37.8 | 43.2 | 73.0 | 0.58 | 0.52–0.65 |
Use of devices or activity reduction only | 38.7 | 38.7 | |||
Receipt of any assistance | 23.4 | 18.0 | |||
Scheme III | |||||
No accommodation | 37.8 | 43.2 | 66.7 | 0.53 | 0.47–0.66 |
Use of devices only | 26.1 | 22.5 | |||
Activity reduction but no assistance | 12.6 | 16.2 | |||
Receipt of any assistance | 23.4 | 18.0 |
aReinterview sample (n = 111)
Validation of Hierarchical Classifications
Mean age, the percentage living in residential care, and the percent frail all increase with each level of Scheme I (which excludes behavior change), whereas mean word recall and mobility scores decrease (Table 5). Correlations are also in the expected direction and were especially strong for mobility score (−0.53) and age (0.35). Because only a small proportion of cases shift categories between Schemes I and II, correlations and patterns are substantially similar. Scheme III shows similar correlations with mobility score and age, and even stronger correlations with frailty and cognition; however, patterns for gender and residential care are irregular. To explore this last finding, we replicated Table 5 with NHATS (N = 7,609; data not shown) and found no such anomalies; instead, all three schemes conformed to the hierarchy.
Table 5.
Construct and Convergent Validity of Hierarchical Classification Schemes for Self-Care and Mobility Accommodationsa
Correlations, p values | M/percent | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Age | Female | Facility | Frailty | Word recall score | Mobility score | Age | % Female | % Facility | % Frail | Word recall score | Mobility score | N | |
Hierarchical classification | |||||||||||||
Scheme I | |||||||||||||
No accommodation | 0.35, <.001 | 0.17, .002 | 0.33, <.001 | 0.24, <.001 | −0.23, <.001 | −0.53, <.001 | 73.7 | 43.1 | 3.5 | 15.5 | 8.7 | 9.2 | 116 |
Use of devices only | 77.8 | 63.9 | 14.6 | 24.6 | 8.1 | 7.8 | 130 | ||||||
Receipt of any assistance | 81.0 | 63.8 | 35.0 | 43.8 | 6.6 | 4.8 | 80 | ||||||
Scheme II | |||||||||||||
No accommodation | 0.36, <.001 | 0.19, 0.01 | 0.32, <.001 | 0.25, <.001 | −0.26, <.001 | −0.54, <.001 | 73.4 | 39.8 | 3.1 | 13.3 | 9.1 | 9.3 | 98 |
Use of devices or reduced activity level only | 77.5 | 63.5 | 13.5 | 25.0 | 7.9 | 7.9 | 148 | ||||||
Receipt of any assistance | 81.0 | 63.8 | 35.0 | 43.8 | 6.6 | 4.8 | 80 | ||||||
Scheme III | |||||||||||||
No accommodation | 73.4 | 39.8 | 3.1 | 13.3 | 9.1 | 9.3 | 98 | ||||||
Use of devices only | 0.36, <.001 | 0.19, .001 | 0.30, <.001 | 0.27, <.001 | −0.29, <.001 | −0.54, <.001 | 77.0 | 60.2 | 15.5 | 21.4 | 8.4 | 8.1 | 103 |
Activity reduction but no assistance | 78.8 | 71.1 | 8.9 | 33.3 | 6.7 | 4.8 | 45 | ||||||
Receipt of any assistance | 81.0 | 63.8 | 35.0 | 43.8 | 6.6 | 4.8 | 80 |
aFull interview sample (n = 326).
Discussion
We used a test–retest methodology to demonstrate that task-specific measures of the use of help and assistive devices are highly reliable, and self-reported changes in behavior have somewhat lower, but still reasonable, agreement. Summaries across activities also showed acceptable reliability.
Alternative classification schemes that rank individuals into mutually exclusive categories based on types of accommodations offer ample reliability, and proposed levels appear to form a valid hierarchy. Schemes that included activity reduction were slightly less reliable than a hierarchy that considered only device use and help because reports of reduced activity levels were less consistent than other accommodations, but patterns of association with key correlates were similar. Treating activity reduction as distinct from device use resulted in the strongest correlations with cognition and frailty, suggesting preclinical disability is a unique phase between use of devices and receipt of assistance.
This study has several limitations. The sample was relatively small and purposeful by design; however, we confirmed validity analyses with NHATS. In addition, the classic test–retest framework for assessing reliability is necessarily limited because there will always be some degree of true change in accommodations over even a brief time period. Dynamic measurement properties, such as responsiveness, could not be evaluated. Moreover, further research using modern measurement techniques such as item response theory is needed to determine if a scale of adaption can be developed. Finally, we could not explore with these data why activity reduction items are less reliable than reports of other accommodations, but we speculate the longer reference period may be a factor and worthy of future investigation.
Nevertheless, this analysis suggests that measures available from a new national study of late-life disability trends and dynamics offer ample reliability and validity for studying adaption to functioning. These measures may be used in new ways to explore the consequences of functioning for older individuals. For instance, when used in combination with measures of difficulty, researchers can distinguish older adults who have successfully accommodated declines in capacity from those who are a target for (further) accommodation. Researchers interested in the disablement process can also use these new measures to study earlier (preclinical) phases than previously possible on a national scale and predictors of progression through the accommodations hierarchy. Finally, scientists will be able to track in a much more comprehensive way late-life trends in the use of accommodations and relationships to late-life disability trends.
Funding
This research was supported by the National Institute on Aging at the National Institutes of Health (Cooperative Agreement 1U01AG032947).
Acknowledgments
The views expressed are those of the authors alone and do not represent those of the funding agency or the authors’ institutions.
References
- Cornman J. C., Freedman V. A., Agree E. M. (2005). Measurement of assistive device use: Implications for estimates of device use and disability in late life. The Gerontologist, 45, 347–358. 10.1093/geront/45.3.347 [DOI] [PubMed] [Google Scholar]
- Freedman V. A. (2009). Adopting the ICF language for studying late-life disability: A field of dreams? Journal of Gerontology: Medical Sciences, 64, M1172–M1174; discussion 1175. 10.1093/gerona/glp095 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Freedman V. A., Kasper J. D., Cornman J. C., Agree E. M., Bandeen-Roche K., Mor V., et al. (2011). Validation of new measures of disability and functioning in the National Health and Aging Trends Study. Journal of Gerontology: Medical Sciences, 66, M1013–M1021. 10.1093/gerona/glr087 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fried L. P., Bandeen-Roche K., Williamson J. D., Prasada-Rao P., Chee E., Tepper S., Rubin G. S. (1996). Functional decline in older adults: Expanding methods of ascertainment. Journal of Gerontology: Medical Sciences, 51, M206–M214. 10.1093/gerona/51A.5.M206 [DOI] [PubMed] [Google Scholar]
- Fried L. P., Herdman S. J., Kuhn K. E., Rubin G., Turano K. (1991). Preclinical disability: Hypotheses about the bottom of the iceberg. Journal of Aging and Health, 3, 285–300. 10.1177/089826439100300210 [Google Scholar]
- Fried L. P., Tangen C. M., Walston J., Newman A. B., Hirsch C., Gottdiener J, et al. ; Cardiovascular Health Study Collaborative Research Group. (2001). Frailty in older adults: Evidence for a phenotype. Journal of Gerontology: Medical Sciences, 56, M146–M156 10.1093/gerona/56.3.M146 [DOI] [PubMed] [Google Scholar]
- Guralnik J. M., Simonsick E. M., Ferrucci L., Glynn R. J., Berkman L. F., Blazer D. G, et al. (1994). A short physical performance battery assessing lower extremity function: Association with self-reported disability and prediction of mortality and nursing home admission. Journal of Gerontology, 49, M85–M94. 10.1093/geronj/49.2.M85 [DOI] [PubMed] [Google Scholar]
- Kasper J. D., Freedman V. A. (2012). National Health and Aging Trends Study Round 1 User Guide: Final release Baltimore: Johns Hopkins University School of Public Health; Retrieved from http://www.NHATS.org [Google Scholar]
- Maclure M., Willett W. C. (1987). Misinterpretation and misuse of the kappa statistic. American Journal of Epidemiology, 126, 161–169 10.1093/aje/126.2.161 [DOI] [PubMed] [Google Scholar]
- Miller D. K., Andresen E. M., Malmstrom T. K., Miller J. P., Wolinsky F. D. (2006). Test-retest reliability of subclinical status for functional limitation and disability. Journal of Gerontology: Social Sciences, 61, S52–S56. [DOI] [PubMed] [Google Scholar]
- Ofstedal M. B., Fisher G. G., Herzog A. R. (2005). Documentation of Cognitive Functioning Measures in the Health and Retirement Study (HRS Documentation Report DR-006) Retrieved from http://hrsonline. isr.umich.edu/sitedocs/userg/dr-006.pdf.
- Verbrugge L. M., Jette A. M. (1994). The disablement process. Social Science & Medicine (1982), 38, 1–14. [DOI] [PubMed] [Google Scholar]