Skip to contents

This example uses the National Study of Long-Term Care Providers (NSLTCP) Residential Care Community (RCC) Services User (SU) 2018 Public Use File (PUF) to replicate the estimates from a report called Residential Care Community Resident Characteristics: United States, 2018. “The survey used a sample of residential care community residents, obtained from a frame that was constructed from lists of licensed residential care communities acquired from the state licensing agencies in each of the 50 states and the District of Columbia.”

The RCC SU 2018 survey comes with the surveytable package, for use in examples, in an object called rccsu2018.

Begin

Begin by loading the surveytable package.

Now, specify the survey that you’d like to analyze.

set_survey(rccsu2018)
Survey info {RCC SU 2018 PUF}
Variables Observations Design
81 904 Stratified Independent Sampling design svydesign(ids = ~1, strata = ~pufstrata2 + su_facid, fpc = ~pufpopfac2, weights = ~suwt, data = d1)

Check the survey name, survey design variables, and the number of observations to verify that it all looks correct.

For this example, we do want to turn on certain NCHS-specific options, such as identifying low-precision estimates. If you do not care about identifying low-precision estimates, you can skip this command. To turn on the NCHS-specific options:

set_opts(mode = "NCHS")
## * Mode: NCHS.

Alternatively, you can combine these two commands into a single command, like so:

set_survey(rccsu2018, mode = "NCHS")
## * Mode: NCHS.
Survey info {RCC SU 2018 PUF}
Variables Observations Design
81 904 Stratified Independent Sampling design svydesign(ids = ~1, strata = ~pufstrata2 + su_facid, fpc = ~pufpopfac2, weights = ~suwt, data = d1)

Figure 1

This figure shows the percentage of residents by sex, race / ethnicity, and age group.

Sex.

tab("sex")
Resident’s gender {RCC SU 2018 PUF}
Level n Number (000) SE (000) LL (000) UL (000) Percent SE LL UL
Male 272 299 24 255 352 32.6 2.5 27.7 37.7
Female 632 619 26 570 673 67.4 2.5 62.3 72.3
N = 904. Checked NCHS presentation standards. Nothing to report.

Race / ethnicity.

var_list("race")
Variables beginning with ‘race’ {RCC SU 2018 PUF}
Variable Class Long name
raceeth2 factor Resident’s race/ethnicity
tab("raceeth2")
Resident’s race/ethnicity {RCC SU 2018 PUF}
Level n Number (000) SE (000) LL (000) UL (000) Percent SE LL UL Flags
White 816 821 23 776 868 89.3 1.8 85.4 92.6
Black 40 54 14 31 95 5.9 1.5 3.3 9.6
Hispanic 23 18 5 9 34 1.9 0.6 1.0 3.4
Other 25 26 8 12 55 2.8 0.9 1.3 5.3 Cx
N = 904. Checked NCHS presentation standards: Cx: suppress count (and rate).

In the published figure, the Hispanic and Other categories have been merged into a single category called “Another race or ethnicity”. We can do that using the var_collapse() function.

var_collapse("raceeth2"
             , "Another race or ethnicity"
             , c("Hispanic", "Other"))
tab("raceeth2")
Resident’s race/ethnicity {RCC SU 2018 PUF}
Level n Number (000) SE (000) LL (000) UL (000) Percent SE LL UL
White 816 821 23 776 868 89.3 1.8 85.4 92.6
Black 40 54 14 31 95 5.9 1.5 3.3 9.6
Another race or ethnicity 48 44 10 27 70 4.8 1.1 2.9 7.3
N = 904. Checked NCHS presentation standards. Nothing to report.

Age group.

var_list("age")
Variables beginning with ‘age’ {RCC SU 2018 PUF}
Variable Class Long name
age2 numeric Resident’s age

age2 is a numeric variable. We need to create a categorical variable based on this numeric variable. This is done using the var_cut() function.

var_cut("Age", "age2"
        , c(-Inf, 64, 74, 84, Inf)
        , c("Under 65", "65-74", "75-84", "85 and over") )
tab("Age")
Age {RCC SU 2018 PUF}
Level n Number (000) SE (000) LL (000) UL (000) Percent SE LL UL
Under 65 75 69 11 49 96 7.5 1.2 5.2 10.3
65-74 98 111 17 82 151 12.1 1.8 8.8 16.1
75-84 221 235 22 195 282 25.5 2.2 21.3 30.2
85 and over 510 504 26 456 557 54.9 2.6 49.7 60.0
N = 904. Checked NCHS presentation standards. Nothing to report.

Figure 2

This figure shows the percentage of residents with Medicaid, overall and by age group.

tab("medicaid2")
Used Medicaid to pay for services {RCC SU 2018 PUF}
Level n Number (000) SE (000) LL (000) UL (000) Percent SE LL UL
FALSE 674 674 24 628 723 73.3 2.1 68.9 77.4
TRUE 143 160 18 128 201 17.5 1.9 13.9 21.5
<N/A> 87 85 11 64 111 9.2 1.2 6.9 11.9
N = 904. Checked NCHS presentation standards. Nothing to report.

As we can see, for some observations, the value of this variable is unknown (it’s missing or NA). The above command calculates percentages based on all observations, including the ones with missing (NA) values. However, in the published figure, the percentages are based on the knowns only. To exclude the NA’s from the calculation, use the drop_na argument:

tab("medicaid2", drop_na = TRUE)
Used Medicaid to pay for services (knowns only) {RCC SU 2018 PUF}
Level n Number (000) SE (000) LL (000) UL (000) Percent SE LL UL
FALSE 674 674 24 628 723 80.8 2.1 76.4 84.7
TRUE 143 160 18 128 201 19.2 2.1 15.3 23.6
N = 817. Checked NCHS presentation standards. Nothing to report.

Note that the table title alerts you to the fact that you are using known values only.

By age group:

tab_subset("medicaid2", "Age", drop_na = TRUE)
Used Medicaid to pay for services (Age = Under 65) (knowns only) {RCC SU 2018 PUF}
Level n Number (000) SE (000) LL (000) UL (000) Percent SE LL UL Flags
FALSE 31 30 8 17 56 49.8 9.5 30.3 69.4 Px
TRUE 35 31 8 18 52 50.2 9.5 30.6 69.7 Px
N = 66. Checked NCHS presentation standards: Px: suppress percent.
Used Medicaid to pay for services (Age = 65-74) (knowns only) {RCC SU 2018 PUF}
Level n Number (000) SE (000) LL (000) UL (000) Percent SE LL UL Flags
FALSE 53 55 11 37 83 62 7.6 45.4 76.7 Px
TRUE 30 34 8 20 58 38 7.6 23.3 54.6 Px
N = 83. Checked NCHS presentation standards: Px: suppress percent.
Used Medicaid to pay for services (Age = 75-84) (knowns only) {RCC SU 2018 PUF}
Level n Number (000) SE (000) LL (000) UL (000) Percent SE LL UL
FALSE 163 167 18 134 208 79.1 4.5 68.6 87.3
TRUE 33 44 11 26 75 20.9 4.5 12.7 31.4
N = 196. Checked NCHS presentation standards. Nothing to report.
Used Medicaid to pay for services (Age = 85 and over) (knowns only) {RCC SU 2018 PUF}
Level n Number (000) SE (000) LL (000) UL (000) Percent SE LL UL
FALSE 427 421 23 378 469 89.1 2.2 83.8 93.1
TRUE 45 52 11 33 81 10.9 2.2 6.9 16.2
N = 472. Checked NCHS presentation standards. Nothing to report.

Note that according to the NCHS presentation criteria, some of the percentages should be suppressed.

Figure 4

(Figure 3 is slightly more involved, so we’ll do it next.)

  • This figure shows the percentage of residents who have one of a select set of chronic conditions.
  • In addition, it shows the distribution of residents by the number of conditions.

Here’s a table for high blood pressure.

tab("hbp")
Resident diagnosed with high blood pressure {RCC SU 2018 PUF}
Level n Number (000) SE (000) LL (000) UL (000) Percent SE LL UL
FALSE 397 404 25 357 457 44.0 2.5 38.9 49.1
TRUE 481 498 26 449 552 54.2 2.6 49.0 59.3
<N/A> 26 17 4 10 28 1.8 0.4 1.1 2.9
N = 904. Checked NCHS presentation standards. Nothing to report.

Once again, unknown values (NA) are present, while the figure is based on knowns only. Therefore, we again will use the drop_na argument:

tab("hbp", "alz", "depress", "arth", "diabetes", "heartdise", "osteo"
    , "copd", "stroke", "cancer"
    , drop_na = TRUE)
Resident diagnosed with high blood pressure (knowns only) {RCC SU 2018 PUF}
Level n Number (000) SE (000) LL (000) UL (000) Percent SE LL UL
FALSE 397 404 25 357 457 44.8 2.6 39.7 50.0
TRUE 481 498 26 449 552 55.2 2.6 50.0 60.3
N = 878. Checked NCHS presentation standards. Nothing to report.
Resident diagnosed with Alzheimer’s/dementia (knowns only) {RCC SU 2018 PUF}
Level n Number (000) SE (000) LL (000) UL (000) Percent SE LL UL
FALSE 538 598 26 549 651 66.3 2.1 62.0 70.5
TRUE 340 304 19 268 344 33.7 2.1 29.5 38.0
N = 878. Checked NCHS presentation standards. Nothing to report.
Resident diagnosed with depression (knowns only) {RCC SU 2018 PUF}
Level n Number (000) SE (000) LL (000) UL (000) Percent SE LL UL
FALSE 629 654 24 609 703 72.5 2.1 68.1 76.6
TRUE 249 248 20 211 292 27.5 2.1 23.4 31.9
N = 878. Checked NCHS presentation standards. Nothing to report.
Resident diagnosed with arthritis (knowns only) {RCC SU 2018 PUF}
Level n Number (000) SE (000) LL (000) UL (000) Percent SE LL UL
FALSE 683 717 26 668 770 79.5 2 75.3 83.3
TRUE 195 185 18 152 224 20.5 2 16.7 24.7
N = 878. Checked NCHS presentation standards. Nothing to report.
Resident diagnosed with diabetes (knowns only) {RCC SU 2018 PUF}
Level n Number (000) SE (000) LL (000) UL (000) Percent SE LL UL
FALSE 719 718 23 675 765 79.6 2.1 75.3 83.6
TRUE 159 184 20 148 227 20.4 2.1 16.4 24.7
N = 878. Checked NCHS presentation standards. Nothing to report.
Resident diagnosed with heart disease (knowns only) {RCC SU 2018 PUF}
Level n Number (000) SE (000) LL (000) UL (000) Percent SE LL UL
FALSE 739 746 25 697 798 82.7 1.8 78.7 86.2
TRUE 139 156 17 126 193 17.3 1.8 13.8 21.3
N = 878. Checked NCHS presentation standards. Nothing to report.
Resident diagnosed with osteoporosis (knowns only) {RCC SU 2018 PUF}
Level n Number (000) SE (000) LL (000) UL (000) Percent SE LL UL
FALSE 766 794 24 749 842 88 1.4 84.9 90.7
TRUE 112 108 13 85 137 12 1.4 9.3 15.1
N = 878. Checked NCHS presentation standards. Nothing to report.
Resident diagnosed with COPD (knowns only) {RCC SU 2018 PUF}
Level n Number (000) SE (000) LL (000) UL (000) Percent SE LL UL
FALSE 779 806 24 759 856 89.4 1.6 85.9 92.3
TRUE 99 96 14 71 129 10.6 1.6 7.7 14.1
N = 878. Checked NCHS presentation standards. Nothing to report.
Resident diagnosed with stroke (knowns only) {RCC SU 2018 PUF}
Level n Number (000) SE (000) LL (000) UL (000) Percent SE LL UL
FALSE 789 807 23 764 853 89.5 1.5 86.1 92.4
TRUE 89 94 14 70 128 10.5 1.5 7.6 13.9
N = 878. Checked NCHS presentation standards. Nothing to report.
Resident diagnosed with cancer (knowns only) {RCC SU 2018 PUF}
Level n Number (000) SE (000) LL (000) UL (000) Percent SE LL UL
FALSE 806 824 23 780 871 91.4 1.6 87.7 94.2
TRUE 72 78 14 53 114 8.6 1.6 5.8 12.3
N = 878. Checked NCHS presentation standards. Nothing to report.

Advanced variable editing

  • surveytable provides a number of functions to create or modify survey variables.

  • We saw a couple of these above: var_collapse() and var_cut().

  • Occasionally, you might need to do advanced variable editing. Here’s how:

  • Every survey object has an element called variables

  • This is a data frame where the survey’s variables are located

class(rccsu2018$variables)
## [1] "data.frame"
  1. Create a new variable in the variables data frame (which is part of the survey object).
  2. Call set_survey() again. Any time you modify the variables data frame, call set_survey().
  3. Tabulate the new variable.

We go through these steps to count how many chronic conditions were present.

rccsu2018$variables$num_cc = 0
for (vr in c("hbp", "alz", "depress", "arth", "diabetes", "heartdise", "osteo"
             , "copd", "stroke", "cancer")) {
  idx = which(rccsu2018$variables[,vr])
  rccsu2018$variables$num_cc[idx] = rccsu2018$variables$num_cc[idx] + 1
}
set_survey(rccsu2018, mode = "NCHS")
## * Mode: NCHS.
Survey info {RCC SU 2018 PUF}
Variables Observations Design
82 904 Stratified Independent Sampling design svydesign(ids = ~1, strata = ~pufstrata2 + su_facid, fpc = ~pufpopfac2, weights = ~suwt, data = d1)

num_cc is a numeric variable with the number of chronic conditions. The published figure uses a categorical variable which is based on this numeric variable. Use var_cut(), which converts numeric variables to categorical (factor) variables.

var_cut("Number of chronic conditions", "num_cc"
        , c(-Inf, 0, 1, 3, 10, Inf)
        , c("0", "1", "2-3", "4-10", "??"))
tab("Number of chronic conditions")
Number of chronic conditions {RCC SU 2018 PUF}
Level n Number (000) SE (000) LL (000) UL (000) Percent SE LL UL
0 121 140 19 106 184 15.2 2.0 11.5 19.6
1 189 180 17 148 218 19.5 1.9 16.0 23.5
2-3 446 444 23 400 492 48.3 2.4 43.5 53.1
4-10 148 156 17 125 194 16.9 1.8 13.5 20.8
N = 904. Checked NCHS presentation standards. Nothing to report.

Figure 3

  • This figure shows the percentage of residents who need help with one of the activities of daily living (ADLs).
  • In addition, it shows the distribution of residents by the number of ADLs with which they need help.

Here’s a table for bathhlp (help with bathing):

tab("bathhlp")
Type of assistance resident needs to bathe {RCC SU 2018 PUF}
Level n Number (000) SE (000) LL (000) UL (000) Percent SE LL UL Flags
MISSING 22 10 2 6 17 1.1 0.3 0.6 2.1
NEED HELP OR SUPERVISION FROM ANOTHER PERSON 551 581 25 534 633 63.3 2.3 58.7 67.7
USE OF AN ASSISTIVE DEVICE 11 7 2 3 15 0.7 0.3 0.3 1.5 Cx
BOTH 127 113 15 87 148 12.4 1.6 9.4 15.9
NEED NO ASSISTANCE 193 207 18 173 247 22.5 2.0 18.7 26.6
N = 904. Checked NCHS presentation standards: Cx: suppress count (and rate).

This variable has multiple levels.

  • Several of these levels correspond to a resident needing help,
  • One level ("NEED NO ASSISTANCE") = does not need help
  • One level ("MISSING") = unknown

We want to show (resident needing help) as a percentage of knowns only (that is, excluding the unknowns).

To do this, convert the variable to having 2 levels (needs help / does not need help) plus NA (for unknown); then use the drop_na argument to base percentages on knowns only.

for (vr in c("bathhlp", "walkhlp", "dreshlp", "transhlp", "toilhlp", "eathlp")) {
  var_collapse(vr
    , "Needs assistance"
    , c("NEED HELP OR SUPERVISION FROM ANOTHER PERSON"
      , "USE OF AN ASSISTIVE DEVICE"
      , "BOTH"))
  var_collapse(vr, NA, "MISSING")
}

tab("bathhlp", "walkhlp", "dreshlp", "transhlp", "toilhlp", "eathlp", drop_na = TRUE)
Type of assistance resident needs to bathe (knowns only) {RCC SU 2018 PUF}
Level n Number (000) SE (000) LL (000) UL (000) Percent SE LL UL
Needs assistance 689 702 25 654 752 77.2 2 73.1 81.1
NEED NO ASSISTANCE 193 207 18 173 247 22.8 2 18.9 26.9
N = 882. Checked NCHS presentation standards. Nothing to report.
Type of assistance resident needs for locomotion (knowns only) {RCC SU 2018 PUF}
Level n Number (000) SE (000) LL (000) UL (000) Percent SE LL UL
Needs assistance 622 625 24 578 675 68.9 2.3 64.2 73.4
NEED NO ASSISTANCE 253 281 22 241 329 31.1 2.3 26.6 35.8
N = 875. Checked NCHS presentation standards. Nothing to report.
Type of assistance resident needs to dress (knowns only) {RCC SU 2018 PUF}
Level n Number (000) SE (000) LL (000) UL (000) Percent SE LL UL
Needs assistance 527 561 25 513 614 61.7 2.3 57.1 66.2
NEED NO ASSISTANCE 355 348 22 308 393 38.3 2.3 33.8 42.9
N = 882. Checked NCHS presentation standards. Nothing to report.
Type of assistance resident needs to transfer in/out of chair (knowns only) {RCC SU 2018 PUF}
Level n Number (000) SE (000) LL (000) UL (000) Percent SE LL UL
Needs assistance 463 464 24 418 515 51 2.4 46.1 55.8
NEED NO ASSISTANCE 420 446 24 400 496 49 2.4 44.2 53.9
N = 883. Checked NCHS presentation standards. Nothing to report.
Type of assistance resident needs to use bathroom (knowns only) {RCC SU 2018 PUF}
Level n Number (000) SE (000) LL (000) UL (000) Percent SE LL UL
Needs assistance 437 443 24 398 493 48.7 2.4 43.8 53.5
NEED NO ASSISTANCE 447 467 25 421 518 51.3 2.4 46.5 56.2
N = 884. Checked NCHS presentation standards. Nothing to report.
Type of assistance resident needs to eat (knowns only) {RCC SU 2018 PUF}
Level n Number (000) SE (000) LL (000) UL (000) Percent SE LL UL
Needs assistance 257 240 21 200 286 26.3 2.3 21.9 31.1
NEED NO ASSISTANCE 628 671 26 622 724 73.7 2.3 68.9 78.1
N = 885. Checked NCHS presentation standards. Nothing to report.

Now, go through the “advanced variable editing” steps – very similar to Figure 4 – to count how many ADLs were present.

rccsu2018$variables$num_adl = 0
for (vr in c("bathhlp", "walkhlp", "dreshlp", "transhlp", "toilhlp", "eathlp")) {
  idx = which(rccsu2018$variables[,vr] %in%
    c("NEED HELP OR SUPERVISION FROM ANOTHER PERSON"
      , "USE OF AN ASSISTIVE DEVICE"
      , "BOTH"))
  rccsu2018$variables$num_adl[idx] = rccsu2018$variables$num_adl[idx] + 1
}
set_survey(rccsu2018, mode = "NCHS")
## * Mode: NCHS.
Survey info {RCC SU 2018 PUF}
Variables Observations Design
83 904 Stratified Independent Sampling design svydesign(ids = ~1, strata = ~pufstrata2 + su_facid, fpc = ~pufpopfac2, weights = ~suwt, data = d1)

For generating the figure, create a categorical variable based on num_adl, which is numeric.

var_cut("Number of ADLs", "num_adl"
        , c(-Inf, 0, 2, 6, Inf)
        , c("0", "1-2", "3-6", "??"))
tab("Number of ADLs")
Number of ADLs {RCC SU 2018 PUF}
Level n Number (000) SE (000) LL (000) UL (000) Percent SE LL UL
0 131 114 12 92 142 12.4 1.3 9.9 15.4
1-2 218 249 22 209 297 27.1 2.2 22.8 31.8
3-6 555 555 25 508 606 60.4 2.4 55.6 65.1
N = 904. Checked NCHS presentation standards. Nothing to report.