ordinal or interval variable? and which inference test to use?

Watch
Announcements
#1
Hello there,

I'm doing a module in research methods and we are required to analyse some data (provided to us) from a randomised control trial.

I'm having some difficulty deciding on what stats analysis to do...

Background: the primary outcome measure is the sum of scores from a 36 item 'disability questionnaire' (the dependent variable). Each item on the questionnaire asks participants to rate the difficulty they have performing an everyday task. They have 3 options which are attributed a score - no difficulty (0), some difficulty (1), unable to do (2). They end up with a score out of 72. This is done at baseline (time 0). Participants are then randomised to receive an intervention group or a control group (independent variable). After a couple of days, both groups repeat the questionnaire (time 1).

The main research question is, does the intervention reduce disability (which would be evidenced by a significantly greater decrease in questionnaire score in the intervention group than in the control group).

My questions are:

1) what type of variable is the sum of the scores of the questionnaire? Reading around this I can't find any clear guidance. Some sources say Likert-type scales are strictly ordinal data but then could the sum of Likert-type scale scores be classed as interval data?

2) what test should be used to assess for a between-group difference in disability score between time 0 and time 1? My stats textbooks aren't very clear on this!

0
2 years ago
#2
(Original post by KrisJB)
Hello there,

I'm doing a module in research methods and we are required to analyse some data (provided to us) from a randomised control trial.

I'm having some difficulty deciding on what stats analysis to do...

Background: the primary outcome measure is the sum of scores from a 36 item 'disability questionnaire' (the dependent variable). Each item on the questionnaire asks participants to rate the difficulty they have performing an everyday task. They have 3 options which are attributed a score - no difficulty (0), some difficulty (1), unable to do (2). They end up with a score out of 72. This is done at baseline (time 0). Participants are then randomised to receive an intervention group or a control group (independent variable). After a couple of days, both groups repeat the questionnaire (time 1).

The main research question is, does the intervention reduce disability (which would be evidenced by a significantly greater decrease in questionnaire score in the intervention group than in the control group).

My questions are:

1) what type of variable is the sum of the scores of the questionnaire? Reading around this I can't find any clear guidance. Some sources say Likert-type scales are strictly ordinal data but then could the sum of Likert-type scale scores be classed as interval data?

2) what test should be used to assess for a between-group difference in disability score between time 0 and time 1? My stats textbooks aren't very clear on this!

These are tricky questions, for which the answers given in the literature tend to be pragmatic rather than based on any deep theory.

The sum of variables like this doesn’t have any particular type, and interpreting what such a sum means is difficult and context dependent. For one subject’s sum score to be greater than another may occur because the difficulties one subject is having in one area out-weight the difficulties the other subject is having in another area, rather than saying the first subject is uniformly worse off than the second. In addition, the impact (in terms of quality of life, say) of one item on the list may be greater than another and may vary from subject to subject.

So, strictly speaking, what you have here is a 36 dimensional multi-variable outcome, where each individual dimension is a non-interval ordinal variable. What can you do with them?

The first possibility, which is widely used, is to assume that, although the individual items are ordinal, the sum is an interval variable. You then proceed using standard parametric techniques such as linear regression, probably after some transformation of the sum variable. The advantages of this is that it is easy and allows for adjustment by possible confounders (in a randomised study this improves precision rather than bias), especially here, adjustment for baseline.

Second technique is to apply cluster analysis, to see whether the intervention group naturally falls in a different part of 36-space than the control group. This is a very clean approach, but it’s difficult to assign a probability model to this setup, and therefore p-values and confidence intervals may be tough!

A third approach is to group the items in the questionnaire into a small number of closely related groups. For example, one group related to difficulties with manual dexterity, another group related to state of mind, and so on. Then assume that the scores in each group are driven by some (unmeasured) latent variable (that may be a latent class, or a latent continuous variable). Then do the final analysis on the inferred values of this small number of latent variables.
0
X

new posts
Back
to top
Latest
My Feed

Oops, nobody has postedin the last few hours.

Why not re-start the conversation?

see more

See more of what you like onThe Student Room

You can personalise what you see on TSR. Tell us a little about yourself to get started.

Poll

Join the discussion

Would you give consent for uni's to contact your parent/trusted person in a mental health crisis?

Yes - my parent/carer (138)
33.91%
Yes - a trusted person (107)
26.29%
No (110)
27.03%
I'm not sure (52)
12.78%