1. Properties of the SLANSS tool for assessment and prognosis in whiplash Ashley Smith PT, PhD(c), FCAMPT Dave Walton PT, PhD, FCAMPT Michele Sterling PT, PhD
3. Introduction S-LANSS Self-report version of the Leeds Assessment of Neuropathic Signs and Symptoms 7-item paper-and-pencil form Intended to capture signs and symptoms of “neuropathic” pain For screening: each item weighted, score ≥ 12 best indicator of Pain of Predominantly Neuropathic Origin (POPNO)
4.
5. Screening Properties Validated (construct, convergent, internal consistency) against expert opinion with no ‘Gold Standard’ available Unaided: 74% Sensitivity; 76% Specificity Bennett, 2005 Validated in single body regions of painBouhassira, 2011 Fails to identify 25% of pain with clinical Dx Not suitable for assessment of Rx effects Bouhassira, 2011
6. Use in MSK research ‘None of the descriptors was pathognomonic or even specific for neuropathic pain’ Bennett, 2005 Sterling & Pedler, 2008: 85 people with acute whiplash (<4 weeks) (54 females, age 36.27 12.69 years) 34% scored S-LANSS ≥ 12; with corresponding Higher pain/disability, cold hyperalgesia, cervical mechanical hyperalgesia, and less elbow extension with the BPPT Pressure pain thresholds (PPTs) at distant sites and psychological distress (GHQ-28) were not different between the groups
7. Unknown clinimetric properties Do ‘neuropathic signs & symptoms’ constitute one broad domain, or is there more than one factor in the scale? How is the scale (or sub-scales if present) related to other clinical indicators, such as pain threshold or disability? Can the scale, or its subscales, be used to predict short-term outcome after acute whiplash?
8. Objectives To explore the factor structure of the SLANSS in a sample of people with whiplash-associated disorder (WAD) To evaluate the usefulness of the scale, or sub-scales, in predicting current or future WAD-related pain and disability
9. EFA Sample (WAD) NPRS = Numeric Pain Rating Scale, NDI = Neck Disability Index, Cx PPT = pressure pain threshold at the cervical spine, TA PPT = pressure pain threshold at the belly of the tibialis anterior. *: n = 135 subjects for NDI only
10. Factor structure of S-LANSS 3-factor solution optimal, explaining 62.2% of variance in score Factor 1: Superficial symptoms (32.4%) Q2 (skin colour change), Q3 (skin sensitive to touch), Q5 (skin hot) Factor 2: Active tests (17.3%) Q6 (rub the painful area), Q7 (press on the painful area) Factor 3: Deep symptoms (12.5%) Q1 (pins & needles), Q4 (sudden, bursting pain)
11. Concurrent validity Based on EFA, hypotheses were: Active tests would be most strongly associated with local (cervical) PPT None of the factors would be associated with distal (Tib. Ant.) PPT Factor 3 (deep symptoms) would be most strongly associated with NPRS All 3 factors would be independently associated with NDI
12. Concurrent validity Partially supported Hypothesis 1: Of the 3 subscales, association between active tests and local PPT would be strongest “Superficial symptoms” subscale was equally associated with local PPT
14. Concurrent validity Partially supported Hypothesis 2: None of the subscales would be associated with TA PPT “Superficial symptoms” subscale showed a significant association with distal PPT
16. Concurrent validity Not supported Hypothesis 3: Factor 3 (deep symptoms) would be most strongly associated with NPRS While significant, “Deep symptoms” showed the weakest association with NPRS of the 3 subscales.
17. Concurrent validity Supported Hypothesis 4: Each subscale would be independently associated with NDI After stepwise multiple linear regression, all 3 subscales were retained, explaining 35.7% of the variance in NDI
18. Summary of concurrent validity The S-LANSS appears to possess 3 important subscales: superficial symptoms, deep symptoms and active tests The ‘superficial’ and ‘active’ subscales are significantly associated with local PPT ‘Superficial’ is also associated with distal PPT All three are associated with NPRS All three explain unique significant variance in NDI
19. Predictive validity (acute WAD) Hypotheses: Each of the 3 subscales would explain significant unique variance in follow-up NDI scores after controlling for age, sex and baseline pain intensity Each of the 3 subscales would explain significant unique variance in follow-up PTSD scores, after controlling for age, sex and baseline pain intensity
20. Methods Subjects with acute (<60 days) WAD were recruited Data: Demographics (age, sex) NPRS, NDI Local (C-spine) and Distal (Tib. Ant.) PPT S-LANSS 3 months later NDI and PTSD data were collected
21. Sample NPRS = Numeric Pain Rating Scale, NDI = Neck Disability Index. *: n = 72 subjects for NDI only
24. Results: Predictive Validity Multiple linear regression Age and sex entered first, followed by NPRS, Local and Distal PPT, and 3 S-LANSS subscales NDI: 4 variables were retained in the model, explaining 24.2% of the variance in NDI Baseline NPRS, Deep symptoms subscale, Age, Cervical PPT PTSD: 3 variables were retained, explaining 25.9% of the variance in PTSD score Cervical PPT, Deep symptoms subscale, Superficial symptoms subscale
25. Discriminatory Accuracy Ability of S-LANSS total weighted score to discriminate between those ‘at risk’ of NDI >10/100 after 3 months from those not at risk. AUC: 0.68 Best cut-score: score of 10 or higher: Se: 0.52, Sp 0.76, +LR 2.14, -LR 2.60
27. Discriminatory Accuracy Subscales Superficial symptoms: AUC = 0.60 Deep symptoms: AUC = 0.62 Active tests: AUC = 0.63 All of the scales (total and subscales) are more specific than they are sensitive High risk of false negatives with low scores, but high scores are particularly problematic
28. Predictive validity: Summary After controlling for baseline pain, sex and gender, the deep symptoms subscale contributes significant predictive power to the model for 3-month NDI and PTSD The superficial symptoms subscale also uniquely predicts 3-month PTSD None of the scales (total or subscales) are useful screening tools when used on their own
Hinweis der Redaktion
Notes: EFA = exploratory factor analysis. Principal components analysis with varimax rotation.203 total subjects from the 3 databases. Ash’s sample is of course much more chronic than mine or Michele’s. I think we’re justified in pooling them for this analysis, but could see an argument against it as well.Due to some differences in mean values between the 3 databases, there has been some data transformation for the validity testing that I’ll mention as we get there.
Lots to talk about here. All items were entered as either being endorsed (1) or not endorsed (0), since the weightings are largely meaningless statistically.The eigenvalues, either through the run-of-the-mill eigenvalue-greater-than-one or through using Horn’s Parallel Analysis approach, both suggested that 2 factors were present. However, inspection of the scree plot would indicate that a third factor could and probably should be extracted. If the factor solution was left at 2 factors, factor 1 would be everything except the active tests (no. 6 and 7), and factor 2 would be the active tests. The two would explain 49.7% of the variance in total score. Internal consistency of factor 1 would be 0.61, and factor 2 would be 0.54, but the latter is largely meaningless with only 2 dichotomous items. So the question I had to answer was: Which makes more sense conceptually, 2 or 3 factors? Considering that the third factor explained an additional 12.5% of variance, I didn’t think this was non-trivial, so I’ve extracted 3 factors. There are arguments for and against either decision.
Mean PPT scores were sig. different across all 3 of our databases, and I used a different local spot (angle of the traps) than you two did (C5, 6 facet). So, in order to make these comparisons meaningful, I z-transformed all of our data [(individual score – mean for that sample)/SD for that sample], which meant each individual score was converted from raw data to a distance (in SDs) away from the mean, which still maintains the relative position between subjects while allowing for pooling on a standard metric.In order to test concurrent construct validity, hypotheses needed to be formed and tested. I made the assumption that the subscale that requires actually pushing on the area of pain would be most closely associated with local PPT. It was closely associated, but so was the superficial symptoms subscale. I calculated eta2 as a measure of effect size for each subscale (SS between/SS total). The values were: active 0.07, superficial 0.09, deep 0.01. Cohen suggests the following categories: 0.01 is small, 0.06 is medium, 0.14 is large. So, arguably the superficial symptoms subscale was actually more strongly associated with local PPT than was the active tests subscale, but probably not significantly so (I didn’t calculate 95%CI for the eta2 values yet).
This is the box-and-whiskers plot for mean Cx PPT across the levels of the ‘active tests’ subscale, showing that those who endorsed both Q6 and Q7 had a significantly lower PPT than did those who endorsed neither Q6 or Q7, or only one of them.
Again, this was a hypothesis I made unilaterally, and maybe we can change these. But from a construct validity standpoint, it could logically be expected that none of the scales that focus specifically on symptoms in the local area would be associated with distal PPT. In this case, the ‘superficial symptoms’ subscale was associated.
Box and whisker plot again showing the relationship between distal (TA PPT) and the superficial symptoms subscale score. On post-hoc analysis, the only sig. diff. was between those who endorsed 2 of the 3 items in this subscale and those who didn’t endorse any. Power was probably too low to identify a difference between the 0 and 3 groups, as that outlier in the 3 group is probably causing some grief for the analysis.
Not really sure about this one. Felt I wanted one more hypothesis to test, and this was the best I could come up with given the data we have. Calculated eta2 again, and here the values were: active tests 0.19, superficial 0.11, deep 0.06. So in fact, active tests (those that require you actually press on the painful area) had the strongest relationship with NPRS (read: explained the greatest variance in NPRS). The deep symptoms subscale had the weakest association/explained the least variance.
Ran a regression, set sig. of F to retain at 0.05 and to exclude at 0.10. Had to do a square root transformation on all data to meet assumptions of normality. Non-collinearity and homoscedasticity were both satisfied. I dumped only these 3 subscales into the regression equation, trying to answer the question: ‘Do all 3 explain significant unique variance in concurrent NDI score?’. We’d almost certainly get different results if we included age, sex and NPRS first, but that wasn’t answering the question that was posed. Not sure if I love this approach though.Either way, at the end of this, we have identified 3 subscales, each of which explain unique variance in NDI, which is sort of cool if one wants to determine the mechanisms behind a patient’s neck-related disability. The argument here would be that these subscales give clinicians great information on the mechanisms driving the pain and disability than does the aggregate score.
Again, had to pose some hypotheses to test.
These were data from mine and Michele’s datasets only, as Ash’s didn’t fit the inclusion criteria.
No sig. diffs. b/w the data sets save for NPRS, where mine were a bit higher than Michele’s.
Again, this was after a square root transformation to achieve normality. As is usually the case, baseline values of any measure will be the strongest predictor of a future measure, shown here for baseline NDI. But, we can also see that each of the subscales were significantly predictive of future NDI, as was NPRS and Cx PPT. Interesting that neither sex nor age were predictive.
Same kind of evaluation for PTSD scores. Michele and I used different scales for capturing PTSD symptoms: Michele the PDS, me the PCL, so once I again I did a z-transformation, which means what we’re really predicting is each participants location relative to the mean of that sample. In case you’re wondering, we had 118 total subjects, 46 from my data and 72 from Michele’s. Of interest in these results is that the active tests subscale is no longer predictive, but TA PPT is. I have been seeing the TA PPT relationship somewhat consistently in my data, trying to sort out what the mechanism is there. Also interesting was that once again sex and age aren’t predictive.
Heirarchical Multiple linear regression. All data are square root tranformed and in some cases z-transformed. Actually, because the PPT data were Z-transformed, I first had to add a constant to the whole sample so that there were no negative values. So lots of data transformation going on, but all justifiable and none of it should affect the magnitude of the relationships.Once again, there have been all sorts of decisions made as I went through this, and I can’t list them all here. One thing you’ll notice is that unlike with the concurrent validity, this time I did enter the other variables first, so the question became more: ‘do the 3 subscales explain significant variance in future NDI, beyond that explained by age, sex, NPRS and PPT’. I did run a similar type of regression to the concurrent validity, where I only entered the 3 SLANSS subscales. Somewhat interestingly, in that case only the active tests and superficial symptoms scales were retained, not the deep symptoms.
So finally, since clinicians like this kind of thing more than regression, I constructed an ROC curve with 3-month NDI score of 10/100 or lower indicating a positive outcome, and 11/100 or higher a negative outcome. Again, this is open to discussion and easily changeable. In this case though, the interpretation of the results here is: “how well can the SLANSS score (with item weightings) discriminate between those who will score 10 or less on the NDI 3 months later and those who will score over 10?” In a nutshell the answer is not well, but this is another example of the folly of relying on just one scale to determine prognosis, which is never done clinically.
Here’s the ROC curve for the total SLANSS score. It’s better than chance at identifying risk of high NDI (shown by the diagonal), but not a whole lot so.
Constructed 3 more ROC curves, 1 for each of the subscales. Note that in this case, I did NOT use the item weightings, as they introduce essentially meaningless categorizations. The subscales themselves didn’t fare any better than the overall SLANSS score in discriminating between ‘at risk’ and ‘not at risk’ individuals.So the take-home message is this: The SLANSS is NOT unifactorial. It has at least 2 subscales and probably more appropriately 3. While the subscales are useful tools for determining the influences on a patient’s pain experience (concurrently), and the ‘deep symptoms’ scale in particular may be a valuable tool for predicting outcome when used in conjunction with other clinical indicators like PPT and NPRS, neither the subscales nor the total scale seem to be terribly useful for categorizing patients into ‘at risk’ or ‘not at risk’ on their own.
There is literally a pile more we could do here: construct ROC curves for PTSD scores, identify optimal cut-scores for the 3 subscales then determine predictive validity when all are considered together, determine whether using the weighted item scores adds any predictive capacity to the subscales (it probably won’t – didn’t make any difference for the overall SLANSS score), and I’m sure you can both think of others as well. But this is good for now I think, probably even too much for the 15 minutes we have at CPA.