CNIC Information System with Pakdata Cf In Pakistan
Psychometric Properties of the ORS and SRS
1. ISSN-L 1015-5759 • ISSN-Print 1015-5759 • ISSN-Online 2151-2426
Official Organ of the European Association of Psychological Assessment
european journal of
psychological
assessment
www.hogrefe.com/journals/ejpa
Edited by
Matthias Ziegler
Abstracted/Indexed in:
Current Contents/
Social & Behavioral Sciences
Social Sciences Citation Index (SSCI)
Social Scisearch
PsycINFO
Psychological Abstracts
PSYNDEX
ERIH
Scopus
2. Official Organ of the European Association of Psychological Assessment
Your article has appeared in a journal published by Hogrefe Publishing. This e-
offprint is provided exclusively for the personal use of the authors. It may not be
posted on a personal or institutional website or to an institutional or disciplinary
repository.
If you wish to post the article to your personal or institutional website or to
archive it in an institutional or disciplinary repository, please use either a pre-print
or a post-print of your manuscript in accordance with the publication release for
your article and our “Online Rights for Journal Articles”
(www.hogrefe.com/journals).
3. Original Article
Measuring Feedback From Clients
The Psychometric Properties of the Dutch Outcome
Rating Scale and Session Rating Scale
Pauline Janse,1
Liesbeth Boezen-Hilberdink,2
Maarten K. van Dijk,1
Marc J. P. M. Verbraak,1,3
and Giel J. M. Hutschemaekers3
1
HSK Group, Arnhem, The Netherlands, 2
Diaconessenhuis, Zorgcombinatie Noorderboog, Meppel,
The Netherlands, 3
Behavioral Science Institute, Radboud University Nijmegen, The Netherlands
Abstract. Treatment results can be improved by obtaining feedback from clients concerning their progress during therapy and the quality of the
therapeutic relationship. This feedback can be rated using short instruments such as the Outcome Rating Scale (ORS) and the Session Rating Scale
(SRS), which are being increasingly used in many countries. This study investigates the validity and reliability of the Dutch ORS and SRS in a large
sample of subjects (N = 587) drawn from the clients of an outpatient mental healthcare organization. The results are compared to those of previous
Dutch and American studies. While both the ORS and the SRS exhibited adequate test-retest reliability and internal consistency, their concurrent
validity was limited (more for the SRS than for the ORS). New standards are proposed for the Dutch ORS and SRS. The scores obtained with these
standards are interpreted differently than those obtained using American standards. The clinical implications of the limited validity of the ORS and
the SRS are discussed, as is the use of different standards in conjunction with these instruments.
Keywords: Outcome Rating Scale (ORS), Session Rating Scale (SRS), client feedback, validity, reliability
A promising approach to make therapy more effective is to
explicitly ask clients for feedback on how they view their
progress during treatment and to discuss potential improve-
ments with those who are making insufficient progress
(e.g., Lambert & Shimokawa, 2011). Research shows that,
based on their clinical intuition alone, therapists do not
always correctly predict which clients will drop out or dete-
riorate during therapy (Hannan et al., 2005). Another find-
ing was that the clients’ assessment of the quality of the
therapeutic relationship can differ greatly from that of their
therapists (Hafkenscheid, Duncan, & Miller, 2010; Hovarth
& Bedi, 2002).
With that in mind, Scott Miller and Barry Duncan
(2004) developed a system to provide such client directed
feedback. They made the system as user-friendly as possi-
ble for therapists and clients, in terms of feasibility and
practicality. Their feedback system consists of two short
questionnaires: the Outcome Rating Scale (ORS) and the
Session Rating Scale (SRS). The ORS covers three areas
of client functioning: individual (personal well-being),
interpersonal (family, close relationships), and social (work,
school, friendships). It was developed as a short alternative
to the Outcome Questionnaire (Lambert et al., 1996). The
SRS measures the therapeutic alliance and reflects Bordin’s
(1979) definition of alliance: the relationship between client
and therapist and consensus about goals and approach or
method. Miller and Duncan added a fourth item to each
instrument, which involved global assessments of daily
functioning (for the ORS) and of the treatment session
(for the SRS). The outcomes are discussed during the ses-
sions. If the scores do not show improvement, or do not
reach the designated cut-off scores, the possible reasons
are discussed with the client. As such these instruments
enhance engagement and participation of both client and
therapist in treatment.
A number of studies have shown that use of the ORS
and SRS during treatment improves outcome (Miller,
Duncan, Brown, Sorrell, & Chalk, 2006; Reese, Norswor-
thy, & Rowlands, 2009). Miller and colleagues (2006)
reported an increase in the overall effect size of treatment
from .39 in the 6-month baseline period (before the feed-
back system was implemented) to an effect size of .79
when feedback was provided by means of the ORS and
SRS. In addition, two studies on couples therapy showed
that feedback enabled four times more clients to achieve
clinically significant change relative to conventional treat-
ment (Anker, Duncan, & Sparks, 2009; Reese, Toland,
Slone, & Norsworthy, 2010).
The ORS and the SRS are now widely used in the
Netherlands (Beljouw & Verhaak, 2010). Until now, how-
ever, the psychometric properties of the Dutch versions of
the ORS and SRS have not been sufficiently verified. Only
two previous studies have examined psychometric aspects
of the Dutch ORS and SRS (Beljouw & Verhaak, 2010;
Hafkenscheid et al., 2010). The study conducted by
Hafkenscheid and colleagues (2010) provided the first data
Ó 2013 Hogrefe Publishing European Journal of Psychological Assessment 2013
DOI: 10.1027/1015-5759/a000172
Author’s personal copy (e-offprint)
4. on reliability but their sample was unrepresentative (in gen-
eral, little progress was made during treatments). Moreover,
most of the patients in question were treated by the first
author, which limits the potential for generalization to other
settings and other therapists. The study by Beljouw and
Verhaak (2010) focused solely on the convergent validity
of the ORS.
The Dutch ORS and SRS were generally interpreted
using the American standards. The data generated by
Hafkenscheid et al. (2010) suggest that the population of
the Netherlands differs from that of the US, which renders
use of the American standards problematic. For example,
Hafkenscheid et al. (2010) found lower scores for the
SRS than was the case in the American studies. In addition,
the global mean for the SRS (32.4) was found to be well
below the cut-off score of the mean from the American
studies (36 points). Additional data from different patient
populations are needed to produce reliable Dutch standards
and to determine the extent of any differences between
these standards and their American equivalents.
The purpose of the current study is to examine the psy-
chometric properties and standards of the Dutch versions of
the ORS and SRS in a large sample of outpatients. Further-
more, the Reliable Change Index (RCI; Jacobson & Truax,
1991) was calculated to determine whether a change in a
given individual’s ORS score was clinically significant.
The results of this study are compared to those of previous
American and Dutch studies.
Method
Participants
Clinical Sample
The clinical sample consisted of 587 consecutive clients
who had been referred by their physician to one of the five
participating branches of HSK in the period from 2009 to
the end of 2010. HSK is a Dutch organization providing
outpatient mental healthcare. It operates throughout the
Netherlands, and provides cognitive and behavioral thera-
pies for common mental disorders. The age of the clients
in this sample ranged from 18 to 71 years, with a mean
of 41 (SD = 11.1). They presented with diverse psycholog-
ical disorders (Table 1). Of the total sample, 543 clients
received treatment after intake. The average course of treat-
ment consisted of 16 sessions (SD = 8.7).
Nonclinical Sample
It is important to determine the cut-off point on the ORS
that distinguishes the functional population from the dys-
functional population. To this end, Jacobson and Truax
(1991) recommend using their formula c, which takes
scores from both clinical and nonclinical samples (repre-
senting the scores of a functional population) into account.
Accordingly, a nonclinical sample was also included in this
study. These individuals filled in the ORS and SCL-90 once
only, for the purpose of comparison. This nonclinical sam-
ple consisted of the partners of the clients included in the
study. They received the questionnaires (including informa-
tion about the study) and an informed consent form from
their partners, the clients. Any of the partners who were
undergoing psychological treatment was excluded. The
final, nonclinical sample consisted of 116 volunteers.
Fifty-six percent (n = 65) of these participants were female,
and the average age was 41 years (SD = 11.0).
Procedure
The participants signed an informed consent form at intake.
Clients were asked to fill in the ORS and SRS during each
treatment session. The Outcome Questionnaire (OQ-45;
Lambert et al., 1996) and an alliance questionnaire
(WAV-12, Stinckens, Ulburghs, & Claes, 2009) were com-
pleted at the start of treatment, once every fifth session, and
at the end of the treatment. In order to eliminate any possi-
bility of feedback effects affecting therapy, the therapists
were not allowed to see the completed questionnaires.
The Symptom checklist (SCL-90-R; Arrindell & Ettema,
2003) was administered at intake and at the end of
treatment.
Measurements
The Outcome Rating Scale (ORS) and the Session
Rating Scale (SRS)
The ORS and SRS each consist of four items, which are
answered using 10-cm visual analog scales (VAS) ranging
from negative (left) to positive (right).
The ORS measures three areas of client functioning:
individual, interpersonal, and social, as well as measuring
the client’s overall view of their personal well-being.
The SRS measures the relationship between the client
and the therapist, consensus about goals and methods,
and the client’s overall view (at the end of a session) con-
cerning the quality of the therapeutic relationship.
Table 1. Characteristics of the clinical sample
N %
Sex
Male 281 47.9
Female 306 52.1
Diagnosis
Adjustment disorder 164 28.0
Work-related distress 163 27.8
Mood disorders 122 20.9
Anxiety disorders 102 17.3
Other 13 5.6
2 P. Janse et al.: Psychometric Properties of the Dutch Outcome Rating Scale and Session Rating Scale
Author’s personal copy (e-offprint)
European Journal of Psychological Assessment 2013 Ó 2013 Hogrefe Publishing
5. The marks made by clients on each of the four lines are
measured to the nearest millimeter to derive the score.
These are then combined to obtain a total score. The total
scores range from 0 to 40 on both measures. High scores
on the ORS reflect a good level of well-being and function-
ing, while high scores on the SRS reflect a good therapeutic
relationship. The most recent version of the Dutch transla-
tion of the ORS and SRS was used (translation by Asmus,
Crouzen & van Oenen, 2004).
Instruments Used to Validate the ORS and SRS
The concurrent validity of the ORS was tested using the
Outcome Questionnaire (OQ-45; Lambert et al., 1996) and
the Symptom checklist (SCL-90; Arrindell & Ettema, 2003).
The OQ-45, which consists of 45 items, measures three
domains of functioning: symptom distress (SD), interper-
sonal relations (IR), and social role performance (SR).
The Dutch version of the OQ-45 demonstrated adequate
overall reliability (De Jong et al., 2007), but it was inade-
quate in terms of the Social Role subscale. Its construct
validity proved to be adequate. In the current study, the
OQ-45’s internal consistency (or alpha values) was .88
for the SD domain, .80 for the IR domain, .60 for the SR
domain, and .82 for the OQ-45 total score (N = 483).
The Symptom Checklist Revised (SCL-90-R; Derogatis,
1994) measures a broad range of psychological problems
and symptoms of psychopathology. The 90 items included
in the Dutch SCL-90-R are categorized into eight subscales.
A client’s overall score on the SCL-90-R reflects his gen-
eral psychological and psychosomatic well-being. The
Dutch SCL-90-R has shown good psychometric properties
(Arrindell & Ettema, 2003). Alpha values for the SCL-90-R
in this study ranged from .59 to .90 for the subscales and
.77 for the total score (N = 541).
The ORS was expected to have a reasonably strong rela-
tionship with the OQ-45, and a moderately strong relation-
ship with the SCL-90. The latter being due to the slightly
different concepts measured by the ORS and the SCL-90.
The SCL-90 focuses on symptoms of psychological prob-
lems, whereas the ORS also measures an individual’s
well-being in relationships and at work. The ‘‘Individual’’
subscale of the ORS and the total score on the ORS are
expected to show the strongest relationship to the SCL-90
total score.
The concurrent validity of the SRS was determined by
comparing it to the Dutch version of the Working Alliance
Inventory, Short Form (WAV-12; Stinckens et al., 2009).
The WAV-12 is based on Bordin’s (1979) definition of
the therapeutic relationship. It consists of 12 items and mea-
sures three domains of the therapeutic relationship, namely
‘‘Goal,’’ ‘‘Task,’’ and ‘‘Bond.’’ As the SRS and the
WAV-12 are both based on Bordin’s theory, their total
scores should show strong correlation with one another.
As the subscales of the measures are slightly different, they
may show weaker correlations. The Internal consistency
(alpha) of the WAV-12 in this study was .84 for the Task
domain, .82 for the Goal domain, .80 for the Bond domain,
and .87 for the total score (N = 285).
Statistical Analysis
First the normality of the scores was checked. Next the
internal consistencies of the ORS and SRS were calculated
using Cronbach’s alpha. Test-retest reliability and the con-
current validity of the ORS were calculated using bivariate
correlations. The predictive validity of the SRS for treat-
ment outcome was determined by linear regression analy-
sis, using the difference between the total pretreatment
and posttreatment SCL-90 scores as a measure of outcome.
Independent t-tests (two-tailed, p < .05) were used to
measure differences between males and females in the
scores obtained using these measures, and between the clin-
ical and nonclinical groups.
The standards to be used in conjunction with the ORS
were determined on the basis of cut-off scores and the
RCI. The ORS cut-off score used to distinguish between
the functional and dysfunctional populations was calculated
using Jacobson and Truax’s (1991) formula c:
c ¼
S0M1 þ S1M0
S0 þ S1
: ð1Þ
M1 = the mean of the pretreatment clinical group, M0 = the
mean of the nonclinical sample, and S0, S1 = the standard
deviations of clinical and nonclinical samples.
The RCI of the ORS was calculated by multiplying sdiff
by the z value of the requisite significance level (1.96,
p < .05). All statistical analyses were performed using
SPSS version 17.0 (SPSS, Chicago, IL).
Results
Outcome Rating Scale
Normative Data
Table 2 shows the mean scores and standard deviations of
the clinical and nonclinical samples for the ORS total
scores obtained at intake. The total score for the ORS
was lower than that of a clinical group reported by Miller
et al. (2003; M = 19.6, SD = 8.7). The clinical group’s
average total score for the OQ-45 was 70.5 (SD = 22.2),
while their average total SCL-90 score was 180.7
Table 2. Means and standard deviations on the ORS total
scores of the clinical and nonclinical samples
Nonclinical
n = 116
Clinical
n = 524
M SD M SD
ORS individual 7.3 1.8 3.6 2.1
ORS relational 7.4 1.7 5.5 2.4
ORS social 7.5 1.6 3.9 2.4
ORS overall 7.5 1.6 4.0 2.0
ORS total 29.6 6.0 17.0 7.2
P. Janse et al.: Psychometric Properties of the Dutch Outcome Rating Scale and Session Rating Scale 3
Author’s personal copy (e-offprint)
Ó 2013 Hogrefe Publishing European Journal of Psychological Assessment 2013
6. (SD = 47.3), indicating a high level of distress. The non-
clinical group had an average total score for the SCL-90 of
111.0 (SD = 21.8), reflecting a good level of well-being.
At intake, no significant differences were found
between males and females in terms of the ORS total
score (t(522) = 0.58, p > .05), the OQ-45 total score
(t(481) = À4.49, p > .05), or the SCL-90-R total score
(t(545) = À1.10, p > .05).
Cut-Off Scores and Reliable Change
The ORS cut-off score between the nonclinical and clinical
ranges was 24, one point lower than the American cut-off
score. At 9 points, the RCI (which is defined as the mini-
mum amount of change in outcome required to indicate
genuine change, rather than mere error) exceeded the
American RCI of 5 (Miller & Duncan, 2004). This suggests
that, during treatment, Dutch clients need to achieve a
greater degree of change on the ORS for such change to
be considered reliable.
Psychometric Properties of the Outcome
Rating Scale
Reliability
In the clinical sample, internal consistency was determined
at intake and at the first, third, and fifth sessions. The alpha
values of the total score varied from .82 to .96. The
nonclinical sample had an alpha value of .94 (N = 116).
The relationships between the subscales were strong and
in line with the results found both in American studies
(Miller et al., 2003) and in available Dutch data (Hafkensc-
heid et al., 2010).
The test-retest reliability of the ORS was established by
computing correlations between five measurement points in
the clinical sample (Table 3). The decrease in N from intake
(587 clients at intake and 323 at the first and second mea-
surement points) is due to missing data or to clients who
received no further treatment. The correlation between
the ORS total scores at subsequent measurement points
was adequate and slightly higher than that found in the
studies by Miller and colleagues (2003; r ranging from
.49 to .66) and by Hafkenscheid et al. (2010; r ranging
between .16 and .63).
Criterion Validity
As an outcome measure, the ORS must be able to distin-
guish between clinical and nonclinical groups. The differ-
ence between the mean scores of these groups (Table 1)
was significant (t(636) = À17.4, p < .05), indicating that
the ORS can indeed effectively distinguish between dys-
functional and functional clients at group level.
Concurrent Validity
In the clinical sample, correlations between the ORS and
OQ-45 were calculated (Table 4) at intake.
The reported correlations between ORS and OQ-45 sub-
scales and total scales were negative, as a good level of
well-being is indicated by high scores on the ORS but by
low scores on the OQ-45 (and the SCL-90). Overall these
correlations were moderately strong (Cohen, 1988),
although they were slightly lower than those found in the
study by Miller and colleagues (2003). However, the corre-
lation found for the ORS and OQ-45 total scores was in line
with their findings (r ranged from À.53 to À.69). The ORS,
as a general measure of treatment outcome, still appeared to
be reasonably valid.
Concurrent validity was also tested by calculating the
correlation between the ORS and SCL-90 total and subscale
scores at intake. In the clinical sample, correlations
ranged from r = À.09 to À.56 (n = 481). The strongest
Table 3. Test-retest reliability of the ORS between five administrations
1st–2nd 2nd–3rd 3rd–4th 4th–5th
n r n r n r n r
ORS 323 .64 341 .57 339 .69 334 .63
Note. All correlations are significant at p < .01 level.
Table 4. Correlations between the ORS and OQ-45 subscales and total scales in the clinical sample at intake
OQ-45 SD (n = 493) OQ-45 IR (n = 482) OQ-45 SR (n = 492) OQ-45Total (n = 455)
ORS individual À.53 À.40 À.30 À.52
ORS interpersonal À.36 À.54 À.19 À.45
ORS social role À.46 À.36 À.46 À.50
ORS overall À.55 À.45 À.34 À.56
ORS total À.58 À.54 À.40 À.62
Notes. All correlations are significant at p < .01 level. OQ-45 SD = Symptom Distress; OQ-45 IR = Interpersonal Relation; OQ-45
SR = Social Role.
4 P. Janse et al.: Psychometric Properties of the Dutch Outcome Rating Scale and Session Rating Scale
Author’s personal copy (e-offprint)
European Journal of Psychological Assessment 2013 Ó 2013 Hogrefe Publishing
7. relationships were between the ORS Overall scale and ORS
total score and the SCL-90 Depression scale (r = À.54 and
À.56, respectively), and between the ORS total score and
the SCL-90 total score (.50). In the nonclinical sample
(n = 111) the correlations were stronger (r ranged from
À.19 to À.70). Here too, the strongest relationships were
between the ORS Overall scale and the SCL-90 Depression
scale (r = À.70) and between ORS and the SCL-90 total
scores (À.66).
Sensitivity to Change
The ORS is used as an instrument to track progress, so it
must be capable of measuring changes in clients’ well-
being during treatment. Of the total sample, 172 clients
filled in the ORS both at intake and at the end of their treat-
ment. The mean ORS total score at intake was 16.9
(SE = .57) and 29.2 (SE = .58) posttreatment. The
posttreatment well-being of clients was significantly better
than it was before treatment commenced (t(171) = À17,72,
p < .05, r = .81).
The Psychometric Properties of the Session
Rating Scale
Normative Data
Table 5 shows the mean SRS total scores and standard
deviations of the clinical sample at four measurement
points. The maximum mean achieved on the SRS leveled
off at 34 after 15 sessions. As with the ORS, the decline
in N from the intake value is due to missing data or to cli-
ents who did not receive further treatment.
Reliability
Alpha values ranged from .85 to .95 during the first five
sessions. The test-retest reliability (as measured by Spear-
man’s rho) was slightly less than that reported by Duncan
et al. (2003; an overall r of .64), but still moderately strong
(Table 6). Test-retest reliability was assessed between treat-
ment sessions. The SRS scores changed between sessions
(in general they seemed to improve over time, see Table 5),
and the correlations would probably be stronger if this had
not been the case. Thus, when taking this into account, the
test-retest reliability of the SRS can be considered adequate.
Concurrent Validity
The relationship between the SRS and the WAV-12 (as
measured by Spearman’s rho) was assessed at the beginning
of treatment (Table 7). The correlations were moderately
strong and significant (p < .01), but lower than expected,
indicating that the SRS and the WAV-12 may be measuring
slightly different aspects of the therapeutic relationship.
Predictive Validity
There is evidence that the quality of the therapeutic
relationship influences treatment outcome (e.g., Martin,
Garske, & Davis, 2000). A linear regression analysis was
therefore carried out to test the predictive value of the
SRS on treatment outcome (as measured by the difference
between SCL-90 total score at intake and SCL-90 posttreat-
ment score). The SRS total score for sessions two and three
did indeed predict outcome (p < .05), the SRS score for
session two being the strongest predictor (b1 = À.14,
p < .05). Nonetheless, the SRS had only very limited influ-
ence (R2
= .02).
Discussion
The aim of this study was to examine the psychometric
properties of the Dutch ORS and SRS, and to compare
the results with those obtained in American studies and
other Dutch studies.
The results demonstrate that the ORS and SRS have
strong internal consistency, reflecting a strong cohesion
of the items concerned. This is in line with the findings
of other studies. Furthermore, the ORS and SRS exhibited
adequate test-retest reliability, comparable to those found in
the American studies and another Dutch study (Duncan
et al., 2003; Hafkenscheid et al., 2010; Miller et al., 2003).
Table 5. Means and standard deviation on the SRS at sessions 1, 5, 10, and 15
Session 1 Session 5 Session 10 Session 15
n M SD n M SD n M SD n M SD
SRS 349 30.1 6.1 321 32.0 4.7 208 32.6 4.7 121 33.6 4.4
Table 6. Test-retest reliability of the SRS between five administrations
1st–2nd 2nd–3rd 3rd–4th 4th–5th
n rs n rs n rs n rs
SRS 317 .48 313 .72 315 .61 296 .59
P. Janse et al.: Psychometric Properties of the Dutch Outcome Rating Scale and Session Rating Scale 5
Author’s personal copy (e-offprint)
Ó 2013 Hogrefe Publishing European Journal of Psychological Assessment 2013
8. The moderately strong correlations with other outcome
measures (concurrent validity) are somewhat lower than
expected. They are also lower than the correlations found
in other studies (Miller et al., 2003; Campbell & Hemsley,
2009). In particular, a stronger relationship was expected
between the ORS and OQ-45, as the former is based on
the latter. The difference in scaling (VAS and Likert scales)
could be a factor here. The strongest relationships found
were those between the ORS total and OQ-45 and
SCL-90 total scores.
The concurrent validity of the SRS, too, is not as high as
was expected, especially with regard to the subscales of the
SRS. This may indicate the SRS is measuring a somewhat
different construct than the WAV-12. Given the high inter-
nal consistency involved, it follows that it would be better
to use the total scores of the ORS and SRS as general out-
come and alliance scores, rather than interpreting the indi-
vidual items of these measures.
This study was subject to a number of limitations. For
instance, the ORS and SRS are Visual Analog Scales
(VAS), which clients could interpret subjectively. However,
various studies have shown VAS to be reliable and valid
measures, comparable to Likert scales (see for an overview
Hasson & Arnetz, 2005). Another limitation of this study
was the method used to determine test-retest reliability.
The average interval between measurements was 1 week,
during which time the effect of treatment or external factors
might be expected to produce a change in the ORS, in par-
ticular. Duncan et al. (2003) have stated that instruments
which are sensitive to change can produce lower test-retest
correlations. Accordingly, the correlation should not be
interpreted too strictly. In order to determine test-retest cor-
relations more accurately, future studies should use shorter
intervals between measurements. Furthermore, the partici-
pants in this study included a relatively high percentages
of males, so any future studies should include checks to
determine whether the scores obtained are representative
of the Dutch outpatient population as a whole.
One important aim of this study was to establish Dutch
standards for the ORS/SRS. Based on the data obtained in
this study, the clinical cut-off score of the ORS for Dutch
patients attending outpatient clinics in connection with
common mental disorders can be set at 24. This is one point
lower than the American cut-off score. The present study
gave an RCI for the ORS of 9 points, which differs from
the American RCI of 5 (Miller & Duncan, 2004) but is
more in line with the RCI of 8 found by Hafkenscheid
et al. (2010). This means that, relative to American clients,
Dutch clients need to achieve more change on the ORS in
order to achieve reliable change. This has implications for
the way in which the feedback system is used during ther-
apy, as the standards underpin decisions on whether to
change the approach or interventions used in the course
of treatment. For example, if a Dutch client exhibits a posi-
tive change of 5 points on the ORS, this might result in the
adoption of a different approach to treatment or even a
change of therapist. In the same situation, the American
interpretation would be that reliable change has been
achieved and that no change of therapist or approach is
necessary (given that there is a good therapeutic
relationship).
The average scores on the SRS were lower than the
American cut-off score of 36, and never exceeded 34 points
during treatment. American data show that only 24% of
cases fall below the cut-off score of 36 (Miller & Duncan,
2004), yet the present study found that 73% of cases fall
below the American cut-off score at session 5. This sug-
gests that different standards might apply to the Dutch
cut-off score for the SRS. The low mean scores on the
SRS may be due to cultural differences or to the design
of the study. Unlike the therapists in the American studies,
the therapists in this study did not see the scores. It may be
that, when the SRS is discussed during the session, this
results in more socially desirable answers, which in turn
lead to higher scores. Before determining a cut-off score
for the Dutch SRS, this possibility needs to be investigated
further in the context of an effect study (in which scores are
discussed during treatment). A study of this kind is already
underway.
The predictive validity of the quality of the therapeutic
relationship, as measured by the SRS, was very limited.
Although the SRS at sessions two and three were found
to predict treatment outcome, this relationship was rela-
tively weak, suggesting that the therapeutic relationship
has only a marginal effect in this regard. However, further
research is needed to determine whether the predictive
validity of the SRS improves when it is actively used dur-
ing treatment. As the treatments given in this study were
very structured (the therapists used treatment manuals),
the quality of the therapeutic relationships in question
may be less relevant (e.g., Martin et al., 2000) than when
less rigidly structured treatments are used.
In conclusion, this study has shown that while both the
ORS and SRS demonstrate adequate reliability, their valid-
ity is limited. This finding is in line with those of previous
studies. Accordingly, while the ORS and SRS can be very
useful feedback instruments, it is advisable to supplement
them (at intervals of several sessions) with better validated
Table 7. Correlations (rs) between the SRS and the WAV-12 subscales and total scales at the beginning of treatment
WAV-12 bond (n = 235) WAV-12 Goal (n = 252) WAV-12 task (n = 248) WAV-12 total (n = 234)
SRS relationship .32 .36 .37 .37
SRS goal .38 .41 .40 .43
SRS approach .31 .41 .46 .43
SRS overall .37 .40 .45 .44
SRS total .39 .43 .45 .46
Note. All correlations are significant at p < .01 level (2-tailed).
6 P. Janse et al.: Psychometric Properties of the Dutch Outcome Rating Scale and Session Rating Scale
Author’s personal copy (e-offprint)
European Journal of Psychological Assessment 2013 Ó 2013 Hogrefe Publishing
9. instruments, to corroborate progress. This study has also
revealed a difference between Dutch and American stan-
dards for the ORS and SRS, which can have major impli-
cations for the way in which the feedback system is used.
Accordingly, further research is needed on how standards
differ from one country to another, as little is known of
the standards used in countries other than the United States.
In using the ORS and the SRS, the main aims are to
help therapists prevent dropout and to make therapy more
efficient, by means of frequent feedback from clients. By
repeatedly measuring the client’s progress and satisfaction
with treatment, the therapist stays alert. The treatment
maintains the right focus. They are clinical track-and-trace
tools enhancing treatment engagement and participation.
Treatment outcome, however, needs to be corroborated
by other more valid measures.
References
Anker, M., Duncan, B. L., & Sparks, J. A. (2009). Using client
feedback to improve couple outcomes: A randomized
clinical trial in a naturalistic setting. Journal of Consulting
and Clinical Psychology, 77, 693–804.
Arrindell, W. A., & Ettema, J. H. M. (2003). SCL-90, Handle-
iding bij een multidimensionele psychopathologie indicator.
[SCL-90, Manual for a multidimensional indicator of
psychopathology]. Lisse, The Netherlands: Swets &
Zeitlinger.
Asmus, F., Crouzen, M., & van Oenen, F. J. (2004). Outcome
Rating Scale. Retrieved from http://scottdmiller.com/
purchase-individual-or-group-licenses
Campbell, A., & Hemsley, S. (2009). Outcome Rating Scale and
Session Rating Scale in psychological practice. Clinical utility
of ultra-brief measures. Clinical Psychologist, 13, 1–9.
Cohen, J. (1988). Statistical power analysis for the behavioral
sciences (2nd ed.). Hillsdale, NJ: Erlbaum.
Beljouw van, I. M. J., & Verhaak, P. F. M. (2010). Geschikte
uitkomstmaten voor routinematige registratie door
eerstelijnspsychologen [Appropriate outcome measures for
routine registration by primary care psychologists]. Utrecht,
The Netherlands: Nivel.
Bordin, E. S. (1979). The generalizability of the psychoanalytic
concept of the working alliance. Psychotherapy: Theory,
Research & Practice, 16, 252–260.
De Jong, K., Nugter, M. A., Polak, M. G., Wagenborg, J. E. A.,
Spinhoven, Ph., & Heiser, W. J. (2007). The outcome
questionnaire (OQ-45) in a Dutch population: A cross-
cultural validation. Clinical Psychology & Psychotherapy,
14, 288–301.
Derogatis, L. R. (1994). Symptom Checklist 90–R: Administra-
tion, scoring, and procedures manual (3rd ed.). Minneapolis,
MN: National Computer Systems.
Duncan, B. L., Miller, S. D., Sparks, J. A., Claud, D. A.,
Reynolds, L. R., Brown, J., & Johnson, L. D. (2003). The
session rating scale: Preliminary psychometric properties of
a ‘‘working’’ alliance measure. Journal of Brief Therapy, 3,
3–12.
Hafkenscheid, A., Duncan, B. L., & Miller, S. D. (2010). The
Outcome and Session Rating Scales: A cross-cultural
examination of the psychometric properties of the Dutch
translation. Journal of Brief Therapy, 7, 1–12.
Hannan, C., Lambert, M. J., Harmon, C., Nielsen, S. L., Smart,
D. W., Shimokowa, K., & Sutton, S. (2005). A lab test and
algorithms for identifying clients at risk for treatment
failure. Journal of Clinical Psychology, 61, 155–163.
Hasson, D., & Arnetz, B. B. (2005). Validation and findings
comparing VAS vs. Likert scales for psychosocial measure-
ments. International Electronic Journal of Health Educa-
tion, 8, 178–192.
Horvath, A. O., & Bedi, R. P. (2002). The alliance. In J. Norcross
(Ed.), Psychotherapy relationships that work: Therapist
contributions and responsiveness to patients (pp. 37–70).
New York, NY: Oxford University Press.
Jacobson, N. S., & Truax, P. (1991). Clinical significance: A
statistical approach to defining meaningful change in
psychotherapy research. Journal of Consulting and Clinical
Psychology, 59, 12–19.
Lambert, M. J., Hansen, N. B., Umphress, V. J., Lunnen, K.,
Okiishi, J., Burlingame, G., Huefner, J. C., & Reisinger,
C. W. (1996). Administration and scoring manual for the
Outcome Questionnaire (OQ 45.2). Wilmington, DE:
American Professional Credentialing Services.
Lambert, M. J., & Shimokawa, K. (2011). Collecting client
feedback. Psychotherapy, 48, 72–79.
Martin, D. J., Garske, J. P., & Davis, M. K. (2000). Relation of
the therapeutic alliance with outcome and other variables: A
meta-analytic review. Journal of Consulting and Clinical
Psychology, 68, 438–450.
Miller, S. D., Duncan, B. L., Brown, J., Sparks, J., & Claud, D.
(2003). The outcome rating scale: A preliminary study of the
reliability, validity, and feasibility of a brief visual analogue
measure. Journal of Brief Therapy, 2, 91–100.
Miller, S. D., & Duncan, B. L. (2004). The outcome and session
rating scale. Administration and scoring manual. Chicago,
IL: Institute for the Study of Therapeutic Change.
Miller, S. D., Duncan, B. L., Brown, J., Sorrell, R., & Chalk,
M. B. (2006). Using formal client feedback to improve
retention and outcome: Making ongoing, real time assess-
ment feasible. Journal of Brief Therapy, 5, 5–22.
Reese, R. J., Norsworthy, L. A., & Rowlands, S. R. (2009).
Does a continuous feedback system improve psychotherapy
outcome? Psychotherapy theory, research, practice, train-
ing, 46, 418–431.
Reese, R. J., Toland, M. D., Slone, N. C., & Norsworthy, L. A.
(2010). Effect of client feedback on couple psychotherapy
outcomes. Psychotherapy: Theory, Research, Practice,
Training, 47, 616–630.
Stinckens, N., Ulburghs, A., & Claes, L. (2009). De wer-
kalliantievragenlijst als sleutelelement in therapiegebeuren.
Meting met behulp van de WAV-12, de Nederlandstalige
verkorte versie van de Working Alliance Inventory. [The
working alliance questionnaire as a key element in therapy.
Measurement using the WAV-12, the Dutch shortened
version of the Working Alliance Inventory]. Tijdschrift voor
Klinische Psychologie, 39, 44–60.
Date of acceptance: April 22, 2013
Published online: August 23, 2013
Pauline Janse
Department HSK Utrecht
HSK Group
3522 KE Utrecht
The Netherlands
Tel. +31 62 808-8475
E-mail paulinejanse@hotmail.com
P. Janse et al.: Psychometric Properties of the Dutch Outcome Rating Scale and Session Rating Scale 7
Author’s personal copy (e-offprint)
Ó 2013 Hogrefe Publishing European Journal of Psychological Assessment 2013