SlideShare ist ein Scribd-Unternehmen logo
1 von 38
Downloaden Sie, um offline zu lesen
Piers-HarrisChildren’sSelf-ConceptScale,SecondEditionPIERS-HARRIS2MANUALEllenV.Piers,Ph.D.,andDavidS.Herzberg,Ph.D.
W-388B
Piers-Harris 2
Piers-Harris Children’s Self-Concept Scale,
SECOND EDITION
Ellen V. Piers, Ph.D.
David S. Herzberg, Ph.D.
MANUAL
Western Psychological Services • 12031 Wilshire Boulevard, Los Angeles, California 90025-1251
Additional copies of this manual (W-388B) may be purchased from WPS.
Please contact us at 800-648-8857, Fax 310-478-7838, or www.wpspublish.com.
wps®
wps
Publishers Distributors
wps®
Published by
The Piers-Harris Children’s Self-Concept Scale (Piers,
1963) was originally developed in the early 1960s to provide
a brief, self-report instrument for the assessment of self-
concept in children and adolescents. As defined by the scale’s
original authors, self-concept is a relatively stable set of atti-
tudes reflecting both description and evaluation of one’s own
behavior and attributes. Since its introduction, the Piers-
Harris has enjoyed widespread acceptance among clinicians
and researchers, as well as praise from reviewers. The instru-
ment’s stature is reflected in more than 500 citations in pro-
fessional journals and books in psychology, education, and
the health sciences. These numerous references highlight the
Piers-Harris’s vital role in the expansion of knowledge about
self-concept and its relationship to behavior.
The Piers-Harris Children’s Self-Concept Scale,
Second Edition (Piers-Harris 2) represents the culmination
of a careful revision process. The general goals of this pro-
cess were to enhance the ease of use and psychometric foun-
dation of the test, while preserving the many characteristics
of the instrument that have contributed to its success. These
goals have been realized in a set of specific improvements,
including new nationwide normative data, an updated item
set, enhanced interpretive guidelines, and modernized com-
puter assessment tools. Nevertheless, the Piers-Harris 2 re-
tains the familiar response format, self-concept scales, and
excellent psychometric properties of the original edition.
Thus, the revised test should be easily integrated into re-
search projects and clinical assessments that used the origi-
nal Piers-Harris.
General Description
The Piers-Harris 2 is a 60-item self-report question-
naire, subtitled The Way I Feel About Myself. It is designed
for administration to children who are at least 7 years old
and have at least a second-grade reading ability. The mea-
sure can be used with adolescents up to 18 years of age.
The Piers-Harris 2 items are statements that express
how people may feel about themselves. Respondents are
asked to indicate whether each statement applies to them by
choosing yes or no. Several methods of administration are
available: the Piers-Harris 2 AutoScore™ Form (WPS
Product No. W-388A), which is completed by the child and
scored manually by the test administrator; mail-in and fax-in
forms (WPS Product Nos. W-388C and W-388Z), which are
completed by the child and submitted to WPS for computer
scoring and report generation; a PC program (WPS Product
No. W-388Y), which can generate a report based on either
online administration or offline data entry; and the Spanish
Answer Sheet (WPS Product No. W-388E), which is com-
pleted by the child, whose answers are then transcribed onto
an AutoScore™ Form by the examiner. Using any of these
methods of administration, most respondents can complete
the Piers-Harris 2 in 10 to 15 minutes.
The Piers-Harris 2 includes the same Self-Concept
and Validity scales as the original Piers-Harris. The Self-
Concept scales comprise the Piers-Harris 2 Total (TOT)
score, which is a general measure of the respondent’s overall
self-concept, and the six domain scales, which assess spe-
cific components of self-concept. The domain scales include
Behavioral Adjustment (BEH), Intellectual and School
Status (INT), Physical Appearance and Attributes (PHY),
Freedom From Anxiety (FRE), Popularity (POP), and
Happiness and Satisfaction (HAP). (On the original Piers-
Harris, the Freedom From Anxiety scale was labeled
Anxiety and the Behavioral Adjustment scale was labeled
Behavior. All other scale names are unchanged from the
original instrument.) The Self-Concept scales are scored so
that a higher score indicates a more positive self-evaluation
in the domain being measured. The Piers-Harris 2 Validity
scales include the Inconsistent Responding (INC) index,
which is designed to identify random response patterns, and
the Response Bias (RES) index, which measures a child’s
tendency to respond yes or no irrespective of item content.
Piers-Harris 2 Improvements
The most important feature of the Piers-Harris 2 is its
incorporation of new, nationally representative normative
data. The new norms are based on a sample of 1,387 students,
aged 7 to 18 years, who were recruited from school districts
all across the United States. The sample closely approximates
the ethnic composition of the U.S. population (U.S. Bureau
of the Census, 2001a). The new standardization sample is a
1
INTRODUCTION
3
significant improvement over the sample used to norm the
original Piers-Harris. That sample was recruited in the early
1960s from a single public school system in rural Pennsyl-
vania, and was relatively homogenous in terms of ethnicity
and several other key demographic variables. In addition,
whereas the original Piers-Harris sample consisted of 4th
through 12th graders, the Piers-Harris 2 sample included
2nd and 3rd graders as well.
The second major enhancement in the Piers-Harris 2
is the reduction of the scale from 80 to 60 items. This item
reduction shortens administration time significantly, while
retaining all of the Self-Concept and Validity scales from the
original Piers-Harris. The deleted items included those of
relatively less psychometric value, as well as those written
in outdated language that was difficult for many children to
understand. The revised scales are psychometrically equiva-
lent to their counterparts in the original measure. Table 1
summarizes the changes in item composition and labeling
between the original and revised Self-Concept scales.
A third substantial change in the Piers-Harris 2 in-
volves the microcomputer administration and scoring pro-
gram. WPS offers a variety of computer services for many of
its products. The “Computerized Services for the Piers-
Harris 2” section at the back of this manual provides infor-
mation about the options available for the Piers-Harris 2.
The software has been updated for the latest version of the
Microsoft Windows operating system, with an attractive
new graphical user interface. In addition, the computer re-
port has been streamlined and updated to reflect the new
normative data.
This manual includes several new enhancements, in-
cluding a revised section on interpreting the test that incor-
porates three new case studies. Furthermore, the manual
now includes a topic-by-topic inventory of existing Piers-
Harris studies (see Appendix A), to facilitate further re-
search on the scale.
Principles of Use
The Piers-Harris 2 is appropriate for use in any re-
search, educational, or clinical setting that requires efficient
quantitative assessment of children’s reported self-concept.
The original Piers-Harris gained widespread acceptance
among researchers, as reflected in an extensive scholarly lit-
erature that has accumulated over the past four decades. The
instrument has been used to evaluate psychological and edu-
cational interventions, to investigate the relationship be-
tween self-concept and other traits and behaviors (e.g.,
empathy, teenage pregnancy, drug and alcohol use), and to
monitor changes in self-concept over time, among many
other research projects.
Because it is easily administered to groups, the Piers-
Harris 2 can be employed as a screening device in class-
rooms to identify children who might benefit from further
psychological evaluation. The Piers-Harris 2 can also be
used in individual clinical assessments of children and ado-
lescents. The Self-Concept scales can be used to generate
hypotheses for clinical exploration, as well as to guide clin-
icians in choosing among possible interventions and formu-
lating referral questions for further psychological testing.
The Piers-Harris 2 can be administered and scored by
teachers and other trained paraprofessionals. However, ulti-
mate responsibility for its use and interpretation should be
assumed by a professional with appropriate training in psy-
chological assessment. Before administering the Piers-
Harris 2, potential users should read this manual to become
familiar with the theoretical rationale, development, stan-
dardization, and psychometric properties of the measure.
As with many self-report measures, users should keep
in mind that the intent of the Piers-Harris 2 is readily appar-
ent to most children and adolescents. For this reason, the re-
sponses may be subject to conscious and unconscious
distortion, usually in the direction of greater social desir-
ability. The issue of response validity is addressed in greater
detail in chapter 3 of this manual.
Although the Piers-Harris 2 is a useful instrument, it
cannot by itself provide a comprehensive evaluation of a
child’s self-concept. Such an evaluation is a complex task
requiring clinical sensitivity and familiarity with the appli-
cable research literature. In making clinical judgments con-
cerning Piers-Harris 2 results, users should be prepared to
integrate other sources of data, which may include clinical
4
Table 1
Self-Concept Scales of the Original Piers-Harris and the Piers-Harris 2
Original Piers-Harris Piers-Harris 2
Scale name No. of items Scale name No. of items
Total 80 Total (TOT) 60
Cluster scales Domain scales
Behavior (BEH) 16 Behavioral Adjustment (BEH) 14
Intellectual and School Status (INT) 17 Intellectual and School Status (INT) 16
Physical Appearance and Attributes (PHY) 13 Physical Appearance and Attributes (PHY) 11
Anxiety (ANX) 14 Freedom From Anxiety (FRE) 14
Popularity (POP) 12 Popularity (POP) 12
Happiness and Satisfaction (HAP) 10 Happiness and Satisfaction (HAP) 10
Note. Some items are assigned to more than one scale. See Appendix E for a list of original items that were deleted in developing the Piers-Harris 2.
interviews with the child and other informants, prior history,
school records, classroom observations, and results from other
psychological tests. Users should also be prepared to confer
with outside consultants and referral sources as needed.
Contents of This Manual
Chapter 2 of this manual contains instructions for ad-
ministering and scoring the Piers-Harris 2, and includes a
completed sample of an AutoScore™ Form. Chapter 3 pre-
sents guidelines for interpreting the test results. Technical
aspects of the test are presented in chapters 4 and 5. Chapter
4 reviews the development of the original Piers-Harris and
describes the new standardization sample and item revisions
for the Piers-Harris 2. Chapter 5 discusses the reliability and
validity of the Piers-Harris 2 and presents an overview of re-
search on the technical properties of the original test. This
manual also includes several appendixes that support spe-
cialized applications of the test: Appendix A presents a list
of research studies employing the Piers-Harris, organized by
topic; Appendix B reviews the use of the Piers-Harris with
exceptional children; and Appendixes C and D contain in-
structions and tables for comparing raw scores from the
Piers-Harris 2 with those from the original version of the
test. Appendix E lists the items from the original Piers-
Harris that were omitted from the Piers-Harris 2. Finally, in
the back of the manual is a chapter that provides instructions
for using the Piers-Harris 2 computer-scoring products.
5
The original Piers-Harris Children’s Self-Concept
Scale was developed in the 1960s as a research instrument
and as an aid for clinical and educational evaluation in ap-
plied settings (Piers, 1984). Since its introduction, the Piers-
Harris has functioned well in these roles, forming the basis
for an impressive and growing body of research. The Piers-
Harris 2 is the first major revision and restandardization of
the original Piers-Harris. The new features of the Piers-
Harris 2, which include new normative data and an updated
item set, were implemented with the goal of maintaining as
much backwards compatibility” as possible with the original
Piers-Harris. This chapter begins by reviewing the theoretical
rationale for the Piers-Harris and the development of the
original item set and scoring system. The remainder of the
chapter is devoted to the Piers-Harris 2 revisions. The new
standardization sample is described first, followed by a dis-
cussion of the item and scale changes. The chapter concludes
with a discussion of moderator variables and their effects on
interpretation. Users interested in making direct comparisons
between Piers-Harris 2 scores and scores on the original
measure should consult Appendixes C and D, which present
instructions and tables necessary for comparing the scores.
Original Rationale and
Theoretical Background
The original version of the Piers-Harris was based on
the view that individuals maintain relatively consistent be-
liefs about themselves, beliefs that develop and stabilize
during childhood. This set of beliefs represents a person’s
self-concept, a term which some researchers have used
interchangeably with terms such as self-esteem and self-
regard. The original authors of the Piers-Harris assumed
that children would reveal important aspects of this underly-
ing self-image by agreeing or disagreeing with simple,
self-descriptive statements, and that this assessment of self-
concept would relate meaningfully to other aspects of per-
sonality and to predictions of future behavior.
From a global perspective, the term self-concept refers
to a person’s self-perceptions in relation to important as-
pects of life. Although shaped by biological and cultural fac-
tors, these perceptions are formed primarily through the
interaction of the individual with the environment during
childhood, and by the attitudes and behaviors of others.
These perceptions give rise to self-evaluative attitudes and
feelings that have important organizing functions and that
also motivate behavior. Over time, an individual’s self-
concept may change in response to environmental or develop-
mental changes, or as a result of changes in priorities or
values. However, these changes usually do not occur rapidly
or in response to isolated experiences or interventions.
This definition of self-concept rests on several theo-
retical assumptions:
1. Self-concept is essentially phenomenological in
nature. It is not something that can be observed
directly but must be inferred from either behav-
iors or self-report. Although behaviors are
directly measurable, it is difficult to use behav-
ioral observations to draw inferences about self-
concept that are replicable and consistent across
different situations. Self-report, although subject
to many types of distortions, is closer to the pre-
sent definition of self-concept, because it is a
direct expression of the individual’s experience
of the self. The problem of distortion of self-
report is a methodological issue, not a theoreti-
cal one.
2. Self-concept has both global and specific com-
ponents. Global self-concept reflects how an
individual feels about all the characteristics that
make up his or her person, taking into account,
among other things, skills and abilities, interac-
tions with others, and physical self-image.
Various specific aspects of self-concept result
from an individual’s self-appraisal in particular
areas of functioning. These specific facets of
self-concept differ on several dimensions. Some
are relatively broad (e.g., physical self, moral
and ethical self, academic self); others are nar-
rowly defined (e.g., good at mathematics, not
skilled at baseball). The relative significance to
the individual of each aspect of self-concept de-
termines the degree to which success and failure
4
DEVELOPMENT AND RESTANDARDIZATION
37
affect overall self-evaluation (Dickstein, 1977;
Harter, 1978). In unimportant areas, for example,
perceived failure is not likely to have a strong
impact on the individual’s global self-evaluation.
Similar notions have been proposed by
Shavelson, Hubner, and Stanton (1976), who view
self-concept as being “hierarchically organized.”
3. Self-concept is relatively stable. Although
shaped by experience, it does not change easily
or rapidly. In children, self-concept is initially
more situationally dependent and becomes
increasingly stable over time. Although it may
be possible to enhance children’s self-concept
through lengthy corrective experiences, changes
are not likely to occur as the result of a brief,
single, or superficial intervention. For example,
a weekend camping trip may make a child feel
good but is unlikely to bring about lasting
change in that child’s self-concept. In addition,
certain areas of self-concept may be more diffi-
cult to change than others, and some may be
amenable to change only during certain “critical
periods” (Erikson, 1950; Schonfeld, 1969).
4. Self-concept has an evaluative as well as a
descriptive component. It represents an individ-
ual’s accumulated judgments concerning the
self. Some of these evaluations may reflect in-
ternalized judgments of others (e.g., values,
norms, and notions of what constitutes socially
desirable traits and behaviors). Others may be
unique to the individual. Thus, in evaluating
reported self-concept it is important to consider
both nomathetic (between-person) and idio-
graphic (within-person) sources of comparison.
The issues to be addressed concern both how
children compare themselves to their peers and
how they evaluate themselves against their own
internal standards.
5. Self-concept is experienced and expressed dif-
ferently by children at various stages of develop-
ment. During infancy, the focus is on differ-
entiating self from others and on establishing a
reciprocal relationship with the primary care-
taker or caretakers (Ainsworth, 1979; Mahler,
Pine, & Bergman, 1975). During the preschool
years, the child becomes more mobile, interacts
socially with other children and adults, and be-
gins to develop a sense of gender identity. Self-
concept during this period is defined primarily
by the child’s experience in each of these areas,
and by parental attitudes and behaviors. The
concepts of school-age children expand to
encompass a larger arena of daily interactions,
especially in the areas of achievement and peer
relationships. With increasing age and experi-
ence, the child’s self-perceptions also become
increasingly differentiated as he or she struggles
to integrate disparate aspects of experience into
a unified conceptual framework (Fahey &
Phillips, 1981). In adolescence, certain aspects
of self-concept may undergo rapid change or
differentiation (e.g., moral and ethical self-
image, physical self-concept), whereas others
develop in a continuous, stable way (Dusek &
Flaherty, 1981). For a more detailed discussion
of developmental issues relating to self-concept,
see Harter (1983).
6. Self-concept serves to organize and motivate
behavior. A stable self-concept maintains a con-
sistent image of a person’s typical reactions
across different situations. This helps to reduce
ambiguity in new situations and structure behav-
ior toward preexisting goals. Action is also
guided by an individual’s judgment of whether
or not a particular behavior is consistent with his
or her self-image. Behaviors that are congruent
with one’s self-concept will tend to be favored
over incongruent behaviors. In this fashion,
judgments concerning the relative success or
failure of particular actions, as well as the emo-
tions (e.g., pride, joy, humiliation) related to
these outcomes, may serve an important moti-
vating function.
Original Piers-Harris Development
Item Development
The original Piers-Harris items were derived from the
work of Jersild (1952), who asked children what they liked
and disliked about themselves. These statements were then
grouped into the following categories: (a) physical charac-
teristics and appearance; (b) clothing and grooming; (c)
health and physical well-being; (d) home and family; (e) en-
joyment of recreation; (f) ability in sports and play; (g) aca-
demic performance and attitudes toward school; (h)
intellectual abilities; (i) special talents (music, arts); (j) “Just
Me, Myself”; and (k) personality characteristics, inner re-
sources, and emotional tendencies.
An initial item set, consisting of 164 items, was written
to reflect these various aspects of children’s self-concept.
The items were written as simple declarative statements
(e.g., “I am a happy person”), with a yes/no response format.
To reduce the possible effects of response biases, approxi-
mately half of the items were negatively worded (e.g., “I be-
have badly at home”) and half were worded in the direction
of positive self-concept (e.g., “I have many friends”). Most
items were written to avoid such problematic features as
double-negative constructions and ambiguous qualifiers
such as many, often, or rarely. Finally, 12 “lie” scale items
were included to assess the tendency to respond in a social-
ly desirable manner. These items were intended to mea-
sure children’s willingness to admit relatively common
38 Technical Guide
weaknesses (e.g., “I am always good” or “Sometimes I act
silly”). However, these “lie” scale items were later dropped
when it was found that they did not contribute significantly
to the validity of the scale.
This preliminary pool of items was then administered
to a sample of 90 children from Grades 3, 4, and 5. To
minimize errors due to differences in reading ability, the
items were read aloud by the examiners while the children
followed along in their test booklets. This pilot study es-
tablished that the children understood the items, and that
the inventory could be completed in approximately 30 to
35 minutes.
The pilot study results were used to reduce the item
pool. Items answered in one direction by less than 10% or
more than 90% of the respondents were inspected and, in
most cases, dropped. However, because the instrument was
designed to identify children with problems in self-concept,
a few items such as “My parents love me” were temporarily
retained even though answered yes by the great majority of
children. This procedure reduced the item pool to 140 items.
A second pilot study was conducted with a sixth-grade
sample of 127 students. The 30 highest and 30 lowest scores
were identified, and items were retained only if (a) they dis-
criminated significantly between these high and low groups
(p < .05), and (b) they were answered in the expected direc-
tion by at least half of the high-scoring group. These proce-
dures reduced the item pool to the 80 items that composed
the original Piers-Harris.
Scale Construction
Total score. The Piers-Harris Total score was based
on all 80 items. It was calculated by crediting 1 point for
each item answered in the direction of positive self-concept.
The Total score was designed to measure a general dimen-
sion of self-concept or self-esteem.
Cluster scales. Piers (1963) investigated the multi-
dimensional nature of the scale by conducting a principal
components analysis using a sample of 457 sixth graders.
Varimax rotation yielded six interpretable factors that to-
gether accounted for 42% of the variance in item responses.
These factors were labeled Behavior, Intellectual and School
Status, Physical Appearance and Attributes, Anxiety,
Popularity, and Happiness and Satisfaction.
This factor structure was supported by numerous sub-
sequent factor analyses, which are described in more detail
in chapter 5. The factor structure formed the basis for the
subscales of the Piers-Harris, which were called cluster
scales in the 1984 edition of the Piers-Harris Manual). As
with the Total score, the cluster scales were scored by cred-
iting 1 point for each item answered in the direction of pos-
itive self-concept.
Piers-Harris 2 Development
In revising the original Piers-Harris, the primary ob-
jectives were to update and improve the test’s normative data
and item set. The original Piers-Harris normative data were
problematic in several respects. First of all, the standard-
ization sample was recruited in the early 1960s from a single
public-school system in rural Pennsylvania. The standard-
ization sample was relatively homogenous with respect to
ethnicity and several other key demographic variables. This
made it more difficult to interpret Piers-Harris results for mi-
nority children and those from other groups who differed
substantially from the children in the standardization sample.
Secondly, when the cluster and validity scales were devel-
oped for the original Piers-Harris, they were normed on dif-
ferent samples than the standardization sample used to norm
the Total score. Although this procedure was necessary in
order to develop these important new scales, it clearly deviat-
ed from ideal test-standardization procedures, in which all
scales are normed on a single standardization sample.
A second set of concerns addressed in the revision
process involved the test items themselves. Decades of ex-
perience with the original scale have revealed problems with
certain items. Some of these items have become difficult for
younger children to understand because they were written
with outdated language or low-frequency words. In addition,
several researchers have identified opportunities to shorten
the Piers-Harris by deleting items that have relatively limit-
ed psychometric utility.
Standardization Sample
The Piers-Harris 2 restandardization was based on a
large sample of students recruited from elementary, middle,
junior high, and high schools throughout the United States.
Table 6 presents the demographic characteristics of the sam-
ple, along with corresponding percentages from the U.S.
Census (U.S. Bureau of the Census, 2001a, 2001b) for com-
parison. The sample is distributed relatively uniformly with-
in the age range of 7 to 18 years (Grades 2 through 12). The
distribution of the sample among ethnic groups is similar to
the U.S. Census figures, with slight underrepresentation of
Asians and Hispanics. Geographical distribution is ade-
quate, with slight underrepresentation of participants from
the western region of the United States.
Table 6 includes the distribution of a subsample of
participants among various categories of head-of-household
educational level. This variable is used as an index of socio-
economic status (SES), with higher educational level as-
sociated with higher SES. The subsample is fairly close to
the U.S. Census figures for the lowest two SES categories.
The proportion of subsample participants in the top SES cat-
egory is higher than the U.S. Census proportion. The use of
a subsample for the SES comparisons reflects the fact that
education data for heads of household were available for
only 673 participants in the Piers-Harris 2 standardization
study. The remaining 714 participants came from sites
where SES data were not collected as part of the Piers-
Harris 2 standardization study. Fortunately, these sites were
conducting other standardization studies concurrently, and
SES data were collected in these other studies. These other
standardization studies used different participants than the
Piers-Harris 2 study. However, because sampling for the
Chapter 4 Development and Restandardization 39
various studies was random, there was no reason to expect
systematic differences in SES between participants in the
Piers-Harris 2 study and those in other standardization stud-
ies. Examination of the head-of-household education data
for participants in these other studies revealed a distribution
that is almost identical to the distribution for the 673 Piers-
Harris 2 participants (see Table 6). Therefore, it is reason-
able to assume that the distribution of head-of-household
data presented in Table 6 approximates that of the entire
Piers-Harris 2 sample.
Item and Scale Revisions
Item revisions. As noted above, a major goal of the
Piers-Harris 2 revision was to streamline the scales by elim-
inating problematic items. The first items targeted were those
that contributed only to the Piers-Harris Total score and not
to any of the six domain scores (Benson & Rentsch, 1988). A
second set of items tagged for elimination included those
whose wording has become outdated (e.g., “I have lots of
pep”); those that seemed specific to one sex (e.g., “I have a
good figure”); and those containing words that frequently
needed additional explanation, especially for younger chil-
dren (e.g., “I am obedient at home”). These procedures iden-
tified 20 items as candidates for deletion, which left a revised
set of 60 items. Statistical analyses (which are described in
the next two sections) established that deletion of these items
would result in no appreciable loss of reliability in the Total
score or domain scale scores. In addition, the Total and do-
main scores from the 60-item set correlated very highly with
the original scores derived from the 80-item set.
These analyses supported the decision to shorten the
measure by deleting the 20 candidate items. The deleted
items, with their original item numbers, are listed in Appen-
dix E. (The remaining items, which constitute the Piers-
Harris 2, are presented in Table 4.) In addition to the
deletions, one item was slightly altered from its original
wording. Item 37 was changed from “I am among the last to
be chosen for games” to “I am among the last to be chosen
for games and sports.” This change was made to ensure par-
allel wording with two other items (9 and 51) that contain
the phrase “games and sports.” The changes resulted in the
60-item Piers-Harris 2, which yields the same scores as the
original test without sacrificing its psychometric strengths.
The new version also decreases administration time signifi-
cantly, as compared to the original measure. A readability
analysis was conducted on the Piers-Harris 2 item set. The
Flesch Reading Ease score was 91.8, and the Flesch-Kincaid
Grade Level was 2.1, indicating that second-grade readers
should be able to read the items with little difficulty.
Total (TOT) score. As in the original measure, The
Piers-Harris 2 Total (TOT) score is a measure of general
self-concept or self-esteem. The TOT score is derived by
crediting 1 point for each item answered in the direction of
positive self-concept. Because all 80 items of the original
Piers-Harris were administered to the Piers-Harris 2 standard-
ization sample, the reliability of the original and revised
scales could be compared (see chapter 5 for a more thorough
discussion of reliability and associated terminology).
Coefficient alpha values for the 80-item original Total score
and the 60-item Piers-Harris 2 TOT score were .93 and .91,
respectively. These statistics indicate that both versions
demonstrate robust internal consistency and that the revised
scale shows no significant loss in reliability compared to the
lengthier original version. In addition, the original and re-
vised Total scores correlate at .98, indicating that they are
functionally equivalent.
Domain scales. As noted earlier, the original Piers-
Harris included six factor-analytically derived cluster scales
40 Technical Guide
Table 6
Demographic Characteristics of the
Piers-Harris 2 Standardization Sample
U.S.
Sample Census
n % %a
Sex
Male 689 49.7 51.3
Female 698 50.3 48.7
Age in years
7–8 188 13.6
9–10 231 16.7
11–12 277 20.0
13–14 271 19.5
15–16 255 18.4
17–18 165 11.9
Race/Ethnic background
Asian 17 1.2 3.5
Black 255 18.4 14.7
Hispanic/Latino 102 7.4 17.1
White 943 68.0 60.9
Native American 16 1.2 0.9
Other 50 3.6 2.9
Not Specified 4 0.2
U.S. Geographic Region
Northeast 316 22.8 19.0
Midwest 424 30.6 22.9
South 463 33.4 35.6
West 184 13.3 22.5
Head of household’s
educational levelb
Less than high school graduate 73 10.8 11.4
High school graduate 182 27.0 31.9
Some college 100 14.9 28.0
Four-year college degree 318 47.3 28.7
or more
Note. N = 1,387 (except for head of household’s educational level).
a
U.S. Census figures for sex, age, race/ethnicity, and geographic region
(U.S. Census Bureau, 2001a) are based on U.S. population of school-
aged children. Census data for head of household’s education level
(U.S. Census Bureau, 2001b) is based on adults aged 25 to 54 (those
most likely to be parents of school-aged children). b
N = 673; see text
for discussion of missing data in this subtable.
(Behavior, Intellectual and School Status, Physical
Appearance and Attributes, Anxiety, Popularity, and Happi-
ness and Satisfaction). These scales have been retained in
the Piers-Harris 2, with very slight modifications. They are
scored by crediting 1 point for each scale item answered in
the direction of positive self-concept.
Several labeling changes have been instituted in the
Piers-Harris 2. The cluster scales have been relabeled do-
main scales to reflect the fact that they are intended to mea-
sure particular domains of self-concept. This label was
judged more appropriate than cluster scales, which refers to
the statistical procedures used to assign items to the scales,
rather than the clinical utility of the scales themselves. In ad-
dition, the Anxiety scale is labeled Freedom From Anxiety in
the Piers-Harris 2. This was done to correct a confusing as-
pect in the scoring of the original instrument. In the original
Piers-Harris, as in the Piers-Harris 2, all scales are scored so
that higher scores reflect more positive self-concept. On the
original instrument, the scale labeled Anxiety creates confu-
sion because most users intuitively believe that higher scores
indicate more anxiety, but in fact higher scores on this scale
indicate less anxiety. The Piers-Harris 2 name Freedom
From Anxiety removes this source of confusion and is more
consistent with the labeling conventions for the other five
domain scales, which generally refer to positive attributes
(e.g., Popularity, Happiness and Satisfaction). Finally, the
Behavior scale has been given the more specific, descriptive
label of Behavioral Adjustment in the Piers-Harris 2.
The item assignments for the Piers-Harris 2 domain
scales (see Table 3) represent relatively minor changes from
the composition of the original cluster scales. The item re-
visions did not affect the Freedom From Anxiety (FRE),
Popularity (POP), and Happiness and Satisfaction (HAP)
scales, which retain the same items as their counterparts in
the original measure. The Behavioral Adjustment (BEH)
and Physical Appearance and Attributes (PHY) scales each
lost two items in the revision, and the Intellectual and
School Status (INT) scale lost one item. Most of the 20
items dropped for the Piers-Harris 2 contributed to only the
Total and not to any of the domain scales. Table 7 presents
reliability coefficients for the revised and original versions
of these three scales, demonstrating that the item changes
did not cause a meaningful decrement in the internal consis-
tency of these scales. In addition, the extremely high corre-
lation coefficients demonstrate that the revised and original
versions of these scales are essentially equivalent.
As Table 3 indicates, the Piers-Harris 2 domain scales,
like their counterparts in the original measure, contain nu-
merous overlapping items. The item overlap is an artifact of
the factor-analytic procedures used to derive the original
scales. Cooley and Ayres (1988) advocated eliminating this
item overlap in order to increase the independence of the
cluster scales. They suggested assigning each overlapping
item only to the scale with which it correlated most strongly.
This procedure was considered for the Piers-Harris 2, but re-
liability analyses showed that nonoverlapping domain scales
suffered a significant drop in internal consistency. This loss
of reliability was especially apparent in the youngest chil-
dren in the Piers-Harris 2 standardization sample. For 7- to
8-year-old children, coefficient alpha was less than .60 for
one scale and less than .70 for two others. Because such low
reliability coefficients are undesirable, the authors of the
Piers-Harris 2 decided against the creation of nonover-
lapping domain scales.
Validity scales. The Piers-Harris 2 includes two
Validity scales, the Inconsistent Responding (INC) index
and the Response Bias (RES) index, which help the test user
to detect deviant response sets. The INC scale is designed to
identify random response patterns. It is based on the suppo-
sition that certain pairs of responses are contradictory and/or
statistically improbable. The INC scale was introduced in
the 1984 edition of the Piers-Harris Manual, but has been
revised extensively for the Piers-Harris 2.
Inconsistent Responding index. The INC scale was
constructed using both rational and empirical procedures.
First, the 60 Piers-Harris 2 items were examined to deter-
mine pairs of items on which it is possible to produce logi-
cally inconsistent responses (e.g., Item 3, “It is hard for me
to make friends,” and Item 41, “I have many friends”).
Second, a correlation matrix was formed for all 60 items
using the Piers-Harris 2 standardization data. Item pairs that
correlated at r ≥ .25 were examined, and pairs were retained
if their content created the potential for a logically inconsis-
tent pair of responses, as in the aforementioned example.
Third, frequency tables were constructed for all item pairs
identified in the first two steps. These tables were used to
determine which particular combination of responses (e.g.,
yes on one item, no on another) occurred least frequently in
Chapter 4 Development and Restandardization 41
Table 7
Intercorrelations and Reliability Coefficients for Revised and Original Domain Scales
Coefficient alpha
Domain scale ra
Piers-Harris 2 Original Piers-Harris
Behavioral Adjustment (BEH) .98 .81 .81
Intellectual and School Status (INT) .99 .81 .82
Physical Appearance and Attributes (PHY) .98 .75 .79
Note. These are the only domain scales (called “cluster scales” on the original Piers-Harris) that had item changes in the Piers-Harris 2 revision.
a
The correlation coefficient between the revised and original scale raw scores.
the standardization sample. Item pairs were retained if they
produced a pair of logically inconsistent responses that oc-
curred in less than 10% of the sample.
These procedures resulted in an INC scale composed
of 15 item pairs, which are displayed in Table 8. This revised
INC scale features two substantial improvements over the
one described in the 1984 edition of the Piers-Harris
Manual. First of all, the current revision includes no dupli-
cate items among its 15 item pairs, whereas the 25 item pairs
in the original version of the INC scale included numerous
duplicate items. This meant that a particular response to a
single item could have a disproportionate effect on the total
INC score, potentially distorting the meaning of the scale.
Secondly, the items in the current revision of the scale are
distributed relatively evenly among the six domain scales.
The original version of the INC scale was heavily weighted
with items from the Behavior scale, with fewer items repre-
senting the remaining five domain scales. This meant that
the original INC score gave little information about the va-
lidity of responses on these other five scales.
The INC index is scored by crediting 1 point for each
pair of items for which the examinee gives the specified
combination of keyed responses. For example, a yes re-
sponse to Item 1 (“My classmates make fun of me”) coupled
with a no response to Item 47 (“People pick on me”) would
be scored “1” on the INC index. Note that the other combi-
nation of logically inconsistent responses to this item pair
(no to Item 1, yes to Item 47) would not be scored on the
INC index. This scoring system results in a possible INC
index score of 0 to 15.
In the Piers-Harris 2 standardization sample, the INC
score had a mean of .89 and a standard deviation of 1.12. As
would be expected, the distribution of this score was highly
positively skewed (that is, most examinees produced very
low scores on the INC index). The cumulative frequencies
for the INC index are shown in Table 9.
As noted in chapter 3, interpretation of the INC index
is based on a cutoff score of 4. When a child scores 4 or
greater on this index, there is a very high likelihood that the
42 Technical Guide
Table 8
Inconsistent Responding (INC) Index Item Pairs
Frequency
of response
combination
Inter- in standard-
item ization
INC item pair (Keyed response) correlation sample (%)
1. My classmates make fun of me. (Y) 47. People pick on me. (N) .46 7.2
2. I am a happy person. (N) 42. I am cheerful. (Y) .45 4.5
3. It is hard for me to make friends. (N) 41. I have many friends. (N) .40 7.9
4. I am often sad. (N) 40. I am unhappy. (Y) .46 5.3
5. I am smart. (N) 43. I am dumb about most things. (N) .42 4.9
7. I get nervous when the teacher calls on me. (Y) 10. I get worried when we have tests in school. (N) .35 9.6
9. I am a leader in games and sports. (Y) 51. In games and sports, I watch instead of play. (Y) .27 5.3
14. I cause trouble to my family. (N) 20. I behave badly at home. (Y) .41 6.3
18. I am good in my schoolwork. (N) 21. I am slow in finishing my schoolwork. (N) .32 8.1
19. I do many bad things. (Y) 27. I often get into trouble. (N) .47 6.6
26. My friends like my ideas. (N) 39. My classmates in school think I have good ideas. (Y) .59 3.7
29. I worry a lot. (N) 56. I am often afraid. (Y) .36 6.0
31. I like being the way I am. (N) 35. I wish I were different. (N) .48 3.9
44. I am good-looking. (Y) 49. I have a pleasant face. (N) .56 6.5
53. I am easy to get along with. (Y) 60. I am a good person. (N) .38 3.3
Table 9
Cumulative Frequency of
Inconsistent Responding (INC)
Index Scores in the
Standardization Sample
INC score Cumulative frequency (%)
0 47.9
1 77.0
2 91.0
3 96.7
4 99.0
5 99.8
6 99.9
7 99.9
8 99.9
9 100.0
Note. N = 1,387.
Piers-Harris 2 was completed in a random or inconsistent
manner. This cutoff score was determined by first selecting a
score that was approximately 2 standard deviation units
above the mean. In the Piers-Harris 2 standardization sam-
ple, the corresponding score was 3.13 (or 3, when rounded
to the nearest whole number). As Table 9 illustrates, only
3.3% of the children in the Piers-Harris 2 standardization
sample produced an INC index score greater than 3. This
suggested that a cutoff score of 4 would identify those chil-
dren whose INC scores were extremely deviant.
To further assess the interpretive value of this cutoff
score, analyses were undertaken to determine the probabili-
ty that an INC score of 4 represented a random response set.
To accomplish this, the distribution of INC scores from the
Piers-Harris 2 standardization sample (N = 1,387) was com-
pared to the distribution of INC scores from 1,387 random
Piers-Harris 2 response sets. The question of interest was:
For any given INC score, what was the probability that it
was drawn from the random data set as opposed to the data
collected from actual respondents? For each INC score,
the ratio of the frequency counts between the two samples
was calculated and expressed as a percentage indicating
the probability that the score was from the random data.
Table 10 presents these percentages for all possible scores
on the INC scale. Of the 337 cases that scored 4 on the INC
scale, 32 were in the respondent sample and 305 were in
the random sample. Thus, any Piers-Harris 2 response set
that yielded an INC score of 4 had a 91% likelihood of
being a random response set.
These analyses suggest that a cutoff score of 4 is use-
ful for interpreting this index. An INC score of 4 represents
a high probability (91%) that a particular Piers-Harris 2 was
completed at random, or without adequate understanding of
the content of the test items.
Response Bias (RES) index. The RES scale is a
straightforward measure of response bias. A positive response
bias represents the tendency to answer yes regardless of item
content, whereas a negative response bias represents the ten-
dency to answer no regardless of item content. Either of these
biases might distort the meaning of Piers-Harris 2 Self-
Concept scores. The RES index allows the examiner to take
response bias into account when interpreting the Piers-Harris
2. This index is a count of the number of items to which the ex-
aminee has responded yes. The RES score has a possible range
of 0 to 60. Very high RES scores indicate a positive response
bias, and very low scores signal a negative response bias.
In the Piers-Harris 2 standardization sample, the RES
score had a mean of 28.39 and a standard deviation of 5.28.
The distribution of the scores approximated a normal distri-
bution. Cutoff scores for interpretation were set at +/– 2
standard deviation units. These cutoffs corresponded to RES
scores of 40 and 18, respectively. Thus, children scoring 40
or above on the RES scale are identified as having given an
Chapter 4 Development and Restandardization 43
Table 10
Probability That Various
Inconsistent Responding (INC) Scores
Represent Random Responding
Probability of random
INC score response (%)
0 2
1 21
2 52
3 81
4 91
5 96
6 99
≥7 >99
Table 11
Descriptive Statistics for Piers-Harris 2 Raw Scores in the Standardization Sample
Scale No. of items M SD
Self-Concept Scales
Total (TOT) 60 44.6 10.2
Domain Scales
Behavioral Adjustment (BEH) 14 11.2 2.9
Intellectual and School Status (INT) 16 11.9 3.4
Physical Appearance and Attributes (PHY) 11 7.8 2.6
Freedom From Anxiety (FRE) 14 10.0 3.4
Popularity (POP) 12 8.4 2.7
Happiness and Satisfaction (HAP) 10 8.1 2.2
Validity Scales
Inconsistent Responding (INC) 15 pairs 0.9 1.1
Response Bias (RES) 60 28.4 5.3
Note. N = 1,387.
unusually large number of yes responses. Conversely, chil-
dren scoring 18 or below on the RES index are flagged as
having given an unusually large number of no responses.
These two cutoff scores were chosen because they are reason-
ably conservative and because moderate deviations in re-
sponse set in either direction usually do not create major
problems for interpretation.
Derivation of Standard Scores
Table 11 presents means and standard deviations for the
raw scores of the Piers-Harris 2 scales in the standardization
sample. To facilitate interpretation of Piers-Harris 2 results,
these raw scores have been converted to normalized T-scores
(see Anastasi, 1988, p. 88). The original distribution of
Piers-Harris 2 raw scores underwent a nonlinear transform-
ation so that it would approximately fit a normal curve. The
normalized raw scores were converted to T-scores, which
have a mean of 50 and a standard deviation of 10. Because
the Piers-Harris 2 uses normalized T-scores, every scale has
an approximately normal distribution of standard scores.
Thus, the frequency of cases in the standardization sample
falling above or below a given T-score is nearly identical for
every scale. This simplifies intra-individual comparisons
among the domain scales of the Piers-Harris 2. For exam-
ple, if a child achieves normalized T-scores of 60 on the
BEH and POP domain scales, the examiner knows that in
those areas of self-concept, the child has scored higher than
approximately 84% of the children in the standardization
sample. The use of normalized T-scores also enables the ex-
aminer to compare a particular child’s test results with those
of the children in the standardization sample.
Moderator Variables
Moderator variables are respondent characteristics, such
as age, sex, or ethnicity, that may affect scores on a psycho-
logical test independently of the construct that the test is
attempting to measure. When groups that differ on moderator
variables also perform differently on a test, it can be difficult
to interpret test results without separate norms for each group.
This section discusses moderator variables that may impact
Piers-Harris 2 results. The section also presents statistical
comparisons that determine whether there are any meaning-
ful differences between groups defined by the moderator
variables in the standardization sample. One of the advantages
of the Piers-Harris 2 standardization sample, as compared
with the original Piers-Harris standardization sample (Piers,
1984), is its demographic representativeness. Whereas the
original sample was recruited from a single school district
in Pennsylvania, the new sample consists of an ethnically
diverse group of children from all regions of the United
States. This allows statistical analysis of the potential
moderating effects of ethnicity on Piers-Harris 2 scores.
With a sample as large as the Piers-Harris 2 standard-
ization sample, it is necessary to set clear guidelines as to
what constitutes a practically meaningful difference between
average group scores. Because the power to detect group
differences is so high in large sample analyses, differences
as small as 1 or 2 T-score points are often statistically signif-
icant. From a practical perspective, however, a 1- or 2-point
change in a test score yields virtually no new information
about the individual who took the test. In addition, such
small differences are usually within the standard error of
measurement of the test, meaning that the difference is not
likely to be reliable across administrations of the test.
An effect size metric can be used to evaluate whether
statistically significant differences between large groups are
also practically meaningful (Cohen, 1992). The relevant
statistic is calculated by dividing the mean difference between
groups on a score by the standard deviation of that score for
the two groups combined. The use of T-scores simplifies
these calculations. Because the pooled standard deviation al-
ways approximates 10 for any T-score comparisons, 1 effect-
size unit equals about 10 T-score points. Effect sizes of 1 to 3
T-score points are considered small and not clinically
meaningful; effect sizes between 3T and 5T are considered
moderate; and those greater than 5T are considered large.
Moderator effects for Piers-Harris 2 scale scores were
evaluated by calculating average T-scores for groups defined
by the moderator variables (e.g., males and females, ethnic
groups). These average group scores were then compared to
the average T-score for the entire standardization sample
(which is 50 by definition). For each moderator, the follow-
ing decision rule was applied: If the deviation of a particular
moderator group from the overall standardization sample
was equivalent to a small effect size, the moderator was con-
sidered meaningless in practical applications. In these cases,
stratification of the normative data was not required. If, on
the other hand, the deviation constituted a moderate to large
effect size, the pattern of T-score differences among the
moderator groups was examined more closely to determine
whether it was consistent with other known characteristics
of the groups under consideration. If robust, replicable pat-
terns of group differences were present, the use of separate
norms for each group was considered.
Age. Many psychological theories predict substantial
changes in self-concept between certain phases of develop-
ment (e.g., between adolescence and adulthood). Certain
measures of self-concept, especially those designed for both
adolescents and adults, have employed age-stratified norms
(e.g., Tennessee Self-Concept Scale, Second Edition; Fitts
& Warren, 1996). However, empirical research has not sup-
ported an association between age and self-concept scores
for individuals between the ages of 8 and 23 (see Wylie,
1979, for a review). Table 12 presents average T-scores for
age groupings in the Piers-Harris 2 standardization sample.
Out of all the possible comparisons, only one domain scale
score (INT) differed from the sample mean by more than 3
T-score points, and this occurred only for a single age group
(7- to 8-year-olds). There was no evidence of a consistent
pattern of clinically significant age-related differences in
Piers-Harris 2 scores. Thus, there appeared to be no need for
age-stratified norms.
Sex. Generally speaking, investigators have failed to
find significant sex differences in self-concept (Hattie, 1992;
44 Technical Guide
Wylie, 1979). Early studies using the Piers-Harris as a mea-
sure of general self-concept (e.g., Piers & Harris, 1964) were
consistent with this notion.
However, more recent studies have suggested sex dif-
ferences in specific aspects of self-concept. In particular,
males tend to report less anxiety and more problematic be-
haviors than girls. Lewis and Knight (2000) examined moder-
ator variables for the original Piers-Harris in 368 intellectually
gifted children in Grades 4 to 12. Males and females did not
differ on the Total score, but there were sex differences on
three cluster scales. Females scored significantly higher than
males on the Behavior and Intellectual and School Status
scales, whereas the reverse pattern occurred on the Anxiety
scale. Similar sex differences on the Anxiety scale were re-
ported by Lord (1971) and Osborne and LeGette (1982).
Furthermore, in the normative sample for the cluster scales
described in the 1984 edition of the Piers-Harris Manual,
males rated themselves significantly lower in self-concept on
the Behavior scale than females. Additional studies related to
sex differences are listed by topic in Appendix A.
Average T-scores for males and females in the Piers-
Harris 2 standardization sample are provided in Table 13.
There were slight sex differences on the BEH and FRE
scales, consistent with the previously cited literature. How-
ever, the effect sizes were not large enough to constitute a
practically meaningful difference. These findings do not
support stratification of the Piers-Harris 2 norms by sex.
Ethnicity. The original Piers-Harris norms were based
on an ethnically homogenous sample of public-school chil-
dren from one Pennsylvania school district (Piers, 1984).
Because it was not possible to examine ethnic differences in
self-concept using these data, the 1984 edition of the Piers-
Harris Manual relied on an extensive review of the litera-
ture documenting comparisons among various ethnic groups
on Piers-Harris scores. Those studies, along with more re-
cent literature relating to this issue, are listed by topic in
Appendix A.
Certain themes emerge from this literature. Most im-
portantly, ethnicity per se does not appear to be a significant
determinant of self-concept. Rather, children in certain ethnic
or cultural groups may be at higher risk for the kinds of on-
going stressors that can impact self-esteem over the long run.
These experiences may include racial discrimination, aca-
demic failure due to language deficits or other problems, and
general difficulties adjusting to the mainstream culture. On
the other hand, adequate social support seems to insulate chil-
dren from at least some of these potentially harmful factors.
Cultural differences in response style must also be
taken into account. In some cultures, saying positive things
about oneself is frowned upon because it is considered
boastful. Also, individuals from certain cultures may be less
inclined to provide socially desirable responses on psycho-
logical tests. Either of these factors could cause spuriously
low Piers-Harris 2 scores. Alternatively, spuriously high
scores may result in individuals from groups that tend to be
more defensive or reluctant in terms of disclosing negative
emotions.
Chapter 4 Development and Restandardization 45
Table 12
Average T-Scores by Age Group in the Piers-Harris 2 Standardization Sample
Age
7–8 9–10 11 –12 13–14 15–16 17–18
Self-Concept scale (n = 188) (n = 231) (n = 277) (n = 271) (n = 255) (n = 165)
Total (TOT) 51.2 51.5 51.3 48.4 48.6 49.1
Behavioral Adjustment (BEH) 51.2 50.9 51.4 47.8 47.6 50.1
Intellectual and School Status (INT) 53.3 52.0 50.0 47.5 48.1 49.2
Physical Appearance and Attributes (PHY) 50.4 50.2 50.2 48.4 50.6 49.5
Freedom From Anxiety (FRE) 49.9 50.6 50.7 50.4 48.2 48.8
Popularity (POP) 47.7 50.3 51.2 50.5 50.0 48.6
Happiness and Satisfaction (HAP) 50.7 51.0 50.7 47.9 48.5 48.8
Note. N = 1,387.
Table 13
Average T-Scores by Sex in the
Piers-Harris 2 Standardization Sample
Sex
Male Female
Self-Concept scale (n = 689) (n = 698)
Total (TOT) 49.6 50.4
Behavioral Adjustment (BEH) 48.2 51.3
Intellectual and School Status (INT) 48.7 51.0
Physical Appearance and Attributes (PHY) 49.6 50.1
Freedom From Anxiety (FRE) 51.6 48.1
Popularity (POP) 49.6 50.2
Happiness and Satisfaction (HAP) 49.1 50.0
Note. N = 1,387.
The preceding remarks can be viewed as a caveat to
consideration of ethnic differences in the Piers-Harris 2
norms. Unlike the 1984 manual’s norms, the new norms are
based on an ethnically diverse sample, and so it is possible
to investigate directly the moderating effects of ethnicity.
The Piers-Harris 2 standardization sample included suffi-
ciently large numbers of Black, Hispanic, and White partic-
ipants to permit evaluation for meaningful ethnic
differences. Table 14 presents average T-scores for these
groups. None of the individual scale scores differed by more
than 3 T-score points from the grand mean for the standard-
ization sample. These small effect sizes suggest that Piers-
Harris 2 scores can be interpreted without the need for
separate norms for these groups.
Table 14 also presents average T-scores for Asians,
Native Americans, and those who reported their ethnicity as
“Other.” Because of the small cell sizes, these results are not
reliable enough to aid interpretation of Piers-Harris 2 scores.
Nevertheless, the findings may be of interest to those plan-
ning research on self-concept with the Asian or Native
American populations.
Socioeconomic status. Generally speaking, the stud-
ies that have examined socioeconomic status (SES) and self-
concept have not found significant relationships between
these constructs (Hattie, 1992; Wylie, 1979). There are a few
exceptions to this trend. For example, Osborne and LeGette
(1982) studied a sample of 374 middle school students and
found that lower SES was associated with lower self-con-
cept scores. As noted with respect to ethnicity, the relation-
ships between SES and self-concept are likely to be
complex, in that SES may be a marker for other variables,
such as level of stress within the family, that affect self-
concept more directly.
As noted in Table 6, SES data are available for about
half of the Piers-Harris 2 standardization sample, but there is
good reason to believe that these data provide a good ap-
proximation of the SES distribution in the entire sample.
Table 15 shows average T-scores for groups defined by the
46 Technical Guide
Table 14
Average T-Scores by Race/Ethnicity in the Piers-Harris 2 Standardization Sample
Race/Ethnic group
Native
Black Hispanic White Asian American Other
Self-Concept scale (n = 255) (n = 102) (n = 943) (n = 17) (n = 16) (n = 50)
Total (TOT) 48.7 47.4 50.8 48.0 50.9 46.8
Behavioral Adjustment (BEH) 48.4 47.7 50.5 48.7 50.6 47.1
Intellectual and School Status (INT) 48.9 47.6 50.6 47.9 50.4 46.8
Physical Appearance and Attributes (PHY) 50.7 48.8 49.8 48.8 49.2 49.0
Freedom From Anxiety (FRE) 48.1 47.5 50.8 46.6 51.1 47.0
Popularity (POP) 48.9 47.9 50.6 49.0 48.2 47.6
Happiness and Satisfaction (HAP) 48.0 48.4 50.2 51.4 51.2 47.5
Note. N = 1,383. Due to the small cell sizes for the Asian, Native American, and Other samples, the results for those groups are not reliable enough to
aid in interpreting the Piers-Harris 2 score.
Table 15
Average T-Scores by Head of Household’s Education Level in the Piers-Harris 2 Standardization Sample
Head of household’s education level
Not HS HS Some College Post-
graduate graduate college graduate graduate
Self-Concept scale (n = 73) (n = 182) (n = 100) (n = 157) (n = 161)
Total (TOT) 47.0 49.8 50.8 51.9 50.3
Behavioral Adjustment (BEH) 46.6 48.2 49.6 51.3 50.0
Intellectual and School Status (INT) 47.2 49.2 50.8 51.5 50.1
Physical Appearance and Attributes (PHY) 48.5 50.7 50.8 50.8 50.9
Freedom From Anxiety (FRE) 47.2 50.5 49.1 51.6 50.1
Popularity (POP) 48.1 50.8 51.4 51.2 50.5
Happiness and Satisfaction (HAP) 47.3 49.6 49.8 51.3 49.9
Note. N = 673. See text for discussion of missing data.
reported education level of the head of household. There is a
trend toward lower self-concept scores in the group that is
lowest on this SES index (the head of the household did not
graduate from high school), which is consistent with Osborne
and LeGette’s (1982) findings. However, the differences be-
tween this group’s scores and the grand means for the entire
standardization sample remain in the realm of small effect
sizes (two scales, BEH and TOT, differ by slightly more than
3 T-score points from the grand mean). Considering all of
the SES groups, there is no consistent pattern of clinically
meaningful score differences associated with increasing
SES. These data do not indicate a need for Piers-Harris 2
norms that are stratified by SES.
U.S. geographic region. Although there is no theoreti-
cal reason to expect regional differences in Piers-Harris
scores, the possibility of such differences was explored in the
current data set. Table 16 presents the average T-scores by
U.S. geographic region for the Piers-Harris 2 standardization
sample. As with the other moderators, there is no consistent
pattern of clinically meaningful differences among regions.
Intelligence and academic achievement. Numerous
investigators have examined the relationships among intelli-
gence, academic achievement, and self-concept (see Ap-
pendix A). Generally speaking, researchers have found
moderate positive correlations between measures of achieve-
ment and self-concept scores. Furthermore, as hypothesized
by Shavelson et al. (1976), the relationship between these two
constructs appears to be due to a specific academic compo-
nent of self-concept, rather than to generalized self-concept.
For example, two studies of elementary-school and middle-
school students demonstrated that specific measures of aca-
demic self-concept were stronger predictors of achievement
than the Piers-Harris Total score (Lyon & MacDonald, 1990;
Schike & Fagan, 1994). In contrast to achievement, re-
searchers have typically found only weak associations be-
tween intelligence test scores and measures of self-concept
(e.g., Black, 1974; McIntire & Drummond, 1977).
With regard to the Piers-Harris 2 normative sample,
the literature suggests that children who perform well on
academic tasks are likely to have higher than average scores
on the INT cluster scale, but not on the other domain scales
or the Total score. However, this must remain a tentative
hypothesis. Academic achievement data is not available for
the Piers-Harris 2 standardization sample, so it is not possi-
ble to study empirically the effects of this moderator.
Summary. The Piers-Harris 2 standardization sample
has been examined for group differences related to the
potential moderating variables of age, sex, ethnicity, socio-
economic status (as indexed by education level of head of
household), and U.S. geographic region. Where effects of
these moderators were present, they were small and applied
only to one or two domain scales. None of the analyses
identified clinically meaningful patterns of differences that
were consistent with other knowledge about the groups in
question. Consequently, it was determined that one set of
nonstratified normative data could be used for interpreting
Piers-Harris 2 scores.
Nevertheless, the relationship among moderators is
complex and deserves additional study. Although the current
analyses do not support stratified norms for the Piers-Harris
2, several prior studies have found group differences on each
of the moderators described in the previous section. These
studies vary greatly in their methodological quality. The au-
thors of the Piers-Harris 2 recognize that particular clinical
and research applications of this measure may raise con-
cerns about the suitability of the nonstratified norms pre-
sented in this manual. For example, special care should be
taken in interpreting Piers-Harris 2 results for children with
mental retardation, children with severe psychiatric dis-
orders, and children from ethnic groups not well represented
in the standardization sample. It is suggested that in these
and similar situations users familiarize themselves with the
appropriate research literature to determine what caveats
should be considered in using the Piers-Harris 2 norms. To
facilitate this process, Appendix B of this manual includes a
brief review of research relating to the use of the Piers-
Harris for children with special needs. Other studies involv-
ing special populations are listed by topic in Appendix A.
Chapter 4 Development and Restandardization 47
Table 16
Average T-Scores by U.S. Geographic Region in the Piers-Harris 2 Standardization Sample
U.S. geographic region
Northeast Midwest South West
Self-Concept scale (n = 316) (n = 424) (n = 463) (n = 184)
Total (TOT) 50.8 50.8 49.4 48.2
Behavioral Adjustment (BEH) 50.2 50.9 48.3 49.9
Intellectual and School Status (INT) 50.5 51.3 48.8 47.9
Physical Appearance and Attributes (PHY) 51.0 49.9 50.3 46.8
Freedom From Anxiety (FRE) 49.9 50.4 49.9 48.4
Popularity (POP) 50.4 49.4 50.7 48.4
Happiness and Satisfaction (HAP) 50.4 50.2 48.6 49.2
Note. N = 1,387.
During the nearly four decades since the introduction
of the Piers-Harris, numerous investigators have studied the
technical characteristics of the instrument. This chapter
reviews selected studies from that literature and presents new
reliability and validity evidence from the Piers-Harris 2
standardization sample. As demonstrated in the previous
chapter, the Piers-Harris 2 is essentially identical to the origi-
nal measure from a psychometric perspective. Researchers
may therefore proceed to use the Piers-Harris 2 with the
confidence that its reliability and validity are strongly upheld
by the extensive database pertaining to the original scale.
Reliability
Reliability concerns the stability of scores on a psycho-
logical test. A reliable test should produce consistent scores
for the same individual when he or she takes the test on
different occasions and under varying conditions of exam-
ination. A reliable test should also yield scores that are
relatively free of measurement error, or variance in a score
due to chance factors rather than true variance in the psycho-
logical construct being assessed. Reliability estimates are
expressed in terms of correlation coefficients that vary from
0 to 1, with higher figures indicating greater reliability.
Reliability is considered the most basic psychometric prop-
erty of a test, because the discussion of whether a test mea-
sures what it is supposed to measure cannot even begin until
the test’s reliability has been established. In other words, re-
liability is a necessary, but not a sufficient, condition for
validity. This section will consider two aspects of reliability:
internal consistency and test-retest reliability. The section
will describe new reliability data collected for the Piers-
Harris 2 revision, and will also review the extensive litera-
ture of reliability studies on the original Piers-Harris.
Internal Consistency of the Original and
Revised Instruments
The reliability of a psychological test is determined in
large part by how well the test items sample the content do-
main being assessed. A set of test items that performs well in
this regard has the quality of internal consistency. In an
internally consistent test, items tend to be highly inter-
correlated, presumably because they are measuring the same
construct. The primary indexes of internal consistency are
coefficient alpha (Cronbach, 1988), or, if the items are
dichotomous, Kuder-Richardson Formula 20 (KR-20; Kuder
& Richardson, 1937). These statistics measure the average
intercorrelations among items in a test and are thought to
establish an upper limit for the reliability of a test. Another
method for assessing the consistency of content sampling
in a test is split-half reliability, in which the test is split
into equivalent halves for each individual, and the relevant
statistic is the Pearson correlation between the two halves.
Piers-Harris 2. Table 17 presents internal consistency
estimates for the six domain scales and Total (TOT) score
of the Piers-Harris 2. Alphas are reported for the entire
standardization sample and for six age strata. These figures
demonstrate good internal consistency and are comparable
to the values reported for the original Piers-Harris. The age-
stratified values are presented to address concerns regarding
the reliability of self-concept scores with younger children.
These analyses show that the TOT scale and five of the six
domain scales maintain good internal consistency through-
out the six age ranges. The Popularity (POP) scale demon-
strates weaker internal consistency for the youngest children
(7- and 8-year-olds; alpha = .60) and also, unexpectedly, for
the oldest adolescents (17- and 18-year-olds; alpha = .62).
These findings suggest that the POP scale should be inter-
preted cautiously for children in these age ranges.
The standard error of measurement (SEM) statistic
translates the alpha coefficient into practical terms by provid-
ing an index of how close an individual test score is likely to
be to the “true” score that would be obtained if there were no
measurement error. (The formula for calculating SEM for a
given scale is SD √1–r, where SD is the standard deviation and
r is the reliability coefficient for that scale.) SEM values for
the Piers-Harris 2 scales are listed in Table 17.
Original Piers-Harris. Table 18 summarizes several
studies that reported alpha, KR-20, or split-half coefficients
for the 80-item Total score of the original Piers-Harris.
These values, which approach or exceed .90, demonstrate
that the original scale has excellent internal consistency for
both younger and older children.
5
TECHNICAL PROPERTIES
49
Alpha coefficients for the original six cluster scales
are available from several samples. Piers (1984) reported
alpha coefficients from two samples. The first was a combi-
nation of the cluster scale standardization sample (Piers,
1984), which consisted of 485 4th through 10th graders, and
97 additional children referred from psychiatric clinics. In
this sample, cluster scale alpha coefficients ranged from .73
(Happiness and Satisfaction) to .81 (Behavior), with a mean
of .76. The second sample was the 1996 WPS TEST RE-
PORT™ sample, a regionally and ethnically diverse pool of
1,772 individuals whose Piers-Harris responses were scored
by the WPS TEST REPORT™ service. In this sample, clus-
ter scale alpha coefficients ranged from .76 (Happiness and
Satisfaction) to .83 (Physical Appearance and Attributes),
with a mean of .80. These reliability figures indicate that the
cluster scales of the original Piers-Harris have good internal
consistency.
Hattie (1992) reported cluster scale alpha coefficients
for a sample of 135 Australian students in Grades 10 through
12. Alphas ranged from .70 (Happiness and Satisfaction) to
.82 (Physical Appearance and Attributes), with a mean of
.75. Hattie described similar cluster scale alphas for another
sample of 367 children, with the exception that the internal
consistency of the Happiness and Satisfaction scale was
somewhat low (alpha = .64). This scale also had the lowest
alpha coefficient in the three samples described in the previ-
ous paragraph. This finding suggests that the Happiness and
Satisfaction scale may be more multidimensional than the
other Piers-Harris scales, a notion that is supported by sev-
eral factor analyses described in the next section.
Test-Retest Reliability of the Original Piers-Harris
Test-retest reliability measures the extent to which
scores for a single individual are consistent over time and
50 Technical Guide
Table 17
Piers-Harris 2 Internal Consistency Estimates
Entire
standardization
No. of
samplea
Age group (Alpha)
Self-Concept scale items Alpha SEM 7–8b
9–10c
11–12d
13–14e
15–16f
17–18g
Total (TOT) 60 .91 3.07 .89 .92 .92 .91 .93 .89
Behavioral Adjustment (BEH) 14 .81 1.27 .75 .84 .81 .81 .81 .76
Intellectual and School Status (INT) 16 .81 1.50 .76 .82 .81 .82 .82 .72
Physical Appearance and Attributes (PHY) 11 .75 1.29 .72 .75 .80 .77 .73 .65
Freedom From Anxiety (FRE) 14 .81 1.46 .77 .82 .82 .82 .84 .80
Popularity (POP) 12 .74 1.37 .60 .72 .80 .79 .78 .62
Happiness and Satisfaction (HAP) 10 .77 1.05 .71 .82 .78 .77 .78 .71
a
N = 1,387. b
n = 188. c
n = 231. d
n = 277. e
n = 271. f
n = 255. g
n = 165.
Table 18
Studies Reporting Internal Consistency Coefficients for Piers-Harris Total Score
Study Sample Age or grade Sex n Index r
Center and Ward (1986) Nonclinical Australian Grade 2 Both 183 Split-half .89
Grade 3 Both 104 Split-half .91
Grades 4, 6, 9 Both 114 Split-half .91
Cooley and Ayres (1988) Mixed nonclinical/special education Grades 6–8 Both 155 Alpha .92
Franklin, Duley, et al. (1981) Nonclinical Grades 4–6 Both 180 Alpha .92
Lefley (1974) Nonclinical Native American 7–14 years Both 53 Split-half .91
Piers (1973) Nonclinical Grade 6 Female 70 KR-20 .88
Male 76 KR-20 .90
Grade 10 Female 84 KR-20 .88
Male 67 KR-20 .93
Smith and Rogers (1978) Learning disabled 6–12 years Both 206 Alpha .89
Winne, Marx, and Taylor (1977) Nonclinical Grades 3–6 Female 42 Alpha .90
Male 61 Alpha .90
WPS TEST REPORT™ (1996)a
Mixed nonclinical/clinic referred 7–19 years Both 1,772 Alpha .93
Yonker, Blixt, and Dinero (1974) Nonclinical Grade 10 Both 208 Alpha .90
a
This sample is a regionally and ethnically diverse pool of individuals from clinical and nonclinical samples whose tests were scored by the WPS
TEST REPORT™ service.
across settings. Personality measures such as the Piers-
Harris are usually assumed to measure relatively enduring
characteristics of individuals, and so are expected to produce
scores that remain stable across time. However, self-concept
may be less stable among younger children, whose sense of
self is still developing (Harter, 1983). Thus, low test-retest
reliability in younger children may be partially due to the in-
stability of the underlying construct, rather than measure-
ment error per se.
Test-retest reliability data are not available for the Piers-
Harris 2 revision. However, a number of studies have investi-
gated the test-retest reliability of the original Piers-Harris
Total score, in both normal children and special populations.
Most of the test-retest reliability studies of the original Piers-
Harris were completed in the 1960s, 1970s, and early 1980s.
The relevant studies are summarized in Table 19. When re-
viewing these studies, it is important to note that more hetero-
geneous samples are expected to yield higher reliability
coefficients, due to greater variance of scores. If for any rea-
son a small standard deviation is obtained in a given sample,
the test-retest coefficient is expected to be lower. Furthermore,
it is not surprising that shorter test-retest intervals are general-
ly associated with higher reliability estimates, because there is
presumably less chance that environmental or developmental
changes affect children’s self-concepts during these shorter
intervals. In fact, studies with retest intervals of 6 months or
longer are probably best conceptualized as measuring the sta-
bility of the construct of self-concept over time, rather than
test-retest reliability per se. Table 19 is organized to reflect
this distinction.
Test-retest reliability in general samples. An early
study by Piers and Harris (1964) investigated the stability of
the Piers-Harris in the original standardization sample, using
a 95-item experimental version of the scale and a retest in-
terval of 4 months. Reliability coefficients from 3rd, 6th,
and 10th graders were .72, .71, and .72, respectively. These
reliability estimates were deemed satisfactory by the au-
thors, especially given the relatively long retest interval and
the fact that the scale was still in the development stage. The
80-item Piers-Harris, though shorter than the experimental
version, was shown to have marginally better stability for
both 2-month (r = .77) and 4-month (r = .77) retest intervals
(Wing, 1966). Additional studies of nonclinical students re-
port reliability coefficients ranging from .65 to .81 over 2- to
5-month retest intervals (McLaughlin, 1970; Platten &
Williams, 1979, 1981; Shavelson & Bolus, 1982).
Hattie (1992) reported a test-retest study for the Piers-
Harris Total score and the six cluster scales. The study used
a sample of 135 Australian students in Grades 10 through
12. The test-retest interval was 4 weeks. The reliability co-
efficients were as follows: Total, .87; Behavior, .80; Intel-
lectual and School Status, .84; Physical Appearance and
Attributes, .88; Anxiety, .80; Popularity, .80; Happiness and
Satisfaction, .65.
Chapter 5 Technical Properties 51
Table 19
Studies Reporting on Test-Retest Reliability and Construct Stability for Piers-Harris Total Score
Retest
Study Sample Age or grade Sex n interval r
Test-retest reliabilitya
Alban Metcalfe (1981) British 11–20 years Both 182 2 weeks .69
Lefley (1974) American Indian 7–14 years Both 40 10 weeks .73
McLaughlin (1970) Private school Grade 5 Male 67 5 months .75
Grade 6 Male 98 5 months .72
Grade 7 Male 69 5 months .71
Piers and Harris (1964)b
Public school Grade 3 Both 56 4 months .72
Grade 6 Both 66 4 months .71
Grade 10 Both 60 4 months .72
Platten and Williams (1979) Mixed ethnic groups Grades 4–6 Both 159 10 weeks .65
Platten and Williams (1981) Both 173 10 weeks .75
Querry (1970) Normal speech Grades 3–4 Both 10 3–4 weeks .86
Mild articulation disorders Both 10 3–4 weeks .96
Moderate articulation disorders Both 10 3–4 weeks .83
Shavelson and Bolus (1982) Public school Grades 7–8 Both 99 5 months .81
Tavormina (1975) Chronic medical illness M = 12 years Both 94 3–4 weeks .80
Wing (1966) Public school Grade 5 Both 244 2 months .77
4 months .77
Construct stabilityc
Henggeler and Tavormina (1979) Mexican American M = 10.5 years Both 12 1 year .51
Smith and Rogers (1977) Learning disability 6–12 years Both 89 6 months .62
Wolf (1981) Mental retardation/emotional disturbance 11–16 years Both 39 8 months .42
a
Retest intervals of less than 6 months. b
Based on 95-item experimental version of the Piers-Harris. c
Retest intervals of 6 months or more.
Test-retest reliability in special populations. Many
studies have reported test-retest reliability estimates for
samples of minority and special-needs students. These stud-
ies have differed in terms of the sample characteristics (e.g.,
age, sex, and group membership) and retest intervals. Some
studies have even modified the administration procedures or
the instrument itself. Thus, it is not unexpected that the reli-
ability coefficients for these studies vary considerably.
Minority students. The stability of the Piers-Harris in
populations of diverse ethnic or national backgrounds can be
ascertained from three studies. Lefley (1974) found a reliabil-
ity coefficient of .73 for a sample of Native American students
tested over a 10-week interval. Henggeler and Tavormina
(1979) obtained 1-year retest data from 12 Mexican American
migrant children. The reported reliability coefficient of .51 is
best understood as representing construct stability, as opposed
to test-retest reliability. Viewed in this light, the coefficient is
actually rather high, given the long retest interval and small
sample size. Finally, Alban Metcalfe (1981) reported a
2-week test-retest coefficient of .69 in a sample of students
from Northern England. This study used a modified Piers-
Harris, with “Americanisms” removed and the response for-
mat altered to a 5-point Likert-type scale.
Students with special needs. Studies of students with
special needs have included investigations of children with
mental retardation, children with learning and speech dis-
abilities, and children with chronic medical illness. Querry
(1970) compared the self-concepts of 3rd and 4th graders
with normal speech, mild articulation disorders, and moder-
ate articulation disorders. The reported test-retest reliability
coefficients were .86, .96, and .83, respectively, which is
notable considering the small sample size in each group
(n = 10). Tavormina (1975) obtained a stability coefficient
of .80 with a sample of children with chronic illnesses. These
latter two studies were based on relatively short (3 to 4 week)
retest intervals. In a study of students with learning disabili-
ties (ages 6 to 12), Smith and Rogers (1977) reported a reli-
ability coefficient of .62 over a retest interval of 6 months. In
a study of 39 students identified as having mental retardation
or emotional disturbances, Wolf (1981) reported a stability
coefficient of .42 over an 8-month interval. Again, this study
is probably best construed as an example of construct stabil-
ity rather than test-retest reliability. The long retest interval,
small sample size, and relatively low variance of scores
(compared with the Piers-Harris normative sample) proba-
bly all contributed to the low coefficient in this study.
Unreliability due to random responding. Wylie
(1974) has suggested that low scores on self-concept tests
may be less reliable than higher scores, particularly when the
low scores are generated by younger children. With younger
children, it is possible that inability to read or understand the
items could lead to random responding. A randomly an-
swered questionnaire tends to produce a score at or near the
scale’s midpoint. This is likely to result in a low self-concept
score, in normative terms, simply because most self-concept
scales produce negatively skewed distributions, with scores
accumulating disproportionately above the scale’s midpoint.
Random response patterns need to be considered when
interpreting the results of test-retest reliability studies.
Random responding tends to result in inconsistent item re-
sponses between two testing occasions, which is a hallmark
of poor reliability. However, children who respond randomly
are likely to have low self-concept scores on both occasions,
for the reasons described previously. This creates a situation
that in fact is emblematic of low test-retest reliability, but
nevertheless generates a spuriously high test-retest correla-
tion (because both scores are in the low range). This situation
might be interpreted as reflecting a substantial relationship
between the two testing occasions when actually the only
common element was the randomness of responding.
These speculations raise two empirical questions: (a)
Do Piers-Harris scores below the mean exhibit more item in-
stability across testing occasions than scores above the mean?
and (b) if this differential instability exists, is it due to random
responding? The Smith and Rogers (1977) study provides
data relevant to these questions. Smith and Rogers adminis-
tered the Piers-Harris to 89 children (aged 6 to 12) with sig-
nificant learning deficiencies. Extra care was taken to ensure
that the children understood the items. The sample was split
into three groups based on initial Piers-Harris Total score
(high, middle, or low). The groups did not differ significantly
in age or IQ. The sample was given the Piers-Harris again 6
months later. An index of instability was determined for each
child by calculating the number of item inconsistencies from
the first testing to the second testing. Results indicated that
the index of instability for the high self-concept group was
significantly less than for the middle or low groups, but that
the latter two groups did not differ significantly.
Because those with low and middle scores did not dif-
fer in item-response stability, Smith and Rogers (1977) con-
cluded that random responding alone could not account for
low test-retest reliability among low scorers. If random re-
sponding were the cause of low reliability, then the middle-
scoring group, which would have included fewer random
response sets, would have showed less item instability than
the low-scoring group. As an alternative explanation for
their findings, Smith and Rogers proposed that for younger
children, the point in time at which self-concept stabilizes is
a function of the favorableness of self-concept. Under this
view, younger children with high self-regard demonstrate
stability of self-concept earlier than do same-age children
with low self-regard. Once self-concept has become rela-
tively stable (as in older children and adults), no differential
item instability for those with high, middle, and low self-
concept scores would be expected. According to this theory,
then, greater variability in low-scoring children should be
interpreted as reflecting uncertain and poorly defined self-
image rather than inadequate test-retest reliability. Clearly,
more research is needed to test these hypotheses.
Validity
Validity refers to a test’s ability to measure accurately
those psychological characteristics that it purports to measure.
52 Technical Guide
Validity is a multidimensional concept that can be divided
into several types, each of which plays a different role in es-
tablishing the usefulness and accuracy of a test (Anastasi,
1988). Content validity addresses the question of whether
the test’s item content adequately samples the behavior that
is being measured. A second type of validity, construct va-
lidity, refers to how well the test performs in measuring a
theoretical psychological characteristic (e.g., introversion,
neuroticism). Finally, criterion validity involves how well
the test performs in predicting an individual’s performance
or status in other activities (e.g., school achievement, re-
sponse to psychiatric treatment).
Since the introduction of the original Piers-Harris in
the early 1960s, researchers have produced a large body of
evidence supporting the measure’s validity. The validation
process for the Piers-Harris 2 is based on this existing liter-
ature. In addition, new data concerning construct validity
have been collected as part of the standardization study for
the Piers-Harris 2.
This section begins by briefly examining the content
validity of the Piers-Harris 2, in terms of the evolution and
refinement of the original Piers-Harris item set. Content
validity is the least important aspect of the validity of the
Piers-Harris 2, simply because self-concept is by definition
a theoretical entity, and thus the validity of the Piers-Harris 2
is more appropriately determined by construct validation
methods. In addition, the Pier-Harris 2 is frequently used as
an outcome measure, which highlights the importance of es-
tablishing its criterion validity. The majority of this section
is therefore devoted to construct and criterion validity, de-
scribing first the recent studies contributing to the Piers-
Harris 2 revision, followed by a review of validity research
pertaining to the original measure.
Content Validity of the Original and
Revised Instruments
Piers-Harris 2. The Piers-Harris 2 contains 60 items,
20 fewer than the item set of the original measure. The ques-
tion arose as to whether this item reduction had any impact
on the content validity of the Piers-Harris 2. This issue con-
cerns only the TOT score, as the Piers-Harris 2 domain
scales are almost identical to their counterparts in the origi-
nal measure. To determine if the item reduction would af-
fect content validity, a clinical judge compared the deleted
items (see Appendix E) with the retained items (see Table
4). Sixteen of the deleted items were judged to have ade-
quate content overlap with the retained items, so it was con-
cluded that deleting these items would not result in an
overall loss of content coverage in the Piers-Harris 2. The
judge identified four deleted items that had relatively little
content overlap with the retained items. These four items
were: “I am good at making things with my hands,” “I can
draw well,” “I am good in music,” and “I sleep well at
night.” These items refer to specific abilities and attributes
rather than more general facets of self-concept (e.g., “I am a
good person”). Therefore, it was decided that the effect of
deleting them on the overall content sampling of the Piers-
Harris 2 would be relatively small, and that the content valid-
ity of the measure would not be threatened by going ahead
with the planned item reduction.
Original Piers-Harris. The items of the original
Piers-Harris were written with the goal of maximizing con-
tent validity. The universe of content to be sampled was de-
fined as qualities that children reported liking or disliking
about themselves (Jersild, 1952). Children’s statements
were grouped into the following categories: (a) physical
characteristics and appearance; (b) clothing and grooming;
(c) health and physical well-being; (d) home and family;
(e) enjoyment of recreation; (f) ability in sports and play;
(g) academic performance and attitudes toward school;
(h) intellectual abilities; (i) special talents (music, arts);
(j) “Just Me, Myself”; and (k) personality characteristics,
inner resources, and emotional tendencies. As detailed in
chapter 4, an initial pool of 164 pilot items was developed to
reflect all of these categories. This pool was reduced to 80
items by eliminating items with poor ability to discriminate
between high and low scores on the entire item set. The reduced
item set contained items from all of the original categories.
The original factor analysis (Piers, 1963) identified six
factors that became the cluster scales: Behavior, Intellectual
and School Status, Physical Appearance and Attributes,
Anxiety, Popularity, and Happiness and Satisfaction. These
factors collapsed several of Jersild’s (1952) categories and
emphasized items from the two most general categories
(“Just Me, Myself” and personality characteristics, inner re-
sources, and emotional tendencies). These general cate-
gories presumably are a better reflection of a child’s overall
self-concept than narrower categories (e.g., special talents
or enjoyment of recreation).
Construct Validity of the Piers-Harris 2
The Piers-Harris 2 standardization study provided two
kinds of evidence related to the construct validity of the re-
vised instrument. The study allowed detailed examination of
the instrument’s structural characteristics, which refers to
the intercorrelations and item composition of the Piers-Harris
2 domain scales. Determining the interrelatedness of the do-
main scales helps establish whether they can be viewed as
measuring separate components of overall self-concept. In
addition, concurrent data was collected on other psycho-
logical tests from subsamples of the Piers-Harris 2 standard-
ization sample. This data enables assessment of convergent
validity, or the extent to which the Piers-Harris 2 correlates
with measures of similar psychological constructs.
Structural characteristics. Interscale correlation
analysis and factor analysis are the two methods used to il-
luminate the structural characteristics of the Piers-Harris 2.
Interscale correlations. Table 20 presents interscale
correlation coefficients for the Piers-Harris 2 standardization
sample. Most of the scales exhibit correlations with each
other in the moderate to high moderate range. Interscale cor-
relations of this magnitude are to be expected for several rea-
sons. First of all, each scale shares items with at least two
other scales. The magnitude of the interscale correlation
Chapter 5 Technical Properties 53
coefficient reflects the number of shared items between
scales. For example, the FRE and HAP scales, which share
four items, correlate at r = .66. In contrast, BEH and POP,
which do not share any items, have a much weaker associa-
tion (r = .30). Furthermore, the theory underlying the Piers-
Harris specifies that a child’s general sense of self-worth
should influence his or her self-appraisals in specific areas
of functioning. This suggests that the factor-analytically de-
rived subscales of the test, which are intended to measure
distinct dimensions of self-concept, should nevertheless
share variance with one another.
Additional findings support the notion that the domain
scales represent separate but interrelated aspects of self-
concept. First, in all cases, the interscale correlations are
lower than the scale reliabilities (see Table 17). This indi-
cates that individual items are related more strongly to other
items on the same domain scale than to items on other scales
(with the exception of overlapping items). Second, as Table
20 shows, each domain scale correlates more strongly with
the TOT score than with any of the other content scales. This
demonstrates that each domain scale is a better index of gen-
eral self-concept than of the particular components of self-
concept measured by the other domain scales.
Factor analysis. An exploratory common factor anal-
ysis with oblimin rotation was conducted using the Piers-
Harris 2 standardization sample data. The common factor
approach was selected because it allows for sources of vari-
ance (e.g., measurement error) other than the extracted fac-
tors. Oblimin rotation was chosen because it assumes
correlated factors, which is a theoretically and empirically
reasonable assumption for this measure (see the discussion
in the previous section). Table 21 presents the factor load-
ings (item-factor correlations), with items organized by do-
main scale.
The factor analysis yielded six factors with eigenval-
ues greater than 1. The first factor was weighted with items
representing feelings of happiness and perceptions that one
is important and valued by others. The second factor reflect-
ed endorsement of troublesome behaviors at school and at
home. The third factor appeared to represent freedom from
anxiety, worry, and nervousness. The fourth factor reflected
perceptions of being good at schoolwork and fitting in well
at school. The fifth factor represented dissatisfaction with
one’s physical appearance and personal attributes. The sixth
factor reflected a perception that one has many friends,
makes new friends easily, and is well liked by others. It
should be noted that the first, third, fourth, and sixth factors
represented qualities that are positively correlated with self-
esteem, and the second and fifth factors represented qualities
that are negatively correlated with self-esteem. Thus, a child
with a high self-concept score would be expected to have
relatively high scores on Factors I, III, IV, and VI, and rela-
tively low scores on Factors II and V.
There appears to be reasonably good correspondence
between the results of the factor analysis and the item assign-
ments for the Piers-Harris 2 domain scales. The BEH, INT,
FRE, and POP scales map relatively cleanly onto Factors II,
IV, III, and VI, respectively. The situation is a bit more com-
plex for the other two domain scales. On the PHY scale, the
items that reflect concerns with physical appearance map
cleanly onto Factor V. However, PHY items that represent
other personal attributes (e.g., Item 9, “I am a leader in
games and sports,” or Item 15, “I am strong”) do not load on
Factor V or any other factor. The HAP scale appears to be
bifactorial. There is a strong loading on Factor I of items
representing feelings of happiness, importance, and being
valued by others. Other HAP items having to do with satis-
faction with one’s appearance load on Factor V, reflecting
some of the item overlap between HAP and PHY.
It is worth noting that in this factor analysis, items
were recoded to conform to the Piers-Harris 2 scoring sys-
tem (i.e., “1” is coded for the response in the direction of
positive self-concept, “0” for the alternative response). This
recoding tends to obliterate the original distinction between
positive and negative item phrasing. For example, a posi-
tively phrased item (e.g., Item 60, “I am a good person”) is
coded “1” for a yes response, and a negatively phrased item
(e.g., Item 36, “I hate school”) is coded “1” for a no re-
sponse. Closer inspection of the factor analysis, however,
does reveal some effects of item phrasing on the factor struc-
ture of the Piers-Harris 2. In particular, Factors II and III are
composed primarily of negatively phrased items; Factors IV
and V contain mostly positively phrased items; and Factors I
and VI are evenly mixed.
54 Technical Guide
Table 20
Interscale Correlations in the Piers-Harris 2 Standardization Sample
TOT BEH INT PHY FRE POP HAP
Total (TOT) -
Behavioral Adjustment (BEH) .73 -
Intellectual and School Status (INT) .84 .64 -
Physical Appearance and Attributes (PHY) .76 .34 .65 -
Freedom From Anxiety (FRE) .79 .42 .52 .50 -
Popularity (POP) .75 .30 .50 .66 .64 -
Happiness and Satisfaction (HAP) .81 .53 .60 .69 .66 .55 -
Note. N = 1,387.
55
Table 21
Factor Loadings for Piers-Harris 2 Item Responses in Standardization Sample
I II III IV V VI
Behavioral Adjustment (BEH)
12. I am well behaved in school. (Y) –.44
13. It is usually my fault when something goes wrong. (N) –.41
14. I cause trouble to my family. (N) –.49
18. I am good in my schoolwork. (Y) .56
19. I do many bad things. (N) –.62
20. I behave badly at home. (N) –.56
27. I often get into trouble. (N) –.62
30. My parents expect too much of me. (N)
36. I hate school. (N) .45
38. I am often mean to other people. (N) –.47
45. I get into a lot of fights. (N) –.50
48. My family is disappointed in me. (N) –.47
58. I think bad thoughts. (N) –.46
60. I am a good person. (Y) .53
Intellectual and School Status (INT)
5. I am smart. (Y) .51
7. I get nervous when the teacher calls on me. (N) .51
12. I am well behaved in school. (Y) –.44
16. I am an important member of my family. (Y) .49
18. I am good in my schoolwork. (Y) .56
21. I am slow in finishing my schoolwork. (N)
22. I am an important member of my class. (Y) .44
24. I can give a good report in front of the class. (Y) .51
25. In school I am a dreamer. (N)
26. My friends like my ideas. (Y) .49
34. I often volunteer in school. (Y) .45
39. My classmates in school think I have good ideas. (Y) .49 .50
43. I am dumb about most things. (N) .41
50. When I grow up, I will be an important person. (Y) .42
52. I forget what I learn. (N) .50
55. I am a good reader. (Y) .46
Physical Appearance and Attributes (PHY)
5. I am smart. (Y) .51
8. My looks bother me. (N) –.58
9. I am a leader in games and sports. (Y)
15. I am strong. (Y)
26. My friends like my ideas. (Y) .49
33. I have nice hair. (Y) –.55
39. My classmates in school think I have good ideas. (Y) .49 .50
44. I am good-looking. (Y) –.65
46. I am popular with boys. (Y)
49. I have a pleasant face. (Y) –.63
54. I am popular with girls. (Y) .42
continued on next page . . .
56
Table 21 (continued)
Factor Loadings for Piers-Harris 2 Item Responses in Standardization Sample
I II III IV V VI
Freedom From Anxiety (FRE)
4. I am often sad. (N) .44
6. I am shy. (N)
7. I get nervous when the teacher calls on me. (N) .51
8. My looks bother me. (N) –.58
10. I get worried when we have tests in school. (N) .50
17. I give up easily. (N)
23. I am nervous. (N) .55
29. I worry a lot. (N) .56
31. I like being the way I am. (Y) –.54
32. I feel left out of things. (N) .48 –.46
35. I wish I were different. (N) .42 –.58
40. I am unhappy. (N) .54
56. I am often afraid. (N) .54
59. I cry easily. (N) .44
Popularity (POP)
1. My classmates make fun of me. (N) .41
3. It is hard for me to make friends. (N) .56
6. I am shy. (N)
11. I am unpopular. (N) .52
32. I feel left out of things. (N) .48 –.46
37. I am among the last to be chosen for games and sports. (N) .51
39. My classmates in school think I have good ideas. (Y) .49 .50
41. I have many friends. (Y) .43 .63
47. People pick on me. (N) .52
51. In games and sports, I watch instead of play. (N)
54. I am popular with girls. (Y) .42
57. I am different from other people. (N)
Happiness and Satisfaction (HAP)
2. I am a happy person. (Y) .63
8. My looks bother me. (N) –.58
28. I am lucky. (Y)
31. I like being the way I am. (Y) –.54
35. I wish I were different. (N) .42 –.58
40. I am unhappy. (N) .54
42. I am cheerful. (Y) .60
49. I have a pleasant face. (Y) –.63
53. I am easy to get along with. (Y)
60. I am a good person. (Y) .53
Note. N = 1,387. Principal axis extraction method with oblimin rotation. Loadings are item-factor correlations. Loadings of less than .40 are not
displayed. Letter in parentheses indicates response scored as positive self-concept (Y = yes, N = no).
Piers harris-2 manual
Piers harris-2 manual
Piers harris-2 manual
Piers harris-2 manual
Piers harris-2 manual
Piers harris-2 manual
Piers harris-2 manual
Piers harris-2 manual
Piers harris-2 manual
Piers harris-2 manual
Piers harris-2 manual
Piers harris-2 manual
Piers harris-2 manual
Piers harris-2 manual
Piers harris-2 manual

Weitere ähnliche Inhalte

Was ist angesagt?

Stanford binet intelligence scale- fifth edition
Stanford binet intelligence scale- fifth editionStanford binet intelligence scale- fifth edition
Stanford binet intelligence scale- fifth editionMuhammad Musawar Ali
 
Clinical versus statistical prediction
Clinical versus statistical predictionClinical versus statistical prediction
Clinical versus statistical predictionPaulo fattori
 
Poj test a
Poj test aPoj test a
Poj test aPonsoy
 
Behavioral assessment
Behavioral assessmentBehavioral assessment
Behavioral assessmentIqra Shahzad
 
Decoding word association 3 - sentence completion test
Decoding word association 3 - sentence completion testDecoding word association 3 - sentence completion test
Decoding word association 3 - sentence completion testCol Mukteshwar Prasad
 
IGNOU Super-Notes: MPC3 Personality - Theories and Assessment_4 Assessment of...
IGNOU Super-Notes: MPC3 Personality - Theories and Assessment_4 Assessment of...IGNOU Super-Notes: MPC3 Personality - Theories and Assessment_4 Assessment of...
IGNOU Super-Notes: MPC3 Personality - Theories and Assessment_4 Assessment of...PsychoTech Services
 
MMPI (minnesota multiphasic personality inventory)
MMPI (minnesota multiphasic personality inventory)MMPI (minnesota multiphasic personality inventory)
MMPI (minnesota multiphasic personality inventory)Dr.Jeet Nadpara
 
Psychological report writing
Psychological report writingPsychological report writing
Psychological report writingDen Sarabia
 
Introduction to unidimensional item response model
Introduction to unidimensional item response modelIntroduction to unidimensional item response model
Introduction to unidimensional item response modelSumit Das
 
The Rorschach Ink Blot Test
The Rorschach Ink Blot TestThe Rorschach Ink Blot Test
The Rorschach Ink Blot TestYuti Doshi
 
Testing in various areas
Testing in various areasTesting in various areas
Testing in various areasFebby Kirstin
 
neuropsychological assessment in SMI
neuropsychological assessment in SMIneuropsychological assessment in SMI
neuropsychological assessment in SMIhar234
 
Commonly used psychological tests
Commonly used psychological testsCommonly used psychological tests
Commonly used psychological testsDr. Piyush Trivedi
 

Was ist angesagt? (20)

Stanford binet intelligence scale- fifth edition
Stanford binet intelligence scale- fifth editionStanford binet intelligence scale- fifth edition
Stanford binet intelligence scale- fifth edition
 
Behavioral Assessment
Behavioral AssessmentBehavioral Assessment
Behavioral Assessment
 
Clinical versus statistical prediction
Clinical versus statistical predictionClinical versus statistical prediction
Clinical versus statistical prediction
 
Poj test a
Poj test aPoj test a
Poj test a
 
Lesson 20
Lesson 20Lesson 20
Lesson 20
 
Behavioral assessment
Behavioral assessmentBehavioral assessment
Behavioral assessment
 
Sixteen Personality Factor Questionnaire (5th Edition) Training Module
Sixteen Personality Factor Questionnaire (5th Edition) Training ModuleSixteen Personality Factor Questionnaire (5th Edition) Training Module
Sixteen Personality Factor Questionnaire (5th Edition) Training Module
 
Decoding word association 3 - sentence completion test
Decoding word association 3 - sentence completion testDecoding word association 3 - sentence completion test
Decoding word association 3 - sentence completion test
 
IGNOU Super-Notes: MPC3 Personality - Theories and Assessment_4 Assessment of...
IGNOU Super-Notes: MPC3 Personality - Theories and Assessment_4 Assessment of...IGNOU Super-Notes: MPC3 Personality - Theories and Assessment_4 Assessment of...
IGNOU Super-Notes: MPC3 Personality - Theories and Assessment_4 Assessment of...
 
MMPI (minnesota multiphasic personality inventory)
MMPI (minnesota multiphasic personality inventory)MMPI (minnesota multiphasic personality inventory)
MMPI (minnesota multiphasic personality inventory)
 
Psychological report writing
Psychological report writingPsychological report writing
Psychological report writing
 
mmpi
mmpimmpi
mmpi
 
Introduction to unidimensional item response model
Introduction to unidimensional item response modelIntroduction to unidimensional item response model
Introduction to unidimensional item response model
 
The Rorschach Ink Blot Test
The Rorschach Ink Blot TestThe Rorschach Ink Blot Test
The Rorschach Ink Blot Test
 
Testing in various areas
Testing in various areasTesting in various areas
Testing in various areas
 
Rorschach test
Rorschach testRorschach test
Rorschach test
 
neuropsychological assessment in SMI
neuropsychological assessment in SMIneuropsychological assessment in SMI
neuropsychological assessment in SMI
 
Pruebas de inteligencia .
Pruebas de inteligencia .Pruebas de inteligencia .
Pruebas de inteligencia .
 
Commonly used psychological tests
Commonly used psychological testsCommonly used psychological tests
Commonly used psychological tests
 
WISC-IV Presentation
WISC-IV PresentationWISC-IV Presentation
WISC-IV Presentation
 

Ähnlich wie Piers harris-2 manual

Brown Glatt Final Project
Brown Glatt Final ProjectBrown Glatt Final Project
Brown Glatt Final ProjectPaula Brown
 
Knowledge Of Applied Psychology
Knowledge Of Applied PsychologyKnowledge Of Applied Psychology
Knowledge Of Applied PsychologyWalley79
 
Research Skills
Research SkillsResearch Skills
Research SkillsWalley79
 
A Test Review: Children’s Depression Rating Scale, Revised (CDRS-R)
A Test Review: Children’s Depression Rating Scale, Revised (CDRS-R) A Test Review: Children’s Depression Rating Scale, Revised (CDRS-R)
A Test Review: Children’s Depression Rating Scale, Revised (CDRS-R) Sidney Gaskins
 
A Theory Of Careers And Vocational Choice Based Upon...
A Theory Of Careers And Vocational Choice Based Upon...A Theory Of Careers And Vocational Choice Based Upon...
A Theory Of Careers And Vocational Choice Based Upon...Dana Boo
 
Projective Techniques And Other Personality Measures
Projective Techniques And Other Personality MeasuresProjective Techniques And Other Personality Measures
Projective Techniques And Other Personality Measurescmsvenson
 
Substance Abuse Subtle Screening Inventory.docx
Substance Abuse Subtle Screening Inventory.docxSubstance Abuse Subtle Screening Inventory.docx
Substance Abuse Subtle Screening Inventory.docxwrite5
 
Bprs-C (Skala BPRS untuk anak)
Bprs-C (Skala BPRS untuk anak)Bprs-C (Skala BPRS untuk anak)
Bprs-C (Skala BPRS untuk anak)Sri Purwatiningsih
 
Assessment 1by Jaquetta StevensSubmission date 09-Oct-2.docx
Assessment 1by Jaquetta StevensSubmission date 09-Oct-2.docxAssessment 1by Jaquetta StevensSubmission date 09-Oct-2.docx
Assessment 1by Jaquetta StevensSubmission date 09-Oct-2.docxgalerussel59292
 
Running Head REVIEW OF ABAS-II STANDARDIZED TESTREVIEW OF ABAS-.docx
Running Head REVIEW OF ABAS-II STANDARDIZED TESTREVIEW OF ABAS-.docxRunning Head REVIEW OF ABAS-II STANDARDIZED TESTREVIEW OF ABAS-.docx
Running Head REVIEW OF ABAS-II STANDARDIZED TESTREVIEW OF ABAS-.docxtodd521
 
Behavioral Screening with a Translated Measure: Reliability and Validity Evid...
Behavioral Screening with a Translated Measure: Reliability and Validity Evid...Behavioral Screening with a Translated Measure: Reliability and Validity Evid...
Behavioral Screening with a Translated Measure: Reliability and Validity Evid...inventionjournals
 
PR2_Q1_Lesson1.pptx
PR2_Q1_Lesson1.pptxPR2_Q1_Lesson1.pptx
PR2_Q1_Lesson1.pptxRutherArce3
 
Spss Homework Psyc 421
Spss Homework Psyc 421Spss Homework Psyc 421
Spss Homework Psyc 421victor okoth
 
Test and MeasurementsBecks Depression Inventory II Test Review.docx
Test and MeasurementsBecks Depression Inventory II Test Review.docxTest and MeasurementsBecks Depression Inventory II Test Review.docx
Test and MeasurementsBecks Depression Inventory II Test Review.docxmattinsonjanel
 
Population Health Improvement Plan Assignment 3.docx
Population Health Improvement Plan Assignment 3.docxPopulation Health Improvement Plan Assignment 3.docx
Population Health Improvement Plan Assignment 3.docxsdfghj21
 
Psychological Assessment ReportPsychological assessment report.docx
Psychological Assessment ReportPsychological assessment report.docxPsychological Assessment ReportPsychological assessment report.docx
Psychological Assessment ReportPsychological assessment report.docxpotmanandrea
 
Behavior assessment system for children (BASC)
Behavior assessment system for children (BASC)Behavior assessment system for children (BASC)
Behavior assessment system for children (BASC)Anowra Khan
 

Ähnlich wie Piers harris-2 manual (20)

Brown Glatt Final Project
Brown Glatt Final ProjectBrown Glatt Final Project
Brown Glatt Final Project
 
Knowledge Of Applied Psychology
Knowledge Of Applied PsychologyKnowledge Of Applied Psychology
Knowledge Of Applied Psychology
 
Research Skills
Research SkillsResearch Skills
Research Skills
 
A Test Review: Children’s Depression Rating Scale, Revised (CDRS-R)
A Test Review: Children’s Depression Rating Scale, Revised (CDRS-R) A Test Review: Children’s Depression Rating Scale, Revised (CDRS-R)
A Test Review: Children’s Depression Rating Scale, Revised (CDRS-R)
 
A Theory Of Careers And Vocational Choice Based Upon...
A Theory Of Careers And Vocational Choice Based Upon...A Theory Of Careers And Vocational Choice Based Upon...
A Theory Of Careers And Vocational Choice Based Upon...
 
Projective Techniques And Other Personality Measures
Projective Techniques And Other Personality MeasuresProjective Techniques And Other Personality Measures
Projective Techniques And Other Personality Measures
 
Substance Abuse Subtle Screening Inventory.docx
Substance Abuse Subtle Screening Inventory.docxSubstance Abuse Subtle Screening Inventory.docx
Substance Abuse Subtle Screening Inventory.docx
 
Bprs-C (Skala BPRS untuk anak)
Bprs-C (Skala BPRS untuk anak)Bprs-C (Skala BPRS untuk anak)
Bprs-C (Skala BPRS untuk anak)
 
Assessment 1by Jaquetta StevensSubmission date 09-Oct-2.docx
Assessment 1by Jaquetta StevensSubmission date 09-Oct-2.docxAssessment 1by Jaquetta StevensSubmission date 09-Oct-2.docx
Assessment 1by Jaquetta StevensSubmission date 09-Oct-2.docx
 
Running Head REVIEW OF ABAS-II STANDARDIZED TESTREVIEW OF ABAS-.docx
Running Head REVIEW OF ABAS-II STANDARDIZED TESTREVIEW OF ABAS-.docxRunning Head REVIEW OF ABAS-II STANDARDIZED TESTREVIEW OF ABAS-.docx
Running Head REVIEW OF ABAS-II STANDARDIZED TESTREVIEW OF ABAS-.docx
 
QURE_2006
QURE_2006QURE_2006
QURE_2006
 
Behavioral Screening with a Translated Measure: Reliability and Validity Evid...
Behavioral Screening with a Translated Measure: Reliability and Validity Evid...Behavioral Screening with a Translated Measure: Reliability and Validity Evid...
Behavioral Screening with a Translated Measure: Reliability and Validity Evid...
 
PR2_Q1_Lesson1.pptx
PR2_Q1_Lesson1.pptxPR2_Q1_Lesson1.pptx
PR2_Q1_Lesson1.pptx
 
Spss Homework Psyc 421
Spss Homework Psyc 421Spss Homework Psyc 421
Spss Homework Psyc 421
 
Test and MeasurementsBecks Depression Inventory II Test Review.docx
Test and MeasurementsBecks Depression Inventory II Test Review.docxTest and MeasurementsBecks Depression Inventory II Test Review.docx
Test and MeasurementsBecks Depression Inventory II Test Review.docx
 
Population Health Improvement Plan Assignment 3.docx
Population Health Improvement Plan Assignment 3.docxPopulation Health Improvement Plan Assignment 3.docx
Population Health Improvement Plan Assignment 3.docx
 
08 chapter 3 1
08 chapter 3 108 chapter 3 1
08 chapter 3 1
 
JALWIN_CPE311.docx
JALWIN_CPE311.docxJALWIN_CPE311.docx
JALWIN_CPE311.docx
 
Psychological Assessment ReportPsychological assessment report.docx
Psychological Assessment ReportPsychological assessment report.docxPsychological Assessment ReportPsychological assessment report.docx
Psychological Assessment ReportPsychological assessment report.docx
 
Behavior assessment system for children (BASC)
Behavior assessment system for children (BASC)Behavior assessment system for children (BASC)
Behavior assessment system for children (BASC)
 

Kürzlich hochgeladen

PROJECTILE MOTION-Horizontal and Vertical
PROJECTILE MOTION-Horizontal and VerticalPROJECTILE MOTION-Horizontal and Vertical
PROJECTILE MOTION-Horizontal and VerticalMAESTRELLAMesa2
 
User Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationUser Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationColumbia Weather Systems
 
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPirithiRaju
 
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxSTOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxMurugaveni B
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxNandakishor Bhaurao Deshmukh
 
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)Columbia Weather Systems
 
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingBase editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingNetHelix
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxmalonesandreagweneth
 
Topic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptxTopic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptxJorenAcuavera1
 
Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologycaarthichand2003
 
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)riyaescorts54
 
The dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxThe dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxEran Akiva Sinbar
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationColumbia Weather Systems
 
CHROMATOGRAPHY PALLAVI RAWAT.pptx
CHROMATOGRAPHY  PALLAVI RAWAT.pptxCHROMATOGRAPHY  PALLAVI RAWAT.pptx
CHROMATOGRAPHY PALLAVI RAWAT.pptxpallavirawat456
 
trihybrid cross , test cross chi squares
trihybrid cross , test cross chi squarestrihybrid cross , test cross chi squares
trihybrid cross , test cross chi squaresusmanzain586
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxpriyankatabhane
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...lizamodels9
 
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPirithiRaju
 

Kürzlich hochgeladen (20)

PROJECTILE MOTION-Horizontal and Vertical
PROJECTILE MOTION-Horizontal and VerticalPROJECTILE MOTION-Horizontal and Vertical
PROJECTILE MOTION-Horizontal and Vertical
 
User Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationUser Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather Station
 
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
 
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxSTOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
 
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
 
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingBase editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
 
Topic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptxTopic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptx
 
Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technology
 
Volatile Oils Pharmacognosy And Phytochemistry -I
Volatile Oils Pharmacognosy And Phytochemistry -IVolatile Oils Pharmacognosy And Phytochemistry -I
Volatile Oils Pharmacognosy And Phytochemistry -I
 
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
 
The dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxThe dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptx
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather Station
 
Let’s Say Someone Did Drop the Bomb. Then What?
Let’s Say Someone Did Drop the Bomb. Then What?Let’s Say Someone Did Drop the Bomb. Then What?
Let’s Say Someone Did Drop the Bomb. Then What?
 
CHROMATOGRAPHY PALLAVI RAWAT.pptx
CHROMATOGRAPHY  PALLAVI RAWAT.pptxCHROMATOGRAPHY  PALLAVI RAWAT.pptx
CHROMATOGRAPHY PALLAVI RAWAT.pptx
 
trihybrid cross , test cross chi squares
trihybrid cross , test cross chi squarestrihybrid cross , test cross chi squares
trihybrid cross , test cross chi squares
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
 
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
 

Piers harris-2 manual

  • 1. Piers-HarrisChildren’sSelf-ConceptScale,SecondEditionPIERS-HARRIS2MANUALEllenV.Piers,Ph.D.,andDavidS.Herzberg,Ph.D. W-388B Piers-Harris 2 Piers-Harris Children’s Self-Concept Scale, SECOND EDITION Ellen V. Piers, Ph.D. David S. Herzberg, Ph.D. MANUAL Western Psychological Services • 12031 Wilshire Boulevard, Los Angeles, California 90025-1251 Additional copies of this manual (W-388B) may be purchased from WPS. Please contact us at 800-648-8857, Fax 310-478-7838, or www.wpspublish.com. wps® wps Publishers Distributors wps® Published by
  • 2. The Piers-Harris Children’s Self-Concept Scale (Piers, 1963) was originally developed in the early 1960s to provide a brief, self-report instrument for the assessment of self- concept in children and adolescents. As defined by the scale’s original authors, self-concept is a relatively stable set of atti- tudes reflecting both description and evaluation of one’s own behavior and attributes. Since its introduction, the Piers- Harris has enjoyed widespread acceptance among clinicians and researchers, as well as praise from reviewers. The instru- ment’s stature is reflected in more than 500 citations in pro- fessional journals and books in psychology, education, and the health sciences. These numerous references highlight the Piers-Harris’s vital role in the expansion of knowledge about self-concept and its relationship to behavior. The Piers-Harris Children’s Self-Concept Scale, Second Edition (Piers-Harris 2) represents the culmination of a careful revision process. The general goals of this pro- cess were to enhance the ease of use and psychometric foun- dation of the test, while preserving the many characteristics of the instrument that have contributed to its success. These goals have been realized in a set of specific improvements, including new nationwide normative data, an updated item set, enhanced interpretive guidelines, and modernized com- puter assessment tools. Nevertheless, the Piers-Harris 2 re- tains the familiar response format, self-concept scales, and excellent psychometric properties of the original edition. Thus, the revised test should be easily integrated into re- search projects and clinical assessments that used the origi- nal Piers-Harris. General Description The Piers-Harris 2 is a 60-item self-report question- naire, subtitled The Way I Feel About Myself. It is designed for administration to children who are at least 7 years old and have at least a second-grade reading ability. The mea- sure can be used with adolescents up to 18 years of age. The Piers-Harris 2 items are statements that express how people may feel about themselves. Respondents are asked to indicate whether each statement applies to them by choosing yes or no. Several methods of administration are available: the Piers-Harris 2 AutoScore™ Form (WPS Product No. W-388A), which is completed by the child and scored manually by the test administrator; mail-in and fax-in forms (WPS Product Nos. W-388C and W-388Z), which are completed by the child and submitted to WPS for computer scoring and report generation; a PC program (WPS Product No. W-388Y), which can generate a report based on either online administration or offline data entry; and the Spanish Answer Sheet (WPS Product No. W-388E), which is com- pleted by the child, whose answers are then transcribed onto an AutoScore™ Form by the examiner. Using any of these methods of administration, most respondents can complete the Piers-Harris 2 in 10 to 15 minutes. The Piers-Harris 2 includes the same Self-Concept and Validity scales as the original Piers-Harris. The Self- Concept scales comprise the Piers-Harris 2 Total (TOT) score, which is a general measure of the respondent’s overall self-concept, and the six domain scales, which assess spe- cific components of self-concept. The domain scales include Behavioral Adjustment (BEH), Intellectual and School Status (INT), Physical Appearance and Attributes (PHY), Freedom From Anxiety (FRE), Popularity (POP), and Happiness and Satisfaction (HAP). (On the original Piers- Harris, the Freedom From Anxiety scale was labeled Anxiety and the Behavioral Adjustment scale was labeled Behavior. All other scale names are unchanged from the original instrument.) The Self-Concept scales are scored so that a higher score indicates a more positive self-evaluation in the domain being measured. The Piers-Harris 2 Validity scales include the Inconsistent Responding (INC) index, which is designed to identify random response patterns, and the Response Bias (RES) index, which measures a child’s tendency to respond yes or no irrespective of item content. Piers-Harris 2 Improvements The most important feature of the Piers-Harris 2 is its incorporation of new, nationally representative normative data. The new norms are based on a sample of 1,387 students, aged 7 to 18 years, who were recruited from school districts all across the United States. The sample closely approximates the ethnic composition of the U.S. population (U.S. Bureau of the Census, 2001a). The new standardization sample is a 1 INTRODUCTION 3
  • 3. significant improvement over the sample used to norm the original Piers-Harris. That sample was recruited in the early 1960s from a single public school system in rural Pennsyl- vania, and was relatively homogenous in terms of ethnicity and several other key demographic variables. In addition, whereas the original Piers-Harris sample consisted of 4th through 12th graders, the Piers-Harris 2 sample included 2nd and 3rd graders as well. The second major enhancement in the Piers-Harris 2 is the reduction of the scale from 80 to 60 items. This item reduction shortens administration time significantly, while retaining all of the Self-Concept and Validity scales from the original Piers-Harris. The deleted items included those of relatively less psychometric value, as well as those written in outdated language that was difficult for many children to understand. The revised scales are psychometrically equiva- lent to their counterparts in the original measure. Table 1 summarizes the changes in item composition and labeling between the original and revised Self-Concept scales. A third substantial change in the Piers-Harris 2 in- volves the microcomputer administration and scoring pro- gram. WPS offers a variety of computer services for many of its products. The “Computerized Services for the Piers- Harris 2” section at the back of this manual provides infor- mation about the options available for the Piers-Harris 2. The software has been updated for the latest version of the Microsoft Windows operating system, with an attractive new graphical user interface. In addition, the computer re- port has been streamlined and updated to reflect the new normative data. This manual includes several new enhancements, in- cluding a revised section on interpreting the test that incor- porates three new case studies. Furthermore, the manual now includes a topic-by-topic inventory of existing Piers- Harris studies (see Appendix A), to facilitate further re- search on the scale. Principles of Use The Piers-Harris 2 is appropriate for use in any re- search, educational, or clinical setting that requires efficient quantitative assessment of children’s reported self-concept. The original Piers-Harris gained widespread acceptance among researchers, as reflected in an extensive scholarly lit- erature that has accumulated over the past four decades. The instrument has been used to evaluate psychological and edu- cational interventions, to investigate the relationship be- tween self-concept and other traits and behaviors (e.g., empathy, teenage pregnancy, drug and alcohol use), and to monitor changes in self-concept over time, among many other research projects. Because it is easily administered to groups, the Piers- Harris 2 can be employed as a screening device in class- rooms to identify children who might benefit from further psychological evaluation. The Piers-Harris 2 can also be used in individual clinical assessments of children and ado- lescents. The Self-Concept scales can be used to generate hypotheses for clinical exploration, as well as to guide clin- icians in choosing among possible interventions and formu- lating referral questions for further psychological testing. The Piers-Harris 2 can be administered and scored by teachers and other trained paraprofessionals. However, ulti- mate responsibility for its use and interpretation should be assumed by a professional with appropriate training in psy- chological assessment. Before administering the Piers- Harris 2, potential users should read this manual to become familiar with the theoretical rationale, development, stan- dardization, and psychometric properties of the measure. As with many self-report measures, users should keep in mind that the intent of the Piers-Harris 2 is readily appar- ent to most children and adolescents. For this reason, the re- sponses may be subject to conscious and unconscious distortion, usually in the direction of greater social desir- ability. The issue of response validity is addressed in greater detail in chapter 3 of this manual. Although the Piers-Harris 2 is a useful instrument, it cannot by itself provide a comprehensive evaluation of a child’s self-concept. Such an evaluation is a complex task requiring clinical sensitivity and familiarity with the appli- cable research literature. In making clinical judgments con- cerning Piers-Harris 2 results, users should be prepared to integrate other sources of data, which may include clinical 4 Table 1 Self-Concept Scales of the Original Piers-Harris and the Piers-Harris 2 Original Piers-Harris Piers-Harris 2 Scale name No. of items Scale name No. of items Total 80 Total (TOT) 60 Cluster scales Domain scales Behavior (BEH) 16 Behavioral Adjustment (BEH) 14 Intellectual and School Status (INT) 17 Intellectual and School Status (INT) 16 Physical Appearance and Attributes (PHY) 13 Physical Appearance and Attributes (PHY) 11 Anxiety (ANX) 14 Freedom From Anxiety (FRE) 14 Popularity (POP) 12 Popularity (POP) 12 Happiness and Satisfaction (HAP) 10 Happiness and Satisfaction (HAP) 10 Note. Some items are assigned to more than one scale. See Appendix E for a list of original items that were deleted in developing the Piers-Harris 2.
  • 4. interviews with the child and other informants, prior history, school records, classroom observations, and results from other psychological tests. Users should also be prepared to confer with outside consultants and referral sources as needed. Contents of This Manual Chapter 2 of this manual contains instructions for ad- ministering and scoring the Piers-Harris 2, and includes a completed sample of an AutoScore™ Form. Chapter 3 pre- sents guidelines for interpreting the test results. Technical aspects of the test are presented in chapters 4 and 5. Chapter 4 reviews the development of the original Piers-Harris and describes the new standardization sample and item revisions for the Piers-Harris 2. Chapter 5 discusses the reliability and validity of the Piers-Harris 2 and presents an overview of re- search on the technical properties of the original test. This manual also includes several appendixes that support spe- cialized applications of the test: Appendix A presents a list of research studies employing the Piers-Harris, organized by topic; Appendix B reviews the use of the Piers-Harris with exceptional children; and Appendixes C and D contain in- structions and tables for comparing raw scores from the Piers-Harris 2 with those from the original version of the test. Appendix E lists the items from the original Piers- Harris that were omitted from the Piers-Harris 2. Finally, in the back of the manual is a chapter that provides instructions for using the Piers-Harris 2 computer-scoring products. 5
  • 5. The original Piers-Harris Children’s Self-Concept Scale was developed in the 1960s as a research instrument and as an aid for clinical and educational evaluation in ap- plied settings (Piers, 1984). Since its introduction, the Piers- Harris has functioned well in these roles, forming the basis for an impressive and growing body of research. The Piers- Harris 2 is the first major revision and restandardization of the original Piers-Harris. The new features of the Piers- Harris 2, which include new normative data and an updated item set, were implemented with the goal of maintaining as much backwards compatibility” as possible with the original Piers-Harris. This chapter begins by reviewing the theoretical rationale for the Piers-Harris and the development of the original item set and scoring system. The remainder of the chapter is devoted to the Piers-Harris 2 revisions. The new standardization sample is described first, followed by a dis- cussion of the item and scale changes. The chapter concludes with a discussion of moderator variables and their effects on interpretation. Users interested in making direct comparisons between Piers-Harris 2 scores and scores on the original measure should consult Appendixes C and D, which present instructions and tables necessary for comparing the scores. Original Rationale and Theoretical Background The original version of the Piers-Harris was based on the view that individuals maintain relatively consistent be- liefs about themselves, beliefs that develop and stabilize during childhood. This set of beliefs represents a person’s self-concept, a term which some researchers have used interchangeably with terms such as self-esteem and self- regard. The original authors of the Piers-Harris assumed that children would reveal important aspects of this underly- ing self-image by agreeing or disagreeing with simple, self-descriptive statements, and that this assessment of self- concept would relate meaningfully to other aspects of per- sonality and to predictions of future behavior. From a global perspective, the term self-concept refers to a person’s self-perceptions in relation to important as- pects of life. Although shaped by biological and cultural fac- tors, these perceptions are formed primarily through the interaction of the individual with the environment during childhood, and by the attitudes and behaviors of others. These perceptions give rise to self-evaluative attitudes and feelings that have important organizing functions and that also motivate behavior. Over time, an individual’s self- concept may change in response to environmental or develop- mental changes, or as a result of changes in priorities or values. However, these changes usually do not occur rapidly or in response to isolated experiences or interventions. This definition of self-concept rests on several theo- retical assumptions: 1. Self-concept is essentially phenomenological in nature. It is not something that can be observed directly but must be inferred from either behav- iors or self-report. Although behaviors are directly measurable, it is difficult to use behav- ioral observations to draw inferences about self- concept that are replicable and consistent across different situations. Self-report, although subject to many types of distortions, is closer to the pre- sent definition of self-concept, because it is a direct expression of the individual’s experience of the self. The problem of distortion of self- report is a methodological issue, not a theoreti- cal one. 2. Self-concept has both global and specific com- ponents. Global self-concept reflects how an individual feels about all the characteristics that make up his or her person, taking into account, among other things, skills and abilities, interac- tions with others, and physical self-image. Various specific aspects of self-concept result from an individual’s self-appraisal in particular areas of functioning. These specific facets of self-concept differ on several dimensions. Some are relatively broad (e.g., physical self, moral and ethical self, academic self); others are nar- rowly defined (e.g., good at mathematics, not skilled at baseball). The relative significance to the individual of each aspect of self-concept de- termines the degree to which success and failure 4 DEVELOPMENT AND RESTANDARDIZATION 37
  • 6. affect overall self-evaluation (Dickstein, 1977; Harter, 1978). In unimportant areas, for example, perceived failure is not likely to have a strong impact on the individual’s global self-evaluation. Similar notions have been proposed by Shavelson, Hubner, and Stanton (1976), who view self-concept as being “hierarchically organized.” 3. Self-concept is relatively stable. Although shaped by experience, it does not change easily or rapidly. In children, self-concept is initially more situationally dependent and becomes increasingly stable over time. Although it may be possible to enhance children’s self-concept through lengthy corrective experiences, changes are not likely to occur as the result of a brief, single, or superficial intervention. For example, a weekend camping trip may make a child feel good but is unlikely to bring about lasting change in that child’s self-concept. In addition, certain areas of self-concept may be more diffi- cult to change than others, and some may be amenable to change only during certain “critical periods” (Erikson, 1950; Schonfeld, 1969). 4. Self-concept has an evaluative as well as a descriptive component. It represents an individ- ual’s accumulated judgments concerning the self. Some of these evaluations may reflect in- ternalized judgments of others (e.g., values, norms, and notions of what constitutes socially desirable traits and behaviors). Others may be unique to the individual. Thus, in evaluating reported self-concept it is important to consider both nomathetic (between-person) and idio- graphic (within-person) sources of comparison. The issues to be addressed concern both how children compare themselves to their peers and how they evaluate themselves against their own internal standards. 5. Self-concept is experienced and expressed dif- ferently by children at various stages of develop- ment. During infancy, the focus is on differ- entiating self from others and on establishing a reciprocal relationship with the primary care- taker or caretakers (Ainsworth, 1979; Mahler, Pine, & Bergman, 1975). During the preschool years, the child becomes more mobile, interacts socially with other children and adults, and be- gins to develop a sense of gender identity. Self- concept during this period is defined primarily by the child’s experience in each of these areas, and by parental attitudes and behaviors. The concepts of school-age children expand to encompass a larger arena of daily interactions, especially in the areas of achievement and peer relationships. With increasing age and experi- ence, the child’s self-perceptions also become increasingly differentiated as he or she struggles to integrate disparate aspects of experience into a unified conceptual framework (Fahey & Phillips, 1981). In adolescence, certain aspects of self-concept may undergo rapid change or differentiation (e.g., moral and ethical self- image, physical self-concept), whereas others develop in a continuous, stable way (Dusek & Flaherty, 1981). For a more detailed discussion of developmental issues relating to self-concept, see Harter (1983). 6. Self-concept serves to organize and motivate behavior. A stable self-concept maintains a con- sistent image of a person’s typical reactions across different situations. This helps to reduce ambiguity in new situations and structure behav- ior toward preexisting goals. Action is also guided by an individual’s judgment of whether or not a particular behavior is consistent with his or her self-image. Behaviors that are congruent with one’s self-concept will tend to be favored over incongruent behaviors. In this fashion, judgments concerning the relative success or failure of particular actions, as well as the emo- tions (e.g., pride, joy, humiliation) related to these outcomes, may serve an important moti- vating function. Original Piers-Harris Development Item Development The original Piers-Harris items were derived from the work of Jersild (1952), who asked children what they liked and disliked about themselves. These statements were then grouped into the following categories: (a) physical charac- teristics and appearance; (b) clothing and grooming; (c) health and physical well-being; (d) home and family; (e) en- joyment of recreation; (f) ability in sports and play; (g) aca- demic performance and attitudes toward school; (h) intellectual abilities; (i) special talents (music, arts); (j) “Just Me, Myself”; and (k) personality characteristics, inner re- sources, and emotional tendencies. An initial item set, consisting of 164 items, was written to reflect these various aspects of children’s self-concept. The items were written as simple declarative statements (e.g., “I am a happy person”), with a yes/no response format. To reduce the possible effects of response biases, approxi- mately half of the items were negatively worded (e.g., “I be- have badly at home”) and half were worded in the direction of positive self-concept (e.g., “I have many friends”). Most items were written to avoid such problematic features as double-negative constructions and ambiguous qualifiers such as many, often, or rarely. Finally, 12 “lie” scale items were included to assess the tendency to respond in a social- ly desirable manner. These items were intended to mea- sure children’s willingness to admit relatively common 38 Technical Guide
  • 7. weaknesses (e.g., “I am always good” or “Sometimes I act silly”). However, these “lie” scale items were later dropped when it was found that they did not contribute significantly to the validity of the scale. This preliminary pool of items was then administered to a sample of 90 children from Grades 3, 4, and 5. To minimize errors due to differences in reading ability, the items were read aloud by the examiners while the children followed along in their test booklets. This pilot study es- tablished that the children understood the items, and that the inventory could be completed in approximately 30 to 35 minutes. The pilot study results were used to reduce the item pool. Items answered in one direction by less than 10% or more than 90% of the respondents were inspected and, in most cases, dropped. However, because the instrument was designed to identify children with problems in self-concept, a few items such as “My parents love me” were temporarily retained even though answered yes by the great majority of children. This procedure reduced the item pool to 140 items. A second pilot study was conducted with a sixth-grade sample of 127 students. The 30 highest and 30 lowest scores were identified, and items were retained only if (a) they dis- criminated significantly between these high and low groups (p < .05), and (b) they were answered in the expected direc- tion by at least half of the high-scoring group. These proce- dures reduced the item pool to the 80 items that composed the original Piers-Harris. Scale Construction Total score. The Piers-Harris Total score was based on all 80 items. It was calculated by crediting 1 point for each item answered in the direction of positive self-concept. The Total score was designed to measure a general dimen- sion of self-concept or self-esteem. Cluster scales. Piers (1963) investigated the multi- dimensional nature of the scale by conducting a principal components analysis using a sample of 457 sixth graders. Varimax rotation yielded six interpretable factors that to- gether accounted for 42% of the variance in item responses. These factors were labeled Behavior, Intellectual and School Status, Physical Appearance and Attributes, Anxiety, Popularity, and Happiness and Satisfaction. This factor structure was supported by numerous sub- sequent factor analyses, which are described in more detail in chapter 5. The factor structure formed the basis for the subscales of the Piers-Harris, which were called cluster scales in the 1984 edition of the Piers-Harris Manual). As with the Total score, the cluster scales were scored by cred- iting 1 point for each item answered in the direction of pos- itive self-concept. Piers-Harris 2 Development In revising the original Piers-Harris, the primary ob- jectives were to update and improve the test’s normative data and item set. The original Piers-Harris normative data were problematic in several respects. First of all, the standard- ization sample was recruited in the early 1960s from a single public-school system in rural Pennsylvania. The standard- ization sample was relatively homogenous with respect to ethnicity and several other key demographic variables. This made it more difficult to interpret Piers-Harris results for mi- nority children and those from other groups who differed substantially from the children in the standardization sample. Secondly, when the cluster and validity scales were devel- oped for the original Piers-Harris, they were normed on dif- ferent samples than the standardization sample used to norm the Total score. Although this procedure was necessary in order to develop these important new scales, it clearly deviat- ed from ideal test-standardization procedures, in which all scales are normed on a single standardization sample. A second set of concerns addressed in the revision process involved the test items themselves. Decades of ex- perience with the original scale have revealed problems with certain items. Some of these items have become difficult for younger children to understand because they were written with outdated language or low-frequency words. In addition, several researchers have identified opportunities to shorten the Piers-Harris by deleting items that have relatively limit- ed psychometric utility. Standardization Sample The Piers-Harris 2 restandardization was based on a large sample of students recruited from elementary, middle, junior high, and high schools throughout the United States. Table 6 presents the demographic characteristics of the sam- ple, along with corresponding percentages from the U.S. Census (U.S. Bureau of the Census, 2001a, 2001b) for com- parison. The sample is distributed relatively uniformly with- in the age range of 7 to 18 years (Grades 2 through 12). The distribution of the sample among ethnic groups is similar to the U.S. Census figures, with slight underrepresentation of Asians and Hispanics. Geographical distribution is ade- quate, with slight underrepresentation of participants from the western region of the United States. Table 6 includes the distribution of a subsample of participants among various categories of head-of-household educational level. This variable is used as an index of socio- economic status (SES), with higher educational level as- sociated with higher SES. The subsample is fairly close to the U.S. Census figures for the lowest two SES categories. The proportion of subsample participants in the top SES cat- egory is higher than the U.S. Census proportion. The use of a subsample for the SES comparisons reflects the fact that education data for heads of household were available for only 673 participants in the Piers-Harris 2 standardization study. The remaining 714 participants came from sites where SES data were not collected as part of the Piers- Harris 2 standardization study. Fortunately, these sites were conducting other standardization studies concurrently, and SES data were collected in these other studies. These other standardization studies used different participants than the Piers-Harris 2 study. However, because sampling for the Chapter 4 Development and Restandardization 39
  • 8. various studies was random, there was no reason to expect systematic differences in SES between participants in the Piers-Harris 2 study and those in other standardization stud- ies. Examination of the head-of-household education data for participants in these other studies revealed a distribution that is almost identical to the distribution for the 673 Piers- Harris 2 participants (see Table 6). Therefore, it is reason- able to assume that the distribution of head-of-household data presented in Table 6 approximates that of the entire Piers-Harris 2 sample. Item and Scale Revisions Item revisions. As noted above, a major goal of the Piers-Harris 2 revision was to streamline the scales by elim- inating problematic items. The first items targeted were those that contributed only to the Piers-Harris Total score and not to any of the six domain scores (Benson & Rentsch, 1988). A second set of items tagged for elimination included those whose wording has become outdated (e.g., “I have lots of pep”); those that seemed specific to one sex (e.g., “I have a good figure”); and those containing words that frequently needed additional explanation, especially for younger chil- dren (e.g., “I am obedient at home”). These procedures iden- tified 20 items as candidates for deletion, which left a revised set of 60 items. Statistical analyses (which are described in the next two sections) established that deletion of these items would result in no appreciable loss of reliability in the Total score or domain scale scores. In addition, the Total and do- main scores from the 60-item set correlated very highly with the original scores derived from the 80-item set. These analyses supported the decision to shorten the measure by deleting the 20 candidate items. The deleted items, with their original item numbers, are listed in Appen- dix E. (The remaining items, which constitute the Piers- Harris 2, are presented in Table 4.) In addition to the deletions, one item was slightly altered from its original wording. Item 37 was changed from “I am among the last to be chosen for games” to “I am among the last to be chosen for games and sports.” This change was made to ensure par- allel wording with two other items (9 and 51) that contain the phrase “games and sports.” The changes resulted in the 60-item Piers-Harris 2, which yields the same scores as the original test without sacrificing its psychometric strengths. The new version also decreases administration time signifi- cantly, as compared to the original measure. A readability analysis was conducted on the Piers-Harris 2 item set. The Flesch Reading Ease score was 91.8, and the Flesch-Kincaid Grade Level was 2.1, indicating that second-grade readers should be able to read the items with little difficulty. Total (TOT) score. As in the original measure, The Piers-Harris 2 Total (TOT) score is a measure of general self-concept or self-esteem. The TOT score is derived by crediting 1 point for each item answered in the direction of positive self-concept. Because all 80 items of the original Piers-Harris were administered to the Piers-Harris 2 standard- ization sample, the reliability of the original and revised scales could be compared (see chapter 5 for a more thorough discussion of reliability and associated terminology). Coefficient alpha values for the 80-item original Total score and the 60-item Piers-Harris 2 TOT score were .93 and .91, respectively. These statistics indicate that both versions demonstrate robust internal consistency and that the revised scale shows no significant loss in reliability compared to the lengthier original version. In addition, the original and re- vised Total scores correlate at .98, indicating that they are functionally equivalent. Domain scales. As noted earlier, the original Piers- Harris included six factor-analytically derived cluster scales 40 Technical Guide Table 6 Demographic Characteristics of the Piers-Harris 2 Standardization Sample U.S. Sample Census n % %a Sex Male 689 49.7 51.3 Female 698 50.3 48.7 Age in years 7–8 188 13.6 9–10 231 16.7 11–12 277 20.0 13–14 271 19.5 15–16 255 18.4 17–18 165 11.9 Race/Ethnic background Asian 17 1.2 3.5 Black 255 18.4 14.7 Hispanic/Latino 102 7.4 17.1 White 943 68.0 60.9 Native American 16 1.2 0.9 Other 50 3.6 2.9 Not Specified 4 0.2 U.S. Geographic Region Northeast 316 22.8 19.0 Midwest 424 30.6 22.9 South 463 33.4 35.6 West 184 13.3 22.5 Head of household’s educational levelb Less than high school graduate 73 10.8 11.4 High school graduate 182 27.0 31.9 Some college 100 14.9 28.0 Four-year college degree 318 47.3 28.7 or more Note. N = 1,387 (except for head of household’s educational level). a U.S. Census figures for sex, age, race/ethnicity, and geographic region (U.S. Census Bureau, 2001a) are based on U.S. population of school- aged children. Census data for head of household’s education level (U.S. Census Bureau, 2001b) is based on adults aged 25 to 54 (those most likely to be parents of school-aged children). b N = 673; see text for discussion of missing data in this subtable.
  • 9. (Behavior, Intellectual and School Status, Physical Appearance and Attributes, Anxiety, Popularity, and Happi- ness and Satisfaction). These scales have been retained in the Piers-Harris 2, with very slight modifications. They are scored by crediting 1 point for each scale item answered in the direction of positive self-concept. Several labeling changes have been instituted in the Piers-Harris 2. The cluster scales have been relabeled do- main scales to reflect the fact that they are intended to mea- sure particular domains of self-concept. This label was judged more appropriate than cluster scales, which refers to the statistical procedures used to assign items to the scales, rather than the clinical utility of the scales themselves. In ad- dition, the Anxiety scale is labeled Freedom From Anxiety in the Piers-Harris 2. This was done to correct a confusing as- pect in the scoring of the original instrument. In the original Piers-Harris, as in the Piers-Harris 2, all scales are scored so that higher scores reflect more positive self-concept. On the original instrument, the scale labeled Anxiety creates confu- sion because most users intuitively believe that higher scores indicate more anxiety, but in fact higher scores on this scale indicate less anxiety. The Piers-Harris 2 name Freedom From Anxiety removes this source of confusion and is more consistent with the labeling conventions for the other five domain scales, which generally refer to positive attributes (e.g., Popularity, Happiness and Satisfaction). Finally, the Behavior scale has been given the more specific, descriptive label of Behavioral Adjustment in the Piers-Harris 2. The item assignments for the Piers-Harris 2 domain scales (see Table 3) represent relatively minor changes from the composition of the original cluster scales. The item re- visions did not affect the Freedom From Anxiety (FRE), Popularity (POP), and Happiness and Satisfaction (HAP) scales, which retain the same items as their counterparts in the original measure. The Behavioral Adjustment (BEH) and Physical Appearance and Attributes (PHY) scales each lost two items in the revision, and the Intellectual and School Status (INT) scale lost one item. Most of the 20 items dropped for the Piers-Harris 2 contributed to only the Total and not to any of the domain scales. Table 7 presents reliability coefficients for the revised and original versions of these three scales, demonstrating that the item changes did not cause a meaningful decrement in the internal consis- tency of these scales. In addition, the extremely high corre- lation coefficients demonstrate that the revised and original versions of these scales are essentially equivalent. As Table 3 indicates, the Piers-Harris 2 domain scales, like their counterparts in the original measure, contain nu- merous overlapping items. The item overlap is an artifact of the factor-analytic procedures used to derive the original scales. Cooley and Ayres (1988) advocated eliminating this item overlap in order to increase the independence of the cluster scales. They suggested assigning each overlapping item only to the scale with which it correlated most strongly. This procedure was considered for the Piers-Harris 2, but re- liability analyses showed that nonoverlapping domain scales suffered a significant drop in internal consistency. This loss of reliability was especially apparent in the youngest chil- dren in the Piers-Harris 2 standardization sample. For 7- to 8-year-old children, coefficient alpha was less than .60 for one scale and less than .70 for two others. Because such low reliability coefficients are undesirable, the authors of the Piers-Harris 2 decided against the creation of nonover- lapping domain scales. Validity scales. The Piers-Harris 2 includes two Validity scales, the Inconsistent Responding (INC) index and the Response Bias (RES) index, which help the test user to detect deviant response sets. The INC scale is designed to identify random response patterns. It is based on the suppo- sition that certain pairs of responses are contradictory and/or statistically improbable. The INC scale was introduced in the 1984 edition of the Piers-Harris Manual, but has been revised extensively for the Piers-Harris 2. Inconsistent Responding index. The INC scale was constructed using both rational and empirical procedures. First, the 60 Piers-Harris 2 items were examined to deter- mine pairs of items on which it is possible to produce logi- cally inconsistent responses (e.g., Item 3, “It is hard for me to make friends,” and Item 41, “I have many friends”). Second, a correlation matrix was formed for all 60 items using the Piers-Harris 2 standardization data. Item pairs that correlated at r ≥ .25 were examined, and pairs were retained if their content created the potential for a logically inconsis- tent pair of responses, as in the aforementioned example. Third, frequency tables were constructed for all item pairs identified in the first two steps. These tables were used to determine which particular combination of responses (e.g., yes on one item, no on another) occurred least frequently in Chapter 4 Development and Restandardization 41 Table 7 Intercorrelations and Reliability Coefficients for Revised and Original Domain Scales Coefficient alpha Domain scale ra Piers-Harris 2 Original Piers-Harris Behavioral Adjustment (BEH) .98 .81 .81 Intellectual and School Status (INT) .99 .81 .82 Physical Appearance and Attributes (PHY) .98 .75 .79 Note. These are the only domain scales (called “cluster scales” on the original Piers-Harris) that had item changes in the Piers-Harris 2 revision. a The correlation coefficient between the revised and original scale raw scores.
  • 10. the standardization sample. Item pairs were retained if they produced a pair of logically inconsistent responses that oc- curred in less than 10% of the sample. These procedures resulted in an INC scale composed of 15 item pairs, which are displayed in Table 8. This revised INC scale features two substantial improvements over the one described in the 1984 edition of the Piers-Harris Manual. First of all, the current revision includes no dupli- cate items among its 15 item pairs, whereas the 25 item pairs in the original version of the INC scale included numerous duplicate items. This meant that a particular response to a single item could have a disproportionate effect on the total INC score, potentially distorting the meaning of the scale. Secondly, the items in the current revision of the scale are distributed relatively evenly among the six domain scales. The original version of the INC scale was heavily weighted with items from the Behavior scale, with fewer items repre- senting the remaining five domain scales. This meant that the original INC score gave little information about the va- lidity of responses on these other five scales. The INC index is scored by crediting 1 point for each pair of items for which the examinee gives the specified combination of keyed responses. For example, a yes re- sponse to Item 1 (“My classmates make fun of me”) coupled with a no response to Item 47 (“People pick on me”) would be scored “1” on the INC index. Note that the other combi- nation of logically inconsistent responses to this item pair (no to Item 1, yes to Item 47) would not be scored on the INC index. This scoring system results in a possible INC index score of 0 to 15. In the Piers-Harris 2 standardization sample, the INC score had a mean of .89 and a standard deviation of 1.12. As would be expected, the distribution of this score was highly positively skewed (that is, most examinees produced very low scores on the INC index). The cumulative frequencies for the INC index are shown in Table 9. As noted in chapter 3, interpretation of the INC index is based on a cutoff score of 4. When a child scores 4 or greater on this index, there is a very high likelihood that the 42 Technical Guide Table 8 Inconsistent Responding (INC) Index Item Pairs Frequency of response combination Inter- in standard- item ization INC item pair (Keyed response) correlation sample (%) 1. My classmates make fun of me. (Y) 47. People pick on me. (N) .46 7.2 2. I am a happy person. (N) 42. I am cheerful. (Y) .45 4.5 3. It is hard for me to make friends. (N) 41. I have many friends. (N) .40 7.9 4. I am often sad. (N) 40. I am unhappy. (Y) .46 5.3 5. I am smart. (N) 43. I am dumb about most things. (N) .42 4.9 7. I get nervous when the teacher calls on me. (Y) 10. I get worried when we have tests in school. (N) .35 9.6 9. I am a leader in games and sports. (Y) 51. In games and sports, I watch instead of play. (Y) .27 5.3 14. I cause trouble to my family. (N) 20. I behave badly at home. (Y) .41 6.3 18. I am good in my schoolwork. (N) 21. I am slow in finishing my schoolwork. (N) .32 8.1 19. I do many bad things. (Y) 27. I often get into trouble. (N) .47 6.6 26. My friends like my ideas. (N) 39. My classmates in school think I have good ideas. (Y) .59 3.7 29. I worry a lot. (N) 56. I am often afraid. (Y) .36 6.0 31. I like being the way I am. (N) 35. I wish I were different. (N) .48 3.9 44. I am good-looking. (Y) 49. I have a pleasant face. (N) .56 6.5 53. I am easy to get along with. (Y) 60. I am a good person. (N) .38 3.3 Table 9 Cumulative Frequency of Inconsistent Responding (INC) Index Scores in the Standardization Sample INC score Cumulative frequency (%) 0 47.9 1 77.0 2 91.0 3 96.7 4 99.0 5 99.8 6 99.9 7 99.9 8 99.9 9 100.0 Note. N = 1,387.
  • 11. Piers-Harris 2 was completed in a random or inconsistent manner. This cutoff score was determined by first selecting a score that was approximately 2 standard deviation units above the mean. In the Piers-Harris 2 standardization sam- ple, the corresponding score was 3.13 (or 3, when rounded to the nearest whole number). As Table 9 illustrates, only 3.3% of the children in the Piers-Harris 2 standardization sample produced an INC index score greater than 3. This suggested that a cutoff score of 4 would identify those chil- dren whose INC scores were extremely deviant. To further assess the interpretive value of this cutoff score, analyses were undertaken to determine the probabili- ty that an INC score of 4 represented a random response set. To accomplish this, the distribution of INC scores from the Piers-Harris 2 standardization sample (N = 1,387) was com- pared to the distribution of INC scores from 1,387 random Piers-Harris 2 response sets. The question of interest was: For any given INC score, what was the probability that it was drawn from the random data set as opposed to the data collected from actual respondents? For each INC score, the ratio of the frequency counts between the two samples was calculated and expressed as a percentage indicating the probability that the score was from the random data. Table 10 presents these percentages for all possible scores on the INC scale. Of the 337 cases that scored 4 on the INC scale, 32 were in the respondent sample and 305 were in the random sample. Thus, any Piers-Harris 2 response set that yielded an INC score of 4 had a 91% likelihood of being a random response set. These analyses suggest that a cutoff score of 4 is use- ful for interpreting this index. An INC score of 4 represents a high probability (91%) that a particular Piers-Harris 2 was completed at random, or without adequate understanding of the content of the test items. Response Bias (RES) index. The RES scale is a straightforward measure of response bias. A positive response bias represents the tendency to answer yes regardless of item content, whereas a negative response bias represents the ten- dency to answer no regardless of item content. Either of these biases might distort the meaning of Piers-Harris 2 Self- Concept scores. The RES index allows the examiner to take response bias into account when interpreting the Piers-Harris 2. This index is a count of the number of items to which the ex- aminee has responded yes. The RES score has a possible range of 0 to 60. Very high RES scores indicate a positive response bias, and very low scores signal a negative response bias. In the Piers-Harris 2 standardization sample, the RES score had a mean of 28.39 and a standard deviation of 5.28. The distribution of the scores approximated a normal distri- bution. Cutoff scores for interpretation were set at +/– 2 standard deviation units. These cutoffs corresponded to RES scores of 40 and 18, respectively. Thus, children scoring 40 or above on the RES scale are identified as having given an Chapter 4 Development and Restandardization 43 Table 10 Probability That Various Inconsistent Responding (INC) Scores Represent Random Responding Probability of random INC score response (%) 0 2 1 21 2 52 3 81 4 91 5 96 6 99 ≥7 >99 Table 11 Descriptive Statistics for Piers-Harris 2 Raw Scores in the Standardization Sample Scale No. of items M SD Self-Concept Scales Total (TOT) 60 44.6 10.2 Domain Scales Behavioral Adjustment (BEH) 14 11.2 2.9 Intellectual and School Status (INT) 16 11.9 3.4 Physical Appearance and Attributes (PHY) 11 7.8 2.6 Freedom From Anxiety (FRE) 14 10.0 3.4 Popularity (POP) 12 8.4 2.7 Happiness and Satisfaction (HAP) 10 8.1 2.2 Validity Scales Inconsistent Responding (INC) 15 pairs 0.9 1.1 Response Bias (RES) 60 28.4 5.3 Note. N = 1,387.
  • 12. unusually large number of yes responses. Conversely, chil- dren scoring 18 or below on the RES index are flagged as having given an unusually large number of no responses. These two cutoff scores were chosen because they are reason- ably conservative and because moderate deviations in re- sponse set in either direction usually do not create major problems for interpretation. Derivation of Standard Scores Table 11 presents means and standard deviations for the raw scores of the Piers-Harris 2 scales in the standardization sample. To facilitate interpretation of Piers-Harris 2 results, these raw scores have been converted to normalized T-scores (see Anastasi, 1988, p. 88). The original distribution of Piers-Harris 2 raw scores underwent a nonlinear transform- ation so that it would approximately fit a normal curve. The normalized raw scores were converted to T-scores, which have a mean of 50 and a standard deviation of 10. Because the Piers-Harris 2 uses normalized T-scores, every scale has an approximately normal distribution of standard scores. Thus, the frequency of cases in the standardization sample falling above or below a given T-score is nearly identical for every scale. This simplifies intra-individual comparisons among the domain scales of the Piers-Harris 2. For exam- ple, if a child achieves normalized T-scores of 60 on the BEH and POP domain scales, the examiner knows that in those areas of self-concept, the child has scored higher than approximately 84% of the children in the standardization sample. The use of normalized T-scores also enables the ex- aminer to compare a particular child’s test results with those of the children in the standardization sample. Moderator Variables Moderator variables are respondent characteristics, such as age, sex, or ethnicity, that may affect scores on a psycho- logical test independently of the construct that the test is attempting to measure. When groups that differ on moderator variables also perform differently on a test, it can be difficult to interpret test results without separate norms for each group. This section discusses moderator variables that may impact Piers-Harris 2 results. The section also presents statistical comparisons that determine whether there are any meaning- ful differences between groups defined by the moderator variables in the standardization sample. One of the advantages of the Piers-Harris 2 standardization sample, as compared with the original Piers-Harris standardization sample (Piers, 1984), is its demographic representativeness. Whereas the original sample was recruited from a single school district in Pennsylvania, the new sample consists of an ethnically diverse group of children from all regions of the United States. This allows statistical analysis of the potential moderating effects of ethnicity on Piers-Harris 2 scores. With a sample as large as the Piers-Harris 2 standard- ization sample, it is necessary to set clear guidelines as to what constitutes a practically meaningful difference between average group scores. Because the power to detect group differences is so high in large sample analyses, differences as small as 1 or 2 T-score points are often statistically signif- icant. From a practical perspective, however, a 1- or 2-point change in a test score yields virtually no new information about the individual who took the test. In addition, such small differences are usually within the standard error of measurement of the test, meaning that the difference is not likely to be reliable across administrations of the test. An effect size metric can be used to evaluate whether statistically significant differences between large groups are also practically meaningful (Cohen, 1992). The relevant statistic is calculated by dividing the mean difference between groups on a score by the standard deviation of that score for the two groups combined. The use of T-scores simplifies these calculations. Because the pooled standard deviation al- ways approximates 10 for any T-score comparisons, 1 effect- size unit equals about 10 T-score points. Effect sizes of 1 to 3 T-score points are considered small and not clinically meaningful; effect sizes between 3T and 5T are considered moderate; and those greater than 5T are considered large. Moderator effects for Piers-Harris 2 scale scores were evaluated by calculating average T-scores for groups defined by the moderator variables (e.g., males and females, ethnic groups). These average group scores were then compared to the average T-score for the entire standardization sample (which is 50 by definition). For each moderator, the follow- ing decision rule was applied: If the deviation of a particular moderator group from the overall standardization sample was equivalent to a small effect size, the moderator was con- sidered meaningless in practical applications. In these cases, stratification of the normative data was not required. If, on the other hand, the deviation constituted a moderate to large effect size, the pattern of T-score differences among the moderator groups was examined more closely to determine whether it was consistent with other known characteristics of the groups under consideration. If robust, replicable pat- terns of group differences were present, the use of separate norms for each group was considered. Age. Many psychological theories predict substantial changes in self-concept between certain phases of develop- ment (e.g., between adolescence and adulthood). Certain measures of self-concept, especially those designed for both adolescents and adults, have employed age-stratified norms (e.g., Tennessee Self-Concept Scale, Second Edition; Fitts & Warren, 1996). However, empirical research has not sup- ported an association between age and self-concept scores for individuals between the ages of 8 and 23 (see Wylie, 1979, for a review). Table 12 presents average T-scores for age groupings in the Piers-Harris 2 standardization sample. Out of all the possible comparisons, only one domain scale score (INT) differed from the sample mean by more than 3 T-score points, and this occurred only for a single age group (7- to 8-year-olds). There was no evidence of a consistent pattern of clinically significant age-related differences in Piers-Harris 2 scores. Thus, there appeared to be no need for age-stratified norms. Sex. Generally speaking, investigators have failed to find significant sex differences in self-concept (Hattie, 1992; 44 Technical Guide
  • 13. Wylie, 1979). Early studies using the Piers-Harris as a mea- sure of general self-concept (e.g., Piers & Harris, 1964) were consistent with this notion. However, more recent studies have suggested sex dif- ferences in specific aspects of self-concept. In particular, males tend to report less anxiety and more problematic be- haviors than girls. Lewis and Knight (2000) examined moder- ator variables for the original Piers-Harris in 368 intellectually gifted children in Grades 4 to 12. Males and females did not differ on the Total score, but there were sex differences on three cluster scales. Females scored significantly higher than males on the Behavior and Intellectual and School Status scales, whereas the reverse pattern occurred on the Anxiety scale. Similar sex differences on the Anxiety scale were re- ported by Lord (1971) and Osborne and LeGette (1982). Furthermore, in the normative sample for the cluster scales described in the 1984 edition of the Piers-Harris Manual, males rated themselves significantly lower in self-concept on the Behavior scale than females. Additional studies related to sex differences are listed by topic in Appendix A. Average T-scores for males and females in the Piers- Harris 2 standardization sample are provided in Table 13. There were slight sex differences on the BEH and FRE scales, consistent with the previously cited literature. How- ever, the effect sizes were not large enough to constitute a practically meaningful difference. These findings do not support stratification of the Piers-Harris 2 norms by sex. Ethnicity. The original Piers-Harris norms were based on an ethnically homogenous sample of public-school chil- dren from one Pennsylvania school district (Piers, 1984). Because it was not possible to examine ethnic differences in self-concept using these data, the 1984 edition of the Piers- Harris Manual relied on an extensive review of the litera- ture documenting comparisons among various ethnic groups on Piers-Harris scores. Those studies, along with more re- cent literature relating to this issue, are listed by topic in Appendix A. Certain themes emerge from this literature. Most im- portantly, ethnicity per se does not appear to be a significant determinant of self-concept. Rather, children in certain ethnic or cultural groups may be at higher risk for the kinds of on- going stressors that can impact self-esteem over the long run. These experiences may include racial discrimination, aca- demic failure due to language deficits or other problems, and general difficulties adjusting to the mainstream culture. On the other hand, adequate social support seems to insulate chil- dren from at least some of these potentially harmful factors. Cultural differences in response style must also be taken into account. In some cultures, saying positive things about oneself is frowned upon because it is considered boastful. Also, individuals from certain cultures may be less inclined to provide socially desirable responses on psycho- logical tests. Either of these factors could cause spuriously low Piers-Harris 2 scores. Alternatively, spuriously high scores may result in individuals from groups that tend to be more defensive or reluctant in terms of disclosing negative emotions. Chapter 4 Development and Restandardization 45 Table 12 Average T-Scores by Age Group in the Piers-Harris 2 Standardization Sample Age 7–8 9–10 11 –12 13–14 15–16 17–18 Self-Concept scale (n = 188) (n = 231) (n = 277) (n = 271) (n = 255) (n = 165) Total (TOT) 51.2 51.5 51.3 48.4 48.6 49.1 Behavioral Adjustment (BEH) 51.2 50.9 51.4 47.8 47.6 50.1 Intellectual and School Status (INT) 53.3 52.0 50.0 47.5 48.1 49.2 Physical Appearance and Attributes (PHY) 50.4 50.2 50.2 48.4 50.6 49.5 Freedom From Anxiety (FRE) 49.9 50.6 50.7 50.4 48.2 48.8 Popularity (POP) 47.7 50.3 51.2 50.5 50.0 48.6 Happiness and Satisfaction (HAP) 50.7 51.0 50.7 47.9 48.5 48.8 Note. N = 1,387. Table 13 Average T-Scores by Sex in the Piers-Harris 2 Standardization Sample Sex Male Female Self-Concept scale (n = 689) (n = 698) Total (TOT) 49.6 50.4 Behavioral Adjustment (BEH) 48.2 51.3 Intellectual and School Status (INT) 48.7 51.0 Physical Appearance and Attributes (PHY) 49.6 50.1 Freedom From Anxiety (FRE) 51.6 48.1 Popularity (POP) 49.6 50.2 Happiness and Satisfaction (HAP) 49.1 50.0 Note. N = 1,387.
  • 14. The preceding remarks can be viewed as a caveat to consideration of ethnic differences in the Piers-Harris 2 norms. Unlike the 1984 manual’s norms, the new norms are based on an ethnically diverse sample, and so it is possible to investigate directly the moderating effects of ethnicity. The Piers-Harris 2 standardization sample included suffi- ciently large numbers of Black, Hispanic, and White partic- ipants to permit evaluation for meaningful ethnic differences. Table 14 presents average T-scores for these groups. None of the individual scale scores differed by more than 3 T-score points from the grand mean for the standard- ization sample. These small effect sizes suggest that Piers- Harris 2 scores can be interpreted without the need for separate norms for these groups. Table 14 also presents average T-scores for Asians, Native Americans, and those who reported their ethnicity as “Other.” Because of the small cell sizes, these results are not reliable enough to aid interpretation of Piers-Harris 2 scores. Nevertheless, the findings may be of interest to those plan- ning research on self-concept with the Asian or Native American populations. Socioeconomic status. Generally speaking, the stud- ies that have examined socioeconomic status (SES) and self- concept have not found significant relationships between these constructs (Hattie, 1992; Wylie, 1979). There are a few exceptions to this trend. For example, Osborne and LeGette (1982) studied a sample of 374 middle school students and found that lower SES was associated with lower self-con- cept scores. As noted with respect to ethnicity, the relation- ships between SES and self-concept are likely to be complex, in that SES may be a marker for other variables, such as level of stress within the family, that affect self- concept more directly. As noted in Table 6, SES data are available for about half of the Piers-Harris 2 standardization sample, but there is good reason to believe that these data provide a good ap- proximation of the SES distribution in the entire sample. Table 15 shows average T-scores for groups defined by the 46 Technical Guide Table 14 Average T-Scores by Race/Ethnicity in the Piers-Harris 2 Standardization Sample Race/Ethnic group Native Black Hispanic White Asian American Other Self-Concept scale (n = 255) (n = 102) (n = 943) (n = 17) (n = 16) (n = 50) Total (TOT) 48.7 47.4 50.8 48.0 50.9 46.8 Behavioral Adjustment (BEH) 48.4 47.7 50.5 48.7 50.6 47.1 Intellectual and School Status (INT) 48.9 47.6 50.6 47.9 50.4 46.8 Physical Appearance and Attributes (PHY) 50.7 48.8 49.8 48.8 49.2 49.0 Freedom From Anxiety (FRE) 48.1 47.5 50.8 46.6 51.1 47.0 Popularity (POP) 48.9 47.9 50.6 49.0 48.2 47.6 Happiness and Satisfaction (HAP) 48.0 48.4 50.2 51.4 51.2 47.5 Note. N = 1,383. Due to the small cell sizes for the Asian, Native American, and Other samples, the results for those groups are not reliable enough to aid in interpreting the Piers-Harris 2 score. Table 15 Average T-Scores by Head of Household’s Education Level in the Piers-Harris 2 Standardization Sample Head of household’s education level Not HS HS Some College Post- graduate graduate college graduate graduate Self-Concept scale (n = 73) (n = 182) (n = 100) (n = 157) (n = 161) Total (TOT) 47.0 49.8 50.8 51.9 50.3 Behavioral Adjustment (BEH) 46.6 48.2 49.6 51.3 50.0 Intellectual and School Status (INT) 47.2 49.2 50.8 51.5 50.1 Physical Appearance and Attributes (PHY) 48.5 50.7 50.8 50.8 50.9 Freedom From Anxiety (FRE) 47.2 50.5 49.1 51.6 50.1 Popularity (POP) 48.1 50.8 51.4 51.2 50.5 Happiness and Satisfaction (HAP) 47.3 49.6 49.8 51.3 49.9 Note. N = 673. See text for discussion of missing data.
  • 15. reported education level of the head of household. There is a trend toward lower self-concept scores in the group that is lowest on this SES index (the head of the household did not graduate from high school), which is consistent with Osborne and LeGette’s (1982) findings. However, the differences be- tween this group’s scores and the grand means for the entire standardization sample remain in the realm of small effect sizes (two scales, BEH and TOT, differ by slightly more than 3 T-score points from the grand mean). Considering all of the SES groups, there is no consistent pattern of clinically meaningful score differences associated with increasing SES. These data do not indicate a need for Piers-Harris 2 norms that are stratified by SES. U.S. geographic region. Although there is no theoreti- cal reason to expect regional differences in Piers-Harris scores, the possibility of such differences was explored in the current data set. Table 16 presents the average T-scores by U.S. geographic region for the Piers-Harris 2 standardization sample. As with the other moderators, there is no consistent pattern of clinically meaningful differences among regions. Intelligence and academic achievement. Numerous investigators have examined the relationships among intelli- gence, academic achievement, and self-concept (see Ap- pendix A). Generally speaking, researchers have found moderate positive correlations between measures of achieve- ment and self-concept scores. Furthermore, as hypothesized by Shavelson et al. (1976), the relationship between these two constructs appears to be due to a specific academic compo- nent of self-concept, rather than to generalized self-concept. For example, two studies of elementary-school and middle- school students demonstrated that specific measures of aca- demic self-concept were stronger predictors of achievement than the Piers-Harris Total score (Lyon & MacDonald, 1990; Schike & Fagan, 1994). In contrast to achievement, re- searchers have typically found only weak associations be- tween intelligence test scores and measures of self-concept (e.g., Black, 1974; McIntire & Drummond, 1977). With regard to the Piers-Harris 2 normative sample, the literature suggests that children who perform well on academic tasks are likely to have higher than average scores on the INT cluster scale, but not on the other domain scales or the Total score. However, this must remain a tentative hypothesis. Academic achievement data is not available for the Piers-Harris 2 standardization sample, so it is not possi- ble to study empirically the effects of this moderator. Summary. The Piers-Harris 2 standardization sample has been examined for group differences related to the potential moderating variables of age, sex, ethnicity, socio- economic status (as indexed by education level of head of household), and U.S. geographic region. Where effects of these moderators were present, they were small and applied only to one or two domain scales. None of the analyses identified clinically meaningful patterns of differences that were consistent with other knowledge about the groups in question. Consequently, it was determined that one set of nonstratified normative data could be used for interpreting Piers-Harris 2 scores. Nevertheless, the relationship among moderators is complex and deserves additional study. Although the current analyses do not support stratified norms for the Piers-Harris 2, several prior studies have found group differences on each of the moderators described in the previous section. These studies vary greatly in their methodological quality. The au- thors of the Piers-Harris 2 recognize that particular clinical and research applications of this measure may raise con- cerns about the suitability of the nonstratified norms pre- sented in this manual. For example, special care should be taken in interpreting Piers-Harris 2 results for children with mental retardation, children with severe psychiatric dis- orders, and children from ethnic groups not well represented in the standardization sample. It is suggested that in these and similar situations users familiarize themselves with the appropriate research literature to determine what caveats should be considered in using the Piers-Harris 2 norms. To facilitate this process, Appendix B of this manual includes a brief review of research relating to the use of the Piers- Harris for children with special needs. Other studies involv- ing special populations are listed by topic in Appendix A. Chapter 4 Development and Restandardization 47 Table 16 Average T-Scores by U.S. Geographic Region in the Piers-Harris 2 Standardization Sample U.S. geographic region Northeast Midwest South West Self-Concept scale (n = 316) (n = 424) (n = 463) (n = 184) Total (TOT) 50.8 50.8 49.4 48.2 Behavioral Adjustment (BEH) 50.2 50.9 48.3 49.9 Intellectual and School Status (INT) 50.5 51.3 48.8 47.9 Physical Appearance and Attributes (PHY) 51.0 49.9 50.3 46.8 Freedom From Anxiety (FRE) 49.9 50.4 49.9 48.4 Popularity (POP) 50.4 49.4 50.7 48.4 Happiness and Satisfaction (HAP) 50.4 50.2 48.6 49.2 Note. N = 1,387.
  • 16. During the nearly four decades since the introduction of the Piers-Harris, numerous investigators have studied the technical characteristics of the instrument. This chapter reviews selected studies from that literature and presents new reliability and validity evidence from the Piers-Harris 2 standardization sample. As demonstrated in the previous chapter, the Piers-Harris 2 is essentially identical to the origi- nal measure from a psychometric perspective. Researchers may therefore proceed to use the Piers-Harris 2 with the confidence that its reliability and validity are strongly upheld by the extensive database pertaining to the original scale. Reliability Reliability concerns the stability of scores on a psycho- logical test. A reliable test should produce consistent scores for the same individual when he or she takes the test on different occasions and under varying conditions of exam- ination. A reliable test should also yield scores that are relatively free of measurement error, or variance in a score due to chance factors rather than true variance in the psycho- logical construct being assessed. Reliability estimates are expressed in terms of correlation coefficients that vary from 0 to 1, with higher figures indicating greater reliability. Reliability is considered the most basic psychometric prop- erty of a test, because the discussion of whether a test mea- sures what it is supposed to measure cannot even begin until the test’s reliability has been established. In other words, re- liability is a necessary, but not a sufficient, condition for validity. This section will consider two aspects of reliability: internal consistency and test-retest reliability. The section will describe new reliability data collected for the Piers- Harris 2 revision, and will also review the extensive litera- ture of reliability studies on the original Piers-Harris. Internal Consistency of the Original and Revised Instruments The reliability of a psychological test is determined in large part by how well the test items sample the content do- main being assessed. A set of test items that performs well in this regard has the quality of internal consistency. In an internally consistent test, items tend to be highly inter- correlated, presumably because they are measuring the same construct. The primary indexes of internal consistency are coefficient alpha (Cronbach, 1988), or, if the items are dichotomous, Kuder-Richardson Formula 20 (KR-20; Kuder & Richardson, 1937). These statistics measure the average intercorrelations among items in a test and are thought to establish an upper limit for the reliability of a test. Another method for assessing the consistency of content sampling in a test is split-half reliability, in which the test is split into equivalent halves for each individual, and the relevant statistic is the Pearson correlation between the two halves. Piers-Harris 2. Table 17 presents internal consistency estimates for the six domain scales and Total (TOT) score of the Piers-Harris 2. Alphas are reported for the entire standardization sample and for six age strata. These figures demonstrate good internal consistency and are comparable to the values reported for the original Piers-Harris. The age- stratified values are presented to address concerns regarding the reliability of self-concept scores with younger children. These analyses show that the TOT scale and five of the six domain scales maintain good internal consistency through- out the six age ranges. The Popularity (POP) scale demon- strates weaker internal consistency for the youngest children (7- and 8-year-olds; alpha = .60) and also, unexpectedly, for the oldest adolescents (17- and 18-year-olds; alpha = .62). These findings suggest that the POP scale should be inter- preted cautiously for children in these age ranges. The standard error of measurement (SEM) statistic translates the alpha coefficient into practical terms by provid- ing an index of how close an individual test score is likely to be to the “true” score that would be obtained if there were no measurement error. (The formula for calculating SEM for a given scale is SD √1–r, where SD is the standard deviation and r is the reliability coefficient for that scale.) SEM values for the Piers-Harris 2 scales are listed in Table 17. Original Piers-Harris. Table 18 summarizes several studies that reported alpha, KR-20, or split-half coefficients for the 80-item Total score of the original Piers-Harris. These values, which approach or exceed .90, demonstrate that the original scale has excellent internal consistency for both younger and older children. 5 TECHNICAL PROPERTIES 49
  • 17. Alpha coefficients for the original six cluster scales are available from several samples. Piers (1984) reported alpha coefficients from two samples. The first was a combi- nation of the cluster scale standardization sample (Piers, 1984), which consisted of 485 4th through 10th graders, and 97 additional children referred from psychiatric clinics. In this sample, cluster scale alpha coefficients ranged from .73 (Happiness and Satisfaction) to .81 (Behavior), with a mean of .76. The second sample was the 1996 WPS TEST RE- PORT™ sample, a regionally and ethnically diverse pool of 1,772 individuals whose Piers-Harris responses were scored by the WPS TEST REPORT™ service. In this sample, clus- ter scale alpha coefficients ranged from .76 (Happiness and Satisfaction) to .83 (Physical Appearance and Attributes), with a mean of .80. These reliability figures indicate that the cluster scales of the original Piers-Harris have good internal consistency. Hattie (1992) reported cluster scale alpha coefficients for a sample of 135 Australian students in Grades 10 through 12. Alphas ranged from .70 (Happiness and Satisfaction) to .82 (Physical Appearance and Attributes), with a mean of .75. Hattie described similar cluster scale alphas for another sample of 367 children, with the exception that the internal consistency of the Happiness and Satisfaction scale was somewhat low (alpha = .64). This scale also had the lowest alpha coefficient in the three samples described in the previ- ous paragraph. This finding suggests that the Happiness and Satisfaction scale may be more multidimensional than the other Piers-Harris scales, a notion that is supported by sev- eral factor analyses described in the next section. Test-Retest Reliability of the Original Piers-Harris Test-retest reliability measures the extent to which scores for a single individual are consistent over time and 50 Technical Guide Table 17 Piers-Harris 2 Internal Consistency Estimates Entire standardization No. of samplea Age group (Alpha) Self-Concept scale items Alpha SEM 7–8b 9–10c 11–12d 13–14e 15–16f 17–18g Total (TOT) 60 .91 3.07 .89 .92 .92 .91 .93 .89 Behavioral Adjustment (BEH) 14 .81 1.27 .75 .84 .81 .81 .81 .76 Intellectual and School Status (INT) 16 .81 1.50 .76 .82 .81 .82 .82 .72 Physical Appearance and Attributes (PHY) 11 .75 1.29 .72 .75 .80 .77 .73 .65 Freedom From Anxiety (FRE) 14 .81 1.46 .77 .82 .82 .82 .84 .80 Popularity (POP) 12 .74 1.37 .60 .72 .80 .79 .78 .62 Happiness and Satisfaction (HAP) 10 .77 1.05 .71 .82 .78 .77 .78 .71 a N = 1,387. b n = 188. c n = 231. d n = 277. e n = 271. f n = 255. g n = 165. Table 18 Studies Reporting Internal Consistency Coefficients for Piers-Harris Total Score Study Sample Age or grade Sex n Index r Center and Ward (1986) Nonclinical Australian Grade 2 Both 183 Split-half .89 Grade 3 Both 104 Split-half .91 Grades 4, 6, 9 Both 114 Split-half .91 Cooley and Ayres (1988) Mixed nonclinical/special education Grades 6–8 Both 155 Alpha .92 Franklin, Duley, et al. (1981) Nonclinical Grades 4–6 Both 180 Alpha .92 Lefley (1974) Nonclinical Native American 7–14 years Both 53 Split-half .91 Piers (1973) Nonclinical Grade 6 Female 70 KR-20 .88 Male 76 KR-20 .90 Grade 10 Female 84 KR-20 .88 Male 67 KR-20 .93 Smith and Rogers (1978) Learning disabled 6–12 years Both 206 Alpha .89 Winne, Marx, and Taylor (1977) Nonclinical Grades 3–6 Female 42 Alpha .90 Male 61 Alpha .90 WPS TEST REPORT™ (1996)a Mixed nonclinical/clinic referred 7–19 years Both 1,772 Alpha .93 Yonker, Blixt, and Dinero (1974) Nonclinical Grade 10 Both 208 Alpha .90 a This sample is a regionally and ethnically diverse pool of individuals from clinical and nonclinical samples whose tests were scored by the WPS TEST REPORT™ service.
  • 18. across settings. Personality measures such as the Piers- Harris are usually assumed to measure relatively enduring characteristics of individuals, and so are expected to produce scores that remain stable across time. However, self-concept may be less stable among younger children, whose sense of self is still developing (Harter, 1983). Thus, low test-retest reliability in younger children may be partially due to the in- stability of the underlying construct, rather than measure- ment error per se. Test-retest reliability data are not available for the Piers- Harris 2 revision. However, a number of studies have investi- gated the test-retest reliability of the original Piers-Harris Total score, in both normal children and special populations. Most of the test-retest reliability studies of the original Piers- Harris were completed in the 1960s, 1970s, and early 1980s. The relevant studies are summarized in Table 19. When re- viewing these studies, it is important to note that more hetero- geneous samples are expected to yield higher reliability coefficients, due to greater variance of scores. If for any rea- son a small standard deviation is obtained in a given sample, the test-retest coefficient is expected to be lower. Furthermore, it is not surprising that shorter test-retest intervals are general- ly associated with higher reliability estimates, because there is presumably less chance that environmental or developmental changes affect children’s self-concepts during these shorter intervals. In fact, studies with retest intervals of 6 months or longer are probably best conceptualized as measuring the sta- bility of the construct of self-concept over time, rather than test-retest reliability per se. Table 19 is organized to reflect this distinction. Test-retest reliability in general samples. An early study by Piers and Harris (1964) investigated the stability of the Piers-Harris in the original standardization sample, using a 95-item experimental version of the scale and a retest in- terval of 4 months. Reliability coefficients from 3rd, 6th, and 10th graders were .72, .71, and .72, respectively. These reliability estimates were deemed satisfactory by the au- thors, especially given the relatively long retest interval and the fact that the scale was still in the development stage. The 80-item Piers-Harris, though shorter than the experimental version, was shown to have marginally better stability for both 2-month (r = .77) and 4-month (r = .77) retest intervals (Wing, 1966). Additional studies of nonclinical students re- port reliability coefficients ranging from .65 to .81 over 2- to 5-month retest intervals (McLaughlin, 1970; Platten & Williams, 1979, 1981; Shavelson & Bolus, 1982). Hattie (1992) reported a test-retest study for the Piers- Harris Total score and the six cluster scales. The study used a sample of 135 Australian students in Grades 10 through 12. The test-retest interval was 4 weeks. The reliability co- efficients were as follows: Total, .87; Behavior, .80; Intel- lectual and School Status, .84; Physical Appearance and Attributes, .88; Anxiety, .80; Popularity, .80; Happiness and Satisfaction, .65. Chapter 5 Technical Properties 51 Table 19 Studies Reporting on Test-Retest Reliability and Construct Stability for Piers-Harris Total Score Retest Study Sample Age or grade Sex n interval r Test-retest reliabilitya Alban Metcalfe (1981) British 11–20 years Both 182 2 weeks .69 Lefley (1974) American Indian 7–14 years Both 40 10 weeks .73 McLaughlin (1970) Private school Grade 5 Male 67 5 months .75 Grade 6 Male 98 5 months .72 Grade 7 Male 69 5 months .71 Piers and Harris (1964)b Public school Grade 3 Both 56 4 months .72 Grade 6 Both 66 4 months .71 Grade 10 Both 60 4 months .72 Platten and Williams (1979) Mixed ethnic groups Grades 4–6 Both 159 10 weeks .65 Platten and Williams (1981) Both 173 10 weeks .75 Querry (1970) Normal speech Grades 3–4 Both 10 3–4 weeks .86 Mild articulation disorders Both 10 3–4 weeks .96 Moderate articulation disorders Both 10 3–4 weeks .83 Shavelson and Bolus (1982) Public school Grades 7–8 Both 99 5 months .81 Tavormina (1975) Chronic medical illness M = 12 years Both 94 3–4 weeks .80 Wing (1966) Public school Grade 5 Both 244 2 months .77 4 months .77 Construct stabilityc Henggeler and Tavormina (1979) Mexican American M = 10.5 years Both 12 1 year .51 Smith and Rogers (1977) Learning disability 6–12 years Both 89 6 months .62 Wolf (1981) Mental retardation/emotional disturbance 11–16 years Both 39 8 months .42 a Retest intervals of less than 6 months. b Based on 95-item experimental version of the Piers-Harris. c Retest intervals of 6 months or more.
  • 19. Test-retest reliability in special populations. Many studies have reported test-retest reliability estimates for samples of minority and special-needs students. These stud- ies have differed in terms of the sample characteristics (e.g., age, sex, and group membership) and retest intervals. Some studies have even modified the administration procedures or the instrument itself. Thus, it is not unexpected that the reli- ability coefficients for these studies vary considerably. Minority students. The stability of the Piers-Harris in populations of diverse ethnic or national backgrounds can be ascertained from three studies. Lefley (1974) found a reliabil- ity coefficient of .73 for a sample of Native American students tested over a 10-week interval. Henggeler and Tavormina (1979) obtained 1-year retest data from 12 Mexican American migrant children. The reported reliability coefficient of .51 is best understood as representing construct stability, as opposed to test-retest reliability. Viewed in this light, the coefficient is actually rather high, given the long retest interval and small sample size. Finally, Alban Metcalfe (1981) reported a 2-week test-retest coefficient of .69 in a sample of students from Northern England. This study used a modified Piers- Harris, with “Americanisms” removed and the response for- mat altered to a 5-point Likert-type scale. Students with special needs. Studies of students with special needs have included investigations of children with mental retardation, children with learning and speech dis- abilities, and children with chronic medical illness. Querry (1970) compared the self-concepts of 3rd and 4th graders with normal speech, mild articulation disorders, and moder- ate articulation disorders. The reported test-retest reliability coefficients were .86, .96, and .83, respectively, which is notable considering the small sample size in each group (n = 10). Tavormina (1975) obtained a stability coefficient of .80 with a sample of children with chronic illnesses. These latter two studies were based on relatively short (3 to 4 week) retest intervals. In a study of students with learning disabili- ties (ages 6 to 12), Smith and Rogers (1977) reported a reli- ability coefficient of .62 over a retest interval of 6 months. In a study of 39 students identified as having mental retardation or emotional disturbances, Wolf (1981) reported a stability coefficient of .42 over an 8-month interval. Again, this study is probably best construed as an example of construct stabil- ity rather than test-retest reliability. The long retest interval, small sample size, and relatively low variance of scores (compared with the Piers-Harris normative sample) proba- bly all contributed to the low coefficient in this study. Unreliability due to random responding. Wylie (1974) has suggested that low scores on self-concept tests may be less reliable than higher scores, particularly when the low scores are generated by younger children. With younger children, it is possible that inability to read or understand the items could lead to random responding. A randomly an- swered questionnaire tends to produce a score at or near the scale’s midpoint. This is likely to result in a low self-concept score, in normative terms, simply because most self-concept scales produce negatively skewed distributions, with scores accumulating disproportionately above the scale’s midpoint. Random response patterns need to be considered when interpreting the results of test-retest reliability studies. Random responding tends to result in inconsistent item re- sponses between two testing occasions, which is a hallmark of poor reliability. However, children who respond randomly are likely to have low self-concept scores on both occasions, for the reasons described previously. This creates a situation that in fact is emblematic of low test-retest reliability, but nevertheless generates a spuriously high test-retest correla- tion (because both scores are in the low range). This situation might be interpreted as reflecting a substantial relationship between the two testing occasions when actually the only common element was the randomness of responding. These speculations raise two empirical questions: (a) Do Piers-Harris scores below the mean exhibit more item in- stability across testing occasions than scores above the mean? and (b) if this differential instability exists, is it due to random responding? The Smith and Rogers (1977) study provides data relevant to these questions. Smith and Rogers adminis- tered the Piers-Harris to 89 children (aged 6 to 12) with sig- nificant learning deficiencies. Extra care was taken to ensure that the children understood the items. The sample was split into three groups based on initial Piers-Harris Total score (high, middle, or low). The groups did not differ significantly in age or IQ. The sample was given the Piers-Harris again 6 months later. An index of instability was determined for each child by calculating the number of item inconsistencies from the first testing to the second testing. Results indicated that the index of instability for the high self-concept group was significantly less than for the middle or low groups, but that the latter two groups did not differ significantly. Because those with low and middle scores did not dif- fer in item-response stability, Smith and Rogers (1977) con- cluded that random responding alone could not account for low test-retest reliability among low scorers. If random re- sponding were the cause of low reliability, then the middle- scoring group, which would have included fewer random response sets, would have showed less item instability than the low-scoring group. As an alternative explanation for their findings, Smith and Rogers proposed that for younger children, the point in time at which self-concept stabilizes is a function of the favorableness of self-concept. Under this view, younger children with high self-regard demonstrate stability of self-concept earlier than do same-age children with low self-regard. Once self-concept has become rela- tively stable (as in older children and adults), no differential item instability for those with high, middle, and low self- concept scores would be expected. According to this theory, then, greater variability in low-scoring children should be interpreted as reflecting uncertain and poorly defined self- image rather than inadequate test-retest reliability. Clearly, more research is needed to test these hypotheses. Validity Validity refers to a test’s ability to measure accurately those psychological characteristics that it purports to measure. 52 Technical Guide
  • 20. Validity is a multidimensional concept that can be divided into several types, each of which plays a different role in es- tablishing the usefulness and accuracy of a test (Anastasi, 1988). Content validity addresses the question of whether the test’s item content adequately samples the behavior that is being measured. A second type of validity, construct va- lidity, refers to how well the test performs in measuring a theoretical psychological characteristic (e.g., introversion, neuroticism). Finally, criterion validity involves how well the test performs in predicting an individual’s performance or status in other activities (e.g., school achievement, re- sponse to psychiatric treatment). Since the introduction of the original Piers-Harris in the early 1960s, researchers have produced a large body of evidence supporting the measure’s validity. The validation process for the Piers-Harris 2 is based on this existing liter- ature. In addition, new data concerning construct validity have been collected as part of the standardization study for the Piers-Harris 2. This section begins by briefly examining the content validity of the Piers-Harris 2, in terms of the evolution and refinement of the original Piers-Harris item set. Content validity is the least important aspect of the validity of the Piers-Harris 2, simply because self-concept is by definition a theoretical entity, and thus the validity of the Piers-Harris 2 is more appropriately determined by construct validation methods. In addition, the Pier-Harris 2 is frequently used as an outcome measure, which highlights the importance of es- tablishing its criterion validity. The majority of this section is therefore devoted to construct and criterion validity, de- scribing first the recent studies contributing to the Piers- Harris 2 revision, followed by a review of validity research pertaining to the original measure. Content Validity of the Original and Revised Instruments Piers-Harris 2. The Piers-Harris 2 contains 60 items, 20 fewer than the item set of the original measure. The ques- tion arose as to whether this item reduction had any impact on the content validity of the Piers-Harris 2. This issue con- cerns only the TOT score, as the Piers-Harris 2 domain scales are almost identical to their counterparts in the origi- nal measure. To determine if the item reduction would af- fect content validity, a clinical judge compared the deleted items (see Appendix E) with the retained items (see Table 4). Sixteen of the deleted items were judged to have ade- quate content overlap with the retained items, so it was con- cluded that deleting these items would not result in an overall loss of content coverage in the Piers-Harris 2. The judge identified four deleted items that had relatively little content overlap with the retained items. These four items were: “I am good at making things with my hands,” “I can draw well,” “I am good in music,” and “I sleep well at night.” These items refer to specific abilities and attributes rather than more general facets of self-concept (e.g., “I am a good person”). Therefore, it was decided that the effect of deleting them on the overall content sampling of the Piers- Harris 2 would be relatively small, and that the content valid- ity of the measure would not be threatened by going ahead with the planned item reduction. Original Piers-Harris. The items of the original Piers-Harris were written with the goal of maximizing con- tent validity. The universe of content to be sampled was de- fined as qualities that children reported liking or disliking about themselves (Jersild, 1952). Children’s statements were grouped into the following categories: (a) physical characteristics and appearance; (b) clothing and grooming; (c) health and physical well-being; (d) home and family; (e) enjoyment of recreation; (f) ability in sports and play; (g) academic performance and attitudes toward school; (h) intellectual abilities; (i) special talents (music, arts); (j) “Just Me, Myself”; and (k) personality characteristics, inner resources, and emotional tendencies. As detailed in chapter 4, an initial pool of 164 pilot items was developed to reflect all of these categories. This pool was reduced to 80 items by eliminating items with poor ability to discriminate between high and low scores on the entire item set. The reduced item set contained items from all of the original categories. The original factor analysis (Piers, 1963) identified six factors that became the cluster scales: Behavior, Intellectual and School Status, Physical Appearance and Attributes, Anxiety, Popularity, and Happiness and Satisfaction. These factors collapsed several of Jersild’s (1952) categories and emphasized items from the two most general categories (“Just Me, Myself” and personality characteristics, inner re- sources, and emotional tendencies). These general cate- gories presumably are a better reflection of a child’s overall self-concept than narrower categories (e.g., special talents or enjoyment of recreation). Construct Validity of the Piers-Harris 2 The Piers-Harris 2 standardization study provided two kinds of evidence related to the construct validity of the re- vised instrument. The study allowed detailed examination of the instrument’s structural characteristics, which refers to the intercorrelations and item composition of the Piers-Harris 2 domain scales. Determining the interrelatedness of the do- main scales helps establish whether they can be viewed as measuring separate components of overall self-concept. In addition, concurrent data was collected on other psycho- logical tests from subsamples of the Piers-Harris 2 standard- ization sample. This data enables assessment of convergent validity, or the extent to which the Piers-Harris 2 correlates with measures of similar psychological constructs. Structural characteristics. Interscale correlation analysis and factor analysis are the two methods used to il- luminate the structural characteristics of the Piers-Harris 2. Interscale correlations. Table 20 presents interscale correlation coefficients for the Piers-Harris 2 standardization sample. Most of the scales exhibit correlations with each other in the moderate to high moderate range. Interscale cor- relations of this magnitude are to be expected for several rea- sons. First of all, each scale shares items with at least two other scales. The magnitude of the interscale correlation Chapter 5 Technical Properties 53
  • 21. coefficient reflects the number of shared items between scales. For example, the FRE and HAP scales, which share four items, correlate at r = .66. In contrast, BEH and POP, which do not share any items, have a much weaker associa- tion (r = .30). Furthermore, the theory underlying the Piers- Harris specifies that a child’s general sense of self-worth should influence his or her self-appraisals in specific areas of functioning. This suggests that the factor-analytically de- rived subscales of the test, which are intended to measure distinct dimensions of self-concept, should nevertheless share variance with one another. Additional findings support the notion that the domain scales represent separate but interrelated aspects of self- concept. First, in all cases, the interscale correlations are lower than the scale reliabilities (see Table 17). This indi- cates that individual items are related more strongly to other items on the same domain scale than to items on other scales (with the exception of overlapping items). Second, as Table 20 shows, each domain scale correlates more strongly with the TOT score than with any of the other content scales. This demonstrates that each domain scale is a better index of gen- eral self-concept than of the particular components of self- concept measured by the other domain scales. Factor analysis. An exploratory common factor anal- ysis with oblimin rotation was conducted using the Piers- Harris 2 standardization sample data. The common factor approach was selected because it allows for sources of vari- ance (e.g., measurement error) other than the extracted fac- tors. Oblimin rotation was chosen because it assumes correlated factors, which is a theoretically and empirically reasonable assumption for this measure (see the discussion in the previous section). Table 21 presents the factor load- ings (item-factor correlations), with items organized by do- main scale. The factor analysis yielded six factors with eigenval- ues greater than 1. The first factor was weighted with items representing feelings of happiness and perceptions that one is important and valued by others. The second factor reflect- ed endorsement of troublesome behaviors at school and at home. The third factor appeared to represent freedom from anxiety, worry, and nervousness. The fourth factor reflected perceptions of being good at schoolwork and fitting in well at school. The fifth factor represented dissatisfaction with one’s physical appearance and personal attributes. The sixth factor reflected a perception that one has many friends, makes new friends easily, and is well liked by others. It should be noted that the first, third, fourth, and sixth factors represented qualities that are positively correlated with self- esteem, and the second and fifth factors represented qualities that are negatively correlated with self-esteem. Thus, a child with a high self-concept score would be expected to have relatively high scores on Factors I, III, IV, and VI, and rela- tively low scores on Factors II and V. There appears to be reasonably good correspondence between the results of the factor analysis and the item assign- ments for the Piers-Harris 2 domain scales. The BEH, INT, FRE, and POP scales map relatively cleanly onto Factors II, IV, III, and VI, respectively. The situation is a bit more com- plex for the other two domain scales. On the PHY scale, the items that reflect concerns with physical appearance map cleanly onto Factor V. However, PHY items that represent other personal attributes (e.g., Item 9, “I am a leader in games and sports,” or Item 15, “I am strong”) do not load on Factor V or any other factor. The HAP scale appears to be bifactorial. There is a strong loading on Factor I of items representing feelings of happiness, importance, and being valued by others. Other HAP items having to do with satis- faction with one’s appearance load on Factor V, reflecting some of the item overlap between HAP and PHY. It is worth noting that in this factor analysis, items were recoded to conform to the Piers-Harris 2 scoring sys- tem (i.e., “1” is coded for the response in the direction of positive self-concept, “0” for the alternative response). This recoding tends to obliterate the original distinction between positive and negative item phrasing. For example, a posi- tively phrased item (e.g., Item 60, “I am a good person”) is coded “1” for a yes response, and a negatively phrased item (e.g., Item 36, “I hate school”) is coded “1” for a no re- sponse. Closer inspection of the factor analysis, however, does reveal some effects of item phrasing on the factor struc- ture of the Piers-Harris 2. In particular, Factors II and III are composed primarily of negatively phrased items; Factors IV and V contain mostly positively phrased items; and Factors I and VI are evenly mixed. 54 Technical Guide Table 20 Interscale Correlations in the Piers-Harris 2 Standardization Sample TOT BEH INT PHY FRE POP HAP Total (TOT) - Behavioral Adjustment (BEH) .73 - Intellectual and School Status (INT) .84 .64 - Physical Appearance and Attributes (PHY) .76 .34 .65 - Freedom From Anxiety (FRE) .79 .42 .52 .50 - Popularity (POP) .75 .30 .50 .66 .64 - Happiness and Satisfaction (HAP) .81 .53 .60 .69 .66 .55 - Note. N = 1,387.
  • 22. 55 Table 21 Factor Loadings for Piers-Harris 2 Item Responses in Standardization Sample I II III IV V VI Behavioral Adjustment (BEH) 12. I am well behaved in school. (Y) –.44 13. It is usually my fault when something goes wrong. (N) –.41 14. I cause trouble to my family. (N) –.49 18. I am good in my schoolwork. (Y) .56 19. I do many bad things. (N) –.62 20. I behave badly at home. (N) –.56 27. I often get into trouble. (N) –.62 30. My parents expect too much of me. (N) 36. I hate school. (N) .45 38. I am often mean to other people. (N) –.47 45. I get into a lot of fights. (N) –.50 48. My family is disappointed in me. (N) –.47 58. I think bad thoughts. (N) –.46 60. I am a good person. (Y) .53 Intellectual and School Status (INT) 5. I am smart. (Y) .51 7. I get nervous when the teacher calls on me. (N) .51 12. I am well behaved in school. (Y) –.44 16. I am an important member of my family. (Y) .49 18. I am good in my schoolwork. (Y) .56 21. I am slow in finishing my schoolwork. (N) 22. I am an important member of my class. (Y) .44 24. I can give a good report in front of the class. (Y) .51 25. In school I am a dreamer. (N) 26. My friends like my ideas. (Y) .49 34. I often volunteer in school. (Y) .45 39. My classmates in school think I have good ideas. (Y) .49 .50 43. I am dumb about most things. (N) .41 50. When I grow up, I will be an important person. (Y) .42 52. I forget what I learn. (N) .50 55. I am a good reader. (Y) .46 Physical Appearance and Attributes (PHY) 5. I am smart. (Y) .51 8. My looks bother me. (N) –.58 9. I am a leader in games and sports. (Y) 15. I am strong. (Y) 26. My friends like my ideas. (Y) .49 33. I have nice hair. (Y) –.55 39. My classmates in school think I have good ideas. (Y) .49 .50 44. I am good-looking. (Y) –.65 46. I am popular with boys. (Y) 49. I have a pleasant face. (Y) –.63 54. I am popular with girls. (Y) .42 continued on next page . . .
  • 23. 56 Table 21 (continued) Factor Loadings for Piers-Harris 2 Item Responses in Standardization Sample I II III IV V VI Freedom From Anxiety (FRE) 4. I am often sad. (N) .44 6. I am shy. (N) 7. I get nervous when the teacher calls on me. (N) .51 8. My looks bother me. (N) –.58 10. I get worried when we have tests in school. (N) .50 17. I give up easily. (N) 23. I am nervous. (N) .55 29. I worry a lot. (N) .56 31. I like being the way I am. (Y) –.54 32. I feel left out of things. (N) .48 –.46 35. I wish I were different. (N) .42 –.58 40. I am unhappy. (N) .54 56. I am often afraid. (N) .54 59. I cry easily. (N) .44 Popularity (POP) 1. My classmates make fun of me. (N) .41 3. It is hard for me to make friends. (N) .56 6. I am shy. (N) 11. I am unpopular. (N) .52 32. I feel left out of things. (N) .48 –.46 37. I am among the last to be chosen for games and sports. (N) .51 39. My classmates in school think I have good ideas. (Y) .49 .50 41. I have many friends. (Y) .43 .63 47. People pick on me. (N) .52 51. In games and sports, I watch instead of play. (N) 54. I am popular with girls. (Y) .42 57. I am different from other people. (N) Happiness and Satisfaction (HAP) 2. I am a happy person. (Y) .63 8. My looks bother me. (N) –.58 28. I am lucky. (Y) 31. I like being the way I am. (Y) –.54 35. I wish I were different. (N) .42 –.58 40. I am unhappy. (N) .54 42. I am cheerful. (Y) .60 49. I have a pleasant face. (Y) –.63 53. I am easy to get along with. (Y) 60. I am a good person. (Y) .53 Note. N = 1,387. Principal axis extraction method with oblimin rotation. Loadings are item-factor correlations. Loadings of less than .40 are not displayed. Letter in parentheses indicates response scored as positive self-concept (Y = yes, N = no).