SlideShare ist ein Scribd-Unternehmen logo
1 von 88
Downloaden Sie, um offline zu lesen
SENG 421:
Software Metrics
                Measurement Theory
                    (Chapter 2)

Department of Electrical & Computer Engineering, University of Calgary

                 B.H. Far      (far@ucalgary.ca)
      http://www.enel.ucalgary.ca/People/far/Lectures/SENG421/02/
Contents
   Metrology
   Property-oriented measurement
   Meaningfulness in measurement
   Scale
   Measurement validation
   Object-oriented measurement
   Subject-domain-oriented measurement




                      far@ucalgary.ca     2
Metrology
   Metrology is the science of measurement.
   Metrology is the basis for empirical science and
    engineering in order to bring knowledge under
    general laws, i.e., to distil observations into formal
    theories and express them mathematically.
   Measurement is used for formal (logical or
    mathematical, orderly and reliable) representation of
    observation.



                         far@ucalgary.ca               3
Two Problem Categories
   Components of a measurement system:
        m = <attribute, scale, unit>
           Attribute is what is being measured (e.g., size of a program)
           Scale is the standard and scope of measurement (e.g., nominal,
            ordinal, ratio scale, etc.)
           Unit is the physical meaning of scale (e.g., a positive integer, a
            symbol, etc.)
   Determining the value of an attribute of an entity.
   Determining the class of entities to which the
    measurement relates.



                                  far@ucalgary.ca                            4
Empirical Relations
   Empirical relation preserved under measurement M
    as numerical relation




                                         Figure from Fenton’s Book

                       far@ucalgary.ca                          5
Real, Empirical & Formal Worlds
                  Measurement

   Real                                 Empirical
   World                                 World

                                                    Scales
                                                    & Units
                    Formal
   Modeling &       World
   Verification                            Mapping


   M athem atical (logical) M odel
                      far@ucalgary.ca                     6
Real, Empirical & Formal Worlds
                  Measurement
                     height
  Entity A                              197

  Entity B                              124
                                              Scale: ratio
                                              Unit: cm


                  M(A) > M(B)
   Modeling &
   Verification                         Mapping




                      far@ucalgary.ca                  7
Measurement: Activities /1
   Problem definition
        Defining measurement problem
        Designating set of entities forming the target of
         measurement
        Identifying key attributes for the entities
   Identifying scales
        Identifying scales for which the attributes can be
         measured
   Forming the empirical relational system
        Mapping entities and their selected attributes to numbers
         or values on the scales

                              far@ucalgary.ca                   8
Measurement: Activities /2
   Modeling
        Developing mathematical (logical) representation of the
         entities and their attributes
   Defining the formal relational system
        Mapping the empirical relational system to the formal
         model
   Verifying the results of measurement
        Verifying whether the measurement results reflect the
         properties of the entities, attributes and relationships



                              far@ucalgary.ca                       9
Empirical Relational System
   E = {A, R, O}
   A = {a, b, c, …, z} set of real world entities
   k: property for each element of A
   A = {a, b, c, …, z} is the model of A which
    describes each element of A in terms of the property
    k
   Empirical observations defined on the model set A
        Set of n-ary relations: R, e.g., X is taller than Y
        Set of binary operations: O , e.g., X is tall



                              far@ucalgary.ca                  10
Formal Relational System
   F = {A’, R’, O’}
   F should satisfy the following conditions:
       Capable of expressing all relations and
        operations in E.
       Support the meaningful conclusions from the
        data.
       Mapping from E to F must represent all the
        observations, preserving all the relations and
        operations of E.

                          far@ucalgary.ca                11
Example 1                /1
    Ranking 4 software products based on
    user preferences
    Problem definition: ranking
     the products (entities) A, B,              A    B            C          D
     C and D based on user               A      -    80           10         80
     preferences                         B      20   -            5          50
    Scale: A single 0-100%              C      90   95           -          96
     linear scale
                                         D      20   50           4          -
    Empirical relational
     system: represented by the         (A,B) = 80 means %80 of
     table.                             the users preferred product
                                        A to B
                                                         Example from Fenton’s Book

                              far@ucalgary.ca                                    12
Example 1                    /2
   Modeling:
        Valid pairs are those having the value greater that 60.
        If for a pair (A,B) , more than 60% of users prefer A to B then A is
         “definitely better” than B.
           If     M(x,y) > 60%         then p(x) > p(y)
   Formal relation system:
          If       p(x) > p(y)   and       p(y) > p(z)
          Then     p(x) >p(z)
   Verification:
        Valid pairs (C,A), (C,B), (C,D), (A,B) and (A,D)
        No conflict between the data collected and the model




                                  far@ucalgary.ca                         13
Example 1                /3
   Verification fails if the data              A    B    C    D
    was collected as shown               A      -    80   10   80
    here because                         B      20   -    95   50
                                         C      90   5    -    96
    M(B,C) > 60% then p(B) > p(C)
                                         D      20   50   4    -
    M(C,A) > 60% then p(C) > p(A)
    M(A,B) > 60% then p(A) > p(B)


   There is a conflict between
    the real and formal model
    and the model must be
    revised.

                              far@ucalgary.ca                      14
Example 2               /1
   Entity: software failure
   Attribute: criticality
   Three types of failure is observed:
        Delayed response
        Incorrect output
        Data loss
   There are 3 failure classes in E
        Delayed response (R1)
        Incorrect output (R2)
        Data loss (R3)


                            far@ucalgary.ca   15
Example 2       /2
   Measurement mapping




                                       Figure from Fenton’s Book

                     far@ucalgary.ca                        16
Example 2           /3
   Next we would like to add a new binary relation:
         x is more critical than y
   Each data loss failure (x in R3) is more critical than
    incorrect output failure (y in R2) and delayed
    response failure (y in R1)
   Each incorrect output failure (x in R2) is more
    critical than delayed response failure (y in R1)
   The model should be revised to account for this new
    binary relation

                         far@ucalgary.ca              17
Example 2        /4
   Revised measurement mapping
                                       3
                                       3
                                            3




                                           Figure from Fenton’s Book

                     far@ucalgary.ca                            18
Measurement: Properties /1
   Completeness
        Components of a measurement system:
           m = <attribute, scale, unit>
        Attribute is what is being measured (e.g., size of a
         program)
        Scale is the standard and scope of measurement (e.g.,
         ratio)
        Unit is the physical meaning of scale (e.g., a positive
         integer)
   Uniqueness
        A measurement result should be unique and match with
         the scale and units


                              far@ucalgary.ca                      19
Measurement: Properties /2
   Extendibility
       Two or more formal measures mapping to the
        same entity are compatible using explicit
        compatibility relation (e.g., cm and inch used to
        measure length and 1 in = 2.54 cm)
   Discrete Differentiability
       The minimum unit of measurement scale is used
        to determine the differential rate of the
        measurement system.

                          far@ucalgary.ca              20
Measurement: Properties /3
   Deterministic or Probabilistic
       Measurement system should be either
        deterministic (i.e., lead to same results under
        same conditions) or probabilistic (e.g.,
        productivity per hour, reliability)
   Quantitative or Qualitative
       Result of measurement is either quantitative
        (represented by number values) or qualitative
        (represented by qualitative values or range
        intervals)

                          far@ucalgary.ca                 21
Measurement: Direct & Indirect
   Direct measurement of an attribute of an entity
    involves no other attribute or entity. E.g., software
    length in terms of lines of code.
   Indirect measurement is useful in making visible the
    interaction between direct measurements. E.g.,
    productivity, software quality.
   Indirect measurement is particularly helpful when
    the size of empirical measures is quite big (i.e.,
    many relations and many entities) or the cost of
    direct measurement is high.


                         far@ucalgary.ca             22
Example: Direct & Indirect
   In a software system,
    measuring number of faults
    (i.e., direct measurement)
    leads to identification of 5
    problem areas.
   However, measuring faults
    per KLOC (i.e., indirect
    measurement) leads to
    identification of only one
    problem area.


                                              Example from Fenton’s Book

                            far@ucalgary.ca                         23
Measurement: Size of Empirical Set
   Example: measuring popularity of 4 software
    products
   E = {A, R, O}
        |A| = 4
        |R| = 3
             r1: x is more popular than y
             r2: x is less popular than y
             r3: indifferent
   The total number of possible measures is:
    4 × (4 - 1) × 3 = 36
   The size will grow when the number of elements
    and relations grow.

                                   far@ucalgary.ca   24
Software Measurement:
Scale
Scale: Informal Definition
   A measurement scale is a
    set of predefined symbols
    or values in order to
    represent certain common
    measures.
   A scale is an abstract
    measurement tool for
    measuring defined
    common attributes of
    entities.


                        far@ucalgary.ca   31
Scale: Formal Definition
   Assume that the followings are given:
        A = {a, b, c, …, z} set of real world entities
        k: property for each element of A
        A = {a, b, c, …, z} is the model of A which describes each element of
         A in terms of the property k
        The empirical and formal relational systems E and F
        m, which is a mapping A → A’
   Then S= (E, F, m) is called the scale of measurement for
    the property k of the target set A
   If F is defined over a subset of real numbers, within the
    scale, the measurement maps the key property of each object
    of the target set into a number.


                                  far@ucalgary.ca                         32
Scales (History)
   In the early 1940’s, the Harvard psychologist
    Stanley Smith Stevens coined the terms nominal,
    ordinal, interval, and ratio to describe a hierarchy
    of measurement scales and classify statistical
    procedures according to the scales they were
    adopted.
   Stevens’ scales, were first presented in his 1946
    article “On the theory of scales of measurement”
    [Stevens, 1946]. They have been adopted by most
    of the statistics textbooks and have consequently
    been influencing statistical experiments to date.

                         far@ucalgary.ca               33
Scales (Summary)


                           Ratio
            Interval
      Ordinal
Nominal




                far@ucalgary.ca    34
Measurement Scale: Questions
       How do we know if a scale appropriately represents
        the relationships between measured properties of
        entities? (representation problem)
          Has to do with the validity of the measure
       What do we do when we have several different scales
        for the same measure? (uniqueness problem)
          Has to do with transformation between scales (
            unit!)
       For a defined scale, what arithmetic operations make
        sense for measurement values?
          Has to do with meaningfulness of measurement-based
            statements
Scale Types
   Objective (regular) scales
        Nominal, ordinal, interval, ratio, absolute


   Subjective scales
        Likert-Type scale (Evaluation-Type, Frequency-Type,
         Agreement-Type)
        Semantic differential scale
        Summative scale




                              far@ucalgary.ca              36
Regular Scales
   The scale (E,F,m) is regular if and only if
        for every other scale (E,F,g), and
        for all a and b in A,
                   m(a) =m(b) implies g(a) = g(b)

   This is a rephrase for uniqueness property along
    various scales.
   If two objects measured on one scale yield to the
    same measure for the property k, the same holds on
    any other scale.

                           far@ucalgary.ca          37
Objective Scale: Types /1
Nominal Scales
   Define classes or categories, and then place each entity in a
    particular class or category, based on the value of the
    attribute.
   Properties:
        The empirical relation system consists only of different classes; there
         is no notion of ordering among the classes.
        Any distinct numbering or symbolic representation of the classes is
         an acceptable measure, but there is no notion of magnitude
         associated with the numbers or symbols.
   Nominal-scale measurement places elements in a
    classification scheme. The classes are not ordered; even if
    the classes are numbered from 1 to n for identification, there
    is no implied ordering of the classes.


                                   far@ucalgary.ca                         38
Example: Nominal Scale
   Classification of cars
                                               Entity                Attr
    based on their colour
   Classification of faults in                   Car              Colour

    a software:                                          C-C1         1   White
                                                         C-C2         2   Yellow
       Specification fault                              C-C3
                                                         C-C4
                                                                      3
                                                                      4
                                                                          Red
                                                                          Blue
                                                         C-C5         5   Green
       Design fault                                     C-C6         6   other


       Coding fault
                                      Measure (Car Colour) ∈ {“1”, “2”, “3”, “4”, “5”, “6”}
                                                            {White, …, other}




                         far@ucalgary.ca                                           39
Objective Scale: Types /2
Ordinal Scales
   The ordinal scale is useful to augment the nominal
    scale with information about an ordering of the
    classes or categories.
   Properties:
        The empirical relation system consists of classes that are
         ordered with respect to the attribute.
        Any mapping that preserves the ordering (that is, any
         monotonic function) is acceptable.
        The numbers represent ranking only, so addition,
         subtraction, and other arithmetic operations have no
         meaning.


                              far@ucalgary.ca                   40
Example: Ordinal Scale
   M(x) > M(y) then
    M’(x) > M’(y)                                   Entity                 Attr
   Measuring complexity of
                                                    Defect            Severity
    many software modules by
    defining 5 complexity classes                         D-S1         1   S1   Documentation
                                                          D-S2         2   S2   Minor
        Trivial                                          D-S3
                                                          D-S4
                                                                       3
                                                                       4
                                                                           S3
                                                                           S4
                                                                                Major
                                                                                Critical
        Simple
        Moderate
                                              Measure (Defect Severity) ∈ {S1, …, S4}
        Complex                                                         {1, …, 4}

        Incomprehensible

                            far@ucalgary.ca                                           41
Objective Scale: Types /3
Interval Scales
   Interval scale carries more information than ordinal and
    nominal scale. It captures information about the size of the
    intervals that separate the classes, so that we can in some
    sense understand the size of the jump from one class to
    another.
   Properties:
        An interval scale preserves order, as with an ordinal scale.
        An interval scale preserves differences but not ratios. That is, we
         know the difference between any two of the ordered classes in the
         range of the mapping, but computing the ratio of two classes in the
         range does not make sense.
        Addition and subtraction are acceptable on the interval scale, but not
         multiplication and division.



                                  far@ucalgary.ca                          42
Example: Interval Scale
   M’ = aM + b
                                                  Entity       Attr
   Temperature ranges in Celsius
    and Fahrenheit                                Engine       Temp

   Project scheduling                                  …
                                                        E-T1
                                                                 … …
                                                                 -20 -4
                                                        E-T2     -10 14
        Requirement analysis 3 weeks                   E-T3     0   32
                                                        E-T4     10 50
        Design                   4 weeks               E-T5
                                                        …
                                                                 20 68
                                                                 … …
        Coding                   4 weeks
        Testing starts after coding is done Measure (Engine Temperature)
                                             ∈ [min, max]
        When testing starts? After 11 weeks


                                far@ucalgary.ca                       43
Objective Scale: Types /4
Ratio Scales
   Sometimes we would like to be able to say that one liquid is
    twice as hot as another, or that one project took twice as long
    as another. This needs the ratio scale, which is the most
    useful scale of measurement, and quite common in the
    physical sciences.
   Properties:
        It is a measurement mapping that preserves ordering, preserves
         size of intervals between entities, and preserves ratios between
         entities.
        There is a zero element, representing total lack of the attribute.
        The measurement mapping must start at zero and increase at equal
         intervals, known as units.
        All arithmetic can be meaningfully applied to the classes in the range
         of he mapping.

                                  far@ucalgary.ca                         44
Example: Ratio Scale
   Measuring length,                     Entity                Attr
    distance, etc. preserves
                                                                Exec
    M’ = aM                                 Prog
                                                                Time
                                           P-E1         0            0
   Measuring execution                    P-E2
                                           P-E3
                                                        0.001
                                                        0.002
                                                                     1
                                                                     2
    time of a program                      P-E4
                                           P-E5
                                                        0.003
                                                        0.004
                                                                     3
                                                                     4
                                           P-E6         0.005        5
                                           …            …



                                        Measure (Progr. Exec. Time) ∈ [0, ∞)




                      far@ucalgary.ca                                          45
Objective Scale: Types /5
Absolute Scales
   The absolute scale is the most restrictive of all. For any two
    measures, M and M', there is only one admissible
    transformation: the identity transformation.
   Properties:
        The measurement for an absolute scale is made simply by counting
         the number of elements in the entity set.
        The attribute always takes the form “number of occurrences of x in
         the entity.”
        There is only one possible measurement mapping.
        All arithmetic analysis of the resulting count is meaningful.



                                 far@ucalgary.ca                        46
Example: Absolute Scale
   M’ = M
   Number of failures                      Entity             Attr
    observed in a module is
                                            Module           #Defects
    absolute but the reliability
    is not.                                           M-D1
                                                      M-D2
                                                                  0
                                                                  1
                                                      M-D3        2
                                                      M-D4        3
                                                      M-D5        4
                                                      M-D6        5
   Number of people working                          …           …

    on a project is absolute but
                                            Measure (Module Defect Count) ∈ IN0
    their productivity is not.

                          far@ucalgary.ca                                    47
Scale Types (Summary)
                                              Nominal scale: classification of objects,
   Entity
                                               where the fact that objects are different is
                        Attr
                                               preserved
                                              Ordinal scale: objects are ranked/ordered
                                               according to some criteria, but no
                                               information about the distance between the
                                               values is given
                                              Interval scale: differences between values
                                               are meaningful
Measure (Attribute) is well-defined,
if scale and unit are clearly specified;
                                              Ratio scale: there is a meaningful “zero”
specification of the unit makes the            value, ratios between values are meaningful
measure unambiguous!
                                              Absolute scale: no transformation (other
                                               than identity) is meaningful
Measurement Scale Types [Mor01]

Scale Type   Characterization                              Example (generic)                 Example (SE)
Nominal      Divides the set of objects into categories,   Labeling, classification          Name of programming language,
             with no particular ordering among them                                          name of defect type

Ordinal      Divides the set of entities into categories   Preference, ranking, difficulty   Ranking of failures (as measure of
             that are ordered                                                                failure severity)


Interval     Comparing the differences between             Calendar time, temperature        Beginning and end date of
             values is meaningful                          (Fahrenheit, Celsius)             activities (as measures of time
                                                                                             distance)

Ratio        There is a meaningful “zero” value, and       Length, weight, time intervals,   Lines of code (as measure of
             ratios between values are meaningful          absolute temperature (Kelvin)     attribute “Program length/size”)


Absolute     There are no meaningful transformations       Object count                      Count (as measure of attribute
             of values other than identity                                                   “Number of lines of code”)
Measurement Scale – Summary
   There are 5 types of measurement scales
   The type of the measurement scale
    determines
       how measurement data can be treated
       whether statements involving measurement data
        are meaningful




                        far@ucalgary.ca            50
Commonly Used Subjective
Measurement Scales

   Likert-Type Scale
       Evaluation-Type
       Frequency-Type
       Agreement-Type

   Semantic Differential Scale
   Summative Scale
Likert Type Scales /1
   Evaluation-type [Spe92]
   Example:
       Familiarity with and comprehension of the
        software development environment (e.g.,
        compiler, code generator, CASE tools):

             Little
             Unsatisfactory
             Satisfactory
             Excellent
Likert Type Scales /2
   Frequency-type [Spe92]
   Example:
        Customers provided information to the project team
             e.g., during interviews, when given questionnaires by the project
              staff, when presented with a “system walkthrough”, and/or when
              they are asked to provide feedback on a prototype:


                 Never
                 Rarely
                 Occasionally
                 Most of the time
Likert Type Scales /3
   Agreement-type [Spe92]
   Example:
       The tasks supported by the software at the
        customer site were undertaking numerous
        changes during the project:

           Strongly Agree
           Agree
           Disagree
           Strongly Disagree
Semantic Differential Scale
   Items which include semantic opposites
   Example:
        Processing of requests for changes to existing systems:
         the manner, method, and required time with which the
         MIS staff responds to user requests for changes in
         existing computer-based information systems or services.

          Slow   □□□□□□□              Fast
          Timely □ □ □ □ □ □ □        Untimely
Assigning Numbers to Scale
    Responses /1
   Likert-Type Scales:
   Ordinal Scale
   But: Often the distances between the four response
    categories are approximately (conceptually)
    equidistant [Spe76][Spe80] and thus are treated like
    approximate interval scales
          Strongly Agree         1
          Agree                  2
          Disagree               3
          Strongly Disagree      4
Assigning Numbers to Scale
    Responses /2
   Semantic Differential Scale:
   Ordinal scale, but again, often treated as
    interval scales

Slow   □   □     □     □     □     □    □        Fast
       1   2     3     4     5     6    7
Assigning Numbers to Scale
    Responses /3
   Summative Scale:
        A summative scale is constructed by forming an
         unweighted sum of n items:
          Y = X1 + X2 + … + Xn

        This is commonly considered to be an approximately
         interval scale, but controversial (see [McC81][GL89] for
         two different views)
        If the X items are on a Likert-type scale then there is a
         good empirical basis for treating this as an approximately
         interval scale
Exercise 1
Determine the best “scale” for each of the following
measurements using each of the Nominal, Ordinal, Interval
and Ratio scales only once.
   Measuring execution time of a program Ratio
   Classification of objects based on their color Nominal
   Measuring duration of various phases of projects (Project
    scheduling)        Interval
   Measuring complexity of many software modules by
    defining 4 complexity classes (Trivial, Simple, Moderate,
    Complex)                                 Ordinal
   Measuring complexity of many software modules by
    defining cyclomatic complexity metrics
                         far@ucalgary.ca                  59
Exercise 1 (cont’d)
   Software products categorized according to their
    compatibility with the operating system (i.e.
    Windows, Linux, MacOS, DOS, etc.).          Nominal
   Internet services categorized according to their
    relevant technologies (i.e. dial-up, DSL, high-speed,
    wireless, etc.)      Nominal

   Measuring attitudes towards an Internet
    service (say, on an n-point rating, n = 0 to
    10).        Ordinal



                         far@ucalgary.ca             60
Exercise 1 (cont’d)
   Measuring attitude towards Internet services, if the
    evaluation scores are numerically meaningful so the
    difference between a rate of 3 and a rate of 6 is
    exactly the same as the difference between a rate of
    7 and a rate of 10. Interval
   Measuring attitude towards Internet services, if the
    differences between the data values are numerically
    meaningful and equal. In addition, a score of zero
    implies either the full absence of the service being
    evaluated or the full dissatisfaction with it. Ratio

                        far@ucalgary.ca              61
Exercise 2
   Suppose that you are asked to study various
    software development tools and recommend the
    best three to your company. The following table
    shows a list of available development tools.




                        far@ucalgary.ca               62
Exercise 2 (cont’d)
Tool Name/Vendor       Languages               Platforms          Features
                       Supported
Bean Machine           Java                    Windows, OS2,      Best: Visual applet and JavaBean
IBM                                               Unix            generation
CodeWarrior Pro        Java, C, C++, Pascal    Unix, Windows,     Best: if you need to support Unix,
Metrowerks                                         Mac            Windows, and Mac platforms
Java Workshop          Java                    Solaris, Windows   Better: Written 100% in Java; tools
Sun Microsystems                                                  based on a web browser metaphor
JBuilder Imprise       Java                    Windows, AS400     Better: database support
Visual Cafe for Java   Java                    Windows            Good: multithreaded debugger
Symantec
VisualAge              Java                    Unix, Windows      Good: includes incremental compiler
IBM                                                               and automatic version control
Visual J++             Java                    Windows            Fair: All the bells and whistles for
Microsoft                                                         Windows




                                              far@ucalgary.ca                                       63
Exercise 2 (cont’d)
   What are the entities, attributes and their
    values in your model?
    Entity           Attribute            Value

    Development      Language             Java, C, C++, Pascal
    Tool             supported
                     Platform             Win, Unix, Mac,
                                          OS2, AS400
                     Feature              Fair, Good, Better,
                                          Best



                        far@ucalgary.ca                          64
Exercise 2 (cont’d)
   What is the best scale for each of the
    attributes you defined?
Entity        Attribute   Value                      Scale

Development   Language    Java, C, C++, Pascal       Nominal
Tool          supported
              Platform    Win, Unix, Mac, OS2,       Nominal
                          AS400
              Feature     Fair, Good, Better, Best   Ordinal




                          far@ucalgary.ca                      65
Are the following statements
meaningful?
1. Peter is twice as tall as Hermann
2. Peter’s temperature is 10% higher than Hermann’s
3. Defect X is more severe than defect Y
4. Defect X is twice as severe as defect Y
5. The cost for correcting defect X is twice as high as the cost
   for correcting defect Y
6. The average temperature of city A (15 ºC) is twice as high
   as the average temperature of city B (30 ºC)
7. Project Milestone 3 (end of coding) took ten times longer
   than Project Milestone 0 (project start)
8. Coding took as long as requirements analysis
Are the following statements
      meaningful?
                   1. “Peter is twice as tall as Hermann”
1. interval* no*
                   2. “Peter’s temperature is 10% higher than Hermann’s”
2. interval* no*   3. “Defect X is more severe than defect Y”
3. ordinal   yes   4. “Defect X is twice as severe as defect Y”
4. ordinal   no    5. “The cost for correcting defect X is twice as high as
                      the cost for correcting defect Y”
5. ratio     yes   6. The average temperature of city A (30 ºC) is twice as
6. interval no        high as the average temperature of city B (15 ºC)
                   7. “Project Milestone 3 (end of coding) took ten times
7. interval no
                      longer than Project Milestone 0 (project start)”
8. interval yes    8. “Coding took as long as requirements analysis”
Software Measurement:
Validation & Accuracy
Measurement vs. Prediction
   Measurement systems are used to assess existing
    entities by numerically characterizing one or more
    of its attributes.
   Prediction systems are used to predict some
    attributes of a future entity, involving a
    mathematical model with associated prediction
    procedures.
        Deterministic prediction system: The same output will be
         generated for a given input.
        Stochastic prediction system: The output for a given input
         will vary probabilistically. E.g., weather forecast.


                              far@ucalgary.ca                  69
Measurement Validation
   Validation question: Does the measure used
    capture the information that it was intended
    for?
   A measure is valid if it accurately
    characterizes the attribute it claims to
    measure.
   A prediction system is valid if it makes
    accurate predictions.

                      far@ucalgary.ca          70
Validating Measures
   Definition: The process of ensuring that the
    measure is a proper numerical characterization of
    the claimed attribute by showing that the
    representation condition is satisfied.
   Example: Measuring program length
        Any measure of length should satisfy conditions such as:
         Length of joint programs should be
            m(p1 ; p2) = m(p1) + m(p2)
         If length of p1 is greater that p2, any measure of length
         should satisfy m(p1) > m(p2)



                              far@ucalgary.ca                  71
Validating Prediction Systems
   Definition: The process of establishing the
    accuracy of the prediction system by empirical
    means, i.e., by comparing model performance with
    known data in the given environment.
   Example: Software Reliability
        Using tools such as CASRE to fit the failure data into the
         reliability model.
        Accuracy of estimation of the failure intensity λ depends
         on the number of failures experienced (i.e., the sample
         size).
        Good results in estimating failure intensity are generally
         experienced for programs with 5,000 or more developed
         source lines.

                              far@ucalgary.ca                  72
Classes of Prediction Systems
   Class 1: Using internal attribute measures of early
    life-cycle products to predict measures of internal
    attributes of later life-cycle products.
        Example: measures of size, modularity, and reuse of a
         specification are used to predict size of the final code.
   Class 2: Using early life-cycle process attribute
    measures and resource-attribute measures to predict
    measures of attributes of later life-cycle processes
    and resources.
        Example: the number of faults found during formal
         design review is used to predict the cost of
         implementation.


                              far@ucalgary.ca                    73
Classes of Prediction Systems
   Class 3: Using internal product-attribute measures to predict
    process attributes.
        Example: measures of structuredness are used to predict time to
         perform some maintenance task, or number of faults found during
         unit testing.
   Class 4: Using process measures to predict later process
    measures.
        Example: measures of failures during one operational period are
         used to predict likely failure occurrences in a subsequent operational
         period. In examples like this, where an external product attribute
         (reliability) is effectively defined in terms of process attributes
         (operational failures), we may also think of this class of prediction
         systems as using process measures to predict later external product
         measures.


                                  far@ucalgary.ca                          74
Accuracy of Measurement /1
   In an ongoing project, it is possible to validate the estimate
    by comparing actual values with predicted values. If E is an
    estimate of a value and A is the actual value, the relative
    error (RE) in the estimate is:
                        A− E
                  RE =
                          A
   Relative error will be negative (if the estimate is greater than
    the actual value) or positive (if the estimate is less than the
    actual value).
   Inaccuracies can result from either statistical variation or
    estimating bias.
   Estimating inaccuracies must be compensated for throughout
    the project.


                             far@ucalgary.ca                    76
Accuracy of Measurement /2
   Relative Error (RE) can be calculated for a
    collection of estimates, e.g., it is useful to determine
    if predictions are accurate for a group of projects.
   The mean relative error (MRE) is calculated as
    the sum of RE for n projects divided by n. It is
    possible for small relative errors to balance out
    large ones, so the magnitude of the error can also be
    considered.
   The mean magnitude of relative error (MMRE)
    is calculated using the absolute value of RE.


                          far@ucalgary.ca               77
Accuracy of Measurement /3
   MMRE is used to define prediction quality.
   On a set of n projects, if k is the number of projects
    whose RE is less than or equal to q, then prediction
    quality is calculated as:
        PRED(q) = k/n
   Example:
        PRED(0.25) = 0.47 means that 47% of the predicted
         values fall within 25% of their actual values.
        An estimation technique is (usually) acceptable if
          PRED(0.25) = 0.75.


                             far@ucalgary.ca                  78
Exercise
  Assume you want to predict the error proneness of
   software modules.
a) How would you assess prediction accuracy?
      RE (relative error) = (A – E) / A
      with A: actual, E: estimated
            1 n
       MRE = ∑ REi
            n i =1
             1 n
       MMRE = ∑ REi
             n i =1
b) How would you assess prediction quality?
      PRED(q) = k/n
      where k is the number of predictions with |RE|<=q
                             far@ucalgary.ca              79
Software Measurement:
Object Oriented & Subject-
Domain Oriented Measurment
Object-Oriented Measurement
   Measurement of a single property is seldom
    adequate for the purpose of the software
    measurement because:
        Direct measurement of a property may be costly,
         inconvenient, perhaps even impossible. Measured results
         may be better achieved by observing one or more other
         properties and deriving the measurement of the required
         property from them.
        Measurement is seldom focused on a single property. It
         may be necessary to characterize a class of objects by the
         combination of several properties in an object profile.



                              far@ucalgary.ca                  81
Definition: Object Profile
   Profile is a set of disjoint alternatives called
    “elements” together with their values or
    occurrence probabilities.
   Object Profile is the set of properties and
    their values or probabilities of occurrences.
   Representing object profile
       Tabular representation
       Graphical representation


                         far@ucalgary.ca          82
Tabular Representation
   Properties 1~n are directly measurable
   Properties n+1~q are measured indirectly
   Such a matrix stores properties which have already been
    found important, provides information for deriving new
    indirectly measured properties, and offers a framework for
    adding new properties when needed.
    Objects of the                                Properties
     model set A      m1      m2             …           mn    …    mq

          a          m1(a)   m2(a)                     mn(a)       mq(a)
          b          m1(b)   m2(b)                     mn(b)       mq(b)
          c          m1(c)   m2(c)                     mn(c)       mq(c)
          .            .       .                         .           .
          .            .       .                         .           .
          z          m1(z)   m2(z)                     mn(z)       mq(z)


                                     far@ucalgary.ca                     83
OO Measurement /1
   Let Ai = {ai, bi, … , zi} be a model of A with respect to the ith
    key property.
   Then the empirical relational system
    Ei = {Ai, Ri, Oi}
    models the set A with respect to the ith property, preserving
    the relations and operations which the property imposes on
    the objects of A.
   Si = (Ei, Fi, mi) will be a scale for the ith property.
   Now the set A will be an n-dimensional model of
    A: A = {A1, A2, … , An}, each with its own scale.



                             far@ucalgary.ca                    84
OO Measurement /2
   Assume that M = {m1, m2, … , mn} is the set of the
    real-valued functions on A, composed of measures
    of one of the directly measured key properties.
   If one can define a real-valued function g on A in
    terms of the primitive measures of M, then g will be
    an indirect measure of the set of objects of A in
    terms of the chosen set of key properties, shown by
    g(m1, m2, … , mn).



                         far@ucalgary.ca             85
Meaningfulness
   Indirect measures must preserve meaningful
    empirical properties called interrelation conditions
    which permit one to deduce the meaningful indirect
    measure as a function of the primitive measures,
    according to the laws imposed by the model.
   Example:
        Measuring three direct attributes: power (P), voltage (V)
         and current (I)
        Measuring power (indirect) in terms of voltage (direct)
         and current (direct) attributes
        They should satisfy the condition P=V x I



                              far@ucalgary.ca                  86
Example
   Setting failure intensity objective (λ) in terms
    of system availability (A):
                1                            1− A
        A=               or               λ=
             1 + λ tm                        A tm
    λ is failure intensity (failure per unit of time)
    t m is downtime per failure
   The representation condition of the indirect
    measure of λ should preserve the relationship.

                        far@ucalgary.ca             87
Scale Invariance
   The scale of any indirect measure is deduced by
    means of the interrelation condition.
   Example: for a database system
     λ =20 /million transactions
     tm = 6 minutes
     λ and tm must have a compatible scale to calculate A
     If there are 20,000 transactions/day and a day has 8 working
        hours, then tm can be expressed as equivalent to 250
        transactions and
     A = 1 / (1+λ tm) = 1/(1+ 0.005) = 0.995

                            far@ucalgary.ca                  88
Subject Domain-Oriented
    Measurement (SDO)
   When approaching the issue of object-oriented measurement,
    it was necessary to broaden the perspective from a single
    property to the many-featured characterization of objects.
   When considering subject domain-oriented measurement,
    one must widen ones horizons once more, to consider the
    problem of characterizing by measurement any of the objects
    of a discipline.


       Single                                             Subject-domain
                           Object-Oriented
      Property                                               oriented
                            Measurement
    Measurement                                            Measurement
                  Broaden                         Broaden
                  perspective                     perspective


                                far@ucalgary.ca                            89
SDO Example: SI
   The international measurement system (SI). SI provides
    measures for characterizing observable entities of the
    physical universe by means of only seven orthogonal base
    units:
        1. length (metre)
        2. mass (kilogram)
        3. time (second)
        4. temperature (degrees Kelvin)
        5. electric current (ampere)
        6. luminous intensity (candela)
        7. the amount of substance (mole)
   Other physical quantities are measured in units derived from
    these, such as velocity measured in metre/second (m/s).


                                 far@ucalgary.ca             90
SDO for Software Engineering
   At present, software engineering is still in need of stronger
    discipline-wide conceptual foundations and more
    comprehensive formal theories and empirical laws.
   The subject domain is not ready for the formulation of a
    comprehensive measurement system.
   However, there is widespread recognition of the need for
    measurement-based characterization of software, stemming
    from:
        Discontent about the quality of software and the productivity of the
         industry.
        Anxiety about the safety of software-related systems.
        Concern about the security of data entrusted to software.
        Convincing evidence of safety and security.



                                  far@ucalgary.ca                         91
Software Measurement!




        far@ucalgary.ca   93
References
   [Bas92] V. R. Basili: Software modeling and measurement: The Goal/Question/Metric paradigm.
    Technical Report CS-TR-2956, Department of Computer Science, University of Maryland, College
    Park, MD 20742, September 1992.
   [BEM95] L. Briand, K. El Emam, S. Morasca: Theoretical and Empirical Validation of Software
    Product Measures. ISERN technical report 95-03, 1995.
   [BMB96] L. C. Briand, S. Morasca, and V. R. Basili: Property-Based Software Engineering
    Measurement. IEEE Transactions on Software Engineering, Vol.22, No.1, January 1996.
   [EM95] K. El Emam and N. H. Madhavji: Measuring the success of the requirements engineering
    process. Proceedings of the Second IEEE International Symposium on Requirements Engineering, pp.
    204-211, 1995.
   [GL89] D. Galletta and A. Lederer: Some cautions on the measurement of user information
    satisfaction. Decision Sciences, 20:419-438, 1989.
   [GPC02] M. Genero, M. Piattini, C. Calero: Empirical Validation of Class Diagram Metrics. ISESE
    2002: 195-203, 2002.
   [HK81] S. M. Henry, D. G. Kafura: Software Structure Metrics Based on Information Flow. IEEE
    Trans. Software Eng. 7(5): 510-518, 1981.
   [McC76] T. McCabe: A software complexity measure, IEEE Transactions on Software Engineering
    2(4):308-320, 1976.
   [McC81] J. McIver and E. Carmines: Unidimensional Scaling. Sage Publications, 1981.
   [Mor01] S. Morasca: Software Measurement, Handbook of Software Engineering and Knowledge
    Engineering, Vol. I, World Scientific Publishing, 2001.
References
   [MK93] J.C. Munson, T. M. Koshgoftaar: Measurement of Data Structures Complexity, Journal of
    Systems and Software, 20:217-225, 1993.
   [NASA/SEL] V. R. Basili: Software Modeling and Measurement: The Goal/Question/Metric Paradigm.
    Technical Report UNIACS-TR-92-96, University of Maryland, September 1992.
   [Nat92] National Aeronautics and Space Administration: Data Collection Procedures for the Software
    Engineering Laboratory (SEL) Database, March 1992.
   [Nat94] National Aeronautics and Space Administration: Software Measurement Guidebook. SEL-94-
    002. July 1994.
   [Nun78] J.Nunnaly: Psychometric Theory, McGraw Hill, 1978.
   [Osg67] C. Osgood, G. Suci, and P. Tannenbaum: The measurement of meaning. University of Illinois
    Press, 1967.
   [Spe72] P. Spector: Choosing response categories for summated rating scales. Journal of Applied
    Psychology, 61(3):374-375, 1976.
   [Spe80] P. Spector: Ratings of equal and unequal response choice intervals. The Journal of Social
    Psychology, 112:115-119, 1980.
   [Spe92] P. Spector: Summated Rating Scale Construction: An Introduction. Sage Publications, 1992.
   [Wey88] E. Weyuker: Evaluating Software Complexity Measures. IEEE Transactions on Software
    Engineering, 14(9):1357-1365, 1988.
   [Zus91] H. Zuse: Software Complexity: Measures and Methods, de Gruyter, 1991.

Weitere ähnliche Inhalte

Andere mochten auch

Assembler design option
Assembler design optionAssembler design option
Assembler design optionMohd Arif
 
Assembler design options
Assembler design optionsAssembler design options
Assembler design optionsMohd Arif
 
System Programming Unit III
System Programming Unit IIISystem Programming Unit III
System Programming Unit IIIManoj Patil
 
System Programming Unit IV
System Programming Unit IVSystem Programming Unit IV
System Programming Unit IVManoj Patil
 
System Programing Unit 1
System Programing Unit 1System Programing Unit 1
System Programing Unit 1Manoj Patil
 
System Programming Unit II
System Programming Unit IISystem Programming Unit II
System Programming Unit IIManoj Patil
 
Introduction to systems programming
Introduction to systems programmingIntroduction to systems programming
Introduction to systems programmingMukesh Tekwani
 

Andere mochten auch (10)

Smqa unit iii
Smqa unit iiiSmqa unit iii
Smqa unit iii
 
Smqa unit i
Smqa unit iSmqa unit i
Smqa unit i
 
Assembler design option
Assembler design optionAssembler design option
Assembler design option
 
Assembler design options
Assembler design optionsAssembler design options
Assembler design options
 
System Programming Unit III
System Programming Unit IIISystem Programming Unit III
System Programming Unit III
 
System Programming Unit IV
System Programming Unit IVSystem Programming Unit IV
System Programming Unit IV
 
System Programing Unit 1
System Programing Unit 1System Programing Unit 1
System Programing Unit 1
 
Assembler
AssemblerAssembler
Assembler
 
System Programming Unit II
System Programming Unit IISystem Programming Unit II
System Programming Unit II
 
Introduction to systems programming
Introduction to systems programmingIntroduction to systems programming
Introduction to systems programming
 

Ähnlich wie SENG 421 Software Metrics Chapter

A Validation of Object-Oriented Design Metrics as Quality Indicators
A Validation of Object-Oriented Design Metrics as Quality IndicatorsA Validation of Object-Oriented Design Metrics as Quality Indicators
A Validation of Object-Oriented Design Metrics as Quality Indicatorsvie_dels
 
Machine Learning: Classification Concepts (Part 1)
Machine Learning: Classification Concepts (Part 1)Machine Learning: Classification Concepts (Part 1)
Machine Learning: Classification Concepts (Part 1)Daniel Chan
 
Design of experiments
Design of experimentsDesign of experiments
Design of experimentsCynthia Cumby
 
Spark + AI Summit - The Importance of Model Fairness and Interpretability in ...
Spark + AI Summit - The Importance of Model Fairness and Interpretability in ...Spark + AI Summit - The Importance of Model Fairness and Interpretability in ...
Spark + AI Summit - The Importance of Model Fairness and Interpretability in ...Francesca Lazzeri, PhD
 
Real-time ranking with concept drift using expert advice
Real-time ranking with concept drift using expert adviceReal-time ranking with concept drift using expert advice
Real-time ranking with concept drift using expert adviceHila Becker
 
Predicting Employee Attrition
Predicting Employee AttritionPredicting Employee Attrition
Predicting Employee AttritionShruti Mohan
 
Boolean,vector space retrieval Models
Boolean,vector space retrieval Models Boolean,vector space retrieval Models
Boolean,vector space retrieval Models Primya Tamil
 
Task A. [20 marks] Data Choice. Name the chosen data set(s) .docx
Task A. [20 marks] Data Choice. Name the chosen data set(s) .docxTask A. [20 marks] Data Choice. Name the chosen data set(s) .docx
Task A. [20 marks] Data Choice. Name the chosen data set(s) .docxjosies1
 
Visual diagnostics for more effective machine learning
Visual diagnostics for more effective machine learningVisual diagnostics for more effective machine learning
Visual diagnostics for more effective machine learningBenjamin Bengfort
 
Data Mining: Concepts and Techniques (3rd ed.) - Chapter 3 preprocessing
Data Mining:  Concepts and Techniques (3rd ed.)- Chapter 3 preprocessingData Mining:  Concepts and Techniques (3rd ed.)- Chapter 3 preprocessing
Data Mining: Concepts and Techniques (3rd ed.) - Chapter 3 preprocessingSalah Amean
 
Arev05c icfca05lessonslearned
Arev05c icfca05lessonslearnedArev05c icfca05lessonslearned
Arev05c icfca05lessonslearnedAhmed Mohamed
 
A WHIRLWIND TOUR OF ACADEMIC TECHNIQUES FOR REAL-WORLD SECURITY RESEARCHERS
A WHIRLWIND TOUR OF ACADEMIC TECHNIQUES FOR REAL-WORLD SECURITY RESEARCHERSA WHIRLWIND TOUR OF ACADEMIC TECHNIQUES FOR REAL-WORLD SECURITY RESEARCHERS
A WHIRLWIND TOUR OF ACADEMIC TECHNIQUES FOR REAL-WORLD SECURITY RESEARCHERSSilvio Cesare
 
LAK13 linkedup tutorial_evaluation_framework
LAK13 linkedup tutorial_evaluation_frameworkLAK13 linkedup tutorial_evaluation_framework
LAK13 linkedup tutorial_evaluation_frameworkHendrik Drachsler
 
Chapter 3. Data Preprocessing.ppt
Chapter 3. Data Preprocessing.pptChapter 3. Data Preprocessing.ppt
Chapter 3. Data Preprocessing.pptSubrata Kumer Paul
 
Linear Regression Parameters
Linear Regression ParametersLinear Regression Parameters
Linear Regression Parameterscamposer
 
A Modified KS-test for Feature Selection
A Modified KS-test for Feature SelectionA Modified KS-test for Feature Selection
A Modified KS-test for Feature SelectionIOSR Journals
 
Machine Learning.pdf
Machine Learning.pdfMachine Learning.pdf
Machine Learning.pdfBeyaNasr1
 

Ähnlich wie SENG 421 Software Metrics Chapter (20)

A Validation of Object-Oriented Design Metrics as Quality Indicators
A Validation of Object-Oriented Design Metrics as Quality IndicatorsA Validation of Object-Oriented Design Metrics as Quality Indicators
A Validation of Object-Oriented Design Metrics as Quality Indicators
 
Machine Learning: Classification Concepts (Part 1)
Machine Learning: Classification Concepts (Part 1)Machine Learning: Classification Concepts (Part 1)
Machine Learning: Classification Concepts (Part 1)
 
Design of experiments
Design of experimentsDesign of experiments
Design of experiments
 
Spark + AI Summit - The Importance of Model Fairness and Interpretability in ...
Spark + AI Summit - The Importance of Model Fairness and Interpretability in ...Spark + AI Summit - The Importance of Model Fairness and Interpretability in ...
Spark + AI Summit - The Importance of Model Fairness and Interpretability in ...
 
Real-time ranking with concept drift using expert advice
Real-time ranking with concept drift using expert adviceReal-time ranking with concept drift using expert advice
Real-time ranking with concept drift using expert advice
 
Predicting Employee Attrition
Predicting Employee AttritionPredicting Employee Attrition
Predicting Employee Attrition
 
Boolean,vector space retrieval Models
Boolean,vector space retrieval Models Boolean,vector space retrieval Models
Boolean,vector space retrieval Models
 
Task A. [20 marks] Data Choice. Name the chosen data set(s) .docx
Task A. [20 marks] Data Choice. Name the chosen data set(s) .docxTask A. [20 marks] Data Choice. Name the chosen data set(s) .docx
Task A. [20 marks] Data Choice. Name the chosen data set(s) .docx
 
Visual diagnostics for more effective machine learning
Visual diagnostics for more effective machine learningVisual diagnostics for more effective machine learning
Visual diagnostics for more effective machine learning
 
Data Mining: Concepts and Techniques (3rd ed.) - Chapter 3 preprocessing
Data Mining:  Concepts and Techniques (3rd ed.)- Chapter 3 preprocessingData Mining:  Concepts and Techniques (3rd ed.)- Chapter 3 preprocessing
Data Mining: Concepts and Techniques (3rd ed.) - Chapter 3 preprocessing
 
Arev05c icfca05lessonslearned
Arev05c icfca05lessonslearnedArev05c icfca05lessonslearned
Arev05c icfca05lessonslearned
 
A WHIRLWIND TOUR OF ACADEMIC TECHNIQUES FOR REAL-WORLD SECURITY RESEARCHERS
A WHIRLWIND TOUR OF ACADEMIC TECHNIQUES FOR REAL-WORLD SECURITY RESEARCHERSA WHIRLWIND TOUR OF ACADEMIC TECHNIQUES FOR REAL-WORLD SECURITY RESEARCHERS
A WHIRLWIND TOUR OF ACADEMIC TECHNIQUES FOR REAL-WORLD SECURITY RESEARCHERS
 
LAK13 linkedup tutorial_evaluation_framework
LAK13 linkedup tutorial_evaluation_frameworkLAK13 linkedup tutorial_evaluation_framework
LAK13 linkedup tutorial_evaluation_framework
 
Chapter 3. Data Preprocessing.ppt
Chapter 3. Data Preprocessing.pptChapter 3. Data Preprocessing.ppt
Chapter 3. Data Preprocessing.ppt
 
Lec10 matching
Lec10 matchingLec10 matching
Lec10 matching
 
Linear Regression Parameters
Linear Regression ParametersLinear Regression Parameters
Linear Regression Parameters
 
A Modified KS-test for Feature Selection
A Modified KS-test for Feature SelectionA Modified KS-test for Feature Selection
A Modified KS-test for Feature Selection
 
Machine Learning.pdf
Machine Learning.pdfMachine Learning.pdf
Machine Learning.pdf
 
[ppt]
[ppt][ppt]
[ppt]
 
[ppt]
[ppt][ppt]
[ppt]
 

Kürzlich hochgeladen

Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxGrade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxkarenfajardo43
 
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQ-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQuiz Club NITW
 
ICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfVanessa Camilleri
 
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptxDIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptxMichelleTuguinay1
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxlancelewisportillo
 
Scientific Writing :Research Discourse
Scientific  Writing :Research  DiscourseScientific  Writing :Research  Discourse
Scientific Writing :Research DiscourseAnita GoswamiGiri
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptxmary850239
 
Mental Health Awareness - a toolkit for supporting young minds
Mental Health Awareness - a toolkit for supporting young mindsMental Health Awareness - a toolkit for supporting young minds
Mental Health Awareness - a toolkit for supporting young mindsPooky Knightsmith
 
ClimART Action | eTwinning Project
ClimART Action    |    eTwinning ProjectClimART Action    |    eTwinning Project
ClimART Action | eTwinning Projectjordimapav
 
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...Nguyen Thanh Tu Collection
 
ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxVanesaIglesias10
 
Mythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWMythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWQuiz Club NITW
 
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnvESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnvRicaMaeCastro1
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management systemChristalin Nelson
 
4.11.24 Mass Incarceration and the New Jim Crow.pptx
4.11.24 Mass Incarceration and the New Jim Crow.pptx4.11.24 Mass Incarceration and the New Jim Crow.pptx
4.11.24 Mass Incarceration and the New Jim Crow.pptxmary850239
 
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptx
Unraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptxUnraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptx
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptxDhatriParmar
 
Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4JOYLYNSAMANIEGO
 
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITWQ-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITWQuiz Club NITW
 
Measures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataMeasures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataBabyAnnMotar
 

Kürzlich hochgeladen (20)

Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxGrade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
 
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQ-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
 
ICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdf
 
Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"
 
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptxDIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
 
Scientific Writing :Research Discourse
Scientific  Writing :Research  DiscourseScientific  Writing :Research  Discourse
Scientific Writing :Research Discourse
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx
 
Mental Health Awareness - a toolkit for supporting young minds
Mental Health Awareness - a toolkit for supporting young mindsMental Health Awareness - a toolkit for supporting young minds
Mental Health Awareness - a toolkit for supporting young minds
 
ClimART Action | eTwinning Project
ClimART Action    |    eTwinning ProjectClimART Action    |    eTwinning Project
ClimART Action | eTwinning Project
 
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
 
ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptx
 
Mythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWMythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITW
 
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnvESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management system
 
4.11.24 Mass Incarceration and the New Jim Crow.pptx
4.11.24 Mass Incarceration and the New Jim Crow.pptx4.11.24 Mass Incarceration and the New Jim Crow.pptx
4.11.24 Mass Incarceration and the New Jim Crow.pptx
 
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptx
Unraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptxUnraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptx
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptx
 
Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4
 
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITWQ-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
 
Measures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataMeasures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped data
 

SENG 421 Software Metrics Chapter

  • 1. SENG 421: Software Metrics Measurement Theory (Chapter 2) Department of Electrical & Computer Engineering, University of Calgary B.H. Far (far@ucalgary.ca) http://www.enel.ucalgary.ca/People/far/Lectures/SENG421/02/
  • 2. Contents  Metrology  Property-oriented measurement  Meaningfulness in measurement  Scale  Measurement validation  Object-oriented measurement  Subject-domain-oriented measurement far@ucalgary.ca 2
  • 3. Metrology  Metrology is the science of measurement.  Metrology is the basis for empirical science and engineering in order to bring knowledge under general laws, i.e., to distil observations into formal theories and express them mathematically.  Measurement is used for formal (logical or mathematical, orderly and reliable) representation of observation. far@ucalgary.ca 3
  • 4. Two Problem Categories  Components of a measurement system: m = <attribute, scale, unit>  Attribute is what is being measured (e.g., size of a program)  Scale is the standard and scope of measurement (e.g., nominal, ordinal, ratio scale, etc.)  Unit is the physical meaning of scale (e.g., a positive integer, a symbol, etc.)  Determining the value of an attribute of an entity.  Determining the class of entities to which the measurement relates. far@ucalgary.ca 4
  • 5. Empirical Relations  Empirical relation preserved under measurement M as numerical relation Figure from Fenton’s Book far@ucalgary.ca 5
  • 6. Real, Empirical & Formal Worlds Measurement Real Empirical World World Scales & Units Formal Modeling & World Verification Mapping M athem atical (logical) M odel far@ucalgary.ca 6
  • 7. Real, Empirical & Formal Worlds Measurement height Entity A 197 Entity B 124 Scale: ratio Unit: cm M(A) > M(B) Modeling & Verification Mapping far@ucalgary.ca 7
  • 8. Measurement: Activities /1  Problem definition  Defining measurement problem  Designating set of entities forming the target of measurement  Identifying key attributes for the entities  Identifying scales  Identifying scales for which the attributes can be measured  Forming the empirical relational system  Mapping entities and their selected attributes to numbers or values on the scales far@ucalgary.ca 8
  • 9. Measurement: Activities /2  Modeling  Developing mathematical (logical) representation of the entities and their attributes  Defining the formal relational system  Mapping the empirical relational system to the formal model  Verifying the results of measurement  Verifying whether the measurement results reflect the properties of the entities, attributes and relationships far@ucalgary.ca 9
  • 10. Empirical Relational System  E = {A, R, O}  A = {a, b, c, …, z} set of real world entities  k: property for each element of A  A = {a, b, c, …, z} is the model of A which describes each element of A in terms of the property k  Empirical observations defined on the model set A  Set of n-ary relations: R, e.g., X is taller than Y  Set of binary operations: O , e.g., X is tall far@ucalgary.ca 10
  • 11. Formal Relational System  F = {A’, R’, O’}  F should satisfy the following conditions:  Capable of expressing all relations and operations in E.  Support the meaningful conclusions from the data.  Mapping from E to F must represent all the observations, preserving all the relations and operations of E. far@ucalgary.ca 11
  • 12. Example 1 /1 Ranking 4 software products based on user preferences  Problem definition: ranking the products (entities) A, B, A B C D C and D based on user A - 80 10 80 preferences B 20 - 5 50  Scale: A single 0-100% C 90 95 - 96 linear scale D 20 50 4 -  Empirical relational system: represented by the (A,B) = 80 means %80 of table. the users preferred product A to B Example from Fenton’s Book far@ucalgary.ca 12
  • 13. Example 1 /2  Modeling:  Valid pairs are those having the value greater that 60.  If for a pair (A,B) , more than 60% of users prefer A to B then A is “definitely better” than B. If M(x,y) > 60% then p(x) > p(y)  Formal relation system: If p(x) > p(y) and p(y) > p(z) Then p(x) >p(z)  Verification:  Valid pairs (C,A), (C,B), (C,D), (A,B) and (A,D)  No conflict between the data collected and the model far@ucalgary.ca 13
  • 14. Example 1 /3  Verification fails if the data A B C D was collected as shown A - 80 10 80 here because B 20 - 95 50 C 90 5 - 96 M(B,C) > 60% then p(B) > p(C) D 20 50 4 - M(C,A) > 60% then p(C) > p(A) M(A,B) > 60% then p(A) > p(B)  There is a conflict between the real and formal model and the model must be revised. far@ucalgary.ca 14
  • 15. Example 2 /1  Entity: software failure  Attribute: criticality  Three types of failure is observed:  Delayed response  Incorrect output  Data loss  There are 3 failure classes in E  Delayed response (R1)  Incorrect output (R2)  Data loss (R3) far@ucalgary.ca 15
  • 16. Example 2 /2  Measurement mapping Figure from Fenton’s Book far@ucalgary.ca 16
  • 17. Example 2 /3  Next we would like to add a new binary relation: x is more critical than y  Each data loss failure (x in R3) is more critical than incorrect output failure (y in R2) and delayed response failure (y in R1)  Each incorrect output failure (x in R2) is more critical than delayed response failure (y in R1)  The model should be revised to account for this new binary relation far@ucalgary.ca 17
  • 18. Example 2 /4  Revised measurement mapping 3 3 3 Figure from Fenton’s Book far@ucalgary.ca 18
  • 19. Measurement: Properties /1  Completeness  Components of a measurement system: m = <attribute, scale, unit>  Attribute is what is being measured (e.g., size of a program)  Scale is the standard and scope of measurement (e.g., ratio)  Unit is the physical meaning of scale (e.g., a positive integer)  Uniqueness  A measurement result should be unique and match with the scale and units far@ucalgary.ca 19
  • 20. Measurement: Properties /2  Extendibility  Two or more formal measures mapping to the same entity are compatible using explicit compatibility relation (e.g., cm and inch used to measure length and 1 in = 2.54 cm)  Discrete Differentiability  The minimum unit of measurement scale is used to determine the differential rate of the measurement system. far@ucalgary.ca 20
  • 21. Measurement: Properties /3  Deterministic or Probabilistic  Measurement system should be either deterministic (i.e., lead to same results under same conditions) or probabilistic (e.g., productivity per hour, reliability)  Quantitative or Qualitative  Result of measurement is either quantitative (represented by number values) or qualitative (represented by qualitative values or range intervals) far@ucalgary.ca 21
  • 22. Measurement: Direct & Indirect  Direct measurement of an attribute of an entity involves no other attribute or entity. E.g., software length in terms of lines of code.  Indirect measurement is useful in making visible the interaction between direct measurements. E.g., productivity, software quality.  Indirect measurement is particularly helpful when the size of empirical measures is quite big (i.e., many relations and many entities) or the cost of direct measurement is high. far@ucalgary.ca 22
  • 23. Example: Direct & Indirect  In a software system, measuring number of faults (i.e., direct measurement) leads to identification of 5 problem areas.  However, measuring faults per KLOC (i.e., indirect measurement) leads to identification of only one problem area. Example from Fenton’s Book far@ucalgary.ca 23
  • 24. Measurement: Size of Empirical Set  Example: measuring popularity of 4 software products  E = {A, R, O}  |A| = 4  |R| = 3  r1: x is more popular than y  r2: x is less popular than y  r3: indifferent  The total number of possible measures is: 4 × (4 - 1) × 3 = 36  The size will grow when the number of elements and relations grow. far@ucalgary.ca 24
  • 26. Scale: Informal Definition  A measurement scale is a set of predefined symbols or values in order to represent certain common measures.  A scale is an abstract measurement tool for measuring defined common attributes of entities. far@ucalgary.ca 31
  • 27. Scale: Formal Definition  Assume that the followings are given:  A = {a, b, c, …, z} set of real world entities  k: property for each element of A  A = {a, b, c, …, z} is the model of A which describes each element of A in terms of the property k  The empirical and formal relational systems E and F  m, which is a mapping A → A’  Then S= (E, F, m) is called the scale of measurement for the property k of the target set A  If F is defined over a subset of real numbers, within the scale, the measurement maps the key property of each object of the target set into a number. far@ucalgary.ca 32
  • 28. Scales (History)  In the early 1940’s, the Harvard psychologist Stanley Smith Stevens coined the terms nominal, ordinal, interval, and ratio to describe a hierarchy of measurement scales and classify statistical procedures according to the scales they were adopted.  Stevens’ scales, were first presented in his 1946 article “On the theory of scales of measurement” [Stevens, 1946]. They have been adopted by most of the statistics textbooks and have consequently been influencing statistical experiments to date. far@ucalgary.ca 33
  • 29. Scales (Summary) Ratio Interval Ordinal Nominal far@ucalgary.ca 34
  • 30. Measurement Scale: Questions  How do we know if a scale appropriately represents the relationships between measured properties of entities? (representation problem)  Has to do with the validity of the measure  What do we do when we have several different scales for the same measure? (uniqueness problem)  Has to do with transformation between scales ( unit!)  For a defined scale, what arithmetic operations make sense for measurement values?  Has to do with meaningfulness of measurement-based statements
  • 31. Scale Types  Objective (regular) scales  Nominal, ordinal, interval, ratio, absolute  Subjective scales  Likert-Type scale (Evaluation-Type, Frequency-Type, Agreement-Type)  Semantic differential scale  Summative scale far@ucalgary.ca 36
  • 32. Regular Scales  The scale (E,F,m) is regular if and only if  for every other scale (E,F,g), and  for all a and b in A, m(a) =m(b) implies g(a) = g(b)  This is a rephrase for uniqueness property along various scales.  If two objects measured on one scale yield to the same measure for the property k, the same holds on any other scale. far@ucalgary.ca 37
  • 33. Objective Scale: Types /1 Nominal Scales  Define classes or categories, and then place each entity in a particular class or category, based on the value of the attribute.  Properties:  The empirical relation system consists only of different classes; there is no notion of ordering among the classes.  Any distinct numbering or symbolic representation of the classes is an acceptable measure, but there is no notion of magnitude associated with the numbers or symbols.  Nominal-scale measurement places elements in a classification scheme. The classes are not ordered; even if the classes are numbered from 1 to n for identification, there is no implied ordering of the classes. far@ucalgary.ca 38
  • 34. Example: Nominal Scale  Classification of cars Entity Attr based on their colour  Classification of faults in Car Colour a software: C-C1 1 White C-C2 2 Yellow  Specification fault C-C3 C-C4 3 4 Red Blue C-C5 5 Green  Design fault C-C6 6 other  Coding fault Measure (Car Colour) ∈ {“1”, “2”, “3”, “4”, “5”, “6”} {White, …, other} far@ucalgary.ca 39
  • 35. Objective Scale: Types /2 Ordinal Scales  The ordinal scale is useful to augment the nominal scale with information about an ordering of the classes or categories.  Properties:  The empirical relation system consists of classes that are ordered with respect to the attribute.  Any mapping that preserves the ordering (that is, any monotonic function) is acceptable.  The numbers represent ranking only, so addition, subtraction, and other arithmetic operations have no meaning. far@ucalgary.ca 40
  • 36. Example: Ordinal Scale  M(x) > M(y) then M’(x) > M’(y) Entity Attr  Measuring complexity of Defect Severity many software modules by defining 5 complexity classes D-S1 1 S1 Documentation D-S2 2 S2 Minor  Trivial D-S3 D-S4 3 4 S3 S4 Major Critical  Simple  Moderate Measure (Defect Severity) ∈ {S1, …, S4}  Complex {1, …, 4}  Incomprehensible far@ucalgary.ca 41
  • 37. Objective Scale: Types /3 Interval Scales  Interval scale carries more information than ordinal and nominal scale. It captures information about the size of the intervals that separate the classes, so that we can in some sense understand the size of the jump from one class to another.  Properties:  An interval scale preserves order, as with an ordinal scale.  An interval scale preserves differences but not ratios. That is, we know the difference between any two of the ordered classes in the range of the mapping, but computing the ratio of two classes in the range does not make sense.  Addition and subtraction are acceptable on the interval scale, but not multiplication and division. far@ucalgary.ca 42
  • 38. Example: Interval Scale  M’ = aM + b Entity Attr  Temperature ranges in Celsius and Fahrenheit Engine Temp  Project scheduling … E-T1 … … -20 -4 E-T2 -10 14  Requirement analysis 3 weeks E-T3 0 32 E-T4 10 50  Design 4 weeks E-T5 … 20 68 … …  Coding 4 weeks  Testing starts after coding is done Measure (Engine Temperature) ∈ [min, max]  When testing starts? After 11 weeks far@ucalgary.ca 43
  • 39. Objective Scale: Types /4 Ratio Scales  Sometimes we would like to be able to say that one liquid is twice as hot as another, or that one project took twice as long as another. This needs the ratio scale, which is the most useful scale of measurement, and quite common in the physical sciences.  Properties:  It is a measurement mapping that preserves ordering, preserves size of intervals between entities, and preserves ratios between entities.  There is a zero element, representing total lack of the attribute.  The measurement mapping must start at zero and increase at equal intervals, known as units.  All arithmetic can be meaningfully applied to the classes in the range of he mapping. far@ucalgary.ca 44
  • 40. Example: Ratio Scale  Measuring length, Entity Attr distance, etc. preserves Exec M’ = aM Prog Time P-E1 0 0  Measuring execution P-E2 P-E3 0.001 0.002 1 2 time of a program P-E4 P-E5 0.003 0.004 3 4 P-E6 0.005 5 … … Measure (Progr. Exec. Time) ∈ [0, ∞) far@ucalgary.ca 45
  • 41. Objective Scale: Types /5 Absolute Scales  The absolute scale is the most restrictive of all. For any two measures, M and M', there is only one admissible transformation: the identity transformation.  Properties:  The measurement for an absolute scale is made simply by counting the number of elements in the entity set.  The attribute always takes the form “number of occurrences of x in the entity.”  There is only one possible measurement mapping.  All arithmetic analysis of the resulting count is meaningful. far@ucalgary.ca 46
  • 42. Example: Absolute Scale  M’ = M  Number of failures Entity Attr observed in a module is Module #Defects absolute but the reliability is not. M-D1 M-D2 0 1 M-D3 2 M-D4 3 M-D5 4 M-D6 5  Number of people working … … on a project is absolute but Measure (Module Defect Count) ∈ IN0 their productivity is not. far@ucalgary.ca 47
  • 43. Scale Types (Summary)  Nominal scale: classification of objects, Entity where the fact that objects are different is Attr preserved  Ordinal scale: objects are ranked/ordered according to some criteria, but no information about the distance between the values is given  Interval scale: differences between values are meaningful Measure (Attribute) is well-defined, if scale and unit are clearly specified;  Ratio scale: there is a meaningful “zero” specification of the unit makes the value, ratios between values are meaningful measure unambiguous!  Absolute scale: no transformation (other than identity) is meaningful
  • 44. Measurement Scale Types [Mor01] Scale Type Characterization Example (generic) Example (SE) Nominal Divides the set of objects into categories, Labeling, classification Name of programming language, with no particular ordering among them name of defect type Ordinal Divides the set of entities into categories Preference, ranking, difficulty Ranking of failures (as measure of that are ordered failure severity) Interval Comparing the differences between Calendar time, temperature Beginning and end date of values is meaningful (Fahrenheit, Celsius) activities (as measures of time distance) Ratio There is a meaningful “zero” value, and Length, weight, time intervals, Lines of code (as measure of ratios between values are meaningful absolute temperature (Kelvin) attribute “Program length/size”) Absolute There are no meaningful transformations Object count Count (as measure of attribute of values other than identity “Number of lines of code”)
  • 45. Measurement Scale – Summary  There are 5 types of measurement scales  The type of the measurement scale determines  how measurement data can be treated  whether statements involving measurement data are meaningful far@ucalgary.ca 50
  • 46. Commonly Used Subjective Measurement Scales  Likert-Type Scale  Evaluation-Type  Frequency-Type  Agreement-Type  Semantic Differential Scale  Summative Scale
  • 47. Likert Type Scales /1  Evaluation-type [Spe92]  Example:  Familiarity with and comprehension of the software development environment (e.g., compiler, code generator, CASE tools):  Little  Unsatisfactory  Satisfactory  Excellent
  • 48. Likert Type Scales /2  Frequency-type [Spe92]  Example:  Customers provided information to the project team  e.g., during interviews, when given questionnaires by the project staff, when presented with a “system walkthrough”, and/or when they are asked to provide feedback on a prototype:  Never  Rarely  Occasionally  Most of the time
  • 49. Likert Type Scales /3  Agreement-type [Spe92]  Example:  The tasks supported by the software at the customer site were undertaking numerous changes during the project:  Strongly Agree  Agree  Disagree  Strongly Disagree
  • 50. Semantic Differential Scale  Items which include semantic opposites  Example:  Processing of requests for changes to existing systems: the manner, method, and required time with which the MIS staff responds to user requests for changes in existing computer-based information systems or services. Slow □□□□□□□ Fast Timely □ □ □ □ □ □ □ Untimely
  • 51. Assigning Numbers to Scale Responses /1  Likert-Type Scales:  Ordinal Scale  But: Often the distances between the four response categories are approximately (conceptually) equidistant [Spe76][Spe80] and thus are treated like approximate interval scales Strongly Agree  1 Agree  2 Disagree  3 Strongly Disagree  4
  • 52. Assigning Numbers to Scale Responses /2  Semantic Differential Scale:  Ordinal scale, but again, often treated as interval scales Slow □ □ □ □ □ □ □ Fast 1 2 3 4 5 6 7
  • 53. Assigning Numbers to Scale Responses /3  Summative Scale:  A summative scale is constructed by forming an unweighted sum of n items: Y = X1 + X2 + … + Xn  This is commonly considered to be an approximately interval scale, but controversial (see [McC81][GL89] for two different views)  If the X items are on a Likert-type scale then there is a good empirical basis for treating this as an approximately interval scale
  • 54. Exercise 1 Determine the best “scale” for each of the following measurements using each of the Nominal, Ordinal, Interval and Ratio scales only once.  Measuring execution time of a program Ratio  Classification of objects based on their color Nominal  Measuring duration of various phases of projects (Project scheduling) Interval  Measuring complexity of many software modules by defining 4 complexity classes (Trivial, Simple, Moderate, Complex) Ordinal  Measuring complexity of many software modules by defining cyclomatic complexity metrics far@ucalgary.ca 59
  • 55. Exercise 1 (cont’d)  Software products categorized according to their compatibility with the operating system (i.e. Windows, Linux, MacOS, DOS, etc.). Nominal  Internet services categorized according to their relevant technologies (i.e. dial-up, DSL, high-speed, wireless, etc.) Nominal  Measuring attitudes towards an Internet service (say, on an n-point rating, n = 0 to 10). Ordinal far@ucalgary.ca 60
  • 56. Exercise 1 (cont’d)  Measuring attitude towards Internet services, if the evaluation scores are numerically meaningful so the difference between a rate of 3 and a rate of 6 is exactly the same as the difference between a rate of 7 and a rate of 10. Interval  Measuring attitude towards Internet services, if the differences between the data values are numerically meaningful and equal. In addition, a score of zero implies either the full absence of the service being evaluated or the full dissatisfaction with it. Ratio far@ucalgary.ca 61
  • 57. Exercise 2  Suppose that you are asked to study various software development tools and recommend the best three to your company. The following table shows a list of available development tools. far@ucalgary.ca 62
  • 58. Exercise 2 (cont’d) Tool Name/Vendor Languages Platforms Features Supported Bean Machine Java Windows, OS2, Best: Visual applet and JavaBean IBM Unix generation CodeWarrior Pro Java, C, C++, Pascal Unix, Windows, Best: if you need to support Unix, Metrowerks Mac Windows, and Mac platforms Java Workshop Java Solaris, Windows Better: Written 100% in Java; tools Sun Microsystems based on a web browser metaphor JBuilder Imprise Java Windows, AS400 Better: database support Visual Cafe for Java Java Windows Good: multithreaded debugger Symantec VisualAge Java Unix, Windows Good: includes incremental compiler IBM and automatic version control Visual J++ Java Windows Fair: All the bells and whistles for Microsoft Windows far@ucalgary.ca 63
  • 59. Exercise 2 (cont’d)  What are the entities, attributes and their values in your model? Entity Attribute Value Development Language Java, C, C++, Pascal Tool supported Platform Win, Unix, Mac, OS2, AS400 Feature Fair, Good, Better, Best far@ucalgary.ca 64
  • 60. Exercise 2 (cont’d)  What is the best scale for each of the attributes you defined? Entity Attribute Value Scale Development Language Java, C, C++, Pascal Nominal Tool supported Platform Win, Unix, Mac, OS2, Nominal AS400 Feature Fair, Good, Better, Best Ordinal far@ucalgary.ca 65
  • 61. Are the following statements meaningful? 1. Peter is twice as tall as Hermann 2. Peter’s temperature is 10% higher than Hermann’s 3. Defect X is more severe than defect Y 4. Defect X is twice as severe as defect Y 5. The cost for correcting defect X is twice as high as the cost for correcting defect Y 6. The average temperature of city A (15 ºC) is twice as high as the average temperature of city B (30 ºC) 7. Project Milestone 3 (end of coding) took ten times longer than Project Milestone 0 (project start) 8. Coding took as long as requirements analysis
  • 62. Are the following statements meaningful? 1. “Peter is twice as tall as Hermann” 1. interval* no* 2. “Peter’s temperature is 10% higher than Hermann’s” 2. interval* no* 3. “Defect X is more severe than defect Y” 3. ordinal yes 4. “Defect X is twice as severe as defect Y” 4. ordinal no 5. “The cost for correcting defect X is twice as high as the cost for correcting defect Y” 5. ratio yes 6. The average temperature of city A (30 ºC) is twice as 6. interval no high as the average temperature of city B (15 ºC) 7. “Project Milestone 3 (end of coding) took ten times 7. interval no longer than Project Milestone 0 (project start)” 8. interval yes 8. “Coding took as long as requirements analysis”
  • 64. Measurement vs. Prediction  Measurement systems are used to assess existing entities by numerically characterizing one or more of its attributes.  Prediction systems are used to predict some attributes of a future entity, involving a mathematical model with associated prediction procedures.  Deterministic prediction system: The same output will be generated for a given input.  Stochastic prediction system: The output for a given input will vary probabilistically. E.g., weather forecast. far@ucalgary.ca 69
  • 65. Measurement Validation  Validation question: Does the measure used capture the information that it was intended for?  A measure is valid if it accurately characterizes the attribute it claims to measure.  A prediction system is valid if it makes accurate predictions. far@ucalgary.ca 70
  • 66. Validating Measures  Definition: The process of ensuring that the measure is a proper numerical characterization of the claimed attribute by showing that the representation condition is satisfied.  Example: Measuring program length  Any measure of length should satisfy conditions such as: Length of joint programs should be m(p1 ; p2) = m(p1) + m(p2) If length of p1 is greater that p2, any measure of length should satisfy m(p1) > m(p2) far@ucalgary.ca 71
  • 67. Validating Prediction Systems  Definition: The process of establishing the accuracy of the prediction system by empirical means, i.e., by comparing model performance with known data in the given environment.  Example: Software Reliability  Using tools such as CASRE to fit the failure data into the reliability model.  Accuracy of estimation of the failure intensity λ depends on the number of failures experienced (i.e., the sample size).  Good results in estimating failure intensity are generally experienced for programs with 5,000 or more developed source lines. far@ucalgary.ca 72
  • 68. Classes of Prediction Systems  Class 1: Using internal attribute measures of early life-cycle products to predict measures of internal attributes of later life-cycle products.  Example: measures of size, modularity, and reuse of a specification are used to predict size of the final code.  Class 2: Using early life-cycle process attribute measures and resource-attribute measures to predict measures of attributes of later life-cycle processes and resources.  Example: the number of faults found during formal design review is used to predict the cost of implementation. far@ucalgary.ca 73
  • 69. Classes of Prediction Systems  Class 3: Using internal product-attribute measures to predict process attributes.  Example: measures of structuredness are used to predict time to perform some maintenance task, or number of faults found during unit testing.  Class 4: Using process measures to predict later process measures.  Example: measures of failures during one operational period are used to predict likely failure occurrences in a subsequent operational period. In examples like this, where an external product attribute (reliability) is effectively defined in terms of process attributes (operational failures), we may also think of this class of prediction systems as using process measures to predict later external product measures. far@ucalgary.ca 74
  • 70. Accuracy of Measurement /1  In an ongoing project, it is possible to validate the estimate by comparing actual values with predicted values. If E is an estimate of a value and A is the actual value, the relative error (RE) in the estimate is: A− E RE = A  Relative error will be negative (if the estimate is greater than the actual value) or positive (if the estimate is less than the actual value).  Inaccuracies can result from either statistical variation or estimating bias.  Estimating inaccuracies must be compensated for throughout the project. far@ucalgary.ca 76
  • 71. Accuracy of Measurement /2  Relative Error (RE) can be calculated for a collection of estimates, e.g., it is useful to determine if predictions are accurate for a group of projects.  The mean relative error (MRE) is calculated as the sum of RE for n projects divided by n. It is possible for small relative errors to balance out large ones, so the magnitude of the error can also be considered.  The mean magnitude of relative error (MMRE) is calculated using the absolute value of RE. far@ucalgary.ca 77
  • 72. Accuracy of Measurement /3  MMRE is used to define prediction quality.  On a set of n projects, if k is the number of projects whose RE is less than or equal to q, then prediction quality is calculated as: PRED(q) = k/n  Example:  PRED(0.25) = 0.47 means that 47% of the predicted values fall within 25% of their actual values.  An estimation technique is (usually) acceptable if PRED(0.25) = 0.75. far@ucalgary.ca 78
  • 73. Exercise  Assume you want to predict the error proneness of software modules. a) How would you assess prediction accuracy? RE (relative error) = (A – E) / A with A: actual, E: estimated 1 n MRE = ∑ REi n i =1 1 n MMRE = ∑ REi n i =1 b) How would you assess prediction quality? PRED(q) = k/n where k is the number of predictions with |RE|<=q far@ucalgary.ca 79
  • 74. Software Measurement: Object Oriented & Subject- Domain Oriented Measurment
  • 75. Object-Oriented Measurement  Measurement of a single property is seldom adequate for the purpose of the software measurement because:  Direct measurement of a property may be costly, inconvenient, perhaps even impossible. Measured results may be better achieved by observing one or more other properties and deriving the measurement of the required property from them.  Measurement is seldom focused on a single property. It may be necessary to characterize a class of objects by the combination of several properties in an object profile. far@ucalgary.ca 81
  • 76. Definition: Object Profile  Profile is a set of disjoint alternatives called “elements” together with their values or occurrence probabilities.  Object Profile is the set of properties and their values or probabilities of occurrences.  Representing object profile  Tabular representation  Graphical representation far@ucalgary.ca 82
  • 77. Tabular Representation  Properties 1~n are directly measurable  Properties n+1~q are measured indirectly  Such a matrix stores properties which have already been found important, provides information for deriving new indirectly measured properties, and offers a framework for adding new properties when needed. Objects of the Properties model set A m1 m2 … mn … mq a m1(a) m2(a) mn(a) mq(a) b m1(b) m2(b) mn(b) mq(b) c m1(c) m2(c) mn(c) mq(c) . . . . . . . . . . z m1(z) m2(z) mn(z) mq(z) far@ucalgary.ca 83
  • 78. OO Measurement /1  Let Ai = {ai, bi, … , zi} be a model of A with respect to the ith key property.  Then the empirical relational system Ei = {Ai, Ri, Oi} models the set A with respect to the ith property, preserving the relations and operations which the property imposes on the objects of A.  Si = (Ei, Fi, mi) will be a scale for the ith property.  Now the set A will be an n-dimensional model of A: A = {A1, A2, … , An}, each with its own scale. far@ucalgary.ca 84
  • 79. OO Measurement /2  Assume that M = {m1, m2, … , mn} is the set of the real-valued functions on A, composed of measures of one of the directly measured key properties.  If one can define a real-valued function g on A in terms of the primitive measures of M, then g will be an indirect measure of the set of objects of A in terms of the chosen set of key properties, shown by g(m1, m2, … , mn). far@ucalgary.ca 85
  • 80. Meaningfulness  Indirect measures must preserve meaningful empirical properties called interrelation conditions which permit one to deduce the meaningful indirect measure as a function of the primitive measures, according to the laws imposed by the model.  Example:  Measuring three direct attributes: power (P), voltage (V) and current (I)  Measuring power (indirect) in terms of voltage (direct) and current (direct) attributes  They should satisfy the condition P=V x I far@ucalgary.ca 86
  • 81. Example  Setting failure intensity objective (λ) in terms of system availability (A): 1 1− A A= or λ= 1 + λ tm A tm λ is failure intensity (failure per unit of time) t m is downtime per failure  The representation condition of the indirect measure of λ should preserve the relationship. far@ucalgary.ca 87
  • 82. Scale Invariance  The scale of any indirect measure is deduced by means of the interrelation condition.  Example: for a database system λ =20 /million transactions tm = 6 minutes λ and tm must have a compatible scale to calculate A If there are 20,000 transactions/day and a day has 8 working hours, then tm can be expressed as equivalent to 250 transactions and A = 1 / (1+λ tm) = 1/(1+ 0.005) = 0.995 far@ucalgary.ca 88
  • 83. Subject Domain-Oriented Measurement (SDO)  When approaching the issue of object-oriented measurement, it was necessary to broaden the perspective from a single property to the many-featured characterization of objects.  When considering subject domain-oriented measurement, one must widen ones horizons once more, to consider the problem of characterizing by measurement any of the objects of a discipline. Single Subject-domain Object-Oriented Property oriented Measurement Measurement Measurement Broaden Broaden perspective perspective far@ucalgary.ca 89
  • 84. SDO Example: SI  The international measurement system (SI). SI provides measures for characterizing observable entities of the physical universe by means of only seven orthogonal base units:  1. length (metre)  2. mass (kilogram)  3. time (second)  4. temperature (degrees Kelvin)  5. electric current (ampere)  6. luminous intensity (candela)  7. the amount of substance (mole)  Other physical quantities are measured in units derived from these, such as velocity measured in metre/second (m/s). far@ucalgary.ca 90
  • 85. SDO for Software Engineering  At present, software engineering is still in need of stronger discipline-wide conceptual foundations and more comprehensive formal theories and empirical laws.  The subject domain is not ready for the formulation of a comprehensive measurement system.  However, there is widespread recognition of the need for measurement-based characterization of software, stemming from:  Discontent about the quality of software and the productivity of the industry.  Anxiety about the safety of software-related systems.  Concern about the security of data entrusted to software.  Convincing evidence of safety and security. far@ucalgary.ca 91
  • 86. Software Measurement! far@ucalgary.ca 93
  • 87. References  [Bas92] V. R. Basili: Software modeling and measurement: The Goal/Question/Metric paradigm. Technical Report CS-TR-2956, Department of Computer Science, University of Maryland, College Park, MD 20742, September 1992.  [BEM95] L. Briand, K. El Emam, S. Morasca: Theoretical and Empirical Validation of Software Product Measures. ISERN technical report 95-03, 1995.  [BMB96] L. C. Briand, S. Morasca, and V. R. Basili: Property-Based Software Engineering Measurement. IEEE Transactions on Software Engineering, Vol.22, No.1, January 1996.  [EM95] K. El Emam and N. H. Madhavji: Measuring the success of the requirements engineering process. Proceedings of the Second IEEE International Symposium on Requirements Engineering, pp. 204-211, 1995.  [GL89] D. Galletta and A. Lederer: Some cautions on the measurement of user information satisfaction. Decision Sciences, 20:419-438, 1989.  [GPC02] M. Genero, M. Piattini, C. Calero: Empirical Validation of Class Diagram Metrics. ISESE 2002: 195-203, 2002.  [HK81] S. M. Henry, D. G. Kafura: Software Structure Metrics Based on Information Flow. IEEE Trans. Software Eng. 7(5): 510-518, 1981.  [McC76] T. McCabe: A software complexity measure, IEEE Transactions on Software Engineering 2(4):308-320, 1976.  [McC81] J. McIver and E. Carmines: Unidimensional Scaling. Sage Publications, 1981.  [Mor01] S. Morasca: Software Measurement, Handbook of Software Engineering and Knowledge Engineering, Vol. I, World Scientific Publishing, 2001.
  • 88. References  [MK93] J.C. Munson, T. M. Koshgoftaar: Measurement of Data Structures Complexity, Journal of Systems and Software, 20:217-225, 1993.  [NASA/SEL] V. R. Basili: Software Modeling and Measurement: The Goal/Question/Metric Paradigm. Technical Report UNIACS-TR-92-96, University of Maryland, September 1992.  [Nat92] National Aeronautics and Space Administration: Data Collection Procedures for the Software Engineering Laboratory (SEL) Database, March 1992.  [Nat94] National Aeronautics and Space Administration: Software Measurement Guidebook. SEL-94- 002. July 1994.  [Nun78] J.Nunnaly: Psychometric Theory, McGraw Hill, 1978.  [Osg67] C. Osgood, G. Suci, and P. Tannenbaum: The measurement of meaning. University of Illinois Press, 1967.  [Spe72] P. Spector: Choosing response categories for summated rating scales. Journal of Applied Psychology, 61(3):374-375, 1976.  [Spe80] P. Spector: Ratings of equal and unequal response choice intervals. The Journal of Social Psychology, 112:115-119, 1980.  [Spe92] P. Spector: Summated Rating Scale Construction: An Introduction. Sage Publications, 1992.  [Wey88] E. Weyuker: Evaluating Software Complexity Measures. IEEE Transactions on Software Engineering, 14(9):1357-1365, 1988.  [Zus91] H. Zuse: Software Complexity: Measures and Methods, de Gruyter, 1991.