12. STATISTICAL PARAMETERS
Dispersion (also called Variability, Scatter, Spread)
12
It is the extent to which a distribution is stretched
or squeezed.
Common examples of Statistical Dispersion are the
variance, standard deviation and interquartile
range.
Coefficient of Dispersion (COD)
It is a measure of spread that describes the amount of
variability relative to the mean and it is unit less.
𝑪𝑶𝑫= 𝝈∕𝝁∗𝟏𝟎𝟎
𝝈
𝝁
4
13. Variance:
13
It is the expectation of the squared deviation of a
random variable from its mean and it informally
measures how far a set of random numbers are
spread out from the mean.
It is calculated by taking the differences between each
number in the set and the mean, squaring the
differences (to make them positive) and diving the
sum of the squares by the number of values in the set.
The variance provides the user with a numerical
measure of the scatter of the data.𝝈 𝟐
=
𝑵
𝑵
= −𝝁 𝟐; 𝝁(𝑴
𝒆𝒂𝒏) =
𝑿𝑵
5
𝝈 𝟐= ∑(𝑿−𝝁) 𝟐∕∕ 𝑵= ∑𝑿 𝟐∕ 𝑵−𝝁 𝟐; 𝝁(𝑴𝒆𝒂𝒏)= ∑𝑿∕𝑵
14. Standard Déviation (SD) σ
14
It is a measure used to quantify the amount of
variation or dispersion of a set of data values.
It is a number that tells how measurement for a group
are spread out from the average (mean) or expected
value.
A low standard deviation means most of the numbers
are very close to the average while a high value
indicates the data to be spread out.
The SD provides the user with a numerical measure
of the scatter of the data.
𝝈=√𝟏∕𝑵 ∑(𝑿−𝝁)𝟐
15. Root Mean Squared Error
(RMSE)
15
It is also termed as Root Mean Square Deviation
(RMSD).
It is used to measure the differences between values
(sample and population values) predicted by a model
or an estimator and the values actually observed.
𝑹𝑴𝑺𝑬=√∑ (𝑿𝒐𝒃𝒅𝒆𝒓𝒗𝒆𝒅−𝑿𝒎𝒐𝒅𝒆𝒍𝒍𝒆𝒅)𝟐∕𝑵
16. Absolute Error (AE)
16
It is the magnitude of the difference between the
exact value and the approximation.
The relative error is the absolute error divided by the
magnitude of the exact value.
𝑨𝑬 = 𝑿 𝒎𝒆𝒂𝒔𝒖𝒓𝒆𝒅 −𝑿 𝒂𝒄𝒕𝒖𝒂𝒍
17. Mean Square Error (MSE)
17
Also termed as Mean Square Deviation (MSD).
It measures the average of the squares of the errors
or deviations i.e. the difference between the estimator
and that is estimated.
18. Factor Analysis
18
It is a useful tool for investigating variable
relationships for complex concepts allowing
researchers to investigate concepts that are not easily
measured directly by collapsing a large number of
variables into a few interpretable underlying factors.
32. Confidence region
• In statistics, a confidence region is a multi-dimensional
generalization of a confidence interval. It is a set of points in
an n-dimensional space, often represented as an ellipsoid
around a point which is an estimated solution to a problem,
although other shapes can occur.
• Interpretation
• The confidence region is calculated in such a way that if a set
of measurements were repeated many times and a confidence
region calculated in the same way on each set of
measurements, then a certain percentage of the time (e.g. 95%)
the confidence region would include the point representing the
"true" values of the set of variables being estimated
32
33. .
• Nonlinear problems
• Confidence regions can be defined for any probability
distribution. The experimenter can choose the significance
level and the shape of the region, and then the size of the
region is determined by the probability distribution. A natural
choice is to use as a boundary a set of points with
constant (chi-squared) values.
• One approach is to use a linear approximation to the nonlinear
model, which may be a close approximation in the vicinity of
the solution, and then apply the analysis for a linear problem to
find an approximate confidence region. This may be a
reasonable approach if the confidence region is not very large
and the second derivatives of the model are also not very large.
33
34. .• Nonlinearity at the Optimum
• It is useful to study the degree of nonlinearity of our model in
a neighbourhood of the forecast.
• Briefly, there exist methods of assessing the maximum degree
of intrinsic nonlinearity that the model exhibits around the
optimum found. If maximum nonlinearity is excessive, for one
or more parameters the confidence regions obtained applying
the results of the classic theory are not to be trusted. In this
case, alternative simulation procedures may be employed to
provide empirical confidence regions.
34
35. SENSITIVITY ANALYSIS
• Sensitivity analysis is the study of how the uncertainty in the output of a
mathematical model or system (numerical or otherwise) can be divided and
allocated to different sources of uncertainty in its inputs[A related practice
is uncertainty analysis, which has a greater focus on uncertainty
quantification and propagation of uncertainty; ideally, uncertainty and
sensitivity analysis should be run in tandem.
• The process of recalculating outcomes under alternative assumptions to
determine the impact of a variable under sensitivity analysis can be useful
for a range of purposes including:
35
36. .
• Testing the robustness of the results of a model or system in the presence of
uncertainty.
• Increased understanding of the relationships between input and output
variables in a system or model.
• Uncertainty reduction, through the identification of model inputs that cause
significant uncertainty in the output and should therefore be the focus of
attention in order to increase robustness (perhaps by further research).
• Searching for errors in the model (by encountering unexpected
relationships between inputs and outputs).
• Model simplification – fixing model inputs that have no effect on the
output, or identifying and removing redundant parts of the model
structure.Enhancing communication from modellers to decision makers
(e.g. by making recommendations more credible, understandable,
compelling or persuasive). 36
37. .
• Settings and constraints
• The choice of method of sensitivity analysis is typically dictated by a
number of problem constraints or settings. Some of the most common are
• Computational expense: Sensitivity analysis is almost always performed by
running the model a (possibly large) number of times, i.e. a sampling-based
approach.
• The model has a large number of uncertain inputs. Sensitivity analysis is
essentially the exploration of the multidimensional input space, which
grows exponentially in size with the number of inputs. See the curse of
dimensionality.
• Correlated inputs: Most common sensitivity analysis methods assume
independence between model inputs, but sometimes inputs can be strongly
correlated. This is still an immature field of research and definitive
methods have yet to be established.
• Nonlinearity: Some sensitivity analysis approaches, such as those based on
linear regression, can inaccurately measure sensitivity when the model37
38. • Correlated inputs: Most common sensitivity analysis methods assume
independence between model inputs, but sometimes inputs can be strongly
correlated. This is still an immature field of research and definitive methods
have yet to be established.
• Nonlinearity: Some sensitivity analysis approaches, such as those based on
linear regression, can inaccurately measure sensitivity when the model
response is nonlinear with respect to its inputs. In such cases, variance-
based measures are more appropriate.
38
39. .• Sensitivity analysis methods
• There are a large number of approaches to performing a sensitivity
analysis, many of which have been developed to address one or more
of the constraints discussed above. They are also distinguished by the
type of sensitivity measure, be it based on (for example) variance
decompositions.
• Regression analysis, in the context of sensitivity analysis, involves
fitting a linear regression to the model response and using
standardized regression coefficients as direct measures of sensitivity.
• variance-based methods
• Variance-based are a class of probabilistic approaches which quantify
the input and output uncertainties as probability distributions, and
decompose the output variance into parts attributable to input
variables and combinations of variables. 39
40. Applications
• Chemistry
• Sensitivity analysis is common in many areas of physics and
chemistry.
• Sensitivity analysis has been proven to be a powerful tool to
investigate a complex kinetic model.
• In a meta analysis, a sensitivity analysis tests if the results are
sensitive to restrictions on the data included. Common
examples are large trials only, higher quality trials only, and
more recent trials only. If results are consistent it provides
stronger evidence of an effect and of generalizability.
Engineering
• Modern engineering design makes extensive use of computer
models to test designs before they are manufactured.
Sensitivity analysis allows designers to assess the effects and
sources of uncertainties, in the interest of building robust
models. 40
41. Optimal design
• In the design of experiments, optimal designs are a class of
experimental designs that are optimal with respect to some statistical
criterion.
• In the design of experiments for estimating statistical models, optimal
designs allow parameters to be estimated without bias and with
minimum variance.
• A non-optimal design requires a greater number of experimental runs
to estimate the parameters with the same precision as an optimal
design. In practical terms, optimal experiments can reduce the costs of
experimentation.
41
42. .
• Advantages
• Optimal designs reduce the costs of experimentation by
allowing statistical models to be estimated with fewer
experimental runs.
• Optimal designs can accommodate multiple types of
factors, such as process, mixture, and discrete factors.
• Designs can be optimized when the design-space is
constrained, for example, when the mathematical process-
space contains factor-settings that are practically infeasible
(e.g. due to safety concerns).
42
43. .
• Minimizing the variance of estimators
• Experimental designs are evaluated using statistical criteria.
• .In the estimation theory for statistical models with one real parameter, the
reciprocal of the variance of an ("efficient") estimator is called the "Fisher
information" for that estimator. Because of this reciprocity, minimizing the variance
corresponds to maximizing the information.
• When the statistical model has several parameters, however, the mean of the
parameter-estimator is a vector and its variance is a matrix.
• The inverse matrix of the variance-matrix is called the "information matrix".
Because the variance of the estimator of a parameter vector is a matrix, the problem
of "minimizing the variance" is complicated.
• Using statistical theory, statisticians compress the information-matrix using real-
valued summary statistics; being real-valued functions, these "information criteria"
can be maximized.
43
44. POPULATION MODELLING
• A population model is a type of mathematical model
that is applied to the study of population dynamics.
• Population modelling is a tool to identify and describe
relationships between a subject’s physiologic
characteristics and observed drug exposure or response.
Population pharmacokinetics (PK) modelling is not a
new concept; it was first introduced in 1972 by Sheiner
et al.
44
46. Figure represents a brief outline of some areas in which modelling and simulation are
commonly employed during drug development. Appropriate models can provide a
framework for predicting the time course of exposure and response for different dose
regimens. Central to this evolution has been the widespread adoption of population
modelling methods that provide a framework for quantitating and explaining variability
in drug exposure and response.
Types of Models
PK models
PK models describe the relationship between drug concentration(s) and time. The
building block of many PK models is a “compartment”—a region of the body in which
the drug is well mixed and kinetically homogenous (and can therefore be described in
terms of a single representative concentration at any time point).
46
47. Disease progression models
• Disease progression models were first used in 1992 to describe the time
course of a disease metric (e.g., ADASC in Alzheimer's disease).
• Such models also capture the inter-subject variability in disease
progression, and the manner in which the time course is influenced by
covariates or by treatment.
• They can be linked to a concurrent PK model and used to determine
whether a drug exhibits symptomatic activity or affects progression.
• Models of disease progress in placebo groups are crucial for
understanding the time course of the disease in treated groups, as well as
for predicting the likely response in a placebo group in a clinical trial
47
48. References
• 1.Computer application in pharmaceutical
research and development ,Sean Ekins 2006.
• Non linear regression analysis ,Bales and
Watts.
• www.slideshare.net.
• www.google.com 48