代写代考 MS 2252.56

Mathematics and Statistics –
Design and Analysis of Experiments
Week 8-9 – Factorial Designs

Copyright By PowCoder代写 加微信 powcoder

Design of Engineering Experiments Part 5 – The 2k Factorial Design
• Text reference, Chapter 6
• Special case of the general factorial design; k factors,
all at two levels
• The two levels are usually called low and high (they could be either quantitative or qualitative)
• Very widely used in industrial experimentation
• Form a basic “building block” for other very useful
experimental designs (DNA)
• Special (short-cut) methods for analysis

The Simplest Case: The 22
“-” and “+” denote the low and high levels of a factor, respectively
• Low and high are arbitrary terms
• Geometrically, the four runs form the corners of a square
• Factors can be quantitative or qualitative, although their treatment in the final model will be different

Chemical Process Example
A = reactant concentration, B = catalyst amount, y = recovery

Analysis Procedure for a Factorial Design
• Estimate factor effects
• Formulatemodel
– With replication, use full model
– With an unreplicated design, use normal probability plots
• Statisticaltesting(ANOVA)
• Refinethemodel
• Analyzeresiduals(graphical)
• Interpretresults

A=yA+ −yA−
= ab + a − b + (1)
2n B=yB+ −yB−
= ab + b − a + (1) 2n 2n
= 1 [ab+b−a−(1)] 2n
AB = ab + (1) − a + b 2n 2n
= 1 [ab+(1)−a−b] 2n
Estimation of Factor Effects
= 1 [ab+a−b−(1)]
See textbook, pg. 235-236 for manual calculations
The effect estimates are:
A = 8.33, B = -5.00, AB = 1.67 Practical interpretation?

Statistical Testing – ANOVA
The F-test for the “model” source is testing the significance of the overall model; that is, is either A, B, or AB or some combination of these effects important?

Residuals and Diagnostic Checking

The 23 Factorial Design

Effects in The 23 Factorial Design
A=yA+ −yA− B=yB+ −yB− C=yC+ −yC−
etc, etc, …
Analysis done via computer

An Example of a 23 Factorial Design
A = gap, B = Flow, C = Power, y = Etch Rate

Table of – and + Signs for the 23 Factorial Design (pg. 218)

Properties of the Table
• Except for column I, every column has an equal number of + and – signs
• The sum of the product of signs in any two columns is zero
• Multiplying any column by I leaves that column unchanged (identity
• The product of any two columns yields a column in the table:
ABBC = AB2C = AC
• Orthogonal design
• Orthogonality is an important property shared by all factorial designs

Estimation of Factor Effects

ANOVA Summary – Full Model

Model Coefficients – Full Model

Refine Model – Remove Non-significant Factors

Model Coefficients – Reduced Model

Model Summary Statistics for Reduced Model
• R2 and adjusted R2
R2 = SSModel = 5.106105 = 0.9608
SS 5.314 105 T
R2 =1−SSE /dfE =1− 20857.75/12 =0.9509 Adj SS /df 5.314105 /15
5.314 105
• R2 for prediction (based on PRESS)
R2 =1−PRESS=1− 37080.44 =0.9302

Model Summary Statistics
• Standard error of model coefficients (full model)
2 MS 2252.56
se()= V()= = E = =11.87
n2k n2k 2(8)
• Confidence interval on model coefficients
 − t se( )     + t se( ) /2,dfE /2,dfE

The Regression Model

Model Interpretation
Cube plots are often useful visual displays of experimental results

Cube Plot of Ranges
What do the large ranges when gap and power are at the high level tell you?

The General 2k Factorial Design • Section 6-4, pg. 253, Table 6-9, pg. 25
• There will be k main effects, and
two-factor interactions three-factor interactions
1 k − factor interaction

The General 2k Factorial Design

6.5 Unreplicated 2k Factorial Designs
• These are 2k factorial designs with one
observation at each corner of the “cube”
• Anunreplicated2kfactorialdesignisalso
sometimes called a “single replicate” of the 2k
• These designs are very widely used
• Risks…if there is only one observation at each corner, is there a chance of unusual response observations spoiling the results?
• Modeling“noise”?

Unreplicated 2k Factorial Designs
• Lack of replication causes potential problems in
statistical testing
– Replication admits an estimate of “pure error” (a
better phrase is an internal estimate of error)
– With no replication, fitting the full model results in zero
degrees of freedom for error
• Potential solutions to this problem
– Pooling high-order interactions to estimate error
– Normal probability plotting of effects (Daniels, 1959) – Other methods…see text

Example of an Unreplicated 2k Design
• A 24 factorial was used to investigate the effects of four factors on the filtration rate of a resin
• The factors are A = temperature, B = pressure, C = mole ratio, D= stirring rate
• Experiment was performed in a pilot plant

The Resin Plant Experiment

The Resin Plant Experiment

Estimates of the Effects

Design Projection: ANOVA Summary for the Model as a 23 in Factors A, C, and D

The Regression Model
ˆ=== =Y1 223

Model Residuals are Satisfactory

Model Interpretation – Main Effects and Interactions

Outliers: suppose that cd = 375 (instead of 75)

Dealing with Outliers
• Replace with an estimate
• Make the highest-order interaction zero
• In this case, estimate cd such that ABCD = 0 • Analyze only the data you have
• Now the design isn’t orthogonal
• Consequences?

The Drilling Experiment, Example 6.3
A = drill load, B = flow, C = speed, D = type of mud, y = advance rate of the drill

Normal Probability Plot of Effects – The Drilling Experiment

Residual Plots
DadEvS._IGraNte-EXPERT Plot
Residuals vs. Predicted
1.69 4.70 7.70 10.71 13.71
R e s id u a ls

Residual Plots
• The residual plots indicate that there are problems with the equality of variance assumption
• The usual approach to this problem is to employ a transformation on the response
• Power family transformations are widely used y* = y
• Transformations are typically performed to – Stabilize variance
– Induce at least approximate normality – Simplify the model

Selecting a Transformation
• Empirical selection of lambda
• Prior (theoretical) knowledge or experience can
often suggest the form of a transformation
• Analyticalselectionoflambda…theBox-Cox (1964) method (simultaneously estimates the model parameters and the transformation parameter lambda)
• Box-CoxmethodimplementedinMinitab,Design- Expert, …

Chapter 6 46

The Box-Cox Method
DESIGN-EXPERT Plot adv._rate
Box-Cox Plot for Power Transforms 6.85
-3 -2 -1 0 1 2 3
A log transformation is recommended
The procedure provides a confidence interval on the transformation parameter lambda
If unity is included in the confidence interval, no transformation would be needed
Current = 1
Best = -0.23
Low C.I. = -0.79 High C.I. = 0.32
Recommend transform: Log
(Lambda = 0)
L n (R e s id u a lS S )

Effect Estimates Following the Log Transformation
Three main effects are large
No indication of large interaction effects
What happened to the interactions?

ANOVA Following the Log Transformation

Following the Log Transformation

The Log Advance Rate Model
• Is the log model “better”?
• We would generally prefer a simpler model in a transformed scale to a more complicated model in the original metric
• What happened to the interactions?
• Sometimes transformations provide insight into the underlying mechanism

Other Analysis Methods for Unreplicated 2k Designs
• Lenth’s method (see text, pg. 262)
– Analytical method for testing effects, uses an estimate
of error formed by pooling small contrasts
– Some adjustment to the critical values in the original method can be helpful
– Probably most useful as a supplement to the normal probability plot
• Conditional inference charts (pg. 264)

Overview of Lenth’s method
For an individual contrast, compare to the margin of error

Adjusted multipliers for Lenth’s method
Suggested because the original method makes too many type I errors, especially for small designs (few contrasts)
Simulation was used to find these adjusted multipliers
Lenth’s method is a nice supplement to the normal probability plot of effects

The 2k design and design optimality
The model parameter estimates in a 2k design (and the effect estimates) are least squares estimates. For example, for a 22 design the model is
y= +x + x + xx + 0 11 22 1212
(1)=0 +1(−1)+2(−1)+12(−1)(−1)+1 a=0 +1(1)+2(−1)+12(1)(−1)+2 b=0 +1(−1)+2(1)+12(−1)(1)+3
ab=0 +1(1)+2(1)+12(1)(1)+4
(1) 1 −1 −1 1  
 01  
a 11−1−1   y=Xβ+ε,y= ,X= ,β= 1,ε=2 b 1−11−1 2 3
ab 1111      12 4
The four observations from a 22 design

The least squares estimate of β is
ˆ  -1  β=(XX) Xy
The “usual” contrasts
4 0 0 0−1 (1)+a+b+ab 0 4 0 0 a+ab−b−(1)
The XX matrix is diagonal – consequences of an orthogonal design
= 0 0 4 0 b+ab−a−(1)
0 0 0 4 (1)−a−b+ab 
(1)+a+b+ab
0  (1)+a+b+ab a+ab−b−(1)
ˆ 1 =1I a+ab−b−(1)= 4 
The regression coefficient estimates are exactly half of the ‘usual” effect estimates
 ˆ  4 4 b+ab−a−(1) b+ab−a−(1) 2  
ˆ  (1)−a−b+ab  4 
(1)−a−b+ab  4

The matrixXX has interesting and useful
properties:
V ( ) =  =2
(diagonal element of (X X) )
|(XX) |= 256
Notice that these results depend on both the design that
you have chosen and the model What about predicting the response?
Minimum possible value for a four-run design
Maximum possible value for a four-run design

ˆ V[y(x,x )]=2x(XX)-1x
x = [1, x , x , x x ] 1212
2 ˆ1241212
V[y(x,x )]=
The maximum prediction variance occurs when x = 1, x = 1
(1+x2 +x2 +x2x2)
V[y(x,x )]=2 ˆ12
The prediction variance when x = x = 0 is 12
V[y(x,x )]=2
What about average prediction variance over the design space?

 V[y(x , x )dx dx A = area of design space = 22 = 4
Average prediction variance
Aˆ1212 −1 −1
2 (1+x2 +x2 +x2x2)dxdx
4−1−1 4 = 4 2

For the 22 and in general the 2k
• The design produces regression model coefficients that
have the smallest variances (D-optimal design)
• The design results in minimizing the maximum variance of the predicted response over the design space (G-optimal design)
• The design results in minimizing the average variance of the predicted response over the design space (I-optimal design)

Optimal Designs
• These results give us some assurance that these designs are “good” designs in some general ways
• Factorial designs typically share some (most) of these properties
• There are excellent computer routines for finding optimal designs

Addition of Center Points to a 2k Designs
• Based on the idea of replicating some of
the runs in a factorial design
• Runs at the center provide an estimate of
error and allow the experimenter to
distinguish between two possible models:
First-order model (interaction) y = 0 +  x +  x x +
Second-order model y =  +  x +
 x x +  x2 +
 i=1 ji
i=1 i=1 ji

yF = yC  no “curvature” The hypotheses are:
=n n (y −y )2 FCFC
This sum of squares has a single degree of freedom

Example 6.7, Pg. 286
Refer to the original experiment shown in Table 6.10.
Suppose that four center points are added to this experiment, and at the points x1=x2 =x3=x4=0 the four observed filtration rates were 73, 75, 66, and 69.
The average of these four center points is 70.75, and the average of the 16 factorial runs is 70.06.
Since are very similar, we suspect that there is no strong curvature present.
Usually between 3 and 6 center points will work well

ANOVA for Example 6.7

If curvature is significant, augment the design with axial runs to create a central composite design. The CCD is a very effective design for fitting a second-order response surface model

Practical Use of Center Points (pg. 289)
• Use current operating conditions as the center point
• Checkfor“abnormal”conditionsduringthetime the experiment was conducted
• Checkfortimetrends
• Use center points as the first few runs when there is little or no information available about the magnitude of error
• Center points and qualitative factors?

Center Points and Qualitative Factors

Case Study – HPLC method
• Aim: to optimise the separation of peaks in a HPLC analysis

Define the Response
• The CRF (chromatographic response function) is used to quantify separation of peaks. This function thus gives a single number to the ‘quality’ of a chromatogram. The aim is thus to maximise the CRF

Define the Factors
• The factors studied in this study were levels in
the eluent of:- • Acetic Acid
• Methanol
• Citric Acid

Experimental Domain
Acetic Acid (mol/L)
% Methanol
Citric Acid (g/L)

Factorial design (Coded form)
This design gives all combinations of the factors at 2 levels ‘+’, high ‘-’, low
Run Number
Acetic Acid
Citric Acid

Factorial design (Uncoded)
This table shows the actual levels of the variables used in the experiments. Normally the order of experiments is randomised but we will keep it in this structured forms so you can see the patterns
Run Number
Acetic Acid
Citric Acid
Results are inserted here when the experiments
are performed

Factorial design (Uncoded)
The CRF values are now inserted after the experiments (chromatographic runs) are carried out
Run Number
Acetic Acid
Citric Acid

CRF average at ‘high’ methanol
Main Effects Plot (data means) for CRF
11.5 11.0 10.5 10.0
11.5 11.0 10.5 10.0
Acetic Acid
Citric Acid
Note that Methanol has the steepest slope, indicating the strongest effect
CRF average at ‘low’ methanol
Mean of CRF

Acetic Acid 0.004 0.010
A cetic A cid
Interaction Plot (data means) for CRF
Methanol 70 80
The plots show there is an interaction effect with methanol and citric acid at high methanol CA has a Positive effect but at low methanol it has a negative effect on CRS
Citric Acid

Conclusions
• Methanol has the largest effect on CRF
• The Methanol effect strongly depends on the Citric Acid level. Citric acid has a positive effect at high Methanol but a negative effect at low Methanol
• All 3 variables do seem to affect the result. Citric acid has the smallest main effect but large interaction effect
• Hence probably can’t ‘screen out’ any of these variables for further study

Significance – Normal Probability Plots
• Normal Probability Plots are used to test whether data is
normally distributed.
• In our case, we can use such a plot to test for significance of the effects/coefficients
• If the effects are not significant we expect variations just to be due to random error and this can be tested with the plots. It is only a guide, however, as we have no real estimate of the experimental error
• In Minitab the plot can be generated:
• Stat > DOE >factorial > Analyse Factorial Design > Graphs and Select Effects Plots (Normal)

-0.5 0.0 0.5 Lenth’s PSE = 0.1875
1.0 1.5 2.0
Normal Probability Plot of the Effects
(response is CRF, Alpha = .05)
Effect Type Not Significant Significant
Factor Name
A A cetic A cid
B M ethanol
C C itric A cid
Effects due to random errors should be on a straight line. This plot indicates
Methanol and the Methanol/Citric Acid interaction are significant effects

• We have not replicated any experiments so no determination of error. We cannot tell if the coefficients (effects) overall are significant (although normal probability plots help). We can only compare them to see which is the most significant
• We also cannot test for curvature – i.e are the effects of the variables linear. A non-linear effect can be when the response at the high and low levels is similar but at intermediate values is much higher or lower. pH effects are often non-linear

Acetic Acid
Citric Acid
• Addcentrepoints!!
• Centre points are experiments with all variables set at 0 (coded) i.e. mid values
• Replication of the centre point allows determination of error

Acetic Acid
Citric Acid
Results added
for the centre points

Estimated Effects and Coefficients for CRF (coded units)
Acetic Acid
Citric Acid
Acetic Acid*Methanol
Acetic Acid*Citric Acid 0.0250 0.0125 0.05000 0.25 Methanol*Citric Acid 0.8250 0.4125 0.05000 8.25
0.430 0.430 0.844
Effect Coef SE Coef T 10.362 0.05000 207.25
-0.3750 -0.1875 0.05000 -3.75 1.9250 0.9625 0.05000 19.25 0.1250 0.0625 0.05000 1.25 0.1250 0.0625 0.05000 1.25
P is the probability a coefficient is not significantly different from zero i.e no effect on CRF. A low probability
(< 0.05 at the 5% level) indicates high significance. The methanol effect is the only significant one at the 5% level although the methanol-citric acid effect is just above the 5% level 11.5 11.0 10.5 10.0 11.5 11.0 10.5 10.0 0.007 0.010 70 75 80 Citric Acid Main Effects Plot (data means) for CRF Acetic Acid The centre point responses are all on the linear response line. Thus no curvature is indicated. PointType Corner Center Mean of CRF 60 50 40 30 Normal Probability Plot of the Standardized Effects (response is CRF, Alpha = .05) Effect Type Not Significant Significant Factor Name A A cetic A cid B M ethanol C C itric A cid The Normal Plot shows also that Methanol is the only significant effect but Methanol/CA interaction is probably also significant -5 0 5 10 15 20 Standardized Effect 程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com