程序代写代做 AI data structure arm C Fortran Excel game IOS chain Bioinformatics go assembly flex Erlang Bayesian finance clock case study algorithm graph Probability & Statistics for Engineers & Scientists

Probability & Statistics for Engineers & Scientists

This page intentionally left blank

Probability & Statistics for Engineers & Scientists
NINTH EDITION
Ronald E. Walpole
Roanoke College
Raymond H. Myers
Virginia Tech
Sharon L. Myers
Radford University
Keying Ye
University of Texas at San Antonio
Prentice Hall

Editor in Chief: Deirdre Lynch
Acquisitions Editor: Christopher Cummings
Executive Content Editor: Christine O’Brien
Associate Editor: Christina Lepre
Senior Managing Editor: Karen Wernholm
Senior Production Project Manager: Tracy Patruno Design Manager: Andrea Nix
Cover Designer: Heather Scott
Digital Assets Manager: Marianne Groth
Associate Media Producer: Vicki Dreyfus
Marketing Manager: Alex Gay
Marketing Assistant: Kathleen DeChavez
Senior Author Support/Technology Specialist: Joe Vetere Rights and Permissions Advisor: Michael Joyce
Senior Manufacturing Buyer: Carol Melville
Production Coordination: Lifland et al. Bookmakers Composition: Keying Ye
Cover photo: Marjory Dressler/Dressler Photo-Graphics
Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in this book, and Pearson was aware of a trademark claim, the designations have been printed in initial caps or all caps.
Library of Congress Cataloging-in-Publication Data
Probability & statistics for engineers & scientists/Ronald E. Walpole . . . [et al.] — 9th ed. p. cm.
ISBN 978-0-321-62911-1
1. Engineering—Statistical methods. 2. Probabilities. I. Walpole, Ronald E.
TA340.P738 2011 519.02’462–dc22
2010004857
Copyright ⃝c 2012, 2007, 2002 Pearson Education, Inc. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written permission of the publisher. Printed in the United States of America. For information on obtaining permission for use of material in this work, please submit
a written request to Pearson Education, Inc., Rights and Contracts Department, 501 Boylston Street, Suite 900, Boston, MA 02116, fax your request to 617-671-3447, or e-mail at http://www.pearsoned.com/legal/permissions.htm.
1 2 3 4 5 6 7 8 9 10—EB—14 13 12 11 10
ISBN 10: 0-321-62911-6 ISBN 13: 978-0-321-62911-1

This book is dedicated to Billy and Julie
R.H.M. and S.L.M.
Limin, Carolyn and Emily
K.Y.

This page intentionally left blank

Contents
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv
1 Introduction to Statistics and Data Analysis ……….. 1
1.1 Overview: Statistical Inference, Samples, Populations, and the
Role of Probability………………………………………. 1
1.2 Sampling Procedures; Collection of Data …………………… 7
1.3 Measures of Location: The Sample Mean and Median ……….. 11
Exercises …………………………………………… 13
1.4 Measures of Variability…………………………………… 14 Exercises …………………………………………… 17
1.5 Discrete and Continuous Data…………………………….. 17
1.6 Statistical Modeling, Scientific Inspection, and Graphical Diag- nostics…………………………………………………. 18
1.7 General Types of Statistical Studies: Designed Experiment, Observational Study, and Retrospective Study ……………… 27 Exercises …………………………………………… 30
2 Probability………………………………………….. 35
2.1 Sample Space…………………………………………… 35
2.2 Events…………………………………………………. 38
Exercises …………………………………………… 42
2.3 Counting Sample Points………………………………….. 44 Exercises …………………………………………… 51
2.4 Probability of an Event ………………………………….. 52
2.5 Additive Rules………………………………………….. 56 Exercises …………………………………………… 59
2.6 Conditional Probability, Independence, and the Product Rule . . . 62 Exercises …………………………………………… 69
2.7 Bayes’ Rule…………………………………………….. 72 Exercises …………………………………………… 76 Review Exercises…………………………………….. 77

viii
Contents
3
2.8 Potential Misconceptions and Hazards; Relationship to Material
in Other Chapters……………………………………….. 79
Random Variables and Probability Distributions . . . . . . 81
3.1 Concept of a Random Variable……………………………. 81
3.2 Discrete Probability Distributions…………………………. 84
3.3 Continuous Probability Distributions………………………. 87
Exercises …………………………………………… 91
3.4 Joint Probability Distributions……………………………. 94 Exercises …………………………………………… 104 Review Exercises…………………………………….. 107
3.5 Potential Misconceptions and Hazards; Relationship to Material
in Other Chapters……………………………………….. 109
4 Mathematical Expectation………………………….. 111
4.1 Mean of a Random Variable………………………………. 111 Exercises …………………………………………… 117
4.2 Variance and Covariance of Random Variables………………. 119 Exercises …………………………………………… 127
4.3 Means and Variances of Linear Combinations of Random Variables 128
4.4 Chebyshev’s Theorem……………………………………. 135 Exercises …………………………………………… 137 Review Exercises…………………………………….. 139
4.5 Potential Misconceptions and Hazards; Relationship to Material
in Other Chapters……………………………………….. 142
5 Some Discrete Probability Distributions ……………. 143
5.1 Introduction and Motivation ……………………………… 143
5.2 Binomial and Multinomial Distributions……………………. 143
Exercises …………………………………………… 150
5.3 Hypergeometric Distribution ……………………………… 152 Exercises …………………………………………… 157
5.4 Negative Binomial and Geometric Distributions …………….. 158
5.5 Poisson Distribution and the Poisson Process……………….. 161 Exercises …………………………………………… 164 Review Exercises…………………………………….. 166
5.6 Potential Misconceptions and Hazards; Relationship to Material
in Other Chapters……………………………………….. 169

Contents
ix
6 Some Continuous Probability Distributions…………. 171
6.1 Continuous Uniform Distribution………………………….. 171
6.2 Normal Distribution …………………………………….. 172
6.3 Areas under the Normal Curve……………………………. 176
6.4 Applications of the Normal Distribution……………………. 182
Exercises …………………………………………… 185
6.5 Normal Approximation to the Binomial ……………………. 187 Exercises …………………………………………… 193
6.6 Gamma and Exponential Distributions…………………….. 194
6.7 Chi-Squared Distribution…………………………………. 200
6.8 Beta Distribution ……………………………………….. 201
6.9 Lognormal Distribution ………………………………….. 201
6.10 Weibull Distribution (Optional)…………………………… 203 Exercises …………………………………………… 206 Review Exercises…………………………………….. 207
6.11 Potential Misconceptions and Hazards; Relationship to Material
in Other Chapters……………………………………….. 209
7 Functions of Random Variables (Optional)………….. 211
7.1 Introduction ……………………………………………. 211
7.2 Transformations of Variables……………………………… 211
7.3 Moments and Moment-Generating Functions……………….. 218
Exercises …………………………………………… 222
8 Fundamental Sampling Distributions and
Data Descriptions…………………………………. 225
8.1 Random Sampling ………………………………………. 225
8.2 Some Important Statistics………………………………… 227
Exercises …………………………………………… 230
8.3 Sampling Distributions…………………………………… 232
8.4 Sampling Distribution of Means and the Central Limit Theorem. 233
Exercises …………………………………………… 241
8.5 Sampling Distribution of S2 ………………………………. 243
8.6 t-Distribution…………………………………………… 246
8.7 F-Distribution………………………………………….. 251
8.8 Quantile and Probability Plots……………………………. 254
Exercises …………………………………………… 259
Review Exercises…………………………………….. 260
8.9 Potential Misconceptions and Hazards; Relationship to Material
in Other Chapters……………………………………….. 262

x
Contents
9
One- and Two-Sample Estimation Problems………… 265
9.1 Introduction ……………………………………………. 265
9.2 Statistical Inference……………………………………… 265
9.3 Classical Methods of Estimation…………………………… 266
9.4 Single Sample: Estimating the Mean ………………………. 269
9.5 Standard Error of a Point Estimate ……………………….. 276
9.6 Prediction Intervals ……………………………………… 277
9.7 Tolerance Limits………………………………………… 280
Exercises …………………………………………… 282
9.8 Two Samples: Estimating the Difference between Two Means . . . 285
9.9 Paired Observations……………………………………… 291
Exercises …………………………………………… 294
9.10 Single Sample: Estimating a Proportion……………………. 296
9.11 Two Samples: Estimating the Difference between Two Proportions 300
Exercises …………………………………………… 302
9.12 Single Sample: Estimating the Variance……………………. 303
9.13 Two Samples: Estimating the Ratio of Two Variances……….. 305
Exercises …………………………………………… 307
9.14 Maximum Likelihood Estimation (Optional)………………… 307 Exercises …………………………………………… 312 Review Exercises…………………………………….. 313
9.15 Potential Misconceptions and Hazards; Relationship to Material
in Other Chapters……………………………………….. 316
10 One- and Two-Sample Tests of Hypotheses…………. 319
10.1 Statistical Hypotheses: General Concepts ………………….. 319
10.2 Testing a Statistical Hypothesis…………………………… 321
10.3 The Use of P -Values for Decision Making in Testing Hypotheses . 331
Exercises …………………………………………… 334
10.4 Single Sample: Tests Concerning a Single Mean …………….. 336
10.5 Two Samples: Tests on Two Means ……………………….. 342
10.6 Choice of Sample Size for Testing Means…………………… 349
10.7 Graphical Methods for Comparing Means ………………….. 354
Exercises …………………………………………… 356
10.8 One Sample: Test on a Single Proportion…………………… 360
10.9 Two Samples: Tests on Two Proportions…………………… 363
Exercises …………………………………………… 365
10.10 One- and Two-Sample Tests Concerning Variances ………….. 366 Exercises …………………………………………… 369
10.11 Goodness-of-Fit Test…………………………………….. 370
10.12 Test for Independence (Categorical Data) ………………….. 373

Contents
xi
10.13 Test for Homogeneity ……………………………………. 376
10.14 Two-Sample Case Study …………………………………. 379 Exercises …………………………………………… 382 Review Exercises…………………………………….. 384
10.15 Potential Misconceptions and Hazards; Relationship to Material
in Other Chapters……………………………………….. 386
11 Simple Linear Regression and Correlation ………….. 389
11.1 Introduction to Linear Regression…………………………. 389
11.2 The Simple Linear Regression Model………………………. 390
11.3 Least Squares and the Fitted Model……………………….. 394
Exercises …………………………………………… 398
11.4 Properties of the Least Squares Estimators…………………. 400
11.5 Inferences Concerning the Regression Coefficients. . . . . . . . . . . . . . . . 403
11.6 Prediction ……………………………………………… 408
Exercises …………………………………………… 411
11.7 Choice of a Regression Model …………………………….. 414
11.8 Analysis-of-Variance Approach……………………………. 414
11.9 Test for Linearity of Regression: Data with Repeated Observations 416
Exercises …………………………………………… 421
11.10 Data Plots and Transformations…………………………… 424
11.11 Simple Linear Regression Case Study………………………. 428
11.12 Correlation …………………………………………….. 430
Exercises …………………………………………… 435
Review Exercises…………………………………….. 436
11.13 Potential Misconceptions and Hazards; Relationship to Material
in Other Chapters……………………………………….. 442
12 Multiple Linear Regression and Certain
Nonlinear Regression Models……………………… 443
12.1 Introduction ……………………………………………. 443
12.2 Estimating the Coefficients……………………………….. 444
12.3 Linear Regression Model Using Matrices …………………… 447
Exercises …………………………………………… 450
12.4 Properties of the Least Squares Estimators…………………. 453
12.5 Inferences in Multiple Linear Regression……………………. 455
Exercises …………………………………………… 461
12.6 Choice of a Fitted Model through Hypothesis Testing . . . . . . . . . . . 462
12.7 Special Case of Orthogonality (Optional)…………………… 467
Exercises …………………………………………… 471
12.8 Categorical or Indicator Variables…………………………. 472

xii
Contents
12.9 12.10
12.11 12.12
12.13
Exercises …………………………………………… 476 Sequential Methods for Model Selection ……………………. 476 Study of Residuals and Violation of Assumptions (Model Check- ing)……………………………………………………. 482 Cross Validation, Cp, and Other Criteria for Model Selection . . . . 487
Exercises …………………………………………… 494 Special Nonlinear Models for Nonideal Conditions…………… 496 Exercises …………………………………………… 500 Review Exercises…………………………………….. 501
Potential Misconceptions and Hazards; Relationship to Material
in Other Chapters……………………………………….. 506
13 One-Factor Experiments: General…………………… 507
13.1 Analysis-of-Variance Technique……………………………. 507
13.2 The Strategy of Experimental Design………………………. 508
13.3 One-Way Analysis of Variance: Completely Randomized Design
(One-Way ANOVA)……………………………………… 509
13.4 Tests for the Equality of Several Variances …………………. 516 Exercises …………………………………………… 518
13.5 Single-Degree-of-Freedom Comparisons…………………….. 520
13.6 Multiple Comparisons……………………………………. 523 Exercises …………………………………………… 529
13.7 Comparing a Set of Treatments in Blocks ………………….. 532
13.8 Randomized Complete Block Designs………………………. 533
13.9 Graphical Methods and Model Checking …………………… 540
13.10 Data Transformations in Analysis of Variance ………………. 543 Exercises …………………………………………… 545
13.11 Random Effects Models………………………………….. 547
13.12 Case Study …………………………………………….. 551 Exercises …………………………………………… 553 Review Exercises…………………………………….. 555
13.13 Potential Misconceptions and Hazards; Relationship to Material
in Other Chapters……………………………………….. 559
14 Factorial Experiments (Two or More Factors)………. 561
14.1 Introduction ……………………………………………. 561
14.2 Interaction in the Two-Factor Experiment………………….. 562
14.3 Two-Factor Analysis of Variance ………………………….. 565
Exercises …………………………………………… 575
14.4 Three-Factor Experiments………………………………… 579 Exercises …………………………………………… 586

Contents
xiii
14.5
14.6
Factorial Experiments for Random Effects and Mixed Models. . . . 588 Exercises …………………………………………… 592 Review Exercises…………………………………….. 594
Potential Misconceptions and Hazards; Relationship to Material
in Other Chapters……………………………………….. 596
15 2k Factorial Experiments and Fractions …………….. 597
15.1 Introduction ……………………………………………. 597
15.2 The 2k Factorial: Calculation of Effects and Analysis of Variance 598
15.3 Nonreplicated 2k Factorial Experiment…………………….. 604
Exercises …………………………………………… 609
15.4 Factorial Experiments in a Regression Setting………………. 612
15.5 The Orthogonal Design ………………………………….. 617
Exercises …………………………………………… 625
15.6 Fractional Factorial Experiments………………………….. 626
15.7 Analysis of Fractional Factorial Experiments ……………….. 632
Exercises …………………………………………… 634
15.8 Higher Fractions and Screening Designs ……………………. 636
15.9 Construction of Resolution III and IV Designs with 8, 16, and 32
Design Points…………………………………………… 637
15.10 Other Two-Level Resolution III Designs; The Plackett-Burman
Designs………………………………………………… 638
15.11 Introduction to Response Surface Methodology……………… 639
15.12 Robust Parameter Design………………………………… 643
Exercises …………………………………………… 652
Review Exercises…………………………………….. 653
15.13 Potential Misconceptions and Hazards; Relationship to Material
in Other Chapters……………………………………….. 654
16 Nonparametric Statistics……………………………. 655
16.1 Nonparametric Tests…………………………………….. 655
16.2 Signed-Rank Test……………………………………….. 660
Exercises …………………………………………… 663
16.3 Wilcoxon Rank-Sum Test ………………………………… 665
16.4 Kruskal-Wallis Test ……………………………………… 668
Exercises …………………………………………… 670
16.5 Runs Test………………………………………………. 671
16.6 Tolerance Limits………………………………………… 674
16.7 Rank Correlation Coefficient ……………………………… 674
Exercises …………………………………………… 677 Review Exercises…………………………………….. 679

xiv
Contents
17 Statistical Quality Control ………………………….. 681
17.1 Introduction ……………………………………………. 681
17.2 Nature of the Control Limits……………………………… 683
17.3 Purposes of the Control Chart ……………………………. 683
17.4 Control Charts for Variables ……………………………… 684
17.5 Control Charts for Attributes …………………………….. 697
17.6 Cusum Control Charts…………………………………… 705
Review Exercises…………………………………….. 706
18 Bayesian Statistics ………………………………….. 709
18.1 Bayesian Concepts………………………………………. 709
18.2 Bayesian Inferences ……………………………………… 710
18.3 Bayes Estimates Using Decision Theory Framework . . . . . . . . . . . . . 717
Exercises …………………………………………… 718
Bibliography……………………………………………. 721
Appendix A: Statistical Tables and Proofs……………… 725
Appendix B: Answers to Odd-Numbered Non-Review Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 769
Index………………………………………………….. 785

Preface
General Approach and Mathematical Level
Our emphasis in creating the ninth edition is less on adding new material and more on providing clarity and deeper understanding. This objective was accomplished in part by including new end-of-chapter material that adds connective tissue between chapters. We affectionately call these comments at the end of the chapter “Pot Holes.” They are very useful to remind students of the big picture and how each chapter fits into that picture, and they aid the student in learning about limitations and pitfalls that may result if procedures are misused. A deeper understanding of real-world use of statistics is made available through class projects, which were added in several chapters. These projects provide the opportunity for students alone, or in groups, to gather their own experimental data and draw inferences. In some cases, the work involves a problem whose solution will illustrate the meaning of a concept or provide an empirical understanding of an important statistical result. Some existing examples were expanded and new ones were introduced to create “case studies,” in which commentary is provided to give the student a clear understanding of a statistical concept in the context of a practical situation.
In this edition, we continue to emphasize a balance between theory and appli- cations. Calculus and other types of mathematical support (e.g., linear algebra) are used at about the same level as in previous editions. The coverage of an- alytical tools in statistics is enhanced with the use of calculus when discussion centers on rules and concepts in probability. Probability distributions and sta- tistical inference are highlighted in Chapters 2 through 10. Linear algebra and matrices are very lightly applied in Chapters 11 through 15, where linear regres- sion and analysis of variance are covered. Students using this text should have had the equivalent of one semester of differential and integral calculus. Linear algebra is helpful but not necessary so long as the section in Chapter 12 on mul- tiple linear regression using matrix algebra is not covered by the instructor. As in previous editions, a large number of exercises that deal with real-life scientific and engineering applications are available to challenge the student. The many data sets associated with the exercises are available for download from the website http://www.pearsonhighered.com/datasets.
xv

xvi Preface Summary of the Changes in the Ninth Edition
• Class projects were added in several chapters to provide a deeper understand- ing of the real-world use of statistics. Students are asked to produce or gather their own experimental data and draw inferences from these data.
• More case studies were added and others expanded to help students under- stand the statistical methods being presented in the context of a real-life situ- ation. For example, the interpretation of confidence limits, prediction limits, and tolerance limits is given using a real-life situation.
• “Pot Holes” were added at the end of some chapters and expanded in others. These comments are intended to present each chapter in the context of the big picture and discuss how the chapters relate to one another. They also provide cautions about the possible misuse of statistical techniques presented in the chapter.
• Chapter 1 has been enhanced to include more on single-number statistics as well as graphical techniques. New fundamental material on sampling and experimental design is presented.
• Examples added to Chapter 8 on sampling distributions are intended to moti- vate P-values and hypothesis testing. This prepares the student for the more challenging material on these topics that will be presented in Chapter 10.
• Chapter 12 contains additional development regarding the effect of a single regression variable in a model in which collinearity with other variables is severe.
• Chapter 15 now introduces material on the important topic of response surface methodology (RSM). The use of noise variables in RSM allows the illustration of mean and variance (dual response surface) modeling.
• The central composite design (CCD) is introduced in Chapter 15.
• More examples are given in Chapter 18, and the discussion of using Bayesian
methods for statistical decision making has been enhanced.
Content and Course Planning
This text is designed for either a one- or a two-semester course. A reasonable plan for a one-semester course might include Chapters 1 through 10. This would result in a curriculum that concluded with the fundamentals of both estimation and hypothesis testing. Instructors who desire that students be exposed to simple linear regression may wish to include a portion of Chapter 11. For instructors who desire to have analysis of variance included rather than regression, the one- semester course may include Chapter 13 rather than Chapters 11 and 12. Chapter 13 features one-factor analysis of variance. Another option is to eliminate portions of Chapters 5 and/or 6 as well as Chapter 7. With this option, one or more of the discrete or continuous distributions in Chapters 5 and 6 may be eliminated. These distributions include the negative binomial, geometric, gamma, Weibull, beta, and log normal distributions. Other features that one might consider re- moving from a one-semester curriculum include maximum likelihood estimation,

Preface xvii
prediction, and/or tolerance limits in Chapter 9. A one-semester curriculum has built-in flexibility, depending on the relative interest of the instructor in regression, analysis of variance, experimental design, and response surface methods (Chapter 15). There are several discrete and continuous distributions (Chapters 5 and 6) that have applications in a variety of engineering and scientific areas.
Chapters 11 through 18 contain substantial material that can be added for the second semester of a two-semester course. The material on simple and multiple linear regression is in Chapters 11 and 12, respectively. Chapter 12 alone offers a substantial amount of flexibility. Multiple linear regression includes such “special topics” as categorical or indicator variables, sequential methods of model selection such as stepwise regression, the study of residuals for the detection of violations of assumptions, cross validation and the use of the PRESS statistic as well as Cp, and logistic regression. The use of orthogonal regressors, a precursor to the experimental design in Chapter 15, is highlighted. Chapters 13 and 14 offer a relatively large amount of material on analysis of variance (ANOVA) with fixed, random, and mixed models. Chapter 15 highlights the application of two-level designs in the context of full and fractional factorial experiments (2k). Special screening designs are illustrated. Chapter 15 also features a new section on response surface methodology (RSM) to illustrate the use of experimental design for finding optimal process conditions. The fitting of a second order model through the use of a central composite design is discussed. RSM is expanded to cover the analysis of robust parameter design type problems. Noise variables are used to accommodate dual response surface models. Chapters 16, 17, and 18 contain a moderate amount of material on nonparametric statistics, quality control, and Bayesian inference.
Chapter 1 is an overview of statistical inference presented on a mathematically simple level. It has been expanded from the eighth edition to more thoroughly cover single-number statistics and graphical techniques. It is designed to give students a preliminary presentation of elementary concepts that will allow them to understand more involved details that follow. Elementary concepts in sampling, data collection, and experimental design are presented, and rudimentary aspects of graphical tools are introduced, as well as a sense of what is garnered from a data set. Stem-and-leaf plots and box-and-whisker plots have been added. Graphs are better organized and labeled. The discussion of uncertainty and variation in a system is thorough and well illustrated. There are examples of how to sort out the important characteristics of a scientific process or system, and these ideas are illustrated in practical settings such as manufacturing processes, biomedical studies, and studies of biological and other scientific systems. A contrast is made between the use of discrete and continuous data. Emphasis is placed on the use of models and the information concerning statistical models that can be obtained from graphical tools.
Chapters 2, 3, and 4 deal with basic probability as well as discrete and contin- uous random variables. Chapters 5 and 6 focus on specific discrete and continuous distributions as well as relationships among them. These chapters also highlight examples of applications of the distributions in real-life scientific and engineering studies. Examples, case studies, and a large number of exercises edify the student concerning the use of these distributions. Projects bring the practical use of these distributions to life through group work. Chapter 7 is the most theoretical chapter

xviii
Preface
in the text. It deals with transformation of random variables and will likely not be used unless the instructor wishes to teach a relatively theoretical course. Chapter 8 contains graphical material, expanding on the more elementary set of graphi- cal tools presented and illustrated in Chapter 1. Probability plotting is discussed and illustrated with examples. The very important concept of sampling distribu- tions is presented thoroughly, and illustrations are given that involve the central limit theorem and the distribution of a sample variance under normal, independent (i.i.d.) sampling. The t and F distributions are introduced to motivate their use in chapters to follow. New material in Chapter 8 helps the student to visualize the importance of hypothesis testing, motivating the concept of a P-value.
Chapter 9 contains material on one- and two-sample point and interval esti- mation. A thorough discussion with examples points out the contrast between the different types of intervals—confidence intervals, prediction intervals, and toler- ance intervals. A case study illustrates the three types of statistical intervals in the context of a manufacturing situation. This case study highlights the differences among the intervals, their sources, and the assumptions made in their develop- ment, as well as what type of scientific study or question requires the use of each one. A new approximation method has been added for the inference concerning a proportion. Chapter 10 begins with a basic presentation on the pragmatic mean- ing of hypothesis testing, with emphasis on such fundamental concepts as null and alternative hypotheses, the role of probability and the P-value, and the power of a test. Following this, illustrations are given of tests concerning one and two sam- ples under standard conditions. The two-sample t-test with paired observations is also described. A case study helps the student to develop a clear picture of what interaction among factors really means as well as the dangers that can arise when interaction between treatments and experimental units exists. At the end of Chapter 10 is a very important section that relates Chapters 9 and 10 (estimation and hypothesis testing) to Chapters 11 through 16, where statistical modeling is prominent. It is important that the student be aware of the strong connection.
Chapters 11 and 12 contain material on simple and multiple linear regression, respectively. Considerably more attention is given in this edition to the effect that collinearity among the regression variables plays. A situation is presented that shows how the role of a single regression variable can depend in large part on what regressors are in the model with it. The sequential model selection procedures (for- ward, backward, stepwise, etc.) are then revisited in regard to this concept, and the rationale for using certain P-values with these procedures is provided. Chap- ter 12 offers material on nonlinear modeling with a special presentation of logistic regression, which has applications in engineering and the biological sciences. The material on multiple regression is quite extensive and thus provides considerable flexibility for the instructor, as indicated earlier. At the end of Chapter 12 is com- mentary relating that chapter to Chapters 14 and 15. Several features were added that provide a better understanding of the material in general. For example, the end-of-chapter material deals with cautions and difficulties one might encounter. It is pointed out that there are types of responses that occur naturally in practice (e.g. proportion responses, count responses, and several others) with which stan- dard least squares regression should not be used because standard assumptions do not hold and violation of assumptions may induce serious errors. The suggestion is

Preface xix
made that data transformation on the response may alleviate the problem in some cases. Flexibility is again available in Chapters 13 and 14, on the topic of analysis of variance. Chapter 13 covers one-factor ANOVA in the context of a completely randomized design. Complementary topics include tests on variances and multiple comparisons. Comparisons of treatments in blocks are highlighted, along with the topic of randomized complete blocks. Graphical methods are extended to ANOVA to aid the student in supplementing the formal inference with a pictorial type of in- ference that can aid scientists and engineers in presenting material. A new project is given in which students incorporate the appropriate randomization into each plan and use graphical techniques and P-values in reporting the results. Chapter 14 extends the material in Chapter 13 to accommodate two or more factors that are in a factorial structure. The ANOVA presentation in Chapter 14 includes work in both random and fixed effects models. Chapter 15 offers material associated with 2k factorial designs; examples and case studies present the use of screening designs and special higher fractions of the 2k. Two new and special features are the presentations of response surface methodology (RSM) and robust parameter design. These topics are linked in a case study that describes and illustrates a dual response surface design and analysis featuring the use of process mean and variance response surfaces.
Computer Software
Supplements
Case studies, beginning in Chapter 8, feature computer printout and graphical material generated using both SAS and MINITAB. The inclusion of the computer reflects our belief that students should have the experience of reading and inter- preting computer printout and graphics, even if the software in the text is not that which is used by the instructor. Exposure to more than one type of software can broaden the experience base for the student. There is no reason to believe that the software used in the course will be that which the student will be called upon to use in practice following graduation. Examples and case studies in the text are supplemented, where appropriate, by various types of residual plots, quantile plots, normal probability plots, and other plots. Such plots are particularly prevalent in Chapters 11 through 15.
Instructor’s Solutions Manual. This resource contains worked-out solutions to all text exercises and is available for download from Pearson Education’s Instructor Resource Center.
Student Solutions Manual ISBN-10: 0-321-64013-6; ISBN-13: 978-0-321-64013-0. Featuring complete solutions to selected exercises, this is a great tool for students as they study and work through the problem material.
PowerPoint⃝R Lecture Slides ISBN-10: 0-321-73731-8; ISBN-13: 978-0-321-73731- 1. These slides include most of the figures and tables from the text. Slides are available to download from Pearson Education’s Instructor Resource Center.

xx
Preface
StatCrunch eText. This interactive, online textbook includes StatCrunch, a pow- erful, web-based statistical software. Embedded StatCrunch buttons allow users to open all data sets and tables from the book with the click of a button and immediately perform an analysis using StatCrunch.
StatCrunchTM. StatCrunch is web-based statistical software that allows users to perform complex analyses, share data sets, and generate compelling reports of their data. Users can upload their own data to StatCrunch or search the library of over twelve thousand publicly shared data sets, covering almost any topic of interest. Interactive graphical outputs help users understand statistical concepts and are available for export to enrich reports with visual representations of data. Additional features include
• A full range of numerical and graphical methods that allow users to analyze and gain insights from any data set.
• Reporting options that help users create a wide variety of visually appealing representations of their data.
• An online survey tool that allows users to quickly build and administer surveys via a web form.
StatCrunch is available to qualified adopters. For more information, visit our website at www.statcrunch.com or contact your Pearson representative.
Acknowledgments
We are indebted to those colleagues who reviewed the previous editions of this book and provided many helpful suggestions for this edition. They are David Groggel, Miami University; Lance Hemlow, Raritan Valley Community College; Ying Ji, University of Texas at San Antonio; Thomas Kline, University of Northern Iowa; Sheila Lawrence, Rutgers University; Luis Moreno, Broome County Community College; Donald Waldman, University of Colorado—Boulder; and Marlene Will, Spalding University. We would also like to thank Delray Schulz, Millersville Uni- versity; Roxane Burrows, Hocking College; and Frank Chmely for ensuring the accuracy of this text.
We would like to thank the editorial and production services provided by nu- merous people from Pearson/Prentice Hall, especially the editor in chief Deirdre Lynch, acquisitions editor Christopher Cummings, executive content editor Chris- tine O’Brien, production editor Tracy Patruno, and copyeditor Sally Lifland. Many useful comments and suggestions by proofreader Gail Magin are greatly appreci- ated. We thank the Virginia Tech Statistical Consulting Center, which was the source of many real-life data sets.
R.H.M. S.L.M. K.Y.

Chapter 1
Introduction to Statistics and Data Analysis
1.1 Overview: Statistical Inference, Samples, Populations, and the Role of Probability
Beginning in the 1980s and continuing into the 21st century, an inordinate amount of attention has been focused on improvement of quality in American industry. Much has been said and written about the Japanese “industrial miracle,” which began in the middle of the 20th century. The Japanese were able to succeed where we and other countries had failed–namely, to create an atmosphere that allows the production of high-quality products. Much of the success of the Japanese has been attributed to the use of statistical methods and statistical thinking among management personnel.
Use of Scientific Data
The use of statistical methods in manufacturing, development of food products, computer software, energy sources, pharmaceuticals, and many other areas involves the gathering of information or scientific data. Of course, the gathering of data is nothing new. It has been done for well over a thousand years. Data have been collected, summarized, reported, and stored for perusal. However, there is a profound distinction between collection of scientific information and inferential statistics. It is the latter that has received rightful attention in recent decades.
The offspring of inferential statistics has been a large “toolbox” of statistical methods employed by statistical practitioners. These statistical methods are de- signed to contribute to the process of making scientific judgments in the face of uncertainty and variation. The product density of a particular material from a manufacturing process will not always be the same. Indeed, if the process involved is a batch process rather than continuous, there will be not only variation in ma- terial density among the batches that come off the line (batch-to-batch variation), but also within-batch variation. Statistical methods are used to analyze data from a process such as this one in order to gain more sense of where in the process changes may be made to improve the quality of the process. In this process, qual-
1

2
Chapter 1 Introduction to Statistics and Data Analysis
ity may well be defined in relation to closeness to a target density value in harmony with what portion of the time this closeness criterion is met. An engineer may be concerned with a specific instrument that is used to measure sulfur monoxide in the air during pollution studies. If the engineer has doubts about the effectiveness of the instrument, there are two sources of variation that must be dealt with. The first is the variation in sulfur monoxide values that are found at the same locale on the same day. The second is the variation between values observed and the true amount of sulfur monoxide that is in the air at the time. If either of these two sources of variation is exceedingly large (according to some standard set by the engineer), the instrument may need to be replaced. In a biomedical study of a new drug that reduces hypertension, 85% of patients experienced relief, while it is generally recognized that the current drug, or “old” drug, brings relief to 80% of pa- tients that have chronic hypertension. However, the new drug is more expensive to make and may result in certain side effects. Should the new drug be adopted? This is a problem that is encountered (often with much more complexity) frequently by pharmaceutical firms in conjunction with the FDA (Federal Drug Administration). Again, the consideration of variation needs to be taken into account. The “85%” value is based on a certain number of patients chosen for the study. Perhaps if the study were repeated with new patients the observed number of “successes” would be 75%! It is the natural variation from study to study that must be taken into account in the decision process. Clearly this variation is important, since variation from patient to patient is endemic to the problem.
Variability in Scientific Data
In the problems discussed above the statistical methods used involve dealing with variability, and in each case the variability to be studied is that encountered in scientific data. If the observed product density in the process were always the same and were always on target, there would be no need for statistical methods. If the device for measuring sulfur monoxide always gives the same value and the value is accurate (i.e., it is correct), no statistical analysis is needed. If there were no patient-to-patient variability inherent in the response to the drug (i.e., it either always brings relief or not), life would be simple for scientists in the pharmaceutical firms and FDA and no statistician would be needed in the decision process. Statistics researchers have produced an enormous number of analytical methods that allow for analysis of data from systems like those described above. This reflects the true nature of the science that we call inferential statistics, namely, using techniques that allow us to go beyond merely reporting data to drawing conclusions (or inferences) about the scientific system. Statisticians make use of fundamental laws of probability and statistical inference to draw conclusions about scientific systems. Information is gathered in the form of samples, or collections of observations. The process of sampling is introduced in Chapter 2, and the discussion continues throughout the entire book.
Samples are collected from populations, which are collections of all individ- uals or individual items of a particular type. At times a population signifies a scientific system. For example, a manufacturer of computer boards may wish to eliminate defects. A sampling process may involve collecting information on 50 computer boards sampled randomly from the process. Here, the population is all

1.1 Overview: Statistical Inference, Samples, Populations, and the Role of Probability 3
computer boards manufactured by the firm over a specific period of time. If an improvement is made in the computer board process and a second sample of boards is collected, any conclusions drawn regarding the effectiveness of the change in pro- cess should extend to the entire population of computer boards produced under the “improved process.” In a drug experiment, a sample of patients is taken and each is given a specific drug to reduce blood pressure. The interest is focused on drawing conclusions about the population of those who suffer from hypertension.
Often, it is very important to collect scientific data in a systematic way, with planning being high on the agenda. At times the planning is, by necessity, quite limited. We often focus only on certain properties or characteristics of the items or objects in the population. Each characteristic has particular engineering or, say, biological importance to the “customer,” the scientist or engineer who seeks to learn about the population. For example, in one of the illustrations above the quality of the process had to do with the product density of the output of a process. An engineer may need to study the effect of process conditions, temperature, humidity, amount of a particular ingredient, and so on. He or she can systematically move these factors to whatever levels are suggested according to whatever prescription or experimental design is desired. However, a forest scientist who is interested in a study of factors that influence wood density in a certain kind of tree cannot necessarily design an experiment. This case may require an observational study in which data are collected in the field but factor levels can not be preselected. Both of these types of studies lend themselves to methods of statistical inference. In the former, the quality of the inferences will depend on proper planning of the experiment. In the latter, the scientist is at the mercy of what can be gathered. For example, it is sad if an agronomist is interested in studying the effect of rainfall on plant yield and the data are gathered during a drought.
The importance of statistical thinking by managers and the use of statistical inference by scientific personnel is widely acknowledged. Research scientists gain much from scientific data. Data provide understanding of scientific phenomena. Product and process engineers learn a great deal in their off-line efforts to improve the process. They also gain valuable insight by gathering production data (on- line monitoring) on a regular basis. This allows them to determine necessary modifications in order to keep the process at a desired level of quality.
There are times when a scientific practitioner wishes only to gain some sort of summary of a set of data represented in the sample. In other words, inferential statistics is not required. Rather, a set of single-number statistics or descriptive statistics is helpful. These numbers give a sense of center of the location of the data, variability in the data, and the general nature of the distribution of observations in the sample. Though no specific statistical methods leading to statistical inference are incorporated, much can be learned. At times, descriptive statistics are accompanied by graphics. Modern statistical software packages allow for computation of means, medians, standard deviations, and other single- number statistics as well as production of graphs that show a “footprint” of the nature of the sample. Definitions and illustrations of the single-number statistics and graphs, including histograms, stem-and-leaf plots, scatter plots, dot plots, and box plots, will be given in sections that follow.

4 Chapter 1 Introduction to Statistics and Data Analysis The Role of Probability
In this book, Chapters 2 to 6 deal with fundamental notions of probability. A thorough grounding in these concepts allows the reader to have a better under- standing of statistical inference. Without some formalism of probability theory, the student cannot appreciate the true interpretation from data analysis through modern statistical methods. It is quite natural to study probability prior to study- ing statistical inference. Elements of probability allow us to quantify the strength or “confidence” in our conclusions. In this sense, concepts in probability form a major component that supplements statistical methods and helps us gauge the strength of the statistical inference. The discipline of probability, then, provides the transition between descriptive statistics and inferential methods. Elements of probability allow the conclusion to be put into the language that the science or engineering practitioners require. An example follows that will enable the reader to understand the notion of a P-value, which often provides the “bottom line” in the interpretation of results from the use of statistical methods.
Example 1.1: Suppose that an engineer encounters data from a manufacturing process in which 100 items are sampled and 10 are found to be defective. It is expected and antic- ipated that occasionally there will be defective items. Obviously these 100 items represent the sample. However, it has been determined that in the long run, the company can only tolerate 5% defective in the process. Now, the elements of prob- ability allow the engineer to determine how conclusive the sample information is regarding the nature of the process. In this case, the population conceptually represents all possible items from the process. Suppose we learn that if the process is acceptable, that is, if it does produce items no more than 5% of which are de- fective, there is a probability of 0.0282 of obtaining 10 or more defective items in a random sample of 100 items from the process. This small probability suggests that the process does, indeed, have a long-run rate of defective items that exceeds 5%. In other words, under the condition of an acceptable process, the sample in- formation obtained would rarely occur. However, it did occur! Clearly, though, it would occur with a much higher probability if the process defective rate exceeded 5% by a significant amount.
From this example it becomes clear that the elements of probability aid in the translation of sample information into something conclusive or inconclusive about the scientific system. In fact, what was learned likely is alarming information to the engineer or manager. Statistical methods, which we will actually detail in Chapter 10, produced a P-value of 0.0282. The result suggests that the process very likely is not acceptable. The concept of a P-value is dealt with at length in succeeding chapters. The example that follows provides a second illustration.
Example 1.2: Often the nature of the scientific study will dictate the role that probability and deductive reasoning play in statistical inference. Exercise 9.40 on page 294 provides data associated with a study conducted at the Virginia Polytechnic Institute and State University on the development of a relationship between the roots of trees and the action of a fungus. Minerals are transferred from the fungus to the trees and sugars from the trees to the fungus. Two samples of 10 northern red oak seedlings were planted in a greenhouse, one containing seedlings treated with nitrogen and

1.1 Overview: Statistical Inference, Samples, Populations, and the Role of Probability 5
the other containing seedlings with no nitrogen. All other environmental conditions were held constant. All seedlings contained the fungus Pisolithus tinctorus. More details are supplied in Chapter 9. The stem weights in grams were recorded after the end of 140 days. The data are given in Table 1.1.
Table 1.1: Data Set for Example 1.2
No Nitrogen
Nitrogen
0.32 0.26 0.53 0.43 0.28 0.47 0.37 0.49 0.47 0.52 0.43 0.75 0.36 0.79 0.42 0.86 0.38 0.62 0.43 0.46
0.25 0.30 0.35 0.40 0.45 0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.85 0.90
Figure 1.1: A dot plot of stem weight data.
In this example there are two samples from two separate populations. The purpose of the experiment is to determine if the use of nitrogen has an influence on the growth of the roots. The study is a comparative study (i.e., we seek to compare the two populations with regard to a certain important characteristic). It is instructive to plot the data as shown in the dot plot of Figure 1.1. The ◦ values represent the “nitrogen” data and the × values represent the “no-nitrogen” data.
Notice that the general appearance of the data might suggest to the reader that, on average, the use of nitrogen increases the stem weight. Four nitrogen ob- servations are considerably larger than any of the no-nitrogen observations. Most of the no-nitrogen observations appear to be below the center of the data. The appearance of the data set would seem to indicate that nitrogen is effective. But how can this be quantified? How can all of the apparent visual evidence be summa- rized in some sense? As in the preceding example, the fundamentals of probability can be used. The conclusions may be summarized in a probability statement or P-value. We will not show here the statistical inference that produces the summary probability. As in Example 1.1, these methods will be discussed in Chapter 10. The issue revolves around the “probability that data like these could be observed” given that nitrogen has no effect, in other words, given that both samples were generated from the same population. Suppose that this probability is small, say 0.03. That would certainly be strong evidence that the use of nitrogen does indeed influence (apparently increases) average stem weight of the red oak seedlings.

6 Chapter 1 Introduction to Statistics and Data Analysis How Do Probability and Statistical Inference Work Together?
It is important for the reader to understand the clear distinction between the discipline of probability, a science in its own right, and the discipline of inferen- tial statistics. As we have already indicated, the use or application of concepts in probability allows real-life interpretation of the results of statistical inference. As a result, it can be said that statistical inference makes use of concepts in probability. One can glean from the two examples above that the sample information is made available to the analyst and, with the aid of statistical methods and elements of probability, conclusions are drawn about some feature of the population (the pro- cess does not appear to be acceptable in Example 1.1, and nitrogen does appear to influence average stem weights in Example 1.2). Thus for a statistical problem, the sample along with inferential statistics allows us to draw conclu- sions about the population, with inferential statistics making clear use of elements of probability. This reasoning is inductive in nature. Now as we move into Chapter 2 and beyond, the reader will note that, unlike what we do in our two examples here, we will not focus on solving statistical problems. Many examples will be given in which no sample is involved. There will be a population clearly described with all features of the population known. Then questions of im- portance will focus on the nature of data that might hypothetically be drawn from the population. Thus, one can say that elements in probability allow us to draw conclusions about characteristics of hypothetical data taken from the population, based on known features of the population. This type of reasoning is deductive in nature. Figure 1.2 shows the fundamental relationship between probability and inferential statistics.
Population
Probability
Statistical Inference
Sample
Figure 1.2: Fundamental relationship between probability and inferential statistics.
Now, in the grand scheme of things, which is more important, the field of probability or the field of statistics? They are both very important and clearly are complementary. The only certainty concerning the pedagogy of the two disciplines lies in the fact that if statistics is to be taught at more than merely a “cookbook” level, then the discipline of probability must be taught first. This rule stems from the fact that nothing can be learned about a population from a sample until the analyst learns the rudiments of uncertainty in that sample. For example, consider Example 1.1. The question centers around whether or not the population, defined by the process, is no more than 5% defective. In other words, the conjecture is that on the average 5 out of 100 items are defective. Now, the sample contains 100 items and 10 are defective. Does this support the conjecture or refute it? On the

1.2 Sampling Procedures; Collection of Data 7
surface it would appear to be a refutation of the conjecture because 10 out of 100 seem to be “a bit much.” But without elements of probability, how do we know? Only through the study of material in future chapters will we learn the conditions under which the process is acceptable (5% defective). The probability of obtaining 10 or more defective items in a sample of 100 is 0.0282.
We have given two examples where the elements of probability provide a sum- mary that the scientist or engineer can use as evidence on which to build a decision. The bridge between the data and the conclusion is, of course, based on foundations of statistical inference, distribution theory, and sampling distributions discussed in future chapters.
1.2 Sampling Procedures; Collection of Data
In Section 1.1 we discussed very briefly the notion of sampling and the sampling process. While sampling appears to be a simple concept, the complexity of the questions that must be answered about the population or populations necessitates that the sampling process be very complex at times. While the notion of sampling is discussed in a technical way in Chapter 8, we shall endeavor here to give some common-sense notions of sampling. This is a natural transition to a discussion of the concept of variability.
Simple Random Sampling
The importance of proper sampling revolves around the degree of confidence with which the analyst is able to answer the questions being asked. Let us assume that only a single population exists in the problem. Recall that in Example 1.2 two populations were involved. Simple random sampling implies that any particular sample of a specified sample size has the same chance of being selected as any other sample of the same size. The term sample size simply means the number of elements in the sample. Obviously, a table of random numbers can be utilized in sample selection in many instances. The virtue of simple random sampling is that it aids in the elimination of the problem of having the sample reflect a different (possibly more confined) population than the one about which inferences need to be made. For example, a sample is to be chosen to answer certain questions regarding political preferences in a certain state in the United States. The sample involves the choice of, say, 1000 families, and a survey is to be conducted. Now, suppose it turns out that random sampling is not used. Rather, all or nearly all of the 1000 families chosen live in an urban setting. It is believed that political preferences in rural areas differ from those in urban areas. In other words, the sample drawn actually confined the population and thus the inferences need to be confined to the “limited population,” and in this case confining may be undesirable. If, indeed, the inferences need to be made about the state as a whole, the sample of size 1000 described here is often referred to as a biased sample.
As we hinted earlier, simple random sampling is not always appropriate. Which alternative approach is used depends on the complexity of the problem. Often, for example, the sampling units are not homogeneous and naturally divide themselves into nonoverlapping groups that are homogeneous. These groups are called strata,

8
Chapter 1 Introduction to Statistics and Data Analysis
and a procedure called stratified random sampling involves random selection of a sample within each stratum. The purpose is to be sure that each of the strata is neither over- nor underrepresented. For example, suppose a sample survey is conducted in order to gather preliminary opinions regarding a bond referendum that is being considered in a certain city. The city is subdivided into several ethnic groups which represent natural strata. In order not to disregard or overrepresent any group, separate random samples of families could be chosen from each group.
Experimental Design
The concept of randomness or random assignment plays a huge role in the area of experimental design, which was introduced very briefly in Section 1.1 and is an important staple in almost any area of engineering or experimental science. This will be discussed at length in Chapters 13 through 15. However, it is instructive to give a brief presentation here in the context of random sampling. A set of so-called treatments or treatment combinations becomes the populations to be studied or compared in some sense. An example is the nitrogen versus no-nitrogen treat- ments in Example 1.2. Another simple example would be “placebo” versus “active drug,” or in a corrosion fatigue study we might have treatment combinations that involve specimens that are coated or uncoated as well as conditions of low or high humidity to which the specimens are exposed. In fact, there are four treatment or factor combinations (i.e., 4 populations), and many scientific questions may be asked and answered through statistical and inferential methods. Consider first the situation in Example 1.2. There are 20 diseased seedlings involved in the exper- iment. It is easy to see from the data themselves that the seedlings are different from each other. Within the nitrogen group (or the no-nitrogen group) there is considerable variability in the stem weights. This variability is due to what is generally called the experimental unit. This is a very important concept in in- ferential statistics, in fact one whose description will not end in this chapter. The nature of the variability is very important. If it is too large, stemming from a condition of excessive nonhomogeneity in experimental units, the variability will “wash out” any detectable difference between the two populations. Recall that in this case that did not occur.
The dot plot in Figure 1.1 and P-value indicated a clear distinction between these two conditions. What role do those experimental units play in the data- taking process itself? The common-sense and, indeed, quite standard approach is to assign the 20 seedlings or experimental units randomly to the two treat- ments or conditions. In the drug study, we may decide to use a total of 200 available patients, patients that clearly will be different in some sense. They are the experimental units. However, they all may have the same chronic condition for which the drug is a potential treatment. Then in a so-called completely ran- domized design, 100 patients are assigned randomly to the placebo and 100 to the active drug. Again, it is these experimental units within a group or treatment that produce the variability in data results (i.e., variability in the measured result), say blood pressure, or whatever drug efficacy value is important. In the corrosion fatigue study, the experimental units are the specimens that are the subjects of the corrosion.

1.2 Sampling Procedures; Collection of Data 9 Why Assign Experimental Units Randomly?
What is the possible negative impact of not randomly assigning experimental units to the treatments or treatment combinations? This is seen most clearly in the case of the drug study. Among the characteristics of the patients that produce variability in the results are age, gender, and weight. Suppose merely by chance the placebo group contains a sample of people that are predominately heavier than those in the treatment group. Perhaps heavier individuals have a tendency to have a higher blood pressure. This clearly biases the result, and indeed, any result obtained through the application of statistical inference may have little to do with the drug and more to do with differences in weights among the two samples of patients.
We should emphasize the attachment of importance to the term variability. Excessive variability among experimental units “camouflages” scientific findings. In future sections, we attempt to characterize and quantify measures of variability. In sections that follow, we introduce and discuss specific quantities that can be computed in samples; the quantities give a sense of the nature of the sample with respect to center of location of the data and variability in the data. A discussion of several of these single-number measures serves to provide a preview of what statistical information will be important components of the statistical methods that are used in future chapters. These measures that help characterize the nature of the data set fall into the category of descriptive statistics. This material is a prelude to a brief presentation of pictorial and graphical methods that go even further in characterization of the data set. The reader should understand that the statistical methods illustrated here will be used throughout the text. In order to offer the reader a clearer picture of what is involved in experimental design studies, we offer Example 1.3.
Example 1.3: A corrosion study was made in order to determine whether coating an aluminum metal with a corrosion retardation substance reduced the amount of corrosion. The coating is a protectant that is advertised to minimize fatigue damage in this type of material. Also of interest is the influence of humidity on the amount of corrosion. A corrosion measurement can be expressed in thousands of cycles to failure. Two levels of coating, no coating and chemical corrosion coating, were used. In addition, the two relative humidity levels are 20% relative humidity and 80% relative humidity.
The experiment involves four treatment combinations that are listed in the table that follows. There are eight experimental units used, and they are aluminum specimens prepared; two are assigned randomly to each of the four treatment combinations. The data are presented in Table 1.2.
The corrosion data are averages of two specimens. A plot of the averages is pictured in Figure 1.3. A relatively large value of cycles to failure represents a small amount of corrosion. As one might expect, an increase in humidity appears to make the corrosion worse. The use of the chemical corrosion coating procedure appears to reduce corrosion.
In this experimental design illustration, the engineer has systematically selected the four treatment combinations. In order to connect this situation to concepts with which the reader has been exposed to this point, it should be assumed that the

10
Chapter 1 Introduction to Statistics and Data Analysis
Coating
Uncoated
Chemical Corrosion
2000
1000
0
Humidity
20%
80% 20%
80%
Average Corrosion in Thousands of Cycles to Failure
975
350 1750
1550
Chemical Corrosion Coating
Table 1.2: Data for Example 1.3
0 20%
Uncoated
80%
Figure 1.3: Corrosion results for Example 1.3.
conditions representing the four treatment combinations are four separate popula- tions and that the two corrosion values observed for each population are important pieces of information. The importance of the average in capturing and summariz- ing certain features in the population will be highlighted in Section 1.3. While we might draw conclusions about the role of humidity and the impact of coating the specimens from the figure, we cannot truly evaluate the results from an analyti- cal point of view without taking into account the variability around the average. Again, as we indicated earlier, if the two corrosion values for each treatment com- bination are close together, the picture in Figure 1.3 may be an accurate depiction. But if each corrosion value in the figure is an average of two values that are widely dispersed, then this variability may, indeed, truly “wash away” any information that appears to come through when one observes averages only. The foregoing example illustrates these concepts:
(1) random assignment of treatment combinations (coating, humidity) to experi- mental units (specimens)
(2) the use of sample averages (average corrosion values) in summarizing sample information
(3) the need for consideration of measures of variability in the analysis of any sample or sets of samples
Humidity
Average Corrosion

1.3 Measures of Location: The Sample Mean and Median 11
This example suggests the need for what follows in Sections 1.3 and 1.4, namely, descriptive statistics that indicate measures of center of location in a set of data, and those that measure variability.
1.3 Measures of Location: The Sample Mean and Median
Suppose that the observations in a sample are x1, x2, . . . , xn. The sample mean, denoted by x ̄, is
x ̄=􏰤n xi =x1+x2+···+xn.
nn
i=1
Definition 1.1:
Measures of location are designed to provide the analyst with some quantitative values of where the center, or some other location, of data is located. In Example 1.2, it appears as if the center of the nitrogen sample clearly exceeds that of the no-nitrogen sample. One obvious and very useful measure is the sample mean. The mean is simply a numerical average.
There are other measures of central tendency that are discussed in detail in future chapters. One important measure is the sample median. The purpose of the sample median is to reflect the central tendency of the sample in such a way that it is uninfluenced by extreme values or outliers.
As an example, suppose the data set is the following: 1.7, 2.2, 3.9, 3.11, and 14.7. The sample mean and median are, respectively,
x ̄ = 5.12, x ̃ = 3.9.
Clearly, the mean is influenced considerably by the presence of the extreme obser- vation, 14.7, whereas the median places emphasis on the true “center” of the data set. In the case of the two-sample data set of Example 1.2, the two measures of central tendency for the individual samples are
Given that the observations in a sample are x1 , x2 , . . . , xn , arranged in increasing order of magnitude, the sample median is
x ̃ =
2
􏰥
x(n+1)/2,
if n is odd, if n is even.
1 (xn/2 + xn/2+1 ),
Definition 1.2:
x ̄ (no nitrogen) x ̃ (no nitrogen) x ̄ (nitrogen) x ̃ (nitrogen)
Clearly there is a difference in
be of interest to the reader with an engineering background that the sample mean
= = = =
0.399 gram,
0.38 + 0.42 = 0.400 gram,
2 0.565 gram,
0.49 + 0.52 = 0.505 gram. 2
concept between the mean and median. It may

12
Chapter 1 Introduction to Statistics and Data Analysis
is the centroid of the data in a sample. In a sense, it is the point at which a fulcrum can be placed to balance a system of “weights” which are the locations of the individual data. This is shown in Figure 1.4 with regard to the with-nitrogen sample.
x 􏱋 0.565
0.25 0.30 0.35 0.40 0.45 0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.85 0.90
Figure 1.4: Sample mean as a centroid of the with-nitrogen stem weight.
In future chapters, the basis for the computation of x ̄ is that of an estimate of the population mean. As we indicated earlier, the purpose of statistical infer- ence is to draw conclusions about population characteristics or parameters and estimation is a very important feature of statistical inference.
The median and mean can be quite different from each other. Note, however, that in the case of the stem weight data the sample mean value for no-nitrogen is quite similar to the median value.
Other Measures of Locations
There are several other methods of quantifying the center of location of the data in the sample. We will not deal with them at this point. For the most part, alternatives to the sample mean are designed to produce values that represent compromises between the mean and the median. Rarely do we make use of these other measures. However, it is instructive to discuss one class of estimators, namely the class of trimmed means. A trimmed mean is computed by “trimming away” a certain percent of both the largest and the smallest set of values. For example, the 10% trimmed mean is found by eliminating the largest 10% and smallest 10% and computing the average of the remaining values. For example, in the case of the stem weight data, we would eliminate the largest and smallest since the sample size is 10 for each sample. So for the without-nitrogen group the 10% trimmed mean is given by
x ̄tr(10) = 0.32 + 0.37 + 0.47 + 0.43 + 0.36 + 0.42 + 0.38 + 0.43 = 0.39750, 8
and for the 10% trimmed mean for the with-nitrogen group we have
x ̄tr(10) = 0.43 + 0.47 + 0.49 + 0.52 + 0.75 + 0.79 + 0.62 + 0.46 = 0.56625. 8
Note that in this case, as expected, the trimmed means are close to both the mean and the median for the individual samples. The trimmed mean is, of course, more insensitive to outliers than the sample mean but not as insensitive as the median. On the other hand, the trimmed mean approach makes use of more information than the sample median. Note that the sample median is, indeed, a special case of the trimmed mean in which all of the sample data are eliminated apart from the middle one or two observations.

Exercises
13
Exercises
1.1 The following measurements were recorded for the drying time, in hours, of a certain brand of latex paint.
3.4 2.5 4.8 2.9 3.6
2.8 3.3 5.6 3.7 2.8
4.4 4.0 5.2 3.0 4.8
Assume that the measurements are a simple random
sample.
(a) What is the sample size for the above sample?
(b) Calculate the sample mean for these data.
(c) Calculate the sample median.
(d) Plot the data by way of a dot plot.
(e) Compute the 20% trimmed mean for the above data set.
(f) Is the sample mean for these data more or less de- scriptive as a center of location than the trimmed mean?
1.2 According to the journal Chemical Engineering, an important property of a fiber is its water ab- sorbency. A random sample of 20 pieces of cotton fiber was taken and the absorbency on each piece was mea- sured. The following are the absorbency values:
18.71 21.41 20.72 21.81 19.29 22.43 20.17 23.71 19.44 20.50 18.92 20.33 23.00 22.85 19.25 21.77 22.11 19.77 18.04 21.12
(a) Calculate the sample mean and median for the above sample values.
(b) Compute the 10% trimmed mean.
(c) Do a dot plot of the absorbency data.
(d) Using only the values of the mean, median, and trimmed mean, do you have evidence of outliers in the data?
polymer? Explain.
(c) Calculate the sample mean tensile strength of the two samples.
(d) Calculate the median for both. Discuss the simi- larity or lack of similarity between the mean and median of each group.
1.4 In a study conducted by the Department of Me- chanical Engineering at Virginia Tech, the steel rods supplied by two different companies were compared. Ten sample springs were made out of the steel rods supplied by each company, and a measure of flexibility
1.3 A certain polymer is used for evacuation systems
for aircraft. It is important that the polymer be re- (a) sistant to the aging process. Twenty specimens of the polymer were used in an experiment. Ten were as- (b) signed randomly to be exposed to an accelerated batch aging process that involved exposure to high tempera- tures for 10 days. Measurements of tensile strength of
the specimens were made, and the following data were recorded on tensile strength in psi:
12 37 5 3 3
No aging: 227 222 218 216 Aging: 219 214 218 203
(a) Do a dot plot of the data.
218 217 229 228 215 211 204 201
225
221
209
205
The tensile strength of silicone rubber is thought to be a function of curing temperature. A study was carried out in which samples of 12 specimens of the rub- ber were prepared using curing temperatures of 20◦C and 45◦C. The data below show the tensile strength values in megapascals.
(b) From your plot, does it appear as if
cess has had an effect on the tensile strength of this
//
the aging pro-
was recorded for each. The data are
as follows: 8.7 8.5
9.2 7.0 10.2 10.1 10.2 9.6
(a) (b)
Calculate the sample mean and median for the data for the two companies.
Plot the data for the two companies on the same line and give your impression regarding any appar- ent differences between the two companies.
(c)
Do a dot plot of the data for both groups on the same graph.
Compute the mean, median, and 10% trimmed mean for both groups.
Explain why the difference in means suggests one conclusion about the effect of the regimen, while the difference in medians or trimmed means sug- gests a different conclusion.
1.6
Company A: Company B:
9.3 8.8 6.8
6.7 8.0 6.5 11.0 9.8 9.9 9.7 11.0 11.1
1.5 Twenty adult males between the ages of 30 and 40 participated in a study to evaluate the effect of a specific health regimen involving diet and exercise on the blood cholesterol. Ten were randomly selected to be a control group, and ten others were assigned to take part in the regimen as the treatment group for a period of 6 months. The following data show the re- duction in cholesterol experienced for the time period for the 20 sub jects:
Control group: Treatment group:
7 3 −4
5 22 −7 −6 5 9
14 2 9 5 4 4

14
Chapter 1 Introduction to Statistics and Data Analysis
20◦C: 2.07 2.14 2.22 2.03 2.21 2.03 2.05 2.18 2.09 2.14 2.11 2.02 45◦C: 2.52 2.15 2.49 2.03 2.37 2.05 1.99 2.42 2.08 2.42 2.29 2.01
(a) Show a dot plot of the data with both low and high temperature tensile strength values.
1.4 Measures of Variability
(b) Compute sample mean tensile strength for both samples.
(c) Does it appear as if curing temperature has an influence on tensile strength, based on the plot? Comment further.
(d) Does anything else appear to be influenced by an increase in curing temperature? Explain.
Sample variability plays an important role in data analysis. Process and product variability is a fact of life in engineering and scientific systems: The control or reduction of process variability is often a source of major difficulty. More and more process engineers and managers are learning that product quality and, as a result, profits derived from manufactured products are very much a function of process variability. As a result, much of Chapters 9 through 15 deals with data analysis and modeling procedures in which sample variability plays a major role. Even in small data analysis problems, the success of a particular statistical method may depend on the magnitude of the variability among the observations in the sample. Measures of location in a sample do not provide a proper summary of the nature of a data set. For instance, in Example 1.2 we cannot conclude that the use of nitrogen enhances growth without taking sample variability into account.
While the details of the analysis of this type of data set are deferred to Chap- ter 9, it should be clear from Figure 1.1 that variability among the no-nitrogen observations and variability among the nitrogen observations are certainly of some consequence. In fact, it appears that the variability within the nitrogen sample is larger than that of the no-nitrogen sample. Perhaps there is something about the inclusion of nitrogen that not only increases the stem height (x ̄ of 0.565 gram compared to an x ̄ of 0.399 gram for the no-nitrogen sample) but also increases the variability in stem height (i.e., renders the stem height more inconsistent).
As another example, contrast the two data sets below. Each contains two samples and the difference in the means is roughly the same for the two samples, but data set B seems to provide a much sharper contrast between the two populations from which the samples were taken. If the purpose of such an experiment is to detect differences between the two populations, the task is accomplished in the case of data set B. However, in data set A the large variability within the two samples creates difficulty. In fact, it is not clear that there is a distinction between the two populations.
Data set A: X X X X X X
0 X X 0 0 X X X 0 0 0 0 0 0 0 0
xX
Data set B: X X X X X X X X X X X
xX
x0
0 0 0 0 0 0 0 0 0 0 0
x0

1.4 Measures of Variability 15 Sample Range and Sample Standard Deviation
Just as there are many measures of central tendency or location, there are many measures of spread or variability. Perhaps the simplest one is the sample range Xmax − Xmin. The range can be very useful and is discussed at length in Chapter 17 on statistical quality control. The sample measure of spread that is used most often is the sample standard deviation. We again let x1, x2, . . . , xn denote sample values.
The sample variance, denoted by s2, is given by 2 􏰤n(xi−x ̄)2
The sample standard deviation, denoted by s, is the positive square root of s2, that is,

s= s2.
s= n−1. i=1
Definition 1.3:
It should be clear to the reader that the sample standard deviation is, in fact, a measure of variability. Large variability in a data set produces relatively large values of (x − x ̄)2 and thus a large sample variance. The quantity n − 1 is often called the degrees of freedom associated with the variance estimate. In this simple example, the degrees of freedom depict the number of independent pieces of information available for computing variability. For example, suppose that we wish to compute the sample variance and standard deviation of the data set (5, 17, 6, 4). The sample average is x ̄ = 8. The computation of the variance involves
(5−8)2 +(17−8)2 +(6−8)2 +(4−8)2 =(−3)2 +92 +(−2)2 +(−4)2.
􏰦n i=1
Exercise 1.16 on page 31). Then the computation of a sample variance does not involve n independent squared deviations from the mean x ̄. In fact, since the last value of x − x ̄ is determined by the initial n − 1 of them, we say that these are n − 1 “pieces of information” that produce s2. Thus, there are n − 1 degrees of freedom rather than n degrees of freedom for computing a sample variance.
The quantities inside parentheses sum to zero. In general,
(xi − x ̄) = 0 (see
Example 1.4: In an example discussed extensively in Chapter 10, an engineer is interested in testing the “bias” in a pH meter. Data are collected on the meter by measuring the pH of a neutral substance (pH = 7.0). A sample of size 10 is taken, with results given by
7.07 7.00 7.10 6.97 7.00 7.03 7.01 7.01 6.98 7.08. The sample mean x ̄ is given by
x ̄= 7.07+7.00+7.10+···+7.08 =7.0250. 10

16
Chapter 1 Introduction to Statistics and Data Analysis
The sample variance s2 is given by
s2 = 1 [(7.07 − 7.025)2 + (7.00 − 7.025)2 + (7.10 − 7.025)2
9
+ · · · + (7.08 − 7.025)2] = 0.001939.
As a result, the sample standard deviation is given by

So the sample standard deviation is 0.0440 with n − 1 = 9 degrees of freedom.
Units for Standard Deviation and Variance
It should be apparent from Definition 1.3 that the variance is a measure of the average squared deviation from the mean x ̄. We use the term average squared deviation even though the definition makes use of a division by degrees of freedom n − 1 rather than n. Of course, if n is large, the difference in the denominator is inconsequential. As a result, the sample variance possesses units that are the square of the units in the observed data whereas the sample standard deviation is found in linear units. As an example, consider the data of Example 1.2. The stem weights are measured in grams. As a result, the sample standard deviations are in grams and the variances are measured in grams2. In fact, the individual standard deviations are 0.0728 gram for the no-nitrogen case and 0.1867 gram for the nitrogen group. Note that the standard deviation does indicate considerably larger variability in the nitrogen sample. This condition was displayed in Figure 1.1.
Which Variability Measure Is More Important?
As we indicated earlier, the sample range has applications in the area of statistical quality control. It may appear to the reader that the use of both the sample variance and the sample standard deviation is redundant. Both measures reflect the same concept in measuring variability, but the sample standard deviation measures variability in linear units whereas the sample variance is measured in squared units. Both play huge roles in the use of statistical methods. Much of what is accomplished in the context of statistical inference involves drawing conclusions about characteristics of populations. Among these characteristics are constants which are called population parameters. Two important parameters are the population mean and the population variance. The sample variance plays an explicit role in the statistical methods used to draw inferences about the population variance. The sample standard deviation has an important role along with the sample mean in inferences that are made about the population mean. In general, the variance is considered more in inferential theory, while the standard deviation is used more in applications.
s =
0.001939 = 0.044.

1.5 Discrete and Continuous Data
17
Exercises
1.7 Consider the drying time data for Exercise 1.1 on page 13. Compute the sample variance and sample standard deviation.
1.8 Compute the sample variance and standard devi- ation for the water absorbency data of Exercise 1.2 on page 13.
1.9 Exercise 1.3 on page 13 showed tensile strength data for two samples, one in which specimens were ex- posed to an aging process and one in which there was no aging of the specimens.
(a) Calculate the sample variance as well as standard deviation in tensile strength for both samples.
(b) Does there appear to be any evidence that aging affects the variability in tensile strength? (See also the plot for Exercise 1.3 on page 13.)
1.10 For the data of Exercise 1.4 on page 13, com- pute both the mean and the variance in “flexibility” for both company A and company B. Does there ap- pear to be a difference in flexibility between company A and company B?
1.11 Consider the data in Exercise 1.5 on page 13. Compute the sample variance and the sample standard deviation for both control and treatment groups.
1.12 For Exercise 1.6 on page 13, compute the sample standard deviation in tensile strength for the samples separately for the two temperatures. Does it appear as if an increase in temperature influences the variability in tensile strength? Explain.
1.5 Discrete and Continuous Data
Statistical inference through the analysis of observational studies or designed ex- periments is used in many scientific areas. The data gathered may be discrete or continuous, depending on the area of application. For example, a chemical engineer may be interested in conducting an experiment that will lead to condi- tions where yield is maximized. Here, of course, the yield may be in percent or grams/pound, measured on a continuum. On the other hand, a toxicologist con- ducting a combination drug experiment may encounter data that are binary in nature (i.e., the patient either responds or does not).
Great distinctions are made between discrete and continuous data in the prob- ability theory that allow us to draw statistical inferences. Often applications of statistical inference are found when the data are count data. For example, an en- gineer may be interested in studying the number of radioactive particles passing through a counter in, say, 1 millisecond. Personnel responsible for the efficiency of a port facility may be interested in the properties of the number of oil tankers arriving each day at a certain port city. In Chapter 5, several distinct scenarios, leading to different ways of handling data, are discussed for situations with count data.
Special attention even at this early stage of the textbook should be paid to some details associated with binary data. Applications requiring statistical analysis of binary data are voluminous. Often the measure that is used in the analysis is the sample proportion. Obviously the binary situation involves two categories. If there are n units involved in the data and x is defined as the number that fall into category 1, then n − x fall into category 2. Thus, x/n is the sample proportion in category 1, and 1 − x/n is the sample proportion in category 2. In the biomedical application, 50 patients may represent the sample units, and if 20 out of 50 experienced an improvement in a stomach ailment (common to all 50)
after all were given the drug, then 20 = 0.4 is the sample proportion for which 50

18
Chapter 1 Introduction to Statistics and Data Analysis
the drug was a success and 1 − 0.4 = 0.6 is the sample proportion for which the drug was not successful. Actually the basic numerical measurement for binary data is generally denoted by either 0 or 1. For example, in our medical example, a successful result is denoted by a 1 and a nonsuccess a 0. As a result, the sample proportion is actually a sample mean of the ones and zeros. For the successful category,
x1 +x2 +···+x50 = 1+1+0+···+0+1 = 20 =0.4. 50 50 50
What Kinds of Problems Are Solved in Binary Data Situations?
The kinds of problems facing scientists and engineers dealing in binary data are
not a great deal unlike those seen where continuous measurements are of interest.
However, different techniques are used since the statistical properties of sample
proportions are quite different from those of the sample means that result from
averages taken from continuous populations. Consider the example data in Ex-
ercise 1.6 on page 13. The statistical problem underlying this illustration focuses
on whether an intervention, say, an increase in curing temperature, will alter the
population mean tensile strength associated with the silicone rubber process. On
the other hand, in a quality control area, suppose an automobile tire manufacturer
reports that a shipment of 5000 tires selected randomly from the process results
in 100 of them showing blemishes. Here the sample proportion is 100 = 0.02. 5000
Following a change in the process designed to reduce blemishes, a second sample of
5000 is taken and 90 tires are blemished. The sample proportion has been reduced
to 90 = 0.018. The question arises, “Is the decrease in the sample proportion 5000
from 0.02 to 0.018 substantial enough to suggest a real improvement in the pop- ulation proportion?” Both of these illustrations require the use of the statistical properties of sample averages—one from samples from a continuous population, and the other from samples from a discrete (binary) population. In both cases, the sample mean is an estimate of a population parameter, a population mean in the first illustration (i.e., mean tensile strength), and a population proportion in the second case (i.e., proportion of blemished tires in the population). So here we have sample estimates used to draw scientific conclusions regarding population parameters. As we indicated in Section 1.3, this is the general theme in many practical problems using statistical inference.
1.6 Statistical Modeling, Scientific Inspection, and Graphical Diagnostics
Often the end result of a statistical analysis is the estimation of parameters of a postulated model. This is natural for scientists and engineers since they often deal in modeling. A statistical model is not deterministic but, rather, must entail some probabilistic aspects. A model form is often the foundation of assumptions that are made by the analyst. For example, in Example 1.2 the scientist may wish to draw some level of distinction between the nitrogen and no-nitrogen populations through the sample information. The analysis may require a certain model for

1.6 Statistical Modeling, Scientific Inspection, and Graphical Diagnostics 19
Scatter Plot
the data, for example, that the two samples come from normal or Gaussian distributions. See Chapter 6 for a discussion of the normal distribution.
Obviously, the user of statistical methods cannot generate sufficient informa- tion or experimental data to characterize the population totally. But sets of data are often used to learn about certain properties of the population. Scientists and engineers are accustomed to dealing with data sets. The importance of character- izing or summarizing the nature of collections of data should be obvious. Often a summary of a collection of data via a graphical display can provide insight regard- ing the system from which the data were taken. For instance, in Sections 1.1 and 1.3, we have shown dot plots.
In this section, the role of sampling and the display of data for enhancement of statistical inference is explored in detail. We merely introduce some simple but often effective displays that complement the study of statistical populations.
At times the model postulated may take on a somewhat complicated form. Con- sider, for example, a textile manufacturer who designs an experiment where cloth specimen that contain various percentages of cotton are produced. Consider the data in Table 1.3.
Table 1.3: Tensile Strength
Cotton Percentage
15 20 25 30
Tensile Strength
7,7,9,8,10
19, 20, 21, 20, 22 21, 21, 17, 19, 20 8,7,8,9,10
Five cloth specimens are manufactured for each of the four cotton percentages. In this case, both the model for the experiment and the type of analysis used should take into account the goal of the experiment and important input from the textile scientist. Some simple graphics can shed important light on the clear distinction between the samples. See Figure 1.5; the sample means and variability are depicted nicely in the scatter plot. One possible goal of this experiment is simply to determine which cotton percentages are truly distinct from the others. In other words, as in the case of the nitrogen/no-nitrogen data, for which cotton percentages are there clear distinctions between the populations or, more specifi- cally, between the population means? In this case, perhaps a reasonable model is that each sample comes from a normal distribution. Here the goal is very much like that of the nitrogen/no-nitrogen data except that more samples are involved. The formalism of the analysis involves notions of hypothesis testing discussed in Chapter 10. Incidentally, this formality is perhaps not necessary in light of the diagnostic plot. But does this describe the real goal of the experiment and hence the proper approach to data analysis? It is likely that the scientist anticipates the existence of a maximum population mean tensile strength in the range of cot- ton concentration in the experiment. Here the analysis of the data should revolve

20
Chapter 1 Introduction to Statistics and Data Analysis
around a different type of model, one that postulates a type of structure relating the population mean tensile strength to the cotton concentration. In other words, a model may be written
μt,c = β0 + β1C + β2C2,
where μt,c is the population mean tensile strength, which varies with the amount of cotton in the product C. The implication of this model is that for a fixed cotton level, there is a population of tensile strength measurements and the population mean is μt,c. This type of model, called a regression model, is discussed in Chapters 11 and 12. The functional form is chosen by the scientist. At times the data analysis may suggest that the model be changed. Then the data analyst “entertains” a model that may be altered after some analysis is done. The use of an empirical model is accompanied by estimation theory, where β0, β1, and β2 are estimated by the data. Further, statistical inference can then be used to determine model adequacy.
25
20
15
10
5
15 20 25 30 Cotton Percentages
Figure 1.5: Scatter plot of tensile strength and cotton percentages.
Two points become evident from the two data illustrations here: (1) The type of model used to describe the data often depends on the goal of the experiment; and (2) the structure of the model should take advantage of nonstatistical scientific input. A selection of a model represents a fundamental assumption upon which the resulting statistical inference is based. It will become apparent throughout the book how important graphics can be. Often, plots can illustrate information that allows the results of the formal statistical inference to be better communicated to the scientist or engineer. At times, plots or exploratory data analysis can teach the analyst something not retrieved from the formal analysis. Almost any formal analysis requires assumptions that evolve from the model of the data. Graphics can nicely highlight violation of assumptions that would otherwise go unnoticed. Throughout the book, graphics are used extensively to supplement formal data analysis. The following sections reveal some graphical tools that are useful in exploratory or descriptive data analysis.
Tensile Strength

1.6 Statistical Modeling, Scientific Inspection, and Graphical Diagnostics 21 Stem-and-Leaf Plot
Statistical data, generated in large masses, can be very useful for studying the behavior of the distribution if presented in a combined tabular and graphic display called a stem-and-leaf plot.
To illustrate the construction of a stem-and-leaf plot, consider the data of Table 1.4, which specifies the “life” of 40 similar car batteries recorded to the nearest tenth of a year. The batteries are guaranteed to last 3 years. First, split each observation into two parts consisting of a stem and a leaf such that the stem represents the digit preceding the decimal and the leaf corresponds to the decimal part of the number. In other words, for the number 3.7, the digit 3 is designated the stem and the digit 7 is the leaf. The four stems 1, 2, 3, and 4 for our data are listed vertically on the left side in Table 1.5; the leaves are recorded on the right side opposite the appropriate stem value. Thus, the leaf 6 of the number 1.6 is recorded opposite the stem 1; the leaf 5 of the number 2.5 is recorded opposite the stem 2; and so forth. The number of leaves recorded opposite each stem is summarized under the frequency column.
Table 1.4: Car Battery Life
2.2 4.1 3.5 4.5 3.2 3.7 3.0 2.6 3.4 1.6 3.1 3.3 3.8 3.1 4.7 3.7 2.5 4.3 3.4 3.6 2.9 3.3 3.9 3.1 3.3 3.1 3.7 4.4 3.2 4.1 1.9 3.4 4.7 3.8 3.2 2.6 3.9 3.0 4.2 3.5
Table 1.5: Stem-and-Leaf Plot of Battery Life
Stem Leaf Frequency
169 2
2 25669 5
3 0011112223334445567778899 25
4 11234577 8
The stem-and-leaf plot of Table 1.5 contains only four stems and consequently does not provide an adequate picture of the distribution. To remedy this problem, we need to increase the number of stems in our plot. One simple way to accomplish this is to write each stem value twice and then record the leaves 0, 1, 2, 3, and 4 opposite the appropriate stem value where it appears for the first time, and the leaves 5, 6, 7, 8, and 9 opposite this same stem value where it appears for the second time. This modified double-stem-and-leaf plot is illustrated in Table 1.6, where the stems corresponding to leaves 0 through 4 have been coded by the symbol ⋆ and the stems corresponding to leaves 5 through 9 by the symbol ·.
In any given problem, we must decide on the appropriate stem values. This decision is made somewhat arbitrarily, although we are guided by the size of our sample. Usually, we choose between 5 and 20 stems. The smaller the number of data available, the smaller is our choice for the number of stems. For example, if

22
Chapter 1 Introduction to Statistics and Data Analysis
Histogram
the data consist of numbers from 1 to 21 representing the number of people in a cafeteria line on 40 randomly selected workdays and we choose a double-stem-and- leaf plot, the stems will be 0⋆, 0·, 1⋆, 1·, and 2⋆ so that the smallest observation 1 has stem 0⋆ and leaf 1, the number 18 has stem 1· and leaf 8, and the largest observation 21 has stem 2⋆ and leaf 1. On the other hand, if the data consist of numbers from $18,800 to $19,600 representing the best possible deals on 100 new automobiles from a certain dealership and we choose a single-stem-and-leaf plot, the stems will be 188, 189, 190, . . . , 196 and the leaves will now each contain two digits. A car that sold for $19,385 would have a stem value of 193 and the two-digit leaf 85. Multiple-digit leaves belonging to the same stem are usually separated by commas in the stem-and-leaf plot. Decimal points in the data are generally ignored when all the digits to the right of the decimal represent the leaf. Such was the case in Tables 1.5 and 1.6. However, if the data consist of numbers ranging from 21.8 to 74.9, we might choose the digits 2, 3, 4, 5, 6, and 7 as our stems so that a number such as 48.3 would have a stem value of 4 and a leaf of 8.3.
Table 1.6: Double-Stem-and-Leaf Plot of Battery Life
Stem Leaf Frequency
1·69 2 2⋆2 1 2· 5669 4 3⋆ 001111222333444 15 3· 5567778899 10 4⋆ 11234 5 4· 577 3
The stem-and-leaf plot represents an effective way to summarize data. Another way is through the use of the frequency distribution, where the data, grouped into different classes or intervals, can be constructed by counting the leaves be- longing to each stem and noting that each stem defines a class interval. In Table 1.5, the stem 1 with 2 leaves defines the interval 1.0–1.9 containing 2 observations; the stem 2 with 5 leaves defines the interval 2.0–2.9 containing 5 observations; the stem 3 with 25 leaves defines the interval 3.0–3.9 with 25 observations; and the stem 4 with 8 leaves defines the interval 4.0–4.9 containing 8 observations. For the double-stem-and-leaf plot of Table 1.6, the stems define the seven class intervals 1.5–1.9, 2.0–2.4, 2.5–2.9, 3.0–3.4, 3.5–3.9, 4.0–4.4, and 4.5–4.9 with frequencies 2, 1, 4, 15, 10, 5, and 3, respectively.
Dividing each class frequency by the total number of observations, we obtain the proportion of the set of observations in each of the classes. A table listing relative frequencies is called a relative frequency distribution. The relative frequency distribution for the data of Table 1.4, showing the midpoint of each class interval, is given in Table 1.7.
The information provided by a relative frequency distribution in tabular form is easier to grasp if presented graphically. Using the midpoint of each interval and the

1.6 Statistical Modeling, Scientific Inspection, and Graphical Diagnostics 23 Table 1.7: Relative Frequency Distribution of Battery Life
0.375
0.250
0.125
Class Class Frequency, Interval Midpoint f
Relative Frequency 0.050 0.025 0.100 0.375 0.250 0.125 0.075
1.5–1.9 2.0–2.4 2.5–2.9 3.0–3.4 3.5–3.9 4.0–4.4 4.5–4.9
1.7 2.2 2.7 3.2 3.7 4.2 4.7
2 1 4
15 10 5 3
1.7 2.2
3.7 Battery Life (years)
4.2 4.7
2.7 3.2
Figure 1.6: Relative frequency histogram.
corresponding relative frequency, we construct a relative frequency histogram (Figure 1.6).
Many continuous frequency distributions can be represented graphically by the characteristic bell-shaped curve of Figure 1.7. Graphical tools such as what we see in Figures 1.6 and 1.7 aid in the characterization of the nature of the population. In Chapters 5 and 6 we discuss a property of the population called its distribution. While a more rigorous definition of a distribution or probability distribution will be given later in the text, at this point one can view it as what would be seen in Figure 1.7 in the limit as the size of the sample becomes larger.
A distribution is said to be symmetric if it can be folded along a vertical axis so that the two sides coincide. A distribution that lacks symmetry with respect to a vertical axis is said to be skewed. The distribution illustrated in Figure 1.8(a) is said to be skewed to the right since it has a long right tail and a much shorter left tail. In Figure 1.8(b) we see that the distribution is symmetric, while in Figure 1.8(c) it is skewed to the left.
If we rotate a stem-and-leaf plot counterclockwise through an angle of 90◦, we observe that the resulting columns of leaves form a picture that is similar to a histogram. Consequently, if our primary purpose in looking at the data is to determine the general shape or form of the distribution, it will seldom be necessary
Relativ e Frequencty

24
Chapter 1 Introduction to Statistics and Data Analysis
f(x)
0123456 Battery Life (years)
Figure 1.7: Estimating frequency distribution.
(a) (b) (c)
Figure 1.8: Skewness of data. to construct a relative frequency histogram.
Box-and-Whisker Plot or Box Plot
Another display that is helpful for reflecting properties of a sample is the box- and-whisker plot. This plot encloses the interquartile range of the data in a box that has the median displayed within. The interquartile range has as its extremes the 75th percentile (upper quartile) and the 25th percentile (lower quartile). In addition to the box, “whiskers” extend, showing extreme observations in the sam- ple. For reasonably large samples, the display shows center of location, variability, and the degree of asymmetry.
In addition, a variation called a box plot can provide the viewer with infor- mation regarding which observations may be outliers. Outliers are observations that are considered to be unusually far from the bulk of the data. There are many statistical tests that are designed to detect outliers. Technically, one may view an outlier as being an observation that represents a “rare event” (there is a small probability of obtaining a value that far from the bulk of the data). The concept of outliers resurfaces in Chapter 12 in the context of regression analysis.

1.6 Statistical Modeling, Scientific Inspection, and Graphical Diagnostics 25
The visual information in the box-and-whisker plot or box plot is not intended to be a formal test for outliers. Rather, it is viewed as a diagnostic tool. While the determination of which observations are outliers varies with the type of software that is used, one common procedure is to use a multiple of the interquartile range. For example, if the distance from the box exceeds 1.5 times the interquartile range (in either direction), the observation may be labeled an outlier.
Example 1.5: Nicotine content was measured in a random sample of 40 cigarettes. The data are displayed in Table 1.8.
Table 1.8: Nicotine Data for Example 1.5
1.09 1.92 2.31 1.79 2.28 1.74 1.47 1.97 0.85 1.24 1.58 2.03 1.70 2.17 2.55 2.11 1.86 1.90 1.68 1.51 1.64 0.72 1.69 1.85 1.82 1.79 2.46 1.88 2.08 1.67 1.37 1.93 1.40 1.64 2.09 1.75 1.63 2.37 1.75 1.69
1.0 1.5 2.0 2.5 Nicotine
Figure 1.9: Box-and-whisker plot for Example 1.5.
Figure 1.9 shows the box-and-whisker plot of the data, depicting the observa- tions 0.72 and 0.85 as mild outliers in the lower tail, whereas the observation 2.55 is a mild outlier in the upper tail. In this example, the interquartile range is 0.365, and 1.5 times the interquartile range is 0.5475. Figure 1.10, on the other hand, provides a stem-and-leaf plot.
Example 1.6: Consider the data in Table 1.9, consisting of 30 samples measuring the thickness of paint can “ears” (see the work by Hogg and Ledolter, 1992, in the Bibliography). Figure 1.11 depicts a box-and-whisker plot for this asymmetric set of data. Notice that the left block is considerably larger than the block on the right. The median is 35. The lower quartile is 31, while the upper quartile is 36. Notice also that the extreme observation on the right is farther away from the box than the extreme observation on the left. There are no outliers in this data set.

26
Chapter 1 Introduction to Statistics and Data Analysis
The decimal point is 1 digit(s) to the left of the | 7|2
8|5
9|
10 | 9
11 |
12 | 4
13 | 7
14 | 07
15 | 18
16 | 3447899
17 | 045599
18 | 2568
19 | 0237
20 | 389
21 | 17
22 | 8
23 | 17
24 | 6
25 | 5
Figure 1.10: Stem-and-leaf plot for the nicotine data.
Table 1.9: Data for Example 1.6
Sample Measurements Sample Measurements
1 2936393434 16
2 2929283231 17
3 3434393837 18
4 3537333841 19
5 3029313829 20
6 3431373936 21
7 3035334036 22
8 2828313430 23
9 3236383835 24
10 3530373531 25
11 3530353835 26
12 3834353531 27
13 3435333034 28
14 4035343335 29
15 3435383530 30
3530352937 4031383531 3536303332 3534353036 3535313836 3236363236 3637323434 2934333735 3636353737 3630353331 3530293835 3536303436 3530362935 3836353131 3034402830
There are additional ways that box-and-whisker plots and other graphical dis- plays can aid the analyst. Multiple samples can be compared graphically. Plots of data can suggest relationships between variables. Graphs can aid in the detection of anomalies or outlying observations in samples.
There are other types of graphical tools and plots that are used. These are discussed in Chapter 8 after we introduce additional theoretical details.

1.7 General Types of Statistical Studies 27
28 30 32 34 36 38 40 Paint
Figure 1.11: Box-and-whisker plot for thickness of paint can “ears.”
Other Distinguishing Features of a Sample
There are features of the distribution or sample other than measures of center of location and variability that further define its nature. For example, while the median divides the data (or distribution) into two parts, there are other measures that divide parts or pieces of the distribution that can be very useful. Separation is made into four parts by quartiles, with the third quartile separating the upper quarter of the data from the rest, the second quartile being the median, and the first quartile separating the lower quarter of the data from the rest. The distribution can be even more finely divided by computing percentiles of the distribution. These quantities give the analyst a sense of the so-called tails of the distribution (i.e., values that are relatively extreme, either small or large). For example, the 95th percentile separates the highest 5% from the bottom 95%. Similar definitions prevail for extremes on the lower side or lower tail of the distribution. The 1st percentile separates the bottom 1% from the rest of the distribution. The concept of percentiles will play a major role in much that will be covered in future chapters.
1.7 General Types of Statistical Studies: Designed Experiment, Observational Study, and Retrospective Study
In the foregoing sections we have emphasized the notion of sampling from a pop- ulation and the use of statistical methods to learn or perhaps affirm important information about the population. The information sought and learned through the use of these statistical methods can often be influential in decision making and problem solving in many important scientific and engineering areas. As an illustra- tion, Example 1.3 describes a simple experiment in which the results may provide an aid in determining the kinds of conditions under which it is not advisable to use a particular aluminum alloy that may have a dangerous vulnerability to corrosion. The results may be of use not only to those who produce the alloy, but also to the customer who may consider using it. This illustration, as well as many more that appear in Chapters 13 through 15, highlights the concept of designing or control- ling experimental conditions (combinations of coating conditions and humidity) of

28
Chapter 1 Introduction to Statistics and Data Analysis
interest to learn about some characteristic or measurement (level of corrosion) that results from these conditions. Statistical methods that make use of measures of central tendency in the corrosion measure, as well as measures of variability, are employed. As the reader will observe later in the text, these methods often lead to a statistical model like that discussed in Section 1.6. In this case, the model may be used to estimate (or predict) the corrosion measure as a function of humidity and the type of coating employed. Again, in developing this kind of model, descriptive statistics that highlight central tendency and variability become very useful.
The information supplied in Example 1.3 illustrates nicely the types of engi- neering questions asked and answered by the use of statistical methods that are employed through a designed experiment and presented in this text. They are
(i) What is the nature of the impact of relative humidity on the corrosion of the aluminum alloy within the range of relative humidity in this experiment?
(ii) Does the chemical corrosion coating reduce corrosion levels and can the effect be quantified in some fashion?
(iii) Is there interaction between coating type and relative humidity that impacts their influence on corrosion of the alloy? If so, what is its interpretation?
What Is Interaction?
The importance of questions (i) and (ii) should be clear to the reader, as they deal with issues important to both producers and users of the alloy. But what about question (iii)? The concept of interaction will be discussed at length in Chapters 14 and 15. Consider the plot in Figure 1.3. This is an illustration of the detection of interaction between two factors in a simple designed experiment. Note that the lines connecting the sample means are not parallel. Parallelism would have indicated that the effect (seen as a result of the slope of the lines) of relative humidity is the same, namely a negative effect, for both an uncoated condition and the chemical corrosion coating. Recall that the negative slope implies that corrosion becomes more pronounced as humidity rises. Lack of parallelism implies an interaction between coating type and relative humidity. The nearly “flat” line for the corrosion coating as opposed to a steeper slope for the uncoated condition suggests that not only is the chemical corrosion coating beneficial (note the displacement between the lines), but the presence of the coating renders the effect of humidity negligible. Clearly all these questions are very important to the effect of the two individual factors and to the interpretation of the interaction, if it is present.
Statistical models are extremely useful in answering questions such as those listed in (i), (ii), and (iii), where the data come from a designed experiment. But one does not always have the luxury or resources to employ a designed experiment. For example, there are many instances in which the conditions of interest to the scientist or engineer cannot be implemented simply because the important factors cannot be controlled. In Example 1.3, the relative humidity and coating type (or lack of coating) are quite easy to control. This of course is the defining feature of a designed experiment. In many fields, factors that need to be studied cannot be controlled for any one of various reasons. Tight control as in Example 1.3 allows the analyst to be confident that any differences found (for example, in corrosion levels)

1.7 General Types of Statistical Studies 29
are due to the factors under control. As a second illustration, consider Exercise 1.6 on page 13. Suppose in this case 24 specimens of silicone rubber are selected and 12 assigned to each of the curing temperature levels. The temperatures are controlled carefully, and thus this is an example of a designed experiment with a single factor being curing temperature. Differences found in the mean tensile strength would be assumed to be attributed to the different curing temperatures.
What If Factors Are Not Controlled?
Suppose there are no factors controlled and no random assignment of fixed treat- ments to experimental units and yet there is a need to glean information from a data set. As an illustration, consider a study in which interest centers around the relationship between blood cholesterol levels and the amount of sodium measured in the blood. A group of individuals were monitored over time for both blood cholesterol and sodium. Certainly some useful information can be gathered from such a data set. However, it should be clear that there certainly can be no strict control of blood sodium levels. Ideally, the subjects should be divided randomly into two groups, with one group assigned a specific high level of blood sodium and the other a specific low level of blood sodium. Obviously this cannot be done. Clearly changes in cholesterol can be experienced because of changes in one of a number of other factors that were not controlled. This kind of study, without factor control, is called an observational study. Much of the time it involves a situation in which subjects are observed across time.
Biological and biomedical studies are often by necessity observational studies. However, observational studies are not confined to those areas. For example, con- sider a study that is designed to determine the influence of ambient temperature on the electric power consumed by a chemical plant. Clearly, levels of ambient temper- ature cannot be controlled, and thus the data structure can only be a monitoring of the data from the plant over time.
It should be apparent that the striking difference between a well-designed ex- periment and observational studies is the difficulty in determination of true cause and effect with the latter. Also, differences found in the fundamental response (e.g., corrosion levels, blood cholesterol, plant electric power consumption) may be due to other underlying factors that were not controlled. Ideally, in a designed experiment the nuisance factors would be equalized via the randomization process. Certainly changes in blood cholesterol could be due to fat intake, exercise activity, and so on. Electric power consumption could be affected by the amount of product produced or even the purity of the product produced.
Another often ignored disadvantage of an observational study when compared to carefully designed experiments is that, unlike the latter, observational studies are at the mercy of nature, environmental or other uncontrolled circumstances that impact the ranges of factors of interest. For example, in the biomedical study regarding the influence of blood sodium levels on blood cholesterol, it is possible that there is indeed a strong influence but the particular data set used did not involve enough observed variation in sodium levels because of the nature of the subjects chosen. Of course, in a designed experiment, the analyst chooses and controls ranges of factors.

30
Chapter 1 Introduction to Statistics and Data Analysis
//
A third type of statistical study which can be very useful but has clear dis- advantages when compared to a designed experiment is a retrospective study. This type of study uses strictly historical data, data taken over a specific period of time. One obvious advantage of retrospective data is that there is reduced cost in collecting the data. However, as one might expect, there are clear disadvantages.
(i) Validity and reliability of historical data are often in doubt.
(ii) If time is an important aspect of the structure of the data, there may be data missing.
(iii) There may be errors in collection of the data that are not known.
(iv) Again, as in the case of observational data, there is no control on the ranges of the measured variables (the factors in a study). Indeed, the ranges found in historical data may not be relevant for current studies.
In Section 1.6, some attention was given to modeling of relationships among vari- ables. We introduced the notion of regression analysis, which is covered in Chapters 11 and 12 and is illustrated as a form of data analysis for designed experiments discussed in Chapters 14 and 15. In Section 1.6, a model relating population mean tensile strength of cloth to percentages of cotton was used for illustration, where 20 specimens of cloth represented the experimental units. In that case, the data came from a simple designed experiment where the individual cotton percentages were selected by the scientist.
Often both observational data and retrospective data are used for the purpose of observing relationships among variables through model-building procedures dis- cussed in Chapters 11 and 12. While the advantages of designed experiments certainly apply when the goal is statistical model building, there are many areas in which designing of experiments is not possible. Thus, observational or historical data must be used. We refer here to a historical data set that is found in Exercise 12.5 on page 450. The goal is to build a model that will result in an equation or relationship that relates monthly electric power consumed to average ambient temperature x1, the number of days in the month x2, the average product purity x3, and the tons of product produced x4. The data are the past year’s historical data.
Exercises
1.13 A manufacturer of electronic components is in- terested in determining the lifetime of a certain type of battery. A sample, in hours of life, is as follows:
123, 116, 122, 110, 175, 126, 125, 111, 118, 117. (a) Find the sample mean and median.
(b) What feature in this data set is responsible for the substantial difference between the two?
1.14 A tire manufacturer wants to determine the in- ner diameter of a certain grade of tire. Ideally, the diameter would be 570 mm. The data are as follows:
(a) Find the sample mean and median.
(b) Find the sample variance, standard deviation, and range.
(c) Using the calculated statistics in parts (a) and (b), can you comment on the quality of the tires?
1.15 Five independent coin tosses result in HHHHH. It turns out that if the coin is fair the probability of this outcome is (1/2)5 = 0.03125. Does this produce strong evidence that the coin is not fair? Comment and use the concept of P-value discussed in Section 1.1.
572, 572, 573, 568, 569, 575, 565, 570.

Exercises
31
1.16 Show that the n pieces of information in
(c) Compute the sample mean, sample range, and sam- ple standard deviation.
1.20 The following data represent the length of life, in seconds, of 50 fruit flies subject to a new spray in a controlled laboratory experiment:
􏰦n
(xi − x ̄)
i=1
2
are not independent; that is, show that
􏰤n
( x i − x ̄ ) = 0 .
i=1
1.17 A study of the effects of smoking on sleep pat-
terns is conducted. The measure observed is the time, in minutes, that it takes to fall asleep. These data are obtained:
17 20 10 12 14 6 16 188 13 718
9 23
13 12 6 7 32 9 4 27 10 9
19 18 24 10 13 7 7 10 11 19 16 8 6 7 15
Smokers: Nonsmokers:
69.3 56.0 22.1 53.2 48.1 52.7 60.2 43.8 23.2 28.6 25.1 26.4 29.8 28.4 38.5 30.6 31.8 41.6 36.0 37.9 13.9
47.6 34.4 13.8 34.9 30.2 21.1
7 10 5
(a) Construct a double-stem-and-leaf plot for the life
(a) Find the sample mean for each group.
span of the fruit flies using the stems 0⋆, 0·, 1⋆, 1·, 2⋆, 2·, and 3⋆ such that stems coded by the symbols ⋆ and · are associated, respectively, with leaves 0 through 4 and 5 through 9.
(b) Set up a relative frequency distribution. (c) Construct a relative frequency histogram.
(b) Find the sample standard deviation for each group.
(c) Make a dot plot of the data sets A and B on the same line.
(d) Comment on what kind of impact smoking appears to have on the time required to fall asleep.
1.18 The following scores represent the final exami- nation grades for an elementary statistics course:
(d) Find the median.
1.21 The lengths of power failures, recorded in the following table.
in minutes, are
98 102 124 112 112 21
43 37 93 95
23 60 36 80 55 76 98 81 88 62 48 84 69 74
79 32 77 81 52 10 67 41 74 43 90 15 63 80
57 74 95 41 64 75 71 83 60 78 79 34 85 61
52
65
78
54
89
67
70 82 92 85 25 80
22 18 135 83 55 28 70 66 74 40 98 87 50 96 118
15 90 78 121 120 13 89 103 24 132 115 21 158 74 78
69 22 21 28 83
(a) Construct a stem-and-leaf plot for the examination grades in which the stems are 1,2,3,…,9.
(b) Construct a relative frequency histogram, draw an estimate of the graph of the distribution, and dis- cuss the skewness of the distribution.
(c) Compute the sample mean, sample median, and sample standard deviation.
1.19 The following data represent the length of life in years, measured to the nearest tenth, of 30 similar fuel pumps:
2.0 3.0 0.3 3.3 1.3 0.4 0.2 6.0 5.5 6.5 0.2 2.3 1.5 4.0 5.9 1.8 4.7 0.7 4.5 0.3 1.5 0.5 2.5 5.0 1.0 6.0 5.6 6.0 1.2 0.2
(a) Construct a stem-and-leaf plot for the life in years of the fuel pumps, using the digit to the left of the decimal point as the stem for each observation.
(b) Set up a relative frequency distribution.
6.72 6.77 6.82
6.66 6.66 6.64
6.76 6.68 6.66
6.76 6.67 6.70
6.66 6.76 6.76 6.72
//
64
76
17 82
(a) Find the sample mean and sample median of the power-failure times.
(b) Find the sample standard deviation of the power- failure times.
1.22 The following data are the measures of the di- ameters of 36 rivet heads in 1/100 of an inch.
6.62 6.75 6.72 6.76 6.70 6.78 6.79 6.78
72 84
9 13 13 3 7 10 14 15
6.70 6.78 6.70 6.76 6.73 6.80 6.62 6.72 6.76 6.72 6.74 6.81
(a) Compute the sample mean and sample deviation.
standard
(b) Construct a relative frequency histogram of the data.
(c) Comment on whether or not there is any clear in- dication that the sample came from a population that has a bell-shaped distribution.
1.23 The hydrocarbon emissions at idling speed in parts per million (ppm) for automobiles of 1980 and 1990 model years are given for 20 randomly selected cars.

32
Chapter 1 Introduction to Statistics and Data Analysis
1980 models:
141 359 247 940 882 494 306 210 105 880 200 223 188 940 241 190 300 435 241 380
1990 models:
140160 2020223 60 20 9536070 220 400 217 58 235 380 200 175 85 65
(a) Construct a dot plot as in Figure 1.1.
(b) Compute the sample means for the two years and superimpose the two means on the plots.
(c) Comment on what the dot plot indicates regarding whether or not the population emissions changed from 1980 to 1990. Use the concept of variability in your comments.
1.24 The following are historical data on staff salaries (dollars per pupil) for 30 schools sampled in the eastern part of the United States in the early 1970s.
and the sample means were, respectively, 210, 325, and 375.
(a) Plot average wear against load.
(b) From the plot in (a), does it appear as if a relation- ship exists between wear and load?
(c) Suppose we look at the individual wear values for each of the four specimens at each load level (see the data that follow). Plot the wear results for all specimens against the three load values.
(d) From your plot in (c), does it appear as if a clear relationship exists? If your answer is different from that in (b), explain why.
//
3.79 2.99 2.45 2.14 3.36 2.05 3.14 3.54
2.77 2.91 3.10 1.84 2.52 3.22 2.67 2.52 2.71 2.75 3.57 3.85 2.89 2.83 3.13 2.44 2.10 3.71 2.37 2.68 3.51 3.37
700
y1 145
y2 105
y3 260
y4 330
y ̄1 =210
x
1000 250 195 375 480 y ̄2 =325
1300 150 180 420 750 y ̄3 =375
(a) Compute the sample mean and sample standard deviation.
(b) Construct a relative frequency histogram of the data.
(c) Construct a stem-and-leaf display of the data.
1.25 The following data set is related to that in Ex- ercise 1.24. It gives the percentages of the families that are in the upper income level, for the same individual schools in the same order as in Exercise 1.24.
72.2 31.9 26.5 29.1 27.3 8.6 22.3 26.5 20.4 12.8 25.1 19.2 24.1 58.2 68.1 89.2 55.1 9.4 14.5 13.9 20.7 17.9 8.5 55.4 38.1 54.2 21.5 26.2 59.1 43.3
(a) Calculate the sample mean.
(b) Calculate the sample median.
(c) Construct a relative frequency histogram of the data.
(d) Compute the 10% trimmed mean. Compare with the results in (a) and (b) and comment.
1.26 Suppose it is of interest to use the data sets in Exercises 1.24 and 1.25 to derive a model that would predict staff salaries as a function of percentage of fam- ilies in a high income level for current school systems. Comment on any disadvantage in carrying out this type of analysis.
1.27 A study is done to determine the influence of the wear, y, of a bearing as a function of the load, x, on the bearing. A designed experiment is used for this study. Three levels of load were used, 700 lb, 1000 lb, and 1300 lb. Four specimens were used at each level,
1.28 Many manufacturing companies in the United States and abroad use molded parts as components of a process. Shrinkage is often a major problem. Thus, a molded die for a part is built larger than nominal size to allow for part shrinkage. In an injection molding study it is known that the shrinkage is influenced by many factors, among which are the injection velocity in ft/sec and mold temperature in ◦C. The following two data sets show the results of a designed experiment in which injection velocity was held at two levels (low and high) and mold temperature was held constant at a low level. The shrinkage is measured in cm × 104.
Shrinkage values at low injection velocity:
72.68 72.62 72.58 72.48 73.07
72.55 72.42 72.84 72.58 72.92 Shrinkage values at high injection velocity:
71.62 71.68 71.74 71.48 71.55 71.52 71.71 71.56 71.70 71.50
(a) Construct a dot plot of both data sets on the same graph. Indicate on the plot both shrinkage means, that for low injection velocity and high injection velocity.
(b) Based on the graphical results in (a), using the lo- cation of the two means and your sense of variabil- ity, what do you conclude regarding the effect of injection velocity on shrinkage at low mold tem- perature?
1.29 Use the data in Exercise 1.24 to construct a box plot.
1.30 Below are the lifetimes, in hours, of fifty 40-watt, 110-volt internally frosted incandescent lamps, taken from forced life tests:

Exercises
33
919 1196 1156 920 1170 929 1045 855
938 970 978 832 765 958
1217 1085 702 923
785 1126 936 918 948 1067 1092 1162 950 905 972 1035
1195 1195 1340 1122 1237 956 1102 1157 1009 1157 1151 1009
902 1022 1333 811 896 958 1311 1037
(b) As in Exercise 1.28, comment on the influence of injection velocity on shrinkage for high mold tem- perature. Take into account the position of the two means and the variability around each mean.
(c) Compare your conclusion in (b) with that in (b) of Exercise 1.28 in which mold temperature was held at a low level. Would you say that there is an interaction between injection velocity and mold temperature? Explain.
1.32 Use the results of Exercises 1.28 and 1.31 to cre- ate a plot that illustrates the interaction evident from the data. Use the plot in Figure 1.3 in Example 1.3 as a guide. Could the type of information found in Exer- cises 1.28 and 1.31 have been found in an observational study in which there was no control on injection veloc- ity and mold temperature by the analyst? Explain why or why not.
1.33 Group Project: Collect the shoe size of every- one in the class. Use the sample means and variances and the types of plots presented in this chapter to sum- marize any features that draw a distinction between the distributions of shoe sizes for males and females. Do the same for the height of everyone in the class.
Construct a box plot for these data.
1.31 Consider the situation of Exercise 1.28. But now use the following data set, in which shrinkage is mea- sured once again at low injection velocity and high in- jection velocity. However, this time the mold temper- ature is raised to a high level and held constant.
Shrinkage values at low injection velocity:
76.20 76.09 75.98 76.15 76.17
75.94 76.12 76.18 76.25 75.82 Shrinkage values at high injection velocity:
93.25 93.19 92.87 93.29 93.37 92.98 93.47 93.75 93.89 91.62
(a) As in Exercise 1.28, construct a dot plot with both data sets on the same graph and identify both means (i.e., mean shrinkage for low injection ve- locity and for high injection velocity).

This page intentionally left blank

Chapter 2 Probability
2.1 Sample Space
In the study of statistics, we are concerned basically with the presentation and interpretation of chance outcomes that occur in a planned study or scientific investigation. For example, we may record the number of accidents that occur monthly at the intersection of Driftwood Lane and Royal Oak Drive, hoping to justify the installation of a traffic light; we might classify items coming off an as- sembly line as “defective” or “nondefective”; or we may be interested in the volume of gas released in a chemical reaction when the concentration of an acid is varied. Hence, the statistician is often dealing with either numerical data, representing counts or measurements, or categorical data, which can be classified according to some criterion.
We shall refer to any recording of information, whether it be numerical or categorical, as an observation. Thus, the numbers 2, 0, 1, and 2, representing the number of accidents that occurred for each month from January through April during the past year at the intersection of Driftwood Lane and Royal Oak Drive, constitute a set of observations. Similarly, the categorical data N, D, N, N, and D, representing the items found to be defective or nondefective when five items are inspected, are recorded as observations.
Statisticians use the word experiment to describe any process that generates a set of data. A simple example of a statistical experiment is the tossing of a coin. In this experiment, there are only two possible outcomes, heads or tails. Another experiment might be the launching of a missile and observing of its velocity at specified times. The opinions of voters concerning a new sales tax can also be considered as observations of an experiment. We are particularly interested in the observations obtained by repeating the experiment several times. In most cases, the outcomes will depend on chance and, therefore, cannot be predicted with certainty. If a chemist runs an analysis several times under the same conditions, he or she will obtain different measurements, indicating an element of chance in the experimental procedure. Even when a coin is tossed repeatedly, we cannot be certain that a given toss will result in a head. However, we know the entire set of possibilities for each toss.
Given the discussion in Section 1.7, we should deal with the breadth of the term experiment. Three types of statistical studies were reviewed, and several examples were given of each. In each of the three cases, designed experiments, observational studies, and retrospective studies, the end result was a set of data that of course is
35

36
Chapter 2 Probability
Definition 2.1:
subject to uncertainty. Though only one of these has the word experiment in its description, the process of generating the data or the process of observing the data is part of an experiment. The corrosion study discussed in Section 1.2 certainly involves an experiment, with measures of corrosion representing the data. The ex- ample given in Section 1.7 in which blood cholesterol and sodium were observed on a group of individuals represented an observational study (as opposed to a designed experiment), and yet the process generated data and the outcome is subject to un- certainty. Thus, it is an experiment. A third example in Section 1.7 represented a retrospective study in which historical data on monthly electric power consump- tion and average monthly ambient temperature were observed. Even though the data may have been in the files for decades, the process is still referred to as an experiment.
Each outcome in a sample space is called an element or a member of the sample space, or simply a sample point. If the sample space has a finite number of elements, we may list the members separated by commas and enclosed in braces. Thus, the sample space S, of possible outcomes when a coin is flipped, may be written
S = {H, T },
where H and T correspond to heads and tails, respectively.
The set of all possible outcomes of a statistical experiment is called the sample space and is represented by the symbol S.
Example 2.1: Consider the experiment of tossing a die. If we are interested in the number that shows on the top face, the sample space is
S1 = {1,2,3,4,5,6}.
If we are interested only in whether the number is even or odd, the sample space
is simply
S2 = {even, odd}.
Example 2.1 illustrates the fact that more than one sample space can be used to describe the outcomes of an experiment. In this case, S1 provides more information than S2. If we know which element in S1 occurs, we can tell which outcome in S2 occurs; however, a knowledge of what happens in S2 is of little help in determining which element in S1 occurs. In general, it is desirable to use the sample space that gives the most information concerning the outcomes of the experiment. In some experiments, it is helpful to list the elements of the sample space systematically by means of a tree diagram.
Example 2.2: An experiment consists of flipping a coin and then flipping it a second time if a head occurs. If a tail occurs on the first flip, then a die is tossed once. To list the elements of the sample space providing the most information, we construct the tree diagram of Figure 2.1. The various paths along the branches of the tree give the distinct sample points. Starting with the top left branch and moving to the right along the first path, we get the sample point HH, indicating the possibility that heads occurs on two successive flips of the coin. Likewise, the sample point T3 indicates the possibility that the coin will show a tail followed by a 3 on the toss of the die. By proceeding along all paths, we see that the sample space is
S = {HH, HT, T1, T2, T3, T4, T5, T6}.

2.1 Sample Space
37
First Outcome
H
T
Second Sample Outcome Point
H HH
T HT
1T1 2T2 3T3 4T4 5T5 6T6
Figure 2.1: Tree diagram for Example 2.2.
Many of the concepts in this chapter are best illustrated with examples involving the use of dice and cards. These are particularly important applications to use early in the learning process, to facilitate the flow of these new concepts into scientific and engineering examples such as the following.
Example 2.3: Suppose that three items are selected at random from a manufacturing process. Each item is inspected and classified defective, D, or nondefective, N. To list the elements of the sample space providing the most information, we construct the tree diagram of Figure 2.2. Now, the various paths along the branches of the tree give the distinct sample points. Starting with the first path, we get the sample point DDD, indicating the possibility that all three items inspected are defective. As we proceed along the other paths, we see that the sample space is
S = {DDD, DDN, DND, DNN, NDD, NDN, NND, NNN}.
Sample spaces with a large or infinite number of sample points are best de- scribed by a statement or rule method. For example, if the possible outcomes of an experiment are the set of cities in the world with a population over 1 million, our sample space is written
S = {x | x is a city with a population over 1 million},
which reads “S is the set of all x such that x is a city with a population over 1 million.” The vertical bar is read “such that.” Similarly, if S is the set of all points (x,y) on the boundary or the interior of a circle of radius 2 with center at the origin, we write the rule
S = {(x, y) | x2 + y2 ≤ 4}.

38
Chapter 2 Probability
First Second Item Item
Third Sample Item Point
D DDD
N DDN D DND
N DNN D NDD
N NDN D NND
N NNN
D
N
D
N
D
N
2.2 Events
Figure 2.2: Tree diagram for Example 2.3.
Whether we describe the sample space by the rule method or by listing the elements will depend on the specific problem at hand. The rule method has practi- cal advantages, particularly for many experiments where listing becomes a tedious chore.
Consider the situation of Example 2.3 in which items from a manufacturing process are either D, defective, or N, nondefective. There are many important statistical procedures called sampling plans that determine whether or not a “lot” of items is considered satisfactory. One such plan involves sampling until k defec- tives are observed. Suppose the experiment is to sample items randomly until one defective item is observed. The sample space for this case is
S = {D,ND,NND,NNND,…}.
For any given experiment, we may be interested in the occurrence of certain events rather than in the occurrence of a specific element in the sample space. For in- stance, we may be interested in the event A that the outcome when a die is tossed is divisible by 3. This will occur if the outcome is an element of the subset A = {3, 6} of the sample space S1 in Example 2.1. As a further illustration, we may be inter- ested in the event B that the number of defectives is greater than 1 in Example 2.3. This will occur if the outcome is an element of the subset
B = {DDN,DND,NDD,DDD}
of the sample space S.
To each event we assign a collection of sample points, which constitute a subset
of the sample space. That subset represents all of the elements for which the event is true.

2.2 Events
39
Definition 2.2:
Example 2.4: Given the sample space S = {t | t ≥ 0}, where t is the life in years of a certain electronic component, then the event A that the component fails before the end of the fifth year is the subset A = {t | 0 ≤ t < 5}. An event is a subset of a sample space. Definition 2.3: It is conceivable that an event may be a subset that includes the entire sample space S or a subset of S called the null set and denoted by the symbol φ, which contains no elements at all. For instance, if we let A be the event of detecting a microscopic organism by the naked eye in a biological experiment, then A = φ. Also, if B = {x | x is an even factor of 7}, then B must be the null set, since the only possible factors of 7 are the odd numbers 1 and 7. Consider an experiment where the smoking habits of the employees of a man- ufacturing firm are recorded. A possible sample space might classify an individual as a nonsmoker, a light smoker, a moderate smoker, or a heavy smoker. Let the subset of smokers be some event. Then all the nonsmokers correspond to a different event, also a subset of S, which is called the complement of the set of smokers. The complement of an event A with respect to S is the subset of all elements of S that are not in A. We denote the complement of A by the symbol A′. Example 2.5: Let R be the event that a red card is selected from an ordinary deck of 52 playing cards, and let S be the entire deck. Then R′ is the event that the card selected from the deck is not a red card but a black card. Example 2.6: Consider the sample space S = {book, cell phone, mp3, paper, stationery, laptop}. Let A = {book, stationery, laptop, paper}. Then the complement of A is A′ = {cell phone, mp3}. We now consider certain operations with events that will result in the formation of new events. These new events will be subsets of the same sample space as the given events. Suppose that A and B are two events associated with an experiment. In other words, A and B are subsets of the same sample space S. For example, in the tossing of a die we might let A be the event that an even number occurs and B the event that a number greater than 3 shows. Then the subsets A = {2, 4, 6} and B = {4, 5, 6} are subsets of the same sample space S = {1,2,3,4,5,6}. Note that both A and B will occur on a given toss if the outcome is an element of the subset {4, 6}, which is just the intersection of A and B. Definition 2.4: The intersection of two events A and B, denoted by the symbol A ∩ B, is the event containing all elements that are common to A and B. Example 2.7: Let E be the event that a person selected at random in a classroom is majoring in engineering, and let F be the event that the person is female. Then E ∩ F is the event of all female engineering students in the classroom. 40 Chapter 2 Probability Example 2.8: Let V = {a,e,i,o,u} and C = {l,r,s,t}; then it follows that V ∩ C = φ. That is, V and C have no elements in common and, therefore, cannot both simultaneously occur. For certain statistical experiments it is by no means unusual to define two events, A and B, that cannot both occur simultaneously. The events A and B are then said to be mutually exclusive. Stated more formally, we have the following definition: Definition 2.5: Example 2.9: A cable television company offers programs on eight different channels, three of which are affiliated with ABC, two with NBC, and one with CBS. The other two are an educational channel and the ESPN sports channel. Suppose that a person subscribing to this service turns on a television set without first selecting the channel. Let A be the event that the program belongs to the NBC network and B the event that it belongs to the CBS network. Since a television program cannot belong to more than one network, the events A and B have no programs in common. Therefore, the intersection A ∩ B contains no programs, and consequently the events A and B are mutually exclusive. Two events A and B are mutually exclusive, or disjoint, if A ∩ B = φ, that is, if A and B have no elements in common. Definition 2.6: Often one is interested in the occurrence of at least one of two events associated with an experiment. Thus, in the die-tossing experiment, if A = {2,4,6} and B = {4,5,6}, we might be interested in either A or B occurring or both A and B occurring. Such an event, called the union of A and B, will occur if the outcome is an element of the subset {2, 4, 5, 6}. The union of the two events A and B, denoted by the symbol A∪B, is the event containing all the elements that belong to A or B or both. Example2.10: LetA={a,b,c}andB={b,c,d,e};thenA∪B={a,b,c,d,e}. Example 2.11: Let P be the event that an employee selected at random from an oil drilling com- pany smokes cigarettes. Let Q be the event that the employee selected drinks alcoholic beverages. Then the event P ∪ Q is the set of all employees who either drink or smoke or do both. Example2.12: IfM={x|3 0. P (A)
Male 460 Female 140 Total 600
40 500 260 400 300 900
One of these individuals is to be selected at random for a tour throughout the country to publicize the advantages of establishing new industries in the town. We shall be concerned with the following events:
M: a man is chosen,
E: the one chosen is employed.
Using the reduced sample space E, we find that P(M|E) = 460 = 23.
600 30
Let n(A) denote the number of elements in any set A. Using this notation, since each adult has an equal chance of being selected, we can write
P(M|E)= n(E∩M) = n(E∩M)/n(S) = P(E∩M), n(E) n(E)/n(S) P (E)
whereP(E∩M)andP(E)arefoundfromtheoriginalsamplespaceS. Toverify this result, note that
Hence,
as before.
P(M|E) = 23/45 = 23, 2/3 30
P(E)= 600 = 2 and P(E∩M)= 460 = 23.
900 3
900 45
Example 2.34: The probability that a regularly scheduled flight departs on time is P(D) = 0.83; the probability that it arrives on time is P (A) = 0.82; and the probability that it departs and arrives on time is P (D ∩ A) = 0.78. Find the probability that a plane

64
Chapter 2 Probability
(a) arrives on time, given that it departed on time, and (b) departed on time, given
that it has arrived on time.
Solution : Using Definition 2.10, we have the following.
(a) The probability that a plane arrives on time, given that it departed on time, is
P(A|D)= P(D∩A) = 0.78 =0.94. P (D) 0.83
(b) The probability that a plane departed on time, given that it has arrived on time, is
P(D|A)= P(D∩A) = 0.78 =0.95. P (A) 0.82
The notion of conditional probability provides the capability of reevaluating the idea of probability of an event in light of additional information, that is, when it is known that another event has occurred. The probability P(A|B) is an updating of P(A) based on the knowledge that event B has occurred. In Example 2.34, it is important to know the probability that the flight arrives on time. One is given the information that the flight did not depart on time. Armed with this additional information, one can calculate the more pertinent probability P(A|D′), that is, the probability that it arrives on time, given that it did not depart on time. In many situations, the conclusions drawn from observing the more important condi- tional probability change the picture entirely. In this example, the computation of P(A|D′) is
P(A|D′) = P(A ∩ D′) = 0.82 − 0.78 = 0.24. P(D′) 0.17
As a result, the probability of an on-time arrival is diminished severely in the presence of the additional information.
Example 2.35: The concept of conditional probability has countless uses in both industrial and biomedical applications. Consider an industrial process in the textile industry in which strips of a particular type of cloth are being produced. These strips can be defective in two ways, length and nature of texture. For the case of the latter, the process of identification is very complicated. It is known from historical information on the process that 10% of strips fail the length test, 5% fail the texture test, and only 0.8% fail both tests. If a strip is selected randomly from the process and a quick measurement identifies it as failing the length test, what is the probability that it is texture defective?
Solution: Consider the events
L: length defective, T: texture defective.
Given that the strip is length defective, the probability that this strip is texture defective is given by
P(T|L) = P(T ∩ L) = 0.008 = 0.08. P (L) 0.1
Thus, knowing the conditional probability provides considerably more information than merely knowing P(T).

2.6 Conditional Probability, Independence, and the Product Rule 65 Independent Events
In the die-tossing experiment discussed on page 62, we note that P(B|A) = 2/5 whereas P (B) = 1/3. That is, P (B|A) ̸= P (B), indicating that B depends on A. Now consider an experiment in which 2 cards are drawn in succession from an ordinary deck, with replacement. The events are defined as
A: the first card is an ace,
B: the second card is a spade.
Since the first card is replaced, our sample space for both the first and the second
draw consists of 52 cards, containing 4 aces and 13 spades. Hence, P(B|A)= 13 = 1 and P(B)= 13 = 1.
That is, P(B|A) = P(B). When this is true, the events A and B are said to be independent.
Although conditional probability allows for an alteration of the probability of an event in the light of additional material, it also enables us to understand better the very important concept of independence or, in the present context, independent events. In the airport illustration in Example 2.34, P(A|D) differs from P(A). This suggests that the occurrence of D influenced A, and this is certainly expected in this illustration. However, consider the situation where we have events A and B and
P(A|B) = P(A).
In other words, the occurrence of B had no impact on the odds of occurrence of A. Here the occurrence of A is independent of the occurrence of B. The importance of the concept of independence cannot be overemphasized. It plays a vital role in material in virtually all chapters in this book and in all areas of applied statistics.
The condition P(B|A) = P(B) implies that P(A|B) = P(A), and conversely. For the card-drawing experiments, where we showed that P(B|A) = P(B) = 1/4, we also can see that P(A|B) = P(A) = 1/13.
52 4 52 4
Two events A and B are independent if and only if P(B|A) = P(B) or P(A|B) = P(A),
assuming the existences of the conditional probabilities. Otherwise, A and B are dependent.
Definition 2.11:
The Product Rule, or the Multiplicative Rule
Multiplying the formula in Definition 2.10 by P(A), we obtain the following im- portant multiplicative rule (or product rule), which enables us to calculate

66
Chapter 2 Probability
Theorem 2.10:
the probability that two events will both occur.
Thus, the probability that both A and B occur is equal to the probability that A occurs multiplied by the conditional probability that B occurs, given that A occurs. Since the events A ∩ B and B ∩ A are equivalent, it follows from Theorem 2.10 that we can also write
P(A ∩ B) = P(B ∩ A) = P(B)P(A|B).
In other words, it does not matter which event is referred to as A and which event
is referred to as B.
Suppose that we have a fuse box containing 20 fuses, of which 5 are defective. If 2 fuses are selected at random and removed from the box in succession without replacing the first, what is the probability that both fuses are defective?
We shall let A be the event that the first fuse is defective and B the event that the second fuse is defective; then we interpret A ∩ B as the event that A occurs and then B occurs after A has occurred. The probability of first removing a defective fuse is 1/4; then the probability of removing a second defective fuse from the remaining 4 is 4/19. Hence,
􏰧1􏰨􏰧 4 􏰨 1 P(A∩B)= 4 19 =19.
One bag contains 4 white balls and 3 black balls, and a second bag contains 3 white balls and 5 black balls. One ball is drawn from the first bag and placed unseen in the second bag. What is the probability that a ball now drawn from the second bag is black?
Let B1, B2, and W1 represent, respectively, the drawing of a black ball from bag 1, a black ball from bag 2, and a white ball from bag 1. We are interested in the union of the mutually exclusive events B1 ∩ B2 and W1 ∩ B2. The various possibilities and their probabilities are illustrated in Figure 2.8. Now
P[(B1 ∩ B2) or (W1 ∩ B2)] = P(B1 ∩ B2) + P(W1 ∩ B2)
= P (B1)P (B2|B1) + P (W1)P (B2|W1)
􏰧3􏰨􏰧6􏰨 􏰧4􏰨􏰧5􏰨 38 = 7 9 + 7 9 =63.
If, in Example 2.36, the first fuse is replaced and the fuses thoroughly rear- ranged before the second is removed, then the probability of a defective fuse on the second selection is still 1/4; that is, P(B|A) = P(B) and the events A and B are independent. When this is true, we can substitute P(B) for P(B|A) in Theorem 2.10 to obtain the following special multiplicative rule.
If in an experiment the events A and B can both occur, then P(A ∩ B) = P(A)P(B|A), provided P(A) > 0.
Example 2.36:
Solution :
Example 2.37:
Solution:

2.6 Conditional Probability, Independence, and the Product Rule 67
P(B1 ∩B2)=(3/7)(6/9)
Bag 2
3W, 6B B
3/7
4W, 3B
4/7
W
4W, 5B
B 6/9
Bag 1
W
3/9 P(B1 ∩W2)=(3/7)(3/9)
B P(W1 ∩B2)=(4/7)(5/9) 6/9
Bag 2
4/9 W
P(W1 ∩W2)=(4/7)(4/9)
Figure 2.8: Tree diagram for Example 2.37.
Two events A and B are independent if and only if P(A ∩ B) = P(A)P(B).
Therefore, to obtain the probability that two independent events will both occur, we simply find the product of their individual probabilities.
Theorem 2.11:
Example 2.38: A small town has one fire engine and one ambulance available for emergencies. The probability that the fire engine is available when needed is 0.98, and the probability that the ambulance is available when called is 0.92. In the event of an injury resulting from a burning building, find the probability that both the ambulance and the fire engine will be available, assuming they operate independently.
Solution : Let A and B represent the respective events that the fire engine and the ambulance are available. Then
P (A ∩ B) = P (A)P (B) = (0.98)(0.92) = 0.9016.
Example 2.39: An electrical system consists of four components as illustrated in Figure 2.9. The system works if components A and B work and either of the components C or D works. The reliability (probability of working) of each component is also shown in Figure 2.9. Find the probability that (a) the entire system works and (b) the component C does not work, given that the entire system works. Assume that the four components work independently.
Solution: In this configuration of the system, A, B, and the subsystem C and D constitute a serial circuit system, whereas the subsystem C and D itself is a parallel circuit system.
(a) Clearly the probability that the entire system works can be calculated as

68 Chapter 2 Probability follows:
P[A ∩ B ∩ (C ∪ D)] = P(A)P(B)P(C ∪ D) = P(A)P(B)[1 − P(C′ ∩ D′)] = P (A)P (B)[1 − P (C′)P (D′)]
= (0.9)(0.9)[1 − (1 − 0.8)(1 − 0.8)] = 0.7776.
The equalities above hold because of the independence among the four com-
ponents.
(b) To calculate the conditional probability in this case, notice that
P = P(the system works but C does not work) P(the system works)
= P (A ∩ B ∩ C′ ∩ D) = (0.9)(0.9)(1 − 0.8)(0.8) = 0.1667. P(the system works) 0.7776
AB
D
Figure 2.9: An electrical system for Example 2.39.
The multiplicative rule can be extended to more than two-event situations.
C
0.9
0.9
0.8
If, in an experiment, the events A1, A2, . . . , Ak can occur, then P(A1 ∩A2 ∩···∩Ak)
=P(A1)P(A2|A1)P(A3|A1 ∩A2)···P(Ak|A1 ∩A2 ∩···∩Ak−1). IftheeventsA1,A2,…,Ak areindependent,then
P(A1 ∩A2 ∩···∩Ak)=P(A1)P(A2)···P(Ak).
Theorem 2.12:
Example 2.40: Three cards are drawn in succession, without replacement, from an ordinary deck of playing cards. Find the probability that the event A1 ∩ A2 ∩ A3 occurs, where A1 is the event that the first card is a red ace, A2 is the event that the second card is a 10 or a jack, and A3 is the event that the third card is greater than 3 but less than 7.
Solution: First we define the events
A1: the first card is a red ace,
A2: the second card is a 10 or a jack,
0.8

Exercises
69
//
A3: the third card is greater than 3 but less than 7. Now
P(A1)= 2 , P(A2|A1)= 8 , P(A3|A1 ∩A2)= 12, 52 51 50
and hence, by Theorem 2.12,
P(A1 ∩A2 ∩A3)=P(A1)P(A2|A1)P(A3|A1 ∩A2)
􏰧 2 􏰨􏰧 8 􏰨􏰧12􏰨 8
= 52 51 50 = 5525.
The property of independence stated in Theorem 2.11 can be extended to deal with more than two events. Consider, for example, the case of three events A, B, and C. It is not sufficient to only have that P(A ∩ B ∩ C) = P(A)P(B)P(C) as a definition of independence among the three. Suppose A = B and C = φ, the null set. Although A∩B∩C = φ, which results in P(A∩B∩C) = 0 = P(A)P(B)P(C), events A and B are not independent. Hence, we have the following definition.
A collection of events A = {A1,…,An} are mutually independent if for any subset of A, Ai1,…,Aik, for k ≤ n, we have
P(Ai1 ∩···∩Aik)=P(Ai1)···P(Aik).
Definition 2.12:
Exercises
2.73 If R is the event that a convict committed armed robbery and D is the event that the convict pushed dope, state in words what probabilities are expressed by
(a) P(R|D); (b) P(D′|R); (c) P(R′|D′).
2.74 A class in advanced physics is composed of 10 juniors, 30 seniors, and 10 graduate students. The final grades show that 3 of the juniors, 10 of the seniors, and 5 of the graduate students received an A for the course. If a student is chosen at random from this class and is found to have earned an A, what is the probability that he or she is a senior?
2.75 A random sample of 200 adults are classified be- low by sex and their level of education attained.
(b) the person does not have a college degree, given that the person is a female.
2.76 In an experiment to study the relationship of hy- pertension and smoking habits, the following data are collected for 180 individuals:
Moderate Heavy Nonsmokers Smokers Smokers
H 21 36 30 NH 48 26 19
where H and NH in the table stand for Hypertension and Nonhypertension, respectively. If one of these indi- viduals is selected at random, find the probability that the person is
(a) experiencing hypertension, given that the person is a heavy smoker;
(b) a nonsmoker, given that the person is experiencing no hypertension.
2.77 In the senior year of a high school graduating class of 100 students, 42 studied mathematics, 68 stud- ied psychology, 54 studied history, 22 studied both mathematics and history, 25 studied both mathematics and psychology, 7 studied history but neither mathe- matics nor psychology, 10 studied all three subjects, and 8 did not take any of the three. Randomly select
Education
Elementary Secondary College
Male Female
38 45 28 50 22 17
If a person is picked at random from this group, find the probability that
(a) the person is a male, given that the person has a secondary education;

70
Chapter 2 Probability
a student from the class and find the probabilities of the following events.
(a) A person enrolled in psychology takes all three sub- jects.
(b) A person not taking psychology is taking both his- tory and mathematics.
2.78 A manufacturer of a flu vaccine is concerned about the quality of its flu serum. Batches of serum are processed by three different departments having rejec- tion rates of 0.10, 0.08, and 0.12, respectively. The in- spections by the three departments are sequential and independent.
(a) What is the probability that a batch of serum sur- vives the first departmental inspection but is re- jected by the second department?
(b) What is the probability that a batch of serum is rejected by the third department?
2.79 In USA Today (Sept. 5, 1996), the results of a survey involving the use of sleepwear while traveling were listed as follows:
(b) a wife watches the show, given that her husband does;
(c) at least one member of a married couple will watch the show.
2.82 For married couples living in a certain suburb, the probability that the husband will vote on a bond referendum is 0.21, the probability that the wife will vote on the referendum is 0.28, and the probability that both the husband and the wife will vote is 0.15. What is the probability that
(a) at least one member of a married couple will vote?
(b) a wife will vote, given that her husband will vote?
(c) a husband will vote, given that his wife will not vote?
2.83 The probability that a vehicle entering the Lu- ray Caverns has Canadian license plates is 0.12; the probability that it is a camper is 0.28; and the proba- bility that it is a camper with Canadian license plates is 0.09. What is the probability that
(a) a camper entering the Luray Caverns has Canadian license plates?
(b) a vehicle with Canadian license plates entering the Luray Caverns is a camper?
(c) a vehicle entering the Luray Caverns does not have Canadian plates or is not a camper?
2.84 The probability that the head of a household is home when a telemarketing representative calls is 0.4. Given that the head of the house is home, the proba- bility that goods will be bought from the company is 0.3. Find the probability that the head of the house is home and goods are bought from the company.
2.85 The probability that a doctor correctly diag- noses a particular illness is 0.7. Given that the doctor makes an incorrect diagnosis, the probability that the patient files a lawsuit is 0.9. What is the probability that the doctor makes an incorrect diagnosis and the patient sues?
2.86 In 1970, 11% of Americans completed four years of college; 43% of them were women. In 1990, 22% of Americans completed four years of college; 53% of them were women (Time, Jan. 19, 1996).
(a) Given that a person completed four years of college in 1970, what is the probability that the person was a woman?
(b) What is the probability that a woman finished four years of college in 1990?
(c) What is the probability that a man had not finished college in 1990?
//
Underwear Nightgown Nothing Pajamas T-shirt Other
Male Female
0.220 0.024 0.002 0.180 0.160 0.018 0.102 0.073 0.046 0.088 0.084 0.003
Total
0.244 0.182 0.178 0.175 0.134 0.087
(a) What is the probability that a traveler is a female who sleeps in the nude?
(b) What is the probability that a traveler is male?
(c) Assuming the traveler is male, what is the proba- bility that he sleeps in pajamas?
(d) What is the probability that a traveler is male if the traveler sleeps in pajamas or a T-shirt?
2.80 The probability that an automobile being filled with gasoline also needs an oil change is 0.25; the prob- ability that it needs a new oil filter is 0.40; and the probability that both the oil and the filter need chang- ing is 0.14.
(a) If the oil has to be changed, what is the probability that a new oil filter is needed?
(b) If a new oil filter is needed, what is the probability that the oil has to be changed?
2.81 The probability that a married man watches a certain television show is 0.4, and the probability that a married woman watches the show is 0.5. The proba- bility that a man watches the show, given that his wife does, is 0.7. Find the probability that
(a) a married couple watches the show;

Exercises
71
2.87 A real estate agent has 8 master keys to open several new homes. Only 1 master key will open any given house. If 40% of these homes are usually left unlocked, what is the probability that the real estate agent can get into a specific home if the agent selects 3 master keys at random before leaving the office?
2.88 Before the distribution of certain statistical soft- ware, every fourth compact disk (CD) is tested for ac- curacy. The testing process consists of running four independent programs and checking the results. The failure rates for the four testing programs are, respec- tively, 0.01, 0.03, 0.02, and 0.01.
(a) What is the probability that a CD was tested and failed any test?
(b) Given that a CD was tested, what is the probability that it failed program 2 or 3?
(c) In a sample of 100, how many CDs would you ex- pect to be rejected?
(d) Given that a CD was defective, what is the proba- bility that it was tested?
2.89 A town has two fire engines operating indepen- dently. The probability that a specific engine is avail- able when needed is 0.96.
(a) What is the probability that neither is available when needed?
(b) What is the probability that a fire engine is avail- able when needed?
2.90 Pollution of the rivers in the United States has been a problem for many years. Consider the following events:
A: the river is polluted,
B : a sample of water tested detects pollution, C : fishing is permitted.
AD
Assume P(A) = 0.3, P(B|A) = 0.75, P(B|A′) = 0.20, P (C|A∩B) = 0.20, P (C|A′ ∩B) = 0.15, P (C|A∩B′) = 0.80, and P(C|A′ ∩ B′) = 0.90.
(a) Find P(A ∩ B ∩ C). (b) Find P(B′ ∩C).
(c) Find P(C).
(d) Find the probability that the river is polluted, given that fishing is permitted and the sample tested did not detect pollution.
2.91 Find the probability of randomly selecting 4 good quarts of milk in succession from a cooler con- taining 20 quarts of which 5 have spoiled, by using
(a) the first formula of Theorem 2.12 on page 68;
(b) the formulas of Theorem 2.6 and Rule 2.3 on pages 50 and 54, respectively.
2.92 Suppose the diagram of an electrical system is as given in Figure 2.10. What is the probability that the system works? Assume the components fail inde- pendently.
2.93 A circuit system is given in Figure 2.11. Assume the components fail independently.
(a) What is the probability that the entire system works?
(b) Given that the system works, what is the probabil- ity that the component A is not working?
2.94 In the situation of Exercise 2.93, it is known that the system does not work. What is the probability that the component A also does not work?
0.7
0.7
0.7
B
AB
0.95
0.8
0.9
0.8
0.8
0.8
C CDE
Figure 2.10: Diagram for Exercise 2.92. Figure 2.11: Diagram for Exercise 2.93.

72
Chapter 2 Probability
2.7
Bayes’ Rule
Bayesian statistics is a collection of tools that is used in a special form of statistical inference which applies in the analysis of experimental data in many practical situations in science and engineering. Bayes’ rule is one of the most important rules in probability theory. It is the foundation of Bayesian inference, which will be discussed in Chapter 18.
Total Probability
Let us now return to the illustration of Section 2.6, where an individual is being selected at random from the adults of a small town to tour the country and publicize the advantages of establishing new industries in the town. Suppose that we are now given the additional information that 36 of those employed and 12 of those unemployed are members of the Rotary Club. We wish to find the probability of the event A that the individual selected is a member of the Rotary Club. Referring to Figure 2.12, we can write A as the union of the two mutually exclusive events E ∩A and E′ ∩A. Hence, A = (E ∩A)∪(E′ ∩A), and by Corollary 2.1 of Theorem 2.7, and then Theorem 2.10, we can write
P(A)=P[(E∩A)∪(E′ ∩A)]=P(E∩A)+P(E′ ∩A) = P(E)P(A|E) + P(E′)P(A|E′).
E A E􏱌 E􏱕A
E􏱌 􏱕 A
Figure 2.12: Venn diagram for the events A, E, and E′.
The data of Section 2.6, together with the additional data given above for the set
A, enable us to compute
P(E)=600=2, P(A|E)= 36 = 3,
and
900 3 600 50
P(E′)= 1, P(A|E′)= 12 = 1 . 3 300 25
If we display these probabilities by means of the tree diagram of Figure 2.13, where the first branch yields the probability P(E)P(A|E) and the second branch yields

2.7 Bayes’ Rule 73
E P(A|E) = 3/50 A P(E)P(A|E)
P(E’)P(A|E’) E’ P(A|E)􏱖 􏱗 1/25 A’
Figure 2.13: Tree diagram for the data on page 63, using additional information on page 72.
the probability P(E′)P(A|E′), it follows that
􏰧2􏰨􏰧 3 􏰨 􏰧1􏰨􏰧 1 􏰨 4
P(A)= 3 50 + 3 25 =75.
A generalization of the foregoing illustration to the case where the sample space is partitioned into k subsets is covered by the following theorem, sometimes called the theorem of total probability or the rule of elimination.
If the events B1 , B2 , . . . , Bk constitute a partition of the sample space S such that P(Bi) ̸= 0 for i = 1,2,…,k, then for any event A of S,
􏰤k P(A) =
i=1
P(Bi ∩ A) =
􏰤k i=1
P(Bi)P(A|Bi).
Theorem 2.13:
B1
B4
B5
A
B2 …
B3
Figure 2.14: Partitioning the sample space S.
P(E) = 2/3
P(E’) = 1/3

74 Chapter 2 Probability Proof: Consider the Venn diagram of Figure 2.14. The event A is seen to be the union of
the mutually exclusive events
B1 ∩A, B2 ∩A, …, Bk ∩A;
that is,
A=(B1 ∩A)∪(B2 ∩A)∪···∪(Bk ∩A).
Using Corollary 2.2 of Theorem 2.7 and Theorem 2.10, we have
P(A)=P[(B1 ∩A)∪(B2 ∩A)∪···∪(Bk ∩A)] =P(B1 ∩A)+P(B2 ∩A)+···+P(Bk ∩A)
􏰤k i=1
􏰤k
i=1
Example 2.41: In a certain assembly plant, three machines, B1, B2, and B3, make 30%, 45%, and 25%, respectively, of the products. It is known from past experience that 2%, 3%, and 2% of the products made by each machine, respectively, are defective. Now, suppose that a finished product is randomly selected. What is the probability that it is defective?
Solution: Consider the following events: A: the product is defective,
B1: the product is made by machine B1, B2: the product is made by machine B2, B3: the product is made by machine B3.
Applying the rule of elimination, we can write
P (A) = P (B1)P (A|B1) + P (B2)P (A|B2) + P (B3)P (A|B3).
Referring to the tree diagram of Figure 2.15, we find that the three branches give the probabilities
=
=
P(Bi ∩A)
P (Bi )P (A|Bi ).
and hence
P(B1)P(A|B1) = (0.3)(0.02) = 0.006, P(B2)P(A|B2) = (0.45)(0.03) = 0.0135, P(B3)P(A|B3) = (0.25)(0.02) = 0.005,
P (A) = 0.006 + 0.0135 + 0.005 = 0.0245.

2.7 Bayes’ Rule 75
B 1
P(B 2 ) = 0.45 B2
B 3
P(A | B 1 ) = 0.02 A
P(A | B 2 ) = 0.03 A
P(A | B 3 ) = 0.02
A
Bayes’ Rule
Theorem 2.14:
Figure 2.15: Tree diagram for Example 2.41.
Instead of asking for P(A) in Example 2.41, by the rule of elimination, suppose that we now consider the problem of finding the conditional probability P(Bi|A). In other words, suppose that a product was randomly selected and it is defective. What is the probability that this product was made by machine Bi? Questions of this type can be answered by using the following theorem, called Bayes’ rule:
(Bayes’ Rule) If the events B1 , B2 , . . . , Bk constitute a partition of the sample space S such that P(Bi) ̸= 0 for i = 1,2,…,k, then for any event A in S such that P(A) ̸= 0,
P(Br|A)= P(Br ∩A)
􏰦k i=1
=
P(Br)P(A|Br) forr=1,2,…,k.
P(Bi ∩A)
􏰦k i=1
P(Bi)P(A|Bi)
Proof : By the definition of conditional probability, P(Br|A) = P(Br ∩ A),
P (A)
and then using Theorem 2.13 in the denominator, we have
P(Br|A) = P(Br ∩ A)
which completes the proof.
=
P(Br)P(A|Br) ,
􏰦k i=1
P(Bi ∩A)
􏰦k i=1
P(Bi)P(A|Bi)
Example 2.42: With reference to Example 2.41, if a product was chosen randomly and found to be defective, what is the probability that it was made by machine B3?
Solution: Using Bayes’ rule to write
P (B3|A) = P (B3)P (A|B3) , P (B1)P (A|B1) + P (B2)P (A|B2) + P (B3)P (A|B3)
P(B 1 ) = 0.3
P(B 3 ) = 0.25

//
76 Chapter 2 Probability and then substituting the probabilities calculated in Example 2.41, we have
P(B3|A) = 0.005 = 0.005 = 10. 0.006 + 0.0135 + 0.005 0.0245 49
In view of the fact that a defective product was selected, this result suggests that it probably was not made by machine B3.
Example 2.43: A manufacturing firm employs three analytical plans for the design and devel- opment of a particular product. For cost reasons, all three are used at varying times. In fact, plans 1, 2, and 3 are used for 30%, 20%, and 50% of the products, respectively. The defect rate is different for the three procedures as follows:
P (D|P1) = 0.01, P (D|P2) = 0.03, P (D|P3) = 0.02,
where P(D|Pj) is the probability of a defective product, given plan j. If a random product was observed and found to be defective, which plan was most likely used and thus responsible?
Solution : From the statement of the problem
P (P1) = 0.30, P (P2) = 0.20, and P (P3) = 0.50,
we must find P (Pj |D) for j = 1, 2, 3. Bayes’ rule (Theorem 2.14) shows
P (P1|D) = P (P1)P (D|P1)
P (P1)P (D|P1) + P (P2)P (D|P2) + P (P3)P (D|P3)
= (0.30)(0.01) = 0.003 = 0.158. (0.3)(0.01) + (0.20)(0.03) + (0.50)(0.02) 0.019
Similarly,
P (P2|D) = (0.03)(0.20) = 0.316 and P (P3|D) = (0.02)(0.50) = 0.526.
The conditional probability of a defect given plan 3 is the largest of the three; thus a defective for a random product is most likely the result of the use of plan 3.
Using Bayes’ rule, a statistical methodology called the Bayesian approach has attracted a lot of attention in applications. An introduction to the Bayesian method will be discussed in Chapter 18.
0.019 0.019
Exercises
2.95 In a certain region of the country it is known from past experience that the probability of selecting an adult over 40 years of age with cancer is 0.05. If the probability of a doctor correctly diagnosing a per- son with cancer as having the disease is 0.78 and the probability of incorrectly diagnosing a person without cancer as having the disease is 0.06, what is the prob-
ability that an adult over 40 years of age is diagnosed as having cancer?
2.96 Police plan to enforce speed limits by using radar traps at four different locations within the city limits. The radar traps at each of the locations L1, L2, L3, and L4 will be operated 40%, 30%, 20%, and 30% of

Review Exercises
77
the time. If a person who is speeding on her way to work has probabilities of 0.2, 0.1, 0.5, and 0.2, respec- tively, of passing through these locations, what is the probability that she will receive a speeding ticket?
2.97 Referring to Exercise 2.95, what is the probabil- ity that a person diagnosed as having cancer actually has the disease?
2.98 If the person in Exercise 2.96 received a speed- ing ticket on her way to work, what is the probability that she passed through the radar trap located at L2?
2.99 Suppose that the four inspectors at a film fac- tory are supposed to stamp the expiration date on each package of film at the end of the assembly line. John, who stamps 20% of the packages, fails to stamp the expiration date once in every 200 packages; Tom, who stamps 60% of the packages, fails to stamp the expira- tion date once in every 100 packages; Jeff, who stamps 15% of the packages, fails to stamp the expiration date once in every 90 packages; and Pat, who stamps 5% of the packages, fails to stamp the expiration date once in every 200 packages. If a customer complains that her package of film does not show the expiration date, what is the probability that it was inspected by John?
2.100 A regional telephone company operates three identical relay stations at different locations. During a
Review Exercises
2.103 A truth serum has the property that 90% of the guilty suspects are properly judged while, of course, 10% of the guilty suspects are improperly found inno- cent. On the other hand, innocent suspects are mis- judged 1% of the time. If the suspect was selected from a group of suspects of which only 5% have ever committed a crime, and the serum indicates that he is guilty, what is the probability that he is innocent?
2.104 An allergist claims that 50% of the patients she tests are allergic to some type of weed. What is the probability that
(a) exactly 3 of her next 4 patients are allergic to weeds?
(b) none of her next 4 patients is allergic to weeds?
2.105 By comparing appropriate regions of Venn di- agrams, verify that
(a) (A ∩ B) ∪ (A ∩ B′) = A;
(b)A′ ∩(B′ ∪C)=(A′ ∩B′)∪(A′ ∩C).
one-year period, the number of malfunctions reported by each station and the causes are shown below.
Station A B C
//
Problems with electricity supplied Computer malfunction Malfunctioning electrical equipment Caused by other human errors
2 1 4 3 5 4 7 7
1 2 2 5
Suppose that a malfunction was reported and
found to be caused by other human errors. What is the probability that it came from station C?
2.101 A paint-store chain produces and sells latex and semigloss paint. Based on long-range sales, the probability that a customer will purchase latex paint is 0.75. Of those that purchase latex paint, 60% also pur- chase rollers. But only 30% of semigloss paint buyers purchase rollers. A randomly selected buyer purchases a roller and a can of paint. What is the probability that the paint is latex?
2.102 Denote by A, B, and C the events that a grand prize is behind doors A, B, and C, respectively. Sup- pose you randomly picked a door, say A. The game host opened a door, say B, and showed there was no prize behind it. Now the host offers you the option of either staying at the door that you picked (A) or switching to the remaining unopened door (C). Use probability to explain whether you should switch or not.
2.106 The probabilities that a service station will pump gas into 0, 1, 2, 3, 4, or 5 or more cars during a certain 30-minute period are 0.03, 0.18, 0.24, 0.28, 0.10, and 0.17, respectively. Find the probability that in this 30-minute period
(a) more than 2 cars receive gas; (b) at most 4 cars receive gas; (c) 4 or more cars receive gas.
2.107 How many bridge hands are possible contain- ing 4 spades, 6 diamonds, 1 club, and 2 hearts?
2.108 If the probability is 0.1 that a person will make a mistake on his or her state income tax return, find the probability that
(a) four totally unrelated persons each make a mistake;
(b) Mr. Jones and Ms. Clark both make mistakes, and Mr. Roberts and Ms. Williams do not make a mistake.
it was

78
Chapter 2 Probability
2.109 A large industrial firm uses three local motels to provide overnight accommodations for its clients. From past experience it is known that 20% of the clients are assigned rooms at the Ramada Inn, 50% at the Sheraton, and 30% at the Lakeview Motor Lodge. If the plumbing is faulty in 5% of the rooms at the Ra- mada Inn, in 4% of the rooms at the Sheraton, and in 8% of the rooms at the Lakeview Motor Lodge, what is the probability that
(a) a client will be assigned a room with faulty plumbing?
(b) a person with a room having faulty plumbing was assigned accommodations at the Lakeview Motor Lodge?
2.110 The probability that a patient recovers from a delicate heart operation is 0.8. What is the probability that
(a) exactly 2 of the next 3 patients who have this op- eration survive?
(b) all of the next 3 patients who have this operation survive?
2.111 In a certain federal prison, it is known that 2/3 of the inmates are under 25 years of age. It is also known that 3/5 of the inmates are male and that 5/8 of the inmates are female or 25 years of age or older. What is the probability that a prisoner selected at random from this prison is female and at least 25 years old?
2.112 From 4 red, 5 green, and 6 yellow apples, how many selections of 9 apples are possible if 3 of each color are to be selected?
2.113 From a box containing 6 black balls and 4 green balls, 3 balls are drawn in succession, each ball being re- placed in the box before the next draw is made. What is the probability that
(a) all 3 are the same color? (b) each color is represented?
2.114 A shipment of 12 television sets contains 3 de- fective sets. In how many ways can a hotel purchase 5 of these sets and receive at least 2 of the defective sets?
2.115 A certain federal agency employs three con- sulting firms (A, B, and C) with probabilities 0.40, 0.35, and 0.25, respectively. From past experience it is known that the probability of cost overruns for the firms are 0.05, 0.03, and 0.15, respectively. Suppose a cost overrun is experienced by the agency.
(a) What is the probability that the consulting firm involved is company C?
(b) What is the probability that it is company A?
2.116 A manufacturer is studying the effects of cook- ing temperature, cooking time, and type of cooking oil for making potato chips. Three different temperatures, 4 different cooking times, and 3 different oils are to be used.
(a) What is the total number of combinations to be studied?
(b) How many combinations will be used for each type of oil?
(c) Discuss why permutations are not an issue in this exercise.
2.117 Consider the situation in Exercise 2.116, and suppose that the manufacturer can try only two com- binations in a day.
(a) What is the probability that any given set of two runs is chosen?
(b) What is the probability that the highest tempera- ture is used in either of these two combinations?
2.118 A certain form of cancer is known to be found in women over 60 with probability 0.07. A blood test exists for the detection of the disease, but the test is not infallible. In fact, it is known that 10% of the time the test gives a false negative (i.e., the test incorrectly gives a negative result) and 5% of the time the test gives a false positive (i.e., incorrectly gives a positive result). If a woman over 60 is known to have taken the test and received a favorable (i.e., negative) result, what is the probability that she has the disease?
2.119 A producer of a certain type of electronic com- ponent ships to suppliers in lots of twenty. Suppose that 60% of all such lots contain no defective compo- nents, 30% contain one defective component, and 10% contain two defective components. A lot is picked, two components from the lot are randomly selected and tested, and neither is defective.
(a) What is the probability that zero defective compo- nents exist in the lot?
(b) What is the probability that one defective exists in the lot?
(c) What is the probability that two defectives exist in the lot?
2.120 A rare disease exists with which only 1 in 500 is affected. A test for the disease exists, but of course it is not infallible. A correct positive result (patient actually has the disease) occurs 95% of the time, while a false positive result (patient does not have the dis-
//

8.9 Potential Misconceptions and Hazards
79
ease) occurs 1% of the time. If a randomly selected individual is tested and the result is positive, what is the probability that the individual has the disease?
2.121 A construction company employs two sales en- gineers. Engineer 1 does the work of estimating cost for 70% of jobs bid by the company. Engineer 2 does the work for 30% of jobs bid by the company. It is known that the error rate for engineer 1 is such that 0.02 is the probability of an error when he does the work, whereas the probability of an error in the work of engineer 2 is 0.04. Suppose a bid arrives and a se- rious error occurs in estimating cost. Which engineer would you guess did the work? Explain and show all work.
2.122 In the field of quality control, the science of statistics is often used to determine if a process is “out of control.” Suppose the process is, indeed, out of con- trol and 20% of items produced are defective.
(a) If three items arrive off the process line in succes- sion, what is the probability that all three are de- fective?
(b) If four items arrive in succession, what is the prob- ability that three are defective?
2.123 An industrial plant is conducting a study to determine how quickly injured workers are back on the job following injury. Records show that 10% of all in- jured workers are admitted to the hospital for treat- ment and 15% are back on the job the next day. In addition, studies show that 2% are both admitted for hospital treatment and back on the job the next day. If a worker is injured, what is the probability that the worker will either be admitted to a hospital or be back on the job the next day or both?
2.124 A firm is accustomed to training operators who do certain tasks on a production line. Those operators who attend the training course are known to be able to meet their production quotas 90% of the time. New op- erators who do not take the training course only meet their quotas 65% of the time. Fifty percent of new op- erators attend the course. Given that a new operator meets her production quota, what is the probability that she attended the program?
2.125 A survey of those using a particular statistical software system indicated that 10% were dissatisfied.
Half of those dissatisfied purchased the system from vendor A. It is also known that 20% of those surveyed purchased from vendor A. Given that the software was purchased from vendor A, what is the probability that that particular user is dissatisfied?
2.126 During bad economic times, industrial workers are dismissed and are often replaced by machines. The history of 100 workers whose loss of employment is at- tributable to technological advances is reviewed. For each of these individuals, it is determined if he or she was given an alternative job within the same company, found a job with another company in the same field, found a job in a new field, or has been unemployed for 1 year. In addition, the union status of each worker is recorded. The following table summarizes the results.
Same Company
New Company (same field) New Field
Unemployed
Union Nonunion
40 15 13 10 4 11 2 5
2.8 Potential Misconceptions and Hazards; Relationship to Material in Other Chapters
This chapter contains the fundamental definitions, rules, and theorems that provide a foundation that renders probability an important tool for evaluating
(a) If the selected worker found a job with a new com- pany in the same field, what is the probability that the worker is a union member?
(b) If the worker is a union member, what is the prob- ability that the worker has been unemployed for a year?
2.127 There is a 50-50 chance that the queen carries the gene of hemophilia. If she is a carrier, then each prince has a 50-50 chance of having hemophilia inde- pendently. If the queen is not a carrier, the prince will not have the disease. Suppose the queen has had three princes without the disease. What is the probability the queen is a carrier?
2. 128 Group Pro ject: Give each student a bag of chocolate M&Ms. Divide the students into groups of 5 or 6. Calculate the relative frequency distribution for color of M&Ms for each group.
(a) What is your estimated probability of randomly picking a yellow? a red?
(b) Redo the calculations for the whole classroom. Did the estimates change?
(c) Do you believe there is an equal number of each color in a process batch? Discuss.

80
Chapter 2 Probability
scientific and engineering systems. The evaluations are often in the form of prob- ability computations, as is illustrated in examples and exercises. Concepts such as independence, conditional probability, Bayes’ rule, and others tend to mesh nicely to solve practical problems in which the bottom line is to produce a probability value. Illustrations in exercises are abundant. See, for example, Exercises 2.100 and 2.101. In these and many other exercises, an evaluation of a scientific system is being made judiciously from a probability calculation, using rules and definitions discussed in the chapter.
Now, how does the material in this chapter relate to that in other chapters? It is best to answer this question by looking ahead to Chapter 3. Chapter 3 also deals with the type of problems in which it is important to calculate probabili- ties. We illustrate how system performance depends on the value of one or more probabilities. Once again, conditional probability and independence play a role. However, new concepts arise which allow more structure based on the notion of a random variable and its probability distribution. Recall that the idea of frequency distributions was discussed briefly in Chapter 1. The probability distribution dis- plays, in equation form or graphically, the total information necessary to describe a probability structure. For example, in Review Exercise 2.122 the random variable of interest is the number of defective items, a discrete measurement. Thus, the probability distribution would reveal the probability structure for the number of defective items out of the number selected from the process. As the reader moves into Chapter 3 and beyond, it will become apparent that assumptions will be re- quired in order to determine and thus make use of probability distributions for solving scientific problems.

Chapter 3
Random Variables and Probability Distributions
3.1 Concept of a Random Variable
Definition 3.1:
Statistics is concerned with making inferences about populations and population characteristics. Experiments are conducted with results that are subject to chance. The testing of a number of electronic components is an example of a statistical experiment, a term that is used to describe any process by which several chance observations are generated. It is often important to allocate a numerical description to the outcome. For example, the sample space giving a detailed description of each possible outcome when three electronic components are tested may be written
S = {NNN,NND,NDN,DNN,NDD,DND,DDN,DDD},
where N denotes nondefective and D denotes defective. One is naturally concerned with the number of defectives that occur. Thus, each point in the sample space will be assigned a numerical value of 0, 1, 2, or 3. These values are, of course, random quantities determined by the outcome of the experiment. They may be viewed as values assumed by the random variable X, the number of defective items when three electronic components are tested.
We shall use a capital letter, say X, to denote a random variable and its correspond- ing small letter, x in this case, for one of its values. In the electronic component testing illustration above, we notice that the random variable X assumes the value 2 for all elements in the subset
E = {DDN,DND,NDD}
of the sample space S. That is, each possible value of X represents an event that
is a subset of the sample space for the given experiment. 81
A random variable is a function that associates a real number with each element in the sample space.

82
Chapter 3 Random Variables and Probability Distributions
Example 3.1:
Two balls are drawn in succession without replacement from an urn containing 4 red balls and 3 black balls. The possible outcomes and the values y of the random variable Y , where Y is the number of red balls, are
Sample Space y RR 2 RB 1 BR 1 BB 0
A stockroom clerk returns three safety helmets at random to three steel mill em- ployees who had previously checked them. If Smith, Jones, and Brown, in that order, receive one of the three hats, list the sample points for the possible orders of returning the helmets, and find the value m of the random variable M that represents the number of correct matches.
If S, J, and B stand for Smith’s, Jones’s, and Brown’s helmets, respectively, then the possible arrangements in which the helmets may be returned and the number of correct matches are
Sample Space m SJB 3 SBJ 1 BJS 1 JSB 1 JBS 0 BSJ 0
In each of the two preceding examples, the sample space contains a finite number of elements. On the other hand, when a die is thrown until a 5 occurs, we obtain a sample space with an unending sequence of elements,
S = {F,NF,NNF,NNNF,…},
where F and N represent, respectively, the occurrence and nonoccurrence of a 5. But even in this experiment, the number of elements can be equated to the number of whole numbers so that there is a first element, a second element, a third element, and so on, and in this sense can be counted.
There are cases where the random variable is categorical in nature. Variables, often called dummy variables, are used. A good illustration is the case in which the random variable is binary in nature, as shown in the following example.
Consider the simple condition in which components are arriving from the produc- tion line and they are stipulated to be defective or not defective. Define the random variable X by
􏰥
X=
Example 3.2:
Solution:
Example 3.3:
1, if the component is defective,
0, if the component is not defective.

3.1 Concept of a Random Variable 83
Clearly the assignment of 1 or 0 is arbitrary though quite convenient. This will become clear in later chapters. The random variable for which 0 and 1 are chosen to describe the two possible values is called a Bernoulli random variable.
Further illustrations of random variables are revealed in the following examples.
Example 3.4: Statisticians use sampling plans to either accept or reject batches or lots of material. Suppose one of these sampling plans involves sampling independently 10 items from a lot of 100 items in which 12 are defective.
Let X be the random variable defined as the number of items found defec- tive in the sample of 10. In this case, the random variable takes on the values 0,1,2,…,9,10.
Example 3.5: Suppose a sampling plan involves sampling items from a process until a defective is observed. The evaluation of the process will depend on how many consecutive items are observed. In that regard, let X be a random variable defined by the number of items observed before a defective is found. With N a nondefective and D a defective, sample spaces are S = {D} given X = 1, S = {ND} given X = 2, S = {NND} given X = 3, and so on.
Example 3.6: Interest centers around the proportion of people who respond to a certain mail order solicitation. Let X be that proportion. X is a random variable that takes on all values x for which 0 ≤ x ≤ 1.
Example 3.7: Let X be the random variable defined by the waiting time, in hours, between successive speeders spotted by a radar unit. The random variable X takes on all values x for which x ≥ 0.
Definition 3.2:
Definition 3.3:
The outcomes of some statistical experiments may be neither finite nor countable. Such is the case, for example, when one conducts an investigation measuring the distances that a certain make of automobile will travel over a prescribed test course on 5 liters of gasoline. Assuming distance to be a variable measured to any degree of accuracy, then clearly we have an infinite number of possible distances in the sample space that cannot be equated to the number of whole numbers. Or, if one were to record the length of time for a chemical reaction to take place, once again the possible time intervals making up our sample space would be infinite in number and uncountable. We see now that all sample spaces need not be discrete.
A random variable is called a discrete random variable if its set of possible outcomes is countable. The random variables in Examples 3.1 to 3.5 are discrete random variables. But a random variable whose set of possible values is an entire interval of numbers is not discrete. When a random variable can take on values
If a sample space contains a finite number of possibilities or an unending sequence with as many elements as there are whole numbers, it is called a discrete sample space.
If a sample space contains an infinite number of possibilities equal to the number of points on a line segment, it is called a continuous sample space.

84
Chapter 3 Random Variables and Probability Distributions
3.2
on a continuous scale, it is called a continuous random variable. Often the possible values of a continuous random variable are precisely the same values that are contained in the continuous sample space. Obviously, the random variables described in Examples 3.6 and 3.7 are continuous random variables.
In most practical problems, continuous random variables represent measured data, such as all possible heights, weights, temperatures, distance, or life periods, whereas discrete random variables represent count data, such as the number of defectives in a sample of k items or the number of highway fatalities per year in a given state. Note that the random variables Y and M of Examples 3.1 and 3.2 both represent count data, Y the number of red balls and M the number of correct hat matches.
Discrete Probability Distributions
A discrete random variable assumes each of its values with a certain probability. In the case of tossing a coin three times, the variable X, representing the number of heads, assumes the value 2 with probability 3/8, since 3 of the 8 equally likely sample points result in two heads and one tail. If one assumes equal weights for the simple events in Example 3.2, the probability that no employee gets back the right helmet, that is, the probability that M assumes the value 0, is 1/3. The possible values m of M and their probabilities are
m 013
P(M = m)
111 326
Definition 3.4:
Note that the values of m exhaust all possible cases and hence the probabilities add to 1.
Frequently, it is convenient to represent all the probabilities of a random variable X by a formula. Such a formula would necessarily be a function of the numerical values x that we shall denote by f(x), g(x), r(x), and so forth. Therefore, we write f(x) = P(X = x); that is, f(3) = P(X = 3). The set of ordered pairs (x,f(x)) is called the probability function, probability mass function, or probability distribution of the discrete random variable X.
The set of ordered pairs (x, f (x)) is a probability function, probability mass function, or probability distribution of the discrete random variable X if, for each possible outcome x,
1. f(x)≥0,
2. 􏰦f(x)=1,
x
3. P(X=x)=f(x).
Example 3.8: A shipment of 20 similar laptop computers to a retail outlet contains 3 that are defective. If a school makes a random purchase of 2 of these computers, find the probability distribution for the number of defectives.
Solution: Let X be a random variable whose values x are the possible numbers of defective computers purchased by the school. Then x can only take the numbers 0, 1, and

3.2 Discrete Probability Distributions
85
2. Now
f(0)=P(X =0)=
f(2)=P(X =2)=
􏰩3􏰪􏰩17􏰪 68 􏰩3􏰪􏰩17􏰪 02 11
51 = 190,
􏰩20􏰪 = 95, f(1)=P(X =1)= 􏰩20􏰪 22
􏰩3􏰪􏰩17􏰪
3 􏰩20􏰪 = 190.
20 2
Thus, the probability distribution of X is x012
68 51 3 95 190 190
f (x)
Example 3.9: If a car agency sells 50% of its inventory of a certain foreign car equipped with side airbags, find a formula for the probability distribution of the number of cars with side airbags among the next 4 cars sold by the agency.
Solution : Since the probability of selling an automobile with side airbags is 0.5, the 24 = 16 points in the sample space are equally likely to occur. Therefore, the denominator for all probabilities, and also for our function, is 16. To obtain the number of ways of selling 3 cars with side airbags, we need to consider the number of ways of partitioning 4 outcomes into two cells, with 3 cars with side airbags assigned to one cell and the model without side airbags assigned to the other. This can be
done in 􏰩4􏰪= 4 ways. In general, the event of selling x models with side airbags 3 􏰩4􏰪
1 􏰧4􏰨
f(x) = 16 x , for x = 0,1,2,3,4.
There are many problems where we may wish to compute the probability that the observed value of a random variable X will be less than or equal to some real number x. Writing F(x) = P(X ≤ x) for every real number x, we define F(x) to be the cumulative distribution function of the random variable X.
For the random variable M, the number of correct matches in Example 3.2, we
and 4 − x models without side airbags can occur in x ways, where x can be 0, 1, 2, 3, or 4. Thus, the probability distribution f (x) = P (X = x) is
The cumulative distribution function F (x) of a discrete random variable X with probability distribution f(x) is
􏰤
t≤x
F(x)=P(X ≤x)=
f(t), for −∞ 3);
(c) P(1.4 < T < 6); (d)P(T ≤5|T ≥2). 3.13 The probability distribution of X, the number of imperfections per 10 meters of a synthetic fabric in continuous rolls of uniform width, is given by x01234 f (x) 0.41 0.37 0.16 0.05 0.01 Construct the cumulative distribution function of X. 3.14 The waiting time, in hours, between successive speeders spotted by a radar unit is a continuous ran- dom variable with cumulative distribution function 􏰰 F(x)= 0, x<0, 1−e−8x, x≥0. Find the probability of waiting less than 12 minutes between successive speeders (a) using the cumulative distribution function of X; (b) using the probability density function of X. 3.15 Find the cumulative distribution function of the random variable X representing the number of defec- tives in Exercise 3.11. Then using F (x), find (a) P(X = 1); (b) P(0 < X ≤ 2). 3.16 Construct a graph of the cumulative distribution function of Exercise 3.15. 3.17 A continuous random variable X that can as- sume values between x = 1 and x = 3 has a density function given by f(x) = 1/2. (a) Show that the area under the curve is equal to 1. (b) Find P(2 < X < 2.5). 􏰥 20,000 (x+100)3 0, have a shell life of (a) at least 200 days; (b) anywhere from 80 to 120 days. 3.7 The total number of hours, measured in units of 100 hours, that a family runs a vacuum cleaner over a period of one year is a continuous random variable X that has the density function ⎧ ⎨x, 0 < x < 1, f(x)=⎩2−x, 1≤x<2, 0, elsewhere. Find the probability that over a period of one year, a family runs their vacuum cleaner (a) less than 120 hours; (b) between 50 and 100 hours. 3.8 Find the probability distribution of the random variable W in Exercise 3.3, assuming that the coin is biased so that a head is twice as likely to occur as a tail. 3.9 The proportion of people who respond to a certain mail-order solicitation is a continuous random variable X that has the density function 􏰥 , x > 0, elsewhere.
//
92 Chapter 3 Random Variables and Probability Distributions
f(x) =
Find the probability that a bottle of this medicine will
f(x) = 5 0,
2(x+2), 0 0); (b)P(−1≤W <3). 3.24 Find the probability distribution for the number of jazz CDs when 4 CDs are selected at random from a collection consisting of 5 jazz CDs, 2 classical CDs, and 3 rock CDs. Express your results by means of a formula. 3.25 From a box containing 4 dimes and 2 nickels, 3 coins are selected at random without replacement. Find the probability distribution for the total T of the 3 coins. Express the probability distribution graphi- cally as a probability histogram. 3.26 From a box containing 4 black balls and 2 green balls, 3 balls are drawn in succession, each ball being replaced in the box before the next draw is made. Find the probability distribution for the number of green balls. 3.27 The time to failure in hours of an important piece of electronic equipment used in a manufactured DVD player has the density function 􏰰 1 exp(−x/2000), x ≥ 0, f(x) = 2000 0, x < 0. (b) (c) // 3.28 A cereal manufacturer is aware that the weight of the product in the box varies slightly from box to box. In fact, considerable historical data have al- lowed the determination of the density function that describes the probability structure for the weight (in ounces). Letting X be the random variable weight, in ounces, the density function can be described as (a) (b) (c) 􏰰2 , 23.75 ≤ x ≤ 26.25, f(x)= 5 0, elsewhere. Verify that this is a valid density function. Determine the probability that the weight is smaller than 24 ounces. The company desires that the weight exceeding 26 ounces be an extremely rare occurrence. What is the probability that this rare occurrence does ac- tually occur? 3.29 An important factor in solid missile fuel is the particle size distribution. Significant problems occur if the particle sizes are too large. From production data in the past, it has been determined that the particle size (in micrometers) distribution is characterized by 􏰰3x−4, x > 1, f(x) = 0, elsewhere.
(a) Verify that this is a valid density function. (b) Evaluate F(x).
(c) What is the probability that a random particle from the manufactured fuel exceeds 4 micrometers?
3.30 Measurements of scientific systems are always subject to variation, some more than others. There are many structures for measurement error, and statis- ticians spend a great deal of time modeling these errors. Suppose the measurement error X of a certain physical quantity is decided by the density function
(a) (b) (c)
􏰰2
k(3−x ), −1≤x≤1,
f(x)=
Determine k that renders f(x) a valid density func-
0, elsewhere.
tion.
Find the probability that a random error in mea- surement is less than 1/2.
For this particular measurement, it is undesirable if the magnitude of the error (i.e., |x|) exceeds 0.8. What is the probability that this occurs?

94 Chapter 3 Random Variables and Probability Distributions
3.31 Based on extensive testing, it is determined by the manufacturer of a washing machine that the time Y (in years) before a major repair is required is char- acterized by the probability density function
3.34 Magnetron tubes are produced on an automated assembly line. A sampling plan is used periodically to assess quality of the lengths of the tubes. This mea- surement is subject to uncertainty. It is thought that the probability that a random tube meets length spec- ification is 0.99. A sampling plan is used in which the lengths of 5 random tubes are measured.
(a) Show that the probability function of Y , the num- ber out of 5 that meet length specification, is given by the following discrete probability function:
f(y) = 5! (0.99)y(0.01)5−y, y!(5 − y)!
for y = 0,1,2,3,4,5.
(b) Suppose random selections are made off the line and 3 are outside specifications. Use f(y) above ei- ther to support or to refute the conjecture that the probability is 0.99 that a single tube meets specifi- cations.
3.35 Suppose it is known from large amounts of his- torical data that X, the number of cars that arrive at a specific intersection during a 20-second time period, is characterized by the following discrete probability function:
􏰰1e−y/4, y≥0, f(y)= 4
0, elsewhere.
(a) Critics would certainly consider the product a bar- gain if it is unlikely to require a major repair before the sixth year. Comment on this by determining P(Y >6).
(b) What is the probability that a major repair occurs in the first year?
3.32 The proportion of the budget for a certain type of industrial company that is allotted to environmental and pollution control is coming under scrutiny. A data collection project determines that the distribution of these proportions is given by
􏰰5(1−y)4, 0≤y≤1, f(y) = 0, elsewhere.
(a) Verify that the above is a valid density function.
(b) What is the probability that a company chosen at random expends less than 10% of its budget on en- vironmental and pollution controls?
(c) What is the probability that a company selected at random spends more than 50% of its budget on environmental and pollution controls?
3.33 Suppose a certain type of small data processing firm is so specialized that some have difficulty making a profit in their first year of operation. The probabil- ity density function that characterizes the proportion Y that make a profit is given by
􏰰ky4(1−y)3, 0≤y≤1, f(y) = 0, elsewhere.
(a) What is the value of k that renders the above a valid density function?
(b) Find the probability that at most 50% of the firms make a profit in the first year.
(c) Find the probability that at least 80% of the firms make a profit in the first year.
f(x)=e
−6 6x
x!, forx=0,1,2,….
3.4 Joint Probability Distributions
Our study of random variables and their probability distributions in the preced- ing sections is restricted to one-dimensional sample spaces, in that we recorded outcomes of an experiment as values assumed by a single random variable. There will be situations, however, where we may find it desirable to record the simulta-
(a) Find the probability that in a specific 20-second time period, more than 8 cars arrive at the intersection.
(b) Find the probability that only 2 cars arrive.
3.36 On a laboratory assignment, if the equipment is working, the density function of the observed outcome, X,is
􏰰2(1−x), f(x) = 0,
0 < x < 1, otherwise. (a) Calculate P (X ≤ 1/3). (b) What is the probability that X will exceed 0.5? (c) Given that X ≥ 0.5, what is the probability that X will be less than 0.75? 3.4 Joint Probability Distributions 95 neous outcomes of several random variables. For example, we might measure the amount of precipitate P and volume V of gas released from a controlled chemical experiment, giving rise to a two-dimensional sample space consisting of the out- comes (p,v), or we might be interested in the hardness H and tensile strength T of cold-drawn copper, resulting in the outcomes (h, t). In a study to determine the likelihood of success in college based on high school data, we might use a three- dimensional sample space and record for each individual his or her aptitude test score, high school class rank, and grade-point average at the end of freshman year in college. If X and Y are two discrete random variables, the probability distribution for their simultaneous occurrence can be represented by a function with values f(x,y) for any pair of values (x, y) within the range of the random variables X and Y . It is customary to refer to this function as the joint probability distribution of X and Y . Hence, in the discrete case, f(x,y)=P(X=x,Y =y); that is, the values f(x,y) give the probability that outcomes x and y occur at the same time. For example, if an 18-wheeler is to have its tires serviced and X represents the number of miles these tires have been driven and Y represents the number of tires that need to be replaced, then f(30000,5) is the probability that the tires are used over 30,000 miles and the truck needs 5 new tires. The function f(x,y) is a joint probability distribution or probability mass function of the discrete random variables X and Y if 1. f(x,y) ≥ 0 for all (x,y), 2. 􏰦􏰦f(x,y)=1, xy 3. P(X=x,Y =y)=f(x,y). For any region A in the xy plane, P [(X, Y ) ∈ A] = 􏰦 􏰦 f (x, y). A Definition 3.8: Example 3.14: Two ballpoint pens are selected at random from a box that contains 3 blue pens, 2 red pens, and 3 green pens. If X is the number of blue pens selected and Y is the number of red pens selected, find (a) the joint probability function f(x,y), (b) P[(X,Y)∈A],whereAistheregion{(x,y)|x+y≤1}. Solution : The possible pairs of values (x, y) are (0, 0), (0, 1), (1, 0), (1, 1), (0, 2), and (2, 0). (a) Now, f(0,1), for example, represents the probability that a red and a green pens are selected. The total number of equally likely ways of selecting any 2 pens from the 8 is 􏰩8􏰪 = 28. The number of ways of selecting 1 red from 2 2 􏰩2􏰪􏰩3􏰪 red pens and 1 green from 3 green pens is 1 1 = 6. Hence, f(0,1) = 6/28 = 3/14. Similar calculations yield the probabilities for the other cases, which are presented in Table 3.1. Note that the probabilities sum to 1. In Chapter 96 Chapter 3 Random Variables and Probability Distributions 5, it will become clear that the joint probability distribution of Table 3.1 can be represented by the formula 􏰩3􏰪􏰩2􏰪􏰩 3 􏰪 x y 2−x−y f(x,y) = 􏰩8􏰪 , 2 for x = 0, 1, 2; y = 0, 1, 2; and 0 ≤ x + y ≤ 2. (b) The probability that (X, Y ) fall in the region A is P[(X,Y)∈A]=P(X+Y ≤1)=f(0,0)+f(0,1)+f(1,0) =3+3+9=9. Table 3.1: Joint Probability Distribution for Example 3.14 Row f(x,y) Totals 28 14 28 14 x 012 393 28 28 28 1 28 0 1 2 y 15 28 3 7 1 28 33 14 14 0 00 5 15 3 14 28 28 Column Totals 1 When X and Y are continuous random variables, the joint density function f(x,y) is a surface lying above the xy plane, and P[(X,Y) ∈ A], where A is any region in the xy plane, is equal to the volume of the right cylinder bounded by the base A and the surface. The function f(x,y) is a joint density function of the continuous random variables X and Y if 1. f(x,y) ≥ 0, for all (x,y), 2.􏰬∞ 􏰬∞ f(x,y)dxdy=1, 􏰬􏰬 −∞ −∞ 3. P[(X,Y)∈A]= Af(x,y)dxdy,foranyregionAinthexyplane. Definition 3.9: Example 3.15: A privately owned business operates both a drive-in facility and a walk-in facility. On a randomly selected day, let X and Y , respectively, be the proportions of the time that the drive-in and the walk-in facilities are in use, and suppose that the joint density function of these random variables is 􏰥 2(2x+3y), 0≤x≤1,0≤y≤1, 5 f(x,y) = (a) Verify condition 2 of Definition 3.9. 0, elsewhere. (b) FindP[(X,Y)∈A],whereA={(x,y)|00, P (A)

3.4 Joint Probability Distributions 99 where A and B are now the events defined by X = x and Y = y, respectively, then
Definition 3.11:
P(Y =y|X=x)=P(X=x,Y =y)=f(x,y), providedg(x)>0, P(X = x) g(x)
where X and Y are discrete random variables.
It is not difficult to show that the function f (x, y)/g(x), which is strictly a func-
tion of y with x fixed, satisfies all the conditions of a probability distribution. This is also true when f(x,y) and g(x) are the joint density and marginal distribution, respectively, of continuous random variables. As a result, it is extremely important that we make use of the special type of distribution of the form f(x,y)/g(x) in order to be able to effectively compute conditional probabilities. This type of dis- tribution is called a conditional probability distribution; the formal definition follows.
Let X and Y be two random variables, discrete or continuous. The conditional distribution of the random variable Y given that X = x is
f(y|x) = f(x,y), provided g(x) > 0. g(x)
Similarly, the conditional distribution of X given that Y = y is
f(x|y) = f(x,y), provided h(y) > 0. h(y)
If we wish to find the probability that the discrete random variable X falls between a and b when it is known that the discrete variable Y = y, we evaluate
􏰤
P(a2􏰭􏰭X=0.25 =
1/2
Given the joint density function
f(y|x=0.25)dy=
􏰫13y2 8
1/2
1−0.253 dy=9.
Example 3.20:
􏰥x(1+3y2), 0 0,
0, elsewhere.
f(x) =
Let X1, X2, and X3 represent the shelf lives for three of these containers selected
independently and find P(X1 < 2,1 < X2 < 3,X3 > 2).
Solution: Since the containers were selected independently, we can assume that the random
variables X1, X2, and X3 are statistically independent, having the joint probability density
f(x1, x2, x3) = f(x1)f(x2)f(x3) = e−x1 e−x2 e−x3 = e−x1−x2−x3 , for x1 > 0, x2 > 0, x3 > 0, and f(x1,x2,x3) = 0 elsewhere. Hence
􏰫∞􏰫3􏰫2
P(X1 < 2,1 < X2 < 3,X3 > 2) =
= (1 − e−2)(e−1 − e−3)e−2 = 0.0372.
210
e−x1−x2−x3 dx1 dx2 dx3

Exercises
//
104 Chapter 3 Random Variables and Probability Distributions What Are Important Characteristics of Probability Distributions
and Where Do They Come From?
This is an important point in the text to provide the reader with a transition into the next three chapters. We have given illustrations in both examples and exercises of practical scientific and engineering situations in which probability distributions and their properties are used to solve important problems. These probability dis- tributions, either discrete or continuous, were introduced through phrases like “it is known that” or “suppose that” or even in some cases “historical evidence sug- gests that.” These are situations in which the nature of the distribution and even a good estimate of the probability structure can be determined through historical data, data from long-term studies, or even large amounts of planned data. The reader should remember the discussion of the use of histograms in Chapter 1 and from that recall how frequency distributions are estimated from the histograms. However, not all probability functions and probability density functions are derived from large amounts of historical data. There are a substantial number of situa- tions in which the nature of the scientific scenario suggests a distribution type. Indeed, many of these are reflected in exercises in both Chapter 2 and this chap- ter. When independent repeated observations are binary in nature (e.g., defective or not, survive or not, allergic or not) with value 0 or 1, the distribution covering this situation is called the binomial distribution and the probability function is known and will be demonstrated in its generality in Chapter 5. Exercise 3.34 in Section 3.3 and Review Exercise 3.80 are examples, and there are others that the reader should recognize. The scenario of a continuous distribution in time to failure, as in Review Exercise 3.69 or Exercise 3.27 on page 93, often suggests a dis- tribution type called the exponential distribution. These types of illustrations are merely two of many so-called standard distributions that are used extensively in real-world problems because the scientific scenario that gives rise to each of them is recognizable and occurs often in practice. Chapters 5 and 6 cover many of these types along with some underlying theory concerning their use.
A second part of this transition to material in future chapters deals with the notion of population parameters or distributional parameters. Recall in Chapter 1 we discussed the need to use data to provide information about these parameters. We went to some length in discussing the notions of a mean and variance and provided a vision for the concepts in the context of a population. Indeed, the population mean and variance are easily found from the probability function for the discrete case or probability density function for the continuous case. These parameters and their importance in the solution of many types of real-world problems will provide much of the material in Chapters 8 through 17.
3.37 Determine the values of c so that the follow- 3.38 If the joint probability distribution of X and Y
ing functions represent joint probability distributions of the random variables X and Y:
(a)f(x,y)=cxy,forx=1,2,3;y=1,2,3;
(b) f (x, y) = c|x − y|, for x = −2, 0, 2; y = −2, 3.
is given by
f(x,y)= 30 , forx=0,1,2,3; y=0,1,2,
find
x+y

Exercises
105
(a)P(X≤2,Y =1); (b)P(X>2,Y ≤1);
(c) P (X > Y ); (d)P(X+Y =4).
3.39 From a sack of fruit containing 3 oranges, 2 ap- ples, and 3 bananas, a random sample of 4 pieces of fruit is selected. If X is the number of oranges and Y is the number of apples in the sample, find
(a) the joint probability distribution of X and Y ;
(b) P[(X,Y)∈A],whereAistheregionthatisgiven
by{(x,y)|x+y≤2}.
3.40 A fast-food restaurant operates both a drive- through facility and a walk-in facility. On a randomly selected day, let X and Y , respectively, be the propor- tions of the time that the drive-through and walk-in facilities are in use, and suppose that the joint density function of these random variables is
􏰰2
f(x,y)= 3(x+2y), 0≤x≤1, 0≤y≤1,
0, elsewhere.
(a) Find the marginal density of X.
(b) Find the marginal density of Y .
(c) Find the probability that the drive-through facility is busy less than one-half of the time.
3.41 A candy company distributes boxes of choco- lates with a mixture of creams, toffees, and cordials. Suppose that the weight of each box is 1 kilogram, but the individual weights of the creams, toffees, and cor- dials vary from box to box. For a randomly selected box, let X and Y represent the weights of the creams and the toffees, respectively, and suppose that the joint density function of these variables is
􏰰24xy, 0≤x≤1, 0≤y≤1, x+y≤1, f (x, y) = 0, elsewhere.
(a) Find the probability that in a given box the cordials account for more than 1/2 of the weight.
(b) Find the marginal density for the weight of the creams.
(c) Find the probability that the weight of the toffees in a box is less than 1/8 of a kilogram if it is known that creams constitute 3/4 of the weight.
3.42 Let X and Y denote the lengths of life, in years, of two components in an electronic system. If the joint density function of these variables is
, x>0,y>0, elsewhere,
findP(01/2).

106
Chapter 3 Random Variables and Probability Distributions
3. 53 Given the joint density function
􏰰6−x−y, 00.3|Y =0.5).
3.57 Let X, Y , and Z have the joint probability den- sity function
elsewhere,
(a) Evaluate the (b) Evaluate the
marginal distribution of X.
marginal distribution of Y . (c)FindP(Y =3|X=2).
3.50 Suppose that X and Y have the following joint probability distribution:
x f(x,y) 2 4
1 0.10 0.15
f(x,y,z) = (a) Find k.
􏰰kxy2z, 01,1 1 , 1 < Z < 2); 423 (d)P(00,
0, eleswhere.
//
⎨0.4, F (x) = ⎪0.6, ⎪⎩0.8, 1.0,
⎧⎪0, ⎪
3.66 Consider the random variables X and Y with joint density function
􏰰
f(x,y)= x+y, 0≤x,y≤1, 0, elsewhere.
(a) Find the marginal distributions of X and Y . (b) Find P(X > 0.5,Y > 0.5).
3.67 An industrial process manufactures items that can be classified as either defective or not defective. The probability that an item is defective is 0.1. An experiment is conducted in which 5 items are drawn randomly from the process. Let the random variable X be the number of defectives in this sample of 5. What is the probability mass function of X?
3. 68 Consider the following joint probability density function of the random variables X and Y :
elsewhere.
(a) Find the marginal density functions of X and Y . (b) Are X and Y independent?
(c) Find P(X > 2).
3.69 The life span in hours of an electrical compo- nent is a random variable with cumulative distribution function
(a) What is the probability mass function of X? (b) Compute P(4 < X ≤ 7). 3.63 Two electronic components of a missile system work in harmony for the success of the total system. Let X and Y denote the life in hours of the two com- ponents. The joint density of X and Y is 􏰰ye−y(1+x), 0, elsewhere. (a) (b) (c) e−2 2x f(x) = x! , for x = 0,1,2,.... Determine the probability that X equals 0, 1, 2, 3, 4,5,and6. Graph the probability mass function for these val- ues of x. Determine the cumulative distribution function for these values of X. 􏰰3x−y, 1 0.5?
Give the conditional distribution fX1|X2 (x1|x2).
0, elsewhere.
(a) What is the probability that there are no calls
within a 20-minute time interval?
(b) What is the probability that the first call comes within 10 minutes of opening?
3.75 A chemical system that results from a chemical reaction has two important components among others in a blend. The joint distribution describing the pro- portions X1 and X2 of these two components is given by
0, elsewhere.
What fraction of the loaves of this product stocked to-
day would you expect to be sellable 3 days from now?
3.72 Passenger congestion is a service problem in air- ports. Trains are installed within the airport to reduce the congestion. With the use of the train, the time X in minutes that it takes to travel from the main terminal to a particular concourse has density function
3.76 Consider the situation of Review Exercise 3.75. But suppose the joint distribution of the two propor- tions is given by
􏰰1, 0≤x≤10, f(x)= 10
􏰰
f(x1,x2)= 6×2, 0 100, elsewhere.
Find the expected life of this type of device. Solution: Using Definition 4.1, we have
􏰫 ∞ μ = E(X) =
100
x
20,000 x3
􏰫 ∞ 20,000 dx = x2
100
dx = 200.
0,
Therefore, we can expect this type of device to last, on average, 200 hours.
Now let us consider a new random variable g(X), which depends on X; that is, each value of g(X) is determined by the value of X. For instance, g(X) might be X2 or 3X − 1, and whenever X assumes the value 2, g(X) assumes the value g(2). In particular, if X is a discrete random variable with probability distribution
f(x), for x = −1,0,1,2, and g(X) = X2, then
P[g(X)=0]=P(X =0)=f(0),
P[g(X)=1]=P(X =−1)+P(X =1)=f(−1)+f(1), P[g(X)=4]=P(X =2)=f(2),
and so the probability distribution of g(X) may be written g(x) 0 1 4
P[g(X) = g(x)] f(0) f(−1) + f(1) f(2)
By the definition of the expected value of a random variable, we obtain
μg(X) = E[g(x)] = 0f(0) + 1[f(−1) + f(1)] + 4f(2)
= (−1)2f(−1) + (0)2f(0) + (1)2f(1) + (2)2f(2) =
g(x)f(x).
This result is generalized in Theorem 4.1 for both discrete and continuous random
variables.
􏰤
x
Let X be a random variable with probability distribution f(x). The expected value of the random variable g(X) is
if X is discrete, and
if X is continuous.
μg(X) = E[g(X)] =
g(x)f(x) dx
μg(X) = E[g(X)] =
g(x)f(x)
􏰤
x
􏰫∞ −∞
Theorem 4.1:

4.1 Mean of a Random Variable 115
Example 4.4: Suppose that the number of cars X that pass through a car wash between 4:00 P.M. and 5:00 P.M. on any sunny Friday has the following probability distribution:
x 456789
P(X = x)
111111 12 12 4 4 6 6
Let g(X) = 2X−1 represent the amount of money, in dollars, paid to the attendant by the manager. Find the attendant’s expected earnings for this particular time period.
Solution: By Theorem 4.1, the attendant can expect to receive
􏰤9 x=4
􏰧1􏰨 􏰧1􏰨
+ (15) 6 + (17) 6 = $12.67.
Example 4.5: Let X be a random variable with density function
􏰥x2, −10, (x+4)
f(x) =
Find the average number of days that a person is hos-
Find the probability that at least one light bulb chosen is defective. [Hint: Compute P (X1 + X2 = 1).]
4.17 Let X be a random variable with the following probability distribution:
x −3 6 9 f (x) 1/6 1/2 1/3
Find μg(X), where g(X) = (2X + 1)2.
4.18 Find the expected value of the random variable g(X) = X2, where X has the probability distribution of Exercise 4.2.
4.19 A large industrial firm purchases several new word processors at the end of each year, the exact num- ber depending on the frequency of repairs in the previ- ous year. Suppose that the number of word processors, X, purchased each year has the following probability distribution:
x0123 f (x) 1/10 3/10 2/5 1/5
If the cost of the desired model is $1200 per unit and at the end of the year a refund of 50X2 dollars will be issued, how much can this firm expect to spend on new word processors during this year?
4.20 A continuous random variable X has the density
0.20 0.30
(a) Find the expected value of g(X, Y ) = XY 2.
y 3
5 0.10 0.15
function
􏰰e−x, x > 0,
0, elsewhere.
4.27 In Exercise 3.27 on page 93, a density function is given for the time to failure of an important compo- nent of a DVD player. Find the mean number of hours to failure of the component and thus the DVD player.
4.28 Consider the information in Exercise 3.28 on page 93. The problem deals with the weight in ounces of the product in a cereal box, with
􏰰2 , 23.75 ≤ x ≤ 26.25, f(x)= 5
f(x) =
Find the expected value of g(X) = e2X/3.
4.21 What is the dealer’s average profit per auto- mobile if the profit on each automobile is given by g(X) = X2, where X is a random variable having the density function of Exercise 4.12?
0, elsewhere.
Find the expected value of Z =

X2 + Y 2.

4.2 Variance and Covariance of Random Variables
119
What is the population mean of the times to repair?
4.31 Consider Exercise 3.32 on page 94.
(a) What is the mean proportion of the budget allo-
cated to environmental and pollution control?
(b) What is the probability that a company selected at random will have allocated to environmental and pollution control a proportion that exceeds the population mean given in (a)?
4.32 In Exercise 3.13 on page 92, the distribution of the number of imperfections per 10 meters of synthetic fabric is given by
x01234 f(x) 0.41 0.37 0.16 0.05 0.01
(a) Plot the probability function.
(b) Find the expected number of imperfections,
E(X) = μ. (c)FindE(X2).
Variance and Covariance of Random Variables
The mean, or expected value, of a random variable X is of special importance in statistics because it describes where the probability distribution is centered. By itself, however, the mean does not give an adequate description of the shape of the distribution. We also need to characterize the variability in the distribution. In Figure 4.1, we have the histograms of two discrete probability distributions that have the same mean, μ = 2, but differ considerably in variability, or the dispersion of their observations about the mean.
(a) Plot the density function.
(b) Compute the expected value, or mean weight, in ounces.
(c) Are you surprised at your answer in (b)? Explain why or why not.
4.29 Exercise 3.29 on page 93 dealt with an impor- tant particle size distribution characterized by
f(x) =
(a) Plot the density function.
􏰰
3x−4, x > 1,
0, elsewhere.
(b) Give the mean particle size.
4.30 In Exercise 3.31 on page 94, the distribution of times before a major repair of a washing machine was given as
􏰰1e−y/4, y≥0, f(y)= 4
4.2
0, elsewhere.
123 x01234x (a) (b)
Figure 4.1: Distributions with equal means and unequal dispersions.
The most important measure of variability of a random variable X is obtained by applying Theorem 4.1 with g(X) = (X − μ)2. The quantity is referred to as the variance of the random variable X or the variance of the probability

120 Chapter 4 Mathematical Expectation distribution of X and is denoted by Var(X) or the symbol σX2 , or simply by σ2
Definition 4.3:
when it is clear to which random variable we refer.
Let X be a random variable with probability distribution f(x) and mean μ. The variance of X is
σ2 = E[(X − μ)2] = σ2 = E[(X − μ)2] =
􏰤
(x − μ)2f(x), 􏰫∞
(x − μ)2f(x) dx,
if X is discrete, and
if X is continuous.
x
−∞
The positive square root of the variance, σ, is called the standard deviation of X.
The quantity x−μ in Definition 4.3 is called the deviation of an observation from its mean. Since the deviations are squared and then averaged, σ2 will be much smaller for a set of x values that are close to μ than it will be for a set of values that vary considerably from μ.
Example 4.8: Let the random variable X represent the number of automobiles that are used for official business purposes on any given workday. The probability distribution for company A [Figure 4.1(a)] is
x123 f (x) 0.3 0.4 0.3
and that for company B [Figure 4.1(b)] is x01234
f (x) 0.2 0.1 0.3 0.3 0.1
Show that the variance of the probability distribution for company B is greater
than that for company A. Solution : For company A, we find that
μA = E(X) = (1)(0.3) + (2)(0.4) + (3)(0.3) = 2.0,
and then
􏰤3
σA2 = (x−2)2 =(1−2)2(0.3)+(2−2)2(0.4)+(3−2)2(0.3)=0.6.
x=1
For company B, we have
μB = E(X) = (0)(0.2) + (1)(0.1) + (2)(0.3) + (3)(0.3) + (4)(0.1) = 2.0, and then
􏰤4 x=0
= (0 − 2)2(0.2) + (1 − 2)2(0.1) + (2 − 2)2(0.3) + (3 − 2)2(0.3) + (4 − 2)2(0.1) = 1.6.
σB2 =
(x − 2)2f(x)

4.2 Variance and Covariance of Random Variables 121
Theorem 4.2:
Clearly, the variance of the number of automobiles that are used for official business purposes is greater for company B than for company A.
An alternative and preferred formula for finding σ2, which often simplifies the calculations, is stated in the following theorem.
The variance of a random variable X is
σ2 =E(X2)−μ2.
Proof : For the discrete case, we can write
σ2 = =
􏰤􏰤
(x−μ)2f(x)= (x2 −2μx+μ2)f(x) xx
􏰤􏰤􏰤
x2f(x) − 2μ xf(x) + μ2 f(x). xxx
Since μ = 􏰦xf(x) by definition, and 􏰦f(x) = 1 for any discrete probability xx
distribution, it follows that
􏰤
σ2 =
For the continuous case the proof is step by step the same, with summations
replaced by integrations.
Example 4.9: Let the random variable X represent the number of defective parts for a machine when 3 parts are sampled from a production line and tested. The following is the probability distribution of X.
x0123 f (x) 0.51 0.38 0.10 0.01
Using Theorem 4.2, calculate σ2. Solution: First, we compute
μ = (0)(0.51) + (1)(0.38) + (2)(0.10) + (3)(0.01) = 0.61.
Now,
E(X2) = (0)(0.51) + (1)(0.38) + (4)(0.10) + (9)(0.01) = 0.87.
Therefore,
σ2 = 0.87 − (0.61)2 = 0.4979.
Example 4.10: The weekly demand for a drinking-water product, in thousands of liters, from a local chain of efficiency stores is a continuous random variable X having the probability density
􏰰2(x−1), 10andρXY =−1ifb<0. (SeeExercise4.48.) Thecorrelation coefficient is the subject of more discussion in Chapter 12, where we deal with linear regression. Find the correlation coefficient between X and Y in Example 4.13. Since 􏰧5􏰨 􏰧15􏰨 􏰧3􏰨 27 E(X2)=(02) 14 +(12) 28 +(22) 28 =28 and we obtain 􏰧15􏰨 􏰧3􏰨 􏰧1􏰨 4 E(Y2)=(02) 28 +(12) 7 +(22) 28 =7, 2 27 􏰧3􏰨2 45 2 4 􏰧1􏰨2 9 σX=28− 4 =112andσY =7− 2 =28. Therefore, the correlation coefficient between X and Y is σXY −9/56 1 =−√5. Find the correlation coefficient of X and Y in Example 4.14. ρXY=σσ =􏰱 X Y (45/112)(9/28) Example 4.16: Solution: Because E(X2)= 4x5 dx=3andE(Y2)= 4y3(1−y2)dy=1−3=3, 􏰫12􏰫1 21 00 we conclude that 2 2􏰧4􏰨2 2 2 1􏰧8􏰨2 11 σX=3− 5 =75andσY =3− 15 =225. 4/225 4 Note that although the covariance in Example 4.15 is larger in magnitude (dis- regarding the sign) than that in Example 4.16, the relationship of the magnitudes of the correlation coefficients in these two examples is just the reverse. This is evidence that we cannot look at the magnitude of the covariance to decide on how strong the relationship is. Hence, ρX Y = 􏰱(2/75)(11/225) = √66 . Exercises 127 Exercises 4.33 Use Definition 4.3 on page 120 to find the vari- ance of the random variable X of Exercise 4.7 on page 117. 4.34 Let X be a random variable with the following probability distribution: x −2 3 5 f (x) 0.3 0.2 0.5 Find the standard deviation of X. 4.35 The random variable X, representing the num- ber of errors per 100 lines of software code, has the following probability distribution: x23456 f (x) 0.01 0.25 0.4 0.3 0.04 Using Theorem 4.2 on page 121, find the variance of X. 4.36 Suppose that the probabilities are 0.4, 0.3, 0.2, and 0.1, respectively, that 0, 1, 2, or 3 power failures will strike a certain subdivision in any given year. Find the mean and variance of the random variable X repre- senting the number of power failures striking this sub- division. 4.37 A dealer’s profit, in units of $5000, on a new automobile is a random variable X having the density function given in Exercise 4.12 on page 117. Find the variance of X. 4.38 The proportion of people who respond to a cer- tain mail-order solicitation is a random variable X hav- ing the density function given in Exercise 4.14 on page 117. Find the variance of X. 4.39 The total number of hours, in units of 100 hours, that a family runs a vacuum cleaner over a period of one year is a random variable X having the density function given in Exercise 4.13 on page 117. Find the variance of X. random variable Y = 3X − 2, where X has the density function // 􏰰1e−x/4, x>0 f(x)= 4
0, elsewhere.
Find the mean and variance of the random variable Y .
4.44 Find the covariance of the random variables X and Y of Exercise 3.39 on page 105.
4.45 Find the covariance of the random variables X and Y of Exercise 3.49 on page 106.
4.46 Find the covariance of the random variables X and Y of Exercise 3.44 on page 105.
4.47 For the random variables X and Y whose joint density function is given in Exercise 3.40 on page 105, find the covariance.
4. 48 Given a random variable X , with standard de- viationσX,andarandomvariableY=a+bX,show that if b < 0, the correlation coefficient ρXY = −1, and ifb>0,ρXY =1.
4.49 Consider the situation in Exercise 4.32 on page 119. The distribution of the number of imperfections per 10 meters of synthetic failure is given by
x01234 f (x) 0.41 0.37 0.16 0.05 0.01
Find the variance and standard deviation of the num- ber of imperfections.
4.50 For a laboratory assignment, if the equipment is working, the density function of the observed outcome Xis
4.40 Referring to Exercise 4.14 on page 117, find σ2 for the function g(X) = 3X2 + 4.
f(x) =
Find the variance and standard deviation of X.
g(X)
4.41 Find the standard deviation of the random vari-
4.51 For the random variables X and Y in Exercise 3.39 on page 105, determine the correlation coefficient between X and Y .
able g(X) = (2X + 1)2 in Exercise 4.17 on page 118.
4.42 Using the results of Exercise 4.21 on page 118, find the variance of g(X) = X2, where X is a random variable having the density function given in Exercise 4.12 on page 117.
4.43 The length of time, in minutes, for an airplane to obtain clearance for takeoff at a certain airport is a
Random variables X and Y follow a joint distri- 􏰰2, 02, x3
0.20 0.30
g(x) =
0,
elsewhere,
􏰰2y, 00, f(x)= 5
telephone conversation.
(b) Find the variance and standard deviation of X.
(c) Find E[(X + 5)2].
4.83 Referring to the random variables whose joint density function is given in Exercise 3.41 on page 105, find the covariance between the weight of the creams and the weight of the toffees in these boxes of choco- lates.
4.84 Referring to the random variables whose joint probability density function is given in Exercise 3.41 on page 105, find the expected weight for the sum of the creams and toffees if one purchased a box of these chocolates.
4.85 Suppose it is known that the life X of a partic- ular compressor, in hours, has the density function
4.91 A dealer’s profit, in units of $5000, on a new au- tomobile is a random variable X having density func- tion
0, elsewhere.
(a) Determine the mean length E(X) of this type of
􏰰
2(1−x), 0≤x≤1, 0, elsewhere.
􏰰 1 f(x)= 900e
0,
(a) Find the mean life of the compressor.
(b) Find E(X2).
(c) Find the variance and standard deviation of the
random variable X .
4.86 Referring to the random variables whose joint
density function is given in Exercise 3.40 on page 105, (a) find μX and μY ;
(b) find E[(X + Y )/2].
4.87 Show that Cov(aX, bY ) = ab Cov(X, Y ).
4.88 Consider the density function of Review Ex- ercise 4.85. Demonstrate that Chebyshev’s theorem holds for k = 2 and k = 3.
4.89 Consider the joint density function
􏰰
−$5, 000 0.2 $10, 000 0.5 $30, 000 0.3
What is the company’s expected profit?
4.94 In a support system in the U.S. space program, a single crucial component works only 85% of the time. In order to enhance the reliability of the system, it is decided that 3 components will be installed in parallel such that the system fails only if they all fail. Assume the components act independently and that they are equivalent in the sense that all 3 of them have an 85% success rate. Consider the random variable X as the number of components out of 3 that fail.
(a) Write out a probability function for the random variable X.
(b) What is E(X) (i.e., the mean number of compo- nents out of 3 that fail)?
(c) What is Var(X)?
(d) What is the probability that the entire system is
successful?
(e) What is the probability that the system fails?
(f) If the desire is to have the system be successful with probability 0.99, are three components suffi- cient? If not, how many are required?
4.95 In business, it is important to plan and carry out research in order to anticipate what will occur at the end of the year. Research suggests that the profit (loss) spectrum for a certain company, with corresponding probabilities, is as follows:
−x/900
, x>0, elsewhere.
//
(a) (b)
(c)
f(x) =
Find the variance of the dealer’s profit.
Demonstrate that Chebyshev’s theorem holds for k = 2 with the density function above.
What is the probability that the profit exceeds $500?
4.92 Consider Exercise 4.10 on page 117. Can it be said that the ratings given by the two experts are in- dependent? Explain why or why not.
4.93 A company’s marketing and accounting depart- ments have determined that if the company markets its newly developed product, the contribution of the product to the firm’s profit during the next 6 months will be described by the following:
Profit Contribution
Probability
16y, x>2, 0 3) = 1 − 0.6496 = 0.3504.
Find the mean and variance of the binomial random variable of Example 5.2, and
b(x; 10, 0.3) = 0.6496 − 0.3828 = 0.2668.
then use Chebyshev’s theorem (on page 137) to interpret the interval μ ± 2σ. Solution : Since Example 5.2 was a binomial experiment with n = 15 and p = 0.4, by Theorem
5.1, we have
μ = (15)(0.4) = 6 and σ2 = (15)(0.4)(0.6) = 3.6.
Taking the square root of 3.6, we find that σ = 1.897. Hence, the required interval is 6±(2)(1.897), or from 2.206 to 9.794. Chebyshev’s theorem states that the number of recoveries among 15 patients who contracted the disease has a probability of at least 3/4 of falling between 2.206 and 9.794 or, because the data are discrete, between 2 and 10 inclusive.
There are solutions in which the computation of binomial probabilities may allow us to draw a scientific inference about population after data are collected. An illustration is given in the next example.
Example 5.6: Consider the situation of Example 5.4. The notion that 30% of the wells are impure is merely a conjecture put forth by the area water board. Suppose 10 wells are randomly selected and 6 are found to contain the impurity. What does this imply about the conjecture? Use a probability statement.
Solution : We must first ask: “If the conjecture is correct, is it likely that we would find 6 or more impure wells?”
10 5
􏰤􏰤
P(X ≥6)= b(x;10,0.3)− b(x;10,0.3)=1−0.9527=0.0473. x=0 x=0
As a result, it is very unlikely (4.7% chance) that 6 or more wells would be found impure if only 30% of all are impure. This casts considerable doubt on the conjec- ture and suggests that the impurity problem is much more severe.
As the reader should realize by now, in many applications there are more than two possible outcomes. To borrow an example from the field of genetics, the color of guinea pigs produced as offspring may be red, black, or white. Often the “defective” or “not defective” dichotomy is truly an oversimplification in engineering situations. Indeed, there are often more than two categories that characterize items or parts coming off an assembly line.

5.2 Binomial and Multinomial Distributions 149 Multinomial Experiments and the Multinomial Distribution
The binomial experiment becomes a multinomial experiment if we let each trial have more than two possible outcomes. The classification of a manufactured product as being light, heavy, or acceptable and the recording of accidents at a certain intersection according to the day of the week constitute multinomial exper- iments. The drawing of a card from a deck with replacement is also a multinomial experiment if the 4 suits are the outcomes of interest.
In general, if a given trial can result in any one of k possible outcomes E1, E2, . . . , Ek withprobabilitiesp1,p2,…,pk,thenthemultinomialdistribution willgive the probability that E1 occurs x1 times, E2 occurs x2 times, …, and Ek occurs xk times in n independent trials, where
x1 +x2 +···+xk =n. We shall denote this joint probability distribution by
f(x1,x2,…,xk;p1,p2,…,pk,n).
Clearly,p1+p2+···+pk =1,sincetheresultofeachtrialmustbeoneofthek possible outcomes.
To derive the general formula, we proceed as in the binomial case. Since the
trials are independent, any specified order yielding x1 outcomes for E1, x2 for
E ,…,x for E will occur with probability px1px2 ···pxk. The total number of 2kk 12k
orders yielding similar outcomes for the n trials is equal to the number of partitions of n items into k groups with x1 in the first group, x2 in the second group, . . . , and xk in the kth group. This can be done in
􏰧 n 􏰨 n! x,x,…,x =x!x!···x!
Multinomial Distribution
bility, we obtain the multinomial distribution by multiplying the probability for a specified order by the total number of partitions.
If a given trial can result in the k outcomes E1, E2, . . . , Ek with probabilities p1, p2, . . . , pk, then the probability distribution of the random variables X1, X2, . . . , Xk, representing the number of occurrences for E1, E2, . . . , Ek in n inde- pendent trials, is
12k12k
ways. Since all the partitions are mutually exclusive and occur with equal proba-
with
􏰤k i=1
f(x ,x ,…,x ;p ,p ,…,p ,n)=
􏰧n􏰨
px1 px2 ···pxk,
1 2 k 1 2 􏰤k
i=1
k
xi =nand
x1,x2,…,xk 1 2 k
pi =1.
The multinomial distribution derives its name from the fact that the terms of the multinomial expansion of (p1 + p2 + · · · + pk )n correspond to all the possible values of f(x1,x2,…,xk;p1,p2,…,pk,n).

150
Chapter 5 Some Discrete Probability Distributions
//
Example 5.7: The complexity of arrivals and departures of planes at an airport is such that computer simulation is often used to model the “ideal” conditions. For a certain airport with three runways, it is known that in the ideal setting the following are the probabilities that the individual runways are accessed by a randomly arriving commercial jet:
Runway 1: Runway 2: Runway 3:
p1 = 2/9, p2 = 1/6, p3 = 11/18.
What is the probability that 6 randomly arriving airplanes are distributed in the following fashion?
Solution : Using the multinomial distribution, we have
f
􏰧
2 1 11 􏰨 􏰧 6 􏰨􏰧2􏰨2􏰧1􏰨1􏰧11􏰨3 2,1,3; 9, 6, 18,6 = 2,1,3 9 6 18
Runway 1: Runway 2: Runway 3:
2 airplanes, 1 airplane, 3 airplanes
Exercises
6! 22 1 113 =2!1!3!·92 ·6·183 =0.1127.
5.5 According to Chemical Engineering Progress (November 1990), approximately 30% of all pipework failures in chemical plants are caused by operator error.
(a) What is the probability that out of the next 20 pipework failures at least 10 are due to operator error?
(b) What is the probability that no more than 4 out of 20 such failures are due to operator error?
(c) Suppose, for a particular plant, that out of the ran- dom sample of 20 such failures, exactly 5 are due to operator error. Do you feel that the 30% figure stated above applies to this plant? Comment.
5.6 According to a survey by the Administrative Management Society, one-half of U.S. companies give employees 4 weeks of vacation after they have been with the company for 15 years. Find the probabil- ity that among 6 companies surveyed at random, the number that give employees 4 weeks of vacation after 15 years of employment is
(a) anywhere from 2 to 5; (b) fewer than 3.
5.7 One prominent physician claims that 70% of those with lung cancer are chain smokers. If his assertion is correct,
(a) find the probability that of 10 such patients
5.1 A random variable X that assumes the values
x1 , x2 , . . . , xk is called a discrete uniform random vari-
able if its probability mass function is f(x) = 1 for all k
of x1,x2,…,xk and 0 otherwise. Find the mean and variance of X.
5.2 Twelve people are given two identical speakers, which they are asked to listen to for differences, if any. Suppose that these people answer simply by guessing. Find the probability that three people claim to have heard a difference between the two speakers.
5.3 An employee is selected from a staff of 10 to super- vise a certain project by selecting a tag at random from a box containing 10 tags numbered from 1 to 10. Find the formula for the probability distribution of X rep- resenting the number on the tag that is drawn. What is the probability that the number drawn is less than 4?
5.4 In a certain city district, the need for money to buy drugs is stated as the reason for 75% of all thefts. Find the probability that among the next 5 theft cases reported in this district,
(a) exactly 2 resulted from the need for money to buy drugs;
(b) at most 3 resulted from the need for money to buy drugs.

Exercises
151
recently admitted to a hospital, fewer than half are chain smokers;
(b) find the probability that of 20 such patients re- cently admitted to a hospital, fewer than half are chain smokers.
5.8 According to a study published by a group of Uni- versity of Massachusetts sociologists, approximately 60% of the Valium users in the state of Massachusetts first took Valium for psychological problems. Find the probability that among the next 8 users from this state who are interviewed,
(a) exactly 3 began taking Valium for psychological problems;
(b) at least 5 began taking Valium for problems that were not psychological.
5.9 In testing a certain kind of truck tire over rugged terrain, it is found that 25% of the trucks fail to com- plete the test run without a blowout. Of the next 15 trucks tested, find the probability that
(a) from 3 to 6 have blowouts; (b) fewer than 4 have blowouts; (c) more than 5 have blowouts.
5.10 A nationwide survey of college seniors by the University of Michigan revealed that almost 70% dis- approve of daily pot smoking, according to a report in Parade. If 12 seniors are selected at random and asked their opinion, find the probability that the number who disapprove of smoking pot daily is
(a) anywhere from 7 to 9; (b) at most 5;
(c) not less than 8.
5.11 The probability that a patient recovers from a delicate heart operation is 0.9. What is the probabil- ity that exactly 5 of the next 7 patients having this operation survive?
5.12 A traffic control engineer reports that 75% of the vehicles passing through a checkpoint are from within the state. What is the probability that fewer than 4 of the next 9 vehicles are from out of state?
5.13 A national study that examined attitudes about antidepressants revealed that approximately 70% of re- spondents believe “antidepressants do not really cure anything, they just cover up the real trouble.” Accord- ing to this study, what is the probability that at least 3 of the next 5 people selected at random will hold this opinion?
5.14 The percentage of wins for the Chicago Bulls basketball team going into the playoffs for the 1996–97 season was 87.7. Round the 87.7 to 90 in order to use Table A.1.
(a) What is the probability that the Bulls sweep (4-0) the initial best-of-7 playoff series?
(b) What is the probability that the Bulls win the ini- tial best-of-7 playoff series?
(c) What very important assumption is made in an- swering parts (a) and (b)?
5.15 It is known that 60% of mice inoculated with a serum are protected from a certain disease. If 5 mice are inoculated, find the probability that
(a) none contracts the disease;
(b) fewer than 2 contract the disease; (c) more than 3 contract the disease.
5.16 Suppose that airplane engines operate indepen- dently and fail with probability equal to 0.4. Assuming that a plane makes a safe flight if at least one-half of its engines run, determine whether a 4-engine plane or a 2- engine plane has the higher probability for a successful flight.
5.17 If X represents the number of people in Exer- cise 5.13 who believe that antidepressants do not cure but only cover up the real problem, find the mean and variance of X when 5 people are selected at random.
5.18 (a) In Exercise 5.9, how many of the 15 trucks would you expect to have blowouts?
(b) What is the variance of the number of blowouts ex- perienced by the 15 trucks? What does that mean?
5.19 As a student drives to school, he encounters a traffic signal. This traffic signal stays green for 35 sec- onds, yellow for 5 seconds, and red for 60 seconds. As- sume that the student goes to school each weekday between 8:00 and 8:30 a.m. Let X1 be the number of times he encounters a green light, X2 be the number of times he encounters a yellow light, and X3 be the number of times he encounters a red light. Find the joint distribution of X1 , X2 , and X3 .
5.20 According to USA Today (March 18, 1997), of 4 million workers in the general workforce, 5.8% tested positive for drugs. Of those testing positive, 22.5% were cocaine users and 54.4% marijuana users.
(a) What is the probability that of 10 workers testing positive, 2 are cocaine users, 5 are marijuana users, and 3 are users of other drugs?
(b) What is the probability that of 10 workers testing positive, all are marijuana users?
//

152
Chapter 5 Some Discrete Probability Distributions
(c) What is the probability that of 10 workers testing positive, none is a cocaine user?
5.21 The surface of a circular dart board has a small center circle called the bull’s-eye and 20 pie-shaped re- gions numbered from 1 to 20. Each of the pie-shaped regions is further divided into three parts such that a person throwing a dart that lands in a specific region scores the value of the number, double the number, or triple the number, depending on which of the three parts the dart hits. If a person hits the bull’s-eye with probability 0.01, hits a double with probability 0.10, hits a triple with probability 0.05, and misses the dart board with probability 0.02, what is the probability that 7 throws will result in no bull’s-eyes, no triples, a double twice, and a complete miss once?
5.22 According to a genetics theory, a certain cross of guinea pigs will result in red, black, and white offspring in the ratio 8:4:4. Find the probability that among 8 offspring, 5 will be red, 2 black, and 1 white.
5.23 The probabilities are 0.4, 0.2, 0.3, and 0.1, re- spectively, that a delegate to a certain convention ar- rived by air, bus, automobile, or train. What is the probability that among 9 delegates randomly selected at this convention, 3 arrived by air, 3 arrived by bus, 1 arrived by automobile, and 2 arrived by train?
5.24 A safety engineer claims that only 40% of all workers wear safety helmets when they eat lunch at the workplace. Assuming that this claim is right, find the probability that 4 of 6 workers randomly chosen will be wearing their helmets while having lunch at the workplace.
5.25 Suppose that for a very large shipment of integrated-circuit chips, the probability of failure for any one chip is 0.10. Assuming that the assumptions underlying the binomial distributions are met, find the probability that at most 3 chips fail in a random sample of 20.
5.26 Assuming that 6 in 10 automobile accidents are due mainly to a speed violation, find the probabil- ity that among 8 automobile accidents, 6 will be due mainly to a speed violation
(a) by using the formula for the binomial distribution; (b) by using Table A.1.
5.27 If the probability that a fluorescent light has a useful life of at least 800 hours is 0.9, find the proba- bilities that among 20 such lights
(a) exactly 18 will have a useful life of at least 800 hours;
(b) at least 15 will have a useful life of at least 800 hours;
(c) at least 2 will not have a useful life of at least 800 hours.
5.28 A manufacturer knows that on average 20% of the electric toasters produced require repairs within 1 year after they are sold. When 20 toasters are ran- domly selected, find appropriate numbers x and y such that
(a) the probability that at least x of them will require repairs is less than 0.5;
(b) the probability that at least y of them will not re- quire repairs is greater than 0.8.
5.3 Hypergeometric Distribution
The simplest way to view the distinction between the binomial distribution of Section 5.2 and the hypergeometric distribution is to note the way the sampling is done. The types of applications for the hypergeometric are very similar to those for the binomial distribution. We are interested in computing probabilities for the number of observations that fall into a particular category. But in the case of the binomial distribution, independence among trials is required. As a result, if that distribution is applied to, say, sampling from a lot of items (deck of cards, batch of production items), the sampling must be done with replacement of each item after it is observed. On the other hand, the hypergeometric distribution does not require independence and is based on sampling done without replacement.
Applications for the hypergeometric distribution are found in many areas, with heavy use in acceptance sampling, electronic testing, and quality assurance. Ob- viously, in many of these fields, testing is done at the expense of the item being tested. That is, the item is destroyed and hence cannot be replaced in the sample. Thus, sampling without replacement is necessary. A simple example with playing

5.3 Hypergeometric Distribution 153
cards will serve as our first illustration.
If we wish to find the probability of observing 3 red cards in 5 draws from an
ordinary deck of 52 playing cards, the binomial distribution of Section 5.2 does not
apply unless each card is replaced and the deck reshuffled before the next draw is
made. To solve the problem of sampling without replacement, let us restate the
problem. If 5 cards are drawn at random, we are interested in the probability of
selecting 3 red cards from the 26 available in the deck and 2 black cards from the 26
available in the deck. There are 􏰩26􏰪 ways of selecting 3 red cards, and for each of 3 􏰩26􏰪
these ways we can choose 2 black cards in ways. Therefore, the total number
2 􏰩26􏰪􏰩26􏰪
of ways to select 3 red and 2 black cards in 5 draws is the product . The
3 2 􏰩52􏰪 total number of ways to select any 5 cards from the 52 that are available is 5 .
Hence, the probability of selecting 5 cards without replacement of which 3 are red and 2 are black is given by
􏰩26􏰪􏰩26􏰪 32
(26!/3! 23!)(26!/2! 24!) 52!/5! 47!
􏰩52􏰪 = 5
= 0.3251.
In general, we are interested in the probability of selecting x successes from the k items labeled successes and n − x failures from the N − k items labeled failures when a random sample of size n is selected from N items. This is known as a hypergeometric experiment, that is, one that possesses the following two properties:
1. A random sample of size n is selected without replacement from N items.
2. Of the N items, k may be classified as successes and N − k are classified as
failures.
The number X of successes of a hypergeometric experiment is called a hyper- geometric random variable. Accordingly, the probability distribution of the hypergeometric variable is called the hypergeometric distribution, and its val- ues are denoted by h(x; N, n, k), since they depend on the number of successes k in the set N from which we select n items.
Hypergeometric Distribution in Acceptance Sampling
Like the binomial distribution, the hypergeometric distribution finds applications in acceptance sampling, where lots of materials or parts are sampled in order to determine whether or not the entire lot is accepted.
Example 5.8: A particular part that is used as an injection device is sold in lots of 10. The producer deems a lot acceptable if no more than one defective is in the lot. A sampling plan involves random sampling and testing 3 of the parts out of 10. If none of the 3 is defective, the lot is accepted. Comment on the utility of this plan.
Solution : Let us assume that the lot is truly unacceptable (i.e., that 2 out of 10 parts are defective). The probability that the sampling plan finds the lot acceptable is
􏰩2􏰪􏰩8􏰪 03
P(X = 0) = 􏰩10􏰪 = 0.467. 3

154
Chapter 5 Some Discrete Probability Distributions
Thus, if the lot is truly unacceptable, with 2 defective parts, this sampling plan will allow acceptance roughly 47% of the time. As a result, this plan should be considered faulty.
Let us now generalize in order to find a formula for h(x; N, n, k). The total
number of samples of size n chosen from N items is 􏰩N􏰪. These samples are 􏰩k􏰪 n
assumed to be equally likely. There are x ways of selecting x successes from the
Hypergeometric Distribution
samples is given by x n−x . Hence, we have the following definition.
The probability distribution of the hypergeometric random variable X, the num-
ber of successes in a random sample of size n selected from N items of which k
are labeled success and N − k labeled failure, is 􏰩k􏰪􏰩N −k􏰪
k that are available, and for each of these ways we can choose the n − x failures in
􏰩N−k􏰪 ways. Thus, the total number of favorable samples among the 􏰩N􏰪 possible n−x 􏰩k􏰪􏰩N −k􏰪 n
h(x; N, n, k) =
x n−x
􏰩N􏰪 , max{0, n − (N − k)} ≤ x ≤ min{n, k}.
n
The range of x can be determined by the three binomial coefficients in the definition, where x and n − x are no more than k and N − k, respectively, and both of them cannot be less than 0. Usually, when both k (the number of successes) and N − k (the number of failures) are larger than the sample size n, the range of a hypergeometric random variable will be x = 0, 1, . . . , n.
Example 5.9: Lots of 40 components each are deemed unacceptable if they contain 3 or more defectives. The procedure for sampling a lot is to select 5 components at random and to reject the lot if a defective is found. What is the probability that exactly 1 defective is found in the sample if there are 3 defectives in the entire lot?
Solution: Using the hypergeometric distribution with n = 5, N = 40, k = 3, and x = 1, we find the probability of obtaining 1 defective to be
Theorem 5.2:
􏰩3􏰪􏰩37􏰪 14
h(1; 40, 5, 3) = 􏰩40􏰪 = 0.3011. 5
Once again, this plan is not desirable since it detects a bad lot (3 defectives) only about 30% of the time.
The proof for the mean is shown in Appendix A.24.
Themeanandvarianceofthe hypergeometricdistributionh(x;N,n,k)are
nk 2 N−n k􏰧 k􏰨 μ=Nandσ=N−1·n·N 1−N .
Example 5.10: Let us now reinvestigate Example 3.4 on page 83. The purpose of this example was to illustrate the notion of a random variable and the corresponding sample space. In the example, we have a lot of 100 items of which 12 are defective. What is the probability that in a sample of 10, 3 are defective?

5.3 Hypergeometric Distribution 155 Solution: Using the hypergeometric probability function, we have
Example 5.11: Find the mean and variance of the random variable of Example 5.9 and then use Chebyshev’s theorem to interpret the interval μ ± 2σ.
Solution: Since Example 5.9 was a hypergeometric experiment with N = 40, n = 5, and k = 3, by Theorem 5.2, we have
h(3; 100, 10, 12) = 􏰩100􏰪 = 0.08. 10
􏰩12􏰪􏰩88􏰪 37
and
μ = (5)(3) = 3 = 0.375, 40 8
􏰧40−5􏰨 􏰧 3 􏰨􏰧 3 􏰨
σ2 = 39 (5) 40 1 − 40 = 0.3113.
Taking the square root of 0.3113, we find that σ = 0.558. Hence, the required interval is 0.375 ± (2)(0.558), or from −0.741 to 1.491. Chebyshev’s theorem states that the number of defectives obtained when 5 components are selected at random from a lot of 40 components of which 3 are defective has a probability of at least 3/4 of falling between −0.741 and 1.491. That is, at least three-fourths of the time, the 5 components include fewer than 2 defectives.
Relationship to the Binomial Distribution
In this chapter, we discuss several important discrete distributions that have wide applicability. Many of these distributions relate nicely to each other. The beginning student should gain a clear understanding of these relationships. There is an interesting relationship between the hypergeometric and the binomial distribution. As one might expect, if n is small compared to N, the nature of the N items changes very little in each draw. So a binomial distribution can be used to approximate the hypergeometric distribution when n is small compared to N. In fact, as a rule of thumb, the approximation is good when n/N ≤ 0.05.
Thus, the quantity k/N plays the role of the binomial parameter p. As a result, the binomial distribution may be viewed as a large-population version of the hypergeometric distribution. The mean and variance then come from the formulas
nk2 k􏰧k􏰨 μ=np= N andσ =npq=n·N 1−N .
Comparing these formulas with those of Theorem 5.2, we see that the mean is the same but the variance differs by a correction factor of (N − n)/(N − 1), which is negligible when n is small relative to N.
Example 5.12: A manufacturer of automobile tires reports that among a shipment of 5000 sent to a local distributor, 1000 are slightly blemished. If one purchases 10 of these tires at random from the distributor, what is the probability that exactly 3 are blemished?

156
Chapter 5 Some Discrete Probability Distributions
Solution :
Since N = 5000 is large relative to the sample size n = 10, we shall approximate the desired probability by using the binomial distribution. The probability of obtaining a blemished tire is 0.2. Therefore, the probability of obtaining exactly 3 blemished tires is
h(3; 5000, 10, 1000) ≈ b(3; 10, 0.2) = 0.8791 − 0.6778 = 0.2013.
On the other hand, the exact probability is h(3; 5000, 10, 1000) = 0.2015.
The hypergeometric distribution can be extended to treat the case where the N items can be partitioned into k cells A1, A2, . . . , Ak with a1 elements in the first cell, a2 elements in the second cell, …, ak elements in the kth cell. We are now interested in the probability that a random sample of size n yields x1 elements from A1, x2 elements from A2, …, and xk elements from Ak. Let us represent
this probability by
f(x1,x2,…,xk;a1,a2,…,ak,N,n).
To obtain a general formula, we note that the total number of samples of size
n that can be chosen from N items is still 􏰩N􏰪. There are 􏰩a1􏰪 ways of selecting n x1
x items from the items in A , and for each of these we can choose x items from 1􏰩􏰪12
the items in A2 in a2 ways. Therefore, we can select x1 items from A1 and x2 x2
items from A2 in a1 a2 x1 x2
􏰩 􏰪􏰩 􏰪
ways. Continuing in this way, we can select all n items consisting of x1 from A1, x2 from A2, …, and xk from Ak in
􏰧a 􏰨􏰧a 􏰨 􏰧a 􏰨
1 2 ··· k ways.
x1 x2 xk
The required probability distribution is now defined as follows.
Multivariate Hypergeometric Distribution
If N items can be partitioned into the k cells A1, A2, . . . , Ak with a1, a2, . . . , ak elements, respectively, then the probability distribution of the random vari- ables X1, X2, . . . , Xk, representing the number of elements selected from A1,A2,…,Ak in a random sample of size n, is
f(x1,x2,…,xk;a1,a2,…,ak,N,n) =
􏰩N􏰪 , n
􏰩a1􏰪􏰩a2􏰪···􏰩ak􏰪 x1x2 xk
􏰦k with
i=1
xi =nand
􏰦k i=1
ai =N.
Example 5.13:
Solution :
A group of 10 individuals is used for a biological case study. The group contains 3 people with blood type O, 4 with blood type A, and 3 with blood type B. What is the probability that a random sample of 5 will contain 1 person with blood type O, 2 people with blood type A, and 2 people with blood type B?
Using the extension of the hypergeometric distribution with x1 = 1, x2 = 2, x3 = 2, a1 =3,a2 =4,a3 =3,N=10,andn=5,wefindthatthedesiredprobabilityis
􏰩3􏰪􏰩4􏰪􏰩3􏰪 f(1,2,2;3,4,3,10,5)= 1 􏰩2􏰪2
10 5
=
3 14
.

Exercises
157
Exercises
5.29 A homeowner plants 6 bulbs selected at ran- dom from a box containing 5 tulip bulbs and 4 daf- fodil bulbs. What is the probability that he planted 2 daffodil bulbs and 4 tulip bulbs?
5.30 To avoid detection at customs, a traveler places 6 narcotic tablets in a bottle containing 9 vitamin tablets that are similar in appearance. If the customs official selects 3 of the tablets at random for analysis, what is the probability that the traveler will be arrested for illegal possession of narcotics?
5.31 A random committee of size 3 is selected from 4 doctors and 2 nurses. Write a formula for the prob- ability distribution of the random variable X repre- senting the number of doctors on the committee. Find P(2 ≤ X ≤ 3).
5.32 From a lot of 10 missiles, 4 are selected at ran- dom and fired. If the lot contains 3 defective missiles that will not fire, what is the probability that
(a) all 4 will fire?
(b) at most 2 will not fire?
5.33 If 7 cards are dealt from an ordinary deck of 52 playing cards, what is the probability that
(a) exactly 2 of them will be face cards? (b) at least 1 of them will be a queen?
5.34 What is the probability that a waitress will refuse to serve alcoholic beverages to only 2 minors if she randomly checks the IDs of 5 among 9 students, 4 of whom are minors?
5.35 A company is interested in evaluating its cur- rent inspection procedure for shipments of 50 identical items. The procedure is to take a sample of 5 and pass the shipment if no more than 2 are found to be defective. What proportion of shipments with 20% de- fectives will be accepted?
5.36 A manufacturing company uses an acceptance scheme on items from a production line before they are shipped. The plan is a two-stage one. Boxes of 25 items are readied for shipment, and a sample of 3 items is tested for defectives. If any defectives are found, the entire box is sent back for 100% screening. If no defec- tives are found, the box is shipped.
(a) What is the probability that a box containing 3 defectives will be shipped?
(b) What is the probability that a box containing only 1 defective will be sent back for screening?
5.37 Suppose that the manufacturing company of Ex- ercise 5.36 decides to change its acceptance scheme. Under the new scheme, an inspector takes 1 item at random, inspects it, and then replaces it in the box; a second inspector does likewise. Finally, a third in- spector goes through the same procedure. The box is not shipped if any of the three inspectors find a de- fective. Answer the questions in Exercise 5.36 for this new plan.
5.38 Among 150 IRS employees in a large city, only 30 are women. If 10 of the employees are chosen at random to provide free tax assistance for the residents of this city, use the binomial approximation to the hy- pergeometric distribution to find the probability that at least 3 women are selected.
5.39 An annexation suit against a county subdivision of 1200 residences is being considered by a neighboring city. If the occupants of half the residences object to being annexed, what is the probability that in a ran- dom sample of 10 at least 3 favor the annexation suit?
5.40 It is estimated that 4000 of the 10,000 voting residents of a town are against a new sales tax. If 15 eligible voters are selected at random and asked their opinion, what is the probability that at most 7 favor the new tax?
5.41 A nationwide survey of 17,000 college seniors by the University of Michigan revealed that almost 70% disapprove of daily pot smoking. If 18 of these seniors are selected at random and asked their opinion, what is the probability that more than 9 but fewer than 14 disapprove of smoking pot daily?
5.42 Find the probability of being dealt a bridge hand of 13 cards containing 5 spades, 2 hearts, 3 diamonds, and 3 clubs.
5.43 A foreign student club lists as its members 2 Canadians, 3 Japanese, 5 Italians, and 2 Germans. If a committee of 4 is selected at random, find the prob- ability that
(a) all nationalities are represented;
(b) all nationalities except Italian are represented.
5.44 An urn contains 3 green balls, 2 blue balls, and 4 red balls. In a random sample of 5 balls, find the probability that both blue balls and at least 1 red ball are selected.
5.45 Biologists doing studies in a particular environ- ment often tag and release subjects in order to estimate
//

158
Chapter 5 Some Discrete Probability Distributions
the size of a population or the prevalence of certain features in the population. Ten animals of a certain population thought to be extinct (or near extinction) are caught, tagged, and released in a certain region. After a period of time, a random sample of 15 of this type of animal is selected in the region. What is the probability that 5 of those selected are tagged if there are 25 animals of this type in the region?
5.46 A large company has an inspection system for the batches of small compressors purchased from ven- dors. A batch typically contains 15 compressors. In the inspection system, a random sample of 5 is selected and all are tested. Suppose there are 2 faulty compressors in the batch of 15.
(a) What is the probability that for a given sample there will be 1 faulty compressor?
(b) What is the probability that inspection will dis- cover both faulty compressors?
5.47 A government task force suspects that some manufacturing companies are in violation of federal pollution regulations with regard to dumping a certain type of product. Twenty firms are under suspicion but not all can be inspected. Suppose that 3 of the firms are in violation.
(a) What is the probability that inspection of 5 firms will find no violations?
(b) What is the probability that the plan above will find two violations?
5.48 Every hour, 10,000 cans of soda are filled by a machine, among which 300 underfilled cans are pro- duced. Each hour, a sample of 30 cans is randomly selected and the number of ounces of soda per can is checked. Denote by X the number of cans selected that are underfilled. Find the probability that at least 1 underfilled can will be among those sampled.
5.4 Negative Binomial and Geometric Distributions
Let us consider an experiment where the properties are the same as those listed for a binomial experiment, with the exception that the trials will be repeated until a fixed number of successes occur. Therefore, instead of the probability of x successes in n trials, where n is fixed, we are now interested in the probability that the kth success occurs on the xth trial. Experiments of this kind are called negative binomial experiments.
As an illustration, consider the use of a drug that is known to be effective in 60% of the cases where it is used. The drug will be considered a success if it is effective in bringing some degree of relief to the patient. We are interested in finding the probability that the fifth patient to experience relief is the seventh patient to receive the drug during a given week. Designating a success by S and a failure by F, a possible order of achieving the desired result is SFSSSFS, which occurs with probability
(0.6)(0.4)(0.6)(0.6)(0.6)(0.4)(0.6) = (0.6)5(0.4)2.
We could list all possible orders by rearranging the F’s and S’s except for the last
outcome, which must be the fifth success. The total number of possible orders
is equal to the number of partitions of the first six trials into two groups with 2
failures assigned to the one group and 4 successes assigned to the other group.
This can be done in 􏰩6􏰪 = 15 mutually exclusive ways. Hence, if X represents the 4
outcome on which the fifth success occurs, then 􏰧6􏰨
P(X = 7) = 4 (0.6)5(0.4)2 = 0.1866. What Is the Negative Binomial Random Variable?
The number X of trials required to produce k successes in a negative binomial experiment is called a negative binomial random variable, and its probability

5.4 Negative Binomial and Geometric Distributions 159
Negative Binomial Distribution
Example 5.14:
pkqx−k. We obtain the general formula by multiplying pkqx−k by x−1 . k−1
If repeated independent trials can result in a success with probability p and a failure with probability q = 1 − p, then the probability distribution of the random variable X, the number of the trial on which the kth success occurs, is
􏰧x−1􏰨
b∗(x;k,p)= k−1 pkqx−k, x=k,k+1,k+2,….
In an NBA (National Basketball Association) championship series, the team that wins four games out of seven is the winner. Suppose that teams A and B face each other in the championship games and that team A has probability 0.55 of winning a game over team B.
(a) What is the probability that team A will win the series in 6 games? (b) What is the probability that team A will win the series?
distribution is called the negative binomial distribution. Since its probabilities depend on the number of successes desired and the probability of a success on a given trial, we shall denote them by b∗(x;k,p). To obtain the general formula for b∗(x;k,p), consider the probability of a success on the xth trial preceded by k − 1 successes and x − k failures in some specified order. Since the trials are independent, we can multiply all the probabilities corresponding to each desired outcome. Each success occurs with probability p and each failure with probability q = 1 − p. Therefore, the probability for the specified order ending in success is
pk−1qx−kp = pkqx−k.
The total number of sample points in the experiment ending in a success, after the
occurrence of k−1 successes and x−k failures in any order, is equal to the number
of partitions of x−1 trials into two groups with k−1 successes corresponding to one
group and x−k failures corresponding to the other group. This number is specified
by the term 􏰩x−1􏰪, each mutually exclusive and occurring with equal probability k−1 􏰩􏰪
Solution :
(a) b∗ (6; 4, 0.55) = 􏰩5􏰪0.554 (1 − 0.55)6−4 = 0.1853 3
(c) If teams A and B were facing each other in a regional playoff series, which is decided by winning three out of five games, what is the probability that team A would win the series?
(b) P (team A wins the championship series) is
b∗(4; 4, 0.55) + b∗(5; 4, 0.55) + b∗(6; 4, 0.55) + b∗(7; 4, 0.55)
= 0.0915 + 0.1647 + 0.1853 + 0.1668 = 0.6083. (c) P(team A wins the playoff) is
b∗(3; 3, 0.55) + b∗(4; 3, 0.55) + b∗(5; 3, 0.55)
= 0.1664 + 0.2246 + 0.2021 = 0.5931.

160
Chapter 5 Some Discrete Probability Distributions
Geometric Distribution
Example 5.15:
Solution :
Example 5.16:
Solution :
Theorem 5.3:
The negative binomial distribution derives its name from the fact that each term in the expansion of pk(1 − q)−k corresponds to the values of b∗(x; k, p) for x = k, k + 1, k + 2, …. If we consider the special case of the negative binomial distribution where k = 1, we have a probability distribution for the number of trials required for a single success. An example would be the tossing of a coin until a head occurs. We might be interested in the probability that the first head occurs on the fourth toss. The negative binomial distribution reduces to the form
b∗(x;1,p) = pqx−1, x = 1,2,3,….
Since the successive terms constitute a geometric progression, it is customary to refer to this special case as the geometric distribution and denote its values by g(x; p).
If repeated independent trials can result in a success with probability p and a failure with probability q = 1 − p, then the probability distribution of the random variable X, the number of the trial on which the first success occurs, is
g(x;p) = pqx−1, x = 1,2,3,….
For a certain manufacturing process, it is known that, on the average, 1 in every 100 items is defective. What is the probability that the fifth item inspected is the first defective item found?
Using the geometric distribution with x = 5 and p = 0.01, we have
g(5; 0.01) = (0.01)(0.99)4 = 0.0096.
At a “busy time,” a telephone exchange is very near capacity, so callers have difficulty placing their calls. It may be of interest to know the number of attempts necessary in order to make a connection. Suppose that we let p = 0.05 be the probability of a connection during a busy time. We are interested in knowing the probability that 5 attempts are necessary for a successful call.
Using the geometric distribution with x = 5 and p = 0.05 yields
P (X = x) = g(5; 0.05) = (0.05)(0.95)4 = 0.041.
Quite often, in applications dealing with the geometric distribution, the mean
and variance are important. For example, in Example 5.16, the expected number of calls necessary to make a connection is quite important. The following theorem states without proof the mean and variance of the geometric distribution.
The mean and variance of a random variable following the geometric distribution are
μ = 1 and σ2 = 1 − p. p p2

5.5 Poisson Distribution and the Poisson Process 161 Applications of Negative Binomial and Geometric Distributions
Areas of application for the negative binomial and geometric distributions become obvious when one focuses on the examples in this section and the exercises devoted to these distributions at the end of Section 5.5. In the case of the geometric distribution, Example 5.16 depicts a situation where engineers or managers are attempting to determine how inefficient a telephone exchange system is during busy times. Clearly, in this case, trials occurring prior to a success represent a cost. If there is a high probability of several attempts being required prior to making a connection, then plans should be made to redesign the system.
Applications of the negative binomial distribution are similar in nature. Sup- pose attempts are costly in some sense and are occurring in sequence. A high probability of needing a “large” number of attempts to experience a fixed number of successes is not beneficial to the scientist or engineer. Consider the scenarios of Review Exercises 5.90 and 5.91. In Review Exercise 5.91, the oil driller defines a certain level of success from sequentially drilling locations for oil. If only 6 at- tempts have been made at the point where the second success is experienced, the profits appear to dominate substantially the investment incurred by the drilling.
5.5 Poisson Distribution and the Poisson Process
Experiments yielding numerical values of a random variable X, the number of outcomes occurring during a given time interval or in a specified region, are called Poisson experiments. The given time interval may be of any length, such as a minute, a day, a week, a month, or even a year. For example, a Poisson experiment can generate observations for the random variable X representing the number of telephone calls received per hour by an office, the number of days school is closed due to snow during the winter, or the number of games postponed due to rain during a baseball season. The specified region could be a line segment, an area, a volume, or perhaps a piece of material. In such instances, X might represent the number of field mice per acre, the number of bacteria in a given culture, or the number of typing errors per page. A Poisson experiment is derived from the Poisson process and possesses the following properties.
Properties of the Poisson Process
1. The number of outcomes occurring in one time interval or specified region of space is independent of the number that occur in any other disjoint time in- terval or region. In this sense we say that the Poisson process has no memory.
2. The probability that a single outcome will occur during a very short time interval or in a small region is proportional to the length of the time interval or the size of the region and does not depend on the number of outcomes occurring outside this time interval or region.
3. The probability that more than one outcome will occur in such a short time interval or fall in such a small region is negligible.
The number X of outcomes occurring during a Poisson experiment is called a Poisson random variable, and its probability distribution is called the Poisson

162
Chapter 5 Some Discrete Probability Distributions
Poisson Distribution
distribution. The mean number of outcomes is computed from μ = λt, where t is the specific “time,” “distance,” “area,” or “volume” of interest. Since the probabilities depend on λ, the rate of occurrence of outcomes, we shall denote them by p(x; λt). The derivation of the formula for p(x; λt), based on the three properties of a Poisson process listed above, is beyond the scope of this book. The following formula is used for computing Poisson probabilities.
The probability distribution of the Poisson random variable X, representing the number of outcomes occurring in a given time interval or specified region
denoted by t, is
where λ is the average number of outcomes per unit time, distance, area, or
e−λt (λt)x
p(x;λt) = x! , x = 0,1,2,…,
volume and e = 2.71828….
Table A.2 contains Poisson probability sums,
􏰤r x=0
p(x; λt),
for selected values of λt ranging from 0.1 to 18.0. We illustrate the use of this table
with the following two examples.
P (r; λt) =
Example 5.17: During a laboratory experiment, the average number of radioactive particles pass- ing through a counter in 1 millisecond is 4. What is the probability that 6 particles enter the counter in a given millisecond?
Solution : Using the Poisson distribution with x = 6 and λt = 4 and referring to Table A.2, we have
e − 4 4 6 􏰤6
p(6; 4) = 6! =
x=0
p(x; 4) −
􏰤5
x=0
p(x; 4) = 0.8893 − 0.7851 = 0.1042.
Example 5.18: Ten is the average number of oil tankers arriving each day at a certain port. The facilities at the port can handle at most 15 tankers per day. What is the probability that on a given day tankers have to be turned away?
Solution : Let X be the number of tankers arriving each day. Then, using Table A.2, we have 15
P(X >15)=1−P(X ≤15)=1−
􏰤
p(x;10)=1−0.9513=0.0487.
Theorem 5.4:
x=0
Like the binomial distribution, the Poisson distribution is used for quality con-
trol, quality assurance, and acceptance sampling. In addition, certain important continuous distributions used in reliability theory and queuing theory depend on the Poisson process. Some of these distributions are discussed and developed in Chapter 6. The following theorem concerning the Poisson random variable is given in Appendix A.25.
Both the mean and the variance of the Poisson distribution p(x; λt) are λt.

5.5 Poisson Distribution and the Poisson Process 163 Nature of the Poisson Probability Function
1.0
0.75
0.5
0.25
μ 􏱋 0.1
0.30
0.20
0.10
μ 􏱋 2
0.30
0.20
0.10
μ 􏱋 5
Like so many discrete and continuous distributions, the form of the Poisson distri- bution becomes more and more symmetric, even bell-shaped, as the mean grows large. Figure 5.1 illustrates this, showing plots of the probability function for μ = 0.1, μ = 2, and μ = 5. Note the nearness to symmetry when μ becomes as large as 5. A similar condition exists for the binomial distribution, as will be illustrated later in the text.
0x0x0x 0 2 4 6 8 10 0 2 4 6 8 10 0 2 4 6 8 10
Figure 5.1: Poisson density functions for different means.
Approximation of Binomial Distribution by a Poisson Distribution
It should be evident from the three principles of the Poisson process that the Poisson distribution is related to the binomial distribution. Although the Poisson usually finds applications in space and time problems, as illustrated by Examples 5.17 and 5.18, it can be viewed as a limiting form of the binomial distribution. In the case of the binomial, if n is quite large and p is small, the conditions begin to simulate the continuous space or time implications of the Poisson process. The in- dependence among Bernoulli trials in the binomial case is consistent with principle 2 of the Poisson process. Allowing the parameter p to be close to 0 relates to prin- ciple 3 of the Poisson process. Indeed, if n is large and p is close to 0, the Poisson distribution can be used, with μ = np, to approximate binomial probabilities. If p is close to 1, we can still use the Poisson distribution to approximate binomial probabilities by interchanging what we have defined to be a success and a failure, thereby changing p to a value close to 0.
Let X be a binomial random variable with probability distribution b(x; n, p). When n→∞
n→∞,p→0,andnp −→ μremainsconstant,
n→∞ b(x;n,p) −→ p(x;μ).
Theorem 5.5:
f(x)
f(x)
f(x)

164
Chapter 5 Some Discrete Probability Distributions
//
Example 5.19: In a certain industrial facility, accidents occur infrequently. It is known that the probability of an accident on any given day is 0.005 and accidents are independent of each other.
(a) What is the probability that in any given period of 400 days there will be an accident on one day?
(b) What is the probability that there are at most three days with an accident?
Solution : Let X be a binomial random variable with n = 400 and p = 0.005. Thus, np = 2. Using the Poisson approximation,
(a) P(X = 1) = e−221 = 0.271 and
􏰦3 −2x
e 2 /x!=0.857.
x=0
Example 5.20: In a manufacturing process where glass products are made, defects or bubbles occur, occasionally rendering the piece undesirable for marketing. It is known that, on average, 1 in every 1000 of these items produced has one or more bubbles. What is the probability that a random sample of 8000 will yield fewer than 7 items possessing bubbles?
Solution: This is essentially a binomial experiment with n = 8000 and p = 0.001. Since p is very close to 0 and n is quite large, we shall approximate with the Poisson distribution using
(b) P(X≤3)=
Exercises
μ = (8000)(0.001) = 8. Hence, if X represents the number of bubbles, we have
􏰤6 x=0
5.49 The probability that a person living in a certain city owns a dog is estimated to be 0.3. Find the prob- ability that the tenth person randomly interviewed in that city is the fifth one to own a dog.
5.50 Find the probability that a person flipping a coin gets
(a) the third head on the seventh flip; (b) the first head on the fourth flip.
5.51 Three people toss a fair coin and the odd one pays for coffee. If the coins all turn up the same, they are tossed again. Find the probability that fewer than 4 tosses are needed.
5.52 A scientist inoculates mice, one at a time, with a disease germ until he finds 2 that have contracted the
disease. If the probability of contracting the disease is 1/6, what is the probability that 8 mice are required?
5.53 An inventory study determines that, on aver- age, demands for a particular item at a warehouse are made 5 times per day. What is the probability that on a given day this item is requested
(a) more than 5 times? (b) not at all?
5.54 According to a study published by a group of University of Massachusetts sociologists, about two- thirds of the 20 million persons in this country who take Valium are women. Assuming this figure to be a valid estimate, find the probability that on a given day the fifth prescription written by a doctor for Valium is
(a) the first prescribing Valium for a woman;
P (X < 7) = b(x; 8000, 0.001) ≈ p(x; 8) = 0.3134. Exercises 165 (b) the third prescribing Valium for a woman. 5.55 The probability that a student pilot passes the written test for a private pilot’s license is 0.7. Find the probability that a given student will pass the test (a) on the third try; (b) before the fourth try. 5.56 On average, 3 traffic accidents per month occur at a certain intersection. What is the probability that in any given month at this intersection (a) exactly 5 accidents will occur? (b) fewer than 3 accidents will occur? (c) at least 2 accidents will occur? 5.57 On average, a textbook author makes two word- processing errors per page on the first draft of her text- book. What is the probability that on the next page she will make (a) 4 or more errors? (b) no errors? 5.58 A certain area of the eastern United States is, on average, hit by 6 hurricanes a year. Find the prob- ability that in a given year that area will be hit by (a) fewer than 4 hurricanes; (b) anywhere from 6 to 8 hurricanes. 5.59 Suppose the probability that any given person will believe a tale about the transgressions of a famous actress is 0.8. What is the probability that (a) the sixth person to hear this tale is the fourth one to believe it? (b) the third person to hear this tale is the first one to believe it? 5.60 The average number of field mice per acre in a 5-acre wheat field is estimated to be 12. Find the probability that fewer than 7 field mice are found (a) on a given acre; (b) on 2 of the next 3 acres inspected. 5.61 Suppose that, on average, 1 person in 1000 makes a numerical error in preparing his or her income tax return. If 10,000 returns are selected at random and examined, find the probability that 6, 7, or 8 of them contain an error. 5.62 The probability that a student at a local high school fails the screening test for scoliosis (curvature of the spine) is known to be 0.004. Of the next 1875 students at the school who are screened for scoliosis, find the probability that (a) fewer than 5 fail the test; (b) 8, 9, or 10 fail the test. 5.63 Find the mean and variance of the random vari- able X in Exercise 5.58, representing the number of hurricanes per year to hit a certain area of the eastern United States. 5.64 Find the mean and variance of the random vari- able X in Exercise 5.61, representing the number of persons among 10,000 who make an error in preparing their income tax returns. 5.65 An automobile manufacturer is concerned about a fault in the braking mechanism of a particular model. The fault can, on rare occasions, cause a catastrophe at high speed. The distribution of the number of cars per year that will experience the catastrophe is a Poisson random variable with λ = 5. (a) What is the probability that at most 3 cars per year will experience a catastrophe? (b) What is the probability that more than 1 car per year will experience a catastrophe? 5.66 Changes in airport procedures require consid- erable planning. Arrival rates of aircraft are impor- tant factors that must be taken into account. Suppose small aircraft arrive at a certain airport, according to a Poisson process, at the rate of 6 per hour. Thus, the Poisson parameter for arrivals over a period of hours is μ = 6t. (a) What is the probability that exactly 4 small air- craft arrive during a 1-hour period? (b) What is the probability that at least 4 arrive during a 1-hour period? (c) If we define a working day as 12 hours, what is the probability that at least 75 small aircraft ar- rive during a working day? 5.67 The number of customers arriving per hour at a certain automobile service facility is assumed to follow a Poisson distribution with mean λ = 7. (a) Compute the probability that more than 10 cus- tomers will arrive in a 2-hour period. (b) What is the mean number of arrivals during a 2-hour period? 5.68 Consider Exercise 5.62. What is the mean num- ber of students who fail the test? 5.69 The probability that a person will die when he or she contracts a virus infection is 0.001. Of the next 4000 people infected, what is the mean number who will die? // 166 Chapter 5 Some Discrete Probability Distributions 5.70 A company purchases large lots of a certain kind of electronic device. A method is used that rejects a lot if 2 or more defective units are found in a random sample of 100 units. (a) What is the mean number of defective units found in a sample of 100 units if the lot is 1% defective? (b) What is the variance? 5.71 For a certain type of copper wire, it is known that, on the average, 1.5 flaws occur per millimeter. Assuming that the number of flaws is a Poisson random variable, what is the probability that no flaws occur in a certain portion of wire of length 5 millimeters? What is the mean number of flaws in a portion of length 5 millimeters? 5.72 Potholes on a highway can be a serious problem, and are in constant need of repair. With a particular type of terrain and make of concrete, past experience suggests that there are, on the average, 2 potholes per mile after a certain amount of usage. It is assumed that the Poisson process applies to the random vari- able “number of potholes.” (a) What is the probability that no more than one pot- hole will appear in a section of 1 mile? (b) What is the probability that no more than 4 pot- holes will occur in a given section of 5 miles? 5.73 Hospital administrators in large cities anguish about traffic in emergency rooms. At a particular hos- pital in a large city, the staff on hand cannot accom- Review Exercises 5.77 During a manufacturing process, 15 units are randomly selected each day from the production line to check the percent defective. From historical infor- mation it is known that the probability of a defective unit is 0.05. Any time 2 or more defectives are found in the sample of 15, the process is stopped. This proce- dure is used to provide a signal in case the probability of a defective has increased. (a) What is the probability that on any given day the production process will be stopped? (Assume 5% defective.) (b) Suppose that the probability of a defective has in- creased to 0.07. What is the probability that on any given day the production process will not be stopped? 5.78 An automatic welding machine is being consid- ered for use in a production process. It will be con- sidered for purchase if it is successful on 99% of its modate the patient traffic if there are more than 10 emergency cases in a given hour. It is assumed that patient arrival follows a Poisson process, and historical data suggest that, on the average, 5 emergencies arrive per hour. (a) What is the probability that in a given hour the staff cannot accommodate the patient traffic? (b) What is the probability that more than 20 emer- gencies arrive during a 3-hour shift? 5.74 It is known that 3% of people whose luggage is screened at an airport have questionable objects in their luggage. What is the probability that a string of 15 people pass through screening successfully before an individual is caught with a questionable object? What is the expected number of people to pass through be- fore an individual is stopped? 5.75 Computer technology has produced an environ- ment in which robots operate with the use of micro- processors. The probability that a robot fails during any 6-hour shift is 0.10. What is the probability that a robot will operate through at most 5 shifts before it fails? 5.76 The refusal rate for telephone polls is known to be approximately 20%. A newspaper report indicates that 50 people were interviewed before the first refusal. (a) Comment on the validity of the report. Use a prob- ability in your argument. (b) What is the expected number of people interviewed before a refusal? welds. Otherwise, it will not be considered efficient. A test is to be conducted with a prototype that is to perform 100 welds. The machine will be accepted for manufacture if it misses no more than 3 welds. (a) What is the probability that a good machine will be rejected? (b) What is the probability that an inefficient machine with 95% welding success will be accepted? 5.79 A car rental agency at a local airport has avail- able 5 Fords, 7 Chevrolets, 4 Dodges, 3 Hondas, and 4 Toyotas. If the agency randomly selects 9 of these cars to chauffeur delegates from the airport to the down- town convention center, find the probability that 2 Fords, 3 Chevrolets, 1 Dodge, 1 Honda, and 2 Toyotas are used. 5.80 Service calls come to a maintenance center ac- cording to a Poisson process, and on average, 2.7 calls // Review Exercises 167 are received per minute. Find the probability that (a) no more than 4 calls come in any minute; (b) fewer than 2 calls come in any minute; (c) more than 10 calls come in a 5-minute period. 5.81 An electronics firm claims that the proportion of defective units from a certain process is 5%. A buyer has a standard procedure of inspecting 15 units selected randomly from a large lot. On a particular occasion, the buyer found 5 items defective. (a) What is the probability of this occurrence, given that the claim of 5% defective is correct? (b) What would be your reaction if you were the buyer? 5.82 An electronic switching device occasionally mal- functions, but the device is considered satisfactory if it makes, on average, no more than 0.20 error per hour. A particular 5-hour period is chosen for testing the de- vice. If no more than 1 error occurs during the time period, the device will be considered satisfactory. (a) What is the probability that a satisfactory device will be considered unsatisfactory on the basis of the test? Assume a Poisson process. (b) What is the probability that a device will be ac- cepted as satisfactory when, in fact, the mean num- ber of errors is 0.25? Again, assume a Poisson pro- cess. 5.83 A company generally purchases large lots of a certain kind of electronic device. A method is used that rejects a lot if 2 or more defective units are found in a random sample of 100 units. (a) What is the probability of rejecting a lot that is 1% defective? (b) What is the probability of accepting a lot that is 5% defective? 5.84 A local drugstore owner knows that, on average, 100 people enter his store each hour. (a) Find the probability that in a given 3-minute pe- riod nobody enters the store. (b) Find the probability that in a given 3-minute pe- riod more than 5 people enter the store. 5.85 (a) Suppose that you throw 4 dice. Find the probability that you get at least one 1. (b) Suppose that you throw 2 dice 24 times. Find the probability that you get at least one (1, 1), that is, “snake-eyes.” 5.86 Suppose that out of 500 lottery tickets sold, 200 pay off at least the cost of the ticket. Now suppose that you buy 5 tickets. Find the probability that you will win back at least the cost of 3 tickets. 5.87 Imperfections in computer circuit boards and computer chips lend themselves to statistical treat- ment. For a particular type of board, the probability of a diode failure is 0.03 and the board contains 200 diodes. (a) What is the mean number of failures among the diodes? (b) What is the variance? (c) The board will work if there are no defective diodes. What is the probability that a board will work? 5.88 The potential buyer of a particular engine re- quires (among other things) that the engine start suc- cessfully 10 consecutive times. Suppose the probability of a successful start is 0.990. Let us assume that the outcomes of attempted starts are independent. (a) What is the probability that the engine is accepted after only 10 starts? (b) What is the probability that 12 attempted starts are made during the acceptance process? 5.89 The acceptance scheme for purchasing lots con- taining a large number of batteries is to test no more than 75 randomly selected batteries and to reject a lot if a single battery fails. Suppose the probability of a failure is 0.001. (a) What is the probability that a lot is accepted? (b) What is the probability that a lot is rejected on the 20th test? (c) What is the probability that it is rejected in 10 or fewer trials? 5.90 An oil drilling company ventures into various lo- cations, and its success or failure is independent from one location to another. Suppose the probability of a success at any specific location is 0.25. (a) What is the probability that the driller drills at 10 locations and has 1 success? (b) The driller will go bankrupt if it drills 10 times be- fore the first success occurs. What are the driller’s prospects for bankruptcy? 5.91 Consider the information in Review Exercise 5.90. The drilling company feels that it will “hit it big” if the second success occurs on or before the sixth attempt. What is the probability that the driller will hit it big? 5.92 A couple decides to continue to have children un- til they have two males. Assuming that P (male) = 0.5, what is the probability that their second male is their fourth child? // 168 Chapter 5 Some Discrete Probability Distributions 5.93 It is known by researchers that 1 in 100 people carries a gene that leads to the inheritance of a certain chronic disease. In a random sample of 1000 individ- uals, what is the probability that fewer than 7 indi- viduals carry the gene? Use a Poisson approximation. Again, using the approximation, what is the approxi- mate mean number of people out of 1000 carrying the gene? 5.94 A production process produces electronic com- ponent parts. It is presumed that the probability of a defective part is 0.01. During a test of this presump- tion, 500 parts are sampled randomly and 15 defectives are observed. (a) What is your response to the presumption that the process is 1% defective? Be sure that a computed probability accompanies your comment. (b) Under the presumption of a 1% defective process, what is the probability that only 3 parts will be found defective? (c) Do parts (a) and (b) again using the Poisson ap- proximation. 5.95 A production process outputs items in lots of 50. Sampling plans exist in which lots are pulled aside pe- riodically and exposed to a certain type of inspection. It is usually assumed that the proportion defective is very small. It is important to the company that lots containing defectives be a rare event. The current in- spection plan is to periodically sample randomly 10 out of the 50 items in a lot and, if none are defective, to perform no intervention. (a) Suppose in a lot chosen at random, 2 out of 50 are defective. What is the probability that at least 1 in the sample of 10 from the lot is defective? (b) From your answer to part (a), comment on the quality of this sampling plan. (c) What is the mean number of defects found out of 10 items sampled? 5.96 Consider the situation of Review Exercise 5.95. It has been determined that the sampling plan should be extensive enough that there is a high probability, say 0.9, that if as many as 2 defectives exist in the lot of 50 being sampled, at least 1 will be found in the sampling. With these restrictions, how many of the 50 items should be sampled? 5. 97 National security requires that defense technol- ogy be able to detect incoming projectiles or missiles. To make the defense system successful, multiple radar screens are required. Suppose that three independent screens are to be operated and the probability that any one screen will detect an incoming missile is 0.8. Ob- viously, if no screens detect an incoming projectile, the system is unworthy and must be improved. (a) What is the probability that an incoming missile will not be detected by any of the three screens? (b) What is the probability that the missile will be de- tected by only one screen? (c) What is the probability that it will be detected by at least two out of three screens? 5.98 Suppose it is important that the overall missile defense system be as near perfect as possible. (a) Assuming the quality of the screens is as indicated in Review Exercise 5.97, how many are needed to ensure that the probability that a missile gets through undetected is 0.0001? (b) Suppose it is decided to stay with only 3 screens and attempt to improve the screen detection abil- ity. What must the individual screen effectiveness (i.e., probability of detection) be in order to achieve the effectiveness required in part (a)? 5.99 Go back to Review Exercise 5.95(a). Re- compute the probability using the binomial distribu- tion. Comment. 5.100 There are two vacancies in a certain university statistics department. Five individuals apply. Two have expertise in linear models, and one has exper- tise in applied probability. The search committee is instructed to choose the two applicants randomly. (a) What is the probability that the two chosen are those with expertise in linear models? (b) What is the probability that of the two chosen, one has expertise in linear models and one has expertise in applied probability? 5.101 The manufacturer of a tricycle for children has received complaints about defective brakes in the prod- uct. According to the design of the product and consid- erable preliminary testing, it had been determined that the probability of the kind of defect in the complaint was 1 in 10,000 (i.e., 0.0001). After a thorough investi- gation of the complaints, it was determined that during a certain period of time, 200 products were randomly chosen from production and 5 had defective brakes. (a) Comment on the “1 in 10,000” claim by the man- ufacturer. Use a probabilistic argument. Use the binomial distribution for your calculations. (b) Repeat part (a) using the Poisson approximation. 5.102 Group Project: Divide the class into two groups of approximately equal size. The students in group 1 will each toss a coin 10 times (n1) and count the number of heads obtained. The students in group 2 will each toss a coin 40 times (n2) and again count the // 5.6 Potential Misconceptions and Hazards 169 number of heads. The students in each group should individually compute the proportion of heads observed, which is an estimate of p, the probability of observing a head. Thus, there will be a set of values of p1 (from group 1) and a set of values p2 (from group 2). All of the values of p1 and p2 are estimates of 0.5, which is the true value of the probability of observing a head for a fair coin. (a) Which set of values is consistently closer to 0.5, the values of p1 or p2? Consider the proof of Theorem 5.1 on page 147 with regard to the estimates of the parameter p = 0.5. The values of p1 were obtained withn=n1 =10,andthevaluesofp2 wereob- tained with n = n2 = 40. Using the notation of the proof, the estimates are given by p1=x1 =I1+···+In1, n1 n1 where I1,...,In1 are 0s and 1s and n1 = 10, and p2=x2 =I1+···+In2, n2 n2 where I1,...,In2, again, are 0s and 1s and n2 = 40. (b) Referring again to Theorem 5.1, show that E(p1) = E(p2) = p = 0.5. (c) Show that σ2 p1 = σ2 X1 is 4 times the value of n1 σ2 p2 = σ2 X2 . Then explain further why the values n2 of p2 from group 2 are more consistently closer to the true value, p = 0.5, than the values of p1 from group 1. You will continue to learn more and more about parameter estimation beginning in Chapter 9. At that point emphasis will put on the importance of the mean and variance of an estimator of a param- eter. 5.6 Potential Misconceptions and Hazards; Relationship to Material in Other Chapters The discrete distributions discussed in this chapter occur with great frequency in engineering and the biological and physical sciences. The exercises and examples certainly suggest this. Industrial sampling plans and many engineering judgments are based on the binomial and Poisson distributions as well as on the hypergeo- metric distribution. While the geometric and negative binomial distributions are used to a somewhat lesser extent, they also find applications. In particular, a neg- ative binomial random variable can be viewed as a mixture of Poisson and gamma random variables (the gamma distribution will be discussed in Chapter 6). Despite the rich heritage that these distributions find in real life, they can be misused unless the scientific practitioner is prudent and cautious. Of course, any probability calculation for the distributions discussed in this chapter is made under the assumption that the parameter value is known. Real-world applications often result in a parameter value that may “move around” due to factors that are difficult to control in the process or because of interventions in the process that have not been taken into account. For example, in Review Exercise 5.77, “historical information” is used. But is the process that exists now the same as that under which the historical data were collected? The use of the Poisson distribution can suffer even more from this kind of difficulty. For example, in Review Exercise 5.80, the questions in parts (a), (b), and (c) are based on the use of μ = 2.7 calls per minute. Based on historical records, this is the number of calls that occur “on average.” But in this and many other applications of the Poisson distribution, there are slow times and busy times and so there are times in which the conditions 170 Chapter 5 Some Discrete Probability Distributions for the Poisson process may appear to hold when in fact they do not. Thus, the probability calculations may be incorrect. In the case of the binomial, the assumption that may fail in certain applications (in addition to nonconstancy of p) is the independence assumption, stating that the Bernoulli trials are independent. One of the most famous misuses of the binomial distribution occurred in the 1961 baseball season, when Mickey Mantle and Roger Maris were engaged in a friendly battle to break Babe Ruth’s all-time record of 60 home runs. A famous magazine article made a prediction, based on probability theory, that Mantle would break the record. The prediction was based on probability calculation with the use of the binomial distribution. The classic error made was to estimate the param- eter p (one for each player) based on relative historical frequency of home runs throughout the players’ careers. Maris, unlike Mantle, had not been a prodigious home run hitter prior to 1961 so his estimate of p was quite low. As a result, the calculated probability of breaking the record was quite high for Mantle and low for Maris. The end result: Mantle failed to break the record and Maris succeeded. Chapter 6 Some Continuous Probability Distributions 6.1 Continuous Uniform Distribution Uniform Distribution One of the simplest continuous distributions in all of statistics is the continuous uniform distribution. This distribution is characterized by a density function that is “flat,” and thus the probability is uniform in a closed interval, say [A, B]. Although applications of the continuous uniform distribution are not as abundant as those for other distributions discussed in this chapter, it is appropriate for the novice to begin this introduction to continuous distributions with the uniform distribution. The density function of the continuous uniform random variable X on the in- terval [A, B] is 􏰥 1 , A≤x≤B, B−A 0, elsewhere. f(x;A,B) = 1 . B−A Probabilities are simple to calculate for the uniform distribution because of the simple nature of the density function. However, note that the application of this distribution is based on the assumption that the probability of falling in an interval of fixed length within [A, B] is constant. The density function forms a rectangle with base B−A and constant height As a result, the uniform distribution is often called the rectangular distribution. Note, however, that the interval may not always be closed: [A, B]. It can be (A, B) as well. The density function for a uniform random variable on the interval [1, 3] is shown in Figure 6.1. Example 6.1: Suppose that a large conference room at a certain company can be reserved for no more than 4 hours. Both long and short conferences occur quite often. In fact, it can be assumed that the length X of a conference has a uniform distribution on the interval [0, 4]. 171 172 Chapter 6 Some Continuous Probability Distributions (a) (b) Solution: (a) What is the probability density function? What is the probability that any given conference lasts at least 3 hours? The appropriate density function for the uniformly distributed random vari- able X in this situation is 􏰥 f(x) 1 2 013 x Figure 6.1: The density function for a random variable on the interval [1, 3]. 1, 0≤x≤4, 4 f(x) = 0, elsewhere. The mean and variance of the uniform distribution are A + B 2 (B − A)2 μ= 2 andσ= 12 . Theorem 6.1: (b)P[X≥3]=􏰬41 dx=1. 344 The proofs of the theorems are left to the reader. See Exercise 6.1 on page 185. 6.2 Normal Distribution The most important continuous probability distribution in the entire field of statis- tics is the normal distribution. Its graph, called the normal curve, is the bell-shaped curve of Figure 6.2, which approximately describes many phenomena that occur in nature, industry, and research. For example, physical measurements in areas such as meteorological experiments, rainfall studies, and measurements of manufactured parts are often more than adequately explained with a normal distribution. In addition, errors in scientific measurements are extremely well ap- proximated by a normal distribution. In 1733, Abraham DeMoivre developed the mathematical equation of the normal curve. It provided a basis from which much of the theory of inductive statistics is founded. The normal distribution is of- ten referred to as the Gaussian distribution, in honor of Karl Friedrich Gauss 6.2 Normal Distribution 173 σ μ Figure 6.2: The normal curve. x (1777–1855), who also derived its equation from a study of errors in repeated mea- surements of the same quantity. A continuous random variable X having the bell-shaped distribution of Figure 6.2 is called a normal random variable. The mathematical equation for the probability distribution of the normal variable depends on the two parameters μ and σ, its mean and standard deviation, respectively. Hence, we denote the values of the density of X by n(x; μ, σ). Normal Distribution The density of the normal random variable X, with mean μ and variance σ2, is 2σ n(x;μ,σ)=√1 e− 12(x−μ)2, −∞ k) = 0.3015 and
(b) P(k < Z < −0.18) = 0.4197. 0.3015 0.4197 0kx k−0.18x (a) (b) Figure 6.10: Areas for Example 6.3. Solution : Distributions and the desired areas are shown in Figure 6.10. (a) In Figure 6.10(a), we see that the k value leaving an area of 0.3015 to the right must then leave an area of 0.6985 to the left. From Table A.3 it follows that k = 0.52. (b) From Table A.3 we note that the total area to the left of −0.18 is equal to 0.4286. In Figure 6.10(b), we see that the area between k and −0.18 is 0.4197, so the area to the left of k must be 0.4286 − 0.4197 = 0.0089. Hence, from Table A.3, we have k = −2.37. Example 6.4: Given a random variable X having a normal distribution with μ = 50 and σ = 10, find the probability that X assumes a value between 45 and 62. 􏱍0.5 0 1.2 Figure 6.11: Area for Example 6.4. Solution : The z values corresponding to x1 = 45 and x2 = 62 are z1 = 45 − 50 = −0.5 and z2 = 62 − 50 = 1.2. 10 10 x 180 Chapter 6 Some Continuous Probability Distributions Therefore, P(45 362), we need to evaluate the area under the normal curve to the right of x = 362. This can be done by transforming x = 362 to the corresponding z value, obtaining the area to the left of z from Table A.3, and then subtracting this area from 1. We find that
z= 362−300 =1.24. 50
Hence,
P(X >362)=P(Z >1.24)=1−P(Z <1.24)=1−0.8925=0.1075. σ 􏱋 50 300 362 Figure 6.12: Area for Example 6.5. x According to Chebyshev’s theorem on page 137, the probability that a random variable assumes a value within 2 standard deviations of the mean is at least 3/4. If the random variable has a normal distribution, the z values corresponding to x1 =μ−2σandx2 =μ+2σareeasilycomputedtobe z1 = (μ − 2σ) − μ = −2 and z2 = (μ + 2σ) − μ = 2. σσ Hence, P (μ − 2σ < X < μ + 2σ) = P (−2 < Z < 2) = P (Z < 2) − P (Z < −2) = 0.9772 − 0.0228 = 0.9544, which is a much stronger statement than that given by Chebyshev’s theorem. 6.3 Areas under the Normal Curve 181 Using the Normal Curve in Reverse Sometimes, we are required to find the value of z corresponding to a specified probability that falls between values listed in Table A.3 (see Example 6.6). For convenience, we shall always choose the z value corresponding to the tabular prob- ability that comes closest to the specified probability. The preceding two examples were solved by going first from a value of x to a z value and then computing the desired area. In Example 6.6, we reverse the process and begin with a known area or probability, find the z value, and then determine x by rearranging the formula z=x−μ togive x=σz+μ. σ Example 6.6: Given a normal distribution with μ = 40 and σ = 6, find the value of x that has (a) (b) 45% of the area to the left and 14% of the area to the right. Solution : (a) (b) An area of 0.45 to the left of the desired x value is shaded in Figure 6.13(a). We require a z value that leaves an area of 0.45 to the left. From Table A.3 we find P (Z < −0.13) = 0.45, so the desired z value is −0.13. Hence, x = (6)(−0.13) + 40 = 39.22. In Figure 6.13(b), we shade an area equal to 0.14 to the right of the desired x value. This time we require a z value that leaves 0.14 of the area to the right and hence an area of 0.86 to the left. Again, from Table A.3, we find P (Z < 1.08) = 0.86, so the desired z value is 1.08 and x = (6)(1.08) + 40 = 46.48. 0.45 40 (a) x 40 (b) σ =6 σ =6 0.14 x Figure 6.13: Areas for Example 6.6. 182 Chapter 6 Some Continuous Probability Distributions 6.4 Applications of the Normal Distribution Some of the many problems for which the normal distribution is applicable are treated in the following examples. The use of the normal curve to approximate binomial probabilities is considered in Section 6.5. Example 6.7: A certain type of storage battery lasts, on average, 3.0 years with a standard deviation of 0.5 year. Assuming that battery life is normally distributed, find the probability that a given battery will last less than 2.3 years. Solution: First construct a diagram such as Figure 6.14, showing the given distribution of battery lives and the desired area. To find P(X < 2.3), we need to evaluate the area under the normal curve to the left of 2.3. This is accomplished by finding the area to the left of the corresponding z value. Hence, we find that 2.3 3 z = 2.3 − 3 = −1.4, 0.5 and then, using Table A.3, we have P(X < 2.3) = P(Z < −1.4) = 0.0808. Figure 6.14: Area for Example 6.7. σ 􏱋 0.5 σ 􏱋 40 xx 778 800 834 Figure 6.15: Area for Example 6.8. Example 6.8: An electrical firm manufactures light bulbs that have a life, before burn-out, that is normally distributed with mean equal to 800 hours and a standard deviation of 40 hours. Find the probability that a bulb burns between 778 and 834 hours. Solution : The distribution of light bulb life is illustrated in Figure 6.15. The z values corre- sponding to x1 = 778 and x2 = 834 are z1 = 778 − 800 = −0.55 and z2 = 834 − 800 = 0.85. 40 40 Hence, P(778 2.0) = 2(0.0228) = 0.0456.
As a result, it is anticipated that, on average, 4.56% of manufactured ball bearings will be scrapped.
0.0228
2.99 3.0
0.0228
0.025
0.025 1.892 x
Figure 6.16: Area for Example 6.9.
Figure 6.17: Specifications for Example 6.10.
σ = 0.005
σ = 0.2
3.01
x
1.108 1.500
Example 6.10: Gauges are used to reject all components for which a certain dimension is not within the specification 1.50 ± d. It is known that this measurement is normally distributed with mean 1.50 and standard deviation 0.2. Determine the value d such that the specifications “cover” 95% of the measurements.
Solution : From Table A.3 we know that
P(−1.96 < Z < 1.96) = 0.95. Therefore, from which we obtain 1.96 = (1.50 + d) − 1.50, 0.2 d = (0.2)(1.96) = 0.392. An illustration of the specifications is shown in Figure 6.17. 184 Chapter 6 Some Continuous Probability Distributions Example 6.11: A certain machine makes electrical resistors having a mean resistance of 40 ohms and a standard deviation of 2 ohms. Assuming that the resistance follows a normal distribution and can be measured to any degree of accuracy, what percentage of resistors will have a resistance exceeding 43 ohms? Solution: A percentage is found by multiplying the relative frequency by 100%. Since the relative frequency for an interval is equal to the probability of a value falling in the interval, we must find the area to the right of x = 43 in Figure 6.18. This can be done by transforming x = 43 to the corresponding z value, obtaining the area to the left of z from Table A.3, and then subtracting this area from 1. We find z = 43 − 40 = 1.5. 2 Therefore, P(X >43)=P(Z >1.5)=1−P(Z <1.5)=1−0.9332=0.0668. Hence, 6.68% of the resistors will have a resistance exceeding 43 ohms. σ 􏱋 2.0 40 43 Figure 6.18: Area for Example 6.11. σ 􏱋 2.0 xx 40 43.5 Figure 6.19: Area for Example 6.12. Example 6.12: Find the percentage of resistances exceeding 43 ohms for Example 6.11 if resistance is measured to the nearest ohm. Solution : This problem differs from that in Example 6.11 in that we now assign a measure- ment of 43 ohms to all resistors whose resistances are greater than 42.5 and less than 43.5. We are actually approximating a discrete distribution by means of a continuous normal distribution. The required area is the region shaded to the right of 43.5 in Figure 6.19. We now find that z= 43.5−40 =1.75. 2 Hence, P(X >43.5)=P(Z >1.75)=1−P(Z <1.75)=1−0.9599=0.0401. Therefore, 4.01% of the resistances exceed 43 ohms when measured to the nearest ohm. The difference 6.68% − 4.01% = 2.67% between this answer and that of Example 6.11 represents all those resistance values greater than 43 and less than 43.5 that are now being recorded as 43 ohms. // Exercises 185 Example 6.13: The average grade for an exam is 74, and the standard deviation is 7. If 12% of the class is given As, and the grades are curved to follow a normal distribution, what is the lowest possible A and the highest possible B? Solution: In this example, we begin with a known area of probability, find the z value, and then determine x from the formula x = σz + μ. An area of 0.12, corresponding to the fraction of students receiving As, is shaded in Figure 6.20. We require a z value that leaves 0.12 of the area to the right and, hence, an area of 0.88 to the left. From Table A.3, P (Z < 1.18) has the closest value to 0.88, so the desired z value is 1.18. Hence, x = (7)(1.18) + 74 = 82.26. Therefore, the lowest A is 83 and the highest B is 82. σ =7 0.12 σ = 7 0.6 xx 74 74D6 Figure 6.20: Area for Example 6.13. Figure 6.21: Area for Example 6.14. Example 6.14: Refer to Example 6.13 and find the sixth decile. Solution: The sixth decile, written D6, is the x value that leaves 60% of the area to the left, as shown in Figure 6.21. From Table A.3 we find P (Z < 0.25) ≈ 0.6, so the desired z value is 0.25. Now x = (7)(0.25) + 74 = 75.75. Hence, D6 = 75.75. That is, 60% of the grades are 75 or less. Exercises 6.1 Given a continuous uniform distribution, show that (a)μ=A+B and 2 (b) σ2 = (B−A)2 . 12 6.2 Suppose X follows a continuous uniform distribu- tion from 1 to 5. Determine the conditional probability P(X > 2.5 | X ≤ 4).
6.3 The daily amount of coffee, in liters, dispensed by a machine located in an airport lobby is a random
variable X having a continuous uniform distribution with A = 7 and B = 10. Find the probability that on a given day the amount of coffee dispensed by this machine will be
(a) at most 8.8 liters;
(b) more than 7.4 liters but less than 9.5 liters;
(c) at least 8.5 liters.
6.4 A bus arrives every 10 minutes at a bus stop. It is assumed that the waiting time for a particular indi- vidual is a random variable with a continuous uniform distribution.

186
Chapter 6 Some Continuous Probability Distributions
(a) What is the probability that the individual waits more than 7 minutes?
(b) What is the probability that the individual waits between 2 and 7 minutes?
6.5 Given a standard normal distribution, find the area under the curve that lies
(a) to the left of z = −1.39; (b) to the right of z = 1.96;
(c) between z = −2.16 and z = −0.65; (d) to the left of z = 1.43;
(e) to the right of z = −0.89;
(f) between z = −0.48 and z = 1.74.
6.6 Find the value of z if the area under a standard normal curve
(a) to the right of z is 0.3622; (b) to the left of z is 0.1131;
(c) between 0 and z, with z > 0, is 0.4838; (d) between −z and z, with z > 0, is 0.9500.
6.7 Given a standard normal distribution, find the value of k such that
(a) P(Z > k) = 0.2946; (b) P(Z < k) = 0.0427; (c) P(−0.93 < Z < k) = 0.7235. 6.8 Given a normal distribution with μ = 30 and σ = 6, find (a) the normal curve area to the right of x = 17; (b) the normal curve area to the left of x = 22; (c) the normal curve area between x = 32 and x = 41; (d) the value of x that has 80% of the normal curve area to the left; (e) the two values of x that contain the middle 75% of the normal curve area. 6.9 Given the normally distributed variable X with mean 18 and standard deviation 2.5, find (a) P(X < 15); (b) the value of k such that P(X < k) = 0.2236; (c) the value of k such that P(X > k) = 0.1814; (d) P (17 < X < 21). 6.10 According to Chebyshev’s theorem, the proba- bility that any random variable assumes a value within 3 standard deviations of the mean is at least 8/9. If it is known that the probability distribution of a random variable X is normal with mean μ and variance σ2, what is the exact value of P (μ − 3σ < X < μ + 3σ)? 6.11 A soft-drink machine is regulated so that it dis- charges an average of 200 milliliters per cup. If the amount of drink is normally distributed with a stan- dard deviation equal to 15 milliliters, (a) what fraction of the cups will contain more than 224 milliliters? (b) what is the probability that a cup contains between 191 and 209 milliliters? (c) how many cups will probably overflow if 230- milliliter cups are used for the next 1000 drinks? (d) below what value do we get the smallest 25% of the drinks? 6.12 The loaves of rye bread distributed to local stores by a certain bakery have an average length of 30 centimeters and a standard deviation of 2 centimeters. Assuming that the lengths are normally distributed, what percentage of the loaves are (a) longer than 31.7 centimeters? (b) between 29.3 and 33.5 centimeters in length? (c) shorter than 25.5 centimeters? 6.13 A research scientist reports that mice will live an average of 40 months when their diets are sharply re- stricted and then enriched with vitamins and proteins. Assuming that the lifetimes of such mice are normally distributed with a standard deviation of 6.3 months, find the probability that a given mouse will live (a) more than 32 months; (b) less than 28 months; (c) between 37 and 49 months. 6.14 The finished inside diameter of a piston ring is normally distributed with a mean of 10 centimeters and a standard deviation of 0.03 centimeter. (a) What proportion of rings will have inside diameters exceeding 10.075 centimeters? (b) What is the probability that a piston ring will have an inside diameter between 9.97 and 10.03 centime- ters? (c) Below what value of inside diameter will 15% of the piston rings fall? 6.15 A lawyer commutes daily from his suburban home to his midtown office. The average time for a one-way trip is 24 minutes, with a standard deviation of 3.8 minutes. Assume the distribution of trip times to be normally distributed. (a) What is the probability that a trip will take at least 1/2 hour? (b) If the office opens at 9:00 A.M. and the lawyer leaves his house at 8:45 A.M. daily, what percentage of the time is he late for work? // 6.5 Normal Approximation to the Binomial 187 (c) If he leaves the house at 8:35 A.M. and coffee is served at the office from 8:50 A.M. until 9:00 A.M., what is the probability that he misses coffee? (d) Find the length of time above which we find the slowest 15% of the trips. (e) Find the probability that 2 of the next 3 trips will take at least 1/2 hour. 6.16 In the November 1990 issue of Chemical Engi- neering Progress, a study discussed the percent purity of oxygen from a certain supplier. Assume that the mean was 99.61 with a standard deviation of 0.08. As- sume that the distribution of percent purity was ap- proximately normal. (a) What percentage of the purity values would you expect to be between 99.5 and 99.7? (b) What purity value would you expect to exceed ex- actly 5% of the population? 6.17 The average life of a certain type of small motor is 10 years with a standard deviation of 2 years. The manufacturer replaces free all motors that fail while under guarantee. If she is willing to replace only 3% of the motors that fail, how long a guarantee should be offered? Assume that the lifetime of a motor follows a normal distribution. 6.18 The heights of 1000 students are normally dis- tributed with a mean of 174.5 centimeters and a stan- dard deviation of 6.9 centimeters. Assuming that the heights are recorded to the nearest half-centimeter, how many of these students would you expect to have heights (a) less than 160.0 centimeters? (b) between 171.5 and 182.0 centimeters inclusive? (c) equal to 175.0 centimeters? (d) greater than or equal to 188.0 centimeters? 6.19 A company pays its employees an average wage of $15.90 an hour with a standard deviation of $1.50. If the wages are approximately normally distributed and paid to the nearest cent, (a) what percentage of the workers receive wages be- tween $13.75 and $16.22 an hour inclusive? (b) the highest 5% of the employee hourly wages is greater than what amount? 6.20 The weights of a large number of miniature poo- dles are approximately normally distributed with a mean of 8 kilograms and a standard deviation of 0.9 kilogram. If measurements are recorded to the nearest tenth of a kilogram, find the fraction of these poodles with weights (a) over 9.5 kilograms; (b) of at most 8.6 kilograms; (c) between 7.3 and 9.1 kilograms inclusive. 6.21 The tensile strength of a certain metal compo- nent is normally distributed with a mean of 10,000 kilo- grams per square centimeter and a standard deviation of 100 kilograms per square centimeter. Measurements are recorded to the nearest 50 kilograms per square centimeter. (a) What proportion of these components exceed 10,150 kilograms per square centimeter in tensile strength? (b) If specifications require that all components have tensile strength between 9800 and 10,200 kilograms per square centimeter inclusive, what proportion of pieces would we expect to scrap? 6.22 If a set of observations is normally distributed, what percent of these differ from the mean by (a) more than 1.3σ? (b) less than 0.52σ? 6.23 The IQs of 600 applicants to a certain college are approximately normally distributed with a mean of 115 and a standard deviation of 12. If the college requires an IQ of at least 95, how many of these stu- dents will be rejected on this basis of IQ, regardless of their other qualifications? Note that IQs are recorded to the nearest integers. 6.5 Normal Approximation to the Binomial Probabilities associated with binomial experiments are readily obtainable from the formula b(x; n, p) of the binomial distribution or from Table A.1 when n is small. In addition, binomial probabilities are readily available in many computer software packages. However, it is instructive to learn the relationship between the binomial and the normal distribution. In Section 5.5, we illustrated how the Poisson dis- tribution can be used to approximate binomial probabilities when n is quite large and p is very close to 0 or 1. Both the binomial and the Poisson distributions 188 Chapter 6 Some Continuous Probability Distributions are discrete. The first application of a continuous probability distribution to ap- proximate probabilities over a discrete sample space was demonstrated in Example 6.12, where the normal curve was used. The normal distribution is often a good approximation to a discrete distribution when the latter takes on a symmetric bell shape. From a theoretical point of view, some distributions converge to the normal as their parameters approach certain limits. The normal distribution is a conve- nient approximating distribution because the cumulative distribution function is so easily tabled. The binomial distribution is nicely approximated by the normal in practical problems when one works with the cumulative distribution function. We now state a theorem that allows us to use areas under the normal curve to approximate binomial properties when n is sufficiently large. If X is a binomial random variable with mean μ = np and variance σ2 = npq, then the limiting form of the distribution of X − np Z = √npq , as n → ∞, is the standard normal distribution n(z; 0, 1). Theorem 6.3: It turns out that the normal distribution with μ = np and σ2 = np(1 − p) not only provides a very accurate approximation to the binomial distribution when n is large and p is not extremely close to 0 or 1 but also provides a fairly good approximation even when n is small and p is reasonably close to 1/2. To illustrate the normal approximation to the binomial distribution, we first draw the histogram for b(x;15,0.4) and then superimpose the particular normal curve having the same mean and variance as the binomial variable X. Hence, we draw a normal curve with μ = np = (15)(0.4) = 6 and σ2 = npq = (15)(0.4)(0.6) = 3.6. The histogram of b(x; 15, 0.4) and the corresponding superimposed normal curve, which is completely determined by its mean and variance, are illustrated in Figure 6.22. 0123456789 11 13 15x Figure 6.22: Normal approximation of b(x; 15, 0.4). 6.5 Normal Approximation to the Binomial 189 The exact probability that the binomial random variable X assumes a given value x is equal to the area of the bar whose base is centered at x. For example, the exact probability that X assumes the value 4 is equal to the area of the rectangle with base centered at x = 4. Using Table A.1, we find this area to be P(X = 4) = b(4;15,0.4) = 0.1268, which is approximately equal to the area of the shaded region under the normal curve between the two ordinates x1 = 3.5 and x2 = 4.5 in Figure 6.23. Converting to z values, we have z1 = 3.5−6 =−1.32 and z2 = 4.5−6 =−0.79. 1.897 1.897 x 􏰦9 x=7 P(X = 4) = b(4;15,0.4) ≈ P(−1.32 < Z < −0.79) = P(Z < −0.79) − P(Z < −1.32) = 0.2148 − 0.0934 = 0.1214. This agrees very closely with the exact value of 0.1268. The normal approximation is most useful in calculating binomial sums for large values of n. Referring to Figure 6.23, we might be interested in the probability that X assumes a value from 7 to 9 inclusive. The exact probability is given by 􏰤9 P(7≤X ≤9)= 0123456789 11 13 15 Figure 6.23: Normal approximation of b(x; 15, 0.4) and If X is a binomial random variable and Z a standard normal variable, then b(x;15,0.4)− = 0.9662 − 0.6098 = 0.3564, x=0 which is equal to the sum of the areas of the rectangles with bases centered at x = 7, 8, and 9. For the normal approximation, we find the area of the shaded region under the curve between the ordinates x1 = 6.5 and x2 = 9.5 in Figure 6.23. The corresponding z values are z1 = 6.5−6 =0.26 and z2 = 9.5−6 =1.85. 1.897 1.897 􏰤6 x=0 b(x;15,0.4) b(x; 15, 0.4). 190 Chapter 6 Some Continuous Probability Distributions Now, P(7≤X ≤9)≈P(0.26 0.
0
Definition 6.2:
Although the normal distribution can be used to solve many problems in engineer- ing and science, there are still numerous situations that require different types of density functions. Two such density functions, the gamma and exponential distributions, are discussed in this section.
It turns out that the exponential distribution is a special case of the gamma dis- tribution. Both find a large number of applications. The exponential and gamma distributions play an important role in both queuing theory and reliability prob- lems. Time between arrivals at service facilities and time to failure of component parts and electrical systems often are nicely modeled by the exponential distribu- tion. The relationship between the gamma and the exponential allows the gamma to be used in similar types of problems. More details and illustrations will be supplied later in the section.
The gamma distribution derives its name from the well-known gamma func- tion, studied in many areas of mathematics. Before we proceed to the gamma distribution, let us review this function and some of its important properties.
The following are a few simple properties of the gamma function. (a) Γ(n) = (n − 1)(n − 2) · · · (1)Γ(1), for a positive integer n.
To see the proof, integrating by parts with u = xα−1 and dv = e−x dx, we obtain
􏰭􏰭∞ 􏰫∞ 􏰫∞
+ e−x(α − 1)xα−2 dx = (α − 1) 00
for α > 1, which yields the recursion formula
Γ(α) = (α − 1)Γ(α − 1).
xα−2e−x dx,
Γ(α) = −e−x xα−1
0
The result follows after repeated application of the recursion formula. Using this result, we can easily show the following two properties.

6.6 Gamma and Exponential Distributions 195
Gamma Distribution
(b) Γ(n) = (n − 1)! for a positive integer n. (c) Γ(1) = 1.
Furthermore, we have the following property of Γ(α), which is left for the reader to verify (see Exercise 6.39 on page 206).
(d) Γ(1/2) = √π.
The following is the definition of the gamma distribution.
The continuous random variable X has a gamma distribution, with param- eters α and β, if its density function is given by
􏰥
1 βαΓ(α)
0,
xα−1e−x/β,
f(x;α,β) = where α > 0 and β > 0.
x > 0, elsewhere,
Graphs of several gamma distributions are shown in Figure 6.28 for certain specified values of the parameters α and β. The special gamma distribution for which α = 1 is called the exponential distribution.
f(x)
1.0
0.5
α= 1 β =1
α= 2 β =1
α= 4 β =1
where β > 0.
0123456
Figure 6.28: Gamma distributions.
x
Exponential Distribution
The continuous random variable X has an exponential distribution, with parameter β, if its density function is given by
􏰥
f(x;β) =
1e−x/β, x>0, β
0, elsewhere,

196
Chapter 6 Some Continuous Probability Distributions
Theorem 6.4:
Corollary 6.1:
The following theorem and corollary give the mean and variance of the gamma and exponential distributions.
The proof of this theorem is found in Appendix A.26.
The mean and variance of the gamma distribution are μ = αβ and σ2 = αβ2.
The mean and variance of the exponential distribution are μ = β and σ2 = β2.
Relationship to the Poisson Process
We shall pursue applications of the exponential distribution and then return to the gamma distribution. The most important applications of the exponential distribu- tion are situations where the Poisson process applies (see Section 5.5). The reader should recall that the Poisson process allows for the use of the discrete distribu- tion called the Poisson distribution. Recall that the Poisson distribution is used to compute the probability of specific numbers of “events” during a particular period of time or span of space. In many applications, the time period or span of space is the random variable. For example, an industrial engineer may be interested in modeling the time T between arrivals at a congested intersection during rush hour in a large city. An arrival represents the Poisson event.
The relationship between the exponential distribution (often called the negative exponential) and the Poisson process is quite simple. In Chapter 5, the Poisson distribution was developed as a single-parameter distribution with parameter λ, where λ may be interpreted as the mean number of events per unit “time.” Con- sider now the random variable described by the time required for the first event to occur. Using the Poisson distribution, we find that the probability of no events occurring in the span up to time t is given by
e−λt (λt)0 −λt p(0; λt) = 0! = e .
We can now make use of the above and let X be the time to the first Poisson event. The probability that the length of time until the first event will exceed x is the same as the probability that no Poisson events will occur in x. The latter, of course, is given by e−λx. As a result,
P(X > x) = e−λx.
Thus, the cumulative distribution function for X is given by
P(0≤X ≤x)=1−e−λx.
Now, in order that we may recognize the presence of the exponential distribution, we differentiate the cumulative distribution function above to obtain the density

6.6 Gamma and Exponential Distributions
197
function
f(x) = λe−λx,
which is the density function of the exponential distribution with λ = 1/β.
Applications of the Exponential and Gamma Distributions
In the foregoing, we provided the foundation for the application of the exponential distribution in “time to arrival” or time to Poisson event problems. We will illus- trate some applications here and then proceed to discuss the role of the gamma distribution in these modeling applications. Notice that the mean of the exponen- tial distribution is the parameter β, the reciprocal of the parameter in the Poisson distribution. The reader should recall that it is often said that the Poisson distri- bution has no memory, implying that occurrences in successive time periods are independent. The important parameter β is the mean time between events. In reliability theory, where equipment failure often conforms to this Poisson process, β is called mean time between failures. Many equipment breakdowns do follow the Poisson process, and thus the exponential distribution does apply. Other ap- plications include survival times in biomedical experiments and computer response time.
In the following example, we show a simple application of the exponential dis- tribution to a problem in reliability. The binomial distribution also plays a role in the solution.
Example 6.17: Suppose that a system contains a certain type of component whose time, in years, to failure is given by T . The random variable T is modeled nicely by the exponential distribution with mean time to failure β = 5. If 5 of these components are installed in different systems, what is the probability that at least 2 are still functioning at the end of 8 years?
Solution: The probability that a given component is still functioning after 8 years is given
by
1􏰫∞
P(T >8)= 5
􏰤1 x=0
encountered the exponential distribution. Others involving waiting time and reli- ability include Example 6.24 and some of the exercises and review exercises at the end of this chapter.
The Memoryless Property and Its Effect on the Exponential Distribution
The types of applications of the exponential distribution in reliability and compo- nent or machine lifetime problems are influenced by the memoryless (or lack-of- memory) property of the exponential distribution. For example, in the case of,
b(x;5,0.2)=1−
b(x;5,0.2)=1−0.7373=0.2627.
8
e−t/5 dt=e−8/5 ≈0.2.
Let X represent the number of components functioning after 8 years. Then using the binomial distribution, we have
􏰤5 P(X ≥2)=
x=2
There are exercises and examples in Chapter 3 where the reader has already

198
Chapter 6 Some Continuous Probability Distributions
say, an electronic component where lifetime has an exponential distribution, the probability that the component lasts, say, t hours, that is, P(X ≥ t), is the same as the conditional probability
P(X ≥ t0 + t | X ≥ t0).
So if the component “makes it” to t0 hours, the probability of lasting an additional t hours is the same as the probability of lasting t hours. There is no “punish- ment” through wear that may have ensued for lasting the first t0 hours. Thus, the exponential distribution is more appropriate when the memoryless property is justified. But if the failure of the component is a result of gradual or slow wear (as in mechanical wear), then the exponential does not apply and either the gamma or the Weibull distribution (Section 6.10) may be more appropriate.
The importance of the gamma distribution lies in the fact that it defines a family of which other distributions are special cases. But the gamma itself has important applications in waiting time and reliability theory. Whereas the expo- nential distribution describes the time until the occurrence of a Poisson event (or the time between Poisson events), the time (or space) occurring until a specified number of Poisson events occur is a random variable whose density function is described by the gamma distribution. This specific number of events is the param- eter α in the gamma density function. Thus, it becomes easy to understand that when α = 1, the special case of the exponential distribution occurs. The gamma density can be developed from its relationship to the Poisson process in much the same manner as we developed the exponential density. The details are left to the reader. The following is a numerical example of the use of the gamma distribution in a waiting-time application.
Example 6.18: Suppose that telephone calls arriving at a particular switchboard follow a Poisson process with an average of 5 calls coming per minute. What is the probability that up to a minute will elapse by the time 2 calls have come in to the switchboard?
Solution: The Poisson process applies, with time until 2 Poisson events following a gamma distribution with β = 1/5 and α = 2. Denote by X the time in minutes that transpires before 2 calls come. The required probability is given by
􏰫11 􏰫1
P(X ≤1)= β2xe−x/β dx=25 xe−5x dx=1−e−5(1+5)=0.96.
00
While the origin of the gamma distribution deals in time (or space) until the
occurrence of α Poisson events, there are many instances where a gamma distri- bution works very well even though there is no clear Poisson structure. This is particularly true for survival time problems in both engineering and biomedical applications.
Example 6.19: In a biomedical study with rats, a dose-response investigation is used to determine the effect of the dose of a toxicant on their survival time. The toxicant is one that is frequently discharged into the atmosphere from jet fuel. For a certain dose of the toxicant, the study determines that the survival time, in weeks, has a gamma distribution with α = 5 and β = 10. What is the probability that a rat survives no longer than 60 weeks?

6.6 Gamma and Exponential Distributions 199 Solution: Let the random variable X be the survival time (time to death). The required
probability is
1 􏰫 60 xα−1e−x/β P(X ≤ 60) = β5 0 Γ(5)
dx.
The integral above can be solved through the use of the incomplete gamma
function, which becomes the cumulative distribution function for the gamma dis-
tribution. This function is written as
F(x;α)=
If we let y = x/β, so x = βy, we have
􏰫 6 y4e−y
Γ(5) dy,
which is denoted as F(6;5) in the table of the incomplete gamma function in Appendix A.23. Note that this allows a quick computation of probabilities for the gamma distribution. Indeed, for this problem, the probability that the rat survives no longer than 60 days is given by
P(X ≤ 60) = F(6;5) = 0.715.
Example 6.20: It is known, from previous data, that the length of time in months between cus- tomer complaints about a certain product is a gamma distribution with α = 2 and β = 4. Changes were made to tighten quality control requirements. Following these changes, 20 months passed before the first complaint. Does it appear as if the quality control tightening was effective?
Solution: Let X be the time to the first complaint, which, under conditions prior to the changes, followed a gamma distribution with α = 2 and β = 4. The question centers around how rare X ≥ 20 is, given that α and β remain at values 2 and 4, respectively. In other words, under the prior conditions is a “time to complaint”
as large as 20 months reasonable? Thus, following the solution to Example 6.19, 1 􏰫 20 xα−1e−x/β
P(X≥20)=1−βα 0 Γ(α) dx. Again, using y = x/β, we have
􏰫 x yα−1e−y
P(X≤60)=
0
Γ(α) dy.
0
P(X≥20)=1−
􏰫 5 ye−y
Γ(2) dy=1−F(5;2)=1−0.96=0.04,
where F (5; 2) = 0.96 is found from Table A.23.
As a result, we could conclude that the conditions of the gamma distribution
with α = 2 and β = 4 are not supported by the data that an observed time to complaint is as large as 20 months. Thus, it is reasonable to conclude that the quality control work was effective.
Example 6.21: Consider Exercise 3.31 on page 94. Based on extensive testing, it is determined
that the time Y in years before a major repair is required for a certain washing
machine is characterized by the density function
􏰥
0
1e−y/4, y≥0, 4
f(y) =
0, elsewhere.

200
Chapter 6 Some Continuous Probability Distributions
Note that Y is an exponential random variable with μ = 4 years. The machine is considered a bargain if it is unlikely to require a major repair before the sixth year. What is the probability P(Y > 6)? What is the probability that a major repair is required in the first year?
Solution : Consider the cumulative distribution function F (y) for the exponential distribution,
1􏰫y
F(y)= β
P(Y >6)=1−F(6)=e−3/2 =0.2231.
e−t/β dt=1−e−y/β.
6.7
Thus, the probability that the washing machine will require major repair after year six is 0.223. Of course, it will require repair before year six with probability 0.777. Thus, one might conclude the machine is not really a bargain. The probability that a major repair is necessary in the first year is
P(Y <1)=1−e−1/4 =1−0.779=0.221. Chi-Squared Distribution Then 0 Chi-Squared Distribution Another very important special case of the gamma distribution is obtained by letting α = v/2 and β = 2, where v is a positive integer. The result is called the chi-squared distribution. The distribution has a single parameter, v, called the degrees of freedom. The continuous random variable X has a chi-squared distribution, with v degrees of freedom, if its density function is given by 􏰥 f(x;v) = where v is a positive integer. xv/2−1e−x/2, 1 2v/2 Γ(v/2) 0, x > 0, elsewhere,
Theorem 6.5:
The chi-squared distribution plays a vital role in statistical inference. It has considerable applications in both methodology and theory. While we do not discuss applications in detail in this chapter, it is important to understand that Chapters 8, 9, and 16 contain important applications. The chi-squared distribution is an important component of statistical hypothesis testing and estimation.
Topics dealing with sampling distributions, analysis of variance, and nonpara- metric statistics involve extensive use of the chi-squared distribution.
The mean and variance of the chi-squared distribution are μ = v and σ2 = 2v.

6.9 Lognormal Distribution 201 6.8 Beta Distribution
A beta function is defined by
􏰫 1 Γ(α)Γ(β)
B(α,β)=
where Γ(α) is the gamma function.
xα−1(1−x)β−1dx= Γ(α+β), forα,β>0,
0
Definition 6.3:
Beta Distribution
An extension to the uniform distribution is a beta distribution. Let us start by defining a beta function.
The continuous random variable X has a beta distribution with parameters α > 0 and β > 0 if its density function is given by
􏰥
1 xα−1(1−x)β−1, 08)=1−P(X ≤8).
Since ln(X) has a normal distribution with mean μ = 3.2 and standard deviation σ = 1,
􏰮ln(8)−3.2􏰯
P(X ≤8)=Φ 1 =Φ(−1.12)=0.1314.
Here, we use Φ to denote the cumulative distribution function of the standard normal distribution. As a result, the probability that the pollutant concentration exceeds 8 parts per million is 0.1314.

6.10 Weibull Distribution (Optional) 203
Example 6.23: The life, in thousands of miles, of a certain type of electronic control for locomotives has an approximately lognormal distribution with μ = 5.149 and σ = 0.737. Find the 5th percentile of the life of such an electronic control.
6.10
Solution: From Table A.3, we know that P(Z < −1.645) = 0.05. Denote by X the life of such an electronic control. Since ln(X) has a normal distribution with mean μ = 5.149 and σ = 0.737, the 5th percentile of X can be calculated as ln(x) = 5.149 + (0.737)(−1.645) = 3.937. Hence, x = 51.265. This means that only 5% of the controls will have lifetimes less than 51,265 miles. Weibull Distribution (Optional) Weibull Distribution Modern technology has enabled engineers to design many complicated systems whose operation and safety depend on the reliability of the various components making up the systems. For example, a fuse may burn out, a steel column may buckle, or a heat-sensing device may fail. Identical components subjected to iden- tical environmental conditions will fail at different and unpredictable times. We have seen the role that the gamma and exponential distributions play in these types of problems. Another distribution that has been used extensively in recent years to deal with such problems is the Weibull distribution, introduced by the Swedish physicist Waloddi Weibull in 1939. The continuous random variable X has a Weibull distribution, with param- eters α and β, if its density function is given by 􏰥 f(x;α,β) = αβx 0, where α > 0 and β > 0.
β−1 −αxβ e ,
x > 0, elsewhere,
The mean and variance of the Weibull distribution are
􏰧 1􏰨 􏰥 􏰧 2􏰨 􏰮 􏰧 1􏰨􏰯2􏰶
μ=α−1/βΓ 1+β andσ2=α−2/β Γ 1+β − Γ 1+β .
Theorem 6.8:
The graphs of the Weibull distribution for α = 1 and various values of the param- eter β are illustrated in Figure 6.30. We see that the curves change considerably in shape for different values of the parameter β. If we let β = 1, the Weibull dis- tribution reduces to the exponential distribution. For values of β > 1, the curves become somewhat bell shaped and resemble the normal curve but display some skewness.
The mean and variance of the Weibull distribution are stated in the following theorem. The reader is asked to provide the proof in Exercise 6.52 on page 206.
Like the gamma and exponential distributions, the Weibull distribution is also applied to reliability and life-testing problems such as the time to failure or

204
Chapter 6 Some Continuous Probability Distributions
f(x)
β 􏱋 3.5
0
β􏱋1 β􏱋2
0.5 1.0 1.5 2.0
Figure 6.30: Weibull distributions (α = 1).
x
life length of a component, measured from some specified time until it fails. Let us represent this time to failure by the continuous random variable T, with probability density function f (t), where f (t) is the Weibull distribution. The Weibull distribution has inherent flexibility in that it does not require the lack of memory property of the exponential distribution. The cumulative distribution function (cdf) for the Weibull can be written in closed form and certainly is useful in computing probabilities.
The cumulative distribution function for the Weibull distribution is given by
F (x) = 1 − e−αxβ , for x ≥ 0,
for α > 0 and β > 0.
Example 6.24: The length of life X, in hours, of an item in a machine shop has a Weibull distri- bution with α = 0.01 and β = 2. What is the probability that it fails before eight hours of usage?
Solution: P(X < 8) = F(8) = 1 − e−(0.01)82 = 1 − 0.527 = 0.473. The Failure Rate for the Weibull Distribution When the Weibull distribution applies, it is often helpful to determine the fail- ure rate (sometimes called the hazard rate) in order to get a sense of wear or deterioration of the component. Let us first define the reliability of a component or product as the probability that it will function properly for at least a specified time under specified experimental conditions. Therefore, if R(t) is defined to be cdf for Weibull Distribution 6.10 Weibull Distribution (Optional) 205 the reliability of the given component at time t, we may write 􏰫∞ t where F (t) is the cumulative distribution function of T . The conditional probability that a component will fail in the interval from T = t to T = t + Δt, given that it survived to time t, is F(t + Δt) − F(t). R(t) Dividing this ratio by Δt and taking the limit as Δt → 0, we get the failure rate, denoted by Z(t). Hence, Z(t)= lim F(t+Δt)−F(t) 1 = F′(t) = f(t) = f(t) , Δt→0 Δt R(t) R(t) R(t) 1 − F (t) which expresses the failure rate in terms of the distribution of the time to failure. Since Z (t) = f (t)/[1 − F (t)], the failure rate is given as follows: The failure rate at time t for the Weibull distribution is given by Z(t) = αβtβ−1, t > 0.
R(t)=P(T >t)=
f(t)dt=1−F(t),
Failure Rate for Weibull Distribution
Interpretation of the Failure Rate
The quantity Z(t) is aptly named as a failure rate since it does quantify the rate of change over time of the conditional probability that the component lasts an additional Δt given that it has lasted to time t. The rate of decrease (or increase) with time is important. The following are crucial points.
(a) If β = 1, the failure rate = α, a constant. This, as indicated earlier, is the special case of the exponential distribution in which lack of memory prevails.
(b) If β > 1, Z(t) is an increasing function of time t, which indicates that the component wears over time.
(c) If β < 1, Z(t) is a decreasing function of time t and hence the component strengthens or hardens over time. For example, the item in the machine shop in Example 6.24 has β = 2, and hence it wears over time. In fact, the failure rate function is given by Z(t) = 0.02t. On the other hand, suppose the parameters were β = 3/4 and α = 2. In that case, Z(t) = 1.5/t1/4 and hence the component gets stronger over time. 206 Chapter 6 Some Continuous Probability Distributions Exercises 6.39 Use the gamma function with y = that Γ(1/2) =√π. √ 2x to show (a) How long can such a battery be expected to last? (b) What is the probability that such a battery will be operating after 2 years? 6.48 Derive the mean and variance of the beta distri- bution. 6.49 Suppose the random variable X follows a beta distribution with α = 1 and β = 3. (a) Determine the mean and median of X. (b) Determine the variance of X. (c) Find the probability that X > 1/3.
6.50 If the proportion of a brand of television set re- quiring service during the first year of operation is a random variable having a beta distribution with α = 3 and β = 2, what is the probability that at least 80% of the new models of this brand sold this year will require service during their first year of operation?
6.51 The lives of a certain automobile seal have the √
Weibull distribution with failure rate Z(t) =1/ t. Find the probability that such a seal is still intact after 4 years.
6.52 Derive the mean and variance of the Weibull dis- tribution.
6.53 In a biomedical research study, it was deter- mined that the survival time, in weeks, of an animal subjected to a certain exposure of gamma radiation has a gamma distribution with α = 5 and β = 10.
(a) What is the mean survival time of a randomly se- lected animal of the type used in the experiment?
(b) What is the standard deviation of survival time?
(c) What is the probability that an animal survives more than 30 weeks?
6.54 The lifetime, in weeks, of a certain type of tran- sistor is known to follow a gamma distribution with mean 10 weeks and standard deviation √50 weeks. (a) What is the probability that a transistor of this
type will last at most 50 weeks?
(b) What is the probability that a transistor of this type will not survive the first 10 weeks?
6.55 Computer response time is an important appli- cation of the gamma and exponential distributions. Suppose that a study of a certain computer system reveals that the response time, in seconds, has an ex- ponential distribution with a mean of 3 seconds.
//
6.40 In a certain city, the daily consumption of water (in millions of liters) follows approximately a gamma distribution with α = 2 and β = 3. If the daily capac- ity of that city is 9 million liters of water, what is the probability that on any given day the water supply is inadequate?
6.41 If a random variable X has the gamma distribu- tion with α = 2 and β = 1, find P(1.8 < X < 2.4). 6.42 Suppose that the time, in hours, required to repair a heat pump is a random variable X having a gamma distribution with parameters α = 2 and β = 1/2. What is the probability that on the next service call (a) at most 1 hour will be required to repair the heat pump? (b) at least 2 hours will be required to repair the heat pump? 6.43 (a) Find the mean and variance of the daily wa- ter consumption in Exercise 6.40. (b) According to Chebyshev’s theorem, there is a prob- ability of at least 3/4 that the water consumption on any given day will fall within what interval? 6.44 In a certain city, the daily consumption of elec- tric power, in millions of kilowatt-hours, is a random variable X having a gamma distribution with mean μ = 6 and variance σ2 = 12. (a) Find the values of α and β. (b) Find the probability that on any given day the daily power consumption will exceed 12 million kilowatt- hours. 6.45 The length of time for one individual to be served at a cafeteria is a random variable having an ex- ponential distribution with a mean of 4 minutes. What is the probability that a person is served in less than 3 minutes on at least 4 of the next 6 days? 6.46 The life, in years, of a certain type of electrical switch has an exponential distribution with an average life β = 2. If 100 of these switches are installed in dif- ferent systems, what is the probability that at most 30 fail during the first year? 6.47 Suppose that the service life, in years, of a hear- ing aid battery is a random variable having a Weibull distribution with α = 1/2 and β = 2. Review Exercises 207 (a) What is the probability that response time exceeds 5 seconds? (b) What is the probability that response time exceeds 10 seconds? 6.56 Rate data often follow a lognormal distribution. Average power usage (dB per hour) for a particular company is studied and is known to have a lognormal distribution with parameters μ = 4 and σ = 2. What is the probability that the company uses more than 270 dB during any particular hour? 6.57 For Exercise 6.56, what is the mean power usage (average dB per hour)? What is the variance? 6.58 The number of automobiles that arrive at a cer- tain intersection per minute has a Poisson distribution with a mean of 5. Interest centers around the time that elapses before 10 automobiles appear at the intersec- tion. Review Exercises 6.61 According to a study published by a group of so- ciologists at the University of Massachusetts, approx- imately 49% of the Valium users in the state of Mas- sachusetts are white-collar workers. What is the prob- ability that between 482 and 510, inclusive, of the next 1000 randomly selected Valium users from this state are white-collar workers? 6.62 The exponential distribution is frequently ap- plied to the waiting times between successes in a Pois- son process. If the number of calls received per hour by a telephone answering service is a Poisson random variable with parameter λ = 6, we know that the time, in hours, between successive calls has an exponential distribution with parameter β =1/6. What is the prob- ability of waiting more than 15 minutes between any two successive calls? 6.63 When α is a positive integer n, the gamma dis- tribution is also known as the Erlang distribution. Setting α = n in the gamma distribution on page 195, the Erlang distribution is (a) What is the probability that more than 10 auto- mobiles appear at the intersection during any given minute of time? (b) What is the probability that more than 2 minutes elapse before 10 cars arrive? 6.59 Consider the information in Exercise 6.58. (a) What is the probability that more than 1 minute elapses between arrivals? (b) What is the mean number of minutes that elapse between arrivals? 6.60 Show that the failure-rate function is given by Z(t) = αβtβ−1, t > 0,
if and only if the time to failure distribution is the Weibull distribution
f(t) = αβtβ−1e−αtβ , t > 0.
is the probability that the next 3 calls will be received within the next 30 minutes?
6.64 A manufacturer of a certain type of large ma- chine wishes to buy rivets from one of two manufac- turers. It is important that the breaking strength of each rivet exceed 10,000 psi. Two manufacturers (A and B) offer this type of rivet and both have rivets whose breaking strength is normally distributed. The mean breaking strengths for manufacturers A and B are 14,000 psi and 13,000 psi, respectively. The stan- dard deviations are 2000 psi and 1000 psi, respectively. Which manufacturer will produce, on the average, the fewest number of defective rivets?
6.65 According to a recent census, almost 65% of all households in the United States were composed of only one or two persons. Assuming that this percentage is still valid today, what is the probability that between 590 and 625, inclusive, of the next 1000 randomly se- lected households in America consist of either one or two persons?
6.66 A certain type of device has an advertised fail- ure rate of 0.01 per hour. The failure rate is constant and the exponential distribution applies.
(a) What is the mean time to failure?
(b) What is the probability that 200 hours will pass before a failure is observed?
6.67 In a chemical processing plant, it is important that the yield of a certain type of batch product stay
􏰥 xn−1 e−x/β f(x) = βn(n−1)!
0,
,
x > 0, elsewhere.
//
It can be shown that if the times between successive events are independent, each having an exponential distribution with parameter β, then the total elapsed waiting time X until all n events occur has the Erlang distribution. Referring to Review Exercise 6.62, what

208
Chapter 6 Some Continuous Probability Distributions
above 80%. If it stays below 80% for an extended pe- riod of time, the company loses money. Occasional defective batches are of little concern. But if several batches per day are defective, the plant shuts down and adjustments are made. It is known that the yield is normally distributed with standard deviation 4%.
(a) What is the probability of a “false alarm” (yield below 80%) when the mean yield is 85%?
(b) What is the probability that a batch will have a yield that exceeds 80% when in fact the mean yield is 79%?
6.68 For an electrical component with a failure rate of once every 5 hours, it is important to consider the time that it takes for 2 components to fail.
(a) Assuming that the gamma distribution applies, what is the mean time that it takes for 2 compo- nents to fail?
(b) What is the probability that 12 hours will elapse before 2 components fail?
6.69 The elongation of a steel bar under a particular load has been established to be normally distributed with a mean of 0.05 inch and σ = 0.01 inch. Find the probability that the elongation is
(a) above 0.1 inch; (b) below 0.04 inch;
(c) between 0.025 and 0.065 inch.
6.70 A controlled satellite is known to have an error (distance from target) that is normally distributed with mean zero and standard deviation 4 feet. The manu- facturer of the satellite defines a success as a firing in which the satellite comes within 10 feet of the target. Compute the probability that the satellite fails.
6.71 A technician plans to test a certain type of resin developed in the laboratory to determine the nature of the time required before bonding takes place. It is known that the mean time to bonding is 3 hours and the standard deviation is 0.5 hour. It will be con- sidered an undesirable product if the bonding time is either less than 1 hour or more than 4 hours. Com- ment on the utility of the resin. How often would its performance be considered undesirable? Assume that time to bonding is normally distributed.
6.72 Consider the information in Review Exercise 6.66. What is the probability that less than 200 hours will elapse before 2 failures occur?
6.73 For Review Exercise 6.72, what are the mean and variance of the time that elapses before 2 failures occur?
6.74 The average rate of water usage (thousands of gallons per hour) by a certain community is known to involve the lognormal distribution with parameters μ = 5 and σ = 2. It is important for planning purposes to get a sense of periods of high usage. What is the probability that, for any given hour, 50,000 gallons of water are used?
6.75 For Review Exercise 6.74, what is the mean of the average water usage per hour in thousands of gal- lons?
6.76 In Exercise 6.54 on page 206, the lifetime of a transistor is assumed to have a gamma distribution with mean 10 weeks and standard deviation √50 weeks. Suppose that the gamma distribution assumption is in- correct. Assume that the distribution is normal.
(a) What is the probability that a transistor will last at most 50 weeks?
(b) What is the probability that a transistor will not survive for the first 10 weeks?
(c) Comment on the difference between your results here and those found in Exercise 6.54 on page 206.
6.77 The beta distribution has considerable applica- tion in reliability problems in which the basic random variable is a proportion, as in the practical scenario il- lustrated in Exercise 6.50 on page 206. In that regard, consider Review Exercise 3.73 on page 108. Impurities in batches of product of a chemical process reflect a serious problem. It is known that the proportion of impurities Y in a batch has the density function
//
(a) (b)
(c) (d) (e)
􏰰9
f(y)= 10(1−y), 0≤y≤1,
0, elsewhere.
Verify that the above is a valid density function.
What is the probability that a batch is considered not acceptable (i.e., Y > 0.6)?
What are the parameters α and β of the beta dis- tribution illustrated here?
The mean of the beta distribution is α . What is α+β
the mean proportion of impurities in the batch?
The variance of a beta distributed random variable is
σ2= αβ . (α + β)2(α + β + 1)
What is the variance of Y in this problem?
6.78 Consider now Review Exercise 3.74 on page 108. The density function of the time Z in minutes between calls to an electrical supply store is given by
􏰰1 e−z/10, 0 0, the trans- formation x1 = y1 − x2 implies that y2 and hence x2 must always be less than or equal to y1. Consequently, the marginal probability distribution of Y1 is
y1 􏰤
y1!
= e−(μ1+μ2)
y1
􏰤 μy1−y2μy2
1 2 y2=0 (y1 − y2)!y2!
h(y1) =
= e−(μ1+μ2)
(y1 − y2)!y2!
g(y1, y2) = e−(μ1+μ2)
y2=0
y1
􏰤 y1!
μy1−y2 μy2 y2=0 y2!(y1 −y2)! 1 2
y1 􏰧 􏰨
􏰤 y1 μy1−y2 μy2 .
y1!y2=0y21 2
Recognizing this sum as the binomial expansion of (μ1 + μ2)y1 we obtain
e−(μ1+μ2)(μ1 + μ2)y1
h(y1)= y1! , y1 =0,1,2,…,
from which we conclude that the sum of the two independent random variables having Poisson distributions, with parameters μ1 and μ2, has a Poisson distribution with parameter μ1 + μ2.
To find the probability distribution of the random variable Y = u(X) when X is a continuous random variable and the transformation is one-to-one, we shall need Theorem 7.3. The proof of the theorem is left to the reader.
Suppose that X is a continuous random variable with probability distribution f(x). Let Y = u(X) define a one-to-one correspondence between the values of X and Y so that the equation y = u(x) can be uniquely solved for x in terms of y, say x = w(y). Then the probability distribution of Y is
g(y) = f[w(y)]|J|,
where J = w′(y) and is called the Jacobian of the transformation.
Theorem 7.3:

214
Chapter 7 Functions of Random Variables (Optional)
Example 7.3: Let X be a continuous random variable with probability distribution
􏰥
functionofY tobe 􏰥(y+3)/2􏰩1􏰪=y+3, −1 0.
Since g(y) is a density function, it follows that
f(z) = √
e , −∞ < z < ∞. 1 −z2/2 1=√ 2π0 1 􏰫 ∞ Γ(1/2) 􏰫 ∞ y1/2−1e−y/2 Γ(1/2) y1/2−1e−y/2dy= √ √ dy= √ , 􏰥 1 y1/2−1e−y/2, y > 0,
π0 2Γ(1/2) π
the integral being the area under a gamma probability curve with parameters
α=1/2andβ=2. Hence,√π=Γ(1/2)andthedensityofY isgivenby

2Γ(1/2)
0, elsewhere,
g(y) =
which is seen to be a chi-squared distribution with 1 degree of freedom.

218
Chapter 7 Functions of Random Variables (Optional)
7.3
Moments and Moment-Generating Functions
In this section, we concentrate on applications of moment-generating functions. The obvious purpose of the moment-generating function is in determining moments of random variables. However, the most important contribution is to establish distributions of functions of random variables.
If g(X) = Xr for r = 0, 1, 2, 3, . . . , Definition 7.1 yields an expected value called the rth moment about the origin of the random variable X, which we denote b y μ ′r .
Since the first and second moments about the origin are given by μ′1 = E(X) and μ′2 = E(X2), we can write the mean and variance of a random variable as
μ = μ ′1 a n d σ 2 = μ ′2 − μ 2 .
Although the moments of a random variable can be determined directly from Definition 7.1, an alternative procedure exists. This procedure requires us to utilize a moment-generating function.
Moment-generating functions will exist only if the sum or integral of Definition 7.2 converges. If a moment-generating function of a random variable X does exist, it can be used to generate all the moments of that variable. The method is described in Theorem 7.6 without proof.
The rth moment about the origin of the random variable X is given by
⎧⎨􏰦xrf(x), if X is discrete, μ′ =E(Xr)= x
r ⎩􏰬 ∞ xrf(x) dx, if X is continuous. −∞
Definition 7.1:
The moment-generating function of the random variable X is given by E(etX ) and is denoted by MX(t). Hence,
⎧⎨􏰦etxf(x), if X is discrete, M (t)=E(etX)= x
X ⎩􏰬∞ etxf(x) dx, if X is continuous. −∞
Definition 7.2:
Let X be a random variable with moment-generating function MX(t). Then d r M X ( t ) 􏰭􏰭􏰭 = μ ′ .
dtr 􏰭 r t=0
Theorem 7.6:
Example 7.6: Find the moment-generating function of the binomial random variable X and then useittoverifythatμ=npandσ2 =npq.
Solution: From Definition 7.2 we have
􏰤n 􏰧n􏰨 􏰤n􏰧n􏰨
MX (t) = etx pxqn−x = (pet)xqn−x.
x=0x x=0x

7.3 Moments and Moment-Generating Functions 219 Recognizing this last sum as the binomial expansion of (pet + q)n, we obtain
Now
and
MX(t)=(pet +q)n. dMX (t) = n(pet + q)n−1pet
dt
d2MX (t) = np[et(n − 1)(pet + q)n−2pet + (pet + q)n−1et]. dt2
Setting t = 0, we get
Therefore,
μ′1 =npandμ′2 =np[(n−1)p+1].
μ = μ′1 = np and σ2 = μ′2 − μ2 = np(1 − p) = npq,
which agrees with the results obtained in Chapter 5.
Example 7.7: Show that the moment-generating function of the random variable X having a normal probability distribution with mean μ and variance σ2 is given by
􏰧1􏰨 MX(t)=exp μt+2σ2t2 .
Solution : From Definition 7.2 the moment-generating function of the normal random variable X is
􏰫∞ 􏰷 􏰧 􏰨2􏰸 MX (t) = etx √ 1 exp − 1 x − μ dx
−∞ 2πσ 2σ
􏰫∞ 1 􏰮 x2−2(μ+tσ2)x+μ2􏰯 =√exp− 2 dx.
−∞ 2πσ 2σ Completing the square in the exponent, we can write
x2 −2(μ+tσ2)x+μ2 =[x−(μ+tσ2)]2 −2μtσ2 −t2σ4 and then
􏰫 ∞ 1 􏰰 [x−(μ+tσ2)]2 −2μtσ2 −t2σ4􏰹
MX(t)= √ exp− −∞ 2πσ
2 dx 􏰰 [x−(μ+tσ2)]2􏰹
􏰧2μt+σ2t2􏰨􏰫 ∞

√2πσexp − 2σ2
=exp 2
1
dx.
−∞
Let w = [x − (μ + tσ2)]/σ; then dx = σ dw and
􏰧1􏰨􏰫∞1 􏰧1􏰨
MX(t) = exp μt + 2σ2t2
−∞
√2πe−w2/2 dw = exp μt + 2σ2t2 ,

220 Chapter 7 Functions of Random Variables (Optional)
Theorem 7.7:
since the last integral represents the area under a standard normal density curve and hence equals 1.
Although the method of transforming variables provides an effective way of finding the distribution of a function of several variables, there is an alternative and often preferred procedure when the function in question is a linear combination of independent random variables. This procedure utilizes the properties of moment- generating functions discussed in the following four theorems. In keeping with the mathematical scope of this book, we state Theorem 7.7 without proof.
(Uniqueness Theorem) Let X and Y be two random variables with moment- generating functions MX (t) and MY (t), respectively. If MX (t) = MY (t) for all values of t, then X and Y have the same probability distribution.
MX+a(t) = eatMX(t).
Theorem 7.8:
Proof: MX+a(t) = E[et(X+a)] = eatE(etX) = eatMX(t).
Theorem 7.9:
Proof: MaX(t) = E[et(aX)] = E[e(at)X] = MX(at). Theorem 7.10:
The proof of Theorem 7.10 is left for the reader.
Theorems 7.7 through 7.10 are vital for understanding moment-generating func-
tions. An example follows to illustrate. There are many situations in which we need to know the distribution of the sum of random variables. We may use Theorems 7.7 and 7.10 and the result of Exercise 7.19 on page 224 to find the distribution of a sum of two independent Poisson random variables with moment-generating functions given by
MX1 (t) = eμ1(et−1) and MX2 (t) = eμ2(et−1),
respectively. According to Theorem 7.10, the moment-generating function of the
random variable Y1 = X1 + X2 is
MY1 (t) = MX1 (t)MX2 (t) = eμ1(et−1)eμ2(et−1) = e(μ1+μ2)(et−1),
which we immediately identify as the moment-generating function of a random variable having a Poisson distribution with the parameter μ1 + μ2 . Hence, accord- ing to Theorem 7.7, we again conclude that the sum of two independent random variables having Poisson distributions, with parameters μ1 and μ2, has a Poisson distribution with parameter μ1 + μ2.
MaX (t) = MX (at).
If X1, X2, . . . , Xn are independent random variables with moment-generating func- tionsMX1(t),MX2(t),…,MXn(t),respectively,andY =X1+X2+···+Xn,then
MY (t) = MX1(t)MX2(t)···MXn(t).

7.3 Moments and Moment-Generating Functions 221 Linear Combinations of Random Variables
Theorem 7.12:
In applied statistics one frequently needs to know the probability distribution of a linear combination of independent normal random variables. Let us obtain the distribution of the random variable Y = a1X1 +a2X2 when X1 is a normal variable with mean μ1 and variance σ12 and X2 is also a normal variable but independent of X1 with mean μ2 and variance σ2. First, by Theorem 7.10, we find
MY (t) = Ma1 X1 (t)Ma2 X2 (t), and then, using Theorem 7.9, we find
MY (t) = MX1 (a1t)MX2 (a2t).
Substituting a1t for t and then a2t for t in a moment-generating function of the
normal distribution derived in Example 7.7, we have
MY (t) = exp(a1μ1t + a21σ12t2/2 + a2μ2t + a2σ2t2/2)
= exp[(a1μ1 + a2μ2)t + (a21σ12 + a2σ2)t2/2],
which we recognize as the moment-generating function of a distribution that is normal with mean a1μ1 + a2μ2 and variance a21σ12 + a2σ2.
Generalizing to the case of n independent normal variables, we state the fol- lowing result.
If X1,X2,…,Xn are independent random variables having normal distributions withmeansμ1,μ2,…,μn andvariancesσ12,σ2,…,σn2,respectively,thentheran- dom variable
Y =a1X1 +a2X2 +···+anXn has a normal distribution with mean
and variance
μY =a1μ1 +a2μ2 +···+anμn σ Y2 = a 21 σ 12 + a 2 2 σ 2 2 + · · · + a 2n σ n2 .
Theorem 7.11:
It is now evident that the Poisson distribution and the normal distribution possess a reproductive property in that the sum of independent random variables having either of these distributions is a random variable that also has the same type of distribution. The chi-squared distribution also has this reproductive property.
If X1,X2,…,Xn are mutually independent random variables that have, respec- tively, chi-squared distributions with v1, v2, . . . , vn degrees of freedom, then the random variable
Y =X1 +X2 +···+Xn
has a chi-squared distribution with v = v1 + v2 + · · · + vn degrees of freedom.
Proof : By Theorem 7.10 and Exercise 7.21, MY(t)=MX1(t)MX2(t)···MXn(t)andMXi(t)=(1−2t)−vi/2, i=1,2,…,n.

222
Chapter 7 Functions of Random Variables (Optional)
//
If X1, X2, . . . , Xn are independent random variables having identical normal dis- tributions with mean μ and variance σ2, then the random variable
Y = 􏰤n 􏰧 X i − μ 􏰨 2
σ
has a chi-squared distribution with v = n degrees of freedom.
i=1
Corollary 7.1:
Therefore,
MY (t) = (1 − 2t)−v1/2(1 − 2t)−v2/2 · · · (1 − 2t)−vn/2 = (1 − 2t)−(v1+v2+···+vn)/2,
which we recognize as the moment-generating function of a chi-squared distribution withv=v1+v2+···+vn degreesoffreedom.
This corollary is an immediate consequence of Example 7.5. It establishes a re- lationship between the very important chi-squared distribution and the normal distribution. It also should provide the reader with a clear idea of what we mean by the parameter that we call degrees of freedom. In future chapters, the notion of degrees of freedom will play an increasingly important role.
If X1, X2, . . . , Xn are independent random variables and Xi follows a normal dis-
tribution with mean μi and variance σi2 for i = 1, 2, . . . , n, then the variable
random
􏰤n 􏰧 X − μ 􏰨 2 Y=ii
i=1 σi
has a chi-squared distribution with v = n degrees of freedom.
Corollary 7.2:
Exercises
7.1 Let X be a random variable with probability
the joint multinomial distribution
􏰰1, x = 1,2,3, f(x)= 3
f(x1,x2) 􏰺
􏰻􏰧 􏰨x1 􏰧 􏰨x2 􏰧 􏰨2−x1−x2 1 1 5
0, elsewhere.
Find the probability distribution of the random vari-
2
= x1,x2,2−x1 −x2 4 3 12
ableY =2X−1.
7.2 LetX beabinomialrandomvariablewithprob- Y1 =X1 +X2 andY2 =X1 −X2.
ability distribution
􏰥􏰩3􏰪 􏰩 2 􏰪x 􏰩 3 􏰪3−x f(x) = x 5 5
,
x = 0,1,2,3, elsewhere.
7.4 Let X1 and X2 be discrete random variables with joint probability distribution
􏰰x1x2
f(x1,x2) = 18 , x1 = 1,2; x2 = 1,2,3,
0, elsewhere.
Find the probability distribution of the random vari- able Y = X1X2.
for x1 = 0,1,2; x2 = 0,1,2; x1 +x2 ≤ 2; and zero elsewhere. Find the joint probability distribution of
0,
Find the probability distribution of the random vari-
able Y = X2.
7.3 Let X1 and X2 be discrete random variables with

Exercises
223
7.5 Let X have the probability distribution 􏰰
f(x)= 1, 0 0, f(x) = 0, elsewhere.
Show that the random variables Y1 and Y2 are inde- pendentwhenY1 =X1+X2 andY2 =X1/(X1+X2).
7.13 A current of I amperes flowing through a resis- tance of R ohms varies according to the probability distribution
􏰰6i(1−i), 0 < i < 1, f(i) = 0, elsewhere. If the resistance varies independently of the current ac- cording to the probability distribution 􏰰2r, 0 0, elsewhere,
where k is an appropriate constant and b depends on the absolute temperature and mass of the molecule. Find the probability distribution of the kinetic energy of the molecule W, where W = mV 2/2.
7.8 A dealer’s profit, in units of $5000, on a new au- tomobile is given by Y = X2, where X is a random variable having the density function
􏰰2(1−x), 0 < x < 1, f(x) = 0, elsewhere. (a) Find the probability density function of the random variable Y . (b) Using the density function of Y , find the probabil- ity that the profit on the next new automobile sold by this dealership will be less than $500. 7.9 The hospital period, in days, for patients follow- ing treatment for a certain type of kidney disorder is a random variable Y = X + 4, where X has the density function 􏰥 323, x>0, (x+4)
//
f(x) =
0, elsewhere.
(a) Find the probability density function of the random variable Y .
(b) Using the density function of Y , find the probabil- ity that the hospital period for a patient following this treatment will exceed 8 days.
distribution
f(x) =
􏰰 1+x , −1 < x < 1, 2 0, elsewhere. Find the probability distribution of the random vari- ableY=X2. 224 Chapter 7 Functions of Random Variables (Optional) 7.15 Let X have the probability distribution 􏰥 2(x+1), f(x) = 9 7.19 A random variable X has the Poisson distribu- tion p(x;μ) = e−μμx/x! for x = 0,1,2,.... Show that the moment-generating function of X is MX(t) = eμ(et−1). Using MX (t), find the mean and variance of the Pois- son distribution. 7.20 The moment-generating function of a certain Poisson random variable X is given by MX(t) = e4(et−1). Find P (μ − 2σ < X < μ + 2σ). 7.21 Show that the moment-generating function of the random variable X having a chi-squared distribu- tion with v degrees of freedom is MX (t) = (1 − 2t)−v/2. −1 < x < 2, elsewhere. 0, Find the probability distribution of the random vari- able Y = X2. 7.16 Show that the rth moment about the origin of the gamma distribution is μ′r = βrΓ(α+r). Γ(α) [Hint: Substitute y = x/β in the integral defining μ′r and then use the gamma function to evaluate the inte- gral.] 7.17 A random variable X has the discrete uniform distribution 􏰰1, x=1,2,...,k, f(x;k)= k Using the moment-generating function of Exer- cise 7.21, show that the mean and variance of the chi- squared distribution with v degrees of freedom are, re- spectively, v and 2v. 7.23 If both X and Y , distributed independently, fol- low exponential distributions with mean parameter 1, find the distributions of (a) U = X + Y ; (b)V =X/(X+Y). 7.24 By expanding etx in a Maclaurin series and in- tegrating term by term, show that 􏰫∞ MX(t) = etxf(x) dx −∞ ′ t2 = 1 + μt + μ2 2! + · · · + μr r! + · · · . 0, elsewhere. Show that the moment-generating function of X is 7.22 MX(t)= et(1−ekt). k(1−et) 7.18 A random variable X has the geometric distri- butiong(x;p)=pqx−1 forx=1,2,3,.... Showthat the moment-generating function of X is pet MX(t)= 1−qet, t 30) with n = 40. Since the time is measured on a continuous scale to the nearest minute, an x ̄ greater than 30 is equivalent to x ̄ ≥ 30.5. Hence,
􏰧X ̄ −28 P(X ̄ > 30) = P √ ≥
30.5−28􏰨
√ = P(Z ≥ 3.16) = 0.0008.
5/ 40
minutes. An illustrative graph is shown in Figure 8.4.
5/ 40
There is only a slight chance that the average time of one bus trip will exceed 30
28.0 30.5 x Figure 8.4: Area for Example 8.5.
Sampling Distribution of the Difference between Two Means
The illustration in Case Study 8.1 deals with notions of statistical inference on a single mean μ. The engineer was interested in supporting a conjecture regarding a single population mean. A far more important application involves two popula- tions. A scientist or engineer may be interested in a comparative experiment in which two manufacturing methods, 1 and 2, are to be compared. The basis for that comparison is μ1 − μ2, the difference in the population means.
Suppose that we have two populations, the first with mean μ1 and variance σ12, and the second with mean μ2 and variance σ2. Let the statistic X ̄1 represent the mean of a random sample of size n1 selected from the first population, and the statistic X ̄2 represent the mean of a random sample of size n2 selected from

238
Chapter 8 Fundamental Sampling Distributions and Data Descriptions
the second population, independent of the sample from the first population. What can we say about the sampling distribution of the difference X ̄1 − X ̄2 for repeated samples of size n1 and n2? According to Theorem 8.2, the variables X ̄1 and X ̄2 are both approximately normally distributed with means μ1 and μ2 and variances σ12/n1 and σ2/n2, respectively. This approximation improves as n1 and n2 increase. By choosing independent samples from the two populations we ensure that the variables X ̄1 and X ̄2 will be independent, and then using Theorem 7.11, with a1 = 1 and a2 = −1, we can conclude that X ̄1 − X ̄2 is approximately normally distributed with mean
and variance
μX ̄1−X ̄2 = μX ̄1 − μX ̄2 = μ1 − μ2
2 22σ12σ2 σ X ̄ 1 − X ̄ 2 = σ X ̄ 1 + σ X ̄ 2 = n + n .
12
The Central Limit Theorem can be easily extended to the two-sample, two-population case.
If independent samples of size n1 and n2 are drawn at random from two popu- lations, discrete or continuous, with means μ1 and μ2 and variances σ12 and σ2, respectively, then the sampling distribution of the differences of means, X ̄1 − X ̄2, is approximately normally distributed with mean and variance given by
2 σ12 σ2 μX ̄1−X ̄2 =μ1−μ2 andσX ̄1−X ̄2 =n +n .
12
Hence,
( X ̄ 1 − X ̄ 2 ) − ( μ 1 − μ 2 ) Z = 􏰱(σ12/n1) + (σ2/n2)
is approximately a standard normal variable.
Theorem 8.3:
If both n1 and n2 are greater than or equal to 30, the normal approximation for the distribution of X ̄1 − X ̄2 is very good when the underlying distributions are not too far away from normal. However, even when n1 and n2 are less than 30, the normal approximation is reasonably good except when the populations are decidedly nonnormal. Of course, if both populations are normal, then X ̄1 −X ̄2 has a normal distribution no matter what the sizes of n1 and n2 are.
The utility of the sampling distribution of the difference between two sample averages is very similar to that described in Case Study 8.1 on page 235 for the case of a single mean. Case Study 8.2 that follows focuses on the use of the difference between two sample means to support (or not) the conjecture that two population means are the same.
Case Study 8.2: Paint Drying Time: Two independent experiments are run in which two different types of paint are compared. Eighteen specimens are painted using type A, and the drying time, in hours, is recorded for each. The same is done with type B. The population standard deviations are both known to be 1.0.

8.4 Sampling Distribution of Means and the Central Limit Theorem 239
Assuming that the mean drying time is equal for the two types of paint, find P (X ̄A − X ̄B > 1.0), where X ̄A and X ̄B are average drying times for samples of size nA =nB =18.
Solution: From the sampling distribution of X ̄A − X ̄B, we know that the distribution is approximately normal with mean
and variance
μ X ̄ A − X ̄ B = μ A − μ B = 0
2 σA2σB2111
σX ̄A−X ̄B =n +n =18+18=9. AB
σXA−XB = 1 9
μA −μB =0 1.0 Figure 8.5: Area for Case Study 8.2.
xA −xB
The desired probability is given by the shaded region in Figure 8.5. Corre- sponding to the value X ̄A − X ̄B = 1.0, we have
1−(μA −μB) z = 􏰱
1/9
1−0
= 􏰱 = 3.0;
1/9
so
P(Z >3.0)=1−P(Z <3.0)=1−0.9987=0.0013. What Do We Learn from Case Study 8.2? The machinery in the calculation is based on the presumption that μA = μB. Suppose, however, that the experiment is actually conducted for the purpose of drawing an inference regarding the equality of μA and μB, the two population mean drying times. If the two averages differ by as much as 1 hour (or more), this clearly is evidence that would lead one to conclude that the population mean drying time is not equal for the two types of paint. On the other hand, suppose 240 Chapter 8 Fundamental Sampling Distributions and Data Descriptions that the difference in the two sample averages is as small as, say, 15 minutes. If μA =μB, P [(XA − XB ) > 0.25 hour] = P 􏰱1/9 > 4
􏰺􏰻 ̄ ̄ X ̄A−X ̄B−0 3
􏰧 3􏰨
=P Z > 4 =1−P(Z <0.75)=1−0.7734=0.2266. Since this probability is not low, one would conclude that a difference in sample means of 15 minutes can happen by chance (i.e., it happens frequently even though μA = μB ). As a result, that type of difference in average drying times certainly is not a clear signal that μA ̸= μB . As we indicated earlier, a more detailed formalism regarding this and other types of statistical inference (e.g., hypothesis testing) will be supplied in future chapters. The Central Limit Theorem and sampling distributions discussed in the next three sections will also play a vital role. Example 8.6: The television picture tubes of manufacturer A have a mean lifetime of 6.5 years and a standard deviation of 0.9 year, while those of manufacturer B have a mean lifetime of 6.0 years and a standard deviation of 0.8 year. What is the probability that a random sample of 36 tubes from manufacturer A will have a mean lifetime that is at least 1 year more than the mean lifetime of a sample of 49 tubes from manufacturer B? Solution : We are given the following information: Population 1 μ1 =6.5 σ1 =0.9 n1 = 36 Population 2 μ2 =6.0 σ2 =0.8 n2 = 49 If we use Theorem 8.3, the sampling distribution of X ̄1 − X ̄2 will be approxi- mately normal and will have a mean and standard deviation 􏰼 μX ̄1−X ̄2 = 6.5 − 6.0 = 0.5 and σX ̄1−X ̄2 = 0.81 + 0.64 = 0.189. 36 49 The probability that the mean lifetime for 36 tubes from manufacturer A will be at least 1 year longer than the mean lifetime for 49 tubes from manufacturer B is given by the area of the shaded region in Figure 8.6. Corresponding to the value x ̄ 1 − x ̄ 2 = 1 . 0 , w e fi n d t h a t and hence z = 1.0−0.5 = 2.65, 0.189 P(X ̄1 −X ̄2 ≥1.0)=P(Z >2.65)=1−P(Z <2.65) = 1 − 0.9960 = 0.0040. // Exercises 241 σ x1􏱍 x2 􏱋 0.189 x1 􏱍 x2 Figure 8.6: Area for Example 8.6. More on Sampling Distribution of Means—Normal Approximation to the Binomial Distribution 0.5 1.0 Section 6.5 presented the normal approximation to the binomial distribution at length. Conditions were given on the parameters n and p for which the distribution of a binomial random variable can be approximated by the normal distribution. Examples and exercises reflected the importance of the concept of the “normal approximation.” It turns out that the Central Limit Theorem sheds even more light on how and why this approximation works. We certainly know that a binomial random variable is the number X of successes in n independent trials, where the outcome of each trial is binary. We also illustrated in Chapter 1 that the proportion computed in such an experiment is an average of a set of 0s and 1s. Indeed, while the proportion X/n is an average, X is the sum of this set of 0s and 1s, and both X and X/n are approximately normal if n is sufficiently large. Of course, from what we learned in Chapter 6, we know that there are conditions on n and p that affect the quality of the approximation, namely np ≥ 5 and nq ≥ 5. Exercises 8.17 If all possible samples of size 16 are drawn from a normal population with mean equal to 50 and stan- dard deviation equal to 5, what is the probability that a sample mean X ̄ will fall in the interval from μX ̄ −1.9σX ̄ to μX ̄ − 0.4σX ̄ ? Assume that the sample means can be measured to any degree of accuracy. 8.18 If the standard deviation of the mean for the sampling distribution of random samples of size 36 from a large or infinite population is 2, how large must the sample size become if the standard deviation is to be reduced to 1.2? 8.19 A certain type of thread is manufactured with a mean tensile strength of 78.3 kilograms and a standard deviation of 5.6 kilograms. How is the variance of the sample mean changed when the sample size is (a) increased from 64 to 196? (b) decreased from 784 to 49? Given the discrete uniform population 􏰰1, x=2,4,6, f(x)= 3 find the probability that a random sample of size 54, selected with replacement, will yield a sample mean greater than 4.1 but less than 4.4. Assume the means are measured to the nearest tenth. 8.21 A soft-drink machine is regulated so that the amount of drink dispensed averages 240 milliliters with 8.20 0, elsewhere, a standard deviation of 15 milliliters. Periodically, the machine is checked by taking a sample of 40 drinks and computing the average content. If the mean of the 40 drinks is a value within the interval μX ̄ ± 2σX ̄ , the machine is thought to be operating satisfactorily; oth- erwise, adjustments are made. In Section 8.3, the com- pany official found the mean of 40 drinks to be x ̄ = 236 milliliters and concluded that the machine needed no adjustment. Was this a reasonable decision? 8.22 The heights of 1000 students are approximately normally distributed with a mean of 174.5 centimeters and a standard deviation of 6.9 centimeters. Suppose 200 random samples of size 25 are drawn from this pop- ulation and the means recorded to the nearest tenth of a centimeter. Determine (a) the mean and standard deviation of the sampling distribution of X ̄ ; (b) the number of sample means that fall between 172.5 and 175.8 centimeters inclusive; (c) the number of sample means falling below 172.0 centimeters. 8.23 The random variable X, representing the num- ber of cherries in a cherry puff, has the following prob- ability distribution: x4567 P(X = x) 0.2 0.4 0.3 0.1 (a) Find the mean μ and the variance σ2 of X. (b)Findthemeanμ ̄ andthevarianceσ2 ̄ ofthemean is observed, find the probability that their mean time at the teller’s window is (a) at most 2.7 minutes; (b) more than 3.5 minutes; (c) at least 3.2 minutes but less than 3.4 minutes. 8.27 In a chemical process, the amount of a certain type of impurity in the output is difficult to control and is thus a random variable. Speculation is that the population mean amount of the impurity is 0.20 gram per gram of output. It is known that the standard deviation is 0.1 gram per gram. An experiment is con- ducted to gain more insight regarding the speculation that μ = 0.2. The process is run on a lab scale 50 times and the sample average x ̄ turns out to be 0.23 gram per gram. Comment on the speculation that the mean amount of impurity is 0.20 gram per gram. Make use of the Central Limit Theorem in your work. 8.28 A random sample of size 25 is taken from a nor- mal population having a mean of 80 and a standard deviation of 5. A second random sample of size 36 is taken from a different normal population having a mean of 75 and a standard deviation of 3. Find the probability that the sample mean computed from the 25 measurements will exceed the sample mean com- puted from the 36 measurements by at least 3.4 but less than 5.9. Assume the difference of the means to be measured to the nearest tenth. 8.29 The distribution of heights of a certain breed of terrier has a mean of 72 centimeters and a standard de- viation of 10 centimeters, whereas the distribution of heights of a certain breed of poodle has a mean of 28 centimeters with a standard deviation of 5 centimeters. Assuming that the sample means can be measured to any degree of accuracy, find the probability that the sample mean for a random sample of heights of 64 ter- riers exceeds the sample mean for a random sample of heights of 100 poodles by at most 44.2 centimeters. 8.30 The mean score for freshmen on an aptitude test at a certain college is 540, with a standard deviation of 50. Assume the means to be measured to any degree of accuracy. What is the probability that two groups selected at random, consisting of 32 and 50 students, respectively, will differ in their mean scores by (a) more than 20 points? (b) an amount between 5 and 10 points? 8.31 Consider Case Study 8.2 on page 238. Suppose 18 specimens were used for each type of paint in an experiment and x ̄A − x ̄B , the actual difference in mean drying time, turned out to be 1.0. (a) Does this seem to be a reasonable result if the // 242 Chapter 8 Fundamental Sampling Distributions and Data Descriptions XX X ̄ for random samples of 36 cherry puffs. (c) Find the probability that the average number of cherries in 36 cherry puffs will be less than 5.5. 8.24 If a certain machine makes electrical resistors having a mean resistance of 40 ohms and a standard deviation of 2 ohms, what is the probability that a random sample of 36 of these resistors will have a com- bined resistance of more than 1458 ohms? 8.25 The average life of a bread-making machine is 7 years, with a standard deviation of 1 year. Assuming that the lives of these machines follow approximately a normal distribution, find (a) the probability that the mean life of a random sam- ple of 9 such machines falls between 6.4 and 7.2 years; (b) the value of x to the right of which 15% of the means computed from random samples of size 9 would fall. 8.26 The amount of time that a drive-through bank teller spends on a customer is a random variable with a mean μ = 3.2 minutes and a standard deviation σ = 1.6 minutes. If a random sample of 64 customers 8.5 Sampling Distribution of S2 243 two population mean drying times truly are equal? Make use of the result in the solution to Case Study 8.2. (b) If someone did the experiment 10,000 times un- der the condition that μA = μB, in how many of those 10,000 experiments would there be a differ- ence x ̄A − x ̄B that was as large as (or larger than) 1.0? 8.32 Two different box-filling machines are used to fill cereal boxes on an assembly line. The critical measure- ment influenced by these machines is the weight of the product in the boxes. Engineers are quite certain that the variance of the weight of product is σ2 = 1 ounce. Experiments are conducted using both machines with sample sizes of 36 each. The sample averages for ma- chines A and B are x ̄A = 4.5 ounces and x ̄B = 4.7 ounces. Engineers are surprised that the two sample averages for the filling machines are so different. (a) Use the Central Limit Theorem to determine P ( X ̄ B − X ̄ A ≥ 0 . 2 ) under the condition that μA = μB . (b) Do the aforementioned experiments seem to, in any way, strongly support a conjecture that the popu- lation means for the two machines are different? Explain using your answer in (a). 8.33 The chemical benzene is highly toxic to hu- mans. However, it is used in the manufacture of many medicine dyes, leather, and coverings. Government regulations dictate that for any production process in- volving benzene, the water in the output of the process must not exceed 7950 parts per million (ppm) of ben- zene. For a particular process of concern, the water sample was collected by a manufacturer 25 times ran- domly and the sample average x ̄ was 7960 ppm. It is known from historical data that the standard deviation σ is 100 ppm. (a) What is the probability that the sample average in this experiment would exceed the government limit if the population mean is equal to the limit? Use the Central Limit Theorem. (b) Is an observed x ̄ = 7960 in this experiment firm evidence that the population mean for the process exceeds the government limit? Answer your ques- tion by computing P ( X ̄ ≥ 7 9 6 0 | μ = 7 9 5 0 ) . Assume that the distribution of benzene concentra- tion is normal. 8.34 Two alloys A and B are being used to manufac- ture a certain steel product. An experiment needs to be designed to compare the two in terms of maximum load capacity in tons (the maximum weight that can be tolerated without breaking). It is known that the two standard deviations in load capacity are equal at 5 tons each. An experiment is conducted in which 30 specimens of each alloy (A and B) are tested and the results recorded as follows: x ̄A =49.5, x ̄B =45.5; x ̄A −x ̄B =4. The manufacturers of alloy A are convinced that this evidence shows conclusively that μA > μB and strongly supports the claim that their alloy is superior. Man- ufacturers of alloy B claim that the experiment could easily have given x ̄A − x ̄B = 4 even if the two popula- tion means are equal. In other words, “the results are inconclusive!”
(a) Make an argument that manufacturers of alloy B are wrong. Do it by computing
P ( X ̄ A − X ̄ B > 4 | μ A = μ B ) .
(b) Do you think these data strongly support alloy A?
8.35 Consider the situation described in Example 8.4 on page 234. Do these results prompt you to question the premise that μ = 800 hours? Give a probabilis- tic result that indicates how rare an event X ̄ ≤ 775 is when μ = 800. On the other hand, how rare would it be if μ truly were, say, 760 hours?
8.36 Let X1,X2,…,Xn be a random sample from a distribution that can take on only positive values. Use the Central Limit Theorem to produce an argument that if n is sufficiently large, then Y = X1X2 · · · Xn has approximately a lognormal distribution.
8.5 Sampling Distribution of S2
In the preceding section we learned about the sampling distribution of X ̄. The
Central Limit Theorem allowed us to make use of the fact that
X ̄ − μ σ/√n

244
Chapter 8 Fundamental Sampling Distributions and Data Descriptions
Theorem 8.4:
tends toward N(0,1) as the sample size grows large. Sampling distributions of important statistics allow us to learn information about parameters. Usually, the parameters are the counterpart to the statistics in question. For example, if an engineer is interested in the population mean resistance of a certain type of resistor, the sampling distribution of X ̄ will be exploited once the sample information is gathered. On the other hand, if the variability in resistance is to be studied, clearly the sampling distribution of S2 will be used in learning about the parametric counterpart, the population variance σ2.
If a random sample of size n is drawn from a normal population with mean μ and variance σ2, and the sample variance is computed, we obtain a value of the statistic S2. We shall proceed to consider the distribution of the statistic (n − 1)S2/σ2.
By the addition and subtraction of the sample mean X ̄, it is easy to see that
􏰤n
(Xi −μ)2 =
􏰤n
[(Xi −X ̄)+(X ̄ −μ)]2
i=1
i=1
􏰤n 􏰤n 􏰤n
i=1 􏰤n
(Xi −X ̄)2 + (X ̄ −μ)2 +2(X ̄ −μ) i=1
(Xi −X ̄) i=1
= =
1 􏰤n (n−1)S2 (X ̄−μ)2 σ2 (Xi−μ)2= σ2 +σ2/n.
( X i − X ̄ ) 2 + n ( X ̄ − μ ) 2 .
Dividing each term of the equality by σ2 and substituting (n−1)S2 for
we obtain
i=1
Now, according to Corollary 7.1 on page 222, we know that
􏰤n ( X i − μ ) 2
σ2 i=1
i=1
􏰦n i=1
is a chi-squared random variable with n degrees of freedom. We have a chi-squared random variable with n degrees of freedom partitioned into two components. Note that in Section 6.7 we showed that a chi-squared distribution is a special case of a gamma distribution. The second term on the right-hand side is Z2, which is a chi-squared random variable with 1 degree of freedom, and it turns out that (n − 1)S2/σ2 is a chi-squared random variable with n − 1 degree of freedom. We formalize this in the following theorem.
(Xi−X ̄)2,
If S2 is the variance of a random sample of size n taken from a normal population having the variance σ2, then the statistic
2 (n−1)S2 􏰤n(Xi−X ̄)2 χ= σ2 = σ2
i=1
has a chi-squared distribution with v = n − 1 degrees of freedom.
The values of the random variable χ2 are calculated from each sample by the

8.5 Sampling Distribution of S2 245 formula
2 (n − 1)s2 χ= σ2 .
The probability that a random sample produces a χ2 value greater than some specified value is equal to the area under the curve to the right of this value. It is customary to let χ2α represent the χ2 value above which we find an area of α. This is illustrated by the shaded region in Figure 8.7.
χ2
Table A.5 gives values of χ2α for various values of α and v. The areas, α, are the column headings; the degrees of freedom, v, are given in the left column; and the table entries are the χ2 values. Hence, the χ2 value with 7 degrees of freedom, leaving an area of 0.05 to the right, is χ20.05 = 14.067. Owing to lack of symmetry, we must also use the tables to find χ20.95 = 2.167 for v = 7.
Exactly 95% of a chi-squared distribution lies between χ20.975 and χ20.025. A χ2 value falling to the right of χ20.025 is not likely to occur unless our assumed value of σ2 is too small. Similarly, a χ2 value falling to the left of χ20.975 is unlikely unless our assumed value of σ2 is too large. In other words, it is possible to have a χ2 value to the left of χ20.975 or to the right of χ20.025 when σ2 is correct, but if this should occur, it is more probable that the assumed value of σ2 is in error.
Example 8.7: A manufacturer of car batteries guarantees that the batteries will last, on average, 3 years with a standard deviation of 1 year. If five of these batteries have lifetimes of 1.9, 2.4, 3.0, 3.5, and 4.2 years, should the manufacturer still be convinced that the batteries have a standard deviation of 1 year? Assume that the battery lifetime follows a normal distribution.
Solution : We first find the sample variance using Theorem 8.1,
Then
0χ2 α
α
Figure 8.7: The chi-squared distribution.
s
2
(5)(48.26) − (15)2
= (5)(4) = 0.815.
χ2 = (4)(0.815) = 3.26 1

246 Chapter 8 Fundamental Sampling Distributions and Data Descriptions
is a value from a chi-squared distribution with 4 degrees of freedom. Since 95% of the χ2 values with 4 degrees of freedom fall between 0.484 and 11.143, the computed value with σ2 = 1 is reasonable, and therefore the manufacturer has no reason to suspect that the standard deviation is other than 1 year.
Degrees of Freedom as a Measure of Sample Information
Recall from Corollary 7.1 in Section 7.3 that 􏰤n ( X i − μ ) 2
σ2 i=1
has a χ2-distribution with n degrees of freedom. Note also Theorem 8.4, which indicates that the random variable
(n−1)S2 􏰤n(Xi−X ̄)2 σ2 = σ2
i=1
has a χ2-distribution with n−1 degrees of freedom. The reader may also recall that the term degrees of freedom, used in this identical context, is discussed in Chapter 1.
As we indicated earlier, the proof of Theorem 8.4 will not be given. However, the reader can view Theorem 8.4 as indicating that when μ is not known and one considers the distribution of
􏰤n (Xi−X ̄)2
σ2 , i=1
there is 1 less degree of freedom, or a degree of freedom is lost in the estimation of μ (i.e., when μ is replaced by x ̄). In other words, there are n degrees of free- dom, or independent pieces of information, in the random sample from the normal distribution. When the data (the values in the sample) are used to compute the mean, there is 1 less degree of freedom in the information used to estimate σ2.
8.6 t-Distribution
In Section 8.4, we discussed the utility of the Central Limit Theorem. Its applica- tions revolve around inferences on a population mean or the difference between two population means. Use of the Central Limit Theorem and the normal distribution is certainly helpful in this context. However, it was assumed that the population standard deviation is known. This assumption may not be unreasonable in situ- ations where the engineer is quite familiar with the system or process. However, in many experimental scenarios, knowledge of σ is certainly no more reasonable than knowledge of the population mean μ. Often, in fact, an estimate of σ must be supplied by the same sample information that produced the sample average x ̄. As a result, a natural statistic to consider to deal with inferences on μ is
X ̄ − μ T = S/√n,

8.6 t-Distribution 247
Theorem 8.5:
since S is the sample analog to σ. If the sample size is small, the values of S2 fluc- tuate considerably from sample to sample (see Exercise 8.43 on page 259) and the distribution of T deviates appreciably from that of a standard normal distribution.
If the sample size is large enough, say n ≥ 30, the distribution of T does not differ considerably from the standard normal. However, for n < 30, it is useful to deal with the exact distribution of T . In developing the sampling distribution of T , we shall assume that our random sample was selected from a normal population. We can then write where (X ̄ − μ)/(σ/√n) Z T=􏰱=􏰱, S2/σ2 X ̄ − μ Z = σ/√n V/(n − 1) has the standard normal distribution and V = (n−1)S2 σ2 has a chi-squared distribution with v = n − 1 degrees of freedom. In sampling from normal populations, we can show that X ̄ and S2 are independent, and consequently so are Z and V . The following theorem gives the definition of a random variable T as a function of Z (standard normal) and χ2. For completeness, the density function of the t-distribution is given. Let Z be a standard normal random variable and V a chi-squared random variable with v degrees of freedom. If Z and V are independent, then the distribution of the random variable T, where T=􏰱Z , V/v is given by the density function Γ[(v + 1)/2] 􏰧 t2 􏰨−(v+1)/2 h(t)= Γ(v/2)√πv 1+ v , −∞ 500, the value of t computed from the sample is more reasonable. Hence, the engineer is likely to conclude that the process produces a better product than he thought.
What Is the t-Distribution Used For?
The t-distribution is used extensively in problems that deal with inference about the population mean (as illustrated in Example 8.11) or in problems that involve comparative samples (i.e., in cases where one is trying to determine if means from two samples are significantly different). The use of the distribution will be extended in Chapters 9, 10, 11, and 12. The reader should note that use of the t-distribution for the statistic
X ̄ − μ T = S/√n
requires that X1, X2, . . . , Xn be normal. The use of the t-distribution and the sample size consideration do not relate to the Central Limit Theorem. The use of the standard normal distribution rather than T for n ≥ 30 merely implies that S is a sufficiently good estimator of σ in this case. In chapters that follow the t-distribution finds extensive usage.

8.7 F-Distribution 251 8.7 F -Distribution
We have motivated the t-distribution in part by its application to problems in which there is comparative sampling (i.e., a comparison between two sample means). For example, some of our examples in future chapters will take a more formal approach, chemical engineer collects data on two catalysts, biologist collects data on two growth media, or chemist gathers data on two methods of coating material to inhibit corrosion. While it is of interest to let sample information shed light on two population means, it is often the case that a comparison of variability is equally important, if not more so. The F-distribution finds enormous application in comparing sample variances. Applications of the F-distribution are found in problems involving two or more samples.
The statistic F is defined to be the ratio of two independent chi-squared random variables, each divided by its number of degrees of freedom. Hence, we can write
F = U/v1 , V /v2
where U and V are independent random variables having chi-squared distributions with v1 and v2 degrees of freedom, respectively. We shall now state the sampling distribution of F.
Let U and V be two independent random variables having chi-squared distributions
with v1 and v2 degrees of freedom, respectively. Then the distribution of the
random variable F = U/v1 is given by the density function V /v2
􏰥Γ[(v1+v2)/2](v1/v2)v1/2 h(f) = Γ(v1/2)Γ(v2/2)
f(v1/2)−1 (1+v1f/v2)(v1+v2)/2
,
f > 0, f ≤ 0.
0,
This is known as the F-distribution with v1 and v2 degrees of freedom (d.f.).
Theorem 8.6:
We will make considerable use of the random variable F in future chapters. How- ever, the density function will not be used and is given only for completeness. The curve of the F-distribution depends not only on the two parameters v1 and v2 but also on the order in which we state them. Once these two values are given, we can identify the curve. Typical F -distributions are shown in Figure 8.11.
Let fα be the f-value above which we find an area equal to α. This is illustrated by the shaded region in Figure 8.12. Table A.6 gives values of fα only for α = 0.05 and α = 0.01 for various combinations of the degrees of freedom v1 and v2. Hence, the f-value with 6 and 10 degrees of freedom, leaving an area of 0.05 to the right, is f0.05 = 3.22. By means of the following theorem, Table A.6 can also be used to find values of f0.95 and f0.99. The proof is left for the reader.

252
Chapter 8 Fundamental Sampling Distributions and Data Descriptions
d.f. 􏱋 (10, 30)
d.f. 􏱋 (6, 10)
α 0 f0fα
f
Figure 8.11: Typical F -distributions. Figure 8.12: Illustration of the fα for the F – distribution.
Writing fα(v1,v2) for fα with v1 and v2 degrees of freedom, we obtain
f1−α(v1,v2) = 1 . fα(v2, v1)
Theorem 8.7:
Thus, the f-value with 6 and 10 degrees of freedom, leaving an area of 0.95 to the right, is
f0.95(6, 10) = 1 = 1 = 0.246. f0.05(10, 6) 4.06
The F-Distribution with Two Sample Variances
Suppose that random samples of size n1 and n2 are selected from two normal populations with variances σ12 and σ2, respectively. From Theorem 8.4, we know that
2 (n1 − 1)S12 2 (n2 − 1)S2 χ 1 = σ 12 a n d χ 2 = σ 2 2
are random variables having chi-squared distributions with v1 = n1 − 1 and v2 = n2 − 1 degrees of freedom. Furthermore, since the samples are selected at random, we are dealing with independent random variables. Then, using Theorem 8.6 with χ21 = U and χ2 = V , we obtain the following result.
If S12 and S2 are the variances of independent random samples of size n1 and n2 taken from normal populations with variances σ12 and σ2, respectively, then
S12 /σ12 σ2 S12 F = S2/σ2 = σ12S2
has an F -distribution with v1 = n1 − 1 and v2 = n2 − 1 degrees of freedom.
Theorem 8.8:

8.7 F-Distribution 253 What Is the F-Distribution Used For?
We answered this question, in part, at the beginning of this section. The F- distribution is used in two-sample situations to draw inferences about the pop- ulation variances. This involves the application of Theorem 8.8. However, the F -distribution can also be applied to many other types of problems involving sam- ple variances. In fact, the F-distribution is called the variance ratio distribution. As an illustration, consider Case Study 8.2, in which two paints, A and B, were compared with regard to mean drying time. The normal distribution applies nicely (assuming that σA and σB are known). However, suppose that there are three types of paints to compare, say A, B, and C. We wish to determine if the population means are equivalent. Suppose that important summary information from the experiment is as follows:
Paint
A B C
Sample Mean
X ̄A = 4.5 X ̄B =5.5 X ̄C =6.5
Sample Variance
s2A = 0.20 s2B =0.14 s2C =0.11
Sample Size
10 10 10
The problem centers around whether or not the sample averages (x ̄A, x ̄B, x ̄C) are far enough apart. The implication of “far enough apart” is very important. It would seem reasonable that if the variability between sample averages is larger than what one would expect by chance, the data do not support the conclusion that μA = μB = μC. Whether these sample averages could have occurred by chance depends on the variability within samples, as quantified by s2A, s2B, and s2C. The notion of the important components of variability is best seen through some simple graphics. Consider the plot of raw data from samples A, B, and C, shown in Figure 8.13. These data could easily have generated the above summary information.
A A A A A A A B A AB A B B B B B BBCCB C C CC C C C C
4.5 5.5 6.5
xA xB xC Figure 8.13: Data from three distinct samples.
It appears evident that the data came from distributions with different pop- ulation means, although there is some overlap between the samples. An analysis that involves all of the data would attempt to determine if the variability between the sample averages and the variability within the samples could have occurred jointly if in fact the populations have a common mean. Notice that the key to this analysis centers around the two following sources of variability.
(1) Variability within samples (between observations in distinct samples)
(2) Variability between samples (between sample averages)
Clearly, if the variability in (1) is considerably larger than that in (2), there will be considerable overlap in the sample data, a signal that the data could all have come

254
Chapter 8 Fundamental Sampling Distributions and Data Descriptions
8.8
from a common distribution. An example is found in the data set shown in Figure 8.14. On the other hand, it is very unlikely that data from distributions with a common mean could have variability between sample averages that is considerably larger than the variability within samples.
A B C A CB AC CAB C ACBA BABABCACBBABCC
xAxC xB
Figure 8.14: Data that easily could have come from the same population.
The sources of variability in (1) and (2) above generate important ratios of sample variances, and ratios are used in conjunction with the F -distribution. The general procedure involved is called analysis of variance. It is interesting that in the paint example described here, we are dealing with inferences on three pop- ulation means, but two sources of variability are used. We will not supply details here, but in Chapters 13 through 15 we make extensive use of analysis of variance, and, of course, the F-distribution plays an important role.
Quantile and Probability Plots
In Chapter 1 we introduced the reader to empirical distributions. The motivation is to use creative displays to extract information about properties of a set of data. For example, stem-and-leaf plots provide the viewer with a look at symmetry and other properties of the data. In this chapter we deal with samples, which, of course, are collections of experimental data from which we draw conclusions about populations. Often the appearance of the sample provides information about the distribution from which the data are taken. For example, in Chapter 1 we illustrated the general nature of pairs of samples with point plots that displayed a relative comparison between central tendency and variability in two samples.
In chapters that follow, we often make the assumption that a distribution is normal. Graphical information regarding the validity of this assumption can be retrieved from displays like stem-and-leaf plots and frequency histograms. In ad- dition, we will introduce the notion of normal probability plots and quantile plots in this section. These plots are used in studies that have varying degrees of com- plexity, with the main objective of the plots being to provide a diagnostic check on the assumption that the data came from a normal distribution.
We can characterize statistical analysis as the process of drawing conclusions about systems in the presence of system variability. For example, an engineer’s attempt to learn about a chemical process is often clouded by process variability. A study involving the number of defective items in a production process is often made more difficult by variability in the method of manufacture of the items. In what has preceded, we have learned about samples and statistics that express center of location and variability in the sample. These statistics provide single measures, whereas a graphical display adds additional information through a picture.
One type of plot that can be particularly useful in characterizing the nature of a data set is the quantile plot. As in the case of the box-and-whisker plot (Section

8.8 Quantile and Probability Plots 255
Quantile Plot
Definition 8.6:
1.6), one can use the basic ideas in the quantile plot to compare samples of data, where the goal of the analyst is to draw distinctions. Further illustrations of this type of usage of quantile plots will be given in future chapters where the formal statistical inference associated with comparing samples is discussed. At that point, case studies will expose the reader to both the formal inference and the diagnostic graphics for the same data set.
The purpose of the quantile plot is to depict, in sample form, the cumulative distribution function discussed in Chapter 3.
Obviously, a quantile represents an estimate of a characteristic of a population, or rather, the theoretical distribution. The sample median is q(0.5). The 75th percentile (upper quartile) is q(0.75) and the lower quartile is q(0.25).
A quantile plot simply plots the data values on the vertical axis against an empirical assessment of the fraction of observations exceeded by the data value. For theoretical purposes, this fraction is computed as
A quantile of a sample, q(f), is a value for which a specified fraction f of the data values is less than or equal to q(f).
i−3 fi= 8,
n+1 4
where i is the order of the observations when they are ranked from low to high. In other words, if we denote the ranked observations as
y(1) ≤ y(2) ≤ y(3) ≤ ··· ≤ y(n−1) ≤ y(n),
then the quantile plot depicts a plot of y(i) against fi. In Figure 8.15, the quantile plot is given for the paint can ear data discussed previously.
Unlike the box-and-whisker plot, the quantile plot actually shows all observa- tions. All quantiles, including the median and the upper and lower quantile, can be approximated visually. For example, we readily observe a median of 35 and an upper quartile of about 36. Relatively large clusters around specific values are indicated by slopes near zero, while sparse data in certain areas produce steeper slopes. Figure 8.15 depicts sparsity of data from the values 28 through 30 but relatively high density at 36 through 38. In Chapters 9 and 10 we pursue quantile plotting further by illustrating useful ways of comparing distinct samples.
It should be somewhat evident to the reader that detection of whether or not a data set came from a normal distribution can be an important tool for the data analyst. As we indicated earlier in this section, we often make the assumption that all or subsets of observations in a data set are realizations of independent identically distributed normal random variables. Once again, the diagnostic plot can often nicely augment (for display purposes) a formal goodness-of-fit test on the data. Goodness-of-fit tests are discussed in Chapter 10. Readers of a scientific paper or report tend to find diagnostic information much clearer, less dry, and perhaps less boring than a formal analysis. In later chapters (Chapters 9 through 13), we focus

256
Chapter 8 Fundamental Sampling Distributions and Data Descriptions
40
38
36
34
32
30
28
0.0 0.2
0.4 0.6 Fraction, f
0.8 1.0
Figure 8.15: Quantile plot for paint data.
again on methods of detecting deviations from normality as an augmentation of formal statistical inference. Quantile plots are useful in detection of distribution types. There are also situations in both model building and design of experiments in which the plots are used to detect important model terms or effects that are active. In other situations, they are used to determine whether or not the underlying assumptions made by the scientist or engineer in building the model are reasonable. Many examples with illustrations will be encountered in Chapters 11, 12, and 13. The following subsection provides a discussion and illustration of a diagnostic plot called the normal quantile-quantile plot.
Normal Quantile-Quantile Plot
The normal quantile-quantile plot takes advantage of what is known about the quantiles of the normal distribution. The methodology involves a plot of the em- pirical quantiles recently discussed against the corresponding quantile of the normal distribution. Now, the expression for a quantile of an N(μ,σ) random variable is very complicated. However, a good approximation is given by
qμ,σ(f) = μ + σ{4.91[f0.14 − (1 − f)0.14]}.
The expression in braces (the multiple of σ) is the approximation for the corre-
sponding quantile for the N(0,1) random variable, that is, q0,1(f) = 4.91[f0.14 − (1 − f)0.14].
Quantile

8.8 Quantile and Probability Plots 257 Definition 8.7:
A nearly straight-line relationship suggests that the data came from a normal distribution. The intercept on the vertical axis is an estimate of the population mean μ and the slope is an estimate of the standard deviation σ. Figure 8.16 shows a normal quantile-quantile plot for the paint can data.
40
38
36
34
32
30
28
−2 2 1 −2 2 Standard normal quantile, q0,1 (f)
Figure 8.16: Normal quantile-quantile plot for paint data.
Normal Probability Plotting
Notice how the deviation from normality becomes clear from the appearance of the plot. The asymmetry exhibited in the data results in changes in the slope.
The ideas of probability plotting are manifested in plots other than the normal quantile-quantile plot discussed here. For example, much attention is given to the so-called normal probability plot, in which f is plotted against the ordered data values on special paper and the scale used results in a straight line. In addition, an alternative plot makes use of the expected values of the ranked observations for the normal distribution and plots the ranked observations against their expected value, under the assumption of data from N(μ,σ). Once again, the straight line is the graphical yardstick used. We continue to suggest that the foundation in graphical analytical methods developed in this section will aid in understanding formal methods of distinguishing between distinct samples of data.
The normal quantile-quantile plot is a plot of y(i) (ordered observations)
againstq (f),wheref = i−3 . 0,1 i i n+1
4
8
Quantile y

258
Chapter 8 Fundamental Sampling Distributions and Data Descriptions
Example 8.12: Consider the data in Exercise 10.41 on page 358 in Chapter 10. In a study “Nu- trient Retention and Macro Invertebrate Community Response to Sewage Stress in a Stream Ecosystem,” conducted in the Department of Zoology at the Virginia Polytechnic Institute and State University, data were collected on density measure- ments (number of organisms per square meter) at two different collecting stations. Details are given in Chapter 10 regarding analytical methods of comparing samples to determine if both are from the same N(μ,σ) distribution. The data are given in Table 8.1.
Table 8.1: Data for Example 8.12
Number of Organisms per Square Meter
Station 1
Station 2
Construct a normal quantile-quantile or not it is reasonable to assume that distribution.
plot and draw conclusions regarding whether the two samples are from the same n(x; μ, σ)
5, 030 13, 700
10, 730
11, 400
860
2, 200 4, 250 15, 040
4, 980 11, 910 8, 130 26, 850 17, 660 22, 800 1, 130 1, 690
2, 800 4, 670 6, 890 7, 720 7, 030 7, 330
2, 810 1, 330 3, 320 1, 230 2, 130 2, 190
25,000
20,000
15,000
10,000
5,000
Station 1 Station 2
−2 −1 0 1 2 Standard normal quantile, q 0,1( f)
Figure 8.17: Normal quantile-quantile plot for density data of Example 8.12.
Quantile

//
Exercises 259
Solution : Figure 8.17 shows the normal quantile-quantile plot for the density measurements. The plot is far from a single straight line. In fact, the data from station 1 reflect a few values in the lower tail of the distribution and several in the upper tail. The “clustering” of observations would make it seem unlikely that the two samples came from a common N(μ,σ) distribution.
Although we have concentrated our development and illustration on probability plotting for the normal distribution, we could focus on any distribution. We would merely need to compute quantities analytically for the theoretical distribution in question.
Exercises
8.37 For a chi-squared distribution, find (a) χ20.025 when v = 15;
2
(b) χ0.01 when v = 7;
(c) χ20.05 when v = 24.
8.38 For a chi-squared distribution, find (a) χ20.005 when v = 5;
(b) χ20.05 when v = 19;
(c) χ20.01 when v = 12.
8.39 For a chi-squared distribution, find χ2α such that (a)P(X2 >χ2α)=0.99whenv=4;
( b ) P ( X 2 > χ 2α ) = 0 . 0 2 5 w h e n v = 1 9 ;
(c) P(37.652 < X2 < χ2α) = 0.045 when v = 25. 8.40 For a chi-squared distribution, find χ2α such that (a)P(X2 >χ2α)=0.01whenv=21;
(b)P(X2 <χ2α)=0.95whenv=6; (c) P(χ2α < X2 < 23.209) = 0.015 when v = 10. 8.41 Assume the sample variances to be continuous measurements. Find the probability that a random sample of 25 observations, from a normal population with variance σ2 = 6, will have a sample variance S2 (a) greater than 9.1; (b) between 3.462 and 10.745. 8.42 The scores on a placement test given to college freshmen for the past five years are approximately nor- mally distributed with a mean μ = 74 and a variance σ2 = 8. Would you still consider σ2 = 8 to be a valid value of the variance if a random sample of 20 students who take the placement test this year obtain a value of s2 = 20? 8. 43 Show that the variance of S 2 for random sam- ples of size n from a normal population decreases as n becomes large. [Hint: First find the variance of (n − 1)S2/σ2.] 8.44 (a) Find t0.025 when v = 14. (b) Find −t0.10 when v = 10. (c) Find t0.995 when v = 7. 8.45 (a) Find P(T < 2.365) when v = 7. (b) Find P(T > 1.318) when v = 24.
(c) Find P(−1.356 < T < 2.179) when v = 12. (d)FindP(T>−2.567)whenv=17.
8.46 (a) Find P(−t0.005 < T < t0.01) for v = 20. (b) Find P(T > −t0.025).
8.47 Given a random sample of size 24 from a normal distribution, find k such that
(a) P(−2.069 < T < k) = 0.965; (b) P(k < T < 2.807) = 0.095; (c)P(−k 1.26).
8.65 Consider Example 1.5 on page 25. Comment on
any outliers.
8.66 Consider Review Exercise 8.56. Comment on any outliers in the data.
8.67 The breaking strength X of a certain rivet used in a machine engine has a mean 5000 psi and stan- dard deviation 400 psi. A random sample of 36 rivets is taken. Consider the distribution of X ̄, the sample mean breaking strength.
(a) What is the probability that the sample mean falls between 4800 psi and 5200 psi?
(b) What sample n would be necessary in order to have P ( 4 9 0 0 < X ̄ < 5 1 0 0 ) = 0 . 9 9 ? 8.68 Consider the situation of Review Exercise 8.62. If the population from which the sample was taken has population mean μ = 53, 000 kilometers, does the sam- ple information here seem to support that claim? In your answer, compute x ̄−53,000 t = s/√10 and determine from Table A.4 (with 9 d.f.) whether the computed t-value is reasonable or appears to be a rare event. 8.69 Two distinct solid fuel propellants, type A and type B, are being considered for a space program activ- ity. Burning rates of the propellant are crucial. Ran- dom samples of 20 specimens of the two propellants are taken with sample means 20.5 cm/sec for propel- lant A and 24.50 cm/sec for propellant B. It is gen- erally assumed that the variability in burning rate is roughly the same for the two propellants and is given by a population standard deviation of 5 cm/sec. As- sume that the burning rates for each propellant are approximately normal and hence make use of the Cen- tral Limit Theorem. Nothing is known about the two population mean burning rates, and it is hoped that this experiment might shed some light on them. (a) If, indeed, μA = μB, what is P(X ̄B − X ̄A ≥ 4.0)? (b) Use your answer in (a) to shed some light on the proposition that μA = μB . 8.70 The concentration of an active ingredient in the output of a chemical reaction is strongly influenced by the catalyst that is used in the reaction. It is felt that when catalyst A is used, the population mean concen- tration exceeds 65%. The standard deviation is known to be σ = 5%. A sample of outputs from 30 inde- pendent experiments gives the average concentration of x ̄A = 64.5%. (a) Does this sample information with an average con- centration of x ̄A = 64.5% provide disturbing in- formation that perhaps μA is not 65%, but less than 65%? Support your answer with a probability statement. (b) Suppose a similar experiment is done with the use of another catalyst, catalyst B. The standard devi- ation σ is still assumed to be 5% and x ̄B turns out to be 70%. Comment on whether or not the sample information on catalyst B strongly suggests that μB is truly greater than μA. Support your answer by computing P ( X ̄ B − X ̄ A ≥ 5 . 5 | μ B = μ A ) . (c) Under the condition that μA = μB = 65%, give the approximate distribution of the following quantities (with mean and variance of each). Make use of the Central Limit Theorem. i)X ̄B ; ii)X ̄A −X ̄B; // X ̄ −X ̄ AB iii)√ . σ 2/30 8.71 From the information in Review Exercise 8.70, compute (assuming μB = 65%) P (X ̄B ≥ 70). 8.72 Given a normal random variable X with mean 20 and variance 9, and a random sample of size n taken from the distribution, what sample size n is necessary in order that P ( 1 9 . 9 ≤ X ̄ ≤ 2 0 . 1 ) = 0 . 9 5 ? 8.73 In Chapter 9, the concept of parameter esti- mation will be discussed at length. Suppose X is a random variable with mean μ and variance σ2 = 1.0. Suppose also that a random sample of size n is to be taken and x ̄ is to be used as an estimate of μ. When the data are taken and the sample mean is measured, we wish it to be within 0.05 unit of the true mean with probability 0.99. That is, we want there to be a good chance that the computed x ̄ from the sample is “very 262 Chapter 8 Fundamental Sampling Distributions and Data Descriptions close” to the population mean (wherever it is!), so we wish P ( | X ̄ − μ | > 0 . 0 5 ) = 0 . 9 9 . What sample size is required?
8.74 Suppose a filling machine is used to fill cartons with a liquid product. The specification that is strictly enforced for the filling machine is 9 ± 1.5 oz. If any car- ton is produced with weight outside these bounds, it is considered by the supplier to be defective. It is hoped that at least 99% of cartons will meet these specifica- tions. With the conditions μ = 9 and σ = 1, what proportion of cartons from the process are defective? If changes are made to reduce variability, what must σ be reduced to in order to meet specifications with probability 0.99? Assume a normal distribution for the weight.
8.75 Consider the situation in Review Exercise 8.74. Suppose a considerable effort is conducted to “tighten” the variability in the system. Following the effort, a random sample of size 40 is taken from the new assem- bly line and the sample variance is s2 = 0.188 ounces2.
Do we have strong numerical evidence that σ2 has been reduced below 1.0? Consider the probability
P(S2 ≤ 0.188 | σ2 = 1.0), and give your conclusion.
8.76 Group Project: The class should be divided into groups of four people. The four students in each group should go to the college gym or a local fit- ness center. The students should ask each person who comes through the door his or her height in inches. Each group will then divide the height data by gender and work together to answer the following questions.
(a) Construct a normal quantile-quantile plot of the data. Based on the plot, do the data appear to follow a normal distribution?
(b) Use the estimated sample variance as the true vari- ance for each gender. Assume that the popula- tion mean height for male students is actually three inches larger than that of female students. What is the probability that the average height of the male students will be 4 inches larger than that of the female students in your sample?
(c) What factors could render these results misleading?
8.9 Potential Misconceptions and Hazards; Relationship to Material in Other Chapters
The Central Limit Theorem is one of the most powerful tools in all of statistics, and even though this chapter is relatively short, it contains a wealth of fundamental information about tools that will be used throughout the balance of the text.
The notion of a sampling distribution is one of the most important fundamental concepts in all of statistics, and the student at this point in his or her training should gain a clear understanding of it before proceeding beyond this chapter. All chapters that follow will make considerable use of sampling distributions. Suppose one wants to use the statistic X ̄ to draw inferences about the population mean μ. This will be done by using the observed value x ̄ from a single sample of size n. Then any inference made must be accomplished by taking into account not just the single value but rather the theoretical structure, or distribution of all x ̄ values that could be observed from samples of size n. Thus, the concept of a sampling distribution comes to the surface. This distribution is the basis for the Central Limit Theorem. The t, χ2, and F-distributions are also used in the context of sampling distributions. For example, the t-distribution, pictured in Figure 8.8,
represents the structure that occurs if all of the values of x ̄−μ are formed, where s/ n
x ̄ and s are taken from samples of size n from a n(x;μ,σ) distribution. Similar remarks can be made about χ2 and F, and the reader should not forget that the sample information forming the statistics for all of these distributions is the normal. So it can be said that where there is a t, F, or χ2, the source was a sample from a normal distribution.

8.9 Potential Misconceptions and Hazards 263
The three distributions described above may appear to have been introduced in a rather self-contained fashion with no indication of what they are about. However, they will appear in practical problem-solving throughout the balance of the text.
Now, there are three things that one must bear in mind, lest confusion set in regarding these fundamental sampling distributions:
(i) One cannot use the Central Limit Theorem unless σ is known. When σ is not known, it should be replaced by s, the sample standard deviation, in order to use the Central Limit Theorem.
(ii) TheTstatisticisnotaresultoftheCentralLimitTheoremandx1,x2,…,xn √
must come from a n(x; μ, σ) distribution in order for x ̄−μ to be a t-distribution; s/ n
s is, of course, merely an estimate of σ.
(iii) While the notion of degrees of freedom is new at this point, the concept should be very intuitive, since it is reasonable that the nature of the distri- bution of S and also t should depend on the amount of information in the sample x1,x2,…,xn.

This page intentionally left blank

Chapter 9
One- and Two-Sample Estimation Problems
9.1 Introduction
In previous chapters, we emphasized sampling properties of the sample mean and variance. We also emphasized displays of data in various forms. The purpose of these presentations is to build a foundation that allows us to draw conclusions about the population parameters from experimental data. For example, the Central Limit Theorem provides information about the distribution of the sample mean X ̄. The distribution involves the population mean μ. Thus, any conclusions concerning μ drawn from an observed sample average must depend on knowledge of this sampling distribution. Similar comments apply to S2 and σ2. Clearly, any conclusions we draw about the variance of a normal distribution will likely involve the sampling distribution of S2.
In this chapter, we begin by formally outlining the purpose of statistical in- ference. We follow this by discussing the problem of estimation of population parameters. We confine our formal developments of specific estimation proce- dures to problems involving one and two samples.
9.2 Statistical Inference
In Chapter 1, we discussed the general philosophy of formal statistical inference. Statistical inference consists of those methods by which one makes inferences or generalizations about a population. The trend today is to distinguish between the classical method of estimating a population parameter, whereby inferences are based strictly on information obtained from a random sample selected from the population, and the Bayesian method, which utilizes prior subjective knowledge about the probability distribution of the unknown parameters in conjunction with the information provided by the sample data. Throughout most of this chapter, we shall use classical methods to estimate unknown population parameters such as the mean, the proportion, and the variance by computing statistics from random
265

266
Chapter 9 One- and Two-Sample Estimation Problems
9.3
samples and applying the theory of sampling distributions, much of which was covered in Chapter 8. Bayesian estimation will be discussed in Chapter 18.
Statistical inference may be divided into two major areas: estimation and tests of hypotheses. We treat these two areas separately, dealing with theory and applications of estimation in this chapter and hypothesis testing in Chapter 10. To distinguish clearly between the two areas, consider the following examples. A candidate for public office may wish to estimate the true proportion of voters favoring him by obtaining opinions from a random sample of 100 eligible voters. The fraction of voters in the sample favoring the candidate could be used as an estimate of the true proportion in the population of voters. A knowledge of the sampling distribution of a proportion enables one to establish the degree of accuracy of such an estimate. This problem falls in the area of estimation.
Now consider the case in which one is interested in finding out whether brand A floor wax is more scuff-resistant than brand B floor wax. He or she might hypothesize that brand A is better than brand B and, after proper testing, accept or reject this hypothesis. In this example, we do not attempt to estimate a parameter, but instead we try to arrive at a correct decision about a prestated hypothesis. Once again we are dependent on sampling theory and the use of data to provide us with some measure of accuracy for our decision.
Classical Methods of Estimation
A point estimate of some population parameter θ is a single value θˆ of a statistic Θˆ. For example, the value x ̄ of the statistic X ̄, computed from a sample of size n, is a point estimate of the population parameter μ. Similarly, pˆ = x/n is a point estimate of the true proportion p for a binomial experiment.
An estimator is not expected to estimate the population parameter without error. We do not expect X ̄ to estimate μ exactly, but we certainly hope that it is not far off. For a particular sample, it is possible to obtain a closer estimate of μ by using the sample median X ̃ as an estimator. Consider, for instance, a sample consisting of the values 2, 5, and 11 from a population whose mean is 4 but is supposedly unknown. We would estimate μ to be x ̄ = 6, using the sample mean as our estimate, or x ̃ = 5, using the sample median as our estimate. In this case, the estimator X ̃ produces an estimate closer to the true parameter than does the estimator X ̄. On the other hand, if our random sample contains the values 2, 6, and 7, then x ̄ = 5 and x ̃ = 6, so X ̄ is the better estimator. Not knowing the true value of μ, we must decide in advance whether to use X ̄ or X ̃ as our estimator.
Unbiased Estimator
What are the desirable properties of a “good” decision function that would influ- ence us to choose one estimator rather than another? Let Θˆ be an estimator whose value θˆ is a point estimate of some unknown population parameter θ. Certainly, we would like the sampling distribution of Θˆ to have a mean equal to the parameter estimated. An estimator possessing this property is said to be unbiased.

9.3 Classical Methods of Estimation 267 Definition 9.1:
Example 9.1: Show that S2 is an unbiased estimator of the parameter σ2. Solution : In Section 8.5 on page 244, we showed that
A statistic Θˆ is said to be an unbiased estimator of the parameter θ if μ Θˆ = E ( Θˆ ) = θ .
Now
E ( S 2 ) = E 1 ( X i − X ̄ ) 2
= 1 n−1
n−1 i X i=1
􏰤n i=1
(Xi −X ̄)2 =
􏰤n i=1
(Xi −μ)2 −n(X ̄ −μ)2.
􏰷􏰤n 􏰸􏰺􏰤n􏰻 E(Xi−μ)2−nE(X ̄−μ)2 = 1 σX2 −nσ2 ̄ .
i=1
􏰷􏰤n 􏰸 n − 1 i=1
However,
Therefore,
22 2σ2 σXi =σ , fori=1,2,…,n, andσX ̄ = n.
1􏰧 σ2􏰨 E(S2)=n−1 nσ2−nn =σ2.
Although S2 is an unbiased estimator of σ2, S, on the other hand, is usually a biased estimator of σ, with the bias becoming insignificant for large samples. This example illustrates why we divide by n − 1 rather than n when the variance is estimated.
Variance of a Point Estimator
Definition 9.2:
If Θˆ1 and Θˆ2 are two unbiased estimators of the same population parameter θ, we
want to choose the estimator whose sampling distribution has the smaller variance.
Hence, if σ2ˆ < σ2ˆ , we say that Θˆ1 is a more efficient estimator of θ than Θˆ2. θ1 θ2 Figure 9.1 illustrates the sampling distributions of three different estimators, Θˆ1, Θˆ2, and Θˆ3, all estimating θ. It is clear that only Θˆ1 and Θˆ2 are unbiased, since their distributions are centered at θ. The estimator Θˆ 1 has a smaller variance than Θˆ2 and is therefore more efficient. Hence, our choice for an estimator of θ, among the three considered, would be Θˆ1. For normal populations, one can show that both X ̄ and X ̃ are unbiased estima- tors of the population mean μ, but the variance of X ̄ is smaller than the variance If we consider all possible unbiased estimators of some parameter θ, the one with the smallest variance is called the most efficient estimator of θ. 268 Chapter 9 One- and Two-Sample Estimation Problems ^ 􏱏1 ^ 􏱏3 ^ 􏱏2 θ^ Figure 9.1: Sampling distributions of different estimators of θ. of X ̃. Thus, both estimates x ̄ and x ̃ will, on average, equal the population mean μ, but x ̄ is likely to be closer to μ for a given sample, and thus X ̄ is more efficient t h a n X ̃ . Interval Estimation Even the most efficient unbiased estimator is unlikely to estimate the population parameter exactly. It is true that estimation accuracy increases with large samples, but there is still no reason we should expect a point estimate from a given sample to be exactly equal to the population parameter it is supposed to estimate. There are many situations in which it is preferable to determine an interval within which we would expect to find the value of the parameter. Such an interval is called an interval estimate. An interval estimate of a population parameter θ is an interval of the form θˆL < θ < θˆU, where θˆL and θˆU depend on the value of the statistic Θˆ for a particular sample and also on the sampling distribution of Θˆ. For example, a random sample of SAT verbal scores for students in the entering freshman class might produce an interval from 530 to 550, within which we expect to find the true average of all SAT verbal scores for the freshman class. The values of the endpoints, 530 and 550, will depend on the computed sample mean x ̄ and the sampling distribution of X ̄. As the sample size increases, we know that σ2 ̄ = σ2/n X decreases, and consequently our estimate is likely to be closer to the parameter μ, resulting in a shorter interval. Thus, the interval estimate indicates, by its length, the accuracy of the point estimate. An engineer will gain some insight into the population proportion defective by taking a sample and computing the sample proportion defective. But an interval estimate might be more informative. Interpretation of Interval Estimates Since different samples will generally yield different values of Θˆ and, therefore, different values for θˆL and θˆU , these endpoints of the interval are values of corre- sponding random variables ΘˆL and ΘˆU. From the sampling distribution of Θˆ we shall be able to determine ΘˆL and ΘˆU such that P(ΘˆL < θ < ΘˆU) is equal to any θ 9.4 Single Sample: Estimating the Mean 269 positive fractional value we care to specify. If, for instance, we find ΘˆL and ΘˆU such that P(ΘˆL <θ<ΘˆU)=1−α, for 0 < α < 1, then we have a probability of 1−α of selecting a random sample that will produce an interval containing θ. The interval θˆL < θ < θˆU , computed from the selected sample, is called a 100(1 − α)% confidence interval, the fraction 1 − α is called the confidence coefficient or the degree of confidence, and the endpoints, θˆL and θˆU, are called the lower and upper confidence limits. Thus, when α = 0.05, we have a 95% confidence interval, and when α = 0.01, we obtain a wider 99% confidence interval. The wider the confidence interval is, the more confident we can be that the interval contains the unknown parameter. Of course, it is better to be 95% confident that the average life of a certain television transistor is between 6 and 7 years than to be 99% confident that it is between 3 and 10 years. Ideally, we prefer a short interval with a high degree of confidence. Sometimes, restrictions on the size of our sample prevent us from achieving short intervals without sacrificing some degree of confidence. In the sections that follow, we pursue the notions of point and interval esti- mation, with each section presenting a different special case. The reader should notice that while point and interval estimation represent different approaches to gaining information regarding a parameter, they are related in the sense that con- fidence interval estimators are based on point estimators. In the following section, for example, we will see that X ̄ is a very reasonable point estimator of μ. As a result, the important confidence interval estimator of μ depends on knowledge of the sampling distribution of X ̄. We begin the following section with the simplest case of a confidence interval. The scenario is simple and yet unrealistic. We are interested in estimating a popu- lation mean μ and yet σ is known. Clearly, if μ is unknown, it is quite unlikely that σ is known. Any historical results that produced enough information to allow the assumption that σ is known would likely have produced similar information about μ. Despite this argument, we begin with this case because the concepts and indeed the resulting mechanics associated with confidence interval estimation remain the same for the more realistic situations presented later in Section 9.4 and beyond. 9.4 Single Sample: Estimating the Mean The sampling distribution of X ̄ is centered at μ, and in most applications the variance is smaller than that of any other estimators of μ. Thus, the sample mean x ̄ will be used as a point estimate for the population mean μ. Recall that σ2 ̄ = σ2/n, so a large sample will yield a value of X ̄ that comes from a sampling X distribution with a small variance. Hence, x ̄ is likely to be a very accurate estimate of μ when n is large. Let us now consider the interval estimate of μ. If our sample is selected from a normal population or, failing this, if n is sufficiently large, we can establish a confidence interval for μ by considering the sampling distribution of X ̄. According to the Central Limit Theorem, we can expect the sampling distri- bution of X ̄ to be approximately normally distributed with mean μX ̄ = μ and 270 Chapter 9 One- and Two-Sample Estimation Problems standard deviation σX ̄ = σ/√n. Writing zα/2 for the z-value above which we find an area of α/2 under the normal curve, we can see from Figure 9.2 that where Hence, P(−zα/2 X ̄ − z α σ / √ n ) = 1 − α .
􏰽 X ̄ − μ 􏰾 Similar manipulation of P σ/√n > −zα
= 1 − α gives P ( μ < X ̄ + z α σ / √ n ) = 1 − α . One-Sided Confidence Bounds on μ, σ2 Known As a result, the upper and lower one-sided bounds follow. If X ̄ is the mean of a random sample of size n from a population with variance σ2, the one-sided 100(1 − α)% confidence bounds for μ are given by √ upper one-sided bound: x ̄ + zασ/√n; lower one-sided bound: x ̄ − zασ/ n. 274 Chapter 9 One- and Two-Sample Estimation Problems Example 9.4: In a psychological testing experiment, 25 subjects are selected randomly and their reaction time, in seconds, to a particular stimulus is measured. Past experience suggests that the variance in reaction times to these types of stimuli is 4 sec2 and that the distribution of reaction times is approximately normal. The average time for the subjects is 6.2 seconds. Give an upper 95% bound for the mean reaction time. Solution: The upper 95% bound is given by √ 􏰱 x ̄+zασ/ n=6.2+(1.645) 4/25=6.2+0.658 = 6.858 seconds. Hence, we are 95% confident that the mean reaction time is less than 6.858 seconds. The Case of σ Unknown Frequently, we must attempt to estimate the mean of a population when the vari- ance is unknown. The reader should recall learning in Chapter 8 that if we have a random sample from a normal distribution, then the random variable X ̄ − μ T = S/√n has a Student t-distribution with n − 1 degrees of freedom. Here S is the sample standard deviation. In this situation, with σ unknown, T can be used to construct a confidence interval on μ. The procedure is the same as that with σ known except that σ is replaced by S and the standard normal distribution is replaced by the t-distribution. Referring to Figure 9.5, we can assert that P(−tα/2 μ2 with little risk of being in error. For example, in Example 9.11, we are 90% confident that the interval from 0.593 to 1.547 contains the difference of the population means for values of the species diversity index at the two stations. The fact that both confidence limits are positive indicates that, on the average, the index for the station located downstream from the discharge point is greater than the index for the station located upstream.
Equal Sample Sizes
The procedure for constructing confidence intervals for μ1 − μ2 with σ1 = σ2 = σ unknown requires the assumption that the populations are normal. Slight depar- tures from either the equal variance or the normality assumption do not seriously alter the degree of confidence for our interval. (A procedure is presented in Chap- ter 10 for testing the equality of two unknown population variances based on the information provided by the sample variances.) If the population variances are considerably different, we still obtain reasonable results when the populations are normal, provided that n1 = n2. Therefore, in planning an experiment, one should make every effort to equalize the size of the samples.
Unknown and Unequal Variances
Let us now consider the problem of finding an interval estimate of μ1 − μ2 when the unknown population variances are not likely to be equal. The statistic most often used in this case is
T
′ (X ̄1 −X ̄2)−(μ1 −μ2) = 􏰱(S12/n1) + (S2/n2) ,
which has approximately a t-distribution with v degrees of freedom, where (s21/n1 + s2/n2)2
v = [(s21/n1)2/(n1 − 1)] + [(s2/n2)2/(n2 − 1)].
Since v is seldom an integer, we round it down to the nearest whole number. The above estimate of the degrees of freedom is called the Satterthwaite approximation (Satterthwaite, 1946, in the Bibliography).
Using the statistic T′, we write
P(−tα/2 0, β
0, elsewhere.
Thus, the log-likelihood function for the data, given n = 10, is
10 lnL(x1,x2,…,x10;β) = −10lnβ − 1 􏰤xi.
β
i=1
Setting
implies that
10
i=1
βˆ = 1 􏰤 x i = x ̄ = 1 6 . 2 .
10
i=1
∂ ln L = − 10 + 1 􏰤 xi = 0
∂β β β2
10
Evaluating the second derivative of the log-likelihood function at the value βˆ above yields a negative value. As a result, the estimator of the parameter β, the popula- tion mean, is the sample average x ̄.
The following example shows the maximum likelihood estimator for a distribu- tion that does not appear in previous chapters.
Example 9.23: It is known that a sample consisting of the values 12, 11.2, 13.5, 12.3, 13.8, and 11.9 comes from a population with the density function
􏰥
θ , x>1, xθ+1
f(x;θ) =
where θ > 0. Find the maximum likelihood estimate of θ.
0, elsewhere,
Solution: The likelihood function of n observations from this population can be written as
􏱆nθ θn L(x1,x2,…,x10;θ) = xθ+1 = (􏱇n xi)θ+1 ,
i=1 i i=1
which implies that
ln L(x1, x2, . . . , x10; θ) = n ln(θ) − (θ + 1)
􏰤n i=1
ln(xi).

Exercises
9.81 Suppose that
there are n trials x1, x2, . . . , xn
servations from a Weibull distribution with parameters α and β and density function
Setting 0 = ∂θ θˆ= n
= θ −
ln(xi) results in
//
312 Chapter 9 One- and Two-Sample Estimation Problems
i=1
∂lnL n 􏰦n
􏰦n
ln(xi )
i=1
ln(12) + ln(11.2) + ln(13.5) + ln(12.3) + ln(13.8) + ln(11.9)
6 = 0.3970. function does achieve its maximum value at θˆ.
=
Since the second derivative of L is −n/θ2, which is always negative, the likelihood
Additional Comments Concerning Maximum Likelihood Estimation
A thorough discussion of the properties of maximum likelihood estimation is be- yond the scope of this book and is usually a major topic of a course in the theory of statistical inference. The method of maximum likelihood allows the analyst to make use of knowledge of the distribution in determining an appropriate estima- tor. The method of maximum likelihood cannot be applied without knowledge of the underlying distribution. We learned in Example 9.21 that the maximum likelihood estimator is not necessarily unbiased. The maximum likelihood estimator is unbi- ased asymptotically or in the limit; that is, the amount of bias approaches zero as the sample size becomes large. Earlier in this chapter the notion of efficiency was discussed, efficiency being linked to the variance property of an estimator. Maxi- mum likelihood estimators possess desirable variance properties in the limit. The reader should consult Lehmann and D’Abrera (1998) for details.
from a Bernoulli process with parameter p, the prob-
ability of a success. That is, the probability of r suc-
cesses is given by 􏰩n􏰪pr (1 − p)n−r . Work out the max-
􏰥β αβxβ−1e−αx
0,
r
imum likelihood estimator for the parameter p.
9.82 Consider the lognormal distribution with the density function given in Section 6.9. Suppose we have a random sample x1, x2, . . . , xn from a lognormal dis- tribution.
(a) Write out the likelihood function.
(b) Develop the maximum likelihood estimators of μ
and σ2.
9.83 Consider a random sample of x1, . . . , xn coming from the gamma distribution discussed in Section 6.6. Suppose the parameter α is known, say 5, and deter- mine the maximum likelihood estimation for parameter β.
f(x) =
forα,β>0.
,
x > 0, elsewhere,
9.84 Consider a random sample of x1, x2, . . . , xn ob-
(a) Write out the likelihood function.
(b) Write out the equations that, when solved, give the maximum likelihood estimators of α and β.
9.85 Consider a random sample of x1, . . . , xn from a uniform distribution U(0,θ) with unknown parameter θ, where θ > 0. Determine the maximum likelihood estimator of θ.
9.86 Consider the independent observations x1 , x2 , . . . , xn from the gamma distribution discussed in Section 6.6.
(a) Write out the likelihood function.

Review Exercises
313
(b) Write out a set of equations that, when solved, give the maximum likelihood estimators of α and β.
9.87 Consider a hypothetical experiment where a man with a fungus uses an antifungal drug and is cured. Consider this, then, a sample of one from a Bernoulli distribution with probability function
f(x) = pxq1−x, x = 0,1, Review Exercises
9.89 Consider two estimators of σ2 for a sample x1, x2, . . . , xn, which is drawn from a normal distri- bution with mean μ and variance σ2. The estimators
21􏰦n 2
are the unbiased estimator s = n−1 (xi − x ̄) and
where p is the probability of a success (cure) and q = 1 − p. Now, of course, the sample information gives x = 1. Write out a development that shows that pˆ = 1.0 is the maximum likelihood estimator of the probability of a cure.
9.88 Consider the observation X from the negative binomial distribution given in Section 5.4. Find the maximum likelihood estimator for p, assuming k is known.
present in the soil and thus affect the amount avail- able to deer. A large tract of land in the Fishburn Forest was selected for a prescribed burn. Soil samples were taken from 12 plots of equal area just prior to the burn and analyzed for calcium. Postburn calcium lev- els were analyzed from the same plots. These values, in kilograms per plot, are presented in the following table:
Calcium Level (kg/plot) Plot Preburn Postburn
1 50 9 2 50 18 3 82 45 4 64 18 5 82 18 6 73 9 7 77 32 8 54 9 9 23 18
10 45 9 11 36 9 12 54 9
Construct a 95% confidence interval for the mean dif- ference in calcium levels in the soil prior to and after the prescribed burn. Assume the distribution of differ- ences in calcium levels to be approximately normal.
9.93 A health spa claims that a new exercise pro- gram will reduce a person’s waist size by 2 centimeters on average over a 5-day period. The waist sizes, in centimeters, of 6 men who participated in this exercise program are recorded before and after the 5-day period in the following table:
Man Waist Size Before Waist Size After
the maximum likelihood estimator σˆ = n (xi −x ̄) . i=1
//
i=1
21􏰦n 2
Discuss the variance properties of these two estimators.
9.90 According to the Roanoke Times, McDonald’s sold 42.1% of the market share of hamburgers. A ran- dom sample of 75 burgers sold resulted in 28 of them being from McDonald’s. Use material in Section 9.10 to determine if this information supports the claim in the Roanoke Times.
9.91 It is claimed that a new diet will reduce a per- son’s weight by 4.5 kilograms on average in a period of 2 weeks. The weights of 7 women who followed this diet were recorded before and after the 2-week period.
Woman
Weight Before Weight After
1 58.5 60.0
2 60.3 54.9
3 61.7 58.1
4 69.0 62.1
5 64.0 58.5
6 62.6 59.9
7 56.7 54.4
Test the claim about the diet by computing a 95% con- fidence interval for the mean difference in weights. As- sume the differences of weights to be approximately normally distributed.
9.92 A study was undertaken at Virginia Tech to de- termine if fire can be used as a viable management tool to increase the amount of forage available to deer dur- ing the critical months in late winter and early spring. Calcium is a required element for plants and animals. The amount taken up and stored in plants is closely correlated to the amount present in the soil. It was hypothesized that a fire may change the calcium levels
1 2 3 4 5 6
90.4 95.5 98.7
115.9 104.0 85.6
91.7 93.9 97.4
112.8 101.3 84.0

314
Chapter 9 One- and Two-Sample Estimation Problems
By computing a 95% confidence interval for the mean reduction in waist size, determine whether the health spa’s claim is valid. Assume the distribution of differ- ences in waist sizes before and after the program to be approximately normal.
9.94 The Department of Civil Engineering at Virginia Tech compared a modified (M-5 hr) assay technique for recovering fecal coliforms in stormwater runoff from an urban area to a most probable number (MPN) tech- nique. A total of 12 runoff samples were collected and analyzed by the two techniques. Fecal coliform counts per 100 milliliters are recorded in the following table.
suming that the populations are approximately nor- mally distributed.
9.96 An anthropologist is interested in the proportion of individuals in two Indian tribes with double occipi- tal hair whorls. Suppose that independent samples are taken from each of the two tribes, and it is found that 24 of 100 Indians from tribe A and 36 of 120 Indians from tribe B possess this characteristic. Construct a 95% confidence interval for the difference pB − pA be- tween the proportions of these two tribes with occipital hair whorls.
9.97 A manufacturer of electric irons produces these items in two plants. Both plants have the same suppli- ers of small parts. A saving can be made by purchasing thermostats for plant B from a local supplier. A sin- gle lot was purchased from the local supplier, and a test was conducted to see whether or not these new thermostats were as accurate as the old. The ther- mostats were tested on tile irons on the 550◦F setting, and the actual temperatures were read to the nearest 0.1◦F with a thermocouple. The data are as follows:
Sample
1 2 3 4 5 6 7 8 9
10 11 12
MPN Count
2010 930 450 400 210 436 270 4100 450 2090 154 219 179 169 192 194 230 174 340 274 194 183
2300 1200
M-5 hr Count
//
New Supplier (◦F) 530.3 559.3 549.4 544.0 551.7 549.9 556.9 536.7 558.8 538.8 559.1 555.0 538.6 551.1 565.4 550.0 554.9 554.7 536.1 569.1
Old Supplier (◦F) 559.7 534.7 554.8 545.0 544.6 550.7 563.1 551.1 553.8 538.8 554.5 553.0 538.4 548.3 552.9 555.0 544.8 558.4 548.7 560.3
Find 95% confidence intervals for σ12 /σ2 and
where σ12 and σ2 are the population variances of the thermostat readings for the new and old suppliers, re- spectively.
9.98 It is argued that the resistance of wire A is greater than the resistance of wire B. An experiment on the wires shows the following results (in ohms):
Construct a 90% confidence interval for the difference in the mean fecal coliform counts between the M-5 hr and the MPN techniques. Assume that the count dif- ferences are approximately normally distributed.
9.95 An experiment was conducted to determine whether surface finish has an effect on the endurance limit of steel. There is a theory that polishing in- creases the average endurance limit (for reverse bend- ing). From a practical point of view, polishing should not have any effect on the standard deviation of the endurance limit, which is known from numerous en- durance limit experiments to be 4000 psi. An ex- periment was performed on 0.4% carbon steel using both unpolished and polished smooth-turned speci- mens. The data are as follows:
Endurance Limit (psi)
566.3 543.3 554.9
538.0 564.6 535.1
for σ1 /σ2 ,
Polished 0.4% Carbon 85,500 91,900 89,400 84,000 89,900 78,700 87,500 83,100
Unpolished 0.4% Carbon 82,600 82,400 81,700 79,500 79,400 69,800 79,900 83,400
Wire A 0.140 0.138 0.143 0.142 0.144 0.137
Wire B 0.135 0.140 0.136 0.142 0.138 0.140
Find a 95% confidence interval for the difference be- tween the population means for the two methods, as-
Assuming equal variances, what conclusions do you draw? Justify your answer.
9.99 An alternative form of estimation is accom- plished through the method of moments. This method involves equating the population mean and variance to the corresponding sample mean x ̄ and sample variance

Review Exercises
315
s2 and solving for the parameters, the results being the moment estimators. In the case of a single pa- rameter, only the means are used. Give an argument that in the case of the Poisson distribution the maxi- mum likelihood estimator and moment estimators are the same.
9.100 Specify the moment estimators for μ and σ2 for the normal distribution.
9.101 Specify the moment estimators for μ and σ2 for the lognormal distribution.
9.102 Specify the moment estimators for α and β for the gamma distribution.
9.103 A survey was done with the hope of comparing salaries of chemical plant managers employed in two areas of the country, the northern and west central re- gions. An independent random sample of 300 plant managers was selected from each of the two regions. These managers were asked their annual salaries. The results are as follows
9.106 A random sample of 30 firms dealing in wireless products was selected to determine the proportion of such firms that have implemented new software to im- prove productivity. It turned out that 8 of the 30 had implemented such software. Find a 95% confidence in- terval on p, the true proportion of such firms that have implemented new software.
9.107 Refer to Review Exercise 9.106. Suppose there is concern about whether the point estimate pˆ = 8/30 is accurate enough because the confidence interval around p is not sufficiently narrow. Using pˆ as the estimate of p, how many companies would need to be sampled in order to have a 95% confidence interval with a width of only 0.05?
9.108 A manufacturer turns out a product item that is labeled either “defective” or “not defective.” In order to estimate the proportion defective, a random sam- ple of 100 items is taken from production, and 10 are found to be defective. Following implementation of a quality improvement program, the experiment is con- ducted again. A new sample of 100 is taken, and this time only 6 are found to be defective.
(a) Give a 95% confidence interval on p1 − p2, where p1 is the population proportion defective before im- provement and p2 is the proportion defective after improvement.
(b) Is there information in the confidence interval found in (a) that would suggest that p1 > p2? Ex- plain.
9.109 A machine is used to fill boxes with product in an assembly line operation. Much concern centers around the variability in the number of ounces of prod- uct in a box. The standard deviation in weight of prod- uct is known to be 0.3 ounce. An improvement is im- plemented, after which a random sample of 20 boxes is selected and the sample variance is found to be 0.045 ounce2. Find a 95% confidence interval on the variance in the weight of the product. Does it appear from the range of the confidence interval that the improvement of the process enhanced quality as far as variability is concerned? Assume normality on the distribution of weights of product.
9.110 A consumer group is interested in comparing operating costs for two different types of automobile engines. The group is able to find 15 owners whose cars have engine type A and 15 whose cars have engine type B. All 30 owners bought their cars at roughly the same time, and all have kept good records for a cer- tain 12-month period. In addition, these owners drove roughly the same number of miles. The cost statistics are y ̄A = $87.00/1000 miles, y ̄B = $75.00/1000 miles, sA = $5.99, and sB = $4.85. Compute a 95% confi- dence interval to estimate μA − μB , the difference in
North
x ̄1 =$102,300 s1 = $5700
West Central
x ̄2 =$98,500 s2 = $3800
//
(a) Construct a 99% confidence interval for μ1 − μ2, the difference in the mean salaries.
(b) What assumption did you make in (a) about the distribution of annual salaries for the two regions? Is the assumption of normality necessary? Why or why not?
(c) What assumption did you make about the two vari- ances? Is the assumption of equality of variances reasonable? Explain!
9.104 Consider Review Exercise 9.103. Let us assume that the data have not been collected yet and that pre- vious statistics suggest that σ1 = σ2 = $4000. Are the sample sizes in Review Exercise 9.103 sufficient to produce a 95% confidence interval on μ1 − μ2 having a width of only $1000? Show all work.
9.105 A labor union is becoming defensive about gross absenteeism by its members. The union lead- ers had always claimed that, in a typical month, 95% of its members were absent less than 10 hours. The union decided to check this by monitoring a random sample of 300 of its members. The number of hours absent was recorded for each of the 300 members. The results were x ̄ = 6.5 hours and s = 2.5 hours. Use the data to respond to this claim, using a one-sided toler- ance limit and choosing the confidence level to be 99%. Be sure to interpret what you learn from the tolerance limit calculation.

316 Chapter 9 One- and Two-Sample Estimation Problems
the mean operating costs. Assume normality and equal variances.
9.111 Consider the statistic S2, the pooled estimate p
of σ2 discussed in Section 9.8. It is used when one is willing to assume that σ12 = σ2 = σ2. Show that the es- timator is unbiased for σ2 [i.e., show that E(Sp2) = σ2]. You may make use of results from any theorem or ex- ample in this chapter.
9.112 A group of human factor researchers are con- cerned about reaction to a stimulus by airplane pilots in a certain cockpit arrangement. An experiment was conducted in a simulation laboratory, and 15 pilots were used with average reaction time of 3.2 seconds with a sample standard deviation of 0.6 second. It is of interest to characterize the extreme (i.e., worst case scenario). To that end, do the following:
(a) Give a particular important one-sided 99% confi- dence bound on the mean reaction time. What assumption, if any, must you make on the distribu- tion of reaction times?
(b) Give a 99% one-sided prediction interval and give an interpretation of what it means. Must you make
(c)
an assumption about the distribution of reaction times to compute this bound?
Compute a one-sided tolerance bound with 99% confidence that involves 95% of reaction times. Again, give an interpretation and assumptions about the distribution, if any. (Note: The one- sided tolerance limit values are also included in Ta- ble A.7.)
9.15 Potential Misconceptions and Hazards; Relationship to Material in Other Chapters
The concept of a large-sample confidence interval on a population is often confusing to the beginning student. It is based on the notion that even when σ is unknown and one is not convinced that the distribution being sampled is normal, a confidence interval on μ can be computed from
s
x ̄ ± z α / 2 √ n .
In practice, this formula is often used when the sample is too small. The genesis of this large sample interval is, of course, the Central Limit Theorem (CLT), under which normality is not necessary. Here the CLT requires a known σ, of which s is only an estimate. Thus, n must be at least as large as 30 and the underly- ing distribution must be close to symmetric, in which case the interval is still an approximation.
There are instances in which the appropriateness of the practical application of material in this chapter depends very much on the specific context. One very important illustration is the use of the t-distribution for the confidence interval on μ when σ is unknown. Strictly speaking, the use of the t-distribution requires that the distribution sampled from be normal. However, it is well known that any application of the t-distribution is reasonably insensitive (i.e., robust) to the normality assumption. This represents one of those fortunate situations which
9.113 A certain supplier manufactures a type of rub- ber mat that is sold to automotive companies. The material used to produce the mats must have certain hardness characteristics. Defective mats are occasion- ally discovered and rejected. The supplier claims that the proportion defective is 0.05. A challenge was made by one of the clients who purchased the mats, so an ex- periment was conducted in which 400 mats are tested and 17 were found defective.
(a) Compute a 95% two-sided confidence interval on the proportion defective.
(b) Compute an appropriate 95% one-sided confidence interval on the proportion defective.
(c) Interpret both intervals from (a) and (b) and com- ment on the claim made by the supplier.

9.15 Potential Misconceptions and Hazards 317
occur often in the field of statistics in which a basic assumption does not hold and yet “everything turns out all right!” However, one population from which the sample is drawn cannot deviate substantially from normal. Thus, the normal probability plots discussed in Chapter 8 and the goodness-of-fit tests introduced in Chapter 10 often need be called upon to ascertain some sense of “nearness to normality.” This idea of “robustness to normality” will reappear in Chapter 10.
It is our experience that one of the most serious “misuses of statistics” in prac- tice evolves from confusion about distinctions in the interpretation of the types of statistical intervals. Thus, the subsection in this chapter where differences among the three types of intervals are discussed is important. It is very likely that in practice the confidence interval is heavily overused. That is, it is used when there is really no interest in the mean; rather, the question is “Where is the next observation going to fall?” or often, more importantly, “Where is the large bulk of the distribution?” These are crucial questions that are not answered by comput- ing an interval on the mean. The interpretation of a confidence interval is often misunderstood. It is tempting to conclude that the parameter falls inside the in- terval with probability 0.95. While this is a correct interpretation of a Bayesian posterior interval (readers are referred to Chapter 18 for more information on Bayesian inference), it is not the proper frequency interpretation.
A confidence interval merely suggests that if the experiment is conducted and data are observed again and again, about 95% of such intervals will contain the true parameter. Any beginning student of practical statistics should be very clear on the difference among these statistical intervals.
Another potential serious misuse of statistics centers around the use of the χ2-distribution for a confidence interval on a single variance. Again, normality of the distribution from which the sample is drawn is assumed. Unlike the use of the t-distribution, the use of the χ2 test for this application is not robust to the nor-
mality assumption (i.e., the sampling distribution of (n−1)S2 deviates far from σ2
χ2 if the underlying distribution is not normal). Thus, strict use of goodness-of-fit (Chapter 10) tests and/or normal probability plotting can be extremely important in such contexts. More information about this general issue will be given in future chapters.

This page intentionally left blank

Chapter 10
One- and Two-Sample Tests of Hypotheses
10.1 Statistical Hypotheses: General Concepts
Definition 10.1:
Often, the problem confronting the scientist or engineer is not so much the es- timation of a population parameter, as discussed in Chapter 9, but rather the formation of a data-based decision procedure that can produce a conclusion about some scientific system. For example, a medical researcher may decide on the basis of experimental evidence whether coffee drinking increases the risk of cancer in humans; an engineer might have to decide on the basis of sample data whether there is a difference between the accuracy of two kinds of gauges; or a sociologist might wish to collect appropriate data to enable him or her to decide whether a person’s blood type and eye color are independent variables. In each of these cases, the scientist or engineer postulates or conjectures something about a system. In addition, each must make use of experimental data and make a decision based on the data. In each case, the conjecture can be put in the form of a statistical hypothesis. Procedures that lead to the acceptance or rejection of statistical hy- potheses such as these comprise a major area of statistical inference. First, let us define precisely what we mean by a statistical hypothesis.
The truth or falsity of a statistical hypothesis is never known with absolute certainty unless we examine the entire population. This, of course, would be im- practical in most situations. Instead, we take a random sample from the population of interest and use the data contained in this sample to provide evidence that either supports or does not support the hypothesis. Evidence from the sample that is inconsistent with the stated hypothesis leads to a rejection of the hypothesis.
A statistical hypothesis is an assertion or conjecture concerning one or more populations.
319

320 Chapter 10 One- and Two-Sample Tests of Hypotheses The Role of Probability in Hypothesis Testing
It should be made clear to the reader that the decision procedure must include an awareness of the probability of a wrong conclusion. For example, suppose that the hypothesis postulated by the engineer is that the fraction defective p in a certain process is 0.10. The experiment is to observe a random sample of the product in question. Suppose that 100 items are tested and 12 items are found defective. It is reasonable to conclude that this evidence does not refute the condition that the binomial parameter p = 0.10, and thus it may lead one not to reject the hypothesis. However, it also does not refute p = 0.12 or perhaps even p = 0.15. As a result, the reader must be accustomed to understanding that rejection of a hypothesis implies that the sample evidence refutes it. Put another way, rejection means that there is a small probability of obtaining the sample information observed when, in fact, the hypothesis is true. For example, for our proportion-defective hypothesis, a sample of 100 revealing 20 defective items is certainly evidence for rejection. Why? If, indeed, p = 0.10, the probability of obtaining 20 or more defectives is approximately 0.002. With the resulting small risk of a wrong conclusion, it would seem safe to reject the hypothesis that p = 0.10. In other words, rejection of a hypothesis tends to all but “rule out” the hypothesis. On the other hand, it is very important to emphasize that acceptance or, rather, failure to reject does not rule out other possibilities. As a result, the firm conclusion is established by the data analyst when a hypothesis is rejected.
The formal statement of a hypothesis is often influenced by the structure of the probability of a wrong conclusion. If the scientist is interested in strongly supporting a contention, he or she hopes to arrive at the contention in the form of rejection of a hypothesis. If the medical researcher wishes to show strong evidence in favor of the contention that coffee drinking increases the risk of cancer, the hypothesis tested should be of the form “there is no increase in cancer risk produced by drinking coffee.” As a result, the contention is reached via a rejection. Similarly, to support the claim that one kind of gauge is more accurate than another, the engineer tests the hypothesis that there is no difference in the accuracy of the two kinds of gauges.
The foregoing implies that when the data analyst formalizes experimental evi- dence on the basis of hypothesis testing, the formal statement of the hypothesis is very important.
The Null and Alternative Hypotheses
The structure of hypothesis testing will be formulated with the use of the term null hypothesis, which refers to any hypothesis we wish to test and is denoted by H0. The rejection of H0 leads to the acceptance of an alternative hypoth- esis, denoted by H1. An understanding of the different roles played by the null hypothesis (H0) and the alternative hypothesis (H1) is crucial to one’s understand- ing of the rudiments of hypothesis testing. The alternative hypothesis H1 usually represents the question to be answered or the theory to be tested, and thus its spec- ification is crucial. The null hypothesis H0 nullifies or opposes H1 and is often the logical complement to H1. As the reader gains more understanding of hypothesis testing, he or she should note that the analyst arrives at one of the two following

10.2 Testing a Statistical Hypothesis 321 conclusions:
reject H0 in favor of H1 because of sufficient evidence in the data or fail to reject H0 because of insufficient evidence in the data.
Note that the conclusions do not involve a formal and literal “accept H0.” The statement of H0 often represents the “status quo” in opposition to the new idea, conjecture, and so on, stated in H1, while failure to reject H0 represents the proper conclusion. In our binomial example, the practical issue may be a concern that the historical defective probability of 0.10 no longer is true. Indeed, the conjecture may be that p exceeds 0.10. We may then state
H0: p=0.10, H1: p>0.10.
Now 12 defective items out of 100 does not refute p = 0.10, so the conclusion is “fail to reject H0.” However, if the data produce 20 out of 100 defective items, then the conclusion is “reject H0” in favor of H1: p > 0.10.
Though the applications of hypothesis testing are quite abundant in scientific and engineering work, perhaps the best illustration for a novice lies in the predica- ment encountered in a jury trial. The null and alternative hypotheses are
H0: defendant is innocent, H1: defendant is guilty.
The indictment comes because of suspicion of guilt. The hypothesis H0 (the status quo) stands in opposition to H1 and is maintained unless H1 is supported by evidence “beyond a reasonable doubt.” However, “failure to reject H0” in this case does not imply innocence, but merely that the evidence was insufficient to convict. So the jury does not necessarily accept H0 but fails to reject H0.
10.2 Testing a Statistical Hypothesis
To illustrate the concepts used in testing a statistical hypothesis about a popula- tion, we present the following example. A certain type of cold vaccine is known to be only 25% effective after a period of 2 years. To determine if a new and some- what more expensive vaccine is superior in providing protection against the same virus for a longer period of time, suppose that 20 people are chosen at random and inoculated. (In an actual study of this type, the participants receiving the new vaccine might number several thousand. The number 20 is being used here only to demonstrate the basic steps in carrying out a statistical test.) If more than 8 of those receiving the new vaccine surpass the 2-year period without contracting the virus, the new vaccine will be considered superior to the one presently in use. The requirement that the number exceed 8 is somewhat arbitrary but appears reason- able in that it represents a modest gain over the 5 people who could be expected to receive protection if the 20 people had been inoculated with the vaccine already in use. We are essentially testing the null hypothesis that the new vaccine is equally effective after a period of 2 years as the one now commonly used. The alternative

322
Chapter 10 One- and Two-Sample Tests of Hypotheses
The Test Statistic
hypothesis is that the new vaccine is in fact superior. This is equivalent to testing the hypothesis that the binomial parameter for the probability of a success on a given trial is p = 1/4 against the alternative that p > 1/4. This is usually written as follows:
H0: p=0.25, H1: p>0.25.
The test statistic on which we base our decision is X, the number of individuals in our test group who receive protection from the new vaccine for a period of at least 2 years. The possible values of X, from 0 to 20, are divided into two groups: those numbers less than or equal to 8 and those greater than 8. All possible scores greater than 8 constitute the critical region. The last number that we observe in passing into the critical region is called the critical value. In our illustration, the critical value is the number 8. Therefore, if x > 8, we reject H0 in favor of the alternative hypothesis H1. If x ≤ 8, we fail to reject H0. This decision criterion is illustrated in Figure 10.1.
x
The decision procedure just described could lead to either of two wrong conclusions. For instance, the new vaccine may be no better than the one now in use (H0 true) and yet, in this particular randomly selected group of individuals, more than 8 surpass the 2-year period without contracting the virus. We would be committing an error by rejecting H0 in favor of H1 when, in fact, H0 is true. Such an error is called a type I error.
A second kind of error is committed if 8 or fewer of the group surpass the 2-year period successfully and we are unable to conclude that the vaccine is better when it actually is better (H1 true). Thus, in this case, we fail to reject H0 when in fact H0 is false. This is called a type II error.
In testing any statistical hypothesis, there are four possible situations that determine whether our decision is correct or in error. These four situations are
Do not reject H0 (p 􏱋 0.25)
Reject H0 (p 􏱒 0.25)
0 1 2 3 4 5 6 7 8 9 1011121314151617181920
Figure 10.1: Decision criterion for testing p = 0.25 versus p > 0.25.
The Probability of a Type I Error
Definition 10.2:
Definition 10.3:
Rejection of the null hypothesis when it is true is called a type I error.
Nonrejection of the null hypothesis when it is false is called a type II error.

10.2 Testing a Statistical Hypothesis 323 summarized in Table 10.1.
Table 10.1: Possible Situations for Testing a Statistical Hypothesis
H0 is true
Do not reject H0 Correct decision
Reject H0 Type I error
H0 is false Type II error Correct decision
The probability of committing a type I error, also called the level of signif- icance, is denoted by the Greek letter α. In our illustration, a type I error will occur when more than 8 individuals inoculated with the new vaccine surpass the 2-year period without contracting the virus and researchers conclude that the new vaccine is better when it is actually equivalent to the one in use. Hence, if X is the number of individuals who remain free of the virus for at least 2 years,
􏰧 􏰨20􏰧􏰨 α=P(typeIerror)=P X>8whenp=1 =􏰤b x;20,1
4x=9 4 =1− b x;20,4 =1−0.9591=0.0409.
􏰤8 􏰧 1􏰨 x=0
We say that the null hypothesis, p = 1/4, is being tested at the α = 0.0409 level of significance. Sometimes the level of significance is called the size of the test. A critical region of size 0.0409 is very small, and therefore it is unlikely that a type I error will be committed. Consequently, it would be most unusual for more than 8 individuals to remain immune to a virus for a 2-year period using a new vaccine that is essentially equivalent to the one now on the market.
The Probability of a Type II Error
The probability of committing a type II error, denoted by β, is impossible to com- pute unless we have a specific alternative hypothesis. If we test the null hypothesis that p = 1/4 against the alternative hypothesis that p = 1/2, then we are able to compute the probability of not rejecting H0 when it is false. We simply find the probability of obtaining 8 or fewer in the group that surpass the 2-year period when p = 1/2. In this case,
􏰧 1􏰨 β=P(typeIIerror)=P X≤8whenp=2
􏰤8 􏰧 1􏰨
= b x;20,2 =0.2517.
x=0
This is a rather high probability, indicating a test procedure in which it is quite likely that we shall reject the new vaccine when, in fact, it is superior to what is now in use. Ideally, we like to use a test procedure for which the type I and type II error probabilities are both small.
It is possible that the director of the testing program is willing to make a type II error if the more expensive vaccine is not significantly superior. In fact, the only

324
Chapter 10 One- and Two-Sample Tests of Hypotheses
time he wishes to guard against the type II error is when the true value of p is at least 0.7. If p = 0.7, this test procedure gives
β = P(type II error) = P(X ≤ 8 when p = 0.7)
􏰤8 x=0
With such a small probability of committing a type II error, it is extremely unlikely that the new vaccine would be rejected when it was 70% effective after a period of 2 years. As the alternative hypothesis approaches unity, the value of β diminishes to zero.
The Role of α, β, and Sample Size
Let us assume that the director of the testing program is unwilling to commit a type II error when the alternative hypothesis p = 1/2 is true, even though we have found the probability of such an error to be β = 0.2517. It is always possible to reduce β by increasing the size of the critical region. For example, consider what happens to the values of α and β when we change our critical value to 7 so that all scores greater than 7 fall in the critical region and those less than or equal to 7 fall in the nonrejection region. Now, in testing p = 1/4 against the alternative hypothesis that p = 1/2, we find that
and
=
b(x; 20, 0.7) = 0.0051.
20 􏰧 􏰨 7 􏰧 􏰨
α=􏰤b x;20,1 =1−􏰤b x;20,1 =1−0.8982=0.1018
x=8 4 x=0 4
􏰤7 􏰧 1􏰨
β= b x;20,2 =0.1316.
x=0
By adopting a new decision procedure, we have reduced the probability of com-
mitting a type II error at the expense of increasing the probability of committing a type I error. For a fixed sample size, a decrease in the probability of one error will usually result in an increase in the probability of the other error. Fortunately, the probability of committing both types of error can be reduced by increasing the sample size. Consider the same problem using a random sample of 100 individuals. If more than 36 of the group surpass the 2-year period, we reject the null hypothesis that p = 1/4 and accept the alternative hypothesis that p > 1/4. The critical value is now 36. All possible scores above 36 constitute the critical region, and all possible scores less than or equal to 36 fall in the acceptance region.
To determine the probability of committing a type I error, we shall use the normal curve approximation with
􏰧1􏰨 √􏰱
μ=np=(100) 4 =25 and σ= npq= (100)(1/4)(3/4)=4.33.
Referring to Figure 10.2, we need the area under the normal curve to the right of x = 36.5. The corresponding z-value is
z= 36.5−25 =2.66. 4.33

10.2 Testing a Statistical Hypothesis
325
α
x
From Table A.3 we find that 􏰧 1􏰨 α=P(typeIerror)=P X>36whenp=4 ≈P(Z>2.66)
=1−P(Z <2.66)=1−0.9961=0.0039. If H0 is false and the true value of H1 is p = 1/2, we can determine the probability of a type II error using the normal curve approximation with μ = np = (100)(1/2) = 50 and σ = √npq = 􏰱(100)(1/2)(1/2) = 5. The probability of a value falling in the nonrejection region when H0 is true is given by the area of the shaded region to the left of x = 36.5 in Figure 10.3. The z-value corresponding to x = 36.5 is μ 􏱋 25 Figure 10.2: Probability of a type I error. σ 􏱋 4.33 36.5 H0 z= 36.5−50 =−2.7. 5 H1 σ 􏱋 4.33 25 36.5 σ 􏱋 5 50 x Therefore, β=P(typeIIerror)=P X≤36whenp=2 ≈P(Z<−2.7)=0.0035. Figure 10.3: Probability of a type II error. 􏰧 1􏰨 326 Chapter 10 One- and Two-Sample Tests of Hypotheses Obviously, the type I and type II errors will rarely occur if the experiment consists of 100 individuals. The illustration above underscores the strategy of the scientist in hypothesis testing. After the null and alternative hypotheses are stated, it is important to consider the sensitivity of the test procedure. By this we mean that there should be a determination, for a fixed α, of a reasonable value for the probability of wrongly accepting H0 (i.e., the value of β) when the true situation represents some important deviation from H0. A value for the sample size can usually be determined for which there is a reasonable balance between the values of α and β computed in this fashion. The vaccine problem provides an illustration. Illustration with a Continuous Random Variable The concepts discussed here for a discrete population can be applied equally well to continuous random variables. Consider the null hypothesis that the average weight of male students in a certain college is 68 kilograms against the alternative hypothesis that it is unequal to 68. That is, we wish to test H0: μ=68, H1: μ̸=68. The alternative hypothesis allows for the possibility that μ < 68 or μ > 68.
A sample mean that falls close to the hypothesized value of 68 would be consid- ered evidence in favor of H0. On the other hand, a sample mean that is considerably less than or more than 68 would be evidence inconsistent with H0 and therefore favoring H1. The sample mean is the test statistic in this case. A critical region for the test statistic might arbitrarily be chosen to be the two intervals x ̄ < 67 and x ̄ > 69. The nonrejection region will then be the interval 67 ≤ x ̄ ≤ 69. This
decision criterion is illustrated in Figure 10.4.
67 68 69
Figure 10.4: Critical region (in blue).
x
Reject H0 (μ􏱋􏱓 68)
Do not reject H0 ( μ 􏱋 68)
Reject H0 (μ􏱋􏱓 68)
Let us now use the decision criterion of Figure 10.4 to calculate the probabilities of committing type I and type II errors when testing the null hypothesis that μ = 68 kilograms against the alternative that μ ̸= 68 kilograms.
Assume the standard deviation of the population of weights to be σ = 3.6. For large samples, we may substitute s for σ if no other estimate of σ is available. Our decision statistic, based on a random sample of size n = 36, will be X ̄, the most efficient estimator of μ. From the Central Limit Theorem, we know that the sampling distribution of X ̄ is approximately normal with standard deviation σX ̄ = σ/√n = 3.6/6 = 0.6.

10.2 Testing a Statistical Hypothesis 327
The probability of committing a type I error, or the level of significance of our test, is equal to the sum of the areas that have been shaded in each tail of the distribution in Figure 10.5. Therefore,
α = P (X ̄ < 67 when μ = 68) + P (X ̄ > 69 when μ = 68).
α/2 α/2 67 μ 􏱋 68 69
x
Figure 10.5: Critical region for testing μ = 68 versus μ ̸= 68. The z-values corresponding to x ̄1 = 67 and x ̄2 = 69 when H0 is true are
z1 = 67−68 =−1.67 and z2 = 69−68 =1.67. 0.6 0.6
Therefore,
α=P(Z <−1.67)+P(Z >1.67)=2P(Z <−1.67)=0.0950. Thus, 9.5% of all samples of size 36 would lead us to reject μ = 68 kilograms when, in fact, it is true. To reduce α, we have a choice of increasing the sample size or widening the fail-to-reject region. Suppose that we increase the sample size to n = 64. Then σX ̄ = 3.6/8 = 0.45. Now z1 = 67−68 =−2.22 and z2 = 69−68 =2.22. 0.45 0.45 Hence, α=P(Z <−2.22)+P(Z >2.22)=2P(Z <−2.22)=0.0264. The reduction in α is not sufficient by itself to guarantee a good testing proce- dure. We must also evaluate β for various alternative hypotheses. If it is important to reject H0 when the true mean is some value μ ≥ 70 or μ ≤ 66, then the prob- ability of committing a type II error should be computed and examined for the alternatives μ = 66 and μ = 70. Because of symmetry, it is only necessary to consider the probability of not rejecting the null hypothesis that μ = 68 when the alternative μ = 70 is true. A type II error will result when the sample mean x ̄ falls between 67 and 69 when H1 is true. Therefore, referring to Figure 10.6, we find that β = P(67 ≤ X ̄ ≤ 69 when μ = 70). 328 Chapter 10 One- and Two-Sample Tests of Hypotheses H0 H1 x Figure 10.6: Probability of type II error for testing μ = 68 versus μ = 70. The z-values corresponding to x ̄1 = 67 and x ̄2 = 69 when H1 is true are z1 = 67−70 =−6.67 and z2 = 69−70 =−2.22. Therefore, β = P (−6.67 < Z < −2.22) = P (Z < −2.22) − P (Z < −6.67) = 0.0132 − 0.0000 = 0.0132. If the true value of μ is the alternative μ = 66, the value of β will again be 0.0132. For all possible values of μ < 66 or μ > 70, the value of β will be even smaller when n = 64, and consequently there would be little chance of not rejecting H0 when it is false.
The probability of committing a type II error increases rapidly when the true value of μ approaches, but is not equal to, the hypothesized value. Of course, this is usually the situation where we do not mind making a type II error. For example, if the alternative hypothesis μ = 68.5 is true, we do not mind committing a type II error by concluding that the true answer is μ = 68. The probability of making such an error will be high when n = 64. Referring to Figure 10.7, we have
β = P(67 ≤ X ̄ ≤ 69 when μ = 68.5).
The z-values corresponding to x ̄1 = 67 and x ̄2 = 69 when μ = 68.5 are
z1 = 67−68.5 =−3.33 and z2 = 69−68.5 =1.11. 0.45 0.45
Therefore,
β = P (−3.33 < Z < 1.11) = P (Z < 1.11) − P (Z < −3.33) = 0.8665 − 0.0004 = 0.8661. 67 68 69 70 71 0.45 0.45 The preceding examples illustrate the following important properties: 10.2 Testing a Statistical Hypothesis 329 H0 H1 69 67 68 68.5 x Important Properties of a Test of Hypothesis Figure 10.7: Type II error for testing μ = 68 versus μ = 68.5. 1. The type I error and type II error are related. A decrease in the probability of one generally results in an increase in the probability of the other. 2. The size of the critical region, and therefore the probability of committing a type I error, can always be reduced by adjusting the critical value(s). 3. An increase in the sample size n will reduce α and β simultaneously. 4. If the null hypothesis is false, β is a maximum when the true value of a parameter approaches the hypothesized value. The greater the distance between the true value and the hypothesized value, the smaller β will be. One very important concept that relates to error probabilities is the notion of the power of a test. The power of a test can be computed as 1 − β. Often different types of tests are compared by contrasting power properties. Consider the previous illustration, in which we were testing H0 : μ = 68 and H1 : μ ̸= 68. As before, suppose we are interested in assessing the sensitivity of the test. The test is gov- erned by the rule that we do not reject H0 if 67 ≤ x ̄ ≤ 69. We seek the capability of the test to properly reject H0 when indeed μ = 68.5. We have seen that the probability of a type II error is given by β = 0.8661. Thus, the power of the test is 1 − 0.8661 = 0.1339. In a sense, the power is a more succinct measure of how sensitive the test is for detecting differences between a mean of 68 and a mean of 68.5. In this case, if μ is truly 68.5, the test as described will properly reject H0 only 13.39% of the time. As a result, the test would not be a good one if it was important that the analyst have a reasonable chance of truly distinguishing between a mean of 68.0 (specified by H0) and a mean of 68.5. From the foregoing, it is clear that to produce a desirable power (say, greater than 0.8), one must either increase α or increase the sample size. So far in this chapter, much of the discussion of hypothesis testing has focused on foundations and definitions. In the sections that follow, we get more specific Definition 10.4: The power of a test is the probability of rejecting H0 given that a specific alter- native is true. 330 Chapter 10 One- and Two-Sample Tests of Hypotheses and put hypotheses in categories as well as discuss tests of hypotheses on various parameters of interest. We begin by drawing the distinction between a one-sided and a two-sided hypothesis. One- and Two-Tailed Tests A test of any statistical hypothesis where the alternative is one sided, such as H0: θ=θ0, or perhaps H1: θ>θ0
H0: θ=θ0, H1: θ<θ0, is called a one-tailed test. Earlier in this section, we referred to the test statistic for a hypothesis. Generally, the critical region for the alternative hypothesis θ > θ0 lies in the right tail of the distribution of the test statistic, while the critical region for the alternative hypothesis θ < θ0 lies entirely in the left tail. (In a sense, the inequality symbol points in the direction of the critical region.) A one-tailed test was used in the vaccine experiment to test the hypothesis p = 1/4 against the one-sided alternative p > 1/4 for the binomial distribution. The one-tailed critical region is usually obvious; the reader should visualize the behavior of the test statistic and notice the obvious signal that would produce evidence supporting the alternative hypothesis.
A test of any statistical hypothesis where the alternative is two sided, such as H0: θ=θ0,
H1: θ̸=θ0,
is called a two-tailed test, since the critical region is split into two parts, often having equal probabilities, in each tail of the distribution of the test statistic. The alternative hypothesis θ ̸= θ0 states that either θ < θ0 or θ > θ0. A two-tailed test was used to test the null hypothesis that μ = 68 kilograms against the two- sided alternative μ ̸= 68 kilograms in the example of the continuous population of student weights.
How Are the Null and Alternative Hypotheses Chosen?
The null hypothesis H0 will often be stated using the equality sign. With this approach, it is clear how the probability of type I error is controlled. However, there are situations in which “do not reject H0” implies that the parameter θ might be any value defined by the natural complement to the alternative hypothesis. For example, in the vaccine example, where the alternative hypothesis is H1: p > 1/4, it is quite possible that nonrejection of H0 cannot rule out a value of p less than 1/4. Clearly though, in the case of one-tailed tests, the statement of the alternative is the most important consideration.

10.3 The Use of P-Values for Decision Making in Testing Hypotheses 331
Whether one sets up a one-tailed or a two-tailed test will depend on the con- clusion to be drawn if H0 is rejected. The location of the critical region can be determined only after H1 has been stated. For example, in testing a new drug, one sets up the hypothesis that it is no better than similar drugs now on the market and tests this against the alternative hypothesis that the new drug is superior. Such an alternative hypothesis will result in a one-tailed test with the critical region in the right tail. However, if we wish to compare a new teaching technique with the conventional classroom procedure, the alternative hypothesis should allow for the new approach to be either inferior or superior to the conventional procedure. Hence, the test is two-tailed with the critical region divided equally so as to fall in the extreme left and right tails of the distribution of our statistic.
Example 10.1: A manufacturer of a certain brand of rice cereal claims that the average saturated fat content does not exceed 1.5 grams per serving. State the null and alternative hypotheses to be used in testing this claim and determine where the critical region is located.
Solution : The manufacturer’s claim should be rejected only if μ is greater than 1.5 milligrams and should not be rejected if μ is less than or equal to 1.5 milligrams. We test
H0: μ=1.5, H1: μ>1.5.
Nonrejection of H0 does not rule out values less than 1.5 milligrams. Since we have a one-tailed test, the greater than symbol indicates that the critical region lies entirely in the right tail of the distribution of our test statistic X ̄.
Example 10.2: A real estate agent claims that 60% of all private residences being built today are 3-bedroom homes. To test this claim, a large sample of new residences is inspected; the proportion of these homes with 3 bedrooms is recorded and used as the test statistic. State the null and alternative hypotheses to be used in this test and determine the location of the critical region.
10.3
Solution : If the test statistic were substantially higher or lower than p = 0.6, we would reject the agent’s claim. Hence, we should make the hypothesis
H0: p=0.6, H1: p̸=0.6.
The alternative hypothesis implies a two-tailed test with the critical region divided equally in both tails of the distribution of P􏱅, our test statistic.
The Use of P -Values for Decision Making in Testing Hypotheses
In testing hypotheses in which the test statistic is discrete, the critical region may be chosen arbitrarily and its size determined. If α is too large, it can be reduced by making an adjustment in the critical value. It may be necessary to increase the

332
Chapter 10 One- and Two-Sample Tests of Hypotheses
sample size to offset the decrease that occurs automatically in the power of the test.
Over a number of generations of statistical analysis, it had become customary to choose an α of 0.05 or 0.01 and select the critical region accordingly. Then, of course, strict rejection or nonrejection of H0 would depend on that critical region. For example, if the test is two tailed and α is set at the 0.05 level of significance and the test statistic involves, say, the standard normal distribution, then a z-value is observed from the data and the critical region is
z > 1.96 or z < −1.96, where the value 1.96 is found as z0.025 in Table A.3. A value of z in the critical region prompts the statement “The value of the test statistic is significant,” which we can then translate into the user’s language. For example, if the hypothesis is given by H0: μ=10, H1: μ̸=10, one might say, “The mean differs significantly from the value 10.” Preselection of a Significance Level This preselection of a significance level α has its roots in the philosophy that the maximum risk of making a type I error should be controlled. However, this approach does not account for values of test statistics that are “close” to the critical region. Suppose, for example, in the illustration with H0 : μ = 10 versus H1: μ ̸= 10, a value of z = 1.87 is observed; strictly speaking, with α = 0.05, the value is not significant. But the risk of committing a type I error if one rejects H0 in this case could hardly be considered severe. In fact, in a two-tailed scenario, one can quantify this risk as P = 2P(Z > 1.87 when μ = 10) = 2(0.0307) = 0.0614.
As a result, 0.0614 is the probability of obtaining a value of z as large as or larger (in magnitude) than 1.87 when in fact μ = 10. Although this evidence against H0 is not as strong as that which would result from rejection at an α = 0.05 level, it is important information to the user. Indeed, continued use of α = 0.05 or 0.01 is only a result of what standards have been passed down through the generations. The P-value approach has been adopted extensively by users of applied statistics. The approach is designed to give the user an alternative (in terms of a probability) to a mere “reject” or “do not reject” conclusion. The P-value computation also gives the user important information when the z-value falls well into the ordinary critical region. For example, if z is 2.73, it is informative for the user to observe that
P = 2(0.0032) = 0.0064,
and thus the z-value is significant at a level considerably less than 0.05. It is important to know that under the condition of H0, a value of z = 2.73 is an extremely rare event. That is, a value at least that large in magnitude would only occur 64 times in 10,000 experiments.

10.3 The Use of P-Values for Decision Making in Testing Hypotheses 333 A Graphical Demonstration of a P-Value
One very simple way of explaining a P -value graphically is to consider two distinct samples. Suppose that two materials are being considered for coating a particular type of metal in order to inhibit corrosion. Specimens are obtained, and one collection is coated with material 1 and one collection coated with material 2. The sample sizes are n1 = n2 = 10, and corrosion is measured in percent of surface area affected. The hypothesis is that the samples came from common distributions with mean μ = 10. Let us assume that the population variance is 1.0. Then we are testing
H0: μ1 =μ2 =10.
Let Figure 10.8 represent a point plot of the data; the data are placed on the distribution stated by the null hypothesis. Let us assume that the “×” data refer to material 1 and the “◦” data refer to material 2. Now it seems clear that the data do refute the null hypothesis. But how can this be summarized in one number? The P-value can be viewed as simply the probability of obtaining these data given that both samples come from the same distribution. Clearly, this probability is quite small, say 0.00000001! Thus, the small P -value clearly refutes H0, and the conclusion is that the population means are significantly different.
μ 􏱋 10
Figure 10.8: Data that are likely generated from populations having two different
means.
Use of the P -value approach as an aid in decision-making is quite natural, and nearly all computer packages that provide hypothesis-testing computation print out P -values along with values of the appropriate test statistic. The following is a formal definition of a P-value.
Definition 10.5:
A P -value is the lowest level (of significance) at which the observed value of the test statistic is significant.
How Does the Use of P-Values Differ from Classic Hypothesis Testing?
It is tempting at this point to summarize the procedures associated with testing, say, H0: θ = θ0. However, the student who is a novice in this area should un- derstand that there are differences in approach and philosophy between the classic

334
Chapter 10 One- and Two-Sample Tests of Hypotheses
Approach to Hypothesis Testing with Fixed Probability of Type I Error
Significance Testing (P -Value Approach)
Exercises
10.1 Suppose that an allergist wishes to test the hy- pothesis that at least 30% of the public is allergic to some cheese products. Explain how the allergist could commit
(a) a type I error; (b) a type II error.
10.2 A sociologist is concerned about the effective- ness of a training course designed to get more drivers to use seat belts in automobiles.
(a) What hypothesis is she testing if she commits a type I error by erroneously concluding that the training course is ineffective?
(b) What hypothesis is she testing if she commits a type II error by erroneously concluding that the training course is effective?
10.3 A large manufacturing firm is being charged with discrimination in its hiring practices.
(a) What hypothesis is being tested if a jury commits a type I error by finding the firm guilty?
(b) What hypothesis is being tested if a jury commits a type II error by finding the firm guilty?
10.4 A fabric manufacturer believes that the propor- tion of orders for raw material arriving late is p = 0.6. If a random sample of 10 orders shows that 3 or fewer arrived late, the hypothesis that p = 0.6 should be rejected in favor of the alternative p < 0.6. Use the binomial distribution. (a) Find the probability of committing a type I error if the true proportion is p = 0.6. (b) Find the probability of committing a type II error for the alternatives p = 0.3, p = 0.4, and p = 0.5. 10.5 Repeat Exercise 10.4 but assume that 50 orders are selected and the critical region is defined to be x ≤ 24, where x is the number of orders in the sample that arrived late. Use the normal approximation. 10.6 The proportion of adults living in a small town who are college graduates is estimated to be p = 0.6. To test this hypothesis, a random sample of 15 adults is selected. If the number of college graduates in the sample is anywhere from 6 to 12, we shall not reject the null hypothesis that p = 0.6; otherwise, we shall conclude that p ̸= 0.6. (a) Evaluate α assuming that p = 0.6. Use the bino- mial distribution. // fixed α approach that is climaxed with either a “reject H0” or a “do not reject H0” conclusion and the P-value approach. In the latter, no fixed α is determined and conclusions are drawn on the basis of the size of the P -value in harmony with the subjective judgment of the engineer or scientist. While modern computer software will output P-values, nevertheless it is important that readers understand both approaches in order to appreciate the totality of the concepts. Thus, we offer a brief list of procedural steps for both the classical and the P-value approach. 1. State the null and alternative hypotheses. 2. Choose a fixed significance level α. 3. Choose an appropriate test statistic and establish the critical region based on α. 4. Reject H0 if the computed test statistic is in the critical region. Otherwise, do not reject. 5. Draw scientific or engineering conclusions. 1. State null and alternative hypotheses. 2. Choose an appropriate test statistic. 3. Compute the P -value based on the computed value of the test statistic. 4. Use judgment based on the P -value and knowledge of the scientific system. In later sections of this chapter and chapters that follow, many examples and exercises emphasize the P -value approach to drawing scientific conclusions. Exercises 335 (b) Evaluate β for the alternatives p = 0.5 and p = 0.7. (c) Is this a good test procedure? 10.7 Repeat Exercise 10.6 but assume that 200 adults are selected and the fail-to-reject region is defined to be 110 ≤ x ≤ 130, where x is the number of college graduates in our sample. Use the normal approxima- tion. 10.8 In Relief from Arthritis published by Thorsons Publishers, Ltd., John E. Croft claims that over 40% of those who suffer from osteoarthritis receive measur- able relief from an ingredient produced by a particular species of mussel found off the coast of New Zealand. To test this claim, the mussel extract is to be given to a group of 7 osteoarthritic patients. If 3 or more of the patients receive relief, we shall not reject the null hypothesis that p = 0.4; otherwise, we conclude that p < 0.4. (a) Evaluate α, assuming that p = 0.4. (b) Evaluate β for the alternative p = 0.3. 10.9 A dry cleaning establishment claims that a new spot remover will remove more than 70% of the spots to which it is applied. To check this claim, the spot remover will be used on 12 spots chosen at random. If fewer than 11 of the spots are removed, we shall not reject the null hypothesis that p = 0.7; otherwise, we conclude that p > 0.7.
(a) Evaluate α, assuming that p = 0.7. (b) Evaluate β for the alternative p = 0.9.
10.10 Repeat Exercise 10.9 but assume that 100 spots are treated and the critical region is defined to be x > 82, where x is the number of spots removed.
10.11 Repeat Exercise 10.8 but assume that 70 pa- tients are given the mussel extract and the critical re- gion is defined to be x < 24, where x is the number of osteoarthritic patients who receive relief. 10.12 A random sample of 400 voters in a certain city are asked if they favor an additional 4% gasoline sales tax to provide badly needed revenues for street repairs. If more than 220 but fewer than 260 favor the sales tax, we shall conclude that 60% of the voters are for it. (a) Find the probability of committing a type I error if 60% of the voters favor the increased tax. (b) What is the probability of committing a type II er- ror using this test procedure if actually only 48% of the voters are in favor of the additional gasoline tax? 10.13 Suppose, in Exercise 10.12, we conclude that 60% of the voters favor the gasoline sales tax if more than 214 but fewer than 266 voters in our sample fa- vor it. Show that this new critical region results in a smaller value for α at the expense of increasing β. 10.14 A manufacturer has developed a new fishing line, which the company claims has a mean breaking strength of 15 kilograms with a standard deviation of 0.5 kilogram. To test the hypothesis that μ = 15 kilo- grams against the alternative that μ < 15 kilograms, a random sample of 50 lines will be tested. The critical region is defined to be x ̄ < 14.9. (a) Find the probability of committing a type I error when H0 is true. (b) Evaluate β for the alternatives μ = 14.8 and μ = 14.9 kilograms. 10.15 A soft-drink machine at a steak house is reg- ulated so that the amount of drink dispensed is ap- proximately normally distributed with a mean of 200 milliliters and a standard deviation of 15 milliliters. The machine is checked periodically by taking a sam- ple of 9 drinks and computing the average content. If x ̄ falls in the interval 191 < x ̄ < 209, the machine is thought to be operating satisfactorily; otherwise, we conclude that μ ̸= 200 milliliters. (a) Find the probability of committing a type I error when μ = 200 milliliters. (b) Find the probability of committing a type II error when μ = 215 milliliters. 10.16 Repeat Exercise 10.15 for samples of size n = 25. Use the same critical region. 10.17 A new curing process developed for a certain type of cement results in a mean compressive strength of 5000 kilograms per square centimeter with a stan- dard deviation of 120 kilograms. To test the hypothesis that μ = 5000 against the alternative that μ < 5000, a random sample of 50 pieces of cement is tested. The critical region is defined to be x ̄ < 4970. (a) Find the probability of committing a type I error when H0 is true. (b) Evaluate β for the alternatives μ = 4970 and μ = 4960. 10.18 If we plot the probabilities of failing to reject H0 corresponding to various alternatives for μ (includ- ing the value specified by H0) and connect all the points by a smooth curve, we obtain the operating characteristic curve of the test criterion, or simply the OC curve. Note that the probability of failing to reject H0 when it is true is simply 1 − α. Operating characteristic curves are widely used in industrial ap- plications to provide a visual display of the merits of the test criterion. With reference to Exercise 10.15, find the probabilities of failing to reject H0 for the fol- lowing 9 values of μ and plot the OC curve: 184, 188, 192, 196, 200, 204, 208, 212, and 216. // 336 Chapter 10 One- and Two-Sample Tests of Hypotheses 10.4 Single Sample: Tests Concerning a Single Mean In this section, we formally consider tests of hypotheses on a single population mean. Many of the illustrations from previous sections involved tests on the mean, so the reader should already have insight into some of the details that are outlined here. Tests on a Single Mean (Variance Known) We should first describe the assumptions on which the experiment is based. The model for the underlying situation centers around an experiment with X1, X2, . . . , Xn representing a random sample from a distribution with mean μ and variance σ2 > 0. Consider first the hypothesis
H0: μ = μ0, H1: μ ̸= μ0.
The appropriate test statistic should be based on the random variable X ̄. In
Chapter 8, the Central Limit Theorem was introduced, which essentially states
that despite the distribution of X, the random variable X ̄ has approximately a
normal distribution with mean μ and variance σ2/n for reasonably large sample
sizes. So, μX ̄ = μ and σ2 ̄ = σ2/n. We can then determine a critical region based X
on the computed sample average, x ̄. It should be clear to the reader by now that there will be a two-tailed critical region for the test.
Standardization of X ̄
It is convenient to standardize X ̄ and formally involve the standard normal
random variable Z, where
X ̄ − μ Z = σ/√n .
We know that under H0, that is, if μ = μ0, √n(X ̄ − μ0)/σ follows an n(x; 0, 1) distribution, and hence the expression
􏰧 X ̄−μ 􏰨 P−zα/2< √0zα/2 or z= σ/√n <−zα/2 If −zα/2 acceptance of the alternative hypothesis μ ̸= μ0. With this definition of the critical region, it should be clear that there will be probability α of rejecting H0 (falling into the critical region) when, indeed, μ = μ0. Although it is easier to understand the critical region written in terms of z, we can write the same critical region in terms of the computed average x ̄. The following can be written as an identical decision procedure: < z < zα/2 , do not reject H0 . Rejection of H0 , of course, implies where reject H0 ifx ̄b, σσ
a=μ0 −zα/2√n, b=μ0 +zα/2√n.
Hence, for a significance level α, the critical values of the random variable z and x ̄
are both depicted in Figure 10.9.
1􏱍α
α/2 α/2 aμb
x
Figure 10.9: Critical region for the alternative hypothesis μ ̸= μ0.
Tests of one-sided hypotheses on the mean involve the same statistic described in the two-sided case. The difference, of course, is that the critical region is only in one tail of the standard normal distribution. For example, suppose that we seek to test
H0: μ=μ0, H1: μ>μ0.
The signal that favors H1 comes from large values of z. Thus, rejection of H0 results when the computed z > zα. Obviously, if the alternative is H1: μ < μ0, the critical region is entirely in the lower tail and thus rejection results from z < −zα. Although in a one-sided testing case the null hypothesis can be written as H0 : μ ≤ μ0 or H0: μ ≥ μ0, it is usually written as H0: μ = μ0. The following two examples illustrate tests on means for the case in which σ is known. 338 Chapter 10 One- and Two-Sample Tests of Hypotheses Example 10.3: Solution : A random sample of 100 recorded deaths in the United States during the past year showed an average life span of 71.8 years. Assuming a population standard deviation of 8.9 years, does this seem to indicate that the mean life span today is greater than 70 years? Use a 0.05 level of significance. 1. H0: μ = 70 years. 2. H1: μ > 70 years.
3. α = 0.05.
4. Critical region: z > 1.645, where z = x ̄−μ0 .

σ/ n
5. Computations: x ̄ = 71.8 years, σ = 8.9 years, and hence z = 71.8−70 = 2.02.

Example 10.4:
Solution :
8.9/ 100
6. Decision: Reject H0 and conclude that the mean life span today is greater
than 70 years.
The P-value corresponding to z = 2.02 is given by the area of the shaded region in Figure 10.10.
Using Table A.3, we have
P = P(Z > 2.02) = 0.0217.
As a result, the evidence in favor of H1 is even stronger than that suggested by a 0.05 level of significance.
A manufacturer of sports equipment has developed a new synthetic fishing line that the company claims has a mean breaking strength of 8 kilograms with a standard deviation of 0.5 kilogram. Test the hypothesis that μ = 8 kilograms against the alternative that μ ̸= 8 kilograms if a random sample of 50 lines is tested and found to have a mean breaking strength of 7.8 kilograms. Use a 0.01 level of significance.
1. H0: μ = 8 kilograms.
2. H1: μ ̸= 8 kilograms.
3. α = 0.01.
4. Critical region: z < −2.575 and z > 2.575, where z = x ̄−μ0 .

σ/ n
5. Computations: x ̄ = 7.8 kilograms, n = 50, and hence z = 7.8−8 = −2.83.
equal to 8 but is, in fact, less than 8 kilograms.
Since the test in this example is two tailed, the desired P-value is twice the area of the shaded region in Figure 10.11 to the left of z = −2.83. Therefore, using Table A.3, we have
P = P(|Z| > 2.83) = 2P(Z < −2.83) = 0.0046, which allows us to reject the null hypothesis that μ = 8 kilograms at a level of significance smaller than 0.01. √ 0.5/ 50 6. Decision: Reject H0 and conclude that the average breaking strength is not 10.4 Single Sample: Tests Concerning a Single Mean 339 P P/2 P/2 zz 0 2.02 −2.83 0 2.83 Figure 10.10: P-value for Example 10.3. Figure 10.11: P-value for Example 10.4. Relationship to Confidence Interval Estimation The reader should realize by now that the hypothesis-testing approach to statistical inference in this chapter is very closely related to the confidence interval approach in Chapter 9. Confidence interval estimation involves computation of bounds within which it is “reasonable” for the parameter in question to lie. For the case of a single population mean μ with σ2 known, the structure of both hypothesis testing and confidence interval estimation is based on the random variable X ̄ − μ Z = σ/√n . It turns out that the testing of H0: μ = μ0 against H1: μ ̸= μ0 at a significance level α is equivalent to computing a 100(1 − α)% confidence interval on μ and rejecting H0 if μ0 is outside the confidence interval. If μ0 is inside the confidence interval, the hypothesis is not rejected. The equivalence is very intuitive and quite simple to illustrate. Recall that with an observed value x ̄, failure to reject H0 at significance level α implies that which is equivalent to x ̄ − μ 0 −zα/2 ≤ σ/√n ≤ zα/2, σσ x ̄−zα/2√n ≤μ0 ≤x ̄+zα/2√n. The equivalence of confidence interval estimation to hypothesis testing extends to differences between two means, variances, ratios of variances, and so on. As a result, the student of statistics should not consider confidence interval estimation and hypothesis testing as separate forms of statistical inference. For example, consider Example 9.2 on page 271. The 95% confidence interval on the mean is given by the bounds (2.50, 2.70). Thus, with the same sample information, a two- sided hypothesis on μ involving any hypothesized value between 2.50 and 2.70 will not be rejected. As we turn to different areas of hypothesis testing, the equivalence to the confidence interval estimation will continue to be exploited. 340 Chapter 10 One- and Two-Sample Tests of Hypotheses Tests on a Single Sample (Variance Unknown) The t-Statistic for a Test on a Single Mean (Variance Unknown) One would certainly suspect that tests on a population mean μ with σ2 unknown, like confidence interval estimation, should involve the use of Student t-distribution. Strictly speaking, the application of Student t for both confidence intervals and hypothesis testing is developed under the following assumptions. The random variables X1, X2, . . . , Xn represent a random sample from a normal distribution with unknown μ and σ2. Then the random variable √n(X ̄ − μ)/S has a Student t-distribution with n−1 degrees of freedom. The structure of the test is identical to that for the case of σ known, with the exception that the value σ in the test statistic is replaced by the computed estimate S and the standard normal distribution is replaced by a t-distribution. For the two-sided hypothesis H0: μ=μ0, H1: μ̸=μ0, t= s/√n exceeds tα/2,n−1 or is less than −tα/2,n−1. The reader should recall from Chapters 8 and 9 that the t-distribution is symmetric around the value zero. Thus, this two-tailed critical region applies in a fashion similar to that for the case of known σ. For the two-sided hypothesis at significance level α, the two-tailed critical regions apply. For H1: μ > μ0, rejection results when t > tα,n−1. For H1: μ < μ0, the critical region is given by t < −tα,n−1. The Edison Electric Institute has published figures on the number of kilowatt hours used annually by various home appliances. It is claimed that a vacuum cleaner uses an average of 46 kilowatt hours per year. If a random sample of 12 homes included in a planned study indicates that vacuum cleaners use an average of 42 kilowatt hours per year with a standard deviation of 11.9 kilowatt hours, does this suggest at the 0.05 level of significance that vacuum cleaners use, on average, less than 46 kilowatt hours annually? Assume the population of kilowatt hours to be normal. we reject H0 at significance level α when the computed t-statistic x ̄ − μ 0 Example 10.5: Solution : 1. H0: μ = 46 kilowatt hours. 2. H1: μ < 46 kilowatt hours. 3. α = 0.05. 4. Critical region: t < −1.796, where t = x ̄−μ0 Hence, with 11 degrees of freedom. √ s/ n 5. Computations: x ̄ = 42 kilowatt hours, s = 11.9 kilowatt hours, and n = 12. 42−46 t= √ =−1.16, P =P(T <−1.16)≈0.135. 11.9/ 12 10.4 Single Sample: Tests Concerning a Single Mean 341 6. Decision: Do not reject H0 and conclude that the average number of kilowatt hours used annually by home vacuum cleaners is not significantly less than 46. Comment on the Single-Sample t-Test The reader has probably noticed that the equivalence of the two-tailed t-test for a single mean and the computation of a confidence interval on μ with σ replaced by s is maintained. For example, consider Example 9.5 on page 275. Essentially, we can view that computation as one in which we have found all values of μ0, the hypothesized mean volume of containers of sulfuric acid, for which the hypothesis H0: μ = μ0 will not be rejected at α = 0.05. Again, this is consistent with the statement “Based on the sample information, values of the population mean volume between 9.74 and 10.26 liters are not unreasonable.” Comments regarding the normality assumption are worth emphasizing at this point. We have indicated that when σ is known, the Central Limit Theorem allows for the use of a test statistic or a confidence interval which is based on Z, the standard normal random variable. Strictly speaking, of course, the Central Limit Theorem, and thus the use of the standard normal distribution, does not apply unless σ is known. In Chapter 8, the development of the t-distribution was given. There we pointed out that normality on X1, X2, . . . , Xn was an underlying assumption. Thus, strictly speaking, the Student’s t-tables of percentage points for tests or confidence intervals should not be used unless it is known that the sample comes from a normal population. In practice, σ can rarely be assumed to be known. However, a very good estimate may be available from previous experiments. Many statistics textbooks suggest that one can safely replace σ by s in the test statistic x ̄ − μ 0 z = σ/√n when n ≥ 30 with a bell-shaped population and still use the Z-tables for the appropriate critical region. The implication here is that the Central Limit Theorem is indeed being invoked and one is relying on the fact that s ≈ σ. Obviously, when this is done, the results must be viewed as approximate. Thus, a computed P- value (from the Z-distribution) of 0.15 may be 0.12 or perhaps 0.17, or a computed confidence interval may be a 93% confidence interval rather than a 95% interval as desired. Now what about situations where n ≤ 30? The user cannot rely on s being close to σ, and in order to take into account the inaccuracy of the estimate, the confidence interval should be wider or the critical value larger in magnitude. The t-distribution percentage points accomplish this but are correct only when the sample is from a normal distribution. Of course, normal probability plots can be used to ascertain some sense of the deviation of normality in a data set. For small samples, it is often difficult to detect deviations from a normal dis- tribution. (Goodness-of-fit tests are discussed in a later section of this chapter.) For bell-shaped distributions of the random variables X1,X2,...,Xn, the use of the t-distribution for tests or confidence intervals is likely to produce quite good results. When in doubt, the user should resort to nonparametric procedures, which are presented in Chapter 16. 342 Chapter 10 One- and Two-Sample Tests of Hypotheses Annotated Computer Printout for Single-Sample t-Test It should be of interest for the reader to see an annotated computer printout showing the result of a single-sample t-test. Suppose that an engineer is interested in testing the bias in a pH meter. Data are collected on a neutral substance (pH = 7.0). A sample of the measurements were taken with the data as follows: 7.07 7.00 7.10 6.97 7.00 7.03 7.01 7.01 6.98 7.08 It is, then, of interest to test H0: μ = 7.0, H1: μ ̸= 7.0. In this illustration, we use the computer package MINITAB to illustrate the anal- ysis of the data set above. Notice the key components of the printout shown in Figure 10.12. Of course, the mean y ̄ is 7.0250, StDev is simply the sample standard deviation s = 0.044, and SE Mean is the estimated standard error of the mean and is computed as s/√n = 0.0139. The t-value is the ratio (7.0250 − 7)/0.0139 = 1.80. pH-meter 7.07 7.00 7.10 6.97 7.00 7.03 7.01 7.01 MTB > Onet ’pH-meter’; SUBC> Test 7.
One-Sample T: pH-meter Test of mu = 7 vs not = 7
Variable N Mean StDev SE Mean 95% CI
pH-meter 10 7.02500 0.04403 0.01392 (6.99350, 7.05650) 1.80 0.106
Figure 10.12: MINITAB printout for one sample t-test for pH meter.
The P-value of 0.106 suggests results that are inconclusive. There is no evi- dence suggesting a strong rejection of H0 (based on an α of 0.05 or 0.10), yet one certainly cannot truly conclude that the pH meter is unbiased. Notice that the sample size of 10 is rather small. An increase in sample size (perhaps an- other experiment) may sort things out. A discussion regarding appropriate sample size appears in Section 10.6.
Two Samples: Tests on Two Means
The reader should now understand the relationship between tests and confidence intervals, and can only heavily rely on details supplied by the confidence interval material in Chapter 9. Tests concerning two means represent a set of very impor- tant analytical tools for the scientist or engineer. The experimental setting is very much like that described in Section 9.8. Two independent random samples of sizes
6.98 7.08
T P
10.5

10.5 Two Samples: Tests on Two Means 343 n1 and n2, respectively, are drawn from two populations with means μ1 and μ2
and variances σ12 and σ2. We know that the random variable ( X ̄ 1 − X ̄ 2 ) − ( μ 1 − μ 2 )
has a standard normal distribution. Here we are assuming that n1 and n2 are sufficiently large that the Central Limit Theorem applies. Of course, if the two populations are normal, the statistic above has a standard normal distribution even for small n1 and n2. Obviously, if we can assume that σ1 = σ2 = σ, the statistic above reduces to
( X ̄ 1 − X ̄ 2 ) − ( μ 1 − μ 2 ) Z=􏰱.
σ 1/n1 + 1/n2
The two statistics above serve as a basis for the development of the test procedures involving two means. The equivalence between tests and confidence intervals, along with the technical detail involving tests on one mean, allow a simple transition to tests on two means.
The two-sided hypothesis on two means can be written generally as H0: μ1 − μ2 = d0.
Obviously, the alternative can be two sided or one sided. Again, the distribu- tion used is the distribution of the test statistic under H0. Values x ̄1 and x ̄2 are computed and, for σ1 and σ2 known, the test statistic is given by
( x ̄ 1 − x ̄ 2 ) − d 0 z=􏰱σ2/n +σ2/n ,
Z = 􏰱σ12/n1 + σ2/n2
with a two-tailed critical region in the case of a two-sided alternative. That is, reject H0 in favor of H1: μ1 − μ2 ̸= d0 if z > zα/2 or z < −zα/2 . One-tailed critical regions are used in the case of the one-sided alternatives. The reader should, as before, study the test statistic and be satisfied that for, say, H1: μ1 − μ2 > d0, the signal favoring H1 comes from large values of z. Thus, the upper-tailed critical region applies.
Unknown But Equal Variances
The more prevalent situations involving tests on two means are those in which variances are unknown. If the scientist involved is willing to assume that both distributions are normal and that σ1 = σ2 = σ, the pooled t-test (often called the two-sample t-test) may be used. The test statistic (see Section 9.8) is given by the following test procedure.
1122

344 Chapter 10 One- and Two-Sample Tests of Hypotheses
Two-Sample For the two-sided hypothesis Pooled t-Test
H0: μ1 = μ2, H1: μ1 ̸= μ2,
we reject H0 at significance level α when the computed t-statistic
where
( x ̄ 1 − x ̄ 2 ) − d 0 t=􏰱,
sp 1/n1 + 1/n2
s2p = s21(n1 −1)+s2(n2 −1) n1 +n2 −2
exceeds tα/2,n1+n2−2 or is less than −tα/2,n1+n2−2.
Recall from Chapter 9 that the degrees of freedom for the t-distribution are a result of pooling of information from the two samples to estimate σ2. One-sided alternatives suggest one-sided critical regions, as one might expect. For example, f o r H 1: μ 1 − μ 2 > d 0 , r e j e c t H 1: μ 1 − μ 2 = d 0 w h e n t > t α , n 1 + n 2 − 2 .
Example 10.6: An experiment was performed to compare the abrasive wear of two different lami- nated materials. Twelve pieces of material 1 were tested by exposing each piece to a machine measuring wear. Ten pieces of material 2 were similarly tested. In each case, the depth of wear was observed. The samples of material 1 gave an average (coded) wear of 85 units with a sample standard deviation of 4, while the samples of material 2 gave an average of 81 with a sample standard deviation of 5. Can we conclude at the 0.05 level of significance that the abrasive wear of material 1 exceeds that of material 2 by more than 2 units? Assume the populations to be approximately normal with equal variances.
Solution: Let μ1 and μ2 represent the population means of the abrasive wear for material 1 and material 2, respectively.
1. H0: μ1 − μ2 = 2.
2. H1: μ1 − μ2 > 2.
3. α = 0.05.
4. Critical region: t > 1.725, where t =
(x ̄1 −x ̄2 )−d0
freedom.
5. Computations:
x ̄1 =85, s1 =4, x ̄2 =81, s2 =5,
n1 =12, n2 =10.
sp

1/n1 +1/n2
with v = 20 degrees of

10.5 Two Samples: Tests on Two Means
345
Hence
6. Decision: Do not reject H0. We are unable to conclude that the abrasive wear of material 1 exceeds that of material 2 by more than 2 units.
Unknown But Unequal Variances
There are situations where the analyst is not able to assume that σ1 = σ2. Recall from Section 9.8 that, if the populations are normal, the statistic
′ ( X ̄ 1 − X ̄ 2 ) − d 0 T = 􏰱s21/n1 + s2/n2
sp =
􏰼
(11)(16) + (9)(25) = 4.478, 12+10−2
(85−81)−2
􏰱 = 1.04,
t =
P = P (T > 1.04) ≈ 0.16. (See Table A.4.)
4.478 1/12 + 1/10
has an approximate t-distribution with approximate degrees of freedom (s21/n1 + s2/n2)2
v = (s21/n1)2/(n1 − 1) + (s2/n2)2/(n2 − 1). As a result, the test procedure is to not reject H0 when
−tα/2,v < t′ < tα/2,v, with v given as above. Again, as in the case of the pooled t-test, one-sided alter- natives suggest one-sided critical regions. Paired Observations A study of the two-sample t-test or confidence interval on the difference between means should suggest the need for experimental design. Recall the discussion of experimental units in Chapter 9, where it was suggested that the conditions of the two populations (often referred to as the two treatments) should be assigned randomly to the experimental units. This is done to avoid biased results due to systematic differences between experimental units. In other words, in hypothesis- testing jargon, it is important that any significant difference found between means be due to the different conditions of the populations and not due to the exper- imental units in the study. For example, consider Exercise 9.40 in Section 9.9. The 20 seedlings play the role of the experimental units. Ten of them are to be treated with nitrogen and 10 with no nitrogen. It may be very important that this assignment to the “nitrogen” and “no-nitrogen” treatments be random to en- sure that systematic differences between the seedlings do not interfere with a valid comparison between the means. In Example 10.6, time of measurement is the most likely choice for the experi- mental unit. The 22 pieces of material should be measured in random order. We 346 Chapter 10 One- and Two-Sample Tests of Hypotheses need to guard against the possibility that wear measurements made close together in time might tend to give similar results. Systematic (nonrandom) differences in experimental units are not expected. However, random assignments guard against the problem. References to planning of experiments, randomization, choice of sample size, and so on, will continue to influence much of the development in Chapters 13, 14, and 15. Any scientist or engineer whose interest lies in analysis of real data should study this material. The pooled t-test is extended in Chapter 13 to cover more than two means. Testing of two means can be accomplished when data are in the form of paired observations, as discussed in Chapter 9. In this pairing structure, the conditions of the two populations (treatments) are assigned randomly within homogeneous units. Computation of the confidence interval for μ1 − μ2 in the situation with paired observations is based on the random variable D ̄ − μ D T = Sd/√n , where D ̄ and Sd are random variables representing the sample mean and standard deviation of the differences of the observations in the experimental units. As in the case of the pooled t-test, the assumption is that the observations from each popu- lation are normal. This two-sample problem is essentially reduced to a one-sample problem by using the computed differences d1, d2, . . . , dn. Thus, the hypothesis reduces to H0: μD =d0. The computed test statistic is then given by d−d0 t = sd/√n. Critical regions are constructed using the t-distribution with n − 1 degrees of free- dom. Problem of Interaction in a Paired t-Test Not only will the case study that follows illustrate the use of the paired t-test but the discussion will shed considerable light on the difficulties that arise when there is an interaction between the treatments and the experimental units in the paired t structure. Recall that interaction between factors was introduced in Section 1.7 in a discussion of general types of statistical studies. The concept of interaction will be an important issue from Chapter 13 through Chapter 15. There are some types of statistical tests in which the existence of interaction results in difficulty. The paired t-test is one such example. In Section 9.9, the paired structure was used in the computation of a confidence interval on the difference between two means, and the advantage in pairing was revealed for situations in which the experimental units are homogeneous. The pairing results in a reduction in σD, the standard deviation of a difference Di = X1i − X2i, as discussed in 10.5 Two Samples: Tests on Two Means 347 Section 9.9. If interaction exists between treatments and experimental units, the advantage gained in pairing may be substantially reduced. Thus, in Example 9.13 on page 293, the no interaction assumption allowed the difference in mean TCDD levels (plasma vs. fat tissue) to be the same across veterans. A quick glance at the data would suggest that there is no significant violation of the assumption of no interaction. In order to demonstrate how interaction influences Var(D) and hence the quality of the paired t-test, it is instructive to revisit the ith difference given by Di = X1i − X2i = (μ1 − μ2) + (ε1 − ε2), where X1i and X2i are taken on the ith experimental unit. If the pairing unit is homogeneous, the errors in X1i and in X2i should be similar and not independent. We noted in Chapter 9 that the positive covariance between the errors results in a reduced Var(D). Thus, the size of the difference in the treatments and the relationship between the errors in X1i and X2i contributed by the experimental unit will tend to allow a significant difference to be detected. What Conditions Result in Interaction? Let us consider a situation in which the experimental units are not homogeneous. Rather, consider the ith experimental unit with random variables X1i and X2i that are not similar. Let ε1i and ε2i be random variables representing the errors in the values X1i and X2i, respectively, at the ith unit. Thus, we may write X1i = μ1 + ε1i and X2i = μ2 + ε2i. The errors with expectation zero may tend to cause the response values X1i and X2i to move in opposite directions, resulting in a negative value for Cov(ε1i,ε2i) and hence negative Cov(X1i,X2i). In fact, the model may be complicated even more by the fact that σ12 = Var(ε1i) ̸= σ2 = Var(ε2i). The variance and covari- ance parameters may vary among the n experimental units. Thus, unlike in the homogeneous case, Di will tend to be quite different across experimental units due to the heterogeneous nature of the difference in ε1 − ε2 among the units. This produces the interaction between treatments and units. In addition, for a specific experimental unit (see Theorem 4.9), σD2 = Var(D) = Var(ε1) + Var(ε2) − 2 Cov(ε1, ε2) is inflated by the negative covariance term, and thus the advantage gained in pairing in the homogeneous unit case is lost in the case described here. While the inflation in Var(D) will vary from case to case, there is a danger in some cases that the increase in variance may neutralize any difference that exists between μ1 and μ2. Of course, a large value of d ̄ in the t-statistic may reflect a treatment difference that overcomes the inflated variance estimate, s2d. Case Study 10.1: Blood Sample Data: In a study conducted in the Forestry and Wildlife De- partment at Virginia Tech, J. A. Wesson examined the influence of the drug suc- cinylcholine on the circulation levels of androgens in the blood. Blood samples were taken from wild, free-ranging deer immediately after they had received an intramuscular injection of succinylcholine administered using darts and a capture gun. A second blood sample was obtained from each deer 30 minutes after the 348 Chapter 10 One- and Two-Sample Tests of Hypotheses first sample, after which the deer was released. The levels of androgens at time of capture and 30 minutes later, measured in nanograms per milliliter (ng/mL), for 15 deer are given in Table 10.2. Assuming that the populations of androgen levels at time of injection and 30 minutes later are normally distributed, test at the 0.05 level of significance whether the androgen concentrations are altered after 30 minutes. Table 10.2: Data for Case Study 10.1 Androgen (ng/mL) Deer At Time of Injection 1 2.76 2 5.18 3 2.68 4 3.05 5 4.10 6 7.05 7 6.60 8 4.79 9 7.39 10 7.30 11 11.78 12 3.90 13 26.00 14 67.48 15 17.04 30 Minutes after Injection di 7.02 4.26 3.10 −2.08 5.44 2.76 3.99 0.94 5.21 1.11 10.26 3.21 13.91 7.31 18.53 13.74 7.91 0.52 4.85 −2.45 11.10 −0.68 3.74 −0.16 94.03 68.03 94.03 26.55 41.70 24.66 Solution: Let μ1 and μ2 be the average androgen concentration at the time of injection and 30 minutes later, respectively. We proceed as follows: 1. H0: μ1 = μ2 or μD = μ1 − μ2 = 0. 2. H1: μ1 ̸= μ2 or μD = μ1 − μ2 ̸= 0. 3. α = 0.05. 4. Critical region: t < −2.145 and t > 2.145, where t = d−d0
with v = 14 5. Computations: The sample mean and standard deviation for the di are

degrees of freedom.
Therefore,
As a result, there is some evidence that there is a difference in mean circulating levels of androgen.
9.848 − 0
√ = 2.06.
t =
6. Though the t-statistic is not significant at the 0.05 level, from Table A.4,
sD/ n
d = 9.848 and sd = 18.474.
18.474/ 15
P = P(|T| > 2.06) ≈ 0.06.

10.6 Choice of Sample Size for Testing Means 349
The assumption of no interaction would imply that the effect on androgen levels of the deer is roughly the same in the data for both treatments, i.e., at the time of injection of succinylcholine and 30 minutes following injection. This can be expressed with the two factors switching roles; for example, the difference in treatments is roughly the same across the units (i.e., the deer). There certainly are some deer/treatment combinations for which the no interaction assumption seems to hold, but there is hardly any strong evidence that the experimental units are homogeneous. However, the nature of the interaction and the resulting increase in Var(D ̄) appear to be dominated by a substantial difference in the treatments. This is further demonstrated by the fact that 11 of the 15 deer exhibited positive signs for the computed di and the negative di (for deer 2, 10, 11, and 12) are small in magnitude compared to the 12 positive ones. Thus, it appears that the mean level of androgen is significantly higher 30 minutes following injection than at injection, and the conclusions may be stronger than p = 0.06 would suggest.
Annotated Computer Printout for Paired t-Test
Figure 10.13 displays a SAS computer printout for a paired t-test using the data of Case Study 10.1. Notice that the printout looks like that for a single sample t-test and, of course, that is exactly what is accomplished, since the test seeks to determine if d is significantly different from zero.
Analysis Variable : Diff
N Mean Std Error t Value Pr > |t|
———————————————————
15 9.8480000 4.7698699 2.06 0.0580
———————————————————
Figure 10.13: SAS printout of paired t-test for data of Case Study 10.1. Summary of Test Procedures
As we complete the formal development of tests on population means, we offer Table 10.3, which summarizes the test procedure for the cases of a single mean and two means. Notice the approximate procedure when distributions are normal and variances are unknown but not assumed to be equal. This statistic was introduced in Chapter 9.
10.6 Choice of Sample Size for Testing Means
In Section 10.2, we demonstrated how the analyst can exploit relationships among the sample size, the significance level α, and the power of the test to achieve a certain standard of quality. In most practical circumstances, the experiment should be planned, with a choice of sample size made prior to the data-taking process if possible. The sample size is usually determined to achieve good power for a fixed α and fixed specific alternative. This fixed alternative may be in the

350
Chapter 10 One- and Two-Sample Tests of Hypotheses
Table 10.3: Tests Concerning Means
H0 μ=μ0
Value of Test Statistic
x ̄ − μ 0
z= σ/√n; σknown
x ̄ − μ 0
t= s/√n; v=n−1,
H1 μ < μ 0 μ>μ0 μ̸=μ0
μ < μ 0 μ>μ
Critical Region
z < − z α z>zα
z<−zα/2 orz>zα/2
t < − t α t>t
μ=μ
00α
σ unknown
( x ̄ 1 − x ̄ 2 ) − d 0
μ ̸= μ0 μ1−μ2tα/2 z<−zα z>z
z<−zα/2 orz>zα/2
t<−tα t>tα
t<−tα/2 or t>tα/2
μ−μ=d z=􏰱2 2 ;
120 σ1/n1+σ2/n2 120
μ−μ>d μ1−μ2̸=d0
α
μ1 − μ2 = d0
μ1 − μ2 = d0 μD = d0
paired observations
σ1 and σ2 known
( x ̄ 1 − x ̄ 2 ) − d 0 t=􏰱;
sp 1/n1 + 1/n2 v = n1 + n2 − 2,
σ1 = σ2 but unknown,
s2p = (n1 −1)s21 +(n2 −1)s2
n1 +n2 −2
μ1−μ2d0 μ1−μ2̸=d0
′ ( x ̄ 1 − x ̄ 2 ) − d 0
t=􏰱22; ′
s1/n1 + s2/n2 (s21/n1 + s2/n2)2
μ1−μ2d0 μ1−μ2̸=d0
μD < d0 μD > d0 μD ̸= d0
t<−tα t′>tα t′<−tα/2ort′>tα/2
t<−tα t>tα
t<−tα/2 or t>tα/2
v = (s21/n1)2 + (s2/n2)2 , n1 −1 n2 −1
σ1 ̸= σ2 and unknown d−d0
t = sd/√n; v=n−1
form of μ − μ0 in the case of a hypothesis involving a single mean or μ1 − μ2 in the case of a problem involving two means. Specific cases will provide illustrations.
Suppose that we wish to test the hypothesis H0: μ=μ0,
H1: μ>μ0,
with a significance level α, when the variance σ2 is known. For a specific alternative,
say μ = μ0 + δ, the power of our test is shown in Figure 10.14 to be 1 − β = P ( X ̄ > a w h e n μ = μ 0 + δ ) .
Therefore,
β = P ( X ̄ < a w h e n μ = μ 0 + δ ) 􏰮 X ̄ − ( μ 0 + δ ) a − ( μ 0 + δ ) 􏰯 =P σ/√n < σ/√n whenμ=μ0+δ . 10.6 Choice of Sample Size for Testing Means 351 βα μ0 a μ0+δ Figure 10.14: Testing μ = μ0 versus μ = μ0 + δ. x Under the alternative hypothesis μ = μ0 + δ, the statistic X ̄ − ( μ 0 + δ ) σ/√n is the standard normal variable Z. So 􏰧a−μ0δ􏰨􏰧 δ􏰨 β=P Z<σ/√n−σ/√n =P Z 68 kilograms
for the weights of male students at a certain college, using an α = 0.05 level of significance, when it is known that σ = 5. Find the sample size required if the power of our test is to be 0.95 when the true mean is 69 kilograms.

352
Chapter 10 One- and Two-Sample Tests of Hypotheses
Solution : Since α = β = 0.05, we have zα = zβ = 1.645. For the alternative β = 69, we take δ = 1 and then
n = (1.645 + 1.645)2(25) = 270.6. 1
Therefore, 271 observations are required if the test is to reject the null hypothesis 95% of the time when, in fact, μ is as large as 69 kilograms.
Two-Sample Case
A similar procedure can be used to determine the sample size n = n1 = n2 required for a specific power of the test in which two population means are being compared. For example, suppose that we wish to test the hypothesis
H0: μ1 − μ2 = d0, H1: μ1 − μ2 ̸= d0,
when σ1 and σ2 are known. For a specific alternative, say μ1 − μ2 = d0 + δ, the power of our test is shown in Figure 10.15 to be
1 − β = P ( | X ̄ 1 − X ̄ 2 | > a w h e n μ 1 − μ 2 = d 0 + δ ) .
α2 βα2
−a d0 a d0+δ
x
Figure 10.15: Testing μ1 − μ2 = d0 versus μ1 − μ2 = d0 + δ. Therefore,
β = P ( − a < X ̄ 1 − X ̄ 2 < a w h e n μ 1 − μ 2 = d 0 + δ ) 􏰷 = P 􏰱(σ12 + σ2)/n < a−(d0 +δ) −a−(d0 +δ) (X ̄1 −X ̄2)−(d0 +δ) 􏰱(σ12 + σ2)/n <􏰱(σ12+σ2)/nwhenμ1−μ2=d0+δ . Under the alternative hypothesis μ1 − μ2 = d0 + δ, the statistic X ̄1 −X ̄2 −(d0 +δ) 􏰱(σ12 + σ2)/n 􏰸 10.6 Choice of Sample Size for Testing Means 353 is the standard normal variable Z. Now, writing −zα/2=􏰱−a−d0 and zα/2=􏰱 a−d0 , (σ12 + σ2)/n (σ12 + σ2)/n 􏰷􏰸 we have β = P −zα/2 − 􏰱 δ < Z < zα/2 − 􏰱 δ , (σ12 + σ2)/n (σ12 + σ2)/n from which we conclude that −zβ ≈ zα/2 − 􏰱 δ (σ12 +σ2)/n n ≈ (zα/2 + zβ)2(σ12 + σ2). δ2 , and hence For the one-tailed test, the expression for the required sample size when n = n1 = n2 is Choice of sample size: n = (zα + zβ)2(σ12 + σ2). δ2 When the population variance (or variances, in the two-sample situation) is un- known, the choice of sample size is not straightforward. In testing the hypothesis μ=μ0 whenthetruevalueisμ=μ0+δ,thestatistic X ̄ − ( μ 0 + δ ) S/√n does not follow the t-distribution, as one might expect, but instead follows the noncentral t-distribution. However, tables or charts based on the noncentral t-distribution do exist for determining the appropriate sample size if some estimate of σ is available or if δ is a multiple of σ. Table A.8 gives the sample sizes needed to control the values of α and β for various values of Δ = |δ| = |μ − μ0| σσ for both one- and two-tailed tests. In the case of the two-sample t-test in which the variances are unknown but assumed equal, we obtain the sample sizes n = n1 = n2 needed to control the values of α and β for various values of Δ= |δ| = |μ1 −μ2 −d0| σσ from Table A.9. Example 10.8: In comparing the performance of two catalysts on the effect of a reaction yield, a two-sample t-test is to be conducted with α = 0.05. The variances in the yields 354 Chapter 10 One- and Two-Sample Tests of Hypotheses 10.7 are considered to be the same for the two catalysts. How large a sample for each catalyst is needed to test the hypothesis H0: μ1 = μ2, H1: μ1 ̸= μ2 if it is essential to detect a difference of 0.8σ between the catalysts with probability 0.9? Solution : From Table A.9, with α = 0.05 for a two-tailed test, β = 0.1, and Δ = |0.8σ| = 0.8, σ we find the required sample size to be n = 34. In practical situations, it might be difficult to force a scientist or engineer to make a commitment on information from which a value of Δ can be found. The reader is reminded that the Δ-value quantifies the kind of difference between the means that the scientist considers important, that is, a difference considered significant from a scientific, not a statistical, point of view. Example 10.8 illustrates how this choice is often made, namely, by selecting a fraction of σ. Obviously, if the sample size is based on a choice of |δ| that is a small fraction of σ, the resulting sample size may be quite large compared to what the study allows. Graphical Methods for Comparing Means In Chapter 1, considerable attention was directed to displaying data in graphical form, such as stem-and-leaf plots and box-and-whisker plots. In Section 8.8, quan- tile plots and quantile-quantile normal plots were used to provide a “picture” to summarize a set of experimental data. Many computer software packages produce graphical displays. As we proceed to other forms of data analysis (e.g., regression analysis and analysis of variance), graphical methods become even more informa- tive. Graphical aids cannot be used as a replacement for the test procedure itself. Certainly, the value of the test statistic indicates the proper type of evidence in support of H0 or H1. However, a pictorial display provides a good illustration and is often a better communicator of evidence to the beneficiary of the analysis. Also, a picture will often clarify why a significant difference was found. Failure of an important assumption may be exposed by a summary type of graphical tool. For the comparison of means, side-by-side box-and-whisker plots provide a telling display. The reader should recall that these plots display the 25th per- centile, 75th percentile, and the median in a data set. In addition, the whiskers display the extremes in a data set. Consider Exercise 10.40 at the end of this sec- tion. Plasma ascorbic acid levels were measured in two groups of pregnant women, smokers and nonsmokers. Figure 10.16 shows the box-and-whisker plots for both groups of women. Two things are very apparent. Taking into account variability, there appears to be a negligible difference in the sample means. In addition, the variability in the two groups appears to be somewhat different. Of course, the analyst must keep in mind the rather sizable differences between the sample sizes in this case. 10.7 Graphical Methods for Comparing Means 355 1.5 1.0 0.5 0.0 0.8 0.7 0.6 0.5 0.4 0.3 Nonsmoker Smoker No Nitrogen Nitrogen Figure 10.16: Two box-and-whisker plots of Figure 10.17: Two box-and-whisker plots of plasma ascorbic acid in smokers and nonsmokers. seedling data. Consider Exercise 9.40 in Section 9.9. Figure 10.17 shows the multiple box- and-whisker plot for the data on 10 seedlings, half given nitrogen and half given no nitrogen. The display reveals a smaller variability for the group containing no nitrogen. In addition, the lack of overlap of the box plots suggests a significant difference between the mean stem weights for the two groups. It would appear that the presence of nitrogen increases the stem weights and perhaps increases the variability in the weights. There are no certain rules of thumb regarding when two box-and-whisker plots give evidence of significant difference between the means. However, a rough guide- line is that if the 25th percentile line for one sample exceeds the median line for the other sample, there is strong evidence of a difference between means. More emphasis is placed on graphical methods in a real-life case study presented later in this chapter. Annotated Computer Printout for Two-Sample t-Test Consider once again Exercise 9.40 on page 294, where seedling data under condi- tions of nitrogen and no nitrogen were collected. Test H0: μNIT = μNON, H1: μNIT > μNON,
where the population means indicate mean weights. Figure 10.18 is an annotated computer printout generated using the SAS package. Notice that sample standard deviation and standard error are shown for both samples. The t-statistics under the assumption of equal variance and unequal variance are both given. From the box- and-whisker plot of Figure 10.17 it would certainly appear that the equal variance assumption is violated. A P -value of 0.0229 suggests a conclusion of unequal means. This concurs with the diagnostic information given in Figure 10.18. Incidentally, notice that t and t′ are equal in this case, since n1 = n2.
Ascorbic Acid
Weight

356
Chapter 10 One- and Two-Sample Tests of Hypotheses
Exercises
No nitrogen
Nitrogen
Variances
Equal
Unequal
Variable
Weight
DF t Value
18 2.62
11.7 2.62
Pr > |t|
0.0174
0.0229
10.19 In a research report, Richard H. Weindruch of the UCLA Medical School claims that mice with an average life span of 32 months will live to be about 40 months old when 40% of the calories in their diet are replaced by vitamins and protein. Is there any reason to believe that μ < 40 if 64 mice that are placed on this diet have an average life of 38 months with a stan- dard deviation of 5.8 months? Use a P-value in your conclusion. 10.20 A random sample of 64 bags of white ched- dar popcorn weighed, on average, 5.23 ounces with a standard deviation of 0.24 ounce. Test the hypothesis that μ = 5.5 ounces against the alternative hypothesis, μ < 5.5 ounces, at the 0.05 level of significance. 10.21 An electrical firm manufactures light bulbs that have a lifetime that is approximately normally distributed with a mean of 800 hours and a standard deviation of 40 hours. Test the hypothesis that μ = 800 hours against the alternative, μ ̸= 800 hours, if a ran- dom sample of 30 bulbs has an average life of 788 hours. Use a P-value in your answer. 10.22 In the American Heart Association journal Hy- pertension, researchers report that individuals who practice Transcendental Meditation (TM) lower their blood pressure significantly. If a random sample of 225 male TM practitioners meditate for 8.5 hours per week with a standard deviation of 2.25 hours, does that sug- gest that, on average, men who use TM meditate more than 8 hours per week? Quote a P-value in your con- clusion. 10.23 Test the hypothesis that the average content of containers of a particular lubricant is 10 liters if the contents of a random sample of 10 containers are 10.2, 9.7, 10.1, 10.3, 10.1, 9.8, 9.9, 10.4, 10.3, and 9.8 liters. Use a 0.01 level of significance and assume that the distribution of contents is normal. 10.24 The average height of females in the freshman class of a certain college has historically been 162.5 cen- timeters with a standard deviation of 6.9 centimeters. Is there reason to believe that there has been a change in the average height if a random sample of 50 females in the present freshman class has an average height of 165.2 centimeters? Use a P-value in your conclusion. Assume the standard deviation remains the same. 10.25 It is claimed that automobiles are driven on average more than 20,000 kilometers per year. To test this claim, 100 randomly selected automobile owners are asked to keep a record of the kilometers they travel. Would you agree with this claim if the random sample showed an average of 23,500 kilometers and a standard deviation of 3900 kilometers? Use a P-value in your conclusion. 10.26 According to a dietary study, high sodium in- take may be related to ulcers, stomach cancer, and migraine headaches. The human requirement for salt is only 220 milligrams per day, which is surpassed in most single servings of ready-to-eat cereals. If a ran- dom sample of 20 similar servings of a certain cereal has a mean sodium content of 244 milligrams and a standard deviation of 24.5 milligrams, does this sug- gest at the 0.05 level of significance that the average sodium content for a single serving of such cereal is greater than 220 milligrams? Assume the distribution of sodium contents to be normal. // Variable Weight Mineral Std Dev Std Err 0.0728 0.0230 TTEST Procedure N Mean 10 0.3990 10 0.5650 0.1867 0.0591 Test the Equality of Variances Num DF Den DF F Value 9 9 6.58 Pr > F 0.0098
Figure 10.18: SAS printout for two-sample t-test.

Exercises
357
10.27 A study at the University of Colorado at Boul- der shows that running increases the percent resting metabolic rate (RMR) in older women. The average RMR of 30 elderly women runners was 34.0% higher than the average RMR of 30 sedentary elderly women, and the standard deviations were reported to be 10.5 and 10.2%, respectively. Was there a significant in- crease in RMR of the women runners over the seden- tary women? Assume the populations to be approxi- mately normally distributed with equal variances. Use a P-value in your conclusions.
10.28 According to Chemical Engineering, an impor- tant property of fiber is its water absorbency. The aver- age percent absorbency of 25 randomly selected pieces of cotton fiber was found to be 20 with a standard de- viation of 1.5. A random sample of 25 pieces of acetate yielded an average percent of 12 with a standard devi- ation of 1.25. Is there strong evidence that the popula- tion mean percent absorbency is significantly higher for cotton fiber than for acetate? Assume that the percent absorbency is approximately normally distributed and that the population variances in percent absorbency for the two fibers are the same. Use a significance level of 0.05.
10.29 Past experience indicates that the time re- quired for high school seniors to complete a standard- ized test is a normal random variable with a mean of 35 minutes. If a random sample of 20 high school seniors took an average of 33.1 minutes to complete this test with a standard deviation of 4.3 minutes, test the hy- pothesis, at the 0.05 level of significance, that μ = 35 minutes against the alternative that μ < 35 minutes. 10.30 A random sample of size n1 = 25, taken from a normal population with a standard deviation σ1 = 5.2, has a mean x ̄1 = 81. A second random sample of size n2 = 36, taken from a different normal population with a standard deviation σ2 = 3.4, has a mean x ̄2 = 76. Test the hypothesis that μ1 = μ2 against the alterna- tive, μ1 ̸= μ2. Quote a P-value in your conclusion. 10.31 A manufacturer claims that the average ten- sile strength of thread A exceeds the average tensile strength of thread B by at least 12 kilograms. To test this claim, 50 pieces of each type of thread were tested under similar conditions. Type A thread had an aver- age tensile strength of 86.7 kilograms with a standard deviation of 6.28 kilograms, while type B thread had an average tensile strength of 77.8 kilograms with a standard deviation of 5.61 kilograms. Test the manu- facturer’s claim using a 0.05 level of significance. 10.32 Amstat News (December 2004) lists median salaries for associate professors of statistics at research institutions and at liberal arts and other institutions in the United States. Assume that a sample of 200 associate professors from research institutions has an average salary of $70,750 per year with a standard de- viation of $6000. Assume also that a sample of 200 as- sociate professors from other types of institutions has an average salary of $65,200 with a standard deviation of $5000. Test the hypothesis that the mean salary for associate professors in research institutions is $2000 higher than for those in other institutions. Use a 0.01 level of significance. 10.33 A study was conducted to see if increasing the substrate concentration has an appreciable effect on the velocity of a chemical reaction. With a substrate concentration of 1.5 moles per liter, the reaction was run 15 times, with an average velocity of 7.5 micro- moles per 30 minutes and a standard deviation of 1.5. With a substrate concentration of 2.0 moles per liter, 12 runs were made, yielding an average velocity of 8.8 micromoles per 30 minutes and a sample standard de- viation of 1.2. Is there any reason to believe that this increase in substrate concentration causes an increase in the mean velocity of the reaction of more than 0.5 micromole per 30 minutes? Use a 0.01 level of signifi- cance and assume the populations to be approximately normally distributed with equal variances. 10.34 A study was made to determine if the subject matter in a physics course is better understood when a lab constitutes part of the course. Students were ran- domly selected to participate in either a 3-semester- hour course without labs or a 4-semester-hour course with labs. In the section with labs, 11 students made an average grade of 85 with a standard deviation of 4.7, and in the section without labs, 17 students made an average grade of 79 with a standard deviation of 6.1. Would you say that the laboratory course increases the average grade by as much as 8 points? Use a P -value in your conclusion and assume the populations to be ap- proximately normally distributed with equal variances. 10.35 To find out whether a new serum will arrest leukemia, 9 mice, all with an advanced stage of the disease, are selected. Five mice receive the treatment and 4 do not. Survival times, in years, from the time the experiment commenced are as follows: Treatment 2.1 5.3 1.4 4.6 0.9 No Treatment 1.9 0.5 2.8 3.1 At the 0.05 level of significance, can the serum be said to be effective? Assume the two populations to be nor- mally distributed with equal variances. 10.36 Engineers at a large automobile manufactur- ing company are trying to decide whether to purchase brand A or brand B tires for the company’s new mod- els. To help them arrive at a decision, an experiment is conducted using 12 of each brand. The tires are run // 358 Chapter 10 One- and Two-Sample Tests of Hypotheses until they wear out. The results are as follows: blood samples, the following plasma ascorbic acid val- ues were determined, in milligrams per 100 milliliters: Plasma Ascorbic Acid Values Nonsmokers Smokers 0.97 1.16 0.48 0.72 0.86 0.71 1.00 0.85 0.98 0.81 0.58 0.68 0.62 0.57 1.18 1.32 0.64 1.36 1.24 0.98 0.78 0.99 1.09 1.64 0.90 0.92 0.74 0.78 0.88 1.24 0.94 1.18 Is there sufficient evidence to conclude that there is a difference between plasma ascorbic acid levels of smok- ers and nonsmokers? Assume that the two sets of data came from normal populations with unequal variances. Use a P -value. 10.41 A study was conducted by the Department of Zoology at Virginia Tech to determine if there is a significant difference in the density of organisms at two different stations located on Cedar Run, a sec- ondary stream in the Roanoke River drainage basin. Sewage from a sewage treatment plant and overflow from the Federal Mogul Corporation settling pond en- ter the stream near its headwaters. The following data give the density measurements, in number of organisms per square meter, at the two collecting stations: Number of Organisms per Square Meter Brand A : Brand B : x ̄1 = 37,900 kilometers, s1 = 5100 kilometers. x ̄1 = 39,800 kilometers, s2 = 5900 kilometers. // Test the hypothesis that there is no difference in the average wear of the two brands of tires. Assume the populations to be approximately normally distributed with equal variances. Use a P-value. 10.37 In Exercise 9.42 on page 295, test the hypoth- esis that the fuel economy of Volkswagen mini-trucks, on average, exceeds that of similarly equipped Toyota mini-trucks by 4 kilometers per liter. Use a 0.10 level of significance. 10.38 A UCLA researcher claims that the average life span of mice can be extended by as much as 8 months when the calories in their diet are reduced by approx- imately 40% from the time they are weaned. The re- stricted diets are enriched to normal levels by vitamins and protein. Suppose that a random sample of 10 mice is fed a normal diet and has an average life span of 32.1 months with a standard deviation of 3.2 months, while a random sample of 15 mice is fed the restricted diet and has an average life span of 37.6 months with a standard deviation of 2.8 months. Test the hypothesis, at the 0.05 level of significance, that the average life span of mice on this restricted diet is increased by 8 months against the alternative that the increase is less than 8 months. Assume the distributions of life spans for the regular and restricted diets are approximately normal with equal variances. 5030 13,700 10,730 11,400 102 86 98 109 92 860 81 165 97 134 92 87 114 2200 Station 1 Station 2 10.39 The following data represent the running times of films produced by two motion-picture companies: 4980 2800 11,910 4670 8130 6890 26,850 7720 17,660 7030 22,800 7330 1130 2810 1330 3320 1230 2130 2190 Company 1 2 Time (minutes) Test the hypothesis that the average running time of 4250 films produced by company 2 exceeds the average run- 15,040 ning time of films produced by company 1 by 10 min- utes against the one-sided alternative that the differ- ence is less than 10 minutes. Use a 0.1 level of sig- nificance and assume the distributions of times to be approximately normal with unequal variances. 10.40 In a study conducted at Virginia Tech, the plasma ascorbic acid levels of pregnant women were compared for smokers versus nonsmokers. Thirty-two women in the last three months of pregnancy, free of major health disorders and ranging in age from 15 to 32 years, were selected for the study. Prior to the col- lection of 20 ml of blood, the participants were told to avoid breakfast, forgo their vitamin supplements, and avoid foods high in ascorbic acid content. From the 1690 Can we conclude, at the 0.05 level of significance, that the average densities at the two stations are equal? Assume that the observations come from normal pop- ulations with different variances. 10.42 Five samples of a ferrous-type substance were used to determine if there is a difference between a laboratory chemical analysis and an X-ray fluorescence analysis of the iron content. Each sample was split into two subsamples and the two types of analysis were ap- plied. Following are the coded data showing the iron content analysis: Exercises 359 Sample Analysis 1 2 3 4 Sorbic Acid Residuals in Ham // 5 Slice 1 2 3 4 5 6 7 8 Before Storage 224 270 400 444 590 660 1400 680 After Storage 116 96 239 329 437 597 689 576 X-ray Chemical 2.2 1.9 2.5 2.3 2.4 Assuming that the populations are normal, test at the 0.05 level of significance whether the two methods of analysis give, on the average, the same result. 10.43 According to published reports, practice un- der fatigued conditions distorts mechanisms that gov- ern performance. An experiment was conducted using 15 college males, who were trained to make a continu- ous horizontal right-to-left arm movement from a mi- croswitch to a barrier, knocking over the barrier co- incident with the arrival of a clock sweephand to the 6 o’clock position. The absolute value of the differ- ence between the time, in milliseconds, that it took to knock over the barrier and the time for the sweephand to reach the 6 o’clock position (500 msec) was recorded. Each participant performed the task five times under prefatigue and postfatigue conditions, and the sums of the absolute differences for the five performances were recorded. 2.0 2.0 2.3 2.1 2.4 Absolute Time Differences Prefatigue Postfatigue 158 91 92 59 65 215 98 226 33 223 89 91 148 92 58 177 142 134 117 116 74 153 66 219 109 143 57 164 85 100 Assuming the populations to be normally distributed, is there sufficient evidence, at the 0.05 level of signifi- cance, to say that the length of storage influences sorbic acid residual concentrations? 10.45 A taxi company manager is trying to decide whether the use of radial tires instead of regular belted tires improves fuel economy. Twelve cars were equipped with radial tires and driven over a prescribed test course. Without changing drivers, the same cars were then equipped with regular belted tires and driven once again over the test course. The gasoline consump- tion, in kilometers per liter, was recorded as follows: Kilometers per Liter Sub ject 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Car Radial Tires 1 4.2 2 4.7 3 6.6 4 7.0 5 6.7 6 4.5 7 5.7 8 6.0 9 7.4 10 4.9 11 6.1 12 5.2 Belted Tires 4.1 4.9 6.2 6.9 6.8 4.4 5.7 5.8 6.9 4.7 6.0 4.9 with radial tires An increase in the mean absolute time difference when the task is performed under postfatigue conditions would support the claim that practice under fatigued conditions distorts mechanisms that govern perfor- mance. Assuming the populations to be normally dis- tributed, test this claim. 10.44 In a study conducted by the Department of Human Nutrition and Foods at Virginia Tech, the fol- lowing data were recorded on sorbic acid residuals, in parts per million, in ham immediately after dipping in a sorbate solution and after 60 days of storage: Can we conclude that cars equipped give better fuel economy than those equipped with belted tires? Assume the populations to be normally distributed. Use a P -value in your conclusion. 10.46 In Review Exercise 9.91 on page 313, use the t- distribution to test the hypothesis that the diet reduces a woman’s weight by 4.5 kilograms on average against the alternative hypothesis that the mean difference in weight is less than 4.5 kilograms. Use a P -value. 10.47 How large a sample is required in Exercise 10.20 if the power of the test is to be 0.90 when the true mean is 5.20? Assume that σ = 0.24. 10.48 If the distribution of life spans in Exercise 10.19 is approximately normal, how large a sample is re- quired in order that the probability of committing a type II error be 0.1 when the true mean is 35.9 months? Assume that σ = 5.8 months. 360 Chapter 10 One- and Two-Sample Tests of Hypotheses 10.49 How large a sample is required in Exercise 10.24 if the power of the test is to be 0.95 when the true average height differs from 162.5 by 3.1 centime- ters? Use α = 0.02. 10.50 How large should the samples be in Exercise 10.31 if the power of the test is to be 0.95 when the true difference between thread types A and B is 8 kilo- grams? 10.51 How large a sample is required in Exercise 10.22 if the power of the test is to be 0.8 when the true mean meditation time exceeds the hypothesized value by 1.2σ? Use α = 0.05. 10.52 For testing H0: μ=14, H1: μ̸=14, an α = 0.05 level t-test is being considered. What sam- ple size is necessary in order for the probability to be 0.1 of falsely failing to reject H0 when the true popula- tion mean differs from 14 by 0.5? From a preliminary sample we estimate σ to be 1.25. 10.53 A study was conducted at the Department of Veterinary Medicine at Virginia Tech to determine if the “strength” of a wound from surgical incision is af- fected by the temperature of the knife. Eight dogs were used in the experiment. “Hot” and “cold” in- cisions were made on the abdomen of each dog, and the strength was measured. The resulting data appear below. Dog Knife Strength 1 Hot 5120 1 Cold 8200 2 Hot 10, 000 2 Cold 8600 3 Hot 10, 000 3 Cold 9200 4 Hot 10, 000 4 Cold 6200 Dog Knife Strength 5 Hot 10,000 5 Cold 10, 000 6 Hot 7900 6 Cold 5200 7 Hot 510 7 Cold 885 8 Hot 1020 8 Cold 460 (a) Write an appropriate hypothesis to determine if there is a significant difference in strength between the hot and cold incisions. (b) Test the hypothesis using a paired t-test. Use a P-value in your conclusion. 10.54 Nine subjects were used in an experiment to determine if exposure to carbon monoxide has an im- pact on breathing capability. The data were collected by personnel in the Health and Physical Education De- partment at Virginia Tech and were analyzed in the Statistics Consulting Center at Hokie Land. The sub- jects were exposed to breathing chambers, one of which contained a high concentration of CO. Breathing fre- quency measures were made for each subject for each chamber. The subjects were exposed to the breath- ing chambers in random sequence. The data give the breathing frequency, in number of breaths taken per minute. Make a one-sided test of the hypothesis that mean breathing frequency is the same for the two en- vironments. Use α = 0.05. Assume that breathing frequency is approximately normal. Subject With CO 1 30 2 45 3 26 4 25 5 34 6 51 7 46 8 32 9 30 Without CO 30 40 25 23 30 49 41 35 28 10.8 One Sample: Test on a Single Proportion Tests of hypotheses concerning proportions are required in many areas. Politicians are certainly interested in knowing what fraction of the voters will favor them in the next election. All manufacturing firms are concerned about the proportion of defective items when a shipment is made. Gamblers depend on a knowledge of the proportion of outcomes that they consider favorable. We shall consider the problem of testing the hypothesis that the proportion of successes in a binomial experiment equals some specified value. That is, we are testing the null hypothesis H0 that p = p0, where p is the parameter of the binomial distribution. The alternative hypothesis may be one of the usual one-sided 10.8 One Sample: Test on a Single Proportion 361 Testinga Proportion (Small Samples) or two-sided alternatives: pp0, or p̸=p0.
The appropriate random variable on which we base our decision criterion is the binomial random variable X, although we could just as well use the statistic pˆ = X/n. Values of X that are far from the mean μ = np0 will lead to the rejection of the null hypothesis. Because X is a discrete binomial variable, it is unlikely that a critical region can be established whose size is exactly equal to a prespecified value of α. For this reason it is preferable, in dealing with small samples, to base our decisions on P-values. To test the hypothesis
H0: p=p0, H1: pp0,
at the α-level of significance, we compute
P = P(X ≥ x when p = p0)
and reject H0 in favor of H1 if this P-value is less than or equal to α. Finally, to test the hypothesis
H0: p=p0, H1: p̸=p0,
at the α-level of significance, we compute
P = 2P(X ≤ x when p = p0)
or
P = 2P(X ≥ x when p = p0)
and reject H0 in favor of H1 if the computed P-value is less than or equal to α. The steps for testing a null hypothesis about a proportion against various al-
ternatives using the binomial probabilities of Table A.1 are as follows:
1.H0:p=p0.
2. OneofthealternativesH1: pp0, orp̸=p0.
3. Choose a level of significance equal to α.
4. Test statistic: Binomial variable X with p = p0.
5. Computations: Find x, the number of successes, and compute the appropri- ate P-value.
6. Decision: Draw appropriate conclusions based on the P -value.
if x < np0 if x > np0

362
Chapter 10 One- and Two-Sample Tests of Hypotheses
Example 10.9:
Solution :
A builder claims that heat pumps are installed in 70% of all homes being con- structed today in the city of Richmond, Virginia. Would you agree with this claim if a random survey of new homes in this city showed that 8 out of 15 had heat pumps installed? Use a 0.10 level of significance.
1. H0:p=0.7.
2. H1: p ̸= 0.7.
3. α = 0.10.
4. Test statistic: Binomial variable X with p = 0.7 and n = 15.
5. Computations: x = 8 and np0 = (15)(0.7) = 10.5. Therefore, from Table A.1, the computed P-value is
􏰤8
x=0
6. Decision: Do not reject H0. Conclude that there is insufficient reason to doubt the builder’s claim.
In Section 5.2, we saw that binomial probabilities can be obtained from the actual binomial formula or from Table A.1 when n is small. For large n, approxi- mation procedures are required. When the hypothesized value p0 is very close to 0 or 1, the Poisson distribution, with parameter μ = np0, may be used. However, the normal curve approximation, with parameters μ = np0 and σ2 = np0q0, is usually preferred for large n and is very accurate as long as p0 is not extremely close to 0 or to 1. If we use the normal approximation, the z-value for testing p = p0 is given by
x−np0 pˆ−p0 z=√npq =􏰱 ,
0 0 p0q0/n
which is a value of the standard normal variable Z. Hence, for a two-tailed test at the α-level of significance, the critical region is z < −zα/2 or z > zα/2. For the one-sided alternative p < p0, the critical region is z < −zα, and for the alternative p > p0, the critical region is z > zα.
A commonly prescribed drug for relieving nervous tension is believed to be only 60% effective. Experimental results with a new drug administered to a random sample of 100 adults who were suffering from nervous tension show that 70 received relief. Is this sufficient evidence to conclude that the new drug is superior to the one commonly prescribed? Use a 0.05 level of significance.
1. H0: p = 0.6.
2. H1: p > 0.6.
3. α = 0.05.
4. Critical region: z > 1.645.
P = 2P (X ≤ 8 when p = 0.7) = 2
b(x; 15, 0.7) = 0.2622 > 0.10.
Example 10.10:
Solution:

10.9 Two Samples: Tests on Two Proportions 363 5. Computations: x = 70, n = 100, pˆ = 70/100 = 0.7, and
z=􏰱 0.7−0.6 =2.04, P=P(Z>2.04)<0.0207. (0.6)(0.4)/100 6. Decision: Reject H0 and conclude that the new drug is superior. 10.9 Two Samples: Tests on Two Proportions Situations often arise where we wish to test the hypothesis that two proportions are equal. For example, we might want to show evidence that the proportion of doctors who are pediatricians in one state is equal to the proportion in another state. A person may decide to give up smoking only if he or she is convinced that the proportion of smokers with lung cancer exceeds the proportion of nonsmokers with lung cancer. In general, we wish to test the null hypothesis that two proportions, or bino- mial parameters, are equal. That is, we are testing p1 = p2 against one of the alternatives p1 < p2, p1 > p2, or p1 ̸= p2. Of course, this is equivalent to testing the null hypothesis that p1 − p2 = 0 against one of the alternatives p1 − p2 < 0, p1 −p2 > 0, or p1 −p2 ̸= 0. The statistic on which we base our decision is the random variable P􏱅1 − P􏱅2. Independent samples of sizes n1 and n2 are selected at random from two binomial populations and the proportions of successes P􏱅1 and P􏱅2 for the two samples are computed.
In our construction of confidence intervals for p1 and p2 we noted, for n1 and n2 sufficiently large, that the point estimator P􏱅1 minus P􏱅2 was approximately normally distributed with mean
and variance
μ P􏱅 1 − P􏱅 2 = p 1 − p 2 σ2􏱅 􏱅=p1q1+p2q2.
P1−P2 n1 n2
Therefore, our critical region(s) can be established by using the standard normal
variable
( P􏱅 1 − P􏱅 2 ) − ( p 1 − p 2 ) Z=􏰱 .
p1q1/n1 + p2q2/n2
WhenH0 istrue,wecansubstitutep1 =p2 =pandq1 =q2 =q(wherepand
q are the common values) in the preceding formula for Z to give the form Z=􏰱 P􏱅1−P􏱅2 .
pq(1/n1 + 1/n2)
To compute a value of Z, however, we must estimate the parameters p and q that appear in the radical. Upon pooling the data from both samples, the pooled estimate of the proportion p is
pˆ = x 1 + x 2 , n1 + n2

364 Chapter 10 One- and Two-Sample Tests of Hypotheses
where x1 and x2 are the numbers of successes in each of the two samples. Substi- tuting pˆ for p and qˆ = 1 − pˆ for q, the z-value for testing p1= p2 is determined from the formula
z = 􏰱 pˆ 1 − pˆ 2 . pˆqˆ(1/n1 +1/n2)
The critical regions for the appropriate alternative hypotheses are set up as before, using critical points of the standard normal curve. Hence, for the alternative p1 ̸= p2 at the α-level of significance, the critical region is z < −zα/2 or z > zα/2. For a test where the alternative is p1 < p2, the critical region is z < −zα, and when the alternative is p1 > p2, the critical region is z > zα.
Example 10.11: A vote is to be taken among the residents of a town and the surrounding county to determine whether a proposed chemical plant should be constructed. The con- struction site is within the town limits, and for this reason many voters in the county believe that the proposal will pass because of the large proportion of town voters who favor the construction. To determine if there is a significant difference in the proportions of town voters and county voters favoring the proposal, a poll is taken. If 120 of 200 town voters favor the proposal and 240 of 500 county residents favor it, would you agree that the proportion of town voters favoring the proposal is higher than the proportion of county voters? Use an α = 0.05 level of significance.
Solution : Let p1 and p2 be the true proportions of voters in the town and county, respectively, favoring the proposal.
1. H0: p1 = p2.
2. H1: p1 > p2.
3. α = 0.05.
4. Critical region: z > 1.645.
5. Computations:
pˆ1=x1 =120=0.60, pˆ2=x2 =240=0.48, and
n1 200 n2 500
Therefore,
pˆ= x1 +x2 = 120+240 =0.51. n1 +n2 200+500
z= 􏰱 0.60−0.48 (0.51)(0.49)(1/200 + 1/500)
P = P(Z > 2.9) = 0.0019.
=2.9,
6. Decision: Reject H0 and agree that the proportion of town voters favoring the proposal is higher than the proportion of county voters.

Exercises
365
Exercises
10.55 A marketing expert for a pasta-making com- pany believes that 40% of pasta lovers prefer lasagna. If 9 out of 20 pasta lovers choose lasagna over other pas- tas, what can be concluded about the expert’s claim? Use a 0.05 level of significance.
10.56 Suppose that, in the past, 40% of all adults favored capital punishment. Do we have reason to believe that the proportion of adults favoring capital punishment has increased if, in a random sample of 15 adults, 8 favor capital punishment? Use a 0.05 level of significance.
10.57 A new radar device is being considered for a certain missile defense system. The system is checked by experimenting with aircraft in which a kill or a no kill is simulated. If, in 300 trials, 250 kills occur, accept or reject, at the 0.04 level of significance, the claim that the probability of a kill with the new system does not exceed the 0.8 probability of the existing device.
10.58 It is believed that at least 60% of the residents in a certain area favor an annexation suit by a neigh- boring city. What conclusion would you draw if only 110 in a sample of 200 voters favored the suit? Use a 0.05 level of significance.
10.59 A fuel oil company claims that one-fifth of the homes in a certain city are heated by oil. Do we have reason to believe that fewer than one-fifth are heated by oil if, in a random sample of 1000 homes in this city, 136 are heated by oil? Use a P-value in your conclu- sion.
10.60 At a certain college, it is estimated that at most 25% of the students ride bicycles to class. Does this seem to be a valid estimate if, in a random sample of 90 college students, 28 are found to ride bicycles to class? Use a 0.05 level of significance.
10.61 In a winter of an epidemic flu, the parents of 2000 babies were surveyed by researchers at a well- known pharmaceutical company to determine if the company’s new medicine was effective after two days. Among 120 babies who had the flu and were given the medicine, 29 were cured within two days. Among 280 babies who had the flu but were not given the medicine, 56 recovered within two days. Is there any significant indication that supports the company’s claim of the effectiveness of the medicine?
10.62 In a controlled laboratory experiment, scien- tists at the University of Minnesota discovered that 25% of a certain strain of rats subjected to a 20% coffee bean diet and then force-fed a powerful cancer-causing chemical later developed cancerous tumors. Would we have reason to believe that the proportion of rats devel- oping tumors when subjected to this diet has increased if the experiment were repeated and 16 of 48 rats de- veloped tumors? Use a 0.05 level of significance.
10.63 In a study to estimate the proportion of resi- dents in a certain city and its suburbs who favor the construction of a nuclear power plant, it is found that 63 of 100 urban residents favor the construction while only 59 of 125 suburban residents are in favor. Is there a significant difference between the proportions of ur- ban and suburban residents who favor construction of the nuclear plant? Make use of a P -value.
10.64 In a study on the fertility of married women conducted by Martin O’Connell and Carolyn C. Rogers for the Census Bureau in 1979, two groups of childless wives aged 25 to 29 were selected at random, and each was asked if she eventually planned to have a child. One group was selected from among wives married less than two years and the other from among wives married five years. Suppose that 240 of the 300 wives married less than two years planned to have children some day compared to 288 of the 400 wives married five years. Can we conclude that the proportion of wives married less than two years who planned to have children is significantly higher than the proportion of wives married five years? Make use of a P -value.
10.65 An urban community would like to show that the incidence of breast cancer is higher in their area than in a nearby rural area. (PCB levels were found to be higher in the soil of the urban community.) If it is found that 20 of 200 adult women in the urban com- munity have breast cancer and 10 of 150 adult women in the rural community have breast cancer, can we con- clude at the 0.05 level of significance that breast cancer is more prevalent in the urban community?
10.66 Group Project: The class should be divided into pairs of students for this project. Suppose it is conjectured that at least 25% of students at your uni- versity exercise for more than two hours a week. Col- lect data from a random sample of 50 students. Ask each student if he or she works out for at least two hours per week. Then do the computations that allow either rejection or nonrejection of the above conjecture. Show all work and quote a P -value in your conclusion.

366
Chapter 10 One- and Two-Sample Tests of Hypotheses
10.10
One- and Two-Sample Tests Concerning Variances
In this section, we are concerned with testing hypotheses concerning population variances or standard deviations. Applications of one- and two-sample tests on variances are certainly not difficult to motivate. Engineers and scientists are con- fronted with studies in which they are required to demonstrate that measurements involving products or processes adhere to specifications set by consumers. The specifications are often met if the process variance is sufficiently small. Attention is also focused on comparative experiments between methods or processes, where inherent reproducibility or variability must formally be compared. In addition, to determine if the equal variance assumption is violated, a test comparing two variances is often applied prior to conducting a t-test on two means.
Let us first consider the problem of testing the null hypothesis H0 that the population variance σ2 equals a specified value σ02 against one of the usual alter- natives σ2 < σ02, σ2 > σ02, or σ2 ̸= σ02. The appropriate statistic on which to base our decision is the chi-squared statistic of Theorem 8.4, which was used in Chapter 9 to construct a confidence interval for σ2. Therefore, if we assume that the distribution of the population being sampled is normal, the chi-squared value for testing σ2 = σ02 is given by
2 (n − 1)s2 χ= σ02 ,
where n is the sample size, s2 is the sample variance, and σ02 is the value of σ2 given by the null hypothesis. If H0 is true, χ2 is a value of the chi-squared distribution with v = n − 1 degrees of freedom. Hence, for a two-tailed test at the α-level of significance, the critical region is χ2 < χ21−α/2 or χ2 > χ2α/2. For the one-
sided alternative σ2 < σ02, the critical region is χ2 < χ21−α, and for the one-sided alternative σ2 > σ02, the critical region is χ2 > χ2α.
Robustness of χ2-Test to Assumption of Normality
The reader may have discerned that various tests depend, at least theoretically, on the assumption of normality. In general, many procedures in applied statis- tics have theoretical underpinnings that depend on the normal distribution. These procedures vary in the degree of their dependency on the assumption of normality. A procedure that is reasonably insensitive to the assumption is called a robust procedure (i.e., robust to normality). The χ2-test on a single variance is very nonrobust to normality (i.e., the practical success of the procedure depends on normality). As a result, the P-value computed may be appreciably different from the actual P-value if the population sampled is not normal. Indeed, it is quite feasible that a statistically significant P-value may not truly signal H1: σ ̸= σ0; rather, a significant value may be a result of the violation of the normality assump- tions. Therefore, the analyst should approach the use of this particular χ2-test with caution.
Example 10.12: A manufacturer of car batteries claims that the life of the company’s batteries is approximately normally distributed with a standard deviation equal to 0.9 year.

10.10 One- and Two-Sample Tests Concerning Variances 367
Solution :
If a random sample of 10 of these batteries has a standard deviation of 1.2 years, do you think that σ > 0.9 year? Use a 0.05 level of significance.
1. H0:σ2=0.81.
2. H1: σ2 > 0.81.
3. α = 0.05.
4. Critical region: From Figure 10.19 we see that the null hypothesis is rejected
when χ2 > 16.919, where χ2 = (n−1)s2 , with v = 9 degrees of freedom. σ02
v=9
0
0.05 χ2 16.919
Figure 10.19: Critical region for the alternative hypothesis σ > 0.9. 5. Computations: s2 = 1.44, n = 10, and
χ2 = (9)(1.44) = 16.0, P ≈ 0.07. 0.81
6. Decision: The χ2-statistic is not significant at the 0.05 level. However, based on the P -value 0.07, there is evidence that σ > 0.9.
Now let us consider the problem of testing the equality of the variances σ12 and σ2 of two populations. That is, we shall test the null hypothesis H0 that σ12 = σ2 against one of the usual alternatives
σ12 < σ2, σ12 > σ2, or σ12 ̸= σ2.
For independent random samples of sizes n1 and n2, respectively, from the two
populations, the f-value for testing σ12 = σ2 is the ratio f = s 21 ,
s2
where s21 and s2 are the variances computed from the two samples. If the two populations are approximately normally distributed and the null hypothesis is true, according to Theorem 8.8 the ratio f = s21/s2 is a value of the F-distribution with v1 = n1 − 1 and v2 = n2 − 1 degrees of freedom. Therefore, the critical regions

368 Chapter 10 One- and Two-Sample Tests of Hypotheses
of size α corresponding to the one-sided alternatives σ12 < σ2 and σ12 > σ2 are, respectively, f < f1−α(v1,v2) and f > fα(v1,v2). For the two-sided alternative σ12 ̸= σ2, the critical region is f < f1−α/2(v1, v2) or f > fα/2(v1, v2).
Example 10.13: In testing for the difference in the abrasive wear of the two materials in Example 10.6, we assumed that the two unknown population variances were equal. Were we justified in making this assumption? Use a 0.10 level of significance.
Solution: Let σ12 and σ2 be the population variances for the abrasive wear of material 1 and material 2, respectively.
1. H0: σ12 = σ2.
2. H1: σ12 ̸= σ2.
3. α = 0.10.
4. Critical region: From Figure 10.20, we see that f0.05(11,9) = 3.11, and, by using Theorem 8.7, we find
f0.95(11, 9) = 1 = 0.34. f0.05(9, 11)
Therefore, the null hypothesis is rejected when f < 0.34 or f > 3.11, where f = s21/s2 with v1 = 11 and v2 = 9 degrees of freedom.
5. Computations: s2 = 16, s2 = 25, and hence f = 16 = 0.64. 12 25
6. Decision: Do not reject H0. Conclude that there is insufficient evidence that the variances differ.
0.05
0 0.34
0.05
3.11
f
v1 =11andv2 =9
Figure 10.20: Critical region for the alternative hypothesis σ12 ̸= σ2. F-Test for Testing Variances in SAS
Figure 10.18 on page 356 displays the printout of a two-sample t-test where two means from the seedling data in Exercise 9.40 were compared. Box-and-whisker plots in Figure 10.17 on page 355 suggest that variances are not homogeneous, and thus the t′-statistic and its corresponding P-value are relevant. Note also that

//
Exercises 369
the printout displays the F-statistic for H0: σ1 = σ2 with a P-value of 0.0098, additional evidence that more variability is to be expected when nitrogen is used than under the no-nitrogen condition.
Exercises
10.67 The content of containers of a particular lubri- cant is known to be normally distributed with a vari- ance of 0.03 liter. Test the hypothesis that σ2 = 0.03 against the alternative that σ2 ̸= 0.03 for the random sample of 10 containers in Exercise 10.23 on page 356. Use a P-value in your conclusion.
10.68 Past experience indicates that the time re- quired for high school seniors to complete a standard- ized test is a normal random variable with a standard deviation of 6 minutes. Test the hypothesis that σ = 6 against the alternative that σ < 6 if a random sample of the test times of 20 high school seniors has a standard deviation s = 4.51. Use a 0.05 level of significance. 10.69 Aflotoxins produced by mold on peanut crops in Virginia must be monitored. A sample of 64 batches of peanuts reveals levels of 24.17 ppm, on average, with a variance of 4.25 ppm. Test the hypothesis that σ2 = 4.2 ppm against the alternative that σ2 ̸= 4.2 ppm. Use a P -value in your conclusion. 10.70 Past data indicate that the amount of money contributed by the working residents of a large city to a volunteer rescue squad is a normal random variable with a standard deviation of $1.40. It has been sug- gested that the contributions to the rescue squad from just the employees of the sanitation department are much more variable. If the contributions of a random sample of 12 employees from the sanitation department have a standard deviation of $1.75, can we conclude at the 0.01 level of significance that the standard devi- ation of the contributions of all sanitation workers is greater than that of all workers living in the city? 10.71 A soft-drink dispensing machine is said to be out of control if the variance of the contents exceeds 1.15 deciliters. If a random sample of 25 drinks from this machine has a variance of 2.03 deciliters, does this indicate at the 0.05 level of significance that the ma- chine is out of control? Assume that the contents are approximately normally distributed. 10.72 Large-Sample Test of σ2 = σ02: When n ≥ 30, we can test the null hypothesis that σ2 = σ02, or σ = σ0, by computing s−σ0 z=√, which is a value of a random variable whose sampling distribution is approximately the standard normal dis- tribution. (a) With reference to Example 10.4, test at the 0.05 level of significance whether σ = 10.0 years against the alternative that σ ̸= 10.0 years. (b) It is suspected that the variance of the distribution of distances in kilometers traveled on 5 liters of fuel by a new automobile model equipped with a diesel engine is less than the variance of the distribution of distances traveled by the same model equipped with a six-cylinder gasoline engine, which is known to be σ2 = 6.25. If 72 test runs of the diesel model have a variance of 4.41, can we conclude at the 0.05 level of significance that the variance of the distances traveled by the diesel model is less than that of the gasoline model? 10.73 A study is conducted to compare the lengths of time required by men and women to assemble a certain product. Past experience indicates that the distribu- tion of times for both men and women is approximately normal but the variance of the times for women is less than that for men. A random sample of times for 11 men and 14 women produced the following data: Men n1 = 11 s1 =6.1 Women n2 = 14 s2 =5.3 Test the hypothesis that σ12 = σ2 against the alterna- tive that σ12 > σ2 . Use a P -value in your conclusion.
10.74 For Exercise 10.41 on page 358, test the hy- pothesis at the 0.05 level of significance that σ12 = σ2 against the alternative that σ12 ̸= σ2, where σ12 and σ2 are the variances of the number of organisms per square meter of water at the two different locations on Cedar Run.
10.75 With reference to Exercise 10.39 on page 358, test the hypothesis that σ12 = σ2 against the alterna- tive that σ12 ̸= σ2, where σ12 and σ2 are the variances for the running times of films produced by company 1 and company 2, respectively. Use a P-value.
10.76 Two types of instruments for measuring the amount of sulfur monoxide in the atmosphere are being compared in an air-pollution experiment. Researchers
σ0/ 2n

370 Chapter 10 One- and Two-Sample Tests of Hypotheses
wish to determine whether the two types of instruments yield measurements having the same variability. The readings in the following table were recorded for the two instruments.
Sulfur Monoxide
0.48 0.39 0.42 0.52 0.40 0.48 0.52 0.52
Production line 2:
0.38 0.37 0.39 0.41 0.38 0.39 0.40 0.39
Assume both populations are normal. It is suspected that production line 1 is not producing as consistently as production line 2 in terms of alcohol content. Test the hypothesis that σ1 = σ2 against the alternative thatσ1̸=σ2.UseaP-value.
10.78 Hydrocarbon emissions from cars are known to have decreased dramatically during the 1980s. A study was conducted to compare the hydrocarbon emissions at idling speed, in parts per million (ppm), for automo- biles from 1980 and 1990. Twenty cars of each model year were randomly selected, and their hydrocarbon emission levels were recorded. The data are as follows:
1980 models:
141 359 247 940 882 494 306 210 105 880 200 223 188 940 241 190 300 435 241 380 1990 models:
140 160 20 20 223 60 20 95 360 70 220400217582353802001758565
Test the hypothesis that σ1 = σ2 against the alter- native that σ1 ̸= σ2. Assume both populations are normal. Use a P-value.
Instrument A 0.86
0.82
0.75
0.61
0.89
0.64
0.81
0.68
0.65
Instrument B 0.87
0.74
0.63
0.55
0.76
0.70
0.69
0.57
0.53
Assuming the populations of measurements to be ap- proximately normally distributed, test the hypothesis that σA = σB against the alternative that σA ̸= σB. Use a P-value.
10.77 An experiment was conducted to compare the alcohol content of soy sauce on two different produc- tion lines. Production was monitored eight times a day. The data are shown here.
Production line 1:
10.11 Goodness-of-Fit Test
Throughout this chapter, we have been concerned with the testing of statistical hypotheses about single population parameters such as μ, σ2, and p. Now we shall consider a test to determine if a population has a specified theoretical distribution. The test is based on how good a fit we have between the frequency of occurrence of observations in an observed sample and the expected frequencies obtained from the hypothesized distribution.
To illustrate, we consider the tossing of a die. We hypothesize that the die is honest, which is equivalent to testing the hypothesis that the distribution of outcomes is the discrete uniform distribution
f(x) = 1, x = 1,2,…,6. 6
Suppose that the die is tossed 120 times and each outcome is recorded. Theoret- ically, if the die is balanced, we would expect each face to occur 20 times. The results are given in Table 10.4.
Table 10.4: Observed and Expected Frequencies of 120 Tosses of a Die
Face: 123456
Observed 20 22 17 18 19 24 Expected 20 20 20 20 20 20

10.11 Goodness-of-Fit Test 371
Goodness-of-Fit Test
By comparing the observed frequencies with the corresponding expected fre- quencies, we must decide whether these discrepancies are likely to occur as a result of sampling fluctuations and the die is balanced or whether the die is not honest and the distribution of outcomes is not uniform. It is common practice to refer to each possible outcome of an experiment as a cell. In our illustration, we have 6 cells. The appropriate statistic on which we base our decision criterion for an experiment involving k cells is defined by the following.
A goodness-of-fit test between observed and expected frequencies is based on the quantity
2 􏰤k ( o i − e i ) 2 χ=e,
where χ2 is a value of a random variable whose sampling distribution is approx- imated very closely by the chi-squared distribution with v = k − 1 degrees of freedom. The symbols oi and ei represent the observed and expected frequencies, respectively, for the ith cell.
The number of degrees of freedom associated with the chi-squared distribution used here is equal to k − 1, since there are only k − 1 freely determined cell fre- quencies. That is, once k − 1 cell frequencies are determined, so is the frequency for the kth cell.
If the observed frequencies are close to the corresponding expected frequencies, the χ2-value will be small, indicating a good fit. If the observed frequencies differ considerably from the expected frequencies, the χ2-value will be large and the fit is poor. A good fit leads to the acceptance of H0, whereas a poor fit leads to its rejection. The critical region will, therefore, fall in the right tail of the chi-squared distribution. For a level of significance equal to α, we find the critical value χ2α from Table A.5, and then χ2 > χ2α constitutes the critical region. The decision criterion described here should not be used unless each of the expected frequencies is at least equal to 5. This restriction may require the combining of adjacent cells, resulting in a reduction in the number of degrees of freedom.
From Table 10.4, we find the χ2-value to be
2 (20−20)2 (22−20)2 (17−20)2 χ= 20 + 20 + 20
(18 − 20)2 (19 − 20)2 (24 − 20)2
+ 20 + 20 + 20 =1.7.
Using Table A.5, we find χ20.05 = 11.070 for v = 5 degrees of freedom. Since 1.7 is less than the critical value, we fail to reject H0. We conclude that there is insufficient evidence that the die is not balanced.
As a second illustration, let us test the hypothesis that the frequency distri- bution of battery lives given in Table 1.7 on page 23 may be approximated by a normal distribution with mean μ = 3.5 and standard deviation σ = 0.7. The expected frequencies for the 7 classes (cells), listed in Table 10.5, are obtained by computing the areas under the hypothesized normal curve that fall between the various class boundaries.
i=1 i

372
Chapter 10 One- and Two-Sample Tests of Hypotheses
Table 10.5: Observed and Expected Frequencies of Battery Lives, Assuming Normality
Class Boundaries
1.45−1.95 1.95−2.45 2.45−2.95 2.95−3.45 3.45−3.95 3.95−4.45 4.45−4.95
oi ⎫ ei
2⎬ 0.5
1 ⎭7 2.1 ⎭ 8.5 4 5.9
⎫ ⎬
15 10.3 10 􏰹 10.7 􏰹
58 7.0 3 3.5
10.5
For example, the z-values corresponding to the boundaries of the fourth class are
z1 = 2.95−3.5 =−0.79 and z2 = 3.45−3.5 =−0.07. 0.7 0.7
From Table A.3 we find the area between z1 = −0.79 and z2 = −0.07 to be area = P(−0.79 < Z < −0.07) = P(Z < −0.07) − P(Z < −0.79) = 0.4721 − 0.2148 = 0.2573. Hence, the expected frequency for the fourth class is e4 = (0.2573)(40) = 10.3. It is customary to round these frequencies to one decimal. The expected frequency for the first class interval is obtained by using the total area under the normal curve to the left of the boundary 1.95. For the last class interval, we use the total area to the right of the boundary 4.45. All other expected frequencies are determined by the method described for the fourth class. Note that we have combined adjacent classes in Table 10.5 where the expected frequencies are less than 5 (a rule of thumb in the goodness-of-fit test). Consequently, the total number of intervals is reduced from 7 to 4, resulting in v = 3 degrees of freedom. The χ2-value is then given by 2 (7 − 8.5)2 (15 − 10.3)2 (10 − 10.7)2 (8 − 10.5)2 χ= 8.5 + 10.3 + 10.7 + 10.5 =3.05. Since the computed χ2-value is less than χ20.05 = 7.815 for 3 degrees of freedom, we have no reason to reject the null hypothesis and conclude that the normal distribution with μ = 3.5 and σ = 0.7 provides a good fit for the distribution of battery lives. The chi-squared goodness-of-fit test is an important resource, particularly since so many statistical procedures in practice depend, in a theoretical sense, on the assumption that the data gathered come from a specific type of distribution. As we have already seen, the normality assumption is often made. In the chapters that follow, we shall continue to make normality assumptions in order to provide a theoretical basis for certain tests and confidence intervals. 10.12 Test for Independence (Categorical Data) 373 There are tests in the literature that are more powerful than the chi-squared test for testing normality. One such test is called Geary’s test. This test is based on a very simple statistic which is a ratio of two estimators of the population standard deviation σ. Suppose that a random sample X1, X2, . . . , Xn is taken from a normal distribution, N(μ,σ). Consider the ratio 􏰱􏰦n ̄ π/2 |Xi −X|/n The reader should recognize that the denominator is a reasonable estimator of σ whether the distribution is normal or not. The numerator is a good estimator of σ if the distribution is normal but may overestimate or underestimate σ when there are departures from normality. Thus, values of U differing considerably from 1.0 represent the signal that the hypothesis of normality should be rejected. For large samples, a reasonable test is based on approximate normality of U. The test statistic is then a standardization of U, given by U−1 Z = 0.2661/√n. Of course, the test procedure involves the two-sided critical region. We compute a value of z from the data and do not reject the hypothesis of normality when −zα/2 < Z < zα/2. A paper dealing with Geary’s test is cited in the Bibliography (Geary, 1947). 10.12 Test for Independence (Categorical Data) The chi-squared test procedure discussed in Section 10.11 can also be used to test the hypothesis of independence of two variables of classification. Suppose that we wish to determine whether the opinions of the voting residents of the state of Illinois concerning a new tax reform are independent of their levels of income. Members of a random sample of 1000 registered voters from the state of Illinois are classified as to whether they are in a low, medium, or high income bracket and whether or not they favor the tax reform. The observed frequencies are presented in Table 10.6, which is known as a contingency table. Table 10.6: 2 × 3 Contingency Table Income Level U= 􏰿 i=1 􏰦n ̄2 . (Xi −X) /n i=1 Tax Reform For Against Total Low Medium 182 213 154 138 336 351 High Total 203 598 110 402 313 1000 374 Chapter 10 One- and Two-Sample Tests of Hypotheses A contingency table with r rows and c columns is referred to as an r × c table (“r × c” is read “r by c”). The row and column totals in Table 10.6 are called marginal frequencies. Our decision to accept or reject the null hypothesis, H0, of independence between a voter’s opinion concerning the tax reform and his or her level of income is based upon how good a fit we have between the observed frequencies in each of the 6 cells of Table 10.6 and the frequencies that we would expect for each cell under the assumption that H0 is true. To find these expected frequencies, let us define the following events: L: A person selected is in the low-income level. M: A person selected is in the medium-income level. H: A person selected is in the high-income level. F: A person selected is for the tax reform. A: A person selected is against the tax reform. By using the marginal frequencies, we can list the following probability esti- mates: P(L)= 336 , 1000 P(F)= 598, 1000 P(M)= 351 , P(H)= 313 , 1000 1000 P(A)= 402. 1000 Now, if H0 is true and the two variables are independent, we should have 􏰧 336 􏰨􏰧 598 􏰨 P(L ∩ F) = P(L)P(F) = 1000 1000 , P(L ∩ A) = P(L)P(A) = 􏰧 336 􏰨􏰧 402 􏰨 1000 1000 , P (M ∩ F ) = P (M )P (F ) = P (M ∩ A) = P (M )P (A) = P(H∩F)=P(H)P(F)= P(H∩A)=P(H)P(A)= The expected frequencies are obtained by the total number of observations. As before, we round these frequencies to one decimal. Thus, the expected number of low-income voters in our sample who favor the tax reform is estimated to be 􏰧 336 􏰨􏰧 598 􏰨 (336)(598) 1000 1000 (1000) = 1000 = 200.9 􏰧 351 􏰨􏰧 598 􏰨 1000 1000 , 􏰧 351 􏰨􏰧 402 􏰨 1000 1000 , 􏰧 313 􏰨􏰧 598 􏰨 1000 1000 , 􏰧 313 􏰨􏰧 402 􏰨 1000 1000 . multiplying each cell probability by 10.12 Test for Independence (Categorical Data) 375 when H0 is true. The general rule for obtaining the expected frequency of any cell is given by the following formula: expected frequency = (column total) × (row total) . grand total The expected frequency for each cell is recorded in parentheses beside the actual observed value in Table 10.7. Note that the expected frequencies in any row or column add up to the appropriate marginal total. In our example, we need to compute only two expected frequencies in the top row of Table 10.7 and then find the others by subtraction. The number of degrees of freedom associated with the chi-squared test used here is equal to the number of cell frequencies that may be filled in freely when we are given the marginal totals and the grand total, and in this illustration that number is 2. A simple formula providing the correct number of degrees of freedom is v = (r − 1)(c − 1). Table 10.7: Observed and Expected Frequencies Tax Reform For Against Total Low 182 (200.9) 154 (135.1) 336 Income Level Medium 213 (209.9) 138 (141.1) 351 High Total 203 (187.2) 598 110 (125.8) 402 313 1000 Hence, for our example, v = (2 − 1)(3 − 1) = 2 degrees of freedom. To test the null hypothesis of independence, we use the following decision criterion. Test for Independence Calculate where the summation extends over all rc cells in the r × c contingency table. If χ2 > χ2α with v = (r − 1)(c − 1) degrees of freedom, reject the null hypothesis of independence at the α-level of significance; otherwise, fail to reject the null hypothesis.
Applying this criterion to our example, we find that
2 􏰤 (oi − ei)2 χ=e,
ii
2 (182 − 200.9)2
(213 − 209.9)2 209.9 +
(203 − 187.2)2 187.2
χ =
+
P ≈ 0.02.
200.9 + (154 − 135.1)2
(138 − 141.1)2 141.1 +
(110 − 125.8)2
125.8 = 7.85,
135.1 +
From Table A.5 we find that χ20.05 = 5.991 for v = (2−1)(3−1) = 2 degrees of freedom. The null hypothesis is rejected and we conclude that a voter’s opinion concerning the tax reform and his or her level of income are not independent.

376
Chapter 10 One- and Two-Sample Tests of Hypotheses
10.13
It is important to remember that the statistic on which we base our decision has a distribution that is only approximated by the chi-squared distribution. The computed χ2-values depend on the cell frequencies and consequently are discrete. The continuous chi-squared distribution seems to approximate the discrete sam- pling distribution of χ2 very well, provided that the number of degrees of freedom is greater than 1. In a 2 × 2 contingency table, where we have only 1 degree of freedom, a correction called Yates’ correction for continuity is applied. The corrected formula then becomes
2 􏰤(|oi −ei|−0.5)2 χ (corrected) = e .
ii
If the expected cell frequencies are large, the corrected and uncorrected results are almost the same. When the expected frequencies are between 5 and 10, Yates’ correction should be applied. For expected frequencies less than 5, the Fisher-Irwin exact test should be used. A discussion of this test may be found in Basic Concepts of Probability and Statistics by Hodges and Lehmann (2005; see the Bibliography). The Fisher-Irwin test may be avoided, however, by choosing a larger sample.
Test for Homogeneity
When we tested for independence in Section 10.12, a random sample of 1000 vot- ers was selected and the row and column totals for our contingency table were determined by chance. Another type of problem for which the method of Section 10.12 applies is one in which either the row or column totals are predetermined. Suppose, for example, that we decide in advance to select 200 Democrats, 150 Republicans, and 150 Independents from the voters of the state of North Carolina and record whether they are for a proposed abortion law, against it, or undecided. The observed responses are given in Table 10.8.
Table 10.8: Observed Frequencies
Political Affiliation
Abortion Law Democrat Republican Independent Total
For 82 70 62 214
Against Undecided Total
93 62
25 18 200 150
67 222
21 64 150 500
Now, rather than test for independence, we test the hypothesis that the popu- lation proportions within each row are the same. That is, we test the hypothesis that the proportions of Democrats, Republicans, and Independents favoring the abortion law are the same; the proportions of each political affiliation against the law are the same; and the proportions of each political affiliation that are unde- cided are the same. We are basically interested in determining whether the three categories of voters are homogeneous with respect to their opinions concerning the proposed abortion law. Such a test is called a test for homogeneity.
Assuming homogeneity, we again find the expected cell frequencies by multi- plying the corresponding row and column totals and then dividing by the grand

10.13 Test for Homogeneity 377 total. The analysis then proceeds using the same chi-squared statistic as before.
Example 10.14:
Solution :
We illustrate this process for the data of Table 10.8 in the following example.
Referring to the data of Table 10.8, test the hypothesis that opinions concerning the proposed abortion law are the same within each political affiliation. Use a 0.05 level of significance.
1. H0: For each opinion, the proportions of Democrats, Republicans, and Inde- pendents are the same.
2. H1: For at least one opinion, the proportions of Democrats, Republicans, and Independents are not the same.
3. α = 0.05.
4. Critical region: χ2 > 9.488 with v = 4 degrees of freedom.
5. Computations: Using the expected cell frequency formula on page 375, we need to compute 4 cell frequencies. All other frequencies are found by sub- traction. The observed and expected cell frequencies are displayed in Table 10.9.
Table 10.9: Observed and Expected Frequencies
Democrat
82 (85.6) 93 (88.8) 25 (25.6) 200
Republican
70 (64.2) 62 (66.6) 18 (19.2) 150
Independent
Total
Political Affiliation
Abortion Law
For Against Undecided Total
Now,
2
6. Decision: Do not reject H0. There is insufficient evidence to conclude that the proportions of Democrats, Republicans, and Independents differ for each stated opinion.
62 (64.2) 214 67 (66.6) 222 21 (19.2) 64
150 500
(82 − 85.6)2
χ= 85.6 + 64.2 + 64.2
(93 − 88.8)2 (62 − 66.6)2 (67 − 66.6)2 + 88.8 + 66.6 + 66.6
(25 − 25.6)2 (18 − 19.2)2 (21 − 19.2)2 + 25.6 + 19.2 + 19.2
= 1.53.
(70 − 64.2)2
(62 − 64.2)2
Testing for Several Proportions
The chi-squared statistic for testing for homogeneity is also applicable when testing the hypothesis that k binomial parameters have the same value. This is, therefore, an extension of the test presented in Section 10.9 for determining differences be- tween two proportions to a test for determining differences among k proportions. Hence, we are interested in testing the null hypothesis
H0 : p1 =p2 =···=pk

378 Chapter 10 One- and Two-Sample Tests of Hypotheses
against the alternative hypothesis, H1, that the population proportions are not all equal. To perform this test, we first observe independent random samples of size n1, n2, . . . , nk from the k populations and arrange the data in a 2 × k contingency table, Table 10.10.
Table 10.10: k Independent Binomial Samples
Sample: 1 2 ··· k Successes x1 x2 ··· xk Failures n1 −x1 n2 −x2 ··· nk −xk
Depending on whether the sizes of the random samples were predetermined or occurred at random, the test procedure is identical to the test for homogeneity or the test for independence. Therefore, the expected cell frequencies are calculated as before and substituted, together with the observed frequencies, into the chi-squared statistic
2 􏰤 (oi − ei)2 χ=e,
with
v = (2 − 1)(k − 1) = k − 1
By selecting the appropriate upper-tail critical region of the form χ2 > χ2α, we
can now reach a decision concerning H0.
Example 10.15: In a shop study, a set of data was collected to determine whether or not the proportion of defectives produced was the same for workers on the day, evening, and night shifts. The data collected are shown in Table 10.11.
Table 10.11: Data for Example 10.15
degrees of freedom.
ii
Shift:
Defectives Nondefectives
Day Evening Night
45 55 70 905 890 870
Use a 0.025 level of significance to determine if the proportion of defectives is the
same for all three shifts.
Solution: Let p1,p2, and p3 represent the true proportions of defectives for the day, evening,
and night shifts, respectively.
1. H0: p1 =p2 =p3.
2. H1: p1, p2, and p3 are not all equal.
3. α = 0.025.
4. Critical region: χ2 > 7.378 for v = 2 degrees of freedom.

10.14 Two-Sample Case Study 379 5. Computations: Corresponding to the observed frequencies o1 = 45 and o2 =
55, we find
e1 = (950)(170) = 57.0 and e2 = (945)(170) = 56.7.
All other expected frequencies are found by subtraction and are displayed in Table 10.12.
2835 2835
Table 10.12: Observed and Expected Frequencies
Shift:
Defectives Nondefectives Total
Day
45 (57.0) 905 (893.0) 950
Evening
55 (56.7) 890 (888.3) 945
Night
70 (56.3) 870 (883.7) 940
Total
170 2665 2835
= 6.29,
Now
P ≈ 0.04.
2 (45 − 57.0)2 (55 − 56.7)2 (70 − 56.3)2 χ= 57.0 + 56.7 + 56.3
(905 − 893.0)2 (890 − 888.3)2 (870 − 883.7)2 + 893.0 + 888.3 + 883.7
6. Decision: We do not reject H0 at α = 0.025. Nevertheless, with the above P-value computed, it would certainly be dangerous to conclude that the pro- portion of defectives produced is the same for all shifts.
Often a complete study involving the use of statistical methods in hypothesis testing can be illustrated for the scientist or engineer using both test statistics, complete with P-values and statistical graphics. The graphics supplement the numerical diagnostics with pictures that show intuitively why the P -values appear as they do, as well as how reasonable (or not) the operative assumptions are.
10.14 Two-Sample Case Study
In this section, we consider a study involving a thorough graphical and formal anal- ysis, along with annotated computer printout and conclusions. In a data analysis study conducted by personnel at the Statistics Consulting Center at Virginia Tech, two different materials, alloy A and alloy B, were compared in terms of breaking strength. Alloy B is more expensive, but it should certainly be adopted if it can be shown to be stronger than alloy A. The consistency of performance of the two alloys should also be taken into account.
Random samples of beams made from each alloy were selected, and strength was measured in units of 0.001-inch deflection as a fixed force was applied at both ends of the beam. Twenty specimens were used for each of the two alloys. The data are given in Table 10.13.
It is important that the engineer compare the two alloys. Of concern is average strength and reproducibility. It is of interest to determine if there is a severe

380
Chapter 10 One- and Two-Sample Tests of Hypotheses
Table 10.13: Data for Two-Sample Case Study
Alloy A 88 82
79 85 84 88 89 80 81 85 83 87 82 80 79 78
Alloy B
87 75 81 80 90 77 78 81 83 86 78 77 81 84 82 78
80 80 78 76 83 85 76 79
violation of the normality assumption required of both the t- and F-tests. Figures 10.21 and 10.22 are normal quantile-quantile plots of the samples of the two alloys. There does not appear to be any serious violation of the normality assumption. In addition, Figure 10.23 shows two box-and-whisker plots on the same graph. The box-and-whisker plots suggest that there is no appreciable difference in the vari- ability of deflection for the two alloys. However, it seems that the mean deflection for alloy B is significantly smaller, suggesting, at least graphically, that alloy B is
stronger. The sample means and standard deviations are
y ̄A = 83.55, sA = 3.663; y ̄B = 79.70, sB = 3.097.
The SAS printout for the PROC TTEST is shown in Figure 10.24. The F-test suggests no significant difference in variances (P = 0.4709), and the two-sample t-statistic for testing
H0: μA =μB, H1: μA > μB
(t = 3.59, P = 0.0009) rejects H0 in favor of H1 and thus confirms what the graphical information suggests. Here we use the t-test that pools the two-sample variances together in light of the results of the F -test. On the basis of this analysis, the adoption of alloy B would seem to be in order.
Statistical Significance and Engineering or Scientific Significance
While the statistician may feel quite comfortable with the results of the comparison between the two alloys in the case study above, a dilemma remains for the engineer. The analysis demonstrated a statistically significant improvement with the use of alloy B. However, is the difference found really worth pursuing, since alloy B is more expensive? This illustration highlights a very important issue often overlooked by statisticians and data analysts—the distinction between statistical significance and engineering or scientific significance. Here the average difference in deflection is y ̄A − y ̄B = 0.00385 inch. In a complete analysis, the engineer must determine if the difference is sufficient to justify the extra cost in the long run. This is an economic and engineering issue. The reader should understand that a statistically significant difference merely implies that the difference in the sample

10.14 Two-Sample Case Study
381
90 88
86 84 82 80 78
86
84
82
80
78
76
􏱍2 􏱍1 0 1 2 Normal Quantile
􏱍2 􏱍1 0 1 2 Normal Quantile
Figure 10.21: Normal quantile-quantile plot of data for alloy A.
90
85
80
75
Alloy A
Figure 10.23: Box-and-whisker plots for both alloys.
means found in the data could hardly have occurred by chance. It does not imply that the difference in the population means is profound or particularly significant in the context of the problem. For example, in Section 10.4, an annotated computer printout was used to show evidence that a pH meter was, in fact, biased. That is, it does not demonstrate a mean pH of 7.00 for the material on which it was tested. But the variability among the observations in the sample is very small. The engineer may decide that the small deviations from 7.0 render the pH meter adequate.
Figure 10.22: Normal quantile-quantile plot of data for alloy B.
Alloy B
Deflection
Quantile
Quantile

382
Chapter 10 One- and Two-Sample Tests of Hypotheses
Exercises
10.79 A machine is supposed to mix peanuts, hazel- nuts, cashews, and pecans in the ratio 5:2:2:1. A can containing 500 of these mixed nuts was found to have 269 peanuts, 112 hazelnuts, 74 cashews, and 45 pecans. At the 0.05 level of significance, test the hypothesis that the machine is mixing the nuts in the ratio 5:2:2:1.
10.80 The grades in a statistics course for a particu- lar semester were as follows:
Grade A B C D F f 14 18 32 20 16
Test the hypothesis, at the 0.05 level of significance, that the distribution of grades is uniform.
10.81 A die is tossed 180 times with the following results:
x123456 f 28 36 36 30 27 23
Is this a balanced die? Use a 0.01 level of significance.
10.82 Three marbles are selected from an urn con- taining 5 red marbles and 3 green marbles. After the number X of red marbles is recorded, the marbles are replaced in the urn and the experiment repeated 112 times. The results obtained are as follows:
x0123 f 1 31 55 25
Test the hypothesis, at the 0.05 level of significance, that the recorded data may be fitted by the hypergeo- metric distribution h(x; 8, 3, 5), x = 0, 1, 2, 3.
10.83 A coin is thrown until a head occurs and the number X of tosses recorded. After repeating the ex-
periment 256 times, we obtained the following results: x12345678
f 136 60 34 12 9 1 3 1
Test the hypothesis, at the 0.05 level of significance, that the observed distribution of X may be fitted by the geometric distribution g(x; 1/2), x = 1, 2, 3, . . . .
10.84 For Exercise 1.18 on page 31, test the good- ness of fit between the observed class frequencies and the corresponding expected frequencies of a normal dis- tribution with μ = 65 and σ = 21, using a 0.05 level of significance.
10.85 For Exercise 1.19 on page 31, test the good- ness of fit between the observed class frequencies and the corresponding expected frequencies of a normal dis- tribution with μ = 1.8 and σ = 0.4, using a 0.01 level of significance.
10.86 In an experiment to study the dependence of hypertension on smoking habits, the following data were taken on 180 individuals:
Non- Moderate Heavy smokers Smokers Smokers
Hypertension 21 36 30 No hypertension 48 26 19
Test the hypothesis that the presence or absence of hy- pertension is independent of smoking habits. Use a 0.05 level of significance.
10.87 A random sample of 90 adults is classified ac- cording to gender and the number of hours of television watched during a week:
//
Alloy
Alloy A
Alloy B
Variances
Equal
Unequal
Num DF 19
The TTEST Procedure
N Mean Std Dev Std Err
20 83.55 3.6631 0.8191
20 79.7 3.0967 0.6924
DF t Value
38 3.59
37 3.59
Equality of Variances
Den DF F Value
19 1.40
Pr > |t|
0.0009
0.0010
Pr > F 0.4709
Figure 10.24: Annotated SAS printout for alloy data.

Exercises
383
Gender
Male Female
15 29 27 19
be in agreement with the results of a study published in Across the Board (June 1981):
//
Over 25 hours Under 25 hours
Standard of Living
Use a 0.01 level of significance and test the hypothesis that the time spent watching television is independent of whether the viewer is male or female.
10.88 A random sample of 200 married men, all re- tired, was classified according to education and number of children:
Somewhat Period Better
1980: Jan. 72 May 63 Sept. 47 1981: Jan. 40
Not as Same Good
Total
Number of Children
Test the hypothesis that the proportions of households within each standard of living category are the same for each of the four time periods. Use a P-value.
10.92 A college infirmary conducted an experiment to determine the degree of relief provided by three cough remedies. Each cough remedy was tried on 50 students and the following data recorded:
Cough Remedy NyQuil Robitussin Triaminic
No relief 11 13 9 Some relief 32 28 27 Total relief 7 9 14
Test the hypothesis that the three cough remedies are equally effective. Use a P -value in your conclusion.
10.93 To determine current attitudes about prayer in public schools, a survey was conducted in four Vir- ginia counties. The following table gives the attitudes of 200 parents from Craig County, 150 parents from Giles County, 100 parents from Franklin County, and 100 parents from Montgomery County:
County
Attitude Craig Giles Franklin Mont.
144 84 300 135 102 300 100 53 200 105 55 200
Education
Elementary Secondary College
0–1 2–3
14 37
19 42
12 17
Over 3
32 17 10
Test the hypothesis, at the 0.05 level of significance, that the size of a family is independent of the level of education attained by the father.
10.89 A criminologist conducted a survey to deter- mine whether the incidence of certain types of crime varied from one part of a large city to another. The particular crimes of interest were assault, burglary, larceny, and homicide. The following table shows the numbers of crimes committed in four areas of the city during the past year.
Type of Crime
District Assault Burglary Larceny Homicide
1 162 118 451 18 2 310 196 996 25 3 258 193 458 10 4 280 175 390 19
Can we conclude from these data at the 0.01 level of significance that the occurrence of these types of crime is dependent on the city district?
10.90 According to a Johns Hopkins University study published in the American Journal of Public Health, widows live longer than widowers. Consider the fol- lowing survival data collected on 100 widows and 100 widowers following the death of a spouse:
Favor Oppose
No opinion
65 66 40 34 42 30 33 42 93 54 27 24
Years Lived
Less than 5
5 to 10
More than 10
Widow Widower
25 39 42 40 33 21
Test for homogeneity of attitudes among the four coun- ties concerning prayer in the public schools. Use a P- value in your conclusion.
10.94 A survey was conducted in Indiana, Kentucky, and Ohio to determine the attitude of voters concern- ing school busing. A poll of 200 voters from each of these states yielded the following results:
Can we conclude at the 0.05 level of significance that the proportions of widows and widowers are equal with respect to the different time periods that a spouse sur- vives after the death of his or her mate?
10.91 The following responses concerning the stan- dard of living at the time of an independent opinion poll of 1000 households versus one year earlier seem to
State
Voter Attitude Do Not
Support Support Undecided
Indiana 82 97 21 Kentucky 107 66 27 Ohio 93 74 33
At the 0.05 level of significance, test the null hypothe- sis that the proportions of voters within each attitude category are the same for each of the three states.

384
Chapter 10 One- and Two-Sample Tests of Hypotheses
10.95 A survey was conducted in two Virginia cities to determine voter sentiment about two gubernatorial candidates in an upcoming election. Five hundred vot- ers were randomly selected from each city and the fol- lowing data were recorded:
esis that proportions of voters favoring candidate A, favoring candidate B, and undecided are the same for each city.
10.96 In a study to estimate the proportion of wives who regularly watch soap operas, it is found that 52 of 200 wives in Denver, 31 of 150 wives in Phoenix, and 37 of 150 wives in Rochester watch at least one soap opera. Use a 0.05 level of significance to test the hypothesis that there is no difference among the true proportions of wives who watch soap operas in these three cities.
//
Voter Sentiment
Favor A Favor B Undecided
City
Richmond
204 211 85
Norfolk
225 198 77
At the 0.05 level of significance, test the null hypoth-
Review Exercises
10.97 State the null and alternative hypotheses to be used in testing the following claims and determine gen- erally where the critical region is located:
(a) The mean snowfall at Lake George during the month of February is 21.8 centimeters.
(b) No more than 20% of the faculty at the local uni- versity contributed to the annual giving fund.
(c) On the average, children attend schools within 6.2 kilometers of their homes in suburban St. Louis.
(d) At least 70% of next year’s new cars will be in the compact and subcompact category.
(e) The proportion of voters favoring the incumbent in the upcoming election is 0.58.
(f) The average rib-eye steak at the Longhorn Steak house weighs at least 340 grams.
10.98 A geneticist is interested in the proportions of males and females in a population who have a cer- tain minor blood disorder. In a random sample of 100 males, 31 are found to be afflicted, whereas only 24 of 100 females tested have the disorder. Can we conclude at the 0.01 level of significance that the proportion of men in the population afflicted with this blood disorder is significantly greater than the proportion of women afflicted?
10.99 A study was made to determine whether more Italians than Americans prefer white champagne to pink champagne at weddings. Of the 300 Italians selected at random, 72 preferred white champagne, and of the 400 Americans selected, 70 preferred white champagne. Can we conclude that a higher proportion of Italians than Americans prefer white champagne at weddings? Use a 0.05 level of significance.
10.100 Consider the situation of Exercise 10.54 on page 360. Oxygen consumption in mL/kg/min, was also measured.
Subject With CO
1 26.46 2 17.46 3 16.32 4 20.19 5 19.84 6 20.65 7 28.21 8 33.94 9 29.32
Without CO
25.41 22.53 16.32 27.48 24.97 21.77 28.17 32.02 28.96
It is conjectured that oxygen consumption should be higher in an environment relatively free of CO. Do a significance test and discuss the conjecture.
10.101 In a study analyzed by the Statistics Consult- ing Center at Virginia Tech, a group of subjects was asked to complete a certain task on the computer. The response measured was the time to completion. The purpose of the experiment was to test a set of facilita- tion tools developed by the Department of Computer Science at the university. There were 10 subjects in- volved. With a random assignment, five were given a standard procedure using Fortran language for comple- tion of the task. The other five were asked to do the task with the use of the facilitation tools. The data on the completion times for the task are given here.
Group 1 Group 2 (Standard Procedure) (Facilitation Tool)
161 132 169 162 174 134 158 138 163 133
Assuming that the population distributions are nor- mal and variances are the same for the two groups, support or refute the conjecture that the facilitation tools increase the speed with which the task can be accomplished.
10.102 State the null and alternative hypotheses to be used in testing the following claims, and determine

Review Exercises
385
generally where the critical region is located:
(a) At most, 20% of next year’s wheat crop will be exported to the Soviet Union.
(b) On the average, American homemakers drink 3 cups of coffee per day.
(c) The proportion of college graduates in Virginia this year who majored in the social sciences is at least 0.15.
(d) The average donation to the American Lung Asso- ciation is no more than $10.
(e) Residents in suburban Richmond commute, on the average, 15 kilometers to their place of employ- ment.
10.103 If one can containing 500 nuts is selected at random from each of three different distributors of mixed nuts and there are, respectively, 345, 313, and 359 peanuts in each of the cans, can we conclude at the 0.01 level of significance that the mixed nuts of the three distributors contain equal proportions of peanuts?
10.104 A study was made to determine whether there is a difference between the proportions of parents in the states of Maryland (MD), Virginia (VA), Georgia (GA), and Alabama (AL) who favor placing Bibles in the elementary schools. The responses of 100 parents selected at random in each of these states are recorded in the following table:
State
Preference MD VA GA AL
Yes 65 71 78 82
No 35 29 22 18
Can we conclude that the proportions of parents who favor placing Bibles in the schools are the same for these four states? Use a 0.01 level of significance.
10.105 A study was conducted at the Virginia- Maryland Regional College of Veterinary Medicine Equine Center to determine if the performance of a certain type of surgery on young horses had any effect on certain kinds of blood cell types in the animal. Fluid samples were taken from each of six foals before and af- ter surgery. The samples were analyzed for the number of postoperative white blood cell (WBC) leukocytes. A preoperative measure of WBC leukocytes was also measured. The data are given as follows:
nificant change in WBC leukocytes with the surgery.
10.106 A study was conducted at the Department of Health and Physical Education at Virginia Tech to de- termine if 8 weeks of training truly reduces the choles- terol levels of the participants. A treatment group con- sisting of 15 people was given lectures twice a week on how to reduce cholesterol level. Another group of 18 people of similar age was randomly selected as a control group. All participants’ cholesterol levels were recorded at the end of the 8-week program and are listed below.
Treatment:
129 131 154 172 115 126 175 191
122 238 159 156 176 175 126
Control:
151 132 196 195 188 198 187 168 115
165 137 208 133 217 191 193 140 146
Can we conclude, at the 5% level of significance, that the average cholesterol level has been reduced due to the program? Make the appropriate test on means.
10.107 In a study conducted by the Department of Mechanical Engineering and analyzed by the Statistics Consulting Center at Virginia Tech, steel rods supplied by two different companies were compared. Ten sam- ple springs were made out of the steel rods supplied by each company, and the “bounciness” was studied. The data are as follows:
Company A:
9.3 8.8 6.8 8.7 8.5 6.7 8.0 6.5 9.2 7.0
Company B:
11.0 9.8 9.9 10.2 10.1 9.7 11.0 11.1 10.2 9.6
Can you conclude that there is virtually no difference in means between the steel rods supplied by the two companies? Use a P-value to reach your conclusion. Should variances be pooled here?
10.108 In a study conducted by the Water Resources Center and analyzed by the Statistics Consulting Cen- ter at Virginia Tech, two different wastewater treat- ment plants are compared. Plant A is located where the median household income is below $22,000 a year, and plant B is located where the median household income is above $60,000 a year. The amount of waste- water treated at each plant (thousands of gallons/day) was randomly sampled for 10 days. The data are as follows:
Plant A:
21 19 20 23 22 28 32 19 13 18
Plant B:
20 39 24 33 30 28 30 22 33 24
Can we conclude, at the 5% level of significance, that
Foal Presurgery*
1 10.80 2 12.90 3 9.59 4 8.81 5 12.00 6 6.07
Postsurgery*
10.60 16.60 17.20 14.00 10.60
//
8.60
Use a paired sample t-test to determine if there is a sig-
*All values × 10−3.

386 Chapter 10 One- and Two-Sample Tests of Hypotheses
the average amount of wastewater treated at the plant in the high-income neighborhood is more than that treated at the plant in the low-income area? Assume normality.
10.109 The following data show the numbers of de- fects in 100,000 lines of code in a particular type of software program developed in the United States and Japan. Is there enough evidence to claim that there is a significant difference between the programs developed in the two countries? Test on means. Should variances be pooled?
U.S. 48 39 42 52 40 48 52 52 54 48 52 55 43 46 48 52 Japan 5048424043485046 38 38 36 40 40 48 48 45
10.110 Studies show that the concentration of PCBs is much higher in malignant breast tissue than in normal breast tissue. If a study of 50 women with
breast cancer reveals an average PCB concentration of 22.8 × 10−4 gram, with a standard deviation of 4.8 × 10−4 gram, is the mean concentration of PCBs less than 24 × 10−4 gram?
10.111 z-Value for Testing p1−p2 = d0: To test the null hypothesis H0 that p1 −p2 = d0, where d0 ̸= 0, we base our decision on
z=􏰱 pˆ1−pˆ2−d0 , pˆ 1 qˆ 1 / n 1 + pˆ 2 qˆ 2 / n 2
which is a value of a random variable whose distribu- tion approximates the standard normal distribution as long as n1 and n2 are both large. With reference to Example 10.11 on page 364, test the hypothesis that the percentage of town voters favoring the construction of the chemical plant will not exceed the percentage of county voters by more than 3%. Use a P -value in your conclusion.
10.15 Potential Misconceptions and Hazards; Relationship to Material in Other Chapters
One of the easiest ways to misuse statistics relates to the final scientific conclusion drawn when the analyst does not reject the null hypothesis H0. In this text, we have attempted to make clear what the null hypothesis means and what the al- ternative means, and to stress that, in a large sense, the alternative hypothesis is much more important. Put in the form of an example, if an engineer is attempt- ing to compare two gauges using a two-sample t-test, and H0 is “the gauges are equivalent” while H1 is “the gauges are not equivalent,” not rejecting H0 does not lead to the conclusion of equivalent gauges. In fact, a case can be made for never writing or saying “accept H0”! Not rejecting H0 merely implies insufficient evidence. Depending on the nature of the hypothesis, a lot of possibilities are still not ruled out.
In Chapter 9, we considered the case of the large-sample confidence interval
using
x ̄ − μ z = s/√n.
In hypothesis testing, replacing σ by s for n < 30 is risky. If n ≥ 30 and the distribution is not normal but somehow close to normal, the Central Limit Theorem is being called upon and one is relying on the fact that with n ≥ 30, s ≈ σ. Of course, any t-test is accompanied by the concomitant assumption of normality. As in the case of confidence intervals, the t-test is relatively robust to normality. However, one should still use normal probability plotting, goodness-of-fit tests, or other graphical procedures when the sample is not too small. Most of the chapters in this text include discussions whose purpose is to relate the chapter in question to other material that will follow. The topics of estimation 10.15 Potential Misconceptions and Hazards 387 and hypothesis testing are both used in a major way in nearly all of the tech- niques that fall under the umbrella of “statistical methods.” This will be readily noted by students who advance to Chapters 11 through 16. It will be obvious that these chapters depend heavily on statistical modeling. Students will be ex- posed to the use of modeling in a wide variety of applications in many scientific and engineering fields. It will become obvious quite quickly that the framework of a statistical model is useless unless data are available with which to estimate parameters in the formulated model. This will become particularly apparent in Chapters 11 and 12 as we introduce the notion of regression models. The concepts and theory associated with Chapter 9 will carry over. As far as material in the present chapter is concerned, the framework of hypothesis testing, P -values, power of tests, and choice of sample size will collectively play a major role. Since initial model formulation quite often must be supplemented by model editing before the analyst is sufficiently comfortable to use the model for either process understand- ing or prediction, Chapters 11, 12, and 15 make major use of hypothesis testing to supplement diagnostic measures that are used to assess model quality. This page intentionally left blank Chapter 11 Simple Linear Regression and Correlation 11.1 Introduction to Linear Regression Often, in practice, one is called upon to solve problems involving sets of variables when it is known that there exists some inherent relationship among the variables. For example, in an industrial situation it may be known that the tar content in the outlet stream in a chemical process is related to the inlet temperature. It may be of interest to develop a method of prediction, that is, a procedure for estimating the tar content for various levels of the inlet temperature from experimental infor- mation. Now, of course, it is highly likely that for many example runs in which the inlet temperature is the same, say 130◦C, the outlet tar content will not be the same. This is much like what happens when we study several automobiles with the same engine volume. They will not all have the same gas mileage. Houses in the same part of the country that have the same square footage of living space will not all be sold for the same price. Tar content, gas mileage (mpg), and the price of houses (in thousands of dollars) are natural dependent variables, or responses, in these three scenarios. Inlet temperature, engine volume (cubic feet), and square feet of living space are, respectively, natural independent variables, or regressors. A reasonable form of a relationship between the response Y and the regressor x is the linear relationship Y = β0 + β1x, where, of course, β0 is the intercept and β1 is the slope. The relationship is illustrated in Figure 11.1. If the relationship is exact, then it is a deterministic relationship between two scientific variables and there is no random or probabilistic component to it. However, in the examples listed above, as well as in countless other scientific and engineering phenomena, the relationship is not deterministic (i.e., a given x does not always give the same value for Y ). As a result, important problems here are probabilistic in nature since the relationship above cannot be viewed as being exact. The concept of regression analysis deals with finding the best relationship 389 390 Chapter 11 Simple Linear Regression and Correlation Y } β0 Figure 11.1: A linear relationship; β0: intercept; β1: slope. between Y and x, quantifying the strength of that relationship, and using methods that allow for prediction of the response values given values of the regressor x. In many applications, there will be more than one regressor (i.e., more than one independent variable that helps to explain Y ). For example, in the case where the response is the price of a house, one would expect the age of the house to contribute to the explanation of the price, so in this case the multiple regression structure might be written Y =β0 +β1x1 +β2x2, where Y is price, x1 is square footage, and x2 is age in years. In the next chap- ter, we will consider problems with multiple regressors. The resulting analysis is termed multiple regression, while the analysis of the single regressor case is called simple regression. As a second illustration of multiple regression, a chem- ical engineer may be concerned with the amount of hydrogen lost from samples of a particular metal when the material is placed in storage. In this case, there may be two inputs, storage time x1 in hours and storage temperature x2 in degrees centigrade. The response would then be hydrogen loss Y in parts per million. In this chapter, we deal with the topic of simple linear regression, treating only the case of a single regressor variable in which the relationship between y and x is linear. For the case of more than one regressor variable, the reader is referred to Chapter 12. Denote a random sample of size n by the set {(xi,yi); i = 1,2,...,n}. If additional samples were taken using exactly the same values of x, we should expect the y values to vary. Hence, the value yi in the ordered pair (xi,yi) is a value of some random variable Yi. The Simple Linear Regression (SLR) Model We have already confined the terminology regression analysis to situations in which relationships among variables are not deterministic (i.e., not exact). In other words, there must be a random component to the equation that relates the variables. x 11.2 ββ Y=0+1x 11.2 The Simple Linear Regression Model 391 Simple Linear Regression Model This random component takes into account considerations that are not being mea- sured or, in fact, are not understood by the scientists or engineers. Indeed, in most applications of regression, the linear equation, say Y = β0 + β1x, is an approxima- tion that is a simplification of something unknown and much more complicated. For example, in our illustration involving the response Y = tar content and x = inlet temperature, Y = β0 + β1x is likely a reasonable approximation that may be operative within a confined range on x. More often than not, the models that are simplifications of more complicated and unknown structures are linear in nature (i.e., linear in the parameters β0 and β1 or, in the case of the model involving the price, size, and age of the house, linear in the parameters β0, β1, and β2). These linear structures are simple and empirical in nature and are thus called empirical models. An analysis of the relationship between Y and x requires the statement of a statistical model. A model is often used by a statistician as a representation of an ideal that essentially defines how we perceive that the data were generated by the system in question. The model must include the set {(xi, yi); i = 1, 2, . . . , n} of data involving n pairs of (x, y) values. One must bear in mind that the value yi depends on xi via a linear structure that also has the random component involved. The basis for the use of a statistical model relates to how the random variable Y moves with x and the random component. The model also includes what is assumed about the statistical properties of the random component. The statistical model for simple linear regression is given below. The response Y is related to the independent variable x through the equation Y = β0 + β1x + ε. In the above, β0 and β1 are unknown intercept and slope parameters, respectively, and ε is a random variable that is assumed to be distributed with E(ε) = 0 and Var(ε) = σ2. The quantity σ2 is often called the error variance or residual variance. From the model above, several things become apparent. The quantity Y is a random variable since ε is random. The value x of the regressor variable is not random and, in fact, is measured with negligible error. The quantity ε, often called a random error or random disturbance, has constant variance. This portion of the assumptions is often called the homogeneous variance assump- tion. The presence of this random error, ε, keeps the model from becoming simply a deterministic equation. Now, the fact that E(ε) = 0 implies that at a specific x the y-values are distributed around the true, or population, regression line y = β0 + β1x. If the model is well chosen (i.e., there are no additional important regressors and the linear approximation is good within the ranges of the data), then positive and negative errors around the true regression are reasonable. We must keep in mind that in practice β0 and β1 are not known and must be estimated from data. In addition, the model described above is conceptual in nature. As a result, we never observe the actual ε values in practice and thus we can never draw the true regression line (but we assume it is there). We can only draw an estimated line. Figure 11.2 depicts the nature of hypothetical (x, y) data scattered around a true regression line for a case in which only n = 5 observations are available. Let us emphasize that what we see in Figure 11.2 is not the line that is used by the 392 Chapter 11 Simple Linear Regression and Correlation scientist or engineer. Rather, the picture merely describes what the assumptions mean! The regression that the user has at his or her disposal will now be described. y ε2 ε3 ε1 ε4 ε5 “True’’ Regression Line E(Y)=β0+β1x x Figure 11.2: Hypothetical (x, y) data scattered around the true regression line for n = 5. The Fitted Regression Line An important aspect of regression analysis is, very simply, to estimate the parame- ters β0 and β1 (i.e., estimate the so-called regression coefficients). The method of estimation will be discussed in the next section. Suppose we denote the esti- mates b0 for β0 and b1 for β1. Then the estimated or fitted regression line is given by yˆ = b 0 + b 1 x , where yˆ is the predicted or fitted value. Obviously, the fitted line is an estimate of the true regression line. We expect that the fitted line should be closer to the true regression line when a large amount of data are available. In the following example, we illustrate the fitted line for a real-life pollution study. One of the more challenging problems confronting the water pollution control field is presented by the tanning industry. Tannery wastes are chemically complex. They are characterized by high values of chemical oxygen demand, volatile solids, and other pollution measures. Consider the experimental data in Table 11.1, which were obtained from 33 samples of chemically treated waste in a study conducted at Virginia Tech. Readings on x, the percent reduction in total solids, and y, the percent reduction in chemical oxygen demand, were recorded. The data of Table 11.1 are plotted in a scatter diagram in Figure 11.3. From an inspection of this scatter diagram, it can be seen that the points closely follow a straight line, indicating that the assumption of linearity between the two variables appears to be reasonable. 11.2 The Simple Linear Regression Model 393 Table 11.1: Measures of Reduction in Solids and Oxygen Demand Solids Reduction, x (%) Oxygen Demand Solids Reduction, Oxygen Demand 3 7 11 15 18 27 29 30 30 31 31 32 33 33 34 36 36 Reduction, y (%) 5 11 21 16 16 28 27 25 35 30 40 32 34 32 34 37 38 x (%) Reduction, y (%) 36 34 37 36 38 38 39 37 39 36 39 45 40 39 41 41 42 40 42 44 43 37 44 44 45 46 46 46 47 49 50 51 y 55 50 45 40 35 30 25 20 15 10 5 0 3 6 9 121518212427303336394245485154 Figure 11.3: Scatter diagram with regression lines. x The fitted regression line and a hypothetical true regression line are shown on the scatter diagram of Figure 11.3. This example will be revisited as we move on to the method of estimation, discussed in Section 11.3. y^ = b 0 + b 1 x μ|ββ Yx= 0+ 1x 394 Chapter 11 Simple Linear Regression and Correlation Another Look at the Model Assumptions It may be instructive to revisit the simple linear regression model presented previ- ously and discuss in a graphical sense how it relates to the so-called true regression. Let us expand on Figure 11.2 by illustrating not merely where the εi fall on a graph but also what the implication is of the normality assumption on the εi. Suppose we have a simple linear regression with n = 6 evenly spaced values of x and a single y-value at each x. Consider the graph in Figure 11.4. This illustration should give the reader a clear representation of the model and the assumptions involved. The line in the graph is the true regression line. The points plotted are actual (y,x) points which are scattered about the line. Each point is on its own normal distribution with the center of the distribution (i.e., the mean of y) falling on the line. This is certainly expected since E(Y ) = β0 + β1x. As a result, the true regression line goes through the means of the response, and the actual observations are on the distribution around the means. Note also that all distributions have the same variance, which we referred to as σ2. Of course, the deviation between an individual y and the point on the line will be its individual ε value. This is clear since yi −E(Yi)=yi −(β0 +β1xi)=εi. Thus, at a given x, Y and the corresponding ε both have variance σ2. Y x x1 x2 x3 x4 x5 x6 Figure 11.4: Individual observations around true regression line. Note also that we have written the true regression line here as μY |x = β0 + β1 x in order to reaffirm that the line goes through the mean of the Y random variable. 11.3 Least Squares and the Fitted Model In this section, we discuss the method of fitting an estimated regression line to the data. This is tantamount to the determination of estimates b0 for β0 and b1 Y/x= 0+ 1x μββ 11.3 Least Squares and the Fitted Model 395 for β1. This of course allows for the computation of predicted values from the fitted line yˆ = b0 + b1x and other types of analyses and diagnostic information that will ascertain the strength of the relationship and the adequacy of the fitted model. Before we discuss the method of least squares estimation, it is important to introduce the concept of a residual. A residual is essentially an error in the fit o f t h e m o d e l yˆ = b 0 + b 1 x . Residual: Error in Given a set of regression data {(xi, yi); i = 1, 2, . . . , n} and a fitted model, yˆi = Fit b0 + b1xi, the ith residual ei is given by ei=yi−yˆi, i=1,2,...,n. Obviously, if a set of n residuals is large, then the fit of the model is not good. Small residuals are a sign of a good fit. Another interesting relationship which is useful at times is the following: yi = b0 + b1xi + ei. The use of the above equation should result in clarification of the distinction be- tween the residuals, ei, and the conceptual model errors, εi. One must bear in mind that whereas the εi are not observed, the ei not only are observed but also play an important role in the total analysis. Figure 11.5 depicts the line fit to this set of data, namely yˆ = b0 + b1 x, and the line reflecting the model μY |x = β0 + β1x. Now, of course, β0 and β1 are unknown parameters. The fitted line is an estimate of the line produced by the statistical model. Keep in mind that the line μY |x = β0 + β1x is not known. y εi{}ei ( x i , y i ) y^ = b 0 + b 1 x μY|x =β0+β1x Figure 11.5: Comparing εi with the residual, ei. The Method of Least Squares We shall find b0 and b1, the estimates of β0 and β1, so that the sum of the squares of the residuals is a minimum. The residual sum of squares is often called the sum of squares of the errors about the regression line and is denoted by SSE. This x 396 Chapter 11 Simple Linear Regression and Correlation minimization procedure for estimating the parameters is called the method of least squares. Hence, we shall find a and b so as to minimize 􏰤n 􏰤n 􏰤n SSE = e2i = i=1 (yi −yˆi)2 = (yi −b0 −b1xi)2. i=1 i=1 Differentiating SSE with respect to b0 and b1, we have ∂ ( S S E ) 􏰤n ∂ ( S S E ) 􏰤n =−2 Setting the partial derivatives equal to zero and rearranging the terms, we obtain (yi −b0 −b1xi), the equations (called the normal equations) =−2 (yi −b0 −b1xi)xi. ∂b 0 i=1 ∂b 1 i=1 􏰤n 􏰤n nb0 +b1 xi = 􏰤n i=1 􏰤n 􏰤n x2i = xiyi, yi, which may be solved simultaneously to yield computing formulas for b0 and b1. Estimating the Given the sample {(xi , yi ); i = 1, 2, . . . , n}, the least squares estimates b0 and b1 Regression of the regression coefficients β0 and β1 are computed from the formulas i=1 i=1 i=1 i=1 b0 xi +b1 Coefficients 􏰨 􏰦n =i=1 and i=1 i=1 xi 􏰦n n x i y i − 􏰧 􏰦n i=1 x i 􏰨 􏰧 􏰦n y i ( x i − x ̄ ) ( y i − y ̄ ) b= i=1 1 􏰦n􏰧􏰦n􏰨2 􏰦n 􏰦n b0 = i=1 i=1 n x2i− xi (xi−x ̄)2 i=1 yi − b1 􏰦n i=1 Estimate the regression line for the pollution data of Table 11.1. 33 33 33 33 =y ̄−b1x ̄. The calculations of b0 and b1, using the data of Table 11.1, are illustrated by the following example. n Example 11.1: Solution : 􏰤􏰤􏰤􏰤 i=1 Therefore, i=1 xi = 1104, yi = 1124, i=1 xiyi = 41,355, x2i = 41,086 i=1 b1 = (33)(41,355) − (1104)(1124) = 0.903643 and (33)(41,086) − (1104)2 b0 = 1124 − (0.903643)(1104) = 3.829633. 33 Thus, the estimated regression line is given by yˆ = 3.8296 + 0.9036x. Using the regression line of Example 11.1, we would predict a 31% reduction in the chemical oxygen demand when the reduction in the total solids is 30%. The 11.3 Least Squares and the Fitted Model 397 31% reduction in the chemical oxygen demand may be interpreted as an estimate of the population mean μY |30 or as an estimate of a new observation when the reduction in total solids is 30%. Such estimates, however, are subject to error. Even if the experiment were controlled so that the reduction in total solids was 30%, it is unlikely that we would measure a reduction in the chemical oxygen demand exactly equal to 31%. In fact, the original data recorded in Table 11.1 show that measurements of 25% and 35% were recorded for the reduction in oxygen demand when the reduction in total solids was kept at 30%. What Is Good about Least Squares? It should be noted that the least squares criterion is designed to provide a fitted line that results in a “closeness” between the line and the plotted points. There are many ways of measuring closeness. For example, one may wish to determine b0 􏰦n and b1 for which |yi − yˆi | is minimized or for which 􏰦n 1.5 |yi − yˆi | is minimized. i=1 These are both viable and reasonable methods. Note that both of these, as well as the least squares procedure, result in forcing residuals to be “small” in some sense. One should remember that the residuals are the empirical counterpart to the ε values. Figure 11.6 illustrates a set of residuals. One should note that the fitted line has predicted values as points on the line and hence the residuals are vertical deviations from points to the line. As a result, the least squares procedure produces a line that minimizes the sum of squares of vertical deviations from the points to the line. y i=1 x Figure 11.6: Residuals as vertical deviations. y^ = b 0 + b 1 x 398 Chapter 11 Simple Linear Regression and Correlation Exercises 11.1 A study was conducted at Virginia Tech to de- termine if certain static arm-strength measures have an influence on the “dynamic lift” characteristics of an individual. Twenty-five individuals were subjected to strength tests and then were asked to perform a weight- lifting test in which weight was dynamically lifted over- head. The data are given here. x (◦C) y (grams) 0868 15 12 10 14 30 25 21 24 45 31 33 28 60 44 39 42 75 48 51 44 (a) Find the equation of the regression line. (b) Graph the line on a scatter diagram. (c) Estimate the amount of chemical that will dissolve in 100 grams of water at 50◦C. 11.4 The following data were collected to determine the relationship between pressure and the correspond- ing scale reading for the purpose of calibration. // Individual Arm Strength, x Dynamic Lift, y 71.7 48.3 88.3 75.0 91.7 100.0 73.3 65.0 75.0 88.3 68.3 96.7 76.7 78.3 60.0 71.7 85.0 85.0 88.3 100.0 100.0 100.0 91.7 100.0 71.7 1 17.3 2 19.3 3 19.5 4 19.7 5 22.9 6 23.1 7 26.4 8 26.8 9 27.6 10 28.1 11 28.2 12 28.7 13 29.0 14 29.6 15 29.9 16 29.9 17 30.3 18 31.3 19 36.0 20 39.5 21 40.4 22 44.3 23 44.6 24 50.4 25 55.9 Pressure, x (lb/sq in.) 10 10 10 10 10 50 50 50 50 50 Scale Reading, y 13 18 16 15 20 86 90 88 88 92 (a) Estimate β0 and β1 for the linear regression curve μY|x=β0+β1x. (b) Find a point estimate of μY |30. (c) Plot the residuals versus the x’s (arm strength). Comment. 11.2 The grades of a class of 9 students on a midterm report (x) and on the final examination (y) are as fol- lows: x 775071728194969967 y 82 66 78 34 47 85 99 99 68 (a) Estimate the linear regression line. (b) Estimate the final examination grade of a student who received a grade of 85 on the midterm report. 11.3 The amounts of a chemical compound y that dis- solved in 100 grams of water at various temperatures x were recorded as follows: (a) Find the equation of the regression line. (b) The purpose of calibration in this application is to estimate pressure from an observed scale reading. Estimate the pressure for a scale reading of 54 using xˆ = (54 − b0)/b1. 11.5 A study was made on the amount of converted sugar in a certain process at various temperatures. The data were coded and recorded as follows: Temperature, x 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 Converted Sugar, y 8.1 7.8 8.5 9.8 9.5 8.9 8.6 10.2 9.3 1.9 9.2 2.0 10.5 (a) Estimate the linear regression line. (b) Estimate the mean amount of converted sugar pro- duced when the coded temperature is 1.75. (c) Plot the residuals versus temperature. Comment. Exercises 399 11.6 In a certain type of metal test specimen, the nor- mal stress on a specimen is known to be functionally related to the shear resistance. The following is a set of coded experimental data on the two variables: Placement Test 50 35 35 40 55 65 35 60 90 35 90 80 60 60 60 40 55 50 65 50 Course Grade 53 41 61 56 68 36 11 70 79 59 54 91 48 71 71 47 53 68 57 79 // Normal Stress, x 26.8 25.4 28.9 23.6 27.7 23.9 24.7 28.1 26.9 27.4 22.6 25.6 Shear Resistance, y 26.5 27.3 24.2 27.1 23.6 25.9 26.3 22.5 21.7 21.4 25.8 24.9 (a) Estimate the (b) Estimate the 24.5. regression line μY |x = β0 + β1x. shear resistance for a normal stress of 11.7 The following is a portion of a classic data set called the “pilot plot data” in Fitting Equations to Data by Daniel and Wood, published in 1971. The response y is the acid content of material produced by titration, whereas the regressor x is the organic acid content produced by extraction and weighing. yxyx A study was made by a retail merchant to deter- mine the relation between weekly advertising expendi- 76 123 62 55 66 100 58 75 88 159 70 109 37 48 82 138 88 164 43 28 tures and sales. Advertising Costs ($) 40 20 25 20 30 50 40 20 50 40 25 50 (a) Plot a scatter diagram. Sales ($) 385 400 395 365 475 440 490 420 560 525 480 510 11.9 (a) Plot the data; does it appear that a simple linear regression will be a suitable model? (b) Fit a simple linear regression; estimate a slope and intercept. (c) Graph the regression line on the plot in (a). 11.8 A mathematics placement test is given to all en- tering freshmen at a small college. A student who re- ceives a grade below 35 is denied admission to the regu- lar mathematics course and placed in a remedial class. The placement test scores and the final grades for 20 students who took the regular course were recorded. (a) Plot a scatter diagram. (b) Find the equation of the regression line to predict course grades from placement test scores. (c) Graph the line on the scatter diagram. (d) If 60 is the minimum passing grade, below which placement test score should students in the future be denied admission to this course? (b) Find the equation of the regression line to predict weekly sales from advertising expenditures. (c) Estimate the weekly sales when advertising costs are $35. (d) Plot the residuals versus advertising costs. Com- ment. 11.10 The following data are the selling prices z of a certain make and model of used car w years old. Fit a curve of the form μz|w = γδw by means of the nonlin- ear sample regression equation zˆ = cdw. [Hint: Write lnzˆ=lnc+(lnd)w=b0+b1w.] w (years) z (dollars) w (years) z (dollars) 1 6350 3 5395 2 5695 5 4985 2 5750 5 4895 400 Chapter 11 Simple Linear Regression and Correlation 11.11 The thrust of an engine (y) is a function of exhaust temperature (x) in ◦F when other important variables are held constant. Consider the following data. yxyx 4300 1760 4010 1665 4650 1652 3810 1550 3200 1485 4500 1700 3150 1390 3008 1270 4950 1820 (a) Plot the data. (b) Fit a simple linear regression to the data and plot the line through the data. 11.12 A study was done to study the effect of ambi- ent temperature x on the electric power consumed by a chemical plant y. Other factors were held constant, and the data were collected from an experimental pilot plant. data: Daily Rainfall, x (0.01 cm) 4.3 4.5 5.9 5.6 6.1 5.2 3.8 2.1 7.5 Particulate Removed, y (μg/m3) 126 121 116 118 114 118 132 141 108 y (BTU) 250 285 320 295 (a) Plot the data. x (◦F) y (BTU) x (◦F) 31 60 34 74 (a) Find the equation of the regression line to predict the particulate removed from the amount of daily rainfall. (b) Estimate the amount of particulate removed when the daily rainfall is x = 4.8 units. 11.14 A professor in the School of Business in a uni- versity polled a dozen colleagues about the number of professional meetings they attended in the past five years (x) and the number of papers they submitted to refereed journals (y) during the same period. The summary data are given as follows: 27 45 72 58 265 298 267 321 (b) Estimate the slope and intercept in a simple linear regression model. (c) Predict power consumption for an ambient temper- ature of 65◦F. 11.13 A study of the amount of rainfall and the quan- tity of air pollution removed produced the following 􏰤n i=1 x2i = 232, 􏰤n i=1 xiyi = 318. 11.4 Properties of the Least Squares Estimators In addition to the assumptions that the error term in the model Yi = β0 + β1xi + εi is a random variable with mean 0 and constant variance σ2, suppose that we make the further assumption that ε1, ε2, . . . , εn are independent from run to run in the experiment. This provides a foundation for finding the means and variances for the estimators of β0 and β1. It is important to remember that our values of b0 and b1, based on a given sample of n observations, are only estimates of true parameters β0 and β1. If the experiment is repeated over and over again, each time using the same fixed values of x, the resulting estimates of β0 and β1 will most likely differ from experiment to experiment. These different estimates may be viewed as values assumed by the random variables B0 and B1, while b0 and b1 are specific realizations. Since the values of x remain fixed, the values of B0 and B1 depend on the vari- ations in the values of y or, more precisely, on the values of the random variables, n=12, x ̄=4, y ̄=12, Fit a simple linear regression model between x and y by finding out the estimates of intercept and slope. Com- ment on whether attending more professional meetings would result in publishing more papers. 11.4 Properties of the Least Squares Estimators 401 Y1,Y2,...,Yn. The distributional assumptions imply that the Yi, i = 1,2,...,n, are also independently distributed, with mean μY |xi = β0 + β1xi and equal vari- ances σ2; that is, σ2 = σ2 for i = 1, 2, . . . , n. Y |xi Mean and Variance of Estimators In what follows, we show that the estimator B1 is unbiased for β1 and demonstrate the variances of both B0 and B1. This will begin a series of developments that lead to hypothesis testing and confidence interval estimation on the intercept and slope. Since the estimator is of the form 􏰦n i=1 ciYi, where c i = x i − x ̄ ( x i − x ̄ ) 2 , 􏰦n ̄􏰦n (xi −x ̄)(Yi −Y) (xi −x ̄)Yi B1 = i=1 (xi − x ̄)2 = i=1 􏰦n (xi − x ̄)2 i=1 i = 1 , 2 , . . . , n , 􏰦n i=1 􏰦n i=1 we may conclude from Theorem 7.11 that B1 has a n(μB1 , σB1 ) distribution with (xi − x ̄)(β0 + β1xi) =βandσ2 =i=1 B1􏰦n 1B1􏰮􏰦n􏰯2􏰦n (xi − x ̄)2 (xi − x ̄)2 (xi − x ̄)2 i=1 i=1 i=1 It can also be shown (Review Exercise 11.60 on page 438) that the random variable B0 is normally distributed with 􏰦n 2 xi β0 and β1 are both unbiased estimators. Partition of Total Variability and Estimation of σ2 To draw inferences on β0 and β1, it becomes necessary to arrive at an estimate of the parameter σ2 appearing in the two preceding variance formulas for B0 and B1. The parameter σ2, the model error variance, reflects random variation or 􏰦n μ =i=1 􏰦n 22 (xi − x ̄) σYi σ2 = . mean μB0 = β0 and variance σB2 From the foregoing results, it is apparent that the least squares estimators for = i=1 σ2. 0 􏰦n n (xi−x ̄)2 i=1 402 Chapter 11 Simple Linear Regression and Correlation Theorem 11.1: experimental error variation around the regression line. In much of what follows, it is advantageous to use the notation 􏰤n 􏰤n 􏰤n Sxx = (xi −x ̄)2, Syy = i=1 (yi −y ̄)2, Sxy = (xi −x ̄)(yi −y ̄). i=1 i=1 Now we may write the error sum of squares as follows: 􏰤n SSE = 􏰤n i=1 (yi −b0 −b1xi)2 = 􏰤n 􏰤n 􏰤n = (yi −y ̄)2 −2b1 (xi −x ̄)(yi −y ̄)+b21 (xi −x ̄)2 i=1 i=1 i=1 = Syy − 2b1Sxy + b21Sxx = Syy − b1Sxy, the final step following from the fact that b1 = Sxy/Sxx. The proof of Theorem 11.1 is left as an exercise (see Review Exercise 11.59). i=1 [(yi −y ̄)−b1(xi −x ̄)]2 An unbiased estimate of σ2 is 2 SSE 􏰤n (yi−yˆi)2 Syy−b1Sxy s=n−2=n−2=n−2. i=1 The Estimator of σ2 as a Mean Squared Error One should observe the result of Theorem 11.1 in order to gain some intuition about the estimator of σ2. The parameter σ2 measures variance or squared deviations between Y values and their mean given by μY |x (i.e., squared deviations between Y and β0 +β1x). Of course, β0 +β1x is estimated by yˆ = b0 +b1x. Thus, it would make sense that the variance σ2 is best depicted as a squared deviation of the typical observation yi from the estimated mean, yˆi, which is the corresponding point on the fitted line. Thus, (yi − yˆi)2 values reveal the appropriate variance, much like the way (yi − y ̄)2 values measure variance when one is sampling in a nonregression scenario. In other words, y ̄ estimates the mean in the latter simple situation, whereas yˆi estimates the mean of yi in a regression structure. Now, what about the divisor n−2? In future sections, we shall note that these are the degrees of freedom associated with the estimator s2 of σ2. Whereas in the standard normal i.i.d. scenario, one degree of freedom is subtracted from n in the denominator and a reasonable explanation is that one parameter is estimated, namely the mean μ by, say, y ̄, but in the regression problem, two parameters are estimated, namely β0 and β1 by b0 and b1. Thus, the important parameter σ2, estimated by 􏰤n i=1 is called a mean squared error, depicting a type of mean (division by n − 2) of the squared residuals. s2 = (yi − yˆi)2/(n − 2), 11.5 Inferences Concerning the Regression Coefficients 403 11.5 Inferences Concerning the Regression Coefficients Aside from merely estimating the linear relationship between x and Y for purposes of prediction, the experimenter may also be interested in drawing certain inferences about the slope and intercept. In order to allow for the testing of hypotheses and the construction of confidence intervals on β0 and β1, one must be willing to make the further assumption that each εi , i = 1, 2, . . . , n, is normally distributed. This assumption implies that Y1, Y2, . . . , Yn are also normally distributed, each with probability distribution n(yi; β0 + β1xi, σ). From Section 11.4 we know that B1 follows a normal distribution. It turns out that under the normality assumption, a result very much analogous to that given in Theorem 8.4 allows us to conclude that (n − 2)S2/σ2 is a chi-squared variable with n − 2 degrees of freedom, independent of the random variable B1. Theorem 8.5 then assures us that the statistic (B1 − β1)/(σ/√Sxx) B1 − β1 T = S/σ = S/√Sxx has a t-distribution with n − 2 degrees of freedom. The statistic T can be used to construct a 100(1 − α)% confidence interval for the coefficient β1. Confidence Interval A 100(1 − α)% confidence interval for the parameter β1 in the regression line forβ1 μY|x=β0+β1xis b1 −tα/2√s <β1 |t| <.0001 <.0001 Upred 17.6451 34.5891 31.8386 32.9327 26.4693 24.5069 20.4279 17.5493 25.1227 21.4822 MODEL WT MPG Predict LMean UMean 11.9752 15.5688 28.6063 32.6213 26.4143 29.6681 27.2967 30.8438 21.7478 23.9758 19.8160 21.9972 15.3213 18.1224 11.8570 15.4811 20.4390 22.6091 16.5379 19.1011 Residual 1.22804 -1.61381 2.95877 -1.07026 0.13825 0.09341 -1.72185 -0.66905 0.47599 0.18051 Adj R-Sq Parameter Estimates 0.9509 0.9447 t Value 23.21 -12.44 Lpred 9.8988 26.6385 24.2439 25.2078 19.2543 17.3062 13.0158 9.7888 17.9253 14.1568 Figure 11.13: SAS printout for Exercise 11.27. 414 Chapter 11 Simple Linear Regression and Correlation 11.7 Choice of a Regression Model Much of what has been presented thus far on regression involving a single inde- pendent variable depends on the assumption that the model chosen is correct, the presumption that μY |x is related to x linearly in the parameters. Certainly, one cannot expect the prediction of the response to be good if there are several inde- pendent variables, not considered in the model, that are affecting the response and are varying in the system. In addition, the prediction will certainly be inadequate if the true structure relating μY |x to x is extremely nonlinear in the range of the variables considered. Often the simple linear regression model is used even though it is known that the model is something other than linear or that the true structure is unknown. This approach is often sound, particularly when the range of x is narrow. Thus, the model used becomes an approximating function that one hopes is an adequate rep- resentation of the true picture in the region of interest. One should note, however, the effect of an inadequate model on the results presented thus far. For example, if the true model, unknown to the experimenter, is linear in more than one x, say μY |x1,x2 = β0 + β1x1 + β2x2, then the ordinary least squares estimate b1 = Sxy/Sxx, calculated by only con- sidering x1 in the experiment, is, under general circumstances, a biased estimate of the coefficient β1, the bias being a function of the additional coefficient β2 (see Review Exercise 11.65 on page 438). Also, the estimate s2 for σ2 is biased due to the additional variable. Analysis-of-Variance Approach Often the problem of analyzing the quality of the estimated regression line is han- dled by an analysis-of-variance (ANOVA) approach: a procedure whereby the total variation in the dependent variable is subdivided into meaningful compo- nents that are then observed and treated in a systematic fashion. The analysis of variance, discussed in Chapter 13, is a powerful resource that is used for many applications. Suppose that we have n experimental data points in the usual form (xi, yi) and that the regression line is estimated. In our estimation of σ2 in Section 11.4, we established the identity Syy = b1Sxy + SSE. An alternative and perhaps more informative formulation is 􏰤n 􏰤n 􏰤n 11.8 i=1 i=1 (yi −y ̄)2 = (yˆi −y ̄)2 + (yi −yˆi)2. i=1 We have achieved a partitioning of the total corrected sum of squares of y into two components that should reflect particular meaning to the experimenter. We shall indicate this partitioning symbolically as SST = SSR + SSE. 11.8 Analysis-of-Variance Approach 415 The first component on the right, SSR, is called the regression sum of squares, and it reflects the amount of variation in the y-values explained by the model, in this case the postulated straight line. The second component is the familiar error sum of squares, which reflects variation about the regression line. Suppose that we are interested in testing the hypothesis H0: β1 =0versusH1: β1 ̸=0, where the null hypothesis says essentially that the model is μY |x = β0. That is, the variation in Y results from chance or random fluctuations which are independent of the values of x. This condition is reflected in Figure 11.10(b). Under the conditions of this null hypothesis, it can be shown that SSR/σ2 and SSE/σ2 are values of independent chi-squared variables with 1 and n−2 degrees of freedom, respectively, and then by Theorem 7.12 it follows that SST/σ2 is also a value of a chi-squared variable with n − 1 degrees of freedom. To test the hypothesis above, we compute f = SSR/1 = SSR SSE/(n − 2) s2 and reject H0 at the α-level of significance when f > fα (1, n − 2).
The computations are usually summarized by means of an analysis-of-variance
table, as in Table 11.2. It is customary to refer to the various sums of squares divided by their respective degrees of freedom as the mean squares.
Table 11.2: Analysis of Variance for Testing β1 = 0
Source of Variation
Regression Error
Sum of Squares
SSR SSE
Degrees of Freedom
1 n−2
Computed
Mean
Square f
Total SST n−1
SSR s2 = SSE n−2
SSR s2
When the null hypothesis is rejected, that is, when the computed F-statistic exceeds the critical value fα (1, n − 2), we conclude that there is a significant amount of variation in the response accounted for by the postulated model, the straight-line function. If the F-statistic is in the fail to reject region, we conclude that the data did not reflect sufficient evidence to support the model postulated.
In Section 11.5, a procedure was given whereby the statistic
B1 − β10 T=√
S/ Sxx
is used to test the hypothesis
H0: β1 = β10 versus H1: β1 ̸= β10,
where T follows the t-distribution with n − 2 degrees of freedom. The hypothesis is rejected if |t| > tα/2 for an α-level of significance. It is interesting to note that

416
Chapter 11 Simple Linear Regression and Correlation
in the special case in which we are testing
H0: β1 =0versusH1: β1 ̸=0,
the value of our T-statistic becomes t=√,
b1
s/ Sxx
and the hypothesis under consideration is identical to that being tested in Table 11.2. Namely, the null hypothesis states that the variation in the response is due merely to chance. The analysis of variance uses the F-distribution rather than the t-distribution. For the two-sided alternative, the two approaches are identical. This we can see by writing
t2 = b21Sxx = b1Sxy = SSR, s2 s2 s2
which is identical to the f -value used in the analysis of variance. The basic relation- ship between the t-distribution with v degrees of freedom and the F-distribution with 1 and v degrees of freedom is
t2 =f(1,v).
Of course, the t-test allows for testing against a one-sided alternative while the
F -test is restricted to testing against a two-sided alternative. Annotated Computer Printout for Simple Linear Regression
Consider again the chemical oxygen demand reduction data of Table 11.1. Figures 11.14 and 11.15 show more complete annotated computer printouts. Again we illustrate it with MINITAB software. The t-ratio column indicates tests for null hypotheses of zero values on the parameter. The term “Fit” denotes yˆ-values, often called fitted values. The term “SE Fit” is used in computing confidence intervals on mean response. The item R2 is computed as (SSR/SST)×100 and signifies the proportion of variation in y explained by the straight-line regression. Also shown are confidence intervals on the mean response and prediction intervals on a new observation.
11.9 Test for Linearity of Regression: Data with Repeated Observations
In certain kinds of experimental situations, the researcher has the capability of obtaining repeated observations on the response for each value of x. Although it is not necessary to have these repetitions in order to estimate β0 and β1, nevertheless repetitions enable the experimenter to obtain quantitative information concerning the appropriateness of the model. In fact, if repeated observations are generated, the experimenter can make a significance test to aid in determining whether or not the model is adequate.

11.9 Test for Linearity of Regression: Data with Repeated Observations 417
The regression equation is COD = 3.83 + 0.904 Per_Red
Predictor Coef SE Coef T P
Constant 3.830 1.768 2.17 0.038
Per_Red 0.90364 0.05012 18.03 0.000
S = 3.22954 R-Sq = 91.3% R-Sq(adj) = 91.0%
Analysis of Variance
Source DF SS MS F P
Regression 1 3390.6 3390.6 325.08 0.000
Residual Error 31 323.3 10.4
Total
Obs Per_Red
1 3.0
2 36.0
3 7.0
4 37.0
5 11.0
6 38.0
7 15.0
8 39.0
9 18.0
10 39.0
11 27.0
12 39.0
13 29.0
14 40.0
15 30.0
16 41.0
17 30.0
18 42.0
19 31.0
20 42.0
21 31.0
22 43.0
23 32.0
24 44.0
25 33.0
26 45.0
27 33.0
28 46.0
29 34.0
30 47.0
31 36.0
32 50.0
33 36.0
32 3713.9
COD Fit SE Fit Residual St Resid
5.000 6.541
34.000 36.361
11.000 10.155
36.000 37.264
21.000 13.770
38.000 38.168
16.000 17.384
37.000 39.072
16.000 20.095
36.000 39.072
28.000 28.228
45.000 39.072
27.000 30.035
39.000 39.975
25.000 30.939
41.000 40.879
35.000 30.939
40.000 41.783
30.000 31.843
44.000 41.783
40.000 31.843
37.000 42.686
32.000 32.746
44.000 43.590
34.000 33.650
46.000 44.494
32.000 33.650
46.000 45.397
34.000 34.554
49.000 46.301
37.000 36.361
51.000 49.012
38.000 36.361
1.627
0.576
1.440
0.590
1.258
0.607
1.082
0.627
0.957
0.627
0.649
0.627
0.605
0.651
0.588
0.678
0.588
0.707
0.575
0.707
0.575
0.738
0.567
0.772
0.563
0.807
0.563
0.843
0.563
0.881
0.576
1.002
0.576
-1.541
-2.361
0.845
-1.264
7.230
-0.168
-1.384
-2.072
-4.095
-3.072
-0.228
5.928
-3.035
-0.975
-5.939
0.121
4.061
-1.783
-1.843
2.217
8.157
-5.686
-0.746
0.410
0.350
1.506
-1.650
0.603
-0.554
2.699
0.639
1.988
1.639
-0.55
-0.74
0.29
-0.40
2.43
-0.05
-0.45
-0.65
-1.33
-0.97
-0.07
1.87
-0.96
-0.31
-1.87
0.04
1.28
-0.57
-0.58
0.70
2.57
-1.81
-0.23
0.13
0.11
0.48
-0.52
0.19
-0.17
0.87
0.20
0.65
0.52
Figure 11.14: MINITAB printout of simple linear regression for chemical oxygen demand reduction data; part I.
Let us select a random sample of n observations using k distinct values of x,
sayx1,x2,…,xn,suchthatthesamplecontainsn1 observedvaluesoftherandom
variable Y1 corresponding to x1, n2 observed values of Y2 corresponding to x2, . . . ,
􏰦k i=1
nk observed values of Yk corresponding to xk. Of necessity, n =
ni.

418 Chapter 11 Simple Linear Regression and Correlation
Obs Fit SE Fit 95% CI 95% PI
1 6.541 1.627 ( 3.223, 9.858) (-0.834, 13.916)
2 36.361 0.576 (35.185, 37.537) (29.670, 43.052)
3 10.155 1.440 ( 7.218, 13.092) ( 2.943, 17.367)
4 37.264 0.590 (36.062, 38.467) (30.569, 43.960)
5 13.770 1.258 (11.204, 16.335) ( 6.701, 20.838)
6 38.168 0.607 (36.931, 39.405) (31.466, 44.870)
7 17.384 1.082 (15.177, 19.592) (10.438, 24.331)
8 39.072 0.627 (37.793, 40.351) (32.362, 45.781)
9 20.095 0.957 (18.143, 22.047) (13.225, 26.965)
10 39.072 0.627 (37.793, 40.351) (32.362, 45.781)
11 28.228 0.649 (26.905, 29.551) (21.510, 34.946)
12 39.072 0.627 (37.793, 40.351) (32.362, 45.781)
13 30.035 0.605 (28.802, 31.269) (23.334, 36.737)
14 39.975 0.651 (38.648, 41.303) (33.256, 46.694)
15 30.939 0.588 (29.739, 32.139) (24.244, 37.634)
16 40.879 0.678 (39.497, 42.261) (34.149, 47.609)
17 30.939 0.588 (29.739, 32.139) (24.244, 37.634)
18 41.783 0.707 (40.341, 43.224) (35.040, 48.525)
19 31.843 0.575 (30.669, 33.016) (25.152, 38.533)
20 41.783 0.707 (40.341, 43.224) (35.040, 48.525)
21 31.843 0.575 (30.669, 33.016) (25.152, 38.533)
22 42.686 0.738 (41.181, 44.192) (35.930, 49.443)
23 32.746 0.567 (31.590, 33.902) (26.059, 39.434)
24 43.590 0.772 (42.016, 45.164) (36.818, 50.362)
25 33.650 0.563 (32.502, 34.797) (26.964, 40.336)
26 44.494 0.807 (42.848, 46.139) (37.704, 51.283)
27 33.650 0.563 (32.502, 34.797) (26.964, 40.336)
28 45.397 0.843 (43.677, 47.117) (38.590, 52.205)
29 34.554 0.563 (33.406, 35.701) (27.868, 41.239)
30 46.301 0.881 (44.503, 48.099) (39.473, 53.128)
31 36.361 0.576 (35.185, 37.537) (29.670, 43.052)
32 49.012 1.002 (46.969, 51.055) (42.115, 55.908)
33 36.361 0.576 (35.185, 37.537) (29.670, 43.052)
Figure 11.15: MINITAB printout of simple linear regression for chemical oxygen demand reduction data; part II.
We define
Hence, if n4 = 3 measurements of Y were made corresponding to x = x4, we would indicate these observations by y41,y42, and y43. Then
Ti. =y41 +y42 +y43.
Concept of Lack of Fit
The error sum of squares consists of two parts: the amount due to the variation between the values of Y within given values of x and a component that is normally
yij = the jth value of the random variable Yi, ni
yij,
yi. =Ti. = y ̄i. = Ti. .
􏰤
j=1 ni

11.9 Test for Linearity of Regression: Data with Repeated Observations 419
called the lack-of-fit contribution. The first component reflects mere random variation, or pure experimental error, while the second component is a measure of the systematic variation brought about by higher-order terms. In our case, these are terms in x other than the linear, or first-order, contribution. Note that in choosing a linear model we are essentially assuming that this second component does not exist and hence our error sum of squares is completely due to random errors. If this should be the case, then s2 = SSE/(n − 2) is an unbiased estimate of σ2. However, if the model does not adequately fit the data, then the error sum of squares is inflated and produces a biased estimate of σ2. Whether or not the model fits the data, an unbiased estimate of σ2 can always be obtained when we have repeated observations simply by computing
ni 􏰦
(yij −y ̄i.)2
s2i =j=1
for each of the k distinct values of x and then pooling these variances to get
, i=1,2,…,k,
ni − 1
􏰦 kni 2
k􏰦􏰦
(ni −1)s2i (yij −y ̄i.)
s2 = i=1 = i=1 j=1 n−k n−k
.
The numerator of s2 is a measure of the pure experimental error. A compu- tational procedure for separating the error sum of squares into the two components representing pure error and lack of fit is as follows:
Computation of 1. Compute the pure error sum of squares
Lack-of-Fit Sum of Squares
k ni 􏰤􏰤
(yij −y ̄i.)2. i=1 j=1
This sum of squares has n − k degrees of freedom associated with it, and the resulting mean square is our unbiased estimate s2 of σ2.
2. Subtract the pure error sum of squares from the error sum of squares SSE, thereby obtaining the sum of squares due to lack of fit. The degrees of freedom for lack of fit are obtained by simply subtracting (n − 2) − (n − k) = k − 2.
The computations required for testing hypotheses in a regression problem with repeated measurements on the response may be summarized as shown in Table 11.3.
Figures 11.16 and 11.17 display the sample points for the “correct model” and “incorrect model” situations. In Figure 11.16, where the μY |x fall on a straight line, there is no lack of fit when a linear model is assumed, so the sample variation around the regression line is a pure error resulting from the variation that occurs among repeated observations. In Figure 11.17, where the μY |x clearly do not fall on a straight line, the lack of fit from erroneously choosing a linear model accounts for a large portion of the variation around the regression line, supplementing the pure error.

420
Chapter 11 Simple Linear Regression and Correlation
Table 11.3: Analysis of Variance for Testing Linearity of Regression
Source of Variation
Regression
Sum of Squares
Degrees of Freedom
1
Mean Square
SSR
SSE−SSE(pure) k−2
s2 = SSE(pure)
Computed f SSR
s2 SSE−SSE(pure)
s2 (k−2)
SSR
Error SSE n−2
􏰥􏰥
Lack of fit
Pure error
Total SST n−1
SSE −SSE (pure) SSE (pure)
k −2 n − k
n−k
YY
x1 x2 x3 x4 x5 x6 x Figure 11.16: Correct linear model with no lack-of-
fit component.
x
The concept of lack of fit is extremely important in applications of regression analysis. In fact, the need to construct or design an experiment that will account for lack of fit becomes more critical as the problem and the underlying mechanism involved become more complicated. Surely, one cannot always be certain that his or her postulated structure, in this case the linear regression model, is correct or even an adequate representation. The following example shows how the error sum of squares is partitioned into the two components representing pure error and lack of fit. The adequacy of the model is tested at the α-level of significance by comparing the lack-of-fit mean square divided by s2 with fα(k − 2, n − k).
Example 11.8: Observations of the yield of a chemical reaction taken at various temperatures were recorded in Table 11.4. Estimate the linear model μY |x = β0 + β1x and test for lack of fit.
Solution : Results of the computations are shown in Table 11.5.
Conclusion: The partitioning of the total variation in this manner reveals a
significant variation accounted for by the linear model and an insignificant amount of variation due to lack of fit. Thus, the experimental data do not seem to suggest the need to consider terms higher than first order in the model, and the null hypothesis is not rejected.
What Is the Importance in Detecting Lack of Fit?
x1 x2 x3 x4 x5 x6
Figure 11.17: Incorrect linear model with lack-of-fit
component.
Y| x = 0 + 1 x
/
Yx= 0+ 1x
μββ
μββ

Exercises
421
//
Table 11.4: Data for Example 11.8
y (%) 77.4 76.7 78.2 84.1 84.5 83.7
x(◦C)
x(◦C) y (%)
150 88.9 250 150 89.2 250 150 89.7 250 200 94.8 300 200 94.7 300 200 95.9 300
Table 11.5: Analysis of Variance on Yield-Temperature Data
Source of Variation
Regression Error
Lack of fit
Pure error Total
Sum of Squares
509.2507 3.8660 􏱈 1.2060 2.6600 513.1167
Degrees of Mean Freedom Square
Computed f 1531.58
1.81
P-Values < 0.0001 0.2241 1 509.2507 10 􏱈2 0.6030 8 0.3325 11 Annotated Computer Printout for Test for Lack of Fit Figure 11.18 is an annotated computer printout showing analysis of the data of Example 11.8 with SAS. Note the “LOF” with 2 degrees of freedom, represent- ing the quadratic and cubic contribution to the model, and the P-value of 0.22, suggesting that the linear (first-order) model is adequate. Dependent Variable: yield Source DF Squares Model Error Corrected Total Sum of 3 510.4566667 8 2.6600000 11 513.1166667 Mean Square 170.1522222 0.3325000 F Value 511.74 Pr>F <.0001 Pr>F <.0001 0.2241 R-Square 0.994816 Coeff Var Root MSE 0.666751 0.576628 yield Mean 86.48333 Source temperature LOF Exercises 11.31 Test for linearity of regression in Exercise 11.3 on page 398. Use a 0.05 level of significance. Comment. 11.32 Test for linearity of regression in Exercise 11.8 on page 399. Comment. 11.33 Suppose we have a linear equation through the DF Type I SS Mean Square F Value 1 509.2506667 509.2506667 1531.58 2 1.2060000 0.6030000 1.81 Figure 11.18: SAS printout, showing analysis of data of Example 11.8. origin (Exercise 11.28) μY |x = βx. (a) Estimate the regression line passing through the origin for the following data: x 0.5 1.5 3.2 4.2 5.1 6.5 y 1.3 3.4 6.7 8.0 10.0 13.2 422 Chapter 11 Simple Linear Regression and Correlation (b) Suppose it is not known whether the true regres- sion should pass through the origin. Estimate the linear model μY |x = β0 + β1x and test the hypoth- esis that β0 = 0, at the 0.10 level of significance, against the alternative that β0 ̸= 0. 11.34 Use an analysis-of-variance approach to test the hypothesis that β1 = 0 against the alternative hy- pothesis β1 ̸= 0 in Exercise 11.5 on page 398 at the 0.05 level of significance. 11.35 The following data are a result of an investiga- tion as to the effect of reaction temperature x on per- cent conversion of a chemical process y. (See Myers, Montgomery and Anderson-Cook, 2009.) Fit a simple linear regression, and use a lack-of-fit test to determine if the model is adequate. Discuss. (a) Determine if emitter drive-in time influences gain in a linear relationship. That is, test H0: β1 = 0, where β1 is the slope of the regressor variable. (b) Do a lack-of-fit test to determine if the linear rela- tionship is adequate. Draw conclusions. (c) Determine if emitter dose influences gain in a linear relationship. Which regressor variable is the better predictor of gain? 11.37 Organophosphate (OP) compounds are used as pesticides. However, it is important to study their ef- fect on species that are exposed to them. In the labora- tory study Some Effects of Organophosphate Pesticides on Wildlife Species, by the Department of Fisheries and Wildlife at Virginia Tech, an experiment was con- ducted in which different dosages of a particular OP pesticide were administered to 5 groups of 5 mice (per- omysius leucopus). The 25 mice were females of similar age and condition. One group received no chemical. The basic response y was a measure of activity in the brain. It was postulated that brain activity would de- crease with an increase in OP dosage. The data are as follows: Observation 1 2 3 4 5 6 7 8 9 10 11 12 (◦C), x 200 250 200 250 189.65 260.35 225 225 225 225 225 225 (%), y 43 78 69 73 48 78 65 74 76 79 83 81 Animal Dose, x (mg/kg body weight) Activity, y (moles/liter/min) 10.9 10.6 10.8 9.8 9.0 11.0 11.3 9.9 9.2 10.1 10.6 10.4 8.8 11.1 8.4 9.7 7.8 9.0 8.2 2.3 2.9 2.2 3.4 5.4 8.2 Temperature Conversion // 11.36 Transistor gain between emitter and collector in an integrated circuit device (hFE) is related to two variables (Myers, Montgomery and Anderson-Cook, 2009) that can be controlled at the deposition process, emitter drive-in time (x1, in minutes) and emitter dose (x2 , in ions × 1014 ). Fourteen samples were observed following deposition, and the resulting data are shown in the table below. We will consider linear regression models using gain as the response and emitter drive-in time or emitter dose as the regressor variable. x1 (drive-in x2 (dose, y (gain, Obs. time, min) ions ×1014) or hFE) 1 0.0 2 0.0 3 0.0 4 0.0 5 0.0 6 2.3 7 2.3 8 2.3 9 2.3 10 2.3 11 4.6 12 4.6 13 4.6 14 4.6 15 4.6 16 9.2 17 9.2 18 9.2 19 9.2 20 9.2 21 18.4 22 18.4 23 18.4 24 18.4 25 18.4 Using the model Yi =β0 +β1xi +εi, 1 195 2 255 3 195 4 255 5 255 6 255 7 255 8 195 9 255 10 255 11 255 12 255 13 255 14 340 4.00 1004 4.00 1636 4.60 852 4.60 1506 4.20 1272 4.10 1270 4.60 1269 4.30 903 4.30 1555 4.00 1260 4.70 1146 4.30 1276 4.72 1225 4.30 1321 (a) (b) i=1,2,...,25, find the least squares estimates of β0 and β1. Construct an analysis-of-variance table in which the lack of fit and pure error have been separated. Exercises 423 Determine if the lack of fit is significant at the 0.05 level. Interpret the results. 11.38 Heat treating is often used to carburize metal parts such as gears. The thickness of the carburized layer is considered an important feature of the gear, and it contributes to the overall reliability of the part. Because of the critical nature of this feature, a lab test is performed on each furnace load. The test is a de- structive one, where an actual part is cross sectioned and soaked in a chemical for a period of time. This test involves running a carbon analysis on the surface of both the gear pitch (top of the gear tooth) and the gear root (between the gear teeth). The data below are the results of the pitch carbon-analysis test for 19 parts. 11.40 It is of interest to study the effect of population size in various cities in the United States on ozone con- centrations. The data consist of the 1999 population in millions and the amount of ozone present per hour in ppb (parts per billion). The data are as follows. Soak Time Pitch 0.66 0.66 0.66 0.66 1.00 1.17 1.17 1.17 Soak Time 1.17 1.17 0.015 1.17 0.016 1.20 0.015 2.00 0.016 2.00 0.014 2.20 0.021 2.20 0.018 2.20 Pitch 0.021 0.019 0.021 0.025 0.025 0.026 0.024 0.025 0.024 (a) Fit the linear regression model relating ozone con- centration to population. Test H0: β1 = 0 using the ANOVA approach. (b) Do a test for lack of fit. Is the linear model appro- priate based on the results of your test? (c) Test the hypothesis of part (a) using the pure mean square error in the F-test. Do the results change? Comment on the advantage of each test. 11.41 Evaluating nitrogen deposition from the atmo- sphere is a major role of the National Atmospheric Deposition Program (NADP), a partnership of many agencies. NADP is studying atmospheric deposition and its effect on agricultural crops, forest surface wa- ters, and other resources. Nitrogen oxides may affect the ozone in the atmosphere and the amount of pure nitrogen in the air we breathe. The data are as follows: Year Nitrogen Oxide 1978 0.73 1979 2.55 1980 2.90 1981 3.83 1982 2.53 1983 2.77 1984 3.93 1985 2.03 1986 4.39 1987 3.04 1988 3.41 1989 5.07 1990 3.95 1991 3.14 1992 3.44 1993 3.63 1994 4.50 1995 3.95 1996 5.24 1997 3.30 1998 4.36 1999 3.33 // Ozone (ppb/hour), y 126 135 124 128 130 128 126 128 128 129 Population, x 0.6 4.9 0.2 0.5 1.1 0.1 1.1 2.3 0.6 2.3 0.58 0.013 0.66 0.016 0.019 (a) Fit a simple linear regression relating the pitch car- bon analysis y against soak time. Test H0: β1 = 0. (b) If the hypothesis in part (a) is rejected, determine if the linear model is adequate. 11.39 A regression model is desired relating tempera- ture and the proportion of impurities passing through solid helium. Temperature is listed in degrees centi- grade. The data are as follows: Temperature (◦C) −260.5 −255.7 −264.6 −265.0 −270.0 −272.0 −272.5 −272.6 −272.8 −272.9 Proportion of Impurities 0.425 0.224 0.453 0.475 0.705 0.860 0.935 0.961 0.979 0.990 (a) Fit a linear regression model. (b) Does it appear that the proportion of impurities passing through helium increases as the tempera- ture approaches −273 degrees centigrade? (c) Find R2 . (d) Based on the information above, does the linear model seem appropriate? What additional infor- mation would you need to better answer that ques- tion? 424 Chapter 11 Simple Linear Regression and Correlation (a) Plot the data. (b) Fit a linear regression model and find R2. (c) What can you say about the trend in nitrogen oxide across time? 11.42 For a particular variety of plant, researchers wanted to develop a formula for predicting the quan- tity of seeds (in grams) as a function of the density of plants. They conducted a study with four levels of the factor x, the number of plants per plot. Four replica- tions were used for each level of x. The data are shown as follows: Plants per Plot, x (grams) Quantity of Seeds, y 11.10 Data Plots and Transformations In this chapter, we deal with building regression models where there is one in- dependent, or regressor, variable. In addition, we are assuming, through model formulation, that both x and y enter the model in a linear fashion. Often it is advisable to work with an alternative model in which either x or y (or both) enters in a nonlinear way. A transformation of the data may be indicated because of theoretical considerations inherent in the scientific study, or a simple plotting of the data may suggest the need to reexpress the variables in the model. The need to perform a transformation is rather simple to diagnose in the case of simple linear regression because two-dimensional plots give a true pictorial display of how each variable enters the model. A model in which x or y is transformed should not be viewed as a nonlinear regression model. We normally refer to a regression model as linear when it is linear in the parameters. In other words, suppose the complexion of the data or other scientific information suggests that we should regress y* against x*, where each is a transformation on the natural variables x and y. Then the model of the form y i∗ = β 0 + β 1 x ∗i + ε i is a linear model since it is linear in the parameters β0 and β1. The material given in Sections 11.2 through 11.9 remains intact, with yi∗ and x∗i replacing yi and xi. A simple and useful example is the log-log model log yi = β0 + β1 log xi + εi. Although this model is not linear in x and y, it is linear in the parameters and is thus treated as a linear model. On the other hand, an example of a truly nonlinear model is yi=β0+β1xβ2 +εi, where the parameter β2 (as well as β0 and β1) is to be estimated. The model is not linear in β2. Transformations that may enhance the fit and predictability of a model are many in number. For a thorough discussion of transformations, the reader is referred to Myers (1990, see the Bibliography). We choose here to indicate a few of them and show the appearance of the graphs that serve as a diagnostic tool. Consider Table 11.6. Several functions are given describing relationships between y and x that can produce a linear regression through the transformation indicated. 10 12.6 20 15.3 30 17.9 40 19.2 11.0 12.1 10.9 16.1 14.9 15.6 18.3 18.6 17.8 19.6 18.9 20.0 Is a simple linear regression model adequate for ana- lyzing this data set? 11.10 Data Plots and Transformations 425 In addition, for the sake of completeness the reader is given the dependent and independent variables to use in the resulting simple linear regression. Figure 11.19 depicts functions listed in Table 11.6. These serve as a guide for the analyst in choosing a transformation from the observation of the plot of y against x. Table 11.6: Some Useful Transformations to Linearize Functional Form Relating y to x Exponential: y = β0eβ1x Power: y=β0xβ1 􏰩 􏰪 Hyperbolic:y= x β0+β1x Proper Transformation x∗=1 y x Form of Simple Linear Regression Regress y* against x Regress y* against x* Regress y against x* Regress y* against x* Reciprocal:y=β0+β1 1 xx y∗ = lny y∗ =logy; x∗ =logx y∗=1; x∗=1 yy β0 yy β1>1
β1>0 β0
β1 < 0 β1 < 0 0 < β1 < 1 xxxx (a) Exponential function (b) Power function yyy β0 1 β1 β0 xxx (c) Reciprocal function (d) Hyperbolic function Figure 11.19: Diagrams depicting functions listed in Table 11.6. β1>0
β1 < 0 What Are the Implications of a Transformed Model? The foregoing is intended as an aid for the analyst when it is apparent that a trans- formation will provide an improvement. However, before we provide an example, two important points should be made. The first one revolves around the formal writing of the model when the data are transformed. Quite often the analyst does not think about this. He or she merely performs the transformation without any 426 Chapter 11 Simple Linear Regression and Correlation concern about the model form before and after the transformation. The exponen- tial model serves as a good illustration. The model in the natural (untransformed) variables that produces an additive error model in the transformed variables is given by yi =β0eβ1xi ·εi, which is a multiplicative error model. Clearly, taking logs produces lnyi =lnβ0 +β1xi +lnεi. As a result, it is on lnεi that the basic assumptions are made. The purpose of this presentation is merely to remind the reader that one should not view a transformation as merely an algebraic manipulation with an error added. Often a model in the transformed variables that has a proper additive error structure is a result of a model in the natural variables with a different type of error structure. The second important point deals with the notion of measures of improvement. Obvious measures of comparison are, of course, R2 and the residual mean square, s2. (Other measures of performance used to compare competing models are given in Chapter 12.) Now, if the response y is not transformed, then clearly s2 and R2 can be used in measuring the utility of the transformation. The residuals will be in the same units for both the transformed and the untransformed models. But when y is transformed, performance criteria for the transformed model should be based on values of the residuals in the metric of the untransformed response so that comparisons that are made are proper. The example that follows provides an illustration. Example 11.9: The pressure P of a gas corresponding to various volumes V is recorded, and the data are given in Table 11.7. Table 11.7: Data for Example 11.9 V (cm3) 50 60 70 90 100 P (kg/cm2) 64.7 51.3 40.5 25.9 7.8 The ideal gas law is given by the functional form P V γ = C, where γ and C are constants. Estimate the constants C and γ. Solution: Let us take natural logs of both sides of the model PiVγ =C·εi, i=1,2,3,4,5. As a result, a linear model can be written lnPi =lnC−γlnVi +ε∗i, i=1,2,3,4,5, where ε∗i = ln εi . The following represents results of the simple linear regression: 􏱉􏱅 Intercept: ln C = 14.7589, C = 2, 568, 862.88, Slope: γˆ = 2.65347221. The following represents information taken from the regression analysis. Pi Vi 64.7 50 51.3 60 40.5 70 25.9 90 7.8 100 ln Pi 4.16976 3.93769 3.70130 3.25424 2.05412 ln Vi 3.91202 4.09434 4.24850 4.49981 4.60517 ln Pi Pi 4.37853 79.7 3.89474 49.1 3.48571 32.6 2.81885 16.8 2.53921 12.7 ei = Pi − Pi −15.0 􏱉􏱁􏱅 2.2 7.9 9.1 −4.9 11.10 Data Plots and Transformations 427 It is instructive to plot the data and the regression equation. Figure 11.20 shows a plot of the data in the untransformed pressure and volume and the curve representing the regression equation. 80 60 40 20 0 Figure 11.20: Pressure and volume data and fitted regression. Diagnostic Plots of Residuals: Graphical Detection of Violation of Assumptions Plots of the raw data can be extremely helpful in determining the nature of the model that should be fit to the data when there is a single independent variable. We have attempted to illustrate this in the foregoing. Detection of proper model form is, however, not the only benefit gained from diagnostic plotting. As in much of the material associated with significance testing in Chapter 10, plotting methods can illustrate and detect violation of assumptions. The reader should recall that much of what is illustrated in this chapter requires assumptions made on the model errors, the εi. In fact, we assume that the εi are independent N(0,σ) random variables. Now, of course, the εi are not observed. However, the ei = yi − yˆi, the residuals, are the error in the fit of the regression line and thus serve to mimic the εi. Thus, the general complexion of these residuals can often highlight difficulties. Ideally, of course, the plot of the residuals is as depicted in Figure 11.21. That is, they should truly show random fluctuations around a value of zero. Nonhomogeneous Variance Homogeneous variance is an important assumption made in regression analysis. Violations can often be detected through the appearance of the residual plot. In- creasing error variance with an increase in the regressor variable is a common condition in scientific data. Large error variance produces large residuals, and hence a residual plot like the one in Figure 11.22 is a signal of nonhomogeneous variance. More discussion regarding these residual plots and information regard- 50 60 70 80 90 100 Volume Pressure 428 Chapter 11 Simple Linear Regression and Correlation 0 0 y^ Figure 11.22: Residual plot depicting heteroge- neous error variance. y^ Figure 11.21: Ideal residual plot. ing different types of residuals appears in Chapter 12, where we deal with multiple linear regression. Normal Probability Plotting The assumption that the model errors are normal is made when the data analyst deals in either hypothesis testing or confidence interval estimation. Again, the numerical counterpart to the εi, namely the residuals, are subjects of diagnostic plotting to detect any extreme violations. In Chapter 8, we introduced normal quantile-quantile plots and briefly discussed normal probability plots. These plots on residuals are illustrated in the case study introduced in the next section. 11.11 Simple Linear Regression Case Study In the manufacture of commercial wood products, it is important to estimate the relationship between the density of a wood product and its stiffness. A relatively new type of particleboard is being considered that can be formed with considerably more ease than the accepted commercial product. It is necessary to know at what density the stiffness is comparable to that of the well-known, well-documented commercial product. A study was done by Terrance E. Conners, Investigation of Certain Mechanical Properties of a Wood-Foam Composite (M.S. Thesis, Depart- ment of Forestry and Wildlife Management, University of Massachusetts). Thirty particleboards were produced at densities ranging from roughly 8 to 26 pounds per cubic foot, and the stiffness was measured in pounds per square inch. Table 11.8 shows the data. It is necessary for the data analyst to focus on an appropriate fit to the data and use inferential methods discussed in this chapter. Hypothesis testing on the slope of the regression, as well as confidence or prediction interval estimation, may well be appropriate. We begin by demonstrating a simple scatter plot of the raw data with a simple linear regression superimposed. Figure 11.23 shows this plot. The simple linear regression fit to the data produced the fitted model yˆ = −25,433.739 + 3884.976x (R2 = 0.7975), Residual Residual 11.11 Simple Linear Regression Case Study 429 Table 11.8: Density and Stiffness for 30 Particleboards Density, x 9.50 9.80 8.30 8.60 7.00 17.40 15.20 16.70 15.00 14.80 25.60 24.40 19.50 22.80 19.80 Stiffness, y 14,814.00 14,007.00 7573.00 9714.00 5304.00 43,243.00 28,028.00 49,499.00 26,222.00 26,751.00 96,305.00 72,594.00 32,207.00 70,453.00 38,138.00 Density, x 8.40 11.00 9.90 6.40 8.20 15.00 16.40 15.40 14.50 13.60 23.40 23.30 21.20 21.70 21.30 Stiffness, y 17,502.00 19,443.00 14,191.00 8076.00 10,728.00 25,319.00 41,792.00 25,312.00 22,148.00 18,036.00 104,170.00 49,512.00 48,218.00 47,661.00 53,045.00 110,000 80,000 50,000 20,000 40,000 30,000 20,000 10,000 0 −10,000 5 10 15 20 25 Density −20,000 5 10 15 20 25 Density Figure 11.23: Scatter plot of the wood density data. Figure 11.24: Residual plot for the wood density data. and the residuals were computed. Figure 11.24 shows the residuals plotted against the measurements of density. This is hardly an ideal or healthy set of residuals. They do not show a random scatter around a value of zero. In fact, clusters of positive and negative values suggest that a curvilinear trend in the data should be investigated. To gain some type of idea regarding the normal error assumption, a normal probability plot of the residuals was generated. This is the type of plot discussed in Stiffness Residuals 430 Chapter 11 Simple Linear Regression and Correlation Section 8.8 in which the horizontal axis represents the empirical normal distribution function on a scale that produces a straight-line plot when plotted against the residuals. Figure 11.25 shows the normal probability plot of the residuals. The normal probability plot does not reflect the straight-line appearance that one would like to see. This is another symptom of a faulty, perhaps overly simplistic choice of a regression model. 40,000 30,000 20,000 10,000 0 −10,000 −20,000 −2 −1 0 1 2 Standard Normal Quantile Figure 11.25: Normal probability plot of residuals for wood density data. Both types of residual plots and, indeed, the scatter plot itself suggest here that a somewhat complicated model would be appropriate. One possible approach is to use a natural log transformation. In other words, one might choose to regress ln y against x. This produces the regression 􏱉2 ln y = 8.257 + 0.125x (R = 0.9016). To gain some insight into whether the transformed model is more appropriate, consider Figures 11.26 and 11.27, which reveal plots of the residuals in stiffness [i.e., 􏱉 Up to this point we have assumed that the independent regressor variable x is a physical or scientific variable but not a random variable. In fact, in this context, x is often called a mathematical variable, which, in the sampling process, is measured with negligible error. In many applications of regression techniques, it is more realistic to assume that both X and Y are random variables and the measurements {(xi, yi); i = 1, 2, . . . , n} are observations from a population having yi-antilog (lny)] against density. Figure 11.26 appears to be closer to a random pattern around zero, while Figure 11.27 is certainly closer to a straight line. This in addition to the higher R2-value would suggest that the transformed model is more appropriate. 11.12 Correlation Residual Quantile 11.12 Correlation 431 0.6 0.3 0 􏱍0.3 􏱍0.6 5 10 15 20 25 Density 0.6 0.3 0 􏱍0.3 􏱍0.6 􏱍2 􏱍1 0 1 2 Standard Normal Quantile Figure 11.26: Residual plot using the log transfor- Figure 11.27: Normal probability plot of residuals mation for the wood density data. using the log transformation for the wood density data. the joint density function f(x,y). We shall consider the problem of measuring the relationship between the two variables X and Y. For example, if X and Y represent the length and circumference of a particular kind of bone in the adult body, we might conduct an anthropological study to determine whether large values of X are associated with large values of Y, and vice versa. On the other hand, if X represents the age of a used automobile and Y repre- sents the retail book value of the automobile, we would expect large values of X to correspond to small values of Y and small values of X to correspond to large values of Y. Correlation analysis attempts to measure the strength of such rela- tionships between two variables by means of a single number called a correlation coefficient. In theory, it is often assumed that the conditional distribution f(y|x) of Y, for fixed values of X, is normal with mean μY |x = β0 + β1x and variance σ2 = σ2 Y |x and that X is likewise normally distributed with mean μ and variance σx2. The joint density of X and Y is then f(x,y)= n(y|x;β0 +β1x,σ)n(x;μX,σX) 􏰥􏰷􏰧 􏰨2􏰧􏰨2􏰸􏰶 = 1 exp −1 y−β0−β1x + x−μX , 2πσxσ 2 σ σX for −∞ < x < ∞ and −∞ < y < ∞. Let us write the random variable Y in the form Y = β0 + β1X + ε, where X is now a random variable independent of the random error ε. Since the mean of the random error ε is zero, it follows that μY=β0+β1μX andσY2=σ2+β12σX2. Substituting for α and σ2 into the preceding expression for f(x,y), we obtain the bivariate normal distribution Residuals Residual Quantile 432 Chapter 11 Simple Linear Regression and Correlation 1 f(x,y) = 􏰱 2πσXσY 1−ρ2 􏰰 1 􏰮􏰧x − μ 􏰨 􏰧x − μ 􏰨􏰧y − μ 􏰨 􏰧y − μ 􏰨 􏰯􏰹 X2−2ρ X Y+ Y2, σX σY σY ×exp− for −∞ < x < ∞ and −∞ < y < ∞, where 2(1−ρ2) σX 2 σ2 2σX2 ρ = 1 − σ Y2 = β 1 σ Y2 . The constant ρ (rho) is called the population correlation coefficient and plays a major role in many bivariate data analysis problems. It is important for the reader to understand the physical interpretation of this correlation coefficient and the distinction between correlation and regression. The term regression still has meaning here. In fact, the straight line given by μY |x = β0 + β1x is still called the regression line as before, and the estimates of β0 and β1 are identical to those given in Section 11.3. The value of ρ is 0 when β1 = 0, which results when there essentially is no linear regression; that is, the regression line is horizontal and any knowledge of X is useless in predicting Y. Since σY2 ≥ σ2, we must have ρ2 ≤ 1 andhence−1≤ρ≤1. Valuesofρ=±1onlyoccurwhenσ2 =0,inwhichcase we have a perfect linear relationship between the two variables. Thus, a value of ρ equal to +1 implies a perfect linear relationship with a positive slope, while a value of ρ equal to −1 results from a perfect linear relationship with a negative slope. It might be said, then, that sample estimates of ρ close to unity in magnitude imply good correlation, or linear association, between X and Y, whereas values near zero indicate little or no correlation. To obtain a sample estimate of ρ, recall from Section 11.4 that the error sum of squares is SSE = Syy − b1Sxy. Dividing both sides of this equation by Syy and replacing Sxy by b1Sxx, we obtain the relation b2Sxx =1−SSE. 1 Syy Syy The value of b21Sxx/Syy is zero when b1 = 0, which will occur when the sample points show no linear relationship. Since Syy ≥ SSE, we conclude that b21Sxx/Sxy must be between 0 and l. Consequently, b1 Sxx/Syy must range from −1 to +1, negative values corresponding to lines with negative slopes and positive values to lines with positive slopes. A value of −1 or +1 will occur when SSE = 0, but this is the case where all sample points lie in a straight line. Hence, a perfect linear relationship appears in the sample data when b1 Sxx/Syy = ±1. Clearly, the Sxx/Syy, which we shall henceforth designate as r, can be used as an estimate of the population correlation coefficient ρ. It is customary to refer to the estimate r as the Pearson product-moment correlation coefficient or simply the sample correlation coefficient. Correlation The measure ρ of linear association between two variables X and Y is estimated Coefficient by the sample correlation coefficient r, where 􏰿 r=b Sxx=􏰱Sxy . 1 Syy SxxSyy 􏰱 􏰱 quantity b1 􏰱 11.12 Correlation 433 For values of r between −1 and +1 we must be careful in our interpretation. For example, values of r equal to 0.3 and 0.6 only mean that we have two positive correlations, one somewhat stronger than the other. It is wrong to conclude that r = 0.6 indicates a linear relationship twice as good as that indicated by the value r = 0.3. On the other hand, if we write r2= Sx2y =SSR, SxxSyy Syy then r2, which is usually referred to as the sample coefficient of determination, represents the proportion of the variation of Syy explained by the regression of Y on x, namely SSR. That is, r2 expresses the proportion of the total variation in the values of the variable Y that can be accounted for or explained by a linear relationship with the values of the random variable X. Thus, a correlation of 0.6 means that 0.36, or 36%, of the total variation of the values of Y in our sample is accounted for by a linear relationship with values of X. Example 11.10: It is important that scientific researchers in the area of forest products be able to study correlation among the anatomy and mechanical properties of trees. For the study Quantitative Anatomical Characteristics of Plantation Grown Loblolly Pine (Pinus Taeda L.) and Cottonwood (Populus deltoides Bart. Ex Marsh.) and Their Relationships to Mechanical Properties, conducted by the Department of Forestry and Forest Products at Virginia Tech, 29 loblolly pines were randomly selected for investigation. Table 11.9 shows the resulting data on the specific gravity in grams/cm3 and the modulus of rupture in kilopascals (kPa). Compute and inter- pret the sample correlation coefficient. Table 11.9: Data on 29 Loblolly Pines for Example 11.10 Specific Gravity, x (g/cm3) 0.414 0.383 0.399 0.402 0.442 0.422 0.466 0.500 0.514 0.530 0.569 0.558 0.577 0.572 0.548 Modulus of Rupture, Specific Gravity, x (g/cm3) Modulus of Rupture, y (kPa) 85,156 69,571 84,160 73,466 78,610 67,657 74,017 87,291 86,836 82,540 81,699 82,096 75,657 80,490 y (kPa) 29,186 0.581 29,266 0.557 26,215 0.550 30,162 0.531 38,867 0.550 37,831 0.556 44,576 0.523 46,097 0.602 59,698 0.569 67,705 0.544 66,088 0.557 78,486 0.530 89,869 0.547 77,369 0.585 67,095 Solution : From the data we find that Sxx = 0.11273, Syy = 11,807,324,805, Sxy = 34,422.27572. Therefore, r = 􏰱 34,422.27572 (0.11273)(11,807,324,805) = 0.9435. 434 Chapter 11 Simple Linear Regression and Correlation A correlation coefficient of 0.9435 indicates a good linear relationship between X and Y. Since r2 = 0.8902, we can say that approximately 89% of the variation in the values of Y is accounted for by a linear relationship with X. A test of the special hypothesis ρ = 0 versus an appropriate alternative is equivalent to testing β1 = 0 for the simple linear regression model, and therefore the procedures of Section 11.8 using either the t-distribution with n − 2 degrees of freedom or the F -distribution with 1 and n − 2 degrees of freedom are applicable. However, if one wishes to avoid the analysis-of-variance procedure and compute only the sample correlation coefficient, it can be verified (see Review Exercise 11.66 on page 438) that the t-value can also be written as b1 t=√ s/ Sxx r√n−2 t=√ , Example 11.11: Solution: 1−r2 which, as before, is a value of the statistic T having a t-distribution with n − 2 degrees of freedom. For the data of Example 11.10, test the hypothesis that there is no linear association among the variables. 1. H0: ρ=0. 2. H1: ρ̸=0. 3. α = 0.05. 4. Critical region: t < −2.052 or t > 2.052. √
5. Computations: t = 0.9435 27 = 14.79, P < 0.0001. 1−0.94352 6. Decision: Reject the hypothesis of no linear association. A test of the more general hypothesis ρ = ρ0 against a suitable alternative is easily conducted from the sample information. If X and Y follow the bivariate normal distribution, the quantity1 􏰧 1 + r 􏰨 2ln 1−r is the value of a random variable that follows approximately the normal distribution with mean 1 ln 1+ρ and variance 1/(n−3). Thus, the test procedure is to compute 2 1−ρ √ √n−3􏰮 􏰧1+r􏰨 􏰧1+ρ 􏰨􏰯 √n−3 􏰮(1+r)(1−ρ )􏰯 z= ln −ln 0 = ln 0 2 1−r 1−ρ0 2 (1−r)(1+ρ0) and compare it with the critical points of the standard normal distribution. Example 11.12: Solution: For the data of Example 11.10, test the null hypothesis that ρ = 0.9 against the alternative that ρ > 0.9. Use a 0.05 level of significance.
1. H0: ρ = 0.9.
2. H1: ρ > 0.9.
3. α = 0.05.
4. Critical region: z > 1.645.

Exercises
435
y
//
y
(a) No Association
xx (b) Causal Relationship
Figure 11.28: Scatter diagram showing zero correlation. 5. Computations:
√26 􏰮 (1 + 0.9435)(0.1) 􏰯
z = 2 ln (1 − 0.9435)(1.9) = 1.51, P = 0.0655.
6. Decision: There is certainly some evidence that the correlation coefficient does not exceed 0.9.
It should be pointed out that in correlation studies, as in linear regression problems, the results obtained are only as good as the model that is assumed. In the correlation techniques studied here, a bivariate normal density is assumed for the variables X and Y, with the mean value of Y at each x-value being linearly related to x. To observe the suitability of the linearity assumption, a preliminary plotting of the experimental data is often helpful. A value of the sample correlation coefficient close to zero will result from data that display a strictly random effect as in Figure 11.28(a), thus implying little or no causal relationship. It is important to remember that the correlation coefficient between two variables is a measure of their linear relationship and that a value of r = 0 implies a lack of linearity and not a lack of association. Hence, if a strong quadratic relationship exists between X and Y, as indicated in Figure 11.28(b), we can still obtain a zero correlation indicating a nonlinear relationship.
Exercises
11.43 Compute and interpret the correlation coeffi- 11.44 With reference to Exercise 11.1 on page 398,
cient for the following grades of 6 students selected at random:
Mathematics grade 70 92 80 74 65 83
assume that x and y are random variables with a bi- variate normal distribution.
(a) Calculate r.
(b) Test the hypothesis that ρ = 0 against the alterna-
tive that ρ ̸= 0 at the 0.05 level of significance.
English grade
74 84 63 87 78 90

11.45 With reference to Exercise 11.13 on page 400, assume a bivariate normal distribution for x and y.
(a) Calculate r.
(b) Test the null hypothesis that ρ = −0.5 against the alternative that ρ < −0.5 at the 0.025 level of sig- nificance. (c) Determine the percentage of the variation in the amount of particulate removed that is due to changes in the daily amount of rainfall. 11.46 Test the hypothesis that ρ = 0 in Exercise 11.43 against the alternative that ρ ̸= 0. Use a 0.05 level of significance. 11.47 The following data were obtained in a study of the relationship between the weight and chest size of Review Exercises 11.48 With reference to Exercise 11.8 on page 399, construct (a) a 95% confidence interval for the average course grade of students who make a 35 on the placement test; (b) a 95% prediction interval for the course grade of a student who made a 35 on the placement test. 11.49 The Statistics Consulting Center at Virginia Tech analyzed data on normal woodchucks for the De- partment of Veterinary Medicine. The variables of in- terest were body weight in grams and heart weight in grams. It was desired to develop a linear regression equation in order to determine if there is a significant linear relationship between heart weight and total body weight. Body Weight (grams) Heart Weight (grams) 4050 11.2 2465 12.4 3120 10.5 5700 13.2 2595 9.8 3640 11.0 2050 10.8 4235 10.4 2935 12.2 4975 11.2 3690 10.8 2800 14.2 2775 12.2 2170 10.0 2370 12.3 2055 12.5 2025 11.8 2645 16.0 2675 13.8 infants at birth. Weight (kg) 2.75 2.15 4.41 5.52 3.21 4.32 2.31 4.30 3.71 Chest Size (cm) 29.5 26.3 32.2 36.5 27.2 27.7 28.3 30.3 28.7 // 436 Chapter 11 Simple Linear Regression and Correlation (a) Calculate r. (b) Test the null hypothesis that ρ = 0 against the al- ternative that ρ > 0 at the 0.01 level of significance. (c) What percentage of the variation in infant chest
sizes is explained by difference in weight?
Use heart weight as the independent variable and body weight as the dependent variable and fit a simple linear regression using the following data. In addition, test the hypothesis H0: β1 = 0 versus H1: β1 ̸= 0. Draw conclusions.
11.50 The amounts of solids removed from a particu- lar material when exposed to drying periods of different lengths are as shown.
x (hours) y (grams) 4.4 13.1 14.2 4.5 9.0 11.5 4.8 10.4 11.5 5.5 13.8 14.8 5.7 12.7 15.1 5.9 9.9 12.7 6.3 13.8 16.5 6.9 16.4 15.7 7.5 17.6 16.9 7.8 18.3 17.2
(a) Estimate the linear regression line.
(b) Test at the 0.05 level of significance whether the linear model is adequate.
11.51 With reference to Exercise 11.9 on page 399, construct
(a) a 95% confidence interval for the average weekly sales when $45 is spent on advertising;
(b) a 95% prediction interval for the weekly sales when $45 is spent on advertising.
11.52 An experiment was designed for the Depart- ment of Materials Engineering at Virginia Tech to study hydrogen embrittlement properties based on electrolytic hydrogen pressure measurements. The so-

Review Exercises
437
lution used was 0.1 N NaOH, and the material was a certain type of stainless steel. The cathodic charging current density was controlled and varied at four lev- els. The effective hydrogen pressure was observed as the response. The data follow.
11.54 The business section of the Washington Times in March of 1997 listed 21 different used computers and printers and their sale prices. Also listed was the aver- age hover bid. Partial results from regression analysis using SAS software are shown in Figure 11.29 on page 439.
(a) Explain the difference between the confidence in- terval on the mean and the prediction interval.
(b) Explain why the standard errors of prediction vary from observation to observation.
(c) Which observation has the lowest standard error of prediction? Why?
11.55 Consider the vehicle data from Consumer Re- ports in Figure 11.30 on page 440. Weight is in tons, mileage in miles per gallon, and drive ratio is also indi- cated. A regression model was fitted relating weight x to mileage y. A partial SAS printout in Figure 11.30 on page 440 shows some of the results of that regression analysis, and Figure 11.31 on page 441 gives a plot of the residuals and weight for each vehicle.
(a) From the analysis and the residual plot, does it ap- pear that an improved model might be found by using a transformation? Explain.
(b) Fit the model by replacing weight with log weight. Comment on the results.
(c) Fit a model by replacing mpg with gallons per 100 miles traveled, as mileage is often reported in other countries. Which of the three models is preferable? Explain.
11.56 Observations on the yield of a chemical reac- tion taken at various temperatures were recorded as follows:
Charging Current Density, x
Run ( mA/cm2)
1 0.5
2 0.5
3 0.5
4 0.5
5 1.5
6 1.5
7 1.5
8 2.5
9 2.5
10 2.5
11 2.5
12 3.5
13 3.5
14 3.5
15 3.5
Effective Hydrogen Pressure, y (atm)
86.1 92.1 64.7 74.7
223.6 202.1 132.9 413.5 231.5 466.7 365.3 493.7 382.3 447.2 563.8
//
(a) Run a simple linear regression of y against x.
(b) Compute the pure error sum of squares and make
a test for lack of fit.
(c) Does the information in part (b) indicate a need for a model in x beyond a first-order regression? Explain.
11.53 The following data represent the chemistry grades for a random sample of 12 freshmen at a cer- tain college along with their scores on an intelligence test administered while they were still seniors in high school.
x (◦C)
150 75.4 150 81.2 200 85.5 250 89.0 250 90.5 300 96.7
(a) Plot the data.
(b) Does it appear from is linear?
Test Score, x
Chemistry Grade, y 85
74
76
90
85
87
94
98
81
91
76
74
y (%) x (◦C) 150
y (%) 77.7 84.4 85.7 89.4 94.8 95.3
Student
200
200
250
300
300
1 65
2 50
3 55
4 65
5 55
6 70
7 65
8 70
9 55
10 70
11 50
12 55
the plot as if the relationship
(a) Compute and interpret the sample correlation co- efficient.
(b) State necessary assumptions on random variables.
(c) Test the hypothesis that ρ = 0.5 against the alter- native that ρ > 0.5. Use a P-value in the conclu- sion.
(c) Fit a simple linear regression and test for lack of fit.
(d) Draw conclusions based on your result in (c).
11.57 Physical fitness testing is an important aspect of athletic training. A common measure of the mag- nitude of cardiovascular fitness is the maximum vol- ume of oxygen uptake during strenuous exercise. A study was conducted on 24 middle-aged men to de- termine the influence on oxygen uptake of the time required to complete a two-mile run. Oxygen uptake

438
Chapter 11 Simple Linear Regression and Correlation
was measured with standard laboratory methods as the subjects performed on a treadmill. The work was pub- lished in “Maximal Oxygen Intake Prediction in Young and Middle Aged Males,” Journal of Sports Medicine 9, 1969, 17–22. The data are as follows:
mean β0 and variance
1 42.33
2 53.10
3 42.08
4 50.06
5 42.45
6 42.46
7 47.82
8 49.92
9 36.23
49.66
11 41.49
12 46.17
13 46.18
14 43.21
15 51.81
16 53.28
17 53.29
18 47.18
19 56.91
20 47.80
21 48.65
22 53.67
23 60.62
24 56.73
where the εi are independent and normally distributed with zero means and equal variances σ2, show that Y ̄ and
􏰦n
( x i − x ̄ ) Y i
B1 = i=1 􏰦n
( x i − x ̄ ) 2 i=1
have zero covariance.
11.62 Show, in the case of a least squares fit to the simple linear regression model
(a) Estimate the parameters in a simple linear regres- sion model.
(b) Does the time it takes to run two miles have a sig- nificant influence on maximum oxygen uptake? Use H0: β1 =0versusH1: β1 ̸=0.
(c) Plot the residuals on a graph against x and com- ment on the appropriateness of the simple linear model.
11.58 Suppose a scientist postulates a model Yi = β0 +β1xi +εi, i = 1,2,…,n,
and β0 is a known value, not necessarily zero.
(a) What is the appropriate least squares estimator of
β1? Justify your answer.
(b) What is the variance of the slope estimator?
11.59 For the simple linear regression model, prove that E(s2) = σ2.
11.60 Assuming that the εi are independent and nor- mally distributed with zero means and common vari- ance σ2, show that B0, the least squares estimator of β0 in μY |x = β0 + β1x, is normally distributed with
11.63 Consider the situation of Review Exercise 11.62 but suppose n = 2 (i.e., only two data points are available). Give an argument that the least squares re- gression line will result in (y1 − yˆ1) = (y2 − yˆ2) = 0. Also show that for this case R2 = 1.0.
11.64 In Review Exercise 11.62, the student was re-
􏰦n
(yi − yˆi) = 0 for a standard i=1
//
Subject
y, Maximum Volume of O2
x, Time in Seconds
918
805
892
962
968
907
770
743
1045
810
927
813
858
860
760
747
743
803
683
844
755
700
748
775
11.61
􏰦n 2 xi
σ2=i=1 σ2. B0 􏰦n
n (xi−x ̄)2 i=1
For a simple linear regression model Yi =β0 +β1xi +εi, i=1,2,…,n,
1 0
t h a t
( y i − yˆ i ) =
Yi =β0 +β1xi +εi, i=1,2,…,n,
􏰦n i=1
􏰦n
e i = 0 .
i=1
quired to show that
simple linear regression model. Does the same hold for a model with zero intercept? Show why or why not.
11.65 Suppose that an experimenter postulates a model of the type
Yi =β0 +β1x1i +εi, i=1,2,…,n,
when in fact an additional variable, say x2, also con- tributes linearly to the response. The true model is then given by
Yi = β0 + β1x1i + β2x2i + εi, i = 1,2,…,n. Compute the expected value of the estimator
􏰦n
( x 1 i − x ̄ 1 ) Y i
B1 = i=1 􏰦n
( x 1 i − x ̄ 1 ) 2 i=1
.
11.66 Show the necessary steps in converting the √
b1 r n−2 equation r = s/√Sxx to the equivalent form t = √ 2 .
1−r

Review Exercises 439 11.67 Consider the fictitious set of data shown below, 11.68 Project: This project can be done in groups
where the line through the data is the fitted simple lin- ear regression line. Sketch a residual plot.
y
or as individuals. Each group or person must find a set of data, preferably but not restricted to their field of study. The data need to fit the regression framework with a regression variable x and a response variable y. Carefully make the assignment as to which variable is x and which y. It may be necessary to consult a journal or periodical from your field if you do not have other research data available.
R-Square
0.967472
Parameter
Intercept
Buyer
Coeff Var
7.923338
Estimate
59.93749137
1.04731316
x
Standard
Error
38.34195754
0.04405635
(a) (b)
(c)
Price Mean
894.0476
t Value
1.56
23.77
Plot y versus x. Comment on the relationship as seen from the plot.
Fit an appropriate regression model from the data. Use simple linear regression or fit a polynomial model to the data. Comment on measures of qual- ity.
Plot residuals as illustrated in the text. Check pos- sible violation of assumptions. Show graphically a plot of confidence intervals on a mean response plotted against x. Comment.
Pr > |t|
0.1345
<.0001 Root MSE 70.83841 Predict Std Err Lower 95% Upper 95% Lower 95% Upper 95% product IBM PS/1 486/66 420MB IBM ThinkPad 500 IBM Think-Dad 755CX AST Pentium 90 540MB 800 Dell Pentium 75 1GB 650 Gateway 486/75 320MB 700 Clone 586/133 1GB 500 Compaq Contura 4/25 120MB 450 600 531.23 21.7232 485.76 576.70 376.15 686.31 Compaq Deskpro P90 1.2GB 800 850 897.79 15.4590 865.43 930.14 746.03 1049.54 PowerBook 540c 320MB PowerBook 5300 500MB Power Mac 7500/100 1GB NEC Versa 486 340MB Toshiba 1960CS 320MB Toshiba 4800VCT 500MB HP Laser jet III 1400 1500 1526.18 30.7579 1350 1575 1473.81 28.8747 1150 1325 1264.35 21.9454 800 900 897.79 15.4590 1461.80 1590.55 1413.37 1534.25 1218.42 1310.28 865.43 930.14 1364.54 1313.70 1109.13 1687.82 1633.92 1419.57 1049.54 Buyer Price Value Predict 325 375 400.31 25.8906 450 625 531.23 21.7232 Mean Mean 346.12 454.50 485.76 576.70 Predict 242.46 376.15 Predict 558.17 686.31 1700 1850 1840.37 42.7041 875 897.79 15.4590 700 740.69 16.7503 750 793.06 16.0314 600 583.59 20.2363 1750.99 1929.75 865.43 930.14 705.63 775.75 759.50 826.61 541.24 625.95 1667.25 746.03 588.34 641.04 429.40 2013.49 1049.54 893.05 945.07 737.79 Micron P75 810MB Micron P100 1.2GB Mac Quadra 840AV 500MB Mac Performer 6116 700MB 700 775 793.06 16.0314 759.50 826.61 641.04 945.07 800 675 897.79 15.4590 900 975 1002.52 16.1176 450 575 531.23 21.7232 865.43 930.14 968.78 1036.25 485.76 576.70 746.03 850.46 376.15 1049.54 1154.58 686.31 700 825 793.06 16.0314 1000 1150 1107.25 17.8715 350 475 426.50 25.0157 759.50 826.61 1069.85 1144.66 374.14 478.86 746.03 641.04 954.34 269.26 945.07 1260.16 583.74 Apple Laser Writer Pro 63 750 800 845.42 15.5930 812.79 878.06 693.61 997.24 Figure 11.29: SAS printout, showing partial analysis of data of Review Exercise 11.54. 440 Chapter 11 Simple Linear Regression and Correlation Obs 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 Model Buick Estate Wagon Ford Country Squire Wagon Chevy Ma libu Wagon Chrysler LeBaron Wagon Chevette Toyota Corona Datsun 510 Dodge Omni Audi 5000 Volvo 240 CL Saab 99 GLE Peugeot 694 SL Buick Century Special Mercury Zephyr Dodge Aspen AMC Concord D/L Chevy Caprice Classic Ford LTP Mercury Grand Marquis Dodge St Regis Ford Mustang 4 Ford Mustang Ghia Macda GLC Dodge Colt AMC Spirit VW Scirocco Honda Accord LX Buick Skylark Chevy Citation Olds Omega Pontiac Phoenix Plymouth Horizon Datsun 210 Fiat Strada VW Dasher Datsun 810 BMW 320i VW Rabbit WT MPG DR_RATIO 2.73 2.26 2.56 2.45 3.70 3.05 3.54 3.37 3.90 3.50 3.77 3.58 2.73 3.08 2.71 2.73 2.41 2.26 2.26 2.45 3.08 3.08 3.73 2.97 3.08 3.78 3.05 2.53 2.69 2.84 2.69 3.37 3.70 3.10 3.70 3.70 3.64 3.78 Pr > |t|
<.0001 <.0001 R-Square 0.817244 Parameter Intercept WT Coeff Var 11.46010 Estimate 48.67928080 -8.36243141 Root MSE 2.837580 Standard Error 1.94053995 0.65908398 4.360 4.054 3.605 3.940 2.155 2.560 2.300 2.230 2.830 3.140 2.795 3.410 3.380 3.070 3.620 3.410 3.840 3.725 3.955 3.830 2.585 2.910 1.975 1.915 2.670 1.990 2.135 2.570 2.595 2.700 2.556 2.200 2.020 2.130 2.190 2.815 2.600 1.925 16.9 15.5 19.2 18.5 30.0 27.5 27.2 30.9 20.3 17.0 21.6 16.2 20.6 20.8 18.6 18.1 17.0 17.6 16.5 18.2 26.5 21.9 34.1 35.1 27.4 31.5 29.5 28.4 28.8 26.8 33.5 34.2 31.8 37.3 30.5 22.0 21.5 31.9 MPG Mean 24.76053 t Value 25.09 -12.69 Figure 11.30: SAS printout, showing partial analysis of data of Review Exercise 11.55. Review Exercises 441 Resid | 8+ | | | | Plot of Resid*WT. Symbol used is ’*’. |** 6+ | | | |* | 4+* | | | |* |* 2+* |** |* |*** |*** |**** 0 +-----------------*---*------------------------------------------*------------------------ | |**** | |* | -2 + * |*** | |* | |* -4 + * | |* | |** | -6 + | ---+-------------+-------------+-------------+-------------+-------------+-------------+-- 1.5 2.0 2.5 3.0 3.5 4.0 4.5 WT Figure 11.31: SAS printout, showing residual plot of Review Exercise 11.55. 442 Chapter 11 Simple Linear Regression and Correlation 11.13 Potential Misconceptions and Hazards; Relationship to Material in Other Chapters Anytime one is considering the use of simple linear regression, a plot of the data is not only recommended but essential. A plot of the ordinary residuals and a normal probability plot of these residuals are always edifying. In addition, we introduce and illustrate an additional type of residual in Chapter 12 that is in a standardized form. All of these plots are designed to detect violation of assumptions. The use of t-statistics for tests on regression coefficients is reasonably robust to the normality assumption. The homogeneous variance assumption is crucial, and residual plots are designed to detect a violation. The material in this chapter is used heavily in Chapters 12 and 15. All of the information involving the method of least squares in the development of regression models carries over into Chapter 12. The difference is that Chapter 12 deals with the scientific conditions in which there is more than a single x variable, i.e., more than one regression variable. However, material in the current chapter that deals with regression diagnostics, types of residual plots, measures of model quality, and so on, applies and will carry over. The student will realize that more complications occur in Chapter 12 because the problems in multiple regression models often involve the backdrop of questions regarding how the various regression variables enter the model and even issues of which variables should remain in the model. Certainly Chapter 15 heavily involves the use of regression modeling, but we will preview the connection in the summary at the end of Chapter 12. Chapter 12 Multiple Linear Regression and Certain Nonlinear Regression Models 12.1 Introduction In most research problems where regression analysis is applied, more than one independent variable is needed in the regression model. The complexity of most scientific mechanisms is such that in order to be able to predict an important response, a multiple regression model is needed. When this model is linear in the coefficients, it is called a multiple linear regression model. For the case of k independent variables x1,x2,...,xk, the mean of Y|x1,x2,...,xk is given by the multiple linear regression model μY |x1,x2,...,xk = β0 + β1x1 + · · · + βkxk, and the estimated response is obtained from the sample regression equation yˆ = b 0 + b 1 x 1 + · · · + b k x k , where each regression coefficient βi is estimated by bi from the sample data using the method of least squares. As in the case of a single independent variable, the multiple linear regression model can often be an adequate representation of a more complicated structure within certain ranges of the independent variables. Similar least squares techniques can also be applied for estimating the coeffi- cients when the linear model involves, say, powers and products of the independent variables. For example, when k = 1, the experimenter may believe that the means μY |x do not fall on a straight line but are more appropriately described by the polynomial regression model μY|x =β0 +β1x+β2x2 +···+βrxr, and the estimated response is obtained from the polynomial regression equation yˆ = b 0 + b 1 x + b 2 x 2 + · · · + b r x r . 443 444 Chapter 12 Multiple Linear Regression and Certain Nonlinear Regression Models Confusion arises occasionally when we speak of a polynomial model as a linear model. However, statisticians normally refer to a linear model as one in which the parameters occur linearly, regardless of how the independent variables enter the model. An example of a nonlinear model is the exponential relationship μY|x =αβx, whose response is estimated by the regression equation yˆ = a b x . There are many phenomena in science and engineering that are inherently non- linear in nature, and when the true structure is known, an attempt should certainly be made to fit the actual model. The literature on estimation by least squares of nonlinear models is voluminous. The nonlinear models discussed in this chapter deal with nonideal conditions in which the analyst is certain that the response and hence the response model error are not normally distributed but, rather, have a binomial or Poisson distribution. These situations do occur extensively in practice. A student who wants a more general account of nonlinear regression should consult Classical and Modern Regression with Applications by Myers (1990; see the Bibliography). 12.2 Estimating the Coefficients In this section, we obtain the least squares estimators of the parameters β0 , β1 , . . . , βk by fitting the multiple linear regression model μY|x1,x2,...,xk =β0 +β1x1 +···+βkxk to the data points {(x1i,x2i,...,xki,yi); i = 1,2,...,n and n > k},
where yi is the observed response to the values x1i, x2i, . . . , xki of the k independent variables x1, x2, . . . , xk. Each observation (x1i, x2i, . . . , xki, yi) is assumed to satisfy the following equation.
Multiple Linear yi =β0 +β1x1i +β2x2i +···+βkxki +εi Regression Model or
yi =yˆi +ei =b0 +b1x1i +b2x2i +···+bkxki +ei,
where εi and ei are the random error and residual, respectively, associated with
the response yi and fitted value yˆi.
As in the case of simple linear regression, it is assumed that the εi are independent and identically distributed with mean 0 and common variance σ2.
In using the concept of least squares to arrive at estimates b0,b1,…,bk, we
minimize the expression
􏰤n SSE =
e2i =
􏰤n
(yi −b0 −b1x1i −b2x2i −···−bkxki)2.
i=1
Differentiating SSE in turn with respect to b0, b1, . . . , bk and equating to zero, we generate the set of k + 1 normal equations for multiple linear regression.
i=1

12.2 Estimating the Coefficients
445
Normal Estimation Equations for Multiple Linear Regression
􏰤n 􏰤n 􏰤n 􏰤n
b0
x1i +b1
+ b2
x1ixki =
nb0+b1
􏰤n 􏰤n 􏰤n
x21i
􏰤n 􏰤n 􏰤n
+ ···+bk x1ix2i + ··· + bk
.
xkix2i + ··· + bk
xki =
􏰤n 􏰤n
yi x1iyi
i=1 .
i=1 .
i=1
i=1
i=1 .
i=1 .
b0 xki +b1 i=1
xkix1i+ b2
􏰤n 􏰤n
x2ki = xkiyi
i=1
i=1
i=1
i=1
i=1
i=1
i=1
x1i +b2
x2i
These equations can be solved for b0, b1, b2, . . . , bk by any appropriate method for solving systems of linear equations. Most statistical software can be used to obtain numerical solutions of the above equations.
Example 12.1: A study was done on a diesel-powered light-duty pickup truck to see if humidity, air temperature, and barometric pressure influence emission of nitrous oxide (in ppm). Emission measurements were taken at different times, with varying experimental conditions. The data are given in Table 12.2. The model is
μY |x1,x2,x3 = β0 + β1×1 + β2×2 + β3×3,
yi =β0 +β1x1i +β2x2i +β3x3i +εi, i=1,2,…,20.
Fit this multiple linear regression model to the given data and then estimate the amount of nitrous oxide emitted for the conditions where humidity is 50%, tem- perature is 76◦F, and barometric pressure is 29.30.
Table 12.1: Data for Example 12.1
Nitrous Humidity, Temp., Pressure, Nitrous Humidity, Temp., Pressure,
or, equivalently,
Oxide, y x1 x2 0.90 72.4 76.3 0.91 41.6 70.3 0.96 34.3 77.1 0.89 35.1 68.0 1.00 10.7 79.0 1.10 12.9 67.4 1.15 8.3 66.8 1.03 20.1 76.9 0.77 72.2 77.7 1.07 24.0 67.7
x3 Oxide, y x1 x2 29.18 1.07 23.2 76.8 29.35 0.94 47.4 86.6 29.24 1.10 31.5 76.9 29.27 1.10 10.6 86.3 29.78 1.10 11.2 86.0 29.39 0.91 73.3 76.3 29.69 0.87 75.4 77.9 29.48 0.78 96.6 78.7 29.09 0.82 107.4 86.8 29.60 0.95 54.9 70.9
x3 29.38 29.35 29.63 29.56 29.48 29.40 29.28 29.29 29.03 29.37
Source: Charles T. Hare, “Light-Duty Diesel Emission Correction Factors for Ambient Conditions,” EPA-600/2-77- 116. U.S. Environmental Protection Agency.
Solution : The solution of the set of estimating equations yields the unique estimates b0 = −3.507778, b1 = −0.002625, b2 = 0.000799, b3 = 0.154155.

446 Chapter 12 Multiple Linear Regression and Certain Nonlinear Regression Models
Therefore, the regression equation is
yˆ = −3.507778 − 0.002625 x1 + 0.000799 x2 + 0.154155 x3 .
For 50% humidity, a temperature of 76◦F, and a barometric pressure of 29.30, the estimated amount of nitrous oxide emitted is
yˆ = −3.507778 − 0.002625(50.0) + 0.000799(76.0) + 0.1541553(29.30) = 0.9384 ppm.
Polynomial Regression
Now suppose that we wish to fit the polynomial equation μY|x =β0 +β1x+β2×2 +···+βrxr
to the n pairs of observations {(xi, yi); i = 1, 2, . . . , n}. Each observation, yi, satisfies the equation
or
y i = β 0 + β 1 x i + β 2 x 2i + · · · + β r x ri + ε i
y i = yˆ i + e i = b 0 + b 1 x i + b 2 x 2i + · · · + b r x ri + e i ,
where r is the degree of the polynomial and εi and ei are again the random error and residual associated with the response yi and fitted value yˆi, respectively. Here, the number of pairs, n, must be at least as large as r+1, the number of parameters to be estimated.
Notice that the polynomial model can be considered a special case of the more general multiple linear regression model, where we set x1 = x, x2 = x2, . . . , xr = xr. The normal equations assume the same form as those given on page 445. They are then solved for b0,b1,b2,…,br.
Example 12.2: Given the data x0123456789
y 9.1 7.3 3.2 4.6 4.8 2.9 5.7 7.1 8.8 10.2
fit a regression curve of the form μY |x = β0 + β1x + β2×2 and then estimate μY |2.
Solution : From the data given, we find that
10b0 + 45b1 + 285 b2 = 63.7,
45b0 + 285b1 + 2025b2 = 307.3, 285b0 + 2025 b1 + 15,333b2 = 2153.3.
Solving these normal equations, we obtain
b0 = 8.698, b1 = −2.341, b2 = 0.288.
Therefore,
yˆ = 8.698 − 2.341x + 0.288×2.

12.3 Linear Regression Model Using Matrices 447 When x = 2, our estimate of μY |2 is
yˆ = 8.698 − (2.341)(2) + (0.288)(22) = 5.168.
Example 12.3: The data in Table 12.2 represent the percent of impurities that resulted for various temperatures and sterilizing times during a reaction associated with the manufac- turing of a certain beverage. Estimate the regression coefficients in the polynomial model
yi = β0 + β1x1i + β2x2i + β11x21i + β22x2i + β12x1ix2i + εi, for i = 1,2,…,18.
Table 12.2: Data for Example 12.3
Sterilizing Time, x2 (min)
15 20 25
Temperature, x1 (◦C)
75
14.05 14.93 16.56 15.85 22.41 21.66
100 125
10.55 7.55 9.48 6.59 13.63 9.23 11.75 8.78 18.55 15.93 17.98 16.44
12.3
Solution : Using the normal equations, we obtain
b0 = 56.4411, b1 = −0.36190, b2 = −2.75299,
b11 = 0.00081, b22 = 0.08173, b12 = 0.00314,
and our estimated regression equation is
yˆ = 56.4411 − 0.36190×1 − 2.75299×2 + 0.00081×21 + 0.08173×2 + 0.00314x1x2.
Many of the principles and procedures associated with the estimation of poly- nomial regression functions fall into the category of response surface methodol- ogy, a collection of techniques that have been used quite successfully by scientists and engineers in many fields. The x2i are called pure quadratic terms, and the xixj (i ̸= j) are called interaction terms. Such problems as selecting a proper experimental design, particularly in cases where a large number of variables are in the model, and choosing optimum operating conditions for x1, x2, . . . , xk are often approached through the use of these methods. For an extensive exposure, the reader is referred to Response Surface Methodology: Process and Product Opti- mization Using Designed Experiments by Myers, Montgomery, and Anderson-Cook (2009; see the Bibliography).
Linear Regression Model Using Matrices
In fitting a multiple linear regression model, particularly when the number of vari- ables exceeds two, a knowledge of matrix theory can facilitate the mathematical manipulations considerably. Suppose that the experimenter has k independent

448
Chapter 12 Multiple Linear Regression and Certain Nonlinear Regression Models variables x1,x2,…,xk and n observations y1,y2,…,yn, each of which can be ex-
pressed by the equation
yi =β0 +β1x1i +β2x2i +···+βkxki +εi.
This model essentially represents n equations describing how the response values are generated in the scientific process. Using matrix notation, we can write the following equation:
General Linear
Model where
y=⎢⎣.⎥⎦, X=⎢⎣. . . . ⎥⎦, β=⎢⎣.⎥⎦, ε=⎢⎣.⎥⎦. yn 1 x1n x2n ··· xkn βk εn
Then the least squares method for estimation of β, illustrated in Section 12.2, involves finding b for which
SSE = (y − Xb)′(y − Xb)
is minimized. This minimization process involves solving for b in the equation
∂ (SSE) = 0. ∂b
We will not present the details regarding solution of the equations above. The result reduces to the solution of b in
(X′X)b = X′y.
Notice the nature of the X matrix. Apart from the initial element, the ith row
represents the x-values that give rise to the response yi. Writing
⎡ 􏰦n 􏰦n 􏰦n ⎤
⎢n x1i x2i···xki⎥ i=1 i=1 i=1
and
y = Xβ + ε,
⎡y⎤ ⎡1 x x ··· x ⎤ ⎡β⎤ ⎡ε⎤
1 1121 k1 0 1 ⎢y2⎥ ⎢1 x12 x22 ··· xk2⎥ ⎢β1⎥ ⎢ε2⎥
⎢􏰦n􏰦n2􏰦n 􏰦n⎥ ′ ⎢ x1i x1i x1ix2i ··· x1ixki⎥ A=XX=⎢i=1. i=1. i=1 . i=1 . ⎥
⎢ . . . . ⎥ ⎣􏰦n􏰦n􏰦n 􏰦n2⎦
xki i=1
xkix1i
⎡ 􏰦n
xkix2i
· · · xki i=1

⎥ ⎥
⎥ ⎥ ⎥ ⎦
i=1
i=1
⎢g1= g=Xy=⎢ i=1
i=1
allows the normal equations to be put in the matrix form

⎢g0= yi i=1
⎢ 􏰦n
x1iyi
xkiyi
⎢ . ⎣ 􏰦n
gk =
Ab = g.

12.3 Linear Regression Model Using Matrices 449 If the matrix A is nonsingular, we can write the solution for the regression
coefficients as
b = A−1g = (X′X)−1X′y.
Thus, we can obtain the prediction equation or regression equation by solving a set of k + 1 equations in a like number of unknowns. This involves the inversion of the k + 1 by k + 1 matrix X′X. Techniques for inverting this matrix are explained in most textbooks on elementary determinants and matrices. Of course, there are many high-speed computer packages available for multiple regression problems, packages that not only print out estimates of the regression coefficients but also provide other information relevant to making inferences concerning the regression equation.
Example 12.4: The percent survival rate of sperm in a certain type of animal semen, after storage, was measured at various combinations of concentrations of three materials used to increase chance of survival. The data are given in Table 12.3. Estimate the multiple linear regression model for the given data.
Table 12.3: Data for Example 12.4
y (% survival) 25.5
31.2 25.9 38.4 18.4 26.7 26.4 25.9 32.0 25.2 39.7 35.7 26.5
x1 (weight %) 1.74
6.32
6.22 10.52 1.19 1.22 4.10 6.32 4.08 4.15 10.15 1.72 1.70
x2 (weight %)
5.30 5.42 8.41 4.63
11.60 5.85 6.62 8.72 4.42 7.60 4.83 3.12 5.30
x3 (weight %) 10.80
9.40 7.20 8.50 9.40 9.90 8.00 9.10 8.70 9.20 9.40 7.60 8.20
Solution: The least squares estimating equations, (X′X)b = X′y, are
⎡ ⎤⎡⎤⎡⎤
13.0 ⎢ 59.43 ⎣ 81.82
59.43 394.7255 360.6621
81.82 360.6621 576.7264
115.40 522.0780 728.3100
b0
⎥ ⎢b1⎥ = ⎢
377.5 1877.567 ⎥ . 2246.661 ⎦ 3337.780
⎦ ⎣b2 ⎦ ⎣ 115.40 522.0780 728.3100 1035.9600 b3
From a computer readout we obtain the elements of the inverse matrix
⎡⎤
8.0648 −0.0826 −0.0942 −0.7905
(X′X)−1 = ⎢ −0.0826 ⎣ −0.0942 −0.7905
and then, using the relation b = (X′X)−1X′y, the estimated regression coefficients are obtained as
0.0017 0.0037 ⎥ , 0.0166 −0.0021 ⎦
0.0085
0.0017
0.0037 −0.0021 0.0886

Exercises
12.1 A set of experimental runs was made to deter- mine a way of predicting cooking time y at various values of oven width x1 and flue temperature x2. The
coded data were recorded as follows:
y x1 x2
Chemistry Test Classes Student Grade, y Score, x1 Missed, x2
1 85 65 1 2 74 50 7 3 76 55 5 4 90 65 2 5 85 55 6 6 87 70 3 7 94 65 2 8 98 70 5 9 81 55 4
10 91 70 3 11 76 50 1 12 74 55 4
(a) Fit a multiple linear regression equation of the form yˆ=b0 +b1x1 +b2x2.
(b) Estimate the chemistry grade for a student who has an intelligence test score of 60 and missed 4 classes.
12.4 An experiment was conducted to determine if the weight of an animal can be predicted after a given period of time on the basis of the initial weight of the animal and the amount of feed that was eaten. The following data, measured in kilograms, were recorded:
111.42 10.65
Estimate the multiple linear regression equation
μY |x1,x2 = β0 + β1×1 + β2×2.
In Applied Spectroscopy, the infrared reflectance spectra properties of a viscous liquid used in the elec- tronics industry as a lubricant were studied. The de- signed experiment consisted of the effect of band fre- quency x1 and film thickness x2 on optical density y using a Perkin-Elmer Model 621 infrared spectrometer. (Source: Pacansky, J., England, C. D., and Wattman, R., 1986.)
y x1 x2
0.231 740 1.10 0.107 740 0.62 0.053 740 0.31 0.129 805 1.10 0.069 805 0.62 0.030 805 0.31 1.005 980 1.10 0.559 980 0.62 0.321 980 0.31 2.948 1235 1.10 1.633 1235 0.62 0.934 1235 0.31
Estimate the multiple linear regression equation yˆ=b0 +b1x1 +b2x2.
12.3 Suppose in Review Exercise 11.53 on page 437 that we were also given the number of class periods missed by the 12 students taking the chemistry course. The complete data are shown.
12. 2
//
450 Chapter 12 Multiple Linear Regression and Certain Nonlinear Regression Models
b0 = 39.1574, b1 = 1.0161, b2 = −1.8616, b3 = −0.3433. Hence, our estimated regression equation is
yˆ = 39.1574 + 1.0161×1 − 1.8616×2 − 0.3433×3.
6.40 1.32 15.05 2.69 18.75 3.56 30.25 4.41 44.85 5.35 48.94 6.20 51.55 7.12 61.50 8.87
100.44 9.80
1.15 3.40 4.10 8.75
14.82 15.15 15.32 18.18 35.19 40.40
Final Weight, y
95 77 80
100
97
70
50
80
92
84
Initial Weight, x1
42
33
33
45
39
36
32
41
40
38
Feed Weight, x2
272
226
259
292
311
183
173
236
230
235
(a) Fit a multiple regression equation of the form μY |x1,x2 = β0 + β1×1 + β2×2.
(b) Predict the final weight of an animal having an ini- tial weight of 35 kilograms that is given 250 kilo- grams of feed.
12.5 The electric power consumed each month by a chemical plant is thought to be related to the average ambient temperature x1, the number of days in the month x2, the average product purity x3, and the tons of product produced x4. The past year’s historical data are available and are presented in the following table.

Exercises
451
y x1 x2 240 25 24
236 31 21 290 45 24 274 60 25 301 65 25 316 72 26 300 80 25 296 84 25 267 75 24 276 60 25 288 50 25 261 38 23
x3 x4 91 100
90 95 88 110 87 88 91 94 94 99 87 97 86 96 88 110 91 105 90 100 89 98
12.8 The following is a set of coded experimental data on the compressive strength of a particular alloy at var- ious values of the concentration of some additive:
//
Concentration,
x
10.0 15.0 20.0 25.0 30.0
Compressive Strength, y
25.2 27.3 28.7 29.8 31.1 27.8 31.2 32.6 29.7 31.7 30.1 32.3 29.4 30.8 32.8
(a) Fit a multiple linear regression model using the above data set.
(b) Predict power consumption for a month in which x1 =75◦F,x2 =24days,x3 =90%,andx4 =98 tons.
12.6 An experiment was conducted on a new model of a particular make of automobile to determine the stopping distance at various speeds. The following data were recorded.
(a) Estimate the quadratic regression equation μY |x = β0 +β1x+β2×2.
(b) Test for lack of fit of the model.
12.9 (a) Fit a multiple regression equation of the form μY|x = β0 +β1×1 +β2×2 to the data of Ex- ample 11.8 on page 420.
(b) Estimate the yield of the chemical reaction for a temperature of 225◦C.
12.10 The following data are given: x0123456
y1453234
(a) Fit the cubic model μY |x = β0 +β1x+β2×2 +β3×3. (b) Predict Y when x = 2.
12.11 An experiment was conducted to study the size of squid eaten by sharks and tuna. The regressor vari- ables are characteristics of the beaks of the squid. The data are given as follows:
x1 x2 x3 x4 x5 y
35 50 65 80 95 110
β0 + β1v + β2v2.
(b) Estimate the stopping distance when the car is
traveling at 70 kilometers per hour.
12.7 An experiment was conducted in order to de- termine if cerebral blood flow in human beings can be predicted from arterial oxygen tension (millimeters of mercury). Fifteen patients participated in the study, and the following data were collected:
Speed, v (km/hr)
Stopping Distance, d (m)
(a) Fit a multiple regression curve of the form μD|v =
Blood Flow,
y
84.33 87.80 82.20 78.21 78.44 80.01 83.53 79.46 75.22 76.58 77.90 78.80 80.67 86.60 78.20
Arterial Oxygen Tension, x
603.40 582.50 556.20 594.60 558.90 575.20 580.10 451.20 404.00 484.00 452.40 448.40 334.80 320.30 350.30
1.31 1.07 1.55 1.49 0.99 0.84 0.99 0.83 1.01 0.90 1.09 0.93 1.08 0.90 1.27 1.08 0.99 0.85 1.34 1.13 1.30 1.10 1.33 1.10 1.86 1.47 1.58 1.34 1.97 1.59 1.80 1.56 1.75 1.58 1.72 1.43 1.68 1.57 1.75 1.59 2.19 1.86 1.73 1.67
0.44 0.75 0.35 1.95 0.53 0.90 0.47 2.90 0.34 0.57 0.32 0.72 0.34 0.54 0.27 0.81 0.36 0.64 0.30 1.09 0.42 0.61 0.31 1.22 0.40 0.51 0.31 1.02 0.44 0.77 0.34 1.93 0.36 0.56 0.29 0.64 0.45 0.77 0.37 2.08 0.45 0.76 0.38 1.98 0.48 0.77 0.38 1.90 0.60 1.01 0.65 8.56 0.52 0.95 0.50 4.49 0.67 1.20 0.59 8.49 0.66 1.02 0.59 6.17 0.63 1.09 0.59 7.54 0.64 1.02 0.63 6.36 0.72 0.96 0.68 7.63 0.68 1.08 0.62 7.78 0.75 1.24 0.72 10.15 0.64 1.14 0.55 6.88
16 26 41 62 88 119
Estimate the quadratic regression equation μY|x =β0 +β1x+β2×2.

In the study, the regressor variables and response con- sidered are
x1 = rostral length, in inches,
x2 = wing length, in inches,
x3 = rostral to notch length, in inches, x4 = notch to wing length, in inches, x5 = width, in inches,
y = weight, in pounds.
Estimate the multiple linear regression equation
μY |x1,x2,x3,x4,x5
= β0 +β1×1 +β2×2 +β3×3 +β4×4 +β5×5.
12.12 The following data reflect information from 17 U.S. Naval hospitals at various sites around the world. The regressors are workload variables, that is, items that result in the need for personnel in a hospital. A brief description of the variables is as follows:
were obtained. (From Response Surface Methodology, Myers, Montgomery, and Anderson-Cook, 2009.)
y x1 x2 y x1 x2
//
452 Chapter 12 Multiple Linear Regression and Certain Nonlinear Regression Models
y = monthly labor-hours,
x1 = average daily patient load,
x2 = monthly X-ray exposures,
x3 = monthly occupied bed-days,
x4 = eligible population in the area/1000,
x5 = average length of patient’s stay, in days.
x1 x2 x3 x4 x5 y
(a) Estimate the unknown parameters of the multiple linear regression equation
μY |x1,x2 = β0 + β1×1 + β2×2.
(b) Predict wear when oil viscosity is 20 and load is
1200.
12.14 Eleven student teachers took part in an eval- uation program designed to measure teacher effective- ness and determine what factors are important. The response measure was a quantitative evaluation of the teacher. The regressor variables were scores on four standardized tests given to each teacher. The data are as follows:
y x1 x2 x3 x4
193 1.6 851 172 22.0 1058 113 33.0 1357
230 15.5 816 91 43.0 1201 125 40.0 1115
Site
410 69 125
569 57 131
425 77 141
344 81 122
324 0 141
505 53 152
235 77 141
501 76 132
400 65 157
584 97 166
434 76 141
59.00 55.66 31.75 63.97 80.50 45.32 75.00 46.67 49.00 41.21 49.35 43.83 60.75 41.61 41.25 64.57 50.75 42.41 32.25 57.95 54.50 57.90
1 15.57 2463 2 44.02 2048 3 20.42 3940 4 18.74 6505 5 49.20 5723 6 44.92 11,520 7 55.48 5779 8 59.28 5969 9 94.39 8461
10 128.02 20,106 11 96.00 13,313 12 131.42 10,771 13 127.21 15,543 14 252.90 36,194 15 409.20 34,703 16 463.70 39,204 17 510.22 86,533
472.92 18.0 4.45 1339.75 9.5 6.92 620.25 12.8 4.28 568.33 36.7 3.90 1497.60 35.7 5.50 1365.83 24.0 4.60 1687.00 43.3 5.62 1639.92 46.7 5.15 2872.33 78.7 6.18 3655.08 180.5 6.15 2912.00 60.9 5.88 3921.00 103.7 4.88 3865.67 126.8 5.50 7684.10 157.7 7.00
566.52
696.82 1033.15 1003.62 1611.37 1613.27 1854.17 2160.55 2305.58 3503.93 3571.59 3741.40 4026.52
10,343.81 11,732.17 15,414.94 18,854.45
Estimate the multiple linear regression equation
μY |x1,x2,x3,x4 = β0 + β1×1 + β2×2 + β3×3 + β4×4.
12.15 The personnel department of a certain indus- trial firm used 12 subjects in a study to determine the relationship between job performance rating (y) and scores on four tests. The data are as follows:
y x1 x2 x3 x4
12,446.33 14,098.40 15,524.00
169.4 10.75 331.4 7.05 371.6 6.35
11.2 56.5 71.0 14.5 59.5 72.5 17.2 69.2 76.0 17.8 74.5 79.5 19.3 81.2 84.0 24.5 88.0 86.2 21.2 78.2 80.5 16.9 69.0 72.0 14.8 58.1 68.0 20.0 80.5 85.0 13.2 58.3 71.0 22.5 84.0 87.2
38.5 43.0 38.2 44.8 42.5 49.0 43.4 56.3 47.5 60.2 47.4 62.0 44.5 58.1 41.8 48.1 42.1 46.0 48.1 60.3 37.5 47.1 51.0 65.2
The goal here is to produce an
will estimate (or predict) personnel needs for Naval hospitals. Estimate the multiple linear regression equa- tion
μY |x1,x2,x3,x4,x5
= β0 +β1×1 +β2×2 +β3×3 +β4×4 +β5×5.
12.13 A study was performed on a type of bear- ing to find the relationship of amount of wear y to x1 = oil viscosity and x2 = load. The following data
empirical equation that

12.4 Properties of the Least Squares Estimators
453
Estimate the regression coefficients in the model yˆ=b0 +b1x1 +b2x2 +b3x3 +b4x4.
12.16 An engineer at a semiconductor company wants to model the relationship between the gain or hFE of a device (y) and three parameters: emitter-RS (x1), base-RS (x2), and emitter-to-base-RS (x3). The data are shown below:
x1, x2, Emitter-RS Base-RS
16.12 220.5 15.13 223.5 15.50 217.6 15.13 228.5 15.50 230.2 16.12 226.5 15.13 226.6 15.63 225.6 15.38 234.0 15.50 230.0 14.25 224.3 14.50 240.5 14.62 223.7
x3, E-B-RS
3.375 6.125 5.000 6.625 5.750 3.750 6.125 5.375 8.875 4.000 8.000
10.870 7.375
y, hFE
48.14 109.60 82.68 112.60 97.52 59.06 111.80 89.09 171.90 66.80 157.10 208.40 133.40
x1, x2, Emitter-RS Base-RS
14.62 226.0 15.63 220.0 14.62 217.4 15.00 220.0 14.50 226.5 15.25 224.1
x3, E-B-RS
7.000 3.375 6.375 6.000 7.625 6.000
y, hFE
128.40 52.62 113.90 98.01 139.90 102.60 (cont.)
(Data from Myers, Montgomery, and Anderson-Cook, 2009.)
(a) Fit a multiple linear regression to the data.
(b) Predict hFE when x1 = 14, x2 = 220, and x3 = 5.
12.4 Properties of the Least Squares Estimators
Themeansandvariancesoftheestimatorsb0,b1,…,bk arereadilyobtainedunder certainassumptionsontherandomerrorsε1,ε2,…,εk thatareidenticaltothose made in the case of simple linear regression. When we assume these errors to be independent, each with mean 0 and variance σ2, it can be shown that b0, b1, . . . , bk are, respectively, unbiased estimators of the regression coefficients β0,β1,…,βk. In addition, the variances of the b’s are obtained through the elements of the inverse of the A matrix. Note that the off-diagonal elements of A = X′X represent sums of products of elements in the columns of X, while the diagonal elements of A represent sums of squares of elements in the columns of X. The inverse matrix, A−1, apart from the multiplier σ2, represents the variance-covariance matrix of the estimated regression coefficients. That is, the elements of the matrix A−1σ2 displaythevariancesofb0,b1,…,bk onthemaindiagonalandcovariancesonthe off-diagonal. For example, in a k = 2 multiple linear regression problem, we might write
⎡⎤
c00 (X′X)−1 = ⎣c10
c20
with the elements below the main diagonal determined through the symmetry of
the matrix. Then we can write
σ2 =c σ2, i=0,1,2,
bi ii
σbibj = Cov(bi,bj)= cijσ2, i ̸= j.
Of course, the estimates of the variances and hence the standard errors of these estimators are obtained by replacing σ2 with the appropriate estimate obtained through experimental data. An unbiased estimate of σ2 is once again defined in
c01 c02 c11 c12⎦ c21 c22

454 Chapter 12 Multiple Linear Regression and Certain Nonlinear Regression Models
Theorem 12.1:
terms of the error sum of squares, which is computed using the formula estab- lished in Theorem 12.1. In the theorem, we are making the assumptions on the εi described above.
For the linear regression equation
y = Xβ + ε,
an unbiased estimate of σ2 is given by the error or residual mean square
S S E 􏰤n
s2=n−k−1, where SSE=
i=1
e2i=
􏰤n i=1
(yi−yˆi)2.
We can see that Theorem 12.1 represents a generalization of Theorem 11.1 for the simple linear regression case. The proof is left for the reader. As in the simpler linear regression case, the estimate s2 is a measure of the variation in the prediction errors or residuals. Other important inferences regarding the fitted regression equation, based on the values of the individual residuals ei = yi − yˆi, i = 1, 2, . . . , n, are discussed in Sections 12.10 and 12.11.
The error and regression sums of squares take on the same form and play the same role as in the simple linear regression case. In fact, the sum-of-squares identity
􏰤n 􏰤n 􏰤n
(yi −y ̄)2 =
continues to hold, and we retain our previous notation, namely
with
and
i=1
SST = SSR + SSE,
􏰤n i=1
􏰤n i=1
i=1
SST =
(yˆi −y ̄)2 + (yi −yˆi)2 i=1
SSR =
(yˆi − y ̄)2 = regression sum of squares.
(yi − y ̄)2 = total sum of squares
There are k degrees of freedom associated with SSR, and, as always, SST has n − 1 degrees of freedom. Therefore, after subtraction, SSE has n − k − 1 degrees of freedom. Thus, our estimate of σ2 is again given by the error sum of squares divided by its degrees of freedom. All three of these sums of squares will appear on the printouts of most multiple regression computer packages. Note that the condition n > k in Section 12.2 guarantees that the degrees of freedom of SSE cannot be negative.

12.5 Inferences in Multiple Linear Regression 455 Analysis of Variance in Multiple Regression
The partition of the total sum of squares into its components, the regression and error sums of squares, plays an important role. An analysis of variance can be conducted to shed light on the quality of the regression equation. A useful hypothesis that determines if a significant amount of variation is explained by the model is
H0: β1 =β2 =β3 =···=βk =0.
The analysis of variance involves an F -test via a table given as follows:
Source
Regression
Error
Total SST n−1
Sum of Squares
Degrees of Freedom
Mean Squares
MSR = SSR k
MSE= SSE n−(k+1)
F
f = MSR MSE
SSR SSE
k
n−(k+1)
12.5
This test is an upper-tailed test. Rejection of H0 implies that the regression equation differs from a constant. That is, at least one regressor variable is important. More discussion of the use of analysis of variance appears in subsequent sections.
Further utility of the mean square error (or residual mean square) lies in its use in hypothesis testing and confidence interval estimation, which is discussed in Sec- tion 12.5. In addition, the mean square error plays an important role in situations where the scientist is searching for the best from a set of competing models. Many model-building criteria involve the statistic s2. Criteria for comparing competing models are discussed in Section 12.11.
Inferences in Multiple Linear Regression
A knowledge of the distributions of the individual coefficient estimators enables the experimenter to construct confidence intervals for the coefficients and to test hypotheses about them. Recall from Section 12.4 that the bj (j = 0, 1, 2, . . . , k) are normally distributed with mean βj and variance cjjσ2. Thus, we can use the statistic
bj −βj0 t = s√cjj
with n − k − 1 degrees of freedom to test hypotheses and construct confidence intervals on βj. For example, if we wish to test
H0: βj =βj0, H1: βj ̸= βj0,
we compute the above t-statistic and do not reject H0 if −tα/2 < t < tα/2, where tα/2 has n − k − 1 degrees of freedom. 456 Chapter 12 Multiple Linear Regression and Certain Nonlinear Regression Models Example 12.5: For the model of Example 12.4, test the hypothesis that β2 = −2.5 at the 0.05 level of significance against the alternative that β2 > −2.5.
Solution :
Computations:
s√c
Decision: Reject H0 and conclude that β2 > −2.5.
Individual t-Tests for Variable Screening
The t-test most often used in multiple regression is the one that tests the impor- tance of individual coefficients (i.e., H0: βj = 0 against the alternative H1: βj ̸= 0). These tests often contribute to what is termed variable screening, where the ana- lyst attempts to arrive at the most useful model (i.e., the choice of which regressors to use). It should be emphasized here that if a coefficient is found insignificant (i.e., the hypothesis H0: βj = 0 is not rejected), the conclusion drawn is that the vari- able is insignificant (i.e., explains an insignificant amount of variation in y), in the presence of the other regressors in the model. This point will be reaffirmed in a future discussion.
Inferences on Mean Response and Prediction
One of the most useful inferences that can be made regarding the quality of the predicted response y0 corresponding to the values x10, x20, . . . , xk0 is the confidence interval on the mean response μY |x10 ,x20 ,…,xk0 . We are interested in constructing a confidence interval on the mean response for the set of conditions given by
x′0 = [1,×10,x20,…,xk0].
We augment the conditions on the x’s by the number 1 in order to facilitate the matrix notation. Normality in the εi produces normality in the bj and the mean and variance are still the same as indicated in Section 12.4. So is the covariance between bi and bj, for i ̸= j. Hence,
􏰤k j=1
is likewise normally distributed and is, in fact, an unbiased estimator for the mean response on which we are attempting to attach a confidence interval. The variance of yˆ0, written in matrix notation simply as a function of σ2, (X′X)−1, and the condition vector x′0, is
σ2 = σ2x′ (X′X)−1x . yˆ0 0 0
H0: β2 = −2.5, H1: β2 > −2.5.
b2 − β20 −1.8616 + 2.5
t =
P = P(T > 2.390) = 0.04.
= 2.073√0.0166 = 2.390,
22
yˆ = b 0 +
b j x j 0

12.5 Inferences in Multiple Linear Regression 457
If this expression is expanded for a given case, say k = 2, it is readily seen that it appropriately accounts for the variance of the bj and the covariance of bi and bj, for i ̸= j. After σ2 is replaced by s2 as given by Theorem 12.1, the 100(1 − α)% confidence interval on μY |x10,x20,…,xk0 can be constructed from the statistic
yˆ0 − μY |x10,x20,…,xk0 T = s􏰱x′ (X′X)−1x ,
which has a t-distribution with n − k − 1 degrees of freedom.
Confidence Interval A 100(1 − α)% confidence interval for the mean response μY |x10,x20,…,xk0 is
00
for μY |x10,x20,…,xk0
􏱊􏱊
yˆ0−tα/2s x′0(X′X)−1×0<μY|x10,x20,...,xk0 F
Coeff Var
Variable DF
Intercept
x1
x2
x3
Dependent Predicted Std Error
Obs Variable Value Mean Predict
1 25.5000 27.3514
2 31.2000 32.2623
3 25.9000 27.3495
3 399.45437 133.15146 30.98 <.0001 9 38.67640 4.29738 Root MSE 2.07301 R-Square 0.9117 Dependent Mean 29.03846 Adj R-Sq 0.8823 7.13885 Parameter Standard Estimate Error 1 39.15735 5.88706 1 1.01610 0.19090 1 -1.86165 0.26733 1 -0.34326 0.61705 t Value 6.65 5.32 -6.96 -0.56 Pr > |t|
<.0001 0.0005 <.0001 0.5916 38.3096 15.5447 26.1081 41.2093 19.1165 28.4512 32.7960 43.8232 9.6499 21.4395 20.8658 31.3503 0.0904 2.8553 0.5919 95% CL Mean 1.4152 24.1500 30.5528 21.6734 33.0294 -1.8514 0.7846 30.4875 34.0371 27.2482 37.2764 -1.0623 1.3588 24.2757 30.4234 21.7425 32.9566 -1.4495 95% CL Predict Residual Figure 12.1: SAS printout for data in Example 12.4. More on Analysis of Variance in Multiple Regression (Optional) In Section 12.4, we discussed briefly the partition of the total sum of squares 􏰦n 2 (yi −y ̄) into its two components, the regression model and error sums of squares i=1 (illustrated in Figure 12.1). The analysis of variance leads to a test of H0: β1 =β2 =β3 =···=βk =0. Rejection of the null hypothesis has an important interpretation for the scientist or engineer. (For those who are interested in more extensive treatment of the subject using matrices, it is useful to discuss the development of these sums of squares used in ANOVA.) First, recall in Section 12.3, b, the vector of least squares estimators, is given by b = (X′X)−1X′y. 460 Chapter 12 Multiple Linear Regression and Certain Nonlinear Regression Models A partition of the uncorrected sum of squares 􏰤n i=1 into two components is given by y′y = b′X′y + (y′y − b′X′y) = y′X(X′X)−1X′y + [y′y − y′X(X′X)−1X′y]. The second term (in brackets) on the right-hand side is simply the error sum of 􏰦n 2 squares (yi − yˆi) . The reader should see that an alternative expression for the i=1 error sum of squares is SSE = y′[In − X(X′X)−1X′]y. The term y′X(X′X)−1X′y is called the regression sum of squares. However, 􏰦n 2 it is not the expression (yˆi − y ̄) used for testing the “importance” of the terms i=1 b1,b2,...,bk but, rather, −1 􏰤n y′X(X′X) X′y = i=1 which is a regression sum of squares uncorrected for the mean. As such, it would only be used in testing if the regression equation differs significantly from zero, that is, H0: β0 =β1 =β2 =···=βk =0. In general, this is not as important as testing H0: β1 =β2 =···=βk =0, since the latter states that the mean response is a constant, not necessarily zero. Degrees of Freedom Thus, the partition of sums of squares and degrees of freedom reduces to y ′ y = y i2 yˆi2, Source Regression Error Total Sum of Squares d.f. 􏰦n 2 ′ ′ −1 ′ yˆi=yX(XX) Xy k+1 􏰦n 2′ ′−1′ i=1 (yi−yˆi) =y[In−X(XX) X]y n−(k+1) i=1 􏰦n 2 ′ i=1 yi =yy n Exercises 461 Hypothesis of Interest // Now, of course, the hypotheses of interest for an ANOVA must eliminate the role of the intercept described previously. Strictly speaking, if H0 : β1 = β2 = · · · = βk = 0, then the estimated regression line is merely yˆi = y ̄. As a result, we are actually seeking evidence that the regression equation “varies from a constant.” Thus, the total and regression sums of squares must be corrected for the mean. As a result, we have 􏰤n 􏰤n 􏰤n (yi −y ̄)2 = (yˆi −y ̄)2 + (yi −yˆi)2. i=1 i=1 i=1 In matrix notation this is simply y′[In − 1(1′1)−11′]y = y′[X(X′X)−1X′ − 1(1′1)−11′]y + y′[In − X(X′X)−1X′]y. In this expression, 1 is a vector of n ones. As a result, we are merely subtracting 􏰺􏰤n 􏰻2 yi i=1 from y′y and from y′X(X′X)−1X′y (i.e., correcting the total and regression sums of squares for the mean). Finally, the appropriate partitioning of sums of squares with degrees of freedom is as follows: y′1(1′1)−11′y = 1 n Source Regression Error Total Sum of Squares d.f. 􏰦n 2′′−1′′−1 (yˆi −y ̄) =y[X(XX) X −1(11) 1]y k i=1 􏰦n 2′ ′−1′ (yi−yˆi) =y[In−X(XX) X]y n−(k+1) 􏰦n 2′ ′−1′ i=1 i=1 (yi−y ̄) =y[In−1(11) 1]y n−1 This is the ANOVA table that appears in the computer printout of Figure 12.1. The expression y′[1(1′1)−11′]y is often called the regression sum of squares associated with the mean, and 1 degree of freedom is allocated to it. Exercises 12.17 For the data of Exercise 12.2 on page 450, es- timate σ2. 12.18 For the data of Exercise 12.1 on page 450, es- timate σ2. 12.19 For the data of Exercise 12.5 on page 450, es- timate σ2. variance of the estimators b1 and b2 of Exercise 12.2 on page 450. 12.21 Referring to Exercise 12.5 on page 450, find the estimate of (a) σ2 ; b2 (b) Cov(b1,b4). 12.20 Obtain estimates of the variances and the co- 12.22 For the model of Exercise 12.7 on page 451, 462 Chapter 12 Multiple Linear Regression and Certain Nonlinear Regression Models test the hypothesis that β2 = 0 at the 0.05 level of significance against the alternative that β2 ̸= 0. 12.23 For the model of Exercise 12.2 on page 450, test the hypothesis that β1 = 0 at the 0.05 level of significance against the alternative that β1 ̸= 0. 12.24 For the model of Exercise 12.1 on page 450, test the hypotheses that β1 = 2 against the alternative that β1 ̸= 2. Use a P-value in your conclusion. 12.25 Using the data of Exercise 12.2 on page 450 and the estimate of σ2 from Exercise 12.17, compute 95% confidence intervals for the predicted response and the mean response when x1 = 900 and x2 = 1.00. 12.26 For Exercise 12.8 on page 451, construct a 90% confidence interval for the mean compressive strength when the concentration is x = 19.5 and a quadratic model is used. 12.27 Using the data of Exercise 12.5 on page 450 and the estimate of σ2 from Exercise 12.19, compute 95% confidence intervals for the predicted response and the mean response when x1 = 75, x2 = 24, x3 = 90, and x4 = 98. 12.28 Consider the following data from Exercise 12.13 on page 452. y (wear) 193 230 172 91 113 125 x1 (oil viscosity) x2 (load) 1.6 851 15.5 816 22.0 1058 43.0 1201 33.0 1357 40.0 1115 12.6 Choice of a Fitted Model through Hypothesis Testing In many regression situations, individual coefficients are of importance to the ex- perimenter. For example, in an economics application, β1, β2, . . . might have some particular significance, and thus confidence intervals and tests of hypotheses on these parameters would be of interest to the economist. However, consider an in- dustrial chemical situation in which the postulated model assumes that reaction yield is linearly dependent on reaction temperature and concentration of a certain catalyst. It is probably known that this is not the true model but an adequate ap- proximation, so interest is likely to be not in the individual parameters but rather in the ability of the entire function to predict the true response in the range of the variables considered. Therefore, in this situation, one would put more emphasis on σ2ˆ , confidence intervals on the mean response, and so forth, and likely deemphasize Y inferences on individual parameters. The experimenter using regression analysis is also interested in deletion of vari- ables when the situation dictates that, in addition to arriving at a workable pre- diction equation, he or she must find the “best regression” involving only variables that are useful predictors. There are a number of computer programs that sequen- tially arrive at the so-called best regression equation depending on certain criteria. We discuss this further in Section 12.9. One criterion that is commonly used to illustrate the adequacy of a fitted re- gression model is the coefficient of determination, or R2. (a) Estimate σ2 using multiple regression of y on x1 and x2. (b) Compute predicted values, a 95% confidence inter- val for mean wear, and a 95% prediction interval for observed wear if x1 = 20 and x2 = 1000. 12.29 Using the data from Exercise 12.28, test the following at the 0.05 level. (a) H0: β1 = 0 versus H1: β1 ̸= 0; (b) H0: β2 =0 versus H1: β2 ̸=0. (c) Do you have any reason to believe that the model in Exercise 12.28 should be changed? Why or why not? 12.30 Use the data from Exercise 12.16 on page 453. (a) Estimate σ2 using the multiple regression of y on x1, x2, and x3, (b) Compute a 95% prediction interval for the ob- served gain with the three regressors at x1 = 15.0, x2 = 220.0, and x3 = 6.0. 12.6 Choice of a Fitted Model through Hypothesis Testing 463 C o e ffi c i e n t o f Determination, or 2 􏰦n = i=1 2 R R2 = SSR SST 􏰦n (yˆi − y ̄) ( y i − y ̄ ) 2 = 1 − SSE SST . i=1 Note that this parallels the description of R2 in Chapter 11. At this point the explanation might be clearer since we now focus on SSR as the variability explained. The quantity R2 merely indicates what proportion of the total vari- ation in the response Y is explained by the fitted model. Often an experimenter will report R2 × 100% and interpret the result as percent variation explained by the postulated model. The square root of R2 is called the multiple correlation coefficient between Y and the set x1, x2, . . . , xk. The value of R2 for the case in Example 12.4, indicating the proportion of variation explained by the three independent variables x1, x2, and x3, is R2 = SSR = 399.45 = 0.9117, S S T 438.13 which means that 91.17% of the variation in percent survival has been explained by the linear regression model. The regression sum of squares can be used to give some indication concerning whether or not the model is an adequate explanation of the true situation. We can test the hypothesis H0 that the regression is not significant by merely forming the ratio f = SSR/k = SSR/k SSE/(n − k − 1) s2 and rejecting H0 at the α-level of significance when f > fα(k, n − k − 1). For the data of Example 12.4, we obtain
f = 399.45/3 = 30.98. 4.298
From the printout in Figure 12.1, the P -value is less than 0.0001. This should not be misinterpreted. Although it does indicate that the regression explained by the model is significant, this does not rule out the following possibilities:
1. The linear regression model for this set of x’s is not the only model that can be used to explain the data; indeed, there may be other models with transformations on the x’s that give a larger value of the F-statistic.
2. The model might have been more effective with the inclusion of other variables in addition to x1, x2, and x3 or perhaps with the deletion of one or more of the variables in the model, say x3, which has a P = 0.5916.
The reader should recall the discussion in Section 11.5 regarding the pitfalls in the use of R2 as a criterion for comparing competing models. These pitfalls are certainly relevant in multiple linear regression. In fact, in its employment in multiple regression, the dangers are even more pronounced since the temptation

464 Chapter 12 Multiple Linear Regression and Certain Nonlinear Regression Models
to overfit is so great. One should always keep in mind that R2 ≈ 1.0 can always be achieved at the expense of error degrees of freedom when an excess of model terms is employed. However, R2 = 1, describing a model with a near perfect fit, does not always result in a model that predicts well.
The Adjusted Coefficient of Determination (R2 ) adj
Adjusted R2
In Chapter 11, several figures displaying computer printout from both SAS and MINITAB featured a statistic called adjusted R2 or adjusted coefficient of deter- mination. Adjusted R2 is a variation on R2 that provides an adjustment for degrees of freedom. The coefficient of determination as defined on page 407 cannot decrease as terms are added to the model. In other words, R2 does not decrease as the error degrees of freedom n − k − 1 are reduced, the latter result being produced by an increase in k, the number of model terms. Adjusted R2 is computed by dividing SSE and SST by their respective degrees of freedom as follows.
R2 = 1 − SSE/(n − k − 1). adj SST/(n−1)
To illustrate the use of R2 , Example 12.4 will be revisited. adj
How Are R2 and R2 Affected by Removal of x3? adj
The t-test (or corresponding F -test) for x3 suggests that a simpler model involving
only x1 and x2 may well be an improvement. In other words, the complete model
with all the regressors may be an overfitted model. It is certainly of interest
to investigate R2 and R2 for both the full (x1,x2,x3) and the reduced (x1,x2) adj
models. We already know that R2 = 0.9117 from Figure 12.1. The SSE for full
the reduced model is 40.01, and thus R2 = 1 − 40.01 = 0.9087. Thus, more reduced 438.13
variability is explained with x3 in the model. However, as we have indicated, this will occur even if the model is an overfitted model. Now, of course, R2 is designed
to provide a statistic that punishes an overfitted model, so we might expect it to favor the reduced model. Indeed, for the full model
R2 = 1 − 38.6764/9 = 1 − 4.2974 = 0.8823, adj 438.1308/12 36.5109
adj
whereas for the reduced model (deletion of x3)
R2 =1− 40.01/10 =1− 4.001 =0.8904.
adj 438.1308/12 36.5109
Thus, R2 does indeed favor the reduced model and confirms the evidence pro-
adj
duced by the t- and F-tests, suggesting that the reduced model is preferable to the model containing all three regressors. The reader may expect that other statistics would suggest rejection of the overfitted model. See Exercise 12.40 on page 471.

12.6 Choice of a Fitted Model through Hypothesis Testing 465 Test on an Individual Coefficient
The addition of any single variable to a regression system will increase the re- gression sum of squares and thus reduce the error sum of squares. Consequently, we must decide whether the increase in regression is sufficient to warrant using the variable in the model. As we might expect, the use of unimportant variables can reduce the effectiveness of the prediction equation by increasing the variance of the estimated response. We shall pursue this point further by considering the importance of x3 in Example 12.4. Initially, we can test
H0: β3 = 0, H1: β3 ̸= 0
by using the t-distribution with 9 degrees of freedom. We have b3 − 0 −0.3433
= 2.073√0.0886 = −0.556,
which indicates that β3 does not differ significantly from zero, and hence we may very well feel justified in removing x3 from the model. Suppose that we consider the regression of Y on the set (x1,x2), the least squares normal equations now reducing to
⎡⎤⎡⎤⎡⎤
13.0 59.43 81.82 b0
⎣ 59.43 394.7255 360.6621 ⎦ ⎣b1⎦ = ⎣
81.82 360.6621 576.7264 b2
The estimated regression coefficients for this reduced model are
b0 = 36.094, b1 = 1.031, b2 = −1.870,
and the resulting regression sum of squares with 2 degrees of freedom is
R(β1, β2) = 398.12.
Here we use the notation R(β1,β2) to indicate the regression sum of squares of the restricted model; it should not be confused with SSR, the regression sum of squares of the original model with 3 degrees of freedom. The new error sum of squares is then
SST − R(β1, β2) = 438.13 − 398.12 = 40.01,
and the resulting mean square error with 10 degrees of freedom becomes
s2 = 40.01 = 4.001. 10
Does a Single Variable t-Test Have an F Counterpart?
From Example 12.4, the amount of variation in the percent survival that is at-
tributed to x3, in the presence of the variables x1 and x2, is
R(β3 |β1,β2)=SSR−R(β1,β2)=399.45−398.12=1.33,
t = s√c
33
377.50 1877.5670 ⎦ . 2246.6610

466 Chapter 12 Multiple Linear Regression and Certain Nonlinear Regression Models
which represents a small proportion of the entire regression variation. This amount of added regression is statistically insignificant, as indicated by our previous test on β3. An equivalent test involves the formation of the ratio
f = R(β3 | β1,β2) = 1.33 = 0.309, s2 4.298
which is a value of the F-distribution with 1 and 9 degrees of freedom. Recall that the basic relationship between the t-distribution with v degrees of freedom and the F-distribution with 1 and v degrees of freedom is
t2 =f(1,v),
and note that the f-value of 0.309 is indeed the square of the t-value of −0.56.
To generalize the concepts above, we can assess the work of an independent
variable xi in the general multiple linear regression model μY|x1,x2,…,xk =β0 +β1×1 +···+βkxk
by observing the amount of regression attributed to xi over and above that attributed to the other variables, that is, the regression on xi adjusted for the other variables. For example, we say that x1 is assessed by calculating
R(β1 |β2,β3,…,βk)=SSR−R(β2,β3,…,βk),
where R(β2,β3,…,βk) is the regression sum of squares with β1×1 removed from
the model. To test the hypothesis
we compute
H0: β1 = 0, H1: β1 ̸= 0,
f = R(β1 | β2,β3,…,βk), s2
and compare it with fα(1, n − k − 1). Partial F-Tests on Subsets of Coefficients
In a similar manner, we can test for the significance of a set of the variables. For example, to investigate simultaneously the importance of including x1 and x2 in the model, we test the hypothesis
H0: β1 = β2 = 0,
H1: β1 and β2 are not both zero,
by computing
f = [R(β1,β2 | β3,β4,…,βk)]/2 = [SSR−R(β3,β4,…,βk)]/2 s2 s2

12.7 Special Case of Orthogonality (Optional) 467
and comparing it with fα(2, n−k−1). The number of degrees of freedom associated with the numerator, in this case 2, equals the number of variables in the set being investigated.
Suppose we wish to test the hypothesis H0:β2 =β3 =0,
H1: β2 and β3 are not both zero for Example 12.4. If we develop the regression model
y = β0 + β1×1 + ε,
we can obtain R(β1) = SSRreduced = 187.31179. From Figure 12.1 on page 459, we have s2 = 4.29738 for the full model. Hence, the f-value for testing the hypothesis is
f = R(β2, β3 | β1)/2 = [R(β1, β2, β3) − R(β1)]/2 = [SSRfull − SSRreduced]/2 s2 s2 s2
= (399.45437 − 187.31179)/2 = 24.68278. 4.29738
This implies that β2 and β3 are not simultaneously zero. Using statistical software such as SAS one can directly obtain the above result with a P-value of 0.0002. Readers should note that in statistical software package output there are P -values associated with each individual model coefficient. The null hypothesis for each is that the coefficient is zero. However, it should be noted that the insignificance of any coefficient does not necessarily imply that it does not belong in the final model. It merely suggests that it is insignificant in the presence of all other variables in the problem. The case study at the end of this chapter illustrates this further.
12.7 Special Case of Orthogonality (Optional)
Prior to our original development of the general linear regression problem, the assumption was made that the independent variables are measured without error and are often controlled by the experimenter. Quite often they occur as a result of an elaborately designed experiment. In fact, we can increase the effectiveness of the resulting prediction equation with the use of a suitable experimental plan.
Suppose that we once again consider the X matrix as defined in Section 12.3. We can rewrite it as
X = [1,×1,x2,…,xk],
where 1 represents a column of ones and xj is a column vector representing the
levels of xj. If
x′pxq =0, forp̸=q,
the variables xp and xq are said to be orthogonal to each other. There are certain obvious advantages to having a completely orthogonal situation where x′pxq = 0

468 Chapter 12 Multiple Linear Regression and Certain Nonlinear Regression Models for all possible p and q, p ̸= q, and, in addition,
􏰤n
xji = 0, j = 1,2,…,k.
i=1
The resulting X′X is a diagonal matrix, and the normal equations in Section 12.3
reduce to
􏰤n 􏰤n 􏰤n
􏰤n 􏰤n
x2ki = xkiyi.
nb0 = yi, b1 i=1
x21i =
x1iyi,··· ,bk
i=1
i=1
i=1
i=1
An important advantage is that one is easily able to partition SSR into single- degree-of-freedom components, each of which corresponds to the amount of variation in Y accounted for by a given controlled variable. In the orthogonal situation, we can write
􏰤n S S R =
i=1
􏰤n i=1
( b 0 + b 1 x 1 i + · · · + b k x k i − b 0 ) 2
( yˆ i − y ̄ ) 2 =
􏰤n 􏰤n 􏰤n
=b21 x21i +b2 x2i +···+b2k x2ki i=1 i=1 i=1
= R(β1) + R(β2) + · · · + R(βk).
The quantity R(βi) is the amount of the regression sum of squares associated with a model involving a single independent variable xi.
To test simultaneously for the significance of a set of m variables in an orthog- onal situation, the regression sum of squares becomes
R(β1,β2,…,βm |βm+1,βm+2,…,βk)=R(β1)+R(β2)+···+R(βm), and thus we have the further simplification
R(β1 | β2,β3,…,βk) = R(β1)
when evaluating a single independent variable. Therefore, the contribution of a given variable or set of variables is essentially found by ignoring the other variables in the model. Independent evaluations of the worth of the individual variables are accomplished using analysis-of-variance techniques, as given in Table 12.4. The total variation in the response is partitioned into single-degree-of-freedom compo- nents plus the error term with n−k−1 degrees of freedom. Each computed f-value is used to test one of the hypotheses
􏰹
H0: βi = 0 H1: βi ̸= 0
by comparing with the critical point fα(1, n − k − 1) or merely interpreting the P-value computed from the f-distribution.
i = 1,2,…,k,

12.7 Special Case of Orthogonality (Optional) 469 Table 12.4: Analysis of Variance for Orthogonal Variables
Source of Sum of Variation Squares
Degrees of Freedom
1
1 .
Mean
Square f
β1 R(β1) = b1
2􏰦n 2
2􏰦n 2 βk R(βk) = bk xki
i=1
Error SSE Total SST = Syy
Computed
2􏰦n 2
R(β1) R(β1) s2
R(β2) R(β2) s2
. .
R(βk) R(βk) s2
β2 R(β2) = b2 . .
i=1 i=1
x1i x2i
1
n−k−1 s2= SSE
n−k−1
n − 1
Example 12.8: Suppose that a scientist takes experimental data on the radius of a propellant grain Y as a function of powder temperature x1, extrusion rate x2, and die temperature x3. Fit a linear regression model for predicting grain radius, and determine the effectiveness of each variable in the model. The data are given in Table 12.5.
Table 12.5: Data for Example 12.8
Grain Radius
82
93
114
124
111
129
157
164
Powder Temperature
150 (−1) 190 (+1) 150 (−1) 150 (−1) 190 (+1) 190 (+1) 150 (−1) 190 (+1)
Extrusion Rate
12 (−1) 12 (−1) 24 (+1) 12 (−1) 24 (+1) 12 (−1) 24 (+1) 24 (+1)
Die Temperature
220 (−1) 220 (−1) 220 (−1) 250 (+1) 220 (−1) 250 (+1) 250 (+1) 250 (+1)
Solution : Note that each variable is controlled at two levels, and the experiment is composed of the eight possible combinations. The data on the independent variables are coded for convenience by means of the following formulas:
x1 = powder temperature − 170 , 20
x2 = extrusion rate − 18 , 6
x3 = die temperature − 235 . 15
The resulting levels of x1, x2, and x3 take on the values −1 and +1 as indicated in the table of data. This particular experimental design affords the orthogonal-

470
Chapter 12 Multiple Linear Regression and Certain Nonlinear Regression Models
ity that we want to illustrate here. (A more thorough treatment of this type of
experimental layout appears in Chapter 15.) The X matrix is ⎡⎤
1 −1 −1 −1 ⎢1 1−1−1⎥ ⎢1−1 1−1⎥
X=⎢1 −1 −1 1⎥, ⎢1 1 1−1⎥ ⎢1 1−1 1⎥ ⎣1−1 1 1⎦
1111
and the orthogonality conditions are readily verified. We can now compute coefficients
1 􏰤8
b0=8
􏰦8
b2 = i=1 =
1 􏰤8
b1=8
b3 = i=1
//
yi=121.75,
20 x1iyi= 8 =2.5,
i=1
i=1 􏰦8
x2i yi
88 88
118
=14.75,
x3i yi 174
= =21.75,
so in terms of the coded variables, the prediction equation is yˆ=121.75+2.5×1 +14.75×2 +21.75×3.
The analysis of variance in Table 12.6 shows independent contributions to SSR for each variable. The results, when compared to the f0.05(1,4) critical point of 7.71, indicate that x1 does not contribute significantly at the 0.05 level, whereas variables x2 and x3 are significant. In this example, the estimate for σ2 is 23.1250. As for the single independent variable case, it should be pointed out that this estimate does not solely contain experimental error variation unless the postulated model is correct. Otherwise, the estimate is “contaminated” by lack of fit in addition to pure error, and the lack of fit can be separated out only if we obtain multiple experimental observations for the various (x1, x2, x3) combinations.
Table 12.6: Analysis of Variance for Grain Radius Data
Source of Variation
β1
β2
β3 Error Total
Sum of Squares
(2.5)2(8) = 50.00 (14.75)2(8) = 1740.50 (21.75)2(8) = 3784.50 92.50
5667.50
Degrees of Freedom
1 1 1 4 7
Mean Squares
50.00 1740.50 3784.50 23.13
Computed
f
2.16 75.26 163.65
P-Value 0.2156
0.0010 0.0002
Since x1 is not significant, it can simply be eliminated from the model without altering the effects of the other variables. Note that x2 and x3 both impact the grain radius in a positive fashion, with x3 being the more important factor based on the smallness of its P-value.

Exercises
471
Exercises
12.31 Compute and interpret the coefficient of multi- ple determination for the variables of Exercise 12.1 on page 450.
12.32 Test whether the regression explained by the model in Exercise 12.1 on page 450 is significant at the 0.01 level of significance.
12.33 Test whether the regression explained by the model in Exercise 12.5 on page 450 is significant at the 0.01 level of significance.
12.34 For the model of Exercise 12.5 on page 450, test the hypothesis
H0: β1 =β2 =0,
H1: β1 and β2 are not both zero.
12.35 Repeat Exercise 12.17 on page 461 using an F-statistic.
12.36 A small experiment was conducted to fit a mul- tiple regression equation relating the yield y to temper- ature x1 , reaction time x2 , and concentration of one of the reactants x3. Two levels of each variable were chosen, and measurements corresponding to the coded independent variables were recorded as follows:
12.38 Consider the data for Exercise 12.36. Compute the following:
//
R(β1 |β0),
R(β2 |β0,β1),
R(β3 |β0,β1,β2),R(β1,β2 |β3).
R(β1 |β0,β2,β3), R(β2 |β0,β1,β3),
y x1 7.6 −1 8.4 1 9.2 −1
10.3 −1 9.8 1 11.1 1 10.2 −1 12.6 1
x2 x3 −1 −1 −1 −1
1 −1 −1 1 1 −1 −1 1 1 1 1 1
Comment.
12.39 Consider the data of Exercise 11.55 on page 437. Fit a regression model using weight and drive ratio as explanatory variables. Compare this model with the SLR (simple linear regression) model using weight alone. Use R2 , Ra2dj , and any t-statistics (or F-statistics) you may need to compare the SLR with the multiple regression model.
12.40 Consider Example 12.4. Figure 12.1 on page 459 displays a SAS printout of an analysis of the model containing variables x1, x2, and x3. Focus on the confidence interval of the mean response μY at the (x1,x2,x3) locations representing the 13 data points. Consider an item in the printout indicated by C.V. This is the coefficient of variation, which is defined by
C.V. = s · 100, y ̄
where s = √s2 is the root mean squared error. The coefficient of variation is often used as yet another crite- rion for comparing competing models. It is a scale-free quantity which expresses the estimate of σ, namely s, as a percent of the average response y ̄. In competition for the “best” among a group of competing models, one strives for the model with a small value of C.V. Do a regression analysis of the data set shown in Example 12.4 but eliminate x3. Compare the full (x1, x2, x3) model with the restricted (x1,x2) model and focus on two criteria: (i) C.V.; (ii) the widths of the confidence intervals on μY . For the second criterion you may want to use the average width. Comment.
12.41 Consider Example 12.3 on page 447. Compare the two competing models.
(a) Using the coded variables, estimate the multiple linear regression equation
μY |x1,x2,x3 = β0 + β1×1 + β2×2 + β3×3.
(b) Partition SSR, the regression sum of squares, into three single-degree-of-freedom components at- tributable to x1, x2, and x3, respectively. Show an analysis-of-variance table, indicating significance tests on each variable.
12.37 Consider the electric power data of Exercise 12.5 on page 450. Test H0: β1 = β2 = 0, making use of R(β1 , β2 | β3 , β4 ). Give a P-value, and draw conclu- sions.
First order: Second order:
yi = β0 + β1x1i + β2x2i + εi, yi = β0 + β1x1i + β2x2i
+ β11x21i + β22x2i + β12x1ix2i + εi.
Use Ra2dj in your comparison. Test H0 : β11 = β22 = β12 = 0. In addition, use the C.V. discussed in Exercise 12.40.

472 Chapter 12 Multiple Linear Regression and Certain Nonlinear Regression Models
12.42 In Example 12.8, a case is made for eliminat- ing x1, powder temperature, from the model since the P-value based on the F-test is 0.2156 while P-values for x2 and x3 are near zero.
(a) Reduce the model by eliminating x1, thereby pro- ducing a full and a restricted (or reduced) model, and compare them on the basis of Ra2dj.
(b) Compare the full and restricted models using the width of the 95% prediction intervals on a new ob- servation. The better of the two models would be that with the tightened prediction intervals. Use the average of the width of the prediction intervals.
12.43 Consider the data of Exercise 12.13 on page 452. Can the response, wear, be explained adequately by a single variable (either viscosity or load) in an SLR rather than with the full two-variable regression? Jus- tify your answer thoroughly through tests of hypothe- ses as well as comparison of the three competing models.
12.44 For the data set given in Exericise 12.16 on page 453, can the response be explained adequately by any two regressor variables? Discuss.
12.8 Categorical or Indicator Variables
An extremely important special-case application of multiple linear regression oc- curs when one or more of the regressor variables are categorical, indicator, or dummy variables. In a chemical process, the engineer may wish to model the process yield against regressors such as process temperature and reaction time. However, there is interest in using two different catalysts and somehow including “the catalyst” in the model. The catalyst effect cannot be measured on a contin- uum and is hence a categorical variable. An analyst may wish to model the price of homes against regressors that include square feet of living space x1, the land acreage x2, and age of the house x3. These regressors are clearly continuous in nature. However, it is clear that cost of homes may vary substantially from one area of the country to another. If data are collected on homes in the east, mid- west, south, and west, we have an indicator variable with four categories. In the chemical process example, if two catalysts are used, we have an indicator variable with two categories. In a biomedical example in which a drug is to be compared to a placebo, all subjects are evaluated on several continuous measurements such as age, blood pressure, and so on, as well as gender, which of course is categori- cal with two categories. So, included along with the continuous variables are two indicator variables: treatment with two categories (active drug and placebo) and gender with two categories (male and female).
Model with Categorical Variables
Let us use the chemical processing example to illustrate how indicator variables are involved in the model. Suppose y = yield and x1 = temperature and x2 = reaction time. Now let us denote the indicator variable by z. Let z = 0 for catalyst 1 and z = 1 for catalyst 2. The assignment of the (0, 1) indicator to the catalyst is arbitrary. As a result, the model becomes
yi =β0 +β1x1i +β2x2i +β3zi +εi, i=1,2,…,n.
Three Categories
The estimation of coefficients by the method of least squares continues to apply. In the case of three levels or categories of a single indicator variable, the model will

12.8 Categorical or Indicator Variables 473 include two regressors, say z1 and z2, where the (0,1) assignment is as follows:
⎡z1 z2⎤ 10,
⎣0 1⎦ 00
where 0 and 1 are vectors of 0’s and 1’s, respectively. In other words, if there are l categories, the model includes l − 1 actual model terms.
It may be instructive to look at a graphical representation of the model with three categories. For the sake of simplicity, let us assume a single continuous variable x. As a result, the model is given by
yi = β0 +β1xi +β2z1i +β3z2i +εi.
Thus, Figure 12.2 reflects the nature of the model. The following are model ex-
pressions for the three categories.
E(Y ) = (β0 + β2) + β1x, E(Y ) = (β0 + β3) + β1x, E(Y ) = β0 + β1x,
category 1, category 2, category 3.
As a result, the model involving categorical variables essentially involves a change in the intercept as we change from one category to another. Here of course we are assuming that the coefficients of continuous variables remain the same across the categories.
y
Figure 12.2: Case of three categories.
Category 1
Category 2 Category 3
x
Example 12.9: Consider the data in Table 12.7. The response y is the amount of suspended solids in a coal cleansing system. The variable x is the pH of the system. Three different polymers are used in the system. Thus, “polymer” is categorical with three categories and hence produces two model terms. The model is given by
yi =β0 +β1xi +β2z1i +β3z2i +εi, i=1,2,…,18.

474 Chapter 12 Multiple Linear Regression and Certain Nonlinear Regression Models Here we have
􏰥􏰥
z1 = 1, for polymer 1, and z2 = 0, otherwise,
1, for polymer 2, 0, otherwise.
From the analysis in Figure 12.3, the following conclusions are drawn. The coefficient b1 for pH is the estimate of the common slope that is assumed in the regression analysis. All model terms are statistically significant. Thus, pH and the nature of the polymer have an impact on the amount of cleansing. The signs and magnitudes of the coefficients of z1 and z2 indicate that polymer 1 is most effective (producing higher suspended solids) for cleansing, followed by polymer 2. Polymer 3 is least effective.
x, (pH) 6.5 6.9 7.8 8.4 8.8 9.2 6.7 6.9 7.5 7.9 8.7 9.2 6.5 7.0 7.2 7.6 8.7 9.2
Table 12.7: Data for Example 12.9
y, (amount of suspended solids)
292 1
Polymer
329 1 352 1 378 1 392 1 410 1 198 2 227 2 277 2 297 2 364 2 375 2 167 3 225 3 247 3 268 3 288 3 342 3
Slope May Vary with Indicator Categories
In the discussion given here, we have assumed that the indicator variable model terms enter the model in an additive fashion. This suggests that the slopes, as in Figure 12.2, are constant across categories. Obviously, this is not always going to be the case. We can account for the possibility of varying slopes and indeed test for this condition of parallelism by including product or interaction terms between indicator terms and continuous variables. For example, suppose a model with one continuous regressor and an indicator variable with two levels is chosen. The model is given by
y=β0 +β1x+β2z+β3xz+ε.

12.8 Categorical or Indicator Variables
475
R-Square
0.940433
Coeff Var
6.316049
Root MSE y Mean
19.04640 301.5556
Standard
Error t Value Pr > |t|
Source DF
Model 3
Error 14
Sum of
Squares Mean Square
80181.73127 26727.24376
5078.71318 362.76523
F Value 73.68
Pr > F <.0001 Corrected Total 17 85260.44444 Parameter Intercept -161.8973333 37.43315576 x z1 z2 54.2940260 89.9980606 27.1656970 4.75541126 11.05228237 11.01042883 -4.32 0.0007 11.42 <.0001 8.14 <.0001 2.47 0.0271 Estimate Figure 12.3: SAS printout for Example 12.9. This model suggests that for category l (z = 1), E(y) = (β0 +β2)+(β1 +β3)x, while for category 2 (z = 0), Thus, we allow for varying intercepts and slopes for the two categories. Figure 12.4 E(y) = β0 + β1x. displays the regression lines with varying slopes for the two categories. y β2 β0 Category 1: slope = β1+β3 Category 2: slope = β1 x Figure 12.4: Nonparallelism in categorical variables. In this case, β0, β1, and β2 are positive while β3 is negative with |β3| < β1. Ob- viously, if the interaction coefficient β3 is insignificant, we are back to the common slope model. 476 Chapter 12 Multiple Linear Regression and Certain Nonlinear Regression Models Exercises 12.45 A study was done to assess the cost effective- ness of driving a four-door sedan instead of a van or an SUV (sports utility vehicle). The continuous variables are odometer reading and octane of the gasoline used. The response variable is miles per gallon. The data are presented here. (c) Discuss the difference between a van and an SUV in terms of gas mileage. 12.46 A study was done to determine whether the gender of the credit card holder was an important fac- tor in generating profit for a certain credit card com- pany. The variables considered were income, the num- ber of family members, and the gender of the card holder. The data are as follows: MPG Car Type 34.5 sedan 33.3 sedan 30.4 sedan 32.8 sedan 35.0 sedan 29.0 sedan 32.5 sedan 29.6 sedan 16.8 van 19.2 van 22.6 van 24.4 van 20.7 van 25.1 van 18.8 van 15.8 van 17.4 van 15.6 SUV 17.3 SUV 20.8 SUV 22.2 SUV 16.5 SUV 21.3 SUV 20.7 SUV 24.1 SUV Odometer Octane 75,000 87.5 60,000 87.5 88,000 78.0 15,000 78.0 25,000 90.0 35,000 78.0 102,000 90.0 98,000 87.5 56,000 87.5 72,000 90.0 14,500 87.5 22,000 90.0 66,500 78.0 35,000 90.0 97,500 87.5 65,500 78.0 42,000 78.0 65,000 78.0 55,500 87.5 26,500 87.5 11,500 90.0 38,000 78.0 77,500 90.0 19,500 78.0 87,000 90.0 Profit Income 157 45,000 −181 55,000 −253 45,800 158 38,000 75 75,000 202 99,750 −451 28,000 146 39,000 89 54,350 −357 32,500 522 36,750 78 42,500 5 34,250 −177 36,750 123 24,500 251 27,500 −56 18,000 453 24,500 288 88,750 −104 19,750 Family Gender Members M 1 M 2 M 4 M 3 M 4 M 4 M 1 M 2 M 1 M 1 F 1 F 3 F 2 F 3 F 2 F 1 F 1 F 1 F 1 F 2 (a) Fit a linear regression model including two indica- tor variables. Use (0,0) to denote the four-door sedan. (b) Which type of vehicle appears to get the best gas mileage? 12.9 Sequential Methods for Model Selection At times, the significance tests outlined in Section 12.6 are quite adequate for determining which variables should be used in the final regression model. These tests are certainly effective if the experiment can be planned and the variables are orthogonal to each other. Even if the variables are not orthogonal, the individual t-tests can be of some use in many problems where the number of variables under investigation is small. However, there are many problems where it is necessary to use more elaborate techniques for screening variables, particularly when the experiment exhibits a substantial deviation from orthogonality. Useful measures of multicollinearity (linear dependency) among the independent variables are provided by the sample correlation coefficients rxixj . Since we are concerned only (a) (b) Fit a linear available. Based on the fitted model, would the company prefer male or female customers? Would you say that income was an important fac- tor in explaining the variability in profit? regression model using the variables 12.9 Sequential Methods for Model Selection 477 with linear dependency among independent variables, no confusion will result if we drop the x’s from our notation and simply write rxixj = rij, where rij = 􏰱Sij . SiiSjj Note that the rij do not give true estimates of population correlation coeffi- cients in the strict sense, since the x’s are actually not random variables in the context discussed here. Thus, the term correlation, although standard, is perhaps a misnomer. When one or more of these sample correlation coefficients deviate substantially from zero, it can be quite difficult to find the most effective subset of variables for inclusion in our prediction equation. In fact, for some problems the multicollinear- ity will be so extreme that a suitable predictor cannot be found unless all possible subsets of the variables are investigated. Informative discussions of model selec- tion in regression by Hocking (1976) are cited in the Bibliography. Procedures for detection of multicollinearity are discussed in the textbook by Myers (1990), also cited. The user of multiple linear regression attempts to accomplish one of three ob- jectives: 1. Obtain estimates of individual coefficients in a complete model. 2. Screen variables to determine which have a significant effect on the response. 3. Arrive at the most effective prediction equation. In (1) it is known a priori that all variables are to be included in the model. In (2) prediction is secondary, while in (3) individual regression coefficients are not as important as the quality of the estimated response yˆ. For each of the situations above, multicollinearity in the experiment can have a profound effect on the success of the regression. In this section, some standard sequential procedures for selecting variables are discussed. They are based on the notion that a single variable or a collection of variables should not appear in the estimating equation unless the variables result in a significant increase in the regression sum of squares or, equivalently, a significant increase in R2, the coefficient of multiple determination. Illustration of Variable Screening in the Presence of Collinearity Example 12.10: Consider the data of Table 12.8, where measurements were taken for nine infants. The purpose of the experiment was to arrive at a suitable estimating equation relating the length of an infant to all or a subset of the independent variables. The sample correlation coefficients, indicating the linear dependency among the independent variables, are displayed in the symmetric matrix ⎡ x1 x2 x3 x4 ⎤ 1.0000 0.9523 0.5340 0.3900 ⎢ 0.9523 1.0000 0.2626 0.1549 ⎥ ⎣ 0.5340 0.2626 1.0000 0.7847 ⎦ 0.3900 0.1549 0.7847 1.0000 478 Chapter 12 Multiple Linear Regression and Certain Nonlinear Regression Models Table 12.8: Data Relating to Infant Length∗ Infant Length, y (cm) 57.5 52.8 61.3 67.0 53.5 62.7 56.2 68.5 69.2 Age, x1 (days) 78 69 77 88 67 80 74 94 102 Length at Birth, x2 (cm) 48.2 45.5 46.3 49.0 43.0 48.0 48.0 53.0 58.0 Weight at Birth, x3 (kg) 2.75 2.15 4.41 5.52 3.21 4.32 2.31 4.30 3.71 Chest Size at Birth, x4 (cm) 29.5 26.3 32.2 36.5 27.2 27.7 28.3 30.3 28.7 ∗Data analyzed by the Statistical Consulting Center, Virginia Tech, Blacksburg, Virginia. Note that there appears to be an appreciable amount of multicollinearity. Using the least squares technique outlined in Section 12.2, the estimated regression equation was fitted using the complete model and is yˆ = 7.1475 + 0.1000x1 + 0.7264x2 + 3.0758x3 − 0.0300x4. The value of s2 with 4 degrees of freedom is 0.7414, and the value for the coefficient of determination for this model is found to be 0.9908. Regression sums of squares, measuring the variation attributed to each individual variable in the presence of the others, and the corresponding t-values are given in Table 12.9. Table 12.9: t-Values for the Regression Data of Table 12.8 Variable x1 R(β1 | β2,β3,β4) = 0.0644 t = 0.2947 Variable x2 R(β2 | β1,β3,β4) = 0.6334 t = 0.9243 Variable x3 R(β3 | β1,β2,β4) = 6.2523 t = 2.9040 Variable x4 R(β4 | β1,β2,β3) = 0.0241 t = −0.1805 A two-tailed critical region with 4 degrees of freedom at the 0.05 level of sig- nificance is given by |t| > 2.776. Of the four computed t-values, only variable x3 appears to be significant. However, recall that although the t-statistic described in Section 12.6 measures the worth of a variable adjusted for all other variables, it does not detect the potential importance of a variable in combination with a subset of the variables. For example, consider the model with only the variables x2 and x3 in the equation. The data analysis gives the regression function
yˆ = 2.1833 + 0.9576×2 + 3.3253×3,
with R2 = 0.9905, certainly not a substantial reduction from R2 = 0.9907 for the complete model. However, unless the performance characteristics of this particular combination had been observed, one would not be aware of its predictive poten- tial. This, of course, lends support for a methodology that observes all possible regressions or a systematic sequential procedure designed to test subsets.

12.9 Sequential Methods for Model Selection 479 Stepwise Regression
One standard procedure for searching for the “optimum subset” of variables in the absence of orthogonality is a technique called stepwise regression. It is based on the procedure of sequentially introducing the variables into the model one at a time. Given a predetermined size α, the description of the stepwise routine will be better understood if the methods of forward selection and backward elimination are described first.
Forward selection is based on the notion that variables should be inserted one at a time until a satisfactory regression equation is found. The procedure is as follows:
STEP 1. Choose the variable that gives the largest regression sum of squares when performing a simple linear regression with y or, equivalently, that which gives the largest value of R2. We shall call this initial variable x1. If x1 is insignificant, the procedure is terminated.
STEP 2. Choose the variable that, when inserted in the model, gives the largest increase in R2, in the presence of x1, over the R2 found in step 1. This, of course, is the variable xj for which
R(βj |β1 ) = R(β1 , βj ) − R(β1 )
is largest. Let us call this variable x2. The regression model with x1 and x2 is then fitted and R2 observed. If x2 is insignificant, the procedure is terminated.
STEP 3. Choose the variable xj that gives the largest value of R(βj |β1,β2)=R(β1,β2,βj)−R(β1,β2),
again resulting in the largest increase of R2 over that given in step 2. Calling this variable x3, we now have a regression model involving x1, x2, and x3. If x3 is insignificant, the procedure is terminated.
This process is continued until the most recent variable inserted fails to induce a significant increase in the explained regression. Such an increase can be determined at each step by using the appropriate partial F -test or t-test. For example, in step 2 the value
f = R(β2|β1) s2
can be determined to test the appropriateness of x2 in the model. Here the value of s2 is the mean square error for the model containing the variables x1 and x2. Similarly, in step 3 the ratio
f = R(β3 | β1,β2) s2
tests the appropriateness of x3 in the model. Now, however, the value for s2 is the mean square error for the model that contains the three variables x1, x2, and x3. If f < fα (1, n − 3) at step 2, for a prechosen significance level, x2 is not included 480 Chapter 12 Multiple Linear Regression and Certain Nonlinear Regression Models Example 12.11: Solution: and the process is terminated, resulting in a simple linear equation relating y and x1. However, if f > fα(1, n − 3), we proceed to step 3. Again, if f < fα(1, n − 4) at step 3, x3 is not included and the process is terminated with the appropriate regression equation containing the variables x1 and x2. Backward elimination involves the same concepts as forward selection except that one begins with all the variables in the model. Suppose, for example, that there are five variables under consideration. The steps are as follows: STEP 1. Fit a regression equation with all five variables included in the model. Choose the variable that gives the smallest value of the regression sum of squares adjusted for the others. Suppose that this variable is x2. Remove x2 from the model if f = R(β2 | β1,β3,β4,β5) s2 is insignificant. STEP 2. Fit a regression equation using the remaining variables x1, x3, x4, and x5, and repeat step 1. Suppose that variable x5 is chosen this time. Once again, if f = R(β5 | β1,β3,β4) s2 is insignificant, the variable x5 is removed from the model. At each step, the s2 used in the F-test is the mean square error for the regression model at that stage. This process is repeated until at some step the variable with the smallest ad- justed regression sum of squares results in a significant f-value for some predeter- mined significance level. Stepwise regression is accomplished with a slight but important modification of the forward selection procedure. The modification involves further testing at each stage to ensure the continued effectiveness of variables that had been inserted into the model at an earlier stage. This represents an improvement over forward selection, since it is quite possible that a variable entering the regression equation at an early stage might have been rendered unimportant or redundant because of relationships that exist between it and other variables entering at later stages. Therefore, at a stage in which a new variable has been entered into the regression equation through a significant increase in R2 as determined by the F-test, all the variables already in the model are subjected to F-tests (or, equivalently, to t-tests) in light of this new variable and are deleted if they do not display a significant f-value. The procedure is continued until a stage is reached where no additional variables can be inserted or deleted. We illustrate the stepwise procedure in the following example. Using the techniques of stepwise regression, find an appropriate linear regression model for predicting the length of infants for the data of Table 12.8. STEP 1. Considering each variable separately, four individual simple linear regression equations are fitted. The following pertinent regression sums of 12.9 Sequential Methods for Model Selection 481 squares are computed: R(β1) = 288.1468, R(β3) = 186.1065, Variable x1 clearly gives the largest regression sum of squares. The mean square error for the equation involving only x1 is s2 = 4.7276, and since f = R(β1) = 288.1468 = 60.9500, s2 4.7276 which exceeds f0.05(1,7) = 5.59, the variable x1 is significant and is entered into the model. STEP 2. Three regression equations are fitted at this stage, all containing x1. The important results for the combinations (x1, x2), (x1, x3), and (x1, x4) are R(β2|β1) = 23.8703, R(β3|β1) = 29.3086, R(β4|β1) = 13.8178. Variable x3 displays the largest regression sum of squares in the presence of x1. The regression involving x1 and x3 gives a new value of s2 = 0.6307, and since f = R(β3|β1) = 29.3086 = 46.47, s2 0.6307 which exceeds f0.05 (1, 6) = 5.99, the variable x3 is significant and is included along with x1 in the model. Now we must subject x1 in the presence of x3 to a significance test. We find that R(β1 | β3) = 131.349, and hence f = R(β1|β3) = 131.349 = 208.26, s2 0.6307 which is highly significant. Therefore, x1 is retained along with x3. STEP 3. With x1 and x3 already in the model, we now require R(β2 | β1, β3) and R(β4 | β1,β3) in order to determine which, if any, of the remaining two variables is entered at this stage. From the regression analysis using x2 along with x1 and x3, we find R(β2 | β1,β3) = 0.7948, and when x4 is used along with x1 and x3, we obtain R(β4 | β1, β3) = 0.1855. The value of s2 is 0.5979 for the (x1, x2, x3) combination and 0.7198 for the (x1, x2, x4) combination. Since neither f-value is significant at the α = 0.05 level, the final regression model includes only the variables x1 and x3. The estimating equation is found to be yˆ = 20.1084 + 0.4136x1 + 2.0253x3, and the coefficient of determination for this model is R2 = 0.9882. Although (x1,x3) is the combination chosen by stepwise regression, it is not nec- essarily the combination of two variables that gives the largest value of R2. In fact, we have already observed that the combination (x2,x3) gives R2 = 0.9905. Of course, the stepwise procedure never observed this combination. A rational ar- gument could be made that there is actually a negligible difference in performance R(β2) = 215.3013, R(β4) = 100.8594. 482 Chapter 12 Multiple Linear Regression and Certain Nonlinear Regression Models Summary between these two estimating equations, at least in terms of percent variation explained. It is interesting to observe, however, that the backward elimination procedure gives the combination (x2,x3) in the final equation (see Exercise 12.49 on page 494). The main function of each of the procedures explained in this section is to expose the variables to a systematic methodology designed to ensure the eventual inclusion of the best combinations of the variables. Obviously, there is no assurance that this will happen in all problems, and, of course, it is possible that the multicollinearity is so extensive that one has no alternative but to resort to estimation procedures other than least squares. These estimation procedures are discussed in Myers (1990), listed in the Bibliography. The sequential procedures discussed here represent three of many such methods that have been put forth in the literature and appear in various regression computer packages that are available. These methods are designed to be computationally efficient but, of course, do not give results for all possible subsets of the variables. As a result, the procedures are most effective for data sets that involve a large number of variables. For regression problems involving a relatively small number of variables, modern regression computer packages allow for the computation and summarization of quantitative information on all models for every possible subset of the variables. Illustrations are provided in Section 12.11. Choice of P-Values As one might expect, the choice of the final model with these procedures may depend dramatically on what P-value is chosen. In addition, a procedure is most successful when it is forced to test a large number of candidate variables. For this reason, any forward procedure will be most useful when a relatively large P-value is used. Thus, some software packages use a default P -value of 0.50. 12.10 Study of Residuals and Violation of Assumptions (Model Checking) It was suggested earlier in this chapter that the residuals, or errors in the regression fit, often carry information that can be very informative to the data analyst. The ei =yi−yˆi, i=1,2,...,n,whicharethenumericalcounterparttotheεi,themodel errors, often shed light on the possible violation of assumptions or the presence of “suspect” data points. Suppose that we let the vector xi denote the values of the regressor variables corresponding to the ith data point, supplemented by a 1 in the initial position. That is, x′i = [1,x1i,x2i,...,xki]. hii = x′i(X′X)−1xi, i = 1,2,...,n. Consider the quantity 12.10 Study of Residuals and Violation of Assumptions (Model Checking) 483 The reader should recognize that hii was used in the computation of the confidence intervals on the mean response in Section 12.5. Apart from σ2, hii represents the variance of the fitted value yˆi. The hii values are the diagonal elements of the HAT matrix H = X(X′X)−1X′, which plays an important role in any study of residuals and in other modern aspects of regression analysis (see Myers, 1990, listed in the Bibliography). The term HAT matrix is derived from the fact that H generates the “y-hats,” or the fitted values when multiplied by the vector y of observed responses. That is, yˆ = Xb, and thus yˆ = X(X′X)−1X′y = Hy, where yˆ is the vector whose ith element is yˆi. If we make the usual assumptions that the εi are independent and normally distributed with mean 0 and variance σ2, the statistical properties of the residuals are readily characterized. Then E(ei)=E(yi −yˆi)=0 and σε2 =(1−hii)σ2, i for i = 1,2,...,n. (See Myers, 1990, for details.) It can be shown that the HAT diagonal values are bounded according to the inequality 1 ≤hii ≤1. n hii = k + 1, the number of regression parameters. As a result, any data point whose HAT diagonal element is large, that is, well above the average value of (k + 1)/n, is in a position in the data set where the variance of yˆi is relatively large and the variance of a residual is relatively small. As a result, the data analyst can gain some insight into how large a residual may become before its deviation from zero can be attributed to something other than mere chance. Many of the commercial regression computer packages produce the set of studentized residuals. ri= √ei , i=1,2,...,n s 1−hii Here each residual has been divided by an estimate of its standard de- viation, creating a t-like statistic that is designed to give the analyst a scale-free quantity providing information regarding the size of the residual. In addition, standard computer packages often provide values of another set of studentized- type residuals called the R-Student values. ei ti = s √1−h , i=1,2,...,n, −i ii where s−i is an estimate of the error standard deviation, calculated with the ith data point deleted. In addition, 􏰦n i=1 Studentized Residual R-Student Residual 484 Chapter 12 Multiple Linear Regression and Certain Nonlinear Regression Models There are three types of violations of assumptions that are readily detected through use of residuals or residual plots. While plots of the raw residuals, the ei, can be helpful, it is often more informative to plot the studentized residuals. The three violations are as follows: 1. Presence of outliers 2. Heterogeneous error variance 3. Model misspecification In case 1, we choose to define an outlier as a data point where there is a deviation from the usual assumption E(εi) = 0 for a specific value of i. If there is a reason to believe that a specific data point is an outlier exerting a large influence on the fitted model, ri or ti may be informative. The R-Student values can be expected to be more sensitive to outliers than the ri values. In fact, under the condition that E(εi) = 0, ti is a value of a random variable following a t-distribution with n − 1 − (k + 1) = n − k − 2 degrees of freedom. Thus, a two-sided t-test can be used to provide information for detecting whether or not the ith point is an outlier. Although the R-Student statistic ti produces an exact t-test for detection of an outlier at a specific data location, the t-distribution would not apply for simultane- ously testing for outliers at all locations. As a result, the studentized residuals or R-Student values should be used strictly as diagnostic tools without formal hypoth- esis testing as the mechanism. The implication is that these statistics highlight data points where the error of fit is larger than what is expected by chance. R-Student values large in magnitude suggest a need for “checking” the data with whatever resources are possible. The practice of eliminating observations from regression data sets should not be done indiscriminately. (For further information regarding the use of outlier diagnostics, see Myers, 1990, in the Bibliography.) Illustration of Outlier Detection Case Study 12.1: Method for Capturing Grasshoppers: In a biological experiment conducted at Virginia Tech by the Department of Entomology, n experimental runs were made with two different methods for capturing grasshoppers. The methods were drop net catch and sweep net catch. The average number of grasshoppers caught within a set of field quadrants on a given date was recorded for each of the two methods. An additional regressor variable, the average plant height in the quadrants, was also recorded. The experimental data are given in Table 12.10. The goal is to be able to estimate grasshopper catch by using only the sweep net method, which is less costly. There was some concern about the validity of the fourth data point. The observed catch that was reported using the net drop method seemed unusually high given the other conditions and, indeed, it was felt that the figure might be erroneous. Fit a model of the type yi = β0 + β1x1 + β2x2 to the 17 data points and study the residuals to determine if data point 4 is an outlier. 12.10 Study of Residuals and Violation of Assumptions (Model Checking) 485 Table 12.10: Data Set for Case Study 12.1 Observation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 Drop Net Catch, y 18.0000 8.8750 2.0000 20.0000 2.3750 2.7500 3.3333 1.0000 1.3333 1.7500 4.1250 12.8750 5.3750 28.0000 4.7500 1.7500 0.1333 Sweep Net Catch, x1 4.15476 2.02381 0.15909 2.32812 0.25521 0.57292 0.70139 0.13542 0.12121 0.10937 0.56250 2.45312 0.45312 6.68750 0.86979 0.14583 0.01562 Plant Height, x2 (cm) 52.705 42.069 34.766 27.622 45.879 97.472 102.062 97.790 88.265 58.737 42.386 31.274 31.750 35.401 64.516 25.241 36.354 Solution: A computer package generated the yˆ = 3.6870 + fitted regression model 4.1050x1 − 0.0367x2 along with the statistics R2 = 0.9244 and s2 = 5.580. The residuals and other diagnostic information were also generated and recorded in Table 12.11. As expected, the residual at the fourth location appears to be unusually high, namely 7.769. The vital issue here is whether or not this residual is larger than one would expect by chance. The residual standard error for point 4 is 2.209. The R- Student value t4 is found to be 9.9315. Viewing this as a value of a random variable having a t-distribution with 13 degrees of freedom, one would certainly conclude that the residual of the fourth observation is estimating something greater than 0 and that the suspected measurement error is supported by the study of residuals. Notice that no other residual results in an R-Student value that produces any cause for alarm. Plotting Residuals for Case Study 12.1 In Chapter 11, we discussed, in some detail, the usefulness of plotting residuals in regression analysis. Violation of model assumptions can often be detected through these plots. In multiple regression, normal probability plotting of residuals or plotting of residuals against yˆ may be useful. However, it is often preferable to plot studentized residuals. Keep in mind that the preference for the studentized residuals over ordinary residuals for plotting purposes stems from the fact that since the variance of the 486 Chapter 12 Multiple Linear Regression and Certain Nonlinear Regression Models Table 12.11: Residual Information for the Data Set of Case Study 12.1 Obs. yi 1 18.000 2 8.875 3 2.000 4 20.000 5 2.375 6 2.750 7 3.333 8 1.000 9 1.333 10 1.750 11 4.125 12 12.875 13 5.375 14 28.000 15 4.750 16 1.750 17 0.133 yˆi yi−yˆi 18.809 −0.809 10.452 −1.577 3.065 −1.065 12.231 7.769 3.052 −0.677 2.464 0.286 2.823 0.510 0.656 0.344 0.947 0.386 1.982 −0.232 4.442 −0.317 12.610 0.265 4.383 0.992 29.841 −1.841 4.891 −0.141 3.360 −1.610 2.418 −2.285 hii s√1 − hii 0.2291 2.074 0.0766 2.270 0.1364 2.195 0.1256 2.209 0.0931 2.250 0.2276 2.076 0.2669 2.023 0.2318 2.071 0.1691 2.153 0.0852 2.260 0.0884 2.255 0.1152 2.222 0.1339 2.199 0.6233 1.450 0.0699 2.278 0.1891 2.127 0.1386 2.193 ri −0.390 −0.695 −0.485 3.517 −0.301 0.138 0.252 0.166 0.179 −0.103 −0.140 0.119 0.451 −1.270 −0.062 −0.757 −1.042 ti −0.3780 −0.6812 −0.4715 9.9315 −0.2909 0.1329 0.2437 0.1601 0.1729 −0.0989 −0.1353 0.1149 0.4382 −1.3005 −0.0598 −0.7447 −1.0454 ith residual depends on the ith HAT diagonal, variances of residuals will differ if there is a dispersion in the HAT diagonals. Thus, the appearance of a plot of residuals may seem to suggest heterogeneity because the residuals themselves do not behave, in general, in an ideal way. The purpose of using studentized residuals is to provide a type of standardization. Clearly, if σ were known, then under ideal conditions (i.e., a correct model and homogeneous variance), we would have 􏰧e􏰨 􏰧e􏰨 E √ i =0 and Var √ i =1. σ 1−hii σ 1−hii So the studentized residuals produce a set of statistics that behave in a standard way under ideal conditions. Figure 12.5 shows a plot of the R-Student values for the grasshopper data of Case Study 12.1. Note how the value for observation 4 stands out from the rest. The R-Student plot was generated by SAS software. The plot shows the residuals against the yˆ-values. Normality Checking The reader should recall the importance of normality checking through the use of normal probability plotting, as discussed in Chapter 11. The same recommendation holds for the case of multiple linear regression. Normal probability plots can be generated using standard regression software. Again, however, they can be more effective when one does not use ordinary residuals but, rather, studentized residuals or R-Student values. 12.11 Cross Validation, Cp, and Other Criteria for Model Selection 487 10 8 6 4 2 0 Figure 12.5: R-Student values plotted against predicted values for grasshopper data of Case Study 12.1. 12.11 Cross Validation, Cp, and Other Criteria for Model Selection For many regression problems, the experimenter must choose among various alter- native models or model forms that are developed from the same data set. Quite often, the model that best predicts or estimates mean response is required. The experimenter should take into account the relative sizes of the s2-values for the can- didate models and certainly the general nature of the confidence intervals on the mean response. One must also consider how well the model predicts response val- ues that were not used in building the candidate models. The models should be subjected to cross validation. What are required, then, are cross-validation errors rather than fitting errors. Such errors in prediction are the PRESS resid- uals δi=yi−yˆi,−i, i=1,2,...,n, where yˆi,−i is the prediction of the ith data point by a model that did not make use of the ith point in the calculation of the coefficients. These PRESS residuals are calculated from the formula δi = ei , i=1,2,...,n. 1−hii (The derivation can be found in Myers, 1990.) Use of the PRESS Statistic The motivation for PRESS and the utility of PRESS residuals are very simple to understand. The purpose of extracting or setting aside data points one at a time is obs 4 0 5 10 15 20 25 30 Predicted Value of Y Studentized Residual without Current Obs 488 Chapter 12 Multiple Linear Regression and Certain Nonlinear Regression Models to allow the use of separate methodologies for fitting and assessment of a specific model. For assessment of a model, the “−i” indicates that the PRESS residual gives a prediction error where the observation being predicted is independent of the model fit. Criteria that make use of the PRESS residuals are given by pred R2 of Prediction Given a fitted model with a specific value for PRESS, R2 pred is given by 􏰤n i=1 􏰤n i=1 δi2. (The term PRESS is an acronym for prediction sum of squares.) We suggest that both of these criteria be used. It is possible for PRESS to be dominated by 􏰦n i=1 sensitive to a small number of large values. In addition to the PRESS statistic itself, the analyst can simply compute an R2-like statistic reflecting prediction performance. The statistic is often called R2 and is given as follows: |δi| and PRESS = one or only a few large PRESS residuals. Clearly, the criterion on |δi| is less R2 =1− PRESS . pred 􏰦n ( y i − y ̄ ) 2 Note that R2 pred i=1 is merely the ordinary R2 statistic with SSE replaced by the PRESS statistic. In the following case study, an illustration is provided in which many candidate models are fit to a set of data and the best model is chosen. The sequential procedures described in Section 12.9 are not used. Rather, the role of the PRESS residuals and other statistical values in selecting the best regression equation is illustrated. Case Study 12.2: Football Punting: Leg strength is a necessary characteristic of a successful punter in American football. One measure of the quality of a good punt is the “hang time.” This is the time that the ball hangs in the air before being caught by the punt returner. To determine what leg strength factors influence hang time and to de- velop an empirical model for predicting this response, a study on The Relationship Between Selected Physical Performance Variables and Football Punting Ability was conducted by the Department of Health, Physical Education, and Recreation at Virginia Tech. Thirteen punters were chosen for the experiment, and each punted a football 10 times. The average hang times, along with the strength measures used in the analysis, were recorded in Table 12.12. Each regressor variable is defined as follows: 1. RLS, right leg strength (pounds) 2. LLS, left leg strength (pounds) 3. RHF, right hamstring muscle flexibility (degrees) 4. LHF, left hamstring muscle flexibility (degrees) 12.11 Cross Validation, Cp, and Other Criteria for Model Selection 5. Power, overall leg strength (foot-pounds) Determine the most appropriate model for predicting hang time. Table 12.12: Data for Case Study 12.2 489 Punter Hang Time, y (sec) RLS, LLS, x1 x2 170 170 140 130 180 170 160 160 170 150 150 150 170 180 110 110 120 110 130 120 120 140 140 130 160 150 RHF, LHF, x3 x4 106 106 92 93 93 78 103 93 104 93 101 87 108 106 86 92 90 86 85 80 89 83 92 94 95 95 Power, x5 240.57 195.49 152.99 197.09 266.56 260.56 219.25 132.68 130.24 205.88 153.92 154.64 240.57 1 4.75 2 4.07 3 4.04 4 4.18 5 4.35 6 4.16 7 4.43 8 3.20 9 3.02 10 3.64 11 3.68 12 3.60 13 3.85 Solution: In the search for the best of the candidate models for predicting hang time, the information in Table 12.13 was obtained from a regression computer package. The models are ranked in ascending order of the values of the PRESS statistic. This display provides enough information on all possible models to enable the user to eliminate from consideration all but a few models. The model containing x2 and x5 (LLS and Power), denoted by x2x5, appears to be superior for predicting punter 2 􏰦n hang time. Also note that all models with low PRESS, low s , low i=1 high R2-values contain these two variables. In order to gain some insight from the residuals of the fitted regression yˆ i = b 0 + b 2 x 2 i + b 5 x 5 i , the residuals and PRESS residuals were generated. The actual prediction model (see Exercise 12.47 on page 494) is given by yˆ = 1.10765 + 0.01370x2 + 0.00429x5. Residuals, HAT diagonal values, and PRESS values are listed in Table 12.14. Note the relatively good fit of the two-variable regression model to the data. The PRESS residuals reflect the capability of the regression equation to predict hang time if independent predictions were to be made. For example, for punter number 4, the hang time of 4.180 would encounter a prediction error of 0.039 if the model constructed by using the remaining 12 punters were used. For this model, the average prediction error or cross-validation error is 1 􏰤n 13 |δi| = 0.1489 second, i=1 |δi|, and 490 Chapter 12 Multiple Linear Regression and Certain Nonlinear Regression Models x2 x5 x1 x2 x2 x4 x2 x3 x1 x2 x1 x2 x2 x3 x1 x3 x1 x4 x1 x5 x2 x3 x1 x3 x1 x2 x2 x3 x5 x1 x2 x3 x1 x3 x2 x3 x2 x4 x1 x2 x3 x4 x1 x4 x1 x1 x3 x1 x2 x3 x4 x1 x2 x5 x4 x5 x4 x5 x5 x5 x4 x3 x4 x5 x5 x3 x4 x4 x3 x4 x4 x5 x3 x5 x5 x5 x4 x5 0.036907 0.041001 0.037708 0.039636 0.042265 0.044578 0.042421 0.053664 0.056279 0.059621 0.056153 0.059400 0.048302 0.066894 0.065678 0.068402 0.074518 0.065414 0.062082 0.063744 0.059670 0.080605 0.069965 0.080208 0.059169 0.064143 0.072505 0.066088 0.111779 0.105648 0.186708 1.93583 0.54683 2.06489 0.58998 2.18797 0.59915 2.09553 0.66182 2.42194 0.67840 2.26283 0.70958 2.55789 0.86236 2.65276 0.87325 2.75390 0.89551 2.99434 0.97483 2.95310 0.98815 3.01436 0.99697 2.87302 1.00920 3.22319 1.04564 3.09474 1.05708 3.09047 1.09726 3.06754 1.13555 3.36304 1.15043 3.32392 1.17491 3.59101 1.18531 3.41287 1.26558 3.28004 1.28314 3.64415 1.30194 3.31562 1.30275 3.37362 1.36867 3.89402 1.39834 3.49695 1.42036 3.95854 1.52344 4.17839 1.72511 4.12729 1.87734 4.88870 2.82207 0.871300 0.871321 0.881658 0.875606 0.882093 0.875642 0.881658 0.831580 0.823375 0.792094 0.804187 0.792864 0.882096 0.743404 0.770971 0.761474 0.714161 0.794705 0.805163 0.777716 0.812730 0.718921 0.756023 0.692334 0.834936 0.798692 0.772450 0.815633 0.571234 0.631593 0.283819 punters. x5 x4 Table 12.13: Comparing Different Regression Models Model s2 􏰦|δi| PRESS R2 which is small compared to the average hang time for the 13 We indicated in Section 12.9 that the use of all possible subset regressions is often advisable when searching for the best model. Most commercial statistics software packages contain an all possible regressions routine. These algorithms compute various criteria for all subsets of model terms. Obviously, criteria such as R2, s2, and PRESS are reasonable for choosing among candidate subsets. Another very popular and useful statistic, particularly for areas in the physical sciences and engineering, is the Cp statistic, described below. 12.11 Cross Validation, Cp, and Other Criteria for Model Selection Table 12.14: PRESS Residuals 491 Punter yi yˆi ei =yi − yˆi 0.280 0.342 −0.054 0.034 0.043 −0.121 −0.085 0.016 −0.154 0.004 −0.007 0.047 −0.346 hii 0.198 0.118 0.444 0.132 0.286 0.250 0.298 0.294 0.301 0.231 0.152 0.142 0.154 δi 0.349 0.388 −0.097 0.039 0.060 −0.161 −0.121 0.023 −0.220 0.005 −0.008 0.055 −0.409 1 4.750 2 4.070 3 4.040 4 4.180 5 4.350 6 4.160 7 4.430 8 3.200 9 3.020 10 3.640 11 3.680 12 3.600 13 3.850 4.470 3.728 4.094 4.146 4.307 4.281 4.515 3.184 3.174 3.636 3.687 3.553 4.196 The Cp Statistic Quite often, the choice of the most appropriate model involves many considerations. Obviously, the number of model terms is important; the matter of parsimony is a consideration that cannot be ignored. On the other hand, the analyst cannot be pleased with a model that is too simple, to the point where there is serious underspecification. A single statistic that represents a nice compromise in this regard is the Cp statistic. (See Mallows, 1973, in the Bibliography.) The Cp statistic appeals nicely to common sense and is developed from con- siderations of the proper compromise between excessive bias incurred when one underfits (chooses too few model terms) and excessive prediction variance pro- duced when one overfits (has redundancies in the model). The Cp statistic is a simple function of the total number of parameters in the candidate model and the mean square error s2. We will not present the entire development of the Cp statistic. (For details, the reader is referred to Myers, 1990, in the Bibliography.) The Cp for a particular subset model is an estimate of the following: 1 􏰤n Γ(p) = σ2 i=1 1 􏰤n Var(yˆi) + σ2 (Bias yˆi)2. i=1 It turns out that under the standard least squares assumptions indicated earlier in this chapter, and assuming that the “true” model is the model containing all candidate variables, 1 􏰤n σ2 Var(yˆi) = p (number of parameters in the candidate model) i=1 492 Chapter 12 Multiple Linear Regression and Certain Nonlinear Regression Models (see Review Exercise 12.63) and an unbiased estimate of 1 􏰤n σ2 i=1 In the above, s2 is the mean square error for the candidate model and σ2 is the population error variance. Thus, if we assume that some estimate σˆ2 is available for σ2, Cp is given by the following equation: C p = p + ( s 2 − σˆ 2 ) ( n − p ) , σˆ2 where p is the number of model parameters, s2 is the mean square error for the candidate model, and σˆ2 is an estimate of σ2. Obviously, the scientist should adopt models with small values of Cp. The reader should note that, unlike the PRESS statistic, Cp is scale-free. In addition, one can gain some insight concerning the adequacy of a candidate model by ob- serving its value of Cp. For example, Cp > p indicates a model that is biased due to being an underfitted model, whereas Cp ≈ p indicates a reasonable model.
There is often confusion concerning where σˆ2 comes from in the formula for Cp. Obviously, the scientist or engineer does not have access to the population quantity σ2. In applications where replicated runs are available, say in an experimental design situation, a model-independent estimate of σ2 is available (see Chapters 11 and 15). However, most software packages use σˆ2 as the mean square error from the most complete model. Obviously, if this is not a good estimate, the bias portion of the Cp statistic can be negative. Thus, Cp can be less than p.
1 􏰤n ( s 2 − σ 2 ) ( n − p ) 2􏱉2
(Bias yˆi) is given by σ2 (Bias yˆi) = σ2 . i=1
Cp Statistic
Example 12.12: Consider the data set in Table 12.15, in which a maker of asphalt shingles is interested in the relationship between sales for a particular year and factors that influence sales. (The data were taken from Kutner et al., 2004, in the Bibliography.)
Of the possible subset models, three are of particular interest. These three are x2x3, x1x2x3, and x1x2x3x4. The following represents pertinent information for comparing the three models. We include the PRESS statistics for the three models to supplement the decision making.
Model
x2 x3
x1 x2 x3 x1 x2 x3 x4
R2 R2pred 0.9940 0.9913 0.9970 0.9928 0.9971 0.9917
s2 PRESS Cp
44.5552 782.1896
24.7956 643.3578 3.4075 26.2073 741.7557 5.0
11.4013
It seems clear from the information in the table that the model x1,x2,x3 is preferable to the other two. Notice that, for the full model, Cp = 5.0. This occurs since the bias portion is zero, and σˆ2 = 26.2073 is the mean square error from the full model.
Figure 12.6 is a SAS PROC REG printout showing information for all possible regressions. Here we are able to show comparisons of other models with (x1, x2, x3). Note that (x1,x2,x3) appears to be quite good when compared to all models.
As a final check on the model (x1,x2,x3), Figure 12.7 shows a normal proba- bility plot of the residuals for this model.

12.11 Cross Validation, Cp, and Other Criteria for Model Selection
493
Table 12.15: Data for Example 12.12
Promotional Active Competing Potential, District Accounts, x1 Accounts, x2 Brands, x3 x4
1 5.5 31 10 8 2 2.5 55 8 6 3 8.0 67 12 9 4 3.0 50 7 16 5 3.0 38 8 15 6 2.9 71 12 17 7 8.0 30 12 8 8 9.0 56 5 10 9 4.0 42 8 4
10 6.5 73 5 16
11 5.5 60 11 7
12 5.0 44 12 12 86.3
13 6.0 50 6 6 14 5.0 39 10 4 15 3.5 55 10 4
Sales, y (thousands)
$79.3 200.1 163.2 200.1 146.0 177.7
237.5 107.2 155.0
30.9 291.9 160.0 339.4 159.6
Number in
Model
MSE Variables in Model
Dependent Variable: sales
Adjusted
C(p) R-Square R-Square
3
4
2 11.4013 0.9940 0.9930 44.55518 x2 x3
3 13.3770 0.9940 0.9924 48.54787 x2 x3 x4
3 1053.643 0.6896 0.6049 2526.96144 x1 x3 x4
2 1082.670 0.6805 0.6273 2384.14286 x3 x4
2 1215.316 0.6417 0.5820 2673.83349 x1 x3
1 1228.460 0.6373 0.6094 2498.68333 x3
3 1653.770 0.5140 0.3814 3956.75275 x1 x2 x4
2 1668.699 0.5090 0.4272 3663.99357 x1 x2
2 1685.024 0.5042 0.4216 3699.64814 x2 x4
1 1693.971 0.5010 0.4626 3437.12846 x2
2 3014.641 0.1151 -.0324 6603.45109 x1 x4
1 3088.650 0.0928 0.0231 6248.72283 x4
1 3364.884 0.0120 -.0640 6805.59568 x1
Figure 12.6: SAS printout of all possible subsets on sales data for Example 12.12.
3.4075 0.9970 0.9961
5.0000 0.9971 0.9959
24.79560 x1 x2 x3
26.20728 x1 x2 x3 x4

494
Chapter 12 Multiple Linear Regression and Certain Nonlinear Regression Models
6 4 2 0
􏱍2 􏱍4 􏱍6 􏱍8
//
􏱍1 0 1 Theoretical Quantiles
Figure 12.7: Normal probability plot of residuals using the model x1x2x3 for Example 12.12. Exercises
12.47 Consider the “hang time” punting data given in Case Study 12.2, using only the variables x2 and x3.
(a) Verify the regression equation shown on page 489.
(b) Predict punter hang time for a punter with LLS = 180 pounds and Power = 260 foot-pounds.
(c) Construct a 95% confidence interval for the mean hang time of a punter with LLS = 180 pounds and Power = 260 foot-pounds.
12.48 For the data of Exercise 12.15 on page 452, use the techniques of
(a) forward selection with a 0.05 level of significance to choose a linear regression model;
(b) backward elimination with a 0.05 level of signifi- cance to choose a linear regression model;
(c) stepwise regression with a 0.05 level of significance to choose a linear regression model.
12.49 Use the techniques of backward elimination with α = 0.05 to choose a prediction equation for the data of Table 12.8.
12. 50 For the punter data in Case Study 12.2, an additional response, “punting distance,” was also recorded. The average distance values for each of the 13 punters are given.
(a) Using the distance data rather than the hang times, estimate a multiple linear regression model of the type
μY |x1,x2,x3,x4,x5
= β0 +β1×1 +β2×2 +β3×3 +β4×4 +β5×5
(b) Use stepwise regression with a significance level of 0.10 to select a combination of variables.
the entire set of 31 models. Use this information to determine the best combination of variables for predicting punting distance.
(d) For the final model you choose, plot the standard- ized residuals against Y and do a normal probabil- ity plot of the ordinary residuals. Comment.
Punter Distance, y (ft)
1 162.50 2 144.00 3 147.50 4 163.50 5 192.00 6 171.75 7 162.00 8 104.93 9 105.67
10 117.59 11 140.25 12 150.17 13 165.16
12.51 The following is a set of data for y, the amount of money (in thousands of dollars) contributed to the alumni association at Virginia Tech by the Class of 1960, and x, the number of years following graduation:
22 13
(c) Generate values for s , R , PRESS, and |δi| for i=1
􏰦
for predicting punting distance.
Sample Quantiles

Exercises
495
//
yxyx
(a) Fit a multiple linear regression to the data. (b) Compute t-tests on coefficients. Give P-values.
(c) Comment on the quality of the fitted model.
12.55 Rayon whiteness is an important factor for sci- entists dealing in fabric quality. Whiteness is affected by pulp quality and other processing variables. Some of the variables include acid bath temperature, ◦C (x1); cascade acid concentration, % (x2); water temperature, ◦ C (x3 ); sulfide concentration, % (x4 ); amount of chlo- rine bleach, lb/min (x5); and blanket finish tempera- ture, ◦C (x6). A set of data from rayon specimens is given here. The response, y, is the measure of white- ness.
812.52 1
822.50 2 1211.50 3 1348.00 4 1301.00 8 2567.50 9 2526.50 10
2755.00 11 4390.50 12 5581.50 13 5548.00 14
(a) Fit a regression model of the type μY |x = β0 + β1x.
(b) Fit a quadratic model of the type μY|x =β0 +β1x+β11×2.
y
x1 x2 x3 x4
x5 x6 0.606 48 0.600 55 0.527 61 0.500 65 0.485 54 0.533 60 0.510 57 0.489 49 0.462 64 0.478 63 0.411 61 0.387 88 0.437 63 0.499 58 0.530 65 0.500 67
6086.00 5764.00 8903.00
15 16 17
(c) Determine which of the models in (a) or (b) is preferable. Use s2 , R2 , and the PRESS residuals to support your decision.
12.52 For the model of Exercise 12.50(a), test the hy- pothesis
H0: β4 = 0, H1: β4 ̸= 0.
Use a P-value in your conclusion.
12.53 For the quadratic model of Exercise 12.51(b), give estimates of the variances and covariances of the estimates of β1 and β11.
12.54 A client from the Department of Mechanical Engineering approached the Consulting Center at Vir- ginia Tech for help in analyzing an experiment dealing with gas turbine engines. The voltage output of en- gines was measured at various combinations of blade speed and sensor extension.
y Speed, x1 Extension, (volts) (in./sec) x2 (in.)
1.95 6336 0.000 2.50 7099 0.000 2.93 8026 0.000 1.69 6230 0.000 1.23 5369 0.000 3.13 8343 0.000 1.55 6522 0.006 1.94 7310 0.006 2.18 7974 0.006 2.70 8501 0.006 1.32 6646 0.012 1.60 7384 0.012 1.89 8000 0.012 2.15 8545 0.012 1.09 6755 0.018 1.26 7362 0.018 1.57 7934 0.018 1.92 8554 0.018
88.7 43 89.3 42 75.5 47 92.1 46 83.4 52 44.8 50 50.9 43 78.0 49 86.8 51 47.3 51 53.7 48 92.0 46 87.9 43 90.3 45 94.2 53 89.5 47
0.211 85 0.243 0.604 89 0.237 0.450 87 0.198 0.641 90 0.194 0.370 93 0.198 0.526 85 0.221 0.486 83 0.203 0.504 93 0.279 0.609 90 0.220 0.702 86 0.198 0.397 92 0.231 0.488 88 0.211 0.525 85 0.199 0.486 84 0.189 0.527 87 0.245 0.601 95 0.208
(a) Use the
“best” model from among all subset models.
(b) Plot standardized residuals against Y and do a normal probability plot of residuals for the “best” model. Comment.
12.56 In an effort to model executive compensation for the year 1979, 33 firms were selected, and data were gathered on compensation, sales, profits, and employ- ment. The following data were gathered for the year 1979.
Compen- sation, y
criteria MSE, Cp, and PRESS to find the
Sales, x1
Firm (thousands) (millions) (millions) ment, x3
Profits, x2 $128.1
783.9 136.0 179.0 231.5 329.5 381.8 237.9 222.3
Employ-
1 2 3 4 5 6 7 8 9
$450
387
368
277
676
454
507
496
487
$4600.6 9255.4 1526.2 1683.2 2752.8 2205.8 2384.6 2746.0 1434.0
48,000 55,900 13,783 27,765 34,000 26,500 30,800 41,000 25,900 (cont.)

496 Chapter 12 Multiple Linear Regression and Certain Nonlinear Regression Models
Compen-
sation, y Sales, x1 Profits, x2 Employ-
Firm (thousands) (millions) (millions) ment, x3
(d) Give the final model.
(e) For your model in part (d), plot studentized resid-
uals (or R-Student) and comment.
y x1 x2 x3 x4 x5 x6
10 $383
11 311
12 271
13 524
14 498
15 343
16 354
17 324
18 225
19 254
20 208
21 518
22 406
23 332
24 340
25 698
26 306
27 613
28 302
29 540
30 293
31 528
32 456
33 417
$470.6 $63.7 1508.0 149.5 464.4 30.0 9329.3 577.3 2377.5 250.7 1174.3 82.6 409.3 61.5 724.7 90.8 578.9 63.3 966.8 42.8 591.0 48.5 4933.1 310.6 7613.2 491.6 3457.4 228.0 545.3 54.6 22,862.8 3011.3 2361.0 203.0 2614.1 201.0 1013.2 121.3 4560.3 194.6 855.7 63.4 4211.6 352.1 5440.4 655.2 1229.9 97.5
8600 21,075 6874 39,000 34,300 19,405 3586 3905 4139 6255 10,605 65,392 89,400 55,200 7800 337,119 52,000 50,500 18,625 97,937 12,300 71,800 87,700 14,600
8.0 5.2 8.3 5.2 8.5 5.8 8.8 6.4 9.0 5.8 9.3 5.2 9.3 5.6 9.5 6.0 9.8 5.2
10.0 5.8 10.3 6.4 10.5 6.0 10.8 6.2 11.0 6.2 11.3 6.2 11.5 5.6 11.8 6.0 12.3 5.8 12.5 5.6
19.6 29.6 94.9 19.8 32.4 89.7 19.6 31.0 96.2 19.4 32.4 95.6 18.6 28.6 86.5 18.8 30.6 84.5 20.4 32.4 88.8 19.0 32.6 85.7 20.8 32.2 93.6 19.9 31.8 86.0 18.0 32.6 87.1 20.6 33.4 93.1 20.2 31.8 83.4 20.2 32.4 94.5 19.2 31.4 83.4 17.0 33.2 85.2 19.8 35.4 84.1 18.8 34.0 86.9 18.6 34.2 83.0
2.1 2.3 2.1 1.8 2.0 2.0 2.2 2.1 2.0 1.8 2.1 2.1 2.2 1.9 2.1 1.9 2.3 2.1 2.1 1.8 2.0 1.6 2.1 2.1 2.2 2.1 2.1 1.9 1.9 1.8 2.1 2.1 2.0 1.8 2.1 1.8 1.9 2.0
Consider the model
yi = β0 +β1 lnx1i +β2 lnx2i
12.58 For Exercise 12.57, test H0: β1 = β6 = 0. Give P-values and comment.
12.59 In Exercise 12.28, page 462, we have the fol- lowing data concerning wear of a bearing:
+β3lnx3i +εi, i=1,2,…,33.
(a) Fit the regression with the model above.
(b) Is a model with a subset of the variables preferable to the full model?
12.57 The pull strength of a wire bond is an impor- tant characteristic. The following data give informa- tion on pull strength y, die height x1, post height x2, loop height x3, wire length x4, bond width on the die x5, and bond width on the post x6. (From Myers, Mont- gomery, and Anderson-Cook, 2009.)
(a) Fit a regression model using all independent vari- ables.
(b) Use stepwise regression with input significance level 0.25 and removal significance level 0.05. Give your final model.
(c) Use all possible regression models and compute R2, Cp, s2, and adjusted R2 for all models.
y (wear) 193
230
172
91
113
125
x1 (oil viscosity) 1.6
15.5 22.0 43.0 33.0 40.0
x2 (load) 851
816
1058
1201
1357
1115
12.12 Special Nonlinear Models for Nonideal Conditions
In much of the preceding material in this chapter and in Chapter 11, we have benefited substantially from the assumption that the model errors, the εi, are normal with mean 0 and constant variance σ2. However, there are many real-life
(a) The following model may be considered to describe the data:
yi = β0 + β1x1i + β2x2i + β12x1ix2i + εi,
for i = 1,2,…,6. The x1x2 is an “interaction”
term. Fit this model and estimate the parameters. (b) Use the models (x1), (x1, x2), (x2), (x1, x2, x1x2) and compute PRESS, Cp, and s2 to determine the
“best” model.

12.12 Special Nonlinear Models for Nonideal Conditions 497
situations in which the response is clearly nonnormal. For example, a wealth of applications exist where the response is binary (0 or 1) and hence Bernoulli in nature. In the social sciences, the problem may be to develop a model to predict whether or not an individual is a good credit risk (0 or 1) as a function of certain socioeconomic regressors such as income, age, gender, and level of education. In a biomedical drug trial, the response is often whether or not the patient responds positively to a drug while regressors may include drug dosage as well as biological factors such as age, weight, and blood pressure. Again the response is binary in nature. Applications are also abundant in manufacturing areas where certain controllable factors influence whether a manufactured item is defective or not.
A second type of nonnormal application on which we will touch briefly has to do with count data. Here the assumption of a Poisson response is often convenient. In biomedical applications, the number of cancer cell colonies may be the response which is modeled against drug dosages. In the textile industry, the number of imperfections per yard of cloth may be a reasonable response which is modeled against certain process variables.
Nonhomogeneous Variance
The reader should note the comparison of the ideal (i.e., the normal response) situation with that of the Bernoulli (or binomial) or the Poisson response. We have become accustomed to the fact that the normal case is very special in that the variance is independent of the mean. Clearly this is not the case for either Bernoulli or Poisson responses. For example, if the response is 0 or l, suggesting a Bernoulli response, then the model is of the form
p = f(x,β),
where p is the probability of a success (say response = 1). The parameter p plays the role of μY |x in the normal case. However, the Bernoulli variance is p(1 − p), which, of course, is also a function of the regressor x. As a result, the variance is not constant. This rules out the use of standard least squares, which we have utilized in our linear regression work up to this point. The same is true for the Poisson case since the model is of the form
λ = f(x,β), with Var(y) = μy = λ, which varies with x.
Binary Response (Logistic Regression)
The most popular approach to modeling binary responses is a technique entitled logistic regression. It is used extensively in the biological sciences, biomedical research, and engineering. Indeed, even in the social sciences binary responses are found to be plentiful. The basic distribution for the response is either Bernoulli or binomial. The former is found in observational studies where there are no repeated runs at each regressor level, while the latter will be the case when an experiment is designed. For example, in a clinical trial in which a new drug is being evaluated, the goal might be to determine the dose of the drug that provides efficacy. So

498 Chapter 12 Multiple Linear Regression and Certain Nonlinear Regression Models certain doses will be employed in the experiment, and more than one subject will
be used for each dose. This case is called the grouped case. What Is the Model for Logistic Regression?
In the case of binary responses ,the mean response is a probability. In the preceding clinical trial illustration, we might say that we wish to estimate the probability that the patient responds properly to the drug, P(success). Thus, the model is written in terms of a probability. Given regressors x, the logistic function is given by
p=1. 1 + e−xβ
The portion x′β is called the linear predictor, and in the case of a single regressor x it might be written x′β = β0 + β1x. Of course, we do not rule out involving multiple regressors and polynomial terms in the so-called linear predictor. In the grouped case, the model involves modeling the mean of a binomial rather than a Bernoulli, and thus we have the mean given by
np= n . 1 + e−xβ
Characteristics of Logistic Function
A plot of the logistic function reveals a great deal about its characteristics and why it is utilized for this type of problem. First, the function is nonlinear. In addition, the plot in Figure 12.8 reveals the S-shape with the function approaching p = 1.0 as an asymptote. In this case, β1 > 0. Thus, we would never experience an estimated probability exceeding 1.0.
p
1.0
Figure 12.8: The logistic function.
The regression coefficients in the linear predictor can be estimated by the method of maximum likelihood, as described in Chapter 9. The solution to the
x

12.12 Special Nonlinear Models for Nonideal Conditions 499
likelihood equations involves an iterative methodology that will not be described here. However, we will present an example and discuss the computer printout and conclusions.
Example 12.13: The data set in Table 12.16 will be used to illustrate the use of logistic regression to analyze a single-agent quantal bioassay of a toxicity experiment. The results show the effect of different doses of nicotine on the common fruit fly.
Table 12.16: Data Set for Example 12.13
x ni y
Concentration (grams/100 cc)
0.10 0.15 0.20 0.30 0.50 0.70 0.95
Number of Number Insects Killed
Percent Killed
47 8 17.0 53 14 26.4 55 24 43.6 52 32 61.5 46 38 82.6 54 50 92.6 52 50 96.2
The purpose of the
probability of “kill” to concentration. In addition, the analyst sought the so-called effective dose (ED), that is, the concentration of nicotine that results in a certain probability. Of particular interest was the ED50, the concentration that produces a 0.5 probability of “insect kill.”
This example is grouped, and thus the model is given by E(Yi) = nipi = ni .
Estimates of β0 and β1 and their standard errors are found by the method of maximum likelihood. Tests on individual coefficients are found using χ2-statistics rather than t-statistics since there is no common variance σ2. The χ2-statistic is derived from 􏰩 coeff 􏰪2.
standard error
Thus, we have the following from a SAS PROC LOGIST printout. Analysis of Parameter Estimates
experiment was to arrive at an appropriate model relating
1 + e−(β0+β1xi)
df
Estimate
Standard Error Chi-Squared
0.2420 51.4482 0.7422 71.9399
P-Value < 0.0001 β0 1 β1 1 −1.7361 6.2954 < 0.0001 Both coefficients are significantly different from zero. Thus, the fitted model used to predict the probability of “kill” is given by pˆ= 1 . 1 + e−(−1.7361+6.2954x) Thus, ED50 is given by // 500 Chapter 12 Multiple Linear Regression and Certain Nonlinear Regression Models Estimate of Effective Dose The estimate of ED50 for Example 12.13 is found very simply from the estimates b0 for β0 and b1 for β1. From the logistic function, we see that 􏰧p􏰨 log 1−p =β0+β1x. As a result, for p = 0.5, an estimate of x is found from b0 + b1x = 0. x = − 􏰧b 􏰨 0 = 0.276 gram/100 cc. b1 Concept of Odds Ratio In logistic regression, an odds ratio is the ratio of odds of success at condition 2 to that of condition 1 in the regressors, that is, [p/(1 − p)]2 . [p/(1 − p)]1 Definition 12.1: Another form of inference that is conveniently accomplished using logistic regres- sion is derived from the use of the odds ratio. The odds ratio is designed to determine how the odds of success, p , increases as certain changes in regressor 1−p values occur. For example, in the case of Example 12.13 we may wish to know how the odds would increase if one were to increase dosage by, say, 0.2 gram/100 cc. This allows the analyst to ascertain a sense of the utility of changing the regressor by a certain number of units. Now, since p = eβ0+β1x, for Example 12.13, 1−p the ratio reflecting the increase in odds of success when the dosage of nicotine is increased by 0.2 gram/100 cc is given by e0.2b1 = e(0.2)(6.2954) = 3.522. The implication of an odds ratio of 3.522 is that the odds of success is enhanced by a factor of 3.522 when the nicotine dose is increased by 0.2 gram/100 cc. 􏰽􏰾 Exercises 12.60 From a set of streptonignic dose-response data, an experimenter desires to develop a relationship be- tween the proportion of lymphoblasts sampled that contain aberrations and the dosage of streptonignic. Five dosage levels were applied to the rabbits used for the experiment. The data are as follows (see Myers, 1990, in the Bibliography): Dose Number of (mg/kg) Lymphoblasts 0 600 30 500 60 600 75 300 90 300 Number with Aberrations 15 96 187 100 145 Review Exercises 501 (a) Fit a logistic regression to the data set and thus estimate β0 and β1 in the model p=1, 1 + e−(β0+β1x) where n is the number of lymphoblasts, x is the dose, and p is the probability of an aberration. (b) Show results of χ2-tests revealing the significance of the regression coefficients β0 and β1. (c) Estimate ED50 and give an interpretation. 12.61 In an experiment to ascertain the effect of load, x, in lb/inches2, on the probability of failure of speci- mens of a certain fabric type, an experiment was con- ducted in which numbers of specimens were exposed to loads ranging from 5 lb/in.2 to 90 lb/in.2. The numbers Review Exercises 12.62 In the Department of Fisheries and Wildlife at Virginia Tech, an experiment was conducted to study the effect of stream characteristics on fish biomass. The regressor variables are as follows: average depth (of 50 cells), x1; area of in-stream cover (i.e., undercut banks, logs, boulders, etc.), x2; percent canopy cover (average of 12), x3 ; and area ≥ 25 centimeters in depth, x4 . of “failures” were observed. The data are as follows: // Number of Load Specimens 5 600 35 500 70 600 80 300 90 300 Number of Failures 13 95 189 95 130 (a) (b) Use logistic regression to fit the model p=1, The response is y, the fish biomass. follows: Obs. y x1 x2 The data are as y x1 7.6 −1 5.5 1 9.2 −1 10.3 −1 11.6 1 11.1 1 10.2 −1 14.0 1 x2 x3 −1 −1 −1 −1 1 −1 −1 1 1 −1 −1 1 1 1 1 1 1 + e−(β0+β1x) where p is the probability of failure and x is load. Use the odds ratio concept to determine the in- crease in odds of failure that results by increasing the load from 20 lb/in.2. 12.64 A small experiment was conducted to fit a mul- tiple regression equation relating the yield y to tem- perature x1, reaction time x2, and concentration of one of the reactants x3. Two levels of each variable were chosen, and measurements corresponding to the coded independent variables were recorded as follows: 1 2 3 4 5 6 7 8 9 10 100 14.3 388 19.1 755 54.6 1288 28.8 230 16.1 0 10.0 551 28.5 345 13.8 0 10.7 348 25.9 15.0 29.4 58.0 42.6 15.9 56.4 95.1 60.6 35.2 52.0 x3 12.2 26.0 24.2 26.1 31.6 23.3 13.0 7.5 40.3 40.3 x4 48.0 152.2 469.7 485.9 87.6 6.9 192.9 105.8 0.0 116.6 (a) (b) Using the coded variables, estimate the multiple linear regression equation μY |x1,x2,x3 = β0 + β1x1 + β2x2 + β3x3. Partition SSR, the regression sum of squares, into three single-degree-of-freedom components at- tributable to x1, x2, and x3, respectively. Show an analysis-of-variance table, indicating significance tests on each variable. Comment on the results. (a) Fit a multiple linear regression including all four regression variables. (b) Use Cp, R2, and s2 to determine the best subset of variables. Compute these statistics for all possible subsets. (c) Compare the appropriateness of the models in parts (a) and (b) for predicting fish biomass. 12.63 Show that, in a multiple linear regression data set, 􏰤n hii = p. i=1 12.65 In a chemical engineering experiment dealing with heat transfer in a shallow fluidized bed, data are collected on the following four regressor variables: flu- idizing gas flow rate, lb/hr (x1); supernatant gas flow rate, lb/hr (x2); supernatant gas inlet nozzle opening, millimeters (x3); and supernatant gas inlet tempera- ture, ◦F (x4). The responses measured are heat trans- fer efficiency (y1 ) and thermal efficiency (y2 ). The data are as follows: Obs. y1 y2 x1 x2 x3 x4 1 41.852 38.75 69.69 170.83 45 219.74 2 155.329 51.87 113.46 230.06 25 181.22 3 99.628 53.79 113.54 228.19 65 179.06 4 49.409 53.84 118.75 117.73 65 281.30 5 72.958 49.17 119.72 117.69 25 282.20 6 107.702 47.61 168.38 173.46 45 216.14 7 97.239 64.19 169.85 169.85 45 223.88 8 105.856 52.73 169.85 170.86 45 222.80 9 99.348 51.00 170.89 173.92 80 218.84 10 111.907 47.37 171.31 173.34 25 218.12 11 100.008 43.18 171.43 171.43 45 219.20 12 175.380 71.23 171.59 263.49 45 168.62 13 117.800 49.30 171.63 171.63 45 217.58 14 217.409 50.87 171.93 170.91 10 219.92 15 41.725 54.44 173.92 71.73 45 296.60 16 151.139 47.93 221.44 217.39 65 189.14 17 220.630 42.91 222.74 221.73 25 186.08 18 131.666 66.60 228.90 114.40 25 285.80 19 80.537 64.94 231.19 113.52 65 286.34 20 152.966 43.18 236.84 167.77 45 221.72 Consider the model for predicting the heat transfer co- efficient response x1 x2 44 89.47 40 75.07 44 85.84 42 68.15 38 89.02 47 77.45 40 75.98 43 81.19 44 81.42 38 81.87 44 73.03 45 87.66 45 66.45 47 79.15 54 83.12 49 81.42 51 69.63 51 77.91 48 91.63 49 73.37 57 73.37 54 79.38 52 76.32 50 70.87 51 67.25 54 91.63 51 73.71 57 59.08 49 76.32 48 61.24 52 82.78 x3 x4 11.37 62 10.07 62 8.65 45 8.17 40 9.22 55 11.63 58 11.95 70 10.85 64 13.08 63 8.63 48 10.13 45 14.03 56 11.12 51 10.60 47 10.33 50 8.95 44 10.95 57 10.00 48 10.25 48 10.08 76 12.63 58 11.17 62 9.63 48 8.92 48 11.08 48 12.88 44 10.47 59 9.93 49 9.40 56 11.50 52 10.50 53 x5 x6 178 182 185 185 156 168 166 172 178 180 176 176 176 180 162 170 174 176 170 186 168 168 186 192 176 176 162 164 166 170 180 185 168 172 162 168 162 164 168 168 174 176 156 165 164 166 146 155 172 172 168 172 186 188 148 155 186 188 170 176 170 172 y1i = β0 + βjjx2ji βjlxjixli +εi, + 􏰤􏰤 j ̸=l i=1,2,...,20. 􏰤4 j=1 􏰤4 i=1 βjxji + 􏰦n i=1 |yi − yˆi,−i| for the least (a) Compute PRESS and squares regression fit to the model above. (b) Fit a second-order model with x4 completely elim- inated (i.e., deleting all terms involving x4). Com- pute the prediction criteria for the reduced model. Comment on the appropriateness of x4 for predic- tion of the heat transfer coefficient. (c) Repeat parts (a) and (b) for thermal efficiency. 12. 66 In exercise physiology, an ob jective measure of aerobic fitness is the oxygen consumption in volume per unit body weight per unit time. Thirty-one individuals were used in an experiment in order to be able to model oxygen consumption against age in years (x1), weight in kilograms (x2 ), time to run 1 1 miles (x3 ), resting 2 pulse rate (x4), pulse rate at the end of run (x5), and maximum pulse rate during run (x6). (a) Do a stepwise regression with input significance level 0.25. Quote the final model. (b) Do all possible subsets using s2, Cp, R2, and Ra2dj. Make a decision and quote the final model. data of Review Exercise 12.64. // 502 Chapter 12 Multiple Linear Regression and Certain Nonlinear Regression Models ID y 1 44.609 2 45.313 3 54.297 4 59.571 5 49.874 6 44.811 7 45.681 8 49.091 9 39.442 10 60.055 11 50.541 12 37.388 13 44.754 14 47.273 15 51.855 16 49.156 17 40.836 18 46.672 19 46.774 20 50.388 21 39.407 22 46.080 23 45.441 24 54.625 25 45.118 26 39.203 27 45.790 28 50.545 29 48.673 30 47.920 31 47.467 12.67 Consider the Suppose it is of interest to add some “interaction” terms. Namely, consider the model yi = β0 + β1x1i + β2x2i + β3x3i + β12x1ix2i + β13x1ix3i + β23x2ix3i + β123x1ix2ix3i + εi. (a) Do we still have orthogonality? Comment. (b) With the fitted model in part (a), can you find prediction intervals and confidence intervals on the mean response? Why or why not? (c) Consider a model with β123x1x2x3 removed. To determine if interactions (as a whole) are needed, test H0: β12 =β13 =β23 =0. Give the P-value and conclusions. 12.68 A carbon dioxide (CO2) flooding technique is used to extract crude oil. The CO2 floods oil pock- ets and displaces the crude oil. In an experiment, flow tubes are dipped into sample oil pockets containing a known amount of oil. Using three different values of Review Exercises 503 flow pressure and three different values of dipping an- gles, the oil pockets are flooded with CO2, and the per- centage of oil displaced recorded. Consider the model yi = β0 + β1x1i + β2x2i + β11x21i + β22x2i + β12x1ix2i + εi. Fit the model above to the data, and suggest any model editing that may be needed. (a) (b) (c) TestH0: β1 =β2 =β3 =0. Plot studentized residuals against Pressure (lb/in2), x1 1000 1000 1000 1500 1500 1500 2000 2000 2000 Dipping Angle, x2 0 15 30 0 15 30 0 15 30 Oil Recovery (%), y 60.58 72.72 79.99 66.83 80.78 89.78 69.18 80.31 91.99 12.70 A study was conducted to determine whether lifestyle changes could replace medication in reducing blood pressure among hypertensives. The factors con- sidered were a healthy diet with an exercise program, the typical dosage of medication for hypertension, and no intervention. The pretreatment body mass index (BMI) was also calculated because it is known to affect blood pressure. The response considered in this study was change in blood pressure. The variable “group” had the following levels. 1 = Healthy diet and an exercise program 2 = Medication 3 = No intervention (a) Fit an appropriate model using the data below. Does it appear that exercise and diet could be effec- tively used to lower blood pressure? Explain your answer from the results. (b) Would exercise and diet be an effective alternative to medication? (Hint: You may wish to form the model in more than one way to answer both of these questions.) Change in Blood Pressure Group BMI −32 1 27.3 −21 1 22.1 −26 1 26.1 −16 1 27.8 −11 2 19.2 −19 2 26.1 −23 2 28.6 −5 2 23.0 −6 3 28.1 5 3 25.3 −11 3 26.7 14 3 22.3 Show that in choosing the so-called best subset model from a series of candidate models, choosing the model with the smallest s2 is equivalent to choosing the model with the smallest Ra2dj. Source: Wang, G. C. “Microscopic Investigations of CO2 Flooding Process,” Journal of Petroleum Technology, Vol. 34, No. 8, Aug. 1982. 12.69 An article in the Journal of Pharmaceutical Sciences (Vol. 80, 1991) presents data on the mole fraction solubility of a solute at a constant tempera- ture. Also measured are the dispersion x1 and dipolar and hydrogen bonding solubility parameters x2 and x3. A portion of the data is shown in the table below. In the model, y is the negative logarithm of the mole frac- tion. Fit the model yi = β0 +β1x1i +β2x2i +β3x3i +εi, for i = 1,2,...,20. Obs. y x1 x2 x3 // x1 , Consider two additional models that are competi- x2 , Model 2: Add x21, x2, x23. Model 3: Add x21, x2, x23, x1x2, x1x3, x2x3. Use PRESS and Cp with these three models to ar- rive at the best among the three. (three plots). Comment. tors to the models above: and x3 1 0.2220 7.3 2 0.3950 8.7 3 0.4220 8.8 4 0.4370 8.1 5 0.4280 9.0 6 0.4670 8.7 7 0.4440 9.3 8 0.3780 7.6 9 0.4940 10.0 10 0.4560 8.4 11 0.4520 9.3 12 0.1120 7.7 13 0.4320 9.8 14 0.1010 7.3 15 0.2320 8.5 16 0.3060 9.5 17 0.0923 7.4 18 0.1160 7.8 19 0.0764 7.7 20 0.4390 10.3 0.0 0.0 0.0 0.3 0.7 1.0 4.0 0.2 0.5 1.0 1.5 2.8 2.1 1.0 5.1 3.4 0.0 0.3 3.7 4.1 3.6 2.0 2.8 7.1 4.2 2.0 2.5 6.8 2.0 6.6 2.5 5.0 2.8 7.8 2.8 7.7 3.0 8.0 1.7 4.2 12.71 504 Chapter 12 Multiple Linear Regression and Certain Nonlinear Regression Models 12.72 Case Study: Consider the data set for Exer- cise 12.12, page 452 (hospital data), repeated here. Sitex1 x2 x3 x4 x5 y (a) The SAS PROC REG outputs provided in Figures 12.9 and 12.10 supply a considerable amount of in- formation. Goals are to do outlier detection and eventually determine which model terms are to be used in the final model. (b) Often the role of a single regressor variable is not apparent when it is studied in the presence of sev- eral other variables. This is due to multicollinear- ity. With this in mind, comment on the importance of x2 and x3 in the full model as opposed to their importance in a model in which they are the only variables. (c) Comment on what other analyses should be run. (d) Run appropriate analyses and write your conclu- sions concerning the final model. 1 15.57 2463 2 44.02 2048 3 20.42 3940 4 18.74 6505 5 49.20 5723 6 44.92 11,520 7 55.48 5779 8 59.28 5969 9 94.39 8461 10 128.02 20,106 11 96.00 13,313 12 131.42 10,771 13 127.21 15,543 14 252.90 36,194 15 409.20 34,703 16 463.70 39,204 17 510.22 86,533 472.92 18.0 4.45 1339.75 9.5 6.92 620.25 12.8 4.28 568.33 36.7 3.90 1497.60 35.7 5.50 1365.83 24.0 4.60 1687.00 43.3 5.62 1639.92 46.7 5.l5 2872.33 78.7 6.18 3655.08 180.5 6.15 2912.00 60.9 5.88 3921.00 103.7 4.88 3865.67 126.8 5.50 7684.10 157.7 7.00 12,446.33 169.4 10.75 14,098.40 331.4 7.05 15,524.00 371.6 6.35 566.52 696.82 1033.15 1003.62 1611.37 1613.27 1854.17 2160.55 2305.58 3503.93 3571.59 3741.40 4026.52 10,343.81 11,732.17 15,414.94 18,854.45 Dependent Variable: y Source Model Error Corrected Total Analysis of Variance Sum of Mean Square 98035498 412277 Root MSE Dependent Mean Coeff Var R-Square 0.9908 Adj R-Sq 0.9867 DF 5 11 16 Squares 490177488 4535052 494712540 642.08838 4978.48000 12.89728 F Value 237.79 Pr > F <.0001 Parameter Estimates Parameter Standard Variable Label DF Estimate Error t Value Pr > |t|
Intercept Intercept
x1 Average Daily Patient Load
x2 Monthly X-Ray Exposure
x3 Monthly Occupied Bed Days
x4 Eligible Population in the
Area/100
x5 Average Length of Patients
Stay in Days
1 1962.94816 1071.36170 1.83
0.0941
0.8740
0.0234
0.6174
0.5685
0.0867
1 1 1 1
1
-15.85167
0.05593
1.58962
-4.21867
97.65299 -0.16
0.02126 2.63
3.09208 0.51
7.17656 -0.59
-394.31412 209.63954 -1.88
Figure 12.9: SAS output for Review Exercise 12.72; part I.

Review Exercises
505
Dependent Predicted Std Error
Obs Variable Value Mean Predict
95% CL Mean
244.0765 1306
11.8355 1470
490.9234 1717
650.3459 1831
1099 2029
1535 2767
1208 2172
703.9948 2768
2098 3376
2394 4970
2823 3655
3630 5077
3566 4948
8213 9323
10974 13500
13749 16328
18000 20641
1 566.5200 775.0251
2 696.8200 740.6702
3 1033
4 1604
5 1611
6 1613
7 1854
8 2161
9 2306
10 3504
11 3572
12 3741
13 4027
14 10344
15 11732
16 15415
17 18854
Obs Residual
1 -208.5051
2 -43.8502
3 -70.7734
4 363.1244
5 46.9483
6 -538.0017
7 164.4696
8 424.3145
9 -431.4090
10 -177.9234
11 332.6011
12 -611.9330
13 -230.5684
14 1576
15 -504.8574
16 376.5491
17 -466.2470
241.2323
95% CL Predict
-734.6494 2285
-849.4275 2331
-436.5244 2644
-291.0028 2772
76.6816 3052
609.5796 3693
196.5345 3183
-13.8306 3486
1186 4288
1770 5594
1766 4713
2766 5941
2684 5830
7249 10286
10342 14133
13126 16951
17387 21255
12
|
|
|
| |* |
| | |
| *| |
| | |
| |* |
| *| |
| *| |
| |* |
| **| |
| | |
| |***** |
| ***| |
| |** |
| ****| |
331.1402
1104 278.5116
1240 268.1298
1564 211.2372
2151 279.9293
1690 218.9976
1736 468.9903
2737 290.4749
3682 585.2517
3239 189.0989
4353 328.8507
4257 314.0481
8768 252.2617
12237 573.9168
15038 585.7046
19321 599.9780
Std Error Student
Residual Residual
595.0 -0.350
550.1 -0.0797
578.5 -0.122
583.4 0.622
606.3 0.0774
577.9 -0.931
603.6 0.272
438.5 0.968
572.6 -0.753
264.1 -0.674
613.6 0.542
551.5 -1.110
560.0 -0.412
590.5 2.669
287.9 -1.753
263.1 1.431
228.7 -2.039
-2-1 0 | | | | | |
Figure 12.10: SAS output for Review Exercise 12.72; part II.

506
Chapter 12 Multiple Linear Regression and Certain Nonlinear Regression Models
12.13
Potential Misconceptions and Hazards; Relationship to Material in Other Chapters
There are several procedures discussed in this chapter for use in the “attempt” to find the best model. However, one of the most important misconceptions under which na ̈ıve scientists or engineers labor is that there is a true linear model and that it can be found. In most scientific phenomena, relationships between scientific variables are nonlinear in nature and the true model is unknown. Linear statistical models are empirical approximations.
At times, the choice of the model to be adopted may depend on what informa- tion needs to be derived from the model. Is it to be used for prediction? Is it to be used for the purpose of explaining the role of each regressor? This “choice” can be made difficult in the presence of collinearity. It is true that for many regression problems there are multiple models that are very similar in performance. See the Myers reference (1990) for details.
One of the most damaging misuses of the material in this chapter is to assign too much importance to R2 in the choice of the so-called best model. It is important to remember that for any data set, one can obtain an R2 as large as one desires, within the constraint 0 ≤ R2 ≤ 1. Too much attention to R2 often leads to overfitting.
Much attention was given in this chapter to outlier detection. A classical serious misuse of statistics centers around the decision made concerning the detection of outliers. We hope it is clear that the analyst should absolutely not carry out the exercise of detecting outliers, eliminating them from the data set, fitting a new model, reporting outlier detection, and so on. This is a tempting and disastrous procedure for arriving at a model that fits the data well, with the result being an example of how to lie with statistics. If an outlier is detected, the history of the data should be checked for possible clerical or procedural error before it is eliminated from the data set. One must remember that an outlier by definition is a data point that the model did not fit well. The problem may not be in the data but rather in the model selection. A changed model may result in the point not being detected as an outlier.
There are many types of responses that occur naturally in practice but can’t be used in an analysis of standard least squares because classic least squares as- sumptions do not hold. The assumptions that often fail are those of normal errors and homogeneous variance. For example, if the response is a proportion, say pro- portion defective, the response distribution is related to the binomial distribution. A second response that occurs often in practice is that of Poisson counts. Clearly the distribution is not normal, and the response variance, which is equal to the Poisson mean, will vary from observation to observation. For more details on these nonideal conditions, see Myers et al. (2008) in the Bibliography.

Chapter 13
One-Factor Experiments: General
13.1 Analysis-of-Variance Technique
In the estimation and hypothesis testing material covered in Chapters 9 and 10, we were restricted in each case to considering no more than two population parameters. Such was the case, for example, in testing for the equality of two population means using independent samples from normal populations with common but unknown variance, where it was necessary to obtain a pooled estimate of σ2.
This material dealing in two-sample inference represents a special case of what we call the one-factor problem. For example, in Exercise 10.35 on page 357, the survival time was measured for two samples of mice, where one sample received a new serum for leukemia treatment and the other sample received no treatment. In this case, we say that there is one factor, namely treatment, and the factor is at two levels. If several competing treatments were being used in the sampling process, more samples of mice would be necessary. In this case, the problem would involve one factor with more than two levels and thus more than two samples.
In the k > 2 sample problem, it will be assumed that there are k samples from k populations. One very common procedure used to deal with testing population means is called the analysis of variance, or ANOVA.
The analysis of variance is certainly not a new technique to the reader who has followed the material on regression theory. We used the analysis-of-variance approach to partition the total sum of squares into a portion due to regression and a portion due to error.
Suppose in an industrial experiment that an engineer is interested in how the mean absorption of moisture in concrete varies among 5 different concrete aggre- gates. The samples are exposed to moisture for 48 hours. It is decided that 6 samples are to be tested for each aggregate, requiring a total of 30 samples to be tested. The data are recorded in Table 13.1.
The model for this situation may be set up as follows. There are 6 observations taken from each of 5 populations with means μ1, μ2, . . . , μ5, respectively. We may wish to test
H0: μ1 =μ2 =···=μ5,
H1: At least two of the means are not equal.
507

508
Chapter 13 One-Factor Experiments: General
Table 13.1: Absorption of Moisture in Concrete Aggregates
Aggregate: 1 2
3 4 5
551 595 639 457 580 615 450 508 511 731 583 573 499 633 648 632 517 677
417 563
449 631
517 522
438 613
415 656
555 679
2791 3664 465.17 610.67
Total
Mean 553.33
16,854 561.80
3320 3416 3663 569.33 610.50
In addition, we may be interested in making individual comparisons among these 5 population means.
Two Sources of Variability in the Data
In the analysis-of-variance procedure, it is assumed that whatever variation exists among the aggregate averages is attributed to (1) variation in absorption among observations within aggregate types and (2) variation among aggregate types, that is, due to differences in the chemical composition of the aggregates. The within- aggregate variation is, of course, brought about by various causes. Perhaps humidity and temperature conditions were not kept entirely constant throughout the experiment. It is possible that there was a certain amount of heterogeneity in the batches of raw materials that were used. At any rate, we shall consider the within-sample variation to be chance or random variation. Part of the goal of the analysis of variance is to determine if the differences among the 5 sample means are what we would expect due to random variation alone or, rather, due to variation beyond merely random effects, i.e., differences in the chemical composition of the aggregates.
Many pointed questions appear at this stage concerning the preceding problem. For example, how many samples must be tested for each aggregate? This is a question that continually haunts the practitioner. In addition, what if the within- sample variation is so large that it is difficult for a statistical procedure to detect the systematic differences? Can we systematically control extraneous sources of variation and thus remove them from the portion we call random variation? We shall attempt to answer these and other questions in the following sections.
13.2 The Strategy of Experimental Design
In Chapters 9 and 10, the notions of estimation and testing for the two-sample case were covered under the important backdrop of the way the experiment is con- ducted. This falls into the broad category of design of experiments. For example, for the pooled t-test discussed in Chapter 10, it is assumed that the factor levels (treatments in the mice example) are assigned randomly to the experimental units (mice). The notion of experimental units was discussed in Chapters 9 and 10 and

13.3
One-Way Analysis of Variance: Completely Randomized Design 509
13.3
illustrated through examples. Simply put, experimental units are the units (mice, patients, concrete specimens, time) that provide the heterogeneity that leads to experimental error in a scientific investigation. The random assignment elim- inates bias that could result with systematic assignment. The goal is to distribute uniformly among the factor levels the risks brought about by the heterogeneity of the experimental units. Random assignment best simulates the conditions that are assumed by the model. In Section 13.7, we discuss blocking in experiments. The notion of blocking was presented in Chapters 9 and 10, when comparisons between means were accomplished with pairing, that is, the division of the experimental units into homogeneous pairs called blocks. The factor levels or treatments are then assigned randomly within blocks. The purpose of blocking is to reduce the effective experimental error. In this chapter, we naturally extend the pairing to larger block sizes, with analysis of variance being the primary analytical tool.
One-Way Analysis of Variance:
Completely Randomized Design (One-Way ANOVA)
Random samples of size n are selected from each of k populations. The k differ- ent populations are classified on the basis of a single criterion such as different treatments or groups. Today the term treatment is used generally to refer to the various classifications, whether they be different aggregates, different analysts, different fertilizers, or different regions of the country.
Assumptions and Hypotheses in One-Way ANOVA
It is assumed that the k populations are independent and normally distributed with means μ1, μ2, . . . , μk and common variance σ2. As indicated in Section 13.2, these assumptions are made more palatable by randomization. We wish to derive appropriate methods for testing the hypothesis
H0: μ1 =μ2 =···=μk,
H1: At least two of the means are not equal.
Let yij denote the jth observation from the ith treatment and arrange the data as in Table 13.2. Here, Yi. is the total of all observations in the sample from the ith treatment, y ̄i. is the mean of all observations in the sample from the ith treatment, Y.. is the total of all nk observations, and y ̄.. is the mean of all nk observations.
Model for One-Way ANOVA
Each observation may be written in the form Yij =μi+εij,
where εij measures the deviation of the jth observation of the ith sample from the corresponding treatment mean. The εij-term represents random error and plays the same role as the error terms in the regression models. An alternative and

510
Chapter 13 One-Factor Experiments: General Table 13.2: k Random Samples
Treatment: 1 2 ··· i ··· k y11 y21 ··· yi1 ··· yk1 y12 y22 ··· yi2 ··· yk2
. . . .
Theorem 13.1:
the constraint
αi = 0. Hence, we may write
y1n y2n Total Y1. Y2. Mean y ̄1. y ̄2.
··· yin ··· ykn
··· Yi. ··· ··· y ̄i. ···
Yk. Y.. y ̄k. y ̄..
preferred form of this equation is obtained by substituting μi = μ + αi, subject to
􏰦k i=1
Yij =μ+αi+εij, where μ is just the grand mean of all the μi, that is,
1 􏰤k μ=k μi,
i=1
and αi is called the effect of the ith treatment.
The null hypothesis that the k population means are equal against the alter-
native that at least two of the means are unequal may now be replaced by the equivalent hypothesis
H0: α1 =α2 =···=αk =0,
H1: At least one of the αi is not equal to zero.
Resolution of Total Variability into Components
Our test will be based on a comparison of two independent estimates of the common population variance σ2. These estimates will be obtained by partitioning the total variability of our data, designated by the double summation
into two components.
􏰤k 􏰤n
(yij −y ̄..)2,
i=1 j=1
Sum-of-Squares Identity
􏰤k 􏰤n i=1 j=1
(yij − y ̄..)2 = n
􏰤k i=1
(y ̄i. − y ̄..)2 +
􏰤k 􏰤n i=1 j=1
(yij − y ̄i.)2
It will be convenient in what follows to identify the terms of the sum-of-squares identity by the following notation:

13.3 One-Way Analysis of Variance: Completely Randomized Design 511
Three Important Measures of Variability
􏰤k 􏰤n i=1 j=1
Theorem 13.2:
Treatment Mean Square
SST =
SSA = n
(yij − y ̄..)2 = total sum of squares, (y ̄i. − y ̄..)2 = treatment sum of squares,
􏰤k i=1
􏰤k 􏰤n i=1 j=1
The sum-of-squares identity can then be represented symbolically by the equation SST = SSA + SSE.
The identity above expresses how between-treatment and within-treatment variation add to the total sum of squares. However, much insight can be gained by investigating the expected value of both SSA and SSE. Eventually, we shall develop variance estimates that formulate the ratio to be used to test the equality of population means.
The proof of the theorem is left as an exercise (see Review Exercise 13.53 on page 556).
If H0 is true, an estimate of σ2, based on k − 1 degrees of freedom, is provided by this expression:
2 SSA s1 = k − 1
SSE =
(yij − y ̄i.)2 = error sum of squares.
E(SSA)=(k−1)σ2 +n
􏰤k i=1
αi2
If H0 is true and thus each αi in Theorem 13.2 is equal to zero, we see that 􏰧SSA􏰨
E k−1 =σ2,
and s21 is an unbiased estimate of σ2. However, if H1 is true, we have
Error Mean Square
􏰧 S S A 􏰨 n 􏰤k
E k−1 =σ2+k−1 αi2,
i=1
and s21 estimates σ2 plus an additional term, which measures variation due to the systematic effects.
A second and independent estimate of σ2, based on k(n−1) degrees of freedom, is this familiar formula:
s2= SSE k(n−1)

512
Chapter 13 One-Factor Experiments: General
It is instructive to point out the importance of the expected values of the mean squares indicated above. In the next section, we discuss the use of an F-ratio with the treatment mean square residing in the numerator. It turns out that when H1 is true, the presence of the condition E(s21) > E(s2) suggests that the F-ratio be used in the context of a one-sided upper-tailed test. That is, when H1 is true, we would expect the numerator s21 to exceed the denominator.
Use of F-Test in ANOVA
The estimate s2 is unbiased regardless of the truth or falsity of the null hypothesis (see Review Exercise 13.52 on page 556). It is important to note that the sum-of- squares identity has partitioned not only the total variability of the data, but also the total number of degrees of freedom. That is,
nk − 1 = k − 1 + k(n − 1). F-Ratio for Testing Equality of Means
When H0 is true, the ratio f = s21/s2 is a value of the random variable F having the F-distribution with k − 1 and k(n − 1) degrees of freedom (see Theorem 8.8). Since s21 overestimates σ2 when H0 is false, we have a one-tailed test with the critical region entirely in the right tail of the distribution.
The null hypothesis H0 is rejected at the α-level of significance when f >fα[k−1,k(n−1)].
Another approach, the P-value approach, suggests that the evidence in favor of or against H0 is
P =P{f[k−1,k(n−1)]>f}.
The computations for an analysis-of-variance problem are usually summarized in
tabular form, as shown in Table 13.3.
Table 13.3: Analysis of Variance for the One-Way ANOVA
Source of Variation
Treatments
Error Total
Sum of Squares
SSA
SSE SST
Degrees of Freedom
k−1 k(n−1)
kn − 1
Mean
Square f
Computed
s21 = SSA s21 k−1 s2
s2 = SSE k(n−1)
Example 13.1: Test the hypothesis μ1 = μ2 = · · · = μ5 at the 0.05 level of significance for the data of Table 13.1 on absorption of moisture by various types of cement aggregates.

13.3
One-Way Analysis of Variance: Completely Randomized Design 513 Solution: The hypotheses are
H0: μ1 =μ2 =···=μ5,
H1: At least two of the means are not equal. α = 0.05.
Critical region: f > 2.76 with v1 = 4 and v2 = 25 degrees of freedom. The sum-of-squares computations give
SST = 209,377, SSA = 85,356, SSE = 209,377 − 85,356 = 124,021.
These results and the remaining computations are exhibited in Figure 13.1 in the SAS ANOVA procedure.
The GLM Procedure
Dependent Variable: moisture
Source DF
Model 4
Error 25
Corrected Total 29
R-Square Coeff Var
0.407669 12.53703 70.43304
Sum of
Mean Square
21339.1167
4960.8133
moisture Mean
561.8000
Mean Square
21339.11667 4.30 0.0088
Source
aggregate
DF Type I SS
4 85356.46667
F Value
Pr > F
Squares
85356.4667
124020.3333
209376.8000
F Value 4.30
Pr > F 0.0088
Root MSE
Sum of Squares, Unequal Sample Sizes
Figure 13.1: SAS output for the analysis-of-variance procedure.
Decision: Reject H0 and conclude that the aggregates do not have the same mean absorption. The P-value for f = 4.30 is 0.0088, which is smaller than 0.05.
In addition to the ANOVA, a box plot was constructed for each aggregate. The plots are shown in Figure 13.2. From these plots it is evident that the absorption is not the same for all aggregates. In fact, it appears as if aggregate 4 stands out from the rest. A more formal analysis showing this result will appear in Exercise 13.21 on page 531.
During experimental work, one often loses some of the desired observations. Experimental animals may die, experimental material may be damaged, or human subjects may drop out of a study. The previous analysis for equal sample size will still be valid if we slightly modify the sum of squares formulas. We now assume the k random samples to be of sizes n1, n2, . . . , nk, respectively.
k ni k 􏰤􏰤􏰤
SST = (yij − y ̄..)2, SSA = ni(y ̄i. − y ̄..)2, SSE = SST − SSA i=1 j=1 i=1

514
Chapter 13 One-Factor Experiments: General
sample Q3
raw data
sample median sample Q1
12345 Aggregate
Figure 13.2: Box plots for the absorption of moisture in concrete aggregates. The degrees of freedom are then partitioned as before: N − 1 for SST, k − 1 for
􏰦k i=1
Example 13.2: Part of a study conducted at Virginia Tech was designed to measure serum alka- line phosphatase activity levels (in Bessey-Lowry units) in children with seizure disorders who were receiving anticonvulsant therapy under the care of a private physician. Forty-five subjects were found for the study and categorized into four drug groups:
G-1: Control (not receiving anticonvulsants and having no history of seizure disorders)
G-2: Phenobarbital
G-3: Carbamazepine
G-4: Other anticonvulsants
From blood samples collected from each subject, the serum alkaline phosphatase activity level was determined and recorded as shown in Table 13.4. Test the hy- pothesis at the 0.05 level of significance that the average serum alkaline phosphatase activity level is the same for the four drug groups.
SSA, and N − 1 − (k − 1) = N − k for SSE, where N =
ni.
Moisture
450 500 550 600 650 700

13.3 One-Way Analysis of Variance: Completely Randomized Design 515 Table 13.4: Serum Alkaline Phosphatase Activity Level
G-2
G-3
97.07 73.40 68.50 91.85
106.60 0.57 0.79 0.77 0.81
62.10
94.95 142.50 53.00 175.00 79.50 29.50 78.40 127.50
49.20 44.54 45.80 95.84 30.10 36.50 82.30 87.85
105.00 95.22
97.50 105.00 58.05 86.60 58.35 72.80 116.70 45.15 70.35 77.40
G-1
G-4
110.60 57.10 117.60 77.71 150.00 82.90 111.50
Solution: With the level of significance at 0.05, the hypotheses are H0: μ1 =μ2 =μ3 =μ4,
H1: At least two of the means are not equal.
Critical region: f > 2.836, from interpolating in Table A.6.
Computations: Y1. = 1460.25, Y2. = 440.36, Y3. = 842.45, Y4. = 707.41, and Y.. = 3450.47. The analysis of variance is shown in the M I N I T AB output of Figure 13.3.
One-way ANOVA: G-1, G-2, G-3, G-4
Source DF SS MS F P
Factor 3 13939 4646 3.57 0.022
Error 41 53376 1302
Total 44 67315
S = 36.08
R-Sq = 20.71%
R-Sq(adj) = 14.90%
Individual 95% CIs For Mean Based on
Pooled StDev
–+———+———+———+——-
(—-*—–)
(——-*——-)
(——-*——-)
(——–*——–)
–+———+———+———+——-
30 60 90 120
Pooled StDev = 36.08
Mean StDev
73.01 25.75
48.93 47.11
93.61 46.57
Level N
G-1 20
G-2 9
G-3 9
G-4 7 101.06 30.76
Figure 13.3: MINITAB analysis of data in Table 13.4.

516
Chapter 13 One-Factor Experiments: General
13.4
Decision: Reject H0 and conclude that the average serum alkaline phosphatase activity levels for the four drug groups are not all the same. The calculated P- value is 0.022.
In concluding our discussion on the analysis of variance for the one-way classi- fication, we state the advantages of choosing equal sample sizes over the choice of unequal sample sizes. The first advantage is that the f-ratio is insensitive to slight departures from the assumption of equal variances for the k populations when the samples are of equal size. Second, the choice of equal sample sizes minimizes the probability of committing a type II error.
Tests for the Equality of Several Variances
Although the f-ratio obtained from the analysis-of-variance procedure is insensitive to departures from the assumption of equal variances for the k normal populations when the samples are of equal size, we may still prefer to exercise caution and run a preliminary test for homogeneity of variances. Such a test would certainly be advisable in the case of unequal sample sizes if there was a reasonable doubt concerning the homogeneity of the population variances. Suppose, therefore, that we wish to test the null hypothesis
H 0 : σ 12 = σ 2 2 = · · · = σ k2
H1: The variances are not all equal.
The test that we shall use, called Bartlett’s test, is based on a statistic whose sampling distribution provides exact critical values when the sample sizes are equal. These critical values for equal sample sizes can also be used to yield highly accurate approximations to the critical values for unequal sample sizes.
First, we compute the k sample variances s21, s2, . . . , s2k from samples of size
􏰦k i=1
against the alternative
n1, n2, . . . , nk, with the pooled estimate
ni = N. Second, we combine the sample variances to give
1 􏰤k
(ni − 1)s2i .
b = [(s21)n1−1(s2)n2−1 ···(s2k)nk−1]1/(N−k)
s2p
s2p = N − k
i=1
Now
is a value of a random variable B having the Bartlett distribution. For the special case where n1 = n2 = ··· = nk = n, we reject H0 at the α-level of significance if
b < bk(α;n), 13.4 Tests for the Equality of Several Variances 517 where bk(α;n) is the critical value leaving an area of size α in the left tail of the Bartlett distribution. Table A.10 gives the critical values, bk(α;n), for α = 0.01 and 0.05; k = 2,3,...,10; and selected values of n from 3 to 100. When the sample sizes are unequal, the null hypothesis is rejected at the α-level of significance if b < bk(α;n1,n2,...,nk), bk(α;n1,n2,...,nk)≈ n1bk(α;n1)+n2bk(α;n2)+···+nkbk(α;nk). N As before, all the bk(α; ni) for sample sizes n1, n2, . . . , nk are obtained from Table A.10. Example 13.3: Use Bartlett’s test to test the hypothesis at the 0.01 level of significance that the population variances of the four drug groups of Example 13.2 are equal. Solution: We have the hypotheses H0: σ12 =σ2 =σ32 =σ42, H1: The variances are not equal, with α = 0.01. Critical region: Referring to Example 13.2, we have n1 = 20, n2 = 9, n3 = 9, n4 = 7, N = 45, and k = 4. Therefore, we reject when b < b4(0.01; 20, 9, 9, 7) ≈ (20)(0.8586) + (9)(0.6892) + (9)(0.6892) + (7)(0.6045) 45 = 0.7513. Computations: First compute s21 = 662.862, s2 = 2219.781, s23 = 2168.434, s24 = 946.032, and then s2p = (19)(662.862) + (8)(2219.781) + (8)(2168.434) + (6)(946.032) 41 = 1301.861. where Now b = 1301.861 = 0.8557. Decision: Do not reject the hypothesis, and conclude that the population variances of the four drug groups are not significantly different. Although Bartlett’s test is most often used for testing of homogeneity of vari- ances, other methods are available. A method due to Cochran provides a compu- tationally simple procedure, but it is restricted to situations in which the sample [(662.862)19 (2219.781)8 (2168.434)8 (946.032)6 ]1/41 518 Chapter 13 One-Factor Experiments: General // sizes are equal. Cochran’s test is particularly useful for detecting if one variance is much larger than the others. The statistic that is used is largest Si2 G=􏰦k , Si2 i=1 and the hypothesis of equality of variances is rejected if g > gα, where the value of gα is obtained from Table A.11.
To illustrate Cochran’s test, let us refer again to the data of Table 13.1 on moisture absorption in concrete aggregates. Were we justified in assuming equal variances when we performed the analysis of variance in Example 13.1? We find that
s21 = 12,134, s2 = 2303, s23 = 3594, s24 = 3319, s25 = 3455. Therefore,
g = 12,134 = 0.4892, 24,805
which does not exceed the table value g0.05 = 0.5065. Hence, we conclude that the assumption of equal variances is reasonable.
Exercises
13.1 Six different machines are being considered for use in manufacturing rubber seals. The machines are being compared with respect to tensile strength of the product. A random sample of four seals from each ma- chine is used to determine whether the mean tensile strength varies from machine to machine. The follow- ing are the tensile-strength measurements in kilograms per square centimeter × 10−1:
Machine 123456
17.5 16.4 20.3 14.6 17.5 18.3
16.9 19.2 15.7 16.7 19.2 16.2
15.8 17.7 17.8 20.8 16.5 17.5
18.6 15.4 18.9 18.9 20.5 20.1 Perform the analysis of variance at the 0.05 level of sig- nificance and indicate whether or not the mean tensile
strengths differ significantly for the six machines.
13.2 The data in the following table represent the number of hours of relief provided by five different brands of headache tablets administered to 25 subjects experiencing fevers of 38◦C or more. Perform the anal- ysis of variance and test the hypothesis at the 0.05 level of significance that the mean number of hours of relief provided by the tablets is the same for all five brands. Discuss the results.
Tablet
ABCDE
5.2 9.1 3.2 2.4 7.1 4.7 7.1 5.8 3.4 6.6 8.1 8.2 2.2 4.1 9.3 6.2 6.0 3.1 1.0 4.2 3.0 9.1 7.2 4.0 7.6
In an article “Shelf-Space Strategy in Retailing,” published in Proceedings: Southern Marketing Associa- tion, the effect of shelf height on the supermarket sales of canned dog food is investigated. An experiment was conducted at a small supermarket for a period of 8 days on the sales of a single brand of dog food, referred to as Arf dog food, involving three levels of shelf height: knee level, waist level, and eye level. During each day, the shelf height of the canned dog food was randomly changed on three different occasions. The remaining sections of the gondola that housed the given brand were filled with a mixture of dog food brands that were both familiar and unfamiliar to customers in this par- ticular geographic area. Sales, in hundreds of dollars, of Arf dog food per day for the three shelf heights are given. Based on the data, is there a significant differ- ence in the average daily sales of this dog food based on shelf height? Use a 0.01 level of significance.
13. 3

Exercises
519
//
Knee Level
Shelf Height
Waist Level
Eye Level
State University, was designed to assess the ability of this enzyme to undergo conformation or shape changes. Changes in the specific activity of the enzyme caused by variations in the concentration of NADP could be interpreted as supporting the theory of conformational change. The enzyme in question is located in the in- ner membrane of the tapeworm’s mitochondria. Tape- worms were homogenized, and through a series of cen- trifugations, the enzyme was isolated. Various con- centrations of NADP were then added to the isolated enzyme solution, and the mixture was then incubated in a water bath at 56◦C for 3 minutes. The enzyme was then analyzed on a dual-beam spectrophotometer, and the results shown were calculated, with the specific activity of the enzyme given in nanomoles per minute per milligram of protein. Test the hypothesis at the 0.01 level that the average specific activity is the same for the four concentrations.
NADP Concentration (nm) 0 80 160 360
77 88 85 82 94 85 86 93 87 78 90 81 81 91 80 86 94 79 77 90 87 81 87 93
13.4 Immobilization of free-ranging white-tailed deer by drugs allows researchers the opportunity to closely examine the deer and gather valuable physiological in- formation. In the study Influence of Physical Restraint and Restraint Facilitating Drugs on Blood Measure- ments of White-Tailed Deer and Other Selected Mam- mals, conducted at Virginia Tech, wildlife biologists tested the “knockdown” time (time from injection to immobilization) of three different immobilizing drugs. Immobilization, in this case, is defined as the point where the animal no longer has enough muscle control to remain standing. Thirty male white-tailed deer were randomly assigned to each of three treatments. Group A received 5 milligrams of liquid succinylcholine chlo- ride (SCC); group B received 8 milligrams of powdered SCC; and group C received 200 milligrams of phency- clidine hydrochloride. Knockdown times, in minutes, were recorded. Perform an analysis of variance at the 0.01 level of significance and determine whether or not the average knockdown time for the three drugs is the same.
Group
ABC
11 10 4 574
11.01 11.38 12.09 10.67 10.55 12.33 11.26 10.08
11.02 10.67 11.50 10.31
6.04 10.31 8.65 8.30 7.76 9.48
10.13 8.89 9.36
13.6 A study measured the sorption (either absorp- tion or adsorption) rates of three different types of or- ganic chemical solvents. These solvents are used to clean industrial fabricated-metal parts and are poten- tial hazardous waste. Independent samples from each type of solvent were tested, and their sorption rates were recorded as a mole percentage. (See McClave, Dietrich, and Sincich, 1997.)
Aromatics
Chloroalkanes
Esters
0.29 0.43 0.06 0.06 0.51 0.09 0.44 0.10 0.17 0.55 0.53 0.17 0.61 0.34 0.60
14
7
10
7
23
4
11
11
16 6 7 3 7 5 5 6
10 8 10 3 6 7 12 3
1.06 0.95 1.58 1.12 0.79 0.65 1.45 0.91 0.82 1.15 0.57 0.83 0.89 1.12 1.16 0.43
1.05
Is there a significant difference in
13.5 The mitochondrial enzyme NADPH:NAD transhydrogenase of the common rat tapeworm (Hy- menolepiasis diminuta) catalyzes hydrogen in the transfer from NADPH to NAD, producing NADH. This enzyme is known to serve a vital role in the tapeworm’s anaerobic metabolism, and it has recently been hypothesized that it may serve as a proton ex- change pump, transferring protons across the mito- chondrial membrane. A study on Effect of Various Substrate Concentrations on the Conformational Vari- ation of the NADPH:NAD Transhydrogenase of Hy- menolepiasis diminuta, conducted at Bowling Green
conclusions. Which solvent would you use?
13.7 It has been shown that the fertilizer magnesium ammonium phosphate, MgNH4 PO4 , is an effective sup- plier of the nutrients necessary for plant growth. The compounds supplied by this fertilizer are highly solu- ble in water, allowing the fertilizer to be applied di- rectly on the soil surface or mixed with the growth substrate during the potting process. A study on the Effect of Magnesium Ammonium Phosphate on Height of Chrysanthemums was conducted at George Mason University to determine a possible optimum level of fertilization, based on the enhanced vertical growth re- sponse of the chrysanthemums. Forty chrysanthemum
the mean sorption rates for the three solvents? Use a P-value for your

520
Chapter 13 One-Factor Experiments: General
seedlings were divided into four groups, each containing 10 plants. Each was planted in a similar pot containing a uniform growth medium. To each group of plants an increasing concentration of MgNH4PO4, measured in grams per bushel, was added. The four groups of plants were grown under uniform conditions in a greenhouse for a period of four weeks. The treatments and the re- spective changes in heights, measured in centimeters, are shown next.
erage attained height of chrysanthemums? How much MgNH4PO4 appears to be best?
13.8 For the data set in Exercise 13.7, use Bartlett’s test to check whether the variances are equal. Use α = 0.05.
13.9 Use Bartlett’s test at the 0.01 level of signifi- cance to test for homogeneity of variances in Exercise 13.5 on page 519.
13.10 Use Cochran’s test at the 0.01 level of signifi- cance to test for homogeneity of variances in Exercise 13.4 on page 519.
13.11 Use Bartlett’s test at the 0.05 level of signifi- cance to test for homogeneity of variances in Exercise 13.6 on page 519.
Treatment
100 g/bu
200 g/bu
50 g/bu
400 g/bu
13.2 12.4 16.0 12.6 7.8 14.4 21.0 14.8 12.8 17.2 14.8 13.0 20.0 15.8 19.1 15.8 13.0 14.0 14.0 23.6 17.0 27.0 18.0 26.0 14.2 21.6 14.0 17.0 19.6 18.0 21.1 22.0 15.0 20.0 22.2 24.4 20.2 23.2 25.0 18.2
Can we conclude at the 0.05 level of significance that different concentrations of MgNH4PO4 affect the av-
13.5 Single-Degree-of-Freedom Comparisons
Definition 13.1:
The analysis of variance in a one-way classification, or a one-factor experiment, as it is often called, merely indicates whether or not the hypothesis of equal treatment means can be rejected. Usually, an experimenter would prefer his or her analysis to probe deeper. For instance, in Example 13.1, by rejecting the null hypothesis we concluded that the means are not all equal, but we still do not know where the differences exist among the aggregates. The engineer might have the feeling a priori that aggregates 1 and 2 should have similar absorption properties and that the same is true for aggregates 3 and 5. However, it is of interest to study the difference between the two groups. It would seem, then, appropriate to test the hypothesis
H0: μ1 +μ2 −μ3 −μ5 =0,
H1: μ1 +μ2 −μ3 −μ5 ̸=0.
We notice that the hypothesis is a linear function of the population means where
the coefficients sum to zero.
Any linear function of the form
where
􏰦k i=1
ω =
􏰤k i=1
ciμi,
ci = 0, is called a comparison or contrast in the treatment means.
The experimenter can often make multiple comparisons by testing the significance of contrasts in the treatment means, that is, by testing a hypothesis of the following type:

13.5 Single-Degree-of-Freedom Comparisons
521
Hypothesis for a Contrast
H0: H1:
􏰤k i=1
􏰤k i=1
ciμi = 0, ciμi ̸= 0,
􏰦k i=1
Since Y ̄1., Y ̄2., . . . , Y ̄k. are independent random variables having normal distribu- tions with means μ1, μ2, . . . , μk and variances σ12/n1, σ2/n2, . . . , σk2/nk, respec- tively, Theorem 7.11 assures us that w is a value of the normal random variable W with
where
The test is conducted by first computing a similar contrast in the sample means,
􏰤k i=1
ci = 0.
􏰤k mean μW =
ciμi and variance σW2 = σ2
􏰤k c2 i .
w =
ciy ̄i..
i=1
Therefore, when H0 is true, μW = 0 and, by Example 7.5, the statistic
􏰧 􏰦k 􏰨 2 W 2 ci Y ̄i.
2 = i=1
σW 2􏰦k2
i=1 ni
σ (ci /ni) i=1
Test Statistic for Testing a Contrast
is distributed as a chi-squared random variable with 1 degree of freedom. Our hypothesis is tested at the α-level of significance by computing
􏰧 􏰨2 􏰮 􏰯2
􏰦k 􏰦k ci y ̄i.
(ci Yi. /ni ) S S w f=i=1 =i=1 =.
2􏰦k 2
s (ci /ni)
i=1
s
2􏰦k 2 s2
(ci /ni)
i=1
Here f is a value of the random variable F having the F-distribution with 1 and N − k degrees of freedom.
When the sample sizes are all equal to n,
􏰧 􏰦k
SSw= i=1
􏰦k 2
n ci i=1
ci Yi.
􏰨 2
.
The quantity SSw, called the contrast sum of squares, indicates the portion of SSA that is explained by the contrast in question.

522 Chapter 13 One-Factor Experiments: General This sum of squares will be used to test the hypothesis that
􏰤k
ciμi = 0.
i=1
It is often of interest to test multiple contrasts, particularly contrasts that are
linearly independent or orthogonal. As a result, we need the following definition:
The two contrasts
ω1 =
are said to be orthogonal if
􏰦k i=1
􏰤k i=1
􏰤k
ω2 = ciμi
i=1
bici/ni = 0 or, when the ni are all equal to n, if
􏰤k
bici = 0.
i=1
biμi and
Definition 13.2:
If ω1 and ω2 are orthogonal, then the quantities SSw1 and SSw2 are compo- nents of SSA, each with a single degree of freedom. The treatment sum of squares with k − 1 degrees of freedom can be partitioned into at most k − 1 independent single-degree-of-freedom contrast sums of squares satisfying the identity
SSA=SSw1 +SSw2 +···+SSwk−1, if the contrasts are orthogonal to each other.
Example 13.4: Referring to Example 13.1, find the contrast sum of squares corresponding to the orthogonal contrasts
ω1 =μ1 +μ2 −μ3 −μ5, ω2 =μ1 +μ2 +μ3 −4μ4 +μ5,
and carry out appropriate tests of significance. In this case, it is of interest a priori to compare the two groups (1, 2) and (3, 5). An important and independent contrast is the comparison between the set of aggregates (1, 2, 3, 5) and aggregate 4.
Solution : It is obvious that the two contrasts are orthogonal, since
(1)(1) + (1)(1) + (−1)(1) + (0)(−4) + (−1)(1) = 0.
The second contrast indicates a comparison between aggregates (1, 2, 3, and 5) and aggregate 4. We can write two additional contrasts orthogonal to the first two, namely
ω3 = μ1 − μ2 (aggregate 1 versus aggregate 2), ω4 = μ3 − μ5 (aggregate 3 versus aggregate 5).

13.6 Multiple Comparisons 523 From the data of Table 13.1, we have
(3320 + 3416 − 3663 − 3664)2
SSw1 = 6[(1)2 + (1)2 + (−1)2 + (−1)2] = 14, 553,
[3320 + 3416 + 3663 + 3664 − 4(2791)]2
SSw2 = 6[(1)2 + (1)2 + (1)2 + (1)2 + (−4)2] = 70, 035.
A more extensive analysis-of-variance table is shown in Table 13.5. We note that the two contrast sums of squares account for nearly all the aggregate sum of squares. There is a significant difference between aggregates in their absorption properties, and the contrast ω1 is marginally significant. However, the f-value of 14.12 for ω2 is highly significant, and the hypothesis
H0: μ1 +μ2 +μ3 +μ5 =4μ4
is rejected.
Table 13.5: Analysis of Variance Using Orthogonal Contrasts
Source of Variation
Aggregates
(1, 2) vs. (3, 5) (1, 2, 3, 5) vs. 4
Error Total
Sum of Squares
􏱈85,356 14,553 70,035 124,021 209,377
Degrees of Freedom
􏱈 4 1
1 25 29
Mean
Square f
Computed
􏱈21,339 4.30 14,553 2.93 70,035 14.12
4961
Orthogonal contrasts allow the practitioner to partition the treatment varia- tion into independent components. Normally, the experimenter would have certain contrasts that were of interest to him or her. Such was the case in our example, where a priori considerations suggested that aggregates (1,2) and (3,5) consti- tuted distinct groups with different absorption properties, a postulation that was not strongly supported by the significance test. However, the second comparison supported the conclusion that aggregate 4 seemed to “stand out” from the rest. In this case, the complete partitioning of SSA was not necessary, since two of the four possible independent comparisons accounted for a majority of the variation in treatments.
Figure 13.4 shows a SAS GLM procedure that displays a complete set of or- thogonal contrasts. Note that the sums of squares for the four contrasts add to the aggregate sum of squares. Also, note that the latter two contrasts (1 versus 2, 3 versus 5) reveal insignificant comparisons.
13.6 Multiple Comparisons
The analysis of variance is a powerful procedure for testing the homogeneity of a set of means. However, if we reject the null hypothesis and accept the stated alternative—that the means are not all equal—we still do not know which of the population means are equal and which are different.

524
Chapter 13 One-Factor Experiments: General
The GLM Procedure
Dependent Variable: moisture
Sum of
Source DF Squares Mean Square F Value Pr > F
Model 4 85356.4667
Error 25 124020.3333
21339.1167
4960.8133
Root MSE
70.43304
4.30 0.0088
Corrected Total
R-Square
0.407669
Source DF
aggregate
Source
aggregate
Contrast
(1,2,3,5) vs. 4
(1,2) vs. (3,5)
1 vs. 2
3 vs. 5
4
DF 4
DF 1 1 1 1
Mean Square
21339.11667
Mean Square
21339.11667
Mean Square
70035.00833
14553.37500
768.00000
0.08333
F Value 4.30
F Value 4.30
F Value
14.12
2.93
0.15
0.00
Pr>F 0.0088
Pr>F 0.0088
Pr>F 0.0009 0.0991 0.6973 0.9968
29 209376.8000
Coeff Var
12.53703
Type I SS
85356.46667
Type III SS
85356.46667
Contrast SS
70035.00833
14553.37500
768.00000
0.08333
moisture Mean
561.8000
Figure 13.4: A set of orthogonal contrasts
Often it is of interest to make several (perhaps all possible) paired compar- isons among the treatments. Actually, a paired comparison may be viewed as a simple contrast, namely, a test of
H0: μi − μj = 0, H1: μi − μj ̸= 0,
for all i ̸= j. Making all possible paired comparisons among the means can be very beneficial when particular complex contrasts are not known a priori. For example, in the aggregate data of Table 13.1, suppose that we wish to test
H0: μ1 − μ5 = 0, H1: μ1 − μ5 ̸= 0.
The test is developed through use of an F, t, or confidence interval approach. Using t, we have
y ̄ 1 . − y ̄ 5 . t=􏰱,
s 2/n
where s is the square root of the mean square error and n = 6 is the sample size
per treatment. In this case,
553.33 − 610.67
t=√ 􏰱 =−1.41.
4961 1/3

13.6 Multiple Comparisons 525 The P-value for the t-test with 25 degrees of freedom is 0.17. Thus, there is not
sufficient evidence to reject H0. Relationship between T and F
In the foregoing, we displayed the use of a pooled t-test along the lines of that discussed in Chapter 10. The pooled estimate was taken from the mean squared error in order to enjoy the degrees of freedom that are pooled across all five samples. In addition, we have tested a contrast. The reader should note that if the t-value is squared, the result is exactly of the same form as the value of f for a test on a contrast, discussed in the preceding section. In fact,
(y ̄1. − y ̄5.)2 (553.33 − 610.67)2
f = s2(1/6 + 1/6) = 4961(1/3) = 1.988,
which, of course, is t2.
Confidence Interval Approach to a Paired Comparison
It is straightforward to solve the same problem of a paired comparison (or a con- trast) using a confidence interval approach. Clearly, if we compute a 100(1 − α)% confidence interval on μ1 − μ5, we have
􏰼
y ̄1.−y ̄5.±tα/2s 2, 6
where tα/2 is the upper 100(1 − α/2)% point of a t-distribution with 25 degrees of freedom (degrees of freedom coming from s2). This straightforward connection between hypothesis testing and confidence intervals should be obvious from dis- cussions in Chapters 9 and 10. The test of the simple contrast μ1 − μ5 involves no more than observing whether or not the confidence interval above covers zero. Substituting the numbers, we have as the 95% confidence interval
√ 􏰼1
(553.33 − 610.67) ± 2.060 4961 3 = −57.34 ± 83.77.
Thus, since the interval covers zero, the contrast is not significant. In other words, we do not find a significant difference between the means of aggregates 1 and 5.
Experiment-wise Error Rate
Serious difficulties occur when the analyst attempts to make many or all pos- sible paired comparisons. For the case of k means, there will be, of course, r = k(k − 1)/2 possible paired comparisons. Assuming independent comparisons, the experiment-wise error rate or family error rate (i.e., the probability of false rejection of at least one of the hypotheses) is given by 1 − (1 − α)r, where α is the selected probability of a type I error for a specific comparison. Clearly, this measure of experiment-wise type I error can be quite large. For example, even

526
Chapter 13 One-Factor Experiments: General
if there are only 6 comparisons, say, in the case of 4 means, and α = 0.05, the experiment-wise rate is
1 − (0.95)6 ≈ 0.26.
When many paired comparisons are being tested, there is usually a need to make
the effective contrast on a single comparison more conservative. That is, with the
confidence interval approach, the confidence intervals would be much wider than
Tukey’s Test
the ±tα/2s
􏰱
2/n used for the case where only a single comparison is being made.
There are several standard methods for making paired comparisons that sustain
the credibility of the type I error rate. We shall discuss and illustrate two of them
here. The first one, called Tukey’s procedure, allows formation of simultaneous
100(1 − α)% confidence intervals for all paired comparisons. The method is based
on the studentized range distribution. The appropriate percentile point is a function
of α,k, and v = degrees of freedom for s2. A list of upper percentage points for
α = 0.05 is shown in Table A.12. The method of paired comparisons by Tukey
involves finding a significant difference between means i and j (i ̸= j) if |y ̄i. − y ̄j.|
s2 . n
we have 6 treatments in a one-factor completely randomized design, with 5 obser- vations taken per treatment. Suppose that the mean square error taken from the analysis-of-variance table is s2 = 2.45 (24 degrees of freedom). The sample means are in ascending order:
y ̄2. y ̄5. y ̄1. y ̄3. y ̄6. y ̄4. 14.50 16.75 19.84 21.12 22.90 23.20.
With α = 0.05, the value of q(0.05, 6, 24) is 4.37. Thus, all absolute differences are to be compared to
􏰼
4.37 2.45 = 3.059. 5
As a result, the following represent means found to be significantly different using Tukey’s procedure:
4and1, 4and5, 4and2, 6and1, 6and5, 6and2, 3and5, 3and2, 1and5, 1and2.
􏱊
exceeds q(α, k, v)
Tukey’s procedure is easily illustrated. Consider a hypothetical example where
Where Does the α-Level Come From in Tukey’s Test?
We briefly alluded to the concept of simultaneous confidence intervals being employed for Tukey’s procedure. The reader will gain a useful insight into the notion of multiple comparisons if he or she gains an understanding of what is meant by simultaneous confidence intervals.
In Chapter 9, we saw that if we compute a 95% confidence interval on, say, a mean μ, then the probability that the interval covers the true mean μ is 0.95.

13.6 Multiple Comparisons 527 However, as we have discussed, for the case of multiple comparisons, the effective
probability of interest is tied to the experiment-wise error rate, and it should be
emphasized that the confidence intervals of the type y ̄i. − y ̄j. ± q(α, k, v)s 1/n are not independent since they all involve s and many involve the use of the same averages, the y ̄i.. Despite the difficulties, if we use q(0.05,k,v), the simultaneous confidence level is controlled at 95%. The same holds for q(0.01, k, v); namely, the confidence level is controlled at 99%. In the case of α = 0.05, there is a probability of 0.05 that at least one pair of measures will be falsely found to be different (false rejection of at least one null hypothesis). In the α = 0.01 case, the corresponding probability will be 0.01.
The second procedure we shall discuss is called Duncan’s procedure or Dun- can’s multiple-range test. This procedure is also based on the general notion of studentized range. The range of any subset of p sample means must exceed a certain value before any of the p means are found to be different. This value is called the least significant range for the p means and is denoted by Rp, where 􏰼
Rp = rp
s2 n .
􏰱
Duncan’s Test
The values of the quantity rp, called the least significant studentized range, depend on the desired level of significance and the number of degrees of freedom of the mean square error. These values may be obtained from Table A.13 for p = 2,3,…,10 means.
To illustrate the multiple-range test procedure, let us consider the hypothetical example where 6 treatments are compared, with 5 observations per treatment. This is the same example used to illustrate Tukey’s test. We obtain Rp by multiplying each rp by 0.70. The results of these computations are summarized as follows:
p23456 rp 2.919 3.066 3.160 3.226 3.276
Rp 2.043 2.146 2.212 2.258 2.293
Comparing these least significant ranges with the differences in ordered means, we
arrive at the following conclusions:
1. Since y ̄4. − y ̄2. = 8.70 > R6 = 2.293, we conclude that μ4 and μ2 are signifi- cantly different.
2. Comparing y ̄4. −y ̄5. and y ̄6. −y ̄2. with R5, we conclude that μ4 is significantly greater than μ5 and μ6 is significantly greater than μ2.
3. Comparing y ̄4. − y ̄1., y ̄6. − y ̄5., and y ̄3. − y ̄2. with R4, we conclude that each difference is significant.
4. Comparing y ̄4. − y ̄3., y ̄6. − y ̄1., y ̄3. − y ̄5., and y ̄1. − y ̄2. with R3, we find all differences significant except for μ4 −μ3. Therefore, μ3, μ4, and μ6 constitute a subset of homogeneous means.
5. Comparing y ̄3. − y ̄1., y ̄1. − y ̄5., and y ̄5. − y ̄2. with R2, we conclude that only μ3 and μ1 are not significantly different.

528
Chapter 13 One-Factor Experiments: General
It is customary to summarize the conclusions above by drawing a line under any subsets of adjacent means that are not significantly different. Thus, we have
y ̄2. y ̄5. y ̄1. y ̄3. y ̄6. y ̄4. 14.50 16.75 19.84 21.12 22.90 23.20
It is clear that in this case the results from Tukey’s and Duncan’s procedures are very similar. Tukey’s procedure did not detect a difference between 2 and 5, whereas Duncan’s did.
Dunnett’s Test: Comparing Treatment with a Control
In many scientific and engineering problems, one is not interested in drawing infer- ences regarding all possible comparisons among the treatment means of the type μi − μj . Rather, the experiment often dictates the need to simultaneously compare each treatment with a control. A test procedure developed by C. W. Dunnett de- termines significant differences between each treatment mean and the control, at a single joint significance level α. To illustrate Dunnett’s procedure, let us consider the experimental data of Table 13.6 for a one-way classification where the effect of three catalysts on the yield of a reaction is being studied. A fourth treatment, no catalyst, is used as a control.
Table 13.6: Yield of Reaction
Control
50.7 51.5 49.2 53.1 52.7
y ̄0. = 51.44
In general, we wish to test the k hypotheses
􏰹
H0: μ0 = μi H1: μ0 ̸= μi
Catalyst 1
54.1 53.8 53.1 52.5 54.0
Catalyst 2
52.7 53.9 57.0 54.1 52.5
Catalyst 3
51.2 50.8 49.7 48.0 47.2
y ̄3. = 49.38
y ̄1. = 53.50
y ̄2. = 54.04
where μ0 represents the mean yield for the population of measurements in which the control is used. The usual analysis-of-variance assumptions, as outlined in Section 13.3, are expected to remain valid. To test the null hypotheses specified by H0 against two-sided alternatives for an experimental situation in which there are k treatments, excluding the control, and n observations per treatment, we first calculate the values
y ̄i. − y ̄0.
di = 􏰱2s2/n, i=1,2,…,k.
The sample variance s2 is obtained, as before, from the mean square error in the analysis of variance. Now, the critical region for rejecting H0, at the α-level of
i = 1,2,…,k,

Exercises
529
//
significance, is established by the inequality |di| > dα/2(k, v),
where v is the number of degrees of freedom for the mean square error. The values of the quantity dα/2(k, v) for a two-tailed test are given in Table A.14 for α = 0.05 and α = 0.01 for various values of k and v.
Example 13.5: For the data of Table 13.6, test hypotheses comparing each catalyst with the con- trol, using two-sided alternatives. Choose α = 0.05 as the joint significance level.
Solution: The mean square error with 16 degrees of freedom is obtained from the analysis- of-variance table, using all k + 1 treatments. The mean square error is given by
􏰼􏰼
2 36.812 2s2 (2)(2.30075)
s = 16 =2.30075and n = 5 =0.9593.
Hence,
d1 = 53.50 − 51.44 = 2.147, d2 = 54.04 − 51.44 = 2.710, 0.9593 0.9593
d3 = 49.38 − 51.44 = −2.147. 0.9593
From Table A.14 the critical value for α = 0.05 is found to be d0.025(3, 16) = 2.59. Since |d1| < 2.59 and |d3| < 2.59, we conclude that only the mean yield for catalyst 2 is significantly different from the mean yield of the reaction using the control. Many practical applications dictate the need for a one-tailed test for comparing treatments with a control. Certainly, when a pharmacologist is concerned with the effect of various dosages of a drug on cholesterol level and his control is zero dosage, it is of interest to determine if each dosage produces a significantly larger reduction than the control. Table A.15 shows the critical values of dα(k,v) for one-sided alternatives. Exercises 13.12 Consider the data of Review Exercise 13.45 on page 555. Make significance tests on the following con- trasts: (a) B versus A, C, and D; (b) C versus A and D; (c) A versus D. 13.13 The purpose of the study The Incorporation of a Chelating Agent into a Flame Retardant Finish of a Cotton Flannelette and the Evaluation of Selected Fabric Properties conducted at Virginia Tech was to evaluate the use of a chelating agent as part of the flame-retardant finish of cotton flannelette by deter- mining its effects upon flammability after the fabric is laundered under specific conditions. Two baths were prepared, one with carboxymethyl cellulose and one without. Twelve pieces of fabric were laundered 5 times in bath I, and 12 other pieces of fabric were laundered 10 times in bath I. This was repeated using 24 addi- tional pieces of cloth in bath II. After the washings the lengths of fabric that burned and the burn times were measured. For convenience, let us define the following treatments: Treatment 1: 5 launderings in bath I, Treatment 2: 5 launderings in bath II, Treatment 3: 10 launderings in bath I, Treatment 4: 10 launderings in bath II. 530 Chapter 13 One-Factor Experiments: General Burn times, in seconds, were recorded as follows: Treatment 1234 13.16 An investigation was conducted to determine the source of reduction in yield of a certain chemical product. It was known that the loss in yield occurred in the mother liquor, that is, the material removed at the filtration stage. It was thought that different blends of the original material might result in different yield reductions at the mother liquor stage. The following are the percent reductions for 3 batches at each of 4 preselected blends: Blend 1234 25.6 25.2 20.8 31.6 24.3 28.6 26.7 29.8 27.9 24.7 22.2 34.3 (a) Perform the analysis of variance at the α = 0.05 level of significance. (b) Use Duncan’s multiple-range test to determine which blends differ. (c) Do part (b) using Tukey’s test. 13.17 In the study An Evaluation of the Removal Method for Estimating Benthic Populations and Diver- sity conducted by Virginia Tech on the Jackson River, 5 different sampling procedures were used to determine the species counts. Twenty samples were selected at random, and each of the 5 sampling procedures was repeated 4 times. The species counts were recorded as follows: Sampling Procedure Substrate 13.7 6.2 23.0 5.4 15.7 5.0 25.5 4.4 15.8 5.0 14.8 3.3 14.0 16.0 29.4 2.5 9.7 1.6 14.0 3.9 12.3 2.5 12.3 7.1 27.2 18.2 16.8 8.8 12.9 14.5 14.9 14.7 17.1 17.1 13.0 13.9 10.8 10.6 13.5 5.8 25.5 7.3 14.2 17.7 27.4 18.3 11.5 9.9 // (a) Perform an analysis of variance, using a 0.01 level of significance, and determine whether there are any significant differences among the treatment means. (b) Use single-degree-of-freedom contrasts with α = 0.01 to compare the mean burn time of treatment 1 versus treatment 2 and also treatment 3 versus treatment 4. 13.14 The study Loss of Nitrogen Through Sweat by Preadolescent Boys Consuming Three Levels of Dietary Protein was conducted by the Department of Human Nutrition and Foods at Virginia Tech to determine per- spiration nitrogen loss at various dietary protein levels. Twelve preadolescent boys ranging in age from 7 years, 8 months to 9 years, 8 months, all judged to be clini- cally healthy, were used in the experiment. Each boy was subjected to one of three controlled diets in which 29, 54, or 84 grams of protein were consumed per day. The following data represent the body perspiration ni- trogen loss, in milligrams, during the last two days of the experimental period: Deple- Modified tion Hess Surber Removal Kick- Kicknet net 29 Grams 190 266 270 Protein Level 54 Grams 318 295 271 438 402 84 Grams 390 321 396 399 85 75 31 43 17 55 45 20 21 10 40 35 9 15 8 77 67 37 27 15 (a) Is there a significant difference in the average species counts for the different sampling proce- dures? Use a P-value in your conclusion. (b) Use Tukey’s test with α = 0.05 to find which sam- pling procedures differ. 13.18 The following data are values of pressure (psi) in a torsion spring for several settings of the angle be- tween the legs of the spring in a free position: (a) Perform an analysis of variance at the 0.05 level of significance to show that the mean perspiration nitrogen losses at the three protein levels are dif- ferent. (b) Use Tukey’s test to determine which protein levels are significantly different from each other in mean nitrogen loss. 13.15 Use Tukey’s test, with a 0.05 level of signifi- cance, to analyze the means of the five different brands of headache tablets in Exercise 13.2 on page 518. Angle (◦) 71 75 79 67 83 84 86 85 85 87 85 88 86 88 83 87 89 90 87 90 92 88 90 88 91 86 88 89 87 90 Exercises 531 Compute a one-way analysis of variance for this experi- ment and state your conclusion concerning the effect of angle on the pressure in the spring. (From C. R. Hicks, Fundamental Concepts in the Design of Experiments, Holt, Rinehart and Winston, New York, 1973.) 13.19 It is suspected that the environmental temper- ature at which batteries are activated affects their life. Thirty homogeneous batteries were tested, six at each of five temperatures, and the data are shown below (activated life in seconds). Analyze and interpret the data. (From C. R. Hicks, Fundamental Concepts in Design of Experiments, Holt, Rinehart and Winston, New York, 1973.) Temperature (◦C) 0 25 50 75 100 55 60 70 72 65 55 61 72 72 66 57 60 72 72 60 54 60 68 70 64 54 60 77 68 65 56 60 77 69 65 13.20 The following table (from A. Hald, Statistical Theory with Engineering Applications, John Wiley & Sons, New York, 1952) gives tensile strengths (in devi- ations from 340) for wires taken from nine cables to be used for a high-voltage network. Each cable is made from 12 wires. We want to know whether the mean strengths of the wires in the nine cables are the same. If the cables are different, which ones differ? Use a P-value in your analysis of variance. Cable Tensile Strength the data of Exercise 13.6 on page 519. Discuss the results. 13.23 In a biological experiment, four concentrations of a certain chemical are used to enhance the growth of a certain type of plant over time. Five plants are used at each concentration, and the growth in each plant is measured in centimeters. The following growth data are taken. A control (no chemical) is also applied. Concentration Control 1 2 3 4 6.8 8.2 7.3 8.7 6.3 9.4 6.9 9.2 7.1 8.6 Use Dunnett’s two-sided test at icance to simultaneously compare the concentrations with the control. 13.24 The financial structure of a firm refers to the way the firm’s assets are divided into equity and debt, and the financial leverage refers to the percentage of assets financed by debt. In the paper The Effect of Fi- nancial Leverage on Return, Tai Ma of Virginia Tech claims that financial leverage can be used to increase the rate of return on equity. To say it another way, stockholders can receive higher returns on equity with the same amount of investment through the use of fi- nancial leverage. The following data show the rates of return on equity using 3 different levels of financial leverage and a control level (zero debt) for 24 randomly selected firms: Financial Leverage Control Low Medium High 2.1 6.2 9.6 10.3 5.6 4.0 8.0 6.9 3.0 8.4 5.5 7.8 7.8 2.8 12.6 5.8 5.2 4.2 7.0 7.2 2.6 5.0 7.8 12.0 Source: Standard&Poor’sMachineryIndus- try Survey, 1975. (a) Perform the analysis of variance at the 0.05 level of significance. (b) Use Dunnett’s test at the 0.01 level of significance to determine whether the mean rates of return on equity are higher at the low, medium, and high lev- els of financial leverage than at the control level. 7.7 6.9 5.9 8.4 5.8 6.1 8.6 7.2 6.9 8.1 6.8 5.7 8.0 7.4 6.1 the 0.05 level of signif- 5−13−5−2−10−6−5 0−3 2−7−5 −11−13 −8 8 −3−12−12−10 5−6−12−10 1 2 3 4 5 6 7 8 9 13. 21 The information on Duncan’s test, using PROC GLM in SAS, for the aggregate data in Example 13.1. Give conclusions regarding paired comparisons using Dun- can’s test results. 13.22 Do Duncan’s test for paired comparisons for 0−10−15−12 −2 −8 −5 −12 4210−5−8−12 7 1501065 1 0−5−4−1 0 2 −1 0 2 1 −4 2 7 −1 0 7 5 10 8 1 2 6 7 8 15 11 −7 0−4−1 −5−11 0−5−3−30 20−1−10−2 51−267 5 1 0 −4 2 2−3 6 0 5 7 10 7 8 1 printout in Figure 13.5 on page 532 gives 532 Chapter 13 One-Factor Experiments: General 13.7 The GLM Procedure Duncan’s Multiple Range Test for moisture NOTE: This test controls the Type I comparisonwise error rate, not the experimentwise error rate. Alpha 0.05 Error Degrees of Freedom 25 Error Mean Square 4960.813 Number of Means 2345 Critical Range 83.75 87.97 90.69 92.61 Means with the same letter are not significantly different. Duncan Grouping Mean N aggregate A 610.67 6 5 A A 610.50 6 3 A A 569.33 6 2 A A 553.33 6 1 B 465.17 6 4 Figure 13.5: SAS printout for Exercise 13.21. Comparing a Set of Treatments in Blocks In Section 13.2, we discussed the idea of blocking, that is, isolating sets of experi- mental units that are reasonably homogeneous and randomly assigning treatments to these units. This is an extension of the “pairing” concept discussed in Chapters 9 and 10, and it is done to reduce experimental error, since the units in a block have more common characteristics than units in different blocks. The reader should not view blocks as a second factor, although this is a tempting way of visualizing the design. In fact, the main factor (treatments) still carries the major thrust of the experiment. Experimental units are still the source of error, just as in the completely randomized design. We merely treat sets of these units more systematically when blocking is accomplished. In this way, we say there are restrictions in randomization. Before we turn to a discussion of blocking, let us look at two examples of a completely randomized design. The first example is a chemical experiment designed to determine if there is a difference in mean reaction yield among four catalysts. Samples of materials to be tested are drawn from the same batches of raw materials, while other conditions, such as temperature and concentration of reactants, are held constant. In this case, the time of day for the experimental runs might represent the experimental units, and if the experimenter believed that there could possibly be a slight time effect, he or she would randomize the assignment of the catalysts to the runs to counteract the possible trend. As a second example of such a design, consider an experiment to compare four methods 13.8 Randomized Complete Block Designs 533 of measuring a particular physical property of a fluid substance. Suppose the sampling process is destructive; that is, once a sample of the substance has been measured by one method, it cannot be measured again by any of the other methods. If it is decided that five measurements are to be taken for each method, then 20 samples of the material are selected from a large batch at random and are used in the experiment to compare the four measuring methods. The experimental units are the randomly selected samples. Any variation from sample to sample will appear in the error variation, as measured by s2 in the analysis. What Is the Purpose of Blocking? If the variation due to heterogeneity in experimental units is so large that the sensitivity with which treatment differences are detected is reduced due to an inflated value of s2, a better plan might be to “block off” variation due to these units and thus reduce the extraneous variation to that accounted for by smaller or more homogeneous blocks. For example, suppose that in the previous catalyst illustration it is known a priori that there definitely is a significant day-to-day effect on the yield and that we can measure the yield for four catalysts on a given day. Rather than assign the four catalysts to the 20 test runs completely at random, we choose, say, five days and run each of the four catalysts on each day, randomly assigning the catalysts to the runs within days. In this way, the day- to-day variation is removed from the analysis, and consequently the experimental error, which still includes any time trend within days, more accurately represents chance variation. Each day is referred to as a block. The most straightforward of the randomized block designs is one in which we randomly assign each treatment once to every block. Such an experimental layout is called a randomized complete block (RCB) design, each block constituting a single replication of the treatments. 13.8 Randomized Complete Block Designs A typical layout for the randomized complete block design using 3 measurements in 4 blocks is as follows: Block 1 Block 2 Block 3 Block 4 The t’s denote the assignment to blocks of each of the 3 treatments. Of course, the true allocation of treatments to units within blocks is done at random. Once the experiment has been completed, the data can be recorded in the following 3 × 4 array: t2 t1 t3 t1 t3 t2 t3 t2 t1 t2 t1 t3 534 Chapter 13 One-Factor Experiments: General Treatment 1 2 3 Block: 1 2 y11 y12 y21 y22 y31 y32 3 4 y13 y14 y23 y24 y33 y34 where y11 represents the response obtained by using treatment 1 in block l, y12 represents the response obtained by using treatment 1 in block 2, ..., and y34 represents the response obtained by using treatment 3 in block 4. Let us now generalize and consider the case of k treatments assigned to b blocks. The data may be summarized as shown in the k × b rectangular array of Table 13.7. It will be assumed that the yij, i = 1,2,...,k and j = 1,2,...,b, are values of independent random variables having normal distributions with mean μij and common variance σ2. Table 13.7: k × b Array for the RCB Design Block Treatment 1 2 ··· j ··· 1 y11 y12 ··· y1j ··· 2 y21 y22 ··· y2j ··· . . . . i yi1 yi2 ··· yij ··· . . . . k yk1 yk2 ··· ykj ··· Total T.1 T.2 ··· T.j ··· Mean y ̄.1 y ̄.2 ··· y ̄.j ··· b Total Mean y1b T1. y ̄1. y2b T2. y ̄2. . . . yib Ti. y ̄i. . . . ykb Tk. y ̄k. T.b T.. y ̄.b Let μi. represent the average (rather than the total) of the b population means for the ith treatment. That is, 1 􏰤b μi. = b 1 􏰤k μij, for i = 1,...,k. Similarly, the average of the population means for the jth block, μ.j, is defined by μ.j = k and the average of the bk population means, μ, is defined by 1 􏰤k 􏰤b μ = bk To determine if part of the variation in our observations is due to differences among the treatments, we consider the following test: j=1 i=1 μij, for j = 1,...,b y ̄.. μij. i=1 j=1 13.8 Randomized Complete Block Designs 535 Hypothesis of Equal Treatment Means Model for the RCB Design H0: μ1. =μ2. =···μk. =μ, H1: The μi. are not all equal. Each observation may be written in the form yij =μij +εij, where εij measures the deviation of the observed value yij from the population mean μij. The preferred form of this equation is obtained by substituting μij =μ+αi+βj, where αi is, as before, the effect of the ith treatment and βj is the effect of the jth block. It is assumed that the treatment and block effects are additive. Hence, we may write yij = μ + αi + βj + εij . Notice that the model resembles that of the one-way classification, the essential difference being the introduction of the block effect βj. The basic concept is much like that of the one-way classification except that we must account in the analysis for the additional effect due to blocks, since we are now systematically controlling variation in two directions. If we now impose the restrictions that then and 1 􏰤b μi. = b 1 􏰤k 􏰤k i=1 αi = 0 and 􏰤b j=1 βj = 0, (μ + αi + βj ) = μ + αi, for i = 1, . . . , k, j=1 μ.j =k The null hypothesis that the k treatment means μi· are equal, and therefore equal to μ, is now equivalent to testing the hypothesis H0: α1 =α2 =···=αk =0, H1: At least one of the αi is not equal to zero. Each of the tests on treatments will be based on a comparison of independent estimates of the common population variance σ2. These estimates will be obtained (μ+αi+βj)=μ+βj, forj=1,...,b. i=1 536 Chapter 13 One-Factor Experiments: General Theorem 13.3: by splitting the total sum of squares of our data into three components by means of the following identity. Sum-of-Squares Identity 􏰤k 􏰤b i=1 j=1 i=1 􏰤k 􏰤b (yij − y ̄..)2 = b (y ̄i. − y ̄..)2 + k (y ̄.j − y ̄..)2 j=1 (yij − y ̄i. − y ̄.j + y ̄..)2 + 􏰤k 􏰤b i=1 j=1 The proof is left to the reader. The sum-of-squares identity may be presented symbolically by the equation SST =SSA+SSB+SSE, where 􏰤k 􏰤b i=1 j=1 SST = SSA = b SSB = k (yij − y ̄..)2 (y ̄i. − y ̄..)2 = total sum of squares, = treatment sum of squares, = block sum of squares, SSE = (yij − y ̄i. − y ̄.j + y ̄..)2 = error sum of squares. 􏰤k i=1 􏰤b j=1 􏰤k 􏰤b i=1 j=1 (y ̄.j − y ̄..)2 Following the procedure outlined in Theorem 13.2, where we interpreted the sums of squares as functions of the independent random variables Y11, Y12, . . . , Ykb, we can show that the expected values of the treatment, block, and error sums of squares are given by 􏰤b j=1 If the treatment effects α1 = α2 = ··· = αk = 0, s21 is an unbiased estimate of σ2. However, if the treatment effects are not all zero, we have the following: 􏰤k E(SSA)=(k−1)σ2 +b αi2, E(SSB)=(b−1)σ2 +k As in the case of the one-factor problem, we have the treatment mean square s 21 = S S A . k−1 i=1 E(SSE) = (b − 1)(k − 1)σ2. βj2, 13.8 Randomized Complete Block Designs 537 􏰧SSA􏰨 b 􏰤k E k−1 =σ2+k−1 αi2 i=1 In this case, s21 overestimates σ2. A second estimate of σ2, based on b − 1 degrees of freedom, is s2 = SSB. b−1 The estimate s2 is an unbiased estimate of σ2 if the block effects β1 = β2 = · · · = βb = 0. If the block effects are not all zero, then 􏰧 S S B 􏰨 k 􏰤b E b−1 =σ2+b−1 βj2, j=1 and s2 will overestimate σ2. A third estimate of σ2, based on (k−1)(b−1) degrees of freedom and independent of s21 and s2, is s2 = SSE , (k − 1)(b − 1) which is unbiased regardless of the truth or falsity of either null hypothesis. To test the null hypothesis that the treatment effects are all equal to zero, we compute the ratio f1 = s21/s2, which is a value of the random variable F1 having an F-distribution with k − 1 and (k − 1)(b − 1) degrees of freedom when the null hypothesis is true. The null hypothesis is rejected at the α-level of significance when f1 >fα[k−1,(k−1)(b−1)].
In practice, we first compute SST, SSA, and SSB and then, using the sum- of-squares identity, obtain SSE by subtraction. The degrees of freedom associated with SSE are also usually obtained by subtraction; that is,
(k − 1)(b − 1) = kb − 1 − (k − 1) − (b − 1).
The computations in an analysis-of-variance problem for a randomized complete
block design may be summarized as shown in Table 13.8.
Example 13.6: Four different machines, M1, M2, M3, and M4, are being considered for the assem- bling of a particular product. It was decided that six different operators would be used in a randomized block experiment to compare the machines. The machines were assigned in a random order to each operator. The operation of the machines requires physical dexterity, and it was anticipated that there would be a difference among the operators in the speed with which they operated the machines. The amounts of time (in seconds) required to assemble the product are shown in Table 13.9.
Test the hypothesis H0, at the 0.05 level of significance, that the machines perform at the same mean rate of speed.
Expected TreatmentMean Square

538
Chapter 13 One-Factor Experiments: General
Table 13.8: Analysis of Variance for the Randomized Complete Block Design
Source of Variation
Treatments
Blocks
Error Total
Sum of Squares
SSA SSB
SSE SST
Degrees of Freedom
k−1 b−1
(k−1)(b−1) kb−1
Mean Computed Square f
s21=SSA f1=s21 k−1 s2
s2 = SSB b−1
s2= SSE
(k − 1)(b − 1)
Table 13.9: Time, in Seconds, to Assemble Product
Operator Machine 1 2 3 4
1 42.5 39.3 39.6 39.9 2 39.8 40.1 40.5 42.3 3 40.2 40.5 41.3 43.4 4 41.3 42.2 43.5 44.2
5 6 Total
42.9 43.6 247.8 42.5 43.1 248.3 44.9 45.1 255.4 45.9 42.3 259.4
Total 163.8 162.1 164.9 169.8 176.2 174.1 1010.9 Solution: The hypotheses are
H0: α1 = α2 = α3 = α4 = 0 (machine effects are zero), H1: At least one of the αi is not equal to zero.
The sum-of-squares formulas shown on page 536 and the degrees of freedom are used to produce the analysis of variance in Table 13.10. The value f = 3.34 is significant at P = 0.048. If we use α = 0.05 as at least an approximate yardstick, we conclude that the machines do not perform at the same mean rate of speed.
Table 13.10: Analysis of Variance for the Data of Table 13.9
Source of Variation
Machines Operators Error Total
Sum of Squares
15.93 42.09 23.84 81.86
Degrees of Freedom
3
15 1.59 23
Mean Computed Square f
5.31 3.34 5 8.42
Further Comments Concerning Blocking
In Chapter 10, we presented a procedure for comparing means when the observa- tions were paired. The procedure involved “subtracting out” the effect due to the

13.8 Randomized Complete Block Designs 539
homogeneous pair and thus working with differences. This is a special case of a randomized complete block design with k = 2 treatments. The n homogeneous units to which the treatments were assigned take on the role of blocks.
If there is heterogeneity in the experimental units, the experimenter should not be misled into believing that it is always advantageous to reduce the experimental error through the use of small homogeneous blocks. Indeed, there may be instances where it would not be desirable to block. The purpose in reducing the error variance is to increase the sensitivity of the test for detecting differences in the treatment means. This is reflected in the power of the test procedure. (The power of the analysis-of-variance test procedure is discussed more extensively in Section 13.11.) The power to detect certain differences among the treatment means increases with a decrease in the error variance. However, the power is also affected by the degrees of freedom with which this variance is estimated, and blocking reduces the degrees of freedom that are available from k(b − 1) for the one-way classification to (k − 1)(b−1). So one could lose power by blocking if there is not a significant reduction in the error variance.
Interaction between Blocks and Treatments
Another important assumption that is implicit in writing the model for a random- ized complete block design is that the treatment and block effects are additive. This is equivalent to stating that
μij −μij =μij −μij or μij −μij =μij −μij,
for every value of i, i′, j, and j′. That is, the difference between the population means for blocks j and j′ is the same for every treatment and the difference between the population means for treatments i and i′ is the same for every block. The parallel lines of Figure 13.6(a) illustrate a set of mean responses for which the treatment and block effects are additive, whereas the intersecting lines of Figure 13.6(b) show a situation in which treatment and block effects are said to interact. Referring to Example 13.6, if operator 3 is 0.5 second faster on the average than operator 2 when machine 1 is used, then operator 3 will still be 0.5 second faster on the average than operator 2 when machine 2, 3, or 4 is used. In many experiments, the assumption of additivity does not hold and the analysis described in this section leads to erroneous conclusions. Suppose, for instance, that operator 3 is 0.5 second faster on the average than operator 2 when machine 1 is used but is 0.2 second slower on the average than operator 2 when machine 2 is used. The operators and machines are now interacting.
An inspection of Table 13.9 suggests the possible presence of interaction. This apparent interaction may be real or it may be due to experimental error. The analysis of Example 13.6 was based on the assumption that the apparent interaction was due entirely to experimental error. If the total variability of our data was in part due to an interaction effect, this source of variation remained a part of the error sum of squares, causing the mean square error to overestimate σ2 and thereby increasing the probability of committing a type II error. We have, in fact, assumed an incorrect model. If we let (αβ)ij denote the interaction effect of the ith treatment and the jth block, we can write a more appropriate model in the

540
Chapter 13 One-Factor Experiments: General
(αβ)ij =
We can now readily verify that
Block 1 Block 2
Block 1
Block 2
t1 t2 t3 t1 t2 t3 Treatments Treatments
(a) (b)
Figure 13.6: Population means for (a) additive results and (b) interacting effects.
form
yij =μ+αi +βj +(αβ)ij +εij, on which we impose the additional restrictions
􏰤k i=1
􏰤b j=1
(αβ)ij = 0, for i = 1,…,k and j = 1,…,b.
(αβ)2ij.
􏰮SSE􏰯 1􏰤k􏰤b
E (b−1)(k−1) =σ2+(b−1)(k−1)
i=1 j=1
13.9
Thus, the mean square error is seen to be a biased estimate of σ2 when existing interaction has been ignored. It would seem necessary at this point to arrive at a procedure for the detection of interaction for cases where there is suspicion that it exists. Such a procedure requires the availability of an unbiased and independent estimate of σ2. Unfortunately, the randomized block design does not lend itself to such a test unless the experimental setup is altered. This subject is discussed extensively in Chapter 14.
Graphical Methods and Model Checking
In several chapters, we make reference to graphical procedures displaying data and analytical results. In early chapters, we used stem-and-leaf and box-and-whisker plots as visuals to aid in summarizing samples. We used similar diagnostics to better understand the data in two sample problems in Chapter 10. In Chapter 11 we introduced the notion of residual plots to detect violations of standard assump- tions. In recent years, much attention in data analysis has centered on graphical
Population Mean
Population Mean

13.9 Graphical Methods and Model Checking 541
750
700
650
600
550
500
450
400
y3
y5
200 150 100
50 0 −50
−100
methods. Like regression, analysis of variance lends itself to graphics that aid in summarizing data as well as detecting violations. For example, a simple plotting of the raw observations around each treatment mean can give the analyst a feel for variability between sample means and within samples. Figure 13.7 depicts such a plot for the aggregate data of Table 13.1. From the appearance of the plot one may even gain a graphical insight into which aggregates (if any) stand out from the others. It is clear that aggregate 4 stands out from the others. Aggregates 3 and 5 certainly form a homogeneous group, as do aggregates 1 and 2.
y1
y2
y4
1 2 3 4 5 −1501 2 3 4 5 Aggregate Aggregate
Figure 13.7: Plot of data around the mean for the Figure 13.8: Plot of residuals for five aggregates, aggregate data of Table 13.1. using data in Table 13.1.
As in the case of regression, residuals can be helpful in analysis of variance in providing a diagnostic that may detect violations of assumptions. To form the residuals, we merely need to consider the model of the one-factor problem, namely
yij =μi+εij.
It is straightforward to determine that the estimate of μi is y ̄i.. Hence, the ijth residual is yij − y ̄i.. This is easily extendable to the randomized complete block model. It may be instructive to have the residuals plotted for each aggregate in order to gain some insight regarding the homogeneous variance assumption. This plot is shown in Figure 13.8.
Trends in plots such as these may reveal difficulties in some situations, par- ticularly when the violation of a particular assumption is graphic. In the case of Figure 13.8, the residuals seem to indicate that the within-treatment variances are reasonably homogeneous apart from aggregate 1. There is some graphical evidence that the variance for aggregate 1 is larger than the rest.
What Is a Residual for an RCB Design?
The randomized complete block design is another experimental situation in which graphical displays can make the analyst feel comfortable with an “ideal picture” or
Moisture
Residual

542
Chapter 13 One-Factor Experiments: General
perhaps highlight difficulties. Recall that the model for the randomized complete block design is
yij = μ + αi + βj + εij , with the imposed constraints
i = 1, . . . , k,
j = 1, . . . , b,
􏰤k i=1
αi = 0,
􏰤b j=1
βj = 0.
To determine what indeed constitutes a residual, consider that αi = μi. − μ, βj = μ.j − μ
and that μ is estimated by y ̄.., μi. is estimated by y ̄i., and μ.j is estimated by y ̄.j. As a result, the predicted or fitted value yˆij is given by
yˆij =μˆ+αˆi +βˆj =y ̄i. +y ̄.j −y ̄.., and thus the residual at the (i,j) observation is given by
yij −yˆij =yij −y ̄i. −y ̄.j +y ̄…
Note that yˆij, the fitted value, is an estimate of the mean μij. This is consistent with the partitioning of variability given in Theorem 13.3, where the error sum of squares is
􏰤k 􏰤b ij
The visual displays in the randomized complete block design involve plotting the residuals separately for each treatment and for each block. The analyst should expect roughly equal variability if the homogeneous variance assumption holds. The reader should recall that in Chapter 12 we discussed plotting residuals for the purpose of detecting model misspecification. In the case of the randomized complete block design, the serious model misspecification may be related to our assumption of additivity (i.e., no interaction). If no interaction is present, a random pattern should appear.
Consider the data of Example 13.6, in which treatments are four machines and blocks are six operators. Figures 13.9 and 13.10 give the residual plots for separate treatments and separate blocks. Figure 13.11 shows a plot of the residuals against the fitted values. Figure 13.9 reveals that the error variance may not be the same for all machines. The same may be true for error variance for each of the six operators. However, two unusually large residuals appear to produce the apparent difficulty. Figure 13.11 is a plot of residuals that shows reasonable evidence of random behavior. However, the two large residuals displayed earlier still stand out.
SSE =
(yij − y ̄i. − y ̄.j + y ̄..)2.

13.10 Data Transformations in Analysis of Variance
543
2.5
1.5
2.5
1.5
0.5 00
−0.5 −0.5
−1.5 −1.5
−2.5 −2.5 1234123456
Machines Operators
Figure 13.9: Residual plot for the four machines for Figure 13.10: Residual plot for the six operators
0.5
the data of Example 13.6.
for the data of Example 13.6.
2.5 1.5 0.5
0 −0.5
−1.5
−2.5
40 41 42 43 44 45 46
y^
Figure 13.11: Residuals plotted against fitted values for the data of Example 13.6.
13.10 Data Transformations in Analysis of Variance
In Chapter 11, considerable attention was given to transformation of the response y in situations where a linear regression model was being fit to a set of data. Obviously, the same concept applies to multiple linear regression, though it was not discussed in Chapter 12. In the regression modeling discussion, emphasis was placed on the transformations of y that would produce a model that fit the data better than the model in which y enters linearly. For example, if the “time” structure is exponential in nature, then a log transformation on y linearizes the
Residual
Residual
Residual

544
Chapter 13 One-Factor Experiments: General
structure and thus more success is anticipated when one uses the transformed response.
While the primary purpose for data transformation discussed thus far has been to improve the fit of the model, there are certainly other reasons to transform or reexpress the response y, and many of them are related to assumptions that are being made (i.e., assumptions on which the validity of the analysis depends). One very important assumption in analysis of variance is the homogeneous variance assumption discussed early in Section 13.4. We assume a common variance σ2. If the variance differs a great deal from treatment to treatment and we perform the standard ANOVA discussed in this chapter (and future chapters), the results can be substantially flawed. In other words, the analysis of variance is not robust to the assumption of homogeneous variance. As we have discussed thus far, this is the centerpiece of motivation for the residual plots discussed in the previous section and illustrated in Figures 13.9, 13.10, and 13.11. These plots allow us to detect nonhomogeneous variance problems. However, what do we do about them? How can we accommodate them?
Where Does Nonhomogeneous Variance Come From?
Often, but not always, nonhomogeneous variance in ANOVA is present because of the distribution of the responses. Now, of course we assume normality in the response. But there certainly are situations in which tests on means are needed even though the distribution of the response is one of the nonnormal distributions discussed in Chapters 5 and 6, such as Poisson, lognormal, exponential, or gamma. ANOVA-type problems certainly exist with count data, time to failure data, and so on.
We demonstrated in Chapters 5 and 6 that, apart from the normal case, the variance of a distribution will often be a function of the mean, say σi2 = g(μi). For example, in the Poisson case Var(Yi) = μi = σi2 (i.e., the variance is equal to the mean). In the case of the exponential distribution, Var(Yi) = σi2 = μ2i (i.e., the variance is equal to the square of the mean). For the case of the lognormal, a log transformation produces a normal distribution with constant variance σ2.
The same concepts that we used in Chapter 4 to determine the variance of a
nonlinear function can be used as an aid to determine the nature of the variance
stabilizing transformation g(y ). Recall that the first order Taylor series expansion
of g(yi) around yi = μi where g′(μi) = ∂g(yi) . The transformation func- ∂yi yi=μi
tion g(y) must be independent of μ in order to suffice as the variance stabilizing transformation. From the above,
i
􏰮􏰯
Var[g(yi)] ≈ [g′(μi)]2σi2.
As a result, g(yi) must be such that g′(μi) ∝ 1 . Thus, if we suspect that the
σ
response is Poisson distributed, σ = μ1/2, so g′(μ ) ∝ 1 . Thus, the variance
i i i μ1/2 i
stabilizing transformation is g(yi) = y1/2. From this illustration and similar ma- i
nipulation for the exponential and gamma distributions, we have the following.

Exercises
545
13.25 Four kinds of fertilizer f1, f2, f3, and f4 are used to study the yield of beans. The soil is divided into 3 blocks, each containing 4 homogeneous plots. The yields in kilograms per plot and the correspond- ing treatments are as follows:
1 68 57
2 83 94
3 72 81
4 55 73
5 92 68
73 91 63 77 75
61
86
59
66
87
equal dif-
Distribution
Poisson Exponential Gamma
Variance Stabilizing Transformations
g(y) = y1/2 g(y) = ln y g(y) = ln y
French, and biology:
Sub ject
Student Math English French Biology
//
Exercises
Block 1
Block 2
Block 3
f1 = 42.7 f3 = 48.5 f4 = 32.8 f2 = 39.3
f3 = 50.9 f1 = 50.0 f2 = 38.0 f4 = 40.2
Conduct an analysis of variance at the 0.05 level of sig- nificance using the randomized complete block model.
13.26 Three varieties of potatoes are being compared for yield. The experiment is conducted by assigning each variety at random to one of 3 equal-size plots at each of 4 different locations. The following yields for varieties A, B, and C, in 100 kilograms per plot, were recorded:
Location 1 Location 2 Location 3 Location 4
Perform a randomized complete block analysis of vari- ance to test the hypothesis that there is no difference in the yielding capabilities of the 3 varieties of potatoes. Use a 0.05 level of significance. Draw conclusions.
13.27 The following data are the percents of foreign additives measured by 5 analysts for 3 similar brands of strawberry jam, A, B, and C:
Analyst 1 Analyst 2 Analyst 3 Analyst 4 Analyst 5
Perform a randomized complete block analysis of vari- ance to test the hypothesis, at the 0.05 level of signifi- cance, that the percent of foreign additives is the same for all 3 brands of jam. Which brand of jam appears to have fewer additives?
13.28 The following data represent the final grades obtained by 5 students in mathematics, English,
13.29 In a study on The Periphyton of the South River, Virginia: Mercury Concentration, Productivity, and Autotropic Index Studies, conducted by the De- partment of Environmental Sciences and Engineering at Virginia Tech, the total mercury concentration in periphyton total solids was measured at 6 different sta- tions on 6 different days. Determine whether the mean mercury content is significantly different between the stations by using the following recorded data. Use a P-value and discuss your findings.
Station
Date CA CB El E2 E3 E4
April8 0.45 3.24 1.33 2.04 3.93 5.93 June23 0.10 0.10 0.99 4.31 9.92 6.49 July1 0.25 0.25 1.65 3.13 7.39 4.43 July8 0.09 0.06 0.92 3.66 7.88 6.24 July15 0.15 0.16 2.17 3.50 8.82 5.39 July23 0.17 0.39 4.30 2.91 5.50 4.29
13. 30 A nuclear power facility produces a vast amount of heat, which is usually discharged into aquatic systems. This heat raises the temperature of the aquatic system, resulting in a greater concentration of chlorophyll a, which in turn extends the growing sea- son. To study this effect, water samples were collected monthly at 3 stations for a period of 12 months. Sta- tion A is located closest to a potential heated water discharge, station C is located farthest away from the discharge, and station B is located halfway between stations A and C. The following concentrations of chlorophyll a were recorded.
f4 = 51.1 f2 = 46.3 f1 = 51.9 f3 = 53.5
Test the hypothesis that the courses
ficulty. Use a P-value in your conclusions and discuss your findings.
are of
B: 13 A: 18 C: 12
C: 21 A: 20 B: 23
C:9 B: 12 A: 14
A: 11 C: 10 B: 17
B: 2.7 C: 3.6 A: 3.8
C: 7.5 A: 1.6 B: 5.2
B: 2.8 A: 2.7 C: 6.4
A: 1.7 B: 1.9 C: 2.6
C: 8.1 A: 2.0 B: 4.8

546
Chapter 13 One-Factor Experiments: General
Station
Month A B C
million, were recorded:
Analyst
Individual Employee Chemist Laboratory
1 0.05 0.05 0.04
2 0.05 0.05 0.04
3 0.04 0.04 0.03
4 0.15 0.17 0.10
January 9.867
3.723
Diet 1: Diet 2: Diet 3:
mixed fat and carbohydrates, high fat,
high carbohydrates.
1 455
2 622
3 695 56 50
//
4.410 11.100 4.470 8.010 34.080 8.990 3.350 4.500 6.830 5.800 3.480 3.020
test the hypoth- esis, at the 0.05 level of significance, that there is no difference in the mean concentrations of chlorophyll a
at the 3 stations.
13.31 In a study conducted by the Department of Health and Physical Education at Virginia Tech, 3 di- ets were assigned for a period of 3 days to each of 6 subjects in a randomized complete block design. The subjects, playing the role of blocks, were assigned the following 3 diets in a random order:
8.416 20.723 9.168 4.778 9.145 8.463 4.086 4.233 2.320 3.843 3.610 Perform an analysis of variance and
February March April
May
June
July August September October November December
14.035 10.700 13.853
7.067 11.670 7.357 3.358 4.210 3.630 2.953 2.640
Perform an analysis of
esis, at the 0.05 level of significance, that there is no difference in the arsenic levels for the 3 methods of analysis.
13.33 Scientists in the Department of Plant Pathol- ogy at Virginia Tech devised an experiment in which 5 different treatments were applied to 6 different lo- cations in an apple orchard to determine if there were significant differences in growth among the treatments. Treatments 1 through 4 represent different herbicides and treatment 5 represents a control. The growth period was from May to November in 1982, and the amounts of new growth, measured in centimeters, for samples selected from the 6 locations in the orchard were recorded as follows:
Locations Treatment 1 2 3 4 5 6
4 607
5 388
650 493 263 185
variance and test the hypoth-
At the end of the 3-day period, each subject was put on a treadmill and the time to exhaustion, in seconds, was measured. Perform the analysis of variance, separating out the diet, subject, and error sum of squares. Use a P -value to determine if there are significant differences among the diets, using the following recorded data.
Sub ject
Diet 1 2 3 4 5 6
72 61 82 444
501
134
373
262
622
215 695 170 437 443 701 257 490 103 518
Perform an analysis
treatment, location, and error sum of squares. De- termine if there are significant differences among the treatment means. Quote a P-value.
13.34 In the paper “Self-Control and Therapist Control in the Behavioral Treatment of Overweight Women,” published in Behavioral Research and Ther- apy (Vol. 10, 1972), two reduction treatments and a control treatment were studied for their effects on the weight change of obese women. The two reduc- tion treatments were a self-induced weight reduction program and a therapist-controlled reduction program. Each of 10 subjects was assigned to one of the 3 treatment programs in a random order and measured for weight loss. The following weight changes were recorded:
of variance, separating
out the
1 2 3
84 35 91 57 56 45
91 48 71 45 61 61 122 53 110 71 91 122
13.32 Organic arsenicals are used by forestry person- nel as silvicides. The amount of arsenic that the body takes in when exposed to these silvicides is a major health problem. It is important that the amount of exposure be determined quickly so that a field worker with a high level of arsenic can be removed from the job. In an experiment reported in the paper “A Rapid Method for the Determination of Arsenic Concentra- tions in Urine at Field Locations,” published in the American Industrial Hygiene Association Journal (Vol. 37, 1976), urine specimens from 4 forest service per- sonnel were divided equally into 3 samples each so that each individual’s urine could be analyzed for arsenic by a university laboratory, by a chemist using a portable system, and by a forest-service employee after a brief orientation. The following arsenic levels, in parts per

13.11 Random Effects Models
547
Sub ject Control
1 1.00 2 3.75 3 0.00 4 −0.25 5 −2.25 6 −1.00 7 −1.00 8 3.75 9 1.50
10 0.50
Treatment
Self-induced
−2.25 −6.00 −2.00 −1.50 −3.25 −1.50
−10.75 −0.75 0.00 −3.75
Therapist
−10.50 −13.50 0.75 −4.50 −6.00 4.00 −12.25 −2.75 −6.75 −7.00
at the 0.05 level of significance, that there is no dif- ference in the color density of the fabric for the three levels of dye. Consider plants to be blocks.
13.36 An experiment was conducted to compare three types of coating materials for copper wire. The purpose of the coating is to eliminate “flaws” in the wire. Ten different specimens of length 5 millimeters were randomly assigned to receive each coating, and the thirty specimens were subjected to an abrasive wear type process. The number of flaws was measured for each, and the results are as follows:
Material 13
6 8 4 5 3 3 5 4 12 8 7 14 7 7 9 6 2 4 4 5 18 6 7 18 784385
Suppose it is assumed that the Poisson process applies
and thus the model is Yij = μi + εij, where μi is the
mean of a Poisson distribution and σY2 = μi. ij
(a) Do an appropriate transformation on the data and perform an analysis of variance.
(b) Determine whether or not there is sufficient evi- dence to choose one coating material over the other. Show whatever findings suggest a conclusion.
(c) Do a plot of the residuals and comment.
(d) Give the purpose of your data transformation.
(e) What additional assumption is made here that may not have been completely satisfied by your trans- formation?
(f) Comment on (e) after doing a normal probability plot on the residuals.
Perform an analysis of variance and test the hypothesis, at the 0.01 level of significance, that there is no differ- ence in the mean weight losses for the 3 treatments. Which treatment was best?
13.35 In the book Design of Experiments for the
Quality Improvement, published by the Japanese Stan-
dards Association (1989), a study on the amount of dye
needed to get the best color for a certain type of fabric
was reported. The three amounts of dye, 1 % wof ( 1 % 33
of the weight of a fabric), 1% wof, and 3% wof, were each administered at two different plants. The color density of the fabric was then observed four times for each level of dye at each plant.
1%
12.3 10.5 12.4 10.9
14.5 11.8 16.0 13.6
1/3%
5.2 6.0 5.9 5.9 6.5 5.5 6.4 5.9
Amount of Dye
22.4 17.8 22.5 18.4 29.0 23.2 29.7 24.0
3%
Plant 1 Plant 2
Perform an analysis of variance to test the hypothesis,
13.11 Random Effects Models
Throughout this chapter, we deal with analysis-of-variance procedures in which the primary goal is to study the effect on some response of certain fixed or prede- termined treatments. Experiments in which the treatments or treatment levels are preselected by the experimenter as opposed to being chosen randomly are called fixed effects experiments. For the fixed effects model, inferences are made only on those particular treatments used in the experiment.
It is often important that the experimenter be able to draw inferences about a population of treatments by means of an experiment in which the treatments used are chosen randomly from the population. For example, a biologist may be interested in whether or not there is significant variance in some physiological characteristic due to animal type. The animal types actually used in the experiment are then chosen randomly and represent the treatment effects. A chemist may be interested in studying the effect of analytical laboratories on the chemical analysis of a substance. She is not concerned with particular laboratories but rather with a large population of laboratories. She might then select a group of laboratories
2

548 Chapter 13 One-Factor Experiments: General
at random and allocate samples to each for analysis. The statistical inference would then involve (1) testing whether or not the laboratories contribute a nonzero variance to the analytical results and (2) estimating the variance due to laboratories and the variance within laboratories.
Model and Assumptions for Random Effects Model
Theorem 13.4:
The one-way random effects model is written like the fixed effects model but with the terms taking on different meanings. The response yij = μ + αi + εij is now a value of the random variable
Yij =μ+Ai+εij, withi=1,2,…,kandj=1,2,…,n,
where the Ai are independently and normally distributed with mean 0 and variance
σα2 and are independent of the εij. As for the fixed effects model, the εij are also
independently and normally distributed with mean 0 and variance σ2. Note that
􏰦k i=1
Table 13.11 shows the expected mean squares for both a fixed effects and a random effects experiment. The computations for a random effects experiment are carried out in exactly the same way as for a fixed effects experiment. That is, the sum-of-squares, degrees-of-freedom, and mean-square columns in an analysis- of-variance table are the same for both models.
Table 13.11: Expected Mean Squares for the One-Factor Experiment
for a random effects experiment, the constraint that
αi = 0 no longer applies.
For the one-way random effects analysis-of-variance model,
E(SSA) = (k − 1)σ2 + n(k − 1)σα2 and E(SSE) = k(n − 1)σ2.
Source of Variation
Degrees of Mean Freedom Squares
Expected Mean Squares
k−1 s21 Total nk − 1
Fixed Effects
σ2 + n 􏰦αi2 k−1 i
Random Effects
σ2 +nσα2 σ2
Treatments
Error k(n−1) s2 σ2
Hypothesis for a Random Effects Experiment
For the random effects model, the hypothesis that the treatment effects are all zero is written as follows:
H 0 : σ α2 = 0 , H 1 : σ α2 ̸ = 0 .
This hypothesis says that the different treatments contribute nothing to the variability of the response. It is obvious from Table 13.11 that s21 and s2 are both

13.11 Random Effects Models 549 estimates of σ2 when H0 is true and that the ratio
f = s 21 s2
is a value of the random variable F having the F-distribution with k−1 and k(n−1) degrees of freedom. The null hypothesis is rejected at the α-level of significance when
f >fα[k−1,k(n−1)].
In many scientific and engineering studies, interest is not centered on the F- test. The scientist knows that the random effect does, indeed, have a significant effect. What is more important is estimation of the various variance components. This produces a ranking in terms of what factors produce the most variability and by how much. In the present context, it may be of interest to quantify how much larger the single-factor variance component is than that produced by chance (random variation).
Estimation of Variance Components
Table 13.11 can also be used to estimate the variance components σ2 and σα2 . Since s21 estimates σ2 + nσα2 and s2 estimates σ2,
2 2 2 s21−s2 σˆ=s, σˆα= n .
Example 13.7: The data in Table 13.12 are coded observations on the yield of a chemical process, using five batches of raw material selected randomly. Show that the batch variance component is significantly greater than zero and obtain its estimate.
Table 13.12: Data for Example 13.7
Batch: 1 2 3 4 5
9.7 10.4 5.6 9.6 8.4 7.3 7.9 6.8 8.2 8.8 7.7 9.2 8.1 7.6
Total 55.6 59.7
15.9 14.4 8.3 12.8 7.9 11.6 9.8
80.7
8.6 9.7 11.1 12.8 10.7 8.7
7.6 13.4 6.4 8.3 5.9 11.7 8.1 10.7
58.4 75.3
329.7
Solution: The total, batch, and error sums of squares
SST = 194.64, SSA = 72.60, and SSE = 194.64 − 72.60 = 122.04.
These results, with the remaining computations, are shown in Table 13.13.
are, respectively,

550
Chapter 13 One-Factor Experiments: General
Source of Variation
Batches Error
Total
Sum of Squares
72.60 122.04
194.64
Degrees of Freedom
Mean Computed Square f
Table 13.13: Analysis of Variance for Example 13.7
4
30 4.07
18.15 4.46
34
The f-ratio is significant at the α = 0.05 level, indicating that the hypothesis of a zero batch component is rejected. An estimate of the batch variance component is
σˆα2 = 18.15 − 4.07 = 2.01. 7
Note that while the batch variance component is significantly different from zero, when gauged against the estimate of σ2, namely σˆ2 = MSE = 4.07, it appears as if the batch variance component is not appreciably large.
If the result using the formula for σα2 appears negative, (i.e., when s21 is smaller than s2), σˆα2 is then set to zero. This is a biased estimator. In order to have a better estimator of σα2 , a method called restricted (or residual) maximum likelihood (REML) is commonly used (see Harville, 1977, in the Bibliography). Such an estimator can be found in many statistical software packages. The details for this estimation procedure are beyond the scope of this text.
Randomized Block Design with Random Blocks
In a randomized complete block experiment where the blocks represent days, it is conceivable that the experimenter would like the results to apply not only to the actual days used in the analysis but to every day in the year. He or she would then select at random the days on which to run the experiment as well as the treatments and use the random effects model
Yij = μ + Ai + Bj + εij , for i = 1, 2, . . . , k and j = 1, 2, . . . , b,
with the Ai, Bj, and εij being independent random variables with means 0 and variances σα2, σβ2, and σ2, respectively. The expected mean squares for a random effects randomized complete block design are obtained, using the same procedure as for the one-factor problem, and are presented along with those for a fixed effects experiment in Table 13.14.
Again the computations for the individual sums of squares and degrees of free- dom are identical to those of the fixed effects model. The hypothesis
is carried out by computing
H 0 : σ α2 = 0 , H 1 : σ α2 ̸ = 0
f = s 21 s2

13.12 Case Study 551 Table 13.14: Expected Mean Squares for the Randomized Complete Block Design
Source of Variation
Treatments
Blocks
Error Total
Degrees of Mean Freedom Squares
k−1 s21 b−1 s2
(k − 1)(b − 1) s2 kb − 1
Expected Mean Squares
Fixed Effects
σ2 + b 􏰦αi2 k−1 i
σ2+ k 􏰦βj2 b−1 j
σ2
Random Effects
σ2 +bσα2 σ2+kσβ2
σ2
andrejectingH0 whenf>fα[k−1,(b−1)(k−1)].
The unbiased estimates of the variance components are
2 2 2 s 21 − s 2 2 s 2 2 − s 2 σˆ=s, σˆα= b , σˆβ= k .
Tests of hypotheses concerning the various variance components are made by computing the ratios of appropriate mean squares, as indicated in Table 13.14, and comparing them with corresponding f-values from Table A.6.
13.12 Case Study
Case Study 13.1: Chemical Analysis: Personnel in the Chemistry Department of Virginia Tech were called upon to analyze a data set that was produced to compare 4 different methods of analysis of aluminum in a certain solid igniter mixture. To get a broad range of analytical laboratories involved, 5 laboratories were used in the experiment. These laboratories were selected because they are generally adept in doing these types of analyses. Twenty samples of igniter material containing 2.70% aluminum were assigned randomly, 4 to each laboratory, and directions were given on how to carry out the chemical analysis using all 4 methods. The data retrieved
are as follows:
Laboratory
Method 1 2 3 4 5 Mean
A 2.67 2.69 2.62 2.66 2.70 2.668
B 2.71 2.74 2.69 2.70 2.77 2.722
C 2.76 2.76 2.70 2.76 2.81 2.758
D 2.65 2.69 2.60 2.64 2.73 2.662
The laboratories are not considered as random effects since they were not se- lected randomly from a larger population of laboratories. The data were analyzed as a randomized complete block design. Plots of the data were sought to determine if an additive model of the type
yij = μ + mi + lj + εij

552
Chapter 13 One-Factor Experiments: General
is appropriate: in other words, a model with additive effects. The randomized block is not appropriate when interaction between laboratories and methods exists. Consider the plot shown in Figure 13.12. Although this plot is a bit difficult to interpret because each point is a single observation, there appears to be no appreciable interaction between methods and laboratories.
5
5
2
1 24
1 543
5
232 1 41
3
4 3
2.85 2.80 2.75 2.70 2.65 2.60
Residual Plots
0.02
0.00
−0.02
Figure 13.12: Interaction plot for data of Case Study 13.1.
Residual plots were used as diagnostic indicators regarding the homogeneous vari- ance assumption. Figure 13.13 shows a plot of residuals against analytical methods. The variability depicted in the residuals seems to be remarkably homogeneous. For completeness, a normal probability plot of the residuals is shown in Figure 13.14.
ABCD
Method
0.015
0.005
−0.005
−0.015
ABCD
Method
−2 −1 0 1 2 Standard Normal Quantile
Figure 13.13: Plot of residuals against method for Figure 13.14: Normal probability plot of residuals the data of Case Study 13.1. for the data of Case Study 13.1.
The residual plots show no difficulty with either the assumption of normal errors or the assumption of homogeneous variance. SAS PROC GLM was used
Residual
Residual
Response

Exercises 553
to conduct the analysis of variance. Figure 13.15 shows the annotated computer printout.
The computed f- and P-values do indicate a significant difference between an- alytical methods. This analysis can be followed by a multiple comparison analysis to determine where the differences are among the methods.
Exercises
13.37 Testing patient blood samples for HIV antibod- ies, a spectrophotometer determines the optical density of each sample. Optical density is measured as the absorbance of light at a particular wavelength. The blood sample is positive if it exceeds a certain cutoff value that is determined by the control samples for that run. Researchers are interested in comparing the lab- oratory variability for the positive control values. The data represent positive control values for 10 different runs at 4 randomly selected laboratories.
Laboratory
Run 1 2 3 4
Operator 1234 175.4 168.5 170.1 175.2 171.7 162.7 173.4 175.7 173.0 165.0 175.7 180.1 170.5 164.1 170.7 183.7
(a) Perform a random effects analysis of the 0.05 level of significance.
variance at
1 0.888
2 0.983
3 1.047
4 1.087
5 1.125
6 0.997
7 1.025
8 0.969
9 0.898
10 1.018
1.065 1.325 1.226 1.069 1.332 1.219 0.958 0.958 0.816 0.819 1.015 1.140 1.071 1.222 0.905 0.995 1.140 0.928 1.051 1.322
1.232 1.127 1.051 0.897 1.222 1.125 0.990 0.875 0.930 0.775
(b) Compute an estimate of the operator variance com- ponent and the experimental error variance compo- nent.
13.40 Five “pours” of metals have had 5 core samples each analyzed for the amount of a trace element. The
data for the 5 randomly selected pours are
as follows:
5
1.00 1.21 0.93 0.86 1.41
Core 1 2
1 0.98 0.85
2 1.02 0.92
3 1.57 1.16
4 1.25 1.43
5 1.16 0.99
Pour 34
1.12 1.21 1.68 1.19 0.99 1.32 1.26 1.08 1.05 0.94
(a) Write an appropriate model for this experiment.
(b) Estimate the laboratory variance component and the variance within laboratories.
13.38 An experiment is conducted in which 4 treat- ments are to be compared in 5 blocks. The data are given below.
Block Treatment 1 2 3 4 5
(a) The intent is that the pours be identical. Thus, test that the “pour” variance component is zero. Draw conclusions.
(b) Show a complete ANOVA along with an estimate of the within-pour variance.
13.41 A textile company weaves a certain fabric on a large number of looms. The managers would like the looms to be homogeneous so that their fabric is of uniform strength. It is suspected that there may be significant variation in strength among looms. Con- sider the following data for 4 randomly selected looms. Each observation is a determination of strength of the fabric in pounds per square inch.
Loom 1234 99 97 94 93 97 96 95 94 97 92 90 90 96 98 92 92
(a) Write a model for the experiment.
(b) Does the loom variance component differ signifi- cantly from zero?
1 12.8 10.6
2 11.7 14.2
3 11.5 14.7
4 12.6 16.5
11.7 10.7 11.0 11.8 9.9 13.8 13.6 10.7 15.9 15.4 9.6 17.1
(a) Assuming a random effects model, test the hypoth- esis, at the 0.05 level of significance, that there is no difference between treatment means.
(b) Compute estimates of the treatment and block vari- ance components.
13.39 The following data show the effect of 4 oper- ators, chosen randomly, on the output of a particular machine.
(c) Comment on the managers’ suspicion.

554
Chapter 13 One-Factor Experiments: General
Source DF
Model 7
Error 12
Corrected Total 19
Sum of
Squares Mean Square F Value
R-Square
0.960954
Coeff Var Root MSE Response Mean
0.497592 0.013447 2.702500
Source DF
Method 3
Lab 4
Type III SS Mean Square F Value
Pr > F
<.0001 <.0001 Observation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Observed 2.67000000 2.71000000 2.76000000 2.65000000 2.69000000 2.74000000 2.76000000 2.69000000 2.62000000 2.69000000 2.70000000 2.60000000 2.66000000 2.70000000 2.76000000 2.64000000 2.70000000 2.77000000 2.81000000 2.73000000 Residual 0.00700000 -0.00700000 0.00700000 -0.00700000 0.00450000 0.00050000 -0.01550000 0.01050000 0.00200000 0.01800000 -0.00800000 -0.01200000 0.00450000 -0.00950000 0.01450000 -0.00950000 -0.01800000 -0.00200000 0.00200000 0.01800000 The GLM Procedure Class Level Information Class Levels Method 4 Lab 5 Values ABCD 12345 Number of Observations Read 20 Number of Observations Used 20 Dependent Variable: Response Pr > F
0.05340500 0.00762929 42.19 <.0001 0.00217000 0.00018083 0.05557500 0.03145500 0.02195000 0.01048500 0.00548750 Predicted 2.66300000 2.71700000 2.75300000 2.65700000 2.68550000 2.73950000 2.77550000 2.67950000 2.61800000 2.67200000 2.70800000 2.61200000 2.65550000 2.70950000 2.74550000 2.64950000 2.71800000 2.77200000 2.80800000 2.71200000 57.98 30.35 Figure 13.15: SAS printout for data of Case Study 13.1. Review Exercises 555 Review Exercises 13.42 An analysis was conducted by the Statistics Consulting Center at Virginia Tech in conjunction with the Department of Forestry. A certain treatment was applied to a set of tree stumps in which the chemical Garlon was used with the purpose of regenerating the roots of the stumps. A spray was used with four lev- els of Garlon concentration. After a period of time, the height of the shoots was observed. Perform a one- factor analysis of variance on the following data. Test to see if the concentration of Garlon has a significant impact on the height of the shoots. Use α = 0.05. Garlon Level 14 2.87 2.31 3.27 2.66 2.39 1.91 3.05 0.91 3.91 2.04 3.15 2.00 2.89 1.89 2.43 0.01 13.43 Consider the aggregate data of Example 13.1. Perform Bartlett’s test, at level α = 0.1, to determine if there is heterogeneity of variance among the aggre- gates. 13.44 Three catalysts are used in a chemical process; a control (no catalyst) is also included. The following are yield data from the process: (b) Perform the analysis of variance and give conclu- sions concerning the laboratories. (c) Do a normal probability plot of residuals. 13.46 An experiment was designed for personnel in the Department of Animal Science at Virginia Tech to study urea and aqueous ammonia treatment of wheat straw. The purpose was to improve nutritional value for male sheep. The diet treatments were control, urea at feeding, ammonia-treated straw, and urea-treated straw. Twenty-four sheep were used in the experiment, and they were separated according to relative weight. There were four sheep in each homogeneous group (by weight) and each of them was given one of the four diets in random order. For each of the 24 sheep, the percent dry matter digested was measured. The data follow. Group by Weight (block) Diet 1 2 3 4 5 6 Control 32.68 36.22 36.36 40.95 34.99 33.89 Urea at feeding 35.90 38.73 37.55 34.64 37.36 34.35 Ammonia treated 49.43 53.50 52.86 45.00 47.20 49.76 Urea treated 46.58 42.82 45.41 45.08 43.81 47.40 (a) Use a randomized complete block type of analy- sis to test for differences between the diets. Use α = 0.05. (b) Use Dunnett’s test to compare the three diets with the control. Use α = 0.05. (c) Do a normal probability plot of residuals. 13.47 In a study that was analyzed for personnel in the Department of Biochemistry at Virginia Tech, three diets were given to groups of rats in order to study the effect of each on dietary residual zinc in the blood- stream. Five pregnant rats were randomly assigned to each diet group, and each was given the diet on day 22 of pregnancy. The amount of zinc in parts per million was measured. The data are as follows: 1 0.50 0.42 0.65 0.47 0.44 Diet 2 0.42 0.40 0.73 0.47 0.69 3 1.06 0.82 0.72 0.72 0.82 Determine if there is a significant difference in resid- ual dietary zinc among the three diets. Use α = 0.05. Perform a one-way ANOVA. // 2 3 Control 74.5 76.1 75.9 78.1 76.2 Catalyst 1 2 3 77.5 81.5 82.0 82.3 80.2 80.6 81.4 81.5 84.9 79.5 83.0 81.0 83.0 82.1 78.1 Use Dunnett’s test to determine if a significantly higher yield is obtained with the catalysts than with no catalyst. 13.45 Four laboratories are being used to perform chemical analysis. Samples of the same material are sent to the laboratories for analysis as part of a study to determine whether or not they give, on the average, the same results. The analytical results for the four laboratories are as A 58.7 61.4 60.9 59.1 58.2 follows: Laboratory BCD 62.7 55.9 60.7 64.5 56.1 60.3 63.1 57.3 60.9 59.2 55.2 61.4 60.3 58.1 62.3 at the α = 0.01 level of significance (a) Use laboratory variances are not significantly different at the α = 0.05 level of significance. Bartlett’s test to show that the within- 556 Chapter 13 One-Factor Experiments: General 13.48 An experiment was conducted to compare three types of paint for evidence of differences in their wearing qualities. They were exposed to abrasive ac- tion and the time in hours until abrasion was noticed was observed. Six specimens were used for each type of paint. The data are as follows. Paint Type 13 158 97 282 515 264 544 317 662 213 315 220 115 525 330 525 536 175 614 (a) Do an analysis of variance to determine if the evi- dence suggests that wearing quality differs for the three paints. Use a P-value in your conclusion. (b) If significant differences are found, characterize what they are. Is there one paint that stands out? Discuss your findings. (c) Do whatever graphical analysis you need to deter- mine if assumptions used in (a) are valid. Discuss your findings. (d) Suppose it is determined that the data for each treatment follow an exponential distribution. Does this suggest an alternative analysis? If so, do the alternative analysis and give findings. 13.49 A company that stamps gaskets out of sheets of rubber, plastic, and cork wants to compare the mean number of gaskets produced per hour for the three types of material. Two randomly selected stamping machines are chosen as blocks. The data represent the number of gaskets (in thousands) produced per hour. The data is given below. In addition, the printout anal- ysis is given in Figure 13.16 on page 557. Gasoline Brand Model A B C // (a) (b) (c) Discuss the need for the use of more than a single model of car. Consider the ANOVA from the SAS printout in Figure 13.17 on page 558. Does brand of gasoline matter? Which brand of gasoline would you select? Consult the result of Duncan’s test. A 32.4 B 28.8 C 36.5 D 34.4 35.6 38.7 28.6 29.9 37.6 39.1 36.2 37.9 2 13.51 Four different locations in the northeast were used for collecting ozone measurements in parts per million. Amounts of ozone were collected in 5 samples at each location. Location 1234 0.09 0.15 0.10 0.10 0.10 0.12 0.13 0.07 0.08 0.17 0.08 0.05 0.08 0.18 0.08 0.08 0.11 0.14 0.09 0.09 (a) Is there sufficient information here to suggest that there are differences in the mean ozone levels across locations? Be guided by a P-value. (b) If significant differences are found in (a), charac- terize the nature of the differences. Use whatever methods you have learned. Material Rubber 13. 52 Show that the mean square error 2 SSE s =k(n−1) Machine A B Cork Plastic 4.01 3.94 3.89 3.48 3.53 3.42 4.31 4.27 4.40 3.94 3.81 3.99 3.36 3.42 3.48 3.91 3.80 3.85 (a) Why would the stamping machines be chosen as blocks? (b) Plot the six means for machine and material com- binations. (c) Is there a single material that is best? (d) Is there an interaction between treatments and blocks? If so, is the interaction causing any seri- ous difficulty in arriving at a proper conclusion? Explain. 13.50 A study is conducted to compare gas mileage for 3 competing brands of gasoline. Four different au- tomobile models of varying size are randomly selected. The data, in miles per gallon, follow. The order of testing is random for each model. for the analysis of variance in a one-way classification is an unbiased estimate of σ2. 13.53 Prove Theorem 13.2. 13.54 Show that the computing formula for SSB, in the analysis of variance of the randomized complete block design, is equivalent to the corresponding term in the identity of Theorem 13.3. 13.55 For the randomized block design with k treat- ments and b blocks, show that E(SSB) = (b − 1)σ2 + k 􏰤b j=1 βj2. Review Exercises 557 The GLM Procedure Dependent Variable: gasket Sum of Source Model Error Corrected Total R-Square 0.969588 Source material 2 machine 1 material*machine 2 Level of material cork A cork B plastic A plastic B rubber A rubber B Level of material N cork 6 plastic 6 rubber 6 Level of machine N A 9 B 9 17 Coeff Var 1.734095 0.066291 DF Type III SS Level of machine N ------------gasket----------- // DF Squares Mean Square F Value Pr > F
5 1.68122778
12 0.05273333
1.73396111
Root MSE
0.81194444
0.10125000
0.76803333
0.33624556 76.52
0.00439444
gasket Mean
3.822778
Mean Square F Value
<.0001 Pr > F
<.0001 0.0004 <.0001 Mean 3 4.32666667 3 3.91333333 3 3.94666667 3 3.47666667 3 3.42000000 3 3.85333333 ------------gasket----------- Mean Std Dev 4.12000000 0.23765521 3.71166667 0.26255793 3.63666667 0.24287171 ------------gasket----------- Mean Std Dev 3.89777778 0.39798800 3.74777778 0.21376259 Std Dev 0.06658328 0.09291573 0.06027714 0.05507571 0.06000000 0.05507571 0.40597222 0.10125000 0.38401667 92.38 23.04 87.39 Figure 13.16: SAS printout for Review Exercise 13.49. 558 Chapter 13 One-Factor Experiments: General Dependent Variable: MPG The GLM Procedure Sum of Source Model Error Corrected Total 11 160.7091667 Mean Square 30.6501667 1.2430556 F Value 24.66 Pr > F 0.0006
R-Square
0.953591
Source
Model
Brand
Coeff Var Root MSE
3.218448 1.114924
MPG Mean
34.64167
DF 5 6
Squares
153.2508333
7.4583333
DF Type III SS Mean Square F Value Pr > F
3 130.3491667 43.4497222 34.95 0.0003
2 22.9016667 11.4508333
Duncan’s Multiple Range Test for MPG
9.21 0.0148
NOTE: This test controls the Type I comparisonwise error rate, not
the experimentwise error rate.
Alpha
Error Degrees of Freedom
Error Mean Square
0.05
6
1.243056
Number of Means
2 3
1.929 1.999
Critical Range
Means with the same letter are not significantly different.
Duncan Grouping Mean N Brand
A 36.4000 4 C
A
B A 34.5000 4 B
B
B 33.0250 4 A
Figure 13.17: SAS printout for Review Exercise 13.50.
Group Pro ject: It is of interest to determine which type of sports ball can be thrown the longest dis- tance. The competition involves a tennis ball, a base- ball, and a softball. Divide the class into teams of five individuals. Each team should design and conduct a separate experiment. Each team should also analyze the data from its own experiment. For a given team, each of the five individuals will throw each ball (after sufficient arm warmup). The experimental response will be the distance (in feet) that the ball is thrown. The data for each team will involve 15 observations. Important points:
(a) This is not a competition among teams. The com- petition is among the three types of sports balls. One would expect that the conclusion drawn by
each team would be similar.
(b) Each team should be gender mixed.
(c) The experimental design for each team should be a randomized complete block design. The five indi- viduals throwing are the blocks.
(d) Be sure to incorporate the appropriate randomiza- tion in conducting the experiment.
(e) The results should contain a description of the ex- periment with an ANOVA table complete with a P – value and appropriate conclusions. Use graphical techniques where appropriate. Use multiple com- parisons where appropriate. Draw practical conclu- sions concerning differences between the ball types. Be thorough.
13. 56

13.13 Potential Misconceptions and Hazards; Relationship to Material in Other Chapters 559 13.13 Potential Misconceptions and Hazards;
Relationship to Material in Other Chapters
As in other procedures covered in previous chapters, the analysis of variance is reasonably robust to the normality assumption but less robust to the homogeneous variance assumption. Also we note here that Bartlett’s test for equal variance is extremely nonrobust to normality.
This chapter is an extremely pivotal chapter in that it is essentially an “entry level” point for important topics such as design of experiments and analysis of variance. Chapter 14 will concern itself with the same topics, but the expansion will be to more than one factor, with the total analysis further complicated by the interpretation of interaction among factors. There are times when the role of interaction in a scientific experiment is more important than the role of the main factors (main effects). The presence of interaction results in even more empha- sis placed on graphical displays. In Chapters 14 and 15, it will be necessary to give more details regarding the randomization process since the number of factor combinations can be large.

This page intentionally left blank

Chapter 14
Factorial Experiments (Two or More Factors)
14.1 Introduction
Consider a situation where it is of interest to study the effects of two factors, A and B, on some response. For example, in a chemical experiment, we would like to vary simultaneously the reaction pressure and reaction time and study the effect of each on the yield. In a biological experiment, it is of interest to study the effects of drying time and temperature on the amount of solids (percent by weight) left in samples of yeast. As in Chapter 13, the term factor is used in a general sense to denote any feature of the experiment such as temperature, time, or pressure that may be varied from trial to trial. We define the levels of a factor to be the actual values used in the experiment.
For each of these cases, it is important to determine not only if each of the two factors has an influence on the response, but also if there is a significant interaction between the two factors. As far as terminology is concerned, the experiment de- scribed here is a two-factor experiment and the experimental design may be either a completely randomized design, in which the various treatment combinations are assigned randomly to all the experimental units, or a randomized complete block design, in which factor combinations are assigned randomly within blocks. In the case of the yeast example, the various treatment combinations of temperature and drying time would be assigned randomly to the samples of yeast if we were using a completely randomized design.
Many of the concepts studied in Chapter 13 are extended in this chapter to two and three factors. The main thrust of this material is the use of the completely randomized design with a factorial experiment. A factorial experiment in two factors involves experimental trials (or a single trial) with all factor combinations. For example, in the temperature-drying-time example with, say, 3 levels of each and n = 2 runs at each of the 9 combinations, we have a two-factor factorial experiment in a completely randomized design. Neither factor is a blocking factor; we are interested in how each influences percent solids in the samples and whether or not they interact. The biologist would have available 18 physical samples of
561

562
Chapter 14 Factorial Experiments (Two or More Factors)
material which are experimental units. These would then be assigned randomly to the 18 combinations (9 treatment combinations, each duplicated).
Before we launch into analytical details, sums of squares, and so on, it may be of interest for the reader to observe the obvious connection between what we have described and the situation with the one-factor problem. Consider the yeast experiment. Explanation of degrees of freedom aids the reader or the analyst in visualizing the extension. We should initially view the 9 treatment combinations as if they represented one factor with 9 levels (8 degrees of freedom). Thus, an initial look at degrees of freedom gives
Treatment combinations 8 Error 9 Total 17
Main Effects and Interaction
The experiment could be analyzed as described in the above table. However, the F-test for combinations would probably not give the analyst the information he or she desires, namely, that which considers the role of temperature and drying time. Three drying times have 2 associated degrees of freedom; three temperatures have 2 degrees of freedom. The main factors, temperature and drying time, are called main effects. The main effects represent 4 of the 8 degrees of freedom for factor combinations. The additional 4 degrees of freedom are associated with interaction between the two factors. As a result, the analysis involves
Combinations Temperature Drying time Interaction
Error Total
8 2
2 4
9 17
Recall from Chapter 13 that factors in an analysis of variance may be viewed as fixed or random, depending on the type of inference desired and how the levels were chosen. Here we must consider fixed effects, random effects, and even cases where effects are mixed. Most attention will be directed toward expected mean squares when we advance to these topics. In the following section, we focus on the concept of interaction.
14.2 Interaction in the Two-Factor Experiment
In the randomized block model discussed previously, it was assumed that one observation on each treatment is taken in each block. If the model assumption is correct, that is, if blocks and treatments are the only real effects and interaction does not exist, the expected value of the mean square error is the experimental error variance σ2. Suppose, however, that there is interaction occurring between treatments and blocks as indicated by the model
yij = μ + αi + βj + (αβ)ij + εij

14.2 Interaction in the Two-Factor Experiment 563 of Section 13.8. The expected value of the mean square error is then given as
(αβ)2ij.
The treatment and block effects do not appear in the expected mean square error, but the interaction effects do. Thus, if there is interaction in the model, the mean square error reflects variation due to experimental error plus an interaction contribution, and for this experimental plan, there is no way of separating them.
Interaction and the Interpretation of Main Effects
From an experimenter’s point of view it should seem necessary to arrive at a significance test on the existence of interaction by separating true error variation from that due to interaction. The main effects, A and B, take on a different meaning in the presence of interaction. In the previous biological example, the effect that drying time has on the amount of solids left in the yeast might very well depend on the temperature to which the samples are exposed. In general, there could be experimental situations in which factor A has a positive effect on the response at one level of factor B, while at a different level of factor B the effect of A is negative. We use the term positive effect here to indicate that the yield or response increases as the levels of a given factor increase according to some defined order. In the same sense, a negative effect corresponds to a decrease in response for increasing levels of the factor.
Consider, for example, the following data on temperature (factor A at levels t1, t2, and t3 in increasing order) and drying time d1, d2, and d3 (also in increasing order). The response is percent solids. These data are completely hypothetical and given to illustrate a point.
B
A d1 d2 d3 Total
􏰮SSE􏰯 1􏰤k􏰤b
E (b−1)(k−1) =σ2+(b−1)(k−1)
i=1 j=1
t1 4.4
t2 7.5
t3 9.7
8.8 5.2 18.4 8.5 2.4 18.4 7.9 0.8 18.4
Total 21.6 25.2 8.4 55.2
Clearly the effect of temperature on percent solids is positive at the low drying time d1 but negative for high drying time d3. This clear interaction between temperature and drying time is obviously of interest to the biologist, but, based on the totals of the responses for temperatures t1, t2, and t3, the temperature sum of squares, SSA, will yield a value of zero. We say then that the presence of interaction is masking the effect of temperature. Thus, if we consider the average effect of temperature, averaged over drying time, there is no effect. This then defines the main effect. But, of course, this is likely not what is pertinent to the biologist.
Before drawing any final conclusions resulting from tests of significance on the main effects and interaction effects, the experimenter should first observe whether or not the test for interaction is significant. If interaction is

564
Chapter 14 Factorial Experiments (Two or More Factors)
not significant, then the results of the tests on the main effects are meaningful. However, if interaction should be significant, then only those tests on the main effects that turn out to be significant are meaningful. Nonsignificant main effects in the presence of interaction might well be a result of masking and dictate the need to observe the influence of each factor at fixed levels of the other.
A Graphical Look at Interaction
The presence of interaction as well as its scientific impact can be interpreted nicely through the use of interaction plots. The plots clearly give a pictorial view of the tendency in the data to show the effect of changing one factor as one moves from one level to another of a second factor. Figure 14.1 illustrates the strong temperature by drying time interaction. The interaction is revealed in nonparallel lines.
10
8
6 4 2
d1 d2
d3
123 Temperature
Figure 14.1: Interaction plot for temperature–drying time data.
The relatively strong temperature effect on percent solids at the lower dry- ing time is reflected in the steep slope at d1. At the middle drying time d2 the temperature has very little effect, while at the high drying time d3 the negative slope illustrates a negative effect of temperature. Interaction plots such as this set give the scientist a quick and meaningful interpretation of the interaction that is present. It should be apparent that parallelism in the plots signals an absence of interaction.
Need for Multiple Observations
Interaction and experimental error are separated in the two-factor experiment only if multiple observations are taken at the various treatment combinations. For max- imum efficiency, there should be the same number n of observations at each com- bination. These should be true replications, not just repeated measurements. For
Percent Solids

14.3 Two-Factor Analysis of Variance 565
example, in the yeast illustration, if we take n = 2 observations at each combina- tion of temperature and drying time, there should be two separate samples and not merely repeated measurements on the same sample. This allows variability due to experimental units to appear in “error,” so the variation is not merely measurement error.
14.3 Two-Factor Analysis of Variance
To present general formulas for the analysis of variance of a two-factor experiment using repeated observations in a completely randomized design, we shall consider the case of n replications of the treatment combinations determined by a levels of factor A and b levels of factor B. The observations may be classified by means of a rectangular array where the rows represent the levels of factor A and the columns represent the levels of factor B. Each treatment combination defines a cell in our array. Thus, we have ab cells, each cell containing n observations. Denoting the kth observation taken at the ith level of factor A and the jth level of factor B by yijk, Table 14.1 shows the abn observations.
Table 14.1: Two-Factor Experiment with n Replications
B
A 1 2 ···
b Total Mean y1b1 Y1.. y ̄1..
1
2
.
a
Total Mean
y111 y121 y112 y122 . .
y11n y12n y211 y221 y212 y222 . .
y21n y22n . .
ya11 ya21 ya12 ya22 . .
ya1n ya2n Y.1. Y.2. y ̄.1. y ̄.2.
···
··· y1b2
.
··· y1bn
···
··· y2b2
y ̄2..
. y ̄a..
y ̄…
.
··· yabn
y2b1 Y2..
.
··· y2bn
. .
···
··· yab2
yab1 Ya..
··· ···
Y.b. Y… y ̄.b.
The observations in the (ij)th cell constitute a random sample of size n from a population that is assumed to be normally distributed with mean μij and variance σ2. All ab populations are assumed to have the same variance σ2. Let us define

566
Chapter 14 Factorial Experiments (Two or More Factors)
the following useful symbols, some of which are used in Table 14.1:
Yij. = sum of the observations in the (ij)th cell,
Yi.. = sum of the observations for the ith level of factor A, Y.j. = sum of the observations for the jth level of factor B, Y… = sum of all abn observations,
y ̄ij. = mean of the observations in the (ij)th cell,
y ̄i.. = mean of the observations for the ith level of factor A, y ̄.j. = mean of the observations for the jth level of factor B, y ̄… = mean of all abn observations.
Unlike in the one-factor situation covered at length in Chapter 13, here we are assuming that the populations, where n independent identically distributed ob- servations are taken, are combinations of factors. Also we will assume throughout that an equal number (n) of observations are taken at each factor combination. In cases in which the sample sizes per combination are unequal, the computations are more complicated but the concepts are transferable.
Model and Hypotheses for the Two-Factor Problem
Each observation in Table 14.1 may be written in the form yijk =μij +εijk,
where εijk measures the deviations of the observed yijk values in the (ij)th cell from the population mean μij. If we let (αβ)ij denote the interaction effect of the ith level of factor A and the jth level of factor B, αi the effect of the ith level of factor A, βj the effect of the jth level of factor B, and μ the overall mean, we can write
μij =μ+αi +βj +(αβ)ij,
and then
yijk =μ+αi +βj +(αβ)ij +εijk, on which we impose the restrictions
􏰤a 􏰤b 􏰤a 􏰤b
i=1
j=1
αi =0,
βj =0, (αβ)ij =0, i=1
(αβ)ij =0. j=1
The three hypotheses to be tested are as follows: 1. H0′: α1 =α2 =···=αa =0,
H1′ : At least one of the αi is not equal to zero. 2. H: β =β =···=β =0,
012b

H1 : At least one of the βj is not equal to zero.

14.3 Two-Factor Analysis of Variance 567

3. H0: (αβ)11 =(αβ)12 =···=(αβ)ab =0, 
H1 : At least one of the (αβ)ij is not equal to zero.
We warned the reader about the problem of masking of main effects when inter- action is a heavy contributor in the model. It is recommended that the interaction test result be considered first. The interpretation of the main effect test follows, and the nature of the scientific conclusion depends on whether interaction is found. If interaction is ruled out, then hypotheses 1 and 2 above can be tested and the interpretation is quite simple. However, if interaction is found to be present the interpretation can be more complicated, as we have seen from the discussion of the drying time and temperature in the previous section. In what follows, the structure of the tests of hypotheses 1, 2, and 3 will be discussed. Interpretation of results will be incorporated in the discussion of the analysis in Example 14.1.
The tests of the hypotheses above will be based on a comparison of independent estimates of σ2 provided by splitting the total sum of squares of our data into four components by means of the following identity.
Partitioning of Variability in the Two-Factor Case
Sum-of-Squares Identity
􏰤a 􏰤b 􏰤n i=1 j=1 k=1
􏰤a 􏰤b
(yijk − y ̄…)2 = bn (y ̄i.. − y ̄…)2 + an (y ̄.j. − y ̄…)2
i=1
􏰤a 􏰤b + n
i=1 j=1
j=1
(y ̄ij. − y ̄i.. − y ̄.j. + y ̄…)2 +
􏰤a 􏰤b 􏰤n i=1 j=1 k=1
(yijk − y ̄ij.)2
Theorem 14.1:
Symbolically, we write the sum-of-squares identity as SST =SSA+SSB+SS(AB)+SSE,
where SSA and SSB are called the sums of squares for the main effects A and B, respectively, SS(AB) is called the interaction sum of squares for A and B, and SSE is the error sum of squares. The degrees of freedom are partitioned according to the identity
abn − 1 = (a − 1) + (b − 1) + (a − 1)(b − 1) + ab(n − 1). Formation of Mean Squares
If we divide each of the sums of squares on the right side of the sum-of-squares identity by its corresponding number of degrees of freedom, we obtain the four statistics
S12 = SSA, S2 = SSB, S32 = SS(AB) , S2 = SSE a−1 b−1 (a−1)(b−1) ab(n−1)
.
All of these variance estimates are independent estimates of σ2 under the condition that there are no effects αi, βj, and, of course, (αβ)ij. If we interpret the sums of

568
Chapter 14 Factorial Experiments (Two or More Factors)
squares as functions of the independent random variables y111, y112, . . . , yabn, it is not difficult to verify that
􏰮SSA􏰯 nb 􏰤a E(S12)=E a−1 =σ2+a−1 αi2,
􏰮􏰯 i=1 SSB na 􏰤b
E(S2)=E b−1 =σ2+b−1 βj2, 􏰮 􏰯j=1
SS(AB) n 􏰤a􏰤b
(αβ)2ij,
from which we immediately observe that all four estimates of σ2 are unbiased when H′ , H , and H are true.
compute the following ratio:
s21 f1 = s2 ,
which is a value of the random variable F1 having the F-distribution with a − 1 and ab(n−1) degrees of freedom when H0 is true. The null hypothesis is rejected at the α-level of significance when f1 > fα[a − 1, ab(n − 1)].

Similarly, to test the hypothesis H0 that the effects of factor B are all equal to zero, we compute the following ratio:
s2 f2 = s2 ,
E(S32)=E (a−1)(b−1) =σ2+(a−1)(b−1) 􏰮􏰯
E(S2) = E SSE = σ2, ab(n − 1)
i=1 j=1
F-Test for Factor A
F-Test for Factor B
F-Test for Interaction
which is a value of the random variable F2 having the F-distribution with b − 1 and ab(n − 1) degrees of freedom when H is true. This hypothesis is rejected
000
To test the hypothesis H0′ , that the effects of factors A are all equal to zero, we
0
at the α-level of significance when f2 > fα[b − 1, ab(n − 1)].

Finally,totestthehypothesisH0 ,thattheinteractioneffectsareallequaltozero, we compute the following ratio:
s23 f3 = s2 ,
which is a value of the random variable F3 having the F-distribution with (a − 1)(b − 1) and ab(n − 1) degrees of freedom when H is true. We con-
0
clude that, at the α-level of significance, interaction is present when f3 >
fα[(a − 1)(b − 1), ab(n − 1)].
As indicated in Section 14.2, it is advisable to interpret the test for interaction before attempting to draw inferences on the main effects. If interaction is not sig- nificant, there is certainly evidence that the tests on main effects are interpretable. Rejection of hypothesis 1 on page 566 implies that the response means at the levels

14.3 Two-Factor Analysis of Variance 569
of factor A are significantly different, while rejection of hypothesis 2 implies a simi- lar condition for the means at levels of factor B. However, a significant interaction could very well imply that the data should be analyzed in a somewhat different manner—perhaps observing the effect of factor A at fixed levels of factor B, and so forth.
The computations in an analysis-of-variance problem, for a two-factor experi- ment with n replications, are usually summarized as in Table 14.2.
Table 14.2: Analysis of Variance for the Two-Factor Experiment with n Replications
Source of Variation
Main effect:
A B
Two-factor interactions:
AB
Error Total
Sum of Degrees of Squares Freedom
SSA a−1 SSB b−1
SS(AB) (a − 1)(b − 1) SSE ab(n − 1) SST abn − 1
Mean Computed Square f
s2=SSA f=s21 1 a−1 1 s2
s2=SSB f=s2 2 b−1 2 s2
s2=SS(AB) f=s23 3 (a−1)(b−1) 3 s2
s2=SSE ab(n−1)
Example 14.1:
In an experiment conducted to determine which of 3 different missile systems is preferable, the propellant burning rate for 24 static firings was measured. Four dif- ferent propellant types were used. The experiment yielded duplicate observations of burning rates at each combination of the treatments.
The data, after coding, are given in Table 14.3. Test the following hypotheses:

(a) H0: there is no difference in the mean propellant burning rates when different 
missile systems are used, (b) H0 : there is no difference in the mean propellant 
burning rates of the 4 propellant types, (c) H0 : there is no interaction between the different missile systems and the different propellant types.
Table 14.3: Propellant Burning Rates
Missile System b1
Propellant Type
b2 b3 b4 30.1 29.8 29.0 32.7 32.8 26.7 28.9
Solution:
1. (a) H0: α1 =α2 =α3 =0. 

(b) H0: β1 =β2 =β3 =β4 =0.
a1 34.0
a2 32.0
33.2 29.8 28.1 27.8
a3 28.4
29.3 28.9
30.2 28.7 27.6 27.3 29.7 28.8
27.3 29.1

570
Chapter 14 Factorial Experiments (Two or More Factors)

(c) H0: (αβ)11 =(αβ)12 =···=(αβ)34 =0.

2. (a) H1: At least one of the αi is not equal to zero. 
(b) H1 : At least one of the βj is not equal to zero. 
(c) H1 : At least one of the (αβ)ij is not equal to zero.
The sum-of-squares formula is used as described in Theorem 14.1. The analysis
of variance is shown in Table 14.4.
Table 14.4: Analysis of Variance for the Data of Table 14.3
Source of Variation
Missile system Propellant type Interaction Error
Total
Sum of Squares
14.52 40.08 22.16 14.91 91.68
Degrees of Freedom
2 3 6
Mean Computed Square f
7.26 5.84 13.36 10.75 3.69 2.97
12 1.24 23
The reader is directed to a SAS GLM Procedure (General Linear Models) for
analysis of the burning rate data in Figure 14.2. Note how the “model” (11 degrees
of freedom) is initially tested and the system, type, and system by type interac-
tion are tested separately. The F-test on the model (P = 0.0030) is testing the
accumulation of the two main effects and the interaction.

(a) Reject H0 and conclude that different missile systems result in different mean propellant burning rates. The P-value is approximately 0.0169.

(b) Reject H0 and conclude that the mean propellant burning rates are not the same for the four propellant types. The P-value is approximately 0.0010.
(c) Interaction is barely insignificant at the 0.05 level, but the P -value of approx- imately 0.0513 would indicate that interaction must be taken seriously.
At this point we should draw some type of interpretation of the interaction. It should be emphasized that statistical significance of a main effect merely implies that marginal means are significantly different. However, consider the two-way table of averages in Table 14.5.
Table 14.5:
Interpretation of Interaction
b1 b2
a1 33.35 31.45
a2 32.60 30.00
a3 28.85 28.10
b3 b4 28.25 28.95 28.40 27.70 28.50 28.95
Average
30.50 29.68 28.60
Average
31.60
29.85 28.38 28.53
It is apparent that more important information exists in the body of the table— trends that are inconsistent with the trend depicted by marginal averages. Table 14.5 certainly suggests that the effect of propellant type depends on the system

14.3 Two-Factor Analysis of Variance
571
Dependent Variable: rate
Source DF
Model 11
Error 12
Corrected Total 23
R-Square
0.837366
Coeff Var
3.766854
Source
system 2
type 3
system*type 6
The GLM Procedure
Sum of
Squares
76.76833333
14.91000000
91.67833333
Root MSE
1.114675
Mean Square
6.97893939
1.24250000
rate Mean
29.59167
F Value Pr>F 5.62 0.0030
DF Type III SS Mean Square F Value Pr > F
14.52333333 7.26166667 5.84
40.08166667 13.36055556 10.75
22.16333333 3.69388889 2.97
0.0169
0.0010
0.0512
Figure 14.2: SAS printout of the analysis of the propellant rate data of Table 14.3.
being used. For example, for system 3 the propellant-type effect does not appear to be important, although it does have a large effect if either system 1 or system 2 is used. This explains the “significant” interaction between these two factors. More will be revealed subsequently concerning this interaction.
Example 14.2: Referring to Example 14.1, choose two orthogonal contrasts to partition the sum of squares for the missile systems into single-degree-of-freedom components to be used in comparing systems 1 and 2 versus 3, and system 1 versus system 2.
Solution : The contrast for comparing systems 1 and 2 with 3 is w1 =μ1. +μ2. −2μ3..
A second contrast, orthogonal to w1, for comparing system 1 with system 2, is given by w2 = μ1. − μ2.. The single-degree-of-freedom sums of squares are
and
[244.0 + 237.4 − (2)(228.8)]2
SSw1 = (8)[(1)2 + (1)2 + (−2)2] = 11.80
(244.0 − 237.4)2
SSw2 = (8)[(1)2 + (−1)2] = 2.72.
Notice that SSw1 + SSw2 = SSA, as expected. The computed f-values corre- sponding to w1 and w2 are, respectively,
f1 = 11.80 =9.5 and f2 = 2.72 =2.2. 1.24 1.24
Compared to the critical value f0.05(1,12) = 4.75, we find f1 to be significant. In fact, the P-value is less than 0.01. Thus, the first contrast indicates that the

572
Chapter 14 Factorial Experiments (Two or More Factors)
hypothesis
H0: 1(μ1.+μ2.)=μ3. 2
is rejected. Since f2 < 4.75, the mean burning rates of the first and second systems are not significantly different. Impact of Significant Interaction in Example 14.1 If the hypothesis of no interaction in Example 14.1 is true, we could make the general comparisons of Example 14.2 regarding our missile systems rather than separate comparisons for each propellant. Similarly, we might make general com- parisons among the propellants rather than separate comparisons for each missile system. For example, we could compare propellants 1 and 2 with 3 and 4 and also propellant 1 versus propellant 2. The resulting f-ratios, each with 1 and 12 degrees of freedom, turn out to be 24.81 and 7.39, respectively, and both are quite significant at the 0.05 level. From propellant averages there appears to be evidence that propellant 1 gives the highest mean burning rate. A prudent experimenter might be somewhat cau- tious in drawing overall conclusions in a problem such as this one, where the f-ratio for interaction is barely below the 0.05 critical value. For example, the overall evi- dence, 31.60 versus 29.85 on the average for the two propellants, certainly indicates that propellant 1 is superior, in terms of a higher burning rate, to propellant 2. However, if we restrict ourselves to system 3, where we have an average of 28.85 for propellant 1 as opposed to 28.10 for propellant 2, there appears to be little or no difference between these two propellants. In fact, there appears to be a stabilization of burning rates for the different propellants if we operate with sys- tem 3. There is certainly overall evidence which indicates that system 1 gives a higher burning rate than system 3, but if we restrict ourselves to propellant 4, this conclusion does not appear to hold. The analyst can conduct a simple t-test using average burning rates for system 3 in order to display conclusive evidence that interaction is producing considerable difficulty in allowing broad conclusions on main effects. Consider a comparison of propellant 1 against propellant 2 only using system 3. Borrowing an estimate of σ2 from the overall analysis, that is, using s2 = 1.24 with 12 degrees of freedom, we have 0.75 |t| = 􏰱 0.75 = √ = 0.67, 2s2 /n be cautious about strict interpretation of main effects in the presence of interaction. Graphical Analysis for the Two-Factor Problem of Example 14.1 Many of the same types of graphical displays that were suggested in the one-factor problems certainly apply in the two-factor case. Two-dimensional plots of cell means or treatment combination means can provide insight into the presence of 1.24 which is not even close to being significant. This illustration suggests that one must 14.3 Two-Factor Analysis of Variance 573 interactions between the two factors. In addition, a plot of residuals against fitted values may well provide an indication of whether or not the homogeneous variance assumption holds. Often, of course, a violation of the homogeneous variance as- sumption involves an increase in the error variance as the response numbers get larger. As a result, this plot may point out the violation. Figure 14.3 shows the plot of cell means in the case of the missile system propellant illustration in Example 14.1. Notice how graphically (in this case) the lack of parallelism shows through. Note the flatness of the part of the figure showing the propellant effect for system 3. This illustrates interaction among the factors. Figure 14.4 shows the plot of residuals against fitted values for the same data. There is no apparent sign of difficulty with the homogeneous variance assumption. 34 32 30 28 26 Figure 14.3: Plot of cell means for data of Example 14.1. Numbers represent missile systems. 1.5 0.5 􏱍0.5 􏱍1.5 27 28 29 30 31 32 33 34 y^ Figure 14.4: Residual plot of data of Example 14.1. 1 2 1 2 33 31 2 3 1 2 1234 Type Residual Rate 574 Chapter 14 Factorial Experiments (Two or More Factors) Example 14.3: An electrical engineer is investigating a plasma etching process used in semicon- ductor manufacturing. It is of interest to study the effects of two factors, the C2F6 gas flow rate (A) and the power applied to the cathode (B). The response is the etch rate. Each factor is run at 3 levels, and 2 experimental runs on etch rate are made for each of the 9 combinations. The setup is that of a completely randomized design. The data are given in Table 14.6. The etch rate is in A◦/min. Table 14.6: Data for Example 14.3 Power Supplied C2F6 Flow Rate 1 1 288 360 465 720 3 488 462 612 801 The levels of the factors are in ascending order, with level 1 being low level and level 3 being the highest. (a) Show an analysis of variance table and draw conclusions, beginning with the test on interaction. (b) Do tests on main effects and draw conclusions. Solution: A SAS output is given in Figure 14.5. From the output we learn the following. Model Error Corrected Total 8 379508.7778 9 6999.5000 R-Square 0.981890 Coeff Var 5.057714 17 386508.2778 Root MSE 27.88767 DF 2 46343.1111 23171.5556 29.79 2 330003.4444 165001.7222 212.16 Type III SS Mean Square F Value 2 3 2 385 411 521 724 488 670 482 692 595 761 The GLM Procedure Dependent Variable: etchrate Sum of Source DF Squares Mean Square F Value 47438.5972 61.00 777.7222 etchrate Mean 551.3889 Source c2f6 power c2f6*power 4 3162.2222 790.5556 1.02 Pr>F <.0001 Pr>F 0.0001 <.0001 0.4485 Figure 14.5: SAS printout for Example 14.3. (a) The P-value for the test of interaction is 0.4485. We can conclude that there is no significant interaction. (b) There is a significant difference in mean etch rate for the 3 levels of C2F6 flow rate. Duncan’s test shows that the mean etch rate for level 3 is significantly Exercises 575 // higher than that for level 2 and the rate for level 2 is significantly higher than that for level 1. See Figure 14.6(a). There is a significant difference in mean etch rate based on the level of power to the cathode. Duncan’s test revealed that the etch rate for level 3 is sig- nificantly higher than that for level 2 and the rate for level 2 is significantly higher than that for level 1. See Figure 14.6(b). Duncan Grouping A B C Mean N c2f6 619.83 63 535.83 62 498.50 61 (a) Duncan Grouping Mean N power A 728.00 63 B 527.17 62 C 399.00 61 (b) Figure 14.6: SAS output, for Example 14.3. (a) Duncan’s test on gas flow rate; (b) Duncan’s test on power. Exercises 14.1 An experiment was conducted to study the ef- fects of temperature and type of oven on the life of a particular component. Four types of ovens and 3 temperature levels were used in the experiment. Twenty-four pieces were assigned randomly, two to each combination of treatments, and the following re- sults recorded. until it was tested. The results, in milligrams of ascor- bic acid per liter, were recorded. Use a 0.05 level of significance to test the hypothesis that Oven O2 O3 O4 260 229 246 273 206 that (a) different temperatures have no effect on the life of the component; (b) different ovens have no effect on the life of the com- ponent; (c) the type of oven and temperature do not interact. 14.2 To ascertain the stability of vitamin C in re- constituted frozen orange juice concentrate stored in a refrigerator for a period of up to one week, the study Vitamin C Retention in Reconstituted Frozen Orange Juice was conducted by the Department of Human Nu- trition and Foods at Virginia Tech. Three types of frozen orange juice concentrate were tested using 3 dif- ferent time periods. The time periods refer to the num- ber of days from when the orange juice was blended (a) (b) (c) there is no difference in ascorbic acid contents among the different brands of orange juice concen- trate; there is no difference in ascorbic acid contents for the different time periods; the brands of orange juice concentrate and the number of days from the time the juice was blended until it was tested do not interact. Time (days) Temperature (◦F) O1 500 227 214 225 221 259 236 550 187 208 179 198 181 232 3 49.4 49.2 42.8 53.2 48.8 44.0 44.0 42.4 Brand Richfood Sealed-Sweet Minute Maid 0 7 42.7 48.8 40.4 47.6 49.2 44.0 42.0 43.2 48.5 43.3 45.2 47.6 600 174 202 194 213 52.6 54.2 49.8 46.5 56.0 48.0 49.6 48.4 52.5 52.0 48.0 47.0 51.8 53.6 48.2 49.6 219 Using a 0.05 level of significance, test the hypothesis 198 178 14.3 Three strains of rats were studied under 2 envi- ronmental conditions for their performance in a maze test. The error scores for the 48 rats were recorded. Mixed 33 83 36 14 41 76 22 58 Environment Free Restricted Strain Bright Dull 28 12 22 23 25 10 36 86 72 32 60 89 136 120 48 93 35 126 38 153 25 31 83 110 64 128 91 19 99 118 87 140 101 94 33 56 122 83 35 23 Relative Humidity (b) different muscles have no effect on electromyo- graphic measurements; (c) subjects and types of muscle do not interact. 14.6 An experiment was conducted to determine whether additives increase the adhesiveness of rubber products. Sixteen products were made with the new additive and another 16 without the new additive. The observed adhesiveness was as recorded below. // 576 Chapter 14 Factorial Experiments (Two or More Factors) Five different muscles Use a 0.01 level of significance to test the hypothesis that (a) there is no difference in error scores for different environments; (b) there is no difference in error scores for different strains; (c) the environments and strains of rats do not inter- act. 14.4 Corrosion fatigue in metals has been defined as the simultaneous action of cyclic stress and chemical attack on a metal structure. A widely used technique for minimizing corrosion fatigue damage in aluminum involves the application of a protective coating. A study conducted by the Department of Mechanical En- gineering at Virginia Tech used 3 different levels of hu- midity Low: 20–25% relative humidity Medium: 55–60% relative humidity High: 86–91% relative humidity and 3 types of surface coatings Uncoated: no coating Anodized: sulfuric acid anodic oxide coating Conversion: chromate chemical conversion coating The corrosion fatigue data, expressed in thousands of cycles to failure, were recorded as follows: 1: anterior deltoid 2: pectorial major 3: posterior deltoid 4: middle deltoid 5: triceps were tested on each of 3 subjects, and the experiment was carried out 3 times for each treatment combina- tion. The electromyographic data, recorded during the serve, are presented here. Muscle Subject 1 2 3 4 5 1 32 59 38 2 63 60 50 3 43 54 47 5 58 1.5 61 2 66 10 64 9 78 7 78 41 26 43 29 42 23 10 19 10 20 14 23 45 43 61 61 71 42 63 61 46 85 55 95 Use a 0.01 level of significance to test the hypothesis that (a) different sub jects have equal measurements; electromyographic Medium 314 522 244 739 261 134 322 471 306 130 68 398 Coating Low 361 469 Uncoated 466 937 1069 1357 114 1032 Anodized 1236 92 533 211 130 1482 Conversion 841 529 1595 754 High 1344 1216 1027 1097 1011 1011 78 466 387 107 130 327 586 524 402 751 846 529 Temperature (◦ C) 80 3.9 3.2 3.0 2.7 3.5 3.6 3.8 3.9 significant 252 105 847 874 755 573 Without Additive With Additive 50 60 2.3 3.4 2.9 3.7 3.1 3.6 3.2 3.2 4.3 3.8 3.9 3.8 3.9 3.9 4.2 3.5 70 3.8 3.9 4.1 3.8 3.9 4.0 3.7 3.6 (a) Perform an analysis of variance with α = 0.05 to test for significant main and interaction effects. (b) Use Duncan’s multiple-range test at the 0.05 level of significance to determine which humidity levels result in different corrosion fatigue damage. 14.5 To determine which muscles need to be sub- jected to a conditioning program in order to improve one’s performance on the flat serve used in tennis, a study was conducted by the Department of Health, Physical Education and Recreation at Virginia Tech. Perform an analysis of variance to test for main and interaction effects. 14.7 The extraction rate of a certain polymer is known to depend on the reaction temperature and the amount of catalyst used. An experiment was con- ducted at four levels of temperature and five levels of the catalyst, and the extraction rate was recorded in the following table. Exercises 577 Amount of Catalyst 0.5% 0.6% 0.7% 0.8% 0.9% (c) Do secondary tests that will allow the engineer to learn the true impact of cutting speed. (d) Show a plot that graphically displays the interac- tion effect. 14.10 Two factors in a manufacturing process for an integrated circuit are studied in a two-factor experi- ment. The purpose of the experiment is to learn their effect on the resistivity of the wafer. The factors are implant dose (2 levels) and furnace position (3 levels). Experimentation is costly so only one experimental run is made at each combination. The data are as follows. 45 57 59 57 56 70 73 61 58 // 50◦C 38 41 47 59 61 58 60◦C 44 43 57 69 72 70◦C 44 47 60 67 61 59 56 70 73 61 80◦ C 49 47 65 55 69 53 58 62 70 62 Perform an analysis of variance. Test for significant main and interaction effects. 1 4 . 8 I n M y e r s , M o n t g o m e r y, a n d A n d e r s o n - C o o k (2009), a scenario is discussed involving an auto bumper plating process. The response is the thickness of the material. Factors that may impact the thickness include amount of nickel (A) and pH (B). A two-factor experiment is designed. The plan is a completely ran- domized design in which the individual bumpers are assigned randomly to the factor combinations. Three levels of pH and two levels of nickel content are involved Dose 1 Position 15.5 14.8 21.3 cm × 10−3, 6 221 150 170 both main effects and interaction. Show P-values. (b) Give engineering conclusions. What have you learned from the analysis of the data? (c) Show a plot that depicts either a presence or an absence of interaction. 14.9 An engineer is interested in the effects of cut- ting speed and tool geometry on the life in hours of a machine tool. Two cutting speeds and two different geometries are used. Three experimental tests are ac- complished at each of the four combinations. The data are as follows. in the experiment. The thickness data, in are as follows: Nickel Content pH 5.5 27.2 24.9 26.1 It is to be assumed that no interaction exists between these two factors. (a) Write the model and explain terms. (b) Show the analysis-of-variance table. (c) Explain the 2 “error” degrees of freedom. (d) Use Tukey’s test to do multiple-comparison tests on furnace position. Explain what the results show. 14.11 A study was done to determine the impact of two factors, method of analysis and the laboratory do- ing the analysis, on the level of sulfur content in coal. Twenty-eight coal specimens were randomly assigned to 14 factor combinations, the structure of the experi- mental units represented by combinations of seven lab- oratories and two methods of analysis with two speci- mens per factor combination. The data, expressed in percent of sulfur, are as follows: 2 (grams) 5 18 250 211 195 172 188 165 10 115 165 112 142 108 (a) Display the analysis-of-variance table with tests for 88 69 101 72 Laboratory 1 0.109 2 0.129 3 0.115 4 0.108 5 0.097 6 0.114 7 0.155 1 Method 0.105 0.105 0.122 0.127 0.112 0.109 0.108 0.117 0.096 0.110 0.119 0.116 0.145 0.164 2 0.108 0.124 0.111 0.118 0.097 0.122 0.160 Tool Geometry 1 2 Cutting Speed Low High 222820 343729 181516 111010 (The data are taken from G. Taguchi, Noise Ratio and Its Applications to Testing Material,” Reports of Statistical Application Research, Union of Japanese Scientists and Engineers, Vol. 18, No. 4, 1971.) (a) Do an analysis of variance and show results in an analysis-of-variance table. (b) Is interaction significant? If so, discuss what it means to the scientist. Use a P-value in your con- clusion. (c) Are the individual main effects, laboratory, and method of analysis statistically significant? Discuss “Signal to (a) Show an analysis-of-variance table with tests on in- teraction and main effects. (b) Comment on the effect that interaction has on the test on cutting speed. 578 Chapter 14 Factorial Experiments (Two or More Factors) what is learned and let your answer be couched in the context of any significant interaction. (d) Do an interaction plot that illustrates the effect of interaction. (e) Do a test comparing methods 1 and 2 at laboratory 1 and do the same test at laboratory 7. Comment on what these results illustrate. 14.12 In an experiment conducted in the Civil Engi- neering Department at Virginia Tech, growth of a cer- tain type of algae in water was observed as a function of time and the dosage of copper added to the water. The data are as follows. Response is in units of algae. treatment influence magnesium uptake. (d) Fit the appropriate regression model with treat- ment as a categorical variable. Include interaction in the model. (e) Is interaction significant in the regression model? 14.14 Consider the data set in Exercise 14.12 and an- swer the following questions. Copper Time in Days 5 12 18 (a) (b) Both factors, copper and time, are quantitative in nature. As a result, a regression model may be of interest. Describe what might be an appropriate model using x1 = copper content and x2 = time. Fit the model to the data, showing regression coef- ficients and a t-test on each. Fit the model Y = β0 +β1x1 +β2x2 +β12x1x2 + β11x21 + β22x2 + ε, and compare it to the one you chose in (a). Which is more appropriate? Use Ra2dj as a criterion. 1 0.30 0.37 0.25 0.34 0.36 0.23 0.32 0.35 0.24 2 0.24 0.30 0.27 0.23 0.32 0.25 0.22 0.31 0.25 3 0.20 0.30 0.27 0.28 0.31 0.29 0.24 0.30 0.25 (a) Do an analysis of variance and show the analysis- of-variance table. (b) Comment concerning whether the data are suffi- cient to show a time effect on algae concentration. (c) Do the same for copper content. Does the level of copper impact algae concentration? (d) Comment on the results of the test for interaction. How is the effect of copper content influenced by time? 14.13 In Myers, Classical and Modern Regression with Applications (Duxbury Classic Series, 2nd edition, 1990), an experiment is described in which the Envi- ronmental Protection Agency seeks to determine the effect of two water treatment methods on magnesium uptake. Magnesium levels in grams per cubic centime- ter (cc) are measured, and two different time levels are incorporated into the experiment. The data are as fol- lows: 14.15 The purpose of the study The Incorporation of a Chelating Agent into a Flame Retardant Finish of a Cotton Flannelette and the Evaluation of Selected Fab- ric Properties, conducted at Virginia Tech, was to eval- uate the use of a chelating agent as part of the flame retardant finish of cotton flannelette by determining its effect upon flammability after the fabric is laundered under specific conditions. There were two treatments at two levels. Two baths were prepared, one with car- boxymethyl cellulose (bath I) and one without (bath II). Half of the fabric was laundered 5 times and half was laundered 10 times. There were 12 pieces of fab- ric in each bath/number of launderings combination. After the washings, the lengths of fabric that burned and the burn times were measured. Burn times (in seconds) were recorded as follows: Treatment Launderings 5 10 Bath I 13.7 23.0 25.5 15.8 14.0 29.4 14.0 12.3 27.2 16.8 14.9 17.1 10.8 13.5 14.2 27.4 15.7 14.8 9.7 12.3 12.9 13.0 25.5 11.5 Bath II 6.2 5.4 5.0 4.4 5.0 3.3 16.0 2.5 1.6 3.9 2.5 7.1 18.2 8.8 14.5 14.7 17.1 13.9 10.6 5.8 7.3 17.7 18.3 9.9 Time (hr) 1 2 1 2 2.19 2.15 2.16 2.03 2.01 2.04 2.01 2.03 2.04 1.88 1.86 1.91 (a) Do an interaction plot. What is your impression? (b) Do an analysis of variance and show tests for the main effects and interaction. (c) Give scientific findings regarding how time and (a) Perform an analysis of cant interaction term? variance. Is there a signifi- (b) Are there main effect differences? Discuss. 14.4 Three-Factor Experiments 579 14.4 Three-Factor Experiments Model for the Three-Factor Experiment In this section, we consider an experiment with three factors, A, B, and C, at a, b, and c levels, respectively, in a completely randomized experimental design. Assume again that we have n observations for each of the abc treatment combinations. We shall proceed to outline significance tests for the three main effects and interactions involved. It is hoped that the reader can then use the description given here to generalize the analysis to k > 3 factors.
The model for the three-factor experiment is
yijkl =μ+αi +βj +γk +(αβ)ij +(αγ)ik +(βγ)jk +(αβγ)ijk +εijkl,
i = 1,2,…,a; j = 1,2,…,b; k = 1,2,…,c; and l = 1,2,…,n, where αi, βj, and γk are the main effects and (αβ)ij, (αγ)ik, and (βγ)jk are the two- factor interaction effects that have the same interpretation as in the two-factor experiment.
The term (αβγ)ijk is called the three-factor interaction effect, a term that represents a nonadditivity of the (αβ)ij over the different levels of the factor C. As before, the sum of all main effects is zero and the sum over any subscript of the two- and three-factor interaction effects is zero. In many experimental situations, these higher-order interactions are insignificant and their mean squares reflect only random variation, but we shall outline the analysis in its most general form.
Again, in order that valid significance tests can be made, we must assume that the errors are values of independent and normally distributed random variables, each with mean 0 and common variance σ2.
The general philosophy concerning the analysis is the same as that discussed for the one- and two-factor experiments. The sum of squares is partitioned into eight terms, each representing a source of variation from which we obtain independent estimates of σ2 when all the main effects and interaction effects are zero. If the effects of any given factor or interaction are not all zero, then the mean square will estimate the error variance plus a component due to the systematic effect in question.
S u m o f S q u a r e s for a Three-Factor E x p e r i m e n t
􏰤a SSA = bcn
i=1 􏰤b
􏰤 􏰤
SS(AB) = cn (y ̄ij.. − y ̄i… − y ̄.j.. + y ̄….)2 i j
􏰤 􏰤
(y ̄i.k. − y ̄i… − y ̄..k. + y ̄….)2 (y ̄.jk. − y ̄.j.. − y ̄..k. + y ̄….)2
(y ̄i… − y ̄….)2 (y ̄.j.. − y ̄….)2
SSB = acn
􏰤c 􏰤 􏰤
SS(AC) = bn
j=1 ik
SSC = abn (y ̄..k. − y ̄….)2 SS(BC) = an
k=1 jk
􏰤􏰤􏰤
ijk
(y ̄ijk. − y ̄ij.. − y ̄i.k. − y ̄.jk. + y ̄i… + y ̄.j.. + y ̄..k. − y ̄….)2
SS(ABC) = n
􏰤􏰤􏰤􏰤 􏰤􏰤􏰤􏰤
SST = (yijkl − y ̄….)2 SSE = (yijkl − y ̄ijk.)2 ijkl ijkl

580
Chapter 14 Factorial Experiments (Two or More Factors)
Although we emphasize interpretation of annotated computer printout in this section rather than being concerned with laborious computation of sums of squares, we do offer the following as the sums of squares for the three main effects and interactions. Notice the obvious extension from the two- to three-factor problem.
The averages in the formulas are defined as follows: y ̄…. = average of all abcn observations,
y ̄i… = average of the observations for the ith level of factor A,
y ̄.j.. = average of the observations for the jth level of factor B,
y ̄..k. = average of the observations for the kth level of factor C,
y ̄ij.. = average of the observations for the ith level of A and the jth level of B, y ̄i.k. = average of the observations for the ith level of A and the kth level of C, y ̄.jk. = average of the observations for the jth level of B and the kth level of C, y ̄ijk. = average of the observations for the (ijk)th treatment combination.
The computations in an analysis-of-variance table for a three-factor problem with n replicated runs at each factor combination are summarized in Table 14.7.
Table 14.7: ANOVA for the Three-Factor Experiment with n Replications
Source of Variation
Main effect:
A
B
C
Two-factor interaction:
AB
AC
BC
Three-factor interaction:
Sum of Degrees of Squares Freedom
SSAa−1 SSB b−1 SSCc−1
SS(AB) (a − 1)(b − 1) SS(AC) (a − 1)(c − 1) SS(BC) (b − 1)(c − 1)
SS(ABC) (a − 1)(b − 1)(c − 1) SSE abc(n − 1)
SST abcn − 1
Mean Computed Square f
s2f=s21 1 1s2
s2 f=s2 2 2s2
s2f=s23 3 3s2
ABC
Error Total
s2 4
s2 5
s2 6
s2 7
s2
f = s 24 4s2
f = s 25 5s2
f = s 26 6s2
f = s 27 7s2
For the three-factor experiment with a single experimental run per combination, we may use the analysis of Table 14.7 by setting n = 1 and using the ABC interaction sum of squares for SSE. In this case, we are assuming that the (αβγ)ijk interaction effects are all equal to zero so that
(αβγ)2ijk =σ2.
􏰮SS(ABC)􏰯 n 􏰤a􏰤b􏰤c
E (a−1)(b−1)(c−1) =σ2+(a−1)(b−1)(c−1)
i=1 j=1 k=1

14.4 Three-Factor Experiments 581
That is, SS(ABC) represents variation due only to experimental error. Its mean square thereby provides an unbiased estimate of the error variance. With n = 1 and SSE = SS(ABC), the error sum of squares is found by subtracting the sums of squares of the main effects and two-factor interactions from the total sum of squares.
Example 14.4: In the production of a particular material, three variables are of interest: A, the operator effect (three operators): B, the catalyst used in the experiment (three catalysts); and C, the washing time of the product following the cooling process (15 minutes and 20 minutes). Three runs were made at each combination of factors. It was felt that all interactions among the factors should be studied. The coded yields are in Table 14.8. Perform an analysis of variance to test for significant effects.
Table 14.8: Data for Example 14.4
Washing Time, C
15 Minutes Catalyst, B
20 Minutes Catalyst, B
Operator,A 1 2 3 1 2 3
1 10.7 10.3 11.2 10.9 10.5 12.2 10.8 10.2 11.6 12.1 11.1 11.7 11.3 10.5 12.0 11.5 10.3 11.0
2 11.4 10.2 10.7 9.8 12.6 10.8 11.8 10.9 10.5 11.3 7.5 10.2 11.5 10.5 10.2 10.9 9.9 11.5
3 13.6 12.0 11.1 10.7 10.2 11.9 14.1 11.6 11.0 11.7 11.5 11.6 14.5 11.5 11.5 12.7 10.9 12.2
Solution: Table 14.9 shows an analysis of variance of the data given above. None of the interactions show a significant effect at the α = 0.05 level. However, the P-value for BC is 0.0610; thus, it should not be ignored. The operator and catalyst effects are significant, while the effect of washing time is not significant.
Impact of Interaction BC
More should be discussed regarding Example 14.4, particularly about dealing with the effect that the interaction between catalyst and washing time is having on the test on the washing time main effect (factor C). Recall our discussion in Section 14.2. Illustrations were given of how the presence of interaction could change the interpretation that we make regarding main effects. In Example 14.4, the BC interaction is significant at approximately the 0.06 level. Suppose, however, that we observe a two-way table of means as in Table 14.10.
It is clear why washing time was found not to be significant. A non-thorough analyst may get the impression that washing time can be eliminated from any future study in which yield is being measured. However, it is obvious how the

582
Chapter 14 Factorial Experiments (Two or More Factors)
Table 14.9: ANOVA for a Three-Factor Experiment in a Completely Randomized Design
Source
df
Sum of Squares
13.98 10.18 4.77 1.19 2.91 3.63 4.91 21.61 63.19
Mean Square F-Value 6.99 11.64
5.09 8.48 1.19 1.99 1.19 1.97 1.46 2.43 1.82 3.03 1.23 2.04 0.60
P-Value 0.0001
0.0010 0.1172 0.1686 0.1027 0.0610 0.1089
A 2
B 2
AB 4 C 1 AC 2 BC 2 ABC 4 Error 36
Total
53
Table 14.10: Two-Way Table of Means for Example 14.4
Catalyst, B 1
2 3
Means
15 min
12.19 10.86 11.09
11.38
20 min
11.29 10.50 11.46
11.08
Washing Time, C
effect of washing time changes from a negative effect for the first catalyst to what appears to be a positive effect for the third catalyst. If we merely focus on the data for catalyst 1, a simple comparison between the means at the two washing times will produce a simple t-statistic:
=2.5,
t= 􏰱
0.6(2/9)
12.19 − 11.29
which is significant at a level less than 0.02. Thus, an important negative effect of washing time for catalyst 1 might very well be ignored if the analyst makes the incorrect broad interpretation of the insignificant F-ratio for washing time.
Pooling in Multifactor Models
We have described the three-factor model and its analysis in the most general form by including all possible interactions in the model. Of course, there are many situations where it is known a priori that the model should not contain certain interactions. We can then take advantage of this knowledge by combining or pooling the sums of squares corresponding to negligible interactions with the error sum of squares to form a new estimator for σ2 with a larger number of degrees of freedom. For example, in a metallurgy experiment designed to study the effect on film thickness of three important processing variables, suppose it is known that factor A, acid concentration, does not interact with factors B and C. The

14.4 Three-Factor Experiments
583
Source of Variation
Main effect:
A
B
C
Two-factor interaction:
Squares Freedom Square
Computed
f
Table 14.11: ANOVA with Factor A Noninteracting Sum of Degrees of Mean
SSAa−1s2f=s21 1 1s2
SSB b−1 s2 f=s2 2 2s2
SSCc−1s2f=s23 3 3s2
s2 f =s24 4 4s2
BC
Error Total
SS(BC) (b−1)(c−1)
S S E Subtraction s2 SST abcn − 1
sums of squares SSA, SSB, SSC, and SS(BC) are computed using the methods described earlier in this section. The mean squares for the remaining effects will now all independently estimate the error variance σ2. Therefore, we form our new mean square error by pooling SS(AB), SS(AC), SS(ABC), and SSE, along with the corresponding degrees of freedom. The resulting denominator for the significance tests is then the mean square error given by
s2 = SS(AB) + SS(AC) + SS(ABC) + SSE . (a−1)(b−1)+(a−1)(c−1)+(a−1)(b−1)(c−1)+abc(n−1)
Computationally, of course, one obtains the pooled sum of squares and the pooled degrees of freedom by subtraction once SST and the sums of squares for the existing effects are computed. The analysis-of-variance table would then take the form of Table 14.11.
Factorial Experiments in Blocks
In this chapter, we have assumed that the experimental design used is a completely randomized design. By interpreting the levels of factor A in Table 14.11 as dif- ferent blocks, we then have the analysis-of-variance procedure for a two-factor experiment in a randomized block design. For example, if we interpret the op- erators in Example 14.4 as blocks and assume no interaction between blocks and the other two factors, the analysis of variance takes the form of Table 14.12 rather than that of Table 14.9. The reader can verify that the mean square error is also
s2 = 4.77+2.91+4.91+21.61 =0.74, 4+2+4+36
which demonstrates the pooling of the sums of squares for the nonexisting inter- action effects. Note that factor B, catalyst, has a significant effect on yield.

584
Chapter 14 Factorial Experiments (Two or More Factors)
Table 14.12: ANOVA for a Two-Factor Experiment in a Randomized Block Design
Source of Sum of Variation Squares
Blocks 13.98 Main effect:
Degrees of Freedom
Mean Computed Square f
P-Value
0.0024 0.2130
0.0966
B
C
Two-factor interaction:
10.18 1.18
3.64 Error 34.21 Total 63.19
2 1
5.09 6.88 1.18 1.59
BC
1.82 2.46
2 6.99
2
46 0.74 53
Example 14.5: An experiment was conducted to determine the effects of temperature, pressure, and stirring rate on product filtration rate. This was done in a pilot plant. The experiment was run at two levels of each factor. In addition, it was decided that two batches of raw materials should be used, where batches were treated as blocks. Eight experimental runs were made in random order for each batch of raw materials. It is thought that all two-factor interactions may be of interest. No interactions with batches are assumed to exist. The data appear in Table 14.13. “L” and “H” imply low and high levels, respectively. The filtration rate is in gallons per hour.
(a) Show the complete ANOVA table. Pool all “interactions” with blocks into error.
(b) What interactions appear to be significant?
(c) Create plots to reveal and interpret the significant interactions. Explain what the plot means to the engineer.
Table 14.13: Data for Example 14.5
Low Stirring Rate
Temp. Pressure L Pressure H L 43 49
H 64 68
Low Stirring Rate
Temp. Pressure L Pressure H L 49 57
H 70 76
Batch 1
Batch 2
High Stirring Rate
Temp. Pressure L Pressure H L 44 47
H 97 102
High Stirring Rate
Temp. Pressure L Pressure H L 51 55
H 103 106

14.4 Three-Factor Experiments 585
Solution : (a) (b)
(c)
The SAS printout is given in Figure 14.7.
As seen in Figure 14.7, the temperature by stirring rate (strate) interaction appears to be highly significant. The pressure by stirring rate interaction also appears to be significant. Incidentally, if one were to do further pooling by combining the insignificant interactions with error, the conclusions would remain the same and the P-value for the pressure by stirring rate interaction would become stronger, namely 0.0517.
The main effects for both stirring rate and temperature are highly significant, as shown in Figure 14.7. A look at the interaction plot of Figure 14.8(a) shows that the effect of stirring rate is dependent upon the level of temperature. At the low level of temperature the stirring rate effect is negligible, whereas at the high level of temperature stirring rate has a strong positive effect on mean filtration rate. In Figure 14.8(b), the interaction between pressure and stirring rate, though not as pronounced as that of Figure 14.8(a), still shows a slight inconsistency of the stirring rate effect across pressure.
DF Type III SS Mean Square F Value Pr > F
1 175.562500 175.562500 177.14 <.0001 1 95.062500 95.062500 95.92 <.0001 1 5292.562500 5292.562500 5340.24 <.0001 1 0.562500 0.562500 0.57 0.4758 1 1040.062500 1040.062500 1049.43 <.0001 1 5.062500 5.062500 5.11 0.0583 1 1072.562500 1072.562500 1082.23 <.0001 Source batch pressure temp pressure*temp strate pressure*strate temp*strate pressure*temp*strate 1 1.562500 1.562500 1.58 0.2495 Error 7 6.937500 0.991071 Corrected Total 15 7689.937500 100 90 80 70 60 50 40 High 110 100 90 80 High Figure 14.7: ANOVA for Example 14.5, batch interaction pooled with error. Temperature Stirring Rate Pressure Low 70 60 Low 50 Low High 40 Low High Stirring Rate (b) Pressure versus stirring rate. (a) Temperature versus stirring rate. Figure 14.8: Interaction plots for Example 14.5. Filtration Rate Filtration Rate // 586 Chapter 14 Factorial Experiments (Two or More Factors) Exercises 14.16 Consider an experimental situation involving factors A, B, and C, where we assume a three-way fixed effects model of the form yijkl = μ+αi +βj + γk +(βγ)jk +εijkl. All other interactions are considered to be nonexistent or negligible. The data are presented here. tions (4 levels); B, the analysis time (2 levels); and C, the method of loading propellant into sample hold- ers (hot and room temperature). The following data, which represent the weight percent of ammonium per- chlorate in a particular propellant, were recorded. B1 C1 C2 A1 4.0 3.4 4.9 4.1 A2 3.6 2.8 3.9 3.2 A3 4.8 3.3 3.7 3.8 A4 3.6 3.2 3.9 2.8 B2 C3 C1 C2 3.9 4.4 4.3 3.4 3.1 2.7 3.5 3.0 3.6 3.6 4.2 3.8 3.2 2.2 3.4 3.5 Method of Loading, C Hot Room Temp. C3 3.1 3.7 A 1 2 3 4 B1 B2 B1 38.62 38.45 39.82 37.20 38.64 39.15 38.02 38.75 39.78 37.67 37.81 39.53 37.57 37.75 39.76 37.85 37.91 39.90 37.51 37.21 39.34 37.74 37.42 39.60 37.58 37.79 39.62 37.52 37.60 40.09 37.15 37.55 39.63 37.51 37.91 39.67 B2 39.82 40.26 39.72 39.56 39.25 39.04 39.74 39.49 39.45 39.36 39.38 39.00 3.1 3.5 2.9 3.7 3.2 2.9 2.9 3.3 3.5 2.9 3.6 3.2 4.3 4.2 (a) Perform a test of significance on the BC interaction at the α = 0.05 level. (b) Perform tests of significance on the main effects A, B, and C using a pooled mean square error at the α = 0.05 level. 14.17 The following data are measurements from an experiment conducted using three factors A, B, and C, all fixed effects: (a) (b) Perform an analysis of variance with α = 0.01 to test for significant main and interaction effects. Discuss the influence of the three factors on the weight percent of ammonium perchlorate. Let your discussion involve the role of any significant inter- action. C1 C2 B1 B2 B3 A1 15.0 14.8 15.9 18.5 13.6 14.8 22.1 12.2 13.6 A2 11.3 17.2 16.1 18.9 15.4 14.6 15.5 14.7 17.3 17.0 18.2 14.2 13.4 16.1 18.6 (a) Perform tests of significance the α = 0.05 level. C3 B1 B2 B3 14.19 Corrosion fatigue in metals has been defined as the simultaneous action of cyclic stress and chem- ical attack on a metal structure. In the study Effect of Humidity and Several Surface Coatings on the Fa- tigue Life of 2024-T351 Aluminum Alloy, conducted by the Department of Mechanical Engineering at Virginia Tech, a technique involving the application of a protec- tive chromate coating was used to minimize corrosion fatigue damage in aluminum. Three factors were used in the investigation, with 5 replicates for each treat- ment combination: coating, at 2 levels, and humidity and shear stress, both with 3 levels. The fatigue data, recorded in thousands of cycles to failure, are presented here. (a) Perform an analysis of variance with α = 0.01 to test for significant main and interaction effects. (b) Make a recommendation for combinations of the three factors that would result in low fatigue dam- age. B1 B2 B3 16.8 14.2 13.2 15.4 12.9 11.6 14.3 13.0 10.1 15.8 15.5 14.3 13.7 13.0 12.6 12.4 12.7 17.3 13.6 14.2 15.8 15.2 15.9 14.6 19.2 13.5 11.1 7.8 11.5 12.2 on all interactions at (b) Perform tests of significance on the main effects at the α = 0.05 level. (c) Give an explanation of how a significant interaction has masked the effect of factor C. 14.18 The method of X-ray fluorescence is an impor- tant analytical tool for determining the concentration of material in solid missile propellants. In the paper An X-ray Fluorescence Method for Analyzing Polybu- tadiene Acrylic Acid (PBAA) Propellants (Quarterly Report, RK-TR-62-1, Army Ordinance Missile Com- mand, 1962), it is postulated that the propellant mix- ing process and analysis time have an influence on the homogeneity of the material and hence on the accu- racy of X-ray intensity measurements. An experiment was conducted using 3 factors: A, the mixing condi- Exercises 587 Shear Stress (psi) Type Gold Foil Goldent // Coating Uncoated Humidity Low 4580 5252 361 Dentist Method 3 1 13,000 17,00020,000 (20–25% RH) Medium (50–60% RH) High (86–91% RH) Low (20–25% RH) Medium (50–60% RH) High (86–91% RH) 10,126 897 466 1341 1465 1069 6414 2694 469 3549 1017 937 2858 799 314 8829 3471 244 10,914 685 261 4067 810 522 2595 3409 739 6489 1862 1344 5248 2710 1027 6816 2632 663 5860 2131 1216 5901 2470 1097 5395 4035 130 2768 2022 841 1821 914 1595 3604 2036 1482 4106 3524 529 4833 1847 252 7414 1684 105 10,022 3042 847 7463 4482 874 21,906 996 755 3287 1319 586 5200 929 402 5493 1263 846 4145 2236 524 3336 1392 751 715 724 2 792 715 3 762 606 673 946 3 690 245 4 1 2 657 743 634 715 3 724 627 5 1 2 649 724 Chromated 14.21 Electronic copiers make copies by gluing black ink on paper, using static electricity. Heating and glu- ing the ink on the paper comprise the final stage of the copying process. The gluing power during this fi- nal process determines the quality of the copy. It is postulated that temperature, surface state of the glu- ing roller, and hardness of the press roller influence the gluing power of the copier. An experiment is run with treatments consisting of a combination of these three factors at each of three levels. The following data show the gluing power for each treatment combination. Per- form an analysis of variance with α = 0.05 to test for significant main and interaction effects. Surface State of Gluing Roller Soft Medium Hard Soft Medium Hard Soft Medium Hard Hardness of the Press Roller 40 0.54 0.52 0.65 0.56 0.79 0.73 0.79 0.78 0.58 0.68 0.57 0.59 0.31 0.49 0.48 0.66 0.66 0.57 0.72 0.56 0.53 0.45 0.59 0.47 0.54 0.52 0.65 0.56 0.53 0.45 0.59 0.47 20 60 Low Temp. 0.52 0.44 0.60 0.55 0.57 0.53 0.78 0.68 0.64 0.59 0.49 0.48 0.58 0.64 0.74 0.50 0.67 0.77 0.55 0.65 0.74 0.65 0.57 0.58 0.46 0.40 0.56 0.42 0.58 0.37 0.49 0.49 0.60 0.43 0.64 0.54 0.62 0.61 0.74 0.56 0.53 0.65 0.56 0.66 0.66 0.56 0.71 0.67 0.52 0.44 0.65 0.49 0.57 0.53 0.65 0.52 0.53 0.65 0.49 0.48 0.66 0.56 0.74 0.50 0.43 0.43 0.48 0.31 0.55 0.65 0.47 0.44 0.43 0.27 0.57 0.58 14.20 For a study of the hardness of gold dental fill- ings, five randomly chosen dentists were assigned com- binations of three methods of condensation and two types of gold. The hardness was measured. (See Temp. Hoaglin, Mosteller, and Tukey, 1991.) Let the den- tists play the role of blocks. The data are presented here. (a) State the appropriate model with the assumptions. of condensation and type of gold filling material? (c) Is there one method of condensation that seems to be best? Explain. Medium High (b) Is there a significant interaction between method Temp. Dentist 1 1 Type Gold Foil Goldent Method 14.22 Consider the data set in Exercise 14.21. (a) Construct an interaction plot for any two-factor in- teraction that is significant. (b) Do a normal probability plot of residuals and com- ment. 14.23 Consider combinations of three factors in the 792 824 2 772 772 3 782 803 3 715 707 (cont.) 2 1 2 752 772 803 803 588 Chapter 14 Factorial Experiments (Two or More Factors) removal of dirt from standard loads of laundry. The first factor is the brand of the detergent, X, Y , or Z. The second factor is the type of detergent, liquid or powder. The third factor is the temperature of the wa- ter, hot or warm. The experiment was replicated three times. Response is percent dirt removal. The data are as follows: is believed that extrusion rate does not interact with die temperature and that the three-factor interaction should be negligible. Thus, these two interactions may be pooled to produce a 2 d.f. “error” term. (a) Do an analysis of variance that includes the three main effects and two two-factor interactions. De- termine what effects influence the radius of the pro- pellant grain. (b) Construct interaction plots for the powder temper- ature by die temperature and powder temperature by extrusion rate interactions. Brand Type Temperature X Powder Hot Warm 828385 858880 Liquid Hot Warm 757573 (c) Comment on the consistency in the appearance of the interaction plots and the tests on the two in- teractions in the ANOVA. 787572 Y Powder Hot Warm 888688 909292 Liquid Hot Warm 767776 14.25 In the book Design of Experiments for Qual- ity Improvement, published by the Japanese Standards Association (1989), a study is reported on the extrac- tion of polyethylene by using a solvent and how the amount of gel (proportion) is influenced by three fac- tors: the type of solvent, extraction temperature, and extraction time. A factorial experiment was designed, and the following data were collected on proportion of gel. 787670 Z Powder Hot Warm 767478 858788 607068 Liquid Hot Warm 555754 (a) Are there significant interaction effects at the α = 0.05 level? (b) Are there significant differences between the three brands of detergent? (c) Which combination of factors would you prefer to use? 14.24 A scientist collects experimental data on the radius of a propellant grain, y, as a function of pow- der temperature, extrusion rate, and die temperature. Results of the three-factor experiment are as follows: Powder Temp Time Ethanol 120 94.0 94.0 91.1 90.5 8 93.8 94.2 94.9 95.3 Solvent Temp. 80 95.3 95.1 92.5 92.4 4 Toluene 120 94.6 94.5 93.6 94.1 91.1 91.0 16 150 Die Temp 190 Die Temp 220 250 80 95.4 95.4 95.6 96.0 92.1 92.1 (a) Do an analysis of variance and determine what fac- tors and interactions influence the proportion of gel. (b) Construct an interaction plot for any two-factor in- teraction that is significant. In addition, explain what conclusion can be drawn from the presence of the interaction. (c) Do a normal probability plot of residuals and com- ment. Rate 220 250 12 82 124 88 129 24 114 157 121 164 Resources are not available to make repeated experi- mental trials at the eight combinations of factors. It 14.5 Factorial Experiments for Random Effects and Mixed Models In a two-factor experiment with random effects, we have the model Yijk =μ+Ai +Bj +(AB)ij +εijk, for i = 1,2,...,a; j = 1,2,...,b; and k = 1,2,...,n, where the Ai, Bj, (AB)ij, and εijk are independent random variables with means 0 and variances σα2, σβ2, σ2 , and σ2, respectively. The sums of squares for random effects experiments are αβ computed in exactly the same way as for fixed effects experiments. We are now 14.5 Factorial Experiments for Random Effects and Mixed Models 589 interested in testing hypotheses of the form Source of Variation A B AB Error Total Degrees of Mean Freedom Square Expected Mean Square σ2+nσ2 +bnσ2 αβα σ2+nσ2 +anσ2 αβ β σ2 + nσ2 2 H0: σα =0, 2 H1: σα ̸=0,  2 H0: σβ =0,  2 H1: σβ ̸=0,  2 H0: σαβ =0,  2 H1: σαβ ̸=0, where the denominator in the f-ratio is not necessarily the mean square error. The appropriate denominator can be determined by examining the expected values of the various mean squares. These are shown in Table 14.14. Table 14.14: Expected Mean Squares for a Two-Factor Random Effects Experiment a−1 s2 1 b−1 s2 2 (a−1)(b−1) s2 3 ab(n − 1) s2 abn−1 αβ σ2 are tested by using s2 in the de- From Table 14.14 we see that H and H 003 nominator of the f-ratio, whereas H is tested using s2 in the denominator. The 0 unbiased estimates of the variance components are 2 2 2 s 23 − s 2 2 s 21 − s 23 2 s 2 2 − s 23 σˆ=s, σˆαβ=n, σˆα=bn, σˆβ=an. Table 14.15: Expected Mean Squares for a Three-Factor Random Effects Experiment Source of Variation Degrees of Mean Freedom Square a−1 s2 1 b−1 s2 2 c−1 s2 3 (a−1)(b−1) s2 4 (a−1)(c−1) s2 5 (b−1)(c−1) s2 6 (a−1)(b−1)(c−1) s2 7 abc(n − 1) s2 abcn−1 Expected Mean Square +nσ2 + cnσ2 + bnσ2 + bcnσ2 αβγ αβ αγ α +nσ2 + cnσ2 + anσ2 + acnσ2 αβγ αβ βγ β +nσ2 + bnσ2 + anσ2 + abnσ2 αβγ αγ βγ γ + nσ2 + cnσ2 αβγαβ + nσ2 + bnσ2 αβγαγ + nσ2 + anσ2 αβγ βγ +nσ2 αβγ A B C AB AC BC ABC Error Total σ2 σ2 σ2 σ2 σ2 σ2 σ2 σ2 The expected mean squares for the three-factor experiment with random effects in a completely randomized design are shown in Table 14.15. It is evident from the expected mean squares of Table 14.15 that one can form appropriate f-ratios for 590 Chapter 14 Factorial Experiments (Two or More Factors) testing all two-factor and three-factor interaction variance components. However, to test a hypothesis of the form there appears to be no appropriate f-ratio unless we have found one or more of the two-factor interaction variance components not significant. Suppose, for example, that we have compared s25 (mean square AC) with s27 (mean square ABC) and found σα2γ to be negligible. We could then argue that the term σα2γ should be dropped from all the expected mean squares of Table 14.15; then the ratio s21/s24 provides a test for the significance of the variance component σα2 . Therefore, if we are to test hypotheses concerning the variance components of the main effects, it is necessary first to investigate the significance of the two-factor interaction compo- nents. An approximate test derived by Satterthwaite (1946; see the Bibliography) may be used when certain two-factor interaction variance components are found to be significant and hence must remain a part of the expected mean square. Example 14.6: In a study to determine which are the important sources of variation in an industrial process, 3 measurements are taken on yield for 3 operators chosen randomly and 4 batches of raw materials chosen randomly. It is decided that a statistical test should be made at the 0.05 level of significance to determine if the variance components due to batches, operators, and interaction are significant. In addition, estimates of variance components are to be computed. The data are given in Table 14.16, with the response being percent by weight. Table 14.16: Data for Example 14.6 Batch Operator 1 2 3 4 1 66.9 68.3 69.0 69.3 68.1 67.4 69.8 70.9 67.2 67.7 67.5 71.4 2 66.3 68.1 69.7 69.4 65.4 66.9 68.8 69.6 65.8 67.6 69.2 70.0 3 65.6 66.0 67.1 67.9 66.3 66.9 66.2 68.4 65.2 67.3 67.4 68.7 Solution : The sums of squares are found in the usual way, with the following results: SST (total) = 84.5564, SSE (error) = 10.6733, SSA (operators) = 18.2106, SSB (batches) = 50.1564, SS(AB) (interaction) = 5.5161. All other computations are carried out and exhibited in Table 14.17. Since f0.05(2, 6) = 5.14, f0.05(3, 6) = 4.76, and f0.05(6, 24) = 2.51, H 0 : σ α2 = 0 , H 1 : σ α2 ̸ = 0 , 14.5 Factorial Experiments for Random Effects and Mixed Models 591 we find the operator and batch variance components to be significant. Although the interaction variance is not significant at the α = 0.05 level, the P-value is 0.095. Estimates of the main effect variance components are σˆα2 = 9.1053 − 0.9194 = 0.68, σˆβ2 = 16.7188 − 0.9194 = 1.76. 12 9 Table 14.17: Analysis of Variance for Example 14.6 Source of Sum of Variation Squares Operators 18.2106 Batches 50.1564 Interaction 5.5161 Error 10.6733 Total 84.5564 Mixed Model Experiment There are situations where the model (i.e., a mixture of random and fixed effects). For example, for the case of two factors, we may have Yijk =μ+Ai +Bj +(AB)ij +εijk, for i = 1,2,...,a; j = 1,2,...,b; k = 1,2,...,n. The Ai may be independent random variables, independent of εijk, and the Bj may be fixed effects. The mixed nature of the model requires that the interaction terms be random variables. As a result, the relevant hypotheses are of the form Degrees of Freedom 2 3 6 24 35 Mean Square f Computed 9.1053 9.90 16.7188 18.18 0.9194 2.07 0.4447 experiment dictates the assumption of a mixed  2  H0: σα =0, H0: B1 =B2 =···=Bb =0,  2  H1: σα ̸=0, H1: AtleastonetheBj isnotzero, H1: σαβ ̸=0. Again, the computations of sums of squares are identical to those of fixed and random effects situations, and the F-test is dictated by the expected mean squares. Table 14.18 provides the expected mean squares for the two-factor mixed model problem. Table 14.18: Expected Mean Squares for Two-Factor Mixed Model Experiment  2 H0: σαβ =0,  2 Factor A (random) B (fixed) AB (random) Error Expected Mean Square σ2 + bnσα2 􏰦 σ2+nσ2 +an σ2 + nσ2 σ2 B2 αβ b−1 j αβ j 592 Chapter 14 Factorial Experiments (Two or More Factors) From the nature of the expected mean squares it becomes clear that the test on the random effect employs the mean square error s2 as the denominator, whereas the test on the fixed effect uses the interaction mean square. Suppose we now consider three factors. Here, of course, we must take into account the situation where one factor is fixed and the situation in which two factors are fixed. Table 14.19 covers both situations. Table 14.19: Expected Mean Squares for Mixed Model Factorial Experiments in Three Factors // A Random A Random, B Random A σ2 + bcnσ2 σ2 + cnσ2 + bcnσ2 α αβα 22􏰦bBj2 222 B σ+cnσαβ+acn b−1 σ + cnσαβ + acnσβ j=1 C σ2+bnσ2+abn􏰦cCk2 σ2 + nσ2 + anσ2 + bnσ2 + abn 􏰦c Ck2 αγc−1 αβγβγαγc−1 AB σ2 + cnσ2 αβ AC σ2 + bnσ2 αγ BC σ2+nσ2 +an αβγ ABC σ2+nσ2 k=1 􏰦􏰦 (BC)2 jk σ2 + cnσ2 αβ σ2 + nσ2 + bnσ2 αβγ αγ σ2 + nσ2 + anσ2 αβγ βγ σ2 + nσ2 σ2 k=1 Error σ2 Exercises 14.26 Assuming a random effects experiment for Ex- ercise 14.2 on page 575, estimate the variance compo- nents for brand of orange juice concentrate, for number of days from when orange juice was blended until it was tested, and for experimental error. 14.27 To estimate the various components of variabil- ity in a filtration process, the percent of material lost in the mother liquor is measured for 12 experimental conditions, with 3 runs on each condition. Three filters and 4 operators are selected at random for use in the experiment. (a) Test the hypothesis of no interaction variance com- ponent between filters and operators at the α = 0.05 level of significance. (b) Test the hypotheses that the operators and the fil- ters have no effect on the variability of the filtration process at the α = 0.05 level of significance. (c) Estimate the components of variance due to filters, operators, and experimental error. αβγ αβγ jk (b−1)(c−1) Note that in the case of A random, all effects have proper f-tests. But in the case of A and B random, the main effect C must be tested using a Satterthwaite-type procedure similar to that used in the random effects experiment. Filter 1 1 16.2 16.8 17.1 2 16.6 16.9 16.8 3 16.7 16.9 17.1 Operator 2 3 4 15.9 15.6 14.9 15.1 15.9 15.2 14.5 16.1 14.9 16.0 16.1 15.4 16.3 16.0 14.6 16.5 17.2 15.9 16.5 16.4 16.1 16.9 17.4 15.4 16.8 16.9 15.6 14.28 A defense contractor is interested in studying an inspection process to detect failure or fatigue of transformer parts. Three levels of inspections are used by three randomly chosen inspectors. Five lots are used for each combination in the study. The factor levels are given in the data. The response is in failures per 1000 pieces. Exercises 593 (a) Write an appropriate model, with assumptions. (b) Use analysis of variance to test the appropriate hy- pothesis for inspector, inspection level, and inter- action. Inspection Level Full Military Inspector Inspection Commercial Operator Time 1 2 3 4 1 9.5 9.8 9.8 10.0 9.8 10.1 10.3 9.7 10.0 9.6 9.7 10.2 2 10.2 10.1 10.2 10.3 9.9 9.8 9.8 10.1 9.5 9.7 9.7 9.9 3 10.5 10.4 9.9 10.0 10.2 10.2 10.3 10.1 9.3 9.8 10.2 9.7 Reduced Military Inspection 7.08 6.17 5.65 5.30 5.02 7.68 5.86 5.28 5.38 4.87 A B C 7.50 5.85 5.35 7.58 6.54 5.12 7.70 6.42 5.35 7.42 5.89 6.52 5.64 6.82 7.19 6.19 5.39 5.85 5.35 5.01 6.15 5.52 5.48 5.48 5.98 6.17 6.20 5.44 5.75 5.68 6.21 5.66 5.36 5.90 6.12 variance for A manufacturer of latex house paint (brand A) would like to show that its paint is more robust to the material being painted than that of its two closest competitors. The response is the time, in years, until chipping occurs. The study involves the three brands of paint and three randomly chosen materials. Two pieces of material are used for each combination. Brand of Paint 14.31 Consider the following analysis of a random effects experiment: 14.29 Degrees of Freedom Mean Square C Source of Variation A B C AB AC BC ABC Error 24 5 Total 47 Test for significant variance components among all main effects and interaction effects at the 0.01 level of significance (a) by using a pooled estimate of error when appropri- ate; (b) by not pooling sums of squares of insignificant ef- fects. 14.30 A plant manager would like to show that the yield of a woven fabric in the plant does not depend on machine operator or time of day and is consistently high. Four randomly selected operators and 3 ran- domly selected hours of the day are chosen for the study. The yield is measured in yards produced per minute. Samples are taken on 3 randomly chosen days. (a) Write the appropriate model. (b) Evaluate the variance components for operator and time. (c) Draw conclusions. Material A A 5.50 5.15 4.75 4.60 5.10 5.20 B 5.60 5.55 5.50 5.60 5.40 5.50 C 5.40 5.48 5.05 4.95 4.50 4.55 (a) What is this type of model called? (b) Analyze the data, using the appropriate model. (c) Did the manufacturer of brand A support its claim with the data? 14.32 A process engineer wants to determine if the power setting on the machines used to fill certain types of cereal boxes results in a significant effect on the ac- tual weight of the product. The study consists of 3 randomly chosen types of cereal manufactured by the company and 3 fixed power settings. Weight is mea- sured for 4 different randomly selected boxes of cereal at each combination. The desired weight is 400 grams. The data are presented here. 3 140 1 480 325 3 15 6 24 2 18 6 2 2 Power Setting Low Current High 1 Cereal Type 390 402 400 399 399 404 402 400 408 404 406 415 407 401 400 413 3 2 392 392 394 401 390 392 395 502 395 401 396 400 410 408 405 399 403 399 412 415 (a) (b) (c) Give the appropriate model, and list the assump- tion being made. Is there a significant effect due to the power set- ting? Is there a significant variance component due to cereal type? B 14.33 The Statistics Consulting Center at Virginia Tech was involved in analyzing a set of data taken by personnel in the Human Nutrition and Foods Depart- ment in which it was of interest to study the effects of flour type and percent sweetener on certain physical attributes of a type of cake. All-purpose flour and cake flour were used, and the percent sweetener was varied at four levels. The following data show information on specific gravity of cake samples. Three cakes were prepared at each of the eight factor combinations. Sweetener Flour Concentration All-Purpose Cake 0 0.90 0.87 0.90 0.91 0.90 0.80 50 0.86 0.89 0.91 0.88 0.82 0.83 75 0.93 0.88 0.87 0.86 0.85 0.80 100 0.79 0.82 0.80 0.86 0.85 0.85 (a) Treat the analysis as a two-factor analysis of vari- ance. Test for differences between flour type. Test for differences between sweetener concentration. (b) Discuss the effect of interaction, if any. Give P- values on all tests. 14.34 An experiment was conducted in the Depart- ment of Food Science at Virginia Tech. It was of inter- est to characterize the texture of certain types of fish in the herring family. The effect of sauce types used in preparing the fish was also studied. The response in the experiment was “texture value,” measured with a machine that sliced the fish product. The following are data on texture values: results, in kilograms, are as follows: // 594 Chapter 14 Factorial Experiments (Two or More Factors) Review Exercises Humidity 30% 50% 70% 90% 39.0 33.1 33.8 33.0 42.8 37.8 30.7 32.9 36.9 27.2 29.7 28.5 41.0 26.8 29.1 27.9 27.4 29.2 26.7 30.9 30.3 29.9 32.0 31.5 Assuming a fixed effects analysis of variance and test the hypothesis of no interaction between humidity and plastic type at the 0.05 level of significance. Using only plastics A and B and the value of s2 from part (a), once again test for the presence of interaction at the 0.05 level of significance. Plastic Type A B C (a) (b) experiment, perform an Fish Type Menhaden Herring 27.6 57.4 47.8 71.1 53.8 49.8 31.0 48.3 62.2 88.0 95.2 11.8 35.1 54.6 43.6 108.2 86.7 16.1 41.8 105.2 Environment Dry Hydrogen High Humidity (95%) Stress Level Low Medium 11.08 13.12 10.98 13.04 11.24 13.37 10.75 12.73 10.52 12.87 10.43 12.95 High 14.18 14.90 15.10 14.15 14.42 14.25 14. 36 Personnel in the Materials Engineering Depart- ment at Virginia Tech conducted an experiment to study the effects of environmental factors on the sta- bility of a certain type of copper-nickel alloy. The basic response was the fatigue life of the material. The fac- tors are level of stress and environment. The data are as follows: Bleached Menhaden 64.0 66.9 66.5 66.8 53.8 Sauce Type Sour Cream WineSauce 107.0 83.9 110.4 93.4 (a) Do an analysis of variance to test for interaction between the factors. Use α = 0.05. (b) Based on part (a), do an analysis on the two main effects and draw conclusions. Use a P-value ap- proach in drawing conclusions. 14.37 In the experiment of Review Exercise 14.33, cake volume was also used as a response. The units are cubic inches. Test for interaction between factors and discuss main effects. Assume that both factors are fixed effects. Unbleached 83.1 (a) Do an analysis of variance. Determine whether or not there is an interaction between sauce type and fish type. (b) Based on your results from part (a) and on F-tests on main effects, determine if there is a significant difference in texture due to sauce types, and deter- mine whether there is a significant difference due to fish types. 14.35 A study was made to determine if humidity conditions have an effect on the force required to pull apart pieces of glued plastic. Three types of plastic were tested using 4 different levels of humidity. The Sweetener Flour Concentration All-Purpose Cake 0 4.48 3.98 4.42 50 3.68 5.04 3.72 75 3.92 3.82 4.06 100 3.26 3.80 3.40 4.12 4.92 5.10 5.00 4.26 4.34 4.82 4.34 4.40 4.32 4.18 4.30 Review Exercises 595 14.38 A control valve needs to be very sensitive to the input voltage, thus generating a good output volt- age. An engineer turns the control bolts to change the input voltage. The book SN-Ratio for the Quality Evaluation, published by the Japanese Standards As- sociation (1988), described a study on how these three factors (relative position of control bolts, control range of bolts, and input voltage) affect the sensitivity of a control valve. The factors and their levels are shown below. The data show the sensitivity of a control valve. Factor A, relative position of control bolts: center −0.5, center, and center +0.5 Factor B, control range of bolts: 2, 4.5, and 7 (mm) Factor C, input voltage: 100, 120, and 150 (V) C A B C1 C3 (1988), a study on how tire air pressure affects the ma- neuverability of an automobile was described. Three different tire air pressures were compared on three dif- ferent driving surfaces. The three air pressures were both left- and right-side tires inflated to 6 kgf/cm2, left-side tires inflated to 6 kgf/cm2 and right-side tires inflated to 3 kgf/cm2, and both left- and right-side tires inflated to 3 kgf/cm2. The three driving surfaces were asphalt, dry asphalt, and dry cement. The turning ra- dius of a test vehicle was observed twice for each level of tire pressure on each of the three different driving surfaces. // Tire Air Pressure 1 3 44.0 25.5 27.4 42.8 31.9 33.7 43.7 38.2 2 34.2 37.2 31.8 27.6 Driving Surface Asphalt Dry Asphalt Dry Cement 27.3 39.5 46.6 28.1 35.5 34.6 A1 B1 A1 B2 A1 B3 A2 B1 A2 B2 A2 B3 A3 B1 A3 B2 A3 B3 151 135 151 135 178 171 180 173 204 190 205 190 156 148 158 149 183 168 183 170 210 204 211 203 161 145 162 148 189 182 191 184 215 202 216 203 151 138 181 174 206 192 158 150 183 172 213 204 163 148 192 183 217 205 Perform an analysis of variance of the above data. Comment on the interpretation of the main and in- teraction effects. 14.41 The manufacturer of a certain brand of freeze- dried coffee hopes to shorten the process time without jeopardizing the integrity of the product. The process engineer wants to use 3 temperatures for the drying chamber and 4 drying times. The current drying time is 3 hours at a temperature of −15◦C. The flavor re- sponse is an average of scores of 4 professional judges. The score is on a scale from 1 to 10, with 10 being the best. The data are as shown in the following table. Perform an analysis of variance with for significant main and interaction effects. Draw con- clusions. 14.39 Exercise 14.25 on page 588 describes an exper- iment involving the extraction of polyethylene through use of a solvent. Solvent Temp. 4 Ethanol 120 94.0 94.0 80 95.3 95.1 Time Time 1 hr 1.5 hr 2 hr 3 hr −20◦ C 9.60 9.63 9.75 9.73 9.82 9.93 9.78 9.81 Temperature −15◦ C 9.55 9.50 9.60 9.61 9.81 9.78 9.80 9.75 −10◦ C 9.40 9.43 9.55 9.48 9.50 9.52 9.55 9.58 α = 0.05 to test 8 93.8 94.2 94.9 95.3 Toluene 120 94.6 94.5 93.6 94.1 80 95.4 95.4 95.6 96.0 16 91.1 90.5 92.5 92.4 91.1 91.0 92.1 92.1 (a) (b) (c) What type of model should be used? State assump- tions. Analyze the data appropriately. Write a brief report to the vice-president in charge and make a recommendation for future manufac- turing of this product. (a) Do a different sort of analysis on the data. Fit an appropriate regression model with a solvent cate- gorical variable, a temperature term, a time term, a temperature by time interaction, a solvent by tem- perature interaction, and a solvent by time interac- tion. Do t-tests on all coefficients and report your findings. (b) Do your findings suggest that different models are appropriate for ethanol and toluene, or are they equivalent apart from the intercepts? Explain. (c) Do you find any conclusions here that contra- dict conclusions drawn in your solution of Exercise 14.25? Explain. 14.40 In the book SN-Ratio for the Quality Evalua- tion, published by the Japanese Standards Association 14.42 To ascertain the number of tellers needed dur- ing peak hours of operation, data were collected by an urban bank. Four tellers were studied during three “busy” times: (1) weekdays between 10:00 and 11:00 A.M., (2) weekday afternoons between 2:00 and 3:00 P.M., and (3) Saturday mornings between 11:00 A.M. and 12:00 noon. An analyst chose four randomly se- lected times within each of the three time periods for each of the four teller positions over a period of months, and the numbers of customers serviced were observed. The data are as follows: C2 596 Chapter 14 Factorial Experiments (Two or More Factors) Time Period any, would be violated? (b) Construct a standard ANOVA table that includes F-tests on main effects and interactions. If interac- tions and main effects are found to be significant, give scientific conclusions. What have we learned? Be sure to interpret any significant interaction. Use your own judgment regarding P-values. (c) Do the entire analysis again using an appropriate transformation on the response. Do you see any differences in your findings? Comment. Teller 1 1 18241722 25292332 29302134 2 16111914 23322517 27291816 3 12191122 27332724 25202915 4 11 9 13 8 10 7 19 8 11 9 17 9 It is assumed that the number of customers served is a Poisson random variable. (a) Discuss the danger in doing a standard analysis of variance on the data above. What assumptions, if 2 3 14.6 Potential Misconceptions and Hazards; Relationship to Material in Other Chapters One of the most confusing issues in the analysis of factorial experiments resides in the interpretation of main effects in the presence of interaction. The presence of a relatively large P-value for a main effect when interactions are clearly present may tempt the analyst to conclude “no significant main effect.” However, one must understand that if a main effect is involved in a significant interaction, then the main effect is influencing the response. The nature of the effect is inconsistent across levels of other effects. The nature of the role of the main effect can be deduced from interaction plots. In light of what is communicated in the preceding paragraph, there is danger of a substantial misuse of statistics when one employs a multiple comparison test on main effects in the clear presence of interaction among the factors. One must be cautious in the analysis of a factorial experiment when the assump- tion of a complete randomized design is made when in fact complete randomization is not carried out. For example, it is common to encounter factors that are very difficult to change. As a result, factor levels may need to be held without change for long periods of time throughout the experiment. For instance, a temperature factor is a common example. Moving temperature up and down in a randomization scheme is a costly plan, and most experimenters will refuse to do it. Experimental designs with restrictions in randomization are quite common and are called split plot designs. They are beyond the scope of the book, but presentations are found in Montgomery (2008a). Many of the concepts discussed in this chapter carry over into Chapter 15 (e.g., the importance of randomization and the role of interaction in the interpretation of results). However, there are two areas covered in Chapter 15 that represent an expansion of principles dealt with both in Chapter 13 and in this chapter. In Chapter 15, problem solving through the use of factorial experiments is done with regression analysis since most of the factors are assumed to be quantitative and measured on a continuum (e.g., temperature and time). Prediction equations are developed from the data of the designed experiment, and they are used for process improvement or even process optimization. In addition, development is given on the topic of fractional factorials, in which only a portion or fraction of the entire factorial experiment is implemented due to the prohibitive cost of doing the entire experiment. Chapter 15 2k Factorial Experiments and Fractions 15.1 Introduction We have already been exposed to certain experimental design concepts. The sam- pling plan for the simple t-test on the mean of a normal population and the analy- sis of variance involve randomly allocating pre-chosen treatments to experimental units. The randomized block design, where treatments are assigned to units within relatively homogeneous blocks, involves restricted randomization. In this chapter, we give special attention to experimental designs in which the experimental plan calls for the study of the effect on a response of k factors, each at two levels. These are commonly known as 2k factorial experiments. We often denote the levels as “high” and “low” even though this notation may be arbitrary in the case of qualitative variables. The complete factorial design requires that each level of every factor occur with each level of every other factor, giving a total of 2k treatment combinations. Factor Screening and Sequential Experimentation Often, when experimentation is conducted either on a research or on a development level, a well-planned experimental design is a stage of what is truly a sequential plan of experimentation. More often than not, the scientists and engineers at the outset of a study may not be aware of which factors are important or what are appropriate ranges for the potential factors on which experimentation should be conducted. For example, in the text Response Surface Methodology by Myers, Montgomery, and Anderson-Cook (2009), one example is given of an investigation of a pilot plant experiment in which four factors—temperature, pressure, concen- tration of formaldehyde, and steering rate—are varied in order to establish their influence on the response, filtration rate of a certain chemical product. Even at the pilot plant level, the scientists are not certain if all four factors should be involved in the model. In addition, the eventual goal is to determine the proper settings of contributing factors that maximize the filtration rate. Thus, there is a need 597 598 Chapter 15 2k Factorial Experiments and Fractions to determine the proper region of experimentation. These questions can be answered only if the total experimental plan is done sequentially. Many experimen- tal endeavors are plans that feature iterative learning, the type of learning that is consistent with the scientific method, with the word iterative implying stage-wise experimentation. Generally, the initial stage of the ideal sequential plan is variable or factor screening, a procedure that involves an inexpensive experimental design using the candidate factors. This is particularly important when the plan involves a complex system like a manufacturing process. The information received from the results of a screening design is used to design one or more subsequent experiments in which adjustments in the important factors are made, the adjustments that provide improvements in the system or process. The 2k factorial experiments and fractions of the 2k are powerful tools that are ideal screening designs. They are simple, practical, and intuitively appealing. Many of the general concepts discussed in Chapter 14 continue to apply. However, there are graphical methods that provide useful intuition in the analysis of the two-level designs. Screening Designs for Large Numbers of Factors When k is small, say k = 2 or even k = 3, the utility of the 2k factorial for factor screening is clear. Analysis of variance and/or regression analysis as discussed and illustrated in Chapters 12, 13, and 14 remain useful as tools. In addition, graphical approaches are helpful. If k is large, say as large as 6, 7, or 8, the number of factor combinations and thus experimental runs required for the 2k factorial often becomes prohibitive. For example, suppose one is interested in carrying out a screening design involving k = 8 factors. There may be interest in gaining information on all k = 8 main effects as well as the k(k−1) = 28 two-factor interactions. However, including 2 28 = 256 runs would appear to make the study much too large and be wasteful for studying 28 + 8 = 36 effects. But, as we will illustrate in future sections, when k is large we can gain considerable information in an efficient manner by using only a fraction of the complete 2k factorial experiment. This class of designs is the class of fractional factorial designs. The goal is to retain high-quality information on main effects and interesting interactions even though the size of the design is reduced considerably. 15.2 The 2k Factorial: Calculation of Effects and Analysis of Variance Consider initially a 22 factorial with factors A and B and n experimental obser- vations per factor combination. It is useful to use the symbols (1), a, b, and ab to signify the design points, where the presence of a lowercase letter implies that the factor (A or B) is at the high level. Thus, absence of the lower case implies that the factor is at the low level. So ab is the design point (+, +), a is (+, −), b is (−, +) and (1) is (−, −). There are situations in which the notation also stands 15.2 The 2k Factorial: Calculation of Effects and Analysis of Variance 599 for the response data at the design point in question. As an introduction to the calculation of important effects that aid in the determination of the influence of the factors and sums of squares that are incorporated into analysis of variance computations, we have Table 15.1. Table 15.1: A 22 Factorial Experiment 􏰥A Mean B Mean b ab b+ab 2n (1) a (1)+b a+ab 2n 2n (1)+a 2n In this table, (1), a, b, and ab signify totals of the n response values at the individual design points. The simplicity of the 22 factorial lies in the fact that apart from experimental error, important information comes to the analyst in single-degree-of-freedom components, one each for the two main effects A and B and one degree of freedom for interaction AB. The information retrieved on all these takes the form of three contrasts. Let us define the following contrasts among the treatment totals: A contrast = ab + a − b − (1), B contrast=ab−a+b−(1), AB contrast=ab−a−b+(1). The three effects from the experiment involve these contrasts and appeal to com- mon sense and intuition. The two computed main effects are of the form effect=y ̄H −y ̄L, where y ̄H and y ̄L are average response at the high, or “+” level and average response at the low, or “−” level, respectively. As a result, A = ab + a − b − (1) = A contrast Calculation of Main Effects and 2n 2n B = ab−a+b−(1) = B contrast. 2n 2n The quantity A is seen to be the difference between the mean responses at the low and high levels of factor A. In fact, we call A the main effect of factor A. Similarly, B is the main effect of factor B. Apparent interaction in the data is observed by inspecting the difference between ab − b and a − (1) or between ab − a and b − (1) in Table 15.1. If, for example, ab − a ≈ b − (1) or ab − a − b + (1) ≈ 0, 600 Chapter 15 2k Factorial Experiments and Fractions Interaction Effect AB = ab−a−b+(1) = AB contrast. a line connecting the responses for each level of factor A at the high level of factor B will be approximately parallel to a line connecting the responses for each level of factor A at the low level of factor B. The nonparallel lines of Figure 15.1 suggest the presence of interaction. To test whether this apparent interaction is significant, a third contrast in the treatment totals orthogonal to the main effect contrasts, called the interaction effect, is constructed by evaluating 2n 2n b (1) Low ab a High Level of A Figure 15.1: Response suggesting apparent interaction. Example 15.1: Consider the data in Tables 15.2 and 15.3 with n = 1 for a 22 factorial experiment. Table 15.2: 22 Factorial with No Interaction Table 15.3: 22 Factorial with Interaction BB A−+A−+ + 50 70 + 50 70 − 80 100 − 80 40 The numbers in the cells in Tables 15.2 and 15.3 clearly illustrate how contrasts and the resulting calculation of the two main effects and resulting conclusions can be highly influenced by the presence of interaction. In Table 15.2, the effect of A is −30 at both the low and high levels of factor B and the effect of B is 20 at both the low and high levels of factor A. This “consistency of effect” (no interaction) can be very important information to the analyst. The main effects are A = 70 + 50 − 100 + 80 = 60 − 90 = −30, 22 B = 100 + 70 − 80 + 50 = 85 − 65 = 20, 22 while the interaction effect is AB = 100 + 50 − 80 + 70 = 75 − 75 = 0. 22 High Level of B Response Low Level of B 15.2 The 2k Factorial: Calculation of Effects and Analysis of Variance 601 On the other hand, in Table 15.3 the effect of A is once again −30 at the low level of B but +30 at the high level of B. This “inconsistency of effect” (interaction) also is present for B across levels of A. In these cases, the main effects can be meaningless and, in fact, highly misleading. For example, the effect of A is A = 50 + 70 − 80 + 40 = 0, 22 since there is a complete “masking” of the effect as one averages over levels of B. The strong interaction is illustrated by the calculated effect AB = 70 + 80 − 50 + 40 = 30. 22 Here it is convenient to illustrate the scenarios of Tables 15.2 and 15.3 with inter- action plots. Note the parallelism in the plot of Figure 15.2 and the interaction that is apparent in Figure 15.3. B 􏱋 􏱐1 B 􏱋 􏱍1 B 􏱋 􏱐1 B 􏱋 􏱍1 100 90 80 70 60 50 Figure 15.2: Interaction plot for data of Figure 15.3: Interaction plot for data of Table 15.2. Table 15.3. Computation of Sums of Squares We take advantage of the fact that in the 22 factorial, or for that matter in the general 2k factorial experiment, each main effect and interaction effect has an as- sociated single degree of freedom. Therefore, we can write 2k − 1 orthogonal single-degree-of-freedom contrasts in the treatment combinations, each accounting for variation due to some main or interaction effect. Thus, under the usual in- dependence and normality assumptions in the experimental model, we can make tests to determine if the contrast reflects systematic variation or merely chance or random variation. The sums of squares for each contrast are found by following the procedures given in Section 13.5. Writing Y1.. =b+(1), Y2.. =ab+a, c1 =−1, and c2 =1, 􏱍1 1 AA 􏱍1 1 Response Response 40 50 60 70 80 602 Chapter 15 2k Factorial Experiments and Fractions where Y1.. and Y2.. are the total of 2n observations, we have 􏰧 􏰦2 SSA=SSwA= i=1 i=1 with 1 degree of freedom. Similarly, we find that 􏰨 2 2n c2i ciYi.. 􏰦2 [ab + a − b − (1)]2 = = 2 2 n (A contrast)2 2 2 n , and [ab + b − a − (1)]2 SSB = 22n [ab + (1) − a − b]2 SS(AB) = 22n (B contrast)2 = 22n (AB contrast)2 = 22n . Each contrast has l degree of freedom, whereas the error sum of squares, with 22(n − 1) degrees of freedom, is obtained by subtraction from the formula SSE = SST − SSA − SSB − SS(AB). In computing the sums of squares for the main effects A and B and the inter- action effect AB, it is convenient to present the total responses of the treatment combinations along with the appropriate algebraic signs for each contrast, as in Table 15.4. The main effects are obtained as simple comparisons between the low and high levels. Therefore, we assign a positive sign to the treatment combination that is at the high level of a given factor and a negative sign to the treatment combination at the low level. The positive and negative signs for the interaction effect are obtained by multiplying the corresponding signs of the contrasts of the interacting factors. Table 15.4: Signs for Contrasts in a 22 Factorial Experiment Treatment Factorial Effect Combination A B AB (1) ––+ a +–– b –+– ab +++ The 23 Factorial Let us now consider an experiment using three factors, A, B, and C, each with levels −1 and +1. This is a 23 factorial experiment giving the eight treatment combinations (1), a, b, c, ab, ac, bc, and abc. The treatment combinations and the appropriate algebraic signs for each contrast used in computing the sums of squares for the main effects and interaction effects are presented in Table 15.5. 15.2 The 2k Factorial: Calculation of Effects and Analysis of Variance 603 Table 15.5: Signs for Contrasts in a 23 Factorial Experiment Factorial Effect (symbolic) Combination A B C AB AC BC ABC (1) −−−+++− a +−−−−++ b −+−−+−+ c −−++−−+ ab + + − + − − − ac + − + − + − − bc − + + − − + − abc + + + + + + + Treatment +1 C bc c −1 +1 Bb (1) a ac abc −1 −1 A +1 Figure 15.4: Geometric view of 23. It is helpful to discuss and illustrate the geometry of the 23 factorial much as we illustrated that of the 22 factorial in Figure 15.1. For the 23, the eight design points represent the vertices of a cube, as shown in Figure 15.4. The columns of Table 15.5 represent the signs that are used for the contrasts and thus computation of seven effects and corresponding sums of squares. These columns are analogous to those given in Table 15.4 for the case of the 22. Seven effects are available since there are eight design points. For example, A= a+ab+ac+abc−(1)−b−c−bc, 4n AB = (1)+c+ab+abc−a−b−ac−bc, 4n ab and so on. The sums of squares are merely given by (contrast)2 SS(effect) = 23n . An inspection of Table 15.5 reveals that for the 23 experiment all contrasts 604 Chapter 15 2k Factorial Experiments and Fractions among the seven are mutually orthogonal, and therefore the seven effects are as- sessed independently. Effects and Sums of Squares for the 2k For a 2k factorial experiment the single-degree-of-freedom sums of squares for the main effects and interaction effects are obtained by squaring the appropriate con- trasts in the treatment totals and dividing by 2kn, where n is the number of replications of the treatment combinations. As before, an effect is always calculated by subtracting the average response at the “low” level from the average response at the “high” level. The high and low for main effects are quite clear. The symbolic high and low for interactions are evident from information as in Table 15.5. The orthogonality property has the same importance here as it does for the material on comparisons discussed in Chapter 13. Orthogonality of contrasts im- plies that the estimated effects and thus the sums of squares are independent. This independence is readily illustrated in the 23 factorial experiment if the responses, with factor A at its high level, are increased by an amount x in Table 15.5. Only the A contrast leads to a larger sum of squares, since the x effect cancels out in the formation of the six remaining contrasts as a result of the two positive and two negative signs associated with treatment combinations in which A is at the high level. There are additional advantages produced by orthogonality. These are pointed out when we discuss the 2k factorial experiment in regression situations. 15.3 Nonreplicated 2k Factorial Experiment The full 2k factorial may often involve considerable experimentation, particularly when k is large. As a result, replication of each factor combination is often not feasible. If all effects, including all interactions, are included in the model of the experiment, no degrees of freedom are allowed for error. Often, when k is large, the data analyst will pool sums of squares and corresponding degrees of freedom for high-order interactions that are known or assumed to be negligible. This will produce F-tests for main effects and lower-order interactions. Diagnostic Plotting with Nonreplicated 2k Factorial Experiments Normal probability plotting can be a very useful methodology for determining the relative importance of effects in a reasonably large two-level factored experiment when there is no replication. This type of diagnostic plot can be particularly useful when the data analyst is hesitant to pool high-order interactions for fear that some of the effects pooled in the “error” may truly be real effects and not merely random. The reader should bear in mind that all effects that are not real (i.e., they are independent estimates of zero) follow a normal distribution with mean near zero and constant variance. For example, in a 24 factorial experiment, we are reminded that all effects (keep in mind that n = 1) are of the form AB = contrast = y ̄H − y ̄L, 8 15.3 Nonreplicated 2k Factorial Experiment 605 Probability Effect Plots for Nonreplicated 24 Factorial Experiments where y ̄H is the average of eight independent experimental runs at the high, or “+,” level and y ̄L is the average of eight independent runs at the low, or “−,” level. Thus, the variance of each contrast is Var(y ̄H − y ̄L) = σ2/4. For any real effects, E(y ̄H − y ̄L) ̸= 0. Thus, normal probability plotting should reveal “significant” effects as those that fall off the straight line that depicts realizations of independent, identically distributed normal random variables. The probability plotting can take one of many forms. The reader is referred to Chapter 8, where these plots were first presented. The empirical normal quantile- quantile plot may be used. The plotting procedure that makes use of normal probability paper may also be used. In addition, there are several other types of diagnostic normal probability plots. In summary, the procedure for diagnostic effect plots is as follows. 1. Calculate effects as effect = contrast. 2k−1 2. Construct a normal probability plot of all effects. 3. Effects that fall off the straight line should be considered real effects. Further comments regarding normal probability plotting of effects are in order. First, the data analyst may feel frustrated if he or she uses these plots with a small experiment. On the other hand, the plotting is likely to give satisfying results when there is effect sparsity—many effects that are truly not real. This sparsity will be evident in large experiments where high-order interactions are not likely to be real. Case Study 15.1: Injection Molding: Many manufacturing companies in the United States and abroad use molded parts as components. Shrinkage is often a major problem. Often, a molded die for a part is built larger than nominal to allow for part shrink- age. In the following experimental situation, a new die is being produced, and ultimately it is important to find the proper process settings to minimize shrink- age. In the following experiment, the response values are deviations from nominal (i.e., shrinkage). The factors and levels are as follows: A. Injection velocity (ft/sec) B. Mold temperature (◦C) C. Mold pressure (psi) D. Back pressure (psi) Coded Levels −1 +1 1.0 2.0 100 150 500 1000 75 120 The purpose of the experiment was to determine what effects (main effects and interaction effects) influence shrinkage. The experiment was considered a prelim- inary screening experiment from which the factors for a more complete analysis might be determined. Also, it was hoped that some insight might be gained into how the important factors impact shrinkage. The data from a nonreplicated 24 factorial experiment are given in Table 15.6. 606 Chapter 15 2k Factorial Experiments and Fractions Table 15.6: Data for Case Study 15.1 Factor Combination Response Factor (cm × 104) Combination Response (cm × 104) 73.52 75.97 74.28 92.87 79.34 75.12 79.67 97.80 (1) 72.68 d a 71.74 ad b 76.09 bd ab 93.19 abd c 71.25 cd ac 70.59 acd bc 70.92 bcd abc 104.96 abcd Initially, effects were calculated and calculated effects are as follows: placed on a normal probability plot. The A = 10.5613, C = 2.4138, AC = 1.2613, CD = 1.4088, ACD = −3.0438, BD = D = AD = ABC = BCD = −2.2787, 2.1438, −1.8238, 2.8588, −0.4788, B = 12.4463, AB = 11.4038, BC = 1.8163, ABD = −1.7813, ABCD = −1.3063. The normal quantile-quantile plot is shown in Figure 15.5. The plot seems to imply that effects A, B, and AB stand out as being important. The signs of the important effects indicate the preliminary conclusions. 2 1A ABC C D BC 0 CD AC BCD ABCD ABD AD BD ACD B AB −1 −2−3 −1 1 3 5 7 9 11 13 Effects Quantiles Figure 15.5: Normal quantile-quantile plot of effects for Case Study 15.1. Theoretical Quantiles 15.3 Nonreplicated 2k Factorial Experiment 607 1. An increase in injection velocity from 1.0 to 2.0 increases shrinkage. 2. An increase in mold temperature from 100◦C to 150◦C increases shrinkage. 3. There is an interaction between injection velocity and mold temperature; al- though both main effects are important, it is crucial that we understand the impact of the two-factor interaction. Interpretation of Two-Factor Interaction As one would expect, a two-way table of means provides ease in interpretation of the AB interaction. Consider the two-factor situation in Table 15.7. Table 15.7: Illustration of Two-Factor Interaction A (velocity) 2 1 100 73.355 74.1975 150 97.205 75.240 B (temperature) Notice that the large sample mean at high velocity and high temperature cre- ated the significant interaction. The shrinkage increases in a nonadditive manner. Mold temperature appears to have a positive effect despite the velocity level. But the effect is greatest at high velocity. The velocity effect is very slight at low temperature but clearly is positive at high mold temperature. To control shrinkage at a low level, one should avoid using high injection velocity and high mold temperature simultaneously. All of these results are illustrated graphically in Figure 15.6. 100 95 90 85 80 75 70 100 Velocity Temperature 2 1 150 Figure 15.6: Interaction plot for Case Study 15.1. Shrinkage 608 Chapter 15 2k Factorial Experiments and Fractions Analysis with Pooled Mean Square Error: Annotated Computer Printout It may be of interest to observe an analysis of variance of the injection molding data with high-order interactions pooled to form a mean square error. Interactions of order three and four are pooled. Figure 15.7 shows a SAS PROC GLM printout. The analysis of variance reveals essentially the same conclusion as that of the normal probability plot. The tests and P-values shown in Figure 15.7 require interpretation. A signif- icant P-value suggests that the effect differs significantly from zero. The tests on main effects (which in the presence of interactions may be regarded as the effects averaged over the levels of the other factors) indicate significance for effects A and B. The signs of the effects are also important. An increase in the level from low A 1 B 1 C 1 D 1 A*B 1 A*C 1 A*D 1 B*C 1 B*D 1 C*D 1 446.1600062 619.6365563 23.3047563 18.3826563 520.1820562 6.3630063 13.3042562 13.1950562 20.7708062 7.9383063 446.1600062 619.6365563 23.3047563 18.3826563 520.1820562 6.3630063 13.3042562 13.1950562 20.7708062 7.9383063 24.74 0.0042 34.36 0.0020 1.29 0.3072 1.02 0.3590 28.84 0.0030 0.35 0.5784 0.74 0.4297 0.73 0.4314 1.15 0.3322 0.44 0.5364 Parameter Estimate Intercept 79.99937500 A 5.28062500 B 6.22312500 C 1.20687500 D 1.07187500 A*B 5.70187500 A*C 0.63062500 A*D -0.91187500 B*C 0.90812500 B*D -1.13937500 C*D 0.70437500 Standard Error t Value Pr > |t|
<.0001 0.0042 0.0020 0.3072 0.3590 0.0030 0.5784 0.4297 0.4314 0.3322 0.5364 The GLM Procedure Dependent Variable: y Source DF Model 10 Error 5 Corrected Total 15 1779.418294 R-Square Coeff Var Root MSE 0.949320 5.308667 4.246901 Source DF Type III SS Mean Square F Value Pr > F
Sum of
Squares Mean Square F Value Pr > F
1689.237462
90.180831
168.923746 9.37 0.0117
18.036166
1.06172520
1.06172520
1.06172520
1.06172520
1.06172520
1.06172520
1.06172520
1.06172520
1.06172520
1.06172520
1.06172520
75.35
4.97
5.86
1.14
1.01
5.37
0.59
-0.86
0.86
-1.07
0.66
y Mean
79.99938
Figure 15.7: SAS printout for data of Case Study 15.1.

Exercises 609
to high of A, injection velocity, results in increased shrinkage. The same is true for B. However, because of the significant interaction AB, main effect interpretations may be viewed as trends across the levels of the other factors. The impact of the significant AB interaction is better understood by using a two-way table of means.
Exercises
15.1 The following data are obtained from a 23 fac- torial experiment replicated three times. Evaluate the sums of squares for all factorial effects by the contrast method. Draw conclusions.
tant, do a complete analysis of the data. Use P-values in your conclusion.
15.3 In a metallurgy experiment, it is desired to test the effect of four factors and their interactions on the concentration (percent by weight) of a particular phos- phorus compound in casting material. The variables are A, percent phosphorus in the refinement; B, per- cent remelted material; C, fluxing time; and D, holding time. The four factors are varied in a 24 factorial exper- iment with two castings taken at each factor combina- tion. The 32 castings were made in random order. The following table shows the data and an ANOVA table is given in Figure 15.8 on page 610. Discuss the effects of the factors and their interactions on the concentration of the phosphorus compound.
Treatment Combination
Rep 1
Rep 2
19
20
16
17
25
19
23
25
Rep 3
10
16
17
27
21
19
29
20
(1) 12
a 15
b 24
ab 23 c 17 ac 16 bc 24 abc 28
In an experiment conducted by
neering Department at Virginia Tech to study a partic- ular filtering system for coal, a coagulant was added to a solution in a tank containing coal and sludge, which was then placed in a recirculation system in order that the coal could be washed. Three factors were varied in the experimental process:
15.2
Factor A: Factor B:
Factor C:
the Mining Engi-
Weight
% of Phosphorus Compound
Treatment Combination
Rep 1
Rep 2 Total
28.6 58.9 31.4 59.9 25.6 50.1 27.2 53.1 23.4 48.2 23.8 50.7 27.8 52.6 24.9 47.1 33.5 65.2 26.2 50.8 30.6 58.2 27.8 54.1 27.7 57.6 24.2 51.0 24.9 51.3 29.3 56.2
436.9 865.0
percent solids circulated initially in the overflow
flow rate of the polymer
pH of the tank
(1) 30.3
a 28.5
b 24.5
ab 25.9 c 24.8 ac 26.9 bc 24.8 abc 22.2 d 31.7 ad 24.6 bd 27.6 abd 26.3 cd 29.9 acd 26.8 bcd 26.4 abcd 26.9
The amount of solids in the underflow of the cleans- ing system determines how clean the coal has become. Two levels of each factor were used and two experi- mental runs were made for each of the 23 = 8 combi- nations. The response measurements in percent solids by weight in the underflow of the circulation system are as specified in the following table:
Treatment Response Combination Replication 1 Replication 2
Total
428.1
(1) 4.65 a 21.42 b 12.66 ab 18.27 c 7.93 ac 13.18 bc 6.51 abc 18.23
5.81 21.35 12.56 16.62
7.88 12.87 6.26 17.83
15.4 A preliminary experiment is conducted to study the effects of four factors and their interactions on the output of a certain machining operation. Two runs are made at each of the treatment combinations in order to supply a measure of pure experimental error. Two lev- els of each factor are used, resulting in the data shown next page. Make tests on all main effects and interac- tions at the 0.05 level of significance. Draw conclusions.
Assuming that all interactions are potentially impor-

610
Chapter 15 2k Factorial Experiments and Fractions
Source of Variation
Main effect :
A B C D
Two-factor interaction :
AB AC AD BC BD CD
Three-factor interaction :
ABC ABD ACD BCD
Four-factor interaction :
ABCD
Error Total
Treatment Combination
Sum of Degrees of Mean Computed Effects Squares Freedom Square f
P-Value 0.0459
0.0421 0.0010 0.0163
0.0939 0.2857 0.0295 0.0480 0.2763 0.2249
0.3360 0.0064 0.0163 0.1394
0.2249
−1.2000 11.52 1 −1.2250 12.01 1 −2.2250 39.61 1
1.4875 17.70 1 0.9875 7.80 1
0.6125 3.00 1 −1.3250 14.05 1 1.1875 11.28 1 0.6250 3.13 1 0.7000 3.92 1
−0.5500 2.42 1 1.7375 24.15 1 1.4875 17.70 1
−0.8625 5.95 1
11.52 4.68 12.01 4.88 39.61 16.10 17.70 7.20
7.80 3.17
3.00 1.22 14.05 5.71 11.28 4.59
3.13 1.27 3.92 1.59
2.42 0.98 24.15 9.82 17.70 7.20
5.95 2.42 3.92 1.59
217.51 31
Figure 15.8: ANOVA table for Exercise 15.3.
0.7000 3.92 1
39.36 16 2.46
Replicate 1
Replicate 2
9.6 10.2 5.8 12.0 8.3 12.3 15.5 8.7 15.2 5.8 10.9 21.9 7.8 11.2 11.1 14.3
in an analysis with certain levels of certain processing variables. The data are shown below.
Phys. Mixing Blade Nitrogen
Obs. State Time Speed Condition Aluminum
1 1 1 2 2 16.3 2 1 2 2 2 16.0 3 1 1 1 1 16.2 4 1 2 1 2 16.1 5 1 1 1 2 16.0 6 1 2 1 1 16.0 7 1 2 2 1 15.5 8 1 1 2 1 15.9 9 2 1 2 2 16.7
10 2 2 2 2 16.1 11 2 1 1 1 16.3 12 2 2 1 2 15.8 13 2 1 1 2 15.9 14 2 2 1 1 15.9 15 2 2 2 1 15.6 16 2 1 2 1 15.8
The variables in the data are given as below. A: mixing time
level 1: 2 hours level 2: 4 hours
(1) 7.9 a 9.1 b 8.6
c
d
ab ac ad bc bd cd abc abd acd bcd abcd
10.4 7.1 11.1 16.4 7.1 12.6 4.7 7.4 21.9 9.8 13.8 10.2 12.8
In the study An X-Ray Fluorescence Method for Analyzing Polybutadiene-Acrylic Acid (PBAA) Propel- lants (Quarterly Reports, RK-TR-62-1, Army Ord- nance Missile Command), an experiment was con- ducted to determine whether or not there was a signif- icant difference in the amount of aluminum obtained
15.5

Exercises
611
B: blade speed
level 1: 36 rpm
level 2: 78 rpm
C: condition of nitrogen passed over propellant
level 1: dry
level 2: 72% relative humidity D: physical state of propellant
level 1: uncured level 2: cured
Assuming all three- and four-factor interactions to be negligible, analyze the data. Use a 0.05 level of signif- icance. Write a brief report summarizing the findings.
15.6 It is important to study the effect of the concen- tration of the reactant and the feed rate on the viscosity of the product from a chemical process. Let the reac- tant concentration be factor A, at levels 15% and 25%. Let the feed rate be factor B, with levels 20 lb/hr and 30 lb/hr. The experiment involves two experimental runs at each of the four combinations (L = low and H = high). The viscosity readings are as follows.
H
B
of interpretation, show two AD interaction plots, one forB=−1andtheotherforB=+1. Fromtheap- pearance of these, give an interpretation of the ABD interaction.
15.9 Consider Exercise 15.6. Use a +1 and −1 scaling for “high” and “low,” respectively, and do a multiple linear regression with the model
Yi = β0 + β1x1i + β2x2i + β12x1ix2i + εi,
with x1i = reactant concentration (−1, +1) and x2i =
feed rate (−1, +1).
(a) Compute regression coefficients.
(b) How do the coefficients b1, b2, and b12 relate to the effects you found in Exercise 15.6(a)?
(c) In your regression analysis, do t-tests on b1, b2, and b12. How do these test results relate to those in Ex- ercise 15.6(b) and (c)?
15.10 Consider Exercise 15.5. Compute all 15 effects and do normal probability plots of the effects.
(a) Does it appear as if your assumption of negligible three- and four-factor interactions has merit?
(b) Are the results of the effect plots consistent with what you communicated about the importance of main effects and two-factor interactions in your summary report?
15.11 In Myers, Montgomery, and Anderson-Cook (2009), a data set is discussed in which a 23 factorial is used by an engineer to study the effects of cutting speed (A), tool geometry (B), and cutting angle (C) on the life (in hours) of a machine tool. Two levels of each factor are chosen, and duplicates are run at each design point with the order of the runs being random. The data are presented here.
ABC Life
(1) − − − 22,31
a + − − 32,43
b − + − 35,34
ab + + − 35,47 c − − + 44,45 ac + − + 40,37 bc − + + 60,50 abc + + + 39,41
(a) Calculate all seven effects. Which appear, based on their magnitude, to be important?
(b) Do an analysis of variance and observe P -values. (c) Do your results in (a) and (b) agree?
132 149 137 152
145 154 147 150
L
LH
A
(a) Assuming a model containing two main effects and an interaction, calculate the three effects. Do you have any interpretation at this point?
(b) Do an analysis of variance and test for interaction. Give conclusions.
(c) Test for main effects and give final conclusions re- garding the importance of all these effects.
15.7 Consider Exercise 15.3. It is of interest to the researcher to learn not only that AD, BC, and possibly AB are important, but also what they mean scientif- ically. Show two-dimensional interaction plots for all three and give an interpretation.
15.8 Consider Exercise 15.3 once again. Three-factor interactions are often not significant, and even if they are, they are difficult to interpret. The interaction ABD appears to be important. To gain some sense

612
(d) The engineer felt confident that cutting speed and cutting angle should interact. If this interaction is significant, draw an interaction plot and discuss the engineering meaning of the interaction.
15.12 Consider Exercise 15.11. Suppose there was some experimental difficulty in making the runs. In fact, the total experiment had to be halted after only 4 runs. As a result, the abbreviated experiment is given by
Life
a 43 b 35 c 44 abc 39
Chapter 15 2k Factorial Experiments and Fractions With only these runs, we have the signs for contrasts
given by
15.4 Factorial Experiments in a Regression Setting
Thus far in this chapter, we have mostly confined our discussion of analysis of the data for a 2k factorial to the method of analysis of variance. The only reference to an alternative analysis resides in Exercise 15.9. Indeed, this exercise introduces much of what motivates the present section. There are situations in which model fitting is important and the factors under study can be controlled. For example, a biologist may wish to study the growth of a certain type of algae in the water, and so a model that looks at units of algae as a function of the amount of a pollutant and, say, time would be very helpful. Thus, the study involves a factorial experiment in a laboratory setting in which concentration of the pollutant and time are the factors. As we shall discuss later in this section, a more precise model can be fitted if the factors are controlled in a factorial array, with the 2k factorial often being a useful choice. In many biological and chemical processes, the levels of the regressor variables can and should be controlled.
Recall that the regression model employed in Chapter 12 can be written in matrix notation as
y = Xβ + ε.
The X matrix is referred to as the model matrix. Suppose, for example, that a 23 factorial experiment is employed with the variables
Temperature: Humidity: Pressure (psi):
150◦C 200◦C 15% 20% 1000 1500
A B C AB AC BC ABC
a+−−−−++ b−+−−+−+ c−−++−−+ abc + + + + + + +
Comment. In your comments, determine whether or not the contrasts are orthogonal. Which are and which are not? Are main effects orthogonal to each other? In this abbreviated experiment (called a fractional facto- rial), can we study interactions independent of main effects? Is it a useful experiment if we are convinced that interactions are negligible? Explain.
The familiar +1, −1 levels can be generated through the following centering and scaling to design units:
x1 = temperature − 175 , x2 = humidity − 17.5 , x3 = pressure − 1250 . 25 2.5 250

15.4 Factorial Experiments in a Regression Setting
613
As a result, the X matrix becomes
⎡ x1 x2 x3 1 −1 −1 −1
⎤Design Identification (1)
⎢1 1−1−1⎥ a
⎢1−1 1−1⎥ b X=⎢1 −1 −1 1⎥ c
⎢1 1 1−1⎥ ab ⎢1 1−1 1⎥ ac ⎣1−1 1 1⎦ bc
1111 abc
It is now seen that the contrasts illustrated and discussed in Section 15.2 are directly related to regression coefficients. Notice that all the columns of the X matrix in our 23 example are orthogonal. As a result, the computation of regression coefficients as described in Section 12.3 becomes
a+ab+ac+abc+(1)+b+c+bc
= 1⎢a+ab+ac+abc−(1)−b−c−bc⎥,
8 ⎣b+ab+bc+abc−(1)−a−c−ac⎦ c+ac+bc+abc−(1)−a−b−ab
where a, ab, and so on, are response measures.
One can now see that the notion of calculated main effects, which has been
emphasized throughout this chapter with 2k factorials, is related to coefficients in a fitted regression model when factors are quantitative. In fact, for a 2k with, say, n experimental runs per design point, the relationships between effects and regression coefficients are as follows:
Effect = contrast 2k−1(n)
Regression coefficient = contrast = effect. 2k(n) 2
This relationship should make sense to the reader, since a regression coefficient bj is an average rate of change in response per unit change in xj. Of course, as one goes from −1 to +1 in xj (low to high), the design variable changes by 2 units.
Example 15.2: Consider an experiment where an engineer desires to fit a linear regression of yield y against holding time x1 and flexing time x2 in a certain chemical system. All other factors are held fixed. The data in the natural units are given in Table 15.8. Estimate the multiple linear regression model.
Solution: The fitted regression model is
yˆ=b0 +b1x1 +b2x2.
⎡⎤
b0 􏰧􏰨
b = ⎢b1⎥ = (X′X)−1X′y = 1I ⎣b2⎦ 8
X′y ⎡⎤
b3

614
Chapter 15 2k Factorial Experiments and Fractions Table 15.8: Data for Example 15.2
Holding Time (hr)
0.5 0.8 0.5 0.8
The design units are
x1 = holding time − 0.65 ,
Flexing Time (hr)
0.10 0.10 0.20 0.20
Yield (%)
28 39 32 46
and the X matrix is
0.15
x2 = flexing time − 0.15 0.05
⎡ x1 x2⎤ 1 −1 −1 ⎢1 1 −1⎥ ⎣1 −1 1⎦ 111
with the regression coefficients
b1 =(XX) Xy=⎢ 4 ⎥= 6.25 . b2 ⎣b+ab−(1)−a⎦ 2.75
4 Thus, the least squares regression equation is
yˆ = 36.25 + 6.25×1 + 2.75×2.
This example provides an illustration of the use of the two-level factorial ex- periment in a regression setting. The four experimental runs in the 22 design were used to calculate a regression equation, with the obvious interpretation of the regression coefficients. The value b1 = 6.25 represents the estimated increase in response (percent yield) per design unit change (0.15 hour) in holding time. The value b2 = 2.75 represents a similar rate of change for flexing time.
⎡(1)+a+b+ab⎤
⎡b⎤ ⎢ 4 ⎥ ⎡36.25⎤ ⎣ 0⎦ ′ −1 ′ ⎢a+ab−(1)−b⎥ ⎣ ⎦
Interaction in the Regression Model
The interaction contrasts discussed in Section 15.2 have definite interpretations in the regression context. In fact, interactions are accounted for in regression models by product terms. For example, in Example 15.2, the model with interaction is
y=b0 +b1x1 +b2x2 +b12x1x2 with b0, b1, b2 as before and
b12 = ab+(1)−a−b = 46+28−39−32 =0.75. 44

15.4 Factorial Experiments in a Regression Setting 615
Thus, the regression equation expressing two linear main effects and interaction is yˆ = 36.25 + 6.25×1 + 2.75×2 + 0.75x1x2.
The regression context provides a framework in which the reader should better understand the advantage of orthogonality that is enjoyed by the 2k factorial. In Section 15.2, the merits of orthogonality were discussed from the point of view of analysis of variance of the data in a 2k factorial experiment. It was pointed out that orthogonality among effects leads to independence among the sums of squares. Of course, the presence of regression variables certainly does not rule out the use of analysis of variance. In fact, f-tests are conducted just as they were described in Section 15.2. Of course, a distinction must be made. In the case of ANOVA, the hypotheses evolve from population means, while in the regression case, the hypotheses involve regression coefficients.
For instance, consider the experimental design in Exercise 15.2 on page 609. Each factor is continuous. Suppose that the levels are
A (x1): 20%
B (x2): 5 lb/sec C(x3): 5
40%
10 lb/sec 5.5
and we have, for design levels,
x1 = % solids−30, x2 = flow rate−7.5, x3 = pH−5.25.
10 2.5 0.25
Suppose that it is of interest to fit a multiple regression model in which all linear coefficients and available interactions are to be considered. In addition, the engineer wants to obtain some insight into what levels of the factor will maximize cleansing (i.e., maximize the response). This problem will be the subject of Case Study 15.2.
Case Study 15.2: Coal Cleansing Experiment1: Figure 15.9 represents annotated computer print- out for the regression analysis for the fitted model
yˆ = b0 + b1x1 + b2x2 + b3x3 + b12x1x2 + b13x1x3 + b23x2x3 + b123x1x2x3,
where x1, x2, and x3 are percent solids, flow rate, and pH of the system, respec- tively. The computer system used is SAS PROC REG.
Note the parameter estimates, standard error, and P-values in the printout. The parameter estimates represent coefficients in the model. All model coefficients are significant except the x2x3 term (BC interaction). Note also that residuals, confidence intervals, and prediction intervals appear as discussed in the regression material in Chapters 11 and 12.
The reader can use the values of the model coefficients and predicted values from the printout to ascertain what combination of the factors results in max- imum cleansing efficiency. Factor A (percent solids circulated) has a large positive coefficient, suggesting a high value for percent solids. In addition, a low value for factor C (pH of the tank) is suggested. Though the B main effect (flow rate of the polymer) coefficient is positive, the rather large positive coefficient of
1See Exercise 15.2.

616 Chapter 15 2k Factorial Experiments and Fractions
Dependent Variable: Y
Analysis of Variance
Sum of Mean
DF Squares Square F Value Pr > F
7 490.23499 70.03357 254.43 <.0001 8 2.20205 0.27526 Source Model Error Corrected Total 15 492.43704 Root MSE 0.52465 R-Square Dependent Mean 12.75188 Adj R-Sq 0.9955 0.9916 Coeff Var Variable DF Intercept A B C AB AC BC ABC 4.11429 Parameter Estimates Parameter Standard Estimate 1 12.75188 1 4.71938 1 0.86563 1 -1.41563 1 -0.59938 1 -0.52813 1 0.00562 1 2.23063 Error 0.13116 0.13116 0.13116 0.13116 0.13116 0.13116 0.13116 0.13116 t Value 97.22 35.98 6.60 -10.79 -4.57 -4.03 0.04 17.01 Pr>|t| <.0001 <.0001 0.0002 <.0001 0.0018 0.0038 0.9668 <.0001 Dependent Predicted Obs Variable Value Mean Predict 1 4.6500 5.2300 2 21.4200 21.3850 3 12.6600 12.6100 4 18.2700 17.4450 5 7.9300 7.9050 6 13.1800 13.0250 7 6.5100 6.3850 8 18.2300 18.0300 9 5.8100 5.2300 10 21.3500 21.3850 11 12.5600 12.6100 12 16.6200 17.4450 13 7.8800 7.9050 14 12.8700 13.0250 15 6.2600 6.3850 16 17.8300 18.0300 95% CL Mean 0.3710 4.3745 6.0855 Std Error 95% CL Predict Residual 3.7483 6.7117 -0.5800 0.3710 20.5295 22.2405 19.9033 22.8667 0.0350 0.3710 11.7545 13.4655 11.1283 14.0917 0.0500 0.3710 16.5895 18.3005 15.9633 18.9267 0.8250 0.3710 7.0495 8.7605 6.4233 9.3867 0.0250 0.3710 12.1695 13.8805 11.5433 14.5067 0.1550 0.3710 5.5295 7.2405 4.9033 7.8667 0.1250 0.3710 17.1745 18.8855 16.5483 19.5117 0.2000 0.3710 4.3745 6.0855 3.7483 6.7117 0.5800 0.3710 20.5295 22.2405 19.9033 22.8667 -0.0350 0.3710 11.7545 13.4655 11.1283 14.0917 -0.0500 0.3710 16.5895 18.3005 15.9633 18.9267 -0.8250 0.3710 7.0495 8.7605 6.4233 9.3867 -0.0250 0.3710 12.1695 13.8805 11.5433 14.5067 -0.1550 0.3710 5.5295 7.2405 4.9033 7.8667 -0.1250 0.3710 17.1745 18.8855 16.5483 19.5117 -0.2000 Figure 15.9: SAS printout for data of Case Study 15.2. x1x2x3 (ABC) suggests that flow rate should be at the low level to enhance effi- ciency. Indeed, the regression model generated in the SAS printout suggests that the combination of factors that may produce optimum results, or perhaps suggest direction for further experimentation, is given by A: high level B: low level C: low level 15.5 The Orthogonal Design 617 15.5 The Orthogonal Design In experimental situations where it is appropriate to fit models that are linear in the design variables and possibly should involve interactions or product terms, there are advantages gained from the two-level orthogonal design, or orthogonal array. By an orthogonal design we mean one in which there is orthogonality among the columns of the X matrix. For example, consider the X matrix for the 22 factorial of Example 15.2. Notice that all three columns are mutually orthogonal. The X matrix for the 23 factorial also contains orthogonal columns. The 23 factorial with interactions would yield an X matrix of the type ⎢ 1 ⎢ 1 X=⎢ 1 ⎢ 1 ⎢ 1 ⎣1 1 −1 −1 1 −1 −1 1 1 1 −1 −1 1 −1 −1 −1 −1 1 1 −1 1 1 −1 1 −1 −1 1 1⎥ 1 −1 1⎥ −1 −1 1⎥ −1 −1 −1 ⎥ 1 −1 −1 ⎥ −1 1 −1⎦ x x x xxxxxxxxx ⎡⎤ 1 2 3 121323123 1 −1 −1 −1 1 1 1 −1 11111111 The outline of degrees of freedom is Source d.f. Regression 3 Lack of fit Error (pure) 8 Total 15 4 (x1x2, x1x3, x2x3, x1x2x3) The 8 degrees of freedom for pure error are obtained from the duplicate runs at each design point. Lack-of-fit degrees of freedom may be viewed as the difference between the number of distinct design points and the number of total model terms; in this case, there are 8 points and 4 model terms. Standard Error of Coefficients and T-Tests In previous sections, we showed how the designer of an experiment may exploit the notion of orthogonality to design a regression experiment with coefficients that attain minimum variance on a per cost basis. We should be able to make use of our exposure to regression in Section 12.4 to compute estimates of variances of coefficients and hence their standard errors. It is also of interest to note the relationship between the t-statistic on a coefficient and the F-statistic described and illustrated in previous chapters. Recall from Section 12.4 that the variances and covariances of coefficients ap- pear in A−1, or, in terms of present notation, the variance-covariance matrix of coefficients is σ2A−1 = σ2(X′X)−1. In the case of the 2k factorial experiment, the columns of X are mutually orthog- 618 Chapter 15 2k Factorial Experiments and Fractions onal, imposing a very special structure. In general, for the 2k we can write x1 x2 ··· xk x1x2 ··· X = [1 ± 1 ±1 · · · ± 1 ± 1 · · · ], where each column contains 2k or 2kn entries, where n is the number of replicate runs at each design point. Thus, formation of X′X yields X′X = 2knIp, where I is the identity matrix of dimension p, the number of model parameters. Example 15.3: Consider a 23 factorial design with duplicated runs fitted to the model E(Y ) = β0 + β1x1 + β2x2 + β3x3 + β12x1x2 + β13x1x3 + β23x2x3. Give expressions for the standard errors of the least squares estimates of b0, b1, b2, b3, b12, b13, and b23. x x x x x x x x x Solution: ⎡ ⎤ 1 ⎢1 ⎢1 X=⎢1 ⎢1 ⎢1 ⎣1 −1 −1 1 −1 −1 1 −1 −1 1 1 1 −1 −1 1 −1 1 1 −1 −1 −1 −1 −1 1 1 1 −1 −1 1 −1 1 −1 1 1 −1 −1 1 1 ⎥ −1 ⎥ −1⎥ −1 ⎥ −1 ⎥ 1⎦ 123121323 1111111 with each unit viewed as being repeated (i.e., each observation is duplicated). As a result, Thus, X′X = 16I7. (X′X)−1 = 1 I7. 16 From the foregoing it should be clear that the variances of all coefficients for a 2k factorial with n runs at each design point are σ2 Var(bj) = 2kn, and, of course, all covariances are zero. As a result, standard errors of coefficients are calculated as 􏰼 sbj=s 1, 2kn where s is found from the square root of the mean square error (hopefully obtained from adequate replication). Thus, in our case with the 23, 􏰧1􏰨 sbj =s 4 . 15.5 The Orthogonal Design 619 Example 15.4: Consider the metallurgy experiment in Exercise 15.3 on page 609. Suppose that the fitted model is E(Y ) = β0 + β1x1 + β2x2 + β3x3 + β4x4 + β12x1x2 + β13x1x3 + β14x1x4 + β23x2x3 + β24x2x4 + β34x3x4. What are the standard errors of the least squares regression coefficients? Solution: Standard errors of all coefficients for the 2k factorial are equal and are which in this illustration is 􏰼 sbj=s 1, 2kn 􏰿 sbj=s 1. (16)(2) In this case, the pure mean square error is given by s2 = 2.46 (16 degrees of freedom). Thus, sbj = 0.28. The standard errors of coefficients can be used to construct t-statistics on all coefficients. These t-values are related to the F-statistics in the analysis of variance. We have already demonstrated that an F-statistic on a coefficient, using the 2k factorial, is (contrast)2 f= (2kn)s2 . This is the form of the F-statistics on page 610 for the metallurgy experiment (Exercise 15.3). It is easy to verify that if we write then t = bj , sbj 2 where bj = contrast, 2kn (contrast)2 t = s22kn =f. As a result, the usual relationship holds between t-statistics on coefficients and the F-values. As we might expect, the only difference between the use of t and F in assessing significance lies in the fact that the t-statistic indicates the sign, or direction, of the effect of the coefficient. It would appear that the 2k factorial plan would handle many practical situa- tions in which regression models are fitted. It can accommodate linear and inter- action terms, providing optimal estimates of all coefficients (from a variance point of view). However, when k is large, the number of design points required is very large. Often, portions of the total design can be used and still allow orthogonality with all its advantages. These designs are discussed in Section 15.6. 620 Chapter 15 2k Factorial Experiments and Fractions A More Thorough Look at the Orthogonality Property in the 2k Factorial We have learned that for the case of the 2k factorial all the information that is delivered to the analyst about the main effects and interactions is in the form of contrasts. These “2k − 1 pieces of information” carry a single degree of freedom apiece and they are independent of each other. In an analysis of variance, they manifest themselves as effects, whereas if a regression model is being constructed, the effects turn out to be regression coefficients, apart from a factor of 2. With either form of analysis, significance tests can be carried out and the t-test for a given effect is numerically the same as that for the corresponding regression coefficient. In the case of ANOVA, variable screening and scientific interpretation of interactions are important, whereas in the case of a regression analysis, a model may be used to predict response and/or determine which factor/level combinations are optimum (e.g. maximize yield or maximize cleaning efficiency, as in the case of Case Study 15.2). It turns out that the orthogonality property is important whether the analysis is to be ANOVA or regression. The orthogonality among the columns of X, the model matrix in, say, Example 15.3, provides special conditions that have an im- portant impact on the variance of effects or regression coefficients. In fact, it has already become apparent that the orthogonal design results in equality of variance for all effects or coefficients. Thus, in this way, the precision, for purposes of estimation or testing, is the same for all coefficients, main effects, or interac- tions. In addition, if the regression model contains only linear terms and thus only main effects are of interest, the following conditions result in the minimization of variances of all effects (or, correspondingly, first-order regression coefficients). If the regression model contains terms no higher than first order, and if the Minimum ranges on the variables are given by xj ∈ [−1,+1] for j = 1,2,...,k, then Conditions for Variances of Coefficients Var(bj )/σ2, for j = 1, 2, . . . , k, is minimized if the design is orthogonal and all xi levels in the design are at ±1 for i = 1,2,...,k. Thus, in terms of coefficients of model terms or main effects, orthogonality in the 2k is a very desirable property. Another approach to a better understanding of the “balance” provided by the 23 is to look at the situation graphically. All of the contrasts that are orthogonal and thus mutually independent are shown graphically in Figure 15.10. In the graphs, the planes of the squares whose vertices contain the responses labeled “+” are compared to those containing the responses labeled “−.” Those given in (a) show contrasts for main effects and should be obvious to the reader. Those in (b) show the planes representing “+” vertices and “−” vertices for the three two-factor interaction contrasts. In (c), we see the geometric representation of the contrasts for the three-factor (ABC) interaction. Center Runs with 2k Designs In the situation in which the 2k design is implemented with continuous design variables and one is seeking to fit a linear regression model, the use of replicated runs in the design center can be extremely useful. In fact, quite apart from the advantages that will be discussed in what follows, a majority of scientists and 15.5 The Orthogonal Design 621 + − ABC (a) Main effects − + + − AB AC BC (b) Two−factor interactions C = + runs = − runs B A ABC (c) Three−factor interaction Figure 15.10: Geometric presentation of contrasts for the 23 factorial design. + − + − + − − + + − 622 Chapter 15 2k Factorial Experiments and Fractions engineers would consider center runs (i.e., the runs at xi = 0 for i = 1,2,...,k) as not only a reasonable practice but something that was intuitively appealing. In many areas of application of the 2k design, the scientist desires to determine if he or she might benefit from moving to a different region of interest in the factors. In many cases, the center (i.e., the point (0, 0, . . . , 0) in the coded factors) is often either the current operating conditions of the process or at least those conditions that are considered “currently optimum.” So it is often the case that the scientist will require data on the response at the center. Center Runs and Lack of Fit In addition to the intuitive appeal of the augmentation of the 2k with center runs, a second advantage is enjoyed that relates to the kind of model that is fitted to the data. Consider, for example, the case with k = 2, illustrated in Figure 15.11. 􏱐1 B(x2) 􏱍1 􏱍1 􏱐1 (0, 0) A(x1) Figure 15.11: A 22 design with center runs. It is clear that without the center runs the model terms are the intercept, x1, x2, x1x2. These account for the four model degrees of freedom delivered by the four design points, apart from any replication. Since each factor has response infor- mation available only at two locations {−1, +1}, no “pure” second-order curvature terms can be accommodated in the model (i.e, x21 or x2). But the information at (0,0) produces an additional model degree of freedom. While this important degree of freedom does not allow both x21 and x2 to be used in the model, it does allow for testing the significance of a linear combination of x21 and x2. For nc center runs, there are then nc − 1 degrees of freedom available for replication or “pure” error. This allows an estimate of σ2 for testing the model terms and significance of the 1 d.f. for quadratic lack of fit. The concept here is very much like that discussed in the lack-of-fit material in Chapter 11. In order to gain a complete understanding of how the lack-of-fit test works, assume that for k = 2 the true model contains the full second-order complement of terms, including x21 and x2. In other words, E(Y ) = β0 + β1x1 + β2x2 + β12x1x2 + β11x21 + β22x2. 15.5 The Orthogonal Design 623 Now, consider the contrast where y ̄f is the average response at the factorial locations and y ̄0 is the average response at the center point. It can be shown easily (see Review Exercise 15.46) that E(y ̄f − y ̄0) = β11 + β22, and, in fact, for the general case with k factors, t n c − 1 = y ̄ f − y ̄ 0 = 􏰱 y ̄ f − y ̄ 0 , sy ̄f−y ̄0 MSE(1/nf +1/nc) where nf is the number of factorial points and MSE is simply the sample variance of the response values at (0,0,...,0). Example 15.5: This example is taken from Myers, Montgomery, and Anderson-Cook (2009). A chemical engineer is attempting to model the percent conversion in a process. There are two variables of interest, reaction time and reaction temperature. In an attempt to arrive at the appropriate model, a preliminary experiment was conducted in a 22 factorial using the current region of interest in reaction time and temperature. Single runs were made at each of the four factorial points and five runs were made at the design center in order that a lack-of-fit test for curvature could be conducted. Figure 15.12 shows the design region and the experimental runs on yield. The time and temperature readings at the center are, of course, 35 minutes and 145◦C. The estimates of the main effects and single interaction coefficient are computed through contrasts, just as before. The center runs play no role in the computation of b1, b2, and b12. This should be intuitively reasonable to the reader. The intercept is merely y ̄ for the entire experiment. This value is y ̄ = 40.4444. The standard errors are found through the use of diagonal elements of (X′X)−1, as discussed earlier. For this case, ⎡xxxx⎤ 1212 1 −1 −1 1 E(y ̄f −y ̄0)= As a result, the lack-of-fit test is a simple t-test (or F = t2) with y ̄ f − y ̄ 0 , 􏰤k i=1 βii. X=⎢ 1 ⎢1 ⎢1 −1⎥ −1⎥ 1⎥ 0 ⎥ 0⎥ 0⎥ ⎣1000⎦ ⎢1 ⎢1 ⎢1 −1 1 1 −1 1 1 0 0 0 0 0 0 1000 624 Chapter 15 2k Factorial Experiments and Fractions 􏱐1 160° 40.0 41.5 39.3 40.9 40.3, 40.5, 40.7, 40.2, 40.6 130° 􏱍1 􏱍1 􏱐1 40 min 30 min Figure 15.12: 22 factorial with 5 center runs. After the computations, we have b0 = 40.4444, sb0 = 0.06231, tb0 = 649.07 b1 = 0.7750, sb1 = 0.09347, tb1 = 8.29 b2 = 0.3250, sb2 = 0.09347, tb2 = 3.48 b12 = −0.0250, sb12 = 0.09347, tb12 = −0.27 (P = 0.800). The contrast y ̄f − y ̄0 = 40.425 − 40.46 = −0.035, and the t-statistic that tests for curvature is given by t = 􏰱 40.425 − 40.46 = 0.251 (P = 0.814). 0.0430(1/4 + 1/5) As a result, it appears as if the appropriate model should contain only first-order terms (apart from the intercept). An Intuitive Look at the Test on Curvature If one considers the simple case of a single design variable with runs at −1 and +1, it should seem clear that the average response at −1 and +1 should be close to the response at 0, the center, if the model is first order in nature. Any deviation would certainly suggest curvature. This is simple to extend to two variables. Consider Figure 15.13. The figure shows the plane on y that passes through the y values of the fac- torial points. This is the plane that would represent the perfect fit for the model containing x1, x2, and x1x2. If the model contains no quadratic curvature (i.e., β11 = β22 = 0), we would expect the response at (0, 0) to be at or near the plane. If the response is far away from the plane, as in the case of Figure 15.13, then it can be seen graphically that quadratic curvature is present. Time Temperature Exercises 625 Exercises 15.13 Consider a 25 experiment where the experi- mental runs are on 4 different machines. Use the ma- chines as blocks, and assume that all main effects and two-factor interactions may be important. (a) Which runs would be made on each of the 4 ma- chines? (b) Which effects are confounded with blocks? 15.14 An experiment is described in Myers, Mont- gomery, and Anderson-Cook (2009) in which optimum conditions are sought for storing bovine semen to ob- tain maximum survival. The variables are percent sodium citrate, percent glycerol, and equilibration time in hours. The response is percent survival of the motile spermatozoa. The natural levels are found in the above reference. The data with coded levels for the factorial portion of the design and the center runs are given. (a) Fit a linear regression model to the data and deter- mine which linear and interaction terms are signif- icant. Assume that the x1 x2 x3 interaction is neg- ligible. (b) Test for quadratic lack of fit and comment. x1, % x3 // y B (x2) Figure 15.13: 22 factorial with runs at (0, 0). Responses at (0, 0) Sodium x2, % Citrate Glycerol −1 −1 1 −1 −1 1 1 1 −1 −1 1 −1 −1 1 1 1 0 0 0 0 15.15 Oil producers that are strong and corrosion resistant. An experiment was conducted in which yield strengths were compared for nickel alloy tensile specimens charged in solutions of sulfuric acid saturated with carbon disulfide. Two alloys were compared: a 75% nickel composition and a 30% nickel composition. The alloys were tested un- der two different charging times, 25 and 50 days. A 23 A (x1) Equilibration % Time Survival −1 57 −1 40 1 19 1 40 −1 54 −1 41 1 21 1 43 0 63 0 61 are interested in nickel alloys 􏱍1 0 􏱐1 􏱐1 0 􏱍1 626 Chapter 15 2k Factorial Experiments and Fractions factorial was conducted with the following factors: (a) Test to determine which main effects and interac- tions should be involved in the fitted model. % sulfuric acid: 4%, 6% (x1) charging time: 25 days, 50 days (x2) nickel composition: 30%, 75% (x3) (b) Test for quadratic curvature. (c) If quadratic curvature is significant, how many additional design points are needed to determine which quadratic terms should be included in the model? A specimen was prepared for each of the 8 conditions. Since the engineers were not certain of the nature of the model (i.e., whether or not quadratic terms would be needed), a third level (middle level) was incorporated, 15.16 Suppose a second replicate of the experiment and 4 center runs were employed using 4 specimens at in Exercise 15.13 could be performed. 5% sulfuric acid, 37.5 days, and 52.5% nickel composi- tion. The following are the yield strengths in kilograms per square inch. (a) Would a second replication of the blocking scheme of Exercise 15.13 be the best choice? (b) If the answer to part (a) is no, give the layout for a better choice for the second replicate. (c) What concept did you use in your design selection? Charging Time Nickel Comp. 75% 30% 25 Days Sulfuric Acid 50 Days Sulfuric Acid 4% 6% 47.9 47.4 4% 52.5 50.2 6% 56.5 50.8 15.17 Consider Figure 15.14, which represents a 22 47.2 factorial with 3 center runs. If quadratic curvature is 41.7 significant, what additional design points would you select that might allow the estimation of the terms The center runs gave the following strengths: x21 , x2 ? 51.6, 51.4, 52.4, 52.9 􏱍1, 1 Explain. 1, 1 x2 (0, 0) 􏱍1, 􏱍1 Figure 15.14: Graph for Exercise 15.17. 15.6 Fractional Factorial Experiments The 2k factorial experiment can become quite demanding, in terms of the number of experimental units required, when k is large. One of the real advantages of this experimental plan is that it allows a degree of freedom for each interaction. However, in many experimental situations, it is known that certain interactions are negligible, and thus it would be a waste of experimental effort to use the complete factorial experiment. In fact, the experimenter may have an economic constraint that disallows taking observations at all of the 2k treatment combinations. When k is large, we can often make use of a fractional factorial experiment where per- x1 1, 􏱍1 15.6 Fractional Factorial Experiments 627 haps one-half, one-fourth, or even one-eighth of the total factorial plan is actually carried out. Construction of 1 Fraction 2 The construction of the half-replicate design is identical to the allocation of the 2k factorial experiment into two blocks. We begin by selecting a defining contrast that is to be completely sacrificed. We then construct the two blocks accordingly and choose either of them as the experimental plan. A 1 fraction of a 2k factorial is often referred to as a 2k−1 design, the latter 2 indicating the number of design points. Our first illustration of a 2k−1 will be a 1 2 of 23, or a 23−1, design. In other words, the scientist or engineer cannot use the full complement (i.e., the full 23 with 8 design points) and hence must settle for a design with only four design points. The question is, of the design points (1), a, b, ab, ac, c, bc, and abc, which four design points would result in the most useful design? The answer, along with the important concepts involved, appears in the table of + and − signs displaying contrasts for the full 23. Consider Table 15.9. Table 15.9: Contrasts for the Seven Available Effects for a 23 Factorial Experiment Effects Combination I A B C AB AC BC ABC 23−1 a ++−−−−++ b +−+−−+−+ c +−−++−−+ abc + + + + + + + + 23−1 ab +++−+−−− ac + + − + − + − − bc + − + + − − + − (1) + − − − + + + − Note that the two 1 fractions are {a, b, c, abc} and {ab, ac, bc, (1)}. Note also 2 from Table 15.9 that in both designs ABC has no contrast but all other effects do have contrasts. In one of the fractions we have ABC containing all + signs, and in the other fraction the ABC effect contains all − signs. As a result, we say that the top design in the table is described by ABC = I and the bottom design by ABC = −I. The interaction ABC is called the design generator, and ABC = I (or ABC = −I for the second design) is called the defining relation. Aliases in the 23−1 If we focus on the ABC = I design (the upper 23−1), it becomes apparent that six effects contain contrasts. This produces the initial appearance that all effects can be studied apart from ABC. However, the reader can certainly recall that with only four design points, even if points are replicated, the degrees of freedom available (apart from experimental error) are Treatment 628 Chapter 15 2k Factorial Experiments and Fractions Regression model terms 3 Intercept 1 4 A closer look suggests that the seven effects are not orthogonal, and each contrast is represented in another effect. In fact, using ≡ to signify identical contrasts, we have A ≡ BC; B ≡ AC; C ≡ AB. As a result, within a pair an effect cannot be estimated independently of its alias “partner.” The effects A= a+abc−b−c and BC = a+abc−b−c 22 will produce the same numerical result and thus contain the same information. In fact, it is often said that they share a degree of freedom. In truth, the estimated effect actually estimates the sum, namely A + BC. We say that A and BC are aliases, B and AC are aliases, and C and AB are aliases. For the ABC = −I fraction we can observe that the aliases are the same as those for the ABC = I fraction, apart from sign. Thus, we have A ≡ −BC; B ≡ −AC; C ≡ −AB. The two fractions appear on corners of the cubes in Figures 15.15(a) and 15.15(b). CC bc c abc Bb a B (1) ac AA (a) The ABC = I fraction (b) The AB C = − I fraction Figure 15.15: The 1 fractions of the 23 factorial. 2 ab How Aliases Are Determined in General In general, for a 2k−1, each effect, apart from that defined by the generator, will have a single alias partner. The effect defined by the generator will not be aliased 15.6 Fractional Factorial Experiments 629 by another effect but rather will be aliased with the mean since the least squares estimator will be the mean. To determine the alias for each effect, one merely begins with the defining relation, say ABC = I for the 23−1. Then to find, say, the alias for effect A, multiply A by both sides of the equation ABC = I and reduce any exponent by modulo 2. For example, In a similar fashion, and, of course, A · ABC = A, thus, BC ≡ A. B ≡ B · ABC ≡ AB2C ≡ AC, C ≡ C · ABC ≡ ABC2 ≡ AB. Now for the second fraction (i.e., defined by the relation ABC = −I), A ≡ −BC; B ≡ −AC; C ≡ −AB. As a result, the numerical value of effect A is actually estimating A−BC. Similarly, the value of B estimates B − AC, and the value of C estimates C − AB. Formal Construction of the 2k−1 A clear understanding of the concept of aliasing makes it very simple to understand the construction of the 2k−1. We begin with investigation of the 23−1. There are three factors and four design points required. The procedure begins with a full factorial in k − 1 = 2 factors A and B. Then a third factor is added according to the desired alias structures. For example, with ABC as the generator, clearly C = ±AB. Thus, C = AB or C = −AB is found to supplement the full factorial in A and B. Table 15.10 illustrates what is a very simple procedure. Table 15.10: Construction of the Two 23−1 Designs Basic 22 23−1; ABC = I 23−1; ABC = −I A B A B C=AB A B C=−AB −−−−+−−− +−+−−+−+ −+−+−−++ +++++++− Note that we saw earlier that ABC = I gives the design points a, b, c, and abc while ABC = −I gives (1), ac, bc, and ab. Earlier we were able to construct the same designs using the table of contrasts in Table 15.9. However, as the design becomes more complicated with higher fractions, these contrast tables become more difficult to deal with. Consider now a 24−1 (i.e., a 1 of a 24 factorial design) involving factors A, B, 2 C, and D. As in the case of the 23−1, the highest-order interaction, in this case 630 Chapter 15 2k Factorial Experiments and Fractions ABCD, is used as the generator. We must keep in mind that ABCD = I; the defining relation suggests that the information on ABCD is sacrificed. Here we begin with the full 23 in A, B, and C and form D = ±ABC to generate the two 24−1 designs. Table 15.11 illustrates the construction of both designs. Table 15.11: Construction of the Two 24−1 Designs Basic 23 24−1; ABCD = I 24−1; ABCD = −I A B C A B C D=ABC A B C D=−ABC −−−−−−−−−−+ +−−+−−++−−− −+−−+−+−+−− ++−++−−++−+ −−+−−++−−+− +−++−+−+−++ −++−++−−+++ ++++++++++− Here, using the notation a, b, c, and so on, we have the following designs: ABCD = I, (1), ad, bd, ab, cd, ac, bc, abcd ABCD = −I, d, a, b, abd, c, acd, bcd, abc. The aliases in the case of the 24−1 are found as illustrated earlier for the 23−1. Each effect has a single alias partner and is found by multiplication via the use of the defining relation. For example, the alias for A for the ABCD = I design is given by A = A · ABCD = A2BCD = BCD. The alias for AB is given by AB = AB · ABCD = A2B2CD = CD. As we can observe easily, main effects are aliased with three-factor interactions and two-factor interactions are aliased with other two-factor interactions. A complete listing is given by Construction of the 1 Fraction 4 A = BCD B = ACD C = ABD D = ABC. AB = CD AC = BD AD = BC In the case of the 1 fraction, two interactions are selected to be sacrificed rather 4 than one, and the third results from finding the generalized interaction of the 15.6 Fractional Factorial Experiments 631 selected two. Note that this is very much like the construction of four blocks. The fraction used is simply one of the blocks. A simple example aids a great deal in seeing the connection to the construction of the 1 fraction. Consider the 2 construction of 1 of a 25 factorial (i.e., a 25−2), with factors A, B, C, D, and E. 4 One procedure that avoids the confounding of two main effects is the choice of ABD and ACE as the interactions that correspond to the two generators, giving ABD = I and ACE = I as the defining relations. The third interaction sacrificed would then be (ABD)(ACE) = A2BCDE = BCDE. For the construction of the design, we begin with a 25−2 = 23 factorial in A, B, and C. We use the interactions ABD and ACE to supply the generators, so the 23 factorial in A, B, and C is supplemented by factor D = ±AB and E = ±AC. Thus, one of the fractions is given by ⎡ A B C D=ABE=AC⎤ −−−++ de ⎢ + − − − − ⎥ a ⎢ − + − − + ⎥ be ⎢ + + − + − ⎥ abd ⎢ − − + + − ⎥ cd ⎢ + − + − + ⎥ ace ⎣ − + + − − ⎦ bc +++++ abcde The other three fractions are found by using the generators {D = −AB, E = AC}, {D = AB,E = −AC}, and {D = −AB,E = −AC}. Consider an analysis of the above 25−2 design. It contains eight design points to study five factors. The aliases for main effects are given by A(ABD) ≡ BD B ≡ AD C ≡ ABCD D ≡ AB E ≡ ABDE A(ACE) ≡ CE ≡ ABCE ≡ AE ≡ ACDE ≡ AC A(BCDE) ≡ ABCDE ≡ CDE ≡ BDE ≡ BCE ≡ BCD Aliases for other effects can be found in the same fashion. The breakdown of degrees of freedom is given by (apart from replication) Main effects 5 Lackoffit Total 7 2 (CD=BE, BC=DE) We list interactions only through degree 2 in the lack of fit. Consider now the case of a 26−2, which allows 16 design points to study six factors. Once again two design generators are chosen. A pragmatic choice to supplement a 26−2 = 24 full factorial in A, B, C, and D is to use E = ±ABC and F = ±BCD. The construction is given in Table 15.12. Obviously, with eight more design points than in the 25−2, the aliases for main effects will not present as difficult a problem. In fact, note that with defining relations ABCE = ±I, BCDF = ±I, and (ABCE)(BCDF) = ADEF = ±I, 632 Chapter 15 2k Factorial Experiments and Fractions Table 15.12: A 26−2 Design Treatment A B C D E=ABC F=BCD Combination −−−−− − (1) + − − − + − ae −+−−+ + bef ++−−− + abf −−+−+ + cef +−+−− + acf − + + − − − bc +++−+ − abce − − − + − + df +−−++ + adef −+−++ − bde ++−+− − abd −−+++ − cde +−++− − acd −+++− + bcdf ++++ + + abcdef main effects will be aliased with interactions that are no less complex than those of third order. The alias structure for main effects is written 15.7 Here, of course, there is some aliasing among the two-factor interactions. The remaining 2 degrees of freedom are accounted for by the following groups: ABD≡CDE≡ACF ≡BEF, ACD≡BDE≡ABF ≡CEF. It becomes evident that we should always be aware of what the alias structure is for a fractional experiment before we finally recommend the experimental plan. Proper choice in defining contrasts is important, since it dictates the alias structure. Analysis of Fractional Factorial Experiments The difficulty of making formal significance tests using data from fractional factorial experiments lies in the determination of the proper error term. Unless there are A ≡ BCE ≡ ABCDF ≡ DEF, B ≡ACE ≡CDF ≡ABDEF, C ≡ABE≡BDF ≡ACDEF, D ≡ ABCDE ≡ BCF ≡ AEF, E ≡ABC ≡BCDEF ≡ADF, F ≡ABCEF ≡BCD≡ADE, each with a single degree of freedom. For the two-factor interactions, AB ≡CE≡ACDF ≡BDEF, AC ≡BE≡ABDF ≡CDEF, AD ≡BCDE≡ABCF ≡EF, AE ≡ BC ≡ ABCDEF ≡ DF. AF ≡BCEF ≡ABCD≡DE, BD≡ACDE≡CF ≡ABEF, BF ≡ACEF ≡CD≡ABDE, 15.7 Analysis of Fractional Factorial Experiments 633 data available from prior experiments, the error must come from a pooling of contrasts representing effects that are presumed to be negligible. Sums of squares for individual effects are found by using essentially the same procedures given for the complete factorial. We can form a contrast in the treat- ment combinations by constructing the table of positive and negative signs. For example, for a half-replicate of a 23 factorial experiment with ABC the defining contrast, one possible set of treatment combinations, along with the appropriate algebraic sign for each contrast used in computing effects and the sums of squares for the various effects, is presented in Table 15.13. Table 15.13: Signs for Contrasts in a Half-Replicate of a 23 Factorial Experiment Treatment Factorial Effect Combination A B C AB AC BC ABC a +−−−−++ b −+−−+−+ c −−++−−+ abc +++++++ Note that in Table 15.13 the A and BC contrasts are identical, illustrating the aliasing. Also, B ≡ AC and C ≡ AB. In this situation, we have three orthogonal contrasts representing the 3 degrees of freedom available. If two observations were obtained for each of the four treatment combinations, we would then have an estimate of the error variance with 4 degrees of freedom. Assuming the interaction effects to be negligible, we could test all the main effects for significance. An example effect and corresponding sum of squares is a−b−c+abc (a−b−c+abc)2 A= 2n , SSA= 22n . In general, the single-degree-of-freedom sum of squares for any effect in a 2−p fraction of a 2k factorial experiment (p < k) is obtained by squaring contrasts in the treatment totals selected and dividing by 2k−pn, where n is the number of replications of these treatment combinations. Example 15.6: Suppose that we wish to use a half-replicate to study the effects of five factors, each at two levels, on some response, and it is known that whatever the effect of each factor, it will be constant for each level of the other factors. In other words, there are no interactions. Let the defining contrast be ABCDE, causing main effects to be aliased with four-factor interactions. The pooling of contrasts involving interactions provides 15 − 5 = 10 degrees of freedom for error. Perform an analysis of variance on the data in Table 15.14, testing all main effects for significance at the 0.05 level. Solution : The sums of squares and effects for the main effects are (11.3 − 15.6 − · · · − 14.7 + 13.2)2 (−17.5)2 SSA = 25−1 = 16 = 19.14, 634 Chapter 15 2k Factorial Experiments and Fractions a 11.3 b 15.6 c 12.7 d 10.4 e 9.2 abc 11.0 abd 8.9 acd 9.6 A = −17.5 = −2.19, 8 bcd abe ace ade bce bde cde abcde 14.1 14.2 11.7 9.4 16.2 13.9 14.7 13.2 // Table 15.14: Data for Example 15.6 Treatment Response Treatment Response (−11.3 + 15.6 − · · · − 14.7 + 13.2)2 SSB = 25−1 B = 18.1 = 2.26, 8 (−11.3 − 15.6 + · · · + 14.7 + 13.2)2 SSC = 25−1 C = 10.3 = 1.21, 8 (−11.3 − 15.6 − · · · + 14.7 + 13.2)2 SSD = 25−1 D = −7.7 = −0.96, 8 (−11.3 − 15.6 − · · · + 14.7 + 13.2)2 SSE = 25−1 E = 8.9 = 1.11. 8 = 20.48, = 6.63, = 3.71, = 4.95, (18.1)2 = 16 (10.3)2 = 16 (−7.7)2 = 16 (8.9)2 = 16 All other calculations and tests of significance are summarized in Table 15.15. The tests indicate that factor A has a significant negative effect on the response, whereas factor B has a significant positive effect. Factors C, D, and E are not significant at the 0.05 level. Exercises 15.18 List the 25 factorial experiment when the defining contrast is ACDE. 15.19 (a) Obtain a 1 fraction of a 24 factorial design 2 using BCD as the defining contrast. (b) Divide the 1 fraction into 2 blocks of 4 units each 2 by confounding ABC. aliases for the various effects in a (c) Show the analysis-of-variance table (sources of vari- ation and degrees of freedom) for testing all uncon- founded main effects, assuming that all interaction effects are negligible. 15.20 Construct a 1 fraction of a 26 factorial design 4 using ABCD and BDEF as the defining contrasts. Show what effects are aliased with the six main effects. // Exercises 635 Table 15.15: Analysis of Variance for the Data of a Half-Replicate of a 25 Factorial Experiment Source of Sum of Variation Squares Main effect: A 19.14 B 20.48 C 6.63 D 3.71 E 4.95 Error 30.83 Total 85.74 15.21 (a) Using the defining contrasts ABCE and ABDF, obtain a 1 fraction of a 26 design. 4 (b) Show the analysis-of-variance table (sources of vari- ation and degrees of freedom) for all appropriate tests, assuming that E and F do not interact and all three-factor and higher interactions are negligi- ble. Degrees of Freedom 1 1 1 1 1 Mean Computed Square f 19.14 6.21 20.48 6.65 6.63 2.15 3.71 1.20 4.95 1.61 10 3.08 15 A B C D Response 15.22 Seven factors are varied at two levels in an ex- periment involving only 16 trials. A 1 fraction of a −1 −1 1 −1 −1 1 1 1 −1 −1 1 −1 −1 1 −1 −1 6.6 −1 1 6.9 −1 1 7.9 −1 −1 6.1 1 1 9.2 1 −1 6.8 1−1 10.4 8 27 factorial experiment is used, with the defining con- 1111 7.3 15.24 In an experiment conducted at the Department of Mechanical Engineering and analyzed by the Statis- tics Consulting Center at Virginia Tech, a sensor de- tects an electrical charge each time a turbine blade makes one rotation. The sensor then measures the am- plitude of the electrical current. Six factors are rpm A, temperature B, gap between blades C, gap between blade and casing D, location of input E, and location of detection F. A 1 fraction of a 26 factorial experi- 4 ment is used, with defining contrasts being ABCE and BCDF. The data are as follows: A B C D E F Response 3.89 10.46 25.98 39.88 61.88 3.22 8.94 20.29 32.07 50.76 2.80 8.15 16.80 25.47 44.44 2.45 trasts being ACD, BEF, and CEG. The data are as follows: Treat. Comb. Response (1) 31.6 ad 28.7 abce 33.1 cdef 33.6 acef 33.7 bcde 34.2 abdf 32.5 bf 27.8 Treat. Comb. Response acg 31.1 cdg 32.0 beg 32.8 adef g 35.3 ef g 32.4 abdeg 35.3 b c df g 3 5 . 6 abcf g 35.1 −1 −1 1 −1 1 1 −1 1 1 1 −1 1 −1 −1 1 −1 −1 1 1 1 1 −1 −1 −1 1 −1 −1 −1 −1 1 111111 Perform an analysis of variance on all seven main ef- fects, assuming that interactions are negligible. Use a 0.05 level of significance. 15.23 An experiment is conducted so that an en- gineer can gain insight into the influence of sealing temperature A, cooling bar temperature B, percent polyethylene additive C, and pressure D on the seal strength (in grams per inch) of a bread-wrapper stock. A 1 fraction of a 24 factorial experiment is used, with 2 the defining contrast being ABCD. The data are pre- sented here. Perform an analysis of variance on main effects only. Use α = 0.05. −1 1 −1 1 −1 1 −1 1 −1 1 −1 1 −1 1 −1 −1 −1 −1 −1 −1 −1 1 −1 −1 1 −1 −1 −1 1 −1 −1 1 −1 1 1 −1 1 1 −1 −1 −1 1 −1 −1 1 1 −1 1 1 −1 1 −1 1 1 −1 1 1 1 1 1 636 Perform an analysis of variance on main effects and two-factor interactions, assuming that all three-factor and higher interactions are negligible. Use α = 0.05. 15.25 In the study Durability of Rubber to Steel Ad- hesively Bonded Joints, conducted at the Department of Environmental Science and Mechanics and analyzed by the Statistics Consulting Center at Virginia Tech, an experimenter measured the number of breakdowns in an adhesive seal. It was postulated that concentra- tion of seawater A, temperature B, pH C, voltage D, and stress E influence the breakdown of an adhesive seal. A 1 fraction of a 25 factorial experiment was 2 used, with the defining contrast being ABCDE. The data are as follows: Chapter 15 2k Factorial Experiments and Fractions aliases. 15.27 There are six factors and only eight design points can be used. Construct a 26−3 by beginning with a 23 and use D = AB, E = −AC, and F = BC as the generators. 15.28 Consider Exercise 15.27. Construct another 26−3 that is different from the design chosen in Ex- ercise 15.27. 15.29 For Exercise 15.27, give all aliases for the six main effects. 15.30 In Myers, Montgomery, and Anderson-Cook (2009), an application is discussed in which an engi- neer is concerned with the effects on the cracking of a titanium alloy. The three factors are A, temperature; B, titanium content; and C, amount of grain refiner. The following table gives a portion of the design and the response, crack length induced in the sample of the alloy. AB C D E −1 1 −1 −1 −1 −1 −1 1 −1 −1 −1 1 832 −1 1 764 −1 −1 1087 1 −1 522 1 1 854 1 1 773 1 −1 1068 1 1 572 1 −1 831 1 −1 819 factor interactions AD, AE, BD, BE, assuming that (d) all three-factor and higher interactions are negligible. 1 1 analysis of variance on main effects and two (b) Give aliases for all three main effects assuming that two-factor interactions may be real. Assuming that interactions are negligible, which main factor is most important? At what level would you suggest the factor named in (c) be for final production, high or low? Use α = 0.05. 15.26 Consider a 25−1 design with factors A, B, C, D, and E. Construct the design by beginning with a 24 and use E = ABCD as the generator. Show all (e) At what levels would you suggest the other factors be for final production? (f) What hazards lie in the recommendations you made in (d) and (e)? Be thorough in your answer. Response −1 1 −1 1 −1 1 −1 1 −1 1 −1 1 −1 1 −1 1 Perform an −1 −1 −1 −1 1 −1 1 −1 −1 1 −1 1 1 1 1 1 −1 −1 −1 −1 1 −1 1 −1 −1 1 −1 1 1 1 1 1 462 746 714 1070 474 1104 (c) 15.8 Higher Fractions and Screening Designs Some industrial situations require the analyst to determine which of a large number of controllable factors have an impact on some important response. The factors may be qualitative or class variables, regression variables, or a mixture of both. The analytical procedure may involve analysis of variance, regression, or both. Often the regression model used involves only linear main effects, although a few interactions may be estimated. The situation calls for variable screening and the resulting experimental designs are known as screening designs. Clearly, two-level orthogonal designs that are saturated or nearly saturated are viable candidates. A B −1 −1 1 1 1 −1 C Response −1 0.5269 −1 2.3380 1 4.0060 1 3.3640 −1 1 (a) What is the defining relation? 15.9 Construction of Resolution III and IV Designs with 8, 16, and 32 Design Points 637 Design Resolution Definition 15.1: Two-level orthogonal designs are often classified according to their resolution, the latter determined through the following definition. If the design is constructed as a full or fractional factorial (i.e., either a 2k or a 2k−p design, p = 1,2,...,k − 1), the notion of design resolution is an aid in categorizing the impact of the aliasing. For example, a resolution II design would have little use, since there would be at least one instance of aliasing of one main effect with another. A resolution III design will have all main effects (linear effects) orthogonal to each other. However, there will be some aliasing among linear effects and two-factor interactions. Clearly, then, if the analyst is interested in studying main effects (linear effects in the case of regression) and there are no two-factor interactions, a design of resolution at least III is required. The resolution of a two-level orthogonal design is the length of the smallest (least complex) interaction among the set of defining contrasts. 15.9 Construction of Resolution III and IV Designs with 8, 16, and 32 Design Points Useful designs of resolution III and IV can be constructed for 2 to 7 variables with 8 design points. We begin with a 23 factorial that has been symbolically saturated with interactions. xxxxxxxxxxxx ⎡⎤ 123121323123 −1 −1 −1 ⎢1 −1 ⎢−1 1 ⎢ −1 −1 ⎢1 1 ⎢1 −1 ⎣−1 1 1 1 −1 −1 −1 1 1 −1 1 −1 −1 1 −1 −1 1 −1 −1 −1 1 −1 1 1 1 −1 −1 −1 −1 1 1 ⎥ 1⎥ −1 ⎦ 1111111 It is clear that a resolution III design can be constructed merely by replacing interaction columns by new main effects through 7 variables. For example, we may define x4 = x1x2 x5 = x1x3 x6 = x2x3 x7 = x1x2x3 (defining contrast ABD) (defining contrast ACE) (defining contrast BCF ) (defining contrast ABCG) and obtain a 2−4 fraction of a 27 factorial. The preceding expressions identify the chosen defining contrasts. Eleven additional defining contrasts result, and all defining contrasts contain at least three letters. Thus, the design is a resolution III design. Clearly, if we begin with a subset of the augmented columns and conclude 1 ⎥ −1⎥ −1 ⎥ 638 Chapter 15 2k Factorial Experiments and Fractions Table 15.16: Some Resolution III, IV, V, VI and VII 2k−p Designs Number of Number of Factors Design Points 4 8 8 32 16 8 64 32 16 8 64 32 16 Generators C=±AB D=±ABC D=±AB; E=±AC F=±ABCDE E=±ABC;F=±BCD D=±AB;F=±BC;E=±AC G = ±ABCDEF E=±ABC;G=±ABDE E =±ABC; F =±BCD; G=±ACD D=±AB; E=±AC; F =±BC; G=±ABC G=±ABCD;H=±ABEF F =±ABC; G=±ABD; H =±BCDE E =±BCD; F =±ACD; G=±ABC; H =±ABD 3 23−1 III 4 24−1 IV 5 25−2 III 6 26−1 VI 26−2 IV 26−3 III 7 27−1 V II 8 28−2 V 27−2 IV 27−3 IV 27−4 III 28−3 IV 28−4 IV with a design involving fewer than 7 design variables, the result is a resolution III design in fewer than 7 variables. A similar set of possible designs can be constructed for 16 design points by beginning with a 24 saturated with interactions. Definitions of variables that cor- respond to these interactions produce resolution III designs through 15 variables. In a similar fashion, designs containing 32 runs can be constructed by beginning with a 25. Table 15.16 provides guidelines for constructing 8, 16, 32, and 64 point designs that are resolution III, IV and even V. The table gives the number of factors, the number of runs, and the generators that are used to produce the 2k−p designs. The generator given is used to augment the full factorial containing k − p factors. 15.10 Other Two-Level Resolution III Designs; The Plackett-Burman Designs A family of designs developed by Plackett and Burman (1946; see the Bibliography) fills sample size voids that exist with the fractional factorials. The latter are useful with sample sizes 2r (i.e., they involve sample sizes 4, 8, 16, 32, 64, . . . ). The Plackett-Burman designs involve 4r design points, and thus designs of sizes 12, 20, 24, 28, and so on, are available. These two-level Plackett-Burman designs are resolution III designs and are very simple to construct. “Basic lines” are given for each sample size. These lines of + and − signs are n − 1 in number. To construct the columns of the design matrix, we begin with the basic line and do a cyclic permutation on the columns until k (the desired number of variables) columns are formed. Then we fill in the last row with negative signs. The result will be 15.11 Introduction to Response Surface Methodology 639 a resolution III design with k variables (k = 1,2,...,N). The basic lines are as N = 12 N = 16 N = 20 N = 24 follows: + + − + + + − − − + − + + + + − + − + + − − + − − − + + − − + + + + − + − + − − − − + + − + + + + + − + − + + − − + + − − + − + − − − − Example 15.7: Construct a two-level screening design with 6 variables containing 12 design points. Solution: Begin with the basic line in the initial column. The second column is formed by bringing the bottom entry of the first column to the top of the second column and repeating the first column. The third column is formed in the same fashion, using entries in the second column. When there is a sufficient number of columns, simply fill in the last row with negative signs. The resulting design is as 15.11 The Plackett-Burman designs are popular in industry for screening situations. Because they are resolution III designs, all linear effects are orthogonal. For any sample size, the user has available a design for k = 2, 3, . . . , N − 1 variables. The alias structure for the Plackett-Burman design is very complicated, and thus the user cannot construct the design with complete control over the alias structure, as in the case of 2k or 2k−p designs. However, in the case of regression models, the Plackett-Burman design can accommodate interactions (although they will not be orthogonal) when sufficient degrees of freedom are available. Introduction to Response Surface Methodology In Case Study 15.2, a regression model was fitted to a set of data with the specific goal of finding conditions on those design variables that optimize (maximize) the cleansing efficiency of coal. The model contained three linear main effects, three two-factor interaction terms, and one three-factor interaction term. The model re- sponse was the cleansing efficiency, and the optimum conditions on x1, x2, and x3 follows: ⎡x1 x2 x3 x4 x5 x6⎤ +−+−−− ⎢+ + − + − −⎥ ⎢− + + − + −⎥ ⎢+−++−+⎥ ⎢+ + − + + −⎥ ⎢+ + + − + +⎥ ⎢− + + + − +⎥ ⎢− − + + + −⎥ ⎢− − − + + +⎥ ⎢+−−−++⎥ ⎣−+−−−+⎦ −−−−−− 640 Chapter 15 2k Factorial Experiments and Fractions were found by using the signs and the magnitude of the model coefficients. In this example, a two-level design was employed for process improvement or process op- timization. In many areas of science and engineering, the application is expanded to involve more complicated models and designs, and this collection of techniques is called response surface methodology (RSM). It encompasses both graph- ical and analytical approaches. The term response surface is derived from the appearance of the multidimensional surface of constant estimated response from a second-order model, i.e., a model with first- and second-order terms. An example will follow. The Second-Order Response Surface Model In many industrial examples of process optimization, a second-order response sur- face model is used. For the case of, say, k = 2 process variables, or design variables, and a single response y, the model is given by y=β0 +β1x1 +β2x2 +β11x21 +β22x2 +β12x1x2 +ε. Here we have k = 2 first-order terms, two pure second-order, or quadratic, terms, and one interaction term given by β12x1x2. The terms x1 and x2 are taken to be in the familiar ±1 coded form. The ε term designates the usual model error. In general, for k design variables the model will contain 1 + k + k + 􏰩k􏰪 model terms, 2 and hence the experimental design must contain at least a like number of design points. In addition, the quadratic terms require that the design variables be fixed in the design with at least three levels. The resulting design is referred to as a second-order design. Illustrations will follow. The following central composite design (CCD) and example is taken from Myers, Montgomery, and Anderson-Cook (2009). Perhaps the most popular class of second-order designs is the class of central composite designs. The example given in Table 15.17 involves a chemical process in which reaction temperature, ξ1, and reactant concentration, ξ2, are shown at their natural levels. They also appear in coded form. There are five levels of each of the two factors. In addition, we have the order in which the observations on x1 and x2 were run. The column on the right gives values of the response y, the percent conversion of the process. The first four design points represent the familiar factorial points at levels ±1. The next four points are called axial points. They are followed by the center runs that were discussed and illustrated earlier in this chapter. Thus, the five levels of each of the two factors are −1, +1, −1.414, +1.414, and 0. A clear picture of the geometry of the central composite design for this k = 2 example appears in Figure 15.16. This figure illustrates the source of the term axial points.These four points are on the √ factor axes at an axial distance of α = for this particular CCD, the perimeter points, axial and factorial, are all at the √ 2 = 1.414 from the design center. In fact, 2 from the design center, and as a result we have eight equally spaced distance points on a circle plus four replications at the design center. Example 15.8: Response Surface Analysis: An analysis of the data in the two-variable example may involve the fitting of a second-order response function. The resulting response surface can be used analytically or graphically to determine the impact that x1 15.11 Introduction to Response Surface Methodology 641 Table 15.17: Central Composite Design for Example 15.8 Temperature (◦C) Observation Run ξ1 14200 2 12 250 311 200 4 5 250 56 189.65 67260.35 Concentration (%) ξ2 x1 x2 y 7 1 8 3 9 8 10 10 11 9 12 2 225 225 225 225 225 225 30 25 20 15 10 175 0 0 0 0 0 0 −1.414 65 1.414 74 0 76 0 79 0 83 0 81 15−1−143 15 25 25 20 20 12.93 27.07 20 20 20 20 1 −1 78 −11 69 1 1 73 −1.414 0 48 1.414078 +2 +1 0 −1 −2 200 225 250 ξ1, Temperature ( C) 275 −2 −1 0 +1 +2 x1 Figure 15.16: Central composite design for Example 15.8. and x2 have on percent conversion of the process. The coefficients in the response function are determined by the method of least squares developed in Chapter 12 and illustrated throughout this chapter. The resulting second-order response model is given in the coded variables as yˆ = 79.75 + 10.18x1 + 4.22x2 − 8.50x21 − 5.25x2 − 7.75x1x2, whereas in the natural variables it is given by yˆ = −1080.22 + 7.7671ξ1 + 23.1932ξ2 − 0.0136ξ12 − 0.2100ξ2 − 0.0620ξ1ξ2. Since the current example contains only two design variables, the most illumi- x2 ξ2, Concentraion (%) 642 Chapter 15 2k Factorial Experiments and Fractions nating approach to determining the nature of the response surface in the design region is through two- or three-dimensional graphics. It is of interest to determine what levels of temperature x1 and concentration x2 produce a desirable estimated percent conversion, yˆ. The estimated response function above was plotted in three dimensions, and the resulting response surface is shown in Figure 15.17. The height of the surface is yˆ in percent. It is readily seen from this figure why the term re- sponse surface is employed. In cases where only two design variables are used, two-dimensional contour plotting can be useful. Thus, make note of Figure 15.18. Contours of constant estimated conversion are seen as slices from the response sur- face. Note that the viewer of either figure can readily observe which coordinates of temperature and concentration produce the largest estimated percent conver- sion. In the plots, the coordinates are given in both coded units and natural units. Notice that the largest estimated conversion is at approximately 240◦C and 20% concentration. The maximum estimated (or predicted) response at that location is 82.47%. 82.47 60.45 38.43 16.41 27.07 24.71 260.4 248.6 1.414 1 22.36 20.00 17.64 15.29 −1 −1.414 225.0236.8 −1 1 1.414 213.2 x2 0 201.4 12.93 189.7 0 x1 Figure 15.17: Plot for the response surface prediction conversion for Example 15.8. Other Comments Concerning Response Surface Analysis The book by Myers, Montgomery, and Anderson-Cook (2009) provides a great deal of information concerning both design and analysis of RSM. The graphical illustration we have used here can be augmented by analytical results that provide information about the nature of the response surface inside the design region. −1.414 Concentration ξ 2 (concentration, %) 1 (temperature, °C) ξ 15.12 Robust Parameter Design 643 1.414 27.07 1 24.71 22.36 x20 20.00 17.64 1 15.29 −1.414 12.93 189.7 −1.414 201.4 −1 213.2 ξ1(Temperature,°C) 0 x1 236.8 248.6 1 260.4 1.414 225.0 Figure 15.18: Contour plot of predicted conversion for Example 15.8. Other computations can be used to determine whether the location of the optimum conditions is, in fact, inside or remote from the experimental design region. There are many important considerations when one is required to determine appropriate conditions for future operation of a process. Other material in Myers, Montgomery, and Anderson-Cook (2009) deals with further experimental design issues. For example, the CCD, while the most generally useful design, is not the only class of design used in RSM. Many others are discussed in the aforementioned text. Also, the CCD discussed here is a special case in which k = 2. The more general k > 2 case is discussed in Myers, Montgomery, and Anderson-Cook (2009).
15.12 Robust Parameter Design
In this chapter, we have emphasized the notion of using design of experiments (DOE) to learn about engineering and scientific processes. In the case where the process involves a product, DOE can be used to provide product improvement or quality improvement. As we pointed out in Chapter 1, much importance has been attached to the use of statistical methods in product improvement. An important aspect of this quality improvement effort that surfaced in the 1980s and continued through the 1990s is to design quality into processes and products at the research stage or the process design stage. One often requires DOE in the development of processes that have the following properties:
1. Insensitive (robust) to environmental conditions
65
75
(Concentration,
2
%)
ξ
40
50
80
70
60
35
65
45
55
25

644
Chapter 15 2k Factorial Experiments and Fractions
2. Insensitive (robust) to factors difficult to control
3. Provide minimum variation in performance
The methods used to attain the desirable characteristics in 1, 2, and 3 are a part of what is referred to as robust parameter design, or RPD (see Taguchi, 1991; Taguchi and Wu, 1985; and Kackar, 1985, in the Bibliography). The term design in this context refers to the design of the process or system; parameter refers to the parameters in the system. These are what we have been calling factors or variables.
It is very clear that goals 1, 2, and 3 above are quite noble. For example, a petroleum engineer may have a fine gasoline blend that performs quite well as long as conditions are ideal and stable. However, the performance may deteriorate because of changes in environmental conditions, such as type of driver, weather conditions, type of engine, and so forth. A scientist at a food company may have a cake mix that is quite good unless the user does not exactly follow directions on the box, directions that deal with oven temperature, baking time, and so forth. A product or process whose performance is consistent when exposed to these changing environmental conditions is called a robust product or robust process. (See Myers, Montgomery, and Anderson-Cook, 2009, in the Bibliography.)
Control and Noise Variables
Control factors are variables that can be controlled both in the experiment and in the process. Noise factors are variables that may or may not be controlled in the experiment but cannot be controlled in the process (or not controlled well in the process).
Definition 15.2:
Goal of Robust Parameter Design
Taguchi (1991) emphasized the notion of using two classes of design variables in a study involving RPD: control factors and noise factors.
An important approach is to use control variables and noise variables in the same experiment as fixed effects. Orthogonal designs or orthogonal arrays are popular designs to use in this effort.
The goal of robust parameter design is to choose the levels of the control vari- ables (i.e., the design of the process) that are most robust (insensitive) to changes in the noise variables.
It should be noted that changes in the noise variables actually imply changes during the process, changes in the field, changes in the environment, changes in handling or usage by the consumer, and so forth.
The Product Array
One approach to the design of experiments involving both control and noise vari- ables is to use an experimental plan that calls for an orthogonal design for both the control and the noise variables separately. The complete experiment, then, is merely the product or crossing of these two orthogonal designs. The following is a simple example of a product array with two control and two noise variables.

15.12 Robust Parameter Design 645
Example 15.9: In the article “The Taguchi Approach to Parameter Design” in Quality Progress, December 1987, D. M. Byrne and S. Taguchi discuss an interesting example in which a method is sought for attaching an electrometric connector to a nylon tube so as to deliver the pull-off performance required for an automotive engine application. The objective is to find controllable conditions that maximize pull-off force. Among the controllable variables are A, connector wall thickness, and B, insertion depth. During routine operation there are several variables that cannot be controlled, although they will be controlled during the experiment. Among them are C, conditioning time, and D, conditioning temperature. Three levels are taken for each control variable and two for each noise variable. As a result, the crossed array is as follows. The control array is a 3 × 3 array, and the noise array is a familiar 22 factorial with (1), c, d, and cd representing the four factor combinations. The purpose of the noise factor is to create the kind of variability in the response, pull-off force, that might be expected in day-to-day operation with the process. The design is shown in Table 15.18.
Table 15.18: Design for Example 15.9
Thin
Medium
B (depth) Shallow Medium Deep
(1) (1) (1)
ccc
ddd cd cd cd (1) (1) (1) ccc
ddd
cd cd cd
A (wall thickness)
Case Study 15.3: Solder Process Optimization: In an
Industrial Designed Experiments by Schmidt and Launsby (1991; see the Bibli-
ography), solder process optimization is accomplished by a printed circuit-board assembly plant. Parts are inserted either manually or automatically into a bare board with a circuit printed on it. After the parts are inserted, the board is put through a wave solder machine, which is used to connect all the parts into the circuit. Boards are placed on a conveyor and taken through a series of steps. They are bathed in a flux mixture to remove oxide. To minimize warpage, they are preheated before the solder is applied. Soldering takes place as the boards move across the wave of solder. The object of the experiment is to minimize the number of solder defects per million joints. The control factors and levels are as given in Table 15.19.
Thick (1) (1) (1) ccc ddd
cd cd cd
experiment described in Understanding

646
Chapter 15 2k Factorial Experiments and Fractions Table 15.19: Control Factors for Case Study 15.3
Factor
A, solder pot temperature (◦F) B, conveyor speed (ft/min)
C, flux density
D, preheat temperature
E, wave height (in.)
These factors are easy to control at the experimental level but are more formidable
at the plant or process level.
(−1) (+1)
480 510 7.2 10 0.9◦ 1.0◦ 150 200 0.5 0.6
Noise Factors: Tolerances on Control Factors
Often in processes such as this one, the natural noise factors are tolerances on the control factors. For example, in the actual on-line process, solder pot temperature and conveyor-belt speed are difficult to control. It is known that the control of temperature is within ±5◦F and the control of conveyor-belt speed is within ±0.2 ft/min. It is certainly conceivable that variability in the product response (solder- ing performance) is increased because of an inability to control these two factors at some nominal levels. The third noise factor is the type of assembly involved. In practice, one of two types of assemblies will be used. Thus, we have the noise factors given in Table 15.20.
Table 15.20: Noise Factors for Case Study 15.3
Factor
A*, solder pot temperature tolerance (◦F) (deviation from nominal)
B*, conveyor speed tolerance (ft/min) (deviation from ideal)
(−1) (+1) −5 +5
−0.2 +0.2 1 2
C*, assembly type
Both the control array (inner array) and the noise array
(outer array) were chosen to be fractional factorials, the former a 1 of a 25 and the latter a 1 of a 23.
The crossed array and the response values are shown in Table 15.21. The first three columns of the inner array represent a 23. The fourth and fifth columns are formed by D = −AC and E = −BC. Thus, the defining interactions for the inner array are ACD, BCE, and ABDE. The outer array is a standard resolution III fraction of a 23. Notice that each inner array point contains runs from the outer array. Thus, four response values are observed at each combination of the control array. Figure 15.19 displays plots which reveal the effect of temperature and density on the mean response.
42

15.12 Robust Parameter Design 647 Table 15.21: Crossed Arrays and Response Values for Case Study 15.3
Inner Array
A B C D E (1) a*b* a*c* b*c* y ̄
Outer Array
1 1 1 1
−1 −1 −1 −1
1 1
1 −1 −1 1 −1 −1
1 1
1 −1 −1 1 −1 −1
−1 1 −1 1 1 −1 1 −1
−1 1 1 −1 −1 1 1 −1
194 136 185
47
295
234
328
186
197
136
261
125
216
159
326
187
250
185
193 275 214.75 132 136 135.00 264 264 243.50 127 42 85.25 204 293 252.00 231 157 195.25 247 322 305.75
sy 40.20 2.00 39.03 47.11 48.75 43.04 39.25 47.35
105
104 145.50 Flux Density
Solder Pot Temperature
250
185
120
Low (−1)
High (+1)
120 Low (−1)
High (+1)
Figure 15.19: Plot showing the influence of factors on the mean response.
Simultaneous Analysis of Process Mean and Variance
In most examples using RPD, the analyst is interested in finding conditions on the control variables that give suitable values for the mean response y ̄. However, varying the noise variables produces information on the process variance σy2 that might be anticipated in the process. Obviously a robust product is one for which the process is consistent and thus has a small process variance. RPD may involve the simultaneous analysis of y ̄ and sy.
It turns out that temperature and flux density are the most important factors in Case Study 15.3. They seem to influence both sy and y ̄. Fortunately, high temperature and low flux density are preferable for both. From Figure 15.19, the “optimum” conditions are
solder temperature = 510◦F, flux density = 0.9◦.
Mean, y
Mean, y

648 Chapter 15 2k Factorial Experiments and Fractions Alternative Approaches to Robust Parameter Design
One approach suggested by many is to model the sample mean and sample variance separately. Separate modeling often helps the experimenter to obtain a better understanding of the process involved. In the following example, we illustrate this approach with the solder process experiment.
Case Study 15.4: Consider the data set of Case Study 15.3. An alternative approach is to fit separate models for the mean y ̄ and the sample standard deviation. Suppose that we use the usual +1 and −1 coding for the control factors. Based on the apparent importance of solder pot temperature x1 and flux density x2, linear regression on the response
(number of errors per million joints) produces
yˆ = 197.125 − 27.5×1 + 57.875×2.
To find the most robust levels of temperature and flux density, it is impor- tant to procure a compromise between the mean response and variability, which requires a modeling of the variability. An important tool in this regard is the log transformation (see Bartlett and Kendall, 1946, or Carroll and Ruppert, 1988):
ln s2 = γ0 + γ1(x1) + γ2(x2). This modeling process produces the following result:
􏱉2
ln s = 6.6975 − 0.7458×1 + 0.6150×2 .
The log linear model finds extensive use for modeling sample variance, since the log transformation on the sample variance lends itself to use of the method of least squares. This results from the fact that normality and homogeneous variance assumptions are often quite good when one uses ln s2 rather than s2 as the model response.
The analysis that is important to the scientist or engineer makes use of the two models simultaneously. A graphical approach can be very useful. Figure 15.20 shows simple plots of the mean and standard deviation models simultaneously. As one would expect, the location of temperature and flux density that minimizes the mean number of errors is the same as that which minimizes variability, namely high temperature and low flux density. The graphical multiple response surface ap- proach allows the user to see tradeoffs between process mean and process variability. For this example, the engineer may be dissatisfied with the extreme conditions in solder temperature and flux density. The figure offers estimates of how much is lost as one moves away from the optimum mean and variability conditions to any intermediate conditions.
In Case Study 15.4, values for control variables were chosen that gave desirable conditions for both the mean and the variance of the process. The mean and variance were taken across the distribution of noise variables in the process and were modeled separately, and appropriate conditions were found through a dual response surface approach. Since Case Study 15.4 involved two models (mean and variance), this can be viewed as a dual response surface analysis. Fortunately, in this example the same conditions on the two relevant control variables, solder

15.12 Robust Parameter Design
649
1.0
0.5
0.0
􏱍0.5
􏱍1.0 􏱍1.0
􏱍0.5 0.0
x1, Temperature
0.5 1.0
Figure 15.20: Mean and standard deviation for Case Study 15.4.
temperature and flux density, were optimal for both the process mean and the variance. Much of the time in practice some type of compromise between the mean and variance would need to be invoked.
The approach illustrated in Case Study 15.4 involves finding optimal process conditions when the data used are from a product array (or crossed array) type of experimental design. Often, using the product array, a cross between two designs, can be very costly. However, the development of dual response surface models, i.e., a model for the mean and a model for the variance, can be accomplished without a product array. A design that involves both control and noise variables is often called a combined array. This type of design and the resulting analysis can be used to determine what conditions on the control variables are most robust (insensitive) to variation in the noise variables. This can be viewed as tantamount to finding control levels that minimize the process variance produced by movement in the noise variables.
The Role of the Control-by-Noise Interaction
The structure of the process variance is greatly determined by the nature of the control-by-noise interaction. The nature of the nonhomogeneity of process vari- ance is a function of which control variables interact with which noise variables. Specifically, as we will illustrate, those control variables that interact with one or more noise variables can be the object of the analysis. For example, let us consider an illustration used in Myers, Montgomery, and Anderson-Cook (2009) involving two control variables and a single noise variable with the data given in Table 15.22. A and B are control variables and C is a noise variable.
One can illustrate the interactions AC and BC with plots, as given in Figure
^
y 􏱋 260
s 􏱋 57.4
s 􏱋 48.2
s 􏱋 40.4
s 􏱋 34.0
s 􏱋 28.5
s 􏱋 23.9
s 􏱋 20.1
s 􏱋 14.2
^
y 􏱋 240
^
y 􏱋 220
^
y 􏱋 200
^
y 􏱋 180
^
y 􏱋 160
^
y 􏱋 140
^
y 􏱋 120
x2, Flux Density

650
Chapter 15 2k Factorial Experiments and Fractions Table 15.22: Experimental Data in a Crossed Array
Inner Array Outer Array
A B C =−1 C =+1 Response Mean
−1 −1 11 15 13.0 −1 1 7 8 7.5 1 −1 10 26 18.0 1 1 10 14 12.0
15.21. One must understand that while A and B are held constant in the process C follows a probability distribution during the process. Given this information, it becomes clear that A = −1 and B = +1 are levels that produce smaller values for the process variance, while A = +1 and B = −1 give larger values. Thus, we say that A = −1 and B = +1 are robust values, i.e., insensitive to inevitable changes in the noise variable C during the process.
20
10
yy
A=+1 20
A =−1
−1 C +1 (a) AC interaction plot.
B =−1
B =+1
10
Figure 15.21: Interaction plots for the data in Table 15.22.
−1 C +1 (b) BC interaction plot.
In the above example, we say that both A and B are dispersion effects (i.e. both factors impact the process variance). In addition, both factors are location effects since the mean of y changes as both factors move from −1 to +1.
Analysis Involving the Model Containing Both Control and Noise Variables
While it has been emphasized that noise variables are not constant during the working of the process, analysis that results in desirable or even optimal condi- tions on the control variables is best accomplished through an experiment in which both control and noise variables are fixed effects. Thus, both main effects in the control and noise variables and all the important control-by-noise interactions can be evaluated. This model in x and z, often called a response model, can both

15.12 Robust Parameter Design 651
directly and indirectly provide useful information regarding the process. The re- sponse model is actually a response surface model in vector x and vector z, where x contains control variables and z the noise variables. Certain operations allow models to be generated for the process mean and variance much as in Case Study 15.4. Details are supplied in Myers, Montgomery, and Anderson-Cook (2009); we will illustrate with a very simple example. Consider the data of Table 15.22 on page 650 with control variables A and B and noise variable C. There are eight experimental runs in a 22 × 2, or 23, factorial. Thus, the response model can be written
y(x,z)=β0 +β1×1 +β2×2 +β3z+β12x1x2 +β1zx1z+β2zx2z+ε.
We will not include the three-factor interaction in the regression model. A, B, and C in Table 15.22 are represented by x1, x2, and z, respectively, in the model. We assume that the error term ε has the usual independence and constant variance properties.
The Mean and Variance Response Surfaces
The process mean and variance response surfaces are best understood by consid- ering the expectation and variance of z across the process. We assume that the noise variable C [denoted by z in y(x,z)] is continuous with mean 0 and variance σz2. The process mean and variance models may be viewed as
Ez[y(x,z)]=β0 +β1×1 +β2×2 +β12x1x2, Varz[y(x,z)]=σ2 +σz2(β3 +β1zx1 +β2zx2)2 =σ2 +σz2lx2,
where lx is the slope ∂y(x,z) in the direction of z. As we indicated earlier, note how ∂z
the interactions of factors A and B with the noise variable C are key components of the process variance.
Though we have already analyzed the current example through plots in Figure 15.21, which displayed the role of AB and AC interactions, it is instructive to look at the analysis in light of Ez[y(x,z)] and Varz[y(x,z)] above. In this example, the reader can easily verify the estimate b1z for β1z is 15/8 while the estimate b2z for β2z is −15/8. The coefficient b3 = 25/8. Thus, the condition x1 = +1 and x2 = −1 results in a process variance estimate of
􏱉222 Varz[y(x,z)]=σ +σz(b3 +b1zx1 +b2zx2)
􏰯2 􏰧55􏰨2 (−1) =σ2+σz2
􏰮25 􏰧15􏰨 =σ2+σz2 +
􏱉222 Varz[y(x,z)]=σ +σz(b3 +b1zx1 +b2zx2)
(1)+ whereasforx1 =−1andx2 =1,wehave
􏰧−15􏰨 8888
,
.
Thus, for the most desirable (robust) condition of x1 = −1 and x2 = 1, the estimated process variance due to the noise variable C (or z) is (25/64)σz2. The
􏰮25 􏰧15􏰨 =σ2+σz2 +
􏰯2 􏰧−5􏰨2 (−1) =σ2+σz2
􏰧15􏰨 8888
(−1)+

652
Chapter 15 2k Factorial Experiments and Fractions
yy
//
most undesirable condition, the condition of maximum process variance (i.e., x1 = +1 and x2 = −1), produces an estimated process variance of (3025/64)σz2. As far as the mean response is concerned, Figure 15.21 indicates that if maximum response is desired x1 = +1 and x2 = −1 produce the best result.
x1 = +1
x1 = −1
−1
Exercises
15.31 Consider an example in which there are two control variables x1 and x2 and a single noise variable z. The goal is to determine the levels of x1 and x2 that are robust to changes in z, i.e., levels of x1 and x2 that minimize the variance produced in the response y as z moves between −1 and +1. The variables x1 and x2 are at two levels, −1 and +1, in the experiment. The data produce the plots in Figure 15.22 above. Note that x1 and x2 interact with the noise variable z. What settings on x1 and x2 (−1 or +1 for each) result in minimum variance in y? Explain.
15.32 Consider the following 23 factorial with control variables x1 and x2 and noise variable z. Can x1 and x2 be chosen at levels for which Var(y) is minimized? Explain why or why not.
x2 = +1
x2 = −1
+1 z
+1 z −1
Figure 15.22: Interaction plots for the data in Exercise 15.31.
(a) x1z interaction plot.
(b) x2z interaction plot.
z = −1
x2 = −1 x2 = +1
(a)
(b)
(c)
(d) (e)
Can the setting on velocity be used to create some type of control on the process variance in shrinkage which arises due to the inability to control temper- ature? Explain.
Using parameter estimates from Figure 15.7, give an estimate of the following models:
(i)mean shrinkage across the distribution of tem- perature;
(ii)shrinkage variance as a function of σz2.
Use the estimated variance model to determine the level of velocity that minimizes the shrinkage vari- ance.
Use the mean shrinkage model to determine what value of velocity minimizes mean shrinkage.
Are your results above consistent with your anal- ysis from the interaction plot in Figure 15.6? Ex- plain.
z = +1
8 10
0 and variance σz2. Of concern is the variance of the shrinkage response in the process itself. In the analysis of Figure 15.7, it is clear that mold temperature, injec- tion velocity, and the interaction between the two are the only important factors.
x2 = +1
x2 = −1
x1=−1 4 6
x1=+1 1 3 3 5
15.33 Consider Case Study 15.1 involving the injec- tion molding data. Suppose mold temperature is dif- ficult to control and thus it can be assumed that in the process it follows a normal distribution with mean
15.34 In Case Study 15.2 involving the coal cleans-

Review Exercises
653
ing data, the percent solids in the process system is known to vary uncontrollably during the process and is viewed as a noise factor with mean 0 and variance σz2. The response, cleansing efficiency, has a mean and variance that change behavior during the process. Use only significant terms in the following parts.
(a) Use the estimates in Figure 15.9 to develop the pro- cess mean efficiency and variance models.
(b) What factor (or factors) might be controlled at cer- tain levels to control or otherwise minimize the pro- cess variance?
(c) What conditions of factors B and C within the de- sign region maximize the estimated mean?
(d) What level of C would you suggest for minimization of process variance when B = 1? When B = −1?
15.35 Use the coal cleansing data of Exercise 15.2 on page 609 to fit a model of the type
E(Y) = β0 +β1×1 +β2×2 +β3×3, Review Exercises
15.39 A Plackett-Burman design was used to study the rheological properties of high-molecular-weight copolymers. Two levels of each of six variables were fixed in the experiment. The viscosity of the poly- mer is the response. The data were analyzed by the Statistics Consulting Center at Virginia Tech for per- sonnel in the Chemical Engineering Department at the University. The variables are as follows: hard block chemistry x1, nitrogen flow rate x2, heat-up time x3, percent compression x4, scans (high and low) x5, per- cent strain x6. The data are presented here.
Obs.x1x2x3x4x5x6 y
1 −1
where the levels are
x1, percent solids: 8, 12
x2, flow rate: 150, 250 gal/min
x3, pH: 5, 6
Center and scale the variables to design units. Also conduct a test for lack of fit, and comment concerning the adequacy of the linear regression model.
15.36 A 25 factorial plan is used to build a regres- sion model containing first-order coefficients and model terms for all two-factor interactions. Duplicate runs are made for each factor. Outline the analysis-of-variance table, showing degrees of freedom for regression, lack of fit, and pure error.
15.37 Consider the 1 of the 27 factorial discussed in 16
Section 15.9. List the additional 11 defining contrasts.
15.38 Construct a Plackett-Burman design for 10 variables containing 24 experimental runs.
15.40 A large petroleum company in the Southwest regularly conducts experiments to test additives to drilling fluids. Plastic viscosity is a rheological mea- sure reflecting the thickness of the fluid. Various poly- mers are added to the fluid to increase viscosity. The following is a data set in which two polymers are used at two levels each and the viscosity measured. The concentration of the polymers is indicated as “low” or “high.” Conduct an analysis of the 22 factorial ex- periment. Test for effects for the two polymers and interaction.
194,700 588,400 7533 514,100 277,300 493,500 8969 18,340 6793 160,400 7008 3637 Build a regression equation relating viscosity to
1 2 3 4 5 6 7 8 9
10 11 12
Polymer 2 Low High
Low
High
11.3 12.0 21.7 22.4
//
Polymer 1
1 −1 −1 1 1 −1 1 1 −1 1 1 −1 1 1 1 1 −1 1 −1 −1 −1 −1 −1 −1
−1 −1 −1 −1 1 −1 −1 1 1 −1 1 1 −1 1 1 −1 1 1 1 1 −1 1 −1 −1
1 1 −1 1 1 −1 1 1 1 1 −1 1 −1 −1 −1 −1 1 −1 −1 1 −1 −1
3.0 11.7
3.5 12.0
the levels of the six variables. Conduct t-tests for all main effects. Recommend factors that should be retained for future studies and those that should not. Use the resid- ual mean square (5 degrees of freedom) as a measure
of experimental error.
15.41 A 22 factorial experiment is analyzed by the Statistics Consulting Center at Virginia Tech. The client is a member of the Department of Housing, Inte- rior Design, and Resource Management. The client is interested in comparing cold start to preheating ovens in terms of total energy delivered to the product. In ad- dition, convection is being compared to regular mode. Four experimental runs are made at each of the four factor combinations. Following are the data from the experiment:

654
Chapter 15 2k Factorial Experiments and Fractions ponent 1. Assume that no interaction exists among the
Preheat Cold Convection 618 619.3 575 573.7
Mode 629 611 574 572 Regular 581 585.7 558 562 Mode 581 595 562 566
six factors.
Treatment Intensity
Do an analysis of variance to study main effects and interaction. Draw conclusions.
15.42 In the study “The Use of Regression Analy-
sis for Correcting Matrix Effects in the X-Ray Fluo-
rescence Analysis of Pyrotechnic Compositions,” pub-
lished in the Proceedings of the Tenth Conference on
the Design of Experiments in Army Research Devel-
opment and Testing, ARO-D Report 65-3 (1965), an
experiment was conducted in which the concentrations
of four components of a propellant mixture and the
weights of fine and coarse particles in the slurry were
each allowed to vary. Factors A, B, C, and D, each
at two levels, represent the concentrations of the four
components, and factors E and F, also at two levels,
represent the weights of the fine and coarse particles
present in the slurry. The goal of the analysis was
to determine if the X-ray intensity ratios associated
with component 1 of the propellant were significantly
influenced by varying the concentrations of the vari-
ous components and the weights of the particles in the
mixture. A 1 fraction of a 26 factorial experiment was 8
used, with the defining contrasts being ADE, BCE, and AC F . The data shown here represent the total of a pair of intensity readings.
The pooled mean square error with 8 degrees of free- dom is given by 0.02005. Analyze the data using a 0.05 level of significance to determine if the concentra- tions of the components and the weights of the fine and coarse particles present in the slurry have a significant influence on the intensity ratios associated with com-
Batch Combination
1 abef 2 cdef 3 (1) 4 ace 5 bde 6 abcd 7 adf 8 bcf
Ratio Total
2.2480 1.8570 2.2428 2.3270 1.8830 1.8078 2.1424 1.9122
15.43 Use Table 15.16 to construct a 16-run design with 8 factors that is resolution IV.
15.44 Verify that your design in Review Exercise 15.43 is indeed resolution IV.
15.45 Construct a design that contains 9 design points, is orthogonal, contains 12 total runs and 3 de- grees of freedom for replication error, and allows for a lack-of-fit test for pure quadratic curvature.
15.46 Consider a design which is a 23−1 with 2 cen- III
ter runs. Consider y ̄f as the average response at the design parameter and y ̄0 as the average response at the design center. Suppose the true regression model is
E(Y) = β0 +β1×1 +β2×2 +β3×3
+ β11×21 + β22×2 + β33×23.
(a) Give (and verify) E(y ̄f − y ̄0).
(b) Explain what you have learned from the result in
(a).
15.13 Potential Misconceptions and Hazards; Relationship to Material in Other Chapters
In the use of fractional factorial experiments, one of the most important consider- ations that the analyst must be aware of is the design resolution. A design of low resolution is smaller (and hence cheaper) than one of higher resolution. However, a price is paid for the cheaper design. The design of lower resolution has heavier aliasing than one of higher resolution. For example, if the researcher has expec- tations that two-factor interactions may be important, then resolution III should not be used. A resolution III design is strictly a main effects plan.

Chapter 16
Nonparametric Statistics
16.1 Nonparametric Tests
Most of the hypothesis-testing procedures discussed in previous chapters are based on the assumption that the random samples are selected from normal populations. Fortunately, most of these tests are still reliable when we experience slight depar- tures from normality, particularly when the sample size is large. Traditionally, these testing procedures have been referred to as parametric methods. In this chapter, we consider a number of alternative test procedures, called nonparamet- ric or distribution-free methods, that often assume no knowledge whatsoever about the distributions of the underlying populations, except perhaps that they are continuous.
Nonparametric, or distribution-free procedures, are used with increasing fre- quency by data analysts. There are many applications in science and engineering where the data are reported as values not on a continuum but rather on an ordinal scale such that it is quite natural to assign ranks to the data. In fact, the reader may notice quite early in this chapter that the distribution-free methods described here involve an analysis of ranks. Most analysts find the computations involved in nonparametric methods to be very appealing and intuitive.
For an example where a nonparametric test is applicable, consider the situation in which two judges rank five brands of premium beer by assigning a rank of 1 to the brand believed to have the best overall quality, a rank of 2 to the second best, and so forth. A nonparametric test could then be used to determine whether there is any agreement between the two judges.
We should also point out that there are a number of disadvantages associ- ated with nonparametric tests. Primarily, they do not utilize all the information provided by the sample, and thus a nonparametric test will be less efficient than the corresponding parametric procedure when both methods are applicable. Con- sequently, to achieve the same power, a nonparametric test will require a larger sample size than will the corresponding parametric test.
As we indicated earlier, slight departures from normality result in minor devi- ations from the ideal for the standard parametric tests. This is particularly true for the t-test and the F-test. In the case of the t-test and the F-test, the P-value
655

656
Chapter 16 Nonparametric Statistics
Sign Test
quoted may be slightly in error if there is a moderate violation of the normality assumption.
In summary, if a parametric and a nonparametric test are both applicable to the same set of data, we should carry out the more efficient parametric technique. However, we should recognize that the assumptions of normality often cannot be justified and that we do not always have quantitative measurements. It is fortu- nate that statisticians have provided us with a number of useful nonparametric procedures. Armed with nonparametric techniques, the data analyst has more ammunition to accommodate a wider variety of experimental situations. It should be pointed out that even under the standard normal theory assumptions, the ef- ficiencies of the nonparametric techniques are remarkably close to those of the corresponding parametric procedure. On the other hand, serious departures from normality will render the nonparametric method much more efficient than the parametric procedure.
The reader should recall that the procedures discussed in Section 10.4 for testing the null hypothesis that μ = μ0 are valid only if the population is approximately normal or if the sample is large. If n < 30 and the population is decidedly nonnormal, we must resort to a nonparametric test. The sign test is used to test hypotheses on a population median. In the case of many of the nonparametric procedures, the mean is replaced by the median as the pertinent location parameter under test. Recall that the sample median was defined in Section 1.3. The population counterpart, denoted by μ ̃, has an analogous definition. Given a random variable X, μ ̃ is defined such that P(X > μ ̃) ≤ 0.5 and P (X < μ ̃) ≤ 0.5. In the continuous case, P(X >μ ̃)=P(X <μ ̃)=0.5. Of course, if the distribution is symmetric, the population mean and median are equal. In testing the null hypothesis H0 that μ ̃ = μ ̃0 against an appropriate alternative, on the basis of a random sample of size n, we replace each sample value exceeding μ ̃0 with a plus sign and each sample value less than μ ̃0 with a minus sign. If the null hypothesis is true and the population is symmetric, the sum of the plus signs should be approximately equal to the sum of the minus signs. When one sign appears more frequently than it should based on chance alone, we reject the hypothesis that the population median μ ̃ is equal to μ ̃0. In theory, the sign test is applicable only in situations where μ ̃0 cannot equal the value of any of the observations. Although there is a zero probability of obtain- ing a sample observation exactly equal to μ ̃0 when the population is continuous, nevertheless, in practice a sample value equal to μ ̃0 will often occur from a lack of precision in recording the data. When sample values equal to μ ̃0 are observed, they are excluded from the analysis and the sample size is correspondingly reduced. The appropriate test statistic for the sign test is the binomial random variable X, representing the number of plus signs in our random sample. If the null hy- pothesis that μ ̃ = μ ̃0 is true, the probability that a sample value results in either a plus or a minus sign is equal to 1/2. Therefore, to test the null hypothesis that 16.1 Nonparametric Tests 657 μ ̃ = μ ̃0, we actually test the null hypothesis that the number of plus signs is a value of a random variable having the binomial distribution with the parameter p = 1/2. P-values for both one-sided and two-sided alternatives can then be calculated using this binomial distribution. For example, in testing H 0 : μ ̃ = μ ̃ 0 , H 1 : μ ̃ < μ ̃ 0 , we shall reject H0 in favor of H1 only if the proportion of plus signs is sufficiently less than 1/2, that is, when the value x of our random variable is small. Hence, if the computed P-value P = P(X ≤ x when p = 1/2) is less than or equal to some preselected significance level α, we reject H0 in favor of H1. For example, when n = 15 and x = 3, we find from Table A.1 that 􏰤3 􏰧 1􏰨 P=P(X≤3whenp=1/2)= b x;15,2 =0.0176, x=0 so the null hypothesis μ ̃ = μ ̃0 can certainly be rejected at the 0.05 level of signifi- cance but not at the 0.01 level. To test the hypothesis H 0 : μ ̃ = μ ̃ 0 , H 1 : μ ̃ > μ ̃ 0 ,
we reject H0 in favor of H1 only if the proportion of plus signs is sufficiently greater than 1/2, that is, when x is large. Hence, if the computed P-value
P = P(X ≥ x when p = 1/2)
is less than α, we reject H0 in favor of H1. Finally, to test the hypothesis
H 0 : μ ̃ = μ ̃ 0 , H 1 : μ ̃ ̸ = μ ̃ 0 ,
we reject H0 in favor of H1 when the proportion of plus signs is significantly less than or greater than 1/2. This, of course, is equivalent to x being sufficiently small or sufficiently large. Therefore, if x < n/2 and the computed P-value P = 2P(X ≤ x when p = 1/2) is less than or equal to α, or if x > n/2 and the computed P-value
P = 2P(X ≥ x when p = 1/2) is less than or equal to α, we reject H0 in favor of H1.

658
Chapter 16 Nonparametric Statistics
Whenever n > 10, binomial probabilities with p = 1/2 can be approximated from the normal curve, since np = nq > 5. Suppose, for example, that we wish to test the hypothesis
H 0 : μ ̃ = μ ̃ 0 , H 1 : μ ̃ < μ ̃ 0 , at the α = 0.05 level of significance, for a random sample of size n = 20 that yields x = 6 plus signs. Using the normal curve approximation with and we find that Therefore, μ ̃ = np = (20)(0.5) = 10 σ = √npq = 􏰱(20)(0.5)(0.5) = 2.236, z= 6.5−10 =−1.57. 2.236 Example 16.1: Solution: P =P(X ≤6)≈P(Z <−1.57)=0.0582, which leads to the nonrejection of the null hypothesis. The following data represent the number of hours that a rechargeable hedge trim- mer operates before a recharge is required: 1.5, 2.2, 0.9, 1.3, 2.0, 1.6, 1.8, 1.5, 2.0, 1.2, 1.7. Use the sign test to test the hypothesis, at the 0.05 level of significance, that this particular trimmer operates a median of 1.8 hours before requiring a recharge. 1.H0:μ ̃=1.8. 2. H1: μ ̃ ̸= 1.8. 3. α = 0.05. 4. Test statistic: Binomial variable X with p = 1 . 2 5. Computations: Replacing each value by the symbol “+” if it exceeds 1.8 and by the symbol “−” if it is less than 1.8 and discarding the one measurement that equals 1.8, we obtain the sequence −+−−+−−+−− for which n = 10, x = 3, and n/2 = 5. Therefore, from Table A.1 the computed P-value is 􏰧 1􏰨􏰤3􏰧1􏰨 P=2P X≤3whenp=2 =2 b x;10,2 =0.3438>0.05.
x=0

16.1 Nonparametric Tests 659
6. Decision: Do not reject the null hypothesis and conclude that the median operating time is not significantly different from 1.8 hours.
We can also use the sign test to test the null hypothesis μ ̃1 − μ ̃2 = d0 for paired observations. Here we replace each difference, di, with a plus or minus sign depending on whether the adjusted difference, di − d0, is positive or negative. Throughout this section, we have assumed that the populations are symmetric. However, even if populations are skewed, we can carry out the same test procedure, but the hypotheses refer to the population medians rather than the means.
Example 16.2: A taxi company is trying to decide whether the use of radial tires instead of regular belted tires improves fuel economy. Sixteen cars are equipped with radial tires and driven over a prescribed test course. Without changing drivers, the same cars are then equipped with the regular belted tires and driven once again over the test course. The gasoline consumption, in kilometers per liter, is given in Table 16.1. Can we conclude at the 0.05 level of significance that cars equipped with radial tires obtain better fuel economy than those equipped with regular belted tires?
Table 16.1: Data for Example 16.2
Car 12345678 RadialTires 4.2 4.7 6.6 7.0 6.7 4.5 5.7 6.0 BeltedTires 4.1 4.9 6.2 6.9 6.8 4.4 5.7 5.8
Car 9 10 11 12 13 14 15 16 RadialTires 7.4 4.9 6.1 5.2 5.7 6.9 6.8 4.9 BeltedTires 6.9 4.9 6.0 4.9 5.3 6.5 7.1 4.8
Solution: Let μ ̃1 and μ ̃2 represent the median kilometers per liter for cars equipped with radial and belted tires, respectively.
1. H0: μ ̃1 − μ ̃2 = 0.
2. H1: μ ̃1 − μ ̃2 > 0.
3. α = 0.05.
4. Test statistic: Binomial variable X with p = 1/2.
5. Computations: After replacing each positive difference by a “+” symbol and each negative difference by a “−” symbol and then discarding the two zero differences, we obtain the sequence
+−++−+++++++−+
for which n = 14 and x = 11. Using the normal curve approximation, we find
z=􏰱 10.5−7 =1.87, (14)(0.5)(0.5)
and then
P =P(X ≥11)≈P(Z >1.87)=0.0307.

660
Chapter 16 Nonparametric Statistics
16.2
6. Decision: Reject H0 and conclude that, on the average, radial tires do improve fuel economy.
Not only is the sign test one of the simplest nonparametric procedures to ap- ply; it has the additional advantage of being applicable to dichotomous data that cannot be recorded on a numerical scale but can be represented by positive and negative responses. For example, the sign test is applicable in experiments where a qualitative response such as “hit” or “miss” is recorded, and in sensory-type ex- periments where a plus or minus sign is recorded depending on whether the taste tester correctly or incorrectly identifies the desired ingredient.
We shall attempt to make comparisons between many of the nonparametric procedures and the corresponding parametric tests. In the case of the sign test the competition is, of course, the t-test. If we are sampling from a normal distribution, the use of the t-test will result in a larger power for the test. If the distribu- tion is merely symmetric, though not normal, the t-test is preferred in terms of power unless the distribution has extremely “heavy tails” compared to the normal distribution.
Signed-Rank Test
The reader should note that the sign test utilizes only the plus and minus signs of the differences between the observations and μ ̃0 in the one-sample case, or the plus and minus signs of the differences between the pairs of observations in the paired-sample case; it does not take into consideration the magnitudes of these differences. A test utilizing both direction and magnitude, proposed in 1945 by Frank Wilcoxon, is now commonly referred to as the Wilcoxon signed-rank test.
The analyst can extract more information from the data in a nonparametric fashion if it is reasonable to invoke an additional restriction on the distribution from which the data were taken. The Wilcoxon signed-rank test applies in the case of a symmetric continuous distribution. Under this condition, we can test the null hypothesis μ ̃ = μ ̃0. We first subtract μ ̃0 from each sample value, discarding all differences equal to zero. The remaining differences are then ranked without regard to sign. A rank of 1 is assigned to the smallest absolute difference (i.e., without sign), a rank of 2 to the next smallest, and so on. When the absolute value of two or more differences is the same, assign to each the average of the ranks that would have been assigned if the differences were distinguishable. For example, if the fifth and sixth smallest differences are equal in absolute value, each is assigned a rank of 5.5. If the hypothesis μ ̃ = μ ̃0 is true, the total of the ranks corresponding to the positive differences should nearly equal the total of the ranks corresponding to the negative differences. Let us represent these totals by w+ and w−, respectively. We designate the smaller of w+ and w− by w.
In selecting repeated samples, we would expect w+ and w−, and therefore w, to vary. Thus, we may think of w+, w−, and w as values of the corresponding random variables W+, W−, and W. The null hypothesis μ ̃ = μ ̃0 can be rejected in favor of the alternative μ ̃ < μ ̃0 only if w+ is small and w− is large. Likewise, the alternative μ ̃ > μ ̃0 can be accepted only if w+ is large and w− is small. For a two-sided alternative, we may reject H0 in favor of H1 if either w+ or w−, and hence w, is sufficiently small. Therefore, no matter what the alternative hypothesis

16.2 Signed-Rank Test 661 may be, we reject the null hypothesis when the value of the appropriate statistic
W+, W−, or W is sufficiently small. Two Samples with Paired Observations
To test the null hypothesis that we are sampling two continuous symmetric pop- ulations with μ ̃1 = μ ̃2 for the paired-sample case, we rank the differences of the paired observations without regard to sign and proceed as in the single-sample case. The various test procedures for both the single- and paired-sample cases are summarized in Table 16.2.
Table 16.2: Signed-Rank Test
H0 μ ̃ = μ ̃ 0
μ ̃ 1 = μ ̃ 2
⎧ H1
⎨ μ ̃ < μ ̃ 0 w+ ⎩ μ ̃ > μ ̃ 0 w−
μ ̃ ̸= μ ̃0 w ⎧
⎨ μ ̃ 1 < μ ̃ 2 w+ ⎩ μ ̃ 1 > μ ̃ 2 w−
μ ̃ 1 ̸ = μ ̃ 2 w
Compute
It is not difficult to show that whenever n < 5 and the level of significance does not exceed 0.05 for a one-tailed test or 0.10 for a two-tailed test, all possible values of w+, w−, or w will lead to the acceptance of the null hypothesis. However, when 5 ≤ n ≤ 30, Table A.16 shows approximate critical values of W+ and W− for levels of significance equal to 0.01, 0.025, and 0.05 for a one-tailed test and critical values of W for levels of significance equal to 0.02, 0.05, and 0.10 for a two-tailed test. The null hypothesis is rejected if the computed value w+, w−, or w is less than or equal to the appropriate tabled value. For example, when n = 12, Table A.16 shows that a value of w+ ≤ 17 is required for the one-sided alternative μ ̃ < μ ̃0 to be significant at the 0.05 level. Example 16.3: Rework Example 16.1 by using the signed-rank test. Solution: 1.H0:μ ̃=1.8. 2. H1: μ ̃ ̸= 1.8. 3. α = 0.05. 4. Critical region: Since n = 10 after discarding the one measurement that equals 1.8, Table A.16 shows the critical region to be w ≤ 8. 5. Computations: Subtracting 1.8 from each measurement and then ranking the differences without regard to sign, we have di −0.3 0.4 −0.9 −0.5 0.2 −0.2 −0.3 0.2 −0.6 −0.1 Ranks 5.5 7 10 8 3 3 5.5 3 9 1 Noww+ =13andw− =42,sow=13,thesmallerofw+ andw−. 662 Chapter 16 Nonparametric Statistics 6. Decision: As before, do not reject H0 and conclude that the median operating time is not significantly different from 1.8 hours. The signed-rank test can also be used to test the null hypothesis that μ ̃1 − μ ̃2 = d0. In this case, the populations need not be symmetric. As with the sign test, we subtract d0 from each difference, rank the adjusted differences without regard to sign, and apply the same procedure as above. Example 16.4: It is claimed that a college senior can increase his or her score in the major field area of the graduate record examination by at least 50 points if he or she is provided with sample problems in advance. To test this claim, 20 college seniors are divided into 10 pairs such that the students in each matched pair have almost the same overall grade-point averages for their first 3 years in college. Sample problems and answers are provided at random to one member of each pair 1 week prior to the examination. The examination scores are given in Table 16.3. Table 16.3: Data for Example 16.4 Pair 1 2 3 4 5 6 7 8 9 10 With Sample Problems 531 621 663 579 451 660 591 719 543 575 Without Sample Problems 509 540 688 502 424 683 568 748 530 524 Test the null hypothesis, at the 0.05 level of significance, that sample problems increase scores by 50 points against the alternative hypothesis that the increase is less than 50 points. Solution : Let μ ̃1 and μ ̃2 represent the median scores of all students taking the test in question with and without sample problems, respectively. 1. H0: μ ̃1 − μ ̃2 = 50. 2. H1: μ ̃1 − μ ̃2 < 50. 3. α = 0.05. 4. Critical region: Since n = 10, Table A.16 shows the critical region to be w+ ≤ 11. 5. Computations: Pair 1 2 3 4 5 6 7 8 9 10 di 22 81 −25 77 27 −23 23 −29 13 51 di − d0 −28 31 −75 27 −23 −73 −27 −79 −37 1 Ranks 5 6 9 3.5 2 8 3.5 10 7 1 Nowwefindthatw+ =6+3.5+1=10.5. 6. Decision: Reject H0 and conclude that sample problems do not, on average, increase one’s graduate record score by as much as 50 points. // Exercises 663 Normal Approximation for Large Samples When n ≥ 15, the sampling distribution of W+ (or W−) approaches the normal distribution with mean and variance given by μW+ = n(n+1) and σW2 = n(n+1)(2n+1). 4 + 24 Therefore, when n exceeds the largest value in Table A.16, the statistic Z = W+ − μW+ σW+ can be used to determine the critical region for the test. Exercises 16.1 The following data represent the time, in min- utes, that a patient has to wait during 12 visits to a doctor’s office before being seen by the doctor: 17 15 20 20 32 28 12 26 25 25 35 24 Use the sign test at the 0.05 level of significance to test the doctor’s claim that the median waiting time for her patients is not more than 20 minutes. 16.2 The following data represent the number of hours of flight training received by 18 student pilots from a certain instructor prior to their first solo flight: 9 12 18 14 12 14 12 10 16 11 9 11 13 11 13 15 13 14 Using binomial probabilities from Table A.1, perform a sign test at the 0.02 level of significance to test the instructor’s claim that the median time required before his students’ solo is 12 hours of flight training. 16.3 A food inspector examined 16 jars of a certain brand of jam to determine the percent of foreign im- purities. The following data were recorded: 2.4 2.3 3.1 2.2 2.3 1.2 1.0 2.4 1.7 1.1 4.2 1.9 1.7 3.6 1.6 2.3 Using the normal approximation to the binomial dis- tribution, perform a sign test at the 0.05 level of signif- icance to test the null hypothesis that the median per- cent of impurities in this brand of jam is 2.5% against the alternative that the median percent of impurities is not 2.5%. 16.4 A paint supplier claims that a new additive will reduce the drying time of its acrylic paint. To test this claim, 12 panels of wood were painted, one-half of each panel with paint containing the regular additive and the other half with paint containing the new additive. The drying times, in hours, were recorded as follows: Drying Time (hours) Panel New Additive 1 6.4 2 5.8 3 7.4 4 5.5 5 6.3 6 7.8 7 8.6 8 8.2 9 7.0 10 4.9 11 5.9 12 6.5 Regular Additive 6.6 5.8 7.8 5.7 6.0 8.4 8.8 8.4 7.3 5.8 5.8 6.5 Use the sign test at the 0.05 level to test the null hy- pothesis that the new additive is no better than the regular additive in reducing the drying time of this kind of paint. 16.5 It is claimed that a new diet will reduce a per- son’s weight by 4.5 kilograms, on average, in a period of 2 weeks. The weights of 10 women were recorded before and after a 2-week period during which they followed this diet, yielding the following data: Woman 1 2 3 4 5 6 7 8 9 10 Use the sign test at Weight Before Weight After 58.5 60.3 61.7 69.0 64.0 62.6 56.7 63.6 68.2 59.4 the 0.05 level of significance to test the hypothesis that the diet reduces the median 60.0 54.9 58.1 62.1 58.5 59.9 54.4 60.2 62.3 58.7 664 Chapter 16 Nonparametric Statistics weight by 4.5 kilograms against the alternative hypoth- esis that the median weight loss is less than 4.5 kilo- grams. 16.6 Two types of instruments for measuring the amount of sulfur monoxide in the atmosphere are be- ing compared in an air-pollution experiment. The fol- lowing readings were recorded daily for a period of 2 weeks: Sulfur Monoxide 16.8 Analyze the data of Exercise 16.1 by using the signed-rank test. 16.9 Analyze the data of Exercise 16.2 by using the signed-rank test. 16.10 The weights of 5 people before they stopped smoking and 5 weeks after they stopped smoking, in kilograms, are as follows: Individual 12345 Before 66 80 69 52 75 After 71 82 68 56 73 Use the signed-rank test for paired observations to test the hypothesis, at the 0.05 level of significance, that giving up smoking has no effect on a person’s weight against the alternative that one’s weight increases if he or she quits smoking. 16.11 Rework Exercise 16.5 by using the signed-rank test. 16.12 The following are the numbers of prescriptions filled by two pharmacies over a 20-day period: Day Pharmacy A Pharmacy B 1 19 17 2 21 15 3 15 12 4 17 12 5 24 16 6 12 15 7 19 11 8 14 13 9 20 14 10 18 21 11 23 19 12 21 15 13 17 11 14 12 10 15 16 20 16 15 12 17 20 13 18 18 17 19 14 16 20 22 18 Use the signed-rank test at the 0.01 level of significance to determine whether the two pharmacies, on average, fill the same number of prescriptions against the alter- native that pharmacy A fills more prescriptions than pharmacy B. 16.13 Rework Exercise 16.7 by using the signed-rank test. 16.14 Rework Exercise 16.6 by using the signed-rank test. Day 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Instrument A 0.96 0.82 0.75 0.61 0.89 0.64 0.81 0.68 0.65 0.84 0.59 0.94 0.91 0.77 Instrument B 0.87 0.74 0.63 0.55 0.76 0.70 0.69 0.57 0.53 0.88 0.51 0.79 0.84 0.63 Using the bution, perform a sign test to determine whether the different instruments lead to different results. Use a 0.05 level of significance. 16.7 The following figures give the systolic blood pressure of 16 joggers before and after an 8-kilometer run: Jogger Before After normal approximation to the binomial distri- 1 158 164 2 149 158 3 160 163 4 155 160 5 164 172 6 138 147 7 163 167 8 159 169 9 165 173 10 145 147 11 150 156 12 161 164 13 132 133 14 155 161 15 146 154 16 159 170 Use the sign test at the 0.05 level of significance to test the null hypothesis that jogging 8 kilometers increases the median systolic blood pressure by 8 points against the alternative that the increase in the median is less than 8 points. 16.3 Wilcoxon Rank-Sum Test 665 16.3 Wilcoxon Rank-Sum Test As we indicated earlier, the nonparametric procedure is generally an appropriate alternative to the normal theory test when the normality assumption does not hold. When we are interested in testing equality of means of two continuous distributions that are obviously nonnormal, and samples are independent (i.e., there is no pairing of observations), the Wilcoxon rank-sum test or Wilcoxon two-sample test is an appropriate alternative to the two-sample t-test described in Chapter 10. We shall test the null hypothesis H0 that μ ̃1 = μ ̃2 against some suitable alter- native. First we select a random sample from each of the populations. Let n1 be the number of observations in the smaller sample, and n2 the number of observa- tions in the larger sample. When the samples are of equal size, n1 and n2 may be randomly assigned. Arrange the n1 + n2 observations of the combined samples in ascending order and substitute a rank of 1, 2, . . . , n1 + n2 for each observation. In the case of ties (identical observations), we replace the observations by the mean of the ranks that the observations would have if they were distinguishable. For example, if the seventh and eighth observations were identical, we would assign a rank of 7.5 to each of the two observations. The sum of the ranks corresponding to the n1 observations in the smaller sample is denoted by w1. Similarly, the value w2 represents the sum of the n2 ranks corresponding to the larger sample. The total w1 + w2 depends only on the number of observations in the two samples and is in no way affected by the results of the experiment. Hence, if n1 = 3 and n2 = 4, then w1 +w2 = 1+2+···+7 = 28, regardless of the numerical values of the observations. In general, w1 +w2 = (n1 +n2)(n1 +n2 +1), 2 the arithmetic sum of the integers 1, 2, . . . , n1 + n2 . Once we have determined w1 , it may be easier to find w2 by the formula w2 = (n1 +n2)(n1 +n2 +1) −w1. 2 In choosing repeated samples of sizes n1 and n2, we would expect w1, and therefore w2, to vary. Thus, we may think of w1 and w2 as values of the random variables W1 and W2, respectively. The null hypothesis μ ̃1 = μ ̃2 will be rejected in favor of the alternative μ ̃1 < μ ̃2 only if w1 is small and w2 is large. Likewise, the alternative μ ̃1 > μ ̃2 can be accepted only if w1 is large and w2 is small. For a two-tailed test, we may reject H0 in favor of H1 if w1 is small and w2 is large or if w1 is large and w2 is small. In other words, the alternative μ ̃1 < μ ̃2 is accepted if w1 is sufficiently small; the alternative μ ̃1 > μ ̃2 is accepted if w2 is sufficiently small; and the alternative μ ̃1 ̸= μ ̃2 is accepted if the minimum of w1 and w2 is sufficiently small. In actual practice, we usually base our decision on the value
u1 =w1 − n1(n1 +1) or u2 =w2 − n2(n2 +1) 22
of the related statistic U1 or U2 or on the value u of the statistic U, the minimum of U1 and U2. These statistics simplify the construction of tables of critical values,

666
Chapter 16 Nonparametric Statistics
since both U1 and U2 have symmetric sampling distributions and assume values in the interval from 0 to n1n2 such that u1 + u2 = n1n2.
From the formulas for u1 and u2 we see that u1 will be small when w1 is small and u2 will be small when w2 is small. Consequently, the null hypothesis will be rejected whenever the appropriate statistic U1, U2, or U assumes a value less than or equal to the desired critical value given in Table A.17. The various test procedures are summarized in Table 16.4.
Table 16.4: Rank-Sum Test
H0 ⎧ H1
⎨ μ ̃ 1 < μ ̃ 2 μ ̃ 1 = μ ̃ 2 ⎩ μ ̃ 1 > μ ̃ 2 μ ̃ 1 ̸ = μ ̃ 2
Compute
u 1 u 2 u
Example 16.5:
Solution :
Table A.17 gives critical values of U1 and U2 for levels of significance equal to 0.001, 0.01, 0.025, and 0.05 for a one-tailed test, and critical values of U for levels of significance equal to 0.002, 0.02, 0.05, and 0.10 for a two-tailed test. If the observed value of u1, u2, or u is less than or equal to the tabled critical value, the null hypothesis is rejected at the level of significance indicated by the table. Suppose, for example, that we wish to test the null hypothesis that μ ̃1 = μ ̃2 against the one-sided alternative that μ ̃1 < μ ̃2 at the 0.05 level of significance for random samples of sizes n1 = 3 and n2 = 5 that yield the value w1 = 8. It follows that u1 = 8 − (3)(4) = 2. 2 Our one-tailed test is based on the statistic U1. Using Table A.17, we reject the null hypothesis of equal means when u1 ≤ 1. Since u1 = 2 does not fall in the rejection region, the null hypothesis cannot be rejected. The nicotine content of two brands of cigarettes, measured in milligrams, was found to be as follows: BrandA 2.1 4.0 6.3 5.4 4.8 3.7 6.1 3.3 BrandB 4.1 0.6 3.1 2.5 4.0 6.2 1.6 2.2 1.9 5.4 Test the hypothesis, at the 0.05 level of significance, that the median nicotine contents of the two brands are equal against the alternative that they are unequal. 1. H0:μ ̃1=μ ̃2. 2. H1: μ ̃1 ̸= μ ̃2. 3. α = 0.05. 4. Critical region: u ≤ 17 (from Table A.17). 5. Computations: The observations are arranged in ascending order and ranks from 1 to 18 assigned. 16.3 Wilcoxon Rank-Sum Test 667 Original Data 0.6 1.6 1.9 2.1 2.2 2.5 3.1 3.3 3.7 Ranks Original Data 1 4.0 2 4.0 3 4.1 4* 4.8 5 5.4 6 5.4 7 6.1 8* 6.2 9* 6.3 Ranks 10.5* 10.5 12 13* 14.5* 14.5 16* 17 18* Now and Therefore, w1 =4+8+9+10.5+13+14.5+16+18=93 w2 = (18)(19) − 93 = 78. 2 *The ranks marked with an asterisk belong to sample A. u1 = 93 − (8)(9) = 57, 22 u2 = 78 − (10)(11) = 23. 6. Decision: Do not reject the null hypothesis H0 and conclude that there is no significant difference in the median nicotine contents of the two brands of cigarettes. Normal Theory Approximation for Two Samples When both n1 and n2 exceed 8, the sampling distribution of U1 (or U2) approaches the normal distribution with mean and variance given by μU = n1n2 and σU2 = n1n2(n1 +n2 +1). 121 12 Consequently, when n2 is greater than 20, the maximum value in Table A.17, and n1 is at least 9, we can use the statistic Z = U1 − μU1 σU1 for our test, with the critical region falling in either or both tails of the standard normal distribution, depending on the form of H1. The use of the Wilcoxon rank-sum test is not restricted to nonnormal popula- tions. It can be used in place of the two-sample t-test when the populations are normal, although the power will be smaller. The Wilcoxon rank-sum test is always superior to the t-test for decidedly nonnormal populations. 668 Chapter 16 Nonparametric Statistics 16.4 Kruskal-Wallis Test Kruskal-Wallis Test In Chapters 13, 14, and 15, the technique of analysis of variance was prominent as an analytical technique for testing equality of k ≥ 2 population means. Again, however, the reader should recall that normality must be assumed in order for the F-test to be theoretically correct. In this section, we investigate a nonparametric alternative to analysis of variance. The Kruskal-Wallis test, also called the Kruskal-Wallis H test, is a gen- eralization of the rank-sum test to the case of k > 2 samples. It is used to test the null hypothesis H0 that k independent samples are from identical populations. Introduced in 1952 by W. H. Kruskal and W. A. Wallis, the test is a nonpara- metric procedure for testing the equality of means in the one-factor analysis of variance when the experimenter wishes to avoid the assumption that the samples were selected from normal populations.
Let ni (i = 1, 2, . . . , k) be the number of observations in the ith sample. First, we combine all k samples and arrange the n = n1 + n2 + ··· + nk observations in ascending order, substituting the appropriate rank from 1, 2, . . . , n for each obser- vation. In the case of ties (identical observations), we follow the usual procedure of replacing the observations by the mean of the ranks that the observations would have if they were distinguishable. The sum of the ranks corresponding to the ni observations in the ith sample is denoted by the random variable Ri. Now let us consider the statistic
12 􏰤k Ri2
H = n(n+1) n −3(n+1),
i=1 i
which is approximated very well by a chi-squared distribution with k − 1 degrees of freedom when H0 is true, provided each sample consists of at least 5 observations. The fact that h, the assumed value of H, is large when the independent samples come from populations that are not identical allows us to establish the following decision criterion for testing H0:
To test the null hypothesis H0 that k independent samples are from identical populations, compute
12 􏰤k ri2
h= n(n+1) n −3(n+1),
i=1 i
where ri is the assumed value of Ri, for i = 1,2,…,k. If h falls in the critical region H > χ2α with v = k − 1 degrees of freedom, reject H0 at the α-level of significance; otherwise, fail to reject H0.
Example 16.6: In an experiment to determine which of three different missile systems is preferable, the propellant burning rate is measured. The data, after coding, are given in Table 16.5. Use the Kruskal-Wallis test and a significance level of α = 0.05 to test the hypothesis that the propellant burning rates are the same for the three missile systems.

16.4 Kruskal-Wallis Test
669
Table 16.5: Propellant Burning Rates
Missile System 13
24.0 16.7 22.8 18.4 19.1 17.3
2
23.2 19.8 18.1 17.6 20.2 17.8
19.8 18.9
H0: μ1 =μ2 =μ3.
H1: The three means are not all equal.
α = 0.05.
17.3 19.7 18.9 18.8 19.3
Solution :
1. 2. 3. 4. 5.
Critical region: h > χ20.05 = 5.991, for v = 2 degrees of freedom.
Computations: In Table 16.6, we convert the 19 observations to ranks and sum the ranks for each missile system.
Table 16.6: Ranks for Propellant Burning Rates
Missile System 123
19 18 7 1 14.5 11
17 6 2.5
14.5 9.5 r1 = 61.0
4 2.5 16 13
5 9.5 r2 = 63.5 8 12
r3 = 65.5
Now,substitutingn1 =5,n2 =6,n3 =8andr1 =61.0,r2 =63.5,r3 =65.5,
our test statistic H assumes the value
12 􏰧 61.02 63.52 65.52 􏰨
h = (19)(20) 5 + 6 + 8 − (3)(20) = 1.66.
6.
Decision: Since h = 1.66 does not fall in the critical region h > 5.991, we have insufficient evidence to reject the hypothesis that the propellant burning rates are the same for the three missile systems.

670
Chapter 16 Nonparametric Statistics
Exercises
16.15 A cigarette manufacturer claims that the tar content of brand B cigarettes is lower than that of brand A cigarettes. To test this claim, the follow- ing determinations of tar content, in milligrams, were recorded:
BrandA 1 12 9 13 11 14 BrandB 8 10 7
Use the rank-sum test with α = 0.05 to test whether the claim is valid.
16.16 To find out whether a new serum will arrest leukemia, nine patients, who have all reached an ad- vanced stage of the disease, are selected. Five patients receive the treatment and four do not. The survival times, in years, from the time the experiment com- menced are
selected at random and given additional instruction by the teacher. The results on the final examination are as follows:
Additional
Instruction 87 69 No Additional Instruction
Grade
78 91 80 64 82 93
79 67
Treatment
No treatment 1.9 0.5
1.4 4.6 0.9 2.8 3.1
Use the rank-sum test with α = 0.05 to determine if the additional instruction affects the average grade.
16.20 The following data represent the weights, in kilograms, of personal luggage carried on various flights by a member of a baseball team and a member of a basketball team.
Luggage Weight (kilograms)
2.1 5.3
Baseball Player
16.3 20.0 18.6 18.1 15.0 15.4 15.9 18.6 15.6 14.1 14.5 18.3 17.7 19.1 17.4 16.3 13.6 14.8 13.2 17.2 16.5
Basketball Player
15.4 16.3 17.7 18.1 18.6 16.8 12.7 14.1 15.0 13.6 15.9 16.3
75 88
Use the rank-sum test, at the 0.05 level of significance, to determine if the serum is effective.
16.17 The following data represent the number of hours that two different types of scientific pocket cal- culators operate before a recharge is required.
Calculator A 5.5 5.6 6.3 4.6 5.3 5.0 6.2 5.8 5.1 Calculator B 3.8 4.8 4.3 4.2 4.0 4.9 4.5 5.2 4.5
Use the rank-sum test with α = 0.01 to determine if calculator A operates longer than calculator B on a full battery charge.
16.18 A fishing line is being manufactured by two processes. To determine if there is a difference in the mean breaking strength of the lines, 10 pieces manu- factured by each process are selected and then tested for breaking strength. The results are as follows:
Use the rank-sum test with α = 0.05 to test the null hy- pothesis that the two athletes carry the same amount of luggage on the average against the alternative hy- pothesis that the average weights of luggage for the two athletes are different.
16.21 The following data represent the operating times in hours for three types of scientific pocket cal- culators before a recharge is required:
Calculator
AC
B
Process 1 10.4 9.8 11.5 9.6 10.9 11.8
10.0 9.9 9.3 10.7 Process 2 8.7 11.2 9.8 10.1 10.8 9.5 11.0 9.8 10.5 9.9
4.9 6.1 4.3 4.6 5.2
5.5 5.4 6.2 6.4 6.8 5.6 5.8 5.5 5.2 6.5 6.3 6.6
Use the rank-sum test with α =
if there is a difference between the mean breaking strengths of the lines manufactured by the two pro- cesses.
16.19 From a mathematics class of 12 equally capable students using programmed materials, 5 students are
4.8
Use the Kruskal-Wallis test, at the 0.01 level of signif-
icance, to test the hypothesis that the operating times for all three calculators are equal.
16.22 In Exercise 13.6 on page 519, use the Kruskal- Wallis test at the 0.05 level of significance to determine if the organic chemical solvents differ significantly in sorption rate.
0.1 to determine

16.5 Runs Test 671 16.5 Runs Test
Definition 16.1:
In applying the many statistical concepts discussed throughout this book, it was always assumed that the sample data had been collected by some randomization procedure. The runs test, based on the order in which the sample observations are obtained, is a useful technique for testing the null hypothesis H0 that the observations have indeed been drawn at random.
To illustrate the runs test, let us suppose that 12 people are polled to find out if they use a certain product. We would seriously question the assumed randomness of the sample if all 12 people were of the same sex. We shall designate a male and a female by the symbols M and F, respectively, and record the outcomes according to their sex in the order in which they occur. A typical sequence for the experiment might be
M M F F F M F F M M M M, 􏰲 􏰳􏰴 􏰵􏰲 􏰳􏰴 􏰵􏰲􏰳􏰴􏰵􏰲􏰳􏰴􏰵􏰲 􏰳􏰴 􏰵
where we have grouped subsequences of identical symbols. Such groupings are called runs.
Regardless of whether the sample measurements represent qualitative or quan- titative data, the runs test divides the data into two mutually exclusive categories: male or female; defective or nondefective; heads or tails; above or below the me- dian; and so forth. Consequently, a sequence will always be limited to two distinct symbols. Let n1 be the number of symbols associated with the category that oc- curs the least and n2 be the number of symbols that belong to the other category. Then the sample size n = n1 + n2.
For the n = 12 symbols in our poll, we have five runs, with the first containing two M’s, the second containing three F’s, and so on. If the number of runs is larger or smaller than what we would expect by chance, the hypothesis that the sample was drawn at random should be rejected. Certainly, a sample resulting in only two runs,
MMMMMMMFFFFF
or the reverse, is most unlikely to occur from a random selection process. Such a result indicates that the first 7 people interviewed were all males, followed by 5 females. Likewise, if the sample resulted in the maximum number of 12 runs, as in the alternating sequence
M F M F M F M F M F M F,
we would again be suspicious of the order in which the individuals were selected for the poll.
The runs test for randomness is based on the random variable V , the total number of runs that occur in the complete sequence of the experiment. In Table A.18, values of P(V ≤ v∗ when H0 is true) are given for v∗ = 2,3,…,20 runs and
A run is a subsequence of one or more identical symbols representing a common property of the data.

672
Chapter 16 Nonparametric Statistics
Example 16.7:
Solution :
values of n1 and n2 less than or equal to 10. The P-values for both one-tailed and two-tailed tests can be obtained using these tabled values.
For the poll taken previously, we exhibit a total of 5 F’s and 7 M’s. Hence, withn1 =5,n2 =7,andv=5,wenotefromTableA.18thattheP-valuefora two-tailed test is
P =2P(V ≤5whenH0 istrue)=0.394>0.05.
That is, the value v = 5 is reasonable at the 0.05 level of significance when H0 is true, and therefore we have insufficient evidence to reject the hypothesis of randomness in our sample.
When the number of runs is large (for example, if v = 11 while n1 = 5 and n2 = 7), the P-value for a two-tailed test is
P =2P(V ≥11whenH0 istrue)=2[1−P(V ≤10whenH0 istrue)] = 2(1 − 0.992) = 0.016 < 0.05, which leads us to reject the hypothesis that the sample values occurred at random. The runs test can also be used to detect departures from randomness of a sequence of quantitative measurements over time, caused by trends or periodicities. Replacing each measurement, in the order in which it was collected, by a plus symbol if it falls above the median or by a minus symbol if it falls below the median and omitting all measurements that are exactly equal to the median, we generate a sequence of plus and minus symbols that is tested for randomness as illustrated in the following example. A machine dispenses acrylic paint thinner into containers. Would you say that the amount of paint thinner being dispensed by this machine varies randomly if the contents of the next 15 containers are measured and found to be 3.6, 3.9, 4.1, 3.6, 3.8, 3.7, 3.4, 4.0, 3.8, 4.1, 3.9, 4.0, 3.8, 4.2, and 4.1 liters? Use a 0.1 level of significance. 1. H0 : Sequence is random. 2. H1: Sequence is not random. 3. α = 0.1. 4. Test statistic: V , the total number of runs. 5. Computations: For the given sample, we find x ̃ = 3.9. Replacing each mea- surement by the symbol “+” if it falls above 3.9 or by the symbol “−” if it falls below 3.9 and omitting the two measurements that equal 3.9, we obtain the sequence −+−−−−+−++−++ for which n1 = 6, n2 = 7, and v = 8. Therefore, from Table A.18, the computed P-value is P =2P(V ≥8whenH0 istrue) =2[1−P(V ≤8whenH0 istrue)]=2(0.5)=1. 6. Decision: Do not reject the hypothesis that the sequence of measurements varies randomly. 16.5 Runs Test 673 The runs test, although less powerful, can also be used as an alternative to the Wilcoxon two-sample test to test the claim that two random samples come from populations having the same distributions and therefore equal means. If the populations are symmetric, rejection of the claim of equal distributions is equivalent to accepting the alternative hypothesis that the means are not equal. In performing the test, we first combine the observations from both samples and arrange them in ascending order. Now assign the letter A to each observation taken from one of the populations and the letter B to each observation from the other population, thereby generating a sequence consisting of the symbols A and B. If observations from one population are tied with observations from the other population, the sequence of A and B symbols generated will not be unique and consequently the number of runs is unlikely to be unique. Procedures for breaking ties usually result in additional tedious computations, and for this reason we might prefer to apply the Wilcoxon rank-sum test whenever these situations occur. To illustrate the use of runs in testing for equal means, consider the survival times of the leukemia patients of Exercise 16.16 on page 670, for which we have 0.5 0.9 1.4 1.9 2.1 2.8 3.1 4.6 5.3 BAABABBAA resulting in v = 6 runs. If the two symmetric populations have equal means, the observations from the two samples will be intermingled, resulting in many runs. However, if the population means are significantly different, we would expect most of the observations for one of the two samples to be smaller than those for the other sample. In the extreme case where the populations do not overlap, we would obtain a sequence of the form AAAAABBBB or BBBBAAAAA and in either case there would be only two runs. Consequently, the hypothesis of equal population means will be rejected at the α-level of significance only when v is small enough so that P = P(V ≤ v when H0 is true) ≤ α, implying a one-tailed test. Returning to the data of Exercise 16.16 on page 670, for which n1 = 4, n2 = 5, and v = 6, we find from Table A.18 that P =P(V ≤6whenH0 istrue)=0.786>0.05
and therefore fail to reject the null hypothesis of equal means. Hence, we conclude that the new serum does not prolong life by arresting leukemia.
When n1 and n2 increase in size, the sampling distribution of V approaches the normal distribution with mean and variance given by
μV = 2n1n2 + 1 and σV2 = 2n1n2(2n1n2 − n1 − n2). n1 +n2 (n1 +n2)2(n1 +n2 −1)
Consequently, when n1 and n2 are both greater than 10, we can use the statistic
Z = V − μV σV
to establish the critical region for the runs test.

674
Chapter 16 Nonparametric Statistics
16.6
Tolerance Limits
Two-Sided Tolerance Limits
Tolerance limits for a normal distribution of measurements were discussed in Chap- ter 9. In this section, we consider a method for constructing tolerance intervals that is independent of the shape of the underlying distribution. As we might sus- pect, for a reasonable degree of confidence they will be substantially longer than those constructed assuming normality, and the sample size required is generally very large. Nonparametric tolerance limits are stated in terms of the smallest and largest observations in our sample.
For any distribution of measurements, two-sided tolerance limits are indicated by the smallest and largest observations in a sample of size n, where n is determined so that one can assert with 100(1−γ)% confidence that at least the proportion 1 − α of the distribution is included between the sample extremes.
Table A.19 gives required sample sizes for selected values of γ and 1 − α. For example, when γ = 0.01 and 1 − α = 0.95, we must choose a random sample of size n = 130 in order to be 99% confident that at least 95% of the distribution of measurements is included between the sample extremes.
Instead of determining the sample size n such that a specified proportion of measurements is contained between the sample extremes, it is desirable in many industrial processes to determine the sample size such that a fixed proportion of the population falls below the largest (or above the smallest) observation in the sample. Such limits are called one-sided tolerance limits.
For any distribution of measurements, a one-sided tolerance limit is determined by the smallest (largest) observation in a sample of size n, where n is determined so that one can assert with 100(1−γ)% confidence that at least the proportion 1 − α of the distribution will exceed the smallest (be less than the largest) observation in the sample.
Table A.20 shows required sample sizes corresponding to selected values of γ and 1−α. Hence, when γ = 0.05 and 1−α = 0.70, we must choose a sample of size n = 9 in order to be 95% confident that 70% of our distribution of measurements will exceed the smallest observation in the sample.
One-Sided Tolerance Limits
16.7 Rank Correlation Coefficient
In Chapter 11, we used the sample correlation coefficient r to measure the pop- ulation correlation coefficient ρ, the linear relationship between two continuous variables X and Y. If ranks 1,2,…,n are assigned to the x observations in or- der of magnitude and similarly to the y observations, and if these ranks are then substituted for the actual numerical values in the formula for the correlation coef- ficient in Chapter 11, we obtain the nonparametric counterpart of the conventional correlation coefficient. A correlation coefficient calculated in this manner is known as the Spearman rank correlation coefficient and is denoted by rs. When there are no ties among either set of measurements, the formula for rs reduces to a much simpler expression involving the differences di between the ranks assigned to the n pairs of x’s and y’s, which we now state.

16.7 Rank Correlation Coefficient 675
Rank Correlation A nonparametric measure of association between two variables X and Y is given Coefficient by the rank correlation coefficient
6 􏰤n
d2i ,
where di is the difference between the ranks assigned to xi and yi and n is the
number of pairs of data.
In practice, the preceding formula is also used when there are ties among ei- ther the x or y observations. The ranks for tied observations are assigned as in the signed-rank test by averaging the ranks that would have been assigned if the observations were distinguishable.
The value of rs will usually be close to the value obtained by finding r based on numerical measurements and is interpreted in much the same way. As before, the value of rs will range from −1 to +1. A value of +1 or −1 indicates perfect association between X and Y , the plus sign occurring for identical rankings and the minus sign occurring for reverse rankings. When rs is close to zero, we conclude that the variables are uncorrelated.
Example 16.8: The figures listed in Table 16.7, released by the Federal Trade Commission, show the milligrams of tar and nicotine found in 10 brands of cigarettes. Calculate the rank correlation coefficient to measure the degree of relationship between tar and nicotine content in cigarettes.
Table 16.7: Tar and Nicotine Contents
rs = 1 − n(n2 − 1)
i=1
Cigarette Brand
Viceroy Marlboro Chesterfield Kool
Kent
Raleigh
Old Gold Philip Morris Oasis
Players
Tar Content
14
17
28
17
16
13
24
25
18
31
Nicotine Content
0.9 1.1 1.6 1.3 1.0 0.8 1.5 1.4 1.2 2.0
Solution: Let X and Y represent the tar and nicotine contents, respectively. First we assign ranks to each set of measurements, with the rank of 1 assigned to the lowest number in each set, the rank of 2 to the second lowest number in each set, and so forth, until the rank of 10 is assigned to the largest number. Table 16.8 shows the individual rankings of the measurements and the differences in ranks for the 10 pairs of observations.

676
Chapter 16 Nonparametric Statistics
Table 16.8: Rankings for Tar and Nicotine Content
Cigarette Brand
Viceroy Marlboro Chesterfield Kool
xi yi di 2.0 2.0 0.0 4.5 4.0 0.5 9.0 9.0 0.0 4.5 6.0 −1.5 3.0 3.0 0.0 1.0 1.0 0.0 7.0 8.0 −1.0 8.0 7.0 1.0 6.0 5.0 1.0
Kent
Raleigh
Old Gold
Philip Morris
Oasis
Players 10.0 10.0 0.0
Substituting into the formula for rs, we find that rs = 1 − (6)(5.50) = 0.967,
indicating a high positive correlation between the amounts of tar and nicotine found in cigarettes.
Some advantages to using rs rather than r do exist. For instance, we no longer assume the underlying relationship between X and Y to be linear and therefore, when the data possess a distinct curvilinear relationship, the rank correlation co- efficient will likely be more reliable than the conventional measure. A second ad- vantage to using the rank correlation coefficient is the fact that no assumptions of normality are made concerning the distributions of X and Y . Perhaps the greatest advantage occurs when we are unable to make meaningful numerical measurements but nevertheless can establish rankings. Such is the case, for example, when dif- ferent judges rank a group of individuals according to some attribute. The rank correlation coefficient can be used in this situation as a measure of the consistency of the two judges.
To test the hypothesis that ρ = 0 by using a rank correlation coefficient, one needs to consider the sampling distribution of the rs-values under the assumption of no correlation. Critical values for α = 0.05,0.025,0.01, and 0.005 have been calculated and appear in Table A.21. The setup of this table is similar to that of the table of critical values for the t-distribution except for the left column, which now gives the number of pairs of observations rather than the degrees of freedom. Since the distribution of the rs-values is symmetric about zero when ρ = 0, the rs-value that leaves an area of α to the left is equal to the negative of the rs-value that leaves an area of α to the right. For a two-sided alternative hypothesis, the critical region of size α falls equally in the two tails of the distribution. For a test in which the alternative hypothesis is negative, the critical region is entirely in the left tail of the distribution, and when the alternative is positive, the critical region is placed entirely in the right tail.
(10)(100 − 1)

Exercises
677
//
Example 16.9:
Solution:
Refer to Example 16.8 and test the hypothesis that the correlation between the amounts of tar and nicotine found in cigarettes is zero against the alternative that it is greater than zero. Use a 0.01 level of significance.
1. H0: ρ=0.
2. H1: ρ > 0.
3. α = 0.01.
4. Critical region: rs > 0.745 from Table A.21.
5. Computations: From Example 16.8, rs = 0.967.
6. Decision: Reject H0 and conclude that there is a significant correlation be- tween the amounts of tar and nicotine found in cigarettes.
Under the assumption of no correlation, it can be shown that the distribution of the rs-values approaches a normal distribution with a mean of 0 and a standard deviation of 1/√n − 1 as n increases. Consequently, when n exceeds the values given in Table A.21, one can test for a significant correlation by computing
√ z=1/√n−1=rs n−1
rs − 0
and comparing with critical values of the standard normal distribution shown in
Exercises
Table A.3.
16.23 A random sample of 15 adults living in a small town were selected to estimate the proportion of voters favoring a certain candidate for mayor. Each individual was also asked if he or she was a college graduate. By letting Y and N designate the responses of “yes” and “no” to the education question, the following sequence was obtained:
NNNNYYNYYNYNNNN
Use the runs test at the 0.1 level of significance to de- termine if the sequence supports the contention that the sample was selected at random.
16.24 A silver-plating process is used to coat a cer- tain type of serving tray. When the process is in con- trol, the thickness of the silver on the trays will vary randomly following a normal distribution with a mean of 0.02 millimeter and a standard deviation of 0.005 millimeter. Suppose that the next 12 trays examined show the following thicknesses of silver: 0.019, 0.021, 0.020, 0.019, 0.020, 0.018, 0.023, 0.021, 0.024, 0.022, 0.023, 0.022. Use the runs test to determine if the fluctuations in thickness from one tray to another are random. Let α = 0.05.
16.25 Use the runs test to test, at level 0.01, whether there is a difference in the average operating time for the two calculators of Exercise 16.17 on page 670.
16.26 In an industrial production line, items are in- spected periodically for defectives. The following is a sequence of defective items, D, and nondefective items, N, produced by this production line:
DDNNNDNNDDNNNN NDDDNNDNNNNDND
Use the large-sample theory for the runs test, with a significance level of 0.05, to determine whether the de- fectives are occurring at random.
16.27 Assuming that the measurements of Exercise 1.14 on page 30 were recorded successively from left to right as they were collected, use the runs test, with α = 0.05, to test the hypothesis that the data represent a random sequence.
16.28 How large a sample is required to be 95% con- fident that at least 85% of the distribution of measure- ments is included between the sample extremes?

678
Chapter 16 Nonparametric Statistics
16.29 What is the probability that the range of a random sample of size 24 includes at least 90% of the population?
16.30 How large a sample is required to be 99% con- fident that at least 80% of the population will be less than the largest observation in the sample?
16.31 What is the probability that at least 95% of a population will exceed the smallest value in a random sample of size n = 135?
16.32 The following table gives the recorded grades for 10 students on a midterm test and the final exam- ination in a calculus course:
(b) test the hypothesis, at the 0.025 level of signif- icance, that ρ = 0 against the alternative that ρ > 0.
16.36 A consumer panel tests nine brands of mi- crowave ovens for overall quality. The ranks assigned by the panel and the suggested retail prices are as fol- lows:
Panel Manufacturer Rating
Suggested Price
Final Examination
73
63
(a) Calculate the rank correlation coefficient.
(b) Test the null hypothesis that ρ = 0 against the
alternative that ρ > 0. Use α = 0.025.
16.33 With reference to the data of Exercise 11.1 on
page 398,
(a) calculate the rank correlation coefficient;
(b) test the null hypothesis, at the 0.05 level of sig- nificance, that ρ = 0 against the alternative that ρ ̸= 0. Compare your results with those obtained in Exercise 11.44 on page 435.
16.34 Calculate the rank correlation coefficient for the daily rainfall and amount of particulate removed in Exercise 11.13 on page 400.
16.35 With reference to the weights and chest sizes of infants in Exercise 11.47 on page 436,
(a) calculate the rank correlation coefficient;
A 6 $480 B 9 395 C 2 575 D 8 550 E 5 510 F 1 545 G 7 400 H 4 465 I 3 420
Is there a significant relationship between the quality and the price of a microwave oven? Use a 0.05 level of significance.
16.37 Two judges at a college homecoming parade rank eight floats in the following order:
Midterm Student Test
L.S.A. 84
W.P.B. 98
R.W.K. 91 87 J.R.L. 72 66 J.K.L. 86 78 D.L.P. 93 78 B.L.P. 80 91 D.W.M. 0 0 M.N.M. 92 88 R.H.S. 87 77
Float 1234567 8
JudgeA5843627 1 JudgeB7542816 3
Calculate the rank correlation coefficient.
Test the null hypothesis that ρ = 0 against the
alternative that ρ > 0. Use α = 0.05.
(a) (b)
16.38 In the article called “Risky Assumptions” by Paul Slovic, Baruch Fischoff, and Sarah Lichtenstein, published in Psychology Today (June 1980), the risk of dying in the United States from 30 activities and tech- nologies is ranked by members of the League of Women Voters and also by experts who are professionally in- volved in assessing risks. The rankings are as shown in Table 16.9.
(a) Calculate the rank correlation coefficient.
(b) Test the null hypothesis of zero correlation between the rankings of the League of Women Voters and the experts against the alternative that the corre- lation is not zero. Use a 0.05 level of significance.

Review Exercises
679
Table 16.9: The Ranking Data for Exercise 16.38
Activity or Technology Risk
Voters
Activity or Experts Technology Risk
Voters
Experts
Nuclear power 1 Handguns 3 Motorcycles 5 Private aviation 7 Pesticides 9 Fire fighting 11 Hunting 13 Mountain climing 15 Commercial aviation 17 Swimming 19 Skiing 21 Football 23 Food preservatives 25 Power mowers 27 Home appliances 29
Review Exercises
16. 39 A study by a chemical company
drainage properties of two different polymers. Ten dif- ferent sludges were used, and both polymers were al- lowed to drain in each sludge. The free drainage was measured in mL/min.
Motor vehicles
Smoking 4 2
Sludge Type
1 2 3 4 5 6 7 8 9
10
Polymer A 12.7
14.6 18.6 17.5 11.8 16.9 19.9 17.6 15.6 16.0
Polymer B 12.0
15.0 19.2 17.3 12.2 16.6 20.1 17.6 16.0 16.1
compared the
(a) Use the sign test at the 0.05 level to test the null hypothesis that polymer A has the same median drainage as polymer B.
(b) Use the signed-rank test to test the hypotheses of part (a).
16.40 In Review Exercise 13.45 on page 555, use the Kruskal-Wallis test, at the 0.05 level of significance, to determine if the chemical analyses performed by the four laboratories give, on average, the same results.
16.41 Use the data from Exercise 13.14 on page 530 to see if the median amount of nitrogen lost in perspi- ration is different for the three levels of dietary protein.
20
4
6
12
8
18
23
29
16
10
30
27
14
28
22
2
6 3
8 17 10 5 12 13 14 26 16 15 18 9 20 11 22 7 24 19 26 21 28 24 30 25
Alcoholic beverages Police work Surgery
Large construction Spray cans
Bicycles Electric power Contraceptives X-rays Railroads Food coloring Antibiotics Vaccinations
1

This page intentionally left blank

Chapter 17
Statistical Quality Control
17.1 Introduction
The notion of using sampling and statistical analysis techniques in a production setting had its beginning in the 1920s. The objective of this highly successful concept is the systematic reduction of variability and the accompanying isolation of sources of difficulties during production. In 1924, Walter A. Shewhart of the Bell Telephone Laboratories developed the concept of a control chart. However, it was not until World War II that the use of control charts became widespread. This was due to the importance of maintaining quality in production processes during that period. In the 1950s and 1960s, the development of quality control and the general area of quality assurance grew rapidly, particularly with the emergence of the space program in the United States. There has been widespread and successful use of quality control in Japan thanks to the efforts of W. Edwards Deming, who served as a consultant in Japan following World War II. Quality control has been, and is, an important ingredient in the development of Japan’s industry and economy.
Quality control is receiving increasing attention as a management tool in which important characteristics of a product are observed, assessed, and compared with some type of standard. The various procedures in quality control involve consider- able use of sampling procedures and statistical principles that have been presented in previous chapters. The primary users of quality control are, of course, indus- trial corporations. It has become clear that an effective quality control program enhances the quality of the product being produced and increases profits. This is particularly true today since products are produced in such high volume. Before the movement toward quality control methods, quality often suffered because of lack of efficiency, which, of course, increases cost.
The Control Chart
The purpose of a control chart is to determine if the performance of a process is maintaining an acceptable level of quality. It is expected, of course, that any process will experience natural variability, that is, variability due to essentially unimportant and uncontrollable sources of variation. On the other hand, a process may experience more serious types of variability in key performance measures.
681

682
Chapter 17 Statistical Quality Control
These sources of variability may arise from one of several types of nonrandom “assignable causes,” such as operator errors or improperly adjusted dials on a machine. A process operating in this state is called out of control. A process experiencing only chance variation is said to be in statistical control. Of course, a successful production process may operate in an in-control state for a long period. It is presumed that during this period, the process is producing an acceptable product. However, there may be either a gradual or a sudden “shift” that requires detection.
A control chart is intended as a device to detect the nonrandom or out-of- control state of a process. Typically, the control chart takes the form indicated in Figure 17.1. It is important that the shift be detected quickly so that the problem can be corrected. Obviously, if detection is slow, many defective or nonconforming items are produced, resulting in considerable waste and increased cost.
13 12 11 10
9 8 7
Some type of quality characteristic must be under consideration, and units of the process must be sampled over time. Say, for example, the characteristic is the circumference of an engine bearing. The centerline represents the average value of the characteristic when the process is in control. The points depicted in the figure represent results of, say, sample averages of this characteristic, with the samples taken over time. The upper control limit and the lower control limit are chosen in such a way that one would expect all sample points to be covered by these boundaries if the process is in control. As a result, the general complexion of the plotted points over time determines whether or not the process is concluded to be in control. The “in control” evidence is produced by a random pattern of points, with all plotted values being inside the control limits. When a point falls outside the control limits, this is taken to be evidence of a process that is out of control, and a search for the assignable cause is suggested. In addition, a nonrandom pattern of points may be considered suspicious and certainly an indication that an investigation for the appropriate corrective action is needed.
1 2 3 4 5 6 7 8 9 10 Time
Figure 17.1: Typical control chart.
Characteristic

17.3 Purposes of the Control Chart 683
17.2 Nature of the Control Limits
The fundamental ideas on which control charts are based are similar in structure to those of hypothesis testing. Control limits are established to control the probability of making the error of concluding that the process is out of control when in fact it is not. This corresponds to the probability of making a type I error if we were testing the null hypothesis that the process is in control. On the other hand, we must be attentive to an error of the second kind, namely, not finding the process out of control when in fact it is (type II error). Thus, the choice of control limits is similar to the choice of a critical region.
As in the case of hypothesis testing, the sample size at each point is important. The choice of sample size depends to a large extent on the sensitivity or power of detection of the out-of-control state. In this application, the notion of power is very similar to that of the hypothesis-testing situation. Clearly, the larger the sample at each time period, the quicker the detection of an out-of-control process. In a sense, the control limits actually define what the user considers as being in control. In other words, the latitude given by the control limits must depend in some sense on the process variability. As a result, the computation of the control limits will naturally depend on data taken from the process results. Thus, any quality control application must have its beginning with computation from a preliminary sample or set of samples which will establish both the centerline and the quality control limits.
17.3 Purposes of the Control Chart
One obvious purpose of the control chart is mere surveillance of the process, that is, to determine if changes need to be made. In addition, the constant systematic gathering of data often allows management to assess process capability. Clearly, if a single performance characteristic is important, continual sampling and estimation of the mean and standard deviation of that performance characteristic provide an update on what the process can do in terms of mean performance and random variation. This is valuable even if the process stays in control for long periods. The systematic and formal structure of the control chart can often prevent overreaction to changes that represent only random fluctuations. Obviously, in many situations, changes brought about by overreaction can create serious problems that are difficult to solve.
Quality characteristics of control charts fall generally into two categories, vari- ables and attributes. As a result, types of control charts often take the same classifications. In the case of the variables type of chart, the characteristic is usu- ally a measurement on a continuum, such as diameter or weight. For the attribute chart, the characteristic reflects whether the individual product conforms (defective or not). Applications for these two distinct situations are obvious.
In the case of the variables chart, control must be exerted on both central ten- dency and variability. A quality control analyst must be concerned about whether there has been a shift in values of the performance characteristic on average. In addition, there will always be a concern about whether some change in process con- ditions results in a decrease in precision (i.e., an increase in variability). Separate

684
Chapter 17 Statistical Quality Control
17.4
control charts are essential for dealing with these two concepts. Central tendency is controlled by the X ̄-chart, where means of relatively small samples are plotted on a control chart. Variability around the mean is controlled by the range in the sample, or the sample standard deviation. In the case of attribute sampling, the proportion defective from a sample is often the quantity plotted on the chart. In the following section, we discuss the development of control charts for the variables type of performance characteristic.
Control Charts for Variables
Providing an example is a relatively easy way to explain the rudiments of the X ̄- chart for variables. Suppose that quality control charts are to be used on a process for manufacturing a certain engine part. Suppose the process mean is μ = 50 mm and the standard deviation is σ = 0.01 mm. Suppose that groups of 5 are sampled every hour and the values of the sample mean X ̄ are recorded and plotted on a chart like the one in Figure 17.2. The limits for the X ̄-charts are based on the standard deviation of the random variable X ̄. We know from material in Chapter 8 that for the average of independent observations in a sample of size n,
σ σ X ̄ = √ n ,
where σ is the standard deviation of an individual observation. The control limits are designed to result in a small probability that a given value of X ̄ is outside the limits given that, indeed, the process is in control (i.e., μ = 50). If we invoke the Central Limit Theorem, we have that under the condition that the process is in control,
􏰧 0.01􏰨 X ̄∼N50,√ .
5
As a result, 100(1 − α)% of the X ̄ -values fall inside the limits when the process is in control if we use the limits
σσ
LCL = μ − zα/2 √n = 50 − zα/2(0.0045), UCL = μ + zα/2 √n = 50 + zα/2(0.0045).
Here LCL and UCL stand for lower control limit and upper control limit, respec- tively. Often the X ̄ -charts are based on limits that are referred to as “three-sigma” limits, referring, of course, to zα/2 = 3 and limits that become
σ μ ± 3√n.
In our illustration, the upper and lower limits become
LCL = 50 − 3(0.0045) = 49.9865, UCL = 50 + 3(0.0045) = 50.0135.
Thus, if we view the structure of the 3σ limits from the point of view of hypothesis testing, for a given sample point, the probability is 0.0026 that the X ̄-value falls outside control limits, given that the process is in control. This is the probability

17.4 Control Charts for Variables
685
50.02 UCL
50.00
LCL
49.98
0 1 2 3 4 5 6 7 8 9 10
Figure 17.2: The 3σ control limits for the engine part example.
of the analyst erroneously determining that the process is out of control (see Table A.3).
The example above not only illustrates the X ̄-chart for variables, but also should provide the reader with insight into the nature of control charts in general. The centerline generally reflects the ideal value of an important parameter. Control limits are established from knowledge of the sampling properties of the statistic that estimates the parameter in question. They very often involve a multiple of the standard deviation of the statistic. It has become general practice to use 3σ limits. In the case of the X ̄-chart provided here, the Central Limit Theorem provides the user with a good approximation of the probability of falsely ruling that the process is out of control. In general, though, the user may not be able to rely on the normality of the statistic on the centerline. As a result, the exact probability of “type I error” may not be known. Despite this, it has become fairly standard to use the kσ limits. While use of the 3σ limits is widespread, at times the user may wish to deviate from this approach. A smaller multiple of σ may be appropriate when it is important to quickly detect an out-of-control situation. Because of economic considerations, it may prove costly to allow a process to continue to run out of control for even short periods, while the cost of the search and correction of assignable causes may be relatively small. Clearly, in this case, control limits that are tighter than 3σ limits are appropriate.
Rational Subgroups
The sample values to be used in a quality control effort are divided into subgroups, with a sample representing a subgroup. As we indicated earlier, time order of pro- duction is certainly a natural basis for selection of the subgroups. We may view the quality control effort very simply as (1) sampling, (2) detection of an out-of-control state, and (3) a search for assignable causes that may be occurring over time. The selection of the basis for these sample groups would appear to be straightforward, but the choice of these subgroups of sampling information can have an important effect on the success of the quality control program. These subgroups are often called rational subgroups. Generally, if the analyst is interested in detecting a
X

686
Chapter 17 Statistical Quality Control
shift in location, the subgroups should be chosen so that within-subgroup variabil- ity is small and assignable causes, if they are present, have the greatest chance of being detected. Thus, we want to choose the subgroups in such a way as to maximize the between-subgroup variability. Choosing units in a subgroup that are produced close together in time, for example, is a reasonable approach. On the other hand, control charts are often used to control variability, in which case the performance statistic is variability within the sample. Thus, it is more important to choose the rational subgroups to maximize the within-sample variability. In this case, the observations in the subgroups should behave more like a random sample and the variability within samples needs to be a depiction of the variability of the process.
It is important to note that control charts on variability should be established before the development of charts on center of location (say, X ̄-charts). Any control chart on center of location will certainly depend on variability. For example, we have seen an illustration of the central tendency chart and it depends on σ. In the sections that follow, an estimate of σ from the data will be discussed.
X ̄-Chart with Estimated Parameters
In the foregoing, we have illustrated notions of the X ̄-chart that make use of the Central Limit Theorem and employ known values of the process mean and standard deviation. As we indicated earlier, the control limits
σσ LCL = μ − zα/2 √n , UCL = μ + zα/2 √n
are used, and an X ̄-value falling outside these limits is viewed as evidence that the mean μ has changed and thus the process may be out of control.
In many practical situations, it is unreasonable to assume that we know μ and σ. As a result, estimates must be supplied from data taken when the process is in control. Typically, the estimates are determined during a period in which background information or start-up information is gathered. A basis for rational subgroups is chosen, and data are gathered with samples of size n in each subgroup. The sample sizes are usually small, say 4, 5, or 6, and k samples are taken, with k being at least 20. During this period in which it is assumed that the process is in control, the user establishes estimates of μ and σ on which the control chart is based. The important information gathered during this period includes the sample means in the subgroup, the overall mean, and the sample range in each subgroup. In the following paragraphs, we outline how this information is used to develop the control chart.
A portion of the sample information from these k samples takes the form X ̄1, X ̄2, . . . , X ̄k, where the random variable X ̄i is the average of the values in the ith sample. Obviously, the overall average is the random variable
i=1
This is the appropriate estimator of the process mean and, as a result, is the cen- terline in the X ̄ control chart. In quality control applications, it is often convenient
̄ 1􏰤k
X ̄ = k
X ̄ i .

17.4 Control Charts for Variables 687 to estimate σ from the information related to the ranges in the samples rather than
sample standard deviations. Let us define
Ri = Xmax,i − Xmin,i
as the range for the data in the ith sample. Here Xmax,i and Xmin,i are the largest and smallest observations, respectively, in the sample. The appropriate estimate of σ is a function of the average range
̄ 1 􏰤k R=k Ri.
i=1 An estimate of σ, say σˆ, is obtained by
R ̄ σˆ = d ,
2
where d2 is a constant depending on the sample size. Values of d2 are shown in Table A.22.
Use of the range in producing an estimate of σ has roots in quality-control-type applications, particularly since the range was so easy to compute, compared to other variability estimates, in the era when efficient computation was still an issue. The assumption of normality of the individual observations is implicit in the X ̄- chart. Of course, the existence of the Central Limit Theorem is certainly helpful in this regard. Under the assumption of normality, we make use of a random variable called the relative range, given by
W = R. σ
It turns out that the moments of W are simple functions of the sample size n (see the reference to Montgomery, 2000b, in the Bibliography). The expected value of W is often referred to as d2. Thus, by taking the expected value of W above, we have
E(R) = d2. σ
As a result, the rationale for the estimate σˆ = R ̄/d2 is readily understood. It is well known that the range method produces an efficient estimator of σ in relatively small samples. This makes the estimator particularly attractive in quality control applications, since the sample sizes in the subgroups are generally small. Using the range method for estimation of σ results in control charts with the following parameters:
̄ 3R ̄ ̄ ̄ 3R ̄ UCL=X+d √n, centerline=X, LCL=X−d √n.
22
Defining the quantity
3 A2 = d √n,
2

688
Chapter 17 Statistical Quality Control
we have that
̄ ̄ ̄ ̄ UCL = X + A2R, LCL = X − A2R.
To simplify the structure, the user of X ̄-charts often finds values of A2 tabulated. Values of A2 are given for various sample sizes in Table A.22.
R-Charts to Control Variation
Up to this point, all illustrations and details have dealt with the quality control analysts’ attempts at detection of out-of-control conditions produced by a shift in the mean. The control limits are based on the distribution of the random variable X ̄ and depend on the assumption of normality of the individual observations. It is important for control to be applied to variability as well as center of location. In fact, many experts believe that control of variability of the performance char- acteristic is more important and should be established before center of location is considered. Process variability can be controlled through the use of plots of the sample range. A plot over time of the sample ranges is called an R-chart. The same general structure can be used as in the case of the X ̄-chart, with R ̄ being the centerline and the control limits depending on an estimate of the standard devia- tion of the random variable R. Thus, as in the case of the X ̄-chart, 3σ limits are established where “3σ” implies 3σR. The quantity σR must be estimated from the data just as σX ̄ is estimated.
The estimate of σR, the standard deviation, is also based on the distribution of the relative range
W = R. σ
The standard deviation of W is a known function of the sample size and is generally denoted by d3. As a result,
σR = σd3.
We can now replace σ by σˆ = R ̄/d2, and thus the estimator of σR is
σˆ R = R ̄ d 3 . d2
Thus, the quantities that define the R-chart are
UCL = R ̄D4, centerline = R ̄, LCL = R ̄D3,
where the constants D4 and D3 (depending only on n) are D4 =1+3d3, D3 =1−3d3.
d2 d2 The constants D4 and D3 are tabulated in Table A.22.

17.4 Control Charts for Variables 689 X ̄- and R-Charts for Variables
A process manufacturing missile component parts is being controlled, with the performance characteristic being the tensile strength in pounds per square inch. Samples of size 5 each are taken every hour and 25 samples are reported. The data are shown in Table 17.1.
Table 17.1: Sample Information on Tensile Strength Data
Sample Number
1 2 3 4 5 6 7 8 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
As we indicated tions on variability.
Observations X ̄i Ri
1515 1518 1512 1498 1511 1510.8 20 1504 1511 1507 1499 1502 1504.6 12 1517 1513 1504 1521 1520 1515.0 17 1497 1503 1510 1508 1502 1504.0 13 1507 1502 1497 1509 1512 1505.4 15 1519 1522 1523 1517 1511 1518.4 12 1498 1497 1507 1511 1508 1504.2 14 1511 1518 1507 1503 1509 1509.6 15 1506 1503 1498 1508 1506 1504.2 10 1503 1506 1511 1501 1500 1504.2 11 1499 1503 1507 1503 1501 1502.6 8 1507 1503 1502 1500 1501 1502.6 7 1500 1506 1501 1498 1507 1502.4 9 1501 1509 1503 1508 1503 1504.8 8 1507 1508 1502 1509 1501 1505.4 8 1511 1509 1503 1510 1507 1508.0 8 1508 1511 1513 1509 1506 1509.4 7 1508 1509 1512 1515 1519 1512.6 11 1520 1517 1519 1522 1516 1518.8 6 1506 1511 1517 1516 1508 1511.6 11 1500 1498 1503 1504 1508 1502.6 10 1511 1514 1509 1508 1506 1509.6 8 1505 1508 1500 1509 1503 1505.0 9 1501 1498 1505 1502 1505 1502.2 7 1509 1511 1507 1500 1499 1505.2 12
earlier, it is important initially to establish “in control” condi- The calculated centerline for the R-chart is
25
R ̄ = 1 􏰤 Ri = 10.72.
25
WefindfromTableA.22thatforn=5,D3 =0andD4 =2.114. Asaresult,the
control limits for the R-chart are
LCL = R ̄D3 = (10.72)(0) = 0,
UCL = R ̄D4 = (10.72)(2.114) = 22.6621.
i=1

690
Chapter 17 Statistical Quality Control The R-chart is shown in Figure 17.3. None of the plotted ranges fall outside the
control limits. As a result, there is no indication of an out-of-control situation.
25 UCL 20
15
10
5
LCL 􏱋 0
0 10 20 30
Sample
Figure 17.3: R-chart for the tensile strength example.
The X ̄-chart can now be constructed for the tensile strength readings. The
centerline is
̄ 1􏰤 ̄
X = 25 ̄ ̄
25
Xi = 1507.328.
For samples of size 5, we find A2 = 0.577 from Table A.22. Thus, the control limits
are
The X ̄-chart is shown in Figure 17.4. As the reader can observe, three values fall outside the control limits. As a result, the control limits for X ̄ should not be used for line quality control.
i=1
UCL = X + A2R = 1507.328 + (0.577)(10.72) = 1513.5134, ̄ ̄
LCL = X − A2R = 1507.328 − (0.577)(10.72) = 1501.1426.
Further Comments about Control Charts for Variables
A process may appear to be in control and, in fact, may stay in control for a long period. Does this necessarily mean that the process is operating successfully? A process that is operating in control is merely one in which the process mean and variability are stable. Apparently, no serious changes have occurred. “In control” implies that the process remains consistent with natural variability. Quality control charts may be viewed as a method in which the inherent natural variability governs the width of the control limits. There is no implication, however, to what extent an in-control process satisfies predetermined specifications required of the process. Specifications are limits that are established by the consumer. If the current natural
Range

17.4 Control Charts for Variables
691
1520
1515
1510
1505
1500
Figure 17.4: X ̄-chart for the tensile strength example.
variability of the process is larger than that dictated by the specifications, the process will not produce items that meet specifications with high frequency, even though the process is stable and in control.
We have alluded to the normality assumption on the individual observations in a variables control chart. For the X ̄-chart, if the individual observations are normal, the statistic X ̄ is normal. As a result, the quality control analyst has control over the probability of type I error in this case. If the individual X’s are not normal, X ̄ is approximately normal and thus there is approximate control over the probability of type I error for the case in which σ is known. However, the use of the range method for estimating the standard deviation also depends on the normality assumption. Studies regarding the robustness of the X ̄-chart to departures from normality indicate that for samples of size k ≥ 4 the X ̄ chart results in an α-risk close to that advertised (see the work by Montgomery, 2000b, and Schilling and Nelson, 1976, in the Bibliography). We indicated earlier that the ±kσR approach to the R-chart is a matter of convenience and tradition. Even if the distribution of individual observations is normal, the distribution of R is not normal. In fact, the distribution of R is not even symmetric. The symmetric control limits of ±kσR only give an approximation to the α-risk, and in some cases the approximation is not particularly good.
Choice of Sample Size (Operating Characteristic Function) in the Case of the X ̄-Chart
Scientists and engineers dealing in quality control often refer to factors that affect the design of the control chart. Components that determine the design of the chart include the sample size taken in each subgroup, the width of the control limits, and the frequency of sampling. All of these factors depend to a large extent on economic and practical considerations. Frequency of sampling obviously depends on the cost of sampling and the cost incurred if the process continues out of control for a long period. These same factors affect the width of the “in-control” region. The cost that is associated with investigation and search for assignable causes has an impact
UCL
LCL
0 10 20 30 Sample
X

692
Chapter 17 Statistical Quality Control
on the width of the region and on frequency of sampling. A considerable amount of attention has been devoted to optimal design of control charts, and extensive details will not be given here. The reader should refer to the work by Montgomery (2000b) cited in the Bibliography for an excellent historical account of much of this research.
Choice of sample size and frequency of sampling involves balancing available resources allocated to these two efforts. In many cases, the analyst may need to make changes in the strategy until the proper balance is achieved. The analyst should always be aware that if the cost of producing nonconforming items is great, a high sampling frequency with relatively small sample size is a proper strategy.
Many factors must be taken into consideration in the choice of a sample size. In the illustrations and discussion, we have emphasized the use of n = 4, 5, or 6. These values are considered relatively small for general problems in statistical inference but perhaps proper sample sizes for quality control. One justification, of course, is that quality control is a continuing process and the results produced by one sample or set of units will be followed by results from many more. Thus, the “effective” sample size of the entire quality control effort is many times larger than that used in a subgroup. It is generally considered to be more effective to sample frequently with a small sample size.
The analyst can make use of the notion of the power of a test to gain some insight into the effectiveness of the sample size chosen. This is particularly impor- tant since small sample sizes are usually used in each subgroup. Refer to Chapters 10 and 13 for a discussion of the power of formal tests on means and the analysis of variance. Although formal tests of hypotheses are not actually being conducted in quality control, one can treat the sampling information as if the strategy at each subgroup were to test a hypothesis, either on the population mean μ or on the standard deviation σ. Of interest is the probability of detection of an out-of-control condition for a given sample and, perhaps more important, the expected number of runs required for detection. The probability of detection of a specified out-of- control condition corresponds to the power of a test. It is not our intention to show development of the power for all of the types of control charts presented here, but rather to show the development for the X ̄-chart and present power results for the R-chart.
Consider the X ̄-chart for σ known. Suppose that the in-control state has μ = μ0. A study of the role of the subgroup sample size is tantamount to investigating the β-risk, that is, the probability that an X ̄-value remains inside the control limits given that, indeed, a shift in the mean has occurred. Suppose that the form the shift takes is
μ = μ0 + rσ. Again, making use of the normality of X ̄, we have
β = P (LCL ≤ X ̄ ≤ UCL | μ = μ0 + rσ). For the case of kσ limits,
kσ kσ LCL=μ0 − √n and UCL=μ0 + √n.

17.4 Control Charts for Variables 693 As a result, if we denote by Z the standard normal random variable,
􏰮
β=P Z< 􏰰 􏰧μ0 +kσ/√n−μ􏰨􏰯 􏰮 σ/√n −P Z< 􏰧μ0 −kσ/√n−μ􏰨􏰯 􏰮μ0 +kσ/√n−(μ+rσ)􏰯􏰹 σ/√n −P Z< σ/√n 􏰰 σ/√n 􏰮μ0 −kσ/√n−(μ+rσ)􏰯􏰹 =P Z< =P(Z p0, that
􏰷􏰸 P(pˆ≥UCL)=P Z≥􏰱UCL−p1
p1(1 − p1)/n
Since P(Z > 0) = 0.5, we set
􏰱 UCL − p1 = 0.
=0.5.
Substituting
we have
p1(1 − p1)/n
􏰼
􏰼
p+3
p(1−p) =UCL, n
(p−p1)+3
We can now solve for n, the size of each sample:
n= 9p(1−p), Δ2
where, of course, Δ is the “shift” in the value of p, and p is the probability of a defective on which the control limits are based. However, if the control charts are based on kσ limits, then
k2
n= Δ2p(1−p).
Example 17.4: Suppose that an attribute quality control chart is being designed with a value of p = 0.01 for the in-control probability of a defective. What is the sample size per subgroup producing a probability of 0.5 that a process shift to p = p1 = 0.05 will be detected? The resulting p-chart will involve 3σ limits.
Solution : Here we have Δ = 0.04. The appropriate sample size is
n = 9 (0.01)(0.99) = 55.69 ≈ 56. (0.04)2
p(1−p) =0. n

702 Chapter 17 Statistical Quality Control Control Charts for Defects (Use of the Poisson Model)
In the preceding development, we assumed that the item under consideration is one that is either defective (i.e., nonfunctional) or not defective. In the latter case, it is functional and thus acceptable to the consumer. In many situations, this “defective or not” approach is too simplistic. Units may contain defects or nonconformities but still function quite well for the consumer. Indeed, in this case, it may be important to exert control on the number of defects or number of nonconformities. This type of quality control effort finds application when the units are either not simplistic or large. For example, the number of defects may be quite useful as the object of control when the single item or unit is, say, a personal computer. Another example is a unit defined by 50 feet of manufactured pipeline, where the number of defective welds is the object of quality control; the number of defects in 50 feet of manufactured carpeting; or the number of “bubbles” in a large manufactured sheet of glass.
It is clear from what we describe here that the binomial distribution is not appropriate. The total number of nonconformities in a unit or the average number per unit can be used as the measure for the control chart. Often it is assumed that the number of nonconformities in a sample of items follows the Poisson distribution. This type of chart is often called a C-chart.
Suppose that the number of defects X in one unit of product follows the Poisson distribution with parameter λ. (Here t = 1 for the Poisson model.) Recall that for the Poisson distribution,
e−λ λx
P (X = x) = x! , x = 0, 1, 2, . . . .
Here, the random variable X is the number of nonconformities. In Chapter 5, we learned that the mean and variance of the Poisson random variable are both λ. Thus, if the quality control chart were to be structured according to the usual 3σ limits, we could have, for λ known,
√√ UCL=λ+3 λ, centerline=λ, LCL=λ−3 λ.
As usual, λ often must come from an estimator from the data. An unbiased estimate of λ is the average number of nonconformities per sample. Denote this estimate by λˆ. Thus, the control chart has the limits
􏰱􏰱
UCL = λˆ + 3 λˆ, centerline = λˆ, LCL = λˆ − 3 λˆ.
Example 17.5: Table 17.4 represents the number of defects in 20 successive samples of sheet metal rolls each 100 feet long. A control chart is to be developed from these preliminary data for the purpose of controlling the number of defects in such samples. The estimate of the Poisson parameter λ is given by λˆ = 5.95. As a result, the control limits suggested by these preliminary data are
􏰱􏰱 UCL=λˆ+3 λˆ=13.2678 and LCL=λˆ−3 λˆ=−1.3678,
with LCL being set to zero.

17.5 Control Charts for Attributes 703 Table 17.4: Data for Example 17.5; Control Involves Number of Defects in Sheet Metal Rolls
Sample Number
1 2 3 4 5 6 7 8 9
10
Number of Defects Sample Number Number of Defects
8 11 3 7 12 7 5 13 5 4 14 9 4 15 7 7 16 7 6 17 8 4 18 6 5 19 7 6 20 4
Ave. 5.95
Figure 17.9 shows a plot of the preliminary data with the control limits revealed. Table 17.5 shows additional data taken from the production process. For each sample, the unit on which the chart was based, namely 100 feet of the metal, was inspected. The information on 20 samples is included. Figure 17.10 shows a plot of the additional production data. It is clear that the process is in control, at least
through the period for which the data were taken.
Table 17.5: Additional Data from the Production Process of Example 17.5
Sample Number
1 2 3 4 5 6 7 8 9
10
Number of Defects Sample Number Number of Defects
3 11 7 5 12 5 8 13 9 5 14 4 8 15 6 4 16 5 3 17 3 6 18 2 5 19 1 2 20 6
In Example 17.5, we have made very clear what the sampling or inspection unit is, namely, 100 feet of metal. In many cases where the item is a specific one (e.g., a personal computer or a specific type of electronic device), the inspection unit may be a set of items. For example, the analyst may decide to use 10 computers in each subgroup and observe a count of the total number of defects found. Thus, the preliminary sample for construction of the control chart would involve several samples, each containing 10 computers. The choice of the sample size may depend on many factors. Often, we may want a sample size that will ensure an LCL that is positive.
The analyst may wish to use the average number of defects per sampling unit as the basic measure in the control chart. For example, for the case of the personal

704
Chapter 17 Statistical Quality Control
14 12 10
22
0 LCL 0 LCL
0 5 10 15 20 0 5 10 15 20 Sample Sample
Figure 17.9: Preliminary data plotted on the con- Figure 17.10: Additional production data for Ex- trol chart for Example 17.5. ample 17.5.
computer, let the random variable total number of defects U = total number of defects
n
be measured for each sample of, say, n = 10. We can use the method of moment- generating functions to show that U is a Poisson random variable (see Review Exercise 17.1) if we assume that the number of defects per sampling unit is Poisson with parameter λ. Thus, the control chart for this situation is characterized by the following:
UCL
14 UCL 12
10 88 66 44
􏰼 ̄ 􏰼 ̄ UCL=U ̄ +3 U, centerline=U ̄, LCL=U ̄ −3 U.
nn
Here, of course, U ̄ is the average of the U-values in the preliminary or base data
set. The term U ̄/n is derived from the result that E(U) = λ, Var(U) = λ,
n
and thus U ̄ is an unbiased estimate of E(U) = λ and U ̄/n is an unbiased estimate of Var(U) = λ/n. This type of control chart is often called a U-chart.
In this section, we based our entire development of control charts on the Poisson probability model. This model has been used in combination with the 3σ concept. As we implied earlier in this chapter, the notion of 3σ limits has its roots in the normal approximation, although many users feel that the concept works well as a pragmatic tool even if normality is not even approximately correct. The difficulty, of course, is that in the absence of normality, we cannot control the probability of incorrect specification of an out-of-control state. In the case of the Poisson model, when λ is small the distribution is quite asymmetric, a condition that may produce undesirable results if we hold to the 3σ approach.
Number of Defects
Number of Defects

17.6 Cusum Control Charts 705 17.6 Cusum Control Charts
The disadvantage of the Shewhart-type control charts, developed and illustrated in the preceding sections, lies in their inability to detect small changes in the mean. A quality control mechanism that has received considerable attention in the statistics literature and usage in industry is the cumulative sum (cusum) chart. The method for the cusum chart is simple and its appeal is intuitive. It should become obvious to the reader why it is more responsive to small changes in the mean. Consider a control chart for the mean with a reference level established at value W. Consider particular observations X1,X2,…,Xr. The first r cusums are
S1 = X1 − W
S2 = S1 + (X2 − W ) S3 = S2 + (X3 − W )
.
Sr =Sr−1 +(Xr −W).
It becomes clear that the cusum is merely the accumulation of differences from the reference level. That is,
􏰤k i=1
The cusum chart is, then, a plot of Sk against time.
Suppose that we consider the reference level W to be an acceptable value of the
mean μ. Clearly, if there is no shift in μ, the cusum chart should be approximately horizontal, with some minor fluctuations balanced around zero. Now, if there is only a moderate change in the mean, a relatively large change in the slope of the cusum chart should result, since each new observation has a chance of contributing a shift and the measure being plotted is accumulating these shifts. Of course, the signal that the mean has shifted lies in the nature of the slope of the cusum chart. The purpose of the chart is to detect changes that are moving away from the reference level. A nonzero slope (in either direction) represents a change away from the reference level. A positive slope indicates an increase in the mean above the reference level, while a negative slope signals a decrease.
Cusum charts are often devised with a defined acceptable quality level (AQL) and rejectable quality level (RQL) preestablished by the user. Both represent values of the mean. These may be viewed as playing roles somewhat similar to those of the null and alternative mean of hypothesis testing. Consider a situation where the analyst hopes to detect an increase in the value of the process mean. We shall use the notation μ0 for AQL and μ1 for RQL and let μ1 > μ0. The reference level is now set at
W = μ0 + μ1 . 2
The values of Sr (r = 1,2,….) will have a negative slope if the process mean is at μ0 and a positive slope if the process mean is at μ1.
Sk =
(Xi − W ), k = 1, 2, . . . .

Review Exercises
17.1 Consider X1, X2, . . . , Xn independent Poisson
random variables with parameters μ1 , μ2 , . . . , μn . Use
the properties of moment-generating functions to show
Sample
X ̄ R
􏰦n variable with mean
i=1
μi and variance
􏰦n i=1
μi.
17.2 Consider the following data taken on subgroups of size 5. The data contain 20 averages and ranges on the diameter (in millimeters) of an important compo- nent part of an engine. Display X ̄- and R-charts. Does the process appear to be in control?
Sample
X ̄ R
Suppose for Review Exercise 17.2 that the buyer has set specifications for the part. The specifications require that the diameter fall in the range covered by 2.40000 ± 0.0100 mm. What proportion of units pro- duced by this process will not conform to specifica- tions?
17.4 For the situation of Review Exercise 17.2, give numerical estimates of the mean and standard devia-
//
706 Chapter 17 Statistical Quality Control Decision Rule for Cusum Charts
As indicated earlier, the slope of the cusum chart provides the signal for action by the quality control analyst. The decision rule calls for action if, at the rth sampling period,
dr > h,
where h is a prespecified value called the length of the decision interval and
dr=Sr− min Si. 1≤i≤r−1
In other words, action is taken if the data reveal that the current cusum value exceeds by a specified amount the previous smallest cusum value.
A modification in the mechanics described above makes employing the method easier. We have described a procedure that plots the cusums and computes differ- ences. A simple modification involves plotting the differences directly and allows for checking against the decision interval. The general expression for dr is quite simple. For the cusum procedure where we are detecting increases in the mean,
dr =max[0,dr−1 +(Xr −W)].
The choice of the value of h is, of course, very important. We do not choose in this book to provide the many details in the literature dealing with this choice. The reader is referred to Ewan and Kemp, 1960, and Montgomery, 2000b (see the Bibliography) for a thorough discussion. One important consideration is the expected run length. Ideally, the expected run length is quite large under μ = μ0 and quite small when μ = μ1.
that the random variable
􏰦n i=1
Xi is a Poisson random
9 2.3951
10 2.4215
11 2.3887
12 2.4107
13 2.4009
14 2.3992
15 2.3889
16 2.4107
17 2.4109
18 2.3944
19 2.3951
20 2.4015
0.0068 0.0048 0.0082 0.0032 0.0077 0.0107 0.0025 0.0138 0.0037 0.0052 0.0038 0.0017
17.3
1 2.3972
2 2.4191
3 2.4215
4 2.3917
5 2.4151
6 2.4027
7 2.3921
8 2.4171
0.0052 0.0117 0.0062 0.0089 0.0095 0.0101 0.0091 0.0059

Review Exercises
707
tion of the diameter for the part being manufactured in the process.
17.5 Consider the data of Table 17.1. Suppose that additional samples of size 5 are taken and tensile strength recorded. The sampling produces the follow- ing results (in pounds per square inch).
process producing a certain type of item that is consid- ered either defective or not defective. Twenty samples are taken.
(a) Construct a control chart for control of proportion defective.
(b) Does the process appear to be in control? Explain.
//
Sample
X ̄ R
Number of Defective Items
Number of Defective
Sample Items
1 1511
2 1508
3 1522
4 1488
5 1519
6 1524
7 1519
8 1504
9 1500
10 1519
(a) Plot the data, using the X ̄ – preliminary data of Table 17.1.
(b) Does the process appear to be in control? If not, explain why.
17.6 Consider an in-control process with mean μ = 25 and σ = 1.0. Suppose that subgroups of size 5 are used with control limits μ ± 3σ/√n, and centerline at μ. Suppose that a shift occurs in the mean, and the new mean is μ = 26.5.
(a) What is the average number of samples required (following the shift) to detect the out-of-control sit- uation?
(b) What is the standard deviation of the number of runs required?
17.7 Consider the situation of Example 17.2. The fol- lowing data are taken on additional samples of size 5. Plot the X ̄ – and S -values on the X ̄ – and S -charts that were produced with the data in the preliminary sam- ple. Does the process appear to be in control? Explain why or why not.
22 Sample
14 11 18
6 11 8 7 8 14
1 2 3 4 5 6 7 8 9
4 11 2 3 12 4 5 13 1 3 14 2 2 15 3 2 16 1 2 17 1 1 18 2 4 19 3
Sample
X ̄ Si
and R-charts for the 10
3
20 1
17.9 For
pose that additional data are collected as follows:
Sample Number of Defective Items
13 24 32 42 53 61 73 85 97
10 7
Does the process appear to be in control? Explain.
17.10 A quality control effort is being undertaken for a process where large steel plates are manufactured and surface defects are of concern. The goal is to set up a quality control chart for the number of defects per plate. The data are given below. Set up the appropri- ate control chart, using this sample information. Does the process appear to be in control?
the situation of Review Exercise 17.8, sup-
1 62.280
2 62.319
3 62.297
4 62.318
5 62.315
6 62.389
7 62.401
8 62.315
9 62.298
0.062 0.049 0.077 0.042 0.038 0.052 0.059 0.042 0.036 0.068
Number of
Sample Defects Sample
Number of Defects
10 62.337
17.8 Samples of size 50 are taken every hour from a
1 4 11 1 2 2 12 2 3 1 13 2 4 3 14 3 5 0 15 1 6 4 16 4 7 5 17 3 8 3 18 2 9 2 19 1
10 2 20 3

This page intentionally left blank

Chapter 18
Bayesian Statistics
18.1 Bayesian Concepts
The classical methods of estimation that we have studied in this text are based solely on information provided by the random sample. These methods essentially interpret probabilities as relative frequencies. For example, in arriving at a 95% confidence interval for μ, we interpret the statement
P(−1.96 < Z < 1.96) = 0.95 to mean that 95% of the time in repeated experiments Z will fall between −1.96 and 1.96. Since X ̄ − μ Z = σ/√n for a normal sample with known variance, the probability statement here means that 95% of the random intervals (X ̄ − 1.96σ/√n, X ̄ + 1.96σ/√n) contain the true mean μ. Another approach to statistical methods of estimation is called Bayesian methodology. The main idea of the method comes from Bayes’ rule, described in Section 2.7. The key difference between the Bayesian approach and the classical or frequentist approach is that in Bayesian concepts, the parameters are viewed as random variables. Subjective Probability Subjective probability is the foundation of Bayesian concepts. In Chapter 2, we discussed two possible approaches to probability, namely the relative frequency and the indifference approaches. The first one determines a probability as a consequence of repeated experiments. For instance, to decide the free-throw percentage of a basketball player, we can record the number of shots made and the total number of attempts this player has made. The probability of hitting a free-throw for this player can be calculated as the ratio of these two numbers. On the other hand, if we have no knowledge of any bias in a die, the probability that a 3 will appear in the next throw will be 1/6. Such an approach to probability interpretation is based on the indifference rule. 709 710 Chapter 18 Bayesian Statistics However, in many situations, the preceding probability interpretations cannot be applied. For instance, consider the questions “What is the probability that it will rain tomorrow?” “How likely is it that this stock will go up by the end of the month?” and “What is the likelihood that two companies will be merged together?” They can hardly be interpreted by the aforementioned approaches, and the answers to these questions may be different for different people. Yet these questions are constantly asked in daily life, and the approach used to explain these probabilities is called subjective probability, which reflects one’s subjective opinion. Conditional Perspective Recall that in Chapters 9 through 17, all statistical inferences were based on the fact that the parameters are unknown but fixed quantities, apart from those in Section 9.14, in which the parameters were treated as variables and the maximum likelihood estimates (MLEs) were calculated conditioning on the observed sample data. In Bayesian statistics, not only are the parameters treated as variables as in MLE calculation, but also they are treated as random. Because the observed data are the only experimental results for the practitioner, statistical inference is based on the actual observed data from a given experiment. Such a view is called a conditional perspective. Furthermore, in Bayesian concepts, since the parameters are treated as random, a probability distribution can be specified, generally by using the subjective probability for the parameter. Such a distribution is called a prior distribution and it usually reflects the experimenter’s prior belief about the parameter. In the Bayesian perspective, once an experiment is conducted and data are observed, all knowledge about the parameter is contained in the actual observed data and in the prior information. Bayesian Applications Although Bayes’ rule is credited to Thomas Bayes, Bayesian applications were first introduced by French scientist Pierre Simon Laplace, who published a paper on using Bayesian inference on the unknown binomial proportions (for binomial distribution, see Section 5.2). Since the introduction of the Markov chain Monte Carlo (MCMC) computa- tional tools for Bayesian analysis in the early 1990s, Bayesian statistics has become more and more popular in statistical modeling and data analysis. Meanwhile, methodology developments using Bayesian concepts have progressed dramatically, and they are applied in fields such as bioinformatics, biology, business, engineer- ing, environmental and ecology science, life science and health, medicine, and many others. 18.2 Bayesian Inferences Consider the problem of finding a point estimate of the parameter θ for the pop- ulation with distribution f(x| θ), given θ. Denote by π(θ) the prior distribution of θ. Suppose that a random sample of size n, denoted by x = (x1,x2,...,xn), is observed. 18.2 Bayesian Inferences 711 Definition 18.1: The marginal distribution of x in the above definition can be calculated using the following formula: The distribution of θ, given x, which is called the posterior distribution, is given by π(θ|x) = f(x|θ)π(θ), g(x) where g(x) is the marginal distribution of x. ⎧⎨􏰦f(x|θ)π(θ), θ is discrete, g(x) = θ ⎩􏰬 ∞ f (x|θ)π(θ) dθ, θ is continuous. −∞ Example 18.1: Assume that the prior distribution for the proportion of defectives produced by a machine is p 0.1 0.2 π(p) 0.6 0.4 Denote by x the number of defectives among a random sample of size 2. Find the posterior probability distribution of p, given that x is observed. Solution : The random variable X follows a binomial distribution 􏰧2􏰨 f(x|p) = b(x;2,p) = x pxq2−x, x = 0,1,2. The marginal distribution of x can be calculated as g(x) = f (x|0.1)π(0.1) + f (x|0.2)π(0.2) 􏰧2􏰨 = x [(0.1)x(0.9)2−x(0.6) + (0.2)x(0.8)2−x(0.4)]. Hence, for x = 0, 1, 2, we obtain the marginal probabilities as x012 g(x) 0.742 0.236 0.022 The posterior probability of p = 0.1, given x, is π(0.1|x) = f(x|0.1)π(0.1) = (0.1)x(0.9)2−x(0.6) , g(x) (0.1)x(0.9)2−x(0.6) + (0.2)x(0.8)2−x(0.4) and π(0.2|x) = 1 − π(0.1|x). Suppose that x = 0 is observed. π(0.1|0) = f(0 | 0.1)π(0.1) = (0.1)0(0.9)2−0(0.6) = 0.6550, g(0) 0.742 and π(0.2|0) = 0.3450. If x = 1 is observed, π(0.1|1) = 0.4576, and π(0.2|1) = 0.5424. Finally, π(0.1|2) = 0.2727, and π(0.2|2) = 0.7273. The prior distribution for Example 18.1 is discrete, although the natural range of p is from 0 to 1. Consider the following example, where we have a prior distri- bution covering the whole space for p. 712 Chapter 18 Bayesian Statistics Example 18.2: Solution : Suppose that the prior distribution of p is uniform (i.e., π(p) = 1, for 0 < p < 1). Use the same random variable X as in Example 18.1 to find the posterior distribution of p. As in Example 18.1, we have 􏰧2􏰨 f(x|p) = b(x;2,p) = x pxq2−x, x = 0,1,2. The marginal distribution of x can be calculated as 􏰫 1 􏰧2􏰨􏰫 1 g(x) = f(x|p)π(p) dp = x px(1 − p)2−x dp. 00 The integral above can be evaluated at each x directly as g(0) = 1/3, g(1) = 1/3, and g(2) = 1/3. Therefore, the posterior distribution of p, given x, is 􏰩2􏰪px(1 − p)2−x π(p|x)= x 1/3 􏰧2􏰨 =3 px(1−p)2−x, 00.
i=1

18.2 Bayesian Inferences 713 Hence, using Definition 18.1 we obtain the posterior distribution of λ as
􏰦n
λ xi 􏰦n
xi π(λ|x)∝f(x|λ)π(λ)=e 􏱇n e ∝e λi=1 .
−nλ i=1 −λ −(n+1)λ xi!
i=1
Referring to the gamma distribution in Section 6.6, we conclude that the posterior
􏰦n 1 distribution of λ follows a gamma distribution with parameters 1 + xi and n+1 .
􏰦n i=1 􏰦n Hence, we have the posterior mean and variance of λ as i=1 xi+1 and i=1 xi+1.
􏰦 n+1 (n+1)2 So, when x ̄ = 3 with n = 10, we have 10 xi = 30. Hence, the posterior
i=1
distribution of λ is a gamma distribution with parameters 31 and 1/11.
From Example 18.3 we observe that sometimes it is quite convenient to use the “proportional to” technique in calculating the posterior distribution, especially when the result can be formed to a commonly used distribution as described in Chapters 5 and 6.
Point Estimation Using the Posterior Distribution
Once the posterior distribution is derived, we can easily use the summary of the posterior distribution to make inferences on the population parameters. For in- stance, the posterior mean, median, and mode can all be used to estimate the parameter.
Example 18.4: Suppose that x = 1 is observed for Example 18.2. Find the posterior mean and the posterior mode.
Solution : When x = 1, the posterior distribution of p can be expressed as
π(p|1) = 6p(1 − p), for 0 < p < 1. To calculate the mean of this distribution, we need to find 􏰫1 􏰧1 1􏰨 1 6p2(1−p)dp=6 3−4 =2. 0 To find the posterior mode, we need to obtain the value of p such that the posterior distribution is maximized. Taking derivative of π(p) with respect to p, we obtain 6 − 12p. Solving for p in 6 − 12p = 0, we obtain p = 1/2. The second derivative is −12, which implies that the posterior mode is achieved at p = 1/2. Bayesian methods of estimation concerning the mean μ of a normal population are based on the following example. Example 18.5: If x ̄ is the mean of a random sample of size n from a normal population with known variance σ2, and the prior distribution of the population mean is a normal distribution with known mean μ0 and known variance σ02, then show that the posterior distribution of the population mean is also a normal distribution with 714 Chapter 18 Bayesian Statistics mean μ∗ and standard deviation σ∗, where ∗ σ02 σ 2 /n 􏰿 μ = σ02 +σ2/nx ̄+ σ02 +σ2/nμ0 Solution : The density function of our sample is f(x1,x2,...,xn|μ)= 1 (2π)n/2σn and ∗ σ02 σ 2 nσ02 +σ2. σ = 􏰷􏰤n􏰧 􏰨2􏰸 exp −1 xi−μ 2 i=1 σ , for−∞ 0.
From prior experience we are led to believe that θ is a value of an exponential random variable with proba- bility density
π(θ) = 2e−2θ, θ > 0.
If we have a sample of n observations on T , show that the posterior distribution of Θ is a gamma distribution
//

720
Chapter 18 Bayesian Statistics
with parameters
α = n + 1 and
18.14 A random variable X follows an exponential distribution with mean 1/β. Assume the prior distri- bution of β is another exponential distribution with mean 2.5. Determine the Bayes estimate of β under the absolute-error loss function.
18.15 A random sample X1, . . . , Xn comes from a uniform distribution (see Section 6.1) population U (0, θ) with unknown θ. The data are given below:
0.13, 1.06, 1.65, 1.73, 0.95, 0.56, 2.14, 0.33, 1.22, 0.20, 1.55, 1.18, 0.71, 0.01, 0.42, 1.03, 0.43, 1.02, 0.83, 0.88
Suppose the prior distribution of θ has the density
􏰺n 􏰻−1
ti + 2
i=1
18.12 Suppose that a sample consisting of 5, 6, 6, 7, 5, 6, 4, 9, 3, and 6 comes from a Poisson population with mean λ. Assume that the parameter λ follows a gamma distribution with parameters (3, 2). Under the squared-error loss function, find the Bayes estimate of λ.
18.13 A random variable X follows a negative bino- mial distribution with parameters k = 5 and p [i.e., b∗ (x; 5, p)]. Furthermore, we know that p follows a uni- form distribution on the interval (0, 1). Find the Bayes estimate of p under the squared-error loss function.
β =
􏰤
.
π(θ) =
1, θ>1, θ2
􏰰
0, θ ≤ 1.
Determine the Bayes estimator under the absolute- error loss function.

Bibliography
[1] Bartlett, M. S., and Kendall, D. G. (1946). “The Statistical Analysis of Variance Heterogeneity and Logarithmic Transformation,” Journal of the Royal Statistical Society, Ser. B. 8, 128–138.
[2] Bowker, A. H., and Lieberman, G. J. (1972). Engineering Statistics, 2nd ed. Upper Saddle River, N.J.: Prentice Hall.
[3] Box, G. E. P. (1988). “Signal to Noise Ratios, Performance Criteria and Transformations (with discussion),” Technometrics, 30, 1–17.
[4] Box, G. E. P., and Fung, C. A. (1986). “Studies in Quality Improvement: Minimizing Transmitted Variation by Parameter Design,” Report 8. University of Wisconsin-Madison, Center for Quality and Productivity Improvement.
[5] Box, G. E. P., Hunter, W. G., and Hunter, J. S. (1978). Statistics for Experimenters. New York: John Wiley & Sons.
[6] Brownlee, K. A. (1984). Statistical Theory and Methodology: In Science and Engineering, 2nd ed. New York: John Wiley & Sons.
[7] Carroll, R. J., and Ruppert, D. (1988). Transformation and Weighting in Regression. New York: Chapman and Hall.
[8] Chatterjee, S., Hadi, A. S., and Price, B. (1999). Regression Analysis by Example, 3rd ed. New York: John Wiley & Sons.
[9] Cook, R. D., and Weisberg, S. (1982). Residuals and Influence in Regression. New York: Chapman and Hall.
[10] Daniel, C. and Wood, F. S. (1999). Fitting Equations to Data: Computer Analysis of Multifactor Data, 2nd ed. New York: John Wiley & Sons.
[11] Daniel, W. W. (1989). Applied Nonparametric Statistics, 2nd ed. Belmont, Calif.: Wadsworth Publishing Company.
[12] Devore, J. L. (2003). Probability and Statistics for Engineering and the Sciences, 6th ed. Belmont, Calif: Duxbury Press.
[13] Dixon, W. J. (1983). Introduction to Statistical Analysis, 4th ed. New York: McGraw-Hill.
[14] Draper, N. R., and Smith, H. (1998). Applied Regression Analysis, 3rd ed. New York: John Wiley & Sons.
721

722 BIBLIOGRAPHY
[15] Duncan, A. (1986). Quality Control and Industrial Statistics, 5th ed. Homewood, Ill.: Irwin.
[16] Dyer, D. D., and Keating, J. P. (1980). “On the Determination of Critical Values for Bartlett’s Test,” Journal of the American Statistical Association, 75, 313–319.
[17] Ewan, W. D., and Kemp, K. W. (1960). “Sampling Inspection of Continuous Processes with No Autocorrelation between Successive Results,” Biometrika, 47, 363–380.
[18] Geary, R. C. (1947). “Testing for Normality,” Biometrika, 34, 209–242.
[19] Gunst, R. F., and Mason, R. L. (1980). Regression Analysis and Its Application: A Data-Oriented
Approach. New York: Marcel Dekker.
[20] Guttman, I., Wilks, S. S., and Hunter, J. S. (1971). Introductory Engineering Statistics. New York:
John Wiley & Sons.
[21] Harville, D. A. (1977). “Maximum Likelihood Approaches to Variance Component Estimation and
to Related Problems,” Journal of the American Statistical Association, 72, 320–338.
[22] Hicks, C. R., and Turner, K. V. (1999). Fundamental Concepts in the Design of Experiments, 5th
ed. Oxford: Oxford University Press.
[23] Hoaglin, D. C., Mosteller, F., and Tukey, J. W. (1991). Fundamentals of Exploratory Analysis of
Variance. New York: John Wiley & Sons.
[24] Hocking, R. R. (1976). “The Analysis and Selection of Variables in Linear Regression,” Biometrics,
32, 1–49.
[25] Hodges, J. L., and Lehmann, E. L. (2005). Basic Concepts of Probability and Statistics, 2nd ed.
Philadelphia: Society for Industrial and Applied Mathematics.
[26] Hoerl, A. E., and Wennard, R. W. (1970). “Ridge Regression: Applications to Nonorthogonal
Problems,” Technometrics, 12, 55–67.
[27] Hogg, R. V., and Ledolter, J. (1992). Applied Statistics for Engineers and Physical Scientists, 2nd
ed. Upper Saddle River, N.J.: Prentice Hall.
[28] Hogg, R. V., McKean, J. W., and Craig, A. (2005). Introduction to Mathematical Statistics, 6th ed. Upper Saddle River, N.J.: Prentice Hall.
[29] Hollander, M., and Wolfe, D. (1999). Nonparametric Statistical Methods. New York: John Wiley & Sons.
[30] Johnson, N. L., and Leone, F. C. (1977). Statistics and Experimental Design: In Engineering and the Physical Sciences, 2nd ed. Vols. I and II, New York: John Wiley & Sons.
[31] Kackar, R. (1985). “Off-Line Quality Control, Parameter Design, and the Taguchi Methods,” Journal of Quality Technology, 17, 176–188.
[32] Koopmans, L. H. (1987). An Introduction to Contemporary Statistics, 2nd ed. Boston: Duxbury Press.
[33] Kutner, M. H., Nachtsheim, C. J., Neter, J., and Li, W. (2004). Applied Linear Regression Models, 5th ed. New York: McGraw-Hill/Irwin.

BIBLIOGRAPHY 723
[34] Larsen, R. J., and Morris, M. L. (2000). An Introduction to Mathematical Statistics and Its Appli-
cations, 3rd ed. Upper Saddle River, N.J.: Prentice Hall.
[35] Lehmann, E. L., and D’Abrera, H. J. M. (1998). Nonparametrics: Statistical Methods Based on
Ranks, rev. ed. Upper Saddle River, N.J.: Prentice Hall.
[36] Lentner, M., and Bishop, T. (1986). Design and Analysis of Experiments, 2nd ed. Blacksburg, Va.:
Valley Book Co.
[37] Mallows, C. L. (1973). “Some Comments on Cp,” Technometrics, 15, 661–675.
[38] McClave, J. T., Dietrich, F. H., and Sincich, T. (1997). Statistics, 7th ed. Upper Saddle River, N.J.: Prentice Hall.
[39] Montgomery, D. C. (2008a). Design and Analysis of Experiments, 7th ed. New York: John Wiley & Sons.
[40] Montgomery, D. C. (2008b). Introduction to Statistical Quality Control, 6th ed. New York: John Wiley & Sons.
[41] Mosteller, F., and Tukey, J. (1977). Data Analysis and Regression. Reading, Mass.: Addison-Wesley Publishing Co.
[42] Myers, R. H. (1990). Classical and Modern Regression with Applications, 2nd ed. Boston: Duxbury Press.
[43] Myers, R. H., Khuri, A. I., and Vining, G. G. (1992). “Response Surface Alternatives to the Taguchi Robust Parameter Design Approach,” The American Statistician, 46, 131–139.
[44] Myers, R. H., Montgomery, D. C., and Anderson-Cook, C. M. (2009). Response Surface Method- ology: Process and Product Optimization Using Designed Experiments, 3rd ed. New York: John Wiley & Sons.
[45] Myers, R. H., Montgomery, D. C., Vining, G. G., and Robinson, T. J. (2008). Generalized Linear Models with Applications in Engineering and the Sciences, 2nd ed., New York: John Wiley & Sons.
[46] Noether, G. E. (1976). Introduction to Statistics: A Nonparametric Approach, 2nd ed. Boston: Houghton Mifflin Company.
[47] Olkin, I., Gleser, L. J., and Derman, C. (1994). Probability Models and Applications, 2nd ed. New York: Prentice Hall.
[48] Ott, R. L., and Longnecker, M. T. (2000). An Introduction to Statistical Methods and Data Analysis, 5th ed. Boston: Duxbury Press.
[49] Pacansky, J., England, C. D., and Wattman, R. (1986). “Infrared Spectroscopic Studies of Poly (perfluoropropyleneoxide) on Gold Substrate: A Classical Dispersion Analysis for the Refractive Index.” Applied Spectroscopy, 40, 8–16.
[50] Plackett, R. L., and Burman, J. P. (1946). “The Design of Multifactor Experiments,” Biometrika, 33, 305–325.
[51] Ross, S. M. (2002). Introduction to Probability Models, 9th ed. New York: Academic Press, Inc.

724 BIBLIOGRAPHY
[52] Satterthwaite, F. E. (1946). “An Approximate Distribution of Estimates of Variance Components,”
Biometrics, 2, 110–114.
[53] Schilling, E. G., and Nelson, P. R. (1976). “The Effect of Nonnormality on the Control Limits of
X ̄ Charts,” Journal of Quality Technology, 8, 347–373.
[54] Schmidt, S. R., and Launsby, R. G. (1991). Understanding Industrial Designed Experiments. Col-
orado Springs, Col. Air Academy Press.
[55] Shoemaker, A. C., Tsui, K.-L., and Wu, C. F. J. (1991). “Economical Experimentation Methods
for Robust Parameter Design,” Technometrics, 33, 415–428.
[56] Snedecor, G. W., and Cochran, W. G. (1989). Statistical Methods, 8th ed. Allies, Iowa: The Iowa
State University Press.
[57] Steel, R. G. D., Torrie, J. H., and Dickey, D. A. (1996). Principles and Procedures of Statistics: A
Biometrical Approach, 3rd ed. New York: McGraw-Hill.
[58] Taguchi, G. (1991). Introduction to Quality Engineering. White Plains, N.Y.: Unipub/Kraus In-
ternational.
[59] Taguchi, G., and Wu, Y. (1985). Introduction to Off-Line Quality Control. Nagoya, Japan: Central Japan Quality Control Association.
[60] Thompson, W. O., and Cady, F. B. (1973). Proceedings of the University of Kentucky Conference on Regression with a Large Number of Predictor Variables. Lexington, Ken.: University of Kentucky Press.
[61] Tukey, J. W. (1977). Exploratory Data Analysis. Reading, Mass.: Addison-Wesley Publishing Co.
[62] Vining, G. G., and Myers, R. H. (1990). “Combining Taguchi and Response Surface Philosophies:
A Dual Response Approach,” Journal of Quality Technology, 22, 38–45.
[63] Welch, W. J., Yu, T. K., Kang, S. M., and Sacks, J. (1990). “Computer Experiments for Quality
Control by Parameter Design,” Journal of Quality Technology, 22, 15–22.
[64] Winer, B. J. (1991). Statistical Principles in Experimental Design, 3rd ed. New York: McGraw-Hill.

Appendix A
Statistical Tables and Proofs
725

726
Appendix A Statistical Tables and Proofs
􏰦r
Table A.1 Binomial Probability Sums b(x; n, p)
x=0
0.25 0.30 0.40
0.7500 0.7000 0.6000 1.0000 1.0000 1.0000
0.5625 0.4900 0.3600 0.9375 0.9100 0.8400 1.0000 1.0000 1.0000
0.4219 0.3430 0.2160 0.8438 0.7840 0.6480 0.9844 0.9730 0.9360 1.0000 1.0000 1.0000
0.3164 0.2401 0.1296 0.7383 0.6517 0.4752 0.9492 0.9163 0.8208 0.9961 0.9919 0.9744 1.0000 1.0000 1.0000
0.2373 0.1681 0.0778 0.6328 0.5282 0.3370 0.8965 0.8369 0.6826 0.9844 0.9692 0.9130 0.9990 0.9976 0.9898 1.0000 1.0000 1.0000
0.1780 0.1176 0.0467 0.5339 0.4202 0.2333 0.8306 0.7443 0.5443 0.9624 0.9295 0.8208 0.9954 0.9891 0.9590 0.9998 0.9993 0.9959 1.0000 1.0000 1.0000
0.1335 0.0824 0.0280 0.4449 0.3294 0.1586 0.7564 0.6471 0.4199 0.9294 0.8740 0.7102 0.9871 0.9712 0.9037 0.9987 0.9962 0.9812 0.9999 0.9998 0.9984 1.0000 1.0000 1.0000
p
n r
0.10
0.20
0.8000 1.0000
0.6400 0.9600 1.0000
0.5120 0.8960 0.9920 1.0000
0.4096 0.8192 0.9728 0.9984 1.0000
0.3277 0.7373 0.9421 0.9933 0.9997 1.0000
0.2621 0.6554 0.9011 0.9830 0.9984 0.9999 1.0000
0.2097 0.5767 0.8520 0.9667 0.9953 0.9996 1.0000
0.50 0.60 0.70 0.80
0.5000 0.4000 0.3000 0.2000 1.0000 1.0000 1.0000 1.0000
0.2500 0.1600 0.0900 0.0400 0.7500 0.6400 0.5100 0.3600 1.0000 1.0000 1.0000 1.0000
0.1250 0.0640 0.0270 0.0080 0.5000 0.3520 0.2160 0.1040 0.8750 0.7840 0.6570 0.4880 1.0000 1.0000 1.0000 1.0000
0.0625 0.0256 0.0081 0.0016 0.3125 0.1792 0.0837 0.0272 0.6875 0.5248 0.3483 0.1808 0.9375 0.8704 0.7599 0.5904 1.0000 1.0000 1.0000 1.0000
0.0313 0.0102 0.0024 0.0003 0.1875 0.0870 0.0308 0.0067 0.5000 0.3174 0.1631 0.0579 0.8125 0.6630 0.4718 0.2627 0.9688 0.9222 0.8319 0.6723 1.0000 1.0000 1.0000 1.0000
0.0156 0.0041 0.0007 0.0001 0.1094 0.0410 0.0109 0.0016 0.3438 0.1792 0.0705 0.0170 0.6563 0.4557 0.2557 0.0989 0.8906 0.7667 0.5798 0.3446 0.9844 0.9533 0.8824 0.7379 1.0000 1.0000 1.0000 1.0000
0.0078 0.0016 0.0002 0.0000 0.0625 0.0188 0.0038 0.0004 0.2266 0.0963 0.0288 0.0047 0.5000 0.2898 0.1260 0.0333 0.7734 0.5801 0.3529 0.1480 0.9375 0.8414 0.6706 0.4233 0.9922 0.9720 0.9176 0.7903 1.0000 1.0000 1.0000 1.0000
0.90
0.1000 1.0000
0.0100 0.1900 1.0000
0.0010 0.0280 0.2710 1.0000
0.0001 0.0037 0.0523 0.3439 1.0000
0.0000 0.0005 0.0086 0.0815 0.4095 1.0000
0.0000 0.0001 0.0013 0.0159 0.1143 0.4686 1.0000
0.0000 0.0002 0.0027 0.0257 0.1497 0.5217 1.0000
1 0
1 1.0000
2 0
1 0.9900
0.9000 0.8100
2 1.0000
0.7290 2 0.9990
3 0
1 0.9720
3 1.0000
0.6561 2 0.9963
4 0
1 0.9477
3 0.9999
4 1.0000
0.5905 2 0.9914
5 0
1 0.9185
3 0.9995
4 1.0000
5 1.0000
0.5314 2 0.9842
6 0
1 0.8857
3 0.9987
4 0.9999
5 1.0000
6 1.0000
0.4783 2 0.9743
7 0
1 0.8503
3 0.9973
4 0.9998
5 1.0000
6
7

Table A.1 Binomial Probability Table
727
Table A.1 (continued) Binomial Probability Sums
􏰦r
x=0 p
0.40
0.0168 0.1064 0.3154 0.5941 0.8263 0.9502 0.9915 0.9993 1.0000
0.0101 0.0705 0.2318 0.4826 0.7334 0.9006 0.9750 0.9962 0.9997 1.0000
0.0060 0.0464 0.1673 0.3823 0.6331 0.8338 0.9452 0.9877 0.9983 0.9999 1.0000
0.0036 0.0302 0.1189 0.2963 0.5328 0.7535 0.9006 0.9707 0.9941 0.9993 1.0000
b(x; n, p)
0.50 0.60 0.70 0.80
0.0039 0.0007 0.0001 0.0000 0.0352 0.0085 0.0013 0.0001 0.1445 0.0498 0.0113 0.0012 0.3633 0.1737 0.0580 0.0104 0.6367 0.4059 0.1941 0.0563 0.8555 0.6846 0.4482 0.2031 0.9648 0.8936 0.7447 0.4967 0.9961 0.9832 0.9424 0.8322 1.0000 1.0000 1.0000 1.0000
0.0020 0.0003 0.0000
0.0195 0.0038 0.0004 0.0000 0.0898 0.0250 0.0043 0.0003 0.2539 0.0994 0.0253 0.0031 0.5000 0.2666 0.0988 0.0196 0.7461 0.5174 0.2703 0.0856 0.9102 0.7682 0.5372 0.2618 0.9805 0.9295 0.8040 0.5638 0.9980 0.9899 0.9596 0.8658 1.0000 1.0000 1.0000 1.0000
0.0010 0.0001 0.0000
0.0107 0.0017 0.0001 0.0000 0.0547 0.0123 0.0016 0.0001 0.1719 0.0548 0.0106 0.0009 0.3770 0.1662 0.0473 0.0064 0.6230 0.3669 0.1503 0.0328 0.8281 0.6177 0.3504 0.1209 0.9453 0.8327 0.6172 0.3222 0.9893 0.9536 0.8507 0.6242 0.9990 0.9940 0.9718 0.8926 1.0000 1.0000 1.0000 1.0000
0.0005 0.0000
0.0059 0.0007 0.0000
0.0327 0.0059 0.0006 0.0000 0.1133 0.0293 0.0043 0.0002 0.2744 0.0994 0.0216 0.0020 0.5000 0.2465 0.0782 0.0117 0.7256 0.4672 0.2103 0.0504 0.8867 0.7037 0.4304 0.1611 0.9673 0.8811 0.6873 0.3826 0.9941 0.9698 0.8870 0.6779 0.9995 0.9964 0.9802 0.9141 1.0000 1.0000 1.0000 1.0000
n r
0.10
0.20
0.1678 0.5033 0.7969 0.9437 0.9896 0.9988 0.9999 1.0000
0.1342 0.4362 0.7382 0.9144 0.9804 0.9969 0.9997 1.0000
0.1074 0.3758 0.6778 0.8791 0.9672 0.9936 0.9991 0.9999 1.0000
0.0859 0.3221 0.6174 0.8389 0.9496 0.9883 0.9980 0.9998 1.0000
0.25
0.1001 0.3671 0.6785 0.8862 0.9727 0.9958 0.9996 1.0000
0.0751 0.3003 0.6007 0.8343 0.9511 0.9900 0.9987 0.9999 1.0000
0.0563 0.2440 0.5256 0.7759 0.9219 0.9803 0.9965 0.9996 1.0000
0.0422 0.1971 0.4552 0.7133 0.8854 0.9657 0.9924 0.9988 0.9999 1.0000
0.30
0.0576 0.2553 0.5518 0.8059 0.9420 0.9887 0.9987 0.9999 1.0000
0.0404 0.1960 0.4628 0.7297 0.9012 0.9747 0.9957 0.9996 1.0000
0.0282 0.1493 0.3828 0.6496 0.8497 0.9527 0.9894 0.9984 0.9999 1.0000
0.0198 0.1130 0.3127 0.5696 0.7897 0.9218 0.9784 0.9957 0.9994 1.0000
0.90
0.0000 0.0004 0.0050 0.0381 0.1869 0.5695 1.0000
0.0000 0.0001 0.0009 0.0083 0.0530 0.2252 0.6126 1.0000
0.0000 0.0001 0.0016 0.0128 0.0702 0.2639 0.6513 1.0000
0.0000 0.0003 0.0028 0.0185 0.0896 0.3026 0.6862 1.0000
0.4305
2 0.9619
3 0.9950
4 0.9996
5 1.0000
6
7 8
8 0
1 0.8131
0.3874 2 0.9470
9 0
1 0.7748
3 0.9917
4 0.9991
5 0.9999
6 1.0000
7
8 9
0.3487 2 0.9298
10 0
1 0.7361
3 0.9872
4 0.9984
5 0.9999
6 1.0000
7
8
9 10
0.3138 2 0.9104
11 0
1 0.6974
3 0.9815
4 0.9972
5 0.9997
6 1.0000
7
8
9 10 11

728
Appendix A Statistical Tables and Proofs
Table A.1 (continued) Binomial Probability Sums
0.2824
2 0.8891
3 0.9744
4 0.9957
5 0.9995
6 0.9999
7 1.0000
8
9 10 11 12
12 0
1 0.6590
0.2542 2 0.8661
13 0
1 0.6213
3 0.9658
4 0.9935
5 0.9991
6 0.9999
7 1.0000
8
9 10 11 12 13
0.2288 2 0.8416
14 0
1 0.5846
3 0.9559
4 0.9908
5 0.9985
6 0.9998
7 1.0000
8
9 10 11 12 13 14
􏰦r
x=0 p
0.40
0.0022 0.0196 0.0834 0.2253 0.4382 0.6652 0.8418 0.9427 0.9847 0.9972 0.9997 1.0000
0.0013 0.0126 0.0579 0.1686 0.3530 0.5744 0.7712 0.9023 0.9679 0.9922 0.9987 0.9999 1.0000
0.0008 0.0081 0.0398 0.1243 0.2793 0.4859 0.6925 0.8499 0.9417 0.9825 0.9961 0.9994 0.9999 1.0000
b(x; n, p)
0.50 0.60 0.70 0.80
0.0002 0.0000
0.0032 0.0003 0.0000
0.0193 0.0028 0.0002 0.0000 0.0730 0.0153 0.0017 0.0001 0.1938 0.0573 0.0095 0.0006 0.3872 0.1582 0.0386 0.0039 0.6128 0.3348 0.1178 0.0194 0.8062 0.5618 0.2763 0.0726 0.9270 0.7747 0.5075 0.2054 0.9807 0.9166 0.7472 0.4417 0.9968 0.9804 0.9150 0.7251 0.9998 0.9978 0.9862 0.9313 1.0000 1.0000 1.0000 1.0000
0.0001 0.0000
0.0017 0.0001 0.0000
0.0112 0.0013 0.0001
0.0461 0.0078 0.0007 0.0000 0.1334 0.0321 0.0040 0.0002 0.2905 0.0977 0.0182 0.0012 0.5000 0.2288 0.0624 0.0070 0.7095 0.4256 0.1654 0.0300 0.8666 0.6470 0.3457 0.0991 0.9539 0.8314 0.5794 0.2527 0.9888 0.9421 0.7975 0.4983 0.9983 0.9874 0.9363 0.7664 0.9999 0.9987 0.9903 0.9450 1.0000 1.0000 1.0000 1.0000
0.0001 0.0000
0.0009 0.0001
0.0065 0.0006 0.0000
0.0287 0.0039 0.0002
0.0898 0.0175 0.0017 0.0000 0.2120 0.0583 0.0083 0.0004 0.3953 0.1501 0.0315 0.0024 0.6047 0.3075 0.0933 0.0116 0.7880 0.5141 0.2195 0.0439 0.9102 0.7207 0.4158 0.1298 0.9713 0.8757 0.6448 0.3018 0.9935 0.9602 0.8392 0.5519 0.9991 0.9919 0.9525 0.8021 0.9999 0.9992 0.9932 0.9560 1.0000 1.0000 1.0000 1.0000
n r
0.10
0.20
0.0687 0.2749 0.5583 0.7946 0.9274 0.9806 0.9961 0.9994 0.9999 1.0000
0.0550 0.2336 0.5017 0.7473 0.9009 0.9700 0.9930 0.9988 0.9998 1.0000
0.0440 0.1979 0.4481 0.6982 0.8702 0.9561 0.9884 0.9976 0.9996 1.0000
0.25
0.0317 0.1584 0.3907 0.6488 0.8424 0.9456 0.9857 0.9972 0.9996 1.0000
0.0238 0.1267 0.3326 0.5843 0.7940 0.9198 0.9757 0.9944 0.9990 0.9999 1.0000
0.0178 0.1010 0.2811 0.5213 0.7415 0.8883 0.9617 0.9897 0.9978 0.9997 1.0000
0.30
0.0138 0.0850 0.2528 0.4925 0.7237 0.8822 0.9614 0.9905 0.9983 0.9998 1.0000
0.0097 0.0637 0.2025 0.4206 0.6543 0.8346 0.9376 0.9818 0.9960 0.9993 0.9999 1.0000
0.0068 0.0475 0.1608 0.3552 0.5842 0.7805 0.9067 0.9685 0.9917 0.9983 0.9998 1.0000
0.90
0.0000 0.0001 0.0005 0.0043 0.0256 0.1109 0.3410 0.7176 1.0000
0.0000 0.0001 0.0009 0.0065 0.0342 0.1339 0.3787 0.7458 1.0000
0.0000 0.0002 0.0015 0.0092 0.0441 0.1584 0.4154 0.7712 1.0000

Table A.1 Binomial Probability Table
729
Table A.1 (continued) Binomial Probability Sums
􏰦r
x=0 p
0.40
0.0005 0.0052 0.0271 0.0905 0.2173 0.4032 0.6098 0.7869 0.9050 0.9662 0.9907 0.9981 0.9997 1.0000
0.0003 0.0033 0.0183 0.0651 0.1666 0.3288 0.5272 0.7161 0.8577 0.9417 0.9809 0.9951 0.9991 0.9999 1.0000
b(x; n, p)
0.50 0.60
0.0000
0.0005 0.0000 0.0037 0.0003 0.0176 0.0019 0.0592 0.0093 0.1509 0.0338 0.3036 0.0950 0.5000 0.2131 0.6964 0.3902 0.8491 0.5968 0.9408 0.7827 0.9824 0.9095 0.9963 0.9729 0.9995 0.9948 1.0000 0.9995
n r
0.10
0.20
0.0352 0.1671 0.3980 0.6482 0.8358 0.9389 0.9819 0.9958 0.9992 0.9999 1.0000
0.0281 0.1407 0.3518 0.5981 0.7982 0.9183 0.9733 0.9930 0.9985 0.9998 1.0000
0.25
0.0134 0.0802 0.2361 0.4613 0.6865 0.8516 0.9434 0.9827 0.9958 0.9992 0.9999 1.0000
0.0100 0.0635 0.1971 0.4050 0.6302 0.8103 0.9204 0.9729 0.9925 0.9984 0.9997 1.0000
0.30
0.0047 0.0353 0.1268 0.2969 0.5155 0.7216 0.8689 0.9500 0.9848 0.9963 0.9993 0.9999 1.0000
0.0033 0.0261 0.0994 0.2459 0.4499 0.6598 0.8247 0.9256 0.9743 0.9929 0.9984 0.9997 1.0000
0.70 0.80
0.0000
0.0001
0.0007 0.0000 0.0037 0.0001 0.0152 0.0008 0.0500 0.0042 0.1311 0.0181 0.2784 0.0611 0.4845 0.1642 0.7031 0.3518 0.8732 0.6020 0.9647 0.8329 0.9953 0.9648
0.90
0.0000 0.0003 0.0022 0.0127 0.0556 0.1841 0.4510 0.7941 1.0000
0.0000 0.0001 0.0005 0.0033 0.0170 0.0684 0.2108 0.4853 0.8147 1.0000
0.2059
2 0.8159
3 0.9444
4 0.9873
5 0.9978
6 0.9997
7 1.0000
8
9
10
11
12
13
14
15
15 0
1 0.5490
0.1853 2 0.7892
0.0000
0.0003 0.0000 0.0021 0.0001 0.0106 0.0009 0.0384 0.0049 0.1051 0.0191 0.2272 0.0583 0.4018 0.1423 0.5982 0.2839 0.7728 0.4728 0.8949 0.6712 0.9616 0.8334 0.9894 0.9349 0.9979 0.9817 0.9997 0.9967 1.0000 0.9997
16 0
1 0.5147
3 0.9316
4 0.9830
5 0.9967
6 0.9995
7 0.9999
8 1.0000
9
10
11
12
13
14
15
16
0.0000
0.0003
0.0016 0.0000 0.0071 0.0002 0.0257 0.0015 0.0744 0.0070 0.1753 0.0267 0.3402 0.0817 0.5501 0.2018 0.7541 0.4019 0.9006 0.6482 0.9739 0.8593 0.9967 0.9719
1.0000 1.0000 1.0000
1.0000 1.0000 1.0000

730
Appendix A Statistical Tables and Proofs
Table A.1 (continued) Binomial Probability Sums
􏰦r
x=0 p
0.40
0.0002 0.0021 0.0123 0.0464 0.1260 0.2639 0.4478 0.6405 0.8011 0.9081 0.9652 0.9894 0.9975 0.9995 0.9999 1.0000
0.0001 0.0013 0.0082 0.0328 0.0942 0.2088 0.3743 0.5634 0.7368 0.8653 0.9424 0.9797 0.9942 0.9987 0.9998 1.0000
b(x; n, p)
0.50
0.0000 0.0001 0.0012 0.0064 0.0245 0.0717 0.1662 0.3145 0.5000 0.6855 0.8338 0.9283 0.9755 0.9936 0.9988 0.9999 1.0000
0.0000 0.0001 0.0007 0.0038 0.0154 0.0481 0.1189 0.2403 0.4073 0.5927 0.7597 0.8811 0.9519 0.9846 0.9962 0.9993 0.9999 1.0000
n r
0.10
0.20
0.0225 0.1182 0.3096 0.5489 0.7582 0.8943 0.9623 0.9891 0.9974 0.9995 0.9999 1.0000
0.0180 0.0991 0.2713 0.5010 0.7164 0.8671 0.9487 0.9837 0.9957 0.9991 0.9998 1.0000
0.25
0.0075 0.0501 0.1637 0.3530 0.5739 0.7653 0.8929 0.9598 0.9876 0.9969 0.9994 0.9999 1.0000
0.0056 0.0395 0.1353 0.3057 0.5187 0.7175 0.8610 0.9431 0.9807 0.9946 0.9988 0.9998 1.0000
0.30
0.0023 0.0193 0.0774 0.2019 0.3887 0.5968 0.7752 0.8954 0.9597 0.9873 0.9968 0.9993 0.9999 1.0000
0.0016 0.0142 0.0600 0.1646 0.3327 0.5344 0.7217 0.8593 0.9404 0.9790 0.9939 0.9986 0.9997 1.0000
0.60 0.70 0.80
0.0000
0.0001
0.0005 0.0000
0.0025 0.0001
0.0106 0.0007 0.0000 0.0348 0.0032 0.0001 0.0919 0.0127 0.0005 0.1989 0.0403 0.0026 0.3595 0.1046 0.0109 0.5522 0.2248 0.0377 0.7361 0.4032 0.1057 0.8740 0.6113 0.2418 0.9536 0.7981 0.4511 0.9877 0.9226 0.6904 0.9979 0.9807 0.8818 0.9998 0.9977 0.9775 1.0000 1.0000 1.0000
0.0000
0.0002
0.0013 0.0000
0.0058 0.0003
0.0203 0.0014 0.0000 0.0576 0.0061 0.0002 0.1347 0.0210 0.0009 0.2632 0.0596 0.0043 0.4366 0.1407 0.0163 0.6257 0.2783 0.0513 0.7912 0.4656 0.1329 0.9058 0.6673 0.2836 0.9672 0.8354 0.4990 0.9918 0.9400 0.7287 0.9987 0.9858 0.9009 0.9999 0.9984 0.9820 1.0000 1.0000 1.0000
0.90
0.1668
2 0.7618
3 0.9174
4 0.9779
5 0.9953
6 0.9992
7 0.9999
8 1.0000
9
10
11
12
13
14
15
16
17
17 0
1 0.4818
0.1501 2 0.7338
18 0
1 0.4503
3 0.9018
4 0.9718
5 0.9936
6 0.9988
7 0.9998
8 1.0000
9
10
11
12
13
14
15
16
17
18
0.0000 0.0002 0.0012 0.0064 0.0282 0.0982 0.2662 0.5497 0.8499 1.0000
0.0000 0.0001 0.0008 0.0047 0.0221 0.0826 0.2382 0.5182 0.8332 1.0000

Table A.1 Binomial Probability Table
731
Table A.1 (continued) Binomial Probability Sums
􏰦r
x=0 p
0.40
0.0001 0.0008 0.0055 0.0230 0.0696 0.1629 0.3081 0.4878 0.6675 0.8139 0.9115 0.9648 0.9884 0.9969 0.9994 0.9999 1.0000
0.0000 0.0005 0.0036 0.0160 0.0510 0.1256 0.2500 0.4159 0.5956 0.7553 0.8725 0.9435 0.9790 0.9935 0.9984 0.9997 1.0000
b(x; n, p)
0.50
0.0000 0.0004 0.0022 0.0096 0.0318 0.0835 0.1796 0.3238 0.5000 0.6762 0.8204 0.9165 0.9682 0.9904 0.9978 0.9996 1.0000
0.0000 0.0002 0.0013 0.0059 0.0207 0.0577 0.1316 0.2517 0.4119 0.5881 0.7483 0.8684 0.9423 0.9793 0.9941 0.9987 0.9998 1.0000
n r
0.10
0.20
0.0144 0.0829 0.2369 0.4551 0.6733 0.8369 0.9324 0.9767 0.9933 0.9984 0.9997 1.0000
0.25
0.0042 0.0310 0.1113 0.2631 0.4654 0.6678 0.8251 0.9225 0.9713 0.9911 0.9977 0.9995 0.9999 1.0000
0.0032 0.0243 0.0913 0.2252 0.4148 0.6172 0.7858 0.8982 0.9591 0.9861 0.9961 0.9991 0.9998 1.0000
0.30
0.0011 0.0104 0.0462 0.1332 0.2822 0.4739 0.6655 0.8180 0.9161 0.9674 0.9895 0.9972 0.9994 0.9999 1.0000
0.0008 0.0076 0.0355 0.1071 0.2375 0.4164 0.6080 0.7723 0.8867 0.9520 0.9829 0.9949 0.9987 0.9997 1.0000
0.60 0.70
0.0000
0.0001
0.0006 0.0000 0.0031 0.0001 0.0116 0.0006 0.0352 0.0028 0.0885 0.0105 0.1861 0.0326 0.3325 0.0839 0.5122 0.1820 0.6919 0.3345 0.8371 0.5261 0.9304 0.7178 0.9770 0.8668 0.9945 0.9538 0.9992 0.9896 0.9999 0.9989 1.0000 1.0000
0.0000
0.0003
0.0016 0.0000 0.0065 0.0003 0.0210 0.0013 0.0565 0.0051 0.1275 0.0171 0.2447 0.0480 0.4044 0.1133 0.5841 0.2277 0.7500 0.3920 0.8744 0.5836 0.9490 0.7625 0.9840 0.8929 0.9964 0.9645 0.9995 0.9924 1.0000 0.9992
0.80 0.90
0.0000
0.0003
0.0016
0.0067 0.0000 0.0233 0.0003 0.0676 0.0017 0.1631 0.0086 0.3267 0.0352 0.5449 0.1150 0.7631 0.2946 0.9171 0.5797 0.9856 0.8649 1.0000 1.0000
0.0000
0.0001
0.0006
0.0026 0.0000 0.0100 0.0001 0.0321 0.0004 0.0867 0.0024 0.1958 0.0113 0.3704 0.0432 0.5886 0.1330 0.7939 0.3231 0.9308 0.6083 0.9885 0.8784
0.1351
2 0.7054
3 0.8850
4 0.9648
5 0.9914
6 0.9983
7 0.9997
8 1.0000
9
10
11
12
13
14
15
16
17
18
19
19 0
1 0.4203
0.1216 2 0.6769
0.0115 0.0692 0.2061 0.4114 0.6296 0.8042 0.9133 0.9679 0.9900 0.9974 0.9994 0.9999 1.0000
20 0
1 0.3917
3 0.8670
4 0.9568
5 0.9887
6 0.9976
7 0.9996
8 0.9999
9 1.0000
10
11
12
13
14
15
16
17
18
19
20
1.0000 1.0000
1.0000

732
Appendix A
Statistical Tables and Proofs
r
0 1 2 3 4 5 6
r
0 1 2 3 4 5
6 7 8 9
10
11
12
13
14
15
16
0.1
0.9048 0.9953 0.9998 1.0000
1.0
0.3679 0.7358 0.9197 0.9810 0.9963 0.9994
0.9999 1.0000
0.2
0.8187 0.9825 0.9989 0.9999 1.0000
1.5
0.2231 0.5578 0.8088 0.9344 0.9814 0.9955
0.9991 0.9998 1.0000
0.3
0.7408 0.9631 0.9964 0.9997 1.0000
2.0
0.1353 0.4060 0.6767 0.8571 0.9473 0.9834
0.9955 0.9989 0.9998 1.0000
0.4
0.6703 0.9384 0.9921 0.9992 0.9999 1.0000
2.5
0.0821 0.2873 0.5438 0.7576 0.8912 0.9580
0.9858 0.9958 0.9989 0.9997 0.9999
1.0000
0.6
0.5488 0.8781 0.9769 0.9966 0.9996 1.0000
3.5
0.0302 0.1359 0.3208 0.5366 0.7254 0.8576
0.9347 0.9733 0.9901 0.9967 0.9990
0.9997 0.9999 1.0000
0.7
0.4966 0.8442 0.9659 0.9942 0.9992 0.9999 1.0000
4.0
0.0183 0.0916 0.2381 0.4335 0.6288 0.7851
0.8893 0.9489 0.9786 0.9919 0.9972
0.9991 0.9997 0.9999 1.0000
0.8
0.4493 0.8088 0.9526 0.9909 0.9986 0.9998 1.0000
4.5
0.0111 0.0611 0.1736 0.3423 0.5321 0.7029
0.8311 0.9134 0.9597 0.9829 0.9933
0.9976 0.9992 0.9997 0.9999 1.0000
0.9
0.4066 0.7725 0.9371 0.9865 0.9977 0.9997 1.0000
5.0
0.0067 0.0404 0.1247 0.2650 0.4405 0.6160
0.7622 0.8666 0.9319 0.9682 0.9863
0.9945 0.9980 0.9993 0.9998 0.9999 1.0000
Table A.2 Poisson Probability Sums
􏰦r x=0
p(x; μ) μ
0.5
0.6065 0.9098 0.9856 0.9982 0.9998 1.0000
μ
3.0
0.0498 0.1991 0.4232 0.6472 0.8153 0.9161
0.9665 0.9881 0.9962 0.9989 0.9997
0.9999 1.0000

Table A.2 Poisson Probability Table
733
Table A.2 (continued) Poisson Probability Sums μ
􏰦r x=0
p(x; μ)
8.0
0.0003 0.0030 0.0138 0.0424 0.0996 0.1912
0.3134 0.4530 0.5925 0.7166 0.8159
0.8881 0.9362 0.9658 0.9827 0.9918
0.9963 0.9984 0.9993 0.9997 0.9999
1.0000
r 5.5
0 0.0041
1 0.0266
2 0.0884
3 0.2017
4 0.3575
5 0.5289
6 0.6860
7 0.8095
8 0.8944
9 0.9462
10 0.9747
11 0.9890
12 0.9955
13 0.9983
14 0.9994
15 0.9998
16 0.9999
17 1.0000
18 19
20
21 22 23 24
6.0
0.0025 0.0174 0.0620 0.1512 0.2851 0.4457
0.6063 0.7440 0.8472 0.9161 0.9574
0.9799 0.9912 0.9964 0.9986 0.9995
0.9998 0.9999 1.0000
6.5
0.0015 0.0113 0.0430 0.1118 0.2237 0.3690
0.5265 0.6728 0.7916 0.8774 0.9332
0.9661 0.9840 0.9929 0.9970 0.9988
0.9996 0.9998 0.9999 1.0000
7.0
0.0009 0.0073 0.0296 0.0818 0.1730 0.3007
0.4497 0.5987 0.7291 0.8305 0.9015
0.9467 0.9730 0.9872 0.9943 0.9976
0.9990 0.9996 0.9999 1.0000
7.5
0.0006 0.0047 0.0203 0.0591 0.1321 0.2414
0.3782 0.5246 0.6620 0.7764 0.8622
0.9208 0.9573 0.9784 0.9897 0.9954
0.9980 0.9992 0.9997 0.9999
8.5
0.0002 0.0019 0.0093 0.0301 0.0744 0.1496
0.2562 0.3856 0.5231 0.6530 0.7634
0.8487 0.9091 0.9486 0.9726 0.9862
0.9934 0.9970 0.9987 0.9995 0.9998
0.9999 1.0000
9.0
0.0001 0.0012 0.0062 0.0212 0.0550 0.1157
0.2068 0.3239 0.4557 0.5874 0.7060
0.8030 0.8758 0.9261 0.9585 0.9780
0.9889 0.9947 0.9976 0.9989 0.9996
0.9998 0.9999 1.0000
9.5
0.0001 0.0008 0.0042 0.0149 0.0403 0.0885
0.1649 0.2687 0.3918 0.5218 0.6453
0.7520 0.8364 0.8981 0.9400 0.9665
0.9823 0.9911 0.9957 0.9980 0.9991
0.9996 0.9999 0.9999 1.0000

734
Appendix A
Statistical Tables and Proofs
Table A.2 (continued) Poisson Probability Sums μ
􏰦r x=0
p(x; μ)
15.0
0.0000 0.0002 0.0009 0.0028
0.0076 0.0180 0.0374 0.0699 0.1185
0.1848 0.2676 0.3632 0.4657 0.5681
0.6641 0.7489 0.8195 0.8752 0.9170
0.9469 0.9673 0.9805 0.9888 0.9938
0.9967 0.9983 0.9991 0.9996 0.9998
0.9999 1.0000
r 10.0
0 0.0000
1 0.0005
2 0.0028
3 0.0103
4 0.0293
5 0.0671
6 0.1301
7 0.2202
8 0.3328
9 0.4579
10 0.5830
11 0.6968
12 0.7916
13 0.8645
14 0.9165
15 0.9513
16 0.9730
17 0.9857
18 0.9928
19 0.9965
20 0.9984
21 0.9993
22 0.9997
23 0.9999
24 1.0000
25
26 27 28 29 30
31 32 33 34 35
36 37
11.0
0.0000 0.0002 0.0012 0.0049 0.0151 0.0375
0.0786 0.1432 0.2320 0.3405 0.4599
0.5793 0.6887 0.7813 0.8540 0.9074
0.9441 0.9678 0.9823 0.9907 0.9953
0.9977 0.9990 0.9995 0.9998 0.9999
1.0000
12.0
0.0000 0.0001 0.0005 0.0023 0.0076 0.0203
0.0458 0.0895 0.1550 0.2424 0.3472
0.4616 0.5760 0.6815 0.7720 0.8444
0.8987 0.9370 0.9626 0.9787 0.9884
0.9939 0.9970 0.9985 0.9993 0.9997
0.9999 0.9999 1.0000
13.0
0.0000 0.0002 0.0011 0.0037 0.0107
0.0259 0.0540 0.0998 0.1658 0.2517
0.3532 0.4631 0.5730 0.6751 0.7636
0.8355 0.8905 0.9302 0.9573 0.9750
0.9859 0.9924 0.9960 0.9980 0.9990
0.9995 0.9998 0.9999 1.0000
14.0
0.0000 0.0001 0.0005 0.0018 0.0055
0.0142 0.0316 0.0621 0.1094 0.1757
0.2600 0.3585 0.4644 0.5704 0.6694
0.7559 0.8272 0.8826 0.9235 0.9521
0.9712 0.9833 0.9907 0.9950 0.9974
0.9987 0.9994 0.9997 0.9999 0.9999
1.0000
16.0
0.0000 0.0001 0.0004 0.0014
0.0040 0.0100 0.0220 0.0433 0.0774
0.1270 0.1931 0.2745 0.3675 0.4667
0.5660 0.6593 0.7423 0.8122 0.8682
0.9108 0.9418 0.9633 0.9777 0.9869
0.9925 0.9959 0.9978 0.9989 0.9994
0.9997 0.9999 0.9999 1.0000
17.0
0.0000 0.0002 0.0007
0.0021 0.0054 0.0126 0.0261 0.0491
0.0847 0.1350 0.2009 0.2808 0.3715
0.4677 0.5640 0.6550 0.7363 0.8055
0.8615 0.9047 0.9367 0.9594 0.9748
0.9848 0.9912 0.9950 0.9973 0.9986
0.9993 0.9996 0.9998 0.9999 1.0000
18.0
0.0000 0.0001 0.0003
0.0010 0.0029 0.0071 0.0154 0.0304
0.0549 0.0917 0.1426 0.2081 0.2867
0.3751 0.4686 0.5622 0.6509 0.7307
0.7991 0.8551 0.8989 0.9317 0.9554
0.9718 0.9827 0.9897 0.9941 0.9967
0.9982 0.9990 0.9995 0.9998 0.9999
0.9999 1.0000

Table A.3 Normal Probability Table
735
Table A.3 Areas under the Normal Curve
z .00 .01 .02 .03 .04
−3.4 0.0003 0.0003 0.0003 0.0003 0.0003 −3.3 0.0005 0.0005 0.0005 0.0004 0.0004 −3.2 0.0007 0.0007 0.0006 0.0006 0.0006 −3.1 0.0010 0.0009 0.0009 0.0009 0.0008 −3.0 0.0013 0.0013 0.0013 0.0012 0.0012
−2.9 0.0019 0.0018 0.0018 0.0017 0.0016 −2.8 0.0026 0.0025 0.0024 0.0023 0.0023 −2.7 0.0035 0.0034 0.0033 0.0032 0.0031 −2.6 0.0047 0.0045 0.0044 0.0043 0.0041 −2.5 0.0062 0.0060 0.0059 0.0057 0.0055
−2.4 0.0082 0.0080 0.0078 0.0075 0.0073 −2.3 0.0107 0.0104 0.0102 0.0099 0.0096 −2.2 0.0139 0.0136 0.0132 0.0129 0.0125 −2.1 0.0179 0.0174 0.0170 0.0166 0.0162 −2.0 0.0228 0.0222 0.0217 0.0212 0.0207
−1.9 0.0287 0.0281 0.0274 0.0268 0.0262 −1.8 0.0359 0.0351 0.0344 0.0336 0.0329 −1.7 0.0446 0.0436 0.0427 0.0418 0.0409 −1.6 0.0548 0.0537 0.0526 0.0516 0.0505 −1.5 0.0668 0.0655 0.0643 0.0630 0.0618
−1.4 0.0808 0.0793 0.0778 0.0764 0.0749 −1.3 0.0968 0.0951 0.0934 0.0918 0.0901 −1.2 0.1151 0.1131 0.1112 0.1093 0.1075 −1.1 0.1357 0.1335 0.1314 0.1292 0.1271 −1.0 0.1587 0.1562 0.1539 0.1515 0.1492
−0.9 0.1841 0.1814 0.1788 0.1762 0.1736 −0.8 0.2119 0.2090 0.2061 0.2033 0.2005 −0.7 0.2420 0.2389 0.2358 0.2327 0.2296 −0.6 0.2743 0.2709 0.2676 0.2643 0.2611 −0.5 0.3085 0.3050 0.3015 0.2981 0.2946
−0.4 0.3446 0.3409 0.3372 0.3336 0.3300 −0.3 0.3821 0.3783 0.3745 0.3707 0.3669 −0.2 0.4207 0.4168 0.4129 0.4090 0.4052 −0.1 0.4602 0.4562 0.4522 0.4483 0.4443 −0.0 0.5000 0.4960 0.4920 0.4880 0.4840
.05 .06
0.0003 0.0003 0.0004 0.0004 0.0006 0.0006 0.0008 0.0008 0.0011 0.0011
0.0016 0.0015 0.0022 0.0021 0.0030 0.0029 0.0040 0.0039 0.0054 0.0052
0.0071 0.0069 0.0094 0.0091 0.0122 0.0119 0.0158 0.0154 0.0202 0.0197
0.0256 0.0250 0.0322 0.0314 0.0401 0.0392 0.0495 0.0485 0.0606 0.0594
0.0735 0.0721 0.0885 0.0869 0.1056 0.1038 0.1251 0.1230 0.1469 0.1446
0.1711 0.1685 0.1977 0.1949 0.2266 0.2236 0.2578 0.2546 0.2912 0.2877
0.3264 0.3228 0.3632 0.3594 0.4013 0.3974 0.4404 0.4364 0.4801 0.4761
Area
0z
.07 .08
0.0003 0.0003 0.0004 0.0004 0.0005 0.0005 0.0008 0.0007 0.0011 0.0010
0.0015 0.0014 0.0021 0.0020 0.0028 0.0027 0.0038 0.0037 0.0051 0.0049
0.0068 0.0066 0.0089 0.0087 0.0116 0.0113 0.0150 0.0146 0.0192 0.0188
0.0244 0.0239 0.0307 0.0301 0.0384 0.0375 0.0475 0.0465 0.0582 0.0571
0.0708 0.0694 0.0853 0.0838 0.1020 0.1003 0.1210 0.1190 0.1423 0.1401
0.1660 0.1635 0.1922 0.1894 0.2206 0.2177 0.2514 0.2483 0.2843 0.2810
0.3192 0.3156 0.3557 0.3520 0.3936 0.3897 0.4325 0.4286 0.4721 0.4681
.09
0.0002 0.0003 0.0005 0.0007 0.0010
0.0014 0.0019 0.0026 0.0036 0.0048
0.0064 0.0084 0.0110 0.0143 0.0183
0.0233 0.0294 0.0367 0.0455 0.0559
0.0681 0.0823 0.0985 0.1170 0.1379
0.1611 0.1867 0.2148 0.2451 0.2776
0.3121 0.3483 0.3859 0.4247 0.4641

736 Appendix A Statistical Tables and Proofs Table A.3 (continued) Areas under the Normal Curve
z .00
0.0 0.5000
0.1 0.5398
0.2 0.5793
0.3 0.6179
0.4 0.6554
0.5 0.6915
0.6 0.7257
0.7 0.7580
0.8 0.7881
0.9 0.8159
1.0 0.8413
1.1 0.8643
1.2 0.8849
1.3 0.9032
1.4 0.9192
1.5 0.9332
1.6 0.9452
1.7 0.9554
1.8 0.9641
1.9 0.9713
2.0 0.9772
2.1 0.9821
2.2 0.9861
2.3 0.9893
2.4 0.9918
2.5 0.9938
2.6 0.9953
2.7 0.9965
2.8 0.9974
2.9 0.9981
3.0 0.9987
3.1 0.9990
3.2 0.9993
3.3 0.9995
3.4 0.9997
.01 .02
0.5040 0.5080 0.5438 0.5478 0.5832 0.5871 0.6217 0.6255 0.6591 0.6628
0.6950 0.6985 0.7291 0.7324 0.7611 0.7642 0.7910 0.7939 0.8186 0.8212
0.8438 0.8461 0.8665 0.8686 0.8869 0.8888 0.9049 0.9066 0.9207 0.9222
0.9345 0.9357 0.9463 0.9474 0.9564 0.9573 0.9649 0.9656 0.9719 0.9726
0.9778 0.9783 0.9826 0.9830 0.9864 0.9868 0.9896 0.9898 0.9920 0.9922
0.9940 0.9941 0.9955 0.9956 0.9966 0.9967 0.9975 0.9976 0.9982 0.9982
0.9987 0.9987 0.9991 0.9991 0.9993 0.9994 0.9995 0.9995 0.9997 0.9997
.03 .04 .05
0.5120 0.5160 0.5199 0.5517 0.5557 0.5596 0.5910 0.5948 0.5987 0.6293 0.6331 0.6368 0.6664 0.6700 0.6736
0.7019 0.7054 0.7088 0.7357 0.7389 0.7422 0.7673 0.7704 0.7734 0.7967 0.7995 0.8023 0.8238 0.8264 0.8289
0.8485 0.8508 0.8531 0.8708 0.8729 0.8749 0.8907 0.8925 0.8944 0.9082 0.9099 0.9115 0.9236 0.9251 0.9265
0.9370 0.9382 0.9394 0.9484 0.9495 0.9505 0.9582 0.9591 0.9599 0.9664 0.9671 0.9678 0.9732 0.9738 0.9744
0.9788 0.9793 0.9798 0.9834 0.9838 0.9842 0.9871 0.9875 0.9878 0.9901 0.9904 0.9906 0.9925 0.9927 0.9929
0.9943 0.9945 0.9946 0.9957 0.9959 0.9960 0.9968 0.9969 0.9970 0.9977 0.9977 0.9978 0.9983 0.9984 0.9984
0.9988 0.9988 0.9989 0.9991 0.9992 0.9992 0.9994 0.9994 0.9994 0.9996 0.9996 0.9996 0.9997 0.9997 0.9997
.06 .07 .08
0.5239 0.5279 0.5319 0.5636 0.5675 0.5714 0.6026 0.6064 0.6103 0.6406 0.6443 0.6480 0.6772 0.6808 0.6844
0.7123 0.7157 0.7190 0.7454 0.7486 0.7517 0.7764 0.7794 0.7823 0.8051 0.8078 0.8106 0.8315 0.8340 0.8365
0.8554 0.8577 0.8599 0.8770 0.8790 0.8810 0.8962 0.8980 0.8997 0.9131 0.9147 0.9162 0.9279 0.9292 0.9306
0.9406 0.9418 0.9429 0.9515 0.9525 0.9535 0.9608 0.9616 0.9625 0.9686 0.9693 0.9699 0.9750 0.9756 0.9761
0.9803 0.9808 0.9812 0.9846 0.9850 0.9854 0.9881 0.9884 0.9887 0.9909 0.9911 0.9913 0.9931 0.9932 0.9934
0.9948 0.9949 0.9951 0.9961 0.9962 0.9963 0.9971 0.9972 0.9973 0.9979 0.9979 0.9980 0.9985 0.9985 0.9986
0.9989 0.9989 0.9990 0.9992 0.9992 0.9993 0.9994 0.9995 0.9995 0.9996 0.9996 0.9996 0.9997 0.9997 0.9997
.09
0.5359 0.5753 0.6141 0.6517 0.6879
0.7224 0.7549 0.7852 0.8133 0.8389
0.8621 0.8830 0.9015 0.9177 0.9319
0.9441 0.9545 0.9633 0.9706 0.9767
0.9817 0.9857 0.9890 0.9916 0.9936
0.9952 0.9964 0.9974 0.9981 0.9986
0.9990 0.9993 0.9995 0.9997 0.9998

Table A.4
Student t-Distribution Probability Table
737
Table A.4 Critical Values of the t-Distribution
α 0tα
0.05 0.025
6.314 12.706 2.920 4.303 2.353 3.182 2.132 2.776 2.015 2.571
1.943 2.447 1.895 2.365 1.860 2.306 1.833 2.262 1.812 2.228
1.796 2.201 1.782 2.179 1.771 2.160 1.761 2.145 1.753 2.131
1.746 2.120 1.740 2.110 1.734 2.101 1.729 2.093 1.725 2.086
1.721 2.080 1.717 2.074 1.714 2.069 1.711 2.064 1.708 2.060
1.706 2.056 1.703 2.052 1.701 2.048 1.699 2.045 1.697 2.042
1.684 2.021 1.671 2.000 1.658 1.980 1.645 1.960
v 0.40 0.30
1 0.325 0.727
2 0.289 0.617
3 0.277 0.584
4 0.271 0.569
5 0.267 0.559
6 0.265 0.553
7 0.263 0.549
8 0.262 0.546
9 0.261 0.543
10 0.260 0.542
11 0.260 0.540
12 0.259 0.539
13 0.259 0.538
14 0.258 0.537
15 0.258 0.536
16 0.258 0.535
17 0.257 0.534
18 0.257 0.534
19 0.257 0.533
20 0.257 0.533
21 0.257 0.532
22 0.256 0.532
23 0.256 0.532
24 0.256 0.531
25 0.256 0.531
26 0.256 0.531
27 0.256 0.531
28 0.256 0.530
29 0.256 0.530
30 0.256 0.530
40 0.255 0.529
60 0.254 0.527 120 0.254 0.526 ∞ 0.253 0.524
α
0.20 0.15 0.10
1.376 1.963 3.078 1.061 1.386 1.886 0.978 1.250 1.638 0.941 1.190 1.533 0.920 1.156 1.476
0.906 1.134 1.440 0.896 1.119 1.415 0.889 1.108 1.397 0.883 1.100 1.383 0.879 1.093 1.372
0.876 1.088 1.363 0.873 1.083 1.356 0.870 1.079 1.350 0.868 1.076 1.345 0.866 1.074 1.341
0.865 1.071 1.337 0.863 1.069 1.333 0.862 1.067 1.330 0.861 1.066 1.328 0.860 1.064 1.325
0.859 1.063 1.323 0.858 1.061 1.321 0.858 1.060 1.319 0.857 1.059 1.318 0.856 1.058 1.316
0.856 1.058 1.315 0.855 1.057 1.314 0.855 1.056 1.313 0.854 1.055 1.311 0.854 1.055 1.310
0.851 1.050 1.303 0.848 1.045 1.296 0.845 1.041 1.289 0.842 1.036 1.282

738
Appendix A Statistical Tables and Proofs
Table A.4 (continued) Critical Values of the t-Distribution α
v 0.02
1 15.894
2 4.849
3 3.482
4 2.999
5 2.757
6 2.612
7 2.517
8 2.449
9 2.398
10 2.359
11 2.328
12 2.303
13 2.282
14 2.264
15 2.249
16 2.235
17 2.224
18 2.214
19 2.205
20 2.197
21 2.189
22 2.183
23 2.177
24 2.172
25 2.167
26 2.162
27 2.158
28 2.154
29 2.150
30 2.147
40 2.123
60 2.099 120 2.076 ∞ 2.054
0.015 0.01
21.205 31.821 5.643 6.965 3.896 4.541 3.298 3.747 3.003 3.365
2.829 3.143 2.715 2.998 2.634 2.896 2.574 2.821 2.527 2.764
2.491 2.718 2.461 2.681 2.436 2.650 2.415 2.624 2.397 2.602
2.382 2.583 2.368 2.567 2.356 2.552 2.346 2.539 2.336 2.528
2.328 2.518 2.320 2.508 2.313 2.500 2.307 2.492 2.301 2.485
2.296 2.479 2.291 2.473 2.286 2.467 2.282 2.462 2.278 2.457
2.250 2.423 2.223 2.390 2.196 2.358 2.170 2.326
0.0075 0.005 0.0025
42.433 63.656 127.321 8.073 9.925 14.089 5.047 5.841 7.453 4.088 4.604 5.598 3.634 4.032 4.773
3.372 3.707 4.317 3.203 3.499 4.029 3.085 3.355 3.833 2.998 3.250 3.690 2.932 3.169 3.581
2.879 3.106 3.497 2.836 3.055 3.428 2.801 3.012 3.372 2.771 2.977 3.326 2.746 2.947 3.286
2.724 2.921 3.252 2.706 2.898 3.222 2.689 2.878 3.197 2.674 2.861 3.174 2.661 2.845 3.153
2.649 2.831 3.135 2.639 2.819 3.119 2.629 2.807 3.104 2.620 2.797 3.091 2.612 2.787 3.078
2.605 2.779 3.067 2.598 2.771 3.057 2.592 2.763 3.047 2.586 2.756 3.038 2.581 2.750 3.030
2.542 2.704 2.971 2.504 2.660 2.915 2.468 2.617 2.860 2.432 2.576 2.807
0.0005
636.578 31.600 12.924
8.610 6.869
5.959 5.408 5.041 4.781 4.587
4.437 4.318 4.221 4.140 4.073
4.015 3.965 3.922 3.883 3.850
3.819 3.792 3.768 3.745 3.725
3.707 3.689 3.674 3.660 3.646
3.551 3.460 3.373 3.290

Table A.5 Chi-Squared Distribution Probability Table
739
Table A.5 Critical Values of the Chi-Squared Distribution
α
0 χ2 α
α
v 0.995 0.99 0.98 0.975 0.95 0.90 0.80 0.75
0.70 0.50
0.148 0.455 0.713 1.386 1.424 2.366 2.195 3.357 3.000 4.351
3.828 5.348 4.671 6.346 5.527 7.344 6.393 8.343 7.267 9.342
8.148 10.341 9.034 11.340 9.926 12.340
10.821 13.339 11.721 14.339
12.624 15.338 13.531 16.338 14.440 17.338 15.352 18.338 16.266 19.337
17.182 20.337 18.101 21.337 19.021 22.337 19.943 23.337 20.867 24.337
21.792 25.336 22.719 26.336 23.647 27.336 24.577 28.336 25.508 29.336
34.872 39.335 44.313 49.335 53.809 59.335
1 0.04 393
2 0.0100
3 0.0717
4 0.207
5 0.412
6 0.676
7 0.989
8 1.344
9 1.735
10 2.156
11 2.603
12 3.074
13 3.565
14 4.075
15 4.601
16 5.142
17 5.697
18 6.265
19 6.844
20 7.434
21 8.034
22 8.643
23 9.260
24 9.886
25 10.520
26 11.160
27 11.808
28 12.461
29 13.121
30 13.787
40 20.707 50 27.991 60 35.534
0.03 157 0.0201 0.115 0.297 0.554
0.872 1.239 1.647 2.088 2.558
3.053 3.571 4.107 4.660 5.229
5.812 6.408 7.015 7.633 8.260
8.897
9.542 10.196 10.856 11.524
12.198 12.878 13.565 14.256 14.953
22.164 29.707 37.485
0.03 628 0.0404 0.185 0.429 0.752
1.134 1.564 2.032 2.532 3.059
3.609 4.178 4.765 5.368 5.985
6.614 7.255 7.906 8.567 9.237
9.915 10.600 11.293 11.992 12.697
13.409 14.125 14.847 15.574 16.306
23.838 31.664 39.699
0.03 982 0.0506 0.216 0.484 0.831
1.237 1.690 2.180 2.700 3.247
3.816 4.404 5.009 5.629 6.262
6.908 7.564 8.231 8.907 9.591
10.283 10.982 11.689 12.401 13.120
13.844 14.573 15.308 16.047 16.791
24.433 32.357 40.482
0.00393 0.0158 0.103 0.211 0.352 0.584 0.711 1.064 1.145 1.610
1.635 2.204 2.167 2.833 2.733 3.490 3.325 4.168 3.940 4.865
4.575 5.578 5.226 6.304 5.892 7.041 6.571 7.790 7.261 8.547
7.962 9.312 8.672 10.085 9.390 10.865
10.117 11.651 10.851 12.443
11.591 13.240 12.338 14.041 13.091 14.848 13.848 15.659 14.611 16.473
15.379 17.292 16.151 18.114 16.928 18.939 17.708 19.768 18.493 20.599
26.509 29.051 34.764 37.689 43.188 46.459
0.0642 0.102 0.446 0.575 1.005 1.213 1.649 1.923 2.343 2.675
3.070 3.455 3.822 4.255 4.594 5.071 5.380 5.899 6.179 6.737
6.989 7.584 7.807 8.438 8.634 9.299 9.467 10.165
10.307 11.037
11.152 11.912 12.002 12.792 12.857 13.675 13.716 14.562 14.578 15.452
15.445 16.344 16.314 17.240 17.187 18.137 18.062 19.037 18.940 19.939
19.820 20.843 20.703 21.749 21.588 22.657 22.475 23.567 23.364 24.478
32.345 33.66 41.449 42.942 50.641 52.294

740 Appendix A Statistical Tables and Proofs Table A.5 (continued) Critical Values of the Chi-Squared Distribution
v 0.30 0.25 0.20 0.10
1 1.074 1.323 1.642 2.706
2 2.408 2.773 3.219 4.605
3 3.665 4.108 4.642 6.251
4 4.878 5.385 5.989 7.779
5 6.064 6.626 7.289 9.236
6 7.231 7.841 8.558 10.645
7 8.383 9.037 9.803 12.017
8 9.524 10.219 11.030 13.362
9 10.656 11.389 12.242 14.684
10 11.781 12.549 13.442 15.987
11 12.899 13.701 14.631 17.275
12 14.011 14.845 15.812 18.549
13 15.119 15.984 16.985 19.812
14 16.222 17.117 18.151 21.064
15 17.322 18.245 19.311 22.307
16 18.418 19.369 20.465 23.542
17 19.511 20.489 21.615 24.769
18 20.601 21.605 22.760 25.989
19 21.689 22.718 23.900 27.204
20 22.775 23.828 25.038 28.412
21 23.858 24.935 26.171 29.615
22 24.939 26.039 27.301 30.813
23 26.018 27.141 28.429 32.007
24 27.096 28.241 29.553 33.196
25 28.172 29.339 30.675 34.382
26 29.246 30.435 31.795 35.563
27 30.319 31.528 32.912 36.741
28 31.391 32.620 34.027 37.916
29 32.461 33.711 35.139 39.087
30 33.530 34.800 36.250 40.256
40 44.165 45.616 47.269 51.805 50 54.723 56.334 58.164 63.167 60 65.226 66.981 68.972 74.397
α
0.05 0.025
3.841 5.024 5.991 7.378 7.815 9.348 9.488 11.143
11.070 12.832
12.592 14.449 14.067 16.013 15.507 17.535 16.919 19.023 18.307 20.483
19.675 21.920 21.026 23.337 22.362 24.736 23.685 26.119 24.996 27.488
26.296 28.845 27.587 30.191 28.869 31.526 30.144 32.852 31.410 34.170
32.671 35.479 33.924 36.781 35.172 38.076 36.415 39.364 37.652 40.646
38.885 41.923 40.113 43.195 41.337 44.461 42.557 45.722 43.773 46.979
55.758 59.342 67.505 71.420 79.082 83.298
0.02 0.01
5.412 6.635 7.824 9.210 9.837 11.345
11.668 13.277 13.388 15.086
15.033 16.812 16.622 18.475 18.168 20.090 19.679 21.666 21.161 23.209
22.618 24.725 24.054 26.217 25.471 27.688 26.873 29.141 28.259 30.578
29.633 32.000 30.995 33.409 32.346 34.805 33.687 36.191 35.020 37.566
36.343 38.932 37.659 40.289 38.968 41.638 40.270 42.980 41.566 44.314
42.856 45.642 44.140 46.963 45.419 48.278 46.693 49.588 47.962 50.892
60.436 63.691 72.613 76.154 84.58 88.379
0.005 0.001
7.879 10.827 10.597 13.815 12.838 16.266 14.860 18.466 16.750 20.515
18.548 22.457 20.278 24.321 21.955 26.124 23.589 27.877 25.188 29.588
26.757 31.264 28.300 32.909 29.819 34.527 31.319 36.124 32.801 37.698
34.267 39.252 35.718 40.791 37.156 42.312 38.582 43.819 39.997 45.314
41.401 46.796 42.796 48.268 44.181 49.728 45.558 51.179 46.928 52.619
48.290 54.051 49.645 55.475 50.994 56.892 52.335 58.301 53.672 59.702
66.766 73.403 79.490 86.660 91.952 99.608

Table A.6 F-Distribution Probability Table
741
Table A.6 Critical Values of the F-Distribution
f0.05(v1, v2) v1
α 0 fα
v2 1 2 3 4 5 6 7 8 9
1 161.45
2 18.51
3 10.13
4 7.71
5 6.61
6 5.99
7 5.59
8 5.32
9 5.12
10 4.96
11 4.84
12 4.75
13 4.67
14 4.60
15 4.54
16 4.49
17 4.45
18 4.41
19 4.38
20 4.35
21 4.32
22 4.30
23 4.28
24 4.26
25 4.24
26 4.23
27 4.21
28 4.20
29 4.18
30 4.17
40 4.08
60 4.00 120 3.92 ∞ 3.84
Reproduced from Table
Pearson and the Biometrika Trustees.
199.50 215.71 19.00 19.16 9.55 9.28 6.94 6.59 5.79 5.41
5.14 4.76 4.74 4.35 4.46 4.07 4.26 3.86 4.10 3.71
3.98 3.59 3.89 3.49 3.81 3.41 3.74 3.34 3.68 3.29
3.63 3.24 3.59 3.20 3.55 3.16 3.52 3.13 3.49 3.10
3.47 3.07 3.44 3.05 3.42 3.03 3.40 3.01 3.39 2.99
3.37 2.98 3.35 2.96 3.34 2.95 3.33 2.93 3.32 2.92
3.23 2.84 3.15 2.76 3.07 2.68 3.00 2.60
224.58 230.16 233.99 19.25 19.30 19.33 9.12 9.01 8.94 6.39 6.26 6.16 5.19 5.05 4.95
4.53 4.39 4.28 4.12 3.97 3.87 3.84 3.69 3.58 3.63 3.48 3.37 3.48 3.33 3.22
3.36 3.20 3.09 3.26 3.11 3.00 3.18 3.03 2.92 3.11 2.96 2.85 3.06 2.90 2.79
3.01 2.85 2.74 2.96 2.81 2.70 2.93 2.77 2.66 2.90 2.74 2.63 2.87 2.71 2.60
2.84 2.68 2.57 2.82 2.66 2.55 2.80 2.64 2.53 2.78 2.62 2.51 2.76 2.60 2.49
2.74 2.59 2.47 2.73 2.57 2.46 2.71 2.56 2.45 2.70 2.55 2.43 2.69 2.53 2.42
2.61 2.45 2.34 2.53 2.37 2.25 2.45 2.29 2.18 2.37 2.21 2.10
236.77 238.88 240.54 19.35 19.37 19.38 8.89 8.85 8.81 6.09 6.04 6.00 4.88 4.82 4.77
4.21 4.15 4.10 3.79 3.73 3.68 3.50 3.44 3.39 3.29 3.23 3.18 3.14 3.07 3.02
3.01 2.95 2.90 2.91 2.85 2.80 2.83 2.77 2.71 2.76 2.70 2.65 2.71 2.64 2.59
2.66 2.59 2.54 2.61 2.55 2.49 2.58 2.51 2.46 2.54 2.48 2.42 2.51 2.45 2.39
2.49 2.42 2.37 2.46 2.40 2.34 2.44 2.37 2.32 2.42 2.36 2.30 2.40 2.34 2.28
2.39 2.32 2.27 2.37 2.31 2.25 2.36 2.29 2.24 2.35 2.28 2.22 2.33 2.27 2.21
2.25 2.18 2.12 2.17 2.10 2.04 2.09 2.02 1.96 2.01 1.94 1.88
18 of Biometrika Tables for Statisticians, Vol. I, by permission of E.S.

742 Appendix A Statistical Tables and Proofs Table A.6 (continued) Critical Values of the F-Distribution
f0.05(v1, v2) v1
v2 10 12 15 20 24 30 40 60 120 ∞
1 241.88 243.91
2 19.40 19.41
3 8.79 8.74
4 5.96 5.91
5 4.74 4.68
6 4.06 4.00
7 3.64 3.57
8 3.35 3.28
9 3.14 3.07
10 2.98 2.91
11 2.85 2.79
12 2.75 2.69
13 2.67 2.60
14 2.60 2.53
15 2.54 2.48
16 2.49 2.42
17 2.45 2.38
18 2.41 2.34
19 2.38 2.31
20 2.35 2.28
21 2.32 2.25
22 2.30 2.23
23 2.27 2.20
24 2.25 2.18
25 2.24 2.16
26 2.22 2.15
27 2.20 2.13
28 2.19 2.12
29 2.18 2.10
30 2.16 2.09
40 2.08 2.00
60 1.99 1.92 120 1.91 1.83 ∞ 1.83 1.75
245.95 248.01 249.05 19.43 19.45 19.45 8.70 8.66 8.64 5.86 5.80 5.77 4.62 4.56 4.53
3.94 3.87 3.84 3.51 3.44 3.41 3.22 3.15 3.12 3.01 2.94 2.90 2.85 2.77 2.74
2.72 2.65 2.61 2.62 2.54 2.51 2.53 2.46 2.42 2.46 2.39 2.35 2.40 2.33 2.29
2.35 2.28 2.24 2.31 2.23 2.19 2.27 2.19 2.15 2.23 2.16 2.11 2.20 2.12 2.08
2.18 2.10 2.05 2.15 2.07 2.03 2.13 2.05 2.01 2.11 2.03 1.98 2.09 2.01 1.96
2.07 1.99 1.95 2.06 1.97 1.93 2.04 1.96 1.91 2.03 1.94 1.90 2.01 1.93 1.89
1.92 1.84 1.79 1.84 1.75 1.70 1.75 1.66 1.61 1.67 1.57 1.52
250.10 251.14 252.20 19.46 19.47 19.48 8.62 8.59 8.57 5.75 5.72 5.69 4.50 4.46 4.43
3.81 3.77 3.74 3.38 3.34 3.30 3.08 3.04 3.01 2.86 2.83 2.79 2.70 2.66 2.62
2.57 2.53 2.49 2.47 2.43 2.38 2.38 2.34 2.30 2.31 2.27 2.22 2.25 2.20 2.16
2.19 2.15 2.11 2.15 2.10 2.06 2.11 2.06 2.02 2.07 2.03 1.98 2.04 1.99 1.95
2.01 1.96 1.92 1.98 1.94 1.89 1.96 1.91 1.86 1.94 1.89 1.84 1.92 1.87 1.82
1.90 1.85 1.80 1.88 1.84 1.79 1.87 1.82 1.77 1.85 1.81 1.75 1.84 1.79 1.74
1.74 1.69 1.64 1.65 1.59 1.53 1.55 1.50 1.43 1.46 1.39 1.32
253.25 254.31 19.49 19.50 8.55 8.53 5.66 5.63 4.40 4.36
3.70 3.67 3.27 3.23 2.97 2.93 2.75 2.71 2.58 2.54
2.45 2.40 2.34 2.30 2.25 2.21 2.18 2.13 2.11 2.07
2.06 2.01 2.01 1.96 1.97 1.92 1.93 1.88 1.90 1.84
1.87 1.81 1.84 1.78 1.81 1.76 1.79 1.73 1.77 1.71
1.75 1.69 1.73 1.67 1.71 1.65 1.70 1.64 1.68 1.62
1.58 1.51 1.47 1.39 1.35 1.25 1.22 1.00

Table A.6 F-Distribution Probability Table 743 Table A.6 (continued) Critical Values of the F-Distribution
f0.01(v1, v2) v1
v2 1 2 3 4 5 6 7 8 9
1 4052.18
2 98.50
3 34.12
4 21.20
5 16.26
6 13.75
7 12.25
8 11.26
9 10.56
10 10.04
11 9.65
12 9.33
13 9.07
14 8.86
15 8.68
16 8.53
17 8.40
18 8.29
19 8.18
20 8.10
21 8.02
22 7.95
23 7.88
24 7.82
25 7.77
26 7.72
27 7.68
28 7.64
29 7.60
30 7.56
40 7.31
60 7.08 120 6.85 ∞ 6.63
4999.50 5403.35 99.00 99.17 30.82 29.46 18.00 16.69 13.27 12.06
10.92 9.78 9.55 8.45 8.65 7.59 8.02 6.99 7.56 6.55
7.21 6.22 6.93 5.95 6.70 5.74 6.51 5.56 6.36 5.42
6.23 5.29 6.11 5.18 6.01 5.09 5.93 5.01 5.85 4.94
5.78 4.87 5.72 4.82 5.66 4.76 5.61 4.72 5.57 4.68
5.53 4.64 5.49 4.60 5.45 4.57 5.42 4.54 5.39 4.51
5.18 4.31 4.98 4.13 4.79 3.95 4.61 3.78
5624.58 5763.65 99.25 99.30 28.71 28.24 15.98 15.52 11.39 10.97
9.15 8.75 7.85 7.46 7.01 6.63 6.42 6.06 5.99 5.64
5.67 5.32 5.41 5.06 5.21 4.86 5.04 4.69 4.89 4.56
4.77 4.44 4.67 4.34 4.58 4.25 4.50 4.17 4.43 4.10
4.37 4.04 4.31 3.99 4.26 3.94 4.22 3.90 4.18 3.85
4.14 3.82 4.11 3.78 4.07 3.75 4.04 3.73 4.02 3.70
3.83 3.51 3.65 3.34 3.48 3.17 3.32 3.02
5858.99 5928.36 99.33 99.36 27.91 27.67 15.21 14.98 10.67 10.46
8.47 8.26 7.19 6.99 6.37 6.18 5.80 5.61 5.39 5.20
5.07 4.89 4.82 4.64 4.62 4.44 4.46 4.28 4.32 4.14
4.20 4.03 4.10 3.93 4.01 3.84 3.94 3.77 3.87 3.70
3.81 3.64 3.76 3.59 3.71 3.54 3.67 3.50 3.63 3.46
3.59 3.42 3.56 3.39 3.53 3.36 3.50 3.33 3.47 3.30
3.29 3.12 3.12 2.95 2.96 2.79 2.80 2.64
5981.07 6022.47 99.37 99.39 27.49 27.35 14.80 14.66 10.29 10.16
8.10 7.98 6.84 6.72 6.03 5.91 5.47 5.35 5.06 4.94
4.74 4.63 4.50 4.39 4.30 4.19 4.14 4.03 4.00 3.89
3.89 3.78 3.79 3.68 3.71 3.60 3.63 3.52 3.56 3.46
3.51 3.40 3.45 3.35 3.41 3.30 3.36 3.26 3.32 3.22
3.29 3.18 3.26 3.15 3.23 3.12 3.20 3.09 3.17 3.07
2.99 2.89 2.82 2.72 2.66 2.56 2.51 2.41

744 Appendix A Statistical Tables and Proofs Table A.6 (continued) Critical Values of the F-Distribution
f0.01(v1, v2) v1
v2 10 12 15 20 24 30 40 60 120 ∞
1 6055.85
2 99.40
3 27.23
4 14.55
5 10.05
6 7.87
7 6.62
8 5.81
9 5.26
10 4.85
11 4.54
12 4.30
13 4.10
14 3.94
15 3.80
16 3.69
17 3.59
18 3.51
19 3.43
20 3.37
21 3.31
22 3.26
23 3.21
24 3.17
25 3.13
26 3.09
27 3.06
28 3.03
29 3.00
30 2.98
40 2.80
60 2.63 120 2.47 ∞ 2.32
6106.32 6157.28 99.42 99.43 27.05 26.87 14.37 14.20
9.89 9.72
7.72 7.56 6.47 6.31 5.67 5.52 5.11 4.96 4.71 4.56
4.40 4.25 4.16 4.01 3.96 3.82 3.80 3.66 3.67 3.52
3.55 3.41 3.46 3.31 3.37 3.23 3.30 3.15 3.23 3.09
3.17 3.03 3.12 2.98 3.07 2.93 3.03 2.89 2.99 2.85
2.96 2.81 2.93 2.78 2.90 2.75 2.87 2.73 2.84 2.70
2.66 2.52 2.50 2.35 2.34 2.19 2.18 2.04
6208.73 6234.63 99.45 99.46 26.69 26.60 14.02 13.93
9.55 9.47
7.40 7.31 6.16 6.07 5.36 5.28 4.81 4.73 4.41 4.33
4.10 4.02 3.86 3.78 3.66 3.59 3.51 3.43 3.37 3.29
3.26 3.18 3.16 3.08 3.08 3.00 3.00 2.92 2.94 2.86
2.88 2.80 2.83 2.75 2.78 2.70 2.74 2.66 2.70 2.62
2.66 2.58 2.63 2.55 2.60 2.52 2.57 2.49 2.55 2.47
2.37 2.29 2.20 2.12 2.03 1.95 1.88 1.79
6260.65 6286.78 99.47 99.47 26.50 26.41 13.84 13.75
9.38 9.29
7.23 7.14 5.99 5.91 5.20 5.12 4.65 4.57 4.25 4.17
3.94 3.86 3.70 3.62 3.51 3.43 3.35 3.27 3.21 3.13
3.10 3.02 3.00 2.92 2.92 2.84 2.84 2.76 2.78 2.69
2.72 2.64 2.67 2.58 2.62 2.54 2.58 2.49 2.54 2.45
2.50 2.42 2.47 2.38 2.44 2.35 2.41 2.33 2.39 2.30
2.20 2.11 2.03 1.94 1.86 1.76 1.70 1.59
6313.03 6339.39 6365.86 99.48 99.49 99.50 26.32 26.22 26.13 13.65 13.56 13.46
9.20 9.11 9.02
7.06 6.97 6.88 5.82 5.74 5.65 5.03 4.95 4.86 4.48 4.40 4.31 4.08 4.00 3.91
3.78 3.69 3.60 3.54 3.45 3.36 3.34 3.25 3.17 3.18 3.09 3.00 3.05 2.96 2.87
2.93 2.84 2.75 2.83 2.75 2.65 2.75 2.66 2.57 2.67 2.58 2.49 2.61 2.52 2.42
2.55 2.46 2.36 2.50 2.40 2.31 2.45 2.35 2.26 2.40 2.31 2.21 2.36 2.27 2.17
2.33 2.23 2.13 2.29 2.20 2.10 2.26 2.17 2.06 2.23 2.14 2.03 2.21 2.11 2.01
2.02 1.92 1.80 1.84 1.73 1.60 1.66 1.53 1.38 1.47 1.32 1.00

Table A.7 Tolerance Factors for Normal Distributions 745
Table A.7 Tolerance Factors for Normal Distributions
Two-Sided Intervals One-Sided Intervals
γ =0.05 γ =0.01 γ =0.05 γ =0.01
1−α 1−α 1−α 1−α
n 0.90 0.95 0.99 0.90 0.95 0.99 0.90 0.95 0.99 0.90 0.95 0.99 2 32.019 37.674 48.430 160.193 188.491 242.300 20.581 26.260 37.094 103.029 131.426 185.617 3 8.380 9.916 12.861 18.930 22.401 29.055 6.156 7.656 10.553 13.995 17.170 23.896 4 5.369 6.370 8.299 9.398 11.150 14.527 4.162 5.144 7.042 7.380 9.083 12.387 5 4.275 5.079 6.634 6.612 7.855 10.260 3.407 4.203 5.741 5.362 6.578 8.939 6 3.712 4.414 5.775 5.337 6.345 8.301 3.006 3.708 5.062 4.411 5.406 7.335 7 3.369 4.007 5.248 4.613 5.488 7.187 2.756 3.400 4.642 3.859 4.728 6.412 8 3.136 3.732 4.891 4.147 4.936 6.468 2.582 3.187 4.354 3.497 4.285 5.812 9 2.967 3.532 4.631 3.822 4.550 5.966 2.454 3.031 4.143 3.241 3.972 5.389 10 2.839 3.379 4.433 3.582 4.265 5.594 2.355 2.911 3.981 3.048 3.738 5.074 11 2.737 3.259 4.277 3.397 4.045 5.308 2.275 2.815 3.852 2.898 3.556 4.829 12 2.655 3.162 4.150 3.250 3.870 5.079 2.210 2.736 3.747 2.777 3.410 4.633 13 2.587 3.081 4.044 3.130 3.727 4.893 2.155 2.671 3.659 2.677 3.290 4.472 14 2.529 3.012 3.955 3.029 3.608 4.737 2.109 2.615 3.585 2.593 1.189 4.337 15 2.480 2.954 3.878 2.945 3.507 4.605 2.068 2.566 3.520 2.522 3.102 4.222 16 2.437 2.903 3.812 2.872 3.421 4.492 2.033 2.524 3.464 2.460 3.028 4.123 17 2.400 2.858 3.754 2.808 3.345 4.393 2.002 2.486 3.414 2.405 2.963 4.037 18 2.366 2.819 3.702 2.753 3.279 4.307 1.974 2.453 3.370 2.357 2.905 3.960 19 2.337 2.784 3.656 2.703 3.221 4.230 1.949 2.423 3.331 2.314 2.854 3.892 20 2.310 2.752 3.615 2.659 3.168 4.161 1.926 2.396 3.295 2.276 2.808 1.832 25 2.208 2.631 3.457 2.494 2.972 3.904 1.838 2.292 3.158 2.129 2.633 3.001 30 2.140 2.549 3.350 2.385 2.841 3.733 1.777 2.220 3.064 2.030 2.516 3.447 35 2.090 2.490 3.272 2.306 2.748 3.611 1.732 2.167 2.995 1.957 2.430 3.334 40 2.052 2.445 3.213 2.247 2.677 3.518 1.697 2.126 2.941 1.902 2.364 3.249 45 2.021 2.408 3.165 2.200 2.621 3.444 1.669 2.092 2.898 1.857 2.312 3.180 50 1.996 2.379 3.126 2.162 2.576 3.385 1.646 2.065 2.863 1.821 2.269 3.125 60 1.958 2.333 3.066 2.103 2.506 3.293 1.609 2.022 2.807 1.764 2.202 3.038 70 1.929 2.299 3.021 2.060 2.454 3.225 1.581 1.990 2.765 1.722 2.153 2.974 80 1.907 2.272 2.986 2.026 2.414 3.173 1.559 1.965 2.733 1.688 2.114 2.924 90 1.889 2.251 2.958 1.999 2.382 3.130 1.542 1.944 2.706 1.661 2.082 2.883 100 1.874 2.233 2.934 1.977 2.355 3.096 1.527 1.927 2.684 1.639 2.056 2.850 150 1.825 2.175 2.859 1.905 2.270 2.983 1.478 1.870 2.611 1.566 1.971 2.741 200 1.798 2.143 2.816 1.865 2.222 2.921 1.450 1.837 2.570 1.524 1.923 2.679 250 1.780 2.121 2.788 1.839 2.191 2.880 1.431 1.815 2.542 1.496 1.891 2.638 300 1.767 2.106 2.767 1.820 2.169 2.850 1.417 1.800 2.522 1.476 1.868 2.608 ∞ 1.645 1.960 2.576 1.645 1.960 2.576 1.282 1.645 2.326 1.282 1.645 2.326
Adapted from C. Eisenhart, M. W. Hastay, and W. A. Wallis, Techniques of Statistical Analysis, Chapter 2, McGraw- Hill Book Company, New York, 1947. Used with permission of McGraw-Hill Book Company.

746 Appendix A Statistical Tables and Proofs Table A.8 Sample Size for the t-Test of the Mean
Single-Sided Test Double-Sided Test β = 0.1
0.05 0.10 0.15 0.20 0.25
0.30 0.35 0.40 0.45 0.50
0.55 0.60 0.65 0.70 0.75
α = 0.005 α = 0.01
α = 0.01 α = 0.02
α = 0.025 α = 0.05 α = 0.05 α = 0.1
Value of
Δ = |δ|/σ 1.00
0.80 0.85 0.90 0.95
312420171127211814 9231714117191411 282219161025191613 9211613106181311
9 5 8 5
9 7 8 6 7 6 5
6
6 6 6 5 7 6 7 5
6
6 6 5
1.1 1.2 1.3 1.4 1.5
1.6 1.7 1.8 1.9 2.0
2.1 2.2 2.3 2.4 2.5
3.0 3.5 4.0
24 191614 21 161412 18 151311 16 131210 15 12119
13 11108 12 1098 12 1098 11 987 10 887
9 21 8 18 8 16 7 14 7 13
6 12 6
6
6
5
16 14 12 8 14 12 10 7 13 11 9 6 11 10 9 6 109 8 6
109 7 5 119 8 7 108 7 7 108 7 6
97 7 6
87 6 6 8765 866 766 766
655 5
18 15
12 11
13 11 9 6 12 10 8 5 14 10 9 7
15 13 11
108 9
11 10 8 7 7
8 8
100
77 62 37 110
Level of t-Test
.01.05
.1 .2 .5.01.05
110
.1 .2
115 109 85 85 66 68 53 55 43
.5.01.05 .1
139 90
63 119 47 109 88 37 117 84 68 309367 54 257654 44
.2 .5.01.05
99 128 64
90 45 122 67 34 90 51 26 101 70 412180 55 341865 45
.1
.2 .5
115 92 75
134 78 125 99 58 97 77 45
122 70 139 101 45
97 71 32 72 52 24 55 40 19 44 33 15 36 27 13
101 81 63 51 30 90 66
83
71534536226347393118533832241346322619 9 61463931205541342716463327211239282217 8 53403428174735302414402924191034241915 8 4736302516423127211335262116930211713 7
4132272214372824191231221915927191512 6 3729242013332521171128211713824171411 6 3426221812292319161025191612721151310 5
63
53 42 26 75 55
46 36
216345 37
281554 38
30 22 11
1087 7 9876 9776 8776 8766
7665 655 6
9 8
10 9
8 7 7 6
8 7 6 7 6 5 8 7 6 8 6 6 7 6 5
7 6 76 65
6 6
5
Reproduced with permission from O. L. Davies, ed., Design and Analysis of Industrial Experi- ments, Oliver & Boyd, Edinburgh, 1956.

Table A.9 Table of Sample Sizes for the Test of the Difference between Two Means
747
Table A.9 Sample Size for the t-Test of the Difference between Two Means
Level of t-Test
α = 0.005 α = 0.01 α = 0.025
α = 0.01 α = 0.02 α = 0.05
.01.05 .1 .2 .5.01.05 .1 .2 .5.01.05 .1 .2 .5.01.05 .1 .2 .5
Single-Sided Test Double-Sided Test β=0.1
0.05 0.10 0.15 0.20 0.25
0.30 0.35 0.40 0.45 0.50
0.55 0.60 0.65 0.70 0.75
1.1 4232272213382823191132231914 1.2 362723181132242016 927201612 1.3 312320161028211714 823171411 1.4 27201714 924181512 820151210 1.5 24181513 821161411 7181311 9
1.6 21161411 719141210 6161210 8 1.7 19151310 7171311 9 61411 9 7 1.8 17137110 6151210 8 51310 8 6 1.9 1612119614119851297 6 2.0 1411108613109751187 6
2.1 13109851298751086 5 2.2 1210875119764976 5 2.3 119875108764976 5 2.4 1198651087648654 2.5 10876497654865 4
3.0 86654765436544 3.5 6554365445443 4.0 6544 5443443
α = 0.05 α = 0.1
0.80 0.85 0.90 0.95
352821 10
100 88
101 101 85 87 73 75 63 66 55
110 85 118 68 96 55
79 46
67 39
57 34 104 50 29 90 44 26 79
101 106 82
106 88 68 907458 776449 665543 584838
123 90 70 55 45
38
32 104 27 88 24 76 21 67
100 105 79 106 86 64
877153 746045 635139 554434 483929
124
87 64 50 39 32
27 112 23 89 20 76 17 66 15 57
137 88
61 102 45 108 78 35 108 86 62 28 887051 23
735842 19 614936 16 524230 14 453626 12 403223 11
77
69514335216246383017523731231245312518 9 62463931195541342715473427211140282216 8 55423528175037312414423025191036252015 7
58 49
Value of
Δ = |δ|/σ 1.00 5038322615453328221338272317
39 23 70
514333
19 59
423426
14 50
933231814 7
827191512 6 723161310 5 6201411 9 5 6171210 8 4 515119 7 4
514108 6 4 41297 6 3 41187 5 41076 5
4 3
976 4
865 4 865 4 755 4
7 544 654 3
543 43
4
Reproduced with permission from O. L. Davies, ed., Design and Analysis of Industrial Experi- ments, Oliver & Boyd, Edinburgh, 1956.

748 Appendix A Statistical Tables and Proofs
Table A.10 Critical Values for Bartlett’s Test
bk(0.01; n)
Number of Populations, k
3 4 5 6 7 8
n 2
3 0.1411
4 0.2843
5 0.3984
6 0.4850
7 0.5512
8 0.6031
9 0.6445
10 0.6783
11 0.7063
12 0.7299
13 0.7501
14 0.7674
15 0.7825
16 0.7958
17 0.8076
18 0.8181
19 0.8275
20 0.8360
21 0.8437
22 0.8507
23 0.8571
24 0.8630
25 0.8684
26 0.8734
27 0.8781
28 0.8824
29 0.8864
30 0.8902
40 0.9175 50 0.9339 60 0.9449 80 0.9586
100 0.9669
Reproduced from D. D.
Bartlett’s Test,” J. Am. Stat. Assoc., 75, 1980, by permission of the Board of Directors.
0.1672
0.3165 0.3475 0.3729 0.3937 0.4110 0.4304 0.4607 0.4850 0.5046 0.5207
0.5149 0.5430 0.5653 0.5832 O.5978 0.5787 0.6045 0.6248 0.6410 0.6542 0.6282 0.6518 0.6704 0.6851 0.6970 0.6676 0.6892 0.7062 0.7197 0.7305 0.6996 0.7195 0.7352 0.7475 0.7575
0.7260 0.7445 0.7590 0.7703 0.7795 0.7483 0.7654 0.7789 0.7894 0.7980 0.7672 0.7832 0.7958 0.8056 0.8135 0.7835 0.7985 0.8103 0.8195 0.8269 0.7977 0.8118 0.8229 0.8315 0.8385
0.8101 0.8235 0.8339 0.8421 0.8486 0.8211 0.8338 0.8436 0.8514 0.8576 0.8309 0.8429 0.8523 0.8596 0.8655 0.8397 0.8512 0.8601 0.8670 0.8727 0.8476 0.8586 0.8671 0.8737 0.8791
0.8548 0.8653 0.8734 0.8797 0.8848 0.8614 0.8714 0.8791 0.8852 0.8901 0.8673 0.8769 0.8844 0.8902 0.8949 0.8728 0.8820 0.8892 0.8948 0.8993 0.8779 0.8867 0.8936 0.8990 0.9034
0.8825 0.8911 0.8977 0.9029 0.9071 0.8869 0.8951 0.9015 0.9065 0.9105 0.8909 0.8988 0.9050 0.9099 0.9138 0.8946 0.9023 0.9083 0.9130 0.9167 0.8981 0.9056 0.9114 0.9159 0.9195
0.9235 0.9291 0.9335 0.9370 0.9397 0.9387 0.9433 0.9468 0.9496 0.9518 0.9489 0.9527 0.9557 0.9580 0.9599 0.9617 0.9646 0.9668 0.9685 0.9699 0.9693 0.9716 0.9734 0.9748 0.9759
0.5343
0.6100 0.6652 0.7069 0.7395 0.7657
0.7871 0.8050 0.8201 0.8330 0.8443
0.8541 0.8627 0.8704 0.8773 0.8835
0.8890 0.8941 0.8988 0.9030 0.9069
0.9105 0.9138 0.9169 0.9198 0.9225
0.9420 0.9536 0.9614 0.9711 0.9769
9 10
0.5458 0.5558
0.6204 0.6293 0.6744 0.6824 0.7153 0.7225 0.7471 0.7536 0.7726 0.7786
0.7935 0.7990 0.8109 0.8160 0.8256 0.8303 0.8382 0.8426 0.8491 0.8532
0.8586 0.8625 0.8670 0.8707 0.8745 0.8780 0.8811 0.8845 0.8871 0.8903
0.8926 0.8956 0.8975 0.9004 0.9020 0.9047 0.9061 0.9087 0.9099 0.9124
0.9134 0.9158 0.9166 0.9190 0.9196 0.9219 0.9224 0.9246 0.9250 0.9271
0.9439 0.9455 0.9551 0.9564 0.9626 0.9637 0.9720 0.9728 0.9776 0.9783
Dyer and J. P. Keating, “On the Determination of Critical Values for

Table A.10 Table for Bartlett’s Test
749
Table A.10 (continued) Critical Values for Bartlett’s Test
n 2 3
3 0.3123 0.3058
4 0.4780 0.4699
5 0.5845 0.5762
6 0.6563 0.6483
7 0.7075 0.7000
8 0.7456 0.7387
9 0.7751 0.7686
10 0.7984 0.7924
11 0.8175 0.8118
12 0.8332 0.8280
13 0.8465 0.8415
14 0.8578 0.8532
15 0.8676 0.8632
16 0.8761 0.8719
17 0.8836 0.8796
18 0.8902 0.8865
19 0.8961 0.8926
20 0.9015 0.8980
21 0.9063 0.9030
22 0.9106 0.9075
23 0.9146 0.9116
24 0.9182 0.9153
25 0.9216 0.9187
26 0.9246 0.9219
27 0.9275 0.9249
28 0.9301 0.9276
29 0.9326 0.9301
30 0.9348 0.9325
40 0.9513 0.9495 50 0.9612 0.9597 60 0.9677 0.9665 80 0.9758 0.9749
100 0.9807 0.9799
bk(0.05; n)
Number of Populations, k
4 5 6 7 8
0.3173 0.3299
0.4803 0.4921 0.5028 0.5122 0.5204 0.5850 0.5952 0.6045 0.6126 0.6197
0.6559 0.6646 0.6727 0.6798 0.6860 0.7065 0.7142 0.7213 0.7275 0.7329 0.7444 0.7512 0.7574 0.7629 0.7677 0.7737 0.7798 0.7854 0.7903 0.7946 0.7970 0.8025 0.8076 0.8121 0.8160
0.8160 0.8210 0.8257 0.8298 0.8333 0.8317 0.8364 0.8407 0.8444 0.8477 0.8450 0.8493 0.8533 0.8568 0.8598 0.8564 0.8604 0.8641 0.8673 0.8701 0.8662 0.8699 0.8734 0.8764 0.8790
0.8747 0.8782 0.8815 0.8843 0.8868 0.8823 0.8856 0.8886 0.8913 0.8936 0.8890 0.8921 0.8949 0.8975 0.8997 0.8949 0.8979 0.9006 0.9030 0.9051 0.9003 0.9031 0.9057 0.9080 0.9100
0.9051 0.9078 0.9103 0.9124 0.9143 0.9095 0.9120 0.9144 0.9165 0.9183 0.9135 0.9159 0.9182 0.9202 0.9219 0.9172 0.9195 0.9217 0.9236 0.9253 0.9205 0.9228 0.9249 0.9267 0.9283
0.9236 0.9258 0.9278 0.9296 0.9311 0.9265 0.9286 0.9305 0.9322 0.9337 0.9292 0.9312 0.9330 0.9347 0.9361 0.9316 0.9336 0.9354 0.9370 0.9383 0.9340 0.9358 0.9376 0.9391 0.9404
0.9506 0.9520 0.9533 0.9545 0.9555 0.9606 0.9617 0.9628 0.9637 0.9645 0.9672 0.9681 0.9690 0.9698 0.9705 0.9754 0.9761 0.9768 0.9774 0.9779 0.9804 0.9809 0.9815 0.9819 0.9823
9 10
0.5277 0.5341 0.6260 0.6315
0.6914 0.6961 0.7376 0.7418 0.7719 0.7757 0.7984 0.8017 0.8194 0.8224
0.8365 0.8392 0.8506 0.8531 0.8625 0.8648 0.8726 0.8748 0.8814 0.8834
0.8890 0.8909 0.8957 0.8975 0.9016 0.9033 0.9069 0.9086 0.9117 0.9132
0.9160 0.9175 0.9199 0.9213 0.9235 0.9248 0.9267 0.9280 0.9297 0.9309
0.9325 0.9336 0.9350 0.9361 0.9374 0.9385 0.9396 0.9406 0.9416 0.9426
0.9564 0.9572 0.9652 0.9658 0.9710 0.9716 0.9783 0.9787 0.9827 0.9830

750 Appendix A Statistical Tables and Proofs
Table A.11 Critical Values for Cochran’s Test
α = 0.01 n
k 2 3 4 5 6 7 8 9 10 11 17 37 145 ∞
2 0.9999 0.9950 0.9794 0.9586 0.9373 0.9172 0.8988 0.8823 0.8674 0.8539 0.7949 0.7067 0.6062 0.5000 3 0.9933 0.9423 0.8831 0.8335 0.7933 0.7606 0.7335 0.7107 0.6912 0.6743 0.6059 0.5153 0.4230 0.3333 4 0.9676 0.8643 0.7814 0.7212 0.6761 0.6410 0.6129 0.5897 0.5702 0.5536 0.4884 0.4057 0.3251 0.2500
5 0.9279 0.7885 0.6957 0.6329 0.5875 0.5531 0.5259 0.5037 0.4854 0.4697 0.4094 0.3351 0.2644 0.2000 6 0.8828 0.7218 0.6258 0.5635 0.5195 0.4866 0.4608 0.4401 0.4229 0.4084 0.3529 0.2858 0.2229 0.1667 7 0.8376 0.6644 0.5685 0.5080 0.4659 0.4347 0.4105 0.3911 0.3751 0.3616 0.3105 0.2494 0.1929 0.1429
8 0.7945 0.6152 0.5209 0.4627 0.4226 0.3932 0.3704 0.3522 0.3373 0.3248 0.2779 0.2214 0.1700 0.1250
9 0.7544 0.5727 0.4810 0.4251 0.3870 0.3592 0.3378 0.3207 0.3067 0.2950 0.2514 0.1992 0.1521 0.1111 10 0.7175 0.5358 0.4469 0.3934 0.3572 0.3308 0.3106 0.2945 0.2813 0.2704 0.2297 0.1811 0.1376 0.1000
12 0.6528 0.4751 0.3919 0.3428 0.3099 0.2861 0.2680 0.2535 0.2419 0.2320 0.1961 0.1535 0.1157 0.0833 15 0.5747 0.4069 0.3317 0.2882 0.2593 0.2386 0.2228 0.2104 0.2002 0.1918 0.1612 0.1251 0.0934 0.0667 20 0.4799 0.3297 0.2654 0.2288 0.2048 0.1877 0.1748 0.1646 0.1567 0.1501 0.1248 0.0960 0.0709 0.0500
24 0.4247 0.2871 0.2295 0.1970 0.1759 0.1608 0.1495 0.1406 0.1338 0.1283 0.1060 0.0810 0.0595 0.0417 30 0.3632 0.2412 0.1913 0.1635 0.1454 0.1327 0.1232 0.1157 0.1100 0.1054 0.0867 0.0658 0.0480 0.0333 40 0.2940 0.1915 0.1508 0.1281 0.1135 0.1033 0.0957 0.0898 0.0853 0.0816 0.0668 0.0503 0.0363 0.0250
60 0.2151 0.1371 0.1069 0.0902 0.0796 0.0722 0.0668 0.0625 0.0594 0.0567 0.0461 0.0344 0.0245 0.0167 120 0.1225 0.0759 0.0585 0.0489 0.0429 0.0387 0.0357 0.0334 0.0316 0.0302 0.0242 0.0178 0.0125 0.0083
∞00000000000000
Reproduced from C. Eisenhart, M. W. Hastay, and W. A. Wallis, Techniques of Statistical Analysis, Chapter 15, McGraw- Hill Book Company, New, York, 1947. Used with permission of McGraw-Hill Book Company.

Table A.11 Table for Cochran’s Test 751
Table A.11 (continued) Critical Values for Cochran’s Test
α = 0.05
n
k 2 3 4 5 6 7 8 9 10 11 17 37 145 ∞
2 0.9985 0.9750 0.9392 0.9057 0.8772 0.8534 0.8332 0.8159 0.8010 0.7880 0.7341 0.6602 0.5813 0.5000 3 0.9669 0.8709 0.7977 0.7457 0.7071 0.6771 0.6530 0.6333 0.6167 0.6025 0.5466 0.4748 0.4031 0.3333 4 0.9065 0.7679 0.6841 0.6287 0.5895 0.5598 0.5365 0.5175 0.5017 0.4884 0.4366 0.3720 0.3093 0.2500
5 0.8412 0.6838 0.5981 0.5441 0.5065 0.4783 0.4564 0.4387 0.4241 0.4118 0.3645 0.3066 0.2513 0.2000 6 0.7808 0.6161 0.5321 0.4803 0.4447 0.4184 0.3980 0.3817 0.3682 0.3568 0.3135 0.2612 0.2119 0.1667 7 0.7271 0.5612 0.4800 0.4307 0.3974 0.3726 0.3535 0.3384 0.3259 0.3154 0.2756 0.2278 0.1833 0.1429
8 0.6798 0.5157 0.4377 0.3910 0.3595 0.3362 0.3185 0.3043 0.2926 0.2829 0.2462 0.2022 0.1616 0.1250
9 0.6385 0.4775 0.4027 0.3584 0.3286 0.3067 0.2901 0.2768 0.2659 0.2568 0.2226 0.1820 0.1446 0.1111 10 6.6020 0.4450 0.3733 0.3311 0.3029 0.2823 0.2666 0.2541 0.2439 0.2353 0.2032 0.1655 0.1308 0.1000
12 0.5410 0.3924 0.3264 0.2880 0.2624 0.2439 0.2299 0.2187 0.2098 0.2020 0.1737 0.1403 0.1100 0.0833 15 0.4709 0.3346 0.2758 0.2419 0.2195 0.2034 0.1911 0.1815 0.1736 0.1671 0.1429 0.1144 0.0889 0.0667 20 0.3894 0.2705 0.2205 0.1921 0.1735 0.1602 0.1501 0.1422 0.1357 0.1303 0.1108 0.0879 0.0675 0.0500
24 0.3434 0.2354 0.1907 0.1656 0.1493 0.1374 0.1286 0.1216 0.1160 0.1113 0.0942 0.0743 0.0567 0.0417 30 0.2929 0.1980 0.1593 0.1377 0.1237 0.1137 0.1061 0.1002 0.0958 0.0921 0.0771 0.0604 0.0457 0.0333 40 0.2370 0.1576 0.1259 0.1082 0.0968 0.0887 0.0827 0.0780 0.0745 0.0713 0.0595 0.0462 0.0347 0.0250
60 0.1737 0.1131 0.0895 0.0765 0.0682 0.0623 0.0583 0.0552 0.0520 0.0497 0.0411 0.0316 0.0234 0.0167 120 0.0998 0.0632 0.0495 0.0419 0.0371 0.0337 0.0312 0.0292 0.0279 0.0266 0.0218 0.0165 0.0120 0.0083
∞00000000000000

752
Appendix A Statistical Tables and Proofs
Table A.12 Upper Percentage Points of the Studentized Range Distribution: Values of q(0.05; k, v)
Degrees of Number of Treatments k
Freedom,v 2 3 4 5 6 7 8 9 10
1 18.0
2 6.09
3 4.50
4 3.93
5 3.64
6 3.46
7 3.34
8 3.26
9 3.20
10 3.15
11 3.11
12 3.08
13 3.06
14 3.03
15 3.01
16 3.00
17 2.98
18 2.97
19 2.96
20 2.95
24 2.92 30 2.89 40 2.86 60 2.83
120 2.80 ∞ 2.77
27.0 32.8 5.33 9.80 5.91 6.83 5.04 5.76 4.60 5.22
4.34 4.90 4.16 4.68 4.04 4.53 3.95 4.42 3.88 4.33
3.82 4.26 3.77 4.20 3.73 4.15 3.70 4.11 3.67 4.08
3.65 4.05 3.62 4.02 3.61 4.00 3.59 3.98 3.58 3.96
3.53 3.90 3.48 3.84 3.44 3.79 3.40 3.74 3.36 3.69 3.32 3.63
37.2 40.5 10.89 11.73 7.51 8.04 6.29 6.71 5.67 6.03
5.31 5.63 5.06 5.35 4.89 5.17 4.76 5.02 4.66 4.91
4.58 4.82 4.51 4.75 4.46 4.69 4.41 4.65 4.37 4.59
4.34 4.56 4.31 4.52 4.28 4.49 4.26 4.47 4.24 4.45
4.17 4.37 4.11 4.30 4.04 4.23 3.98 4.16 3.92 4.10 3.86 4.03
43.1 15.1 47.1 49.1 12.43 13.03 13.54 13.99 8.47 8.85 9.18 9.46 7.06 7.35 7.60 7.83 6.33 6.58 6.80 6.99
5.89 6.12 6.32 6.49 5.59 5.80 5.99 6.15 5.40 5.60 5.77 5.92 5.24 5.43 5.60 5.74 5.12 5.30 5.46 5.60
5.03 5.20 5.35 5.49 4.95 5.12 5.27 5.40 4.88 5.05 5.19 5.32 4.83 4.99 5.13 5.25 4.78 4.94 5.08 5.20
4.74 4.90 5.03 5.05 4.70 4.86 4.99 5.11 4.67 4.83 4.96 5.07 4.64 4.79 4.92 5.04 4.62 4.77 4.90 5.01
4.54 4.68 4.81 4.92 4.46 4.60 4.72 4.83 4.39 4.52 4.63 4.74 4.31 4.44 4.55 4.65 4.24 4.36 4.47 4.56 4.17 4.29 4.39 4.47

Table A.13 Table for Duncan’s Test 753 Table A.13 Least Significant Studentized Ranges rp(0.05; p, v)
α = 0.05 p
v 2 3 4 5 6 7 8 9 10
1 17.97
2 6.085
3 4.501
4 3.927
5 3.635
6 3.461
7 3.344
8 3.261
9 3.199
10 3.151
11 3.113
12 3.082
13 3.055
14 3.033
15 3.014
16 2.998
17 2.984
18 2.971
19 2.960
20 2.950
24 2.919 30 2.888 40 2.858 60 2.829
120 2.800 ∞ 2.772
17.97 17.97 17.97 17.97 6.085 6.085 6.085 6.085 4.516 4.516 4.516 4.516 4.013 4.033 4.033 4.033 3.749 3.797 3.814 3.814
3.587 3.649 3.68 3.694 3.477 3.548 3.588 3.611 3.399 3.475 3.521 3.549 3.339 3.420 3.470 3.502 3.293 3.376 3.430 3.465
3.256 3.342 3.397 3.435 3.225 3.313 3.370 3.410 3.200 3.289 3.348 3.389 3.178 3.268 3.329 3.372 3.160 3.25 3.312 3.356
3.144 3.235 3.298 3.343 3.130 3.222 3.285 3.331 3.118 3.210 3.274 3.321 3.107 3.199 3.264 3.311 3.097 3.190 3.255 3.303
3.066 3.160 3.226 3.276 3.035 3.131 3.199 3.250 3.006 3.102 3.171 3.224 2.976 3.073 3.143 3.198 2.947 3.045 3.116 3.172 2.918 3.017 3.089 3.146
17.97 17.97 6.085 6.085 4.516 4.516 4.033 4.033 3.814 3.814
3.697 3.697 3.622 3.626 3.566 3.575 3.523 3.536 3.489 3.505
3.462 3.48 3.439 3.459 3.419 3.442 3.403 3.426 3.389 3.413
3.376 3.402 3.366 3.392 3.356 3.383 3.347 3.375 3.339 3.368
3.315 3.345 3.290 3.322 3.266 3.300 3.241 3.277 3.217 3.254 3.193 3.232
17.97 17.97 6.085 6.085 4.516 4.516 4.033 4.033 3.814 3.814
3.697 3.697 3.626 3.626 3.579 3.579 3.544 3.547 3.516 3.522
3.493 3.501 3.474 3.484 3.458 3.470 3.444 3.457 3.432 3.446
3.422 3.437 3.412 3.429 3.405 3.421 3.397 3.415 3.391 3.409
3.370 3.390 3.349 3.371 3.328 3.352 3.307 3.333 3.287 3.314 3.265 3.294
Abridged from H. L. Harter, “Critical Values for Duncan’s New Multiple Range Test,” Biometrics, 16, No. 4, 1960, by permission of the author and the editor.

754
Appendix A Statistical Tables and Proofs
Table A.13 (continued) Least Significant Studentized Ranges rp(0.01; p, v) α = 0.01
p
v 2 3 4 5 6 7 8 9 10
1 90.03 90.03
2 14.04 14.04
3 8.261 8.321
4 6.512 6.677
5 5.702 5.893
6 5.243 5.439
7 4.949 5.145
8 4.746 4.939
9 4.596 4.787
10 4.482 4.671
11 4.392 4.579
12 4.320 4.504
13 4.260 4.442
14 4.210 4.391
15 4.168 4.347
16 4.131 4.309
17 4.099 4.275
18 4.071 4.246
19 4.046 4.220
20 4.024 4.197
24 3.956 4.126 30 3.889 4.056 40 3.825 3.988 60 3.762 3.922
120 3.702 3.858 ∞ 3.643 3.796
90.03 90.03 14.04 14.04
8.321 8.321 6.740 6.756 5.989 6.040
5.549 5.614 5.260 5.334 5.057 5.135 4.906 4.986 4.790 4.871
4.697 4.780 4.622 4.706 4.560 4.644 4.508 4.591 4.463 4.547
4.425 4.509 4.391 4.475 4.362 4.445 4.335 4.419 4.312 4.395
4.239 4.322 4.168 4.250 4.098 4.180 4.031 4.111 3.965 4.044 3.900 3.978
90.03 90.03 14.04 14.04
8.321 8.321 6.756 6.756 6.065 6.074
5.655 5.680 5.383 5.416 5.189 5.227 5.043 5.086 4.931 4.975
4.841 4.887 4.767 4.815 4.706 4.755 4.654 4.704 4.610 4.660
4.572 4.622 4.539 4.589 4.509 4.560 4.483 4.534 4.459 4.510
4.386 4.437 4.314 4.366 4.244 4.296 4.174 4.226 4.107 4.158 4.040 4.091
90.03 90.03 14.04 14.04
8.321 8.321 6.756 6.756 6.074 6.074
5.694 5.701 5.439 5.454 5.256 5.276 5.118 5.142 5.010 5.037
4.924 4.952 4.852 4.883 4.793 4.824 4.743 4.775 4.700 4.733
4.663 4.696 4.630 4.664 4.601 4.635 4.575 4.610 4.552 4.587
4.480 4.516 4.409 4.445 4.339 4.376 4.270 4.307 4.202 4.239 4.135 4.172
90.03 14.04
8.321 6.756 6.074
5.703 5.464 5.291 5.160 5.058
4.975 4.907 4.850 4.802 4.760
4.724 4.693 4.664 4.639 4.617
4.546 4.477 4.408 4.340 4.272 4.205

Table A.14 Table for Dunnett’s Two-Sided Test 755 Table A.14 Values of dα/2(k, v) for Two-Sided Comparisons between k Treatments and
a Control
α = 0.05
k = Number of Treatment Means (excluding control)
v123456789
5 2.57 3.03 3.29
6 2.45 2.86 3.10
7 2.36 2.75 2.97
8 2.31 2.67 2.88
9 2.26 2.61 2.81
10 2.23 2.57 2.76
11 2.20 2.53 2.72
12 2.18 2.50 2.68
13 2.16 2.48 2.65
14 2.14 2.46 2.63
15 2.13 2.44 2.61
16 2.12 2.42 2.59
17 2.11 2.41 2.58
18 2.10 2.40 2.56
19 2.09 2.39 2.55
20 2.09 2.38 2.54
24 2.06 2.35 2.51 30 2.04 2.32 2.47 40 2.02 2.29 2.44 60 2.00 2.27 2.41
120 1.98 2.24 2.38 ∞ 1.96 2.21 2.35
3.48 3.62 3.26 3.39 3.12 3.24 3.02 3.13 2.95 3.05
2.89 2.99 2.84 2.94 2.81 2.90 2.78 2.87 2.75 2.84
2.73 2.82 2.71 2.80 2.69 2.78 2.68 2.76 2.66 2.75
2.65 2.73 2.61 2.70 2.58 2.66 2.54 2.62 2.51 2.58
2.47 2.55 2.44 2.51
3.73 3.82 3.90 3.97 3.49 3.57 3.64 3.71 3.33 3.41 3.47 3.53 3.22 3.29 3.35 3.41 3.14 3.20 3.26 3.32
3.07 3.14 3.19 3.24 3.02 3.08 3.14 3.19 2.98 3.04 3.09 3.14 2.94 3.00 3.06 3.10 2.91 2.97 3.02 3.07
2.89 2.95 3.00 3.04 2.87 2.92 2.97 3.02 2.85 2.90 2.95 3.00 2.83 2.89 2.94 2.98 2.81 2.87 2.92 2.96
2.80 2.86 2.90 2.95 2.76 2.81 2.86 2.90 2.72 2.77 2.82 2.86 2.68 2.73 2.77 2.81 2.64 2.69 2.73 2.77
2.60 2.65 2.69 2.73 2.57 2.61 2.65 2.69
Reproduced from
trol,” Biometrics, 20, No. 3, 1964, by permission of the author and the editor.
Charles W. Dunnett, “New Tables for Multiple Comparison with a Con-

756 Appendix A Statistical Tables and Proofs Table A.14 (continued) Values of dα/2(k, v) for Two-Sided Comparisons between k Treat-
ments and a Control
α = 0.01
k = Number of Treatment Means (excluding control)
v123456789
5 4.03
6 3.71
7 3.50
8 3.36
9 3.25
10 3.17
11 3.11
12 3.05
13 3.01
14 2.98
15 2.95
16 2.92
17 2.90
18 2.88
19 2.86
20 2.85
24 2.80 30 2.75 40 2.70 60 2.66
120 2.62 ∞ 2.58
4.63 4.98 4.21 4.51 3.95 4.21 3.77 4.00 3.63 3.85
3.53 3.74 3.45 3.65 3.39 3.58 3.33 3.52 3.29 3.47
3.25 3.43 3.22 3.39 3.19 3.36 3.17 3.33 3.15 3.31
3.13 3.29 3.07 3.22 3.01 3.15 2.95 3.09 2.90 3.03
2.85 2.97 2.79 2.92
5.22 5.41 4.71 4.87 4.39 4.53 4.17 4.29 4.01 4.12
3.88 3.99 3.79 3.89 3.71 3.81 3.65 3.74 3.59 3.69
3.55 3.64 3.51 3.60 3.47 3.56 3.44 3.53 3.42 3.50
3.40 3.48 3.32 3.40 3.25 3.33 3.19 3.26 3.12 3.19
3.06 3.12 3.00 3.06
5.56 5.69 5.80 5.89 5.00 5.10 5.20 5.28 4.64 4.74 4.82 4.89 4.40 4.48 4.56 4.62 4.22 4.30 4.37 4.43
4.08 4.16 4.22 4.28 3.98 4.05 4.11 4.16 3.89 3.96 4.02 4.07 3.82 3.89 3.94 3.99 3.76 3.83 3.88 3.93
3.71 3.78 3.83 3.88 3.67 3.73 3.78 3.83 3.63 3.69 3.74 3.79 3.60 3.66 3.71 3.75 3.57 3.63 3.68 3.72
3.55 3.60 3.65 3.69 3.47 3.52 3.57 3.61 3.39 3.44 3.49 3.52 3.32 3.37 3.41 3.44 3.25 3.29 3.33 3.37
3.18 3.22 3.26 3.29 3.11 3.15 3.19 3.22

Table A.15 Table for Dunnett’s One-Sided Test 757 Table A.15 Values of dα(k,v) for One-Sided Comparisons between k Treatments and
a Control
α = 0.05
k = Number of Treatment Means (excluding control)
v123456789
5 2.02
6 1.94
7 1.89
8 1.86
9 1.83
10 1.81
11 1.80
12 1.78
13 1.77
14 1.76
15 1.75
16 1.75
17 1.74
18 1.73
19 1.73
20 1.72
24 1.71 30 1.70 40 1.68 60 1.67
120 1.66 ∞ 1.64
2.44 2.68 2.34 2.56 2.27 2.48 2.22 2.42 2.18 2.37
2.15 2.34 2.13 2.31 2.11 2.29 2.09 2.27 2.08 2.25
2.07 2.24 2.06 2.23 2.05 2.22 2.04 2.21 2.03 2.20
2.03 2.19 2.01 2.17 1.99 2.15 1.97 2.13 1.95 2.10
1.93 2.08 1.92 2.06
2.85 2.98 2.71 2.83 2.62 2.73 2.55 2.66 2.50 2.60
2.47 2.56 2.44 2.53 2.41 2.50 2.39 2.48 2.37 2.46
2.36 2.44 2.34 2.43 2.33 2.42 2.32 2.41 2.31 2.40
2.30 2.39 2.28 2.36 2.25 2.33 2.23 2.31 2.21 2.28
2.18 2.26 2.16 2.23
3.08 3.16 2.92 3.00 2.82 2.89 2.74 2.81 2.68 2.75
2.64 2.70 2.60 2.67 2.58 2.64 2.55 2.61 2.53 2.59
2.51 2.57 2.50 2.56 2.49 2.54 2.48 2.53 2.47 2.52
2.46 2.51 2.43 2.48 2.40 2.45 2.37 2.42 2.35 2.39
2.32 2.37 2.29 2.34
3.24 3.30 3.07 3.12 2.95 3.01 2.87 2.92 2.81 2.86
2.76 2.81 2.72 2.77 2.69 2.74 2.66 2.71 2.64 2.69
2.62 2.67 2.61 2.65 2.59 2.64 2.58 2.62 2.57 2.61
2.56 2.60 2.53 2.57 2.50 2.54 2.47 2.51 2.44 2.48
2.41 2.45 2.38 2.42
Reproduced from Charles W. Dunnett, “A Multiple Comparison Procedure for
ing Several Treatments with a Control,” J. Am. Stat. Assoc., 50, 1955, 1096–1121, by permission of the author and the editor.
Compar-

758
Appendix A Statistical Tables and Proofs
Table A.15 (continued) Values of dα(k, v) for One-Sided Comparisons between k Treat- ments and a Control
α = 0.01
k = Number of Treatment Means (excluding control)
v123456789
5 3.37
6 3.14
7 3.00
8 2.90
9 2.82
10 2.76
11 2.72
12 2.68
13 2.65
14 2.62
15 2.60
16 2.58
17 2.57
18 2.55
19 2.54
20 2.53
24 2.49 30 2.46 40 2.42 60 2.39
120 2.36 ∞ 2.33
3.90 4.21 3.61 3.88 3.42 3.66 3.29 3.51 3.19 3.40
3.11 3.31 3.06 3.25 3.01 3.19 2.97 3.15 2.94 3.11
2.91 3.08 2.88 3.05 2.86 3.03 2.84 3.01 2.83 2.99
2.81 2.97 2.77 2.92 2.72 2.87 2.68 2.82 2.64 2.78
2.60 2.73 2.56 2.68
4.43 4.60 4.07 4.21 3.83 3.96 3.67 3.79 3.55 3.66
3.45 3.56 3.38 3.48 3.32 3.42 3.27 3.37 3.23 3.32
3.20 3.29 3.17 3.26 3.14 3.23 3.12 3.21 3.10 3.18
3.08 3.17 3.03 3.11 2.97 3.05 2.92 2.99 2.87 2.94
2.82 2.89 2.77 2.84
4.73 4.85 4.33 4.43 4.07 4.15 3.88 3.96 3.75 3.82
3.64 3.71 3.56 3.63 3.50 3.56 3.44 3.51 3.40 3.46
3.36 3.42 3.33 3.39 3.30 3.36 3.27 3.33 3.25 3.31
3.23 3.29 3.17 3.22 3.11 3.16 3.05 3.10 3.00 3.04
2.94 2.99 2.89 2.93
4.94 5.03 4.51 4.59 4.23 4.30 4.03 4.09 3.89 3.94
3.78 3.83 3.69 3.74 3.62 3.67 3.56 3.61 3.51 3.56
3.47 3.52 3.44 3.48 3.41 3.45 3.38 3.42 3.36 3.40
3.34 3.38 3.27 3.31 3.21 3.24 3.14 3.18 3.08 3.12
3.03 3.06 2.97 3.00

Table A.16 Table for the Signed-Rank Test
759
Table A.16 Critical Values for the Signed-Rank Test One-Sided α = 0.01 One-Sided α = 0.025
n Two-Sided α = 0.02 Two-Sided α = 0.05
One-Sided α = 0.05 Two-Sided α = 0.1
51 612 7024 8246 9368
10 5 8 11
11 7 11 14
12 10 14 17
13 13 17 21
14 16 21 26
15 20 25 30
16 24 30 36
17 28 35 41
18 33 40 47
19 38 46 54
20 43 52 60
21 49 59 68
22 56 66 75
23 62 73 83
24 69 81 92
25 77 90 101
26 85 98 110
27 93 107 120
28 102 117 130
29 111 127 141
30 120 137 152
Reproduced from F. Wilcoxon and R. A. Wilcox, Some Rapid Approximate Statistical Procedures, American Cyanamid Company, Pearl River, N.Y., 1964, by permission of the American Cyanamid Company.

760 Appendix A Statistical Tables and Proofs
Table A.17 Critical Values for the Wilcoxon Rank-Sum Test
One-Tailed Test at α = 0.001 or Two-Tailed Test at α = 0.002
n1
1 2 3 4 5 6 7 8 9
10 11
12
13
14
15
16
17
18
19
20
n1
1 2 3 4 5 6 7 8 9
10 11
12
13
14
15
16
17
18
19
20
n2
6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
0000 00011122333 00112233455677
0 1 2 2 3 2335 556 78 10
4 4 67 89
10 12 12 14 15 17
20
5
8
11
14
17
20
23
26
6 7 8 9 10
11 12 15 16 20 21 25 26 29 32 34 37 40 42 45 48 50 54 55 59 60 65 66 70
9
12
15
19
22
25
29
32
10
14
17
21
24
28
32
36
40
11
15
19
23
27
31
35
39
43
48
13 14 17 18 21 23 25 27 29 32 34 37 38 42 43 46 47 51 52 56 57 61
66 71 76 77 82 88
One-Tailed Test at α = 0.01 or Two-Tailed Test at α n2
5 6 7 8 9 10 11 12 13 14 15 16
= 0.02 17 18
19 20
0 1 1 2 3
1 2 3 4 46 68
10
3 5 7 9
11 14
3 6 8
11 13 16 19
4 5 5 6 7 7 8
9
14
19
24
30
36
41
47
53
59
65
70
76
82
88
9 10 15 16 20 22 26 28 32 34 38 40 44 47 50 53 56 60 63 67 69 73 75 80 82 87 88 93 94 100
101 107 114
00000011 00111222334445
9 12 15 18 22 25
11
14
17
21
24
28
31
12
16
20
23
27
31
35
39
7 8 9
10 13
17
22
26
30
34
38
43
47
11 15
19
24
28
33
37
42
47
51
56
12 16
21
26
31
36
41
46
51
56
61
66
13 18
23
28
33
38
44
49
55
60
66
71
77
Based in part on Tables 1, 3, 5, and 7 of D. Auble, “Extended Tables for the Mann-Whitney Statistic,” Bulletin of the Institute of Educational Research at Indiana University, 1, No. 2, 1953, by permission of the director.

Table A.17 Table for the Rank-Sum Test
761
Table A.17 (continued) Critical Values for the Wilcoxon Rank-Sum Test One-Tailed Test at α = 0.025 or Two-Tailed Test at α = 0.05
n2
n1 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
19 20
4 5 6 7 8 9
10 11
12
13
14
15
16
17
18
19
20
0 1 2 3 4 2356 568 8 10 13
4
7 10 12 15 17
5
8 11 14 17 20 23
6 7 8 9
10
14
19
24
29
34
39
44
49
54
59
64
11
15
21
26
31
37
42
47
53
59
64
70
75
11
17
22
28
34
39
45
51
57
63
67
75
81
87
12 13 13 18 19 20 24 25 27 30 32 34 36 38 41 42 45 48 48 52 55 55 58 62 61 65 69 67 72 76 74 78 83 80 85 90 86 92 98 93 99 105 99 106 112
113 119 127
1
2 0000111112222 30112233445566778
9
13
16
19
23
26
30
11 14
18
22
26
29
33
37
12 16
20
24
28
33
37
41
45
13 17
22
26
31
36
40
45
50
55
One-Tailed Test at α = 0.05 or Two-Tailed Test at α = 0.1 n2
n1
2 3 4 5 6 7 8 9
10 11
12
13
14
15
16
17
18
19
20
3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 100
0001111223333444
0 0 1 2 2 1234 456 78 11
3 5 8
10 13 15
4 4 5 5 6 7 7 8 9
9
16
22
28
35
41
48
55
61
68
75
82
88
95
102
109
10 11 17 18 23 25 30 32 37 39 44 47 51 54 58 62 65 69 72 77 80 84 87 92 94 100
101 107 109 115 116 123 123 130
138
9 12 15 18 21
11 14
17 20 24 27
12 16
19 23 27 31 34
13 17
21
26
30
34
38
42
6789
10
15
19
24
28
33
37
42
47
51
11
16
21
26
31
36
41
46
51
56
61
12
18
23
28
33
39
44
50
55
61
66
72
14
19
25
30
36
42
48
54
60
65
71
77
83
15
20
26
33
39
45
51
57
64
70
77
83
89
96

762
Appendix A Statistical Tables and Proofs
Table A.18 P (V ≤ v∗ when H0 is true) in the Runs Test v∗
(n1,n2) 2 3 4 5 6 7 8 9 10
0.200 0.500 0.133 0.400 0.095 0.333 0.071 0.286 0.056 0.250 0.044 0.222 0.036 0.200 0.030 0.182 0.100 0.300 0.057 0.200 0.036 0.143 0.024 0.107 0.017 0.083 0.012 0.067 0.009 0.055 0.007 0.045 0.029 0.114 0.016 0.071 0.010 0.048 0.006 0.033 0.004 0.024 0.003 0.018 0.002 0.014 0.008 0.040 0.004 0.024 0.003 0.015 0.002 0.010 0.001 0.007 0.001 0.005 0.002 0.013 0.001 0.008 0.001 0.005 0.000 0.003 0.000 0.002 0.001 0.004 0.000 0.002 0.000 0.001 0.000 0.001 0.000 0.001 0.000 0.001 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
Reproduced from C.
ing in a Sequence of Alternatives,” Ann. Math. Stat., 14, 1943, by permission of the editor.
(2, 3) (2, 4) (2, 5) (2, 6) (2, 7) (2, 8) (2, 9) (2, 10) (3, 3) (3,4) (3,5) (3,6) (3,7) (3,8) (3,9) (3,10) (4,4) (4,5) (4,6) (4,7) (4,8) (4,9) (4,10) (5,5) (5,6) (5,7) (5,8) (5,9) (5,10) (6,6) (6,7) (6,8) (6,9) (6,10) (7,7) (7,8) (7,9) (7,10) (8,8) (8,9) (8,10) (9,9) (9,10) (10,10)
0.900 1.000 0.800 1.000 0.714 1.000 0.643 1.000 0.583 1.000 0.533 1.000 0.491 1.000 0.455 1.000 0.700 0.900 0.543 0.800 0.429 0.714 0.345 0.643 0.283 0.583 0.236 0.533 0.200 0.491 0.171 0.455 0.371 0.629 0.262 0.500 0.190 0.405 0.142 0.333 0.109 0.279 0.085 0.236 0.068 0.203 0.167 0.357 0.110 0.262 0.076 0.197 0.054 0.152 0.039 0.119 0.029 0.095 0.067 0.175 0.043 0.121 0.028 0.086 0.019 0.063 0.013 0.047 0.025 0.078 0.015 0.051 0.010 0.035 0.006 0.024 0.009 0.032 0.005 0.020 0.003 0.013 0.003 0.012 0.002 0.008 0.001 0.004
1.000
0.971 1.000
0.929 1.000
0.881 1.000
0.833 1.000
0.788 1.000
0.745 1.000
0.706 1.000
0.886 0.971 1.000
0.786 0.929 0.992 1.000 0.690 0.881 0.976 1.000 0.606 0.833 0.954 1.000 0.533 0.788 0.929 1.000 0.471 0.745 0.902 1.000 0.419 0.706 0.874 1.000 0.643 0.833 0.960 0.992 0.522 0.738 0.911 0.976 0.424 0.652 0.854 0.955 0.347 0.576 0.793 0.929 0.287 0.510 0.734 0.902 0.239 0.455 0.678 0.874 0.392 0.608 0.825 0.933 0.296 0.500 0.733 0.879 0.226 0.413 0.646 0.821 0.175 0.343 0.566 0.762 0.137 0.288 0.497 0.706 0.209 0.383 0.617 0.791 0.149 0.296 0.514 0.704 0.108 0.231 0.427 0.622 0.080 0.182 0.355 0.549 0.100 0.214 0.405 0.595 0.069 0.157 0.319 0.500 0.048 0.117 0.251 0.419 0.044 0.109 0.238 0.399 0.029 0.077 0.179 0.319 0.019 0.051 0.128 0.242
1.000 0.998 0.992 0.984 0.972 0.958 0.987 0.966 0.937 0.902 0.864 0.922 0.867 0.806 0.743 0.786 0.702 0.621 0.601 0.510 0.414
Eisenhart and R. Swed, “Tables for Testing Randomness of Group-

Table A.18 Table for the Runs Test 763 Table A.18 (continued) P (V ≤ v∗ when H0 is true) in the Runs Test
v∗
(n1,n2) 11 12 13 14 15 16 17 18 19 20
(2, 3) (2, 4) (2, 5) (2, 6) (2, 7) (2, 8) (2, 9) (2, 10)
(3, 3) (3, 4) (3, 5) (3, 6) (3, 7) (3, 8) (3, 9) (3, 10)
(4, 4) (4, 5) (4, 6) (4, 7) (4, 8) (4, 9) (4, 10)
(5, 5) (5, 6) (5, 7) (5, 8) (5, 9) (5, 10)
(6, 6) (6, 7) (6, 8) (6, 9) (6, 10)
(7, 7) (7, 8) (7, 9) (7, 10)
(8,8)
(8,9) (8,10)
(9,9) (9,10) (10,10)
1.000 1.000 1.000 1.000 1.000
0.998 1.000
0.992 0.999 1.000 0.984 0.998 1.000 0.972 0.994 1.000 0.958 0.990 1.000
0.975 0.996 0.999 1.000 0.949 0.988 0.998 1.000 0.916 0.975 0.994 0.999 0.879 0.957 0.990 0.998
0.900 0.968 0.991 0.999 0.843 0.939 0.980 0.996 0.782 0.903 0.964 0.990
0.762 0.891 0.956 0.988 0.681 0.834 0.923 0.974 0.586 0.758 0.872 0.949
1.000 1.000 1.000
1.000 1.000
0.999 1.000 1.000 0.998 1.000 1.000
0.997 1.000 1.000 0.992 0.999 1.000 0.981 0.996 0.999
1.000
1.000 1.000
1.000 1.000 1.000

764
Appendix A Statistical Tables and Proofs
Table A.19 Sample Size for Two-Sided Nonparametric Tolerance Limits 1−γ
1 − α 0.50 0.70 0.90 0.95 0.99
0.995
1483
740
146
72
47
34
27
22
16
12
488 777 947 1325 244 388 473 662 49 77 93 130 24 38 46 64 16 25 30 42 12 18 22 31 10 15 18 24 8 12 14 20 6 9 10 14 5 7 8 11
Reproduced from Table A–25d
Introduction to Statistical Analysis, 3rd ed. McGraw-Hill, New York, 1969. Used with permission of McGraw-Hill Book Company.
Table A.20 Sample Size for One-Sided Nonparametric Tolerance Limits 1−γ
0.995 336 0.99 168 0.95 34 0.90 17 0.85 11 0.80 9 0.75 7 0.70 6 0.60 4 0.50 3
of Wilfrid J. Dixon and Frank J. Massey, Jr.,
1 − α 0.50 0.70 0.95 0.99
0.995
0.995 139 0.99 69 0.95 14 0.90 7 0.85 5 0.80 4 0.75 3 0.70 2 0.60 2 0.50 1
241 598 919 1379 120 299 459 688 24 59 90 135 12 29 44 66 8 19 29 43 6 14 21 31 5 11 7 25 4 9 13 20 3 6 10 14 2 5 7 10
Reproduced from Table A–25e of Wilfrid J. Dixon and Frank J. Massey, Jr., Introduction to Statistical Analysis, 3rd ed. McGraw-Hill, New York, 1969. Used with permission of McGraw-Hill Book Company.

Table A.21
Table for Spearman’s Rank Correlation Coefficients 765 Table A.21 Critical Values for Spearman’s Rank Correlation Coefficients
n α = 0.05
5 0.900
6 0.829
7 0.714
8 0.643
9 0.600
10 0.564
11 0.523
12 0.497
13 0.475
14 0.457
15 0.441
16 0.425
17 0.412
18 0.399
19 0.388
20 0.377
21 0.368
22 0.359
23 0.351
24 0.343
25 0.336
26 0.329
27 0.323
28 0.317
29 0.311
30 0.305
α = 0.025 0.886
0.786 0.738 0.683 0.648
0.623 0.591 0.566 0.545 0.525
0.507 0.490 0.476 0.462 0.450
0.438 0.428 0.418 0.409 0.400
0.392 0.385 0.377 0.370 0.364
α = 0.01 0.943
0.893 0.833 0.783 0.745
0.736 0.703 0.673 0.646 0.623
0.601 0.582 0.564 0.549 0.534
0.521 0.508 0.496 0.485 0.475
0.465 0.456 0.448 0.440 0.432
α = 0.005
0.881 0.833 0.794
0.818 0.780 0.745 0.716 0.689
0.666 0.645 0.625 0.608 0.591
0.576 0.562 0.549 0.537 0.526
0.515 0.505 0.496 0.487 0.478
Reproduced from E. G. Olds, “Distribution of Sums of Squares of Rank Differences for Small Samples,” Ann. Math. Stat., 9, 1938, by permission of the editor.

766 Appendix A Statistical Tables and Proofs
Table A.22 Factors for Constructing Control Charts Chart for
Averages Obs. in Factors for
Chart for Standard Deviations Factors for Factors for
Chart for Ranges Factors for Factors for
Sample Control Limits
Centerline Control Limits
n A2 A3
Centerline Control Limits c4 1/c4 B3 B4 B5 B6 d2 1/d2 d3 D3 D4
2 1.880 2.659 0.7979 1.2533 0 3.267 0 2.606 1.128 0.8865 0.853 0 3.267 3 1.023 1.954 0.8862 1.1284 0 2.568 0 2.276 1.693 0.5907 0.888 0 2.574 4 0.729 1.628 0.9213 1.0854 0 2.266 0 2.088 2.059 0.4857 0.880 0 2.282 5 0.577 1.427 0.9400 1.0638 0 2.089 0 1.964 2.326 0.4299 0.864 0 2.114
6 0.483 1.287 0.9515 1.0510 0.030 1.970 0.029 1.874 2.534 0.3946 0.848 0 2.004 7 0.419 1.182 0.9594 1.0423 0.118 1.882 0.113 1.806 2.704 0.3698 0.833 0.076 1.924 8 0.373 1.099 0.9650 1.0363 0.185 1.815 0.179 1.751 2.847 0.3512 0.820 0.136 1.864 9 0.337 1.032 0.9693 1.0317 0.239 1.761 0.232 1.707 2.970 0.3367 0.808 0.184 1.816
10 0.308 0.975 0.9727 1.0281 0.284 1.716 0.276 1.669 3.078 0.3249 0.797 0.223 1.777
11 0.285 0.927 0.9754 1.0252 0.321 1.679 0.313 1.637 3.173 0.3152 0.787 0.256 1.744 12 0.266 0.886 0.9776 1.0229 0.354 1.646 0.346 1.610 3.258 0.3069 0.778 0.283 1.717 13 0.249 0.850 0.9794 1.0210 0.382 1.618 0.374 1.585 3.336 0.2998 0.770 0.307 1.693 14 0.235 0.817 0.9810 1.0194 0.406 1.594 0.399 1.563 3.407 0.2935 0.763 0.328 1.672 15 0.223 0.789 0.9823 1.0180 0.428 1.572 0.421 1.544 3.472 0.2880 0.756 0.347 1.653
16 0.212 0.763 0.9835 1.0168 0.448 1.552 0.440 1.526 3.532 0.2831 0.750 0.363 1.637 17 0.203 0.739 0.9845 1.0157 0.466 1.534 0.458 1.511 3.588 0.2787 0.744 0.378 1.622 18 0.194 0.718 0.9854 1.0148 0.482 1.518 0.475 1.496 3.640 0.2747 0.739 0.391 1.608 19 0.187 0.698 0.9862 1.0140 0.497 1.503 0.490 1.483 3.689 0.2711 0.734 0.403 1.597 20 0.180 0.680 0.9869 1.0133 0.510 1.490 0.504 1.470 3.735 0.2677 0.729 0.415 1.585
21 0.173 0.663 0.9876 1.0126 0.523 1.477 0.516 1.459 3.778 0.2647 0.724 0.425 1.575 22 0.167 0.647 0.9882 1.0119 0.534 1.466 0.528 1.448 3.819 0.2618 0.720 0.434 1.566 23 0.162 0.633 0.9887 1.0114 0.545 1.455 0.539 1.438 3.858 0.2592 0.716 0.443 1.557 24 0.157 0.619 0.9892 1.0109 0.555 1.445 0.549 1.429 3.895 0.2567 0.712 0.451 1.548 25 0.153 0.606 0.9896 1.0105 0.565 1.435 0.559 1.420 3.931 0.2544 0.708 0.459 4.541

Section A.24 Proof of Mean of the Hypergeometric Distribution 767 Table A.23 The Incomplete Gamma Function: F (x; α) = 􏰬 x 1 yα−1e−y dy
0 Γ(α)
α
x 1 2 3 4 5 6 7 8 9 10
1 0.6320
2 0.8650
3 0.9500
4 0.9820
5 0.9930
6 0.9980
7 0.9990
8 1.0000
9
10
11 12 13 14 15
0.2640 0.5940 0.8010 0.9080 0.9600
0.9830 0.9930 0.9970 0.9990 1.0000
0.0800 0.3230 0.5770 0.7620 0.8750
0.9380 0.9700 0.9860 0.9940 0.9970
0.9990 1.0000
0.0190 0.1430 0.3530 0.5670 0.7350
0.8490 0.9180 0.9580 0.9790 0.9900
0.9950 0.9980 0.9990 1.0000
0.0040 0.0010 0.0000 0.0000 0.0000 0.0530 0.0170 0.0050 0.0010 0.0000 0.1850 0.0840 0.0340 0.0120 0.0040 0.3710 0.2150 0.1110 0.0510 0.0210 0.5600 0.3840 0.2380 0.1330 0.0680
0.7150 0.5540 0.3940 0.2560 0.1530 0.8270 0.6990 0.5500 0.4010 0.2710 0.9000 0.8090 0.6870 0.5470 0.4070 0.9450 0.8840 0.7930 0.6760 0.5440 0.9710 0.9330 0.8700 0.7800 0.6670
0.9850 0.9620 0.9210 0.8570 0.7680 0.9920 0.9800 0.9540 0.9110 0.8450 0.9960 0.9890 0.9740 0.9460 0.9000 0.9980 0.9940 0.9860 0.9680 0.9380 0.9990 0.9970 0.9920 0.9820 0.9630
0.0000 0.0000 0.0010 0.0080 0.0320
0.0840 0.1700 0.2830 0.4130 0.5420
0.6590 0.7580 0.8340 0.8910 0.9300
A.24 Proof of Mean of the Hypergeometric Distribution To find the mean of the hypergeometric distribution, we write
􏰤n
􏰩 k 􏰪 􏰩 N − k 􏰪
x
􏰤n x=1
( k − 1 ) !
(x − 1)!(k − x)! ·
􏰩 N − k 􏰪 n−x
x
x=1 n
E(X) =
=k 􏰩N􏰪.
n−x 􏰩N􏰪
􏰩N􏰪 n
= k 􏰤n 􏰩k−1􏰪􏰩N −k􏰪
Since
􏰧 N−k 􏰨 􏰧(N−1)−(k−1)􏰨 􏰧N􏰨 letting y = x − 1, we obtain
N􏰧N−1􏰨
x=0
n
x−1 n−x
N!
n−1−y = n−1−y and n =n!(N−n)!= n n−1 ,
E(X) = k y=0
n−1 􏰩k−1􏰪􏰩 N−k 􏰪
􏰤
y
n−1−y 􏰩N􏰪
n
n−1 􏰩k−1􏰪􏰩(N−1)−(k−1)􏰪 nk􏰤 y n−1−y nk
=N 􏰩N−1􏰪 =N, y=0 n−1
since the summation represents the total of all probabilities in a hypergeometric experiment when N − 1 items are selected at random from N − 1, of which k − 1 are labeled success.

768 Appendix A Statistical Tables and Proofs A.25 Proof of Mean and Variance of the Poisson Distribution
Let μ = λt.
􏰤∞ E(X) =
x=0
x ·
e − μ μ x x!
=
􏰤∞ x=1
x ·
e − μ μ x x!
= μ
􏰤∞ e − μ μ x − 1
Since the summation in the last term above is the total probability of a Poisson random variable with mean μ, which can be easily seen by letting y = x − 1, it equals 1. Therefore, E(X) = μ. To calculate the variance of X, note that
􏰤∞ e − μ μ x E[X(X−1)]= x(x−1) x! =μ
x=0
2 􏰤∞ e − μ μ x − 2
(x2)! .
Again, letting y = x − 2, the summation in the last term above is the total probability of a Poisson
x=1
(x − 1)! .
random variable with mean μ. Hence, we obtain
σ2 =E(X2)−[E(X)]2 =E[X(X−1)]+E(X)−[E(X)]2 =μ2 +μ−μ2 =μ=λt.
A.26 Proof of Mean and Variance of the Gamma Distribution To find the mean and variance of the gamma distribution, we first calculate
1 E(Xk) = βαΓ(α)
􏰫 ∞ βk+αΓ(α + k) 􏰫 ∞ xα+k−1e−x/β
xα+k−1e−x/β dx = βαΓ(α) 00
βk+αΓ(α + k) dx,
x=2
for k = 0, 1, 2, . . . . Since the integrand in the last term above is parameters α + k and β, it equals 1. Therefore,
E(Xk) = βk Γ(k + α). Γ(α)
a gamma
density
function with
Using the recursion formula of the gamma function from page 194, we obtain
μ=βΓ(α+1)=αβ and σ2=E(X2)−μ2=β2Γ(α+2)−μ2=β2α(α+1)−(αβ)2=αβ2. Γ(α) Γ(α)

Appendix B
Answers to Odd-Numbered Non-Review Exercises
Chapter 1
1.11
1.13
1.15 1.17
1.19
Control: sample variance = 69.38, sample standard deviation = 8.33. Treatment: sample variance = 128.04, sample standard deviation = 11.32.
(a) Mean = 124.3, median = 120 (b) 175 is an extreme observation.
Yes, P -value = 0.03125, probability of obtaining HHHHH with a fair coin.
1.1
1.3
1.5
(a) Sample size = 15
(b) Sample mean = 3.787 (c) Sample median = 3.6
(e) x ̄tr(20) = 3.678
(f) They are about the same.
(b) Yes, the aging process has reduced the ten- sile strength.
(c) x ̄Aging = 209.90, x ̄No aging = 222.10
(d) x ̃Aging = 210.00, x ̃No aging = 221.50. The means and medians are similar for each group.
(b) Control: x ̄ = 5.60, x ̃ = 5.00, x ̄tr(10) = 5.13. Treatment: x ̄ = 7.60, x ̃ = 4.50, x ̄tr(10) = 5.63.
(c) The extreme value of 37 in the treatment group plays a strong leverage role for the mean calculation.
(a) (b)
(d)
(a)
(b)
The sample means for nonsmokers and smokers are 30.32 and 43.70, respectively.
The sample standard deviations for non- smokers and smokers are 7.13 and 16.93, respectively.
Smokers appear to take a longer time to fall asleep. For smokers the time to fall asleep is more variable.
Stem Leaf
Frequency
0 22233457
1 023558
2035 3 303 2 4057 3 50569 4 60005 4
Class Class
Interval Midpoint Freq. Freq.
8 6
1.7
1.9 (a)
Sample variance = 0.943
Sample standard deviation = 0.971
No aging: sample variance = 23.66, sample standard deviation = 4.86. Aging: sample variance = 42.10, sample standard deviation = 6.49.
(b) Based on the numbers in (a), the variation in “Aging” is smaller than the variation in “No aging,” although the difference is not so apparent in the plot.
Rel.
769
0.0−0.9 0.45 8 1.0−1.9 1.45 6 2.0−2.9 2.45 3 3.0−3.9 3.45 2 4.0−4.9 4.45 3 5.0−5.9 5.45 4 6.0−6.9 6.45 4
0.267 0.200 0.100 0.067 0.100 0.133 0.133

770
Appendix B Answers to Odd-Numbered Non-Review Exercises
(c)
1.21 (a) (b)
1.23 (b) (c)
1.25 (a) (b) (d)
Sample mean = 2.7967
Sample range = 6.3
Sample standard deviation = 2.2273
x ̄=74.02andx ̃=78 s = 39.26
x ̄1980 = 395.10, x ̄1990 = 160.15
The mean emissions dropped between 1980 and 1990; the variability also decreased be- cause there were no longer extremely large emissions.
Sample mean = 33.31 Sample median = 26.35 x ̄tr(10) = 30.97
(b) A = {M1M2, M1F1, M1F2, M2M1, M2F1, M2F2}
(c) B = {M1F1, M1F2, M2F1, M2F2, F1M1, F1M2, F2M1, F2M2}
(d) C={F1F2,F2F1}
(e) A∩B={M1F1,M1F2,M2F1,M2F2}
(f) A ∪ C = {M1M2, M1F1, M1F2, M2M1, M2F1, M2F2, F1F2, F2F1}
2.15 (a) {nitrogen, potassium, uranium, oxygen} (b) {copper, sodium, zinc, oxygen}
(c) {copper, sodium, nitrogen, potassium, ura- nium, zinc}
(d) {copper, uranium, zinc} (e) φ
(f) {oxygen}
2.19 (a) The family will experience mechanical problems but will receive no ticket for a traffic violation and will not arrive at a
campsite that has no vacancies.
(b) The family will receive a traffic ticket and arrive at a campsite that has no vacancies but will not experience mechanical prob- lems.
(c) The family will experience mechanical problems and will arrive at a campsite that has no vacancies.
(d) The family will receive a traffic ticket but will not arrive at a campsite that has no vacancies.
(e) The family will not experience mechanical problems.
2.21 18
2.23 156
2.25 20
2.27 48
2.29 210
2.31 72
2.33 (a) 1024; (b) 243 2.35 362,880
2.37 2880
2.39 (a) 40,320; (b) 336 2.41 360
Chapter 2
2.1
2.3 2.5
2.7
2.9
2.11
(a) S={8,16,24,32,40,48} (b) S={−5,1}
(c) S={T,HT,HHT,HHH}
(d) S ={Africa, Antarctica, Asia, Australia,
Europe, North America, South America} (e) S = φ
A = C
Using the tree diagram, we obtain
S = {1HH, 1HT, 1TH, 1TT, 2H, 2T, 3HH, 3HT, 3TH, 3TT, 4H, 4T, 5HH, 5HT, 5TH, 5TT, 6H, 6T}
S1 = {MMMM,MMMF,MMFM,MFMM, FMMM,MMFF,MFMF,MFFM,FMFM, FFMM,FMMF,MFFF,FMFF,FFMF, FFFM,FFFF};
S2 = {0,1,2,3,4}
(a) A = {1HH,1HT,1TH,1TT,2H,2T}
(b) B={1TT,3TT,5TT} 
(c) A = {3HH,3HT,3TH,3TT,4H,4T,5HH, 5HT,5TH,5TT,6H,6T}

(d) A ∩B={3TT,5TT}
(e) A∪B = {1HH,1HT,1TH,1TT,2H,2T,
3TT,5TT}
(a) S = {M1M2, M1F1, M1F2, M2M1, M2F1, M2F2, F1M1, F1M2, F1F2, F2M1, F2M2, F2F1}

Answers to Chapter 3
771
2.43 24 2.45 3360 2.47 56
2.49 (a) (b) (c) (d)
Sum of the probabilities exceeds 1.
Sum of the probabilities is less than 1.
A negative probability
Probabilityofbothaheartandablackcard is zero.
2.93 (a) 0.75112; (b) 0.2045 2.95 0.0960
2.97 0.40625
2.99 0.1124
2.101 0.857
Chapter 3
3.1 Discrete; continuous; continuous; discrete; dis- crete; continuous
3.3 Sample Space w
HHH 3 HHT 1 HTH 1 THH 1 HTT −1 THT −1 TTH −1 TTT −3
3.5 (a) 1/30; (b) 1/10 3.7 (a) 0.68; (b) 0.375 3.9 (b) 19/80
2.51 S = {$10,$25,$100}; P(10) = 11, P(25) = 3 ,
P(100)= 15 ; 17 100 20
20
10
2.53 (a) 0.3; (b) 0.2
2.55 10/117
2.57 (a) 5/26; (b) 9/26; (c) 19/26
2.59 (a) 94/54,145; (b) 143/39,984 2.61 (a) 22/25; (b) 3/25; (c) 17/50 2.63 (a) 0.32; (b) 0.68; (c) office or den 2.65 (a) 0.8; (b) 0.45; (c) 0.55
2.67 (a) 0.31; (b) 0.93; (c) 0.31
2.69 (a) 0.009; (b) 0.999; (c) 0.01
2.71 (a) 0.048; (b) $50,000; (c) $12,500
2.73 (a)
The probability that a convict who pushed 3.11 dope also committed armed robbery.
x 012 f(x) 2 4 1

⎪0, forx<0, ⎪⎨0.41, for0≤x < 1, F(x)= 0.78, for1≤x < 2, ⎪0.94, for2≤x < 3, ⎪⎩0.99, for3≤x < 4, (b) The probability that a convict who com- mitted armed robbery did not push dope. 3.13 (c) The probability that a convict who did not push dope also did not commit armed rob- bery. 777 2.75 (a) 14/39; (b) 95/112 2.77 (a) 5/34; (b) 3/8 2.79 (a) 0.018; (b) 0.614; (c) 0.166; (d) 0.479 3.15 2.81 (a) 0.35; (b) 0.875; (c) 0.55 2.83 (a) 9/28; (b) 3/4; (c) 0.91 2.85 0.27 2.87 5/8 2.89 (a) 0.0016; (b) 0.9984 2.91 (a) 91/323; (b) 91/323 forx≥4 1, ⎪⎨0, for x < 0, ⎧ 2, for0≤x<1, F(x)= 7 ⎪6, for1≤x<2, ⎪⎩ 7 1, forx≥2 (a) 4/7; (b) 5/7 3.17 (b) 1/4; (c) 0.3 ⎧ ⎨0, x<1 3.19 F(x)= x−1, 1≤x<3;1/4 ⎩2 1, x≥3 772 Appendix B Answers to Odd-Numbered Non-Review Exercises ⎧ ⎨0, x<0 3.21 (a) 3/2; (b) F(x) = ⎩x3/2, 0 ≤ x < 1; 0.3004 1, x≥1 3.23 3.25 3.51 ⎪1,for−3≤w<−1, 31000 ⎧ (a) x f(x,y) 0 1 2 3 01661 55 55 55 55 y 1 6 16 6 0 for w < −3, F(w)= 7, for−1≤w<1, 26600 55 55 55 55 55 0, ⎨27 ⎪ ⎪⎩ 27 1, for w ≥ 3 (b) 42/55 3.53 5/8 3.55 Independent 3.57 (a) 3; (b) 21/512 3.59 Dependent Chapter 4 4.1 0.88 4.3 25¢ 4.5 $1.23 4.7 $500 4.9 $6900 4.11 (ln4)/π 4.13 100 hours 4.15 0 4.17 209 4.19 $1855 4.21 $833.33 55 ⎪27 ⎪19, for1≤w<3, (a) 20/27; (b) 2/3 t 20 25 30 131 555 􏰰 F(x)= 0, 1 − exp(−x/2000), 0.6065; (c) 0.6321 􏰰 F(x)= 0, x<1, 1−x−3, x≥1 0.0156 0.2231; (b) 0.2212 k = 280; (b) 0.3633; (c) 0.0563 0.1528; (b) 0.0446 1/36; (b) 1/15 x f(x,y) 0 1 2 3 00393 P(T =t) 3.27 (a) (b) 3.29 (b) (c) 3.31 (a) 3.33 (a) 3.35 (a) 3.37 (a) 3.39 (a) y (b) x<0, x ≥ 0 1/2 70 70 70 1 2 18 18 2 70 70 70 70 23930 70 70 70 4.23 (a) 35.2; (b) μX = 3.20, μY = 3.00 4.25 2 4.27 2000 hours 4.29 (b) 3/2 4.31 (a) 1/6; (b) (5/6)5 4.33 $5,250,000 4.35 0.74 4.37 1/18; in terms of actual profit, the variance is 1 (5000)2 4.39 1/6 4.41 118.9 4 . 4 3 μ Y = 1 0 ; σ Y2 = 1 4 4 4.45 0.01 1/16; (b) g(x) = 12x(1−x)2, for 0 ≤ x ≤ 1; 3.41 (a) (c) 1/4 3.43 (a) 3/64; (b) 1/2 3.45 0.6534 3.47 (a) Dependent; (b) 1/3 3.49 (a)x123 g(x) 0.10 0.35 0.55 (b) 18 y123 h(y) 0.20 0.50 0.30 (c) 0.2857 Answers to Chapter 5 773 4.47 −0.0062 5.17 μ = 3.5, σ2 = 1.05 2􏰺􏰻 4.49 σX = 0.8456, σX = 0.9196 4.51 −1/√5 4.53 μg(X) = 10.33, σg(X) = 6.66 4.55 $0.80 4.57 209 4.59 μ = 7/2, σ2 = 15/4 4.61 3/14 4.63 52 4.65 (a) 7; (b) 0; (c) 12.25 4.67 46/63 4.69 (a) E(X) = E(Y) = 1/3 and Var(X) = Var(Y ) = 4/9; (b) E(Z) = 2/3 and Var(Z) = 8/9 4.71 (a) 4; (b) 32; 16 4.73 By direct calculation, E(eY ) = 1884.32. Us- ing the second-order approximation, E(eY ) ≈ 1883.38, which is very close to the true value. 4.75 0.03125 4.77 (a) At most 4/9; (b) at least 5/9; (c) at least 21/25; (d) 10 Chapter 5 1􏰦k 21􏰦k 2 5.1μ=k xi,σ =k (xi−μ) i=1 i=1 5.3f(x)= 1,forx=1,2,...,10,andf(x)=0 elsewhere; 3/10 5.5 (a)0.0480;(b)0.2375;(c)P(X=5|p=0.3)= 0.1789, P = 0.3 is reasonable. 5.7 (a) 0.0474; (b) 0.0171 5.9 (a) 0.7073; (b) 0.4613; (c) 0.1484 5.11 0.1240 5.13 0.8369 5.15 (a) 0.0778; (b) 0.3370; (c) 0.0870 5.19 f(x1, x2, x3) = n x1, x2, x3 0.35x1 0.05x2 0.60x3 5.21 0.0095 5.23 0.0077 5.25 0.8670 5.27 (a) 0.2852; (b) 0.9887; (c) 0.6083 5.29 5/14 5.31 h(x;6,3,4) = 􏰺􏰻􏰺 􏰻 42 x 3−x 􏰺 􏰻 6 3 , for x = 1,2,3; 10 P(2≤X ≤3)=4/5 5.33 (a) 0.3246; (b) 0.4496 5.35 0.9517 5.37 (a) 0.6815; (b) 0.1153 5.39 0.9453 5.41 0.6077 5.43 (a) 4/33; (b) 8/165 5.45 0.2315 5.47 (a) 0.3991; (b) 0.1316 5.49 0.0515 5.51 63/64 5.53 (a) 0.3840; (b) 0.0067 5.55 (a) 0.0630; (b) 0.9730 5.57 (a) 0.1429; (b) 0.1353 5.59 (a) 0.1638; (b) 0.032 5.61 0.2657 5.63 μ = 6, σ2 = 6 5.65 (a) 0.2650; (b) 0.9596 5.67 (a) 0.8243; (b) 14 5.69 4 5.71 5.53 × 10−4; μ = 7.5 5.73 (a) 0.0137; (b) 0.0830 5.75 0.4686 774 Appendix B Answers to Odd-Numbered Non-Review Exercises Chapter 6 6.3 (a) 0.6; (b) 0.7; (c) 0.5 6.5 (a) 0.0823; (b) 0.0250; (c) 0.2424; (d) 0.9236; (e) 0.8133; (f) 0.6435 6.7 (a) 0.54; (b) −1.72; (c) 1.28 6.9 (a) 0.1151; (b) 16.1; (c) 20.275; (d) 0.5403 6.11 (a) 0.0548; (b) 0.4514; (c) 23 cups; (d) 189.95 milliliters 6.13 (a) 0.8980; (b) 0.0287; (c) 0.6080 6.15 (a) 0.0571; (b) 99.11%; (c) 0.3974; (d) 27.952 minutes; (e) 0.0092 6.17 6.24 years 6.19 (a) 51%; (b) $18.37 6.21 (a) 0.0401; (b) 0.0244 6.23 26 students 6.25 (a) 0.3085; (b) 0.0197 6.27 (a) 0.9514; (b) 0.0668 6.29 (a) 0.1171; (b) 0.2049 6.31 0.1357 6.33 (a) 0.0778; (b) 0.0571; (c) 0.6811 6.35 (a) 0.8749; (b) 0.0059 6.37 (a) 0.0228; (b) 0.3974 6.41 2.8e−1.8 − 3.4e−2.4 = 0.1545 6.43 (a)μ=6;σ2 =18; (b) between 0 and 14.485 million liters 6.57 Mean = e6, variance = e12(e4 − 1) 6.59 (a) e−5; (b) β = 0.2 Chapter 7 7.1 g(y) = 1/3, for y = 1,3,5 􏰺􏰻 7.3 2 g(y1,y2)= y1+y2,y1−y2,2−y1 22 􏰽1􏰾(y1+y2)/2 􏰽1􏰾(y1−y2)/2 􏰽 5 􏰾2−y1 ×4 3 12; for y1 = 0,1,2; y2 = −2,−1,0,1,2; y2 ≤ y1; y1 + y2 = 0, 2, 4 7.7 Gamma distribution with α = 3/2 and β = m/2b 7.9 (a) g(y) = 32/y3, for y > 4; (b) 1/4
7.11 h(z)=2(1−z),for0 20
8.51 (a) 2.71; (b) 3.51; (c) 2.92; (d) 0.47; (e) 0.34
8.53 The F-ratio is 1.44. The variances are not sig- nificantly different.
Chapter 9
9.1 56
9.3 0.3097 < μ < 0.3103 9.5 (a) 22,496 < μ < 24,504; (b) error ≤ 1004 9.7 35 9.9 10.15 < μ < 12.45 9.11 0.978 < μ < 1.033 9.13 47.722 < μ < 49.278 9.15 (13, 075, 33, 925) 9.17 (6.05, 16.55) 9.19 323.946 to 326.154 9.21 Upper prediction limit: 9.42; upper tolerance limit: 11.72 9.25 Yes, the value of 6.9 is outside of the prediction interval. 9.27 (a) (0.9876,1.0174); (b) (0.9411,1.0639); (c) (0.9334,1.0716) 9.35 2.9<μ1 −μ2 <7.1 9.37 2.80<μ1 −μ2 <3.40 9.39 1.5<μ1 −μ2 <12.5 9.41 0.70<μ1 −μ2 <3.30 9.43 −6536 < μ1 − μ2 < 2936 9.45 (−0.74,6.30) 9.47 (−6.92,36.70) 9.49 0.54652 < μB − μA < 1.69348 9.51 Method 1: 0.194 < p < 0.262; method 2: 0.1957 < p < 0.2639 9.53 (a) 0.498 < p < 0.642; (b) error ≤ 0.072 9.55 (a) 0.739 < p < 0.961; (b) no 9.57 (a) 0.644 < p < 0.690; (b) error ≤ 0.023 9.59 2576 9.61 160 9.63 9604 9.65 −0.0136 < pF − pM < 0.0636 9.67 0.0011 < p1 − p2 < 0.0869 9.69 (−0.0849, 0.0013); not significantly different 9.71 0.293 < σ2 < 6.736; valid claim 9.73 3.472 < σ2 < 12.804 9.75 9.27<σ<34.16 9.77 0.549 < σ1 /σ2 < 2.690 9.79 0.016 < σ12 /σ2 < 0.454; no 776 Appendix B Answers to Odd-Numbered Non-Review Exercises 9.81 9.83 9.85 9.87 1 􏰦n n xi i=1 βˆ = x ̄ / 5 θˆ= max{x1,...,xn} xlnp+(1−x)ln(1−p). Set the derivative with respecttop=0;pˆ=x=1.0 10.33 t = 1.50; there is not sufficient evidence to con- clude that the increase in substrate concentra- tion would cause an increase in the mean velocity of more than 0.5 micromole per 30 minutes. 10.35 t = 0.70; there is not sufficient evidence to sup- port the conclusion that the serum is effective. 10.37 t = 2.55; reject H0: μ1 − μ2 > 4 kilometers. 
10.39 t = 0.22; fail to reject H0. 
10.41 t = 2.76; reject H0.
10.43 t = −2.53; reject H0; the claim is valid.
10.45 t = 2.48; P -value < 0.02; reject H0 . 10.47 n=6 10.49 78.28 ≈ 79 10.51 5 10.53 (a) H0: Mhot − Mcold = 0, H1: Mhot − Mcold ̸= 0; (b) paired t, t = 0.99; P-value > 0.30; fail to reject H0.
10.55 P-value = 0.4044 (with a one-tailed test); the claim is not refuted.
10.57 z = 1.44; fail to reject H0.
10.59 z = −5.06 with P -value ≈ 0; conclude that fewer
than one-fifth of the homes are heated by oil.
10.61 z = 0.93 with P -value = P (Z > 0.93) = 0.1762; there is not sufficient evidence to conclude that the new medicine is effective.
10.63 z = 2.36 with P -value = 0.0182; yes, the differ- ence is significant.
10.65 z = 1.10 with P -value = 0.1357; we do not have sufficient evidence to conclude that breast cancer is more prevalent in the urban community.
10.67 χ2 = 18.13 with P-value = 0.0676 (from com- puter output); do not reject H0: σ2 = 0.03.
10.69 χ2 = 63.75 with P-value = 0.8998 (from com- puter output); do not reject H0.
10.71 χ2 = 42.37 with P-value = 0.0117 (from com- puter output); machine is out of control.
10.73 f = 1.33 with P -value = 0.3095 (from computer output); fail to reject H0: σ1 = σ2.
Chapter 10
10.1 (a)
Conclude that less than 30% of the public is allergic to some cheese products when, in fact, 30% or more is allergic.
(b) Conclude that at least 30% of the public is allergic to some cheese products when, in fact, less than 30% is allergic.
10.3 (a)
(b) the firm is guilty.
The firm is not guilty;
10.5 (a)
(b) β = 0.0017; β = 0.00968; β = 0.5557
0.0559;
10.7 (a)
(b) β = 0.0901; β = 0.0708.
(c) The probability of a type I error is some- what large.
10.9 (a) α = 0.0850; (b) β = 0.3410
10.11 (a) α = 0.1357; (b) β = 0.2578
10.13 α = 0.0094; β = 0.0122
10.15 (a) α = 0.0718; (b) β = 0.1151
10.17 (a) α = 0.0384; (b) β = 0.5; β = 0.2776 10.19 z = −2.76; yes, μ < 40 months; P -value = 0.0029 10.21 z = −1.64; P -value = 0.10 10.23 t = 0.77; fail to reject H0. 10.25 z = 8.97; yes, μ > 20, 000 kilometers; P -value < 0.001 10.27 t = 12.72; P-value < 0.0005; reject H0. 10.29 t = −1.98; P -value = 0.0312; reject H0 10.31 z = −2.60; conclude μA − μB ≤ 12 kilograms. 0.1286; Answers to Chapter 11 777 10.75 f = 0.086 with P-value = 0.0328 (from com- puter output); reject H0: σ1 = σ2 at level greater than 0.0328. 10.77 f = 19.67 with P -value = 0.0008 (from com- puter output); reject H0: σ1 = σ2. 10.79 χ2 = 10.14; reject H0, the ratio is not 5:2:2:1. 10.81 χ2 = 4.47; there is not sufficient evidence to claim that the die is unbalanced. 10.83 χ2 = 3.125; do not reject H0: geometric distri- bution. 10.85 χ2 = 5.19; do not reject H0: normal distribution. 10.87 χ2 = 5.47; do not reject H0. 10.89 χ2 = 124.59; yes, occurrence of these types of crime is dependent on the city district. 10.91 χ2 = 5.92 with P-value = 0.4332; do not reject H0. 10.93 χ2 = 31.17 with P -value < 0.0001; attitudes are not homogeneous. 10.95 χ2 = 1.84; do not reject H0. Chapter 11 11.19 11.21 11.23 11.25 11.27 11.29 11.31 11.33 11.35 11.37 11.39 11.41 11.43 11.45 11.47 (b) 4.324 < β0 < 8.503; (c) 0.446 < β1 < 3.172 (a) s2 = 6.626; (b) 2.684 < β0 < 8.968; (c) 0.498 < β1 < 0.637 t = −2.24; reject H0 and conclude β < 6 (a) 24.438 < μY |24.5 < 27.106; (b) 21.88 < y0 < 29.66 7.81 < μY |1.6 < 10.81 (a) 17.1812 mpg; (b) no, the 95% confidence interval on mean mpg is (27.95, 29.60); (c) miles per gallon will likely exceed 18. (b) yˆ = 3.4156x The f-value for testing the lack of fit is 1.58, and the conclusion is that H0 is not rejected. Hence, the lack-of-fit test is insignificant. (a) yˆ = 2.003x; (b) t = 1.40, fail to reject H0. f = 1.71 and P -value = 0.2517; the regression is linear. (a) b0 = 10.812, b1 = −0.3437; (b) f = 0.43; the regression is linear. (a) Pˆ = −11.3251 − 0.0449T ; (b) yes; (c) R2 = 0.9355; (d) yes (b) Nˆ = −175.9025 + 0.0902Y ; R2 = 0.3322 r = 0.240 (a) r = −0.979; (b) P -value = 0.0530; do not reject H0 at 0.025 level; (c) 95.8% (a) r = 0.784; (b) reject H0 and conclude that ρ > 0;
(c) 61.5%.
11.1
11.3
11.5
(a) ( b )
(a) (c)
(a) (b)
b0 = 64.529, b1 = 0.561; yˆ = 8 1 . 4
yˆ = 5.8254 + 0.5676x; yˆ = 34.205 at 50◦C
yˆ = 6.4136 + 1.8091x;
yˆ = 9.580 at temperature 1.75
(b) yˆ = 31.709 + 0.353x
(c) yˆ = $456 at advertising costs = $35
(b) yˆ = −1847.633 + 3.653x (a) yˆ = 153.175 − 6.324x;
(b) yˆ=123atx=4.8units (a) s2 = 176.4;
(b) t = 2.04; fail to reject H0: β1 = 0. (a) s2 = 0.40;
11.7
11.9 (b) yˆ = 343.706 + 3.221x;
11.11 11.13
11.15
11.17

778
Appendix B Answers to Odd-Numbered Non-Review Exercises
Chapter 12
12.1 yˆ = 0.5800 + 2.7122×1 + 2.0497×2
12.41
12.43
12.45
12.47
12.49 12.51
First model: R2 = 92.7%, C.V. = 9.0385. adj
12.3 (a) (b)
12.5 (a) ( b )
yˆ = 27.547 + 0.922×1 + 0.284×2; yˆ=84atx1 =64andx2 =4
yˆ = −102.7132 + 0.6054×1 + 8.9236×2 + 1.4374×3 + 0.0136×4;
Using x2 alone is not much different from us- ing x and x together since the R2 are 0.7696
yˆ = 2 8 7 . 6
12.7 yˆ=141.6118−0.2819x+0.0003×2
12 adj
versus 0.7591, respectively.
(a) m􏱉pg = 5.9593 − 0.00003773 odometer +
0.3374 octane − 12.6266z1 − 12.9846z2; (b) sedan;
(c) they are not significantly different. (b) yˆ = 4.690 seconds;
(c) 4.450 < μY |{180,260} < 4.930 yˆ = 2.1833 + 0.9576x2 + 3.3253x3 (a) yˆ = −587.211 + 428.433x; (b) yˆ = 1180 − 191.691x + 35.20945x2; 12.9 (a) yˆ=56.4633+0.1525x−0.00008x2; (b) yˆ = 86.7% when temperature is at 225◦C 12.11 yˆ = −6.5122 + 1.9994x1 − 3.6751x2 + 2.5245x3 + 5.1581x4 + 14.4012x5 12.13 (a) yˆ = 350.9943 − 1.2720x1 − 0.1539x2; ( b ) yˆ = 1 4 0 . 9 12.15 yˆ = 3.3205 + 0.4210x1 − 0.2958x2 + 0.0164x3 + 0.1247x4 12.53 12.55 (c) quadratic model σˆ2 = 20,588; σˆ2 = 62.6502; 12.17 0.1651 12.19 242.72 claim that β1 ̸= 0. 12.25 0.4516 < μY |x1 =900,x2 =1 < 1.2083 and −0.1640 < y0 < 1.8239 12.27 263.7879 < μY |x1 =75,x2 =24,x3 =90,x4 =98 311.3357 and 243.7175 < y0 < 331.4062 (a) Intercept model is the best. 12.57 (a) yˆ = 3.1368 + 0.6444x1 − 0.0104x2 + 0.5046x3 − 0.1197x4 − 2.4618x5 + 1.5044x6 ; (b) yˆ = 4.6563 + 0.5133x3 − 0.1242x4; (c) Cp criterion: variables x1 and x2 with s2 = 0.7317 and R2 = 0.6476; s2 criterion: variables x1, x3 and x4 with s2 = 0.7251 and R2 = 0.6726; (d) yˆ=4.6563+0.5133x3−0.1242x4;Thisone does not lose much in s2 and R2; (e) two observations have large R-Student val- ues and should be checked. 12.21 (a) σˆ2 B2 = −0.0096 = 28.0955; (b) σˆ 12.23 t = 5.91 with P -value = 0.0002. Reject H0 and t = −1.09 with P -value = 0.3562; (c) yes; not sufficient evidence to show that x1 12.29 (a) (b) t = −1.72 with P -value = 0.1841; 12.59 (a) yˆ = 125.8655 + 7.7586x1 + 0.0943x2 − 0.0092x1 x2 ; (b) the model with x2 alone is the best. 12.61 (a) pˆ = (1 + e2.9949−0.0308x )−1 ; (b) 1.8515 Chapter 13 13.1 f = 0.31; not sufficient evidence to support the hypothesis that there are differences among the 6 machines. and x2 are significant 12.31 R2 = 0.9997 12.33 f = 5.106 with P -value = 0.0303; the regression is not significant at level 0.01. 12.35 f = 34.90 with P -value = 0.0002; reject H0 and conclude β1 > 0.
12.37 f = 10.18 with P-value < 0.01; x1 and x2 are significant in the presence of x3 and x4. 12.39 The two-variable model is better. 13.3 f = 14.52; yes, the difference is significant. B1B2 < Second model: R2 adj The partial F -test model 2 is better. = 98.1%, C.V. = 4.6287. shows P -value = 0.0002; B1 B11 σˆB1,B11 = −1103.5 Answers to Chapter 14 779 13.5 13.7 13.9 f = 8.38; the average specific activities differ significantly. f = 2.25; not sufficient evidence to support the hypothesis that the different concentrations of MgNH4 PO4 significantly affect the attained height of chrysanthemums. b = 0.79 > b4(0.01, 4, 4, 4, 9) = 0.4939. Do not reject H0. There is not sufficent evidence to claim that variances are different.
13.31 13.33 13.35
13.37
13.39 13.41
P -value = 0.0023; significant
P -value = 0.1250; not significant
P -value < 0.0001; f = 122.37; the amount of dye has an effect on the color of the fabric. (a) yij = μ + Ai + εij, Ai ∼ n(x;0,σα), εij ∼ n(x; 0, σ); (b) σˆα2 = 0 (the estimated variance component is −0.00027); σˆ2 = 0.0206. (a) f = 14.9; operators differ significantly; (b) σˆα2 = 28.91; s2 = 8.32. (a)yij=μ+Ai+εij,Ai∼n(x;0,σα); (b) yes; f = 5.63 with P -value = 0.0121; (c) there is a significant loom variance compo- nent. b = 0.7822 < b4(0.05, 9, 8, 15) = 0.8055. The variances are significantly different. P -value < 0.0001, significant, (b) for contrast 1 vs. 2, P-value < 0.0001Z, significantly different; for contrast 3 vs. 4, P-value = 0.0648, not significantly differ- ent Results of Tukey’s tests are given below. y ̄4. y ̄3. y ̄1. y ̄5. y ̄2. 2.98 4.30 5.44 6.96 7.90 (a) P -value = 0.0121; yes, there is a significant difference. Substrate Removal Kicknet Surber Kicknet 13.19 f = 70.27 with P-value < 0.0001; reject H0. 13.11 13.13 (a) 13.15 13.17 x ̄0 x ̄25 55.167 60.167 Chapter 14 (b) Depletion Hess 14.1 14.3 14.5 14.7 14.9 (a) f = 8.13; significant; (b) f = 5.18; significant; (c) f = 1.63; insignificant (a) f = 14.81; significant; (b) f = 9.04; significant; (c) f = 0.61; insignificant (a) f = 34.40; significant; (b) f = 26.95; significant; (c) f = 20.30; significant Test for effect of temperature: f1 = 10.85 with P -value = 0.0002; Test for effect of amount of catalyst: f2 = 46.63 with P -value < 0.0001; Modified x ̄100 64.167 x ̄75 x ̄50 70.500 72.833 Temperature is important; both 75◦ and 50◦(C) yielded batteries with significantly longer acti- vated life. 13.21 The mean absorption is significantly lower for aggregate 4 than for the other aggregates. 13.23 Comparing the control to 1 and 2: significant; comparing the control to 3 and 4: insignificant 13.25 f(fertilizer) = 6.11; there is significant difference among the fertilizers. 13.27 f = 5.99; percent of foreign additives is not the same for all three brands of jam; brand A 13.29 P -value < 0.0001; significant Test for effect value = 0.074. (a) of interaction: f = 2.06 with P- Sum of Mean Squares Squares f P Source of Variation df 12.000 675.000 192.000 cutting speed; (c) ftool geometry=1 = 16.51 and P-value = 0.0036; ftool geometry=2 = 5.94 and P-value = 0.0407. Cutting speed Tool geometry Interaction Error 8 0.2836 1 1 1 9.083 (b) The interaction effect masks the effect of 12.000 1.32 675.000 74.31 < 0.0001 192.000 21.14 0.0018 Total 11 72.667 951.667 780 Appendix B Answers to Odd-Numbered Non-Review Exercises 14.11 (a) Source of Variation df Squares Squares fP coating × humidity: f = 3.41 with P -value = 0.0385; coating × stress: f = 0.08 with P -value = 0.9277; humidity × stress: f = 3.15 with P -value = 0.0192; coating × humidity × stress: f = 1.93 with P -value = 0.1138. (b) The best combination appears to be un- coated, medium humidity, and a stress level of 20. Method Laboratory Interaction Error Total (b) (c) (e) 14.13 (b) Source of Variation df Squares Squares Time Treatment Interaction Error Total (c) (d) (e) (a) (b) (a) (b) (c) (a) Sum of Mean 1 0.000104 0.000104 6 0.008058 0.001343 6 0.000198 0.000033 14 0.000222 0.000016 27 0.008582 6.57 0.0226 84.70 < 0.0001 2.08 0.1215 The interaction is not significant; Both main effects are significant; flaboratory=1 = 0.01576 and P-value = 0.9019; no significant difference between the methods in laboratory 1; ftool geometry=2 = 9.081 and P-value = 0.0093. 14.21 Effect f Temperature 14.22 Surface 6.70 HRC 1.67 P 0.0001 0.0020 0.1954 0.0006 0.0369 0.0007 0.0051 Sum of Mean T ×S T ×HRC S ×HRC T ×S×HRC 5.50 2.69 5.41 3.02 < 1 0.060208 1 0.060208 1 0.000008 8 0.003067 0.000383 Yes; brand × type; brand × temperature; yes; brand Y , powdered detergent, hot temper- ature. 11 0.123492 14.23 (a) (b) (c) 14.25 (a) Effect f Time 543.53 Temp 209.79 Solvent 4.97 Time × Temp 2.66 Time × Solvent 2.04 Temp × Solvent 0.03 Time × Temp × Solvent 6.22 0.060208 0.060208 0.000008 f 157.07 157.07 .02 P < 0.0001 < 0.0001 0.8864 Both time and treatment influence the magnesium uptake significantly, although there is no significant interaction between them. Y =μ+βTTime+βZZ+βTZTimeZ+ε, where Z = 1 when treatment = 1 and Z = 0 when treatment = 2; f = 0.02 with P -value = 0.8864; the inter- action in the model is insignificant. Interaction is significant at a level of 0.05, with P-value of 0.0166. Both main effects are significant. AB: f = 3.83; significant; AC: f = 3.79; significant; BC: f = 1.31; not significant; ABC: f = 1.63; not significant; A: f = 0.54; not significant; B: f = 6.85; significant; C: f = 2.15; not significant; The presence of AC interaction masks the main effect C . Stress: f = 45.96 with P -value < 0.0001; coating: f = 0.05 with P -value = 0.8299; humidity: f = 2.13 with P -value = 0.1257; < < P 0.0001 0.0001 0.0457 0.1103 0.1723 0.8558 0.0140 14.15 14.17 Although three two-way interactions are shown to be insignificant, they may be masked by the significant three-way inter- action. 14.27 (a) f = 1.49; no significant interaction; (b) f (operators) = 12.45; significant; f (filters) = 8.39; significant; (c) σˆα2 = 0.1777 (filters); σˆβ2 = 0.3516 (operators); s2 = 0.185 14.29 (a) σˆβ2, σˆγ2, σˆα2γ are significant; (b) σˆγ2 and σˆα2 γ are significant. 14.31 (a) Mixed model; 14.19 Answers to Chapter 15 781 (b) Material: f = 47.42 with P-value < 0.0001; brand: f = 1.73 with P -value = 0.2875; material × brand: f = 16.06 with P -value = 0.0004; (c) no Chapter 15 15.13 (a) Machine 1234 (b) ABD, CDE, ABCDE (one possible de- sign) (a) x2, x3, x1x2, and x1x3; (b) Curvature: P -value = 0.0038; (c) One additional design point different from the original ones (0, −1), (0, 1), (−1, 0), (1, 0) might be used. 15.1 15.3 15.5 15.9 15.11 B and C are significant at level 0.05. Factors A, B, and C have negative effects on the phosphorus compound, and factor D has a pos- itive effect. However, the interpretation of the effect of individual factors should involve the use of interaction plots. Significant effects: A: f = 9.98; BC: f = 19.03. Insignificant effects: B: f = 0.20; C: f = 6.54; D: f = 0.02; AB: f = 1.83; AC: f = 0.20; AD: f = 0.57; BD: f = 1.83; CD: f = 0.02. Since the BC interaction is sig- nificant, both B and C would be investigated further. (a) bA=5.5,bB=−3.25,andbAB=2.5; (b) the values of the coefficients are one-half those of the effects; (c) tA = 5.99 with P -value = 0.0039; tB = −3.54 with P -value = 0.0241; tAB = 2.72 with P -value = 0.0529; t2=F. (a) A = −0.8750, B = 5.8750, C = 9.6250, AB = −3.3750, AC = −9.6250, BC = 0.1250, and ABC = −1.1250; B, C, AB, and AC appear important based on their magnitude. (b) Effects P-Value A 0.7528 B 0.0600 C 0.0071 AB 0.2440 AC 0.0071 BC 0.9640 ABC 0.6861 (c) Yes; (d) At a high level of A, C essentially has no effect. At a low level of A, C has a positive effect. 15.15 15.17 15.19 (a) (b) (c) With BCD as the defining contrast, the principal block contains (1), a, bc, abc, bd, abd, cd, acd; 15.21 (a) confounded by ABC; Defining contrast BCD produces the fol- lowing aliases: A ≡ ABCD, B ≡ CD, C ≡ BD,D≡BC,AB≡ACD,AC≡ABD, and AD ≡ ABC. Since AD and ABC are confounded with blocks, there are only 2 degrees of freedom for error from the inter- actions not confounded. Source of Degrees of Variation Freedom A1 B1 C1 D1 Blocks 1 Error 2 Total 7 With the defining contrasts ABCE and ABDF , the principal block contains (1), ab, acd, bcd, ce, abce, ade, bde, acf , bcf , df, abdf, aef, bef, cdef, abcdef; (1) ab cd ce de abcd abce abde Block 1 Block 2 c d e abc abd abe cde abcde a b acd ace ade bcd bce bde ac ad ae bc bd be acde bcde (1) bc abd acd a abc bd cd 782 Appendix B Answers to Odd-Numbered Non-Review Exercises (b) A≡BCE≡BDF≡ACDEF, AD≡BCDE≡BF≡ACEF, B ≡ ACE ≡ ADF ≡ BCDEF, AE ≡ BC ≡ BDEF ≡ ACDF, C ≡ ABE ≡ ABCDF ≡ DEF, AF ≡BCEF ≡BD≡ACDE, D ≡ ABCDE ≡ ABF ≡ CEF, CE ≡ AB ≡ ABCDEF ≡ DF, E ≡ ABC ≡ ABDEF ≡ CDF, DE ≡ ABCD ≡ ABEF ≡ CF, F ≡ABCEF ≡ABD≡CDE, BCD ≡ ADE ≡ ACF ≡ BEF, AB ≡ CE ≡ DF ≡ ABCDEF, BCF ≡AEF ≡ACD≡BDE, AC ≡ BE ≡ BCDF ≡ ADEF; AD, BD, and BE are also significant at the 0.05 level. The principal block contains af, be, cd, abd, ace, bcf, def, abcdef. A ≡ BD ≡ CE ≡ CDF ≡ BEF ≡ ABCF ≡ ADEF ≡ ABCDE; B ≡ AD ≡ CF ≡ CDE ≡ AEF ≡ ABCE ≡ BDEF ≡ ABCDF; C ≡ AE ≡ BF ≡ BDE ≡ ADF ≡ CDEF ≡ ABCD ≡ ABCEF; D ≡ AB ≡ EF ≡ BCE ≡ ACF ≡ BCDF ≡ ACDE ≡ ABDEF; E ≡ AC ≡ DF ≡ ABF ≡ BCD ≡ ABDE ≡ BCEF ≡ ACDEF; F ≡ BC ≡ DE ≡ ACD ≡ ABE ≡ ACEF ≡ ABDF ≡BCDEF. x1 =1 and x2 =1 (a) Yes; (b) (c) velocity at low level; (d) velocity at low level; (e) yes yˆ = 12.7519 + 4.7194x1 + 0.8656x2 − 1.4156x3; units are centered and scaled; test for lack of fit, F = 81.58, with P -value < 0.0001. AFG, BEG, CDG, DEF, CEFG, BDFG, Source of Degrees of Variation Freedom A1 B1 C1 D1 E1 15.31 15.33 15.35 15.37 (i) E(yˆ) = 79.00 + 5.281A; 22222 F AB AC AD BC BD CD CF Error 1 1 1 1 1 1 1 1 2 15 MS 6.1250 0.6050 4.8050 0.2450 1.0533 MS 388,129.00 277,202.25 4692.25 9702.25 1806.25 1406.25 462.25 1156.00 961.00 108.25 (ii) Var(yˆ) = 6.22 σZ + 5.70 A σZ + 2(6.22)(5.70)AσZ2 ; 15.27 15.29 Total 15.23 Source df A 1 B 1 C 1 D 1 Error 3 SS 6.1250 0.6050 4.8050 0.2450 3.1600 f 5.81 0.57 4.56 0.23 P 0.0949 0.5036 0.1223 0.6626 BCDE, ADEG, ABCDEFG ACDF , 16.1 x=7withP-value 16.3 x = 3 with P-value 16.5 x = 4 with P-value 16.7 x = 4 with P-value 16.9 w = 43; fail to reject H0. 16.11 w+ = 17.5; fail to reject H0. 16.13 w+ =15withn=13;rejectH0 infavorof μ ̃ 1 − μ ̃ 2 < 8 . ABEF , and Total 7 14.9400 15.25 Source df SS A 1 388,129.00 B 1 277,202.25 Chapter 16 C 1 D 1 E 1 AD 1 AE 1 BD 1 BE 1 961.00 Error 6 649.50 4692.25 9702.25 1806.25 1406.25 fP 3585.49 0.0001 2560.76 0.0001 43.35 0.0006 89.63 0.0001 16.69 0.0065 12.99 0.0113 4.27 0.0843 10.68 0.0171 8.88 0.0247 = 0.1719; fail to reject H0. = 0.0244; reject H0. = 0.3770; fail to reject H0. = 0.1335; fail to reject H0. 462.25 1156.00 Total 15 686,167.00 All main effects are significant at the 0.05 level; Answers to Chapter 18 783 16.15 u1 = 4; claim is not valid. 16.17 u2 = 5; A operates longer. 16.19 u = 15; fail to reject H0. 16.21 h = 10.58; operating times are different. 16.23 v = 7 with P -value = 0.910; random sample. 16.25 v = 6 with P-value = 0.044; fail to reject H0. 16.27 v = 4; random sample. 16.29 0.70 16.31 0.995 16.33 (a)rs =0.39;(b)failtorejectH . 0 16.35 (a) rs = 0.72; (b) reject H0, so ρ > 0.
16.37 (a) rs = 0.71; (b) reject H0, so ρ > 0. Chapter 18
18.1 18.3
18.5 18.7 18.9
18.13
18.15
p∗ = 0.173
(a) π(p | x = 1) = 40p(1 − p)3/0.2844;
0.05 < p < 0.15; (b) p∗ = 0.106 (a) beta(95, 45); (b) 1 8.077 < μ < 8.692 (a) 0.2509; (b) 68.71 < μ < 71.69; (c) 0.0174 p∗ = 6 x+2 2.21 This page intentionally left blank Index Acceptable quality level, 705 Acceptance sampling, 153 Additive rule, 56 Adjusted R2, 464 Analysis of variance (ANOVA), 254, 507 one-factor, 509 table, 415 three-factor, 579 two-factor, 565 Approximation binomial to hypergeometric, 155 normal to binomial, 187, 188 Poisson to binomial, 163 Average, 111 Backward elimination, 479 Bartlett’s test, 516 Bayes estimates, 717 under absolute-error loss, 718 under square-error loss, 717 Bayes’ rule, 72, 75 Bayesian inference, 710 interval, 715 methodology, 265, 709 perspective, 710 posterior interval, 317 Bernoulli process, 144 random variable, 83 trial, 144 Beta distribution, 201 Bias, 227 Binomial distribution, 104, 145, 153, 155 mean of, 147 variance of, 147 Blocks, 509 Box plot, 3, 24, 25 Categorical variable, 472 Central composite design, 640 Central limit theorem, 233, 234, 238 Chebyshev’s theorem, 135–137, 148, 155, 180, 186 Chi-squared distribution, 200 Cochran’s test, 518 Coefficient of determination, 407, 433, 462 adjusted, 464 Coefficient of variation, 471 Combination, 50 Complement of an event, 39 Completely randomized design, 8, 509 Conditional distribution, 99 joint, 103 Conditional perspective, 710 Conditional probability, 62–66, 68, 75, 76 Confidence coefficient, 269 degree of, 269 limits, 269, 271 Confidence interval, 269, 270, 281, 317 for difference of two means, 285–288, 290 for difference of two proportions, 300, 301 interpretation of, 289 of large sample, 276 for paired observations, 293 for ratio of standard deviations, 306 for ratio of variances, 306 for single mean, 269–272, 275 one-sided, 273 for single proportion, 297 for single variance, 304 for standard deviation, 304 Contingency table, 373 marginal frequency, 374 Continuity correction, 190 Continuous distribution beta, 201 785 786 INDEX chi-squared, 200 exponential, 195 gamma, 195 lognormal, 201 normal, 172 uniform, 171 Weibull, 203, 204 Control chart for attributes, 697 Cusum chart, 705 p-chart, 697 R-chart, 688 S-chart, 695 U-chart, 704 for variable, 684 X ̄-chart, 686 Correlation coefficient, 125, 431 Pearson product-moment, 432 population, 432 sample, 432 Covariance, 119, 123 Cp statistic, 491 Cross validation, 487 Cumulative distribution function, 85, 90 Degrees of freedom, 15, 16, 200, 244, 246 Satterthwaite approximation of, 289 Descriptive statistics, 3, 9 Design of experiment blocking, 532 central composite, 640 completely randomized, 532 contrast, 599 control factors, 644 defining relation, 627 fractional factorial, 598, 612, 626, 627 noise factor, 644 orthogonal, 617 randomized block, 533 resolution, 637 Deviation, 120 Discrete distribution binomial, 143, 144 geometric, 158, 160 hypergeometric, 152, 153 multinomial, 143, 149 negative binomial, 158, 159 Poisson, 161, 162 Distribution, 23 beta, 201 binomial, 104, 143–145, 175, 188 bivariate normal, 431 chi-squared, 200 continuous uniform, 171 empirical, 254 Erlang, 207 exponential, 104, 194, 195 gamma, 194, 195 Gaussian, 19, 172 geometric, 143, 158, 160 hypergeometric, 152–154, 175 lognormal, 201 multinomial, 143, 149 multivariate hypergeometric, 156 negative binomial, 143, 158–160 normal, 19, 172, 173, 188 Poisson, 143, 161, 162 posterior, 711 prior, 710 skewed, 23 standard normal, 177 symmetric, 23 t-, 246, 247 variance ratio, 253 Weibull, 203 Distribution-free method, 655 Distributional parameter, 104 Dot plot, 3, 8, 32 Dummy variable, 472 Duncan’s multiple-range test, 527 Dunnett’s test, 528 Erlang distribution, 207 Error in estimating the mean, 272 experimental, 509 sum of squares, 402 type I, 322 type II, 323 Estimate, 12 of single mean, 269 Estimation, 12, 142, 266 difference of two sample means, 285 maximum likelihood, 307, 308, 312 paired observations, 291 proportion, 296 INDEX 787 of the ratio of variances, 305 of single variance, 303 two proportions, 300 Estimator, 266 efficient, 267 maximum likelihood, 308–310 method of moments, 314, 315 point, 266, 268 unbiased, 266, 267 Event, 38 Expectation mathematical, 111, 112, 115 Expected mean squares ANOVA model, 548 Expected value, 112–115 Experiment-wise error rate, 525 Experimental error, 509 Experimental unit, 9, 286, 292, 562 Exponential distribution, 104, 194, 195 mean of, 196 memoryless property of, 197 relationship to Poisson process, 196 variance of, 196 F-distribution, 251–254 Factor, 28, 507 Factorial, 47 Factorial experiment, 561 in blocks, 583 factor, 507 interaction, 562 level, 507 main effects, 562 masking effects, 563 mixed model, 591 pooling mean squares, 583 random effects, 589 three-factor ANOVA, 579 treatment, 507 two-factor ANOVA, 565 Failure rate, 204, 205 Fixed effects experiment, 547 Forward selection, 479 Gamma distribution, 194, 195 mean of, 196 relationship to Poisson process, 196 variance of, 196 Gamma function, 194 incomplete, 199 Gaussian distribution, 19, 172 Geometric distribution, 158, 160 mean of, 160 variance of, 160 Goodness-of-fit test, 210, 255, 317, 370, 371 Histogram, 22 probability, 86 Historical data, 30 Hypergeometric distribution, 152–154 mean of, 154 variance of, 154 Hypothesis, 320 alternative, 320 null, 320 statistical, 319 testing, 320, 321 Independence, 62, 65, 67, 68 statistical, 101–103 Indicator variable, 472 Inferential statistics, 1 Interaction, 28, 562 Interquartile range, 24, 25 Intersection of events, 39 Interval estimate, 268 Bayesian, 715 Jacobian, 213 matrix, 214 Kruskall-Wallis test, 668 Lack of fit, 418 Least squares method, 394, 396 Level of significance, 323 Likelihood function, 308 Linear model, 133 Linear predictor, 498 Linear regression ANOVA, 414 categorical variable, 472 coefficient of determination, 407 correlation, 430 data transformation, 424 dependent variable, 389 empirical model, 391 788 INDEX error sum of squares, 415 fitted line, 392 fitted value, 416 independent variable, 389 lack of fit, 418 least squares, 394 mean response, 394, 409 model selection, 476, 487 multiple, 390, 443 normal equation, 396 through the origin, 413 overfitting, 408 prediction, 408 prediction interval, 410, 411 pure experimental error, 419 random error, 391 regression coefficient, 392 regression sum of squares, 461 regressor, 389 residual, 395 simple, 389, 390 statistical model, 391 test of linearity, 416 total sum of squares, 414 Logistic regression, 497 effective dose, 500 odds ratio, 500 Lognormal distribution, 201 mean of, 202 variance of, 202 Loss function absolute-error, 718 squared-error, 717 Marginal distribution, 97, 101, 102 joint, 103 Markov chain Monte Carlo, 710 Masking effect, 563 Maximum likelihood estimation, 307, 308, 710 residual, 550 restricted, 550 Mean, 19, 111, 112, 114, 115 population, 12, 16 trimmed, 12 Mean squared error, 284 Mean squares, 415 Mode, 713 normal distribution, 174 Model selection, 476 backward elimination, 480 Cp statistic, 491 forward selection, 479 PRESS, 487, 488 sequential methods, 476 stepwise regression, 480 Moment, 218 Moment-generating function, 218 Multicollinearity, 476 Multinomial distribution, 149 Multiple comparison test, 523 Duncan’s, 527 Dunnett’s, 528 experiment-wise error rate, 525 Tukey’s, 526 Multiple linear regression, 443 adjusted R2, 464 ANOVA, 455 error sum of squares, 460 HAT matrix, 483 inference, 455 multicollinearity, 476 normal equations, 444 orthogonal variables, 467 outlier, 484 polynomial, 446 R-student residuals, 483 regression sum of squares, 460 studentized residuals, 483 variable screening, 456 variance-covariance matrix, 453 Multiplication rule, 44 Multiplicative rule, 65 Multivariate hypergeometric distribution, 156 Mutually exclusive events, 40 Negative binomial distribution, 158, 159 Negative binomial experiment, 158 Negative exponential distribution, 196 Nonlinear regression, 496 binary response, 497 count data, 497 logistic, 497 Nonparametric methods, 655 Kruskall-Wallis test, 668 runs test, 671 INDEX 789 sign test, 656 signed-rank test, 660 tolerance limits, 674 Wilcoxon rank-sum test, 665 Normal distribution, 172, 173 mean of, 175 normal curve, 172–175 standard, 177 standard deviation of, 175 variance of, 175 Normal equations for linear regression, 444 Normal probability plot, 254 Normal quantile-quantile plot, 256, 257 Observational study, 3, 29 OC curve, 335 One-sided confidence bound, 273 One-way ANOVA, 509 contrast, 520 contrast sum of squares, 521 grand mean, 510 single-degree-of-freedom contrast, 520 treatment, 509 treatment effect, 510 Orthogonal contrasts, 522 Orthogonal variables, 467 Outlier, 24, 279, 484 p-chart, 697 P-value, 4, 109, 331–333 Paired observations, 291 Parameter, 12, 142 Partial F-test, 466 Permutation, 47 circular, 49 Plot box, 24 normal quantile-quantile, 256, 257 probability, 254 quantile, 254, 255 stem-and-leaf, 21 Point estimate, 266, 268 standard error, 276 Points of inflection, normal distribution, 174 Poisson distribution, 143, 161, 162 mean of, 162 variance of, 162 Poisson experiment, 161 Poisson process, 161, 196 relationship to gamma distribution, 196 Polynomial regression, 443, 446 Pooled estimate of variance, 287 Pooled sample variance, 287 Population, 2, 4, 225, 226 mean of, 226 parameter, 16, 104 size of, 226 variance of, 226 Posterior distribution, 711 Power of a test, 329 Prediction interval, 277, 278, 281 for future observation, 278, 279 one-sided, 279 Prediction sum of squares, 487, 488 Prior distribution, 710 Probability, 35, 52, 53 additive rule, 56 coverage, 715 of an event, 52 indifference, 55, 709 mass function, 84 relative frequency, 55, 709 subjective, 55 subjective approach, 709 Probability density function, 88, 89 joint, 96 Probability distribution, 84 conditional, 99 continuous, 87 discrete, 84 joint, 94, 95, 102 marginal, 97 mean of, 111 variance of, 119 Probability function, 84 Probability mass function, 84 joint, 95 Product rule, 65 Quality control, 681 chart, 681, 682 in control, 682 out of control, 682 limits, 683 Quantile, 255 Quantile plot, 254, 255 790 INDEX R-chart, 688 R2, 407, 462 Adjusted, 464 Random effects experiment variance components, 549 Random effects model, 547, 548 Random sample, 227 simple, 7 Random sampling, 225 Random variable, 81 Bernoulli, 83, 147 binomial, 144, 147 chi-squared, 244 continuous, 84 continuous uniform, 171 discrete, 83, 84 discrete uniform, 150 hypergeometric, 143, 153 mean of, 111, 114 multinomial, 149 negative binomial, 158 nonlinear function of, 133 normal, 173 Poisson, 161, 162 transformation, 211 variance of, 119, 122 Randomized complete block design, 533 Rank correlation coefficient, 675 Spearman, 674 Rectangular distribution, 171 Regression, 20 Rejectable quality level, 705 Relative frequency, 22, 31, 111 Residual, 395, 427 Response surface, 642, 648 robust parameter design, 644 Response surface methodology, 447, 639, 640 control factor, 644 control factors, 644 noise factor, 644 second order model, 640 Retrospective study, 30 Rule method, 37 Rule of elimination, 73–75 Runs test, 671 S-chart, 695 Sample, 1, 2, 225, 226 biased, 7 mean, 3, 11, 12, 19, 30–32, 225, 228 median, 3, 11, 12, 30, 31, 228 mode, 228 random, 227 range, 15, 30, 31, 229 standard deviation, 3, 15, 16, 30, 31, 229, 230 variance, 15, 16, 30, 225, 229 Sample mean, 111 Sample size, 7 in estimating a mean, 272 in estimating a proportion, 298 in hypothesis testing, 351 Sample space, 35 continuous, 83 discrete, 83 partition, 57 Sampling distribution, 232 of mean, 233 Satterthwaite approximation of degrees of freedom, 289 Scatter plot, 3 Sign test, 656 Signed-rank test, 660 Significance level, 332 Single proportion test, 360 Standard deviation, 120, 122, 135 sample, 15, 16 Standard error of mean, 277 Standard normal distribution, 177 Statistic, 228 Statistical independence, 101–103 Statistical inference, 3, 225, 265 Stem-and-leaf plot, 3, 21, 22, 31 Stepwise regression, 479 Subjective probability, 709, 710 Sum of squares error, 402, 415 identity, 510, 536, 567 lack-of-fit, 419 regression, 415 total, 407 treatment, 511, 522, 536 t-distribution, 246–250 Test statistic, 322 Tests for equality of variances, 516 Bartlett’s, 516 INDEX 791 Cochran’s, 518 Tests of hypotheses, 19, 266, 319 choice of sample size, 349, 352 critical region, 322 critical value, 322 goodness-of-fit, 210, 255, 370, 371 important properties, 329 one-tailed, 330 P-value, 331, 333 paired observations, 345 partial F, 466 single proportion, 360 single sample, 336 single sample, variance known, 336 single sample, variance unknown, 340 single variance, 366 size of test, 323 test for homogeneity, 376 test for independence, 373 test for several proportions, 377 test statistics, 326 on two means, 342 two means with unknown and unequal vari- ances, 345 two means with unknown but equal variances, 343 two-tailed, 330 two variances, 366 Tolerance interval, 280, 281 Tolerance limits, 280 of nonparametric method, 674 one-sided, 281 Total probability, 72, 73 Treatment negative effect, 563 positive effect, 563 Tree diagram, 36 Trimmed mean, 12 Tukey’s test, 526 2k factorial experiment, 597 aliases, 628 center runs, 620 defining relation, 627 design generator, 627 diagnostic plotting, 604 factor screening, 598 fractional factorial, 626 orthogonal design, 617 Plackett-Burman designs, 638 regression setting, 612 resolution, 637 U-chart, 704 Unbiased estimator, 267 Uniform distribution, 171 Union of events, 40 Variability, 8, 9, 14–16, 119, 135, 228, 251, 253 between/within samples, 253, 254 Variable transformation continuous, 213, 214 discrete, 212 Variance, 119, 120, 122 population, 16 sample, 16 Variance ratio distribution, 253 Venn diagram, 40 Weibull distribution, 203 cumulative distribution function for, 204 failure rate of, 204, 205 mean of, 203 variance of, 203 Wilcoxon rank-sum test, 665 X ̄-chart, 686 operating characteristic function, 691