IT代写 ISE 562, Dr. Bayes Methods Decision Theory

ISE 562, Dr. Bayes Methods Decision Theory
ISE 562, Dr. differences between discrete/continuous
• Probability is not height of function but area under curve for an interval.
p(x) discrete P(X=a)

Copyright By PowCoder代写 加微信 powcoder

• Total area under the function =1
9/4/2022 2
f(x) continuous
P(aXb) a b • Require f(X)0 for all X (but X can be < 0) ISE 562, Dr. vs. Cumulative Probability PDF’s and CDF’s PDF : P(a  X  b)  CDF:P(Xx)x f()d  f (X )  d dX •P(aXb) = P(Xb) – P(Xa) •P(Xa) = 1-P(Xa) •P(Xa) = 1-P(Xa) ISE 562, Dr. histogram f(x) Empirical pdfs 9 / 4 / 2 0 2 2 0202 f ( X )  kX 0  X  2 2 X2 kX2 2 0kXdXk22 2k 0 2k  1 so k  12  f ( X )  12 X 0  X  2 4 ISE 562, Dr. pdfs 2 1 X32 4 E[X]0X2XdX6 3 2 E[X2]2 2X21XdX2 X42 42 162  8 3 299 0 ISE 562, Dr. pdfs: Cumulative Distribution Function (CDF) x F(X)x 1d 2  1 X2 02404 0 X 2 cdf ISE 562, Dr. pdfs: Expected value of a function Y=Cost(X)=aX+b E(Cost(X ))  2 Cost(X ) 12XdX 0 2(aX b)1XdX 2(aX2 bX)dX 02022 aX3 bX2 2 4  6  4  3ab ISE 562, Dr. use properties of expectation: Y=Cost(X)=aX+b E(Cost(X))E(aX b)  E(aX )  E(b)  aE ( X )  b  43 a  b ISE 562, Dr. ’s with more than one random variable f(X,Y) called the joint probability of X and Y Marginal pdfs can be obtained by integrating one variable out of the function: f(X) f(X,Y)dY  f(Y) f(X,Y)dX  ISE 562, Dr. we have a reservoir fed by 2 rivers. Let X=stream flow in river X and Y=flow in river Y. The joint pdf of flows is: 4000  X f(X,Y)c 4000 c  2.5x107 0  X  4000 0Y 2000 ISE 562, Dr. Smith f ( X )  2000 c 4000  X dY  c (4000  X ) 0 4000 2 f (Y )  4000 c 4000  X dX  2000c 0 4000 E[(X Y)] 20004000X Yc 4000 X dXdY 00 4000 ISE 562, Dr. Smith f(X|Y) f(X,Y) f (Y ) Conditional Probability Bayes rule is derived from the above conditional probability relationship where Y represents the sample information (observations) and X is the random (decision) variable. 9/4/2022 12 ISE 562, Dr. Smith i) Conditional Probability ii) As in discrete case: iii) Integrating out the  Bayes for Continuous Random Variables f(|y) f(,y) f (y) f(,y) f()f(y|) f (y)   f (, y)d  f()f(y|)d  Substituting (ii) and (iii) in (i) f ( ) f ( y |  ) f(|y) f()f(y|)d  ISE 562, Dr. Smith f ( | y )  posterior pdf   ( prior pdf )(likelihood ) 9/4/2022 14 f ( ) f ( y |  )  f()f(y|)d ( prior pdf )(likelihood ) ISE 562, Dr. Smith • We denote an uncertain decision variable as  and assume that sample information involving  can be summarized by a sample statistic y. • If y has all the information from the sample relevant to the uncertainty about , then y is called a sufficient statistic. • Example: for a Bernoulli process, the sample information can be summarized by n and r; the actual number of successes and failures adds no additional information about p since p=r/n. 9/4/2022 15 Sufficiency ISE 562, Dr. do we care about sufficiency? Sufficiency For Bayes rule this means that knowledge of n and r is sufficient to determine the likelihoods so the posterior distribution of p given n and r is exactly the same as the posterior distribution of p given the entire sequence of observations. Simply put, everything we need to know about the sample is contained in the likelihood function. 9/4/2022 16 ISE 562, Dr. Smith • Example: Let =market share of a new product (01); assume it is continuous and has prior pdf: Prior pdf f() 2 f ( )  2(1   ) 0    1 • We take a sample of 5 consumers and 1 buys new brand while other 4 purchase a different brand • Assume binomial likelihood distribution with Likelihood success = “buys new product” so f ( y |  )  P ( r  1 | n  5,    )  5! 1(1)4 5(1)4 9/4/2022 4!1! ISE 562, Dr. Smith • Applying Bayes rule we substitute the prior and likelhood functions: f ( | y )  f ( ) f ( y |  )  2 (1   ) 5 (1   ) 4   f()f(y|)d 12(1)5(1)4d  0  10(1)5  (1)5 101(1)5d 1(1)5d Posterior pdf Since  x (1 x) dx  where (n)  (n 1)! f(|y) 1 m1 n1 (m)(n)  0  (mn)  2  (1)5  42(1)5 Notice this is a function of . In the discrete Bayes case we had specific values of p for the states—here we can have any value in [0,1] for the state variable. ISE 562, Dr. Smith • As shown the process appears somewhat difficult and computationally challenging • May not find the integral is computable in closed form; may need numerical quadrature techniques • There is another way!!! •  conjugate families of distributions that simplify the process of combining priors and likelihoods 9/4/2022 19 ISE 562, Dr. of conjugate families of pdfs 1. Tractability:easytospecifyposterior given the prior and likelihood function 2. Richness:thepriorshouldreflectthe prior information (this is done with parameters to fit the distribution to the information) 3. Easeofinterpretation:priorshouldbe interpretable in terms of previous sample results 9/4/2022 20 ISE 562, Dr. Smith 2 conjugate families studied here: 1. SamplingfromaBernoulliprocess whose conjugate is the family of beta distributions 2. Sampling from a normal distributed process with known variance whose conjugate distribution is the family of normal distributions. Likelihood Prior Likelihood ISE 562, Dr. /binomial “Conjugate” family refers to the relationship between the prior and the likelihood function. 1. Sampling from a Bernoulli process (the likelihood function) has as its conjugate, the family of beta distributions f(p) (n1)! pr1(1 p)nr1 0 p1 (r1)!(nr1)! (Note that the random variable p, varies from zero to one for the beta pdf) 9/4/2022 Beta pdf 22 ISE 562, Dr. /binomial “If n, r, not integers we must use gamma functions f(p) (n) (r)(n  r) (t)   xt1exdx note : if t an integer (t)  (t 1)! 9/4/2022 pr1(1 p)nr1 t  0 ISE 562, Dr. /binomial Mean and variance of the beta distribution:   E ( pˆ | r , n )  nr 2 V(pˆ|r,n) r(nr) To calculate probabilities we use fractiles: The f fractile of the pdf of a continuous random variable is the value xf where P(X xf)=f 9/4/2022 24 ISE 562, Dr. /binomial Suppose we have a beta pdf of p with r=5 and n=8 and we want P(p0.562|r=5,n=8) and P(.265p 0.562|r=5,n=8) 9/4/2022 25 2.5000 2.0000 1.5000 1.0000 0.5000 0.0000 ISE 562, Dr. /binomial Suppose we have a beta pdf of p with r=5 and n=8 • Can use rhs of beta calculator (back of book) P(p0.562|r=5,n=8)=0.3402 • For P(.265p 0.562|r=5,n=8) use: • P(a x b)=P(xb) - P(xa) • = P(p.562) - P(p.265) • =0.3402 – 0.0167 • Can use fractiles (on lhs) to enter probability and then look up p. 9/4/2022 Calculator 26 2.5000 2.0000 1.5000 1.0000 0.5000 0.0000 ISE 562, Dr. Alert! • Prior pdfs and likelihoods denoted with single primes; e.g., p’, f’(), E’[], ’,... • Posterior pdfs and parameters of posterior pdfs denoted with double primes, (p”), f”(), E”[], ”,... 9/4/2022 27 ISE 562, Dr. Smith Now the big advantage of conjugate pdfs... ISE 562, Dr. a binomial process – Stationarity and Independency – Beta prior pdf of the form: f'(p) (n'1)! pr'1(1 p)n'r'1 (r'1)!(n'r'1)! Beta/binomial – We draw a sample for binomial likelihood function with r successes in n trials. Then the posterior pdf becomes: f''(p) (n"1)! pr"1(1p)n"r"1 0 p1 (r"1)!(n"r"1)! with n" = n' + n and r" = r' + r ISE 562, Dr. /binomial • Example: Let =market share of a new product (01); assume it is continuous and has prior pdf: Prior pdf f() 2 f ( )  2(1   ) 0    1 • We take a sample of 5 consumers and 1 buys new brand while other 4 purchase a different brand • Assume binomial likelihood distribution with Likelihood success = “buys new product” so f ( y |  )  P ( r  1 | n  5,    )  5! 1(1)4 5(1)4 9/4/2022 4!1! 0.001 0.15 0.001 0.15 ISE 562, Dr. /binomial • Example: Let =market share of a new product (01) is a beta prior: f()2(1) 0 1 E [  ]   1  2 ( 1   ) d   13 0 • The equivalent mean for the beta is E[p]=r/n=1/3 so r=1 and n=3: 9/4/2022 31 2.5000 2.0000 1.5000 1.0000 0.5000 0.0000 ISE 562, Dr. /binomial • The sample results were r=1 and n=5 so the posterior pdf has parameters • r”=r’+r= 1+1=2 • n”=n’+n=3+5=8 • So mean of prior went from 1/3 (0.33) to posterior of 2/8=1/4 (0.25) – it shifted to the left. 9/4/2022 Prior Posterior 32 2.5000 2.0000 1.5000 1.0000 0.5000 0.0000 3.0000 2.5000 2.0000 1.5000 1.0000 0.5000 0.0000 ISE 562, Dr. /binomial • Notice also the shift in variance: • Prior variance =r(n-r)/(n2(n+1))=1(3-1)/(9(4)) =1/18 = 0.055 • Posterior variance=2(8-2)/(64(9))=12/576 2.5000 2.0000 1.5000 1.0000 0.5000 0.0000 =1/48 = 0.021 3.0000 2.5000 2.0000 1.5000 1.0000 0.5000 0.0000 ISE 562, Dr. /binomial • Posterior mean always lies between prior mean and the sample mean for the Bernoulli/Beta family • Posterior variance generally smaller than prior variance due to addition of sample information to prior knowledge. 9/4/2022 Prior Posterior 34 2.5000 2.0000 1.5000 1.0000 0.5000 0.0000 3.0000 2.5000 2.0000 1.5000 1.0000 0.5000 0.0000 ISE 562, Dr. Conjugate • “Conjugate” family refers to the relationship between the prior and the likelihood function. • Sampling from a normal process (the likelihood function) has as its conjugate, the family of normal distributions  ( x )2  f(x) 1 e 22   zx f(x|,2) f(z|0,1) f(z) 1 ez2 2 ISE 562, Dr. normal distribution P(Xa)a f(X)dX 0 P(z  (a  ))  STDNORMALTABLE((a  ))  0.001 0.15 0.001 0.15 0.001 0.15 0.001 0.15 0.001 0.15 0.001 0.15 0.001 0.15 ISE 562, Dr. Smith Table used for This area = u=0 Z=(a-u)/ ISE 562, Dr. on form of table, can use symmetry properties of normal to calculate various probabilities: P(aXb) = P(Xb) – P(Xa) P(Xa) = 1-P(Xa) P(Xa) = 1-P(Xa) 9/4/2022 38 ISE 562, Dr. Normal Table ISE 562, Dr. types of normal pdf problems: 1. Given X, u, 2 , find P(X) a) Transform question to probability statement b) Translate X to Z and P(Z) using Z=(X-u)/ c) Look up P(Z) in standard normal table Lookup P(Z) in body of table to find Z Solve Z=(X-u)/ for unknown quantity Given P(X), find X, u, or 2 ISE 562, Dr. carpet warehouse keeps 6000 yards of carpet in stock during a month. The average demand is normally distributed with mean 4500 yards and standard deviation 900 yards. What is the probability a customer order won’t be met? We want P(d6000). P(d6000)=P(z (6000-4500)/900)=P(z  1.67) = 1-P(z1.67) = 1-(0.9525) = 0.0475 Std. Normal 9/4/2022 Table 41 Type 1, Find Probability ISE 562, Dr. mount of coffee a filling machine puts into 4 oz. jars is normally distributed with std. dev, =0.04 oz. If only 2% of the jars are to contain less than 4 oz., what should be the average fill amount? • We want P(X4)=0.02. • Find Z such that P(Zz)=0.02 • From table, Z=-2.05 • Solve -2.05=(4-u)/.04 for u. • u=4.082 oz. Std. Normal Type 2, Find unknown ISE 562, Dr. to Bayes... Normal Conjugate •  and  are summary measures of the normal pdf • For the binomial, r and n are summary measures of the sample info • For the normal, the sample mean and sample variance are summary measures for a sample from a normal population ISE 562, Dr. Smith • If x1, ..., xn random variables represent a random sample from a normal population with mean  and , then the sample mean is normally distributed with E[m| ,2]=  and V[m| ,2]= 2/n • Even if population is not normally distributed the Central Limit Theorem shows that as n, the distribution of (m-)/(/n) converges to the normal pdf. This is true for any population with finite mean and variance. 9/4/2022 CLT demo 44 Normal Conjugate ISE 562, Dr. Smith •Be careful—check your data for normal distribution 9/4/2022 45 ISE 562, Dr. -squared,2 Goodness-of-fit test • Probability distribution unknown • Sample from the unknown pdf n values • We sort the sample into k class intervals (histogram) and let foi be the observed frequency (count) for interval i. • Then compute the theoretical frequency using normal (or any) distribution for each interval, fti • If any of intervals contain < 5 theoretical observations, combine with next interval • Calculate total deviation of observed values from theoretical values and test for significance 9/4/2022 46 Testing for Normality ISE 562, Dr. Chi-square statistic is: Testing for Normality k= no. intervals p= no. estimated parameters (2 for normal) 2 oi ti i1 fti The hypothesis test: Ho: X is normally distributed Ha: X is not normally distributed Reject Ho if data not normal (if differences too large); ie: reject normal if  2   2 ,kp1 ISE 562, Dr. : cell phone company has frequency data for length of calls outside a roaming area. The mean and std. dev are 14.3 and 3.7 minutes respectively. Are the data normally distributed? Use =0.05. 9/4/2022 48 Length (min) Testing for Normality ISE 562, Dr. (0X5)=P(z(5-14.3)/3.7) - P(z(0-14.3)/3.7)=P(z-2.51)- P(z-3.86)=(1-.9940)-(1-.9999)=0.006 9/4/2022 49 Testing for Normality Length (min) Freq (obs) Theor. prob Theor freq (np) (fo-ft)2 /ft ISE 562, Dr. Smith reject normal if 2  2 ,kp1  2   2   2  3.841 ,kp1 0.05,421 0.05,1 112.2  3.841 so reject normal Testing for Normality ISE 562, Dr. Conjugate Returning to conjugate discussion... Suppose the prior distribution on  is normal with mean m’ and variance ’2:  (m')2  f '()  1 e 2 '2   We draw a sample of size n and observe a sample mean of m; the posterior density is then normal: Where y represents the sample results and the posterior parameters are computed from: f "( | y)  1 e 2 "2  1 m' n m 11n and m"'2 2 (m")2   "2 '2 2 Back to 1n p.56 ISE 562, Dr. Conjugate Discrete prior example: retailer interested in weekly sales, x, at one of their stores • x distributed normally with unknown mean  and variance known, 2=90000 • Only 5 potential values to be considered • =1100, 1150, 1200, 1250, 1300 • The prior distribution is estimated to be: 9/4/2022 52 P(=1100)=.15 P(=1150)=.20 P(=1200)=.30 P(=1250)=.20 P(=1300)=.15 ISE 562, Dr. Conjugate Retailer wants more information about the store so takes a sample from past sales records assuming weekly sales independent. Takes sample of n=60 weeks and calculates sample mean, m=1240; now calculate the likelihoods* f(1240|1100, f(1240|1150, f(1240|1200, f(1240|1250, f(1240|1300, 9/4/2022  12401100 n 38.73) f 38.73  12401150 n 38.73) f 38.73  12401200 n 38.73) f 38.73  12401250 n 38.73) f 38.73  12401300 n 38.73) f 38.73 | 0,1 / 38.73  .0006 / 38.73 | 0,1 / 38.73  .0270 / 38.73 | 0,1 / 38.73  .2347 / 38.73 | 0,1 / 38.73  .3857 / 38.73 | 0,1 / 38.73  .1200 / 38.73 * 300/sqrt(60) = 38.73 ISE 562, Dr. Conjugate Note that denominator same for all calculations (see page 35) so we can ignore in table—its optional since it cancels out later 9/4/2022 54 Prior prob. Likelihood Prior prob * likelihood Posterior prob. ISE 562, Dr. Conjugate Continuous prior example: Suppose mgr decides prior is normally distributed with mean, m’=1200 and ’=50. Note that 2=90000 is the variance of weekly sales Note that ’2=2500 is the variance of prior pdf of , average weekly sales. Using the previous sample information for n=60 weeks with mean m=1240 we calculate the posterior parameters for the posterior pdf: 9/4/2022 55 ISE 562, Dr. Conjugate 1  1  n  1  60  96  "2  '2  2 2500 90000 90000 1m'nm m" '2 2 1 1200 60 1240 90000 1  60 2500 90000 So the mean and variance of the posterior pdf are 1225 and 90000/96=937.5 ISE 562, Dr. Conjugate • The original belief about the mean (prior) was m’=1200 and a population standard deviation of 300 • A sample was taken with sample mean of 1240 and standard deviation of 38.73 • The posterior mean then moved from the prior value of 1200 to the posterior value of 1225 • The posterior standard deviation went from a prior value of 38.73 to sqrt(937.5) = 30.62 (less dispersion due to “learning” from sample information. 9/4/2022 57 ISE 562, Dr. if no conjugates? • What if the prior and likelihood not conjugate? • All is not lost. In Bayes time until digital computers arrived in the 1950-1960’s, computation of the posterior for an arbitrary prior and likelihood was not feasible. • Today we have software that make these calculations easy (e.g., Mathematica, Matlab, etc.) 程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com