MAST20004 Probability Assignment Three [Due 4:00 pm Monday 07/10] There are 5 problems in total, of which 3 randomly chosen ones will be marked. You are expected to submit answers to all questions, otherwise a mark penalty will apply. Calculations and reasoning must be given in order to obtain full credit. Problem 1. Choose a number X at random from the set {1, 2, 3, 4, 5}, then choose a number Y at random from the subset {1, · · · , X}. (i) Find the joint pmf of X and Y . (ii) Find the conditional pmf of X given Y = 3. (iii) Are X and Y independent? (iv) Compute the expected value of X Y . Problem 2. Let (X, Y ) be a bivariate random variable whose joint pdf is given by fX,Y (x, y) = { Cy x3 , 0 < x < 1 and 0 < y < x2, 0, otherwise. (i) Compute the constant C, and the marginal pdf’s of X and Y respectively. (ii) Compute fY |X(y|x) and deduce that E[Y |X] = 2 3 X2. (iii) Compute fX|Y (x|y) and E[X|Y ]. Problem 3. Let (X, Y ) be a general bivariate normal random variable. (i) If Cov(X, Y ) = 0, show that X, Y are independent. (ii) If Var[X] = Var[Y ], show that X + Y and X − Y are independent. (iii) Assume that µX = 0, σ2X = 1, µY = −1, σ2Y = 4, ρ = 1/2. Compute P(X + Y > 0) and P(X + Y > 0|2X − Y = 0). 1 Problem 4. Suppose that X, Y are independent random variables, both uni- formly distributed over [0, 1]. (i) Find the pdf of R = |X − Y | and E[R]. (ii) Find the joint pdf of U = X + Y and V = X/Y . [Hint: The trickier part is to find out what the transformed region T (D) is, where D is the unit square in the x-y plane. There is not a general way of doing so, and one needs to argue example by example. For this problem, we can first figure out the range of u, and then for each u within this range we further figure out the range of v. It is typical that this range of v, denoted as Ru and visualized in the (u, v)-plane as the vertical segment Lu = {(u, v) : v ∈ Ru}, will depend on u which falls between the graphs of two functions of u. The region T (D) is then determined by figuring out what Lu swipes out as u runs over its own range. (iii) Show that the bivariate random variable (U, V ) defined by U = √ −2 logX cos(2piY ), V = √ −2 logX sin(2piY ) is a standard bivariate normal random variable with parameter ρ = 0. [Remark: By taking the U-component, this gives a simple way of generating a standard normal random variable. To some extent, this method is better than the one we discussed in lecture using U = Φ−1(X) where Φ is the Cdf of N(0, 1), since Φ−1 is very hard to obtain. This is a nice illustration of the philosophy that: when we are working on a one dimensional problem, sometimes it could be substantially easier if we look at the problem from a multi-dimensional perspective. Another example of such kind is the computation of ∫∞ −∞ e −x2/2dx we did in the lecture. This idea is further developed and appreciated in the subject of complex analysis.] (iv) By using a given pair of independent uniform random variables (X, Y ) over [0, 1], find a way to construct a bivariate random variable (Z,W ) satisfying E[Z] = E[W ] = 0, Var[Z] = Var[W ] = 5, Cov(Z,W ) = 4. [Remark. Essentially, this method allows us to generate a general bivariate normal random variable from a pair of independent uniform random variables over [0, 1]] Problem 5. [Hard] For this problem, you might need to use the following so- called inclusion-exclusion principle without proof. Let A1, A2, · · · , An be n events. Then P(A1 ∪ · · · ∪ An) = n∑ i=1 P(Ai)− ∑ 16iP(Ai ∩ Aj) + ∑ 16iP(Ai ∩ Aj ∩ Ak)− · · ·+ (−1)n−1P(A1 ∩ · · · ∩ An). 2 A little girl is painting on a blank paper. Suppose that there is a total number of N available colors. At each time she selects one color randomly and paints on the paper. It is possible that she picks a color that she has already used before. Different selections are assumed to be independent. (1) Suppose that the littile girl makes n selections. (1-i) If red and blue are among the available colors, let R (respectively, B) be the event that her painting contains color red (respectively, blue). What is P(R) and P(R ∪B)? (1-ii) Suppose that she is about to make the (n + 1)-th selection. What is the probability that she will obtain a new color in this selection? [Hint: discuss according to the specific color in her (n+ 1)-th selection.] (1-iii) Suppose that n = N . For 1 6 i 6 N, let Ei be the probability that her painting does not contain color i. By using the inclusion-exclusion principle to ∪Ni=1Ei, show that N ! = N∑ k=0 (−1)k ( N k ) (N − k)N . (1-iv) Let D be the number of different colors she obtain among her n selections. By writing N − D as a sum of Bernoulli random variables, compute E[D] and Var[D]. (2) Let S be the number of selections needed until every available color has been selected by the little girl. (2-i) Find the pmf of S. [Hint: consider {S > n} and use the inclusion-exclusion principle to compute this probability.] (2-ii) For 0 6 i 6 N − 1, let Xi be the random variable that after obtaining i different colors, the number of extra selections needed until further obtaining a new color. By understanding the distributions of these Xi’s and their relationship with S, show that E[S] = N × ( 1 + 1 2 + · · ·+ 1 N ) . Since the harmonic series H(N) = 1 + 1 2 + · · · + 1 N has logarithmic growth (i.e. H(N) logN → 1 as N → ∞), this result shows that when N is large, on average the little girl needs to make about N logN selections before obtaining all different colors. (3) Let T be the number of selections until the little girl picks a color that she has obtained before. (3-i) Find the pmf and expected value of T . 3 (3-ii) Consider E[T ] as a function of N . What is the growth rate of E[T ] as N →∞? You don’t need to solve this problem mathematically. Simply make an educated guess. [Hint: firstly, use the formula E[T ] = ∞∑ k=0 P(T > k) given in Tutorial 6, Problem 4 to simplify the expression of the mean. Secondly, relate E[T ] with either some standard Taylor series to guess the growth rate or relate it with Poisson random variables and the central limit theorem. The central limit theorem (which we will learn soon) says, if X1, X2, · · · are independent and identically distributed with finite variance, then lim n→∞ P ( Sn − E[Sn]√ Var[Sn] 6 x ) = Φ(x), for all x ∈ R, where Sn = X1+· · ·+Xn and Φ(x) is the Cdf of the standard normal distribution.] 4