- April 11, 2020

**ECO220Y1Y, APRIL 2018, FINAL EXAM:**** SOLUTIONS
****(1) (a)** We need to make an inference about the difference in population proportions. To obtain the sample proportion
for no taglines: 𝑃
௧ = .ସଷ∗ସା.ହ∗ଷଽ
଼ସ
= 0.5386. To obtain the sample proportion for superfluous taglines: 𝑃
௧ =
.ଷ∗ଷଽସା.ହଵ∗ସହ
ଽଽ
= 0.4410. Hence, the point estimate of the difference is -0.0976, which says that people shown
misleading advertising (superfluous taglines) are nearly 10 percentage points less likely to choose the best credit card
compared to people who make the choice without being confronted with misleading advertising (no taglines).
To test for a difference (either way) requires a two-tailed test:
𝐻: ൫𝑝௧ − 𝑝௧൯=0
𝐻ଵ: ൫𝑝௧ − 𝑝௧൯≠0
𝑧 =
మିభ ටುഥ(భషುഥ) భ ାುഥ(భషುഥ) మ
where 𝑃ത = భାమ భାమ 𝑃ത = భାమ భାమ = (ସ∗.ସଷାଷଽ∗.ହ)ା(ଷଽସ∗.ଷାସହ∗.ହଵ)
଼ସାଽଽ = ସଷଷ.ାଷହଶ.ଷଷ
଼ସାଽଽ = ଼ହ.ଷଽ
ଵଷ
= 0.4900
𝑧 =
మିభ ටುഥ(భషುഥ) భ ାುഥ(భషುഥ) మ = .ସସଵି.ହଷ଼
ටబ.రవ(భషబ.రవ)
ఴబర ାబ.రవ(భషబ.రవ)
ళవవ
= ି.ଽ
ටబ.మరవవ
ఴబర ାబ.మరవవ
ళవవ
= ି.ଽ
.ଶସଽ
= −3.91
The difference is highly statistically significant at any conventional significance level (noting that the Standard Normal
table stops at z values of 3.69 as the tail areas become so tiny), including an 𝛼 of 0.001.
**(b)** We need to make an inference about the difference in population proportions. The point estimate of the difference
is 0.14, which says that among people who saw misleading advertising (superfluous taglines) those that saw the
implemental video were 14 percentage points more likely to choose the best credit card compared to people who saw
the baseline video.
(𝑃ଶ − 𝑃ଵ)±𝑧ఈ/ଶටమ(ଵିమ) మ + భ(ଵିభ) భ
(0.51 − 0.37) ± 2.576ට.ହଵ(ଵି.ହଵ)
ସହ + .ଷ(ଵି.ଷ)
ଷଽସ
(0.14) ± 2.576 ∗ 0.03477
0.14 ± 0.0896 which gives a LCL of 0.05 and an UCL of 0.23
For people who have to make a credit card choice while faced with misleading ads, we are 99% confident that showing
them the longer (implemental) video *increases *the percent selecting the best credit card by between 5 and 23
percentage points compared to the shorter (baseline) video. (A causal interpretation is correct because these are
experimental data where the key x variable – which video a person watched – is randomly assigned.) While it is clear
that the longer video helps people not be distracted by misleading advertising, the width of the interval is wide: it may
increase the percent making the best choice by only 5 p.p. but it could have a huge impact of 23 p.p.
**(2)** This requires an inference about the difference between means for *paired* data: 𝐻: 𝜇ௗ = 0 versus 𝐻ଵ: 𝜇ௗ ≠ 0 where
the correct test statistic is given by 𝑡 = ௗത ௦⁄√
. **(3) (a)** In Panel A, the shape of the distribution is Uniform.
In Panel B, the shape of the distribution from $0 to $50 is positively (right) skewed.
In Panel B, the shape of the distribution from $900.01 to $1,000 is bimodal.
In Panel B, the shape of the distribution from $950.01 to $1,000 is negatively (left) skewed.
**(b)** Observations with a listing price less than $300.01 represent 30 percent of the subsample data. Work for 30: Since it
is Uniform, we find the first percent as 100*6/20 = 30, or equivalently, 0.001*50*6 = 0.30, which is 30%.
Observations with a listing price from $990.01 to $1,000 represent 3.5 percent of the subsample data. Work for 3.5: The
height of the bar is around 0.0035 and the width is 10, which means 0.0035*10=0.035, which is about 3.5%.
**(4) (a)** The exact value of the IQR is 12.8 (=17.4 – 4.6), which is the only reasonable choice among those given. As the
distance between the 75th and 25th percentiles, it measure the spread (i.e. variability) of the middle 50% of the data.
There is substantial variability among countries in the % of wealth held offshore even once we exclude the bottom and
top quarters of data (where all the extremes are): there is a 12.8 percentage point difference between the 75th
percentile country and the 25th percentile country.
**(b)** Can choose to draw a relative frequency, frequency or density histogram (all three shown below), but it must be
clearly labelled. Also, it is reasonable to put Ireland, which is very near the boundary of bin 1 and bin 2 into either bin.
No matter how you draw the histogram, it is positively skewed.
0
.1
.2
.3
.4
.5
0 10 20 30 40 50 60 70 80
Offshore wealth as % of GDP
n = 38 countries
(w/ more than $200 billion in GDP in 2007)
05
10
15
20
0 10 20 30 40 50 60 70 80
Offshore wealth as % of GDP
n = 38 countries
(w/ more than $200 billion in GDP in 2007)
0
.01
.02
.03
.04
.05
0 10 20 30 40 50 60 70 80
Offshore wealth as % of GDP
n = 38 countries
(w/ more than $200 billion in GDP in 2007)
Fraction
Frequency
Density**(5) (a)** We need to obtain 𝑏ଵ and 𝑏 in 𝑦ො = 𝑏 + 𝑏ଵ𝑥 where 𝑦 is the firm’s adaptive practice and 𝑥 is the natural log of
the firm’s age. Plugging in: 𝑏ଵ = 𝑟 ௦௦ೣ
= 0.15 ∗ ଵ.ଷଽ
.ସ
= 0.52125 and 𝑏 = 𝑌ത − 𝑏ଵ𝑋ത = 4.18 − 0.52125 ∗ 1.26 = 3.523225.
Hence, the OLS equation is 𝑎𝑑𝑎𝑝𝑡𝚤𝑣𝑒 ప = 3.52 + 0.52 ∗ ln(𝑎𝑔𝑒). [Firms that are 10 percent older on average have
adaptive practices that are 0.052 units higher on a seven point Likert scale.]
**(b) **𝑅ଶ = (𝑟)ଶ = (0.14)ଶ = 0.0196
𝑠௬ଶ = ௌௌ்
ିଵ 1.39ଶ = ௌௌ்
ଶିଵ 𝑆𝑆𝑇 = 398.0126
𝑅ଶ = ௌௌோ
ௌௌ் 0.0196 = ௌௌோ
ଷଽ଼.ଵଶ 𝑆𝑆𝑅 = 7.80104696
𝑆𝑆𝑇 = 𝑆𝑆𝑅 + 𝑆𝑆𝐸 𝑆𝑆𝐸 = 398.0126 − 7.80104696 = 390.211553
𝑠 = ටௌௌா
ିଶ = ටଷଽ.ଶଵଵହହଷ
ଶିଶ
= 1.38
**(c)** We can use an F-test to test if there is a statistically significant correlation between the natural log of firm age and
rainfall change. 𝐹 = ோమ/
(ଵିோమ)/(ିିଵ) = (ି.଼)మ/ଵ
(ଵି(ି.଼)మ)/(ଶିଵିଵ) = 1.32. The relevant critical value from the F table is 2.71
(or 2.75 if you want to be very conservative) and hence this correlation is not close to being statistically different from
zero.
**(6) (a) **𝑃(10𝑡ℎ 𝑑𝑒𝑐𝑖𝑙𝑒 𝑖𝑛 2012 | 1𝑠𝑡 𝑑𝑒𝑐𝑖𝑙𝑒 𝑖𝑛 2007), which is a conditional probability. There is a 1.5 percent chance
that a Canadian taxfiler who was in poorest income decile (1st decile) in 2007 will move up to the richest income decile
(10th decile) in 2012.
**(b)** Use the Binomial probability formula to find: 𝑃(𝑋≥7) = 𝑃(𝑋=7) + 𝑃(𝑋 = 8).
𝑝(7) = ଼!
!(଼ି)!
0.574(1 − 0.574)଼ି = 8 ∗ 0.574(0.426)ଵ = 0.06997
𝑝(8) = ଼!
଼!(଼ି଼)!
0.574଼(1 − 0.574)଼ି଼ = 0.574଼ = 0.01178
𝑃(𝑋≥7) = 0.06997 + 0.01178 = 0.08175
**(c)** Two mathematically equivalent approaches are to find 𝑃(𝑋 > 400) or 𝑃(𝑃 > 0.40). Either way, we use the Normal
approximation (we expect more than 10 successes and 10 failures).
**Approach #1: **𝐸ሾ𝑋ሿ = 𝑛𝑝 = 1,000 ∗ 0.416 = 416 and 𝑉ሾ𝑋ሿ = 𝑛𝑝(1−𝑝) = 242.944
𝑃(𝑋 > 400) = 𝑃 ቀ𝑍 > ସିସଵ
√ଶସଶ.ଽସସቁ = 𝑃(𝑍 > −1.0265) ≈ 𝑃(𝑍 > −1.03) = 0.5 + 0.3485 = 0.8485 ≅ 0.85
**OR
****Approach #2: **𝐸ൣ𝑃൧ = 𝑝 = 0.416 and 𝑉ൣ𝑃൧ = (ଵି) = 0.000242944
𝑃൫𝑃 > 0.40൯ = 𝑃 ቀ𝑍 > .ସି.ସଵ
√.ଶସଶଽସସቁ = 𝑃(𝑍 > −1.0265) ≈ 𝑃(𝑍 > −1.03) = 0.5 + 0.3485 = 0.8485 ≅ 0.85 **(d)** All the cells in the transition matrix – the 10 rows and first 10 columns of results – would be 10. Hence, people in the
2nd decile in 2007 would have a 10% chance of immobility (remaining in the 2nd decile in 2012), a 10% chance at
downward mobility, and an 80% chance of upward mobility. Canada is very different: people who are quite poor (2nd
decile) in 2007 have a much greater chance of staying exactly that poor (39.4 compared to 10%) and a somewhat higher
chance of becoming even poorer (13.5 compared to 10%). In Canada, the chance that someone in the 2nd decile is
upwardly mobile is only 47%, which is much lower than 80%. Hence, income mobility is much less in Canada than in the
hypothetical country (which has an extreme form of income mobility).
**(7) (a)**
𝐻: 𝛽௪ = 0
𝐻ଵ: 𝛽௪ ≠ 0
𝑡 =
ೕିఉೕబ
௦್ೕ = .ଶ
.
= 2.857 with degrees of freedom that are far above 1,000 so we can use the Normal table as an
excellent approximation when finding the P-value:
𝑃 − 𝑣𝑎𝑙𝑢𝑒 = 2 ∗ 𝑃(𝑡 > 2.86) =2∗ (0.05 − 0.4979) = 0.0042, which means the coefficient is highly statistically
significant. (Alternatively, someone using the Student t table could say that the P-value lies between 0.002 and 0.01.)
**(b)** After controlling for the many housing characteristics (such as age, overall size, location) listed as explanatory
variables in Table 2, houses with an additional bedroom on average have selling prices that are 5.4 percent *lower.* [Note:
This does not mean that extra bedrooms are bad – there is surely a positive correlation between selling price and
number of bedrooms – but rather having more bedrooms when we hold the overall size of the house, number of
bathrooms, and other key variables fixed is not good (as it means tiny rooms).]
After controlling for the many housing characteristics (such as age, overall size, location) listed as explanatory variables
in Table 2, houses that are 1 percent larger on average have selling prices that are 0.8 percent higher.
**(c)** Looking at Table 1 we can see that Energy Star homes on average sell for 22 percent higher prices than homes with
no certification ቀ= ଷଶ,ଽସିଶ,଼ହ
ଶ,଼ହ
= 0.22ቁ. Hence, the simple regression coefficient would be approximately 0.22,
compared to the much lower value of 0.027 in the multiple regression. Why? Table 1 also shows that Energy Star homes
are on average better than homes without certification: they are much more likely to be new, are larger, and much more
likely to have a garage. The multiple regression controls for these other important factors, which are correlated with the
Energy Star rating (and hence are lurking/confounding/omitted/unobserved variables in the simple regression). Those
variables bias the simple regression coefficient estimate (endogeneity bias) and multiple regression helps isolate the
effect of the Energy Star rating on housing prices, which is much more modest: 2.7 percent higher prices, not 22 percent
higher.
**(d)** No. Even though that coefficient is definitely not statistically significant (the 𝑡 ratio would be 0.67, which falls far
short of meeting even an easy 10 percent significance level and we cannot reject the null hypothesis that the coefficient
is zero), we must remember that it does *not* tell us how prices differ between these two groups of houses (remember
part (c)). Instead, it tells of differences after controlling for other more important house characteristics (like new, size,
garage). We should expect that Austin also has higher average selling prices for Energy Star homes (that are highly
statistically significant) like that observed in North Carolina. The key issue is the difference between controlling for other
housing characteristics or not (not potential differences across cities).