Chapter 6.2.2.1 Exercises

Untitled1
In [4]:
library("HSAUR")
library("HSAUR2")
library("HSAUR3")
library("graphics")
data("clouds")
clouds
rownames(clouds)[clouds$rainfall %in% c(bxpseeding$out, bxpecho$out)]
A data.frame: 24 × 7
seedingtimesnecloudcoverprewetnessechomotionrainfall
<fct><int><dbl><dbl><dbl><fct><dbl>
1no 01.7513.40.274stationary12.85
2yes 12.7037.91.267moving 5.52
3yes 34.10 3.90.198stationary 6.29
4no 42.35 5.30.526moving 6.11
5yes 64.25 7.10.250moving 2.45
6no 91.60 6.90.018stationary 3.61
7no 181.30 4.60.307moving 0.47
8no 253.35 4.90.194moving 4.56
9no 272.8512.10.751moving 6.35
10yes282.20 5.20.084moving 5.06
11yes294.40 4.10.236moving 2.76
12yes323.10 2.80.214moving 4.05
13no 333.95 6.80.796moving 5.74
14yes352.90 3.00.124moving 4.84
15yes382.05 7.00.144moving 11.86
16no 394.0011.30.398moving 4.45
17no 533.35 4.20.237stationary 3.66
18yes553.70 3.30.960moving 4.22
19no 563.80 2.20.230moving 1.16
20yes593.40 6.50.142stationary 5.45
21yes653.15 3.10.073moving 2.02
22no 683.15 2.60.136moving 0.82
23yes824.01 8.30.123moving 1.09
24no 834.65 7.40.168moving 0.28
Error in clouds$rainfall %in% c(bxpseeding$out, bxpecho$out): object 'bxpseeding' not found
Traceback:

1. clouds$rainfall %in% c(bxpseeding$out, bxpecho$out)
In [5]:
clouds_formula <- rainfall ~ seeding * (sne + cloudcover +
prewetness + echomotion) + time
clouds_formula
rainfall ~ seeding * (sne + cloudcover + prewetness + echomotion) + 
    time
In [10]:
Xstar <- model.matrix(clouds_formula, data = clouds)
Xstar
A matrix: 24 × 11 of type dbl
(Intercept)seedingyessnecloudcoverprewetnessechomotionstationarytimeseedingyes:sneseedingyes:cloudcoverseedingyes:prewetnessseedingyes:echomotionstationary
1101.7513.40.2741 00.00 0.00.0000
2112.7037.91.2670 12.7037.91.2670
3114.10 3.90.1981 34.10 3.90.1981
4102.35 5.30.5260 40.00 0.00.0000
5114.25 7.10.2500 64.25 7.10.2500
6101.60 6.90.0181 90.00 0.00.0000
7101.30 4.60.3070180.00 0.00.0000
8103.35 4.90.1940250.00 0.00.0000
9102.8512.10.7510270.00 0.00.0000
10112.20 5.20.0840282.20 5.20.0840
11114.40 4.10.2360294.40 4.10.2360
12113.10 2.80.2140323.10 2.80.2140
13103.95 6.80.7960330.00 0.00.0000
14112.90 3.00.1240352.90 3.00.1240
15112.05 7.00.1440382.05 7.00.1440
16104.0011.30.3980390.00 0.00.0000
17103.35 4.20.2371530.00 0.00.0000
18113.70 3.30.9600553.70 3.30.9600
19103.80 2.20.2300560.00 0.00.0000
20113.40 6.50.1421593.40 6.50.1421
21113.15 3.10.0730653.15 3.10.0730
22103.15 2.60.1360680.00 0.00.0000
23114.01 8.30.1230824.01 8.30.1230
24104.65 7.40.1680830.00 0.00.0000
In [12]:
library(MASS)
library(ggplot2)
data(geyser)
fit.ols <- lm(waiting ~ duration, data = geyser)
fit.ols
Warning message:
“package ‘MASS’ was built under R version 3.6.3”
Warning message:
“package ‘ggplot2’ was built under R version 3.6.3”
Call:
lm(formula = waiting ~ duration, data = geyser)

Coefficients:
(Intercept)     duration  
      99.31        -7.80  
In [13]:
ggplot(geyser, aes(x = waiting, y = duration)) +
  geom_point() +
  labs(x = "waiting", y = "duration", 
       title = "geyser") +
  geom_smooth(method = "lm", se = TRUE)
`geom_smooth()` using formula 'y ~ x'

In [7]:
# In Figure 7.4, we plot a scatterplot of score over age. 
# Given that gender is a binary categorical variable in this study, we can make some interesting tweaks: We can assign a color to points from each of the two levels of gender: female and male. 
# Furthermore, the geom_smooth(method = "lm", se = FALSE) layer automatically fits a different regression line for each since we have provided color = gender at the top level in ggplot(). This allows for all geom_etries that follow to have the same mapping of aes()thetics to variables throughout the plot.

ggplot(evals_ch7, aes(x = age, y = score, color = gender)) +
  geom_jitter() +
  labs(x = "Age", y = "Teaching Score", color = "Gender") +
  geom_smooth(method = "lm", se = FALSE)

# We notice some interesting trends: There are almost no women faculty over the age of 60. We can see this by the lack of red dots above 60. Fitting separate regression lines for men and women, we see they have different slopes. We see that the associated effect of increasing age seems to be much harsher for women than men. In other words, as women age, the drop in their teaching score appears to be faster.
In [13]:
# Parallel slopes model:

score_model_2 <- lm(score ~ age + gender, data = evals_ch7)
get_regression_table(score_model_2)
termestimatestd_errorstatisticp_valuelower_ciupper_ci
intercept 4.484 0.125 35.792 0.000 4.238 4.730
age -0.009 0.003 -3.280 0.001 -0.014 -0.003
gendermale 0.191 0.052 3.632 0.000 0.087 0.294
In [22]:
ggplot(score_model_2, aes(x = age, y = score, color = gender)) +
  geom_jitter() +
  labs(x = "Age", y = "Teaching Score", color = "Gender") +
      geom_smooth(method = "lm", se = FALSE)

# mismatch!!!
In [17]:
score_model_interaction <- lm(score ~ age * gender, data = evals_ch7) 
get_regression_table(score_model_interaction)
termestimatestd_errorstatisticp_valuelower_ciupper_ci
intercept 4.883 0.205 23.795 0.000 4.480 5.286
age -0.018 0.004 -3.919 0.000 -0.026 -0.009
gendermale -0.446 0.265 -1.681 0.094 -0.968 0.076
age:gendermale 0.014 0.006 2.446 0.015 0.003 0.024
In [18]:
ggplot(score_model_interaction, aes(x = age, y = score, color = gender)) +
  geom_jitter() +
  labs(x = "Age", y = "Teaching Score", color = "Gender") +
  geom_smooth(method = "lm", se = FALSE)

# mismatch!!!
In [3]:
library(ggplot2)
library(dplyr)
library(moderndive)
library(ISLR)
library(skimr)

evals_ch7 <- evals %>%
  select(score, age, gender)

score_model_interaction <- lm(score ~ age * gender, data = evals_ch7)
In [4]:
# Now say we want to apply the above calculations for male and female instructors for all 463 instructors in the evals_ch7 dataset. As our multiple regression models get more and more complex, computing such values by hand gets more and more tedious. The get_regression_points() function spares us this tedium and returns all fitted values and all residuals. For simplicity, let’s focus only on the fitted interaction model, which is saved in score_model_interaction.

regression_points <- get_regression_points(score_model_interaction)
regression_points

# Recall the format of the output: 
# score corresponds to y the observed value. 
# score_hat corresponds to ˆy=ˆscore the fitted value
# residual corresponds to the residual y−ˆy
IDscoreagegenderscore_hatresidual
1 4.7 36 female4.252 0.448
2 4.1 36 female4.252 -0.152
3 3.9 36 female4.252 -0.352
4 4.8 36 female4.252 0.548
5 4.6 59 male 4.201 0.399
6 4.3 59 male 4.201 0.099
7 2.8 59 male 4.201 -1.401
8 4.1 51 male 4.233 -0.133
9 3.4 51 male 4.233 -0.833
10 4.5 40 female4.182 0.318
11 3.8 40 female4.182 -0.382
12 4.5 40 female4.182 0.318
13 4.6 40 female4.182 0.418
14 3.9 40 female4.182 -0.282
15 3.9 40 female4.182 -0.282
16 4.3 40 female4.182 0.118
17 4.5 40 female4.182 0.318
18 4.8 31 female4.340 0.460
19 4.6 31 female4.340 0.260
20 4.6 31 female4.340 0.260
21 4.9 31 female4.340 0.560
22 4.6 31 female4.340 0.260
23 4.5 31 female4.340 0.160
24 4.4 62 male 4.189 0.211
25 4.6 62 male 4.189 0.411
26 4.7 62 male 4.189 0.511
27 4.5 62 male 4.189 0.311
28 4.8 62 male 4.189 0.611
29 4.9 62 male 4.189 0.711
30 4.5 62 male 4.189 0.311
434 2.8 62 male 4.189 -1.389
435 3.1 62 male 4.189 -1.089
436 4.2 62 male 4.189 0.011
437 3.4 62 male 4.189 -0.789
438 3.0 62 male 4.189 -1.189
439 3.3 35 female4.270 -0.970
440 3.6 35 female4.270 -0.670
441 3.7 35 female4.270 -0.570
442 3.6 61 male 4.193 -0.593
443 4.3 61 male 4.193 0.107
444 4.1 52 female3.972 0.128
445 4.9 52 female3.972 0.928
446 4.8 52 female3.972 0.828
447 3.7 60 female3.832 -0.132
448 3.9 60 female3.832 0.068
449 4.5 60 female3.832 0.668
450 3.6 60 female3.832 -0.232
451 4.4 60 female3.832 0.568
452 3.4 60 female3.832 -0.432
453 4.4 60 female3.832 0.568
454 4.5 32 male 4.309 0.191
455 4.5 32 male 4.309 0.191
456 4.5 32 male 4.309 0.191
457 4.6 32 male 4.309 0.291
458 4.1 32 male 4.309 -0.209
459 4.5 32 male 4.309 0.191
460 3.5 42 female4.147 -0.647
461 4.4 42 female4.147 0.253
462 4.4 42 female4.147 0.253
463 4.1 42 female4.147 -0.047
In [5]:
# As always, let’s perform a residual analysis first with a histogram, which we can facet by gender:

ggplot(regression_points, aes(x = residual)) +
  geom_histogram(binwidth = 0.25, color = "white") +
  labs(x = "Residual") +
  facet_wrap(~gender)
In [7]:
# Second, the residuals as compared to the predictor variables: x1: numerical explanatory/predictor variable of age. x2: categorical explanatory/predictor variable of gender

ggplot(regression_points, aes(x = age, y = residual)) +
  geom_point() +
  labs(x = "age", y = "Residual") +
  geom_hline(yintercept = 0, col = "blue", size = 1) +
  facet_wrap(~ gender)
In [ ]:

In [3]:
# Recall from Table 7.2 that we saw the correlation coefficient between Income in thousands of dollars and credit card Balance was 0.464. What if in instead we looked at the correlation coefficient between Income and credit card Balance, but where Income was in dollars and not thousands of dollars? This can be done by multiplying Income by 1000.

library(ISLR)
library(dplyr)
data(Credit)
Credit %>% 
  select(Balance, Income) %>% 
  mutate(Income = Income * 1000) %>% 
  cor()

# We see it is the same! We say that the correlation coefficient is invariant to linear transformations! In other words, the correlation between x and y will be the same as the correlation between a×x+b and y where a and b are numerical values (real numbers in mathematical terms).
BalanceIncome
Balance1.00000000.4636565
Income0.46365651.0000000
In [2]:
library(ggplot2)
library(dplyr)
library(moderndive)
library(ISLR)
library(skimr)
In [6]:
# Let’s load the Credit data and select() only the needed subset of variables.

(Credit <- Credit %>%
  select(Balance, Limit, Income, Rating, Age))
BalanceLimitIncomeRatingAge
333 3606 14.891283 34
903 6645 106.025483 82
580 7075 104.593514 71
964 9504 148.924681 36
331 4897 55.882357 68
1151 8047 80.180569 77
203 3388 20.996259 37
872 7114 71.408512 87
279 3300 15.125266 66
1350 6819 71.061491 41
1407 8117 63.095589 30
0 1311 15.045138 64
204 5308 80.616394 57
1081 6922 43.682511 49
148 3291 19.144269 75
0 2525 20.089200 57
0 3714 53.598286 73
368 4378 36.496339 69
891 6384 49.570448 28
1048 6626 42.079479 44
89 2860 17.700235 63
968 6378 37.348458 72
0 2631 20.103213 61
411 5179 64.027398 48
0 1757 10.742156 57
671 4323 14.090326 25
654 3625 42.471289 44
467 4534 32.793333 44
1809 13414 186.634949 41
915 5611 26.813411 55
992 6135 35.610466 40
0 2150 39.116173 75
840 3782 19.782293 46
1003 5354 55.412383 37
588 4840 29.400368 76
1000 5673 20.974413 44
767 7167 87.625515 46
0 1567 28.144142 51
717 4941 19.349366 33
0 2860 53.308214 84
661 7760 115.123538 83
849 8029 101.788574 84
1352 5495 24.824409 33
382 3274 14.292282 64
0 1870 20.088180 76
905 5640 26.400398 58
371 3683 19.253287 57
0 1357 16.529126 62
1129 6827 37.878482 80
806 7100 83.948503 44
1393 10578 135.118747 81
721 6555 73.327472 43
0 2308 25.974196 24
0 1335 17.316138 65
734 5758 49.794410 40
560 4100 12.096307 32
480 3838 13.364296 65
138 4171 57.872321 67
0 2525 37.728192 44
966 5524 18.701415 64
In [5]:
# Let’s look at the raw data values both by bringing up RStudio’s spreadsheet viewer and the glimpse() function. Although in Table 7.1 we only show 5 randomly selected credit card holders out of 400:

View(Credit)

# 	Balance 	Limit 	Income 	Rating 	Age
# 141 	1425 	6045 	39.8 	459 	32
# 9 	279 	3300 	15.1 	266 	66
# 13 	204 	5308 	80.6 	394 	57
# 262 	1050 	9310 	180.4 	665 	67
# 267 	15 	4952 	88.8 	360 	86
Error in View(Credit): ‘View()’ not yet supported in the Jupyter R kernel
Traceback:

1. View(Credit)
2. stop(sQuote("View()"), " not yet supported in the Jupyter R kernel")
In [6]:
glimpse(Credit)
Observations: 400
Variables: 5
$ Balance <int> 333, 903, 580, 964, 331, 1151, 203, 872, 279, 1350, 1407, 0...
$ Limit   <int> 3606, 6645, 7075, 9504, 4897, 8047, 3388, 7114, 3300, 6819,...
$ Income  <dbl> 14.891, 106.025, 104.593, 148.924, 55.882, 80.180, 20.996, ...
$ Rating  <int> 283, 483, 514, 681, 357, 569, 259, 512, 266, 491, 589, 138,...
$ Age     <int> 34, 82, 71, 36, 68, 77, 37, 87, 66, 41, 30, 64, 57, 49, 75,...
In [5]:
# Let’s look at some summary statistics, again using the skim() function from the skimr package:

Credit %>% 
  select(Balance, Limit, Income) %>% 
  skim()

# We observe for example:
# The mean and median credit card balance are $520.01 and $459.50 respectively.
# 25% of card holders had debts of $68.75 or less.
# The mean and median credit card limit are $4735.6 and $4622.50 respectively.
# 75% of these card holders had incomes of $57,470 or less.
variabletypestatlevelvalueformatted
Balance integer missing .all 0.000000
Balance integer complete .all 400.00000400
Balance integer n .all 400.00000400
Balance integer mean .all 520.01500520.01
Balance integer sd .all 459.75888459.76
Balance integer p0 .all 0.000000
Balance integer p25 .all 68.7500068.75
Balance integer p50 .all 459.50000459.5
Balance integer p75 .all 863.00000863
Balance integer p100 .all 1999.000001999
Balance integer hist .all NA▇▃▃▃▂▁▁▁
Limit integer missing .all 0.000000
Limit integer complete .all 400.00000400
Limit integer n .all 400.00000400
Limit integer mean .all 4735.600004735.6
Limit integer sd .all 2308.198852308.2
Limit integer p0 .all 855.00000855
Limit integer p25 .all 3088.000003088
Limit integer p50 .all 4622.500004622.5
Limit integer p75 .all 5872.750005872.75
Limit integer p100 .all 13913.0000013913
Limit integer hist .all NA▅▇▇▃▂▁▁▁
Income numeric missing .all 0.000000
Income numeric complete .all 400.00000400
Income numeric n .all 400.00000400
Income numeric mean .all 45.2188945.22
Income numeric sd .all 35.2442735.24
Income numeric p0 .all 10.3540010.35
Income numeric p25 .all 21.0072521.01
Income numeric p50 .all 33.1155033.12
Income numeric p75 .all 57.4707557.47
Income numeric p100 .all 186.63400186.63
Income numeric hist .all NA▇▃▂▁▁▁▁▁
In [9]:
# Since our outcome variable Balance and the explanatory variables Limit and Rating are numerical, we can compute the correlation coefficient between pairs of these variables. First, we could run the get_correlation() command as seen in Subsection 6.1.1 twice, once for each explanatory variable:

Credit %>% 
  get_correlation(Balance ~ Limit)
Credit %>%  
  get_correlation(Balance ~ Income)
correlation
0.8616973
correlation
0.4636565
In [9]:
# Or we can simultaneously compute them by returning a correlation matrix in Table 7.2. We can read off the correlation coefficient for any pair of variables by looking them up in the appropriate row/column combination.

Credit %>%
  select(Balance, Limit, Income) %>% 
  cor()
BalanceLimitIncome
Balance1.00000000.86169730.4636565
Limit0.86169731.00000000.7920883
Income0.46365650.79208831.0000000
In [10]:
# Let’s visualize the relationship of the outcome variable with each of the two explanatory variables in two separate plots:

ggplot(Credit, aes(x = Limit, y = Balance)) +
  geom_point() +
  labs(x = "Credit limit (in $)", y = "Credit card balance (in $)", 
       title = "Relationship between balance and credit limit") +
  geom_smooth(method = "lm", se = FALSE)
  
ggplot(Credit, aes(x = Income, y = Balance)) +
  geom_point() +
  labs(x = "Income (in $1000)", y = "Credit card balance (in $)", 
       title = "Relationship between balance and income") +
  geom_smooth(method = "lm", se = FALSE)
In [2]:
Balance_model <- lm(Balance ~ Limit + Income, data = Credit)
get_regression_table(Balance_model)
termestimatestd_errorstatisticp_valuelower_ciupper_ci
intercept-385.179 19.465 -19.789 0 -423.446 -346.912
Limit 0.264 0.006 44.955 0 0.253 0.276
Income -7.663 0.385 -19.901 0 -8.420 -6.906
In [15]:
Balance_model <- lm(Balance ~ Limit + Income, data = Credit)
get_regression_table(Balance_model)
Balance_model <- lm(Balance ~ Limit, data = Credit)
get_regression_table(Balance_model)
Balance_model <- lm(Balance ~ Income, data = Credit)
get_regression_table(Balance_model)
termestimatestd_errorstatisticp_valuelower_ciupper_ci
intercept-385.179 19.465 -19.789 0 -423.446 -346.912
Limit 0.264 0.006 44.955 0 0.253 0.276
Income -7.663 0.385 -19.901 0 -8.420 -6.906
termestimatestd_errorstatisticp_valuelower_ciupper_ci
intercept-292.790 26.683 -10.973 0 -345.249 -240.332
Limit 0.172 0.005 33.879 0 0.162 0.182
termestimatestd_errorstatisticp_valuelower_ciupper_ci
intercept246.515 33.199 7.425 0 181.247 311.783
Income 6.048 0.579 10.440 0 4.909 7.187
In [21]:
# Recall in Section 6.1.4, our first residual analysis plot investigated the presence of any systematic pattern in the residuals when we had a single numerical predictor: bty_age. For the Credit card dataset, since we have two numerical predictors, Limit and Income, we must perform this twice:

ggplot(regression_points, aes(x = Income, y = residual)) +
  geom_point() +
  labs(x = "Income (in $1000)", y = "Residual", title = "Residuals vs income")
In [20]:
ggplot(regression_points, aes(x = Limit, y = residual)) +
  geom_point() +
  labs(x = "Credit limit (in $)", y = "Residual", title = "Residuals vs credit limit")
In [5]:
# In this case, there does appear to be a systematic pattern to the residuals. As the scatter of the residuals around the line y=0 is definitely not consistent. This behavior of the residuals is further evidenced by the histogram of residuals in Figure 7.3. We observe that the residuals have a slight right-skew (recall we say that data is right-skewed, or positively-skewed, if there is a tail to the right). Ideally, these residuals should be bell-shaped around a residual value of 0.

ggplot(regression_points, aes(x = residual)) +
  geom_histogram(color = "white") +
  labs(x = "Residual")

# Another way to interpret this histogram is that since the residual is computed as y−ˆy = balance - balance_hat, we have some values where the fitted value ˆy is very much lower than the observed value y. In other words, we are underestimating certain credit card holders’ balances by a very large amount.
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
In [4]:
# We will examine the interest rate for four year car loans, and the data that we use comes from the U.S. Federal Reserve’s mean rates . We are looking at and plotting means. This, of course, is a very bad thing because it removes a lot of the variance and is misleading. The only reason that we are working with the data in this way is to provide an example of linear regression that does not use too many data points.
# The first thing to do is to specify the data. Here there are only five pairs of numbers so we can enter them in manually. Each of the five pairs consists of a year and the mean interest rate:

year <- c(2000 ,   2001  ,  2002  ,  2003 ,   2004)
rate <- c(9.34 ,   8.50  ,  7.62  ,  6.93  ,  6.60)
In [3]:
# The next thing we do is take a look at the data. We first plot the data using a scatter plot and notice that it looks linear. To confirm our suspicions we then find the correlation between the year and the mean interest rates:

plot(year,rate,
     main="Commercial Banks Interest Rate for 4 Year Car Loan",
     sub="http://www.federalreserve.gov/releases/g19/20050805/")
In [6]:
fit <- lm(rate ~ year); fit

# When you make the call to lm it returns a variable with a lot of information in it. If you are just learning about least squares regression you are probably only interested in two things at this point, the slope and the y-intercept. If you just type the name of the variable returned by lm it will print out this minimal information to the screen.
Call:
lm(formula = rate ~ year)

Coefficients:
(Intercept)         year  
   1419.208       -0.705  
In [7]:
#If you would like to know what else is stored in the variable you can use the attributes command:

attributes(fit)
$names
  1. 'coefficients'
  2. 'residuals'
  3. 'effects'
  4. 'rank'
  5. 'fitted.values'
  6. 'assign'
  7. 'qr'
  8. 'df.residual'
  9. 'xlevels'
  10. 'call'
  11. 'terms'
  12. 'model'
$class
'lm'
In [9]:
#One of the things you should notice is the coefficients variable within fit. You can print out the y-intercept and slope by accessing this part of the variable:

fit$coefficients[1]
fit$coefficients[[1]]

# Note that if you just want to get the number you should use two square braces.
(Intercept): 1419.20799999989
1419.20799999989
In [12]:
# A better use for this formula would be to calculate the residuals and plot them:

res <- rate - (fit$coefficients[[2]]*year+fit$coefficients[[1]]); res
plot(year,res)
  1. 0.132000000000371
  2. -0.00299999999970169
  3. -0.177999999999774
  4. -0.162999999999847
  5. 0.21200000000008
In [13]:
# That is a bit messy, but fortunately there are easier ways to get the residuals. Two other ways are shown below:

residuals(fit)
fit$residuals
plot(year,fit$residuals)
1
0.131999999999869
2
-0.00299999999988525
3
-0.17799999999994
4
-0.162999999999995
5
0.211999999999951
1
0.131999999999869
2
-0.00299999999988525
3
-0.17799999999994
4
-0.162999999999995
5
0.211999999999951
In [15]:
# If you want to plot the regression line on the same plot as your scatter plot you can use the abline function along with your variable fit:

plot(year,rate,
     main="Commercial Banks Interest Rate for 4 Year Car Loan",
     sub="http://www.federalreserve.gov/releases/g19/20050805/")
abline(fit)
In [16]:
# Finally, as a teaser for the kinds of analyses you might see later, you can get the results of an F-test by asking R for a summary of the fit variable:

summary(fit)
Call:
lm(formula = rate ~ year)

Residuals:
     1      2      3      4      5 
 0.132 -0.003 -0.178 -0.163  0.212 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)   
(Intercept) 1419.20800  126.94957   11.18  0.00153 **
year          -0.70500    0.06341  -11.12  0.00156 **
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.2005 on 3 degrees of freedom
Multiple R-squared:  0.9763,	Adjusted R-squared:  0.9684 
F-statistic: 123.6 on 1 and 3 DF,  p-value: 0.001559