Chapter 4 Matrix Algebra

Note: I have rephrased some parts of the questions for clarity. These changes are bolded. Due to the random numbers, the exact values of the answers, despite the same seeds, might differ. So please be mindful of that.

First, upload necessary package(s).

library(dplyr) # uploads the function filter() and %>%
library(rafalib) # important for plotting

4.2 Exercises

#install.packages('UsingR')
data('father.son', package = 'UsingR')

Question 1

What is the average height of the sons (don’t round off)?


y <- father.son$sheight # son
x <- father.son$fheight # father
mean(y)
## [1] 68.68407

Question 2

One of the defining features of regression is that we stratify one variable based on others. In statistics, we use the verb ‘condition.’ For exmaple,, the linear model for son and father heights answers the question: how tall do I expect a son to be if I condition on his father being \(x\) inches? The regression line answers this question for any \(x\).

Using the father.son dataset described above, we want to know the expected height of sons, if we condition on the father being 71 inches. Create a list of son heights for sons that have fathers with heights of 71 inches, rounding to the nearest inch.

What is the mean of the son heights for fathers that have a height of 71 inches (don’t round off your answer)? Hint: use the function round on the fathers’ heights.


groups <- split(y, round(x))
mean(groups['71'] %>% unlist())
## [1] 70.54082

Question 3

Which of the following cannot be written as a linear model?


The answer is C: \(Y = a + b^t + \epsilon\). This is because the variable \(t\) is an exponent unlike all the other answer choices.

Question 4

Suppose you model the relationship between weight and height across individuals with a linear model. You assume that the height of individuals for a fixed weight \(x\) follows a linear model \(Y = a + bx + \epsilon\). Which of the following do you feel best describes what \(\epsilon\) represents?


The answer is D: Between individual variability: people of the same height vary in their weight.

Let’s look at each answer choice. Let’s first think \(Y\) as son’s height and \(x\) as father’s height. So in this linear model, our goal is to predict son’s height based on father’s height. Choice A (Measurement error: scales are not perfect) seems not wrong. However, if the father’s height is 71 inches, can we gurantee that the son’s height is a certain number with a small measurement variability? This is not true because the son’s range of height can still be wide even if the father’s height is 71 inches. In fact, siblings can have very different heights even if their biological parents are identical. Since choice A only describes the \(\epsilon\) as measurement variability, the description seems inadequate. This explanation also applies to Choice B (Within individual random fluctuations: you don’t weigh the same in the morning as in the afternoon) and C (Round off error introduced by the computer). Therefore, choice D seems to be most appropriate.

4.6 Exercises

Question 1

In R we have vectors and matrices. You can create your own vectors with the function c.

c(1,5,3,4)
## [1] 1 5 3 4

They are also the output of many functions such as:

rnorm(10)
##  [1] -0.8043316 -1.0565257 -1.0353958 -1.1855604 -0.5004395
##  [6] -0.5249887 -0.3024330  0.4719681 -0.2483839  1.2593180

You can turn vectors into matrcies using functions such as rbind, cbind, or matrix. Create the matrix from the vector 1:1000 like this:

X = matrix(1:1000,100,10)

What is the entry in row 25, column 3?


X[25,3]
## [1] 225

Question 2

Using the function cbind, create a 10 x 5 matrix with first column x=1:10. Then add 2*x, 3*x, 4*x and 5*x to columns 2 through 5. What is the sum of the elements of the 7th row?


first_column <- 1:10
y <- cbind(x1=first_column, x2=first_column*2, x3=first_column*3, x4=first_column*4, x5=first_column*5)
sum(y[7,])
## [1] 105

Question 3

Which of the following creates a matrix with multiples of 3 in the third column?


matrix(1:60,20,3) # choice A
##       [,1] [,2] [,3]
##  [1,]    1   21   41
##  [2,]    2   22   42
##  [3,]    3   23   43
##  [4,]    4   24   44
##  [5,]    5   25   45
##  [6,]    6   26   46
##  [7,]    7   27   47
##  [8,]    8   28   48
##  [9,]    9   29   49
## [10,]   10   30   50
## [11,]   11   31   51
## [12,]   12   32   52
## [13,]   13   33   53
## [14,]   14   34   54
## [15,]   15   35   55
## [16,]   16   36   56
## [17,]   17   37   57
## [18,]   18   38   58
## [19,]   19   39   59
## [20,]   20   40   60
matrix(1:60,20,3, byrow=T) # choice B
##       [,1] [,2] [,3]
##  [1,]    1    2    3
##  [2,]    4    5    6
##  [3,]    7    8    9
##  [4,]   10   11   12
##  [5,]   13   14   15
##  [6,]   16   17   18
##  [7,]   19   20   21
##  [8,]   22   23   24
##  [9,]   25   26   27
## [10,]   28   29   30
## [11,]   31   32   33
## [12,]   34   35   36
## [13,]   37   38   39
## [14,]   40   41   42
## [15,]   43   44   45
## [16,]   46   47   48
## [17,]   49   50   51
## [18,]   52   53   54
## [19,]   55   56   57
## [20,]   58   59   60
x = 11:20; rbind(x,2*x,3*x) # choice C
##   [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
## x   11   12   13   14   15   16   17   18   19    20
##     22   24   26   28   30   32   34   36   38    40
##     33   36   39   42   45   48   51   54   57    60
x = 1:40; matrix(3*x,20,2) # choice D 
##       [,1] [,2]
##  [1,]    3   63
##  [2,]    6   66
##  [3,]    9   69
##  [4,]   12   72
##  [5,]   15   75
##  [6,]   18   78
##  [7,]   21   81
##  [8,]   24   84
##  [9,]   27   87
## [10,]   30   90
## [11,]   33   93
## [12,]   36   96
## [13,]   39   99
## [14,]   42  102
## [15,]   45  105
## [16,]   48  108
## [17,]   51  111
## [18,]   54  114
## [19,]   57  117
## [20,]   60  120

The answer is B.

4.8 Exercises

Question 1

Suppose \(X\) is a matrix in R. Which of the following is not equivalent to \(X\)?


head(t(t(X))) # A
##      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
## [1,]    1  101  201  301  401  501  601  701  801   901
## [2,]    2  102  202  302  402  502  602  702  802   902
## [3,]    3  103  203  303  403  503  603  703  803   903
## [4,]    4  104  204  304  404  504  604  704  804   904
## [5,]    5  105  205  305  405  505  605  705  805   905
## [6,]    6  106  206  306  406  506  606  706  806   906
head(X %*% matrix(1,ncol(X))) # B
##      [,1]
## [1,] 4510
## [2,] 4520
## [3,] 4530
## [4,] 4540
## [5,] 4550
## [6,] 4560
head(X*1) # C
##      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
## [1,]    1  101  201  301  401  501  601  701  801   901
## [2,]    2  102  202  302  402  502  602  702  802   902
## [3,]    3  103  203  303  403  503  603  703  803   903
## [4,]    4  104  204  304  404  504  604  704  804   904
## [5,]    5  105  205  305  405  505  605  705  805   905
## [6,]    6  106  206  306  406  506  606  706  806   906
head(X %*% diag(ncol(X))) # D
##      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
## [1,]    1  101  201  301  401  501  601  701  801   901
## [2,]    2  102  202  302  402  502  602  702  802   902
## [3,]    3  103  203  303  403  503  603  703  803   903
## [4,]    4  104  204  304  404  504  604  704  804   904
## [5,]    5  105  205  305  405  505  605  705  805   905
## [6,]    6  106  206  306  406  506  606  706  806   906

The answer is B.

In choice A, the trasnposed matrix \(X\) gets transposed again, thereby returning to its original matrix \(X\). In choice B, the matrix \(X\) gets multiplied by 1, which is a scalar. So it will not be changed. In choice D, \(X\) gets multiplied by an identity matrix, so \(X\) does not change even after the matrix multiplication. Therefore, the answer is B.

Question 2

Solve the following system of equations using R: \[ \begin{align*} 3a + 4b - 5c + d=10\\ 2a + 2b + 2c - d = 5\\ a - b + 5c - 5d = 7 \\ 5a + 5d = 4 \end{align*} \] What is the solution for \(c\)?


X <- matrix(c(3,4,-5,1,2,2,2,-1,1,-1,5,-5,5,0,0,1),4,4,byrow=T)
ans <- solve(X) %*% matrix(c(10,5,7,4),4,1)
ans[3]
## [1] -0.8849558

Question 3

Load the following two matrices into R:

a <- matrix(1:12, nrow=4)
b <- matrix(1:15, nrow=3)

What is the value in the 3rd row and the 2nd column of the matrix product of a and b?


c <- a %*% b
c[3,2]
## [1] 113

Question 4

Multiply the 3rd row of a with the 2nd column of b, using the element-wise vector multiplication with *. What is the sum of the elements in the resulting vector?


sum(a[3,] * b[,2])
## [1] 113

4.10 Exercises

Question 1

Suppose we are analyzing a set of 4 samples. The first two samples are from a treatment group A and the second two samples are from a treatment group B. This design can be represented with a model matrix like so:

X <- matrix(c(1,1,1,1,0,0,1,1),nrow=4)
rownames(X) <- c('a','a','b','b')
X
##   [,1] [,2]
## a    1    0
## a    1    0
## b    1    1
## b    1    1

Suppose that the fitted parameters for a linear model give us:

beta <- c(5,2)

Use the matrix multiplication operator, %*%, in R to answer the following questions: What is the fitted value for the A samples? (The fitted Y values.)


X <- matrix(c(1,1,1,1,0,0,1,1),nrow=4)
rownames(X) <- c('a','a','b','b')
#beta <- c(5,2)
beta <- matrix(c(5,2),nrow =2, ncol=1) # matrix form of the vector beta
X[1:2,] %*% beta
##   [,1]
## a    5
## a    5

When I perform matrix multiplication in R %*%, I usually make sure that all my vectors are converted into matrix. In this case, I rewrote the beta variable in the function of matrix. However, this is not necessary. The vector form of beta beta <- c(5,2) works well too.

Question 2

What is the fitted value for the B samples? (The fitted Y values.)


X[3:4,] %*% beta
##   [,1]
## b    7
## b    7

Question 3

Suppose now we are comparing two treatments B and C to a control group A, each with two samples. This design can be represented with a model matrix like so:

X <- matrix(c(1,1,1,1,1,1,0,0,1,1,0,0,0,0,0,0,1,1),nrow=6)
rownames(X) <- c("a","a","b","b","c","c")
X
##   [,1] [,2] [,3]
## a    1    0    0
## a    1    0    0
## b    1    1    0
## b    1    1    0
## c    1    0    1
## c    1    0    1

Suppose that the fitted values for the linear model are given by:

beta <- c(10,3,-3)

What is the fitted values for the B sample?


beta <- matrix(c(10,3,-3),nrow = 3)
X[3:4,] %*% beta
##   [,1]
## b   13
## b   13

Question 4

What is the fitted values for the C sample?


X[5:6,] %*% beta
##   [,1]
## c    7
## c    7