apply(x,margin,func, ...)
• x
: array
• margin
: subscripts, for matrix, 1 for row, 2 for column
• func
: the function
...
>BOD#R built-in dataset, Biochemical Oxygen Demand
Time demand 1 1 8.3 2 2 10.3 3 3 19.0 4 4 16.0 5 5 15.6 6 7 19.8
Sum up for each row:
> apply(BOD,1,sum)
[1] 9.3 12.3 22.0 20.0 20.6 26.8
Sum up for each column:
> apply(BOD,2,sum)
Time demand 22 89
Multipy all values by 10:
> apply(BOD,1:2,function(x) 10 * x)
Time demand [1,] 10 83 [2,] 20 103 [3,] 30 190 [4,] 40 160 [5,] 50 156 [6,] 70 198
Used for array, margin set to 1:
> x <- array(1:9) > apply(x,1,function(x) x * 10)
[1] 10 20 30 40 50 60 70 80 90
Two dimension array, margin can be 1 or 2:
> x <- array(1:9,c(3,3)) > x
[,1] [,2] [,3] [1,] 1 4 7 [2,] 2 5 8 [3,] 3 6 9
> apply(x,1,function(x) x * 10)#or apply(x,2,function(x) x * 10)
[1] 10 20 30 40 50 60 70 80 90
lapply()
function can handle data frame with similar results, return is a list:
> lapply(BOD,sum)
$Time [1] 22 $demand [1] 89
> lapply(BOD,mean)
$Time [1] 3.666667 $demand [1] 14.83333
sapply()
has similar function, it defines "simplify=TRUE" by default, thus return a vector:
> sapply(BOD,sum)
Time demand 22 89
> sapply(BOD,sum,simplify=FALSE)
$Time [1] 22 $demand [1] 89
> mapply(sum,BOD$Time,BOD$demand)[1] 9.3 12.3 22.0 20.0 20.6 26.8 > mapply(sum,BOD$Time)[1] 1 2 3 4 5 7 > mapply(sum,BOD$demand)[1] 8.3 10.3 19.0 16.0 15.6 19.8 > mapply(sum, trees)Girth Height Volume 410.7 2356.0 935.3 > f <- function(x,y) (x * 12 + y) * 0.0254#ft in to meter convert > mapply(f, c(5,6,5),c(3,1,9))[1] 1.6002 1.8542 1.7526
This example uses the builtin dataset CO2, sum up the uptake grouped by different plants.
> tapply(CO2$uptake,CO2$Plant, sum)Qn1 Qn2 Qn3 Qc1 Qc3 Qc2 Mn3 Mn2 Mn1 Mc2 Mc3 Mc1 232.6 246.1 263.3 209.8 228.1 228.9 168.8 191.4 184.8 85.0 121.1 126.0
Following is the builtin dataset CO2: