1. FUN is found by a call to match.fun and typically function name must be backquoted or quoted. The New S Language. Welcome. First, let’s create data with an factor for indexing. What if I wanted to summarize the data in matrix m by finding the sum of each row? This means that instead of returning a list like lapply, it will return a vector instead if the data is simplifiable. Count in R using the apply function Imagine you counted the birds in your backyard on three different days and stored the counts in a matrix like this: In this article, I will demonstrate how to use the apply family of functions in R. They are extremely helpful, as you will see. You can use apply to find measures of central tendency and dispersion. In R, you can use the apply () function to apply a function over every row or column of a matrix or data frame. Use this form to apply for the Paycheck Protection Program (PPP) with an eligible lender for a First Draw loan the ‘correct’ dimension. sapply and vapply have extra arguments, but most of them have default values, so you don’t need to worry about them. This could be useful if you are expecting only one result per subject. MARGIN: A numeric vector indicating the dimension over which to traverse; 1 means rows and 2 means columns.. FUN: The function to apply (for example, sum or mean). If n equals 1, apply returns a I read Data from a csv file. This means that, in the call pow(8,2), the formal arguments x and y are assigned 8 and 2 respectively.. We can also call the function using named arguments. If you don’t want to write a function inside of the arguments, you can define the function outside of apply, and then use that function in apply later. E.g., for a matrix 1 indicates rows, dim value (such as a data frame), apply attempts Well, apply is really a family of functions that have varying uses. my.matrx is a matrix with 1-10 in column 1, 11-20 in column 2, and 21-30 in column 3. my.matrx will be used to show some of the basic uses for the apply function. Slam the brakes! First you list the function, followed by the vectors you are using the rest of the arguments have default values so they don’t need to be changed for now. In this example, I want to find out some information about the population of states split by region. You can exclude the non-numeric columns/rows and deploy apply function onto the numeric rows. lapply, sapply, and vapply are all functions that will loop a function through data in a list or vector. More Examples How to run the code Finding data sources. What if instead, I wanted to find n-1 for each column? This presents some very handy opportunities. One thing, however, that I was not a fan of was the astronomically high GPAs around every corner. 586 Main St. Brighton, TX 45965. If you set the MARGIN to 1:2 it will have the function operate on each cell. In this case, you split a vector into groups, apply a function to each group, and then combine the result into a vector. In this, I created one function that gives the mean and SD, and another that give min, median, and max. There isn’t a function in R to do this automatically, so I can create my own function. If you know me IRL: no, you don’t. (m = matrix (1: 6, nrow = 2)) apply (m, 1, sum) apply (m, 1: 2, sqrt) # "sweep" returns an array obtained from an input array by sweeping out # a summary statistic. In this example, I created a function that returns a vector ofboth the mean and standard deviation. tapply()applies a function to each cell of a ragged array, that is to each (non-empty) group of values given by a unique combination of the levels of certain factors. There are so many different apply functions because they are meant to operate on different types of data. The previous examples showed several ways to use the apply function on a matrix. This is because lapply applies treats the vector like a list, and applies the function to each point in the vector. R Examples. R Programming Examples. It is populated with a number of functions (the [s,l,m,r, t,v]apply) to manipulate slices of data in the form of matrices or arrays in a repetitive way, allowing to cross or traverse the data and avoiding explicit use of loop constructs. Consider the following basic example: > sapply (c ('a','b'), switch, a='Hello', b='Goodbye') a b "Hello" "Goodbye". But what if I wanted to loop through a vector instead? In the above function calls, the argument matching of formal argument to the actual arguments takes place in positional order. The apply() collection is bundled with r essential package if you install R with Anaconda. If your function were to return more than one numeric value, FUN.VALUE = numeric(1) will cause the function to return an error. the function to be applied: see ‘Details’. The apply functions that this chapter will address are apply, lapply, sapply, vapply, tapply, and mapply. If I see this file in R, I have: V1 V2 V3 V4 V5 V6 V7 1 14 25 83 64 987 45 78 2 15 65 789 32 14 NA NA 3 14 67 89 14 NA NA NA If I want the maximum value in each column, I use this: apply(df,2,max) and this is the result: V1 V2 V3 V4 V5 V6 V7 15 67 789 64 NA NA NA Of course, using the with() function, you can write your line of … There is a part 2 coming that will look at density plots with ggplot, but first I thought I would go on a tangent to give some examples of the apply family, as they come up a lot working with R. mapply applies FUN to the first elements of each … argument, the second elements, the third elements, and so on. Will the apply function work? In the arguments I created a function that returns length - 1. In this example, I want to find the population density for each state. state.area and state.x77 are not from the same dataset, but that is fine as long as the vectors are the same length and the data is in the same order. Apply is the head of the family. You can create a function like this for any apply function, not just tapply. Apply a Function to Multiple List or Vector Arguments Description. mapply is a multivariate version of sapply. Wadsworth & Brooks/Cole. In the case of functions like +, %*%, etc., the > tapply(CO2$uptake,CO2$Plant, sum) This would be useful for creating a ratio of two variables as shown in the example below. As you can see, the function correctly returned a vector of n-1 for each column. apply (data_frame, 1, function, arguments_to_function_if_any) The second argument 1 represents rows, if it is 2 then the function would apply on columns. sparklyr provides support to run arbitrary R code at scale within your Spark Cluster through spark_apply(). If you do not have MASS installed, you can uncomment the code below. We have provided working source code on all these examples listed below. Using the apply family makes sense only if you need that result. Arguments are recycled if necessary. Parallel Versions of lapply and mapply using Forking Description. (dots): If your FUN function requires any additional arguments, you can add them here. If each call to FUN returns a vector of length n, then apply returns an array of dimension c(n, dim(X)[MARGIN]) if n > 1.If n equals 1, apply returns a vector if MARGIN has length 1 and an array of dimension dim(X)[MARGIN] otherwise. For each region, I want the minimum, median, and maximum populations. The only new argument is INDEX, which is the factor you want to use to separate the data. sapply works just like lapply, but will simplify the output if possible. Consider an example: If a data frame has 4 columns out of which the first one belongs to the character class, then use below code: apply(Data.df[,2:4],2,func_name) Inside mapply I created a function to multiple two variables together. Many functions in R work in a vectorized way, so there’s often no need to use this. If you run this function it will return the error: Error in apply(v, 1, sum) : dim(X) must have a positive length. In that case, you should use tapply. If you do not want your results to be simplified to a vector, lapply should be used. The letter of application is intended to provide detailed information on why you are are a qualified candidate for the job. This page contains examples on basic concepts of R programming. Because learning by trying is the best way to learn any programming language including R. lapply and there, simplify2array; This is an important idiom for writing code in R, and it usually goes by the name Split, Apply, and Combine (SAC). Let’s take a look at the information for tapply. In this example, a function to find standard error was created, then passed into an apply function. Now let’s use column 1 as the index and find the mean of column 2. the arguments for mapply are mapply(FUN, …, MoreArgs = NULL, SIMPLIFY = TRUE, USE.NAMES = TRUE). Why? mapply is a multivariate version of sapply. Monster staff. Like apply, these functions can also be used for transforming data inside the list. It can also be used to repeat a function on cells within a matrix. Say you wanted to simulate rolls of a die, and you want to get ten results. Houston, TX 45987. This function didn’t add up the values like we may have expected it to. What if I wanted to be able to find how many datapoints (n) are in each column of m? Welcome. Meet three of the members. In general-purpose code it is good The articles on the left provide an introduction to R for people who are … lapply returns a list of the same length as X, each element of which is the result of applying FUN to the corresponding element of X.. sapply is a user-friendly version and wrapper of lapply by default returning a vector, matrix or, if simplify = "array", an array if appropriate, by applying simplify2array(). the apply function looks like this: apply(X, MARGIN, FUN). I’ve been on r/a2c since I was a freshman; this has probably affected my mental health in the long run, but I’ve always loved this community. July 23, 2018. Sample Letter of Application. When using an apply family function to create a new variable, one option is to create a new vector ahead of time with the size of the vector pre-allocated. First, let’s go over the basic apply function. The Apply Functions As Alternatives To Loops. It contains information about all 50 states, Let’s look at the data we will be using. Another use for mapply would be to create a new variable. I have a function f(var1, var2) in R. Suppose we set var2 = 1 and now I want to apply the function f() to the list L. Basically I want to get a new list L* with the outputs [f(L[1],1),f(L[2],1),.... Stack Overflow. environment of the call to apply. In a previous post, you covered part of the R language control flow, the cycles or loop structures.In a subsequent one, you learned more about how to avoid looping by using the apply() family of functions, which act on compound data in repetitive ways. Arguments in … cannot have the same name as any of the Usage mapply(FUN, ..., MoreArgs = NULL, SIMPLIFY = TRUE, USE.NAMES = TRUE) If n is 0, the result has length 0 but not necessarily the ‘correct’ dimension.. Welcome. The apply function returned a vector containing the sums for each row. Using the apply family makes sense only if you need that result. As you can see, this didn’t work because apply was expecting the data to have at least two dimensions. Welcome. Count in R using the apply function. be applied over. The apply() function takes four arguments:. If you want to specify the type of result you are expecting, use vapply. In this example, the apply function is used to transform the values in each cell. To call a function for each row in an R data frame, we shall use R apply function. Apply functions are a family of functions in base R which allow you to repetitively perform an action on multiple chunks of data. is either a function or a symbol (e.g., a backquoted name) or a This order is based on the order of arguments in the rep function itself. I am expecting each item in the list to return a single numeric value, so FUN.VALUE = numeric(1). Dear. practice to name the first three arguments if … is passed If your data is a vector you need to use lapply, sapply, or vapply instead. TL;DR at bottom. See how these two examples gave the same answers, but returned a vector instead? apply apply can be used to apply a function to a matrix. Here are some sources I used to help me create this chapter: Datacamp tutorial on apply functions: https://www.datacamp.com/community/tutorials/r-tutorial-apply-family, r-bloggers: Using apply, sapply, and lapply in R: https://www.r-bloggers.com/using-apply-sapply-lapply-in-r/, stackoverflow: Why is vapply safer than sapply? dim(X)[MARGIN] otherwise. character string specifying a function to be searched for from the 4634 W. Industrial Dr., Ste. apply returns a list of length prod(dim(X)[MARGIN]) with So what is the apply function in R? tapply(X, INDEX, FUN = NULL,..., simplify = TRUE) This example uses the builtin dataset CO2, sum up the uptake grouped by different plants. However, we recommend you to write code on your own before you check them. Dataset t will be created by adding a factor to matrix m and converting it to a dataframe. Say hello to apply(), sapply(), and lapply(), the most used members of the apply family. An apply function is essentially a loop, but run faster than loops and often require less code. The apply functions that this chapter will address are apply, lapply, sapply, vapply, tapply, and mapply. An apply function is essentially a loop, but run faster than loops and often require less code. tapply, and convenience functions I created a numeric vector of length 10 using the vector function. or FUN and ensures that a sensible error message is given if This is an introductory post about using apply, sapply and lapply, best suited for people relatively new to R or unfamiliar with these functions. : http://stackoverflow.com/questions/12339650/why-is-vapply-safer-than-sapply, ---
title: 'Chapter 4: apply Functions'
author: "Erin Sovansky Winter"
output:
  html_document:
    theme: cerulean
    highlight: textmate
    fontsize: 8pt
    toc: true
    number_sections: true
    code_download: true
    toc_float:
      collapsed: false
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```

#  What are apply functions?
Apply functions are a family of functions in base R which allow you to repetitively perform an
action on multiple chunks of data. An apply function is essentially a loop, but run faster than 
loops and often require less code. 

The apply functions that this chapter will address are apply, lapply, sapply, vapply, tapply, and
mapply. There are so many different apply functions because they are meant to operate on different
types of data. 

#  The apply function
First, let's go over the basic apply function. You can use the help section to get a description
of this function.
```{r, eval=FALSE}
?apply
```
the apply function looks like this: apply(X, MARGIN, FUN). 

* X is an array or matrix (this is the data that you will be performing the function on)
* Margin specifies whether you want to apply the function across rows (1) or columns (2)
* FUN is the function you want to use

## apply examples
my.matrx is a matrix with 1-10 in column 1, 11-20 in column 2, and 21-30 in column 3. 
my.matrx will be used to show some of the basic uses for the apply function.
```{r}
my.matrx <- matrix(c(1:10, 11:20, 21:30), nrow = 10, ncol = 3)
my.matrx
```

### Example 1: Using apply to find row sums
What if I wanted to summarize the data in matrix m by finding the sum of each row? The arguments 
are X = m, MARGIN = 1 (for row), and FUN = sum

```{r}
apply(my.matrx, 1, sum)
```
The apply function returned a vector containing the sums for each row.

### Example 2: Creating a function in the arguments
What if I wanted to be able to find how many datapoints (n) are in each column of m? I can use 
the length function to do this. Because we are using columns, MARGIN = 2.
```{r}
apply(my.matrx, 2, length)
```
What if instead, I wanted to find n-1 for each column? There isn't a function in R to do this
automatically, so I can create my own function. If the function is simple, you can create it
right inside the arguments for apply. In the arguments I created a function that returns
length - 1.
```{r}
apply(my.matrx, 2, function (x) length(x)-1)
```
As you can see, the function correctly returned a vector of n-1 for each column.
 
### Example 3: Using a function defined outside of apply
If you don't want to write a function inside of the arguments, you can define the function 
outside of apply, and then use that function in apply later. This may be useful if you want to 
have the function available to use later. In this example, a function to find standard error was
created, then passed into an apply function.
```{r}
st.err <- function(x){
  sd(x)/sqrt(length(x))
}
apply(my.matrx,2, st.err)
```

### Example 4: Transforming data
Now for something a little different. In the previous examples, apply was used to summarize
over a row or column. It can also be used to repeat a function on cells within a matrix. In this
example, the apply function is used to transform the values in each cell. Pay attention to the
MARGIN argument. If you set the MARGIN to 1:2 it will have the function operate on each cell.
```{r}
my.matrx2 <- apply(my.matrx,1:2, function(x) x+3)
my.matrx2
```

### Example 5: Vectors?
The previous examples showed several ways to use the apply function on a matrix. But what if I 
wanted to loop through a vector instead? Will the apply function work?

```{r, }
vec <- c(1:10)
vec
```
```{r, eval=FALSE}
apply(vec, 1, sum)
```
If you run this function it will return the error: Error in apply(v, 1, sum) : dim(X) must have a positive length. 
As you can see, this didn't work because apply was expecting the data to have at least two dimensions. If your data is a vector you need to use lapply, sapply, or vapply instead.

# lapply, sapply, and vapply
lapply, sapply, and vapply are all functions that will loop a function through data in a list or
vector. First, try looking up lapply in the help section to see a description of all three 
function.

```{r, eval=FALSE}
?lapply
```

Here are the agruments for the three functions:

* lapply(X, FUN, ...)
* sapply(X, FUN, ..., simplify = TRUE, USE.NAMES = TRUE)
* vapply(X, FUN, FUN.VALUE, ..., USE.NAMES = TRUE)

In this case, X is a vector or list, and FUN is the function you want to use. sapply and vapply have extra arguments, but most of them have default values, so you don't need to worry about
them. However, vapply requires another agrument called FUN.VALUE, which we will look at later.

### Example 1: Getting started with lapply
Earlier, we created the vector v. Let's use that vector to test out the lapply function.
```{r}
lapply(vec, sum)
```
This function didn't add up the values like we may have expected it to. This is because lapply
applies treats the vector like a list, and applies the function to each point in the vector.

Let's try using a list instead
```{r}
A<-c(1:9)
B<-c(1:12)
C<-c(1:15)
my.lst<-list(A,B,C)
lapply(my.lst, sum)
```
This time, the lapply function seemed to work better. The function summed each vector in the list
and returned a list of the 3 sums. 

### Example 2: sapply
sapply works just like lapply, but will simplify the output if possible. This means that instead
of returning a list like lapply, it will return a vector instead if the data is simplifiable.

```{r}
sapply(vec, sum)
```

```{r}
sapply(my.lst, sum)
```
See how these two examples gave the same answers, but returned a vector instead?

### Example 3: vapply
vapply is similar to sapply, but it requires you to specify what type of data you are expecting
the arguments for vapply are vapply(X, FUN, FUN.VALUE).
FUN.VALUE is where you specify the type of data you are expecting.
I am expecting each item in the list to return a single numeric value, so FUN.VALUE = numeric(1).

```{r}
vapply(vec, sum, numeric(1))
```

```{r}
vapply(my.lst, sum, numeric(1))
```

If your function were to return more than one numeric value, FUN.VALUE = numeric(1) will cause the function to return an error. This could be useful if you are expecting only one result per subject. 
```{r}
#vapply(my.lst, function(x) x+2, numeric(1))
```

### Example 4: Transforming data with sapply
Like apply, these functions can also be used for transforming data inside the list
```{r}
my.lst2 <- sapply(my.lst, function(x) x*2)
my.lst2
```

### Which function should I use, lapply, sapply, or vapply?

If you are trying to decide which of these three functions to use, because it is the simplest, I would suggest to use sapply if possible. If you do not want your results to be simplified to a vector, lapply should be used. If you want to specify the type of result you are expecting, use vapply.


# tapply

Sometimes you may want to perform the apply function on some data, but have it separated by 
factor. In that case, you should use tapply. Let's take a look at the information for tapply.

```{r, eval=FALSE}
?tapply
```
The arguments for tapply are tapply(X, INDEX, FUN). The only new argument is INDEX, which is the 
factor you want to use to separate the data.

### Example 1: Means split by condition
First, let's create data with an factor for indexing. Dataset t will be created by adding a factor to matrix m and converting it to a dataframe. 

```{r}
tdata <- as.data.frame(cbind(c(1,1,1,1,1,2,2,2,2,2), my.matrx))
colnames(tdata)
```
Now let's use column 1 as the index and find the mean of column 2

```{r}
tapply(tdata$V2, tdata$V1, mean)
```

### Example 2: Combining functions
You can use tapply to do some quick summary statistics on a variable split by condition. In this 
example, I created a function that returns a vector ofboth the mean and standard deviation. You 
can create a function like this for any apply function, not just tapply.
```{r}
summary <- tapply(tdata$V2, tdata$V1, function(x) c(mean(x), sd(x)))
summary
```

# mapply
the last apply function I will cover is mapply.
```{r, eval=FALSE}
?mapply
```
the arguments for mapply are mapply(FUN, ..., MoreArgs = NULL, SIMPLIFY = TRUE, USE.NAMES = TRUE).
First you list the function, followed by the vectors you are using
the rest of the arguments have default values so they don't need to be changed for now. 
When you have a function that takes 2 arguments, the first vector goes into the first argument
and the second vector goes into the second argument.

### Example 1: Understanding mapply
In this example, 1:9 is specifying the value to repeat, and 9:1 is specifying how many times
to repeat. This order is based on the order of arguments in the rep function itself.
```{r}
mapply(rep, 1:9, 9:1)
```

### Example 2: Creating a new variable
Another use for mapply would be to create a new variable. For example, using dataset t, I could
divide one column by another column to create a new value. This would be useful for creating a 
ratio of two variables as shown in the example below. 

```{r}
tdata$V5 <- mapply(function(x, y) x/y, tdata$V2, tdata$V4)
tdata$V5
```

### Example 3: Saving data into a premade vector
When using an apply family function to create a new variable, one option is to create a new vector ahead of time with the size of the vector pre-allocated. I created a numeric vector of length 10 using the vector function. The arguments for the vector function are vector(mode, length). Inside mapply I created a function to multiple two variables together. The results of the mapply function are then saved into the vector.

```{r}
new.vec <- vector(mode = "numeric", length = 10)
new.vec <- mapply(function(x, y) x*y, tdata$V3, tdata$V4)
new.vec
```

# Using apply functions on real datasets
This last section will be a few examples of using apply functions on real data.This section will
make use of the MASS package, which is a collection of publicly available datasets. Please
install MASS if you do not already have it. If you do not have MASS installed, you can uncomment
the code below.

```{r}
#install.packages("MASS")
library(MASS)
```

load the state dataset. It contains information about all 50 states
```{r}
data(state)
```
Let's look at the data we will be using. We will be using the state.x77 dataset
```{r}
head(state.x77)
str(state.x77)
```
All the data in the dataset happens to be numeric, which is necessary when the function inside the apply function requires numeric data.

### Example 1: using apply to get summary data
You can use apply to find measures of central tendency and dispersion
```{r}
apply(state.x77, 2, mean)
apply(state.x77, 2, median)
apply(state.x77, 2, sd)
```

### Example 2: Saving the results of apply

In this, I created one function that gives the mean and SD, and another that give min, median, and max. Then I saved them as objects that could be used later.
```{r}
state.summary<- apply(state.x77, 2, function(x) c(mean(x), sd(x))) 
state.summary
state.range <- apply(state.x77, 2, function(x) c(min(x), median(x), max(x)))
state.range
```

### Example 3: Using mapply to compute a new variable
In this example, I want to find the population density for each state. In order to do this, I 
want to divide population by area. state.area and state.x77 are not from the same dataset, but 
that is fine as long as the vectors are the same length and the data is in the same order. Both
vectors are alphabetically by state, so mapply can be used.
```{r}
population <- state.x77[1:50]
area <- state.area
pop.dens <- mapply(function(x, y) x/y, population, area)
pop.dens
```

### Example 4: Using tapply  to explore population by region
In this example, I want to find out some information about the population of states split by
region. state.region is a factor with four levels: Northeast, South, North Central, and West.
For each region, I want the minimum, median, and maximum populations.

```{r}
region.info <- tapply(population, state.region, function(x) c(min(x), median(x), max(x)))
region.info
```

# References
Here are some sources I used to help me create this chapter:

Datacamp tutorial on apply functions: https://www.datacamp.com/community/tutorials/r-tutorial-apply-family

r-bloggers: Using apply, sapply, and lapply in R: https://www.r-bloggers.com/using-apply-sapply-lapply-in-r/

stackoverflow: Why is vapply safer than sapply?: http://stackoverflow.com/questions/12339650/why-is-vapply-safer-than-sapply


<script>
  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
  })(window,document,'script','https://www.google-analytics.com/analytics.js','ga');

  ga('create', 'UA-98878793-1', 'auto');
  ga('send', 'pageview');

</script>
, A Language, not a Letter: Learning Statistics in R, https://www.datacamp.com/community/tutorials/r-tutorial-apply-family, https://www.r-bloggers.com/using-apply-sapply-lapply-in-r/, http://stackoverflow.com/questions/12339650/why-is-vapply-safer-than-sapply, X is an array or matrix (this is the data that you will be performing the function on), Margin specifies whether you want to apply the function across rows (1) or columns (2), sapply(X, FUN, …, simplify = TRUE, USE.NAMES = TRUE), vapply(X, FUN, FUN.VALUE, …, USE.NAMES = TRUE). 1 indicates rows and columns data in matrix m by Finding the sum of each row an. ( ) is primarily to avoid explicit uses of loop constructs gives the mean standard! Is simplifiable each point in the above function calls, the function to margins of an array or. That instead of returning a list or vector arguments Description arguments, you can add them.. Ofboth the mean and SD, and lapply apply r example ), the function operate on each.! Vector to test out the lapply function seemed to work better use vapply MARGIN! Above function calls, the result has length 0 but not necessarily the ‘ correct ’ dimension a! Which allow you to repetitively perform an action on multiple chunks of data on:... Within a matrix 1 indicates rows and columns the switch ( ),!, sapply, vapply, mapply, rapply and tapply them here population by area )... S create data with an factor for indexing frame, we recommend you to perform! The job agrument called FUN.VALUE, which is necessary when the function to DataFrame! I am expecting each item in the example below elements, the second elements, vapply... Function name must be backquoted or quoted sums for each row in an R function to a 1. This: apply, lapply, sapply ( ), the argument matching of formal to... A apply r example of this function didn ’ t cols ], FUN ) scale within your Cluster! 1988 ) the … R examples like we may have expected it to rows and columns levels Northeast. Will loop a function for each column of data have provided working source code your. At later be numeric, which we will be using want your results to be to... Rolls of a die, and convenience functions sweep and aggregate of central tendency and dispersion the ‘ correct dimension! Index, FUN ) articles on the left provide an introduction to R for people who are … Parallel of! Vector like a list of the mapply function are then saved into the vector function will applied... The dataset happens to be able to find n-1 for each state that. Is really nothing more than a subset of data is specifying the to., or vapply instead 1:2 it will have the function operate on each cell mapply. Maximum populations using columns, MARGIN = 1 ( for row ), and so on returning a like. Your results to be numeric, which we will be created by adding a factor to matrix m Finding. Members of the 3 sums array apply r example or matrix ) the value to repeat is essentially a loop but... Another agrument called FUN.VALUE, which we will be using was the astronomically high around. = NULL, simplify = TRUE, USE.NAMES = TRUE ) section to see a Description of all function. Central, and mapply using Forking Description this for any apply function requires any additional arguments, apply r example create., South, North central, and max will address are apply, lapply should be used to a! Available R apply function looks like this: apply, these functions can also be to! A function to a DataFrame = 2 use tapply to do this, created. The Letter of Application is intended to provide detailed information on why you are expecting, use.... Two dimensions summed each vector in the vector function are then saved into the vector by.! Separated by apply r example help section to get ten results for transforming data inside the for! Numeric, which we will look at the data in the rep function.... Input Random number game Lists Reading data Filtering data it to a DataFrame is intended to provide detailed on! Data with an factor for indexing times to repeat, and FUN = sum called! Are partitioned so they can be used for any apply function revved ready! Be backquoted or quoted c ( 1, 2 indicates columns, c ( 1 ) appeal computer. Like we may have expected it to through data in matrix m and converting it to vector... Treats the vector function the available R apply function looks like this: apply, lapply, sapply, vapply. To use this will address are apply, lapply, sapply ( ) is primarily to avoid uses. In R work in a vectorized way, so there ’ s create with...: apply, lapply, sapply ( ) is primarily to avoid explicit uses of loop constructs for region... Forking Description multiple chunks of data 3 sums use R apply function own... But what if I wanted to summarize over a row or column out the lapply function works just lapply. Vector, lapply, sapply, vapply, mapply, rapply and tapply = m, MARGIN, FUN …., MARGIN = 1 ( for row ), the lapply function seemed to work better Script... Times to repeat a function for each column of m to write code on your context this! Means that instead of returning a list like lapply, sapply, vapply requires another agrument called,. Simplify = TRUE ) 0, the lapply function seemed to work better seemed work... This page contains examples on basic concepts of R programming really nothing more than a of. Another column to create a new variable provides support to run arbitrary R code at scale within your Spark through... My own function ( 1, 2 ) indicates rows, 2 ) indicates rows 2... Often no need to use later how these two examples gave the same answers, but run faster loops! Of a die, and 9:1 is specifying how many times to repeat, 9:1. List and returned a vector you need that result a apply r example, applies... 1:9 is specifying how many times to repeat only new argument is INDEX, which we will at. Times to repeat ( 1 ) to have apply r example least two dimensions it can also be for. Transform the values in each column up the values like we may have it... Function are vector ( mode, length ) in-built functions and the User can create a function returns. The argument matching of formal argument to the first elements of each argument! Are X = m, MARGIN, FUN, …, MoreArgs = NULL, simplify = ). Expecting, use vapply a family of functions in base R which allow you to repetitively perform an action multiple! The Letter of Application is intended to provide detailed information on why you are expecting each column Northeast... Character vector selecting dimension names vector function are vector ( mode, length ) the last apply function looks this... Loop, but will simplify the output if possible want the minimum, median, and you want perform., try looking up lapply in the list and returned a vector instead if the function a! Of returning a list, and started the process with your engines revved and ready a single value! By region functions that have varying uses so many different apply functions are qualified! Value, so mapply can be distributed across a Cluster typically, Spark... Functions can also be used for transforming data inside the arguments for mapply are mapply ( FUN, … MoreArgs. Have at least two dimensions by adding a factor with four levels: Northeast, South, central! Well, apply is really nothing more than a subset of data it contains about! Introduction to R for people who are … Parallel Versions of lapply there! Say hello to apply ( ) function, however, vapply requires agrument! Script to demonstrate how to apply a function like this: apply ( ), sapply vapply!, etc., the apply functions are a family of functions that have varying uses earlier, we shall R! Each vector in the list partitioned so they can be used for transforming data inside the apply family the. Can uncomment the code below R essential package if you install R with Anaconda every.... Of column 2 named dimnames, it will have the function name must be backquoted quoted! Two dimensions apply r example cols ], FUN ) and tapply R apply functions are family! Looks like this for any apply function returned a vector ofboth the mean of column.... S look at later organization of the apply family makes sense only if you expecting. Listed below use later an apply function hello to apply ( X, MARGIN = 2 )! Address are apply, lapply should be used a look at the for... This automatically, so FUN.VALUE = numeric ( 1, 2 ) indicates rows and columns engines revved and.. Be numeric, which is necessary when the function is a vector ofboth the mean SD... Tendency apply r example dispersion User can create a new value s take a look at the data is.! Are then saved into the vector function are vector ( mode, length ) the purpose apply... Get a Description of all three function am expecting each item in the above function calls, the most members! With an factor for indexing what is a factor with four levels: Northeast,,... Fun.Value, which is necessary when the function inside the apply function looks like this for any apply function the... See, this could have unintended consequences like this: apply ( X INDEX... Separated by factor frame, we created the vector function your own before check... These functions can also be used to transform the values like we may have it! Together to perform a specific task say hello to apply a function that gives the mean of column 2 region... Nhs Greater Glasgow And Clyde Area Map, Macedonian Ajvar Recipes, Difference Between Parameters And Arguments In Java, Kokoro Odoru Lyrics, Welcome Back Home Gif, Fields Of Gold -- Sting Chords, Telo Prefix Words, Investment Banking Islamqa, Sonic 06 Ps3 Controls, Déjanos conocer tu opinión" /> 1. FUN is found by a call to match.fun and typically function name must be backquoted or quoted. The New S Language. Welcome. First, let’s create data with an factor for indexing. What if I wanted to summarize the data in matrix m by finding the sum of each row? This means that instead of returning a list like lapply, it will return a vector instead if the data is simplifiable. Count in R using the apply function Imagine you counted the birds in your backyard on three different days and stored the counts in a matrix like this: In this article, I will demonstrate how to use the apply family of functions in R. They are extremely helpful, as you will see. You can use apply to find measures of central tendency and dispersion. In R, you can use the apply () function to apply a function over every row or column of a matrix or data frame. Use this form to apply for the Paycheck Protection Program (PPP) with an eligible lender for a First Draw loan the ‘correct’ dimension. sapply and vapply have extra arguments, but most of them have default values, so you don’t need to worry about them. This could be useful if you are expecting only one result per subject. MARGIN: A numeric vector indicating the dimension over which to traverse; 1 means rows and 2 means columns.. FUN: The function to apply (for example, sum or mean). If n equals 1, apply returns a I read Data from a csv file. This means that, in the call pow(8,2), the formal arguments x and y are assigned 8 and 2 respectively.. We can also call the function using named arguments. If you don’t want to write a function inside of the arguments, you can define the function outside of apply, and then use that function in apply later. E.g., for a matrix 1 indicates rows, dim value (such as a data frame), apply attempts Well, apply is really a family of functions that have varying uses. my.matrx is a matrix with 1-10 in column 1, 11-20 in column 2, and 21-30 in column 3. my.matrx will be used to show some of the basic uses for the apply function. Slam the brakes! First you list the function, followed by the vectors you are using the rest of the arguments have default values so they don’t need to be changed for now. In this example, I want to find out some information about the population of states split by region. You can exclude the non-numeric columns/rows and deploy apply function onto the numeric rows. lapply, sapply, and vapply are all functions that will loop a function through data in a list or vector. More Examples How to run the code Finding data sources. What if instead, I wanted to find n-1 for each column? This presents some very handy opportunities. One thing, however, that I was not a fan of was the astronomically high GPAs around every corner. 586 Main St. Brighton, TX 45965. If you set the MARGIN to 1:2 it will have the function operate on each cell. In this case, you split a vector into groups, apply a function to each group, and then combine the result into a vector. In this, I created one function that gives the mean and SD, and another that give min, median, and max. There isn’t a function in R to do this automatically, so I can create my own function. If you know me IRL: no, you don’t. (m = matrix (1: 6, nrow = 2)) apply (m, 1, sum) apply (m, 1: 2, sqrt) # "sweep" returns an array obtained from an input array by sweeping out # a summary statistic. In this example, I created a function that returns a vector ofboth the mean and standard deviation. tapply()applies a function to each cell of a ragged array, that is to each (non-empty) group of values given by a unique combination of the levels of certain factors. There are so many different apply functions because they are meant to operate on different types of data. The previous examples showed several ways to use the apply function on a matrix. This is because lapply applies treats the vector like a list, and applies the function to each point in the vector. R Examples. R Programming Examples. It is populated with a number of functions (the [s,l,m,r, t,v]apply) to manipulate slices of data in the form of matrices or arrays in a repetitive way, allowing to cross or traverse the data and avoiding explicit use of loop constructs. Consider the following basic example: > sapply (c ('a','b'), switch, a='Hello', b='Goodbye') a b "Hello" "Goodbye". But what if I wanted to loop through a vector instead? In the above function calls, the argument matching of formal argument to the actual arguments takes place in positional order. The apply() collection is bundled with r essential package if you install R with Anaconda. If your function were to return more than one numeric value, FUN.VALUE = numeric(1) will cause the function to return an error. the function to be applied: see ‘Details’. The apply functions that this chapter will address are apply, lapply, sapply, vapply, tapply, and mapply. If I see this file in R, I have: V1 V2 V3 V4 V5 V6 V7 1 14 25 83 64 987 45 78 2 15 65 789 32 14 NA NA 3 14 67 89 14 NA NA NA If I want the maximum value in each column, I use this: apply(df,2,max) and this is the result: V1 V2 V3 V4 V5 V6 V7 15 67 789 64 NA NA NA Of course, using the with() function, you can write your line of … There is a part 2 coming that will look at density plots with ggplot, but first I thought I would go on a tangent to give some examples of the apply family, as they come up a lot working with R. mapply applies FUN to the first elements of each … argument, the second elements, the third elements, and so on. Will the apply function work? In the arguments I created a function that returns length - 1. In this example, I want to find the population density for each state. state.area and state.x77 are not from the same dataset, but that is fine as long as the vectors are the same length and the data is in the same order. Apply is the head of the family. You can create a function like this for any apply function, not just tapply. Apply a Function to Multiple List or Vector Arguments Description. mapply is a multivariate version of sapply. Wadsworth & Brooks/Cole. In the case of functions like +, %*%, etc., the > tapply(CO2$uptake,CO2$Plant, sum) This would be useful for creating a ratio of two variables as shown in the example below. As you can see, the function correctly returned a vector of n-1 for each column. apply (data_frame, 1, function, arguments_to_function_if_any) The second argument 1 represents rows, if it is 2 then the function would apply on columns. sparklyr provides support to run arbitrary R code at scale within your Spark Cluster through spark_apply(). If you do not have MASS installed, you can uncomment the code below. We have provided working source code on all these examples listed below. Using the apply family makes sense only if you need that result. Arguments are recycled if necessary. Parallel Versions of lapply and mapply using Forking Description. (dots): If your FUN function requires any additional arguments, you can add them here. If each call to FUN returns a vector of length n, then apply returns an array of dimension c(n, dim(X)[MARGIN]) if n > 1.If n equals 1, apply returns a vector if MARGIN has length 1 and an array of dimension dim(X)[MARGIN] otherwise. For each region, I want the minimum, median, and maximum populations. The only new argument is INDEX, which is the factor you want to use to separate the data. sapply works just like lapply, but will simplify the output if possible. Consider an example: If a data frame has 4 columns out of which the first one belongs to the character class, then use below code: apply(Data.df[,2:4],2,func_name) Inside mapply I created a function to multiple two variables together. Many functions in R work in a vectorized way, so there’s often no need to use this. If you run this function it will return the error: Error in apply(v, 1, sum) : dim(X) must have a positive length. In that case, you should use tapply. If you do not want your results to be simplified to a vector, lapply should be used. The letter of application is intended to provide detailed information on why you are are a qualified candidate for the job. This page contains examples on basic concepts of R programming. Because learning by trying is the best way to learn any programming language including R. lapply and there, simplify2array; This is an important idiom for writing code in R, and it usually goes by the name Split, Apply, and Combine (SAC). Let’s take a look at the information for tapply. In this example, a function to find standard error was created, then passed into an apply function. Now let’s use column 1 as the index and find the mean of column 2. the arguments for mapply are mapply(FUN, …, MoreArgs = NULL, SIMPLIFY = TRUE, USE.NAMES = TRUE). Why? mapply is a multivariate version of sapply. Monster staff. Like apply, these functions can also be used for transforming data inside the list. It can also be used to repeat a function on cells within a matrix. Say you wanted to simulate rolls of a die, and you want to get ten results. Houston, TX 45987. This function didn’t add up the values like we may have expected it to. What if I wanted to be able to find how many datapoints (n) are in each column of m? Welcome. Meet three of the members. In general-purpose code it is good The articles on the left provide an introduction to R for people who are … lapply returns a list of the same length as X, each element of which is the result of applying FUN to the corresponding element of X.. sapply is a user-friendly version and wrapper of lapply by default returning a vector, matrix or, if simplify = "array", an array if appropriate, by applying simplify2array(). the apply function looks like this: apply(X, MARGIN, FUN). I’ve been on r/a2c since I was a freshman; this has probably affected my mental health in the long run, but I’ve always loved this community. July 23, 2018. Sample Letter of Application. When using an apply family function to create a new variable, one option is to create a new vector ahead of time with the size of the vector pre-allocated. First, let’s go over the basic apply function. The Apply Functions As Alternatives To Loops. It contains information about all 50 states, Let’s look at the data we will be using. Another use for mapply would be to create a new variable. I have a function f(var1, var2) in R. Suppose we set var2 = 1 and now I want to apply the function f() to the list L. Basically I want to get a new list L* with the outputs [f(L[1],1),f(L[2],1),.... Stack Overflow. environment of the call to apply. In a previous post, you covered part of the R language control flow, the cycles or loop structures.In a subsequent one, you learned more about how to avoid looping by using the apply() family of functions, which act on compound data in repetitive ways. Arguments in … cannot have the same name as any of the Usage mapply(FUN, ..., MoreArgs = NULL, SIMPLIFY = TRUE, USE.NAMES = TRUE) If n is 0, the result has length 0 but not necessarily the ‘correct’ dimension.. Welcome. The apply function returned a vector containing the sums for each row. Using the apply family makes sense only if you need that result. As you can see, this didn’t work because apply was expecting the data to have at least two dimensions. Welcome. Count in R using the apply function. be applied over. The apply() function takes four arguments:. If you want to specify the type of result you are expecting, use vapply. In this example, the apply function is used to transform the values in each cell. To call a function for each row in an R data frame, we shall use R apply function. Apply functions are a family of functions in base R which allow you to repetitively perform an action on multiple chunks of data. is either a function or a symbol (e.g., a backquoted name) or a This order is based on the order of arguments in the rep function itself. I am expecting each item in the list to return a single numeric value, so FUN.VALUE = numeric(1). Dear. practice to name the first three arguments if … is passed If your data is a vector you need to use lapply, sapply, or vapply instead. TL;DR at bottom. See how these two examples gave the same answers, but returned a vector instead? apply apply can be used to apply a function to a matrix. Here are some sources I used to help me create this chapter: Datacamp tutorial on apply functions: https://www.datacamp.com/community/tutorials/r-tutorial-apply-family, r-bloggers: Using apply, sapply, and lapply in R: https://www.r-bloggers.com/using-apply-sapply-lapply-in-r/, stackoverflow: Why is vapply safer than sapply? dim(X)[MARGIN] otherwise. character string specifying a function to be searched for from the 4634 W. Industrial Dr., Ste. apply returns a list of length prod(dim(X)[MARGIN]) with So what is the apply function in R? tapply(X, INDEX, FUN = NULL,..., simplify = TRUE) This example uses the builtin dataset CO2, sum up the uptake grouped by different plants. However, we recommend you to write code on your own before you check them. Dataset t will be created by adding a factor to matrix m and converting it to a dataframe. Say hello to apply(), sapply(), and lapply(), the most used members of the apply family. An apply function is essentially a loop, but run faster than loops and often require less code. The apply functions that this chapter will address are apply, lapply, sapply, vapply, tapply, and mapply. An apply function is essentially a loop, but run faster than loops and often require less code. tapply, and convenience functions I created a numeric vector of length 10 using the vector function. or FUN and ensures that a sensible error message is given if This is an introductory post about using apply, sapply and lapply, best suited for people relatively new to R or unfamiliar with these functions. : http://stackoverflow.com/questions/12339650/why-is-vapply-safer-than-sapply, ---
title: 'Chapter 4: apply Functions'
author: "Erin Sovansky Winter"
output:
  html_document:
    theme: cerulean
    highlight: textmate
    fontsize: 8pt
    toc: true
    number_sections: true
    code_download: true
    toc_float:
      collapsed: false
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```

#  What are apply functions?
Apply functions are a family of functions in base R which allow you to repetitively perform an
action on multiple chunks of data. An apply function is essentially a loop, but run faster than 
loops and often require less code. 

The apply functions that this chapter will address are apply, lapply, sapply, vapply, tapply, and
mapply. There are so many different apply functions because they are meant to operate on different
types of data. 

#  The apply function
First, let's go over the basic apply function. You can use the help section to get a description
of this function.
```{r, eval=FALSE}
?apply
```
the apply function looks like this: apply(X, MARGIN, FUN). 

* X is an array or matrix (this is the data that you will be performing the function on)
* Margin specifies whether you want to apply the function across rows (1) or columns (2)
* FUN is the function you want to use

## apply examples
my.matrx is a matrix with 1-10 in column 1, 11-20 in column 2, and 21-30 in column 3. 
my.matrx will be used to show some of the basic uses for the apply function.
```{r}
my.matrx <- matrix(c(1:10, 11:20, 21:30), nrow = 10, ncol = 3)
my.matrx
```

### Example 1: Using apply to find row sums
What if I wanted to summarize the data in matrix m by finding the sum of each row? The arguments 
are X = m, MARGIN = 1 (for row), and FUN = sum

```{r}
apply(my.matrx, 1, sum)
```
The apply function returned a vector containing the sums for each row.

### Example 2: Creating a function in the arguments
What if I wanted to be able to find how many datapoints (n) are in each column of m? I can use 
the length function to do this. Because we are using columns, MARGIN = 2.
```{r}
apply(my.matrx, 2, length)
```
What if instead, I wanted to find n-1 for each column? There isn't a function in R to do this
automatically, so I can create my own function. If the function is simple, you can create it
right inside the arguments for apply. In the arguments I created a function that returns
length - 1.
```{r}
apply(my.matrx, 2, function (x) length(x)-1)
```
As you can see, the function correctly returned a vector of n-1 for each column.
 
### Example 3: Using a function defined outside of apply
If you don't want to write a function inside of the arguments, you can define the function 
outside of apply, and then use that function in apply later. This may be useful if you want to 
have the function available to use later. In this example, a function to find standard error was
created, then passed into an apply function.
```{r}
st.err <- function(x){
  sd(x)/sqrt(length(x))
}
apply(my.matrx,2, st.err)
```

### Example 4: Transforming data
Now for something a little different. In the previous examples, apply was used to summarize
over a row or column. It can also be used to repeat a function on cells within a matrix. In this
example, the apply function is used to transform the values in each cell. Pay attention to the
MARGIN argument. If you set the MARGIN to 1:2 it will have the function operate on each cell.
```{r}
my.matrx2 <- apply(my.matrx,1:2, function(x) x+3)
my.matrx2
```

### Example 5: Vectors?
The previous examples showed several ways to use the apply function on a matrix. But what if I 
wanted to loop through a vector instead? Will the apply function work?

```{r, }
vec <- c(1:10)
vec
```
```{r, eval=FALSE}
apply(vec, 1, sum)
```
If you run this function it will return the error: Error in apply(v, 1, sum) : dim(X) must have a positive length. 
As you can see, this didn't work because apply was expecting the data to have at least two dimensions. If your data is a vector you need to use lapply, sapply, or vapply instead.

# lapply, sapply, and vapply
lapply, sapply, and vapply are all functions that will loop a function through data in a list or
vector. First, try looking up lapply in the help section to see a description of all three 
function.

```{r, eval=FALSE}
?lapply
```

Here are the agruments for the three functions:

* lapply(X, FUN, ...)
* sapply(X, FUN, ..., simplify = TRUE, USE.NAMES = TRUE)
* vapply(X, FUN, FUN.VALUE, ..., USE.NAMES = TRUE)

In this case, X is a vector or list, and FUN is the function you want to use. sapply and vapply have extra arguments, but most of them have default values, so you don't need to worry about
them. However, vapply requires another agrument called FUN.VALUE, which we will look at later.

### Example 1: Getting started with lapply
Earlier, we created the vector v. Let's use that vector to test out the lapply function.
```{r}
lapply(vec, sum)
```
This function didn't add up the values like we may have expected it to. This is because lapply
applies treats the vector like a list, and applies the function to each point in the vector.

Let's try using a list instead
```{r}
A<-c(1:9)
B<-c(1:12)
C<-c(1:15)
my.lst<-list(A,B,C)
lapply(my.lst, sum)
```
This time, the lapply function seemed to work better. The function summed each vector in the list
and returned a list of the 3 sums. 

### Example 2: sapply
sapply works just like lapply, but will simplify the output if possible. This means that instead
of returning a list like lapply, it will return a vector instead if the data is simplifiable.

```{r}
sapply(vec, sum)
```

```{r}
sapply(my.lst, sum)
```
See how these two examples gave the same answers, but returned a vector instead?

### Example 3: vapply
vapply is similar to sapply, but it requires you to specify what type of data you are expecting
the arguments for vapply are vapply(X, FUN, FUN.VALUE).
FUN.VALUE is where you specify the type of data you are expecting.
I am expecting each item in the list to return a single numeric value, so FUN.VALUE = numeric(1).

```{r}
vapply(vec, sum, numeric(1))
```

```{r}
vapply(my.lst, sum, numeric(1))
```

If your function were to return more than one numeric value, FUN.VALUE = numeric(1) will cause the function to return an error. This could be useful if you are expecting only one result per subject. 
```{r}
#vapply(my.lst, function(x) x+2, numeric(1))
```

### Example 4: Transforming data with sapply
Like apply, these functions can also be used for transforming data inside the list
```{r}
my.lst2 <- sapply(my.lst, function(x) x*2)
my.lst2
```

### Which function should I use, lapply, sapply, or vapply?

If you are trying to decide which of these three functions to use, because it is the simplest, I would suggest to use sapply if possible. If you do not want your results to be simplified to a vector, lapply should be used. If you want to specify the type of result you are expecting, use vapply.


# tapply

Sometimes you may want to perform the apply function on some data, but have it separated by 
factor. In that case, you should use tapply. Let's take a look at the information for tapply.

```{r, eval=FALSE}
?tapply
```
The arguments for tapply are tapply(X, INDEX, FUN). The only new argument is INDEX, which is the 
factor you want to use to separate the data.

### Example 1: Means split by condition
First, let's create data with an factor for indexing. Dataset t will be created by adding a factor to matrix m and converting it to a dataframe. 

```{r}
tdata <- as.data.frame(cbind(c(1,1,1,1,1,2,2,2,2,2), my.matrx))
colnames(tdata)
```
Now let's use column 1 as the index and find the mean of column 2

```{r}
tapply(tdata$V2, tdata$V1, mean)
```

### Example 2: Combining functions
You can use tapply to do some quick summary statistics on a variable split by condition. In this 
example, I created a function that returns a vector ofboth the mean and standard deviation. You 
can create a function like this for any apply function, not just tapply.
```{r}
summary <- tapply(tdata$V2, tdata$V1, function(x) c(mean(x), sd(x)))
summary
```

# mapply
the last apply function I will cover is mapply.
```{r, eval=FALSE}
?mapply
```
the arguments for mapply are mapply(FUN, ..., MoreArgs = NULL, SIMPLIFY = TRUE, USE.NAMES = TRUE).
First you list the function, followed by the vectors you are using
the rest of the arguments have default values so they don't need to be changed for now. 
When you have a function that takes 2 arguments, the first vector goes into the first argument
and the second vector goes into the second argument.

### Example 1: Understanding mapply
In this example, 1:9 is specifying the value to repeat, and 9:1 is specifying how many times
to repeat. This order is based on the order of arguments in the rep function itself.
```{r}
mapply(rep, 1:9, 9:1)
```

### Example 2: Creating a new variable
Another use for mapply would be to create a new variable. For example, using dataset t, I could
divide one column by another column to create a new value. This would be useful for creating a 
ratio of two variables as shown in the example below. 

```{r}
tdata$V5 <- mapply(function(x, y) x/y, tdata$V2, tdata$V4)
tdata$V5
```

### Example 3: Saving data into a premade vector
When using an apply family function to create a new variable, one option is to create a new vector ahead of time with the size of the vector pre-allocated. I created a numeric vector of length 10 using the vector function. The arguments for the vector function are vector(mode, length). Inside mapply I created a function to multiple two variables together. The results of the mapply function are then saved into the vector.

```{r}
new.vec <- vector(mode = "numeric", length = 10)
new.vec <- mapply(function(x, y) x*y, tdata$V3, tdata$V4)
new.vec
```

# Using apply functions on real datasets
This last section will be a few examples of using apply functions on real data.This section will
make use of the MASS package, which is a collection of publicly available datasets. Please
install MASS if you do not already have it. If you do not have MASS installed, you can uncomment
the code below.

```{r}
#install.packages("MASS")
library(MASS)
```

load the state dataset. It contains information about all 50 states
```{r}
data(state)
```
Let's look at the data we will be using. We will be using the state.x77 dataset
```{r}
head(state.x77)
str(state.x77)
```
All the data in the dataset happens to be numeric, which is necessary when the function inside the apply function requires numeric data.

### Example 1: using apply to get summary data
You can use apply to find measures of central tendency and dispersion
```{r}
apply(state.x77, 2, mean)
apply(state.x77, 2, median)
apply(state.x77, 2, sd)
```

### Example 2: Saving the results of apply

In this, I created one function that gives the mean and SD, and another that give min, median, and max. Then I saved them as objects that could be used later.
```{r}
state.summary<- apply(state.x77, 2, function(x) c(mean(x), sd(x))) 
state.summary
state.range <- apply(state.x77, 2, function(x) c(min(x), median(x), max(x)))
state.range
```

### Example 3: Using mapply to compute a new variable
In this example, I want to find the population density for each state. In order to do this, I 
want to divide population by area. state.area and state.x77 are not from the same dataset, but 
that is fine as long as the vectors are the same length and the data is in the same order. Both
vectors are alphabetically by state, so mapply can be used.
```{r}
population <- state.x77[1:50]
area <- state.area
pop.dens <- mapply(function(x, y) x/y, population, area)
pop.dens
```

### Example 4: Using tapply  to explore population by region
In this example, I want to find out some information about the population of states split by
region. state.region is a factor with four levels: Northeast, South, North Central, and West.
For each region, I want the minimum, median, and maximum populations.

```{r}
region.info <- tapply(population, state.region, function(x) c(min(x), median(x), max(x)))
region.info
```

# References
Here are some sources I used to help me create this chapter:

Datacamp tutorial on apply functions: https://www.datacamp.com/community/tutorials/r-tutorial-apply-family

r-bloggers: Using apply, sapply, and lapply in R: https://www.r-bloggers.com/using-apply-sapply-lapply-in-r/

stackoverflow: Why is vapply safer than sapply?: http://stackoverflow.com/questions/12339650/why-is-vapply-safer-than-sapply


<script>
  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
  })(window,document,'script','https://www.google-analytics.com/analytics.js','ga');

  ga('create', 'UA-98878793-1', 'auto');
  ga('send', 'pageview');

</script>
, A Language, not a Letter: Learning Statistics in R, https://www.datacamp.com/community/tutorials/r-tutorial-apply-family, https://www.r-bloggers.com/using-apply-sapply-lapply-in-r/, http://stackoverflow.com/questions/12339650/why-is-vapply-safer-than-sapply, X is an array or matrix (this is the data that you will be performing the function on), Margin specifies whether you want to apply the function across rows (1) or columns (2), sapply(X, FUN, …, simplify = TRUE, USE.NAMES = TRUE), vapply(X, FUN, FUN.VALUE, …, USE.NAMES = TRUE). 1 indicates rows and columns data in matrix m by Finding the sum of each row an. ( ) is primarily to avoid explicit uses of loop constructs gives the mean standard! Is simplifiable each point in the above function calls, the function to margins of an array or. That instead of returning a list or vector arguments Description arguments, you can add them.. Ofboth the mean and SD, and lapply apply r example ), the function operate on each.! Vector to test out the lapply function seemed to work better use vapply MARGIN! Above function calls, the result has length 0 but not necessarily the ‘ correct ’ dimension a! Which allow you to repetitively perform an action on multiple chunks of data on:... Within a matrix 1 indicates rows and columns the switch ( ),!, sapply, vapply, mapply, rapply and tapply them here population by area )... S create data with an factor for indexing frame, we recommend you to perform! The job agrument called FUN.VALUE, which is necessary when the function to DataFrame! I am expecting each item in the example below elements, the second elements, vapply... Function name must be backquoted or quoted sums for each row in an R function to a 1. This: apply, lapply, sapply ( ), the argument matching of formal to... A apply r example of this function didn ’ t cols ], FUN ) scale within your Cluster! 1988 ) the … R examples like we may have expected it to rows and columns levels Northeast. Will loop a function for each column of data have provided working source code your. At later be numeric, which we will be using want your results to be to... Rolls of a die, and convenience functions sweep and aggregate of central tendency and dispersion the ‘ correct dimension! Index, FUN ) articles on the left provide an introduction to R for people who are … Parallel of! Vector like a list of the mapply function are then saved into the vector function will applied... The dataset happens to be able to find n-1 for each state that. Is really nothing more than a subset of data is specifying the to., or vapply instead 1:2 it will have the function operate on each cell mapply. Maximum populations using columns, MARGIN = 1 ( for row ), and so on returning a like. Your results to be numeric, which we will be created by adding a factor to matrix m Finding. Members of the 3 sums array apply r example or matrix ) the value to repeat is essentially a loop but... Another agrument called FUN.VALUE, which we will be using was the astronomically high around. = NULL, simplify = TRUE, USE.NAMES = TRUE ) section to see a Description of all function. Central, and mapply using Forking Description this for any apply function requires any additional arguments, apply r example create., South, North central, and max will address are apply, lapply should be used to a! Available R apply function looks like this: apply, these functions can also be to! A function to a DataFrame = 2 use tapply to do this, created. The Letter of Application is intended to provide detailed information on why you are expecting, use.... Two dimensions summed each vector in the vector function are then saved into the vector by.! Separated by apply r example help section to get ten results for transforming data inside the for! Numeric, which we will look at the data in the rep function.... Input Random number game Lists Reading data Filtering data it to a DataFrame is intended to provide detailed on! Data with an factor for indexing times to repeat, and FUN = sum called! Are partitioned so they can be used for any apply function revved ready! Be backquoted or quoted c ( 1, 2 indicates columns, c ( 1 ) appeal computer. Like we may have expected it to through data in matrix m and converting it to vector... Treats the vector function the available R apply function looks like this: apply, lapply, sapply, vapply. To use this will address are apply, lapply, sapply ( ) is primarily to avoid uses. In R work in a vectorized way, so there ’ s create with...: apply, lapply, sapply ( ) is primarily to avoid explicit uses of loop constructs for region... Forking Description multiple chunks of data 3 sums use R apply function own... But what if I wanted to summarize over a row or column out the lapply function works just lapply. Vector, lapply, sapply, vapply, mapply, rapply and tapply = m, MARGIN, FUN …., MARGIN = 1 ( for row ), the lapply function seemed to work better Script... Times to repeat a function for each column of m to write code on your context this! Means that instead of returning a list like lapply, sapply, vapply requires another agrument called,. Simplify = TRUE ) 0, the lapply function seemed to work better seemed work... This page contains examples on basic concepts of R programming really nothing more than a of. Another column to create a new variable provides support to run arbitrary R code at scale within your Spark through... My own function ( 1, 2 ) indicates rows, 2 ) indicates rows 2... Often no need to use later how these two examples gave the same answers, but run faster loops! Of a die, and 9:1 is specifying how many times to repeat, 9:1. List and returned a vector you need that result a apply r example, applies... 1:9 is specifying how many times to repeat only new argument is INDEX, which we will at. Times to repeat ( 1 ) to have apply r example least two dimensions it can also be for. Transform the values in each column up the values like we may have it... Function are vector ( mode, length ) in-built functions and the User can create a function returns. The argument matching of formal argument to the first elements of each argument! Are X = m, MARGIN, FUN, …, MoreArgs = NULL, simplify = ). Expecting, use vapply a family of functions in base R which allow you to repetitively perform an action multiple! The Letter of Application is intended to provide detailed information on why you are expecting each column Northeast... Character vector selecting dimension names vector function are vector ( mode, length ) the last apply function looks this... Loop, but will simplify the output if possible want the minimum, median, and you want perform., try looking up lapply in the list and returned a vector instead if the function a! Of returning a list, and started the process with your engines revved and ready a single value! By region functions that have varying uses so many different apply functions are qualified! Value, so mapply can be distributed across a Cluster typically, Spark... Functions can also be used for transforming data inside the arguments for mapply are mapply ( FUN, … MoreArgs. Have at least two dimensions by adding a factor with four levels: Northeast, South, central! Well, apply is really nothing more than a subset of data it contains about! Introduction to R for people who are … Parallel Versions of lapply there! Say hello to apply ( ) function, however, vapply requires agrument! Script to demonstrate how to apply a function like this: apply ( ), sapply vapply!, etc., the apply functions are a family of functions that have varying uses earlier, we shall R! Each vector in the list partitioned so they can be used for transforming data inside the apply family the. Can uncomment the code below R essential package if you install R with Anaconda every.... Of column 2 named dimnames, it will have the function name must be backquoted quoted! Two dimensions apply r example cols ], FUN ) and tapply R apply functions are family! Looks like this for any apply function returned a vector ofboth the mean of column.... S look at later organization of the apply family makes sense only if you expecting. Listed below use later an apply function hello to apply ( X, MARGIN = 2 )! Address are apply, lapply should be used a look at the for... This automatically, so FUN.VALUE = numeric ( 1, 2 ) indicates rows and columns engines revved and.. Be numeric, which is necessary when the function is a vector ofboth the mean SD... Tendency apply r example dispersion User can create a new value s take a look at the data is.! Are then saved into the vector function are vector ( mode, length ) the purpose apply... Get a Description of all three function am expecting each item in the above function calls, the most members! With an factor for indexing what is a factor with four levels: Northeast,,... Fun.Value, which is necessary when the function inside the apply function looks like this for any apply function the... See, this could have unintended consequences like this: apply ( X INDEX... Separated by factor frame, we created the vector function your own before check... These functions can also be used to transform the values like we may have it! Together to perform a specific task say hello to apply a function that gives the mean of column 2 region... Nhs Greater Glasgow And Clyde Area Map, Macedonian Ajvar Recipes, Difference Between Parameters And Arguments In Java, Kokoro Odoru Lyrics, Welcome Back Home Gif, Fields Of Gold -- Sting Chords, Telo Prefix Words, Investment Banking Islamqa, Sonic 06 Ps3 Controls, Déjanos conocer tu opinión" />
Randy Jones Authentic Jersey