What is an “argument?”
Formal Arguments
Jake Elmstedt 05/02/2020
In general you will hear people talk about function arguments in two different contexts: 1. When you are using a function with a function call.
2. When you are defining a function with a function declaration.
These are two very different things though and so I will try to refer to them by their more proper terms
here.
The arguments in a function call would more properly be called the actual arguments as they contain
actual values rather than simply being placeholders.
The arguments in a function declaration are more properly called parameters or formal arguments. R as a language is largely focused on statistical computing and possibly because statisticians strongly associate the term parameter with another concept, the convention among most R programmers is to use formal arguments, and that’s what I’ll use here.
Conventions
When you are writing functions for this class, you are often given precisely what the formal arguments must be. You won’t always have this luxury, so here are some suggested conventions you to guide you in the future.
Naming
• For data objects the convention in R is generally to keep it short and simple. If your function takes in a single data object, x should be your first choice almost every time and you should have very good reasons before you choose something else. If you have additional data objects, y, z, a, b, etc. make great choices.
• In mathematics and statistics, it is common to refer to matrices with capital letters, X, Y, etc, and it is acceptable to do that in your programming though certainly not required. x and y work just fine and are more consistent with R style.
• Be consistent with R style even if that means being inconsistent with R style. . . So, if you need a formal argument to indicate whether or not you should remove NA values from the data before processing, it should be named na.rm not na_rm.
• Keep the names of things short and meaningful. Avoid trying to be funny or cute with your names.
1
Order
R matches actual arguments to formal arguments by,
1. Perfect name matching 2. Partial name matching 3. Position matching
So the order of your formal arguments matters.
In general, you want to list the most important arguments first and the least important arguments last. In
particular, you should never put a required argument after an optional argument. Optional Arguments
When a user calls a function they don’t necessarily need to provide an actual argument for every formal argument in the function declaration. When a function is written in a way that allows for this, those formal arguments are called optional arguments.
There are two ways to make an argument optional, the first is much more common.
Optional Arguments with Defaults You can, in the function declaration, provide a formal argument with a default value or expression it should take if the user elects not to provide an actual argument. A simple example of this is,
In the function add_xy(), y is an optional argument because the function will still work if the user doesn’t provide a value to use for y. The function will throw an error though if an actual argument for x is not provided, so the formal argument x is a required argument.
This works very well and should be the way you make optional arguments in almost all cases.
But, what if you don’t want, have, or know of an appropriate default value to use but you still want to make the formal argument optional? Or, what if you want to warn the user a default value is being used with a message() call?
Think for a few moments how you could change add_xy() to warn the user when you’ve used the default value of y = 0 because they didn’t provide an actual argument for y.
Optional Arguments without Defaults
The Problem Some people try to get around this by using NULL or NA or something similar as a default value like this,
add_xy <- function(x, y = 0) { x+y
}
add_xy <- function(x, y = NULL) { if (is.null(y)) {
warning("No value provided for 'y', using 'y = 0'")
y <- 0 }
x+y }
2
and this is often “good enough,” though it has a fundamental flaw. You can never know if the user passed in a NULL value or if they didn’t and you are using the default.
add_xy(1)
## Warning in add_xy(1): No value provided for 'y', using 'y = 0'
## [1] 1
add_xy(1, NULL)
## Warning in add_xy(1, NULL): No value provided for 'y', using 'y = 0'
## [1] 1
There is no way for the function body to know if the user passed the actual argument or not. And, in the second case, you’ll provide a warning to the user which is misleading, they did provide a value for y, it just happened to be NULL1, but, what’s worse is you’ll end up giving the user an incorrect answer!
So how do we solve it?
The Solution In R, we have the predicate function missing() which will return TRUE if there was no actual argument provided.
Revisting our function, we could write,
which takes care of everything very neatly,
add_xy(1)
## Warning in add_xy(1): No value provided for 'y', using 'y = 0'
## [1] 1
add_xy(1, NULL) ## numeric(0)
1This is much more likely to occur when you are using computed values for actual arguments: if in add_xy(1, f(32)), the function call f(32) returns NULL for instance.
add_xy <- function(x, y) { if (missing(y)) {
warning("No value provided for 'y', using 'y = 0'")
y <- 0 }
x+y }
3