| model.frame {stats} | R Documentation |
model.frame (a generic function) and its methods return a
data.frame with the variables needed to use
formula and any ... arguments.
model.frame(formula, ...)
## Default S3 method:
model.frame(formula, data = NULL,
subset = NULL, na.action = na.fail,
drop.unused.levels = FALSE, xlev = NULL, ...)
## S3 method for class 'aovlist':
model.frame(formula, data = NULL, ...)
## S3 method for class 'glm':
model.frame(formula, ...)
## S3 method for class 'lm':
model.frame(formula, ...)
formula |
a model formula or terms
object or an R object. |
data |
a data.frame, list or environment (or object
coercible by as.data.frame to a data.frame),
containing the variables in formula. Neither a matrix nor an
array will be accepted. |
subset |
a specification of the rows to be used: defaults to all
rows. This can be any valid indexing vector (see
[.data.frame) for the rows of data or if that is not
supplied, a data frame made up of the variables used in formula. |
na.action |
how NAs are treated. The default is first,
any na.action attribute of data, second
a na.action setting of options, and third
na.fail if that is unset. The “factory-fresh”
default is na.omit. Another possible value is NULL. |
drop.unused.levels |
should factors have unused levels dropped?
Defaults to FALSE. |
xlev |
a named list of character vectors giving the full set of levels to be assumed for each factor. |
... |
further arguments such as data, na.action,
subset. Any additional arguments such as offset and
weights which reach the default method are used to create
further columns in the model frame, with parenthesised names such as
"(offset)". |
Exactly what happens depends on the class and attributes of the object
formula. If this is an object of fitted-model class such as
"lm", the method will either returned the saved model frame
used when fitting the model (if any, often selected by argument
model = TRUE) or pass the call used when fitting on to the
default method. The default method itself can cope with rather
standard model objects such as those of classes
"lqs" and "ppr" from
package MASS if no other arguments are supplied.
The rest of this section applies only to the default method.
If either formula or data is already a model frame (a
data frame with a "terms" attribute and the other is missing,
the model frame is returned. Unless formula is a terms object,
terms is called on it. (If you wish to use the
keep.order argument of terms.formula, pass a terms
object rather than a formula.)
Row names for the model frame are taken from the data argument
if present, then from the names of the response in the formula (or
rownames if it is a matrix), if there is one.
All the variables in formula, subset and in ...
are looked for first in data and then in the environment of
formula (see the help for formula() for further
details) and collected into a data frame. Then the subset
expression is evaluated, and it is is used as a row index to the data
frame. Then the na.action function is applied to the data frame
(and may well add attributes). The levels of any factors in the data
frame are adjusted according to the drop.unused.levels and
xlev arguments.
Unless na.action = NULL, time-series attributes will be removed
from the variables found (since they will be wrong if NAs are
removed).
Note that all the variables in the formula are included in the
data frame, even those preceded by -.
Only variables whose type is raw, logical, integer, real, complex or character can be included in a model frame: this includes classed variables such as factors (whose underlying type is integer), but excludes lists.
A data.frame containing the variables used in
formula plus those specified ....
Chambers, J. M. (1992) Data for models. Chapter 3 of Statistical Models in S eds J. M. Chambers and T. J. Hastie, Wadsworth & Brooks/Cole.
model.matrix for the “design matrix”,
formula for formulas and
expand.model.frame for model.frame manipulation.
data.class(model.frame(dist ~ speed, data = cars))