library("nnet")
dummy_income <- class.ind(hairPre$income)
head(dummy_income) #Display the head 6 lines
class(hairPre$income)
#Do not have to as.factor() the variable
library(dummies)
#to create dummies for specific variable
#Do not have to as.factor() the variable
d1 <- hairPre
d1 <- cbind(d1, dummy(d1$income, sep = "."))
#dummy.data.frame() call for the function model.matrix()
class(d1$income)
names(d1)
#Have to as.factor() the variable first
hairPre$income <- as.factor(hairPre$income)
dum_income <- model.matrix(~income, hairPre)
class(hairPre$income)
head(cbind(dum_income, hairPre$income))
caret包可以很简便地生成dummies
两种方式:
The function dummyVars can be used to generate a complete (less than full rank parameterized) set of dummy variables from one or more factors. The function takes a formula and a data set and outputs an object that can be used to create the dummy variables using the predict method.
For example, the etitanic data set in the earth package includes two factors: pclass (passenger class, with levels 1st, 2nd, 3rd) and sex (with levels female, male).
The base R function model.matrix would generate the following variables:
library(earth)
data(etitanic)
head(model.matrix(survived ~ ., data = etitanic))
Using dummyVars:
dummies <- dummyVars(survived ~ ., data = etitanic)
head(predict(dummies, newdata = etitanic))
Note there is no intercept and each factor has a dummy variable for each level, so this parameterization may not be useful for some model functions, such as lm.
本文介绍了使用R的caret包创建哑变量的两种方法,包括dummyVars函数的运用,以及对比了base R的model.matrix函数。通过示例展示了如何从earth包的etitanic数据集中处理pclass和sex因素,生成完整的哑变量,但注意到这种参数化可能不适用于所有模型函数,如lm。

1231

被折叠的 条评论
为什么被折叠?



