This question is related to this question, but not quite the same.
Say I have this data frame,
df <- data.frame(
id = c(1:6),
profession = c(1, 5, 4, NA, 0, 5))
and a string with human readable information about the profession codes. Say,
profession.code <- c(
Optometrists=1, Accountants=2, Veterinarians=3,
`Financial analysts`=4, Nurses=5)
Now, I'm looking for the easiest way to replace the values in df$profession
with the text found in profession.code
. Preferably without use of special libraries, unless it shortens the code significantly.
I would like my end result to be
df <- data.frame(
id = c(1:6),
profession = c("Optometrists", "Nurses",
"Financial analysts", NA, 0, "Nurses"))
Any help would be greatly appreciated.
Thanks, Eric
You can do it this way:
df <- data.frame(id = c(1:6),
profession = c(1, 5, 4, NA, 0, 5))
profession.code <- c(`0` = 0, Optometrists=1, Accountants=2, Veterinarians=3,
`Financial analysts`=4, Nurses=5)
df$profession.str <- names(profession.code)[match(df$profession, profession.code)]
df
# id profession profession.str
# 1 1 1 Optometrists
# 2 2 5 Nurses
# 3 3 4 Financial analysts
# 4 4 NA <NA>
# 5 5 0 0
# 6 6 5 Nurses
Note that I had to add a 0
entry in your profession.code
vector to account for those zeroes.
EDIT: here is an updated solution to account for Eric's comment below that the data may contain any number of profession codes for which there are no corresponding descriptions:
match.idx <- match(df$profession, profession.code)
df$profession.str <- ifelse(is.na(match.idx),
df$profession,
names(profession.code)[match.idx])
I played around with it and this is my current solution using the car
package.
pLoop <- function(v) paste(profession.code[v],"='", names(profession.code[v]),"';")
library(car)
df$profession<- recode(df$profession, paste(sapply(1:5, pLoop),collapse=""))
df
# id profession
# 1 Optometrists
# 2 Nurses
# 3 Financial analysts
# 4 <NA>
# 5 0
# 6 Nurses
Still interest to if anyone have other suggestions for a solution. I would prefer to do it using only the base function in R.
I personally like the way the arules
package deals with this problem, using the decode
function. From the documentation:
library(arules)
data("Adult")
## Example 1: Manual decoding
## get code
iLabels <- itemLabels(Adult)
head(iLabels)
## get undecoded list and decode in a second step
list <- LIST(Adult[1:5], decode = FALSE)
list
decode(list, itemLabels = iLabels)
Advantage is that the package also offers the functions encode
and recode
. Their respective purpose is straightforward, I believe.