I'm importing a csv file that consists of a crosstab with column names in a two-row hierarchy. When I get the table in R, the result looks like this:
   alpha  X.1  X.2 beta  X.1  X.2  X.3 gamma  X.1
    var1 var2 var3 var1 var2 var3 var4  var1 var4
1     21   50    5   22   48    6    8    25    8 
2     27   50    5   24   48    6    8    33    8 
3     26   50    5   28   48    6    8    33    8 
4     25   50    5   28   48    6    8    20    8
Here, alpha, beta, and gamma are all one level of the hierarchy, while var1, var2, var3, and var4 are the second level.
What I would like it do is get output like the following, where the row names are concatenated but also keeping in mind the structure of the data.
   alpha_var1  alpha_var2  alpha_var3 beta_var1  beta_var2  beta_var3  beta_var4 gamma_var1  gamma_var4
1          21          50           5        22         48          6          8         25           8 
2          27          50           5        24         48          6          8         33           8 
3          26          50           5        28         48          6          8         33           8 
4          25          50           5        28         48          6          8         20           8
Any ideas here? Haven't been able to find anything to deal with this issue. Thanks in advance.
read.table() taking advantage of the skip and nrows parameters - Chase 2012-04-03 23:28
This seems to work, though the use of xts seems a bit heavy handed for the na.locf() function, but I know it works and use it frequently, so that's what I used.
library(xts)
#Read in data without headers
x <- read.delim("Book1.txt", skip = 2, header = FALSE)
#Read in header files transposing them into columns
headers <- data.frame(t(read.delim("Book1.txt", nrows = 2, header = FALSE)), stringsAsFactors = FALSE)
#Create a now column with the value of alpha, beta, gama or NA
headers$vals <- with(headers, ifelse(grepl("[abg]", X1), X1, NA))
#Fill down the values above
headers$vals <- na.locf(headers$vals)
#Paste column names together
colnames(x) <- with(headers, paste(vals, X2, sep = "_"))
#Resulting object
x
 alpha_var1 alpha_var2 alpha_var3 beta_var1 beta_var2 beta_var3 beta_var4 gamma_ var1 gamma_var4
1         21         50          5        22        48         6         8          25          8
2         27         50          5        24        48         6         8          33          8
3         26         50          5        28        48         6         8          33          8
4         25         50          5        28        48         6         8          20          8
!(grepl("X\\.\\d", x)). It's finding the pattern "X." followed by a digit...then taking the negative of that - Chase 2012-04-04 01:32
X.1, etc are auto-generated when you import empty header cells from a csv. If you say header=FALSE, they appear as <NA>. Another issue is that if any of the columns are <NA> in both the first and second level, this fails - user1202761 2012-04-04 01:42
dput() to represent your R objects in the futre). Transpose the two rows of data, use the na.locf() function I showed you, paste them together, and stick it on top of the other data that you read in with the skip=... parameter. All of the pieces you need to solve this are here, just need to connect the dots now - Chase 2012-04-04 01:45
dput() in the future. As I should have done earlier, the help for the na.locf() function solved my problem of NA observations (set na.rm=FALSE of course). Dots connected, problems solved, thank you again - user1202761 2012-04-05 01:37