I haven't found a way to do this in parallel either thus far. However for my dataset (this one is a list of about 1500 dataframes totaling 4.5M rows) the following snippet seemed to help:
while(length(lst) > 1) {
idxlst <- seq(from=1, to=length(lst), by=2)
lst <- lapply(idxlst, function(i) {
if(i==length(lst)) { return(lst[[i]]) }
return(rbind(lst[[i]], lst[[i+1]]))
})
}
where lst is the list. It seemed to be about 4 times faster than using do.call(rbind, lst)
or even do.call(rbind.fill, lst)
(with rbind.fill from the plyr package). In each iteration this code is halving the amount of dataframes.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…