The goal of this step is to make sure that 1) every row has the same number of fields and 2) the fields are in the right order. In read.table, lines that contain less fields than the maximum number of fields detected are appended with NA. One advantage of the do-it-yourself approach shown here is that we do not have to make this assumption. The easiest way to standardize rows is to write a function that takes a single character vector as input and assigns the values in the right order.

assignFields <- function(x){
out <- character(3)
# get names
i <- grepl("[[:alpha:]]",x)
out[1] <- x[i]
# get birth date (if any)
i <- which(as.numeric(x) < 1890)
out[2] <- ifelse(length(i)>0, x[i], NA)
# get death date (if any)
i <- which(as.numeric(x) > 1890)
out[3] <- ifelse(length(i)>0, x[i], NA)
out
}