I'm learning how to analyze using web scraping. However, at the moment I get an error when I use the website that is in the code and grab the season of 2020.
But if I grab the season of 2019 there is nothing wrong.
The error I get is : Error in names(x) <- value : names' attribute [27] must be the same length as the vector [20].
What does it mean, and how can I fix this code so I can create a data frame
Load the data
# Import/ingest the Formula 1 race results for season 2016 ----------------
# Take a look at the data in the browser
browseURL('https://www.formel1.de/rennergebnisse/wm-stand/2020/fahrerwertung')
# Fetch the contents of the HTML-table into the variable f1
f1 <- read_html('https://www.formel1.de/rennergebnisse/wm-stand/2020/fahrerwertung') %>%
html_node('table') %>%
html_table()
# Display our data
f1
This works fine
Transform the data
# Transform & tidy the data -----------------------------------------------
# Add missing column headers
colnames(f1) <- c('Pos', 'Driver', 'Total', sprintf('R%02d', 1:24))
# Convert to tibble data frame and filter on top 9 drivers
f1 <- as_tibble(f1) %>%
filter(as.integer(Pos) <= 10)
# Make Driver a factorial variable, replace all '-' with zeros, convert to long format
f1$Driver <- as.factor(f1$Driver)
f1[, -2] <- apply(f1[, -2], 2, function(x) as.integer(gsub('-', '0', as.character(x))))
f1long <- gather(f1, Race, Points, R01:R21)
# That looks better
f1long
error Error in names(x) <- value : 'names' attribute [27] must be the same length as the vector [20]
Source https://www.formel1.de/rennergebnisse/wm-stand/2020/fahrerwertung 2020
https://www.formel1.de/rennergebnisse/wm-stand/2019/fahrerwertung 2019