I am working on a project to find the number of times a service has to be re-worked but only within the first 30 days of the end of the original service. Using the sample datasets below:
work <- tribble(
~Client , ~Initial_Date_In, ~Initial_Date_Out,
'Car1', '2019-01-01', '2019-01-02',
'Car1', '2019-06-01', '2019-06-03',
'Car2', '2019-01-01', '2019-01-02',
'Truck1', '2019-06-01', '2019-06-02',
'Truck2', '2019-04-01', '2019-04-02',
'Truck3', '2019-08-01', '2019-08-04',
'Van1', '2019-01-01', '2019-01-02',
'Van2', '2019-06-01', '2019-06-02',
'Truck4', '2019-10-05', '2019-10-09',
)
rework <- tribble(
~Client , ~Date_In, ~Date_Out,
'Car1', '2019-01-05', '2019-01-07',
'Car1', '2019-01-09', '2019-01-10',
'Car1', '2019-06-09', '2019-06-10',
'Truck3', '2019-08-07', '2019-08-08',
'Van1', '2019-03-01', '2019-03-01',
'Van2', '2019-06-09', '2019-06-11',
'Truck4', '2019-11-01', '2019-11-02',
)
work$Initial_Date_In <- lubridate::as_date(work$Initial_Date_In)
work$Initial_Date_Out <- lubridate::as_date(work$Initial_Date_Out)
rework$Date_In <- lubridate::as_date(rework$Date_In)
rework$Date_Out <- lubridate::as_date(rework$Date_Out)
I tried multiple things, such as :
work %>%
left_join(select(rework, Client,Date_Out), by = ("Client" = "Client")) %>%
group_by(Client) %>%
filter(max(Date_Out) >Initial_Date_Out & max(Date_Out) < (Initial_Date_Out+duration(30,"days")))
This code seems to pull the max Initial_Date_In(Out), but I want all from the work table where there is an appropriate entry in rework (displaying on the max Date_Out from the rework table)
The expected output would be something like:
Client Initial_Date_In Initial_Date_Out Date_Out #_Of_Rework_Visits
Car1 2019-01-01 2019-01-02 2019-01-10 2
Car1 2019-06-01 2019-06-03 2019-06-10 1
Truck3 2019-08-01 2019-08-04 2019-08-08 1
Van1 2019-06-01 2019-06-02 2019-06-11 1
Truck4 2019-10-05 2019-10-09 2019-11-02 1
Any tips on what I am doing wrong?