EN VI

How can I replace values for each week with the value of a single day?

I'm trying to look at how changing sampling frequency affects results for my dataset. So I'm trying to take data collected every day and make it weekly.

I'm trying to create a new column where all days have both a daily value for y and a weekly value for y. I have weeks as a numerical and also all days stripped out. I also have a date if that's more useful.

data <- data.frame(Week = c(21,21,21,22,22,22), Day = c("Monday","Tuesday","Friday","Monday","Tuesday","Friday"), x = c(5,7,8,6,5,4), y=c(3,2,4,6,8,2))

Week	Day	x	y
21	Monday	5	3
21	Tuesday	7	2
21	Friday	8	4
22	Monday	6	6
22	Tuesday	5	8
22	Friday	4	2

I can select a random weekday fine:

weekdays <- c("Sunday","Monday","Tuesday","Wednesday","Thursday","Friday","Saturday") randay = sample(weekdays,1)

I've tried to use if_else to replace but I don't think I'm selecting the day properly:

data$y_week <- if_else(test = data$day == randay , yes = data$y, no = data$y_week)

Ideally I'd like to get something like this where the day of the week is selected randomly for each week individually (which I think I'd need to add some pipes, but I'm struggling to just replace values right now): Lets say our random value for week 21 was Monday, and week 22 was Tuesday

Week	Day	x	y	y_week
21	Monday	5	3	3
21	Tuesday	7	2	3
21	Friday	8	4	3
22	Monday	6	6	8
22	Tuesday	5	8	8
22	Friday	4	2	8

I'd like to keep x in the data frame because later I'm going to multiply each daily value of x by y, but I think I could join a dataframe with separate week values back to the original if I was able to get that.

Thanks for the help!

Solution:

If I understand the problem correctly, this is easily accomplished using the dplyr package, as follows:

First group() the dataset by week, to allow the computations be done week by week.
Create additional column using mutate() where you randomly select one day of each week as an intermediary helper variable.
Use mutate() again to obtain the y value that corresponds to the random date you selected, and assign it to the new column y_week.
Finally, you can remove the helper variable random_day from the dataframe if you won't need it anymore.

set.seed(999) #set seed for reproducebility

data <- data %>% 
  group_by(Week) %>% 
  mutate(random_day = sample(Day,1)) %>% 
  mutate(y_week = y[Day == random_day]) %>%  
  select(-random_day)

Hope this helps!