我有一个資料框,其中每一行代表一个城市中發生的單个事件.資料框顯示城市名稱和發生日期,如下所示:
df <- data.frame(city = c("Seattle", "Seattle", "Seattle", "Seattle", "Seattle", "NYC", "NYC", "NYC", "Chicago",
"Chicago", "Chicago", "Chicago", "Chicago"),
date_of_event = c("01/13/2011", "01/17/2011", "03/15/2011", "05/21/2011", "05/23/2011",
"01/20/2011", "01/22/2011", "03/23/2011", "01/18/2011", "02/24/2011",
"02/26/2011", "04/30/2011", "06/18/2011"),
stringsAsFactors = FALSE)
df$date_of_event <- as.Date(df$date_of_event, "%m/%d/%Y")
以上只是一个示例,我的資料實際上是在具有數千行,许多城市,许多日期等的csv中。我想做的是生成一个新的資料框,该資料框每个城市和每个月都有一行 /年在資料集中表示,並且相應的計數列顯示原始資料框中每个城市每个月發生的次數.第二个資料幀看起来像這樣:
df2 <- data.frame(city = c("Seattle", "Seattle", "Seattle", "Seattle", "Seattle", "Seattle", "NYC", "NYC", "NYC", "NYC",
"NYC", "NYC", "Chicago", "Chicago", "Chicago", "Chicago", "Chicago", "Chicago"),
month_year = c("01/01/2011", "02/01/2011", "03/01/2011", "04/01/2011", "05/01/2011", "06/01/2011",
"01/01/2011", "02/01/2011", "03/01/2011", "04/01/2011", "05/01/2011", "06/01/2011",
"01/01/2011", "02/01/2011", "03/01/2011", "04/01/2011", "05/01/2011", "06/01/2011"),
count = c(2, 0, 1, 0, 2, 0, 2, 0, 1, 0, 0, 0, 1, 2, 0, 1, 0, 1),
stringsAsFactors = FALSE)
df2$month_year <- as.Date(df2$month_year, "%m/%d/%Y")
我知道您可以使用dplyr中的count,也可以將日期四舍五入到每个月的第一天,但是我尝試並未能正確进行分組和計數以产生我想要的第二个資料幀 .有人可以帮我吗? 提前非常感谢。
最新回復
- 6月前1 #
相似問題
- python:資料透视表還是大pandas分組依据?pythonpandascountgroupbypivottable2021-01-09 09:55
- dplyr:按年份分組資料並過濾全年filterdplyrgroupby2021-01-05 17:28
- sql server:按多列分組,然後計算同一列的不同值sqlservergroupbycountsum2020-12-25 11:59
- SQL Server:SQL Server-基於條件的SUM()和Group()記錄sqlsqlserverdatetimegroupbycount2020-12-24 22:57
您可以尝試以下方法:
要包括零計數的延长期限,您可以尝試以下操作: