Selecting groups in which one or more rows meet certain criteria
I am cleaning up data in R using the tidyverse
package. I would like to select all groups in which one or more rows meet a certain criterion.
I have a data that looks like the following:
require(tidyverse)
dat <- data_frame(
group = rep(c("A", "B", "C"),3),
key = c(1,1,0, 0,0,0,1,0,0),
value = rnorm(n= 9, mean = 3, sd = 1)
)
#A tibble: 9 x 3
#Groups: group [3]
group key value
<chr> <dbl> <dbl>
1 A 1 3.97
2 B 1 2.05
3 C 0 3.28
4 A 0 4.22
5 B 0 2.67
6 C 0 5.02
7 A 1 2.60
8 B 0 3.99
9 C 0 4.42
For this example, I would like to select groups in which one or more keys equal to 1. Only group A and B include rows whose key is 1. Hence, my expected results would be:
#A tibble: 9 x 3
#Groups: group [3]
group key value
<chr> <dbl> <dbl>
1 A 1 3.97
2 B 1 2.05
4 A 0 4.22
5 B 0 2.67
7 A 1 2.60
8 B 0 3.99
r dplyr
add a comment |
I am cleaning up data in R using the tidyverse
package. I would like to select all groups in which one or more rows meet a certain criterion.
I have a data that looks like the following:
require(tidyverse)
dat <- data_frame(
group = rep(c("A", "B", "C"),3),
key = c(1,1,0, 0,0,0,1,0,0),
value = rnorm(n= 9, mean = 3, sd = 1)
)
#A tibble: 9 x 3
#Groups: group [3]
group key value
<chr> <dbl> <dbl>
1 A 1 3.97
2 B 1 2.05
3 C 0 3.28
4 A 0 4.22
5 B 0 2.67
6 C 0 5.02
7 A 1 2.60
8 B 0 3.99
9 C 0 4.42
For this example, I would like to select groups in which one or more keys equal to 1. Only group A and B include rows whose key is 1. Hence, my expected results would be:
#A tibble: 9 x 3
#Groups: group [3]
group key value
<chr> <dbl> <dbl>
1 A 1 3.97
2 B 1 2.05
4 A 0 4.22
5 B 0 2.67
7 A 1 2.60
8 B 0 3.99
r dplyr
add a comment |
I am cleaning up data in R using the tidyverse
package. I would like to select all groups in which one or more rows meet a certain criterion.
I have a data that looks like the following:
require(tidyverse)
dat <- data_frame(
group = rep(c("A", "B", "C"),3),
key = c(1,1,0, 0,0,0,1,0,0),
value = rnorm(n= 9, mean = 3, sd = 1)
)
#A tibble: 9 x 3
#Groups: group [3]
group key value
<chr> <dbl> <dbl>
1 A 1 3.97
2 B 1 2.05
3 C 0 3.28
4 A 0 4.22
5 B 0 2.67
6 C 0 5.02
7 A 1 2.60
8 B 0 3.99
9 C 0 4.42
For this example, I would like to select groups in which one or more keys equal to 1. Only group A and B include rows whose key is 1. Hence, my expected results would be:
#A tibble: 9 x 3
#Groups: group [3]
group key value
<chr> <dbl> <dbl>
1 A 1 3.97
2 B 1 2.05
4 A 0 4.22
5 B 0 2.67
7 A 1 2.60
8 B 0 3.99
r dplyr
I am cleaning up data in R using the tidyverse
package. I would like to select all groups in which one or more rows meet a certain criterion.
I have a data that looks like the following:
require(tidyverse)
dat <- data_frame(
group = rep(c("A", "B", "C"),3),
key = c(1,1,0, 0,0,0,1,0,0),
value = rnorm(n= 9, mean = 3, sd = 1)
)
#A tibble: 9 x 3
#Groups: group [3]
group key value
<chr> <dbl> <dbl>
1 A 1 3.97
2 B 1 2.05
3 C 0 3.28
4 A 0 4.22
5 B 0 2.67
6 C 0 5.02
7 A 1 2.60
8 B 0 3.99
9 C 0 4.42
For this example, I would like to select groups in which one or more keys equal to 1. Only group A and B include rows whose key is 1. Hence, my expected results would be:
#A tibble: 9 x 3
#Groups: group [3]
group key value
<chr> <dbl> <dbl>
1 A 1 3.97
2 B 1 2.05
4 A 0 4.22
5 B 0 2.67
7 A 1 2.60
8 B 0 3.99
r dplyr
r dplyr
edited Dec 29 '18 at 6:32
Ronak Shah
35.2k103856
35.2k103856
asked Dec 29 '18 at 6:15
user8460166user8460166
498
498
add a comment |
add a comment |
3 Answers
3
active
oldest
votes
Relatively simple solutions is as follows:
library(dplyr)
set.seed(12345)
dat <- data_frame(
group = rep(c("A", "B", "C"),3),
key = c(1,1,0, 0,0,0,1,0,0),
value = rnorm(n= 9, mean = 3, sd = 1)
)
dat %>%
group_by(group) %>%
filter(sum(key == 1) > 0)
#> # A tibble: 6 x 3
#> # Groups: group [2]
#> group key value
#> <chr> <dbl> <dbl>
#> 1 A 1 3.59
#> 2 B 1 3.71
#> 3 A 0 2.55
#> 4 B 0 3.61
#> 5 A 1 3.63
#> 6 B 0 2.72
Once you have grouped by a variable, you can apply a filter, remembering that any functions calling a variable will be applied to the vector of that variable belonging only to the group.
Thank you so much, @g_t_m!! Super quick and that's exactly what I was looking for. I'vs never thought of using sum there. Thanks a lot. :)
– user8460166
Dec 29 '18 at 6:24
add a comment |
A base R option using ave
would be
dat[with(dat, ave(key == 1, group, FUN = function(x) any(sum(x) > 0))), ]
# group key value
# <chr> <dbl> <dbl>
#1 A 1. 0.875
#2 B 1. 2.61
#3 A 0. 3.30
#4 B 0. 1.40
#5 A 1. 4.52
#6 B 0. 3.34
add a comment |
Here are some options.
1) using data.table
library(data.table)
setDT(dat)[dat[, .I[sum(key == 1) > 0], group]$V1]
# group key value
#1: A 1 3.97
#2: A 0 4.22
#3: A 1 2.60
#4: B 1 2.05
#5: B 0 2.67
#6: B 0 3.99
2) with base R
a) in a compact way with ave
dat[!!with(dat, ave(key, group, FUN = max)), ]
b) using table
subset(dat, group %in% names(which(!!table(dat[1:2])[,2])))
c) using rowsum
subset(dat, group %in% names(which((rowsum(key, group) > 0) [, 1])))
3) Using tidyverse
library(tidyverse)
dat %>%
group_by(group) %>%
filter(sum(key) > 0)
data
dat <- structure(list(group = c("A", "B", "C", "A", "B", "C", "A", "B",
"C"), key = c(1L, 1L, 0L, 0L, 0L, 0L, 1L, 0L, 0L), value = c(3.97,
2.05, 3.28, 4.22, 2.67, 5.02, 2.6, 3.99, 4.42)), class = "data.frame",
row.names = c("1",
"2", "3", "4", "5", "6", "7", "8", "9"))
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53967191%2fselecting-groups-in-which-one-or-more-rows-meet-certain-criteria%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
Relatively simple solutions is as follows:
library(dplyr)
set.seed(12345)
dat <- data_frame(
group = rep(c("A", "B", "C"),3),
key = c(1,1,0, 0,0,0,1,0,0),
value = rnorm(n= 9, mean = 3, sd = 1)
)
dat %>%
group_by(group) %>%
filter(sum(key == 1) > 0)
#> # A tibble: 6 x 3
#> # Groups: group [2]
#> group key value
#> <chr> <dbl> <dbl>
#> 1 A 1 3.59
#> 2 B 1 3.71
#> 3 A 0 2.55
#> 4 B 0 3.61
#> 5 A 1 3.63
#> 6 B 0 2.72
Once you have grouped by a variable, you can apply a filter, remembering that any functions calling a variable will be applied to the vector of that variable belonging only to the group.
Thank you so much, @g_t_m!! Super quick and that's exactly what I was looking for. I'vs never thought of using sum there. Thanks a lot. :)
– user8460166
Dec 29 '18 at 6:24
add a comment |
Relatively simple solutions is as follows:
library(dplyr)
set.seed(12345)
dat <- data_frame(
group = rep(c("A", "B", "C"),3),
key = c(1,1,0, 0,0,0,1,0,0),
value = rnorm(n= 9, mean = 3, sd = 1)
)
dat %>%
group_by(group) %>%
filter(sum(key == 1) > 0)
#> # A tibble: 6 x 3
#> # Groups: group [2]
#> group key value
#> <chr> <dbl> <dbl>
#> 1 A 1 3.59
#> 2 B 1 3.71
#> 3 A 0 2.55
#> 4 B 0 3.61
#> 5 A 1 3.63
#> 6 B 0 2.72
Once you have grouped by a variable, you can apply a filter, remembering that any functions calling a variable will be applied to the vector of that variable belonging only to the group.
Thank you so much, @g_t_m!! Super quick and that's exactly what I was looking for. I'vs never thought of using sum there. Thanks a lot. :)
– user8460166
Dec 29 '18 at 6:24
add a comment |
Relatively simple solutions is as follows:
library(dplyr)
set.seed(12345)
dat <- data_frame(
group = rep(c("A", "B", "C"),3),
key = c(1,1,0, 0,0,0,1,0,0),
value = rnorm(n= 9, mean = 3, sd = 1)
)
dat %>%
group_by(group) %>%
filter(sum(key == 1) > 0)
#> # A tibble: 6 x 3
#> # Groups: group [2]
#> group key value
#> <chr> <dbl> <dbl>
#> 1 A 1 3.59
#> 2 B 1 3.71
#> 3 A 0 2.55
#> 4 B 0 3.61
#> 5 A 1 3.63
#> 6 B 0 2.72
Once you have grouped by a variable, you can apply a filter, remembering that any functions calling a variable will be applied to the vector of that variable belonging only to the group.
Relatively simple solutions is as follows:
library(dplyr)
set.seed(12345)
dat <- data_frame(
group = rep(c("A", "B", "C"),3),
key = c(1,1,0, 0,0,0,1,0,0),
value = rnorm(n= 9, mean = 3, sd = 1)
)
dat %>%
group_by(group) %>%
filter(sum(key == 1) > 0)
#> # A tibble: 6 x 3
#> # Groups: group [2]
#> group key value
#> <chr> <dbl> <dbl>
#> 1 A 1 3.59
#> 2 B 1 3.71
#> 3 A 0 2.55
#> 4 B 0 3.61
#> 5 A 1 3.63
#> 6 B 0 2.72
Once you have grouped by a variable, you can apply a filter, remembering that any functions calling a variable will be applied to the vector of that variable belonging only to the group.
edited Dec 29 '18 at 6:24
answered Dec 29 '18 at 6:21
g_t_mg_t_m
1963
1963
Thank you so much, @g_t_m!! Super quick and that's exactly what I was looking for. I'vs never thought of using sum there. Thanks a lot. :)
– user8460166
Dec 29 '18 at 6:24
add a comment |
Thank you so much, @g_t_m!! Super quick and that's exactly what I was looking for. I'vs never thought of using sum there. Thanks a lot. :)
– user8460166
Dec 29 '18 at 6:24
Thank you so much, @g_t_m!! Super quick and that's exactly what I was looking for. I'vs never thought of using sum there. Thanks a lot. :)
– user8460166
Dec 29 '18 at 6:24
Thank you so much, @g_t_m!! Super quick and that's exactly what I was looking for. I'vs never thought of using sum there. Thanks a lot. :)
– user8460166
Dec 29 '18 at 6:24
add a comment |
A base R option using ave
would be
dat[with(dat, ave(key == 1, group, FUN = function(x) any(sum(x) > 0))), ]
# group key value
# <chr> <dbl> <dbl>
#1 A 1. 0.875
#2 B 1. 2.61
#3 A 0. 3.30
#4 B 0. 1.40
#5 A 1. 4.52
#6 B 0. 3.34
add a comment |
A base R option using ave
would be
dat[with(dat, ave(key == 1, group, FUN = function(x) any(sum(x) > 0))), ]
# group key value
# <chr> <dbl> <dbl>
#1 A 1. 0.875
#2 B 1. 2.61
#3 A 0. 3.30
#4 B 0. 1.40
#5 A 1. 4.52
#6 B 0. 3.34
add a comment |
A base R option using ave
would be
dat[with(dat, ave(key == 1, group, FUN = function(x) any(sum(x) > 0))), ]
# group key value
# <chr> <dbl> <dbl>
#1 A 1. 0.875
#2 B 1. 2.61
#3 A 0. 3.30
#4 B 0. 1.40
#5 A 1. 4.52
#6 B 0. 3.34
A base R option using ave
would be
dat[with(dat, ave(key == 1, group, FUN = function(x) any(sum(x) > 0))), ]
# group key value
# <chr> <dbl> <dbl>
#1 A 1. 0.875
#2 B 1. 2.61
#3 A 0. 3.30
#4 B 0. 1.40
#5 A 1. 4.52
#6 B 0. 3.34
answered Dec 29 '18 at 6:27
Ronak ShahRonak Shah
35.2k103856
35.2k103856
add a comment |
add a comment |
Here are some options.
1) using data.table
library(data.table)
setDT(dat)[dat[, .I[sum(key == 1) > 0], group]$V1]
# group key value
#1: A 1 3.97
#2: A 0 4.22
#3: A 1 2.60
#4: B 1 2.05
#5: B 0 2.67
#6: B 0 3.99
2) with base R
a) in a compact way with ave
dat[!!with(dat, ave(key, group, FUN = max)), ]
b) using table
subset(dat, group %in% names(which(!!table(dat[1:2])[,2])))
c) using rowsum
subset(dat, group %in% names(which((rowsum(key, group) > 0) [, 1])))
3) Using tidyverse
library(tidyverse)
dat %>%
group_by(group) %>%
filter(sum(key) > 0)
data
dat <- structure(list(group = c("A", "B", "C", "A", "B", "C", "A", "B",
"C"), key = c(1L, 1L, 0L, 0L, 0L, 0L, 1L, 0L, 0L), value = c(3.97,
2.05, 3.28, 4.22, 2.67, 5.02, 2.6, 3.99, 4.42)), class = "data.frame",
row.names = c("1",
"2", "3", "4", "5", "6", "7", "8", "9"))
add a comment |
Here are some options.
1) using data.table
library(data.table)
setDT(dat)[dat[, .I[sum(key == 1) > 0], group]$V1]
# group key value
#1: A 1 3.97
#2: A 0 4.22
#3: A 1 2.60
#4: B 1 2.05
#5: B 0 2.67
#6: B 0 3.99
2) with base R
a) in a compact way with ave
dat[!!with(dat, ave(key, group, FUN = max)), ]
b) using table
subset(dat, group %in% names(which(!!table(dat[1:2])[,2])))
c) using rowsum
subset(dat, group %in% names(which((rowsum(key, group) > 0) [, 1])))
3) Using tidyverse
library(tidyverse)
dat %>%
group_by(group) %>%
filter(sum(key) > 0)
data
dat <- structure(list(group = c("A", "B", "C", "A", "B", "C", "A", "B",
"C"), key = c(1L, 1L, 0L, 0L, 0L, 0L, 1L, 0L, 0L), value = c(3.97,
2.05, 3.28, 4.22, 2.67, 5.02, 2.6, 3.99, 4.42)), class = "data.frame",
row.names = c("1",
"2", "3", "4", "5", "6", "7", "8", "9"))
add a comment |
Here are some options.
1) using data.table
library(data.table)
setDT(dat)[dat[, .I[sum(key == 1) > 0], group]$V1]
# group key value
#1: A 1 3.97
#2: A 0 4.22
#3: A 1 2.60
#4: B 1 2.05
#5: B 0 2.67
#6: B 0 3.99
2) with base R
a) in a compact way with ave
dat[!!with(dat, ave(key, group, FUN = max)), ]
b) using table
subset(dat, group %in% names(which(!!table(dat[1:2])[,2])))
c) using rowsum
subset(dat, group %in% names(which((rowsum(key, group) > 0) [, 1])))
3) Using tidyverse
library(tidyverse)
dat %>%
group_by(group) %>%
filter(sum(key) > 0)
data
dat <- structure(list(group = c("A", "B", "C", "A", "B", "C", "A", "B",
"C"), key = c(1L, 1L, 0L, 0L, 0L, 0L, 1L, 0L, 0L), value = c(3.97,
2.05, 3.28, 4.22, 2.67, 5.02, 2.6, 3.99, 4.42)), class = "data.frame",
row.names = c("1",
"2", "3", "4", "5", "6", "7", "8", "9"))
Here are some options.
1) using data.table
library(data.table)
setDT(dat)[dat[, .I[sum(key == 1) > 0], group]$V1]
# group key value
#1: A 1 3.97
#2: A 0 4.22
#3: A 1 2.60
#4: B 1 2.05
#5: B 0 2.67
#6: B 0 3.99
2) with base R
a) in a compact way with ave
dat[!!with(dat, ave(key, group, FUN = max)), ]
b) using table
subset(dat, group %in% names(which(!!table(dat[1:2])[,2])))
c) using rowsum
subset(dat, group %in% names(which((rowsum(key, group) > 0) [, 1])))
3) Using tidyverse
library(tidyverse)
dat %>%
group_by(group) %>%
filter(sum(key) > 0)
data
dat <- structure(list(group = c("A", "B", "C", "A", "B", "C", "A", "B",
"C"), key = c(1L, 1L, 0L, 0L, 0L, 0L, 1L, 0L, 0L), value = c(3.97,
2.05, 3.28, 4.22, 2.67, 5.02, 2.6, 3.99, 4.42)), class = "data.frame",
row.names = c("1",
"2", "3", "4", "5", "6", "7", "8", "9"))
edited Dec 29 '18 at 9:35
answered Dec 29 '18 at 9:11
akrunakrun
402k13193266
402k13193266
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53967191%2fselecting-groups-in-which-one-or-more-rows-meet-certain-criteria%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown