Drop rows in data frame only if values in two columns are reversed and all other values identical
I am working with the iris dataset, and manipulating it as follows to get a species, feature1, feature2, value data frame:
gatherpairs <- function(data, ...,
xkey = '.xkey', xvalue = '.xvalue',
ykey = '.ykey', yvalue = '.yvalue',
na.rm = FALSE, convert = FALSE, factor_key = FALSE) {
vars <- quos(...)
xkey <- enquo(xkey)
xvalue <- enquo(xvalue)
ykey <- enquo(ykey)
yvalue <- enquo(yvalue)
data %>% {
cbind(gather(., key = !!xkey, value = !!xvalue, !!!vars,
na.rm = na.rm, convert = convert, factor_key = factor_key),
select(., !!!vars))
} %>% gather(., key = !!ykey, value = !!yvalue, !!!vars,
na.rm = na.rm, convert = convert, factor_key = factor_key)%>%
filter(!(.xkey == .ykey)) %>%
mutate(var = apply(.[, c(".xkey", ".ykey")], 1, function(x) paste(sort(x), collapse = ""))) %>%
arrange(var)
}
test = iris %>%
gatherpairs(sapply(colnames(iris[, -ncol(iris)]), eval))
This was taken from https://stackoverflow.com/a/47731111/8315659
What this does is give me that data frame with all combinations of feature1 and feature2, but I want to remove duplicates where it is just the reverse being shown. For example, Petal.Length vs Petal.Width is the same as Petal.Width vs Petal.Length. But if there are two rows with identical values for Petal.Length vs Petal.Width, I do not want to drop that row. Therefore, just dropping rows where all values are identical except that .xkey and .ykey are reversed is what I would want to do. Essentially, this is just to recreate the bottom triangle of the ggplot matrix shown in the above linked answer.
How can this be done?
Jack
r dplyr tidyr
add a comment |
I am working with the iris dataset, and manipulating it as follows to get a species, feature1, feature2, value data frame:
gatherpairs <- function(data, ...,
xkey = '.xkey', xvalue = '.xvalue',
ykey = '.ykey', yvalue = '.yvalue',
na.rm = FALSE, convert = FALSE, factor_key = FALSE) {
vars <- quos(...)
xkey <- enquo(xkey)
xvalue <- enquo(xvalue)
ykey <- enquo(ykey)
yvalue <- enquo(yvalue)
data %>% {
cbind(gather(., key = !!xkey, value = !!xvalue, !!!vars,
na.rm = na.rm, convert = convert, factor_key = factor_key),
select(., !!!vars))
} %>% gather(., key = !!ykey, value = !!yvalue, !!!vars,
na.rm = na.rm, convert = convert, factor_key = factor_key)%>%
filter(!(.xkey == .ykey)) %>%
mutate(var = apply(.[, c(".xkey", ".ykey")], 1, function(x) paste(sort(x), collapse = ""))) %>%
arrange(var)
}
test = iris %>%
gatherpairs(sapply(colnames(iris[, -ncol(iris)]), eval))
This was taken from https://stackoverflow.com/a/47731111/8315659
What this does is give me that data frame with all combinations of feature1 and feature2, but I want to remove duplicates where it is just the reverse being shown. For example, Petal.Length vs Petal.Width is the same as Petal.Width vs Petal.Length. But if there are two rows with identical values for Petal.Length vs Petal.Width, I do not want to drop that row. Therefore, just dropping rows where all values are identical except that .xkey and .ykey are reversed is what I would want to do. Essentially, this is just to recreate the bottom triangle of the ggplot matrix shown in the above linked answer.
How can this be done?
Jack
r dplyr tidyr
add a comment |
I am working with the iris dataset, and manipulating it as follows to get a species, feature1, feature2, value data frame:
gatherpairs <- function(data, ...,
xkey = '.xkey', xvalue = '.xvalue',
ykey = '.ykey', yvalue = '.yvalue',
na.rm = FALSE, convert = FALSE, factor_key = FALSE) {
vars <- quos(...)
xkey <- enquo(xkey)
xvalue <- enquo(xvalue)
ykey <- enquo(ykey)
yvalue <- enquo(yvalue)
data %>% {
cbind(gather(., key = !!xkey, value = !!xvalue, !!!vars,
na.rm = na.rm, convert = convert, factor_key = factor_key),
select(., !!!vars))
} %>% gather(., key = !!ykey, value = !!yvalue, !!!vars,
na.rm = na.rm, convert = convert, factor_key = factor_key)%>%
filter(!(.xkey == .ykey)) %>%
mutate(var = apply(.[, c(".xkey", ".ykey")], 1, function(x) paste(sort(x), collapse = ""))) %>%
arrange(var)
}
test = iris %>%
gatherpairs(sapply(colnames(iris[, -ncol(iris)]), eval))
This was taken from https://stackoverflow.com/a/47731111/8315659
What this does is give me that data frame with all combinations of feature1 and feature2, but I want to remove duplicates where it is just the reverse being shown. For example, Petal.Length vs Petal.Width is the same as Petal.Width vs Petal.Length. But if there are two rows with identical values for Petal.Length vs Petal.Width, I do not want to drop that row. Therefore, just dropping rows where all values are identical except that .xkey and .ykey are reversed is what I would want to do. Essentially, this is just to recreate the bottom triangle of the ggplot matrix shown in the above linked answer.
How can this be done?
Jack
r dplyr tidyr
I am working with the iris dataset, and manipulating it as follows to get a species, feature1, feature2, value data frame:
gatherpairs <- function(data, ...,
xkey = '.xkey', xvalue = '.xvalue',
ykey = '.ykey', yvalue = '.yvalue',
na.rm = FALSE, convert = FALSE, factor_key = FALSE) {
vars <- quos(...)
xkey <- enquo(xkey)
xvalue <- enquo(xvalue)
ykey <- enquo(ykey)
yvalue <- enquo(yvalue)
data %>% {
cbind(gather(., key = !!xkey, value = !!xvalue, !!!vars,
na.rm = na.rm, convert = convert, factor_key = factor_key),
select(., !!!vars))
} %>% gather(., key = !!ykey, value = !!yvalue, !!!vars,
na.rm = na.rm, convert = convert, factor_key = factor_key)%>%
filter(!(.xkey == .ykey)) %>%
mutate(var = apply(.[, c(".xkey", ".ykey")], 1, function(x) paste(sort(x), collapse = ""))) %>%
arrange(var)
}
test = iris %>%
gatherpairs(sapply(colnames(iris[, -ncol(iris)]), eval))
This was taken from https://stackoverflow.com/a/47731111/8315659
What this does is give me that data frame with all combinations of feature1 and feature2, but I want to remove duplicates where it is just the reverse being shown. For example, Petal.Length vs Petal.Width is the same as Petal.Width vs Petal.Length. But if there are two rows with identical values for Petal.Length vs Petal.Width, I do not want to drop that row. Therefore, just dropping rows where all values are identical except that .xkey and .ykey are reversed is what I would want to do. Essentially, this is just to recreate the bottom triangle of the ggplot matrix shown in the above linked answer.
How can this be done?
Jack
r dplyr tidyr
r dplyr tidyr
edited Dec 27 '18 at 17:45
PoGibas
15.5k134175
15.5k134175
asked Dec 27 '18 at 15:50
Jack Arnestad
837211
837211
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
I think this could be accomplished using the first part of the source code, which performs a single gathering operation. Using the iris
example, this will produce 600 rows of output, one for each of the 150 rows x 4 columns in iris
.
gatherpairs <- function(data, ...,
xkey = '.xkey', xvalue = '.xvalue',
ykey = '.ykey', yvalue = '.yvalue',
na.rm = FALSE, convert = FALSE, factor_key = FALSE) {
vars <- quos(...)
xkey <- enquo(xkey)
xvalue <- enquo(xvalue)
ykey <- enquo(ykey)
yvalue <- enquo(yvalue)
data %>% {
cbind(gather(., key = !!xkey, value = !!xvalue, !!!vars,
na.rm = na.rm, convert = convert, factor_key = factor_key),
select(., !!!vars))
} # %>% gather(., key = !!ykey, value = !!yvalue, !!!vars,
# na.rm = na.rm, convert = convert, factor_key = factor_key)%>%
# filter(!(.xkey == .ykey)) %>%
# mutate(var = apply(.[, c(".xkey", ".ykey")], 1, function(x) paste(sort(x), collapse = ""))) %>%
# arrange(var)
}
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53947589%2fdrop-rows-in-data-frame-only-if-values-in-two-columns-are-reversed-and-all-other%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
I think this could be accomplished using the first part of the source code, which performs a single gathering operation. Using the iris
example, this will produce 600 rows of output, one for each of the 150 rows x 4 columns in iris
.
gatherpairs <- function(data, ...,
xkey = '.xkey', xvalue = '.xvalue',
ykey = '.ykey', yvalue = '.yvalue',
na.rm = FALSE, convert = FALSE, factor_key = FALSE) {
vars <- quos(...)
xkey <- enquo(xkey)
xvalue <- enquo(xvalue)
ykey <- enquo(ykey)
yvalue <- enquo(yvalue)
data %>% {
cbind(gather(., key = !!xkey, value = !!xvalue, !!!vars,
na.rm = na.rm, convert = convert, factor_key = factor_key),
select(., !!!vars))
} # %>% gather(., key = !!ykey, value = !!yvalue, !!!vars,
# na.rm = na.rm, convert = convert, factor_key = factor_key)%>%
# filter(!(.xkey == .ykey)) %>%
# mutate(var = apply(.[, c(".xkey", ".ykey")], 1, function(x) paste(sort(x), collapse = ""))) %>%
# arrange(var)
}
add a comment |
I think this could be accomplished using the first part of the source code, which performs a single gathering operation. Using the iris
example, this will produce 600 rows of output, one for each of the 150 rows x 4 columns in iris
.
gatherpairs <- function(data, ...,
xkey = '.xkey', xvalue = '.xvalue',
ykey = '.ykey', yvalue = '.yvalue',
na.rm = FALSE, convert = FALSE, factor_key = FALSE) {
vars <- quos(...)
xkey <- enquo(xkey)
xvalue <- enquo(xvalue)
ykey <- enquo(ykey)
yvalue <- enquo(yvalue)
data %>% {
cbind(gather(., key = !!xkey, value = !!xvalue, !!!vars,
na.rm = na.rm, convert = convert, factor_key = factor_key),
select(., !!!vars))
} # %>% gather(., key = !!ykey, value = !!yvalue, !!!vars,
# na.rm = na.rm, convert = convert, factor_key = factor_key)%>%
# filter(!(.xkey == .ykey)) %>%
# mutate(var = apply(.[, c(".xkey", ".ykey")], 1, function(x) paste(sort(x), collapse = ""))) %>%
# arrange(var)
}
add a comment |
I think this could be accomplished using the first part of the source code, which performs a single gathering operation. Using the iris
example, this will produce 600 rows of output, one for each of the 150 rows x 4 columns in iris
.
gatherpairs <- function(data, ...,
xkey = '.xkey', xvalue = '.xvalue',
ykey = '.ykey', yvalue = '.yvalue',
na.rm = FALSE, convert = FALSE, factor_key = FALSE) {
vars <- quos(...)
xkey <- enquo(xkey)
xvalue <- enquo(xvalue)
ykey <- enquo(ykey)
yvalue <- enquo(yvalue)
data %>% {
cbind(gather(., key = !!xkey, value = !!xvalue, !!!vars,
na.rm = na.rm, convert = convert, factor_key = factor_key),
select(., !!!vars))
} # %>% gather(., key = !!ykey, value = !!yvalue, !!!vars,
# na.rm = na.rm, convert = convert, factor_key = factor_key)%>%
# filter(!(.xkey == .ykey)) %>%
# mutate(var = apply(.[, c(".xkey", ".ykey")], 1, function(x) paste(sort(x), collapse = ""))) %>%
# arrange(var)
}
I think this could be accomplished using the first part of the source code, which performs a single gathering operation. Using the iris
example, this will produce 600 rows of output, one for each of the 150 rows x 4 columns in iris
.
gatherpairs <- function(data, ...,
xkey = '.xkey', xvalue = '.xvalue',
ykey = '.ykey', yvalue = '.yvalue',
na.rm = FALSE, convert = FALSE, factor_key = FALSE) {
vars <- quos(...)
xkey <- enquo(xkey)
xvalue <- enquo(xvalue)
ykey <- enquo(ykey)
yvalue <- enquo(yvalue)
data %>% {
cbind(gather(., key = !!xkey, value = !!xvalue, !!!vars,
na.rm = na.rm, convert = convert, factor_key = factor_key),
select(., !!!vars))
} # %>% gather(., key = !!ykey, value = !!yvalue, !!!vars,
# na.rm = na.rm, convert = convert, factor_key = factor_key)%>%
# filter(!(.xkey == .ykey)) %>%
# mutate(var = apply(.[, c(".xkey", ".ykey")], 1, function(x) paste(sort(x), collapse = ""))) %>%
# arrange(var)
}
answered Dec 28 '18 at 0:08
Jon Spring
5,2331625
5,2331625
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53947589%2fdrop-rows-in-data-frame-only-if-values-in-two-columns-are-reversed-and-all-other%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown