Does dplyr::row_number() calculate row number for each obs? If so, how?
On the tidyverse website reference, I saw two usage mutate(mtcars, row_number() == 1L)
and mtcars %>% filter(between(row_number(), 1, 10))
. It would be straight forward to think that the row_number()
function is return the row number for each observation in the dataframe.
However, it has been emphasized in the documentation that the function is a window function and is similar to sortperm
in other languages. As in the example:
x <- c(5, 1, 3, 2, 2, NA)
row_number(x)
# [1] 5 1 4 2 3 NA
May I ask if this function is intended to report the row number for each observations? If it is, what is the logic flow behind the function call?
Thanks!
r dplyr row-number
add a comment |
On the tidyverse website reference, I saw two usage mutate(mtcars, row_number() == 1L)
and mtcars %>% filter(between(row_number(), 1, 10))
. It would be straight forward to think that the row_number()
function is return the row number for each observation in the dataframe.
However, it has been emphasized in the documentation that the function is a window function and is similar to sortperm
in other languages. As in the example:
x <- c(5, 1, 3, 2, 2, NA)
row_number(x)
# [1] 5 1 4 2 3 NA
May I ask if this function is intended to report the row number for each observations? If it is, what is the logic flow behind the function call?
Thanks!
r dplyr row-number
add a comment |
On the tidyverse website reference, I saw two usage mutate(mtcars, row_number() == 1L)
and mtcars %>% filter(between(row_number(), 1, 10))
. It would be straight forward to think that the row_number()
function is return the row number for each observation in the dataframe.
However, it has been emphasized in the documentation that the function is a window function and is similar to sortperm
in other languages. As in the example:
x <- c(5, 1, 3, 2, 2, NA)
row_number(x)
# [1] 5 1 4 2 3 NA
May I ask if this function is intended to report the row number for each observations? If it is, what is the logic flow behind the function call?
Thanks!
r dplyr row-number
On the tidyverse website reference, I saw two usage mutate(mtcars, row_number() == 1L)
and mtcars %>% filter(between(row_number(), 1, 10))
. It would be straight forward to think that the row_number()
function is return the row number for each observation in the dataframe.
However, it has been emphasized in the documentation that the function is a window function and is similar to sortperm
in other languages. As in the example:
x <- c(5, 1, 3, 2, 2, NA)
row_number(x)
# [1] 5 1 4 2 3 NA
May I ask if this function is intended to report the row number for each observations? If it is, what is the logic flow behind the function call?
Thanks!
r dplyr row-number
r dplyr row-number
edited Jan 3 at 0:44
Julius Vainora
38.2k76685
38.2k76685
asked Jan 3 at 0:05
DanielDaniel
304
304
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
As ?row_number
says, row_number
is equivalent to rank(ties.method = "first")
, where rank
(see ?rank
) returns the sample ranks of the values in a vector and using "first"
results in a permutation with increasing values at each index set of ties:
row_number
# function (x)
# rank(x, ties.method = "first", na.last = "keep")
# <bytecode: 0x108538478>
# <environment: namespace:dplyr>
So,
x <- c(5, 1, 3, 2, 2, NA)
row_number(x)
# [1] 5 1 4 2 3 NA
rank(x, ties = "first", na.last = "keep") # I added na.last = "keep" to fully replicate row_number
# [1] 5 1 4 2 3 NA
since
sort(x)
# [1] 1 2 2 3 5
and we gave a lower rank to the first 2
due to ties = "first"
.
Now when we use simply row_number()
in filter
, mutate
calls, then indeed it seems to simply return a vector of row numbers, as can be found here.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54014771%2fdoes-dplyrrow-number-calculate-row-number-for-each-obs-if-so-how%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
As ?row_number
says, row_number
is equivalent to rank(ties.method = "first")
, where rank
(see ?rank
) returns the sample ranks of the values in a vector and using "first"
results in a permutation with increasing values at each index set of ties:
row_number
# function (x)
# rank(x, ties.method = "first", na.last = "keep")
# <bytecode: 0x108538478>
# <environment: namespace:dplyr>
So,
x <- c(5, 1, 3, 2, 2, NA)
row_number(x)
# [1] 5 1 4 2 3 NA
rank(x, ties = "first", na.last = "keep") # I added na.last = "keep" to fully replicate row_number
# [1] 5 1 4 2 3 NA
since
sort(x)
# [1] 1 2 2 3 5
and we gave a lower rank to the first 2
due to ties = "first"
.
Now when we use simply row_number()
in filter
, mutate
calls, then indeed it seems to simply return a vector of row numbers, as can be found here.
add a comment |
As ?row_number
says, row_number
is equivalent to rank(ties.method = "first")
, where rank
(see ?rank
) returns the sample ranks of the values in a vector and using "first"
results in a permutation with increasing values at each index set of ties:
row_number
# function (x)
# rank(x, ties.method = "first", na.last = "keep")
# <bytecode: 0x108538478>
# <environment: namespace:dplyr>
So,
x <- c(5, 1, 3, 2, 2, NA)
row_number(x)
# [1] 5 1 4 2 3 NA
rank(x, ties = "first", na.last = "keep") # I added na.last = "keep" to fully replicate row_number
# [1] 5 1 4 2 3 NA
since
sort(x)
# [1] 1 2 2 3 5
and we gave a lower rank to the first 2
due to ties = "first"
.
Now when we use simply row_number()
in filter
, mutate
calls, then indeed it seems to simply return a vector of row numbers, as can be found here.
add a comment |
As ?row_number
says, row_number
is equivalent to rank(ties.method = "first")
, where rank
(see ?rank
) returns the sample ranks of the values in a vector and using "first"
results in a permutation with increasing values at each index set of ties:
row_number
# function (x)
# rank(x, ties.method = "first", na.last = "keep")
# <bytecode: 0x108538478>
# <environment: namespace:dplyr>
So,
x <- c(5, 1, 3, 2, 2, NA)
row_number(x)
# [1] 5 1 4 2 3 NA
rank(x, ties = "first", na.last = "keep") # I added na.last = "keep" to fully replicate row_number
# [1] 5 1 4 2 3 NA
since
sort(x)
# [1] 1 2 2 3 5
and we gave a lower rank to the first 2
due to ties = "first"
.
Now when we use simply row_number()
in filter
, mutate
calls, then indeed it seems to simply return a vector of row numbers, as can be found here.
As ?row_number
says, row_number
is equivalent to rank(ties.method = "first")
, where rank
(see ?rank
) returns the sample ranks of the values in a vector and using "first"
results in a permutation with increasing values at each index set of ties:
row_number
# function (x)
# rank(x, ties.method = "first", na.last = "keep")
# <bytecode: 0x108538478>
# <environment: namespace:dplyr>
So,
x <- c(5, 1, 3, 2, 2, NA)
row_number(x)
# [1] 5 1 4 2 3 NA
rank(x, ties = "first", na.last = "keep") # I added na.last = "keep" to fully replicate row_number
# [1] 5 1 4 2 3 NA
since
sort(x)
# [1] 1 2 2 3 5
and we gave a lower rank to the first 2
due to ties = "first"
.
Now when we use simply row_number()
in filter
, mutate
calls, then indeed it seems to simply return a vector of row numbers, as can be found here.
edited Jan 3 at 0:43
answered Jan 3 at 0:25
Julius VainoraJulius Vainora
38.2k76685
38.2k76685
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54014771%2fdoes-dplyrrow-number-calculate-row-number-for-each-obs-if-so-how%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown