Does dplyr::row_number() calculate row number for each obs? If so, how?












2















On the tidyverse website reference, I saw two usage mutate(mtcars, row_number() == 1L) and mtcars %>% filter(between(row_number(), 1, 10)). It would be straight forward to think that the row_number() function is return the row number for each observation in the dataframe.



However, it has been emphasized in the documentation that the function is a window function and is similar to sortperm in other languages. As in the example:



x <- c(5, 1, 3, 2, 2, NA)
row_number(x)
# [1] 5 1 4 2 3 NA


May I ask if this function is intended to report the row number for each observations? If it is, what is the logic flow behind the function call?



Thanks!










share|improve this question





























    2















    On the tidyverse website reference, I saw two usage mutate(mtcars, row_number() == 1L) and mtcars %>% filter(between(row_number(), 1, 10)). It would be straight forward to think that the row_number() function is return the row number for each observation in the dataframe.



    However, it has been emphasized in the documentation that the function is a window function and is similar to sortperm in other languages. As in the example:



    x <- c(5, 1, 3, 2, 2, NA)
    row_number(x)
    # [1] 5 1 4 2 3 NA


    May I ask if this function is intended to report the row number for each observations? If it is, what is the logic flow behind the function call?



    Thanks!










    share|improve this question



























      2












      2








      2








      On the tidyverse website reference, I saw two usage mutate(mtcars, row_number() == 1L) and mtcars %>% filter(between(row_number(), 1, 10)). It would be straight forward to think that the row_number() function is return the row number for each observation in the dataframe.



      However, it has been emphasized in the documentation that the function is a window function and is similar to sortperm in other languages. As in the example:



      x <- c(5, 1, 3, 2, 2, NA)
      row_number(x)
      # [1] 5 1 4 2 3 NA


      May I ask if this function is intended to report the row number for each observations? If it is, what is the logic flow behind the function call?



      Thanks!










      share|improve this question
















      On the tidyverse website reference, I saw two usage mutate(mtcars, row_number() == 1L) and mtcars %>% filter(between(row_number(), 1, 10)). It would be straight forward to think that the row_number() function is return the row number for each observation in the dataframe.



      However, it has been emphasized in the documentation that the function is a window function and is similar to sortperm in other languages. As in the example:



      x <- c(5, 1, 3, 2, 2, NA)
      row_number(x)
      # [1] 5 1 4 2 3 NA


      May I ask if this function is intended to report the row number for each observations? If it is, what is the logic flow behind the function call?



      Thanks!







      r dplyr row-number






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Jan 3 at 0:44









      Julius Vainora

      38.2k76685




      38.2k76685










      asked Jan 3 at 0:05









      DanielDaniel

      304




      304
























          1 Answer
          1






          active

          oldest

          votes


















          2














          As ?row_number says, row_number is equivalent to rank(ties.method = "first"), where rank (see ?rank) returns the sample ranks of the values in a vector and using "first" results in a permutation with increasing values at each index set of ties:



          row_number
          # function (x)
          # rank(x, ties.method = "first", na.last = "keep")
          # <bytecode: 0x108538478>
          # <environment: namespace:dplyr>


          So,



          x <- c(5, 1, 3, 2, 2, NA)
          row_number(x)
          # [1] 5 1 4 2 3 NA
          rank(x, ties = "first", na.last = "keep") # I added na.last = "keep" to fully replicate row_number
          # [1] 5 1 4 2 3 NA


          since



          sort(x)
          # [1] 1 2 2 3 5


          and we gave a lower rank to the first 2 due to ties = "first".



          Now when we use simply row_number() in filter, mutate calls, then indeed it seems to simply return a vector of row numbers, as can be found here.






          share|improve this answer

























            Your Answer






            StackExchange.ifUsing("editor", function () {
            StackExchange.using("externalEditor", function () {
            StackExchange.using("snippets", function () {
            StackExchange.snippets.init();
            });
            });
            }, "code-snippets");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "1"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54014771%2fdoes-dplyrrow-number-calculate-row-number-for-each-obs-if-so-how%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            2














            As ?row_number says, row_number is equivalent to rank(ties.method = "first"), where rank (see ?rank) returns the sample ranks of the values in a vector and using "first" results in a permutation with increasing values at each index set of ties:



            row_number
            # function (x)
            # rank(x, ties.method = "first", na.last = "keep")
            # <bytecode: 0x108538478>
            # <environment: namespace:dplyr>


            So,



            x <- c(5, 1, 3, 2, 2, NA)
            row_number(x)
            # [1] 5 1 4 2 3 NA
            rank(x, ties = "first", na.last = "keep") # I added na.last = "keep" to fully replicate row_number
            # [1] 5 1 4 2 3 NA


            since



            sort(x)
            # [1] 1 2 2 3 5


            and we gave a lower rank to the first 2 due to ties = "first".



            Now when we use simply row_number() in filter, mutate calls, then indeed it seems to simply return a vector of row numbers, as can be found here.






            share|improve this answer






























              2














              As ?row_number says, row_number is equivalent to rank(ties.method = "first"), where rank (see ?rank) returns the sample ranks of the values in a vector and using "first" results in a permutation with increasing values at each index set of ties:



              row_number
              # function (x)
              # rank(x, ties.method = "first", na.last = "keep")
              # <bytecode: 0x108538478>
              # <environment: namespace:dplyr>


              So,



              x <- c(5, 1, 3, 2, 2, NA)
              row_number(x)
              # [1] 5 1 4 2 3 NA
              rank(x, ties = "first", na.last = "keep") # I added na.last = "keep" to fully replicate row_number
              # [1] 5 1 4 2 3 NA


              since



              sort(x)
              # [1] 1 2 2 3 5


              and we gave a lower rank to the first 2 due to ties = "first".



              Now when we use simply row_number() in filter, mutate calls, then indeed it seems to simply return a vector of row numbers, as can be found here.






              share|improve this answer




























                2












                2








                2







                As ?row_number says, row_number is equivalent to rank(ties.method = "first"), where rank (see ?rank) returns the sample ranks of the values in a vector and using "first" results in a permutation with increasing values at each index set of ties:



                row_number
                # function (x)
                # rank(x, ties.method = "first", na.last = "keep")
                # <bytecode: 0x108538478>
                # <environment: namespace:dplyr>


                So,



                x <- c(5, 1, 3, 2, 2, NA)
                row_number(x)
                # [1] 5 1 4 2 3 NA
                rank(x, ties = "first", na.last = "keep") # I added na.last = "keep" to fully replicate row_number
                # [1] 5 1 4 2 3 NA


                since



                sort(x)
                # [1] 1 2 2 3 5


                and we gave a lower rank to the first 2 due to ties = "first".



                Now when we use simply row_number() in filter, mutate calls, then indeed it seems to simply return a vector of row numbers, as can be found here.






                share|improve this answer















                As ?row_number says, row_number is equivalent to rank(ties.method = "first"), where rank (see ?rank) returns the sample ranks of the values in a vector and using "first" results in a permutation with increasing values at each index set of ties:



                row_number
                # function (x)
                # rank(x, ties.method = "first", na.last = "keep")
                # <bytecode: 0x108538478>
                # <environment: namespace:dplyr>


                So,



                x <- c(5, 1, 3, 2, 2, NA)
                row_number(x)
                # [1] 5 1 4 2 3 NA
                rank(x, ties = "first", na.last = "keep") # I added na.last = "keep" to fully replicate row_number
                # [1] 5 1 4 2 3 NA


                since



                sort(x)
                # [1] 1 2 2 3 5


                and we gave a lower rank to the first 2 due to ties = "first".



                Now when we use simply row_number() in filter, mutate calls, then indeed it seems to simply return a vector of row numbers, as can be found here.







                share|improve this answer














                share|improve this answer



                share|improve this answer








                edited Jan 3 at 0:43

























                answered Jan 3 at 0:25









                Julius VainoraJulius Vainora

                38.2k76685




                38.2k76685
































                    draft saved

                    draft discarded




















































                    Thanks for contributing an answer to Stack Overflow!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54014771%2fdoes-dplyrrow-number-calculate-row-number-for-each-obs-if-so-how%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    Monofisismo

                    Angular Downloading a file using contenturl with Basic Authentication

                    Olmecas