Checking whether value in current row of pandas Series is in lagging window












0















I have a pandas DataFrame similar to the one generated by this code:



names = ['steve', 'bob', 'harry', 'jeff'] * 5
df = pd.DataFrame(
index=pd.DatetimeIndex(start='2018-10-10', end='2018-10-29', freq='D'),
data={'value': [x for x in range(20)],
'names': names,
}
)
df['roll'] = np.random.randint(1, 6, df.shape[0])


Which produces data that looks like this:



            value  names  roll
2018-10-10 0 steve 2
2018-10-11 1 bob 5
2018-10-12 2 harry 4
2018-10-13 3 jeff 2
2018-10-14 4 steve 2
2018-10-15 5 bob 4
2018-10-16 6 harry 1
2018-10-17 7 jeff 2
2018-10-18 8 steve 3
2018-10-19 9 bob 3
...


I'd like to add another column result that is boolean, grouped by names and true when the value of roll in the current row appears within a 10-day lagging window. I.e. I want this:



            value  names  roll  result
2018-10-10 0 steve 2 False
2018-10-11 1 bob 5 False
2018-10-12 2 harry 4 False
2018-10-13 3 jeff 2 False
2018-10-14 4 steve 2 True
2018-10-15 5 bob 4 False
2018-10-16 6 harry 1 False
2018-10-17 7 jeff 2 True
2018-10-18 8 steve 3 True
2018-10-19 9 bob 3 False
...


I've tried this:



df['result'] = (
df.groupby('names').apply(lambda x: x['roll'].isin(x.shift().rolling('10D')['roll']))
)


which feels logical to me, but I get a NotImplementedError that points me here: https://github.com/pandas-dev/pandas/issues/11704.



Is there a pandas-native way to get where I want to be?










share|improve this question



























    0















    I have a pandas DataFrame similar to the one generated by this code:



    names = ['steve', 'bob', 'harry', 'jeff'] * 5
    df = pd.DataFrame(
    index=pd.DatetimeIndex(start='2018-10-10', end='2018-10-29', freq='D'),
    data={'value': [x for x in range(20)],
    'names': names,
    }
    )
    df['roll'] = np.random.randint(1, 6, df.shape[0])


    Which produces data that looks like this:



                value  names  roll
    2018-10-10 0 steve 2
    2018-10-11 1 bob 5
    2018-10-12 2 harry 4
    2018-10-13 3 jeff 2
    2018-10-14 4 steve 2
    2018-10-15 5 bob 4
    2018-10-16 6 harry 1
    2018-10-17 7 jeff 2
    2018-10-18 8 steve 3
    2018-10-19 9 bob 3
    ...


    I'd like to add another column result that is boolean, grouped by names and true when the value of roll in the current row appears within a 10-day lagging window. I.e. I want this:



                value  names  roll  result
    2018-10-10 0 steve 2 False
    2018-10-11 1 bob 5 False
    2018-10-12 2 harry 4 False
    2018-10-13 3 jeff 2 False
    2018-10-14 4 steve 2 True
    2018-10-15 5 bob 4 False
    2018-10-16 6 harry 1 False
    2018-10-17 7 jeff 2 True
    2018-10-18 8 steve 3 True
    2018-10-19 9 bob 3 False
    ...


    I've tried this:



    df['result'] = (
    df.groupby('names').apply(lambda x: x['roll'].isin(x.shift().rolling('10D')['roll']))
    )


    which feels logical to me, but I get a NotImplementedError that points me here: https://github.com/pandas-dev/pandas/issues/11704.



    Is there a pandas-native way to get where I want to be?










    share|improve this question

























      0












      0








      0








      I have a pandas DataFrame similar to the one generated by this code:



      names = ['steve', 'bob', 'harry', 'jeff'] * 5
      df = pd.DataFrame(
      index=pd.DatetimeIndex(start='2018-10-10', end='2018-10-29', freq='D'),
      data={'value': [x for x in range(20)],
      'names': names,
      }
      )
      df['roll'] = np.random.randint(1, 6, df.shape[0])


      Which produces data that looks like this:



                  value  names  roll
      2018-10-10 0 steve 2
      2018-10-11 1 bob 5
      2018-10-12 2 harry 4
      2018-10-13 3 jeff 2
      2018-10-14 4 steve 2
      2018-10-15 5 bob 4
      2018-10-16 6 harry 1
      2018-10-17 7 jeff 2
      2018-10-18 8 steve 3
      2018-10-19 9 bob 3
      ...


      I'd like to add another column result that is boolean, grouped by names and true when the value of roll in the current row appears within a 10-day lagging window. I.e. I want this:



                  value  names  roll  result
      2018-10-10 0 steve 2 False
      2018-10-11 1 bob 5 False
      2018-10-12 2 harry 4 False
      2018-10-13 3 jeff 2 False
      2018-10-14 4 steve 2 True
      2018-10-15 5 bob 4 False
      2018-10-16 6 harry 1 False
      2018-10-17 7 jeff 2 True
      2018-10-18 8 steve 3 True
      2018-10-19 9 bob 3 False
      ...


      I've tried this:



      df['result'] = (
      df.groupby('names').apply(lambda x: x['roll'].isin(x.shift().rolling('10D')['roll']))
      )


      which feels logical to me, but I get a NotImplementedError that points me here: https://github.com/pandas-dev/pandas/issues/11704.



      Is there a pandas-native way to get where I want to be?










      share|improve this question














      I have a pandas DataFrame similar to the one generated by this code:



      names = ['steve', 'bob', 'harry', 'jeff'] * 5
      df = pd.DataFrame(
      index=pd.DatetimeIndex(start='2018-10-10', end='2018-10-29', freq='D'),
      data={'value': [x for x in range(20)],
      'names': names,
      }
      )
      df['roll'] = np.random.randint(1, 6, df.shape[0])


      Which produces data that looks like this:



                  value  names  roll
      2018-10-10 0 steve 2
      2018-10-11 1 bob 5
      2018-10-12 2 harry 4
      2018-10-13 3 jeff 2
      2018-10-14 4 steve 2
      2018-10-15 5 bob 4
      2018-10-16 6 harry 1
      2018-10-17 7 jeff 2
      2018-10-18 8 steve 3
      2018-10-19 9 bob 3
      ...


      I'd like to add another column result that is boolean, grouped by names and true when the value of roll in the current row appears within a 10-day lagging window. I.e. I want this:



                  value  names  roll  result
      2018-10-10 0 steve 2 False
      2018-10-11 1 bob 5 False
      2018-10-12 2 harry 4 False
      2018-10-13 3 jeff 2 False
      2018-10-14 4 steve 2 True
      2018-10-15 5 bob 4 False
      2018-10-16 6 harry 1 False
      2018-10-17 7 jeff 2 True
      2018-10-18 8 steve 3 True
      2018-10-19 9 bob 3 False
      ...


      I've tried this:



      df['result'] = (
      df.groupby('names').apply(lambda x: x['roll'].isin(x.shift().rolling('10D')['roll']))
      )


      which feels logical to me, but I get a NotImplementedError that points me here: https://github.com/pandas-dev/pandas/issues/11704.



      Is there a pandas-native way to get where I want to be?







      python pandas dataframe






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Dec 29 '18 at 2:41









      bigsimbigsim

      826




      826
























          1 Answer
          1






          active

          oldest

          votes


















          0














          I think rolling here is no needed



          df.reset_index().groupby(['names','roll'])['index'].diff().dt.days<10
          Out[49]:
          0 False
          1 False
          2 False
          3 False
          4 True
          5 False
          6 False
          7 True
          8 False
          9 False
          Name: index, dtype: bool





          share|improve this answer























            Your Answer






            StackExchange.ifUsing("editor", function () {
            StackExchange.using("externalEditor", function () {
            StackExchange.using("snippets", function () {
            StackExchange.snippets.init();
            });
            });
            }, "code-snippets");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "1"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53966258%2fchecking-whether-value-in-current-row-of-pandas-series-is-in-lagging-window%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            0














            I think rolling here is no needed



            df.reset_index().groupby(['names','roll'])['index'].diff().dt.days<10
            Out[49]:
            0 False
            1 False
            2 False
            3 False
            4 True
            5 False
            6 False
            7 True
            8 False
            9 False
            Name: index, dtype: bool





            share|improve this answer




























              0














              I think rolling here is no needed



              df.reset_index().groupby(['names','roll'])['index'].diff().dt.days<10
              Out[49]:
              0 False
              1 False
              2 False
              3 False
              4 True
              5 False
              6 False
              7 True
              8 False
              9 False
              Name: index, dtype: bool





              share|improve this answer


























                0












                0








                0







                I think rolling here is no needed



                df.reset_index().groupby(['names','roll'])['index'].diff().dt.days<10
                Out[49]:
                0 False
                1 False
                2 False
                3 False
                4 True
                5 False
                6 False
                7 True
                8 False
                9 False
                Name: index, dtype: bool





                share|improve this answer













                I think rolling here is no needed



                df.reset_index().groupby(['names','roll'])['index'].diff().dt.days<10
                Out[49]:
                0 False
                1 False
                2 False
                3 False
                4 True
                5 False
                6 False
                7 True
                8 False
                9 False
                Name: index, dtype: bool






                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Dec 29 '18 at 3:16









                W-BW-B

                105k73165




                105k73165






























                    draft saved

                    draft discarded




















































                    Thanks for contributing an answer to Stack Overflow!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53966258%2fchecking-whether-value-in-current-row-of-pandas-series-is-in-lagging-window%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    Monofisismo

                    Angular Downloading a file using contenturl with Basic Authentication

                    Olmecas