How to groupby a percentage range of each value in pandas python












0















If I have a dataframe of the format:



          date              value
2018-10-31 23:45:00 0.031190
2018-11-01 00:00:00 0.031211
2018-11-01 00:15:00 0.031201
2018-11-01 00:30:00 0.031203
2018-11-01 00:45:00 0.031186
2018-11-01 01:00:00 0.031208
2018-11-01 01:15:00 0.031191
2018-11-01 01:30:00 0.031170
2018-11-01 01:45:00 0.031155
2018-11-01 02:00:00 0.031146
2018-11-01 02:15:00 0.031176
2018-11-01 02:30:00 0.031178
2018-11-01 02:45:00 0.031163
2018-11-01 03:00:00 0.031187
2018-11-01 03:15:00 0.031140
2018-11-01 03:30:00 0.031165
2018-11-01 03:45:00 0.031166
2018-11-01 04:00:00 0.031182
2018-11-01 04:15:00 0.031155
2018-11-01 04:30:00 0.031145
2018-11-01 04:45:00 0.031177
2018-11-01 05:00:00 0.031189
2018-11-01 05:15:00 0.031183
2018-11-01 05:30:00 0.031175
2018-11-01 05:45:00 0.031184
2018-11-01 06:00:00 0.031174
2018-11-01 06:15:00 0.031167
2018-11-01 06:30:00 0.031161
2018-11-01 06:45:00 0.031163
2018-11-01 07:00:00 0.031211
2018-11-01 07:15:00 0.031183
2018-11-01 07:30:00 0.031156
2018-11-01 07:45:00 0.031142
2018-11-01 08:00:00 0.031154
2018-11-01 08:15:00 0.031152
2018-11-01 08:30:00 0.031137
2018-11-01 08:45:00 0.031142
2018-11-01 09:00:00 0.031155
2018-11-01 09:15:00 0.031145
2018-11-01 09:30:00 0.031154
2018-11-01 09:45:00 0.031140
2018-11-01 10:00:00 0.031146
2018-11-01 10:15:00 0.031149
2018-11-01 10:30:00 0.031164
2018-11-01 10:45:00 0.031172
2018-11-01 11:00:00 0.031162
2018-11-01 11:15:00 0.031141
2018-11-01 11:30:00 0.031165
2018-11-01 11:45:00 0.031174
2018-11-01 12:00:00 0.031180


How do I segment the data into groups of a 5% difference in value?



For example, 0.031190 would be in a group of values between 0.0296305 and 0.0327495. If a value is within multiple groups that is fine - in fact it is expected. If a value is not anywhere near any other values, then it will just be by itself.










share|improve this question


















  • 1





    Please include complete sample data with expected results. This data will lead to one group. Which is not a valid test in my opinion.

    – Scott Boston
    Jan 3 at 18:20






  • 2





    check with qcut .

    – Wen-Ben
    Jan 3 at 18:24











  • @user2330270, is the answer helpful?

    – Zanshin
    Jan 8 at 14:04
















0















If I have a dataframe of the format:



          date              value
2018-10-31 23:45:00 0.031190
2018-11-01 00:00:00 0.031211
2018-11-01 00:15:00 0.031201
2018-11-01 00:30:00 0.031203
2018-11-01 00:45:00 0.031186
2018-11-01 01:00:00 0.031208
2018-11-01 01:15:00 0.031191
2018-11-01 01:30:00 0.031170
2018-11-01 01:45:00 0.031155
2018-11-01 02:00:00 0.031146
2018-11-01 02:15:00 0.031176
2018-11-01 02:30:00 0.031178
2018-11-01 02:45:00 0.031163
2018-11-01 03:00:00 0.031187
2018-11-01 03:15:00 0.031140
2018-11-01 03:30:00 0.031165
2018-11-01 03:45:00 0.031166
2018-11-01 04:00:00 0.031182
2018-11-01 04:15:00 0.031155
2018-11-01 04:30:00 0.031145
2018-11-01 04:45:00 0.031177
2018-11-01 05:00:00 0.031189
2018-11-01 05:15:00 0.031183
2018-11-01 05:30:00 0.031175
2018-11-01 05:45:00 0.031184
2018-11-01 06:00:00 0.031174
2018-11-01 06:15:00 0.031167
2018-11-01 06:30:00 0.031161
2018-11-01 06:45:00 0.031163
2018-11-01 07:00:00 0.031211
2018-11-01 07:15:00 0.031183
2018-11-01 07:30:00 0.031156
2018-11-01 07:45:00 0.031142
2018-11-01 08:00:00 0.031154
2018-11-01 08:15:00 0.031152
2018-11-01 08:30:00 0.031137
2018-11-01 08:45:00 0.031142
2018-11-01 09:00:00 0.031155
2018-11-01 09:15:00 0.031145
2018-11-01 09:30:00 0.031154
2018-11-01 09:45:00 0.031140
2018-11-01 10:00:00 0.031146
2018-11-01 10:15:00 0.031149
2018-11-01 10:30:00 0.031164
2018-11-01 10:45:00 0.031172
2018-11-01 11:00:00 0.031162
2018-11-01 11:15:00 0.031141
2018-11-01 11:30:00 0.031165
2018-11-01 11:45:00 0.031174
2018-11-01 12:00:00 0.031180


How do I segment the data into groups of a 5% difference in value?



For example, 0.031190 would be in a group of values between 0.0296305 and 0.0327495. If a value is within multiple groups that is fine - in fact it is expected. If a value is not anywhere near any other values, then it will just be by itself.










share|improve this question


















  • 1





    Please include complete sample data with expected results. This data will lead to one group. Which is not a valid test in my opinion.

    – Scott Boston
    Jan 3 at 18:20






  • 2





    check with qcut .

    – Wen-Ben
    Jan 3 at 18:24











  • @user2330270, is the answer helpful?

    – Zanshin
    Jan 8 at 14:04














0












0








0








If I have a dataframe of the format:



          date              value
2018-10-31 23:45:00 0.031190
2018-11-01 00:00:00 0.031211
2018-11-01 00:15:00 0.031201
2018-11-01 00:30:00 0.031203
2018-11-01 00:45:00 0.031186
2018-11-01 01:00:00 0.031208
2018-11-01 01:15:00 0.031191
2018-11-01 01:30:00 0.031170
2018-11-01 01:45:00 0.031155
2018-11-01 02:00:00 0.031146
2018-11-01 02:15:00 0.031176
2018-11-01 02:30:00 0.031178
2018-11-01 02:45:00 0.031163
2018-11-01 03:00:00 0.031187
2018-11-01 03:15:00 0.031140
2018-11-01 03:30:00 0.031165
2018-11-01 03:45:00 0.031166
2018-11-01 04:00:00 0.031182
2018-11-01 04:15:00 0.031155
2018-11-01 04:30:00 0.031145
2018-11-01 04:45:00 0.031177
2018-11-01 05:00:00 0.031189
2018-11-01 05:15:00 0.031183
2018-11-01 05:30:00 0.031175
2018-11-01 05:45:00 0.031184
2018-11-01 06:00:00 0.031174
2018-11-01 06:15:00 0.031167
2018-11-01 06:30:00 0.031161
2018-11-01 06:45:00 0.031163
2018-11-01 07:00:00 0.031211
2018-11-01 07:15:00 0.031183
2018-11-01 07:30:00 0.031156
2018-11-01 07:45:00 0.031142
2018-11-01 08:00:00 0.031154
2018-11-01 08:15:00 0.031152
2018-11-01 08:30:00 0.031137
2018-11-01 08:45:00 0.031142
2018-11-01 09:00:00 0.031155
2018-11-01 09:15:00 0.031145
2018-11-01 09:30:00 0.031154
2018-11-01 09:45:00 0.031140
2018-11-01 10:00:00 0.031146
2018-11-01 10:15:00 0.031149
2018-11-01 10:30:00 0.031164
2018-11-01 10:45:00 0.031172
2018-11-01 11:00:00 0.031162
2018-11-01 11:15:00 0.031141
2018-11-01 11:30:00 0.031165
2018-11-01 11:45:00 0.031174
2018-11-01 12:00:00 0.031180


How do I segment the data into groups of a 5% difference in value?



For example, 0.031190 would be in a group of values between 0.0296305 and 0.0327495. If a value is within multiple groups that is fine - in fact it is expected. If a value is not anywhere near any other values, then it will just be by itself.










share|improve this question














If I have a dataframe of the format:



          date              value
2018-10-31 23:45:00 0.031190
2018-11-01 00:00:00 0.031211
2018-11-01 00:15:00 0.031201
2018-11-01 00:30:00 0.031203
2018-11-01 00:45:00 0.031186
2018-11-01 01:00:00 0.031208
2018-11-01 01:15:00 0.031191
2018-11-01 01:30:00 0.031170
2018-11-01 01:45:00 0.031155
2018-11-01 02:00:00 0.031146
2018-11-01 02:15:00 0.031176
2018-11-01 02:30:00 0.031178
2018-11-01 02:45:00 0.031163
2018-11-01 03:00:00 0.031187
2018-11-01 03:15:00 0.031140
2018-11-01 03:30:00 0.031165
2018-11-01 03:45:00 0.031166
2018-11-01 04:00:00 0.031182
2018-11-01 04:15:00 0.031155
2018-11-01 04:30:00 0.031145
2018-11-01 04:45:00 0.031177
2018-11-01 05:00:00 0.031189
2018-11-01 05:15:00 0.031183
2018-11-01 05:30:00 0.031175
2018-11-01 05:45:00 0.031184
2018-11-01 06:00:00 0.031174
2018-11-01 06:15:00 0.031167
2018-11-01 06:30:00 0.031161
2018-11-01 06:45:00 0.031163
2018-11-01 07:00:00 0.031211
2018-11-01 07:15:00 0.031183
2018-11-01 07:30:00 0.031156
2018-11-01 07:45:00 0.031142
2018-11-01 08:00:00 0.031154
2018-11-01 08:15:00 0.031152
2018-11-01 08:30:00 0.031137
2018-11-01 08:45:00 0.031142
2018-11-01 09:00:00 0.031155
2018-11-01 09:15:00 0.031145
2018-11-01 09:30:00 0.031154
2018-11-01 09:45:00 0.031140
2018-11-01 10:00:00 0.031146
2018-11-01 10:15:00 0.031149
2018-11-01 10:30:00 0.031164
2018-11-01 10:45:00 0.031172
2018-11-01 11:00:00 0.031162
2018-11-01 11:15:00 0.031141
2018-11-01 11:30:00 0.031165
2018-11-01 11:45:00 0.031174
2018-11-01 12:00:00 0.031180


How do I segment the data into groups of a 5% difference in value?



For example, 0.031190 would be in a group of values between 0.0296305 and 0.0327495. If a value is within multiple groups that is fine - in fact it is expected. If a value is not anywhere near any other values, then it will just be by itself.







python pandas






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Jan 3 at 18:16









user2330270user2330270

97011329




97011329








  • 1





    Please include complete sample data with expected results. This data will lead to one group. Which is not a valid test in my opinion.

    – Scott Boston
    Jan 3 at 18:20






  • 2





    check with qcut .

    – Wen-Ben
    Jan 3 at 18:24











  • @user2330270, is the answer helpful?

    – Zanshin
    Jan 8 at 14:04














  • 1





    Please include complete sample data with expected results. This data will lead to one group. Which is not a valid test in my opinion.

    – Scott Boston
    Jan 3 at 18:20






  • 2





    check with qcut .

    – Wen-Ben
    Jan 3 at 18:24











  • @user2330270, is the answer helpful?

    – Zanshin
    Jan 8 at 14:04








1




1





Please include complete sample data with expected results. This data will lead to one group. Which is not a valid test in my opinion.

– Scott Boston
Jan 3 at 18:20





Please include complete sample data with expected results. This data will lead to one group. Which is not a valid test in my opinion.

– Scott Boston
Jan 3 at 18:20




2




2





check with qcut .

– Wen-Ben
Jan 3 at 18:24





check with qcut .

– Wen-Ben
Jan 3 at 18:24













@user2330270, is the answer helpful?

– Zanshin
Jan 8 at 14:04





@user2330270, is the answer helpful?

– Zanshin
Jan 8 at 14:04












1 Answer
1






active

oldest

votes


















1














based on the data you provided something like this would work;



assuming you would need the range divided in 20 bins of 5%.



df['binned'] = pd.qcut(df['value'], 20)

df = df.groupby('binned')['value'].count()

print(df.head())

binned
(0.031127000000000002, 0.03114] 3
(0.03114, 0.031142] 3
(0.031142, 0.031145] 2
(0.031145, 0.031148] 2
(0.031148, 0.031154] 4





share|improve this answer


























    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54027687%2fhow-to-groupby-a-percentage-range-of-each-value-in-pandas-python%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    1














    based on the data you provided something like this would work;



    assuming you would need the range divided in 20 bins of 5%.



    df['binned'] = pd.qcut(df['value'], 20)

    df = df.groupby('binned')['value'].count()

    print(df.head())

    binned
    (0.031127000000000002, 0.03114] 3
    (0.03114, 0.031142] 3
    (0.031142, 0.031145] 2
    (0.031145, 0.031148] 2
    (0.031148, 0.031154] 4





    share|improve this answer






























      1














      based on the data you provided something like this would work;



      assuming you would need the range divided in 20 bins of 5%.



      df['binned'] = pd.qcut(df['value'], 20)

      df = df.groupby('binned')['value'].count()

      print(df.head())

      binned
      (0.031127000000000002, 0.03114] 3
      (0.03114, 0.031142] 3
      (0.031142, 0.031145] 2
      (0.031145, 0.031148] 2
      (0.031148, 0.031154] 4





      share|improve this answer




























        1












        1








        1







        based on the data you provided something like this would work;



        assuming you would need the range divided in 20 bins of 5%.



        df['binned'] = pd.qcut(df['value'], 20)

        df = df.groupby('binned')['value'].count()

        print(df.head())

        binned
        (0.031127000000000002, 0.03114] 3
        (0.03114, 0.031142] 3
        (0.031142, 0.031145] 2
        (0.031145, 0.031148] 2
        (0.031148, 0.031154] 4





        share|improve this answer















        based on the data you provided something like this would work;



        assuming you would need the range divided in 20 bins of 5%.



        df['binned'] = pd.qcut(df['value'], 20)

        df = df.groupby('binned')['value'].count()

        print(df.head())

        binned
        (0.031127000000000002, 0.03114] 3
        (0.03114, 0.031142] 3
        (0.031142, 0.031145] 2
        (0.031145, 0.031148] 2
        (0.031148, 0.031154] 4






        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited Jan 8 at 7:15

























        answered Jan 7 at 18:57









        ZanshinZanshin

        7601523




        7601523
































            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54027687%2fhow-to-groupby-a-percentage-range-of-each-value-in-pandas-python%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Monofisismo

            Angular Downloading a file using contenturl with Basic Authentication

            Olmecas