How to groupby a percentage range of each value in pandas python

If I have a dataframe of the format:

          date              value

2018-10-31 23:45:00         0.031190

2018-11-01 00:00:00         0.031211

2018-11-01 00:15:00         0.031201

2018-11-01 00:30:00         0.031203

2018-11-01 00:45:00         0.031186

2018-11-01 01:00:00         0.031208

2018-11-01 01:15:00         0.031191

2018-11-01 01:30:00         0.031170

2018-11-01 01:45:00         0.031155

2018-11-01 02:00:00         0.031146

2018-11-01 02:15:00         0.031176

2018-11-01 02:30:00         0.031178

2018-11-01 02:45:00         0.031163

2018-11-01 03:00:00         0.031187

2018-11-01 03:15:00         0.031140

2018-11-01 03:30:00         0.031165

2018-11-01 03:45:00         0.031166

2018-11-01 04:00:00         0.031182

2018-11-01 04:15:00         0.031155

2018-11-01 04:30:00         0.031145

2018-11-01 04:45:00         0.031177

2018-11-01 05:00:00         0.031189

2018-11-01 05:15:00         0.031183

2018-11-01 05:30:00         0.031175

2018-11-01 05:45:00         0.031184

2018-11-01 06:00:00         0.031174

2018-11-01 06:15:00         0.031167

2018-11-01 06:30:00         0.031161

2018-11-01 06:45:00         0.031163

2018-11-01 07:00:00         0.031211

2018-11-01 07:15:00         0.031183

2018-11-01 07:30:00         0.031156

2018-11-01 07:45:00         0.031142

2018-11-01 08:00:00         0.031154

2018-11-01 08:15:00         0.031152

2018-11-01 08:30:00         0.031137

2018-11-01 08:45:00         0.031142

2018-11-01 09:00:00         0.031155

2018-11-01 09:15:00         0.031145

2018-11-01 09:30:00         0.031154

2018-11-01 09:45:00         0.031140

2018-11-01 10:00:00         0.031146

2018-11-01 10:15:00         0.031149

2018-11-01 10:30:00         0.031164

2018-11-01 10:45:00         0.031172

2018-11-01 11:00:00         0.031162

2018-11-01 11:15:00         0.031141

2018-11-01 11:30:00         0.031165

2018-11-01 11:45:00         0.031174

2018-11-01 12:00:00         0.031180

How do I segment the data into groups of a 5% difference in value?

For example, 0.031190 would be in a group of values between 0.0296305 and 0.0327495. If a value is within multiple groups that is fine - in fact it is expected. If a value is not anywhere near any other values, then it will just be by itself.

asked Jan 3 at 18:16

user2330270

97011329

1

Please include complete sample data with expected results. This data will lead to one group. Which is not a valid test in my opinion.

– Scott Boston
Jan 3 at 18:20

2

check with qcut .

– Wen-Ben
Jan 3 at 18:24

@user2330270, is the answer helpful?

– Zanshin
Jan 8 at 14:04

add a comment |

If I have a dataframe of the format:

          date              value

2018-10-31 23:45:00         0.031190

2018-11-01 00:00:00         0.031211

2018-11-01 00:15:00         0.031201

2018-11-01 00:30:00         0.031203

2018-11-01 00:45:00         0.031186

2018-11-01 01:00:00         0.031208

2018-11-01 01:15:00         0.031191

2018-11-01 01:30:00         0.031170

2018-11-01 01:45:00         0.031155

2018-11-01 02:00:00         0.031146

2018-11-01 02:15:00         0.031176

2018-11-01 02:30:00         0.031178

2018-11-01 02:45:00         0.031163

2018-11-01 03:00:00         0.031187

2018-11-01 03:15:00         0.031140

2018-11-01 03:30:00         0.031165

2018-11-01 03:45:00         0.031166

2018-11-01 04:00:00         0.031182

2018-11-01 04:15:00         0.031155

2018-11-01 04:30:00         0.031145

2018-11-01 04:45:00         0.031177

2018-11-01 05:00:00         0.031189

2018-11-01 05:15:00         0.031183

2018-11-01 05:30:00         0.031175

2018-11-01 05:45:00         0.031184

2018-11-01 06:00:00         0.031174

2018-11-01 06:15:00         0.031167

2018-11-01 06:30:00         0.031161

2018-11-01 06:45:00         0.031163

2018-11-01 07:00:00         0.031211

2018-11-01 07:15:00         0.031183

2018-11-01 07:30:00         0.031156

2018-11-01 07:45:00         0.031142

2018-11-01 08:00:00         0.031154

2018-11-01 08:15:00         0.031152

2018-11-01 08:30:00         0.031137

2018-11-01 08:45:00         0.031142

2018-11-01 09:00:00         0.031155

2018-11-01 09:15:00         0.031145

2018-11-01 09:30:00         0.031154

2018-11-01 09:45:00         0.031140

2018-11-01 10:00:00         0.031146

2018-11-01 10:15:00         0.031149

2018-11-01 10:30:00         0.031164

2018-11-01 10:45:00         0.031172

2018-11-01 11:00:00         0.031162

2018-11-01 11:15:00         0.031141

2018-11-01 11:30:00         0.031165

2018-11-01 11:45:00         0.031174

2018-11-01 12:00:00         0.031180

How do I segment the data into groups of a 5% difference in value?

asked Jan 3 at 18:16

user2330270

97011329

1

Please include complete sample data with expected results. This data will lead to one group. Which is not a valid test in my opinion.

– Scott Boston
Jan 3 at 18:20

2

check with qcut .

– Wen-Ben
Jan 3 at 18:24

@user2330270, is the answer helpful?

– Zanshin
Jan 8 at 14:04

add a comment |

If I have a dataframe of the format:

          date              value

2018-10-31 23:45:00         0.031190

2018-11-01 00:00:00         0.031211

2018-11-01 00:15:00         0.031201

2018-11-01 00:30:00         0.031203

2018-11-01 00:45:00         0.031186

2018-11-01 01:00:00         0.031208

2018-11-01 01:15:00         0.031191

2018-11-01 01:30:00         0.031170

2018-11-01 01:45:00         0.031155

2018-11-01 02:00:00         0.031146

2018-11-01 02:15:00         0.031176

2018-11-01 02:30:00         0.031178

2018-11-01 02:45:00         0.031163

2018-11-01 03:00:00         0.031187

2018-11-01 03:15:00         0.031140

2018-11-01 03:30:00         0.031165

2018-11-01 03:45:00         0.031166

2018-11-01 04:00:00         0.031182

2018-11-01 04:15:00         0.031155

2018-11-01 04:30:00         0.031145

2018-11-01 04:45:00         0.031177

2018-11-01 05:00:00         0.031189

2018-11-01 05:15:00         0.031183

2018-11-01 05:30:00         0.031175

2018-11-01 05:45:00         0.031184

2018-11-01 06:00:00         0.031174

2018-11-01 06:15:00         0.031167

2018-11-01 06:30:00         0.031161

2018-11-01 06:45:00         0.031163

2018-11-01 07:00:00         0.031211

2018-11-01 07:15:00         0.031183

2018-11-01 07:30:00         0.031156

2018-11-01 07:45:00         0.031142

2018-11-01 08:00:00         0.031154

2018-11-01 08:15:00         0.031152

2018-11-01 08:30:00         0.031137

2018-11-01 08:45:00         0.031142

2018-11-01 09:00:00         0.031155

2018-11-01 09:15:00         0.031145

2018-11-01 09:30:00         0.031154

2018-11-01 09:45:00         0.031140

2018-11-01 10:00:00         0.031146

2018-11-01 10:15:00         0.031149

2018-11-01 10:30:00         0.031164

2018-11-01 10:45:00         0.031172

2018-11-01 11:00:00         0.031162

2018-11-01 11:15:00         0.031141

2018-11-01 11:30:00         0.031165

2018-11-01 11:45:00         0.031174

2018-11-01 12:00:00         0.031180

How do I segment the data into groups of a 5% difference in value?

asked Jan 3 at 18:16

user2330270

97011329

If I have a dataframe of the format:

          date              value

2018-10-31 23:45:00         0.031190

2018-11-01 00:00:00         0.031211

2018-11-01 00:15:00         0.031201

2018-11-01 00:30:00         0.031203

2018-11-01 00:45:00         0.031186

2018-11-01 01:00:00         0.031208

2018-11-01 01:15:00         0.031191

2018-11-01 01:30:00         0.031170

2018-11-01 01:45:00         0.031155

2018-11-01 02:00:00         0.031146

2018-11-01 02:15:00         0.031176

2018-11-01 02:30:00         0.031178

2018-11-01 02:45:00         0.031163

2018-11-01 03:00:00         0.031187

2018-11-01 03:15:00         0.031140

2018-11-01 03:30:00         0.031165

2018-11-01 03:45:00         0.031166

2018-11-01 04:00:00         0.031182

2018-11-01 04:15:00         0.031155

2018-11-01 04:30:00         0.031145

2018-11-01 04:45:00         0.031177

2018-11-01 05:00:00         0.031189

2018-11-01 05:15:00         0.031183

2018-11-01 05:30:00         0.031175

2018-11-01 05:45:00         0.031184

2018-11-01 06:00:00         0.031174

2018-11-01 06:15:00         0.031167

2018-11-01 06:30:00         0.031161

2018-11-01 06:45:00         0.031163

2018-11-01 07:00:00         0.031211

2018-11-01 07:15:00         0.031183

2018-11-01 07:30:00         0.031156

2018-11-01 07:45:00         0.031142

2018-11-01 08:00:00         0.031154

2018-11-01 08:15:00         0.031152

2018-11-01 08:30:00         0.031137

2018-11-01 08:45:00         0.031142

2018-11-01 09:00:00         0.031155

2018-11-01 09:15:00         0.031145

2018-11-01 09:30:00         0.031154

2018-11-01 09:45:00         0.031140

2018-11-01 10:00:00         0.031146

2018-11-01 10:15:00         0.031149

2018-11-01 10:30:00         0.031164

2018-11-01 10:45:00         0.031172

2018-11-01 11:00:00         0.031162

2018-11-01 11:15:00         0.031141

2018-11-01 11:30:00         0.031165

2018-11-01 11:45:00         0.031174

2018-11-01 12:00:00         0.031180

How do I segment the data into groups of a 5% difference in value?

python pandas

asked Jan 3 at 18:16

user2330270

97011329

asked Jan 3 at 18:16

user2330270

97011329

asked Jan 3 at 18:16

user2330270

97011329

asked Jan 3 at 18:16

user2330270

97011329

asked Jan 3 at 18:16

user2330270

97011329

1

Please include complete sample data with expected results. This data will lead to one group. Which is not a valid test in my opinion.

– Scott Boston
Jan 3 at 18:20

2

check with qcut .

– Wen-Ben
Jan 3 at 18:24

@user2330270, is the answer helpful?

– Zanshin
Jan 8 at 14:04

add a comment |

1

Please include complete sample data with expected results. This data will lead to one group. Which is not a valid test in my opinion.

– Scott Boston
Jan 3 at 18:20

2

check with qcut .

– Wen-Ben
Jan 3 at 18:24

@user2330270, is the answer helpful?

– Zanshin
Jan 8 at 14:04

Please include complete sample data with expected results. This data will lead to one group. Which is not a valid test in my opinion.

– Scott Boston
Jan 3 at 18:20

check with qcut .

– Wen-Ben
Jan 3 at 18:24

@user2330270, is the answer helpful?

– Zanshin
Jan 8 at 14:04

add a comment |

1 Answer
1

active

oldest

votes

based on the data you provided something like this would work;

assuming you would need the range divided in 20 bins of 5%.

df['binned'] = pd.qcut(df['value'], 20)



df = df.groupby('binned')['value'].count()



print(df.head())



binned

(0.031127000000000002, 0.03114]    3

(0.03114, 0.031142]                3

(0.031142, 0.031145]               2

(0.031145, 0.031148]               2

(0.031148, 0.031154]               4

edited Jan 8 at 7:15

answered Jan 7 at 18:57

Zanshin

7601523

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54027687%2fhow-to-groupby-a-percentage-range-of-each-value-in-pandas-python%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

based on the data you provided something like this would work;

assuming you would need the range divided in 20 bins of 5%.

df['binned'] = pd.qcut(df['value'], 20)



df = df.groupby('binned')['value'].count()



print(df.head())



binned

(0.031127000000000002, 0.03114]    3

(0.03114, 0.031142]                3

(0.031142, 0.031145]               2

(0.031145, 0.031148]               2

(0.031148, 0.031154]               4

edited Jan 8 at 7:15

answered Jan 7 at 18:57

Zanshin

7601523

add a comment |

based on the data you provided something like this would work;

assuming you would need the range divided in 20 bins of 5%.

df['binned'] = pd.qcut(df['value'], 20)



df = df.groupby('binned')['value'].count()



print(df.head())



binned

(0.031127000000000002, 0.03114]    3

(0.03114, 0.031142]                3

(0.031142, 0.031145]               2

(0.031145, 0.031148]               2

(0.031148, 0.031154]               4

edited Jan 8 at 7:15

answered Jan 7 at 18:57

Zanshin

7601523

add a comment |

based on the data you provided something like this would work;

assuming you would need the range divided in 20 bins of 5%.

df['binned'] = pd.qcut(df['value'], 20)



df = df.groupby('binned')['value'].count()



print(df.head())



binned

(0.031127000000000002, 0.03114]    3

(0.03114, 0.031142]                3

(0.031142, 0.031145]               2

(0.031145, 0.031148]               2

(0.031148, 0.031154]               4

edited Jan 8 at 7:15

answered Jan 7 at 18:57

Zanshin

7601523

based on the data you provided something like this would work;

assuming you would need the range divided in 20 bins of 5%.

df['binned'] = pd.qcut(df['value'], 20)



df = df.groupby('binned')['value'].count()



print(df.head())



binned

(0.031127000000000002, 0.03114]    3

(0.03114, 0.031142]                3

(0.031142, 0.031145]               2

(0.031145, 0.031148]               2

(0.031148, 0.031154]               4

edited Jan 8 at 7:15

answered Jan 7 at 18:57

Zanshin

7601523

edited Jan 8 at 7:15

answered Jan 7 at 18:57

Zanshin

7601523

answered Jan 7 at 18:57

Zanshin

7601523

answered Jan 7 at 18:57

Zanshin

7601523

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

vseYdD 3Fh,Bqy

搜尋此網誌

Bdtjtk