Count categorical values in DataFrame












1














I have DataFrame only with Categorical Values



    Col1 | Col2| ... | ColM
Row
1 X | Y | ... | X
2 Z | X | ... | Y
3 Y | Z | ... | X
.
.
.
N X | Z | ... | Z


I would like to count how many times each category appeared in database
So example result:



X - 100 times
Y - 30 times
Z = 210 times


Thank You for help










share|improve this question


















  • 3




    df.stack().value_counts()?
    – coldspeed
    Dec 27 '18 at 18:23
















1














I have DataFrame only with Categorical Values



    Col1 | Col2| ... | ColM
Row
1 X | Y | ... | X
2 Z | X | ... | Y
3 Y | Z | ... | X
.
.
.
N X | Z | ... | Z


I would like to count how many times each category appeared in database
So example result:



X - 100 times
Y - 30 times
Z = 210 times


Thank You for help










share|improve this question


















  • 3




    df.stack().value_counts()?
    – coldspeed
    Dec 27 '18 at 18:23














1












1








1







I have DataFrame only with Categorical Values



    Col1 | Col2| ... | ColM
Row
1 X | Y | ... | X
2 Z | X | ... | Y
3 Y | Z | ... | X
.
.
.
N X | Z | ... | Z


I would like to count how many times each category appeared in database
So example result:



X - 100 times
Y - 30 times
Z = 210 times


Thank You for help










share|improve this question













I have DataFrame only with Categorical Values



    Col1 | Col2| ... | ColM
Row
1 X | Y | ... | X
2 Z | X | ... | Y
3 Y | Z | ... | X
.
.
.
N X | Z | ... | Z


I would like to count how many times each category appeared in database
So example result:



X - 100 times
Y - 30 times
Z = 210 times


Thank You for help







python-3.x pandas dataframe






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Dec 27 '18 at 18:22









goskan

528




528








  • 3




    df.stack().value_counts()?
    – coldspeed
    Dec 27 '18 at 18:23














  • 3




    df.stack().value_counts()?
    – coldspeed
    Dec 27 '18 at 18:23








3




3




df.stack().value_counts()?
– coldspeed
Dec 27 '18 at 18:23




df.stack().value_counts()?
– coldspeed
Dec 27 '18 at 18:23












1 Answer
1






active

oldest

votes


















2














The most performant option is to use np.unique with the return_counts flag set:



u, c = np.unique(df, return_counts=True)
pd.Series(c, index=u)


There's also stack and value_counts, which is much slower, but simple and intuitive:



df.stack().value_counts()





share|improve this answer

















  • 1




    That's the answer, I was thinking about value_counts() but completely didn't think about stack(). Thank You!
    – goskan
    Dec 27 '18 at 18:28













Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53949275%2fcount-categorical-values-in-dataframe%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









2














The most performant option is to use np.unique with the return_counts flag set:



u, c = np.unique(df, return_counts=True)
pd.Series(c, index=u)


There's also stack and value_counts, which is much slower, but simple and intuitive:



df.stack().value_counts()





share|improve this answer

















  • 1




    That's the answer, I was thinking about value_counts() but completely didn't think about stack(). Thank You!
    – goskan
    Dec 27 '18 at 18:28


















2














The most performant option is to use np.unique with the return_counts flag set:



u, c = np.unique(df, return_counts=True)
pd.Series(c, index=u)


There's also stack and value_counts, which is much slower, but simple and intuitive:



df.stack().value_counts()





share|improve this answer

















  • 1




    That's the answer, I was thinking about value_counts() but completely didn't think about stack(). Thank You!
    – goskan
    Dec 27 '18 at 18:28
















2












2








2






The most performant option is to use np.unique with the return_counts flag set:



u, c = np.unique(df, return_counts=True)
pd.Series(c, index=u)


There's also stack and value_counts, which is much slower, but simple and intuitive:



df.stack().value_counts()





share|improve this answer












The most performant option is to use np.unique with the return_counts flag set:



u, c = np.unique(df, return_counts=True)
pd.Series(c, index=u)


There's also stack and value_counts, which is much slower, but simple and intuitive:



df.stack().value_counts()






share|improve this answer












share|improve this answer



share|improve this answer










answered Dec 27 '18 at 18:25









coldspeed

120k20119195




120k20119195








  • 1




    That's the answer, I was thinking about value_counts() but completely didn't think about stack(). Thank You!
    – goskan
    Dec 27 '18 at 18:28
















  • 1




    That's the answer, I was thinking about value_counts() but completely didn't think about stack(). Thank You!
    – goskan
    Dec 27 '18 at 18:28










1




1




That's the answer, I was thinking about value_counts() but completely didn't think about stack(). Thank You!
– goskan
Dec 27 '18 at 18:28






That's the answer, I was thinking about value_counts() but completely didn't think about stack(). Thank You!
– goskan
Dec 27 '18 at 18:28




















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.





Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


Please pay close attention to the following guidance:


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53949275%2fcount-categorical-values-in-dataframe%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Angular Downloading a file using contenturl with Basic Authentication

Olmecas

Can't read property showImagePicker of undefined in react native iOS