Pandas calling apply on an empty dataframe using the reduce option changes datatypes
I understand that apply method is called even for the empty dataframe. When there is error inside the apply method it doesn't get propagated. I was looking at this stackoverflow link which suggests to use the reduce
option so that the apply function is not called.
Pandas: why does DataFrame.apply(f, axis=1) call f when the DataFrame is empty?
Consider this example, in Col1, everything is less than 10. So the dataframe is empty. when I use the reduce option, the datatype of col2 is changed. It converts the numbers to decimals.
d = {'col1': [1, 2], 'col2': [3, 4]}
df = pd.DataFrame(data=d)
mask = df["col1"] > 10
df.loc[mask, "col2"] = df[mask].apply(lambda x: x+2, axis=1, result_type='reduce')
print(df)
Expected output
col1 col2
0 1 3
1 2 4
Actual output:
col1 col2
0 1 3.0
1 2 4.0
I am not sure why it converts the integers to decimals. Does anyone know how to avoid this?
python pandas dataframe
add a comment |
I understand that apply method is called even for the empty dataframe. When there is error inside the apply method it doesn't get propagated. I was looking at this stackoverflow link which suggests to use the reduce
option so that the apply function is not called.
Pandas: why does DataFrame.apply(f, axis=1) call f when the DataFrame is empty?
Consider this example, in Col1, everything is less than 10. So the dataframe is empty. when I use the reduce option, the datatype of col2 is changed. It converts the numbers to decimals.
d = {'col1': [1, 2], 'col2': [3, 4]}
df = pd.DataFrame(data=d)
mask = df["col1"] > 10
df.loc[mask, "col2"] = df[mask].apply(lambda x: x+2, axis=1, result_type='reduce')
print(df)
Expected output
col1 col2
0 1 3
1 2 4
Actual output:
col1 col2
0 1 3.0
1 2 4.0
I am not sure why it converts the integers to decimals. Does anyone know how to avoid this?
python pandas dataframe
5
To avoid float conversion, usedf[mask].apply(lambda x: x+2, axis=1, result_type='reduce').astype(int)
To understand why float occurs, doaxis=0
, you will see the introduction ofNaN
which has typefloat64
. In similar fashion, when you doaxis=1
, we have no rows in output but type has been converted tofloat64
– meW
Jan 2 at 5:08
Simplydf.loc[mask, "col2"] = df[mask].apply(lambda x: x+2, axis=1, result_type='reduce').astype(int)
should do the Job
– pygo
Jan 2 at 5:47
Thank you @meW for the detailed explanation.
– Bala
Jan 3 at 0:33
add a comment |
I understand that apply method is called even for the empty dataframe. When there is error inside the apply method it doesn't get propagated. I was looking at this stackoverflow link which suggests to use the reduce
option so that the apply function is not called.
Pandas: why does DataFrame.apply(f, axis=1) call f when the DataFrame is empty?
Consider this example, in Col1, everything is less than 10. So the dataframe is empty. when I use the reduce option, the datatype of col2 is changed. It converts the numbers to decimals.
d = {'col1': [1, 2], 'col2': [3, 4]}
df = pd.DataFrame(data=d)
mask = df["col1"] > 10
df.loc[mask, "col2"] = df[mask].apply(lambda x: x+2, axis=1, result_type='reduce')
print(df)
Expected output
col1 col2
0 1 3
1 2 4
Actual output:
col1 col2
0 1 3.0
1 2 4.0
I am not sure why it converts the integers to decimals. Does anyone know how to avoid this?
python pandas dataframe
I understand that apply method is called even for the empty dataframe. When there is error inside the apply method it doesn't get propagated. I was looking at this stackoverflow link which suggests to use the reduce
option so that the apply function is not called.
Pandas: why does DataFrame.apply(f, axis=1) call f when the DataFrame is empty?
Consider this example, in Col1, everything is less than 10. So the dataframe is empty. when I use the reduce option, the datatype of col2 is changed. It converts the numbers to decimals.
d = {'col1': [1, 2], 'col2': [3, 4]}
df = pd.DataFrame(data=d)
mask = df["col1"] > 10
df.loc[mask, "col2"] = df[mask].apply(lambda x: x+2, axis=1, result_type='reduce')
print(df)
Expected output
col1 col2
0 1 3
1 2 4
Actual output:
col1 col2
0 1 3.0
1 2 4.0
I am not sure why it converts the integers to decimals. Does anyone know how to avoid this?
python pandas dataframe
python pandas dataframe
edited Jan 2 at 5:06
hygull
3,68021432
3,68021432
asked Jan 2 at 5:02
BalaBala
954719
954719
5
To avoid float conversion, usedf[mask].apply(lambda x: x+2, axis=1, result_type='reduce').astype(int)
To understand why float occurs, doaxis=0
, you will see the introduction ofNaN
which has typefloat64
. In similar fashion, when you doaxis=1
, we have no rows in output but type has been converted tofloat64
– meW
Jan 2 at 5:08
Simplydf.loc[mask, "col2"] = df[mask].apply(lambda x: x+2, axis=1, result_type='reduce').astype(int)
should do the Job
– pygo
Jan 2 at 5:47
Thank you @meW for the detailed explanation.
– Bala
Jan 3 at 0:33
add a comment |
5
To avoid float conversion, usedf[mask].apply(lambda x: x+2, axis=1, result_type='reduce').astype(int)
To understand why float occurs, doaxis=0
, you will see the introduction ofNaN
which has typefloat64
. In similar fashion, when you doaxis=1
, we have no rows in output but type has been converted tofloat64
– meW
Jan 2 at 5:08
Simplydf.loc[mask, "col2"] = df[mask].apply(lambda x: x+2, axis=1, result_type='reduce').astype(int)
should do the Job
– pygo
Jan 2 at 5:47
Thank you @meW for the detailed explanation.
– Bala
Jan 3 at 0:33
5
5
To avoid float conversion, use
df[mask].apply(lambda x: x+2, axis=1, result_type='reduce').astype(int)
To understand why float occurs, do axis=0
, you will see the introduction of NaN
which has type float64
. In similar fashion, when you do axis=1
, we have no rows in output but type has been converted to float64
– meW
Jan 2 at 5:08
To avoid float conversion, use
df[mask].apply(lambda x: x+2, axis=1, result_type='reduce').astype(int)
To understand why float occurs, do axis=0
, you will see the introduction of NaN
which has type float64
. In similar fashion, when you do axis=1
, we have no rows in output but type has been converted to float64
– meW
Jan 2 at 5:08
Simply
df.loc[mask, "col2"] = df[mask].apply(lambda x: x+2, axis=1, result_type='reduce').astype(int)
should do the Job– pygo
Jan 2 at 5:47
Simply
df.loc[mask, "col2"] = df[mask].apply(lambda x: x+2, axis=1, result_type='reduce').astype(int)
should do the Job– pygo
Jan 2 at 5:47
Thank you @meW for the detailed explanation.
– Bala
Jan 3 at 0:33
Thank you @meW for the detailed explanation.
– Bala
Jan 3 at 0:33
add a comment |
1 Answer
1
active
oldest
votes
You can use pd.to_numeric()
for downcasting to integer.
I will update this if I find something better to do this.
>>> import pandas as pd
>>> d = {'col1': [1, 2], 'col2': [3, 4]}
>>> df = pd.DataFrame(data=d)
>>> mask = df["col1"] > 10
>>> df.loc[mask, "col2"] = df[mask].apply(lambda x: x+2, axis=1, result_type='reduce')
>>>
>>> df
col1 col2
0 1 3.0
1 2 4.0
>>>
>>> pd.to_numeric(df.col2, downcast='integer')
0 3
1 4
Name: col2, dtype: int8
>>>
>>> df.col2 = pd.to_numeric(df.col2, downcast='integer')
>>> df
col1 col2
0 1 3
1 2 4
>>>
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54001432%2fpandas-calling-apply-on-an-empty-dataframe-using-the-reduce-option-changes-datat%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
You can use pd.to_numeric()
for downcasting to integer.
I will update this if I find something better to do this.
>>> import pandas as pd
>>> d = {'col1': [1, 2], 'col2': [3, 4]}
>>> df = pd.DataFrame(data=d)
>>> mask = df["col1"] > 10
>>> df.loc[mask, "col2"] = df[mask].apply(lambda x: x+2, axis=1, result_type='reduce')
>>>
>>> df
col1 col2
0 1 3.0
1 2 4.0
>>>
>>> pd.to_numeric(df.col2, downcast='integer')
0 3
1 4
Name: col2, dtype: int8
>>>
>>> df.col2 = pd.to_numeric(df.col2, downcast='integer')
>>> df
col1 col2
0 1 3
1 2 4
>>>
add a comment |
You can use pd.to_numeric()
for downcasting to integer.
I will update this if I find something better to do this.
>>> import pandas as pd
>>> d = {'col1': [1, 2], 'col2': [3, 4]}
>>> df = pd.DataFrame(data=d)
>>> mask = df["col1"] > 10
>>> df.loc[mask, "col2"] = df[mask].apply(lambda x: x+2, axis=1, result_type='reduce')
>>>
>>> df
col1 col2
0 1 3.0
1 2 4.0
>>>
>>> pd.to_numeric(df.col2, downcast='integer')
0 3
1 4
Name: col2, dtype: int8
>>>
>>> df.col2 = pd.to_numeric(df.col2, downcast='integer')
>>> df
col1 col2
0 1 3
1 2 4
>>>
add a comment |
You can use pd.to_numeric()
for downcasting to integer.
I will update this if I find something better to do this.
>>> import pandas as pd
>>> d = {'col1': [1, 2], 'col2': [3, 4]}
>>> df = pd.DataFrame(data=d)
>>> mask = df["col1"] > 10
>>> df.loc[mask, "col2"] = df[mask].apply(lambda x: x+2, axis=1, result_type='reduce')
>>>
>>> df
col1 col2
0 1 3.0
1 2 4.0
>>>
>>> pd.to_numeric(df.col2, downcast='integer')
0 3
1 4
Name: col2, dtype: int8
>>>
>>> df.col2 = pd.to_numeric(df.col2, downcast='integer')
>>> df
col1 col2
0 1 3
1 2 4
>>>
You can use pd.to_numeric()
for downcasting to integer.
I will update this if I find something better to do this.
>>> import pandas as pd
>>> d = {'col1': [1, 2], 'col2': [3, 4]}
>>> df = pd.DataFrame(data=d)
>>> mask = df["col1"] > 10
>>> df.loc[mask, "col2"] = df[mask].apply(lambda x: x+2, axis=1, result_type='reduce')
>>>
>>> df
col1 col2
0 1 3.0
1 2 4.0
>>>
>>> pd.to_numeric(df.col2, downcast='integer')
0 3
1 4
Name: col2, dtype: int8
>>>
>>> df.col2 = pd.to_numeric(df.col2, downcast='integer')
>>> df
col1 col2
0 1 3
1 2 4
>>>
answered Jan 2 at 5:21
hygullhygull
3,68021432
3,68021432
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54001432%2fpandas-calling-apply-on-an-empty-dataframe-using-the-reduce-option-changes-datat%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
5
To avoid float conversion, use
df[mask].apply(lambda x: x+2, axis=1, result_type='reduce').astype(int)
To understand why float occurs, doaxis=0
, you will see the introduction ofNaN
which has typefloat64
. In similar fashion, when you doaxis=1
, we have no rows in output but type has been converted tofloat64
– meW
Jan 2 at 5:08
Simply
df.loc[mask, "col2"] = df[mask].apply(lambda x: x+2, axis=1, result_type='reduce').astype(int)
should do the Job– pygo
Jan 2 at 5:47
Thank you @meW for the detailed explanation.
– Bala
Jan 3 at 0:33