Using Standardization in sklearn pipeline

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}

I am using Standardscaler to normalize my dataset, that is I turn each feature into a z-score, by subtracting the mean and dividing by the Std.

I would like to use Standardscaler within sklearn's pipeline and I am wondering how exactly the transformation is applied to X_test. That is, in the code below, when I run pipeline.predict(X_test), it is my understanding that the StandardScaler and SVC() is run on X_test, but what exactly does Standardscaler use as the mean and the StD? The ones from the X_Train or does it compute those only for X_test? What if, for instance X_test consists only of 2 variables, the normalization would look a lot different than if I had normalized X_train and X_test altogether, right?

steps = [('scaler', StandardScaler()),

     ('model',SVC())] 

pipeline = Pipeline(steps)

pipeline.fit(X_train,y_train)

y_pred = pipeline.predict(X_test)

edited Jan 4 at 14:19

desertnaut

20.8k84579

asked Jan 4 at 7:52

Tartaglia

1029

add a comment |

I am using Standardscaler to normalize my dataset, that is I turn each feature into a z-score, by subtracting the mean and dividing by the Std.

steps = [('scaler', StandardScaler()),

     ('model',SVC())] 

pipeline = Pipeline(steps)

pipeline.fit(X_train,y_train)

y_pred = pipeline.predict(X_test)

edited Jan 4 at 14:19

desertnaut

20.8k84579

asked Jan 4 at 7:52

Tartaglia

1029

add a comment |

I am using Standardscaler to normalize my dataset, that is I turn each feature into a z-score, by subtracting the mean and dividing by the Std.

steps = [('scaler', StandardScaler()),

     ('model',SVC())] 

pipeline = Pipeline(steps)

pipeline.fit(X_train,y_train)

y_pred = pipeline.predict(X_test)

edited Jan 4 at 14:19

desertnaut

20.8k84579

asked Jan 4 at 7:52

Tartaglia

1029

I am using Standardscaler to normalize my dataset, that is I turn each feature into a z-score, by subtracting the mean and dividing by the Std.

steps = [('scaler', StandardScaler()),

     ('model',SVC())] 

pipeline = Pipeline(steps)

pipeline.fit(X_train,y_train)

y_pred = pipeline.predict(X_test)

scikit-learn normalization pipeline

edited Jan 4 at 14:19

desertnaut

20.8k84579

asked Jan 4 at 7:52

Tartaglia

1029

edited Jan 4 at 14:19

desertnaut

20.8k84579

asked Jan 4 at 7:52

Tartaglia

1029

edited Jan 4 at 14:19

desertnaut

20.8k84579

edited Jan 4 at 14:19

desertnaut

20.8k84579

edited Jan 4 at 14:19

desertnaut

20.8k84579

asked Jan 4 at 7:52

Tartaglia

1029

asked Jan 4 at 7:52

Tartaglia

1029

asked Jan 4 at 7:52

Tartaglia

1029

add a comment |

1 Answer
1

active

oldest

votes

Sklearn's pipeline will apply transformer.fit_transform() when pipeline.fit() is called and transformer.transform() when pipeline.predict() is called. So for your case, StandardScaler will be fitted to X_train and then the mean and stdev from X_train will be used to scale X_test.

The transform of X_train would indeed look different to that of X_train and X_test. The extent of the difference would depend on the extent of the difference in the distributions between X_train and X_test combined. However, if randomly partitioned from the same original dataset, and of a reasonable size, the distributions of X_train and X_test will probably be similar.

Regardless, it is important to treat X_test as though it is out of sample, in order for it to be a (hopefully) reliable metric for unseen data. Since you don't know the distribution of unseen data, you should pretend you don't know the distribution of X_test, including the mean and stdev.

edited Jan 4 at 17:05

answered Jan 4 at 16:59

Chris

544414

1

Very happy to hear that, that makes perfect sense. Thank you so much for the explanation Chris!!

– Tartaglia
Jan 4 at 19:55

@Tartaglia glad to be able to help.

– Chris
Jan 4 at 20:20

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54034991%2fusing-standardization-in-sklearn-pipeline%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

edited Jan 4 at 17:05

answered Jan 4 at 16:59

Chris

544414

1

Very happy to hear that, that makes perfect sense. Thank you so much for the explanation Chris!!

– Tartaglia
Jan 4 at 19:55

@Tartaglia glad to be able to help.

– Chris
Jan 4 at 20:20

add a comment |

edited Jan 4 at 17:05

answered Jan 4 at 16:59

Chris

544414

1

Very happy to hear that, that makes perfect sense. Thank you so much for the explanation Chris!!

– Tartaglia
Jan 4 at 19:55

@Tartaglia glad to be able to help.

– Chris
Jan 4 at 20:20

add a comment |

edited Jan 4 at 17:05

answered Jan 4 at 16:59

Chris

544414

edited Jan 4 at 17:05

answered Jan 4 at 16:59

Chris

544414

edited Jan 4 at 17:05

answered Jan 4 at 16:59

Chris

544414

answered Jan 4 at 16:59

Chris

544414

answered Jan 4 at 16:59

Chris

544414

1

Very happy to hear that, that makes perfect sense. Thank you so much for the explanation Chris!!

– Tartaglia
Jan 4 at 19:55

@Tartaglia glad to be able to help.

– Chris
Jan 4 at 20:20

add a comment |

1

Very happy to hear that, that makes perfect sense. Thank you so much for the explanation Chris!!

– Tartaglia
Jan 4 at 19:55

@Tartaglia glad to be able to help.

– Chris
Jan 4 at 20:20

Very happy to hear that, that makes perfect sense. Thank you so much for the explanation Chris!!

– Tartaglia
Jan 4 at 19:55

@Tartaglia glad to be able to help.

– Chris
Jan 4 at 20:20

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Bdtjtk