Train/Test Split Python
There are 250 randomly generated data points that are obtained as follows:
[X, y] = getDataSet() # getDataSet() randomly generates 250 data points
X looks like:
[array([[-2.44141527e-01, 8.39016956e-01],
[ 1.37468561e+00, 4.97114860e-01],
[ 3.08071887e-02, -2.03260255e-01],...
While y looks like:
y is array([[0.],
[0.],
[0.],...
(it also contains 1s)
So, I'm trying to split [X, y] into training and testing sets. The training set is suppose to be a random selection of 120 of the randomly generated data points. Here is how I'm generating the training set:
nTrain = 120
maxIndex = len(X)
randomTrainingSamples = np.random.choice(maxIndex, nTrain, replace=False)
trainX = X[randomTrainingSamples, :] # training samples
trainY = y[randomTrainingSamples, :] # labels of training samples nTrain X 1
Now, what I can't seem to figure out is, how to get the testing set, which is the 130 other randomly generated data points that are not included in the training set:
testX = # testing samples
testY = # labels of testing samples nTest x 1
Suggestions are much appreciated. Thank you!
python numpy machine-learning
add a comment |
There are 250 randomly generated data points that are obtained as follows:
[X, y] = getDataSet() # getDataSet() randomly generates 250 data points
X looks like:
[array([[-2.44141527e-01, 8.39016956e-01],
[ 1.37468561e+00, 4.97114860e-01],
[ 3.08071887e-02, -2.03260255e-01],...
While y looks like:
y is array([[0.],
[0.],
[0.],...
(it also contains 1s)
So, I'm trying to split [X, y] into training and testing sets. The training set is suppose to be a random selection of 120 of the randomly generated data points. Here is how I'm generating the training set:
nTrain = 120
maxIndex = len(X)
randomTrainingSamples = np.random.choice(maxIndex, nTrain, replace=False)
trainX = X[randomTrainingSamples, :] # training samples
trainY = y[randomTrainingSamples, :] # labels of training samples nTrain X 1
Now, what I can't seem to figure out is, how to get the testing set, which is the 130 other randomly generated data points that are not included in the training set:
testX = # testing samples
testY = # labels of testing samples nTest x 1
Suggestions are much appreciated. Thank you!
python numpy machine-learning
add a comment |
There are 250 randomly generated data points that are obtained as follows:
[X, y] = getDataSet() # getDataSet() randomly generates 250 data points
X looks like:
[array([[-2.44141527e-01, 8.39016956e-01],
[ 1.37468561e+00, 4.97114860e-01],
[ 3.08071887e-02, -2.03260255e-01],...
While y looks like:
y is array([[0.],
[0.],
[0.],...
(it also contains 1s)
So, I'm trying to split [X, y] into training and testing sets. The training set is suppose to be a random selection of 120 of the randomly generated data points. Here is how I'm generating the training set:
nTrain = 120
maxIndex = len(X)
randomTrainingSamples = np.random.choice(maxIndex, nTrain, replace=False)
trainX = X[randomTrainingSamples, :] # training samples
trainY = y[randomTrainingSamples, :] # labels of training samples nTrain X 1
Now, what I can't seem to figure out is, how to get the testing set, which is the 130 other randomly generated data points that are not included in the training set:
testX = # testing samples
testY = # labels of testing samples nTest x 1
Suggestions are much appreciated. Thank you!
python numpy machine-learning
There are 250 randomly generated data points that are obtained as follows:
[X, y] = getDataSet() # getDataSet() randomly generates 250 data points
X looks like:
[array([[-2.44141527e-01, 8.39016956e-01],
[ 1.37468561e+00, 4.97114860e-01],
[ 3.08071887e-02, -2.03260255e-01],...
While y looks like:
y is array([[0.],
[0.],
[0.],...
(it also contains 1s)
So, I'm trying to split [X, y] into training and testing sets. The training set is suppose to be a random selection of 120 of the randomly generated data points. Here is how I'm generating the training set:
nTrain = 120
maxIndex = len(X)
randomTrainingSamples = np.random.choice(maxIndex, nTrain, replace=False)
trainX = X[randomTrainingSamples, :] # training samples
trainY = y[randomTrainingSamples, :] # labels of training samples nTrain X 1
Now, what I can't seem to figure out is, how to get the testing set, which is the 130 other randomly generated data points that are not included in the training set:
testX = # testing samples
testY = # labels of testing samples nTest x 1
Suggestions are much appreciated. Thank you!
python numpy machine-learning
python numpy machine-learning
asked Dec 28 '18 at 5:45
MatthewSpireMatthewSpire
1591111
1591111
add a comment |
add a comment |
3 Answers
3
active
oldest
votes
You can try this.
randomTestingSamples = [i for i in range(maxIndex) if i not in randomTrainingSamples]
testX = X[randomTestingSamples, :] # testing samples
testY = y[randomTestingSamples, :] # labels of testing samples nTest x 1
I think this worked. Thank you!
– MatthewSpire
Dec 28 '18 at 12:51
I wanted to let you know this was for an assignment and I have attempted to give credit for your help.
– MatthewSpire
Dec 29 '18 at 17:28
add a comment |
You can use sklearn.model_selection.train_test_split
:
import numpy as np
from sklearn.model_selection import train_test_split
X, y = np.ndarray((250, 2)), np.ndarray((250, 1))
trainX, testX, trainY, testY = train_test_split(X, y, test_size= 130)
trainX.shape
# (120, 2)
testX.shape
# (130, 2)
trainY.shape
# (120, 1)
testY.shape
# (130, 1)
Cannot use sklearn, otherwise I would have. Thank you!
– MatthewSpire
Dec 28 '18 at 12:46
@MatthewSpire Only numpy then?
– Chris
Dec 28 '18 at 12:48
Yes sir. I've already got the training, but I can't seem to figure out how to select the other 130 for testing.
– MatthewSpire
Dec 28 '18 at 12:50
I think feed liu got it 'cause that code seems to work.
– MatthewSpire
Dec 28 '18 at 12:52
add a comment |
You can shuffle the index and pick the first 120 as train and the next 130 as test
random_index = np.random.shuffle(np.arange(len(X)))
randomTrainingSamples = random_index[:120]
randomTestSamples = random_index[120:250]
trainX = X[randomTrainingSamples, :]
trainY = y[randomTrainingSamples, :]
testX = X[randomTestSamples, :]
testY = y[randomTestSamples, :]
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53954167%2ftrain-test-split-python%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
You can try this.
randomTestingSamples = [i for i in range(maxIndex) if i not in randomTrainingSamples]
testX = X[randomTestingSamples, :] # testing samples
testY = y[randomTestingSamples, :] # labels of testing samples nTest x 1
I think this worked. Thank you!
– MatthewSpire
Dec 28 '18 at 12:51
I wanted to let you know this was for an assignment and I have attempted to give credit for your help.
– MatthewSpire
Dec 29 '18 at 17:28
add a comment |
You can try this.
randomTestingSamples = [i for i in range(maxIndex) if i not in randomTrainingSamples]
testX = X[randomTestingSamples, :] # testing samples
testY = y[randomTestingSamples, :] # labels of testing samples nTest x 1
I think this worked. Thank you!
– MatthewSpire
Dec 28 '18 at 12:51
I wanted to let you know this was for an assignment and I have attempted to give credit for your help.
– MatthewSpire
Dec 29 '18 at 17:28
add a comment |
You can try this.
randomTestingSamples = [i for i in range(maxIndex) if i not in randomTrainingSamples]
testX = X[randomTestingSamples, :] # testing samples
testY = y[randomTestingSamples, :] # labels of testing samples nTest x 1
You can try this.
randomTestingSamples = [i for i in range(maxIndex) if i not in randomTrainingSamples]
testX = X[randomTestingSamples, :] # testing samples
testY = y[randomTestingSamples, :] # labels of testing samples nTest x 1
answered Dec 28 '18 at 6:02
feed liufeed liu
415
415
I think this worked. Thank you!
– MatthewSpire
Dec 28 '18 at 12:51
I wanted to let you know this was for an assignment and I have attempted to give credit for your help.
– MatthewSpire
Dec 29 '18 at 17:28
add a comment |
I think this worked. Thank you!
– MatthewSpire
Dec 28 '18 at 12:51
I wanted to let you know this was for an assignment and I have attempted to give credit for your help.
– MatthewSpire
Dec 29 '18 at 17:28
I think this worked. Thank you!
– MatthewSpire
Dec 28 '18 at 12:51
I think this worked. Thank you!
– MatthewSpire
Dec 28 '18 at 12:51
I wanted to let you know this was for an assignment and I have attempted to give credit for your help.
– MatthewSpire
Dec 29 '18 at 17:28
I wanted to let you know this was for an assignment and I have attempted to give credit for your help.
– MatthewSpire
Dec 29 '18 at 17:28
add a comment |
You can use sklearn.model_selection.train_test_split
:
import numpy as np
from sklearn.model_selection import train_test_split
X, y = np.ndarray((250, 2)), np.ndarray((250, 1))
trainX, testX, trainY, testY = train_test_split(X, y, test_size= 130)
trainX.shape
# (120, 2)
testX.shape
# (130, 2)
trainY.shape
# (120, 1)
testY.shape
# (130, 1)
Cannot use sklearn, otherwise I would have. Thank you!
– MatthewSpire
Dec 28 '18 at 12:46
@MatthewSpire Only numpy then?
– Chris
Dec 28 '18 at 12:48
Yes sir. I've already got the training, but I can't seem to figure out how to select the other 130 for testing.
– MatthewSpire
Dec 28 '18 at 12:50
I think feed liu got it 'cause that code seems to work.
– MatthewSpire
Dec 28 '18 at 12:52
add a comment |
You can use sklearn.model_selection.train_test_split
:
import numpy as np
from sklearn.model_selection import train_test_split
X, y = np.ndarray((250, 2)), np.ndarray((250, 1))
trainX, testX, trainY, testY = train_test_split(X, y, test_size= 130)
trainX.shape
# (120, 2)
testX.shape
# (130, 2)
trainY.shape
# (120, 1)
testY.shape
# (130, 1)
Cannot use sklearn, otherwise I would have. Thank you!
– MatthewSpire
Dec 28 '18 at 12:46
@MatthewSpire Only numpy then?
– Chris
Dec 28 '18 at 12:48
Yes sir. I've already got the training, but I can't seem to figure out how to select the other 130 for testing.
– MatthewSpire
Dec 28 '18 at 12:50
I think feed liu got it 'cause that code seems to work.
– MatthewSpire
Dec 28 '18 at 12:52
add a comment |
You can use sklearn.model_selection.train_test_split
:
import numpy as np
from sklearn.model_selection import train_test_split
X, y = np.ndarray((250, 2)), np.ndarray((250, 1))
trainX, testX, trainY, testY = train_test_split(X, y, test_size= 130)
trainX.shape
# (120, 2)
testX.shape
# (130, 2)
trainY.shape
# (120, 1)
testY.shape
# (130, 1)
You can use sklearn.model_selection.train_test_split
:
import numpy as np
from sklearn.model_selection import train_test_split
X, y = np.ndarray((250, 2)), np.ndarray((250, 1))
trainX, testX, trainY, testY = train_test_split(X, y, test_size= 130)
trainX.shape
# (120, 2)
testX.shape
# (130, 2)
trainY.shape
# (120, 1)
testY.shape
# (130, 1)
answered Dec 28 '18 at 5:47
ChrisChris
722211
722211
Cannot use sklearn, otherwise I would have. Thank you!
– MatthewSpire
Dec 28 '18 at 12:46
@MatthewSpire Only numpy then?
– Chris
Dec 28 '18 at 12:48
Yes sir. I've already got the training, but I can't seem to figure out how to select the other 130 for testing.
– MatthewSpire
Dec 28 '18 at 12:50
I think feed liu got it 'cause that code seems to work.
– MatthewSpire
Dec 28 '18 at 12:52
add a comment |
Cannot use sklearn, otherwise I would have. Thank you!
– MatthewSpire
Dec 28 '18 at 12:46
@MatthewSpire Only numpy then?
– Chris
Dec 28 '18 at 12:48
Yes sir. I've already got the training, but I can't seem to figure out how to select the other 130 for testing.
– MatthewSpire
Dec 28 '18 at 12:50
I think feed liu got it 'cause that code seems to work.
– MatthewSpire
Dec 28 '18 at 12:52
Cannot use sklearn, otherwise I would have. Thank you!
– MatthewSpire
Dec 28 '18 at 12:46
Cannot use sklearn, otherwise I would have. Thank you!
– MatthewSpire
Dec 28 '18 at 12:46
@MatthewSpire Only numpy then?
– Chris
Dec 28 '18 at 12:48
@MatthewSpire Only numpy then?
– Chris
Dec 28 '18 at 12:48
Yes sir. I've already got the training, but I can't seem to figure out how to select the other 130 for testing.
– MatthewSpire
Dec 28 '18 at 12:50
Yes sir. I've already got the training, but I can't seem to figure out how to select the other 130 for testing.
– MatthewSpire
Dec 28 '18 at 12:50
I think feed liu got it 'cause that code seems to work.
– MatthewSpire
Dec 28 '18 at 12:52
I think feed liu got it 'cause that code seems to work.
– MatthewSpire
Dec 28 '18 at 12:52
add a comment |
You can shuffle the index and pick the first 120 as train and the next 130 as test
random_index = np.random.shuffle(np.arange(len(X)))
randomTrainingSamples = random_index[:120]
randomTestSamples = random_index[120:250]
trainX = X[randomTrainingSamples, :]
trainY = y[randomTrainingSamples, :]
testX = X[randomTestSamples, :]
testY = y[randomTestSamples, :]
add a comment |
You can shuffle the index and pick the first 120 as train and the next 130 as test
random_index = np.random.shuffle(np.arange(len(X)))
randomTrainingSamples = random_index[:120]
randomTestSamples = random_index[120:250]
trainX = X[randomTrainingSamples, :]
trainY = y[randomTrainingSamples, :]
testX = X[randomTestSamples, :]
testY = y[randomTestSamples, :]
add a comment |
You can shuffle the index and pick the first 120 as train and the next 130 as test
random_index = np.random.shuffle(np.arange(len(X)))
randomTrainingSamples = random_index[:120]
randomTestSamples = random_index[120:250]
trainX = X[randomTrainingSamples, :]
trainY = y[randomTrainingSamples, :]
testX = X[randomTestSamples, :]
testY = y[randomTestSamples, :]
You can shuffle the index and pick the first 120 as train and the next 130 as test
random_index = np.random.shuffle(np.arange(len(X)))
randomTrainingSamples = random_index[:120]
randomTestSamples = random_index[120:250]
trainX = X[randomTrainingSamples, :]
trainY = y[randomTrainingSamples, :]
testX = X[randomTestSamples, :]
testY = y[randomTestSamples, :]
answered Dec 28 '18 at 5:52
Ernest S KirubakaranErnest S Kirubakaran
90759
90759
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53954167%2ftrain-test-split-python%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown