What is screwing up my pickling of scikit learn model objects?
Im at least assuming something is screwing that up...
here is what is going on. I am making ML models for currency pairs in the foreign exchange market. I make one model for each pair im looking at. So, I have 8 models.
I do the following:
1.) make a model based on data, from a csv file, for each pair. kind of like this:
namelist = ['usdcad', 'eurjpy', 'usdjpy', 'gbpjpy', 'audusd', 'usdchf', 'nzdusd', 'gbpusd', 'eurusd']
for filename in namelist:
with open(filename+"/all.csv",'r') as dest_f:
data_iter = csv.reader(dest_f,
delimiter = ",",
quotechar = '"')
data = [data for data in data_iter]
##do some stuff to organize the data
model50 = linear_model.LinearRegression()
model50.fit(X, Y50)
afile = open(r''+filename+'.pkl', 'wb')
pickle.dump(model50, afile)
afile.close()
Then in another file I load realtime data, which I use as input for my model, I save that data to csvs
namelist = ['USDCAD', 'EURJPY', 'USDJPY', 'GBPJPY', 'AUDUSD', 'USDCHF', 'USDCAD', 'NZDUSD', 'GBPUSD', 'EURUSD']
for name in namelist:
CSV_URL = 'REMOVED BECAUSE IT HAS MY KEY'+name
with requests.Session() as s:
download = s.get(CSV_URL)
decoded_content = download.content.decode('utf-8')
cr = csv.reader(decoded_content.splitlines(), delimiter=',')
my_list = list(cr)
del my_list[:18]
del my_list[-1]
with open(name+".csv", "wb") as f:
writer = csv.writer(f)
writer.writerows(my_list)
Then finally, I run all my predictions at once:
namelist = ['USDCAD', 'EURJPY', 'USDJPY', 'GBPJPY', 'AUDUSD', 'USDCHF', 'USDCAD', 'NZDUSD', 'GBPUSD', 'EURUSD']
for name in namelist:
with open(name+".csv",'r') as dest_f:
data_iter = csv.reader(dest_f,
delimiter = ",",
quotechar = '"')
res = [data for data in data_iter]
res = [[x[3],x[4],x[5],x[6]] for x in res]
input = np.zeros((1,len(res)*4))
indx = 0
while 1:
for row in res:
for col in row:
input[0,indx] = float(col)
indx += 1
break
pname = name.lower()
file = open(r''+pname+'.pkl', 'rb')
mymodel = pickle.load(file)
file.close()
prediction = mymodel.predict(input)
sorry about any small formatting errors, I am still pretty bad at using this site properly.
anyways, so what actually happens here is that two of my predictions are correct, clearly the model is working right. the other 6 predictions make NO SENSE AT ALL. It's like it's trying to use one of the other models, but it shouldn't be.... Or like the wrong model gets pickled to a filename. I've examined the inputs, the files, they are all correct. Furthermore, using the exact same CSV files, I used the exact same techniques to make a model and predict JUST usdcad, and it works great! But the usdcad prediction comes out totally wrong when I use the scripts below.
Does anyone know what is going wrong here? I can't figure it out. Thanks.
**QUICK EDIT:
I've confirmed that the problem is that it is with the pickled models. It's overwriting them or keeping old model objects in memory or something...
EDIT2: Some extra info
I have discovered that if I pickle my model, and then load it, and then use it - all in the same script, then it works fine. If I pickle my model, and then load it in another script (using the exact same code), it does not work correctly. This is the case even when I only make one model.
python csv scikit-learn pickle
add a comment |
Im at least assuming something is screwing that up...
here is what is going on. I am making ML models for currency pairs in the foreign exchange market. I make one model for each pair im looking at. So, I have 8 models.
I do the following:
1.) make a model based on data, from a csv file, for each pair. kind of like this:
namelist = ['usdcad', 'eurjpy', 'usdjpy', 'gbpjpy', 'audusd', 'usdchf', 'nzdusd', 'gbpusd', 'eurusd']
for filename in namelist:
with open(filename+"/all.csv",'r') as dest_f:
data_iter = csv.reader(dest_f,
delimiter = ",",
quotechar = '"')
data = [data for data in data_iter]
##do some stuff to organize the data
model50 = linear_model.LinearRegression()
model50.fit(X, Y50)
afile = open(r''+filename+'.pkl', 'wb')
pickle.dump(model50, afile)
afile.close()
Then in another file I load realtime data, which I use as input for my model, I save that data to csvs
namelist = ['USDCAD', 'EURJPY', 'USDJPY', 'GBPJPY', 'AUDUSD', 'USDCHF', 'USDCAD', 'NZDUSD', 'GBPUSD', 'EURUSD']
for name in namelist:
CSV_URL = 'REMOVED BECAUSE IT HAS MY KEY'+name
with requests.Session() as s:
download = s.get(CSV_URL)
decoded_content = download.content.decode('utf-8')
cr = csv.reader(decoded_content.splitlines(), delimiter=',')
my_list = list(cr)
del my_list[:18]
del my_list[-1]
with open(name+".csv", "wb") as f:
writer = csv.writer(f)
writer.writerows(my_list)
Then finally, I run all my predictions at once:
namelist = ['USDCAD', 'EURJPY', 'USDJPY', 'GBPJPY', 'AUDUSD', 'USDCHF', 'USDCAD', 'NZDUSD', 'GBPUSD', 'EURUSD']
for name in namelist:
with open(name+".csv",'r') as dest_f:
data_iter = csv.reader(dest_f,
delimiter = ",",
quotechar = '"')
res = [data for data in data_iter]
res = [[x[3],x[4],x[5],x[6]] for x in res]
input = np.zeros((1,len(res)*4))
indx = 0
while 1:
for row in res:
for col in row:
input[0,indx] = float(col)
indx += 1
break
pname = name.lower()
file = open(r''+pname+'.pkl', 'rb')
mymodel = pickle.load(file)
file.close()
prediction = mymodel.predict(input)
sorry about any small formatting errors, I am still pretty bad at using this site properly.
anyways, so what actually happens here is that two of my predictions are correct, clearly the model is working right. the other 6 predictions make NO SENSE AT ALL. It's like it's trying to use one of the other models, but it shouldn't be.... Or like the wrong model gets pickled to a filename. I've examined the inputs, the files, they are all correct. Furthermore, using the exact same CSV files, I used the exact same techniques to make a model and predict JUST usdcad, and it works great! But the usdcad prediction comes out totally wrong when I use the scripts below.
Does anyone know what is going wrong here? I can't figure it out. Thanks.
**QUICK EDIT:
I've confirmed that the problem is that it is with the pickled models. It's overwriting them or keeping old model objects in memory or something...
EDIT2: Some extra info
I have discovered that if I pickle my model, and then load it, and then use it - all in the same script, then it works fine. If I pickle my model, and then load it in another script (using the exact same code), it does not work correctly. This is the case even when I only make one model.
python csv scikit-learn pickle
You should try runningpickle.load
with a specified encoding to see if that fixes your issue on import.
– Dascienz
Jan 3 at 17:07
add a comment |
Im at least assuming something is screwing that up...
here is what is going on. I am making ML models for currency pairs in the foreign exchange market. I make one model for each pair im looking at. So, I have 8 models.
I do the following:
1.) make a model based on data, from a csv file, for each pair. kind of like this:
namelist = ['usdcad', 'eurjpy', 'usdjpy', 'gbpjpy', 'audusd', 'usdchf', 'nzdusd', 'gbpusd', 'eurusd']
for filename in namelist:
with open(filename+"/all.csv",'r') as dest_f:
data_iter = csv.reader(dest_f,
delimiter = ",",
quotechar = '"')
data = [data for data in data_iter]
##do some stuff to organize the data
model50 = linear_model.LinearRegression()
model50.fit(X, Y50)
afile = open(r''+filename+'.pkl', 'wb')
pickle.dump(model50, afile)
afile.close()
Then in another file I load realtime data, which I use as input for my model, I save that data to csvs
namelist = ['USDCAD', 'EURJPY', 'USDJPY', 'GBPJPY', 'AUDUSD', 'USDCHF', 'USDCAD', 'NZDUSD', 'GBPUSD', 'EURUSD']
for name in namelist:
CSV_URL = 'REMOVED BECAUSE IT HAS MY KEY'+name
with requests.Session() as s:
download = s.get(CSV_URL)
decoded_content = download.content.decode('utf-8')
cr = csv.reader(decoded_content.splitlines(), delimiter=',')
my_list = list(cr)
del my_list[:18]
del my_list[-1]
with open(name+".csv", "wb") as f:
writer = csv.writer(f)
writer.writerows(my_list)
Then finally, I run all my predictions at once:
namelist = ['USDCAD', 'EURJPY', 'USDJPY', 'GBPJPY', 'AUDUSD', 'USDCHF', 'USDCAD', 'NZDUSD', 'GBPUSD', 'EURUSD']
for name in namelist:
with open(name+".csv",'r') as dest_f:
data_iter = csv.reader(dest_f,
delimiter = ",",
quotechar = '"')
res = [data for data in data_iter]
res = [[x[3],x[4],x[5],x[6]] for x in res]
input = np.zeros((1,len(res)*4))
indx = 0
while 1:
for row in res:
for col in row:
input[0,indx] = float(col)
indx += 1
break
pname = name.lower()
file = open(r''+pname+'.pkl', 'rb')
mymodel = pickle.load(file)
file.close()
prediction = mymodel.predict(input)
sorry about any small formatting errors, I am still pretty bad at using this site properly.
anyways, so what actually happens here is that two of my predictions are correct, clearly the model is working right. the other 6 predictions make NO SENSE AT ALL. It's like it's trying to use one of the other models, but it shouldn't be.... Or like the wrong model gets pickled to a filename. I've examined the inputs, the files, they are all correct. Furthermore, using the exact same CSV files, I used the exact same techniques to make a model and predict JUST usdcad, and it works great! But the usdcad prediction comes out totally wrong when I use the scripts below.
Does anyone know what is going wrong here? I can't figure it out. Thanks.
**QUICK EDIT:
I've confirmed that the problem is that it is with the pickled models. It's overwriting them or keeping old model objects in memory or something...
EDIT2: Some extra info
I have discovered that if I pickle my model, and then load it, and then use it - all in the same script, then it works fine. If I pickle my model, and then load it in another script (using the exact same code), it does not work correctly. This is the case even when I only make one model.
python csv scikit-learn pickle
Im at least assuming something is screwing that up...
here is what is going on. I am making ML models for currency pairs in the foreign exchange market. I make one model for each pair im looking at. So, I have 8 models.
I do the following:
1.) make a model based on data, from a csv file, for each pair. kind of like this:
namelist = ['usdcad', 'eurjpy', 'usdjpy', 'gbpjpy', 'audusd', 'usdchf', 'nzdusd', 'gbpusd', 'eurusd']
for filename in namelist:
with open(filename+"/all.csv",'r') as dest_f:
data_iter = csv.reader(dest_f,
delimiter = ",",
quotechar = '"')
data = [data for data in data_iter]
##do some stuff to organize the data
model50 = linear_model.LinearRegression()
model50.fit(X, Y50)
afile = open(r''+filename+'.pkl', 'wb')
pickle.dump(model50, afile)
afile.close()
Then in another file I load realtime data, which I use as input for my model, I save that data to csvs
namelist = ['USDCAD', 'EURJPY', 'USDJPY', 'GBPJPY', 'AUDUSD', 'USDCHF', 'USDCAD', 'NZDUSD', 'GBPUSD', 'EURUSD']
for name in namelist:
CSV_URL = 'REMOVED BECAUSE IT HAS MY KEY'+name
with requests.Session() as s:
download = s.get(CSV_URL)
decoded_content = download.content.decode('utf-8')
cr = csv.reader(decoded_content.splitlines(), delimiter=',')
my_list = list(cr)
del my_list[:18]
del my_list[-1]
with open(name+".csv", "wb") as f:
writer = csv.writer(f)
writer.writerows(my_list)
Then finally, I run all my predictions at once:
namelist = ['USDCAD', 'EURJPY', 'USDJPY', 'GBPJPY', 'AUDUSD', 'USDCHF', 'USDCAD', 'NZDUSD', 'GBPUSD', 'EURUSD']
for name in namelist:
with open(name+".csv",'r') as dest_f:
data_iter = csv.reader(dest_f,
delimiter = ",",
quotechar = '"')
res = [data for data in data_iter]
res = [[x[3],x[4],x[5],x[6]] for x in res]
input = np.zeros((1,len(res)*4))
indx = 0
while 1:
for row in res:
for col in row:
input[0,indx] = float(col)
indx += 1
break
pname = name.lower()
file = open(r''+pname+'.pkl', 'rb')
mymodel = pickle.load(file)
file.close()
prediction = mymodel.predict(input)
sorry about any small formatting errors, I am still pretty bad at using this site properly.
anyways, so what actually happens here is that two of my predictions are correct, clearly the model is working right. the other 6 predictions make NO SENSE AT ALL. It's like it's trying to use one of the other models, but it shouldn't be.... Or like the wrong model gets pickled to a filename. I've examined the inputs, the files, they are all correct. Furthermore, using the exact same CSV files, I used the exact same techniques to make a model and predict JUST usdcad, and it works great! But the usdcad prediction comes out totally wrong when I use the scripts below.
Does anyone know what is going wrong here? I can't figure it out. Thanks.
**QUICK EDIT:
I've confirmed that the problem is that it is with the pickled models. It's overwriting them or keeping old model objects in memory or something...
EDIT2: Some extra info
I have discovered that if I pickle my model, and then load it, and then use it - all in the same script, then it works fine. If I pickle my model, and then load it in another script (using the exact same code), it does not work correctly. This is the case even when I only make one model.
python csv scikit-learn pickle
python csv scikit-learn pickle
edited Jan 3 at 16:53
Travis Black
asked Jan 3 at 16:02
Travis BlackTravis Black
347211
347211
You should try runningpickle.load
with a specified encoding to see if that fixes your issue on import.
– Dascienz
Jan 3 at 17:07
add a comment |
You should try runningpickle.load
with a specified encoding to see if that fixes your issue on import.
– Dascienz
Jan 3 at 17:07
You should try running
pickle.load
with a specified encoding to see if that fixes your issue on import.– Dascienz
Jan 3 at 17:07
You should try running
pickle.load
with a specified encoding to see if that fixes your issue on import.– Dascienz
Jan 3 at 17:07
add a comment |
0
active
oldest
votes
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54025831%2fwhat-is-screwing-up-my-pickling-of-scikit-learn-model-objects%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54025831%2fwhat-is-screwing-up-my-pickling-of-scikit-learn-model-objects%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
You should try running
pickle.load
with a specified encoding to see if that fixes your issue on import.– Dascienz
Jan 3 at 17:07