What is screwing up my pickling of scikit learn model objects?












1















Im at least assuming something is screwing that up...
here is what is going on. I am making ML models for currency pairs in the foreign exchange market. I make one model for each pair im looking at. So, I have 8 models.



I do the following:



1.) make a model based on data, from a csv file, for each pair. kind of like this:



namelist = ['usdcad', 'eurjpy', 'usdjpy', 'gbpjpy', 'audusd', 'usdchf', 'nzdusd', 'gbpusd', 'eurusd']


for filename in namelist:

with open(filename+"/all.csv",'r') as dest_f:
data_iter = csv.reader(dest_f,
delimiter = ",",
quotechar = '"')
data = [data for data in data_iter]

##do some stuff to organize the data

model50 = linear_model.LinearRegression()
model50.fit(X, Y50)

afile = open(r''+filename+'.pkl', 'wb')
pickle.dump(model50, afile)
afile.close()


Then in another file I load realtime data, which I use as input for my model, I save that data to csvs



namelist = ['USDCAD', 'EURJPY', 'USDJPY', 'GBPJPY', 'AUDUSD', 'USDCHF', 'USDCAD', 'NZDUSD', 'GBPUSD', 'EURUSD']

for name in namelist:
CSV_URL = 'REMOVED BECAUSE IT HAS MY KEY'+name

with requests.Session() as s:
download = s.get(CSV_URL)

decoded_content = download.content.decode('utf-8')

cr = csv.reader(decoded_content.splitlines(), delimiter=',')
my_list = list(cr)
del my_list[:18]
del my_list[-1]
with open(name+".csv", "wb") as f:
writer = csv.writer(f)
writer.writerows(my_list)


Then finally, I run all my predictions at once:



namelist = ['USDCAD', 'EURJPY', 'USDJPY', 'GBPJPY', 'AUDUSD', 'USDCHF', 'USDCAD', 'NZDUSD', 'GBPUSD', 'EURUSD']


for name in namelist:

with open(name+".csv",'r') as dest_f:
data_iter = csv.reader(dest_f,
delimiter = ",",
quotechar = '"')
res = [data for data in data_iter]


res = [[x[3],x[4],x[5],x[6]] for x in res]
input = np.zeros((1,len(res)*4))
indx = 0
while 1:
for row in res:
for col in row:
input[0,indx] = float(col)
indx += 1
break

pname = name.lower()

file = open(r''+pname+'.pkl', 'rb')
mymodel = pickle.load(file)
file.close()

prediction = mymodel.predict(input)


sorry about any small formatting errors, I am still pretty bad at using this site properly.



anyways, so what actually happens here is that two of my predictions are correct, clearly the model is working right. the other 6 predictions make NO SENSE AT ALL. It's like it's trying to use one of the other models, but it shouldn't be.... Or like the wrong model gets pickled to a filename. I've examined the inputs, the files, they are all correct. Furthermore, using the exact same CSV files, I used the exact same techniques to make a model and predict JUST usdcad, and it works great! But the usdcad prediction comes out totally wrong when I use the scripts below.



Does anyone know what is going wrong here? I can't figure it out. Thanks.



**QUICK EDIT:
I've confirmed that the problem is that it is with the pickled models. It's overwriting them or keeping old model objects in memory or something...



EDIT2: Some extra info
I have discovered that if I pickle my model, and then load it, and then use it - all in the same script, then it works fine. If I pickle my model, and then load it in another script (using the exact same code), it does not work correctly. This is the case even when I only make one model.










share|improve this question

























  • You should try running pickle.load with a specified encoding to see if that fixes your issue on import.

    – Dascienz
    Jan 3 at 17:07
















1















Im at least assuming something is screwing that up...
here is what is going on. I am making ML models for currency pairs in the foreign exchange market. I make one model for each pair im looking at. So, I have 8 models.



I do the following:



1.) make a model based on data, from a csv file, for each pair. kind of like this:



namelist = ['usdcad', 'eurjpy', 'usdjpy', 'gbpjpy', 'audusd', 'usdchf', 'nzdusd', 'gbpusd', 'eurusd']


for filename in namelist:

with open(filename+"/all.csv",'r') as dest_f:
data_iter = csv.reader(dest_f,
delimiter = ",",
quotechar = '"')
data = [data for data in data_iter]

##do some stuff to organize the data

model50 = linear_model.LinearRegression()
model50.fit(X, Y50)

afile = open(r''+filename+'.pkl', 'wb')
pickle.dump(model50, afile)
afile.close()


Then in another file I load realtime data, which I use as input for my model, I save that data to csvs



namelist = ['USDCAD', 'EURJPY', 'USDJPY', 'GBPJPY', 'AUDUSD', 'USDCHF', 'USDCAD', 'NZDUSD', 'GBPUSD', 'EURUSD']

for name in namelist:
CSV_URL = 'REMOVED BECAUSE IT HAS MY KEY'+name

with requests.Session() as s:
download = s.get(CSV_URL)

decoded_content = download.content.decode('utf-8')

cr = csv.reader(decoded_content.splitlines(), delimiter=',')
my_list = list(cr)
del my_list[:18]
del my_list[-1]
with open(name+".csv", "wb") as f:
writer = csv.writer(f)
writer.writerows(my_list)


Then finally, I run all my predictions at once:



namelist = ['USDCAD', 'EURJPY', 'USDJPY', 'GBPJPY', 'AUDUSD', 'USDCHF', 'USDCAD', 'NZDUSD', 'GBPUSD', 'EURUSD']


for name in namelist:

with open(name+".csv",'r') as dest_f:
data_iter = csv.reader(dest_f,
delimiter = ",",
quotechar = '"')
res = [data for data in data_iter]


res = [[x[3],x[4],x[5],x[6]] for x in res]
input = np.zeros((1,len(res)*4))
indx = 0
while 1:
for row in res:
for col in row:
input[0,indx] = float(col)
indx += 1
break

pname = name.lower()

file = open(r''+pname+'.pkl', 'rb')
mymodel = pickle.load(file)
file.close()

prediction = mymodel.predict(input)


sorry about any small formatting errors, I am still pretty bad at using this site properly.



anyways, so what actually happens here is that two of my predictions are correct, clearly the model is working right. the other 6 predictions make NO SENSE AT ALL. It's like it's trying to use one of the other models, but it shouldn't be.... Or like the wrong model gets pickled to a filename. I've examined the inputs, the files, they are all correct. Furthermore, using the exact same CSV files, I used the exact same techniques to make a model and predict JUST usdcad, and it works great! But the usdcad prediction comes out totally wrong when I use the scripts below.



Does anyone know what is going wrong here? I can't figure it out. Thanks.



**QUICK EDIT:
I've confirmed that the problem is that it is with the pickled models. It's overwriting them or keeping old model objects in memory or something...



EDIT2: Some extra info
I have discovered that if I pickle my model, and then load it, and then use it - all in the same script, then it works fine. If I pickle my model, and then load it in another script (using the exact same code), it does not work correctly. This is the case even when I only make one model.










share|improve this question

























  • You should try running pickle.load with a specified encoding to see if that fixes your issue on import.

    – Dascienz
    Jan 3 at 17:07














1












1








1








Im at least assuming something is screwing that up...
here is what is going on. I am making ML models for currency pairs in the foreign exchange market. I make one model for each pair im looking at. So, I have 8 models.



I do the following:



1.) make a model based on data, from a csv file, for each pair. kind of like this:



namelist = ['usdcad', 'eurjpy', 'usdjpy', 'gbpjpy', 'audusd', 'usdchf', 'nzdusd', 'gbpusd', 'eurusd']


for filename in namelist:

with open(filename+"/all.csv",'r') as dest_f:
data_iter = csv.reader(dest_f,
delimiter = ",",
quotechar = '"')
data = [data for data in data_iter]

##do some stuff to organize the data

model50 = linear_model.LinearRegression()
model50.fit(X, Y50)

afile = open(r''+filename+'.pkl', 'wb')
pickle.dump(model50, afile)
afile.close()


Then in another file I load realtime data, which I use as input for my model, I save that data to csvs



namelist = ['USDCAD', 'EURJPY', 'USDJPY', 'GBPJPY', 'AUDUSD', 'USDCHF', 'USDCAD', 'NZDUSD', 'GBPUSD', 'EURUSD']

for name in namelist:
CSV_URL = 'REMOVED BECAUSE IT HAS MY KEY'+name

with requests.Session() as s:
download = s.get(CSV_URL)

decoded_content = download.content.decode('utf-8')

cr = csv.reader(decoded_content.splitlines(), delimiter=',')
my_list = list(cr)
del my_list[:18]
del my_list[-1]
with open(name+".csv", "wb") as f:
writer = csv.writer(f)
writer.writerows(my_list)


Then finally, I run all my predictions at once:



namelist = ['USDCAD', 'EURJPY', 'USDJPY', 'GBPJPY', 'AUDUSD', 'USDCHF', 'USDCAD', 'NZDUSD', 'GBPUSD', 'EURUSD']


for name in namelist:

with open(name+".csv",'r') as dest_f:
data_iter = csv.reader(dest_f,
delimiter = ",",
quotechar = '"')
res = [data for data in data_iter]


res = [[x[3],x[4],x[5],x[6]] for x in res]
input = np.zeros((1,len(res)*4))
indx = 0
while 1:
for row in res:
for col in row:
input[0,indx] = float(col)
indx += 1
break

pname = name.lower()

file = open(r''+pname+'.pkl', 'rb')
mymodel = pickle.load(file)
file.close()

prediction = mymodel.predict(input)


sorry about any small formatting errors, I am still pretty bad at using this site properly.



anyways, so what actually happens here is that two of my predictions are correct, clearly the model is working right. the other 6 predictions make NO SENSE AT ALL. It's like it's trying to use one of the other models, but it shouldn't be.... Or like the wrong model gets pickled to a filename. I've examined the inputs, the files, they are all correct. Furthermore, using the exact same CSV files, I used the exact same techniques to make a model and predict JUST usdcad, and it works great! But the usdcad prediction comes out totally wrong when I use the scripts below.



Does anyone know what is going wrong here? I can't figure it out. Thanks.



**QUICK EDIT:
I've confirmed that the problem is that it is with the pickled models. It's overwriting them or keeping old model objects in memory or something...



EDIT2: Some extra info
I have discovered that if I pickle my model, and then load it, and then use it - all in the same script, then it works fine. If I pickle my model, and then load it in another script (using the exact same code), it does not work correctly. This is the case even when I only make one model.










share|improve this question
















Im at least assuming something is screwing that up...
here is what is going on. I am making ML models for currency pairs in the foreign exchange market. I make one model for each pair im looking at. So, I have 8 models.



I do the following:



1.) make a model based on data, from a csv file, for each pair. kind of like this:



namelist = ['usdcad', 'eurjpy', 'usdjpy', 'gbpjpy', 'audusd', 'usdchf', 'nzdusd', 'gbpusd', 'eurusd']


for filename in namelist:

with open(filename+"/all.csv",'r') as dest_f:
data_iter = csv.reader(dest_f,
delimiter = ",",
quotechar = '"')
data = [data for data in data_iter]

##do some stuff to organize the data

model50 = linear_model.LinearRegression()
model50.fit(X, Y50)

afile = open(r''+filename+'.pkl', 'wb')
pickle.dump(model50, afile)
afile.close()


Then in another file I load realtime data, which I use as input for my model, I save that data to csvs



namelist = ['USDCAD', 'EURJPY', 'USDJPY', 'GBPJPY', 'AUDUSD', 'USDCHF', 'USDCAD', 'NZDUSD', 'GBPUSD', 'EURUSD']

for name in namelist:
CSV_URL = 'REMOVED BECAUSE IT HAS MY KEY'+name

with requests.Session() as s:
download = s.get(CSV_URL)

decoded_content = download.content.decode('utf-8')

cr = csv.reader(decoded_content.splitlines(), delimiter=',')
my_list = list(cr)
del my_list[:18]
del my_list[-1]
with open(name+".csv", "wb") as f:
writer = csv.writer(f)
writer.writerows(my_list)


Then finally, I run all my predictions at once:



namelist = ['USDCAD', 'EURJPY', 'USDJPY', 'GBPJPY', 'AUDUSD', 'USDCHF', 'USDCAD', 'NZDUSD', 'GBPUSD', 'EURUSD']


for name in namelist:

with open(name+".csv",'r') as dest_f:
data_iter = csv.reader(dest_f,
delimiter = ",",
quotechar = '"')
res = [data for data in data_iter]


res = [[x[3],x[4],x[5],x[6]] for x in res]
input = np.zeros((1,len(res)*4))
indx = 0
while 1:
for row in res:
for col in row:
input[0,indx] = float(col)
indx += 1
break

pname = name.lower()

file = open(r''+pname+'.pkl', 'rb')
mymodel = pickle.load(file)
file.close()

prediction = mymodel.predict(input)


sorry about any small formatting errors, I am still pretty bad at using this site properly.



anyways, so what actually happens here is that two of my predictions are correct, clearly the model is working right. the other 6 predictions make NO SENSE AT ALL. It's like it's trying to use one of the other models, but it shouldn't be.... Or like the wrong model gets pickled to a filename. I've examined the inputs, the files, they are all correct. Furthermore, using the exact same CSV files, I used the exact same techniques to make a model and predict JUST usdcad, and it works great! But the usdcad prediction comes out totally wrong when I use the scripts below.



Does anyone know what is going wrong here? I can't figure it out. Thanks.



**QUICK EDIT:
I've confirmed that the problem is that it is with the pickled models. It's overwriting them or keeping old model objects in memory or something...



EDIT2: Some extra info
I have discovered that if I pickle my model, and then load it, and then use it - all in the same script, then it works fine. If I pickle my model, and then load it in another script (using the exact same code), it does not work correctly. This is the case even when I only make one model.







python csv scikit-learn pickle






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Jan 3 at 16:53







Travis Black

















asked Jan 3 at 16:02









Travis BlackTravis Black

347211




347211













  • You should try running pickle.load with a specified encoding to see if that fixes your issue on import.

    – Dascienz
    Jan 3 at 17:07



















  • You should try running pickle.load with a specified encoding to see if that fixes your issue on import.

    – Dascienz
    Jan 3 at 17:07

















You should try running pickle.load with a specified encoding to see if that fixes your issue on import.

– Dascienz
Jan 3 at 17:07





You should try running pickle.load with a specified encoding to see if that fixes your issue on import.

– Dascienz
Jan 3 at 17:07












0






active

oldest

votes












Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54025831%2fwhat-is-screwing-up-my-pickling-of-scikit-learn-model-objects%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























0






active

oldest

votes








0






active

oldest

votes









active

oldest

votes






active

oldest

votes
















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54025831%2fwhat-is-screwing-up-my-pickling-of-scikit-learn-model-objects%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Monofisismo

Angular Downloading a file using contenturl with Basic Authentication

Olmecas