Getting a key error while converting npz to csv format
![Multi tool use Multi tool use](http://sgv.ssvwv.com/sg/ssvwvcomimagb.png)
Multi tool use
I am trying to convert a .npz
file to .csv
format, but it is giving the following key error
KeyError: '0 is not a file in the archive'
I had a sparse matrix which I converted to .npz
format. I then loaded the npz file using np.load()
. I tried converting the loaded npz file to csv using np.savetxt()
but it gives the following error
KeyError: '0 is not a file in the archive'
.
What does this key error mean and how to solve it?
I tried the following code:
DF = np.load("DF_tfidf.npz")
np.savetxt("DF.csv",DF)
python-3.x csv numpy scipy sparse-matrix
add a comment |
I am trying to convert a .npz
file to .csv
format, but it is giving the following key error
KeyError: '0 is not a file in the archive'
I had a sparse matrix which I converted to .npz
format. I then loaded the npz file using np.load()
. I tried converting the loaded npz file to csv using np.savetxt()
but it gives the following error
KeyError: '0 is not a file in the archive'
.
What does this key error mean and how to solve it?
I tried the following code:
DF = np.load("DF_tfidf.npz")
np.savetxt("DF.csv",DF)
python-3.x csv numpy scipy sparse-matrix
np.load
gives you a dictionary like object. The actual arrays are accessed by name, or dictionarykey
. So it doesn't make sense to simply pass this object to thesavetxt
function. I suspect you are trying to use these functions without learning what they produce and require.
– hpaulj
Jan 2 at 6:32
If you have created ascipy
sparse matrix, and saved it withsave_npz
you have added another layer of complexity. While such a file can be read withnp.load
, you have to understand the save format first. If instead you useload_npz
, you get a sparse matrix, just like what you started with. Saving that to acsv
text format is different topic. The simplest would be to convert it to a dense array, withtoarray()
, and write that withsavetxt
. But if the sparse matrix was at all large, you could end up with aMemoryError
.
– hpaulj
Jan 2 at 6:37
What exactly do you expect thecsv
to look like?
– hpaulj
Jan 2 at 12:26
Question has nothing to do withmachine-learning
- kindly do not spam the tag (removed).
– desertnaut
Jan 2 at 15:45
add a comment |
I am trying to convert a .npz
file to .csv
format, but it is giving the following key error
KeyError: '0 is not a file in the archive'
I had a sparse matrix which I converted to .npz
format. I then loaded the npz file using np.load()
. I tried converting the loaded npz file to csv using np.savetxt()
but it gives the following error
KeyError: '0 is not a file in the archive'
.
What does this key error mean and how to solve it?
I tried the following code:
DF = np.load("DF_tfidf.npz")
np.savetxt("DF.csv",DF)
python-3.x csv numpy scipy sparse-matrix
I am trying to convert a .npz
file to .csv
format, but it is giving the following key error
KeyError: '0 is not a file in the archive'
I had a sparse matrix which I converted to .npz
format. I then loaded the npz file using np.load()
. I tried converting the loaded npz file to csv using np.savetxt()
but it gives the following error
KeyError: '0 is not a file in the archive'
.
What does this key error mean and how to solve it?
I tried the following code:
DF = np.load("DF_tfidf.npz")
np.savetxt("DF.csv",DF)
python-3.x csv numpy scipy sparse-matrix
python-3.x csv numpy scipy sparse-matrix
edited Jan 2 at 17:49
hpaulj
115k784155
115k784155
asked Jan 2 at 4:11
![](https://lh4.googleusercontent.com/-DU3lDhgYrDI/AAAAAAAAAAI/AAAAAAAABVI/j453xEW8JT8/photo.jpg?sz=32)
![](https://lh4.googleusercontent.com/-DU3lDhgYrDI/AAAAAAAAAAI/AAAAAAAABVI/j453xEW8JT8/photo.jpg?sz=32)
Hardik BapnaHardik Bapna
114
114
np.load
gives you a dictionary like object. The actual arrays are accessed by name, or dictionarykey
. So it doesn't make sense to simply pass this object to thesavetxt
function. I suspect you are trying to use these functions without learning what they produce and require.
– hpaulj
Jan 2 at 6:32
If you have created ascipy
sparse matrix, and saved it withsave_npz
you have added another layer of complexity. While such a file can be read withnp.load
, you have to understand the save format first. If instead you useload_npz
, you get a sparse matrix, just like what you started with. Saving that to acsv
text format is different topic. The simplest would be to convert it to a dense array, withtoarray()
, and write that withsavetxt
. But if the sparse matrix was at all large, you could end up with aMemoryError
.
– hpaulj
Jan 2 at 6:37
What exactly do you expect thecsv
to look like?
– hpaulj
Jan 2 at 12:26
Question has nothing to do withmachine-learning
- kindly do not spam the tag (removed).
– desertnaut
Jan 2 at 15:45
add a comment |
np.load
gives you a dictionary like object. The actual arrays are accessed by name, or dictionarykey
. So it doesn't make sense to simply pass this object to thesavetxt
function. I suspect you are trying to use these functions without learning what they produce and require.
– hpaulj
Jan 2 at 6:32
If you have created ascipy
sparse matrix, and saved it withsave_npz
you have added another layer of complexity. While such a file can be read withnp.load
, you have to understand the save format first. If instead you useload_npz
, you get a sparse matrix, just like what you started with. Saving that to acsv
text format is different topic. The simplest would be to convert it to a dense array, withtoarray()
, and write that withsavetxt
. But if the sparse matrix was at all large, you could end up with aMemoryError
.
– hpaulj
Jan 2 at 6:37
What exactly do you expect thecsv
to look like?
– hpaulj
Jan 2 at 12:26
Question has nothing to do withmachine-learning
- kindly do not spam the tag (removed).
– desertnaut
Jan 2 at 15:45
np.load
gives you a dictionary like object. The actual arrays are accessed by name, or dictionary key
. So it doesn't make sense to simply pass this object to the savetxt
function. I suspect you are trying to use these functions without learning what they produce and require.– hpaulj
Jan 2 at 6:32
np.load
gives you a dictionary like object. The actual arrays are accessed by name, or dictionary key
. So it doesn't make sense to simply pass this object to the savetxt
function. I suspect you are trying to use these functions without learning what they produce and require.– hpaulj
Jan 2 at 6:32
If you have created a
scipy
sparse matrix, and saved it with save_npz
you have added another layer of complexity. While such a file can be read with np.load
, you have to understand the save format first. If instead you use load_npz
, you get a sparse matrix, just like what you started with. Saving that to a csv
text format is different topic. The simplest would be to convert it to a dense array, with toarray()
, and write that with savetxt
. But if the sparse matrix was at all large, you could end up with a MemoryError
.– hpaulj
Jan 2 at 6:37
If you have created a
scipy
sparse matrix, and saved it with save_npz
you have added another layer of complexity. While such a file can be read with np.load
, you have to understand the save format first. If instead you use load_npz
, you get a sparse matrix, just like what you started with. Saving that to a csv
text format is different topic. The simplest would be to convert it to a dense array, with toarray()
, and write that with savetxt
. But if the sparse matrix was at all large, you could end up with a MemoryError
.– hpaulj
Jan 2 at 6:37
What exactly do you expect the
csv
to look like?– hpaulj
Jan 2 at 12:26
What exactly do you expect the
csv
to look like?– hpaulj
Jan 2 at 12:26
Question has nothing to do with
machine-learning
- kindly do not spam the tag (removed).– desertnaut
Jan 2 at 15:45
Question has nothing to do with
machine-learning
- kindly do not spam the tag (removed).– desertnaut
Jan 2 at 15:45
add a comment |
2 Answers
2
active
oldest
votes
You cannot convert NPZ file to csv file. First we need to find out what are the files in NPZ File like below
np_Array=np.load('DF_tfidf.npz')
print(np_Array.files)
for example if output is like ['arr_0']
for above print
So you need to extract that array and then convert it to csv like below.
arr=np_Array.files[0]
np.savetxt("DF.csv", np_Array[arr], delimiter=",")
This ignores the shape and dtype of arrays. The OP doesn't' understand the content of the npz enough to get meaningful csvs
– hpaulj
Jan 2 at 12:25
add a comment |
This isn't a problem of how to convert npz to csv, but how to properly load the data from the npz, and then save that as csv. In general a npz
is a file archive that contains several arrays. A csv on the other is a format for saving one 2d array.
You could, in theory, write each file of the npz
to its own csv
. But if the npz
saves some complex object, rather than a random set of array, that's probably not what you want to do. My guess is that you have a scipy.sparse
matrix (possibly created in the course of some machine learning project). In that case you should focus on how to write a sparse matrix, or some representation of it, not on converting its npz
save.
Let's make a scipy sparse matrix and save it:
In [45]: from scipy import sparse
In [46]: M = sparse.random(4,4,.2,'csr')
In [47]: M
Out[47]:
<4x4 sparse matrix of type '<class 'numpy.float64'>'
with 3 stored elements in Compressed Sparse Row format>
In [48]: M.A
Out[48]:
array([[0.30442216, 0. , 0. , 0. ],
[0.29783572, 0. , 0. , 0. ],
[0. , 0. , 0.83881939, 0. ],
[0. , 0. , 0. , 0. ]])
In [49]: sparse.save_npz('sparse.npz',M)
Now load it:
In [50]: sparse.load_npz('sparse.npz')
Out[50]:
<4x4 sparse matrix of type '<class 'numpy.float64'>'
with 3 stored elements in Compressed Sparse Row format>
That's the same thing that we saved.
Now look at it with np.load
:
In [51]: data = np.load('sparse.npz')
In [52]: list(data.keys())
Out[52]: ['indices', 'indptr', 'format', 'shape', 'data']
In [53]: data['indices']
Out[53]: array([0, 0, 2], dtype=int32)
In [54]: data['indptr']
Out[54]: array([0, 1, 2, 3, 3], dtype=int32)
In [55]: data['format']
Out[55]: array(b'csr', dtype='|S3')
In [56]: data['shape']
Out[56]: array([4, 4])
In [57]: data['data']
Out[57]: array([0.30442216, 0.29783572, 0.83881939])
I can save the dense equivalent of this sparse matrix to a csv
with:
In [60]: np.savetxt('sparse.csv', M.A, fmt='%10f',delimiter=',')
In [61]: cat sparse.csv
0.304422, 0.000000, 0.000000, 0.000000
0.297836, 0.000000, 0.000000, 0.000000
0.000000, 0.000000, 0.838819, 0.000000
0.000000, 0.000000, 0.000000, 0.000000
For a small matrix like this that's no problem. But often in machine learning the sparse matrix is very large, and M.A
raises a MemoryError.
I suppose one could try to write a 3 column csv
with the row,col,data attributes of a coo
format matrix, the same sort of numbers we get with:
In [62]: print(M)
(0, 0) 0.3044221604204369
(1, 0) 0.29783571660339536
(2, 2) 0.8388193913095385
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54001104%2fgetting-a-key-error-while-converting-npz-to-csv-format%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
You cannot convert NPZ file to csv file. First we need to find out what are the files in NPZ File like below
np_Array=np.load('DF_tfidf.npz')
print(np_Array.files)
for example if output is like ['arr_0']
for above print
So you need to extract that array and then convert it to csv like below.
arr=np_Array.files[0]
np.savetxt("DF.csv", np_Array[arr], delimiter=",")
This ignores the shape and dtype of arrays. The OP doesn't' understand the content of the npz enough to get meaningful csvs
– hpaulj
Jan 2 at 12:25
add a comment |
You cannot convert NPZ file to csv file. First we need to find out what are the files in NPZ File like below
np_Array=np.load('DF_tfidf.npz')
print(np_Array.files)
for example if output is like ['arr_0']
for above print
So you need to extract that array and then convert it to csv like below.
arr=np_Array.files[0]
np.savetxt("DF.csv", np_Array[arr], delimiter=",")
This ignores the shape and dtype of arrays. The OP doesn't' understand the content of the npz enough to get meaningful csvs
– hpaulj
Jan 2 at 12:25
add a comment |
You cannot convert NPZ file to csv file. First we need to find out what are the files in NPZ File like below
np_Array=np.load('DF_tfidf.npz')
print(np_Array.files)
for example if output is like ['arr_0']
for above print
So you need to extract that array and then convert it to csv like below.
arr=np_Array.files[0]
np.savetxt("DF.csv", np_Array[arr], delimiter=",")
You cannot convert NPZ file to csv file. First we need to find out what are the files in NPZ File like below
np_Array=np.load('DF_tfidf.npz')
print(np_Array.files)
for example if output is like ['arr_0']
for above print
So you need to extract that array and then convert it to csv like below.
arr=np_Array.files[0]
np.savetxt("DF.csv", np_Array[arr], delimiter=",")
answered Jan 2 at 6:51
![](https://i.stack.imgur.com/lCQIE.jpg?s=32&g=1)
![](https://i.stack.imgur.com/lCQIE.jpg?s=32&g=1)
Lakshmi Bhavani - IntelLakshmi Bhavani - Intel
28717
28717
This ignores the shape and dtype of arrays. The OP doesn't' understand the content of the npz enough to get meaningful csvs
– hpaulj
Jan 2 at 12:25
add a comment |
This ignores the shape and dtype of arrays. The OP doesn't' understand the content of the npz enough to get meaningful csvs
– hpaulj
Jan 2 at 12:25
This ignores the shape and dtype of arrays. The OP doesn't' understand the content of the npz enough to get meaningful csvs
– hpaulj
Jan 2 at 12:25
This ignores the shape and dtype of arrays. The OP doesn't' understand the content of the npz enough to get meaningful csvs
– hpaulj
Jan 2 at 12:25
add a comment |
This isn't a problem of how to convert npz to csv, but how to properly load the data from the npz, and then save that as csv. In general a npz
is a file archive that contains several arrays. A csv on the other is a format for saving one 2d array.
You could, in theory, write each file of the npz
to its own csv
. But if the npz
saves some complex object, rather than a random set of array, that's probably not what you want to do. My guess is that you have a scipy.sparse
matrix (possibly created in the course of some machine learning project). In that case you should focus on how to write a sparse matrix, or some representation of it, not on converting its npz
save.
Let's make a scipy sparse matrix and save it:
In [45]: from scipy import sparse
In [46]: M = sparse.random(4,4,.2,'csr')
In [47]: M
Out[47]:
<4x4 sparse matrix of type '<class 'numpy.float64'>'
with 3 stored elements in Compressed Sparse Row format>
In [48]: M.A
Out[48]:
array([[0.30442216, 0. , 0. , 0. ],
[0.29783572, 0. , 0. , 0. ],
[0. , 0. , 0.83881939, 0. ],
[0. , 0. , 0. , 0. ]])
In [49]: sparse.save_npz('sparse.npz',M)
Now load it:
In [50]: sparse.load_npz('sparse.npz')
Out[50]:
<4x4 sparse matrix of type '<class 'numpy.float64'>'
with 3 stored elements in Compressed Sparse Row format>
That's the same thing that we saved.
Now look at it with np.load
:
In [51]: data = np.load('sparse.npz')
In [52]: list(data.keys())
Out[52]: ['indices', 'indptr', 'format', 'shape', 'data']
In [53]: data['indices']
Out[53]: array([0, 0, 2], dtype=int32)
In [54]: data['indptr']
Out[54]: array([0, 1, 2, 3, 3], dtype=int32)
In [55]: data['format']
Out[55]: array(b'csr', dtype='|S3')
In [56]: data['shape']
Out[56]: array([4, 4])
In [57]: data['data']
Out[57]: array([0.30442216, 0.29783572, 0.83881939])
I can save the dense equivalent of this sparse matrix to a csv
with:
In [60]: np.savetxt('sparse.csv', M.A, fmt='%10f',delimiter=',')
In [61]: cat sparse.csv
0.304422, 0.000000, 0.000000, 0.000000
0.297836, 0.000000, 0.000000, 0.000000
0.000000, 0.000000, 0.838819, 0.000000
0.000000, 0.000000, 0.000000, 0.000000
For a small matrix like this that's no problem. But often in machine learning the sparse matrix is very large, and M.A
raises a MemoryError.
I suppose one could try to write a 3 column csv
with the row,col,data attributes of a coo
format matrix, the same sort of numbers we get with:
In [62]: print(M)
(0, 0) 0.3044221604204369
(1, 0) 0.29783571660339536
(2, 2) 0.8388193913095385
add a comment |
This isn't a problem of how to convert npz to csv, but how to properly load the data from the npz, and then save that as csv. In general a npz
is a file archive that contains several arrays. A csv on the other is a format for saving one 2d array.
You could, in theory, write each file of the npz
to its own csv
. But if the npz
saves some complex object, rather than a random set of array, that's probably not what you want to do. My guess is that you have a scipy.sparse
matrix (possibly created in the course of some machine learning project). In that case you should focus on how to write a sparse matrix, or some representation of it, not on converting its npz
save.
Let's make a scipy sparse matrix and save it:
In [45]: from scipy import sparse
In [46]: M = sparse.random(4,4,.2,'csr')
In [47]: M
Out[47]:
<4x4 sparse matrix of type '<class 'numpy.float64'>'
with 3 stored elements in Compressed Sparse Row format>
In [48]: M.A
Out[48]:
array([[0.30442216, 0. , 0. , 0. ],
[0.29783572, 0. , 0. , 0. ],
[0. , 0. , 0.83881939, 0. ],
[0. , 0. , 0. , 0. ]])
In [49]: sparse.save_npz('sparse.npz',M)
Now load it:
In [50]: sparse.load_npz('sparse.npz')
Out[50]:
<4x4 sparse matrix of type '<class 'numpy.float64'>'
with 3 stored elements in Compressed Sparse Row format>
That's the same thing that we saved.
Now look at it with np.load
:
In [51]: data = np.load('sparse.npz')
In [52]: list(data.keys())
Out[52]: ['indices', 'indptr', 'format', 'shape', 'data']
In [53]: data['indices']
Out[53]: array([0, 0, 2], dtype=int32)
In [54]: data['indptr']
Out[54]: array([0, 1, 2, 3, 3], dtype=int32)
In [55]: data['format']
Out[55]: array(b'csr', dtype='|S3')
In [56]: data['shape']
Out[56]: array([4, 4])
In [57]: data['data']
Out[57]: array([0.30442216, 0.29783572, 0.83881939])
I can save the dense equivalent of this sparse matrix to a csv
with:
In [60]: np.savetxt('sparse.csv', M.A, fmt='%10f',delimiter=',')
In [61]: cat sparse.csv
0.304422, 0.000000, 0.000000, 0.000000
0.297836, 0.000000, 0.000000, 0.000000
0.000000, 0.000000, 0.838819, 0.000000
0.000000, 0.000000, 0.000000, 0.000000
For a small matrix like this that's no problem. But often in machine learning the sparse matrix is very large, and M.A
raises a MemoryError.
I suppose one could try to write a 3 column csv
with the row,col,data attributes of a coo
format matrix, the same sort of numbers we get with:
In [62]: print(M)
(0, 0) 0.3044221604204369
(1, 0) 0.29783571660339536
(2, 2) 0.8388193913095385
add a comment |
This isn't a problem of how to convert npz to csv, but how to properly load the data from the npz, and then save that as csv. In general a npz
is a file archive that contains several arrays. A csv on the other is a format for saving one 2d array.
You could, in theory, write each file of the npz
to its own csv
. But if the npz
saves some complex object, rather than a random set of array, that's probably not what you want to do. My guess is that you have a scipy.sparse
matrix (possibly created in the course of some machine learning project). In that case you should focus on how to write a sparse matrix, or some representation of it, not on converting its npz
save.
Let's make a scipy sparse matrix and save it:
In [45]: from scipy import sparse
In [46]: M = sparse.random(4,4,.2,'csr')
In [47]: M
Out[47]:
<4x4 sparse matrix of type '<class 'numpy.float64'>'
with 3 stored elements in Compressed Sparse Row format>
In [48]: M.A
Out[48]:
array([[0.30442216, 0. , 0. , 0. ],
[0.29783572, 0. , 0. , 0. ],
[0. , 0. , 0.83881939, 0. ],
[0. , 0. , 0. , 0. ]])
In [49]: sparse.save_npz('sparse.npz',M)
Now load it:
In [50]: sparse.load_npz('sparse.npz')
Out[50]:
<4x4 sparse matrix of type '<class 'numpy.float64'>'
with 3 stored elements in Compressed Sparse Row format>
That's the same thing that we saved.
Now look at it with np.load
:
In [51]: data = np.load('sparse.npz')
In [52]: list(data.keys())
Out[52]: ['indices', 'indptr', 'format', 'shape', 'data']
In [53]: data['indices']
Out[53]: array([0, 0, 2], dtype=int32)
In [54]: data['indptr']
Out[54]: array([0, 1, 2, 3, 3], dtype=int32)
In [55]: data['format']
Out[55]: array(b'csr', dtype='|S3')
In [56]: data['shape']
Out[56]: array([4, 4])
In [57]: data['data']
Out[57]: array([0.30442216, 0.29783572, 0.83881939])
I can save the dense equivalent of this sparse matrix to a csv
with:
In [60]: np.savetxt('sparse.csv', M.A, fmt='%10f',delimiter=',')
In [61]: cat sparse.csv
0.304422, 0.000000, 0.000000, 0.000000
0.297836, 0.000000, 0.000000, 0.000000
0.000000, 0.000000, 0.838819, 0.000000
0.000000, 0.000000, 0.000000, 0.000000
For a small matrix like this that's no problem. But often in machine learning the sparse matrix is very large, and M.A
raises a MemoryError.
I suppose one could try to write a 3 column csv
with the row,col,data attributes of a coo
format matrix, the same sort of numbers we get with:
In [62]: print(M)
(0, 0) 0.3044221604204369
(1, 0) 0.29783571660339536
(2, 2) 0.8388193913095385
This isn't a problem of how to convert npz to csv, but how to properly load the data from the npz, and then save that as csv. In general a npz
is a file archive that contains several arrays. A csv on the other is a format for saving one 2d array.
You could, in theory, write each file of the npz
to its own csv
. But if the npz
saves some complex object, rather than a random set of array, that's probably not what you want to do. My guess is that you have a scipy.sparse
matrix (possibly created in the course of some machine learning project). In that case you should focus on how to write a sparse matrix, or some representation of it, not on converting its npz
save.
Let's make a scipy sparse matrix and save it:
In [45]: from scipy import sparse
In [46]: M = sparse.random(4,4,.2,'csr')
In [47]: M
Out[47]:
<4x4 sparse matrix of type '<class 'numpy.float64'>'
with 3 stored elements in Compressed Sparse Row format>
In [48]: M.A
Out[48]:
array([[0.30442216, 0. , 0. , 0. ],
[0.29783572, 0. , 0. , 0. ],
[0. , 0. , 0.83881939, 0. ],
[0. , 0. , 0. , 0. ]])
In [49]: sparse.save_npz('sparse.npz',M)
Now load it:
In [50]: sparse.load_npz('sparse.npz')
Out[50]:
<4x4 sparse matrix of type '<class 'numpy.float64'>'
with 3 stored elements in Compressed Sparse Row format>
That's the same thing that we saved.
Now look at it with np.load
:
In [51]: data = np.load('sparse.npz')
In [52]: list(data.keys())
Out[52]: ['indices', 'indptr', 'format', 'shape', 'data']
In [53]: data['indices']
Out[53]: array([0, 0, 2], dtype=int32)
In [54]: data['indptr']
Out[54]: array([0, 1, 2, 3, 3], dtype=int32)
In [55]: data['format']
Out[55]: array(b'csr', dtype='|S3')
In [56]: data['shape']
Out[56]: array([4, 4])
In [57]: data['data']
Out[57]: array([0.30442216, 0.29783572, 0.83881939])
I can save the dense equivalent of this sparse matrix to a csv
with:
In [60]: np.savetxt('sparse.csv', M.A, fmt='%10f',delimiter=',')
In [61]: cat sparse.csv
0.304422, 0.000000, 0.000000, 0.000000
0.297836, 0.000000, 0.000000, 0.000000
0.000000, 0.000000, 0.838819, 0.000000
0.000000, 0.000000, 0.000000, 0.000000
For a small matrix like this that's no problem. But often in machine learning the sparse matrix is very large, and M.A
raises a MemoryError.
I suppose one could try to write a 3 column csv
with the row,col,data attributes of a coo
format matrix, the same sort of numbers we get with:
In [62]: print(M)
(0, 0) 0.3044221604204369
(1, 0) 0.29783571660339536
(2, 2) 0.8388193913095385
edited Jan 2 at 17:57
answered Jan 2 at 17:31
hpauljhpaulj
115k784155
115k784155
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54001104%2fgetting-a-key-error-while-converting-npz-to-csv-format%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
WAvYa8U b lLe Jl,QjQPoOs1AEdN vnQ sCrzlrvqg7l UisKHasACnpcpUV1DFksuyj7yVUqfjv
np.load
gives you a dictionary like object. The actual arrays are accessed by name, or dictionarykey
. So it doesn't make sense to simply pass this object to thesavetxt
function. I suspect you are trying to use these functions without learning what they produce and require.– hpaulj
Jan 2 at 6:32
If you have created a
scipy
sparse matrix, and saved it withsave_npz
you have added another layer of complexity. While such a file can be read withnp.load
, you have to understand the save format first. If instead you useload_npz
, you get a sparse matrix, just like what you started with. Saving that to acsv
text format is different topic. The simplest would be to convert it to a dense array, withtoarray()
, and write that withsavetxt
. But if the sparse matrix was at all large, you could end up with aMemoryError
.– hpaulj
Jan 2 at 6:37
What exactly do you expect the
csv
to look like?– hpaulj
Jan 2 at 12:26
Question has nothing to do with
machine-learning
- kindly do not spam the tag (removed).– desertnaut
Jan 2 at 15:45