Pandas.read_csv issue

-2

I trying to read the message from database, but under the class label can't really read same as CSV dataset.

messages = pandas.read_csv('bitcoin_reddit.csv', delimiter='t',
names=["title","class"])
print (messages)

Under the class label the pandas only can read as NaN

The version of my CSV file

Under the class label the pandas only can read as NaN

The version of my CSV file

title,url,timestamp,class

"It's official! 1 Bitcoin = $10,000 USD",https://v.redd.it/e7io27rdgt001,29/11/2017 17:25,0

The last 3 months in 47 seconds.,https://v.redd.it/typ8fdslz3e01,4/2/2018 18:42,0

It's over 9000!!!,https://i.imgur.com/jyoZGyW.gifv,26/11/2017 20:55,1

Everyone who's trading BTC right now,http://cdn.mutually.com/wp-content/uploads/2017/06/08-19.jpg,7/1/2018 12:38,1

I hope James is doing well,https://i.redd.it/h4ngqma643101.jpg,1/12/2017 1:50,1

Weeeeeeee!,https://i.redd.it/iwl7vz69cea01.gif,17/1/2018 1:13,0

Bitcoin.. The King,https://i.redd.it/4tl0oustqed01.jpg,1/2/2018 5:46,1

Nothing can increase by that much and still be a good investment.,https://i.imgur.com/oWePY7q.jpg,14/12/2017 0:02,1

"This is why I want bitcoin to hit $10,000",https://i.redd.it/fhzsxgcv9nyz.jpg,18/11/2017 18:25,1

Bitcoin Doesn't Give a Fuck.,https://v.redd.it/ty2y74gawug01,18/2/2018 15:19,-1

Working Hard or Hardly Working?,https://i.redd.it/c2o6204tvc301.jpg,12/12/2017 12:49,1

edited Dec 28 '18 at 11:25

asked Dec 28 '18 at 11:08

Andy Hui

619

1

better if you can provide text version of your file not excel
– meW
Dec 28 '18 at 11:12

Are you sure the delimiter is tab? You can try and omit the delimiter specification and let python auto-detect it
– Mortz
Dec 28 '18 at 11:12

i have updated my text version file
– Andy Hui
Dec 28 '18 at 11:25

2

Your delimiter is comma here. Also, it's sep=, not delimiter=.
– Michael O.
Dec 28 '18 at 11:27

2

looking at the excel file, the delimiter is ,, also if you only want to extract title and class use usecols=[0,3].
– gyx-hh
Dec 28 '18 at 11:29

|
show 1 more comment

-2

I trying to read the message from database, but under the class label can't really read same as CSV dataset.

messages = pandas.read_csv('bitcoin_reddit.csv', delimiter='t',
names=["title","class"])
print (messages)

Under the class label the pandas only can read as NaN

The version of my CSV file

Under the class label the pandas only can read as NaN

The version of my CSV file

title,url,timestamp,class

"It's official! 1 Bitcoin = $10,000 USD",https://v.redd.it/e7io27rdgt001,29/11/2017 17:25,0

The last 3 months in 47 seconds.,https://v.redd.it/typ8fdslz3e01,4/2/2018 18:42,0

It's over 9000!!!,https://i.imgur.com/jyoZGyW.gifv,26/11/2017 20:55,1

Everyone who's trading BTC right now,http://cdn.mutually.com/wp-content/uploads/2017/06/08-19.jpg,7/1/2018 12:38,1

I hope James is doing well,https://i.redd.it/h4ngqma643101.jpg,1/12/2017 1:50,1

Weeeeeeee!,https://i.redd.it/iwl7vz69cea01.gif,17/1/2018 1:13,0

Bitcoin.. The King,https://i.redd.it/4tl0oustqed01.jpg,1/2/2018 5:46,1

Nothing can increase by that much and still be a good investment.,https://i.imgur.com/oWePY7q.jpg,14/12/2017 0:02,1

"This is why I want bitcoin to hit $10,000",https://i.redd.it/fhzsxgcv9nyz.jpg,18/11/2017 18:25,1

Bitcoin Doesn't Give a Fuck.,https://v.redd.it/ty2y74gawug01,18/2/2018 15:19,-1

Working Hard or Hardly Working?,https://i.redd.it/c2o6204tvc301.jpg,12/12/2017 12:49,1

edited Dec 28 '18 at 11:25

asked Dec 28 '18 at 11:08

Andy Hui

619

1

better if you can provide text version of your file not excel
– meW
Dec 28 '18 at 11:12

Are you sure the delimiter is tab? You can try and omit the delimiter specification and let python auto-detect it
– Mortz
Dec 28 '18 at 11:12

i have updated my text version file
– Andy Hui
Dec 28 '18 at 11:25

2

Your delimiter is comma here. Also, it's sep=, not delimiter=.
– Michael O.
Dec 28 '18 at 11:27

2

looking at the excel file, the delimiter is ,, also if you only want to extract title and class use usecols=[0,3].
– gyx-hh
Dec 28 '18 at 11:29

|
show 1 more comment

-2

I trying to read the message from database, but under the class label can't really read same as CSV dataset.

messages = pandas.read_csv('bitcoin_reddit.csv', delimiter='t',
names=["title","class"])
print (messages)

Under the class label the pandas only can read as NaN

The version of my CSV file

Under the class label the pandas only can read as NaN

The version of my CSV file

title,url,timestamp,class

"It's official! 1 Bitcoin = $10,000 USD",https://v.redd.it/e7io27rdgt001,29/11/2017 17:25,0

The last 3 months in 47 seconds.,https://v.redd.it/typ8fdslz3e01,4/2/2018 18:42,0

It's over 9000!!!,https://i.imgur.com/jyoZGyW.gifv,26/11/2017 20:55,1

Everyone who's trading BTC right now,http://cdn.mutually.com/wp-content/uploads/2017/06/08-19.jpg,7/1/2018 12:38,1

I hope James is doing well,https://i.redd.it/h4ngqma643101.jpg,1/12/2017 1:50,1

Weeeeeeee!,https://i.redd.it/iwl7vz69cea01.gif,17/1/2018 1:13,0

Bitcoin.. The King,https://i.redd.it/4tl0oustqed01.jpg,1/2/2018 5:46,1

Nothing can increase by that much and still be a good investment.,https://i.imgur.com/oWePY7q.jpg,14/12/2017 0:02,1

"This is why I want bitcoin to hit $10,000",https://i.redd.it/fhzsxgcv9nyz.jpg,18/11/2017 18:25,1

Bitcoin Doesn't Give a Fuck.,https://v.redd.it/ty2y74gawug01,18/2/2018 15:19,-1

Working Hard or Hardly Working?,https://i.redd.it/c2o6204tvc301.jpg,12/12/2017 12:49,1

edited Dec 28 '18 at 11:25

asked Dec 28 '18 at 11:08

Andy Hui

619

I trying to read the message from database, but under the class label can't really read same as CSV dataset.

messages = pandas.read_csv('bitcoin_reddit.csv', delimiter='t',
names=["title","class"])
print (messages)

Under the class label the pandas only can read as NaN

The version of my CSV file

Under the class label the pandas only can read as NaN

The version of my CSV file

title,url,timestamp,class

"It's official! 1 Bitcoin = $10,000 USD",https://v.redd.it/e7io27rdgt001,29/11/2017 17:25,0

The last 3 months in 47 seconds.,https://v.redd.it/typ8fdslz3e01,4/2/2018 18:42,0

It's over 9000!!!,https://i.imgur.com/jyoZGyW.gifv,26/11/2017 20:55,1

Everyone who's trading BTC right now,http://cdn.mutually.com/wp-content/uploads/2017/06/08-19.jpg,7/1/2018 12:38,1

I hope James is doing well,https://i.redd.it/h4ngqma643101.jpg,1/12/2017 1:50,1

Weeeeeeee!,https://i.redd.it/iwl7vz69cea01.gif,17/1/2018 1:13,0

Bitcoin.. The King,https://i.redd.it/4tl0oustqed01.jpg,1/2/2018 5:46,1

Nothing can increase by that much and still be a good investment.,https://i.imgur.com/oWePY7q.jpg,14/12/2017 0:02,1

"This is why I want bitcoin to hit $10,000",https://i.redd.it/fhzsxgcv9nyz.jpg,18/11/2017 18:25,1

Bitcoin Doesn't Give a Fuck.,https://v.redd.it/ty2y74gawug01,18/2/2018 15:19,-1

Working Hard or Hardly Working?,https://i.redd.it/c2o6204tvc301.jpg,12/12/2017 12:49,1

python pandas

edited Dec 28 '18 at 11:25

asked Dec 28 '18 at 11:08

Andy Hui

619

edited Dec 28 '18 at 11:25

asked Dec 28 '18 at 11:08

Andy Hui

619

edited Dec 28 '18 at 11:25

asked Dec 28 '18 at 11:08

Andy Hui

619

asked Dec 28 '18 at 11:08

Andy Hui

619

asked Dec 28 '18 at 11:08

Andy Hui

619

1

better if you can provide text version of your file not excel
– meW
Dec 28 '18 at 11:12

Are you sure the delimiter is tab? You can try and omit the delimiter specification and let python auto-detect it
– Mortz
Dec 28 '18 at 11:12

i have updated my text version file
– Andy Hui
Dec 28 '18 at 11:25

2

Your delimiter is comma here. Also, it's sep=, not delimiter=.
– Michael O.
Dec 28 '18 at 11:27

2

looking at the excel file, the delimiter is ,, also if you only want to extract title and class use usecols=[0,3].
– gyx-hh
Dec 28 '18 at 11:29

|
show 1 more comment

1

better if you can provide text version of your file not excel
– meW
Dec 28 '18 at 11:12

Are you sure the delimiter is tab? You can try and omit the delimiter specification and let python auto-detect it
– Mortz
Dec 28 '18 at 11:12

i have updated my text version file
– Andy Hui
Dec 28 '18 at 11:25

2

Your delimiter is comma here. Also, it's sep=, not delimiter=.
– Michael O.
Dec 28 '18 at 11:27

2

looking at the excel file, the delimiter is ,, also if you only want to extract title and class use usecols=[0,3].
– gyx-hh
Dec 28 '18 at 11:29

better if you can provide text version of your file not excel
– meW
Dec 28 '18 at 11:12

Are you sure the delimiter is tab? You can try and omit the delimiter specification and let python auto-detect it
– Mortz
Dec 28 '18 at 11:12

i have updated my text version file
– Andy Hui
Dec 28 '18 at 11:25

Your delimiter is comma here. Also, it's sep=, not delimiter=.
– Michael O.
Dec 28 '18 at 11:27

looking at the excel file, the delimiter is ,, also if you only want to extract title and class use usecols=[0,3].
– gyx-hh
Dec 28 '18 at 11:29

|
show 1 more comment

1 Answer
1

active

oldest

votes

The separator in your csv file is a comma, not a tab. And since , is the default, there is no need to define it.

However, names= defines custom names for the columns. Your header already provides these names, so passing the column names you are interested in to usecols is all you need then:

>>> pd.read_csv(file, usecols=['title', 'class'])

                                               title  class

0             It's official! 1 Bitcoin = $10,000 USD      0

1                   The last 3 months in 47 seconds.      0

2                                  It's over 9000!!!      1

3               Everyone who's trading BTC right now      1

4                         I hope James is doing well      1

5                                         Weeeeeeee!      0

answered Dec 28 '18 at 12:01

jorijnsmit

587421

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53957562%2fpandas-read-csv-issue%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

The separator in your csv file is a comma, not a tab. And since , is the default, there is no need to define it.

However, names= defines custom names for the columns. Your header already provides these names, so passing the column names you are interested in to usecols is all you need then:

>>> pd.read_csv(file, usecols=['title', 'class'])

                                               title  class

0             It's official! 1 Bitcoin = $10,000 USD      0

1                   The last 3 months in 47 seconds.      0

2                                  It's over 9000!!!      1

3               Everyone who's trading BTC right now      1

4                         I hope James is doing well      1

5                                         Weeeeeeee!      0

answered Dec 28 '18 at 12:01

jorijnsmit

587421

add a comment |

The separator in your csv file is a comma, not a tab. And since , is the default, there is no need to define it.

However, names= defines custom names for the columns. Your header already provides these names, so passing the column names you are interested in to usecols is all you need then:

>>> pd.read_csv(file, usecols=['title', 'class'])

                                               title  class

0             It's official! 1 Bitcoin = $10,000 USD      0

1                   The last 3 months in 47 seconds.      0

2                                  It's over 9000!!!      1

3               Everyone who's trading BTC right now      1

4                         I hope James is doing well      1

5                                         Weeeeeeee!      0

answered Dec 28 '18 at 12:01

jorijnsmit

587421

add a comment |

The separator in your csv file is a comma, not a tab. And since , is the default, there is no need to define it.

However, names= defines custom names for the columns. Your header already provides these names, so passing the column names you are interested in to usecols is all you need then:

>>> pd.read_csv(file, usecols=['title', 'class'])

                                               title  class

0             It's official! 1 Bitcoin = $10,000 USD      0

1                   The last 3 months in 47 seconds.      0

2                                  It's over 9000!!!      1

3               Everyone who's trading BTC right now      1

4                         I hope James is doing well      1

5                                         Weeeeeeee!      0

answered Dec 28 '18 at 12:01

jorijnsmit

587421

The separator in your csv file is a comma, not a tab. And since , is the default, there is no need to define it.

However, names= defines custom names for the columns. Your header already provides these names, so passing the column names you are interested in to usecols is all you need then:

>>> pd.read_csv(file, usecols=['title', 'class'])

                                               title  class

0             It's official! 1 Bitcoin = $10,000 USD      0

1                   The last 3 months in 47 seconds.      0

2                                  It's over 9000!!!      1

3               Everyone who's trading BTC right now      1

4                         I hope James is doing well      1

5                                         Weeeeeeee!      0

answered Dec 28 '18 at 12:01

jorijnsmit

587421

answered Dec 28 '18 at 12:01

jorijnsmit

587421

answered Dec 28 '18 at 12:01

jorijnsmit

587421

answered Dec 28 '18 at 12:01

jorijnsmit

587421

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Bdtjtk