Remove an elements from the string if it contains a stopwords [duplicate]

Multi tool use
This question already has an answer here:
How to remove items from a list that contains words found in items in another list [duplicate]
4 answers
I have a list as below:
lst = ['for Sam', 'Just in', 'Mark Rich']
I am trying to remove an element from list of strings(string contains one or more words) which contains stopwords
.
As 1st and 2nd elements in the list contains for
and in
which are stopwords
, it will return
new_lst = ['Mark Rich']
What I tried
from nltk.corpus import stopwords
stop_words = set(stopwords.words('english'))
lst = ['for Sam', 'Just in', 'Mark Rich']
new_lst = [i.split(" ") for i in lst]
new_lst = [" ".join(i) for i in new_lst for j in i if j not in stop_words]
Which is giving me output as:
['for Sam', 'Just in', 'Mark Rich', 'Mark Rich']
python python-3.x nltk
marked as duplicate by Kasrâmvd
StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;
$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');
$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Jan 2 at 11:35
This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.
add a comment |
This question already has an answer here:
How to remove items from a list that contains words found in items in another list [duplicate]
4 answers
I have a list as below:
lst = ['for Sam', 'Just in', 'Mark Rich']
I am trying to remove an element from list of strings(string contains one or more words) which contains stopwords
.
As 1st and 2nd elements in the list contains for
and in
which are stopwords
, it will return
new_lst = ['Mark Rich']
What I tried
from nltk.corpus import stopwords
stop_words = set(stopwords.words('english'))
lst = ['for Sam', 'Just in', 'Mark Rich']
new_lst = [i.split(" ") for i in lst]
new_lst = [" ".join(i) for i in new_lst for j in i if j not in stop_words]
Which is giving me output as:
['for Sam', 'Just in', 'Mark Rich', 'Mark Rich']
python python-3.x nltk
marked as duplicate by Kasrâmvd
StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;
$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');
$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Jan 2 at 11:35
This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.
add a comment |
This question already has an answer here:
How to remove items from a list that contains words found in items in another list [duplicate]
4 answers
I have a list as below:
lst = ['for Sam', 'Just in', 'Mark Rich']
I am trying to remove an element from list of strings(string contains one or more words) which contains stopwords
.
As 1st and 2nd elements in the list contains for
and in
which are stopwords
, it will return
new_lst = ['Mark Rich']
What I tried
from nltk.corpus import stopwords
stop_words = set(stopwords.words('english'))
lst = ['for Sam', 'Just in', 'Mark Rich']
new_lst = [i.split(" ") for i in lst]
new_lst = [" ".join(i) for i in new_lst for j in i if j not in stop_words]
Which is giving me output as:
['for Sam', 'Just in', 'Mark Rich', 'Mark Rich']
python python-3.x nltk
This question already has an answer here:
How to remove items from a list that contains words found in items in another list [duplicate]
4 answers
I have a list as below:
lst = ['for Sam', 'Just in', 'Mark Rich']
I am trying to remove an element from list of strings(string contains one or more words) which contains stopwords
.
As 1st and 2nd elements in the list contains for
and in
which are stopwords
, it will return
new_lst = ['Mark Rich']
What I tried
from nltk.corpus import stopwords
stop_words = set(stopwords.words('english'))
lst = ['for Sam', 'Just in', 'Mark Rich']
new_lst = [i.split(" ") for i in lst]
new_lst = [" ".join(i) for i in new_lst for j in i if j not in stop_words]
Which is giving me output as:
['for Sam', 'Just in', 'Mark Rich', 'Mark Rich']
This question already has an answer here:
How to remove items from a list that contains words found in items in another list [duplicate]
4 answers
python python-3.x nltk
python python-3.x nltk
asked Jan 2 at 11:23
AkshayNevrekarAkshayNevrekar
4,91291840
4,91291840
marked as duplicate by Kasrâmvd
StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;
$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');
$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Jan 2 at 11:35
This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.
marked as duplicate by Kasrâmvd
StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;
$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');
$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Jan 2 at 11:35
This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.
add a comment |
add a comment |
2 Answers
2
active
oldest
votes
You need an if
statement rather than extra nesting:
new_lst = [' '.join(i) for i in new_lst if not any(j in i for j in stop_words)]
If you wish to utilize set
, you can use set.isdisjoint
:
new_lst = [' '.join(i) for i in new_lst if stop_words.isdisjoint(i)]
Here's a demonstration:
stop_words = {'for', 'in'}
lst = ['for Sam', 'Just in', 'Mark Rich']
new_lst = [i.split() for i in lst]
new_lst = [' '.join(i) for i in new_lst if stop_words.isdisjoint(i)]
print(new_lst)
# ['Mark Rich']
Your 1st answer worked like a charm but 2nd one is giving an empty list.
– AkshayNevrekar
Jan 2 at 11:33
1
@Sociopath, Nope, works fine, see my example.
– jpp
Jan 2 at 11:34
add a comment |
You can use a list comprehension and use sets
to check if any words within the two lists intersect:
[i for i in lst if not set(stop_words) & set(i.split(' '))]
['Mark Rich']]
thanks. Worked like a charm. Only one thing, in your answer you misplaced]
– AkshayNevrekar
Jan 2 at 11:34
Noteset.intersection
has higher complexity vsset.disjoint
. It's not necessary to calculate the exact intersection of the 2 sets to know if the intersection is empty.
– jpp
Jan 2 at 11:35
1
Yes I actually thought about.isdisjoint
right when i saw your answer. thx for clarifying
– yatu
Jan 2 at 11:37
add a comment |
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
You need an if
statement rather than extra nesting:
new_lst = [' '.join(i) for i in new_lst if not any(j in i for j in stop_words)]
If you wish to utilize set
, you can use set.isdisjoint
:
new_lst = [' '.join(i) for i in new_lst if stop_words.isdisjoint(i)]
Here's a demonstration:
stop_words = {'for', 'in'}
lst = ['for Sam', 'Just in', 'Mark Rich']
new_lst = [i.split() for i in lst]
new_lst = [' '.join(i) for i in new_lst if stop_words.isdisjoint(i)]
print(new_lst)
# ['Mark Rich']
Your 1st answer worked like a charm but 2nd one is giving an empty list.
– AkshayNevrekar
Jan 2 at 11:33
1
@Sociopath, Nope, works fine, see my example.
– jpp
Jan 2 at 11:34
add a comment |
You need an if
statement rather than extra nesting:
new_lst = [' '.join(i) for i in new_lst if not any(j in i for j in stop_words)]
If you wish to utilize set
, you can use set.isdisjoint
:
new_lst = [' '.join(i) for i in new_lst if stop_words.isdisjoint(i)]
Here's a demonstration:
stop_words = {'for', 'in'}
lst = ['for Sam', 'Just in', 'Mark Rich']
new_lst = [i.split() for i in lst]
new_lst = [' '.join(i) for i in new_lst if stop_words.isdisjoint(i)]
print(new_lst)
# ['Mark Rich']
Your 1st answer worked like a charm but 2nd one is giving an empty list.
– AkshayNevrekar
Jan 2 at 11:33
1
@Sociopath, Nope, works fine, see my example.
– jpp
Jan 2 at 11:34
add a comment |
You need an if
statement rather than extra nesting:
new_lst = [' '.join(i) for i in new_lst if not any(j in i for j in stop_words)]
If you wish to utilize set
, you can use set.isdisjoint
:
new_lst = [' '.join(i) for i in new_lst if stop_words.isdisjoint(i)]
Here's a demonstration:
stop_words = {'for', 'in'}
lst = ['for Sam', 'Just in', 'Mark Rich']
new_lst = [i.split() for i in lst]
new_lst = [' '.join(i) for i in new_lst if stop_words.isdisjoint(i)]
print(new_lst)
# ['Mark Rich']
You need an if
statement rather than extra nesting:
new_lst = [' '.join(i) for i in new_lst if not any(j in i for j in stop_words)]
If you wish to utilize set
, you can use set.isdisjoint
:
new_lst = [' '.join(i) for i in new_lst if stop_words.isdisjoint(i)]
Here's a demonstration:
stop_words = {'for', 'in'}
lst = ['for Sam', 'Just in', 'Mark Rich']
new_lst = [i.split() for i in lst]
new_lst = [' '.join(i) for i in new_lst if stop_words.isdisjoint(i)]
print(new_lst)
# ['Mark Rich']
edited Jan 2 at 11:34
answered Jan 2 at 11:27


jppjpp
101k2164115
101k2164115
Your 1st answer worked like a charm but 2nd one is giving an empty list.
– AkshayNevrekar
Jan 2 at 11:33
1
@Sociopath, Nope, works fine, see my example.
– jpp
Jan 2 at 11:34
add a comment |
Your 1st answer worked like a charm but 2nd one is giving an empty list.
– AkshayNevrekar
Jan 2 at 11:33
1
@Sociopath, Nope, works fine, see my example.
– jpp
Jan 2 at 11:34
Your 1st answer worked like a charm but 2nd one is giving an empty list.
– AkshayNevrekar
Jan 2 at 11:33
Your 1st answer worked like a charm but 2nd one is giving an empty list.
– AkshayNevrekar
Jan 2 at 11:33
1
1
@Sociopath, Nope, works fine, see my example.
– jpp
Jan 2 at 11:34
@Sociopath, Nope, works fine, see my example.
– jpp
Jan 2 at 11:34
add a comment |
You can use a list comprehension and use sets
to check if any words within the two lists intersect:
[i for i in lst if not set(stop_words) & set(i.split(' '))]
['Mark Rich']]
thanks. Worked like a charm. Only one thing, in your answer you misplaced]
– AkshayNevrekar
Jan 2 at 11:34
Noteset.intersection
has higher complexity vsset.disjoint
. It's not necessary to calculate the exact intersection of the 2 sets to know if the intersection is empty.
– jpp
Jan 2 at 11:35
1
Yes I actually thought about.isdisjoint
right when i saw your answer. thx for clarifying
– yatu
Jan 2 at 11:37
add a comment |
You can use a list comprehension and use sets
to check if any words within the two lists intersect:
[i for i in lst if not set(stop_words) & set(i.split(' '))]
['Mark Rich']]
thanks. Worked like a charm. Only one thing, in your answer you misplaced]
– AkshayNevrekar
Jan 2 at 11:34
Noteset.intersection
has higher complexity vsset.disjoint
. It's not necessary to calculate the exact intersection of the 2 sets to know if the intersection is empty.
– jpp
Jan 2 at 11:35
1
Yes I actually thought about.isdisjoint
right when i saw your answer. thx for clarifying
– yatu
Jan 2 at 11:37
add a comment |
You can use a list comprehension and use sets
to check if any words within the two lists intersect:
[i for i in lst if not set(stop_words) & set(i.split(' '))]
['Mark Rich']]
You can use a list comprehension and use sets
to check if any words within the two lists intersect:
[i for i in lst if not set(stop_words) & set(i.split(' '))]
['Mark Rich']]
edited Jan 2 at 11:35
answered Jan 2 at 11:30


yatuyatu
12.9k31341
12.9k31341
thanks. Worked like a charm. Only one thing, in your answer you misplaced]
– AkshayNevrekar
Jan 2 at 11:34
Noteset.intersection
has higher complexity vsset.disjoint
. It's not necessary to calculate the exact intersection of the 2 sets to know if the intersection is empty.
– jpp
Jan 2 at 11:35
1
Yes I actually thought about.isdisjoint
right when i saw your answer. thx for clarifying
– yatu
Jan 2 at 11:37
add a comment |
thanks. Worked like a charm. Only one thing, in your answer you misplaced]
– AkshayNevrekar
Jan 2 at 11:34
Noteset.intersection
has higher complexity vsset.disjoint
. It's not necessary to calculate the exact intersection of the 2 sets to know if the intersection is empty.
– jpp
Jan 2 at 11:35
1
Yes I actually thought about.isdisjoint
right when i saw your answer. thx for clarifying
– yatu
Jan 2 at 11:37
thanks. Worked like a charm. Only one thing, in your answer you misplaced
]
– AkshayNevrekar
Jan 2 at 11:34
thanks. Worked like a charm. Only one thing, in your answer you misplaced
]
– AkshayNevrekar
Jan 2 at 11:34
Note
set.intersection
has higher complexity vs set.disjoint
. It's not necessary to calculate the exact intersection of the 2 sets to know if the intersection is empty.– jpp
Jan 2 at 11:35
Note
set.intersection
has higher complexity vs set.disjoint
. It's not necessary to calculate the exact intersection of the 2 sets to know if the intersection is empty.– jpp
Jan 2 at 11:35
1
1
Yes I actually thought about
.isdisjoint
right when i saw your answer. thx for clarifying– yatu
Jan 2 at 11:37
Yes I actually thought about
.isdisjoint
right when i saw your answer. thx for clarifying– yatu
Jan 2 at 11:37
add a comment |
0rLpNfkdly5BE8PWm,oevirSeP,t1STf YiI7kOgDA06tqsQM 5G7cuQ,1 MkVhUkg,W zByXPEEk,tx5ifL,QAzbAEdCWEkG t3H