Remove an elements from the string if it contains a stopwords [duplicate]
This question already has an answer here:
How to remove items from a list that contains words found in items in another list [duplicate]
4 answers
I have a list as below:
lst = ['for Sam', 'Just in', 'Mark Rich']
I am trying to remove an element from list of strings(string contains one or more words) which contains stopwords.
As 1st and 2nd elements in the list contains for and in which are stopwords, it will return
new_lst = ['Mark Rich']
What I tried
from nltk.corpus import stopwords
stop_words = set(stopwords.words('english'))
lst = ['for Sam', 'Just in', 'Mark Rich']
new_lst = [i.split(" ") for i in lst]
new_lst = [" ".join(i) for i in new_lst for j in i if j not in stop_words]
Which is giving me output as:
['for Sam', 'Just in', 'Mark Rich', 'Mark Rich']
python python-3.x nltk
marked as duplicate by Kasrâmvd
StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;
$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');
$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Jan 2 at 11:35
This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.
add a comment |
This question already has an answer here:
How to remove items from a list that contains words found in items in another list [duplicate]
4 answers
I have a list as below:
lst = ['for Sam', 'Just in', 'Mark Rich']
I am trying to remove an element from list of strings(string contains one or more words) which contains stopwords.
As 1st and 2nd elements in the list contains for and in which are stopwords, it will return
new_lst = ['Mark Rich']
What I tried
from nltk.corpus import stopwords
stop_words = set(stopwords.words('english'))
lst = ['for Sam', 'Just in', 'Mark Rich']
new_lst = [i.split(" ") for i in lst]
new_lst = [" ".join(i) for i in new_lst for j in i if j not in stop_words]
Which is giving me output as:
['for Sam', 'Just in', 'Mark Rich', 'Mark Rich']
python python-3.x nltk
marked as duplicate by Kasrâmvd
StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;
$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');
$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Jan 2 at 11:35
This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.
add a comment |
This question already has an answer here:
How to remove items from a list that contains words found in items in another list [duplicate]
4 answers
I have a list as below:
lst = ['for Sam', 'Just in', 'Mark Rich']
I am trying to remove an element from list of strings(string contains one or more words) which contains stopwords.
As 1st and 2nd elements in the list contains for and in which are stopwords, it will return
new_lst = ['Mark Rich']
What I tried
from nltk.corpus import stopwords
stop_words = set(stopwords.words('english'))
lst = ['for Sam', 'Just in', 'Mark Rich']
new_lst = [i.split(" ") for i in lst]
new_lst = [" ".join(i) for i in new_lst for j in i if j not in stop_words]
Which is giving me output as:
['for Sam', 'Just in', 'Mark Rich', 'Mark Rich']
python python-3.x nltk
This question already has an answer here:
How to remove items from a list that contains words found in items in another list [duplicate]
4 answers
I have a list as below:
lst = ['for Sam', 'Just in', 'Mark Rich']
I am trying to remove an element from list of strings(string contains one or more words) which contains stopwords.
As 1st and 2nd elements in the list contains for and in which are stopwords, it will return
new_lst = ['Mark Rich']
What I tried
from nltk.corpus import stopwords
stop_words = set(stopwords.words('english'))
lst = ['for Sam', 'Just in', 'Mark Rich']
new_lst = [i.split(" ") for i in lst]
new_lst = [" ".join(i) for i in new_lst for j in i if j not in stop_words]
Which is giving me output as:
['for Sam', 'Just in', 'Mark Rich', 'Mark Rich']
This question already has an answer here:
How to remove items from a list that contains words found in items in another list [duplicate]
4 answers
python python-3.x nltk
python python-3.x nltk
asked Jan 2 at 11:23
AkshayNevrekarAkshayNevrekar
4,91291840
4,91291840
marked as duplicate by Kasrâmvd
StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;
$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');
$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Jan 2 at 11:35
This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.
marked as duplicate by Kasrâmvd
StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;
$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');
$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Jan 2 at 11:35
This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.
add a comment |
add a comment |
2 Answers
2
active
oldest
votes
You need an if statement rather than extra nesting:
new_lst = [' '.join(i) for i in new_lst if not any(j in i for j in stop_words)]
If you wish to utilize set, you can use set.isdisjoint:
new_lst = [' '.join(i) for i in new_lst if stop_words.isdisjoint(i)]
Here's a demonstration:
stop_words = {'for', 'in'}
lst = ['for Sam', 'Just in', 'Mark Rich']
new_lst = [i.split() for i in lst]
new_lst = [' '.join(i) for i in new_lst if stop_words.isdisjoint(i)]
print(new_lst)
# ['Mark Rich']
Your 1st answer worked like a charm but 2nd one is giving an empty list.
– AkshayNevrekar
Jan 2 at 11:33
1
@Sociopath, Nope, works fine, see my example.
– jpp
Jan 2 at 11:34
add a comment |
You can use a list comprehension and use sets to check if any words within the two lists intersect:
[i for i in lst if not set(stop_words) & set(i.split(' '))]
['Mark Rich']]
thanks. Worked like a charm. Only one thing, in your answer you misplaced]
– AkshayNevrekar
Jan 2 at 11:34
Noteset.intersectionhas higher complexity vsset.disjoint. It's not necessary to calculate the exact intersection of the 2 sets to know if the intersection is empty.
– jpp
Jan 2 at 11:35
1
Yes I actually thought about.isdisjointright when i saw your answer. thx for clarifying
– yatu
Jan 2 at 11:37
add a comment |
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
You need an if statement rather than extra nesting:
new_lst = [' '.join(i) for i in new_lst if not any(j in i for j in stop_words)]
If you wish to utilize set, you can use set.isdisjoint:
new_lst = [' '.join(i) for i in new_lst if stop_words.isdisjoint(i)]
Here's a demonstration:
stop_words = {'for', 'in'}
lst = ['for Sam', 'Just in', 'Mark Rich']
new_lst = [i.split() for i in lst]
new_lst = [' '.join(i) for i in new_lst if stop_words.isdisjoint(i)]
print(new_lst)
# ['Mark Rich']
Your 1st answer worked like a charm but 2nd one is giving an empty list.
– AkshayNevrekar
Jan 2 at 11:33
1
@Sociopath, Nope, works fine, see my example.
– jpp
Jan 2 at 11:34
add a comment |
You need an if statement rather than extra nesting:
new_lst = [' '.join(i) for i in new_lst if not any(j in i for j in stop_words)]
If you wish to utilize set, you can use set.isdisjoint:
new_lst = [' '.join(i) for i in new_lst if stop_words.isdisjoint(i)]
Here's a demonstration:
stop_words = {'for', 'in'}
lst = ['for Sam', 'Just in', 'Mark Rich']
new_lst = [i.split() for i in lst]
new_lst = [' '.join(i) for i in new_lst if stop_words.isdisjoint(i)]
print(new_lst)
# ['Mark Rich']
Your 1st answer worked like a charm but 2nd one is giving an empty list.
– AkshayNevrekar
Jan 2 at 11:33
1
@Sociopath, Nope, works fine, see my example.
– jpp
Jan 2 at 11:34
add a comment |
You need an if statement rather than extra nesting:
new_lst = [' '.join(i) for i in new_lst if not any(j in i for j in stop_words)]
If you wish to utilize set, you can use set.isdisjoint:
new_lst = [' '.join(i) for i in new_lst if stop_words.isdisjoint(i)]
Here's a demonstration:
stop_words = {'for', 'in'}
lst = ['for Sam', 'Just in', 'Mark Rich']
new_lst = [i.split() for i in lst]
new_lst = [' '.join(i) for i in new_lst if stop_words.isdisjoint(i)]
print(new_lst)
# ['Mark Rich']
You need an if statement rather than extra nesting:
new_lst = [' '.join(i) for i in new_lst if not any(j in i for j in stop_words)]
If you wish to utilize set, you can use set.isdisjoint:
new_lst = [' '.join(i) for i in new_lst if stop_words.isdisjoint(i)]
Here's a demonstration:
stop_words = {'for', 'in'}
lst = ['for Sam', 'Just in', 'Mark Rich']
new_lst = [i.split() for i in lst]
new_lst = [' '.join(i) for i in new_lst if stop_words.isdisjoint(i)]
print(new_lst)
# ['Mark Rich']
edited Jan 2 at 11:34
answered Jan 2 at 11:27
jppjpp
101k2164115
101k2164115
Your 1st answer worked like a charm but 2nd one is giving an empty list.
– AkshayNevrekar
Jan 2 at 11:33
1
@Sociopath, Nope, works fine, see my example.
– jpp
Jan 2 at 11:34
add a comment |
Your 1st answer worked like a charm but 2nd one is giving an empty list.
– AkshayNevrekar
Jan 2 at 11:33
1
@Sociopath, Nope, works fine, see my example.
– jpp
Jan 2 at 11:34
Your 1st answer worked like a charm but 2nd one is giving an empty list.
– AkshayNevrekar
Jan 2 at 11:33
Your 1st answer worked like a charm but 2nd one is giving an empty list.
– AkshayNevrekar
Jan 2 at 11:33
1
1
@Sociopath, Nope, works fine, see my example.
– jpp
Jan 2 at 11:34
@Sociopath, Nope, works fine, see my example.
– jpp
Jan 2 at 11:34
add a comment |
You can use a list comprehension and use sets to check if any words within the two lists intersect:
[i for i in lst if not set(stop_words) & set(i.split(' '))]
['Mark Rich']]
thanks. Worked like a charm. Only one thing, in your answer you misplaced]
– AkshayNevrekar
Jan 2 at 11:34
Noteset.intersectionhas higher complexity vsset.disjoint. It's not necessary to calculate the exact intersection of the 2 sets to know if the intersection is empty.
– jpp
Jan 2 at 11:35
1
Yes I actually thought about.isdisjointright when i saw your answer. thx for clarifying
– yatu
Jan 2 at 11:37
add a comment |
You can use a list comprehension and use sets to check if any words within the two lists intersect:
[i for i in lst if not set(stop_words) & set(i.split(' '))]
['Mark Rich']]
thanks. Worked like a charm. Only one thing, in your answer you misplaced]
– AkshayNevrekar
Jan 2 at 11:34
Noteset.intersectionhas higher complexity vsset.disjoint. It's not necessary to calculate the exact intersection of the 2 sets to know if the intersection is empty.
– jpp
Jan 2 at 11:35
1
Yes I actually thought about.isdisjointright when i saw your answer. thx for clarifying
– yatu
Jan 2 at 11:37
add a comment |
You can use a list comprehension and use sets to check if any words within the two lists intersect:
[i for i in lst if not set(stop_words) & set(i.split(' '))]
['Mark Rich']]
You can use a list comprehension and use sets to check if any words within the two lists intersect:
[i for i in lst if not set(stop_words) & set(i.split(' '))]
['Mark Rich']]
edited Jan 2 at 11:35
answered Jan 2 at 11:30
yatuyatu
12.9k31341
12.9k31341
thanks. Worked like a charm. Only one thing, in your answer you misplaced]
– AkshayNevrekar
Jan 2 at 11:34
Noteset.intersectionhas higher complexity vsset.disjoint. It's not necessary to calculate the exact intersection of the 2 sets to know if the intersection is empty.
– jpp
Jan 2 at 11:35
1
Yes I actually thought about.isdisjointright when i saw your answer. thx for clarifying
– yatu
Jan 2 at 11:37
add a comment |
thanks. Worked like a charm. Only one thing, in your answer you misplaced]
– AkshayNevrekar
Jan 2 at 11:34
Noteset.intersectionhas higher complexity vsset.disjoint. It's not necessary to calculate the exact intersection of the 2 sets to know if the intersection is empty.
– jpp
Jan 2 at 11:35
1
Yes I actually thought about.isdisjointright when i saw your answer. thx for clarifying
– yatu
Jan 2 at 11:37
thanks. Worked like a charm. Only one thing, in your answer you misplaced
]– AkshayNevrekar
Jan 2 at 11:34
thanks. Worked like a charm. Only one thing, in your answer you misplaced
]– AkshayNevrekar
Jan 2 at 11:34
Note
set.intersection has higher complexity vs set.disjoint. It's not necessary to calculate the exact intersection of the 2 sets to know if the intersection is empty.– jpp
Jan 2 at 11:35
Note
set.intersection has higher complexity vs set.disjoint. It's not necessary to calculate the exact intersection of the 2 sets to know if the intersection is empty.– jpp
Jan 2 at 11:35
1
1
Yes I actually thought about
.isdisjoint right when i saw your answer. thx for clarifying– yatu
Jan 2 at 11:37
Yes I actually thought about
.isdisjoint right when i saw your answer. thx for clarifying– yatu
Jan 2 at 11:37
add a comment |