How to find and replace certain keywords in a specific column of a data frame in R? [closed]
I want to find some specific keywords in a specific column of a data frame and replace them with other keywords that already exists in that column. For example technology (freq=2) with technologies (freq=3).
I need to do this without changing the rest of the columns in the data frame and save it in the same column in the same data frame. In this way I can have 5 keywords of "technologies".
However, I have no clue how to start doing this in rstudio specially because I have to keep the output as a data frame. Can you please guide me where to begin with?
r dataframe text-mining
closed as off-topic by phiver, MLavoie, greg-449, EdChum, Tomasz Mularczyk Jan 2 at 13:44
This question appears to be off-topic. The users who voted to close gave this specific reason:
- "Questions seeking debugging help ("why isn't this code working?") must include the desired behavior, a specific problem or error and the shortest code necessary to reproduce it in the question itself. Questions without a clear problem statement are not useful to other readers. See: How to create a Minimal, Complete, and Verifiable example." – MLavoie, greg-449, EdChum, Tomasz Mularczyk
If this question can be reworded to fit the rules in the help center, please edit the question.
add a comment |
I want to find some specific keywords in a specific column of a data frame and replace them with other keywords that already exists in that column. For example technology (freq=2) with technologies (freq=3).
I need to do this without changing the rest of the columns in the data frame and save it in the same column in the same data frame. In this way I can have 5 keywords of "technologies".
However, I have no clue how to start doing this in rstudio specially because I have to keep the output as a data frame. Can you please guide me where to begin with?
r dataframe text-mining
closed as off-topic by phiver, MLavoie, greg-449, EdChum, Tomasz Mularczyk Jan 2 at 13:44
This question appears to be off-topic. The users who voted to close gave this specific reason:
- "Questions seeking debugging help ("why isn't this code working?") must include the desired behavior, a specific problem or error and the shortest code necessary to reproduce it in the question itself. Questions without a clear problem statement are not useful to other readers. See: How to create a Minimal, Complete, and Verifiable example." – MLavoie, greg-449, EdChum, Tomasz Mularczyk
If this question can be reworded to fit the rules in the help center, please edit the question.
Please tag your question with the language you're using, not the IDE.
– coldspeed
Jan 2 at 4:10
Welcome to StackOverflow! Please read the info about how to ask a good question and how to give a reproducible example. This will make it much easier for others to help you.
– Ronak Shah
Jan 2 at 4:18
add a comment |
I want to find some specific keywords in a specific column of a data frame and replace them with other keywords that already exists in that column. For example technology (freq=2) with technologies (freq=3).
I need to do this without changing the rest of the columns in the data frame and save it in the same column in the same data frame. In this way I can have 5 keywords of "technologies".
However, I have no clue how to start doing this in rstudio specially because I have to keep the output as a data frame. Can you please guide me where to begin with?
r dataframe text-mining
I want to find some specific keywords in a specific column of a data frame and replace them with other keywords that already exists in that column. For example technology (freq=2) with technologies (freq=3).
I need to do this without changing the rest of the columns in the data frame and save it in the same column in the same data frame. In this way I can have 5 keywords of "technologies".
However, I have no clue how to start doing this in rstudio specially because I have to keep the output as a data frame. Can you please guide me where to begin with?
r dataframe text-mining
r dataframe text-mining
edited Jan 2 at 9:26
NelsonGon
3,0853730
3,0853730
asked Jan 2 at 3:17
Arbo94Arbo94
83
83
closed as off-topic by phiver, MLavoie, greg-449, EdChum, Tomasz Mularczyk Jan 2 at 13:44
This question appears to be off-topic. The users who voted to close gave this specific reason:
- "Questions seeking debugging help ("why isn't this code working?") must include the desired behavior, a specific problem or error and the shortest code necessary to reproduce it in the question itself. Questions without a clear problem statement are not useful to other readers. See: How to create a Minimal, Complete, and Verifiable example." – MLavoie, greg-449, EdChum, Tomasz Mularczyk
If this question can be reworded to fit the rules in the help center, please edit the question.
closed as off-topic by phiver, MLavoie, greg-449, EdChum, Tomasz Mularczyk Jan 2 at 13:44
This question appears to be off-topic. The users who voted to close gave this specific reason:
- "Questions seeking debugging help ("why isn't this code working?") must include the desired behavior, a specific problem or error and the shortest code necessary to reproduce it in the question itself. Questions without a clear problem statement are not useful to other readers. See: How to create a Minimal, Complete, and Verifiable example." – MLavoie, greg-449, EdChum, Tomasz Mularczyk
If this question can be reworded to fit the rules in the help center, please edit the question.
Please tag your question with the language you're using, not the IDE.
– coldspeed
Jan 2 at 4:10
Welcome to StackOverflow! Please read the info about how to ask a good question and how to give a reproducible example. This will make it much easier for others to help you.
– Ronak Shah
Jan 2 at 4:18
add a comment |
Please tag your question with the language you're using, not the IDE.
– coldspeed
Jan 2 at 4:10
Welcome to StackOverflow! Please read the info about how to ask a good question and how to give a reproducible example. This will make it much easier for others to help you.
– Ronak Shah
Jan 2 at 4:18
Please tag your question with the language you're using, not the IDE.
– coldspeed
Jan 2 at 4:10
Please tag your question with the language you're using, not the IDE.
– coldspeed
Jan 2 at 4:10
Welcome to StackOverflow! Please read the info about how to ask a good question and how to give a reproducible example. This will make it much easier for others to help you.
– Ronak Shah
Jan 2 at 4:18
Welcome to StackOverflow! Please read the info about how to ask a good question and how to give a reproducible example. This will make it much easier for others to help you.
– Ronak Shah
Jan 2 at 4:18
add a comment |
1 Answer
1
active
oldest
votes
Say this is your data:
dat <- data.frame(C1=c("Hi", "My", "Example", "Hi"),
C2=c("This", "Is", "An", "Example"),
stringsAsFactors = F)
You can use gsub
to replace all occurrences of a value in one columns like this:
dat$C1 <- gsub(pattern="Example", replacement="NEW", dat$C1)
You can go through all columns like this:
lapply(a, gsub, pattern="Hi", replacement="NEW")
Does that do what you are after?
1
You could use the stringr package from the tidyverse and use the "str_replace" or "str_replace_all" functions - they also look for a pattern, and then replace it with something you want - you may also want to look into REGEX as that could save you some time when trying to find the correct pattern!!!!
– Data Science
Jan 2 at 9:29
thanks! they look useful. regex kills me though, i find it so confusing :P
– RAB
Jan 2 at 10:49
1
good link for regex is this: hackerearth.com/practice/machine-learning/advanced-techniques/…
– Data Science
Jan 2 at 11:07
It is exactly what I was looking for! Thank you so much.
– Arbo94
Jan 2 at 11:37
An additional question... What if I have a random keyword like " Hi randomword" that I don't know what is the randomword expression, so I want to cut it from the random phrase but keep the "Hi"?
– Arbo94
Jan 2 at 11:42
|
show 4 more comments
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Say this is your data:
dat <- data.frame(C1=c("Hi", "My", "Example", "Hi"),
C2=c("This", "Is", "An", "Example"),
stringsAsFactors = F)
You can use gsub
to replace all occurrences of a value in one columns like this:
dat$C1 <- gsub(pattern="Example", replacement="NEW", dat$C1)
You can go through all columns like this:
lapply(a, gsub, pattern="Hi", replacement="NEW")
Does that do what you are after?
1
You could use the stringr package from the tidyverse and use the "str_replace" or "str_replace_all" functions - they also look for a pattern, and then replace it with something you want - you may also want to look into REGEX as that could save you some time when trying to find the correct pattern!!!!
– Data Science
Jan 2 at 9:29
thanks! they look useful. regex kills me though, i find it so confusing :P
– RAB
Jan 2 at 10:49
1
good link for regex is this: hackerearth.com/practice/machine-learning/advanced-techniques/…
– Data Science
Jan 2 at 11:07
It is exactly what I was looking for! Thank you so much.
– Arbo94
Jan 2 at 11:37
An additional question... What if I have a random keyword like " Hi randomword" that I don't know what is the randomword expression, so I want to cut it from the random phrase but keep the "Hi"?
– Arbo94
Jan 2 at 11:42
|
show 4 more comments
Say this is your data:
dat <- data.frame(C1=c("Hi", "My", "Example", "Hi"),
C2=c("This", "Is", "An", "Example"),
stringsAsFactors = F)
You can use gsub
to replace all occurrences of a value in one columns like this:
dat$C1 <- gsub(pattern="Example", replacement="NEW", dat$C1)
You can go through all columns like this:
lapply(a, gsub, pattern="Hi", replacement="NEW")
Does that do what you are after?
1
You could use the stringr package from the tidyverse and use the "str_replace" or "str_replace_all" functions - they also look for a pattern, and then replace it with something you want - you may also want to look into REGEX as that could save you some time when trying to find the correct pattern!!!!
– Data Science
Jan 2 at 9:29
thanks! they look useful. regex kills me though, i find it so confusing :P
– RAB
Jan 2 at 10:49
1
good link for regex is this: hackerearth.com/practice/machine-learning/advanced-techniques/…
– Data Science
Jan 2 at 11:07
It is exactly what I was looking for! Thank you so much.
– Arbo94
Jan 2 at 11:37
An additional question... What if I have a random keyword like " Hi randomword" that I don't know what is the randomword expression, so I want to cut it from the random phrase but keep the "Hi"?
– Arbo94
Jan 2 at 11:42
|
show 4 more comments
Say this is your data:
dat <- data.frame(C1=c("Hi", "My", "Example", "Hi"),
C2=c("This", "Is", "An", "Example"),
stringsAsFactors = F)
You can use gsub
to replace all occurrences of a value in one columns like this:
dat$C1 <- gsub(pattern="Example", replacement="NEW", dat$C1)
You can go through all columns like this:
lapply(a, gsub, pattern="Hi", replacement="NEW")
Does that do what you are after?
Say this is your data:
dat <- data.frame(C1=c("Hi", "My", "Example", "Hi"),
C2=c("This", "Is", "An", "Example"),
stringsAsFactors = F)
You can use gsub
to replace all occurrences of a value in one columns like this:
dat$C1 <- gsub(pattern="Example", replacement="NEW", dat$C1)
You can go through all columns like this:
lapply(a, gsub, pattern="Hi", replacement="NEW")
Does that do what you are after?
answered Jan 2 at 5:43
RABRAB
1,385317
1,385317
1
You could use the stringr package from the tidyverse and use the "str_replace" or "str_replace_all" functions - they also look for a pattern, and then replace it with something you want - you may also want to look into REGEX as that could save you some time when trying to find the correct pattern!!!!
– Data Science
Jan 2 at 9:29
thanks! they look useful. regex kills me though, i find it so confusing :P
– RAB
Jan 2 at 10:49
1
good link for regex is this: hackerearth.com/practice/machine-learning/advanced-techniques/…
– Data Science
Jan 2 at 11:07
It is exactly what I was looking for! Thank you so much.
– Arbo94
Jan 2 at 11:37
An additional question... What if I have a random keyword like " Hi randomword" that I don't know what is the randomword expression, so I want to cut it from the random phrase but keep the "Hi"?
– Arbo94
Jan 2 at 11:42
|
show 4 more comments
1
You could use the stringr package from the tidyverse and use the "str_replace" or "str_replace_all" functions - they also look for a pattern, and then replace it with something you want - you may also want to look into REGEX as that could save you some time when trying to find the correct pattern!!!!
– Data Science
Jan 2 at 9:29
thanks! they look useful. regex kills me though, i find it so confusing :P
– RAB
Jan 2 at 10:49
1
good link for regex is this: hackerearth.com/practice/machine-learning/advanced-techniques/…
– Data Science
Jan 2 at 11:07
It is exactly what I was looking for! Thank you so much.
– Arbo94
Jan 2 at 11:37
An additional question... What if I have a random keyword like " Hi randomword" that I don't know what is the randomword expression, so I want to cut it from the random phrase but keep the "Hi"?
– Arbo94
Jan 2 at 11:42
1
1
You could use the stringr package from the tidyverse and use the "str_replace" or "str_replace_all" functions - they also look for a pattern, and then replace it with something you want - you may also want to look into REGEX as that could save you some time when trying to find the correct pattern!!!!
– Data Science
Jan 2 at 9:29
You could use the stringr package from the tidyverse and use the "str_replace" or "str_replace_all" functions - they also look for a pattern, and then replace it with something you want - you may also want to look into REGEX as that could save you some time when trying to find the correct pattern!!!!
– Data Science
Jan 2 at 9:29
thanks! they look useful. regex kills me though, i find it so confusing :P
– RAB
Jan 2 at 10:49
thanks! they look useful. regex kills me though, i find it so confusing :P
– RAB
Jan 2 at 10:49
1
1
good link for regex is this: hackerearth.com/practice/machine-learning/advanced-techniques/…
– Data Science
Jan 2 at 11:07
good link for regex is this: hackerearth.com/practice/machine-learning/advanced-techniques/…
– Data Science
Jan 2 at 11:07
It is exactly what I was looking for! Thank you so much.
– Arbo94
Jan 2 at 11:37
It is exactly what I was looking for! Thank you so much.
– Arbo94
Jan 2 at 11:37
An additional question... What if I have a random keyword like " Hi randomword" that I don't know what is the randomword expression, so I want to cut it from the random phrase but keep the "Hi"?
– Arbo94
Jan 2 at 11:42
An additional question... What if I have a random keyword like " Hi randomword" that I don't know what is the randomword expression, so I want to cut it from the random phrase but keep the "Hi"?
– Arbo94
Jan 2 at 11:42
|
show 4 more comments
Please tag your question with the language you're using, not the IDE.
– coldspeed
Jan 2 at 4:10
Welcome to StackOverflow! Please read the info about how to ask a good question and how to give a reproducible example. This will make it much easier for others to help you.
– Ronak Shah
Jan 2 at 4:18