How to replace letters in lines in fasta file using bash loops?
I want to change all n
in the sequence into -
, but I don't know how to make my bash script not change the n
that show up in sequence names. I'm not experienced with sed or regex to make sure my bash script reads only the lines that do not start with >
, as that indicates the header.
Example file:
>Name_with_nnn
nnnatgcnnnatttg
>Name2_with_nnn
atgggnnnnGGtnnn
At the same time I want to convert all lowercase letters into uppercase, only in the sequence lines. I don't even know how to begin using sed, I find it really tricky to understand.
Expected output:
>Name_with_nnn
---ATGC---ATTTG
>Name2_with_nnn
ATGGG----GGT---
So after I created my sequence files I tried to continue my script with:
while IFS= read -r line
do
if [[ $line == ">"* ]]
then
echo "Ignoring header line: $line"
else
echo "Converting to uppercase and then N-to-gaps"
# sed or tr?? do call $line or do I call $OUTFILE? so confused..
fi
done
bash sed
add a comment |
I want to change all n
in the sequence into -
, but I don't know how to make my bash script not change the n
that show up in sequence names. I'm not experienced with sed or regex to make sure my bash script reads only the lines that do not start with >
, as that indicates the header.
Example file:
>Name_with_nnn
nnnatgcnnnatttg
>Name2_with_nnn
atgggnnnnGGtnnn
At the same time I want to convert all lowercase letters into uppercase, only in the sequence lines. I don't even know how to begin using sed, I find it really tricky to understand.
Expected output:
>Name_with_nnn
---ATGC---ATTTG
>Name2_with_nnn
ATGGG----GGT---
So after I created my sequence files I tried to continue my script with:
while IFS= read -r line
do
if [[ $line == ">"* ]]
then
echo "Ignoring header line: $line"
else
echo "Converting to uppercase and then N-to-gaps"
# sed or tr?? do call $line or do I call $OUTFILE? so confused..
fi
done
bash sed
add a comment |
I want to change all n
in the sequence into -
, but I don't know how to make my bash script not change the n
that show up in sequence names. I'm not experienced with sed or regex to make sure my bash script reads only the lines that do not start with >
, as that indicates the header.
Example file:
>Name_with_nnn
nnnatgcnnnatttg
>Name2_with_nnn
atgggnnnnGGtnnn
At the same time I want to convert all lowercase letters into uppercase, only in the sequence lines. I don't even know how to begin using sed, I find it really tricky to understand.
Expected output:
>Name_with_nnn
---ATGC---ATTTG
>Name2_with_nnn
ATGGG----GGT---
So after I created my sequence files I tried to continue my script with:
while IFS= read -r line
do
if [[ $line == ">"* ]]
then
echo "Ignoring header line: $line"
else
echo "Converting to uppercase and then N-to-gaps"
# sed or tr?? do call $line or do I call $OUTFILE? so confused..
fi
done
bash sed
I want to change all n
in the sequence into -
, but I don't know how to make my bash script not change the n
that show up in sequence names. I'm not experienced with sed or regex to make sure my bash script reads only the lines that do not start with >
, as that indicates the header.
Example file:
>Name_with_nnn
nnnatgcnnnatttg
>Name2_with_nnn
atgggnnnnGGtnnn
At the same time I want to convert all lowercase letters into uppercase, only in the sequence lines. I don't even know how to begin using sed, I find it really tricky to understand.
Expected output:
>Name_with_nnn
---ATGC---ATTTG
>Name2_with_nnn
ATGGG----GGT---
So after I created my sequence files I tried to continue my script with:
while IFS= read -r line
do
if [[ $line == ">"* ]]
then
echo "Ignoring header line: $line"
else
echo "Converting to uppercase and then N-to-gaps"
# sed or tr?? do call $line or do I call $OUTFILE? so confused..
fi
done
bash sed
bash sed
edited Jan 3 at 18:49
Benjamin W.
21.6k135257
21.6k135257
asked Jan 3 at 18:27
DNAngelDNAngel
1491110
1491110
add a comment |
add a comment |
4 Answers
4
active
oldest
votes
You can resolve this with sed
with below line:
sed -i "/^>/! {s/n/-/g; s/(.*)/U1/g}" text.txt
And your output would be:
>Name_with_nnn
---ATGC---ATTTG
>Name2_with_nnn
ATGGG----GGT---
@DNAngel be aware I updated the script as converting to uppercase was missing.
– Cedric Zoppolo
Jan 3 at 18:55
1
@CedricZoppolo Not only uppercase, you updatedstart with
>
too, might be worth mentioning.
– Tiw
Jan 3 at 19:05
@Tiw is correct. I also updated to ensure only lines not starting with>
will be converted. I was missing^
character in order to get the lines starting with>
and not lines containing such character in any part of line.
– Cedric Zoppolo
Jan 3 at 19:12
add a comment |
You may use this simple gnu sed
:
sed '/^>/!{s/n/-/g; s/.*/U&/;}' file
>Name_with_nnn
---ATGC---ATTTG
>Name2_with_nnn
ATGGG----GGT---
add a comment |
In pure Bash, likely quite slow for larger inputs:
while IFS= read -r line; do
case $line in
'>'*)
printf '%sn' "$line"
;;
*)
line=${line//n/-}
printf '%sn' "${line^^}"
;;
esac
done < infile
This uses a case
statement with pattern matching to test if a line starts with >
or not; to modify the lines, parameter expansions are used. The ${parameter^^}
expansion requires Bash 4.0 or newer.
add a comment |
How about awk
?
awk '/^[^>]/{gsub("n","-");print toupper($0);next;}1' data
Output:
>Name_with_nnn
---ATGC---ATTTG
>Name2_with_nnn
ATGGG----GGT---
However, sed
can do it too (GNU sed):
sed -E '/^[^>]/{s/n/-/g;s/(.*)/U1/g;}' data
It's the same as:
sed -E '/^>/!{s/n/-/g;s/(.*)/U1/g;}' data
If you want to change in place, you can add -i
switch to sed
.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54027847%2fhow-to-replace-letters-in-lines-in-fasta-file-using-bash-loops%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
4 Answers
4
active
oldest
votes
4 Answers
4
active
oldest
votes
active
oldest
votes
active
oldest
votes
You can resolve this with sed
with below line:
sed -i "/^>/! {s/n/-/g; s/(.*)/U1/g}" text.txt
And your output would be:
>Name_with_nnn
---ATGC---ATTTG
>Name2_with_nnn
ATGGG----GGT---
@DNAngel be aware I updated the script as converting to uppercase was missing.
– Cedric Zoppolo
Jan 3 at 18:55
1
@CedricZoppolo Not only uppercase, you updatedstart with
>
too, might be worth mentioning.
– Tiw
Jan 3 at 19:05
@Tiw is correct. I also updated to ensure only lines not starting with>
will be converted. I was missing^
character in order to get the lines starting with>
and not lines containing such character in any part of line.
– Cedric Zoppolo
Jan 3 at 19:12
add a comment |
You can resolve this with sed
with below line:
sed -i "/^>/! {s/n/-/g; s/(.*)/U1/g}" text.txt
And your output would be:
>Name_with_nnn
---ATGC---ATTTG
>Name2_with_nnn
ATGGG----GGT---
@DNAngel be aware I updated the script as converting to uppercase was missing.
– Cedric Zoppolo
Jan 3 at 18:55
1
@CedricZoppolo Not only uppercase, you updatedstart with
>
too, might be worth mentioning.
– Tiw
Jan 3 at 19:05
@Tiw is correct. I also updated to ensure only lines not starting with>
will be converted. I was missing^
character in order to get the lines starting with>
and not lines containing such character in any part of line.
– Cedric Zoppolo
Jan 3 at 19:12
add a comment |
You can resolve this with sed
with below line:
sed -i "/^>/! {s/n/-/g; s/(.*)/U1/g}" text.txt
And your output would be:
>Name_with_nnn
---ATGC---ATTTG
>Name2_with_nnn
ATGGG----GGT---
You can resolve this with sed
with below line:
sed -i "/^>/! {s/n/-/g; s/(.*)/U1/g}" text.txt
And your output would be:
>Name_with_nnn
---ATGC---ATTTG
>Name2_with_nnn
ATGGG----GGT---
edited Jan 3 at 18:59
answered Jan 3 at 18:38
Cedric ZoppoloCedric Zoppolo
1,36211529
1,36211529
@DNAngel be aware I updated the script as converting to uppercase was missing.
– Cedric Zoppolo
Jan 3 at 18:55
1
@CedricZoppolo Not only uppercase, you updatedstart with
>
too, might be worth mentioning.
– Tiw
Jan 3 at 19:05
@Tiw is correct. I also updated to ensure only lines not starting with>
will be converted. I was missing^
character in order to get the lines starting with>
and not lines containing such character in any part of line.
– Cedric Zoppolo
Jan 3 at 19:12
add a comment |
@DNAngel be aware I updated the script as converting to uppercase was missing.
– Cedric Zoppolo
Jan 3 at 18:55
1
@CedricZoppolo Not only uppercase, you updatedstart with
>
too, might be worth mentioning.
– Tiw
Jan 3 at 19:05
@Tiw is correct. I also updated to ensure only lines not starting with>
will be converted. I was missing^
character in order to get the lines starting with>
and not lines containing such character in any part of line.
– Cedric Zoppolo
Jan 3 at 19:12
@DNAngel be aware I updated the script as converting to uppercase was missing.
– Cedric Zoppolo
Jan 3 at 18:55
@DNAngel be aware I updated the script as converting to uppercase was missing.
– Cedric Zoppolo
Jan 3 at 18:55
1
1
@CedricZoppolo Not only uppercase, you updated
start with
>
too, might be worth mentioning.– Tiw
Jan 3 at 19:05
@CedricZoppolo Not only uppercase, you updated
start with
>
too, might be worth mentioning.– Tiw
Jan 3 at 19:05
@Tiw is correct. I also updated to ensure only lines not starting with
>
will be converted. I was missing ^
character in order to get the lines starting with >
and not lines containing such character in any part of line.– Cedric Zoppolo
Jan 3 at 19:12
@Tiw is correct. I also updated to ensure only lines not starting with
>
will be converted. I was missing ^
character in order to get the lines starting with >
and not lines containing such character in any part of line.– Cedric Zoppolo
Jan 3 at 19:12
add a comment |
You may use this simple gnu sed
:
sed '/^>/!{s/n/-/g; s/.*/U&/;}' file
>Name_with_nnn
---ATGC---ATTTG
>Name2_with_nnn
ATGGG----GGT---
add a comment |
You may use this simple gnu sed
:
sed '/^>/!{s/n/-/g; s/.*/U&/;}' file
>Name_with_nnn
---ATGC---ATTTG
>Name2_with_nnn
ATGGG----GGT---
add a comment |
You may use this simple gnu sed
:
sed '/^>/!{s/n/-/g; s/.*/U&/;}' file
>Name_with_nnn
---ATGC---ATTTG
>Name2_with_nnn
ATGGG----GGT---
You may use this simple gnu sed
:
sed '/^>/!{s/n/-/g; s/.*/U&/;}' file
>Name_with_nnn
---ATGC---ATTTG
>Name2_with_nnn
ATGGG----GGT---
answered Jan 3 at 18:35
anubhavaanubhava
534k48332409
534k48332409
add a comment |
add a comment |
In pure Bash, likely quite slow for larger inputs:
while IFS= read -r line; do
case $line in
'>'*)
printf '%sn' "$line"
;;
*)
line=${line//n/-}
printf '%sn' "${line^^}"
;;
esac
done < infile
This uses a case
statement with pattern matching to test if a line starts with >
or not; to modify the lines, parameter expansions are used. The ${parameter^^}
expansion requires Bash 4.0 or newer.
add a comment |
In pure Bash, likely quite slow for larger inputs:
while IFS= read -r line; do
case $line in
'>'*)
printf '%sn' "$line"
;;
*)
line=${line//n/-}
printf '%sn' "${line^^}"
;;
esac
done < infile
This uses a case
statement with pattern matching to test if a line starts with >
or not; to modify the lines, parameter expansions are used. The ${parameter^^}
expansion requires Bash 4.0 or newer.
add a comment |
In pure Bash, likely quite slow for larger inputs:
while IFS= read -r line; do
case $line in
'>'*)
printf '%sn' "$line"
;;
*)
line=${line//n/-}
printf '%sn' "${line^^}"
;;
esac
done < infile
This uses a case
statement with pattern matching to test if a line starts with >
or not; to modify the lines, parameter expansions are used. The ${parameter^^}
expansion requires Bash 4.0 or newer.
In pure Bash, likely quite slow for larger inputs:
while IFS= read -r line; do
case $line in
'>'*)
printf '%sn' "$line"
;;
*)
line=${line//n/-}
printf '%sn' "${line^^}"
;;
esac
done < infile
This uses a case
statement with pattern matching to test if a line starts with >
or not; to modify the lines, parameter expansions are used. The ${parameter^^}
expansion requires Bash 4.0 or newer.
answered Jan 3 at 18:58
Benjamin W.Benjamin W.
21.6k135257
21.6k135257
add a comment |
add a comment |
How about awk
?
awk '/^[^>]/{gsub("n","-");print toupper($0);next;}1' data
Output:
>Name_with_nnn
---ATGC---ATTTG
>Name2_with_nnn
ATGGG----GGT---
However, sed
can do it too (GNU sed):
sed -E '/^[^>]/{s/n/-/g;s/(.*)/U1/g;}' data
It's the same as:
sed -E '/^>/!{s/n/-/g;s/(.*)/U1/g;}' data
If you want to change in place, you can add -i
switch to sed
.
add a comment |
How about awk
?
awk '/^[^>]/{gsub("n","-");print toupper($0);next;}1' data
Output:
>Name_with_nnn
---ATGC---ATTTG
>Name2_with_nnn
ATGGG----GGT---
However, sed
can do it too (GNU sed):
sed -E '/^[^>]/{s/n/-/g;s/(.*)/U1/g;}' data
It's the same as:
sed -E '/^>/!{s/n/-/g;s/(.*)/U1/g;}' data
If you want to change in place, you can add -i
switch to sed
.
add a comment |
How about awk
?
awk '/^[^>]/{gsub("n","-");print toupper($0);next;}1' data
Output:
>Name_with_nnn
---ATGC---ATTTG
>Name2_with_nnn
ATGGG----GGT---
However, sed
can do it too (GNU sed):
sed -E '/^[^>]/{s/n/-/g;s/(.*)/U1/g;}' data
It's the same as:
sed -E '/^>/!{s/n/-/g;s/(.*)/U1/g;}' data
If you want to change in place, you can add -i
switch to sed
.
How about awk
?
awk '/^[^>]/{gsub("n","-");print toupper($0);next;}1' data
Output:
>Name_with_nnn
---ATGC---ATTTG
>Name2_with_nnn
ATGGG----GGT---
However, sed
can do it too (GNU sed):
sed -E '/^[^>]/{s/n/-/g;s/(.*)/U1/g;}' data
It's the same as:
sed -E '/^>/!{s/n/-/g;s/(.*)/U1/g;}' data
If you want to change in place, you can add -i
switch to sed
.
edited Jan 3 at 18:52
answered Jan 3 at 18:29
TiwTiw
4,35461630
4,35461630
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54027847%2fhow-to-replace-letters-in-lines-in-fasta-file-using-bash-loops%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown