How to replace letters in lines in fasta file using bash loops?

I want to change all n in the sequence into -, but I don't know how to make my bash script not change the n that show up in sequence names. I'm not experienced with sed or regex to make sure my bash script reads only the lines that do not start with >, as that indicates the header.

Example file:

>Name_with_nnn

nnnatgcnnnatttg

>Name2_with_nnn

atgggnnnnGGtnnn

At the same time I want to convert all lowercase letters into uppercase, only in the sequence lines. I don't even know how to begin using sed, I find it really tricky to understand.

Expected output:

>Name_with_nnn

---ATGC---ATTTG

>Name2_with_nnn

ATGGG----GGT---

So after I created my sequence files I tried to continue my script with:

while IFS= read -r line

do

     if [[ $line == ">"* ]]

     then

          echo "Ignoring header line: $line"

     else

          echo "Converting to uppercase and then N-to-gaps"

          # sed or tr?? do call $line or do I call $OUTFILE? so confused..

     fi

done

edited Jan 3 at 18:49

Benjamin W.

21.6k135257

asked Jan 3 at 18:27

DNAngel

1491110

add a comment |

Example file:

>Name_with_nnn

nnnatgcnnnatttg

>Name2_with_nnn

atgggnnnnGGtnnn

At the same time I want to convert all lowercase letters into uppercase, only in the sequence lines. I don't even know how to begin using sed, I find it really tricky to understand.

Expected output:

>Name_with_nnn

---ATGC---ATTTG

>Name2_with_nnn

ATGGG----GGT---

So after I created my sequence files I tried to continue my script with:

while IFS= read -r line

do

     if [[ $line == ">"* ]]

     then

          echo "Ignoring header line: $line"

     else

          echo "Converting to uppercase and then N-to-gaps"

          # sed or tr?? do call $line or do I call $OUTFILE? so confused..

     fi

done

edited Jan 3 at 18:49

Benjamin W.

21.6k135257

asked Jan 3 at 18:27

DNAngel

1491110

add a comment |

Example file:

>Name_with_nnn

nnnatgcnnnatttg

>Name2_with_nnn

atgggnnnnGGtnnn

At the same time I want to convert all lowercase letters into uppercase, only in the sequence lines. I don't even know how to begin using sed, I find it really tricky to understand.

Expected output:

>Name_with_nnn

---ATGC---ATTTG

>Name2_with_nnn

ATGGG----GGT---

So after I created my sequence files I tried to continue my script with:

while IFS= read -r line

do

     if [[ $line == ">"* ]]

     then

          echo "Ignoring header line: $line"

     else

          echo "Converting to uppercase and then N-to-gaps"

          # sed or tr?? do call $line or do I call $OUTFILE? so confused..

     fi

done

edited Jan 3 at 18:49

Benjamin W.

21.6k135257

asked Jan 3 at 18:27

DNAngel

1491110

Example file:

>Name_with_nnn

nnnatgcnnnatttg

>Name2_with_nnn

atgggnnnnGGtnnn

At the same time I want to convert all lowercase letters into uppercase, only in the sequence lines. I don't even know how to begin using sed, I find it really tricky to understand.

Expected output:

>Name_with_nnn

---ATGC---ATTTG

>Name2_with_nnn

ATGGG----GGT---

So after I created my sequence files I tried to continue my script with:

while IFS= read -r line

do

     if [[ $line == ">"* ]]

     then

          echo "Ignoring header line: $line"

     else

          echo "Converting to uppercase and then N-to-gaps"

          # sed or tr?? do call $line or do I call $OUTFILE? so confused..

     fi

done

bash sed

edited Jan 3 at 18:49

Benjamin W.

21.6k135257

asked Jan 3 at 18:27

DNAngel

1491110

edited Jan 3 at 18:49

Benjamin W.

21.6k135257

asked Jan 3 at 18:27

DNAngel

1491110

edited Jan 3 at 18:49

Benjamin W.

21.6k135257

edited Jan 3 at 18:49

Benjamin W.

21.6k135257

edited Jan 3 at 18:49

Benjamin W.

21.6k135257

asked Jan 3 at 18:27

DNAngel

1491110

asked Jan 3 at 18:27

DNAngel

1491110

asked Jan 3 at 18:27

DNAngel

1491110

add a comment |

4 Answers
4

active

oldest

votes

You can resolve this with sed with below line:

sed -i "/^>/! {s/n/-/g; s/(.*)/U1/g}" text.txt

And your output would be:

>Name_with_nnn

---ATGC---ATTTG

>Name2_with_nnn

ATGGG----GGT---

edited Jan 3 at 18:59

answered Jan 3 at 18:38

Cedric Zoppolo

1,36211529

@DNAngel be aware I updated the script as converting to uppercase was missing.

– Cedric Zoppolo
Jan 3 at 18:55

1

@CedricZoppolo Not only uppercase, you updated start with > too, might be worth mentioning.

– Tiw
Jan 3 at 19:05

@Tiw is correct. I also updated to ensure only lines not starting with > will be converted. I was missing ^ character in order to get the lines starting with > and not lines containing such character in any part of line.

– Cedric Zoppolo
Jan 3 at 19:12

add a comment |

You may use this simple gnu sed:

sed '/^>/!{s/n/-/g; s/.*/U&/;}' file

>Name_with_nnn

---ATGC---ATTTG

>Name2_with_nnn

ATGGG----GGT---

answered Jan 3 at 18:35

anubhava

534k48332409

add a comment |

In pure Bash, likely quite slow for larger inputs:

while IFS= read -r line; do

    case $line in

        '>'*)

            printf '%sn' "$line"

            ;;

        *)

            line=${line//n/-}

            printf '%sn' "${line^^}"

            ;;

    esac

done < infile

This uses a case statement with pattern matching to test if a line starts with > or not; to modify the lines, parameter expansions are used. The ${parameter^^} expansion requires Bash 4.0 or newer.

answered Jan 3 at 18:58

Benjamin W.

21.6k135257

add a comment |

How about awk ?

awk '/^[^>]/{gsub("n","-");print toupper($0);next;}1' data

Output:

>Name_with_nnn

---ATGC---ATTTG

>Name2_with_nnn

ATGGG----GGT---

However, sed can do it too (GNU sed):

sed -E '/^[^>]/{s/n/-/g;s/(.*)/U1/g;}' data

It's the same as:

sed -E '/^>/!{s/n/-/g;s/(.*)/U1/g;}' data

If you want to change in place, you can add -i switch to sed.

edited Jan 3 at 18:52

answered Jan 3 at 18:29

Tiw

4,35461630

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54027847%2fhow-to-replace-letters-in-lines-in-fasta-file-using-bash-loops%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

4 Answers
4

active

oldest

votes

4 Answers
4

active

oldest

votes

You can resolve this with sed with below line:

sed -i "/^>/! {s/n/-/g; s/(.*)/U1/g}" text.txt

And your output would be:

>Name_with_nnn

---ATGC---ATTTG

>Name2_with_nnn

ATGGG----GGT---

edited Jan 3 at 18:59

answered Jan 3 at 18:38

Cedric Zoppolo

1,36211529

@DNAngel be aware I updated the script as converting to uppercase was missing.

– Cedric Zoppolo
Jan 3 at 18:55

1

@CedricZoppolo Not only uppercase, you updated start with > too, might be worth mentioning.

– Tiw
Jan 3 at 19:05

@Tiw is correct. I also updated to ensure only lines not starting with > will be converted. I was missing ^ character in order to get the lines starting with > and not lines containing such character in any part of line.

– Cedric Zoppolo
Jan 3 at 19:12

add a comment |

You can resolve this with sed with below line:

sed -i "/^>/! {s/n/-/g; s/(.*)/U1/g}" text.txt

And your output would be:

>Name_with_nnn

---ATGC---ATTTG

>Name2_with_nnn

ATGGG----GGT---

edited Jan 3 at 18:59

answered Jan 3 at 18:38

Cedric Zoppolo

1,36211529

@DNAngel be aware I updated the script as converting to uppercase was missing.

– Cedric Zoppolo
Jan 3 at 18:55

1

@CedricZoppolo Not only uppercase, you updated start with > too, might be worth mentioning.

– Tiw
Jan 3 at 19:05

@Tiw is correct. I also updated to ensure only lines not starting with > will be converted. I was missing ^ character in order to get the lines starting with > and not lines containing such character in any part of line.

– Cedric Zoppolo
Jan 3 at 19:12

add a comment |

You can resolve this with sed with below line:

sed -i "/^>/! {s/n/-/g; s/(.*)/U1/g}" text.txt

And your output would be:

>Name_with_nnn

---ATGC---ATTTG

>Name2_with_nnn

ATGGG----GGT---

edited Jan 3 at 18:59

answered Jan 3 at 18:38

Cedric Zoppolo

1,36211529

You can resolve this with sed with below line:

sed -i "/^>/! {s/n/-/g; s/(.*)/U1/g}" text.txt

And your output would be:

>Name_with_nnn

---ATGC---ATTTG

>Name2_with_nnn

ATGGG----GGT---

edited Jan 3 at 18:59

answered Jan 3 at 18:38

Cedric Zoppolo

1,36211529

edited Jan 3 at 18:59

answered Jan 3 at 18:38

Cedric Zoppolo

1,36211529

answered Jan 3 at 18:38

Cedric Zoppolo

1,36211529

answered Jan 3 at 18:38

Cedric Zoppolo

1,36211529

@DNAngel be aware I updated the script as converting to uppercase was missing.

– Cedric Zoppolo
Jan 3 at 18:55

1

@CedricZoppolo Not only uppercase, you updated start with > too, might be worth mentioning.

– Tiw
Jan 3 at 19:05

@Tiw is correct. I also updated to ensure only lines not starting with > will be converted. I was missing ^ character in order to get the lines starting with > and not lines containing such character in any part of line.

– Cedric Zoppolo
Jan 3 at 19:12

add a comment |

@DNAngel be aware I updated the script as converting to uppercase was missing.

– Cedric Zoppolo
Jan 3 at 18:55

1

@CedricZoppolo Not only uppercase, you updated start with > too, might be worth mentioning.

– Tiw
Jan 3 at 19:05

@Tiw is correct. I also updated to ensure only lines not starting with > will be converted. I was missing ^ character in order to get the lines starting with > and not lines containing such character in any part of line.

– Cedric Zoppolo
Jan 3 at 19:12

@DNAngel be aware I updated the script as converting to uppercase was missing.

– Cedric Zoppolo
Jan 3 at 18:55

@CedricZoppolo Not only uppercase, you updated start with > too, might be worth mentioning.

– Tiw
Jan 3 at 19:05

@Tiw is correct. I also updated to ensure only lines not starting with > will be converted. I was missing ^ character in order to get the lines starting with > and not lines containing such character in any part of line.

– Cedric Zoppolo
Jan 3 at 19:12

add a comment |

You may use this simple gnu sed:

sed '/^>/!{s/n/-/g; s/.*/U&/;}' file

>Name_with_nnn

---ATGC---ATTTG

>Name2_with_nnn

ATGGG----GGT---

answered Jan 3 at 18:35

anubhava

534k48332409

add a comment |

You may use this simple gnu sed:

sed '/^>/!{s/n/-/g; s/.*/U&/;}' file

>Name_with_nnn

---ATGC---ATTTG

>Name2_with_nnn

ATGGG----GGT---

answered Jan 3 at 18:35

anubhava

534k48332409

add a comment |

You may use this simple gnu sed:

sed '/^>/!{s/n/-/g; s/.*/U&/;}' file

>Name_with_nnn

---ATGC---ATTTG

>Name2_with_nnn

ATGGG----GGT---

answered Jan 3 at 18:35

anubhava

534k48332409

You may use this simple gnu sed:

sed '/^>/!{s/n/-/g; s/.*/U&/;}' file

>Name_with_nnn

---ATGC---ATTTG

>Name2_with_nnn

ATGGG----GGT---

answered Jan 3 at 18:35

anubhava

534k48332409

answered Jan 3 at 18:35

anubhava

534k48332409

answered Jan 3 at 18:35

anubhava

534k48332409

answered Jan 3 at 18:35

anubhava

534k48332409

add a comment |

In pure Bash, likely quite slow for larger inputs:

while IFS= read -r line; do

    case $line in

        '>'*)

            printf '%sn' "$line"

            ;;

        *)

            line=${line//n/-}

            printf '%sn' "${line^^}"

            ;;

    esac

done < infile

answered Jan 3 at 18:58

Benjamin W.

21.6k135257

add a comment |

In pure Bash, likely quite slow for larger inputs:

while IFS= read -r line; do

    case $line in

        '>'*)

            printf '%sn' "$line"

            ;;

        *)

            line=${line//n/-}

            printf '%sn' "${line^^}"

            ;;

    esac

done < infile

answered Jan 3 at 18:58

Benjamin W.

21.6k135257

add a comment |

In pure Bash, likely quite slow for larger inputs:

while IFS= read -r line; do

    case $line in

        '>'*)

            printf '%sn' "$line"

            ;;

        *)

            line=${line//n/-}

            printf '%sn' "${line^^}"

            ;;

    esac

done < infile

answered Jan 3 at 18:58

Benjamin W.

21.6k135257

In pure Bash, likely quite slow for larger inputs:

while IFS= read -r line; do

    case $line in

        '>'*)

            printf '%sn' "$line"

            ;;

        *)

            line=${line//n/-}

            printf '%sn' "${line^^}"

            ;;

    esac

done < infile

answered Jan 3 at 18:58

Benjamin W.

21.6k135257

answered Jan 3 at 18:58

Benjamin W.

21.6k135257

answered Jan 3 at 18:58

Benjamin W.

21.6k135257

answered Jan 3 at 18:58

Benjamin W.

21.6k135257

add a comment |

How about awk ?

awk '/^[^>]/{gsub("n","-");print toupper($0);next;}1' data

Output:

>Name_with_nnn

---ATGC---ATTTG

>Name2_with_nnn

ATGGG----GGT---

However, sed can do it too (GNU sed):

sed -E '/^[^>]/{s/n/-/g;s/(.*)/U1/g;}' data

It's the same as:

sed -E '/^>/!{s/n/-/g;s/(.*)/U1/g;}' data

If you want to change in place, you can add -i switch to sed.

edited Jan 3 at 18:52

answered Jan 3 at 18:29

Tiw

4,35461630

add a comment |

How about awk ?

awk '/^[^>]/{gsub("n","-");print toupper($0);next;}1' data

Output:

>Name_with_nnn

---ATGC---ATTTG

>Name2_with_nnn

ATGGG----GGT---

However, sed can do it too (GNU sed):

sed -E '/^[^>]/{s/n/-/g;s/(.*)/U1/g;}' data

It's the same as:

sed -E '/^>/!{s/n/-/g;s/(.*)/U1/g;}' data

If you want to change in place, you can add -i switch to sed.

edited Jan 3 at 18:52

answered Jan 3 at 18:29

Tiw

4,35461630

add a comment |

How about awk ?

awk '/^[^>]/{gsub("n","-");print toupper($0);next;}1' data

Output:

>Name_with_nnn

---ATGC---ATTTG

>Name2_with_nnn

ATGGG----GGT---

However, sed can do it too (GNU sed):

sed -E '/^[^>]/{s/n/-/g;s/(.*)/U1/g;}' data

It's the same as:

sed -E '/^>/!{s/n/-/g;s/(.*)/U1/g;}' data

If you want to change in place, you can add -i switch to sed.

edited Jan 3 at 18:52

answered Jan 3 at 18:29

Tiw

4,35461630

How about awk ?

awk '/^[^>]/{gsub("n","-");print toupper($0);next;}1' data

Output:

>Name_with_nnn

---ATGC---ATTTG

>Name2_with_nnn

ATGGG----GGT---

However, sed can do it too (GNU sed):

sed -E '/^[^>]/{s/n/-/g;s/(.*)/U1/g;}' data

It's the same as:

sed -E '/^>/!{s/n/-/g;s/(.*)/U1/g;}' data

If you want to change in place, you can add -i switch to sed.

edited Jan 3 at 18:52

answered Jan 3 at 18:29

Tiw

4,35461630

edited Jan 3 at 18:52

answered Jan 3 at 18:29

Tiw

4,35461630

answered Jan 3 at 18:29

Tiw

4,35461630

answered Jan 3 at 18:29

Tiw

4,35461630

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Bdtjtk