Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I'm fairly new in linux world and i need your help. i need a code to search for specific characters in spcific positions in a text file. i.e

The file sequences.txt looks like this:

ACGTCAGTCAG**T**CAGCATC**G**ATCGACTACGACCGTAGCTAGCTATACGACT**G**ATCAGCTACGATCAGCTACGATCAGCTACGAT
ACGTCAGTCAG**A**CAGCATC**C**ATCGACCATGCTAGCCGTACGATTAGCGACT**C**ATCAGCTACGATCAGCTACGATCAGCTACGAT
ACGTCAGTCAG**T**CAGCATCATCGACTACGACTACGATCGATCGATCGGACT**G**ATCAGCTACGATCAGCTACGATCAGCTACGATG
ACGTCAGTCAG**A**CAGCATC**G**ATCGACTACGACGATCGATCGATCTACGACT**C**ATCAGCTACGATCAGCTACGATCAGCTACGAT

What i want is to split the dataset in different output files grouping the equal lines containing the same specific charactrs.

hope someone can help me, all the best

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
231 views
Welcome To Ask or Share your Answers For Others

1 Answer

To search for "foo" at position 42:

egrep '^.{42}foo'

You can run a command like this multiple times on your input:

egrep '^.{42}foo' inputfile.txt > lineswithfoo.txt
egrep '^.{42}bar' inputfile.txt > lineswithbar.txt
...

or as a loop:

for pattern in foo bar qux; do
  egrep "^.{42}$pattern" inputfile.txt > lineswith$pattern.txt
done

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...