Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I have a large data sets and the variable includes different format

Subject   Result
1           3
2           4
3          <4
4          <3
5          I need to go to school<>
6          I need to <> be there
7          2.3 need to be< there
8          <.3
9          .<9
10         ..<9
11         >3 need to go to school
12         <16.1
13         <5.0

I just want to keep the rows which include the "< number" or "> number" and not the rows with the text format (forexample, I want to exclude >3 need to school, I need to go to school <>). The problem is that some records are something like .<3, ..<9, >9., >:9. So how can I remove ".","..",":" from the data set and then keep the rows with "< a number" notation. How can I use "grep" function? Again, I just want to keep the following rows

    Subject   Result
>     3          <4
>     4          <3
>     8          <.3
>     9          .<9
>     10         ..<9
>     12         <16.1
>     13         <5.0
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
153 views
Welcome To Ask or Share your Answers For Others

1 Answer

You can simply apply two greps, one to find the "<>" keys, and then one to eliminate fields with characters:

grep "[><]" | grep -v "[A-Za-z]"

If you want to be pedantic, you can also apply another grep to find those with numbers

grep "[><]" | grep -v "[A-Za-z]" | grep "[0-9]"

"grep -v" means match and don't return, by the way.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...