Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I created a file that logs the training of a random forest classifier and logistic regression. It has the following text:

Creating logistic regression model...
Done.
Creating random forest classifier model...
building tree 1 of 27
building tree 2 of 27
building tree 3 of 27
building tree 4 of 27
building tree 5 of 27
building tree 6 of 27
building tree 7 of 27
building tree 8 of 27
building tree 9 of 27
building tree 10 of 27
building tree 11 of 27
building tree 12 of 27
building tree 13 of 27
building tree 14 of 27
building tree 15 of 27
building tree 16 of 27
building tree 17 of 27
building tree 18 of 27
building tree 19 of 27
building tree 20 of 27
building tree 21 of 27
building tree 22 of 27
building tree 23 of 27
building tree 24 of 27
building tree 25 of 27
building tree 26 of 27
building tree 27 of 27
Train scores:
    Logistic Regression Recall: 0.6892336879192357
    Random Forest Recall: 0.5848905752422251
Test scores:
    Logistic Regression Recall: 0.6746186562629912
    Random Forest Recall: 0.5647724728982124

I'd like to extract just the score lines. I've tried sed -n '/[Train|Test|Recall]/p' scores (scores being the file name), but for some reason, even though -n is supposed to suppress all printing but the pattern-matched lines, it still prints the full text of the file.

When I ran cat scores | grep "[Train|Test|Recall]" -, the pattern-matching highlighter checked off the letters from each of those lines that seemed to match [Train|Test|Recall], rather than the actual words: For example, Creating logistic regression model... had _creatin_ l__istic re_ressi_n ___el... highlighted. The problem persists even if I add boundaries: cat scores | grep "[Train|Test|Recall]" -.

My understanding of grep is that it should be matching the full text of each of those words; the pipe in between each word should identify each word as its own pattern to be checked. How do I need to write this regex, and how can I specify whatever arguments I'd need in sed?

question from:https://stackoverflow.com/questions/65713904/sed-regex-recognizing-individual-letters-instead-of-words

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
290 views
Welcome To Ask or Share your Answers For Others

1 Answer

Square brackets [] contains a list of possible matching characters, so you often see examples like gr[ae]y to match both gray and grey.

For your usage you can omit the brackets Train|Test|Recall, or use parentheses (Train|Test|Recall).

For grep in regular mode your command becomes

cat scores | grep "(Train|Test|Recall)"

Or in extended regex mode it becomes

cat scores | grep -E "(Train|Test|Recall)"

Or in sed:

cat scores | sed -E -n "/(Train|Test|Recall)/p"

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...