Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I have been using the following Perl code to extract text from multiple text files. It works fine.

Example of a couple of lines in one of the input files:

Fa0/19    CUTExyz     notconnect   129         half    100 10/100BaseTX
Fa0/22    xyz MLS     notconnect   1293        half     10 10/100BaseTX

What I need is to match the numbers in each line exactly (i.e. 129 is not matched by 1293) and print the corresponding lines.

It would also be nice to match a range of numbers leaving specific numbers out i.e. match 2 through 10 but not 11 the 12 through 20

#!/perl/bin/perl

use warnings;

my @files = <c:/perl64/files/*>;

foreach $file ( @files ) {

    open( FILE, "$file" );

    while ( $line = <FILE> ) {
        print "$file $line" if $line =~ /123/n;
    }

    close FILE;
}

Thank you for the suggestions, but can it can be done using the code structure above?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
1.1k views
Welcome To Ask or Share your Answers For Others

1 Answer

I suggest that you take a look at perldoc perlre.

You need to anchor your regex pattern. The easiest way is probably using which is a zero-width boundary between alphanumerics and non-alphanumerics.

#!/perl/bin/perl
use warnings;
use strict;

foreach my $file ( glob "c:/perl64/files/*" ) {
   open( my $input, '<', $file ) or die $!;
   while (<$input>) {
      print "$file $_" if m/123/;
   }
   close $input;
}

Note - you should use three-argument open with lexical file handles as above, because it is better practice.

I've also removed the n pattern modifier, as it appears redundant.

Following your edit though, to give us some source data. I'd suggest the solution is not to use a regex - your source data looks space delimited. (Maybe those are tabs?).

So I'd suggest you're better off using split and selecting the field you want, and testing it numerically, because you mention matching ranges. This is not a good fit for regexes because they don't understand the numeric content.

Instead:

while ( <$input> ) {
   print if (split)[-4] == 129;
}

Note - I use -4 in the split, which indexes from the end of the list. This is because column 3 contains spaces, so splitting on whitespace is going to produce the wrong result unless we count down from the end of the array. Using a negative index we get the right field each time.

If your data is tab separated then you could use chomp and split / /. Or potentially split on /s{2,}/ to split on 2-or-more spaces

But by selecting the field, you can do numeric tests on it, like

if $fields[-4] > 100 and $fields[-4] < 200

etc.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...