Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I have a huge (10+ GB) .csv file on a Linux server. The lines look somehow like this:

6;20000327;20000425;990099,0;20000327;LL;UBXO;7;-1;62;F;30;001;NO;NO;wgB;0;99;0002;5530;001;708;196;1;AA;N;N;100;53,81;0;0;0;1;1;;1;
6;20000327;20000425;990099,0;20000425;LL;OLD*;62;62;92;F;30;001;NO;NO;ueB;0;99;0002;XXXX;001;;;1;AA;N;N;;;0;0;1;0;0;;30;

I am searching for a fast script to do the following:

  1. change any occurrence of <number>,<number> to <number>.<number>
  2. delete the last semicolon of each line

I have especially problems with the second one, because the script shouldn't mind if it is a Linux file or a windows file.

I tried to do it with sed but failed thus far.

[edit]

I finally used a mix of Dennis Williams and SiegeX solutions:

sed 's/;([0-9]*),([0-9]*);/;1.2;/g;s/;(
?)$/1/' inputfile

(the part with s/;[[:blank:]]*$// didn't work at my file...)

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
280 views
Welcome To Ask or Share your Answers For Others

1 Answer

sed 's/;([0-9]*),([0-9]*);/;1.2;/g;s/;[[:blank:]]*$//' ./infile

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...