Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I have very long integer sequences that look like this (arbitrary length!):

0000000001110002220033333

Now I need some algorithm to convert this string into something compressed like

a9b3a3c3a2d5

Which means "a 9 times, then b 3 times, then a 3 times" and so on, where "a" stands for 0, "b" for 1, "c" for 2 and "d" for 3.

How would you do that? So far nothing suitable came to my mind, and I had no luck with google because I didn't really know what to search for. What is this kind of encoding / compression called?

PS: I am going to do the encoding with PHP, and the decoding in JavaScript.

Edit: Thank you all!

I ended up with this function for encoding:

protected function numStringToRle($s){          
        $rle    = '';
        $count = 1;
        $len    = strlen($s);
        for($i = 0; $i < $len; $i++){
            if($i != $len && isset($s[$i+1]) && $s[$i] == $s[$i+1]){
                $count++;                
            } else {
                $rle .= chr($s[$i] + 97).( $count == 1 ? '' : $count);                                
                $count = 1;
            }
        }
        return $rle;            
}

And that for decoding:

var decodeCoords = function(str) {

   str = str.replace(/(.)(d+)/g, function(_, x, n) {
       return new Array(parseInt(n, 10) + 1).join(x);
   });

   return str.
     replace(/a/g, '0').
     replace(/b/g, '1').
     replace(/c/g, '2').
     replace(/d/g, '3');     
};
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
400 views
Welcome To Ask or Share your Answers For Others

1 Answer

It is called Run Length Encoding

Basic encoder in PHP:

function numStringToRle($s){
    $rle = '';
    $count = 1;
    $len = strlen($s);
    for ( $i = 0; $i < $len; $i++ ){
        if ( $i != $len && $s[$i] == $s[$i+1] ){
            $count++;                
        }else{
          $rle .= chr($s[$i] + 97).$count;    
          $count = 1;
        }
    }
    return $rle;
}

Be warned it will preform badly issues with a string like

 123456789123456789

If you were going to be handling a string that may have a lot of individual single characters you would be better to add some complexity and not write the length of the run if the length of the run is 1.

//change
$rle .= chr($s[$i] + 97).$count;    

//to
$rle .= chr($s[$i] + 97).( $count == 1 ? '' : $count );   

//or
$rle .= chr($s[$i] + 97)
if ( $count != 1 ){
    $rle .= $count;
}

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...