In C#, what is the fastest way to detect duplicate characters in a String and remove them (removal including 1st instance of the duplicated character)?
Example Input: nbHHkRvrXbvkn
Example Output: RrX
In C#, what is the fastest way to detect duplicate characters in a String and remove them (removal including 1st instance of the duplicated character)?
Example Input: nbHHkRvrXbvkn
Example Output: RrX
Fastest as in fewest-lines-of-code:
var s = "nbHHkRvrXbvkn";
var duplicates = s.Where(ch => s.Count(c => c == ch) > 1);
var result = new string(s.Except(duplicates).ToArray()); // = "RrX"
Fastest as in fastest-performance would probably be something like this (does not preserve order):
var h1 = new HashSet<char>();
var h2 = new HashSet<char>();
foreach (var ch in "nbHHkRvrXbvkn")
{
if (!h1.Add(ch))
{
h2.Add(ch);
}
}
h1.ExceptWith(h2); // remove duplicates
var chars = new char[h1.Count];
h1.CopyTo(chars);
var result = new string(chars); // = "RrX"
Performance test
When in doubt -- test it :)
Yuriy Faktorovich's answer 00:00:00.2360900 Luke's answer 00:00:00.2225683 My 'few lines' answer 00:00:00.5318395 My 'fast' answer 00:00:00.1842144