Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

Using regular expressions in C#, is there any way to find and remove duplicate words or symbols in a string containing a variety of words and symbols?

Ex.

Initial string of words:

"I like the environment. The environment is good."

Desired string:

"I like the environment. is good"

Duplicates removed: "the", "environment", "."

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
280 views
Welcome To Ask or Share your Answers For Others

1 Answer

As said by others, you need more than a regex to keep track of words:

var words = new HashSet<string>();
string text = "I like the environment. The environment is good.";
text = Regex.Replace(text, "\w+", m =>
                     words.Add(m.Value.ToUpperInvariant())
                         ? m.Value
                         : String.Empty);

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...