Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I have a string variable response:

where where where is it         
I'm going there
where where did you say
sometimes it is where you think
i think its where where you go
its everywhere where you are
i am planning on going where where where i want to

As you can see, the word "where" is repeated quite often. I want to replace strings "where where" and "where where where" (or even "where where where where") with "where".

However, I don't want to replace "everywhere where" with "where".

I know I can do this manually, but I was hoping to condense the code into as few lines as possible.

This is what I have been trying so far:

gen temp = regexr(response, " (where)+ where ", " where ") 
replace temp = regexr(response, "^(where)+ where ", "where ")

These are my results after running the code above:

where where is it  
I'm going there
where did you say
sometimes it is where you think
i think its where where you go
its everywhere where you are
i am planning on going where where where i want to

Instead, I want the final data to look like this:

where is it         
I'm going there
where did you say
sometimes it is where you think
i think its where you go
its everywhere where you are
i am planning on going where i want to

I have been using "(where)+" to capture both "where where" and "where where where" but it doesn't seem to work. I also split the code into two commands, one begins with "^(where)" and the other with " (where)" in order to avoid capturing the 'where' in "everywhere" but it seems as if the code does not capture "where where" when it occurs in the middle of the sentence.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
196 views
Welcome To Ask or Share your Answers For Others

1 Answer

A quick fix using Stata's string functions is the following:

clear

input str50 string1
"where where where is it"        
"I'm going there"
"where where did you say"
"sometimes it is where you think"
"i think its where where you go"
"its everywhere where you are"
"i am planning on going where where where i want to"
end

generate tag1 = !strmatch(string1, "*everywhere where*")

generate tag2 = ( length(string1) - length(subinstr(string1, "where", "", .)) ) / 5

generate string2 = cond(tag1 == 1, stritrim(subinstr(string1, "where", "", tag2-1)), string1)


list string2, separator(0)

     +----------------------------------------+
     |                                string2 |
     |----------------------------------------|
  1. |                            where is it |
  2. |                        I'm going there |
  3. |                      where did you say |
  4. |        sometimes it is where you think |
  5. |               i think its where you go |
  6. |           its everywhere where you are |
  7. | i am planning on going where i want to |
     +----------------------------------------+

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...