Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I'm trying to remove punctuation from the column "text" using this code:

texttweet = pd.read_csv("../input/pfizer-vaccine-tweets/vaccination_tweets.csv")

i = 0
punct = "

"+string.punctuation

for tweet in texttweet['text']:
    texttweet['text'][i] = tweet.translate(str.maketrans('', '', punct))
    i += 1

texttweet

But I'm getting this message although I'm getting the needed results:

A value is trying to be set on a copy of a slice from a DataFrame

So is it OK to keep my code regardless of the message or should I change something?

question from:https://stackoverflow.com/questions/65640789/how-to-remove-punctuation-from-one-column-of-a-dataframe

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
140 views
Welcome To Ask or Share your Answers For Others

1 Answer

Best way to do that is this:

texttweet = pd.read_csv("../input/pfizer-vaccine-tweets/vaccination_tweets.csv")
punct = "

"+string.punctuation
texttweet['text'] = texttweet['text'].str.translate(str.maketrans('','',punct))
texttweet

For an explanation of the problem you were having see here: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy.

Basically texttweet['text'] is a "slice" of a dataframe, and you are taking that slice and trying to assign something to it in position i.

To avoid the error you can use texttweet.loc[i,'text'] = . This is different because it is being applied directly to the original dataframe, not a slice of it.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share

548k questions

547k answers

4 comments

86.3k users

...