sentiment analysis - How to remove specific words from text corpus in R? Code provided to amend

Question

Ask a Question

Welcome To Ask or Share your Answers For Others

sentiment analysis - How to remove specific words from text corpus in R? Code provided to amend

asked Jan 29, 2021 in Technique[技术] by 深蓝 (71.8m points)

Suppose you have a corpus, e.g.

myCorpus <- c("Carles werwa went to sadaf buy trsfr in the supermanket", 
           "Marta needs to werwa sadaf go to Jamaica")

I have a dictionary (data_int_syllables) containing a list of words which I could like to remove from mytext.

Using library('quanteda'), I tried the following:

myTokens <- tokens(myCorpus, remove_punct = TRUE, remove_numbers = TRUE)
myTokens <- tokens_select(myTokens, names(data_int_syllables))

The issue is, this code amends myTokens to keep only the tokens found in an English dictionary (data_int_syllables). Instead, I want to remove all words found in data_int_syllables.

Does anyone know how to adjust the code so that the words are removed, rather than kept?

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

390 views

1 Answer

深蓝 · Answer 1 · 2021-01-29T04:24:24+0000

answered Jan 29, 2021 by 深蓝 (71.8m points)

等待大神答复

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

Categories

sentiment analysis - How to remove specific words from text corpus in R? Code provided to amend

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags