r - 删除数字和字符串中的所有内容(Remove numbers and everything after in strings)

Question

asked Feb 21, 2021 in Technique[技术] by 深蓝 (71.8m points)

I have receipt data and I want to aggregate the items which are the same product type;

(我有收货数据，并且想要汇总相同产品类型的项目；)

for example, milk, cheeses, jams.

(例如牛奶，奶酪，果酱。)

At the moment, the data includes specifics like pack size and price if the item is flashed as a special offer.

(目前，如果该商品闪现为特价商品，则数据包括包装尺寸和价格之类的详细信息。)

I want to remove this specific info from the end of the string and just have everything before the numbers kept.

(我想从字符串末尾删除此特定信息，并保留所有数字。)

For example:

(例如：)

    Dairy Milk 3Litre
    Brown Onions 1KG
    Avocado 2 AT 3.00 EACH

I want to remove everything after and including the numbers.

(我想删除所有包括数字在内的内容。)

I want to be left with Dairy Milk,Brown Onions, Avocado etc.

(我想剩下牛奶，牛油果，牛油果等。)

ask by Helena H translate from so

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

293 views

1 Answer

深蓝 · Answer 1 · 2021-02-21T04:22:31+0000

We can use sub here to remove everything after first digit occurs

(我们可以在此处使用sub删除出现第一位数字后的所有内容)

x <- c("Dairy Milk 3Litre","Brown Onions 1KG","Avocado 2 AT 3.00 EACH")

sub("\d+.*", "", x)
#[1] "Dairy Milk "   "Brown Onions " "Avocado "

or the other way round, extract everything before first digit occurs.

(或相反，提取出现第一位数字之前的所有内容。)

sub("(.*?)\d+.*", "\1", x)