Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

Whenever I run the code. it gives me with spaces. I used strip function but it didn't work. How to resolve this issue? Here is the link: https://ibb.co/VtVV2fb

import scrapy
from .. items import FetchingItem

class SiteFetching(scrapy.Spider):
    name = 'Site'
    start_urls = ['https://www.rev.com/freelancers']
    transcription_page = 'https://www.rev.com/freelancers/transcription'

    def parse(self, response):
    items = {
    'Heading': response.css('#sign-up::text').extract(),
    'Earn_steps': response.css('.pb2 .lh-copy::text , .mb1::text , .mb3 .lh-copy::text').extract(), 
    }

    yield response.follow(self.transcription_page, self.trans_faqs, meta={'items':items})

    def trans_faqs(self, response):
    items = response.meta['items']
    names = {
    'name1': 'FAQ1',
    'name2': 'FAQ2', 
    }

    finder = {
    'find1': '#whatentailed p::text , #whatentailed .mr3::text',
    'find2': '#requirements p::text , #requirements .mr3::text',
    }

    for name, find in zip(names.values(), finder.values()):
        items[name] = response.css(find.strip()).extract()
    yield items
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
194 views
Welcome To Ask or Share your Answers For Others

1 Answer

strip() can remove only at the end of string, but not inside. If you have inside text then use text = text.replace( ', '')

it seems you get in list created by extract() so you have to use list comprehension to remove from every element on list

data = response.css(find).extract()
data = [x.replace('
', '').strip() for x in data]
items[name] = data

EDIT: to remove spaces and between sentences you can split(' ') to create list with sentences. then you can strip() every sentence. And you can ' '.join() all sentences back to one string.

text = 'Sentence 1
    Sentence 2'

data = text.split('
')
data = [x.strip() for x in data]
text = ' '.join(data)

print(text)

The same in one line

text = 'Sentence 1
    Sentence 2'

text = ' '.join(x.strip() for x in text.split('
'))

print(text)

The same with module re

import re

text = 'Sentence 1
    Sentence 2'

text = re.sub('
s+', ' ', text)

print(text)

for name, find in zip(names.values(), finder.values()):
    data = response.css(find.strip()).extract()
    data = [re.sub('
s+', ' ', text) for text in data]
    items[name] = data

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...