python爬蟲庫,python中scrapy加請求頭_Python爬蟲之scrapy框架隨機請求頭中間件的設置

 2023-10-07 阅读 31 评论 0

摘要:方法一,定義一個存放請求頭的列表,并從中隨機獲取請求頭:獲取請求頭的網址http://www.useragentstring.com/pages/useragentstring.php?name=Allimport randomclass UserAgentDownloadMiddleware(object):USER_AGENTS = [python爬蟲庫?'Moz

方法一,定義一個存放請求頭的列表,并從中隨機獲取請求頭:

獲取請求頭的網址http://www.useragentstring.com/pages/useragentstring.php?name=All

import random

class UserAgentDownloadMiddleware(object):

USER_AGENTS = [

python爬蟲庫?'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36',

'Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.36 (KHTML like Gecko) Chrome/44.0.2403.155 Safari/537.36',

'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.36',

'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2227.1 Safari/537.36',

'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2227.0 Safari/537.36',

'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2227.0 Safari/537.36',

Python爬蟲scrapy,'Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2226.0 Safari/537.36',

'Mozilla/5.0 (Windows NT 6.4; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2225.0 Safari/537.36',

'Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2225.0 Safari/537.36',

'Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2224.3 Safari/537.36',

'Mozilla/5.0 (Windows NT 10.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/40.0.2214.93 Safari/537.36',

'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/37.0.2062.124 Safari/537.36'

可視化爬蟲,]

def process_request(self,request,spider):

user_agent = random.choice(self.USER_AGENTS)

request.headers['User-Agent'] = user_agent

方法二,通過fake_useragent模塊隨機獲取請求頭

from fake_useragent import UserAgent

scrapy可視化爬蟲、class UserAgentDownloadMiddleware(object):

def process_request(self,request,spider):

user_agent = UserAgent().random

request.headers['User-Agent'] = user_agent

版权声明:本站所有资料均为网友推荐收集整理而来,仅供学习和研究交流使用。

原文链接:https://hbdhgg.com/3/128437.html

发表评论:

本站为非赢利网站,部分文章来源或改编自互联网及其他公众平台,主要目的在于分享信息,版权归原作者所有,内容仅供读者参考,如有侵权请联系我们删除!

Copyright © 2022 匯編語言學習筆記 Inc. 保留所有权利。

底部版权信息