百度蜘蛛抓取feed文件后，抓取对应页面地址错误，从而导致抓-问答-微盟圈

百度蜘蛛抓取feed文件后，抓取对应页面地址错误，从而导致抓

作者：更新带动器 • 时间：2020-03-09 • 问答 • 来源：根据相关法律法规 • 阅读

比如蜘蛛在抓取到

123.125.71.16 - - [25/Aug/2019:01:42:22 +0800] 'GET /2224.html/feed HTTP/1.1' 200 979 '-' 'Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)'

紧接着就会抓取

220.181.108.158 - - [25/Aug/2019:02:18:41 +0800] 'GET /www.whlihun.com/2224.html HTTP/1.1' 404 479 '-' 'Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)'

但因为抓取地址多了www.whlihun.com，从而导致抓取404，这是feed设置出错了吗?

feed内容如下：

<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"

xmlns:content="http://purl.org/rss/1.0/modules/content/"

xmlns:dc="http://purl.org/dc/elements/1.1/"

xmlns:atom="http://www.w3.org/2005/Atom"

xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"

<channel>

<title>

《离婚协议是否受合同法约束?》的评论 </title>

<atom:link href="https://www.whlihun.com/2224.html/feed" rel="self" type="application/rss+xml" />

<link>https://www.whlihun.com/2224.html</link>

<description></description>

<lastBuildDate>Wed, 26 Dec 2018 10:06:23 +0000</lastBuildDate>

<sy:updatePeriod>

hourly </sy:updatePeriod>

<sy:updateFrequency>

1 </sy:updateFrequency>

<generator>https://wordpress.org/?v=5.2.2</generator>

</channel>

</rss>

本文来自投稿，不代表微盟圈立场，如若转载，请注明出处：https://www.vm7.com/a/ask/80871.html

百度蜘蛛抓取feed文件后，抓取对应页面地址错误 ，从而导致抓

相关推荐

百度蜘蛛抓取feed文件后，抓取对应页面地址错误，从而导致抓