爱恋bt不用挂梯子资源也比较全,个人强烈推荐!
先说一声,博主刚学python水平很一般,写代码水平拙劣请见谅,自己并不在意爬取效率,能用就行。
1:代码实现环境:
windows10 x64
python3.10
模块:pyppeteer requests lxml
pip install pyppeteer
pip install lxml #(其实pyppeteer已经自带了xpath了,但是一开始不知道,现在懒得改了)
pip install requests
2:爱恋动漫列表文件anime_list.txt
http://www.kisssub.org/search.php?keyword=夏日重现 诸神 720P
http://www.kisssub.org/search.php?keyword=式守同学不止可爱 【喵萌奶茶屋】 1080 简体翻译
http://www.kisssub.org/search.php?keyword=海贼王 ][1080p][周日版][MKV
http://www.kisssub.org/search.php?keyword=间谍过家家 1080p 简日双语 【喵萌奶茶屋
http://www.kisssub.org/search.php?keyword=凡人修仙 NC-Raws
http://www.kisssub.org/search.php?keyword=王者天下 繁體 1080P
同目录创建一个 anime_list.txt 文件,内容如上:一行一个动漫链接,爱恋支持多关键字搜索,以此来选定需要的字幕组和清晰度等等。
chrome直接复制完整地址栏会自动url转码,不复制http://部分就不会自动转码了,虽然不影响用但是不直观。
3:完整py文件代码:
# -*- coding: utf-8 -*-
import asyncio
from pyppeteer import launch
from lxml import etree
import time
import re
import requests
path = r'D:\bt/' #下载bt种子的目录
filename = 'log.txt'
try: #创建log.txt文件,用来防止重复下载bt种子。
with open(filename,'r',encoding = 'utf-8') as txt:
pass
except:
with open(filename, 'w', encoding='utf-8') as txt:
pass
with open('anime_list.txt', 'r', encoding='utf-8') as txt:
anime_list = txt.readlines()
def kakunin(url):#确认是否下载过,没下载过则添加到log.txt
with open('log.txt', 'r+', encoding='utf-8') as txt:
txt_s = txt.read()
if url in txt_s:
return 1
else:
txt.write(url + ' ' + time.asctime(time.localtime()) +'\n')
# 用正则表达式去除windows下的特殊字符,这些字符不能用在文件名
def fixname(filename):
intab = r'[?*/\|.:><]'
filename = re.sub(intab, "", filename)
return filename
# 主函数
async def main():
browser = await launch()
page = await browser.newPage()
await page.goto('http://www.kisssub.org/addon.php?r=document/view&page=mika-mode')
await page.click('#user_script_save') #生存战略模式应用脚本
for url in anime_list:
await page.goto(url.strip('\n'))
doc = await page.content()
page_etree = etree.HTML(doc)
title_url = page_etree.xpath('//*[@id="data_list"]/tr/td[3]/a/@href')
for url in title_url:
fin = kakunin(url)
if fin == 1:
break
else:
try:
child_url = 'http://www.kisssub.org/' + url
await page.goto(child_url)
child_doc = await page.content()
child_etree = etree.HTML(child_doc)
child_torrent = child_etree.xpath('/html/body/div[1]/div[6]/div[2]/div/div[2]/div[4]/div[1]/ul/li[2]/a/@href')
child_title = child_etree.xpath('//*[@id="btm"]/div[4]/a[3]/text()')
fix_title = fixname(child_title[0]) #种子名应用正则
file_path = path + fix_title + '.torrent'
with open(file_path,'wb') as torrent:
bt_data = requests.get(child_torrent[0]).content
torrent.write(bt_data)
print(fix_title,'下载成功')
except:
print('子页面有错误!http://www.kisssub.org/',url)
await browser.close()
# 运行入口
if __name__ == "__main__":
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
loop.run_until_complete(main())
anime_list.txt和py文件放同一个目录,然后把放bt文件的路径改成自己需要的路径就可以了。
配合迅雷自动下载,一个字 爽!
代码比较乱给新手做个参考,有什么建议欢迎留言。
Comments | NOTHING