python爬取p站精美图像

p站无法打开需要借助魔法,因此我们需要用p站的镜像站

p站-pixiv-插画世界 (vilipix.com)

图片都挺精美的,画质很高

python爬取p站精美图像

贴代码:

import requests
import re

header = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/115.0.0.0 Safari/537.36 Edg/115.0.1901.203",
    "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7"
}
img_set = set()
num = int(input("截至数页:"))
count = 1
while count <= num:
    url = f"https://www.vilipix.com/new?p={count}"
    x_open = requests.get(url, headers=header).text
    x_open = x_open.replace("u002F", "").replace("\\", "/")
    img = re.findall('original_url:"(.*?)"', x_open)
    img_set.update(img)
    count += 1
f = open("vimg.txt", "w", encoding="UTF-8")
img_list = list(img_set)
for i in img_list:
    f.write(f"{i}\n")
f.close()
print(f"抓取数目:{len(img_list)}")

再贴上爬取的10张url

http://img9.vilipix.com/picture/pages/original/2023/04/30/13/110287699_p0.jpg
http://img9.vilipix.com/picture/pages/original/2023/04/30/14/110352287_p0.jpg
http://img9.vilipix.com/picture/pages/original/2023/04/30/13/110762332_p0.jpg
http://img9.vilipix.com/picture/pages/original/2023/04/30/15/110350646_p0.jpg
http://img9.vilipix.com/picture/pages/original/2023/04/30/13/109880680_p0.jpg
http://img9.vilipix.com/picture/pages/original/2023/04/30/15/110878733_p0.jpg
http://img9.vilipix.com/picture/pages/original/2023/04/30/14/110807197_p0.jpg
http://img9.vilipix.com/picture/pages/original/2023/04/30/13/110271013_p0.jpg
http://img9.vilipix.com/picture/pages/original/2023/04/30/12/110720774_p0.png
http://img9.vilipix.com/picture/pages/original/2023/04/30/14/110038269_p0.jpg
http://img9.vilipix.com/picture/pages/original/2023/04/30/12/109807253_p0.png