SCRAPER NOT FUCKING WORKING

I'VE BEEN SITTING SINCE LIKE 3AM TRYING TO DOWNLOAD A BUNCH OF LOLI HENTAI USING PYTHON AND BEAUTIFULSOUP AND ITS NOT WORKING WHY THE FUCK DOES IT NOT WORK I CHECKED EVERYTHING I SWEAR I DID. THE DAMN THING WILL DOWNLOAD ONE OR TWO VIDEOS FROM HIM AND THEN JUST FREEZE ON ONE AND NOT PROGRESS. NO CRASHES, NO ERRORS, NOTHING. IT JUST FUCKING STOPS. AT THIS POINT I REFUSE TO DOWNLOAD THE SHIT MANUALLY OUT OF ANGER AND SPITE. IT MUST WORK, I CAN'T EVEN IMAGINE WHERE I WENT WRONG WITH THIS SHIT. ARE THE FILES IM TRYING TO DOWNLOAD TOO BIG AND MAKING THE DAMN THING HICCUP? DO I NEED TO PASS SOMETHING MORE THAN JUST USER-AGENT INFO AND THE URL?

from bs4 import BeautifulSoup
import requests
import re
import os
r_headers = {"user-agent":"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.54 Safari/537.36"}
reg = re.compile("[0-9][0-9][0-9][0-9]?")
reg4 = re.compile(r"a.kyot.me/|https://files.catbox.moe/|https://lewd.cat/")
r = requests.get("mantis-x.net/completed-animations/", headers=r_headers)
soup = BeautifulSoup(r.text, "html.parser")
elements = soup.find_all("a", attrs={"data-id":reg})
vid_page_links = {x.get("href") for x in elements}

os.chdir(r"/home/cereal/Videos/mantis-x")
for vid_page_link in vid_page_links:
r2 = requests.get(vid_page_link, headers=r_headers)
soup2 = BeautifulSoup(r2.text, "html.parser")
title = soup2.find("h1").text
vid_variations = {x.get("href") for x in soup2.find_all("a", href=reg4)}

for vid in vid_variations:
file_name = title.strip() + " - " + vid.split("/")[-1]
with open(file_name, "wb") as f:
try:
f.write(requests.get(vid, headers=r_headers).content)
except:
print(file_name)

Attached: fucking kill me.png (1120x1020, 497.57K)

i would help you, but I am most DEFINITELY not going to allow explicit images to be downloaded onto my computer. better luck with someone else.

>gs-mantis

Attached: 33B96453-9BDC-4113-9870-B3F8E5148288.png (800x650, 39.84K)

self bump, please help. I don't want to give up on this, I really feel like I should've been able to manage this task

Go fuck yourself, you degenerate lolicon coomer.

so can you like help me or not?

you are getting timed out

>mantis-x
this is awful
how can you enjoy any of this lol

:yotsubaexcited:

Yeah that might be it. OP needs to add a timeout parameter to requests.get and use more try catches, there's probably some nasty stuff happening in that HTML parsing logic.