SCRAPER NOT FUCKING WORKING

Question

SCRAPER NOT FUCKING WORKING

Eli Smith

I'VE BEEN SITTING SINCE LIKE 3AM TRYING TO DOWNLOAD A BUNCH OF LOLI HENTAI USING PYTHON AND BEAUTIFULSOUP AND ITS NOT WORKING WHY THE FUCK DOES IT NOT WORK I CHECKED EVERYTHING I SWEAR I DID. THE DAMN THING WILL DOWNLOAD ONE OR TWO VIDEOS FROM HIM AND THEN JUST FREEZE ON ONE AND NOT PROGRESS. NO CRASHES, NO ERRORS, NOTHING. IT JUST FUCKING STOPS. AT THIS POINT I REFUSE TO DOWNLOAD THE SHIT MANUALLY OUT OF ANGER AND SPITE. IT MUST WORK, I CAN'T EVEN IMAGINE WHERE I WENT WRONG WITH THIS SHIT. ARE THE FILES IM TRYING TO DOWNLOAD TOO BIG AND MAKING THE DAMN THING HICCUP? DO I NEED TO PASS SOMETHING MORE THAN JUST USER-AGENT INFO AND THE URL?

from bs4 import BeautifulSoup
import requests
import re
import os
r_headers = {"user-agent":"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.54 Safari/537.36"}
reg = re.compile("[0-9][0-9][0-9][0-9]?")
reg4 = re.compile(r"a.kyot.me/|https://files.catbox.moe/|https://lewd.cat/")
r = requests.get("mantis-x.net/completed-animations/", headers=r_headers)
soup = BeautifulSoup(r.text, "html.parser")
elements = soup.find_all("a", attrs={"data-id":reg})
vid_page_links = {x.get("href") for x in elements}

os.chdir(r"/home/cereal/Videos/mantis-x")
for vid_page_link in vid_page_links:
r2 = requests.get(vid_page_link, headers=r_headers)
soup2 = BeautifulSoup(r2.text, "html.parser")
title = soup2.find("h1").text
vid_variations = {x.get("href") for x in soup2.find_all("a", href=reg4)}

for vid in vid_variations:
file_name = title.strip() + " - " + vid.split("/")[-1]
with open(file_name, "wb") as f:
try:
f.write(requests.get(vid, headers=r_headers).content)
except:
print(file_name)

Attached: fucking kill me.png (1120x1020, 497.57K)

May 25, 2022 - 10:32

Austin Garcia

i would help you, but I am most DEFINITELY not going to allow explicit images to be downloaded onto my computer. better luck with someone else.

May 25, 2022 - 10:34

David Cook

>gs-mantis

Attached: 33B96453-9BDC-4113-9870-B3F8E5148288.png (800x650, 39.84K)

May 25, 2022 - 10:38

Camden Davis

self bump, please help. I don't want to give up on this, I really feel like I should've been able to manage this task

May 25, 2022 - 10:53

Matthew Ramirez

Go fuck yourself, you degenerate lolicon coomer.

May 25, 2022 - 10:56

Asher Butler

so can you like help me or not?

May 25, 2022 - 10:59

Dominic Watson

you are getting timed out

May 25, 2022 - 11:01

Jackson Edwards

>mantis-x
this is awful
how can you enjoy any of this lol

May 25, 2022 - 11:08

Levi Watson

:yotsubaexcited:

May 25, 2022 - 11:16

Dylan Price

Yeah that might be it. OP needs to add a timeout parameter to requests.get and use more try catches, there's probably some nasty stuff happening in that HTML parsing logic.

May 25, 2022 - 11:19

1 2 3 Next

SCRAPER NOT FUCKING WORKING

Last threads