Help with template

Post Reply
redundant
Posts: 3
Joined: 08 Jul 2019, 23:35

Help with template

Post by redundant » 08 Jul 2019, 23:49

Recently I found myself a mod in a forum and need to do a site rip of all the photos in order to get rid of the spammy posters but keep their content. The problem is they use several different image hosts that all have multiple redirects and popups and I have no idea how to get around them. One poster uses a site called celebnewsfind.com to "anonymize" his posts and that makes it even more difficult. The forum site is: http://celebgirls.fun/index.php They have a sister site at youngermodel.pw/index.php
Any help would be much appreciated but not expected. Thank you

Maksym
Site Admin
Posts: 2077
Joined: 02 Mar 2009, 17:02

Re: Help with template

Post by Maksym » 09 Jul 2019, 12:51

You can try using a generic template called "Download all images from entire website" and then add the following filters to the "Included URLs" to handle image hosts:

(?si)^https?://img\d+\.imagevenue\.com/img\.php\?
(?si)^https?://img\d+\.imagevenue\.com/[^/]+/loc\d+/
(?si)^https?://(www\.)?[^/\.\?]+\.[a-z]+/images?/[^/\?#]+$
(?si)^[^\?]+\.imagebam\.com/.+?\?download=1
(?si)^https?://(www\.)?imagflash\.com/images/
(?si)^https?://(www\.)?[^/\.\?]+\.com/[^/\?]+/[^/\?]+\.[jpeginf]+(\.html)?$
(?si)^https?://i(mg)?\d+\.[^/\.\?]+\.com/i/
(?si)^https?://(www\.)?imgbox\.com/[^/\?#]+$
(?si)^https?://i(mages)?(\d+)?\.imgbox\.com/[^\?#]+$
(?si)^https?://(www\.)?[^/\.\?]+\.[a-z]+/show/[^&/\?#]+$
(?si)^https?://i\d+\.imgchili\.net/
(?si)^https?://(www\.)?pixhost\.org/show/[^&]+$
(?si)^https?://img\d+.pixhost\.org/images/
(?si)^https?://(www\.)?imageupper\.com/i/\?
(?si)^https?://s\d+\.imageupper\.com/\d+/
(?si)^https?://(www\.)?gogoimage\.org/img-
(?si)^https?://(www\.)?turboimagehost\.com/p/
turboimg\.net/sp/
\.imgswift\.com/files/\d+/
(?si)^https?://(www\.)share-image\.com/gallery/[^/]+/\d+$
(?si)^https?://(www\.)?mixbase\.net/gallery/image\.php\?id=\d+$
(?si)^https?://(www\.)?mixbase\.net/gallery/media/storage/
(?si)^https?://(www\.)?[^/]+/img-
/big/
abload\.de/img/
(?si)dpic\.me/[^/]+/[^/]+\.[jpginf]+$
(?si)^https?://(www\.)?greenpiccs\.com/.+?\.[jpegnif]+(\.html)?$
filesor\.com/pimpandhost\.com/.+?\.[jpegnif]+$
pixxxels\.cc/image/[^/]+/$
pixxxels\.cc/.+\?dl=1
pixhost\.to/show/[^\]#\?&]+$
img\d+\.pixhost\.to/images/[^\]#\?&]+$
^https?://(www\.)directupload\.net/file/[^\[]+$
directupload\.net/images/[^\[]+$
^[^\?]+/v\.php\?id=[^&]+$
/pic_b/
\.imx\.to/i/
^[^\?'"\]]+[postimgxel]+\.cc/[^/\?#\.]+$
^[^\?'"\]]+[postimgxel]+\.cc/[^/]+/[^/]+\?dl=1$

redundant
Posts: 3
Joined: 08 Jul 2019, 23:35

Re: Help with template

Post by redundant » 12 Jul 2019, 19:00

Thank you for helping. I tried your suggestion and it didn't work at first... i figured it was because of the anonymizer site the poster was using. So I added //celebnewsfind.com/to the include list and now it works... sorta. It's downloading the imagespice and imagetwist but I think it needs something for the other file hosts he uses... imagefrost, imageadult, imagedrive, imgbaron, kvador, imgwallet....etc...
. Should I just type them in as is? or does it need all those question marks, slashes, carrots and money signs?

Maksym
Site Admin
Posts: 2077
Joined: 02 Mar 2009, 17:02

Re: Help with template

Post by Maksym » 13 Jul 2019, 14:41

Those are not supported by Extreme Picture Finder right now. Those image hosts require you to click a button usually called something like "Continue to image..." which uses HTTP POST method to get to the actual image page.

As for the carrots and money signs - those are parts of Regular Expressions. You can learn them here (for example):

https://regexone.com/

redundant
Posts: 3
Joined: 08 Jul 2019, 23:35

Re: Help with template

Post by redundant » 19 Jul 2019, 05:59

the program reads the thumbnail file which is just the picture file with a _t added to the end.How would i tell it to just take the _t off the end. ill give example
/nkrik67f9mho_t.jpg is the end of the file it already scans and says is too small
all it needs to do is grab /nkrik67f9mho.jpg

another example is it scans /nkrik67f9mho/m160-s185-001.jpg.html
if it would drop the .html at the end it would get to a page witht the original picture on it and if it dropped the html and the whole end string it and put .jpg on the end itwould go to the source page

Maksym
Site Admin
Posts: 2077
Joined: 02 Mar 2009, 17:02

Re: Help with template

Post by Maksym » 19 Jul 2019, 11:55

You have to use Custom Parsers for that. If I had full URLs - I could provide more precise Regular Expressions, but based on your input here you are.

This one removes "_t":

Expression: (.*/[^/]+)_t\(\.[jpegnif]+)$
Result: [#1][#2]

This one removes the trailing ".html":

Expression: (.*\.[jpegnif]+)\.html$
Result: [#1]

But let me assure you that it's never that easy :( When you remove the trailing ".html" - you are automatically redirected back to it. Once you remove "_t" to generate direct full-size image URL - turns out it need to have a referer pointing to the page where the full-size image is shown. The image hosting websites are really trying to make people view their ADs, not just grab the images :(

But sometimes it does actually work. The most recent success was with the "imx.to" website. You can take a look at the Custom Parsers used for it in this template:

imx.to - gallery template

Post Reply