snxnarod download

Post Reply
lemans
Posts: 3
Joined: 01 Mar 2018, 03:23

snxnarod download

Post by lemans » 01 Mar 2018, 04:27

Hi there,

I'm looking to download all pictures from some threads on www.sxnarod.com. This is TGP website with www.backbook.me as an image hoster.

Example of thread I want to download pictures from, where one link on snxnarod.com can lead to a group of images on backbook.me:
https://www.sxnarod.com/victoria-s-secret-angels-t.html

Would appreciate your help with the template.

Thanks you.

Maxim
Site Admin
Posts: 933
Joined: 02 Mar 2009, 17:02

Re: snxnarod download

Post by Maxim » 01 Mar 2018, 17:37

This is not a TGP website. This is a forum website. And the best way to download from a forum thread is "exclude everything - include only required URLs". So you create a project and set Exploration type to Regular. Then exclude all URLs by adding

.*

to the Excluded URLs filters. And then you start adding Regular Expressions for the required URLs to the Included URLs. In case of all forum threads you need to add an expression that will match any page of that thread. Usually thread page URLs differ only by number which can be replaced with \d+ in the Regular Expression of the filter. In this case it would be:

\.com/victoria-s-secret-angels_\d+-t\.html$

Now, this forum uses an image hosting website for images, so you need to add a regular expression that will match any image page from that hosting:

backbook\.me/photo-[^/\?]+\.html$

And finally you need to add the filter to match path to any full-size image on this hosting:

backbook\.me/full/

That's it.

lemans
Posts: 3
Joined: 01 Mar 2018, 03:23

Re: snxnarod download

Post by lemans » 01 Mar 2018, 19:19

It seems to start working now.

The problem is, I can only manage to download a fraction of all linked images.
In one thread it parses all the backbook.me links (1200), but only handful of them are dl.backbook.me/full (42), which resulted in 42 downloaded pics.

Maxim
Site Admin
Posts: 933
Joined: 02 Mar 2009, 17:02

Re: snxnarod download

Post by Maxim » 01 Mar 2018, 22:15

OK, have you seen where are the rest of them are located? Do you know how to do it?

lemans
Posts: 3
Joined: 01 Mar 2018, 03:23

Re: snxnarod download

Post by lemans » 02 Mar 2018, 15:19

One of the images that were not downloaded - https://www.backbook.me/photo-c193723503.html

This is the location - https://d.backbook.me/file/2017/11/23/8 ... 723503.jpg

I guess I need additional included url, but don't know how to form it myself.


Another question, regarding naming. Is it possible to rename files in format 'generated numerical file name_original file name', like 0000_c19372350 for example?

Thanks

Maxim
Site Admin
Posts: 933
Joined: 02 Mar 2009, 17:02

Re: snxnarod download

Post by Maxim » 02 Mar 2018, 16:08

One of the images that were not downloaded - https://www.backbook.me/photo-c193723503.html

This is the location - https://d.backbook.me/file/2017/11/23/8 ... 723503.jpg

I guess I need additional included url, but don't know how to form it myself.
In the Included URLs filter (as well as in the Excluded URLs) you have to add the common part of all similar URLs. In this case the common part of all full-size images hosted on backbook.me is the domain name and the text "full" as a part of the URL. So you can change that last filter to the following:

backbook\.me([^\?]+)?/full

to include all the URLs.
Another question, regarding naming. Is it possible to rename files in format 'generated numerical file name_original file name', like 0000_c19372350 for example?
Well, there is an Expert tab in the Naming section of the project properties where you can use Regular Expression of the file URL or any parent URL to create a file name, but current version of EPF does not allow combining number generation and original file name right now. I'm adding it to the to-do list for future versions.

Post Reply