Downloading from forum

Post Reply
nonono2
Posts: 2
Joined: 12 Oct 2011, 20:20

Downloading from forum

Post by nonono2 » 12 Oct 2011, 21:18

Hi, I was testing some automated image downloaders and came across with this wonderful software (using the trial version before actually buying) that was doing the job really well until I found a forum that I don't know why it's not working.

The forum is login based and the files are attached through the posts. You can view the posts but not the images without log on. I don't know the real images addresses, even using another software to "see" the real links. They are behind a php script, so they are something like "http://site/forum/download.php?id=random-numbers". Even don't knowing the real link, EPF sometimes scan them, read the entire content, but I don't have the images.

I'm using:
Files: *.jp*, *.png, *.gif, *.bmp
Regular site, current page without files external sites
Exclude URLs: .*
Include URLs: download\.php\?id=
Don't download if file size have less than 30kb

I not good with regular expressions and I don't know if it's a issue because of the requrired login that I was facing using a similar software that actually download the images I want, but only from the first page. Both programs I have to select manual login and seems to not persist from a page to another (I don't know if I'm doing something wrong)

Maksym
Site Admin
Posts: 2077
Joined: 02 Mar 2009, 17:02

Re: Downloading from forum

Post by Maksym » 13 Oct 2011, 10:44

Hello.

If images are given only through download.php and this script doesn't redirect to the real image file, then you have to add

download.php

to your target files list and in the Save -> Naming check the Change file extension to box and enter .jpg into the corresponding field.

Also, if you want to scan all pages then you should use the Follow all links option, not Current page only. After that exclude all addresses using .* (just like you did) and then add common part of the page address... Like this:

\?page=\d+

And you were right to add

download\.php

to the Included addresses list. But if this script redirects to the actual image file, then you should also add

\.jpg$

to the included addresses so that EPF will not exclude the image files.

This is all I can say without seeing the actual forum. If you want me to dig deeper - give me the real forum URL. If user name/password is required to download the files - you can send them to support@exisoftware.com.

nonono2
Posts: 2
Joined: 12 Oct 2011, 20:20

Re: Downloading from forum

Post by nonono2 » 15 Oct 2011, 08:06

Thank you so much for the support! The parameters actually worked for both download and page following, but there's a problem now: if i put the Change file extension to .jpg, all the files are renamed this away, but the attached files includes animated gifs and I want to keep them this way otherwise they will not work.

If I remove the extensions from Included URLs, no images. If the target file are files extensions (like .gif), all images are downloaded (the undesired ones too).

There's a way to keep the original file name, or at least, the extension? I know I'm asking to much without the actual site, but if you really need it I'll send it.

Settings now
Target file: download.php
Regular site: follow all links, limit only the exploration depth (no limit)
Naming: Change file extension to .jpg
Excluded URLs: .*
Included URLs: \&start=\d+, download\.php, \.jpg$, \.jpeg$, \.gif$, \.png$, \.bmp$
File size: Don't download if file size have less than 30kb

Maksym
Site Admin
Posts: 2077
Joined: 02 Mar 2009, 17:02

Re: Downloading from forum

Post by Maksym » 15 Oct 2011, 14:11

Those were general suggestions suitable for most forums. And I'm glad it worked.

If you say that adding all necessary target file extensions result in downloading all images, including those you need, then you should tweak the Included URL filters... Make them less general. Not just \.jpg$, or \.png$ Include part of actual image URL. If they are hosted on different domain - include domain name, if they are stored in a common folder - include that folder name.

OK, as I said before, feel free to send me the actual forum URL along with login/password to support@exisoftware.com. I'll try to find the necessary settings.

Post Reply