Drying to get the right settings for a project I am working on

Post Reply
reef1980
Posts: 1
Joined: 09 Dec 2019, 12:20

Drying to get the right settings for a project I am working on

Post by reef1980 » 09 Dec 2019, 13:03

Hi. I have a project I am working on, and need a little help. I am trying to download Comic Book covers from sites like

https://dc.fandom.com/wiki/Batman/Covers
and
http://www.coverbrowser.com/covers/batman

I just want the covers though. I have tried all sorts of stuff and I am not getting anywhere.
It either starts downloading everything from the site, or nothing at all.

I tried excluding the site URL, but then including Just the /covers/batman part, but then it just seems like it runs forever, and doesn't bring me any pictures.

Also, will this only download the thumbnails, or will this download the larger picture files as well?
Thanks.

Maksym
Site Admin
Posts: 2077
Joined: 02 Mar 2009, 17:02

Re: Drying to get the right settings for a project I am working on

Post by Maksym » 09 Dec 2019, 14:21

Well, since all the covers are available on one page - you can safely exclude everything by adding

.

to the "excluded URLs" and then add only URLs that lead to the files you need to the "Included URLs". In this case I did it like this:

(?i)\.[jpegnif]+$

The trick here is that this page has links only to thumbnails, but luckily thumbnail URLs contain full-size image URLs as a part of the thumbnail URLs. So you just have to cut those thumbnail URL using Custom Parsers like this:

Expression: src="(http[^"]+)/revision/[^"]+"
Result: [#1]

This Regular Expression uses the URL up until it finds "/revision/" text.

That's it.

Post Reply