Fine Art America - fineartamerica.com

Post Reply
jojo1000
Posts: 3
Joined: 25 Aug 2024, 20:59

Fine Art America - fineartamerica.com

Post by jojo1000 »

Could an expert kindly create a script for this awesome website please?
Maksym
Site Admin
Posts: 2230
Joined: 02 Mar 2009, 17:02

Re: Fine Art America - fineartamerica.com

Post by Maksym »

Have you tried the built-in generic templates? If you want images of one product - use the "Download all images from a single-page gallery" template. If you want everything from this website - use the "Download all images from entire website" template. You can apply filters and limits to avoid downloading thumbnails and website design elements.
jojo1000
Posts: 3
Joined: 25 Aug 2024, 20:59

Re: Fine Art America - fineartamerica.com

Post by jojo1000 »

Yes, thank you, I did try it. But it's downloading the lowest quality images available.

My goal is to download all images from let's say:
https://fineartamerica.com/art/photographs/roger+moore

The quality I am seeking for the software to download is:
https://fineartamerica.com/featured/spa ... chive.html
https://fineartamerica.com/featured/sha ... -ruck.html

How can I make this possible?
Maksym
Site Admin
Posts: 2230
Joined: 02 Mar 2009, 17:02

Re: Fine Art America - fineartamerica.com

Post by Maksym »

Well, in this case, you'll need to use advanced configuration options. It's not going to be easy if you don't know a little bit of HTML. You need to limit the exploration only to the pages that you need. So, first of all, make sure you have [ Entire website ] selected in the [ Regular site ] section of the project properties. Then exclude all the pages of the website by adding

^https?://(www\.)?fineartamerica\.com

to the [ Excluded URLs ]

And then you'll have to add filters to the [ Included URLs ] that will allow the software to crawl only the pages with the full-size photos. Looks like adding

/featured/

is going to be enough. This will already do the job, and you'll get the full-size images (those that are shown on the website, not those that are hidden behind the "preview". So, in order to get rid of the smaller-resolution images and website assets you can also add the following filters to the [ Excluded URLs ]:

/assets/
/400/

And now, for the final touches, you can open the HTML source of the pages that you need to crawl and "exclude" parts of those pages that do not contain the information that you do not need to the [ Excluded Page Parts ]. That's what I came up with:

Image

From1=
To1=<div class='leftdiv'

From2=<div class='rightdiv
To2=

From3=
To3=<div id='imageFlowContainerDiv'

From4=<div id='searchEngineFooterDiv'>
To4=
jojo1000
Posts: 3
Joined: 25 Aug 2024, 20:59

Re: Fine Art America - fineartamerica.com

Post by jojo1000 »

Wow, your reply by itself seemed like a training session. Thank you so much.

I'll try all of this and respond to you.

Thanks again.
Post Reply