'use full URL for file name' not working

theuntakenname
Posts: 10
Joined: 28 Feb 2023, 07:36

'use full URL for file name' not working

Post by theuntakenname » 28 Feb 2023, 07:41

I have ticked 'use full URL for file name', but it only saves the image filename, not the URL.
(this makes it impossible for me to match up the images I receive, with my other data.)

also 'create a sub folder for each starting address' only uses the root domain as the starting address, it should use the actual starting address (including sub folders)

I have bought the software, so I am hoping to get it to work as expected.

any help appreciated.

thanks

Matt
Last edited by theuntakenname on 02 Mar 2023, 06:54, edited 1 time in total.

Maksym
Site Admin
Posts: 2077
Joined: 02 Mar 2009, 17:02

Re: 'use full URL for file name' not working

Post by Maksym » 28 Feb 2023, 12:30

I have ticked 'use full URL for file name', but it only saves the image filename, not the URL.
I created a project with the following Starting URL:

httрs://webimagedownloader.com/screenshots/batch-image-downloader.png

Selected [ Use full original URL for file name ] and Extreme Picture Finder downloaded the image and saved it with the following file name:

httрs-webimagedownloader.com-screenshots-batch-image-downloader.png

Which is the exact full image URL with the following symbols replaced with a dash:

: / \

Those symbols are not allowed in Windows file names.
also 'create a sub folder for each starting address' only uses the root domain as the starting address, it should use the actual starting address (including sub folders)
No, it shouldn't. This is exactly how this option is supposed to work. Before this there were only URL numbers as sub-folder names when you selected a sub-folder for each starting address option, like this:

00000
00001
...

The documentation still says so (this must be fixed).
I have bought the software, so I am hoping to get it to work as expected.

any help appreciated.
So, the software does indeed work as expected. You just expected something different, I suppose. Now, if you tell me exactly what do you need with the actual URL examples - I can try to help. Regular Expressions usually help a lot in such situations. If you do not want to share your URLs and project details here - send the information to support@exisoftware.com.

theuntakenname
Posts: 10
Joined: 28 Feb 2023, 07:36

Re: 'use full URL for file name' not working

Post by theuntakenname » 28 Feb 2023, 13:22

I use these starting adresses

https://www.suninternational.com/grandwest/
https://www.suninternational.com/sun-city/

The results below, two folders and the files in each.....


How can I know which folder is which starting address?

The files do not seem to have the full URL in the filename.



Folders:

00007-www.suninternational.com
00008-www.suninternational.com

Files:

gran-city-lodge.jpg.sunimage.800.400.jpg
gran-kids-fun-packs-november-2022.jpg.sunimage.480.350.jpg
gran-kids-fun-packs-november-2022.jpg.sunimage.750.525.jpg
gran-kids-fun-packs-november-2022.jpg.sunimage.767.350.jpg
gran-kids-fun-packs-november-2022.jpg.sunimage.970.525.jpg
gran-kids-fun-packs-november-2022.jpg.sunimage.1200.525.jpg
gran-kids-fun-packs-november-2022.jpg.sunimage.1400.525.jpg
GRANa0001-0376-grandwest-exterior-evening-.jpg.sunimage.600.315.jpg
GRANb1004-grandwest-exterior-day.jpg.sunimage.800.400.jpg
grandwest-cape-town-fish-market-outside.jpg.sunimage.800.1000.jpg.sunimage.800.400.jpg
grandwest-casino-slots-area-0386.jpg.sunimage.800.1000.jpg.sunimage.800.400.jpg
grandwest-conference-5620.jpg.sunimage.800.1000.jpg.sunimage.800.400.jpg
si-mvg-lost-card-december-2022.jpg.sunimage.750.525.jpg
si-mvg-lost-card-december-2022.jpg.sunimage.970.525.jpg
si-mvg-lost-card-december-2022.jpg.sunimage.1200.525.jpg
si-mvg-lost-card-december-2022.jpg.sunimage.1400.525.jpg
si-slots-royale-2023-updated.jpg.sunimage.480.350.jpg
si-slots-royale-2023-updated.jpg.sunimage.750.525.jpg
si-slots-royale-2023-updated.jpg.sunimage.767.350.jpg
si-slots-royale-2023-updated.jpg.sunimage.970.525.jpg
si-slots-royale-2023-updated.jpg.sunimage.1200.525.jpg
si-slots-royale-2023-updated.jpg.sunimage.1400.525.jpg


scvc-march-sales-aviary_1920x525.jpg.sunimage.1400.525.jpg
golf-clubs.jpg.sunimage.735.338.jpg
pala-exterior-night-signature-shot.jpg.sunimage.480.315.jpg
pala-exterior-night-signature-shot.jpg.sunimage.767.315.jpg
pala-exterior-night-signature-shot.jpg.sunimage.991.525.jpg
pala-exterior-night-signature-shot.jpg.sunimage.1400.525.jpg
sc-family-traditions.jpg.sunimage.449.206.jpg
sc-family-traditions.jpg.sunimage.735.338.jpg
sc-salon-prive-2.jpg.sunimage.449.206.jpg
sc-salon-prive-2.jpg.sunimage.735.338.jpg
scr5d0050-sun-city-sun-central-childrens-play-area-xs-5035-1.jpg.sunimage.800.1000.jpg.sunimage.800.400.jpg
scvc-march-sales-aviary_1920x525.jpg.sunimage.480.350.jpg
scvc-march-sales-aviary_1920x525.jpg.sunimage.750.525.jpg
scvc-march-sales-aviary_1920x525.jpg.sunimage.767.350.jpg
scvc-march-sales-aviary_1920x525.jpg.sunimage.970.525.jpg
scvc-march-sales-aviary_1920x525.jpg.sunimage.1200.525.jpg
scvc-march-sales-lefika_1920x525.jpg.sunimage.480.350.jpg
scvc-march-sales-lefika_1920x525.jpg.sunimage.750.525.jpg
scvc-march-sales-lefika_1920x525.jpg.sunimage.767.350.jpg
scvc-march-sales-lefika_1920x525.jpg.sunimage.970.525.jpg
scvc-march-sales-lefika_1920x525.jpg.sunimage.1200.525.jpg
scvc-march-sales-lefika_1920x525.jpg.sunimage.1400.525.jpg
sun-city-aerial-gpcc-resort-soho-web.jpg.sunimage.800.1000.jpg.sunimage.800.400.jpg
sun-city-casino-prive slots-blurred.jpg.sunimage.800.1000.jpg.sunimage.800.400.jpg
Sun-City-exterior-evening.jpg.sunimage.600.315.jpg
sun-city-phase-2-pool-web.jpg.sunimage.480.315.jpg
sun-city-phase-2-pool-web.jpg.sunimage.767.315.jpg
sun-city-phase-2-pool-web.jpg.sunimage.991.525.jpg
sun-city-phase-2-pool-web.jpg.sunimage.1400.525.jpg
sunr-cashless-web-june-2022.png.sunimage.750.525.jpg
sunr-cashless-web-june-2022.png.sunimage.767.350.jpg
sunr-cashless-web-june-2022.png.sunimage.970.525.jpg
sunr-cashless-web-june-2022.png.sunimage.1200.525.jpg
sunr-cashless-web-june-2022.png.sunimage.1400.525.jpg
sunr-sunscapes-june-2022.jpg.sunimage.480.350.jpg
sunr-sunscapes-june-2022.jpg.sunimage.750.525.jpg
sunr-sunscapes-june-2022.jpg.sunimage.767.350.jpg
sunr-sunscapes-june-2022.jpg.sunimage.970.525.jpg
sunr-sunscapes-june-2022.jpg.sunimage.1200.525.jpg
sunr-sunscapes-june-2022.jpg.sunimage.1400.525.jpg

Maksym
Site Admin
Posts: 2077
Joined: 02 Mar 2009, 17:02

Re: 'use full URL for file name' not working

Post by Maksym » 28 Feb 2023, 13:46

The number in sub-folder names corresponds to the number of a Starting URL in project. So, it looks like

https://www.suninternational.com/grandwest/

is the 7the URL in the list of Starting URLs in your project and

https://www.suninternational.com/sun-city/

is the 8th.

Now, if all your URLs are like these (consist of a domain name a 1 folder name), then you can switch to the [ Expert ] tab, check the [ Use target file parent URL Regular Expression to create sub-folder ] box and add the following Regular Expression:

Expression: //([^/]+)/([^/]+)/$
Result: [#1]-[#2]

Image

This will create the following folder names:

www.suninternational.com-grandwest
www.suninternational.com-sun-city

You can use any other character or word instead of "-" in the Result part, but this symbol must be a legal Windows file name character.

As for the file names. They simply do not fit the maximum allowed number of characters allowed for Windows file names, which is 256. The full file path (Drive:\Folders\FileName) must be 256 characters or less. So, please use something like:

C:\EPF\

as a [Destination folder] for your project and Extreme Picture Finder will not have to truncate the file names to make them fit the character limit.

theuntakenname
Posts: 10
Joined: 28 Feb 2023, 07:36

Re: 'use full URL for file name' not working

Post by theuntakenname » 28 Feb 2023, 17:29

thanks for your response

if the line number of the starting URL was correct that would solve my problem, however my results look like this, multiple results of the same number, not related to starting URL's

I have 20 000 starting URL's, is this too much? maybe causing strange results?


00001-thehoughtonhotel.com
00001-wildwatersboksburg.co.za
00001-www.arebbusch.com
00001-www.avanihotels.com
00001-www.crestahotels.com
00001-www.csiricc.co.za
00001-www.kievitskroon.co.za
00001-www.legacyhotels.co.za
00001-www.meikleshotel.com
00001-www.ngwenya.co.za
00001-www.saltrockbeach.co.za
00001-www.santosexpress.co.za
00001-www.serenahotels.com
00001-www.southernsun.com
00001-www.thebeachhotel.co.za
00001-www.thecountryclub.co.za
00001-www.tresjolie.co.za
00194-www.marriott.com
00197-happyrhino.co.za
00199-www.legacyhotels.co.za
00202-www.belmond.com
00205-www.radissonhotels.com
00208-www.kznwildlife.com
00209-www.krystalbeach.co.za
00210-thegardenvenue.co.za
00212-natalia.co.za
00213-www.hazyviewcabanas.co.za
00217-www.southernsun.com
00226-www.avianto.co.za
00231-www.marriott.com
00233-www.marriott.com
00235-www.marriott.com
00236-www.makiti.co.za
00238-www.montagusprings.co.za
00239-www.radissonhotels.com
00241-www.southernsun.com
00248-www.southernsun.com
00250-oceanbasket.co.za
00252-www.arnistonhotel.com
00253-www.radissonhotels.com
00254-www.radissonhotels.com
00256-anewhotels.com
00258-www.southernsun.com
00260-www.firstgroup-sa.co.za
00262-www.dikhololo.co.za
00267-www.southernsun.com
00269-www.southernsun.com
00270-www.firstgroup-sa.co.za
00271-eilandspa.co.za
00272-www.southernsun.com
00273-bushtavern.com
00275-www.southernsun.com
00283-www.pepperclub.co.za
00285-www.royalelephant.co.za
00286-www.reefhotel.co.za
00289-www.marriott.com
00290-kloofendalfriends.org.za
00292-www.fairview.co.za
00293-www.jhbcityparksandzoo.com
00295-www.southernsun.com
03633-www.vangaalen.co.za
09941-www.crystalsprings.co.za


The URL's are in different formats, so the regular expression approach only handles a subset of results.

if:
the destination folder structure of starting URL's could include the sub folder of starting URL
or
the domain-prefix on the filename could contain the sub-folders of the starting URL's
or
the number on the folder accurately was the starting URL line number

is there a way to configure any of these solutions via regular expression ?

Maksym
Site Admin
Posts: 2077
Joined: 02 Mar 2009, 17:02

Re: 'use full URL for file name' not working

Post by Maksym » 28 Feb 2023, 17:49

Hmm. This does looks like a bug. I'll see what can be done here.

What about the file names? Do they fit with a shorted folder name?

theuntakenname
Posts: 10
Joined: 28 Feb 2023, 07:36

Re: 'use full URL for file name' not working

Post by theuntakenname » 28 Feb 2023, 18:05

The filenames are including the root URL of the domain which is good (unfortunately missing the sub-folder), but the filenames are not truncating, so all ok on that issue.

if the folder-number-prefix could be fixed to match starting URL line number, it would be a huge win

thanks

Maksym
Site Admin
Posts: 2077
Joined: 02 Mar 2009, 17:02

Re: 'use full URL for file name' not working

Post by Maksym » 28 Feb 2023, 18:24

It will be done.

Maksym
Site Admin
Posts: 2077
Joined: 02 Mar 2009, 17:02

Re: 'use full URL for file name' not working

Post by Maksym » 01 Mar 2023, 14:17

OK, there is definitely a bug with the numbers. Can you send me a part of your URL list (1000 URLs will be enough) for the tests please? support@exisoftware.com

theuntakenname
Posts: 10
Joined: 28 Feb 2023, 07:36

Re: 'use full URL for file name' not working

Post by theuntakenname » 01 Mar 2023, 20:06

ok thanks

I have emailed 1000-URL.CSV

looking forward to the fix, can you estimate when it may be fixed?

Matt

Post Reply