I am using custom parsers for generating urls e.g. from
"https://example.com/images/text1/101/filename.jpeg"
to
"https://example.com/images/text2/101/filename.jpeg"
it works fine when i test it
but when i run the project , instead of giving url
"https://example.com/images/text2/101/filename.jpeg"
it gives
"https://example.com/images/text2/101/fi ... ed%20at%20............. " (very long url)
and somtimes
"https://example.com/images/text2/filename.jpeg'/%3E"
so i think i need to use excluded page parts
can someone guide me on that?
Help with Custom Parsers
-
- Site Admin
- Posts: 2304
- Joined: 02 Mar 2009, 17:02
Re: Help with Custom Parsers
I can try to help you if you show me your Custom Parsers and give me a piece of the HTML code where those parsers are supposed to work.
-
- Posts: 3
- Joined: 17 Jan 2025, 10:26
Re: Help with Custom Parsers
i wanted to generate link this way :
https://imgtaxi.com/images/small/2014/1 ... 4698f.jpeg
to
https://imgtaxi.com/images/big/2014/11/ ... 4698f.jpeg
Notice the change of small to big
So i configured parser:
Regular expression: (https://imgtaxi\.com/images/)(small)(/.*)
Result: [#1]big[#3]
+ no check on any box
it works as expected during the test run but when i run the project it does what i mentioned earlier
let me know if you find any solution
Thanks in advance
https://imgtaxi.com/images/small/2014/1 ... 4698f.jpeg
to
https://imgtaxi.com/images/big/2014/11/ ... 4698f.jpeg
Notice the change of small to big
So i configured parser:
Regular expression: (https://imgtaxi\.com/images/)(small)(/.*)
Result: [#1]big[#3]
+ no check on any box
it works as expected during the test run but when i run the project it does what i mentioned earlier
let me know if you find any solution
Thanks in advance
-
- Site Admin
- Posts: 2304
- Joined: 02 Mar 2009, 17:02
Re: Help with Custom Parsers
Most likely your Custom Parser works for you in the tests only because you test it with the URLs and not with the page source. Extreme Picture Finder applies the Custom Parsers to the entire HTML source of the downloaded page after it was processed by the [ Excluded Page Parts ]. And that's exactly how you should test your Custom Parsers:
Now, to your Regular Expressions. I suggest you avoid the "*" character whenever you can. This character is too "greedy" and will match entire page text easily. I prefer using a "negative character set" that allows me to match any number of characters until one of the characters from the given set is reached. For example, if I want to match a full URL inside any HTML tag's attribute I do the following:
And when the page source has a text like this:
or this:
my Regular Expression match will give me this:
So, you can read this pattern like this: match any number of characters until you meet ", or <, or > character.
So, when I need to replace a part of an existing address to create a new one (that's your situation), I do the following:
Regular Expression:
Result:
- 1. First, open the HTML source of the page in your browser ([Ctrl + U] in most modern browsers), select the entire HTML source, and copy it to Windows Clipboard.
- 2. If you have any [ Excluded Page Parts ] in your project - paste the original HTML source in the [ Excluded Page Parts ] tester, click the [ Test Regular Expressions ] button and copy the entire resulting text from the [ Resulting page text ] tab. If your project does not have any [ Excluded Page Parts ], then ignore this step and proceed to step 3.
- 3. Open the Custom Parsers tester and paste the HTML source from step 1 or step 2 into the [ Page source ] tab
Now, to your Regular Expressions. I suggest you avoid the "*" character whenever you can. This character is too "greedy" and will match entire page text easily. I prefer using a "negative character set" that allows me to match any number of characters until one of the characters from the given set is reached. For example, if I want to match a full URL inside any HTML tag's attribute I do the following:
Code: Select all
="(https://[^"<>]+)"
Code: Select all
<a href="https://example.com/folder/file.jpg">
Code: Select all
<img src="https://example.com/folder/file.jpg">
Code: Select all
https://example.com/folder/file.jpg
Code: Select all
[^"<>]+
So, when I need to replace a part of an existing address to create a new one (that's your situation), I do the following:
Regular Expression:
Code: Select all
="(https?://imgtaxi\.com/images/)small(/[^"<>]+)"
Code: Select all
[#1]big[#2]
-
- Posts: 3
- Joined: 17 Jan 2025, 10:26
Re: Help with Custom Parsers
i followed the steps to test custom parsers you mentioned
the tester is not giving any result, no processing or anything
can it be fixed?
the same issue prisists with your suggested RegEx
the tester is not giving any result, no processing or anything
can it be fixed?
the same issue prisists with your suggested RegEx
-
- Site Admin
- Posts: 2304
- Joined: 02 Mar 2009, 17:02
Re: Help with Custom Parsers
What is the URL you are trying to download from?
-
- Site Admin
- Posts: 2304
- Joined: 02 Mar 2009, 17:02
Re: Help with Custom Parsers
The actual Regular expression depends on the HTML source. So, you need to study that source to create a working Custom Parser. The one that I showed you was just an example - I haven't seen the actual HTML source.
-
- Site Admin
- Posts: 2304
- Joined: 02 Mar 2009, 17:02
Re: Help with Custom Parsers
By the way, there is a template that can download from imgtaxi.com galleries and users:
imgtaxi.com downloader
imgtaxi.com downloader