themaLeecher
http://leecher.themasoftware.com/forum/

Extracting and removing images.
http://leecher.themasoftware.com/forum/viewtopic.php?f=3&t=6733
Page 1 of 1

Author:  myfear [ January 24th, 2024, 11:29 pm ]
Post subject:  Extracting and removing images.

Hi,

I was wondering if it is possible to extract image URLS within posts. The leeched images may not be wanted, or may just be watermarked and not re-usable (images after the cover).

I tried using the hosts function to extract but it did not work.

Example - I wish to remove any images contained within a URL structure:

Code:
[url=https://4.bp.blogspot.com/-HzyYF8LveOU/XxMBJkfqWnI/AAAAAAAAEKo/wPBRwfzmozo4vGqbqLRMvg2bXP-cg9DDgCLcBGAsYHQ/s1600/9a.jpg]

[img]https://4.bp.blogspot.com/-HzyYF8LveOU/XxMBJkfqWnI/AAAAAAAAEKo/wPBRwfzmozo4vGqbqLRMvg2bXP-cg9DDgCLcBGAsYHQ/s1600/9a.jpg[/img][/url]

[url=https://3.bp.blogspot.com/-ciLr7-lwgHc/XxMBJppQWTI/AAAAAAAAEKk/vTE8naEOuSs8l2uleJW-jSk-Mt2kTVujQCLcBGAsYHQ/s1600/9b.jpg]

[img]https://3.bp.blogspot.com/-ciLr7-lwgHc/XxMBJppQWTI/AAAAAAAAEKk/vTE8naEOuSs8l2uleJW-jSk-Mt2kTVujQCLcBGAsYHQ/s1600/9b.jpg[/img][/url]
[/center]


Thanks, I tried adding blogspot URL in the extraction under hosts, with the plan to then remove them within the extraction area, but it didn't pick it up.

It is a bit more complex though as I ideally want to keep the cover image, so the first image, and remove any image after that. I am sure it's not straightforward or possible but thought id ask.

Thanks so much!

Author:  Freddy [ January 25th, 2024, 9:24 am ]
Post subject:  Re: Extracting and removing images

Hi,

in "HOSTS" -> add https://blogspot.com with type "For extracting".

in "Settings" -> "Links" -> "When extracting links ignore" -> disable "Images links".

It's fine then to extract. Tested.

To keep the first image you could need to copy it's URL to the right area, then first image will be kept and others removed.

Author:  myfear [ January 25th, 2024, 10:39 am ]
Post subject:  Re: Extracting and removing images.

Oh wow amazing, you have really thought of everything Freddy!

I just have one last ask, I don't supposed you think this could work and provide regex to replace the first encounter of a .jpg, without changing the rest.

I am thinking I could perhaps obscure the first encounter of an image .jpg to .xxx (or something else but unique), then remove all image links, and then use replacements to find the obscured image link url, re-introduce the first image back? what do you think? instead of manually copying and re-adding individually.

The cover image is always the first, similar to how you helped me with the brackets regex, instead where i need to keep reapplying a few times depending on how many links i might have, could maybe use a similar techniques here and have it only replace the first .jpg with .xxx or something?

Then i can extract, remove all image, then find .xxx and replace back to .jpg. (well , that's the plan :) )

Author:  Freddy [ January 25th, 2024, 4:55 pm ]
Post subject:  Re: Extracting and removing images.

It would be easier to add "Remove elements" selector which would remove all images except the first.

I would need one post URL where they are leeched from to test.

The same selector later could be used for other sites as well.

Author:  myfear [ January 25th, 2024, 8:07 pm ]
Post subject:  Re: Extracting and removing images.

Amazing,


Can't thank you enough mate.

Author:  Freddy [ January 26th, 2024, 9:11 am ]
Post subject:  Re: Extracting and removing images.

I only see some preview images at the bottom which can be removed with this "Remove elements" selector for this site:

Go to "WEBSITES" -> select that website -> "Selectors" tab -> add these selectors:

Remove elements (one per line):
Code:
div.separator


Just when adding custom selector default ones are ignored, you need to add some default ones back as well for this site:

Remove elements (one per line):
Code:
*[style~=display: ?none]
div#comments
div#related-posts


It would be different for each site (usually no need to add default back). But you can remove other images with remove elements selector, just need to see in HTML source what sections to remove.

Author:  myfear [ January 26th, 2024, 12:27 pm ]
Post subject:  Re: Extracting and removing images.

Worked perfectly, thanks so much!

I have also re-visited your replacement regex FAQ and realised your examples are perfect for a few things I needed.

From a learning perspective, thank you! You are very talented.

Page 1 of 1 All times are UTC
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group
http://www.phpbb.com/