It is currently April 27th, 2024, 3:33 pm



Post new topic Reply to topic  [ 7 posts ] 
Author Message
 Post subject: Extracting and removing images.
PostPosted: January 24th, 2024, 11:29 pm 

Joined: January 10th, 2021, 12:46 pm
Posts: 26
Hi,

I was wondering if it is possible to extract image URLS within posts. The leeched images may not be wanted, or may just be watermarked and not re-usable (images after the cover).

I tried using the hosts function to extract but it did not work.

Example - I wish to remove any images contained within a URL structure:

Code:
[url=https://4.bp.blogspot.com/-HzyYF8LveOU/XxMBJkfqWnI/AAAAAAAAEKo/wPBRwfzmozo4vGqbqLRMvg2bXP-cg9DDgCLcBGAsYHQ/s1600/9a.jpg]

[img]https://4.bp.blogspot.com/-HzyYF8LveOU/XxMBJkfqWnI/AAAAAAAAEKo/wPBRwfzmozo4vGqbqLRMvg2bXP-cg9DDgCLcBGAsYHQ/s1600/9a.jpg[/img][/url]

[url=https://3.bp.blogspot.com/-ciLr7-lwgHc/XxMBJppQWTI/AAAAAAAAEKk/vTE8naEOuSs8l2uleJW-jSk-Mt2kTVujQCLcBGAsYHQ/s1600/9b.jpg]

[img]https://3.bp.blogspot.com/-ciLr7-lwgHc/XxMBJppQWTI/AAAAAAAAEKk/vTE8naEOuSs8l2uleJW-jSk-Mt2kTVujQCLcBGAsYHQ/s1600/9b.jpg[/img][/url]
[/center]


Thanks, I tried adding blogspot URL in the extraction under hosts, with the plan to then remove them within the extraction area, but it didn't pick it up.

It is a bit more complex though as I ideally want to keep the cover image, so the first image, and remove any image after that. I am sure it's not straightforward or possible but thought id ask.

Thanks so much!


Top
 Profile  
Reply with quote  
 Post subject: Re: Extracting and removing images
PostPosted: January 25th, 2024, 9:24 am 
Site Admin
User avatar

Joined: March 10th, 2011, 11:14 pm
Posts: 12646
Location: Earth
Hi,

in "HOSTS" -> add https://blogspot.com with type "For extracting".

in "Settings" -> "Links" -> "When extracting links ignore" -> disable "Images links".

It's fine then to extract. Tested.

To keep the first image you could need to copy it's URL to the right area, then first image will be kept and others removed.

_________________
themaPoster | themaCreator | themaManager | themaLeecher | themaRegister


Top
 Profile  
Reply with quote  
 Post subject: Re: Extracting and removing images.
PostPosted: January 25th, 2024, 10:39 am 

Joined: January 10th, 2021, 12:46 pm
Posts: 26
Oh wow amazing, you have really thought of everything Freddy!

I just have one last ask, I don't supposed you think this could work and provide regex to replace the first encounter of a .jpg, without changing the rest.

I am thinking I could perhaps obscure the first encounter of an image .jpg to .xxx (or something else but unique), then remove all image links, and then use replacements to find the obscured image link url, re-introduce the first image back? what do you think? instead of manually copying and re-adding individually.

The cover image is always the first, similar to how you helped me with the brackets regex, instead where i need to keep reapplying a few times depending on how many links i might have, could maybe use a similar techniques here and have it only replace the first .jpg with .xxx or something?

Then i can extract, remove all image, then find .xxx and replace back to .jpg. (well , that's the plan :) )


Top
 Profile  
Reply with quote  
 Post subject: Re: Extracting and removing images.
PostPosted: January 25th, 2024, 4:55 pm 
Site Admin
User avatar

Joined: March 10th, 2011, 11:14 pm
Posts: 12646
Location: Earth
It would be easier to add "Remove elements" selector which would remove all images except the first.

I would need one post URL where they are leeched from to test.

The same selector later could be used for other sites as well.

_________________
themaPoster | themaCreator | themaManager | themaLeecher | themaRegister


Top
 Profile  
Reply with quote  
 Post subject: Re: Extracting and removing images.
PostPosted: January 25th, 2024, 8:07 pm 

Joined: January 10th, 2021, 12:46 pm
Posts: 26
Amazing,


Can't thank you enough mate.


Last edited by myfear on January 26th, 2024, 12:26 pm, edited 1 time in total.

Top
 Profile  
Reply with quote  
 Post subject: Re: Extracting and removing images.
PostPosted: January 26th, 2024, 9:11 am 
Site Admin
User avatar

Joined: March 10th, 2011, 11:14 pm
Posts: 12646
Location: Earth
I only see some preview images at the bottom which can be removed with this "Remove elements" selector for this site:

Go to "WEBSITES" -> select that website -> "Selectors" tab -> add these selectors:

Remove elements (one per line):
Code:
div.separator


Just when adding custom selector default ones are ignored, you need to add some default ones back as well for this site:

Remove elements (one per line):
Code:
*[style~=display: ?none]
div#comments
div#related-posts


It would be different for each site (usually no need to add default back). But you can remove other images with remove elements selector, just need to see in HTML source what sections to remove.

_________________
themaPoster | themaCreator | themaManager | themaLeecher | themaRegister


Top
 Profile  
Reply with quote  
 Post subject: Re: Extracting and removing images.
PostPosted: January 26th, 2024, 12:27 pm 

Joined: January 10th, 2021, 12:46 pm
Posts: 26
Worked perfectly, thanks so much!

I have also re-visited your replacement regex FAQ and realised your examples are perfect for a few things I needed.

From a learning perspective, thank you! You are very talented.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 7 posts ] 

Who is online

Users browsing this forum: No registered users and 10 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron
Theme designed by stylerbb.net © 2008
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group
All times are UTC