I would like to have the same feature that delete orphan pages to the assets.
How it works now
Currently, if you paste an image into a block, Logseq generates a file with a non human name, for instance, image_1646019226269_0.png. If I remove the image from the page, the image file remains in the assets folder. It make me concerned about generating too much trash fies.
Desired Behaviour
It could have some form to easily query or view the list of orphan assets and decide to delete them in a batch action or selecting individual assets to delete, like we have now for orphan pages.
+1 Right now, I’m having to find the asset I want to discard in the folder and delete it manually. It’s not very practical… Orphan pages are the perfect analogy although I wish it was the other way around. It’s a lot of unnecessary storage in the long run.
I created a Python script to find unused assets. You can run it in your Logseq folder. The unused assets will be moved into a newly created to_delete folder. You can review them or just delete the whole directory.
import os
import shutil
assets_dir = './assets'
journal_dir = './journals'
pages_dir = './pages'
to_delete_dir = './to_delete'
if not os.path.exists(to_delete_dir):
os.makedirs(to_delete_dir)
assets_files = os.listdir(assets_dir)
referenced_files = []
for dirname in [journal_dir, pages_dir]:
for filename in os.listdir(dirname):
if filename.endswith('.md'):
with open(os.path.join(dirname, filename)) as f:
for line in f:
for asset in assets_files:
if asset in line:
referenced_files.append(asset)
for asset in assets_files:
if asset not in referenced_files and not asset.endswith(".edn"):
print(asset)
shutil.move(os.path.join(assets_dir, asset), to_delete_dir)
Thank you for sharing and welcome to the community!
BTW, I recommend joining our discord server and checking out the docs and blog as well if you haven’t already; https://docs.logseq.com/
Thanks so much for the python script! I just recognized it does not work with assets that have special characters (e.g… ‘äöü’) or spaces in their file names … they are incorrectly classified as to be deleted. Ouch.
Did anyone manage to fix this already or is there an updated solution to remove orphaned assets?
Hi Peter. I’ve migrated to other software so I cannot experiment and locate the exact reason. I suggest you use ChatGPT and see if it can provide you with an edited version of the script.
Thank you! ChatGPT brilliantly recognized that special characters might have to do sth with encoding, failing at proposing a correction . However that finally got me on the right track.
MacOS os.listdir provides file names in decomposed Unicode characters. These need to be normalized to get correctly matched to text from MacOS logseq .md text files. Otherwise, e.g., the ‘Ö’ in the file name does not match the ‘Ö’ in logseq text. See python - UTF-8 and os.listdir() - Stack Overflow
I extended your code as per below, which now works as expected on macOS Sonoma 14.5.
Hope this helps
import os
import unicodedata
import shutil
assets_dir = './assets'
journal_dir = './journals'
pages_dir = './pages'
to_delete_dir = './to_delete'
if not os.path.exists(to_delete_dir):
os.makedirs(to_delete_dir)
# normalization required to support Unicode characters in MacOS file names
assets_files = [unicodedata.normalize('NFC', f) for f in os.listdir(assets_dir)]
referenced_files = []
for dirname in [journal_dir, pages_dir]:
for filename in os.listdir(dirname):
if filename.endswith('.md'):
with open(os.path.join(dirname, filename),encoding="utf-8") as f:
for line in f:
for asset in assets_files:
if asset in line:
referenced_files.append(asset)
for asset in assets_files:
if asset not in referenced_files and not asset.endswith(".edn"):
print(asset)
shutil.move(os.path.join(assets_dir, asset), to_delete_dir)