As I refer to external URLs e.g. web or twitter, over time, some of these URLs will become obsolete because the source URL might change or cease to exist.
Is there a way to create an archival copy of the source page so that the logseq representation isn’t broken someday in the future?
You can save the link to the snapshot or just the original link, because it breaks you will be able to find snapshots of the original link by simply visiting the page above.
In general when you have broken links check the Wayback Machine because often the snaptshots are already there, taken automatically or by other people
Both these tools are excellent suggestions and seem to have built-in support for the Wayback Machine. Will definitely be looking at them closer as I found a second reference in my Logseq notes go bad because the original source website changed.
Not the best solution but, when the linked content is critical, I do a Print to PDF and attach the PDF to the content I am linking from. Since there’s great PDF viewer and annotator built-in, this is sufficient most of the time. But I understand this doesn’t scale unless automated as well.
If you’re on MacOS, There’s also Brett Terpstra’s impressive (and free—he accepts donations and sponsorship) Gather command-line tool. It takes a web page and intelligently parses it into markdown. Various arguments for customizing the output are built in.
I use it in an Alfred workflow triggered by a text snippet, so I just type ..pmd and the text content from my frontmost browser tab is pasted in markdown format wherever I happen to be typing.
Happy to share how I set up the Alfred workflow if there’s interest!