Advice for converting many long Word Docs to LogSeq?

I have hundreds of Word docs/Google Docs I’d like to just import into LogSeq. About 10 of them are 100+ pages as I used to just keep an doc open every day and use it as my running journal. Anyone have any tips for how to import these into LogSeq? I think they’d be much more useful inside LogSeq.

Ideas I have had:

  • Export the Docs to plaintext using the Word doc’s export function. Merge a bunch into one big text doc and then run a FIND + REPLACE to add dashes before every line to put it in LogSeq format and run the doc through some FIND + REPLACE algorithm I come up with to try to clean it up a bit.
  • Use something like PandaDoc (https://pandoc.org/) and then run through some sort of similar algorithm to clean these too

Or hopefully someone has come up with something else that is way awesomer than this drudgery!

After reading your question, the first thing that came to my mind is just using pandoc to convert to Markdown.

You can convert each of them to a .md file (a logseq page). You’d need to put a dash (- ) at every paragraph if you want each paragraph to be a block.

If these documents are static and won’t be edited anymore, you could export them as pdf and just put the pdf in logseq

1 Like

I like the PDF idea. I was dreading the transfer process from Google Docs/MS Word

Thanks!