I have hundreds of Word docs/Google Docs I’d like to just import into LogSeq. About 10 of them are 100+ pages as I used to just keep an doc open every day and use it as my running journal. Anyone have any tips for how to import these into LogSeq? I think they’d be much more useful inside LogSeq.
Ideas I have had:
Export the Docs to plaintext using the Word doc’s export function. Merge a bunch into one big text doc and then run a FIND + REPLACE to add dashes before every line to put it in LogSeq format and run the doc through some FIND + REPLACE algorithm I come up with to try to clean it up a bit.
Use something like PandaDoc (https://pandoc.org/) and then run through some sort of similar algorithm to clean these too
Or hopefully someone has come up with something else that is way awesomer than this drudgery!
I am working on importing Google Docs into Logseq 0.10.8.
Seems to work quite well by copying all the Google Docs text, without highlighting the ToC nor the footnotes.
This is then easily pasted into a new Logseq 0.10.8 page.
The manual clean up work needed, which would be great if somehow automated, is :
tab/indent heading H2-H6, H1 is fine on the left margin of the page;
tab/indent text under headings H1-H6;
Citations looks like they’ll have to be manually added in using [^1] → bottom of page → [^1]: → type footnote text → increment numbers manually.
Automated citations/footnotes would be handy.
I suppose that (1) the imported TOC do not have headings, and (2) the first proper heading of a document is not indented itself. You need only to discard everything before the first heading:
let str = textOfTheMarkdownDocument;
str = str.substring(str.indexOf('\n- #') + 1);
Thank you, I’m now researching how to parse the regex to imported gdocs as .md files.
Bullet threading appears to be the Logseq terminology.
I have installed the Bullet threading pluging, however does not seem to be applying to another imported doc as per screenshot. This the headers cleaned up and I must have manually tabbed the sub header and the text belonging to the upper part.
FYI you don’t have to do that stuff within Logseq. Any scripting method that is able to read and write to plain text files within your filesystem suffices. If you run macOS, then the built-in Script Editor (or even Shortcuts) is good enough to get the job done.