Import from a spreadsheet into logseq (each row a page, as outline not as tables) / one time automated linking?

0nobody0 · January 10, 2022, 4:30am

hi all,

first off, i think this might be more about about data manipulation outside of logseq than a specific feature or plug-in to the program. i’m not trying to recreate the wheel, and very thankful for any thoughts/assistance.

i’m currently working on a cataloging project and the raw data was originally set up in an excel spreadsheet. the spreadsheet lists one individual item per row, with each item given a unique ItemID, and a variety of information in the remaining columns.

we are now maintaining that data in a cataloging program. i have used logseq a fair amount personally, and i think it would be a great tool for keeping track of supplemental research and notes for this project. in order to avoid a fair amount of manual data entry in logseq i was thinking it would be nice to have a page for each item autocreated, containing the information from the spreadsheet for reference.

the first working idea is to export the spreadsheet to JSON, XML, CSV (or whatever might be best), then split up that data, one file per item, with the columns converted to outlined markdown (not markdown tables). was thinking of hacking something together out of the python scripts floating around out there, or maybe creating a bash script, but wondering if anyone has tried this or has any ideas about a better approach?
after importing the pages, was thinking of taking some of the fields that are already made up of unique lists, and auto creating individual logseq pages for each list item. for example for a Genre column in the spreadsheet, taking the list of genres (duplicates removed) and then setting up a script or something to create a markdown page for each.
these list item page titles (for example each individual genre if following the above example) could then be used to parse through in the full set of item markdown files from 1. above, and automatically linked. i saw this post about automatic linking - Automatic linking - #17 by Lucas - but it looks somewhat different. rather than real time auto-linking at all times, this would just be a one time process run across all documents. was thinking of setting up a sed command or batch script for this but thought it might be a feature that already exists/has been requested?

also similar to:

one last idea is kind of the reverse of 1. export from logseq to a database file or spreadsheet that would supplement the original data set. for example, if the last block of every item page in logseq is a new ‘Notes to be Published’ entry. then a script or method to scrape all the markdown files and create a JSON, XML, CSV etc. that just contains the original ItemID and related ‘Notes To be Published’ block.

these are all just rough ideas at this point. if there is a relevant thread i missed, or if it sounds like i am completely on the wrong tangent would love to hear any other ideas too. many thanks

LarryK · January 11, 2022, 5:10am

If you’re starting with a CSV file, you can bash it into pretty much any format you like. I’m fond of saying, awk is how I hammer my text-based nails (but if you’re familiar with Python, use it instead). This is definitely something you do outside of Logseq, but you can add the results to your pages.

For Point 2, I suggest using tags (like #genrename). Let Logseq build your category pages for you. If your tags are multi-word, use [[multi word tag]] instead. The two are equivalent, at least in Logseq. I might be missing something, but I think point 3 is already handled with point 2.

Point 4 sounds difficult. Unless you have a highly-structured dataset, round-tripping from JSON/XML to Markdown and back is a challenge. If you set up the initial conversion right, using properties and whatever structure works, it might work. But it’s going to be very application-specific. Going from JSON/XML to Markdown is easy. The other way… not so much. (In XML-land, this is called round-tripping and is kind of a Holy Grail of document conversions.)

0nobody0 · January 12, 2022, 4:22am

thank you for the insights. will try some tricks and report back!