Migrating from big ZIM wiki to Logseq, are there scaling or other issues to expect?

theoky · February 15, 2023, 11:45pm

I’ve recently discovered Logseq and am currently playing around with it. From what I’ve seen, it seems to be a great tool and has features which I really like.

As long time ZIM user (for approx. 10 years), I’d like to migrate all my stuff from ZIM to Logseq. Before I’ll start migrating data, I’m looking for some answers for questions I have:

The questions mainly arise from the amount of data I manage in ZIM. I have approx.

4200 txt files, which represent pages and subpages (in approx. 2100 directories) and
4100 files as attachments, stored in the different directories along with the pages they belong to.

My questions are:

Has anyone happen to have experience with such large data sets? Or even migrated a ZIM wiki already?
- The reasons for this question are:
  - As far as I understand, Logseq stores attachments in one folder, so all 4100 files would need to go into that folder. The file system will handle it, but working with such large folders could pose problems.
  - Same goes for pages and subpages in ZIM which will also all be stored in one folder.
Is there an easy way to structure file storage (I’ve seen that I can enter paths to anywhere in the file system)?
When adding files by GUI, Logseq renames the files when storing them in the folder. When I’ll use those files later on, e.g. by sending files by mail, the added number could lead to confusion. In addition if I add the same file multiple times to a page, I suddenly have multiple identical files with different filenames, but on the page I still see multiple names of the file without the number. Is there a way to change this behaviour?

Any answers appreaciated,
Theo

alex0 · February 15, 2023, 11:50pm

Here there is an example graph that someone shared on Discord, containing .md files of all the Australia suburbs (18,500 files - one per suburb) and they are linked to each other via various properties (e.g. State, Electorate, Region, etc):

https://cdn.discordapp.com/attachments/775937033394323478/1065809219078586518/Australian_Suburbs.7z

theoky · February 17, 2023, 2:28pm

Thanks for your answer and the dataset.

I’ve tried to import the complete and subsets of the dataset.
Initial import took half an hour one one machine and more than an hour on another machine.
Navigation worked, but then I made the mistake to click on the graph view. This led to an unresponsive app until I terminated it (with the task manager on Win 10) on the bigger import.

The subset which is manageable in a way is around 500 pages subset.

So my impression for the moment is that the app in its current state is not very well suited for my use case.