Why doesn't Logseq directly write to file?

miren · November 2, 2022, 7:15pm

I’m trying to understand (as a non-developer, so I have limits I know) the rationale behind Logseq not writing directly to file, but using the “Refresh”/“Re-index”, and files-metadata.edn et al. files instead.

I sync between devices with a combination of Syncthing and the Nextcloud client and this fact is getting me frequent conflicted files and even some data loss—and I don’t seem to be the only one (thankfully I’m just testing for now and manual recovery has always been possible). What benefit does Logseq get from using this “middle-layer” instead of directly writing to file?

Excuse my ignorance and thanks in advance!

Didac · November 2, 2022, 11:29pm

Ignorance is bliss. So many things to discover and so little fear of being wrong… That’s why I often write as if I knew what I am talking about :)))

Many times, the certainty of the veracity of the things that one affirms is directly proportional to the ignorance that one has about it. Well, I guess :)))))

Well, once warned, allow me to bring my esteemed ignorance to light.

As I understand it, it is a conceptual issue. These are logical layers of abstraction. What is a file? Exactly, a concept. Something conceived in your mind.

An idea that thanks to the interfaces we managed to give shape to. The file system is another interface for Logseq, like your keyboard and your screen are. There is no direct correlation between a page and a file in our storage, although it may match. A page is a block, another concept, which refers to, which is an index of, a collection of blocks. These blocks may or may not “be” in the same file.

Do you follow me?

A file is a representation, a projected image of something on your file system. Conceptually, we have two different things: the representation and the represented.

Understanding an interface as the system with which we can achieve communication between two other systems, whatever their type, so that they exchange information between them.

OK, here we have:

A meta-system witch is Logseq (humans included)
Things to be represented (blocks, events).
Interfaces to communicate the representations between systems.

In other words, language comes into play, communication protocols, with which to represent and to interpret.

alex0 · November 4, 2022, 12:19am

Logseq uses a graph database to provide its features, otherwise it would be just a text editor.

Logseq already updates Markdown files live and watch them in case they are edited in some other way. You will hardly need to refresh manually and reindex is not involved in sync conficts from my understanding.

The conflicts happen, in my experience, if you try to use multiple instances of Logseq, maybe on different devices, open the same page and in particular keep the cursor inside a block in edit mode.

I use Syncthing too and for me just being sure not to leave a cursor in an editing block while using another instance of Logseq is enough to avoid conflicts.

Didac · November 4, 2022, 5:49pm

That is good information to keep in mind. Thank you, @alex0. I take note of it.

On the other hand, this doesn’t answer @miren’s question in this thread…

Well, I know I’m kind of a blah blah blah tallker. Surely some of you have already figured that out. And if you haven’t notice it yet, it’s time you did :)))

In any case, allow me to dig a little deeper into this topic, because it is something that I don’t fully understand either, and in this way I share here what I am learning on the go.

For the record, the question is:
Why doesn’t Logseq directly write to file?

If you’re looking for a short answer, maybe something like “Because Logseq uses DataScript” will be enough.

If not, stay tunned

alex0 · November 4, 2022, 6:06pm

Logseq updates the file as soon as you unfocus a block, this makes Logseq write to files even more frequently than a text editor where you have to Ctrl+S to manually save.

OP was wrong about how Refresh and Reindex work and this made them think there is a “middle layer” or an additional step like Ctrl+S in text editors.

On the other hand, what else Logseq could do? Write on the file every time you digit a character?

About using a database, that’s essential for basic features, but Logseq tries to write to files as soon as it can.

So there is no “why”, there is just a misunderstanding on what Logseq does.

Your reply is more about the difference between storage with a data structure like a file system and data being processed by a program and eventually cached in RAM.

Didac · November 4, 2022, 10:53pm

Thank you for your answer @alex0, I learn a lot reading your comments. Wich are very objetive and clear to understand.

Logseq updates the file as soon as you unfocus a block, this makes Logseq write to files even more frequently than a text editor where you have to Ctrl+S to manually save.

Good to know.

OP was wrong about how Refresh and Reindex work and this made them think there is a “middle layer” or an additional step like Ctrl+S in text editors.

Technically, there is an intermediate layer, in any case, between leaving the editor and the “corresponding” file saved in the relevant storage.

Logseq uses Electron, which is a runtime environment technology that embeds Chronium and Node.js into its binary. That is, it is processed by a JavaScript Engine, within a Java Virtual Machine, and to operate with our file system, Logseq makes use of the corresponding module provided by node.js, establishing a communication through its API, if I am not mistaken.

The Logseq app mostly uses Clojure (which is a compile-to-js dynamic typing functional programming language) and ClojureScript (which is just Clojure compiling to JavaScript).

Now, when Logseq is currently running interacts with the runtime environment via a runtime system. The runtime environment in turn acts as a go-between between the application and the operating system.

I see here an intermediate layer between the action of unfocusing a block (exiting the editor) and writing to the storage in order to achieve “persistence” of the database. In any case, I was referring more to the fact that Logseq uses a graph database and I was wondering what it could conceptually bring about when it is stored in a file system.

On the other hand, when you comment “what else Logseq could do? Write on the file every time you digit a character?” You say that like it’s silly, but it doesn’t sound so silly when you put it in the context of real-time collaboration, Etherpad Lite style, where instead asking to make write calls to the filesystem, the client submits a changeset to the server.

And if it makes sense in that context, given its centralized architecture and real-time requirement, I don’t see it unreasonable to wonder why something like this wouldn’t be possible in a local environment with multiple instances.

Finally, I think that when someone asks a question, the misunderstanding of something is usually implicit in the fact of asking.

alex0 · November 4, 2022, 11:19pm

Thank you!

Well, it doesn’t matter the architecture or the language, every computer program processes data in RAM and eventually read/write to storage with an interface of the kernel, generally abstracted away with some library/framework API like you have mentioned…

Graph file systems are a thing but they are used for very large graphs, there wouldn’t be any advantage in using a graph file system with applications like Logseq that can load the whole graph on RAM for max responsiveness.

If you ask me, I would really like if in the future the file systems of our PCs will handle graphs somehow natively, so that we can browse files without the restrictions of a tree of folders.

Didac · November 5, 2022, 11:00am

It doesn’t matter in the sense that there will always be an intermediate layer. That is, where there is a communication between two systems, there will be an interface that makes it possible.

Logseq will never write directly to a file because it is outside its remit.

I know that the meaning of the OP was not this, however I thought it was convenient enough to point out this fact. In the context of “Logseq writing directly to a file”, bypassing the logical layers of the operating system, and even the layers of Logseq itself, it turns out that even in the single-instance scenario, before processing a request to write to a file, Logseq still has to perform some changeset operations, I imagine.

Graph file systems are a thing but they are used for very large graphs, there wouldn’t be any advantage in using a graph file system with applications like Logseq that can load the whole graph on RAM for max responsiveness.

I was referring to saving a graph database to a non-graph filesystem.

If you ask me, I would really like if in the future the file systems of our PCs will handle graphs somehow natively, so that we can browse files without the restrictions of a tree of folders.

Me too!

miren · November 5, 2022, 4:49pm

Thank you very much @Didac and @alex0 for taking the time to write down and share your thoughtful and detailed responses!

Notice I began my message with “I’m trying to understand…” so it is actually explicit that I am very likely to have many things wrong

I’m not questioning architectural decisions which by now is pretty obvious I don’t quite understand (I imagine and also infer from your explanations that part/most of Logseq’s functionality is made possible precisely through that arquitecture). The thing is… it’s not working for me right now. I really want to love Logseq and eventually use it as my go-to app for notes, so I’ve been testing what for me would be like the essential setup with a laptop and a mobile phone working over the same graph, but I keep experiencing sync conflicts and data loss.

Also I’m far more familiar with the “file representation”. The “graph” and runtime environments… not so much. Which in turn makes it too much difficult for me to try to find out what I might be “doing wrong”—and then fix it.

And that’s where the question came from. Again, thanks again to you both for this conversation! I certainly have some more breadcrumbs to follow now

alex0 · November 5, 2022, 5:07pm

OK but what are you looking for? Logseq is like a text editor that autosave every time you leave a block. So what is the alternative?

Didac · November 5, 2022, 11:43pm

Just take care to unfocus a block (that’s leaving the text editor for that block), before using an other instance of Logseq, as @alex0 mentioned. That’s all. Try it and tell us if you still experiencing sync conflicts, ok @miren?

Don’t give up!

miren · November 6, 2022, 10:50am

Will try taking care to unfocus each time. Thanks again!