Hi everyone, we know a lot of people have questions about the coming database version, like why we’re developing it, and why it takes so long.
We apologize for spending almost all the time on developing it without communicating it well with the community. This post will try to answer some of those questions.
Please don’t hesitate to leave a comment if you have questions!
Context
Everyone loves plain-text files, and with tools like Git and Obsidian, we can use them in conjunction with Logseq. However, there are some limitations:
Building real-time collaboration on top of Markdown files is extremely challenging, for example:
Creating a new block requires rewriting the entire Markdown file.
Renaming a page updates all files that reference it.
The structure data support is limited compared to a database, lacking features like persistent IDs, timestamps, and more.
Templates and properties make it easy to add new books, papers, and more, but they’re difficult to maintain and collaborate on.
Our vision is to create a better environment for learning and collaboration. The current app falls short of our goals, with limitations including:
Data loss when using Logseq sync with multiple clients.
Poor performance with large graphs.
Unreliable undo functionality.
No built-in publishing support for pages.
We received so much love and support from you guys that it’s unacceptable that Logseq still loses data. We wanted to do better so we started to build a solid foundation for the future: the database version, the goals are:
Be stable, improved data stability, reliable undo/redo
Be performant, fast to open, fast to type
Be joyful, anyone can create any workflow with the new classes and properties
We’ve also decided to develop the new database version with real-time collaboration (RTC) in parallel, as implementing RTC with offline support is extremely complex. By considering RTC early in the design process, we can minimize the risks of having to change our implementation later on.
Challenges
Storage
The new database version should be accessible across multiple platforms, including Web, Electron, and Mobile.
It should be capable of handling large-scale data, effortlessly supporting up to 50,000 pages.
Your data should be safe, they should never be erased by browsers
To facilitate advanced querying, the new database version should offer support for Datalog queries.
Furthermore, it should provide flexibility by allowing users, plugins or even other apps to create custom classes and properties.
An intuitive UX for classes and properties
Writing should be a delightful experience
RTC should work offline
We’re committed to local-first, where users have full control over their data
Are you going to deprecate Markdown files support?
No, we’ll continue to support both file-based and database-based graphs, with a long-term goal of achieving seamless two-way sync between the database and markdown files. This will allow you to leverage the benefits of the database version while still being able to use other tools.
Why is it taking so long?
When we began, there was no existing solution that met our requirements for a persistent database, so we had to build one from scratch.
We initially explored CRDT for real-time collaboration with offline support, but ultimately found that current solutions didn’t meet our needs.
We spent significant time in refining the user experience for classes and properties.
Our goal is to ensure that the new database version doesn’t affect the existing version’s functionality.
Yes, all local features will be free to use. We’ll only charge for features that rely on our servers, such as real-time collaboration.
Future plan
We plan to start pre-alpha testing with the database version in 2 ~ 3 months, initially inviting a small group of users to help us improve it. As it becomes more stable, we’ll expand the testing group to include more users
We’ll also extend invitations to a select group of users and companies to test our real-time collaboration feature once it’s ready for feedback.
if there will be no “native” writing to the markdown files as a back-end of Logseq data I suppose that exporting/importing will be the only way to get into markdown the contents of the database or to read into the database the modified content of the markdown files. As import/export suggest a voluntary action I hope this will be configurable to be done automatically (at intervals or maybe even live?)…
Thank you for the insightful update. It is impressive to see the remarkable progress made so far, and I eagerly anticipate the release of the DB version. Best of luck!
I look forward to the future. I think it will be great!
I’m only a little nervous of the transition but if the DB version is just as solid as the current version of Logseq, then I’m completely sold! I can’t think of not having Logseq in my life
Afaik the DB version is much more stable than the current version because we’ve been focusing on avoiding any issue that can result in data invalidation, we’ll also try our best to reduce any issues that can lead to data loss.
This doesn’t work for the db version because the graph data is stored as key-value pairs (id->serialized node in a tree) in a table, the safe way to update contents might likes this:
with the current implementation, if I modify a template or a custom command I can’t have all templates in the markdown-based “db” updated accordingly (if even for an additional property I inserted in a template block). With VSCode it’s quite easy to do regex and to refactor any amount of complex find and replace stuff. I wish -but don’t hold my breath- that Logseq will have an easy way for users to do regex outside of advanced queries (like right in the Search - Ctrl+k - command or in some sort of visual query builder geared towards non devs…
Thanks for your quick reply!
I have no idea that whether i can be catorized as “actve contributors” (2 PRs, 1 merged and 1 WIP) or not, But as a heavy user of logseq, I really want to participate in early testing and am willing to explore and fix problems encountered during testing.
I’m also very excited for the DB version as the improved stability will be very welcome. But as referenced in another comment, I’m a bit worried about lock-in. Will the new database be open source and allow for SQL queries? What’s the reason Logseq couldn’t go with a solution like SQLite?
Also, regarding sync, will that be one of the paywalled features that “rely on [Logseq] servers”? I’m wondering if there will be options for third-party syncing, possibly similar to how Joplin implements it (offering people the ability to host their own notes or pay to use Joplin Cloud). Or will people looking for free syncing have to stick with the markdown version?