nice! thanks for the quick replies!!
Another PR that will help us reach mvp:
I’m VERY interested. Not sure how I feel about IPFS (my experience even trying to share a file between computers on the same network has been very bad), but CRDT always sounds nice and could have potential for other syncing applications.
It would be nice to get something that can be tried out! What’s generally needed to set this up? Can I use it with my phone?
Phone, not yet… We need up votes on logseq plugins in the mobile apps.
We are thinking of adding that functionality to our fork if the Logseq team doesn’t prioritize adding it.
At the moment there is no release build possible, so to try it out needs a full dev setup with three interdependent repos arranged in specific relative paths and an active ipfs node with experimental pub sub enabled. It’s ok as a dev setup (we have watch and auto build and hot reload working) but it is way too involved even for extremely geeky end users.
I plan to keep this thread updated as we progress…
What is the consistency model of the official Logseq sync?
To be honest the current consistency model sounds rather scary:
At the moment, it’s important to keep an eye on the sync status. When you open Logseq, let it first sync before starting to type. Otherwise, you might run the risk of overwriting notes in the cloud. That’s because currently, Sync works different from Git; it does not compare pages. Instead, it syncs the entire page containing the most recent changes.
In case you do find yourself overwriting a remote change by accident, no worries. Logseq Sync keeps a page history for up to a year.
especially scary:
run the risk of overwriting notes in the cloud
I hope and assume that the team is working on better solutions, but i don’t know for sure.
Our solution will use atomic attribute by attribute datalogs and situation flags to provide reliable syncing and sharing of specific block trees.
What’s the algorithm (for the CS professor)?
For a great high-level summary of different techniques for data convergence, see Research for Practice: Convergence | November 2022 | Communications of the ACM (sorry, paywall). It references:
- Conflict-free Replicated Data Types: An Overview (June 2018); [1806.10254] Conflict-free Replicated Data Types: An Overview
- Real differences between OT and CRDT in correctness and complexity for consistency maintenance in co-editors. In Proceedings of the ACM on Human-Computer Interaction 4, CSCW1, article 21, (May 2020), 1–30; https://dl.acm.org/doi/10.1145/3392825
- Mergeable replicated data types. In Proceedings of the ACM Conference on Programming Languages 3, OOSPLA, article 154 (Oct. 2019), 1–29; https://dl.acm.org/doi/10.1145/3360580
- Logical monotonicity:
- Keeping CALM: When distributed consistency is easy. Commun. ACM 63, 9, (Sept. 2020), 72–81; https://dl.acm.org/doi/10.1145/3369736
- P. Bailis, A. Fekete, M. J. Franklin, A. Ghodsi, J.M. Hellerstein, and I. Stoica. Coordination avoidance in database systems. In Proceedings of the VLDB Endowment 8, 3 (2014), 185–196; http://www.vldb.org/pvldb/vol8/p185-bailis.pdf
Well…
I don’t know for sure about the algo for the Logseq official sync… But it seems to be a very simple full page, LWW. Last write wins
@tennox and i are using logseq as a playground for developing bygonz. It’s rather early alpha, so our Algos are still wip.
I was just now reviewing this list of datatypes:
In the way you’re using the term Algo each of those datatypes has an Algo behind it, right?
We are looking to establish functional bi-temporality, by tracking both:
- “as of” local transaction times
- “based-on” previous relevant atom or tx hash
Having both will help flag situations that require automatic and or manual merging.
We are not academic Computer scientists. Rather far from it actually. But we are passionate hackers who love decentralization, local first, far-edge, p2p, and privacy first…
@Ken_Arnold if you (or anyone else reading this) would like to have a video call about crdt design and development, I’m game.
I actually intended my question more for the logseq devs than for you, who seem to know what you’re doing — looks cool.
It looks like devs are thinking about CRDT but I haven’t found any details beyond some discord chats like Discord … I’m concerned about this for logseq as a platform: concurrency isn’t something you can just bolt on, it needs to inform the architecture, and the experience with the official sync isn’t promising.
Yeah, i got that… But i think maybe starting a new thread in questions section or asking directly on discord could help. I am also curious for the answer, and our concerns are quite justified, i think.
Thanks for the encouragement! Let’s see how far we get…
Dumb question, is this only for multiple devices for 1 user, or it’s this and also for multi-users collaboration with nice permissions like (only share some notes with some users)?
that is the intention… although still far from ready for user testing
it does not compare pages. Instead, it syncs the entire page containing the most recent changes
Wow, this sounds worse than my Syncthing setup with a script to automatically merge conflicting files. Why are they even bothering if they’re not going to base it all on atomic changes? I don’t quite see the value proposition.
Anyway: How is this project coming along? Are there any perspectives for using it without IPFS on some kind of self hosted infra? I’m still most interested in the CRDT part.
Agreed.
So we are definitely envisioning and hacking away on various potentially “severless” “back-end” options.
Running an ipfs node via the desktop app is pretty easy. But actually the first layer of crdt (actually sadl i prefer to call it) lives in indexeddb in the browser.
As electron is chromium based we should also be able to write the atomic data logs directly to the users file system.
dGraph, Pocketbase, surrealdb, direct P2P are all in the running once we get ipfs a more solid
Thanks for your interest!
Things always got tricky for file-based conflict resolution. We are trying several options for file-based diff
From user experience aspect, TX-based serving / RTC is always better than file-based, and this is why wee are developing: Trello
you bet… 1,5 years late but I am interested
How did you do that
well we did it with an applog style attribute by attribute ipfs sync scheme… but we’ve moved on
neither are really ready for prime time quite yet, but getting there…