Specification for public graph discovery. Decentralized social network on logseq

By “reference” I mean a usual logseq link that leads to a page, created via [[]], to a block, created via (()) and a link from a child block to a parent block (based on that hierarchy of blocks are rendered).
The thing is that a reference leads to data about a block, not an html page, and this data can be interpreted by a logseq app to render a page.
It’s like publishing source of your graph in, say, .md, but a step further, as we publish logseq data model (in, say, json, or whatever).


Example logseq data.

Such a graph

* A
** AA
** ((AA))

is represented in ~ such a logseq data

{:db/id <A>
 :block/content "A"}
{:db/id <AA>
 :block/content "AA"
 :block/parent <A>}
{:db/id <(AA)>
 :block/content "((AA))"
 :block/parent <A>
 :block/refs [<AA>]}

So now we use references to ref both to your blocks and blocks of others.
And, in order to display other’s blocks, they are being downloaded and are of no difference than yours.

Or should they differ?
Perhaps you wouldn’t like referencing other’s block, walking through that block into other’s graph, walking around that interesting graph, and having other’s blocks that you walked be listed in your graph, no different from yours.

A solution is to track authorship of a block.


In context of linking to other’s reasoning we need to be sure that it won’t change, as that may corrupt your reasoning.
Matrix permits change, that is why I think it is not the best fit for the job.


Listed Matrix projects are interesting, I haven’t know of them, thanks.

1 Like

Hey there, maybe in the meanwhile we could start an initiative to “import” other remote graphs/pages in our personal one, like the following:

Alice Graph
  /pages
  ...

is published as usual at alice.com.

Bob run a simple script that

  1. downloads Alice’s /pages folder
  2. edit its files names by adding alice.com%3A (to create a namespace)
  3. edit the references in those files by adding alice.com/, for example [[alice.com/Something]]
  4. move those files in Bob’s graph in a subfolder like /pages/alice.com
  5. check alice.com for updated pages and eventually download and edit them again

Eventually, if Alice adopted some kind of standard, her pages could have properties like

page-authors:: [[me@alice.com]]
publishing-date:: 2022-02-02

that are not edited by the script so that Bob can for example query for pages written by [[me@alice.com]].

Not very powerful but at least a simple script and a standard could be enough while a proper solution is discussed and developed.

1 Like

Hello, I’m a new member of your community and became fascinated by this topic.

@Alex_QWxleA’s answer, back on June 20th really attracted my attention. I also noticed that the discussion goes in a different direction than the one I would have hoped, so here’s me reviving some of the old convos. This is all in good faith, as a humble member of your community as I am thankful for giving me Logseq.

Maybe some big picture notes on sharing structured graphs, where it could be useful and its current pitfalls with Logseq’s current seralisations:

Although, I doubt sharing structured graphs becoming even remotely mainstream in the foreseeable future. As in, would they be adopted at the level of current social networks? While that would be awesome by the way, it just doesn’t feel like humanity can handle such a big task at the moment. The only thing that might push some communities to take position against this recent (as of Nov 1, 2022) question of how to tap into the wisdom of the crowd, as rapidly as possible. Then, I wonder, what can Logseq’s role be, in that. And frankly, it seems to big of a battle to position Logseq as a social networking platform, as some of the examples mentioned here, indicate.

Those solutions won’t scale, and I mean functional scaling in this case. No particular vocabulary, grammar, or model is governing those properties, etc., which means that there’s no gurantee that a meaningful message sent from a graph to another can easily be misinterpreted. If Logseq does want to do what @Alex_QWxleA is saying here, you may want to start looking into thinking of a neat way of mapping each Logseq block to an RDF resource description. Semantic Web was a mouvement that took a while to mature but the formalisms they have come up with by now, seem colossal, and they are being adopted by the elite in the enterprise world as well.

SKOS can be a good first example I think. It’s one of the simplest, yet very useful ontologies that have been developed. Essentially, it provides a very simple yet powerful language for organising concepts, by terms used to label them or taxonomies used to categorise them. One of the direct applications of this is a person’s ability to easily document their view on existing concepts for instance. Take Wikipedia as an example. Wikipedia editors aren’t supposed to reflect their personal views in encyclopedia articles. However, when a mere mortal knowledge artist is publishing a personal graph, well, they may want to refer to concepts that are already nicely defined in Wikipedia in a much more native way than it is possible today, and then tell their story, whatever it maybe about those concepts.

Any information system has master, reference, transactional, analytical, and operational data, correct? More or less. Well, Logseq can be the platform that is used to create transactions (each block being a transaction), where the master and reference data used to document those transactions may come more naturally for the shared resources on the Web via URLs, which then links the power of Logseq, which is in its ability to create structured content to the power of Semantic Web, that has been building up for years and there are enough decent implementations out there, to start considering its usage.

Sorry for the long message. I hope you’ll find the time to read it.

And BTW, you see how I said Logseq is responsible for creating those transactions, well, then social networks such as those that implement an open protocol like ActivityPub, well can be used to broadcast those transactions using RSS, another standard that is nicely linked to Semantic Web standards.

1 Like

Welcome!

To be clear: “social networks” strictly speaking doesn’t imply microblogging platform with an ephemeral stream of content like Twitter, Facebook etc.

Of course a Logseq-based network would look much more like a decentralized wiki(pedia).

I don’t think ActivityPub would be useful for Logseq since it’s more about streams of content that you probably won’t edit later. It’s RSS on steroids.

I suggested Matrix protocol because it’s de facto a decentralized database where users on different servers can access and edit data with a sort of built-in ACL (access control list). Indeed I think what’s really important is being able to give users on other servers the rights to read/write portions of our graphs.

The point of Matrix is keeping the data consistent even if portions of the network split and rejoin later. Plus Matrix comes with e2e encryption.

If you think about Matrix as an instant messaging protocol like XMPP you are looking at it from the wrong perspective, it can be a new general-purpose layer on top of the Internet stack providing features we generally see in silos, like logging in and interact with other users of the same platform, but decentralized and federated.

1 Like

+1 to the idea of using blocks to express Semantic Web data.
Blocks are powerful for that they can hold both human-friendly text and computer-friendly props,
allowing both humans and computers to reason on them.
And we want that computer assistance.
Semantic Web is a great idea and fits perfectly with the design of blocks, as they resemble a direct labeled graph by referencing other blocks, just like Semantic Web’s model.

I like the metaphor of distributed wiki.
I believe one important trait of such wiki is immutability,
as we scale to the size of the Web we better to not have link rot in the design.
Likily, it’s possible to have an immutable Semantic Web graph by using content-hashes to identify blocks. That way we have an immutable direct acyclic graph, and it’s possible to host it on top of content-addressable storages, such as IPFS, GNUnet, or use a CDN, Matrix, Solid, or all at once - the more the merrier.
Due to blocks being content-addressed we can get ‘trust’ ouf of equation - we care little where data for a content-hash is coming from, because we can verify that it’s exactly what we asked by comparing hashes.
Same applies for ways to distribute data, we can use any number of protocols we like - it can be p2p, via ActivityPub, RSS, Solid, IPFS, OrbitDB, Matrix - you name it.=)

3 Likes

I love the way you think and surely, so many others here in the Logseq community, whom I am looking forward to hearing as well.

I think the protocol discussions are way too early, especially there are other things the Logseq community can think of. The thing about Logseq is that for me, as a mere bystander, Logseq is how HTTP/HTML should have been designed in the first place. Somehow, it didn’t happen and we are where we are.

So I think there’s an opportunity to first re-introduce the simple concept of a website to people and re-introduce it with Logseq as the starting point. Simply due to the exact ability that you yourself mentioned: easy input of structured data.

In fact, Logseq may want to look at two aspects: the workflow side of things and closer integration with the Unix platform. Logseq is client-based and can be tied to the power of the command line. Another aspect Logseq can look into, potentially, is ways to consume the data that is produced inside of it. The query concept is very powerful but we should easily be able to turn queries into tables, tile views, etc.

So, as you can see, I think there’s an immense opportunity ahead to let people know that anyone can now have an extremely nuanced website. At least people like me will join. I am a programmer in the sense that I can use an API if it’s created by responsible engineers. As soon as a library becomes experimental or poorly designed, people like me cannot use them. Logseq on the other hand is compared (at least for me) with Excel, as a piece of software that runs the risk of being adopted by a larger audience that the usual geeks that adopt however difficult tech in our industry. Kudos to them but what about us who are not that technical.

1 Like

In fact, Logseq may want to look at two aspects: the workflow side of things

Agree, UX could be improved, and that would be of big value for users.

closer integration with the Unix platform

Going into the direction of OSes gets us in a bog-land, because software reproducibility is bollocks.
I.e., it’s darn hard to get environment set up for software to run reproducible across different machines.
There is Docker, which is meh. There are Nix and Guix which are pretty good and they push into this direction, but still have their limitations. So it’s hard to rely on OS-level software, which in turn would make OS-level features of Logseq unreliable.
Personally, I would prefer Logseq deepen towards the Web direction first.

Another aspect Logseq can look into, potentially, is ways to consume the data that is produced inside of it.

Agree, having computer assistance on knowledge we create is of immense value.
And we can get a whole new level more out of it by interlinking our graphs and the exsting Semantic Web.


Publishing Logseq graph as a website is already there.
It’s a nice feature, but such a website will be yet another out there, disjoined from the rest.
I think Logseq can be made into a way more powerful tool - a Semantic Web browser, where we would build one distributed wiki, having data and text coaligned for computers and humans to reason on it.
Where such wiki is to be stored and how it is to be distributed is of less importance.

Publishing UX can be as scary as ticking ‘auto-publish’ checkbox, and having your public blocks published out there when you’re finished updating them.
Discovery of content made by others can be done in numerous ways, to name a few:

  • getting notified when your block got referenced by somebody
  • when browsing the Web being shown blocks that refer to current page (ala Hypothes.is)
  • browsing your friends’ graphs
  • browsing some public graphs
  • browsing existing Semantic Web graphs

The more we interlink the easier it is to travel this Semantic Web.

2 Likes

About the unix tools thing, the fact that notes are just markdown is a huge thing.

I’ve been able to mass-process some things just using sed & awk, but you could use perl or python. But it would be much better if logseq exported the graph somehow with a kind of API, even if its only usable when logseq is open.

3 Likes

From my understanding, since Logseq is written in ClojureScript, it is compiled into JavaScript that is supposed to be run by a browser.

But it seems you can use Clojure(Script) with a command line interface using a tool called Babashka and this is what logseq-query (lq command) does:

Somehow it can connect to your Logseq graphs without running Logseq and perform queries. It’s already very useful and it can be combined with the usual Bash/Python/etc scripts.

Now what would be nice is the same with other Logseq features, not only queries.

Even better would be a library (maybe in C or Rust for max compatibility) that could be used from any programming language to manipulate Logseq files and entities programmatically without any JavaScript (or other interpreted languages) involved.

For instance I spent some time trying to write a parser for properties:: (to read and write their values programmatically) but I failed because I’m not very used to this.

2 Likes

You raise a good use-case - programmatic manipulation on top of our notes, that’s powerful and we want that. And a good problem - that dealing programmatically with text is a pain.

So we would want data (e.g., Logseq’s inner representation of our text notes) to be programmatically accessible.
One way would be to have Logseq serve it via an API. It’s a common approach.
Another would be to export Logseq data, in JSON or EDN.
But yet another way, that seems so appealing to me, would be to derive a Semantic Web representation of Logseq’s data. Then it could be queried with SPARQL (akin to Datalog (that DataScript uses), but able to perform queries across the whole Semantic Web, not just local dbs). Also, it can be serialized as JSON for those cases where we don’t need SPARQL queries. And JSON can be dealt with any programming language out there. And another dope thing is that we wont be limited in access to only our knowledge graph, but will have access to graph of others and the rest of the Semantic Web, building one interconnected graph of knowledge. ^.^

I agree that having programming access to our data is of huge value, and it’s more valuable the more data there is. Integrating our graphs into the Semantic Web would be like merging our lakes of data into the ocean.

To have our notes as data would be a dream! Then indeed we can work on them programmatically from whatever language we prefer.

Good pointer to lq.
From what I reckon, atm Logseq’s graphs are stored as ~/.logseq/graph1.transit (.transit is a serialization of EDN).
lq runs in NodeJS, reads a graph and feeds it to DataScript engine. For this use-case .transit is ideal.
For reach from other languages .transit is… not that accessible. Having it in JSON would give us way broader reach.

To have our notes in a semantic form, such as JSON, and have a client of your choise to work with it (e.g., Logseq), yet having programmatic access to it to do whatever we wish with it - that sounds very appealing.

FYI Tienson did an experiment about using EDN instead of Markdown/Org.

Interesting, more info?

I saw this tweet by Tienson:

https://twitter.com/tiensonqin/status/1583170757823430657

Sorry, I have not any other information, as I said, it seems just an experiment.

Edit: found the branch

1 Like

It is very important … not only from the knowledge sharing perspective but also to provide ready reference graphs to new users (like me) who wish to use Logseq for publishing perspective. In fact, I landed on https://demo.diasporamemory.com/#/page/Diasporic%20Memory , which inspired me to explore this area … my initial attempts , with just couple days of work is at https://shutri.com

I fully realize that Logseq is primarily targeted to mine your mind - notes , todos , journals etc … but that doesn’t stop it from being best publishing platform as well. As regards to social features like “follow” or “share” - they are sure important but they are not a MUST have to get started …

Maybe - it is just a thread, in this community (or elsewhere), where expert hosted graphs are listed with basic instructions on features , and how to use them as a reference template …

1 Like

Hey @shutosha : Digitized Diasporic Memory is actually my graduate thesis project!!! I’ve been following this thread quietly and was pleasantly surprised to see it mentioned here. Thank you for the shoutout! Happy to hear that it inspired you. :slight_smile:
If you’d like to reach out to chat some more, let me know.

1 Like

Wow …great to meet you ! Please keep up the good work …

1 Like

Regarding implementation, I think that there was a lot of good work done in semantic web Semantic Web - Wikipedia space, and to me combining two (logseq with tripes, and urls to be entities in RDF sense) makes perfect sense when thinking about multi graph or federations in logseq.

(potentially related topic: RDF/JSON-LD/Tripple Store/Schema.org, etc )

2 Likes

Hi all, I just replied to @andrewzhurov on that referenced thread on this subject and just read over the latest posts here, so wanted to follow up here.

Exactly so, and that is basically what JSON-LD is designed for. To start, I think it might be as simple as the sketch I offered on the other thread for decorating property blocks with JSON-LD @context and @type tags at the system level.

From the schema.org homepage:

Schema.org vocabulary can be used with many different encodings, including RDFa, Microdata and JSON-LD. These vocabularies cover entities, relationships between entities and actions, and can easily be extended through a well-documented extension model. Over 10 million sites use Schema.org to markup their web pages and email messages. Many applications from Google, Microsoft, Pinterest, Yandex and others already use these vocabularies to power rich, extensible experiences.

Beyond providing a shared namespace of types which benefit from relational composability, there are huge wins baked in, including making real graph queries across federated graphs, a path to being indexed by traditional search engines, and so on. In short, JSON-LD is the lingua franca of linked data on the web these days, and is isomorphic to standard RDF triples.

Having gone there, I’m going to take one more swing at my Fluree pitch to support a lot of the goals I read through in this thread. Andrew had expressed (in the other thread) that while Fluree would be a fairly straight forward engine to drop in, but come with added complexity that seemed misplaced in the application. Given the discussion here, I beg to disagree.

I’ll preface this by recognizing Logseqs origins and mission are around being a second brain and is a more inwardly focussed mindset with a more ah hoc evolution that doesn’t lend itself to strong typing. Yet, to be interoperable with the semantic web, type consistency is needed.

Anyway, I realize the notion of shared editing of graphs may seem sacrilegious, yet, I need to share my knowledge and not by rendering out graph segments and hosting them as webpages, and I actually do want to co-edit knowledge graphs. And even further, I want to have granular access control on the graph. And to be clear, I get that isn’t/wasn’t the target use-case of Logseq. But it would be great, lol.

Anyway, to clear up a couple things about the fluree architecture. It has two layers which run (and scale independently) in separate containers:

  • A blockchain persistence/state, and
  • A RDF graph overlayed/indexing the state (this can even be run on client side javascript).

So, yes, the underlying state is immutable and append only. When an object is deleted or updated, a new block is written to reflect that change of state, while the graph (which is what is queryable) is updated the new state. The graph can time travel across state for free… queries include a “at time t” input and it costs the same to look into the past as the present. It also provides for independent scaling of read and write performance. The engine can achieve millisecond response time for queries. Clients register for updates from the ledger nodes for commits to triples in their local cache (basically functions like a CDN).

Granted, there is complexity added around consensus, and why do it if you don’t need consensus. But for shared editing of a graph state that all parties can rely on, it would be well worth it either on a local node or in the cloud.

Turning to the out of the box advantages:

  • Semantic Web native out of the box. Can be queried using SPARQL and GRAPHQL. Native JSON-LD in first half of the year.
  • All transactions are cryptographically signed by identities and encrypted; absolute provenance.
  • Built in “smart functions” for identity based access control
  • ACID transactions
  • Client edge-node graph only reads in the data in needs and loca queries are blazing fast. Write performance can be scaled (and paid for) according to need.

Then, finally, the power to use real, nestable, graph queries executed directly against the Fluree API seems very powerful.

Anyway, I hope you’ll give Fluree a second look in light of the developments in this thread as it seems to hit a lot of the features that have been discussed here.

All the best!

This was done in 0.8.15!! (PR #7699)

2 Likes