Different ways to structure data

I understand that Logseq could provide better UI/UX for this, but why don’t you just

  • Organize Markdown files in folders like we do with our documents anyway, or
  • Maintain some indexes of pages and eventually one with yet to be organized notes?

In the latter case you could add “Organized” as page-tag and in the page [[Not organized]] put a query that lists all the pages without the tag Organized.

Logseq provides powerful generic tools that the user can combine in creative ways and this leads to complains like the ones in this thread, but if Logseq forced one specific workflow there would be even more complains because a workflow will fit only a small portion of the users. And indeed Logseq receives complains in the areas where it forces specific workflows, like the task management one with TODO/DONE etc.

Also I am convinced that one shouldn’t create pages lightly: Logseq has this great feature of focusing the view on a block and its children. This let you write notes as “root” blocks, the to shortcut let you collapse all blocks so that you have an overview of your notes and then you can click to a bullet point to focus on one of them.


Edit, example page to gather notes before and after clicking a bullet point:


2 Likes

What you described is literally 100% of what I suffer from when using Logseq. This lack of a fixed structural system to represent my graph has been driving me mad for the past 2 years when I started using Logseq. The tool really just brings too many benefits for me to give up on it just yet. However, the need to recall where I stored something in my notes really defeats the entire purpose of a second brain; as Tiago states in his book:

“Now, it’s time to acknowledge that we can’t “use our head” to store everything we need to know, and outsource the job of remembering to technology.”

This state of remembering should also include a hierarchy structure of our pages.

2 Likes

I’m a pedestrian user of Logseq at best. What I’m finding, though, is this:

At first, “tag it, link it, and give it properties” works reasonably well, especially when the number of notes remains small. For me, though, this approach isn’t scaling well … and I find myself needing some kind of structure to help me a) navigate my notes and b) make sure I’m using tags, links, and properties consistently.

This creates an administrative burden – the manual maintenance of a map – that feels more like work than play. And the effort needed to maintain that map will only increase over time.

This is already a challenge for me, even in my own (very simple) applications of Logseq over two months. Search results are fine, but they don’t dispel a growing feeling that I’m getting lost in a trap of my own making.

7 Likes

I’ve found another way to organize pages that is handly to browse like namespaces but more flexible.

It’s “tags::” property, it is a special property for pages because it will make some pages display the “Pages tagged with <current-page” section.

It can be used with this structure:

Child.md
tags:: [[Parent]]

Then at the bottom of [[Parent]] there will be the section mentioned above with the link to [[Child]].


My example in the first post would be implemented like this:

Linear Algebra.md
tags:: [[Algebra]]

Algebra.md
tags:: [[Mathematics]]

This is basically like organizing pages in folders but more flexible because a page can be in different “folders” at the same time.

There is another way to organize pages and it is my favourite so far because it scales well with very long complex “poly-hierarchies”:

You can write indexes like the following where you want:

- [[Index]]
  - [[Foo parent]]
    - [[ Foo child]]
  - [[Bar parent]]
    - [[Bar child]]

Then when you are on a page and you want to see where it is in one or more indexes, use

{{query (and <%current page%> [[Index]])}}

You will be able to browse the hierarchy without leaving the page.

You can create a macro with it and a custom command, so that you can just type /Index at the top of a page to add it.


P.S. you can use my minimal style for queries to make the indexes look more native:

5 Likes

Precisely. I honesty think they need to implement a file-browser that we can use to create folders and nest pages inside of. Then when we insert a page reference into our notes, display a hint just above the page-reference name that shows the path of where the file is located at in the folder structure. If more than two pages with the same name exist, display them both as separate items but display their folder paths above their names.

Example:
Say I am taking notes on [[Data Structures]] for Knowledge Management. I have two pages in my graph for [[Data Structures]], one for Programming and one for Knowledge Management. When I want to capture a new note on Data Structures for Knowledge Management, I start the linking process [[Data Structures]] and Logseq presents me with two options: the first that is under Knowledge Management, and the second that is under Programming. I can now wrap all my Knowledge Management notes for Data Structures using that first page, while keeping my Data Structure notes for Programming seperate. Thus I now end up with structure to help me organize.

With this complete, I would then render the tree structure on the page so you can expand open a folder and view other related pages that are nested inside, or traverse up/down the tree as needed.

There really is no need to reinvent the wheel here and design something brand new. Just give us the ability to structure our notes in a way we are all familiar with and leave it there as optional so folks can choose to use it if they wish.

Final note: I would leave tags as a global feature. Meaning that tags have no hierarchy to them. This could then provide some more benefits to the tagging system and allow it to be used for grouping related content together. Tags should also not create an actual page-reference or physical file on the filesystem. They should only act as link nodes that can link notes together, but they do not actually exist on their own except in the context of its capabilities to group content.

1 Like

What indexes I mentioned above look like:


1 Like

While tagging and namespacing certainly works, it feels like a lot of friction. I’m finding it tedious to just type in “MGMT-364” over and over just to file my school notes, and keep them separate from similarly named “Week 3 Lecture” notes for other classes. Like, I can’t state how often I accidentally created a “MGMT264” page because I can’t type to save my life.

I guess I could change the way I do notes. Jam it all into a journal page first, then extract it to a well named page. But that also is friction – learning a new way to work. I’m not looking to become a gardener. I just want a way to jot down info I don’t want to forget, and quickly find it when I need to. Remembering that I learned “something about less rigid product development flow” as part of “some class last spring” is more likely than me remembering that there is this think called “Agile Methodology”, or that I learned it on February 22nd. Flipping through a dozen lecture notes pages is good enough search capability for me in this case.

I think having an alternate view to manage namespaces would cut down on friction by a lot. Like many have suggested, a folder view is natural. It matches how namespaces work in reality (one location in a tree). Being able to see the whole tree at once and move items around would make it easier to groom the kb. Creating a note “inside” a node of the tree would automatically start the page name with the namespace filled out – a boon to poor typists such as myself. Behind the view, the data would still look and get updated in the same way it does today so peoples’ queries would still work.

Accommodating items in multiple locations would be more challenging. Trillium Note let’s you have a page in multiple locations of the tree, but I find it undesirable to do so in practice. When trying to link to the note, you see all of the locations in the auto-suggestion box – messy. Personally I don’t derrive much value from doing so – navigating a graph view would be more than enough if I ever wanted to see all the varied linkages.

Namespaces are not folders not they are meant to organize existing pages into hierarchies. Namespaces in Logseq are what namespaces are in general: a way to disambiguate different things that happens to have the same name.

Have you read about the indexes I mentioned above? Logseq already has this awesome UI for organizing blocks hiearchically that we usually call the outliner, so why not just use that?

In my workflow when I want to organize some pages in a hierachy I just write it as indented list of blocks where each block has a page reference. I tag the top of the list with #index or class:: [[index]].

Then since I have defined a custom command I just need to type /Index somewhere in a page to output a {{index}} macro I defined as a query with a simple AND between current page and [[index]].

The result is that I get in which indexes the current page is mentioned:

1 Like

Hmm, unless I’m missing something, folder are namespaces? You must have unique filenames in a folder, but can have different folders with the same filenames across them. IMO, the term folder is a throwback to the days when developers had to make information systems more relatable to office workers.

I did see your example of using index pages, and I do something similar already:

In my use case I don’t have very unique names for my notes because it’s essentially “week # lecture notes” for every class I take. Right now namespaces don’t give me any functional benefit. I might as well name my lecture notes like [[week 5 lecture #mgmt-364]].

Actually …:thinking:. I have an idea…

Yes but folders being namespaces for files is secondary: we don’t use folders to disambiguate files with the same name, we use them to organize files with a hierarchy that has some kind of logic.

Let’s say you have [[Alice]], a page for a specific person. You want to put [[Alice]] both in [[Contacts]] and [[Friends]]. That would be a hierarchical structure between pages. But if to express this structure you use namespaces, you would have [[Contact/Alice]] and [[Friends/Alice]], that are two different pages and so two different nodes.

In Logseq pages are nodes so you may want those nodes to be meaningful. For example a course is something that definitely makes sense as a node, but the lecture of a course of a specific week? I think not. There is the outliner for that: you can zoom on a block and its children by clicking on the bullet point and it’s often a better alternative to creating a dedicated page.

In your place I would organize that page like this:

MGMT-364
- [[Week 1]]
   - [[Lectures]]
    - actual content
   - [[Case study]]
    - actual content
- [[Week 2]]
   - [[Lectures]]
    - actual content
   - [[Case study]]
    - actual content

So basically instead of creating dedicated pages I would use collapsed blocks and click on the bullet point to open them.

There is a breadcrumb widget at the top to browse the hierarchy.

Then with a query you can retrieve those collapsed blocks in many ways:

#+BEGIN_QUERY
{:query
 (and
    [[MGMT-364]]
    [[Lectures]]
    (or
       [[Week 1]]
       [[Week 2]]
       [[Week 3]]
    )
  )
}
#+END_QUERY

Also this way the [[references]] you make in the content of your notes will be linked directly to [[MGMT-364]] instead of one of its “subpages” created using the namespace function.

3 Likes

There’s nothing wrong with how you are dong it. You gotta find what works and makes the most sense for you. Not everybodies method will work for everyone. Though, I would say what you’re doing is what many others have found to be helpful for class lectures. There’s this video here from another user that shows how they use namespaces for class notes: Is this the BEST Notetaking app for students? | Logseq Student Workflow - YouTube

Also this may help:

4 Likes

@alex0 thank you very much for your examples on how to index using indentation + query in each child page (+ minimal css + command hack)
I struggled to find a proper way to construct index methodology and your method is working great! :pray:

  • Side note about namespace hierarchy - for whom find it useful for hierarchy (@boisjere FYI)
    Using alias, a page can be on several hierarchies, not only one.
    In fact, page name can be with short name, while the alias define any hierarchy needed.
    E.g.
    Page “LogSeq”
    alias:: KPM/LogSeq, ProductivityTools/LogSeq

(Beware of this active bug, that require re-index for every alias change… Renaming prop::alias doesn't cause the linked tag to change. · Issue #8602 · logseq/logseq · GitHub)

3 Likes

This use of the “alias” property is genius! Thank you for sharing it! It opens up a whole new vista of options with respect to hierarchies!

1 Like

No problem!

I have tried to use alias that way too but it is too bugged for me. I heard the implementation of alias isn’t great, so I think it’s a bit expected and if I report bugs like those with alias the reply would be that alias feature needs a rewrite.

I use namespaces to browse queries with incremental filters so it would be nice for me if aliases worked well with namespaces.

1 Like

Alex, is this right? With namespaces we lack the ablity to use the same page in two different branches of a logical tree, whereas by using your structured-block-of-only-pages approach we gain it…

Yes and you can also decide the order of each block, you can format them with highlight for example, you can compose hierarchies using embedded blocks, you can use as element of the hierarchy not only pages but also block references, links to blocks (with the syntax [Label](((block-id)))), plain text to emulate folders, URLs, queries, images etc.

Of course, with your approach we keep all the native functionalities LogSeq’s yields in the management of our indexes or hierarchies (here synonyms) (hadn’t thought of applying the embedding capability to indexes which you pointed, thanks a lot).

I for one came to LogSeq lured by the thinking-resemblance approach to blocks and pages (which are blocks as well, as you have insisted): bidirectional linking and immediate retrieval of previous concepts (here “tags” or better “pages”) at your fingertips, mimicking natural memory connections. But knowledge is that and structure as well. I was becoming suspicious that LogSeq would be useful for me in the long run, since there appears to be no section or plugin to manage hierarchy between pages, but after reading and understanding your posts (the epiphany came with this one Different ways to structure data - #49 by alex0), it now comes clear to me how unwise it is to suppose that a program whose basic building-block is hierarchy (structured blocks) wouldn’t have in its nature a way to bring structure into the bigger ideas (the page-blocks here).

Whether explicitly intended or not (second my guess), granting complex intertwined polyhierarchies to your pages is possible and managable in LogSeq at a very reasonable cost, and, once you’re on, it does feel like playing (in reference to what a fellow posted here earlier). All you have to do is to use the basic LogSeq’s linking, blocking and database query capabilities. You’ve just been showing the way in different threads of this forum. Thanks for that.

1 Like

I’m struggeling with this as well.

I agree with previous points that there seem to be some convergence in the note-taking field to a “mixed strategy” including

  • bidirectional [[wikitextlike-links]]
  • categorization hierarchy
  • tagging

I have observed this particularly within the Obsidian community.

Goal, desired properties

One way to frame the problem would be: With an increasing number of notes, it’s not that easy to find a convention or structure that

  • is low friction
    • meaning: fast entry and low maintenance
    • would be helped by: allowing for adding structure/semantics incrementally[1][2]
  • fights duplication
  • allows heavy linking
    • e.g. in LogSeq, linking to blocks has many drawbacks vs. using ordinary [[page links]]
  • allows some rudimentary (at least) strategy for invariant enforcement/schema adherence/integrity checks (be it, to a part or fully, manual)

…this list goes on and on…

…and here, far down - with regards to short-term feasibility, definitely not desirability! - we’re approaching holy grails such as

  • connecting our personal knowledge base, with full structure and semantics, to a global, distributed one
  • applying machine reasoning to our knowledge base

This is elaborated on further in the closing section - but first: LogSeq.

LogSeq functionality considerations

I’m not sure more “special features” of LogSeq would take us further in this regard. As has already been pointed out: structure is a very personal preference.

I think the best LogSeq can do right now is providing as much general, use case-agnostic, capabilities as possible, and doing it as well as possible - e.g. allowing user customization through general and stable querying functionality, properties:: implementation, and also of importance: customizable interface (example feature request).

That would allow experimentation. Among the user base, various strategies for structure could evolve, and the community as a whole can get inspiration and gain knowledge. Some converge could be expected with time.

My current LogSeq schema

Implemented only to a part, and in no way a perfect solution.

It consists basically of the following:

A hierarchy of categories

encodes: is-a relation
through: namespaces

example: [[vehicle/boat/submarine]]

The above page has a
alias:: submarine
for allowing shorter [[submarine]] link names.

I have chosen not to adhere to strict subtyping for my category taxonomies. Yes, Barbara is left unsatisfied as a consequence. The opposite choice here probably could allow for some clever tricks on how page properties could be utilized. But how the possible LogSeq queries that could perhaps make use of that would look… I don’t even want to think about. Even less, debug them.

Instance-to-category assignment

encodes: instance-of relation
through: a type:: page property with the category as target

example: page [[Boaty McBoatface]] has a page property
type:: [[vehicle/boat/submarine]]

Then, each category page has a query that lists all its instances.
example: page [[vehicle/boat/submarine]] page has a query
{{query (page-property type <% current page %>)}}
which will list all submarines:

  • Boaty McBoatface
  • HSwMS Östergötland

drawbacks:

  • refactoring is very tedious
  • the LogSeq bug of not resolving <% current page %> when page is opened in the right sidepane is a great nuisance

I use faceted classification:

  • meaning: we can have multiple, distinct, hierarchical taxonomies, and the final classification of the page will be the intersection of the assigned category node for each taxonomy hierarchy
  • this is possible in LogSeq since properties can have multiple values, we get optional support for
  • example: page [[HSwMS Östergötland]] has type:: [[vehicle/boat/submarine]], [[military_thing/naval_vessel]]
  • in the field of information science, faceted classification is generally considered a very good thing - and it does brings a lot of benefits to my LogSeq classification system

I don’t use polyhierarchies:

  • (and it wouldn’t be possible if using LogSeq namespaces for the category tree)
  • meaning: every category node has at most 1 parent node
  • a beneficial consequence is that it allows short, pragmatic node/category names, while we’re still conforming to the all-some rule: it’s generally easy to append an additional hierarchical level, with a short name, in order allow further refinement/specificity
  • example: a category path /military_thing/naval_vessel/ can, because of no polyhierarchies, be read equivalently as /military_thing/military_thing--naval_vessel/. All military_thing--naval_vessels are military_things - so /military_thing/naval_vessel/ conforms to the all-some rule.

Leaning towards fewer and longer pages

benefits:

  • hierarchy enforcement
  • faster entry
  • faster re-factoring
    • moving stuff around within the block hierarchy of a page is easy and fast

drabacks:

  • linking to blocks is inferior vs. to pages (and the nead increases as we reduce page granularity)
  • the target of a tag can’t be a block - it can only be a page, so this is a a general drawback of less fine-grained pages
  • the still-present, much-too-old LogSeq UI bug of sometimes not displaying the full page is a terrible friction point (…this bug manifests itself both in the main pane and side pane!)

extensively using, for better overview within a page:

  • headings (# …, ## …, ### …)
  • folding

Tagging

i.e. page tags
through

  • tags:: myTag page property
  • tagging individual
    • blocks #myTag

used for: less formal, sometimes add-hoc, additional structure, such as marking a page as belonging to some bigger area, or collecting a number of related pages (instances or categories) together

benefit:

  • easy to tag not just pages, but also individual blocks

“Authority control”

For each page, I try to add some page property where the value is some URL pointing to some external reference for the intended scope of the page. Usually this is a link toa Wikipedia article. The purpose is to attach an identifier to the page that I can use if, at some later point in time, the page name isn’t enough for me to quickly determine what the intended scope of the page is. This can be seen as some rudimentary/poor-mans linked data vocabulary or autority file connection.

Example: When re-visiting the page [[grammar]] I might ask: does it refer to my internal LogSeq grammar? To grammar in natural language? To database grammars? To help comes the page property wikipedia:: https://en.wikipedia.org/wiki/Formal_grammar which answers the question.

Further comments

  • this system is most of all just a convention I have for myself, in order to have an established standard for how to enter and structure information in LogSeq. Its intention is not to allow a lot of “implementation” such as using a lot of clever queries etc. I use queries very sparingly. I’ve gone down that route a few times, but the unstable state of LogSeq is just too prohibitive (bugs, and to a part incomplete documentation)

  • about is-a and instance-of relations

    • difference betwee the two: the type-token distinction
    • I don’t use “category” pages differently from “instance” pages, e.g. pages of both kinds can contain contents, and can be used as tags
  • I make heavy use of aliases

    • often I want contents for similar but perhaps not distinct concepts on the same page, so terms for those concepts would be alias-ed to the same page
    • I usually provide aliases for alternative spellings, for singular+plural forms, and for both abbreviated
      and unabbreviated versions (yes, due to combinatorics this can indeed end up with a page having a lot of aliases)
    • this makes linking easier
      • I can just add [[]]'s around all occurences in various texts to get linking
      • I can easily find the pages to link by using the “Unlinked References” section
  • I avoid page names/aliases that are too general, and that are often used in the general language

    • example: for the concept of frames in symbolic AI I don’t have a page name/alias “frame”, but stick to a qualified page name such as [[frame (AI)]]

My current graph has:

  • ~500 pages
  • ~18k lines
  • ~123k words
  • (for pages/*.md, so excluding journals - but I don’t use them much)

The ultimate, non-existing, note-taking system

Here, I’m leaving the LogSeq domain. This section is on note-taking systems in general.

A possible ultimate goal: a note taking system that is formal, fully semantic, fully linked (internally and externally), type-safe and invariant-enforced.

There is currently no such note-taking tool.

I regard the quest for a total knowledge representation system, as in the general information science sense, with full structure and semantics, as a perhaps unsolved problem.

“Note-taking system” might sound innocent, but would probably rather be one of the harder domains to model. It would need to be so all-encompassing: we take notes on facts, ideas, thoughts, beliefs, possibilities, to mention a few. It includes relations that are, e.g.: temporal, probabilistic, causal, conditional. Sometimes a connection/link would need to be specified along all of these dimensions in order to be fully described. Any of these relationship types are not unlikely to send shivers to a practicing ontologist. That’s about link/relationship types. Another link/relationship strata would be arity - bi-directional links only would be a limitation (relates to hypergraphs further down). Yet another would be the posibility of linking not only to notes/nodes/entities - but also to some set of such (relates to meteagraphs further down).


One direction would be some complete and fully-specified ontology. It would need to include all possible link relationships as well. To grasp the vastness of such a potential ontology, we can have a look at a published ontology for cultural heritage sites. That’s a 240-page pdf - for a reasonably narrow domain.

Another, but partially overlapping, perspective would be to see our data as a knowledge graph or graph database. Well, LogSeq could possibly be described as a knowledge graph. But if we want a knowledge graph that is fully-semantic and fully-typesafe it starts getting complicated. For full generality and expressivity the graph model won’t suffice. Probably not even its generalization, hypergraphs. Probably rather the generalization of those - metagraphs. This is following the lines of thinking of Ben Goertzel - see further down.

One obvious inspiration in the graph-based realm is the Semantic Web, with its roots in the early 2000s. Unpractical, never really fully realized, and using the much-dreaded XML for everything - but, it is impressively rigorous, extensively researched, and very well specified. Within technology, it probably has a world-record in the (no. of published papers)/(actual practical use) category. The intersection of Semantic Web technology and personal note-taking has seen several product attempts, and not surprisingly: even more academic papers. See for exampke Max Völkel’s thesis (alt.) or his later papers. These are the best sources I have found so far on formalizing link relationships and their types in the area of personal note-taking. For a lighter take on the subject, see for example Jonathan Reeves blog post.

Among initiatives along other but related routes, and more recently, we have e.g. Hode by Jeff Brown. Hode can be described as a note-taking DSL in the form of a hypergraph editor (text+GUI). It is implemented in Haskell, so a checkmark on “type safe” would probably be an understatement. It is/was a single-developer tool, or prototype. It is now abandoned.

An interesting approach, that has not been tried yet as far as I know, would be to implement a note taking system in TypeDB: a fully type safe graph database with built-in schema. That’s a good start, and it gets even better as we encounter such gems as an included inference engine and hypergraph capabilities. The TypeDB+note-taking idea has been mentioned by the creator of Hode, and also by e.g. by Robert Haisfield.

One could also take a category theory approach, e.g. with ologs (“knowledge representation with category theory”). Or - one step closer to note-taking, utilizing the Categorical Query Language (CQL) (my understanding: “a graph database language with category theory”). It is created by David Spivak et al., who were also behind the ologs. Ologs has been mentioned before on the forum, e.g. in A meta-graph as a set of linked graphs by @gax, who also briefly mentions CQL. He also presents a take on LogSeq vs graph databases.

For completeness we should also mention frames (my understanding: “knowledge representation inspired by Objective-oriented modeling”) as a possible note-taking data model.

This section should end with what is perhaps currently the pinnacle of formal, graph-based knowledge representation: Ben Goertzel’s knowledge representation model ([2]). It has its home within his OpenCog symbolic AI system. This model, and implementation, has (my understanding:) very far-reaching expressivity, is fully formalized, totally type-safe, total invariance/schema enforcement, includes its own ontology, includes/allows all the relationship types I mentioned previously, allows inference (and includes the inference engine), and relationship expressivity is on the metagraph level: they have arbitrary arity, and can point not only to entities but to arbitrary sets of entities.

Other links

ScalingSynthesis.com is a goldmine for any and all ideas and initiatives for bringin structure and semantics to note-taking. A project by Joel Chan and said Robert Haisfield.

Ivo Veltichkov has developed a semantic ontology-based Roam PKM system:

Transferring it to LogSeq is perhaps not impossible. At least not if some of the most restraining LogSeq bugs would be fixed.


changelog
2023-05-09: significant edits and additions

10 Likes