Different ways to structure data

Here there are my understandings so far of data structures available on Logseq, feel free to comment and eventually correct me:

  • More than the “pages are collections of blocks” I like “pages are blocks with a name”.
    • We have indentation to create hierarchies between blocks.
    • Indentation of blocks can create hierarchies between pages too, the trick is using blocks containing only a reference to a page:
      		  - [[Parent]]
      		     - [[Child]]
      		       - [[Teddy]]
      		          - Some text (eventually)
      
    • Every page mentioned (parent, child, teddy) will have a backlink to the others at the bottom, something like:
      [[Parent]] > [[Child]] > [[Teddy]]
    • And we can have as many hierarchies as we like mentioning the same page. Here there is an example with [[Mathematics]] page serving as an “index” of two different hierarchies:
      		  Mathematics
      		  - Branches
      		    - [[Algebra]]
      		      - [[Linear Algebra]]
      		      - [[Group Theory]]
      		      - ...
      		    - [[Geometry]]
      		      - [[Euclidean Geometry]]
      		      - [[Topology]]
      		      - ...
      		  - Resources
      		    - [[Theorems]]
      		      - [[Linear Algebra]]
      		      - [[Topology]]
      		      - ...
      		    - [[Definitions]]
      		      - [[Algebra]]
      		      - ...
      
    • Pages names can create hierarchies with [[parent/child/teddy]] syntax and these are better described as “namespaces”.
      • In programming languages and informatics we use the concept of namespaces to avoid ambiguity.
      • For example [[Fruits/Apple]] and [[Companies/Apple]], but a case like this, with almost only one element in common between [[Companies]] and [[Fruits]] doesn’t justify the introduction of these hierarchies that weighten the syntax. This particular example is better solved by two pages like these: [[Apple (Fruit)]] and [[Apple (Company)]] (Wikipedia uses this method). If one want to display just “apple” or in general replace the page name with something else when mentioning it, there is the []() syntax.
      • So what are use cases for namespaces? The cases with huge overlap of childs names between different parents, for example:
        • Books’ chapters, lessons numbers:
          [[My New Book Title/Chapter 1]]
          [[Another Book/Chapter 1]]
          [[An Online Course/Lesson 1]]
          [[An Online Course/Lesson 2]]
        • They can be more compact:
          [[My New Book Title/1]]
          [[Another Book/1]]
          [[An Online Course/1]]
          [[An Online Course/2]]
        • The children names (in these case just the numbers 1, 2, …) makes sense only if associated with a parent page.
        • But remember: the children are pages and it makes sense to create them only if you want to mention them using their names somewhere else (pages are blocks with a name…). Otherwise they can just be indented blocks:
          				  Parent (page title)
          				  - Child 1
          				    - lorem ipsum dolor sit amet
          				    - ...
          				  - Child 2
          				    - amet sit dolor ipsum lorem
          				    - ...
          
          and one can still browse them individually (by clicking on children’s bullet points), so no need to turn children into pages with [[parent/child]] syntax if one don’t need to use that syntax to reference them.
        • Another use case for namespaces could be (I’m still experimenting with it) a quick way to browse queries:
          • [[Theorems]] contains a query listing all pages/blocks with a property like type:: theorem
            [[Geometry/Theorems]] contains a query listing that does the same with type:: theorem and area:: geometry
            [[Theorems/Geometry]] does exactly the same but this way I can use namespaces’ “hierarchy” section at the bottom of pages to quickly browse them:
            					  Hierarchy
            					  [[Geometry]] > [[Theorems]]
            					  [[Geometry]] > [[Definitions]]
            					  
            					  Hierarchy
            					  [[Theorems]] > [[Geometry]]
            					  [[Theorems]] > [[Algebra]]
            
          • This use case is a almost workaround for the lacking UI/UX of queries though. If there was a quick way to toggle filters to queries this hack could be useless.
37 Likes

Thank you for this well thought out article on hierarchies. I’ve been trying to find a way to make it work for me, and you’ve given me a few more options to try. THANK YOU!

2 Likes

So a hierarchy is actually a “namespace” - is that right?

1 Like

Generally speaking:

  • indenting blocks
  • indenting blocks containing only [[page names]]
  • namespaces

are different kinds of hierarchies.

But it happens that Logseq display the label “Hierarchy” above the namespace section, leading to confusion.

1 Like

A namespace is a hierarchy, but it’s not the only hierarchy in Logseq.

(To clarify to others reading this what we’re talking about with namespaces, it’s this:)[[ParentPage/ChildPage]]

As @alex0 explained in the original post, hierarchy in Logseq also happens on the branch level:

  • [[Parent]]
    • [[Child]]

So when you go to the [[Parent]] page, you’ll also be able to see (and filter on) [[Child]]. And when you’re on the [[Child]] page, you can also be able to see (and filter on) [[Parent]].

6 Likes

I believe going over the pros and cons of how the ways we structure our data affect our ability to query relevant information would be nice to cover during the summer sprint.

We can include properties, user-defined metadata, as another important aspect of querying data in Logseq that we could utilize.

Therefore, we have two primary tools:

- *Hierarchies* 
    - Namespaces
    - Pages
        - Indentation
- *Properties*
    - Page Level
    - Block Level
8 Likes

Thanks for pointing it out.

Indeed 90% of the structure of my graph is properties, 10% hierarchies. Maybe I should have mentioned it, I wrote about hierarchies because the database-like approach of queries was self-explanatory to me.

The first thing I do on a new page is setting up properties, using templates for the recurring ones or manually. But what I always set is a type:: property and the values are things like: bookmark, definition, connection, thread, note, video, book, author, area, etymology.

Then for each “type” I have a page with a query on it to display all the pages/blocks of that type.

Even when I have to save something from my Logseq Telegram bot I write messages like:

type:: [[book]]
title::
author::
area::

type:: [[bookmark]]
subtype:: [[git repo]]
url:: https://github.com/logseq/logseq
description:: main [[Logseq]] repo

and boom it’s in the database. Life changing.

10 Likes

Thank you for starting the discussion and sharing how you use the tools for us, Alex.

Obsidian’s Dataview plug-in showed me the power of using metadata to dynamically query relevant information—and I am excited to use hierarchies and properties in Logseq to do the same.

Ultimately, I believe the summer sprint will help in the process of developing a long-term system of querying data that can scale as the number of our notes, both blocks, and pages, increases.

1 Like

Alex: These are two separate examples - right?

1 Like

Yes they are supposed to be two blocks

1 Like

I’m confused about which parts of this post are available today and which are proposals.

For example, is this available today?

1 Like

Everything mentioned here is available, I guess the future tense caused confusion. This is the Learning Sprints section, here we post what we learn about Logseq.

The thing you quoted is displayed at the bottom of each page (in the example below there is the “Algebra” page) in the section “Linked Reference”:

immagine

1 Like

with logseq, is it just as easy to query this kind of hierarchy as it is to query relationships in the namespace hierarchy?

1 Like

Good question. How do you use namespaces with queries?

At the moment I use indentantion of blocks just for “indexes” like the math example above, so I don’t need to query them. I use properties:: to maximize the usefulness of queries.

1 Like

I don’t (yet), as I’m new to Obsidian. But it sounds like it’s not possible to do queries taking namespace into account? That’s a bit odd. In Obsidian, it’s normal to be able to search and query with a path: argument, which would be the equivalent.

You need to be more specific and anyway with advanced queries you can do everything since with them you are basically accessing the database Logseq is using internally.

Ok I’ll play with the queries to see how easy it is to write queries taking into account namespace and indentation structure

I kind of wish the section at the bottom of the page was called “Namespaces” then, or “Page Hierarchies”. I also wish there was a “Namespaces” page you could toggle on and off, so it would display a link on the Left Sidebar between Graph View and All Pages called “Namespaces”. Then I’d like to click there and see all the trees of namespaces in my graph.

It’s a bit like a folder view for namespaces, but my brain is more of a foldering one than a tagging one, at its core.

4 Likes

This kind of replicates the folder/tag distinction, in that the Namespaces/Pages/Indentation manner of creating context is structural, and the Page Properties/Block Properties manner of assembling items to view is dynamic.

When using Properties, contextual information information becomes widely distributed. You tag the note to a structural idea but did not follow a browsing path to the point in the structure where you want that note.

One challenge for me here is that building out structural relationships which have some stability (e.g namespaces, folders) and are easily visualized, has great mnemonic value.

For someone like me, the up front decision of “where to put a note” is actually one of the best parts of personal info management. I love it because it’s an up-front investment in developing mental clarity. I’m a structuralist thinker.

The timeline approach confuses the hell out of me. I suppose if I was free to build out my topic-oriented graph, and timestamp entries so that they got embedded into the Journal kind of incidentally, I wouldn’t mind. I hate the idea of working in the Journal though.

If I want to work on my History of Goth Rock page, I want to work on that page, toplogically! I don’t want to work in my Journal and tag it to that page. I’d rather it was the other way around.

Once I start dispersing structural information across blocks in the form of properties, I get lost. I lose my bearing/orientation, and each note loses meaning for me. It only makes sense when I traverse a meaningful path to get there.

Even if the UX is similar, knowing that I have to add properties to a block to get it to show up in the right place, instead of just clicking a stable structure to get there, is disorienting. I lose my trust in the system that way.

tl;dr There are cognitive reasons people might prefer the Namespaces/Pages/Indentation manner of creating context, as opposed to the Properties manner.

If the two were technically combined, some of that tension would ease. If I had a Namespace page that was editable, so I could move pages up, down and around the outline on that page, and this automatically updated a Namespace property, then it would matter less to me.

Or if there was a Hierarchies page, with one root-level “Namespaces” block as a system-provided block, but then you could add Query-based hierarchies below that, this would also be a bit more usable for me.

I have a little hack where I query {{namespace [[Hierarchies]]} on the Hierarchies page itself, and then I put every single page in a graph in the Hierarchies namespace. The page names get brutally long, but at least I have the deep, narrow pathways into the heart of my graph that match the way I prefer to access information.

In Bergman and Whittaker’s book, “The Science of Managing Our Digital Stuff”, they point to research indicating how when it comes to Personal (as opposed to public) information, most people prefer navigating a structure to searching for (therefore remembering) terms.

That may be because you can use procedural memory instead of declarative/semantic memory, or you can rely on recognition instead of recall to access information. Navigational methods activate a large bilateral posterior region of the brain, much of which operates subliminally, to guide info retrieval. Search leans very heavily on Broca’s area.

So the two manners of creating hierarchy Cesar mentioned kind of map onto this navigational vs. keyword-based way of structuring knowledge.

7 Likes

On analogy from other digital tools, I’d say that you combine querying and namespaces by creating something like “Smart Folders”. So you could have a query that returns the Child and Teddy pages of some set of Parent pages - and you could put filters on that.

So I could get everything from my Psychology and Physical Anthropology namespaces that has “Limbic System” in it - then I could put that page in my hierarchy.

1 Like