Would a rich commitment to hierarchies and classification be an anathema to Logseq culture?

Hi @boisjere and @alex0,

I’ve started to draft a feature request on:“Knowledge Management for tags / Hierarchical tags”. Please let me know what you think:

Why do we need Knowledge Management for Tags?

  • Tagging leads to a large number of unrelated tags, easily many thousands
  • Browsing these tags through a list, graph, or tag cloud, is not very efficient
  • Many efficient search strategies exist elsewhere:
  • All of these search strategies require knowledge of the relationships between tags
  • Logseq’s hierarchies of the form [[parent/child/teddy]] are not enough
    • Each child can only have one parent (teddy can be both in the child, and in the stuffedAnimals category, but currently this can’t be recorded)
    • The classification is specified on the pages themselves and can’t be added on later
      • If I tag 100 pages with teddy and later want to add the tag to a hierarchy, I need to edit every single page. Instead, it should be possible to tag pages, and then later classify tags centrally, making all of the original pages findable under the proper hierarchies

What needs to be done?

  • Logseq needs a way to specify relationships between tags:
    • TagA is a broader/narrower version of TagB
    • TagA is related to TagB
  • These relationships are captured centrally, such that tags can be managed without editing each tagged page individually
  • We need a user interface to hierarchically browse Logseq pages (not part of this feature request)

Example use case

  • Unsinkable Sam example goes here

Example implementation

  • This example uses Markdown to represent a subset of the Simple Knowledge Organization System, which is widely used, e.g. by libraries.

    • Tags are connected using the relations broader, narrower, broaderTransitive, narrowerTransitive and related
      • broader, narrower
        • specify that one tag represents a broader/narrower concept than another
      • broaderTransitive, narrowerTransitive
        • specify that one tag represents a broader/narrower concept than another and all its children/parents
        • e.g. A Cat is a narrower concept of a mammal, which automatically makes it a narrower concept of Animal
      • related
        • two tags are related, e.g. Apples and ApplePie
  • I am not tied to any specific syntax, this is just an example how it could be done in Markdown itself.

    • Alternatively, Logseq could directly parse SKOS RDF/Turtle description files.
    • SKOS is a minimal example, other knowledge management systems exist and in principle Logseq could record arbitrary relations between tags.
  • The following relationship is a (small) section of the animal taxonomy.
    Sub-items of the list are more narrower terms for their parent items. The lists can me arbitrarily nested. For example. Chordata and Mammalia are both narrower terms for Animalia. For a non-transitive relationship, Chordata would be a narrower description for Animalia, but Mammalia would not.

    • The animal taxonomy has the namespace animals to distinguish it from other hierarchies that can exist in parallel. One item can be in multiple hierarchies at the same time
    • 		  		  semanticRelation::narrowerTransitive
      		  		  concept::animals
      		  		  - Animalia
      		  		    - Chordata
      		  		      - Mammalia
      		  		        - Carnivora
      		  		          - Feliformia
      		  		            - Felidae
      		  		              - Felinae
      		  		                - Felis
      		  		                  - F. catus
      		  		                  - F. silvestris
      		  		  
      		  		  
      		  ```
      
    • This information is stored in a central Markdown file
    • If a user tags an item with animals:F. catus, the item will automatically appear in a search for Animalia
    • The user does not need to tag with the entire hierarchy animals:Animalia/Chordata/Mammalia/Carnivora/Feliformia/Felidae/Felinae/…, as this would duplicate the hierarchy on every item. The tag is only animals:F. catus, from which Logseq can infer that we are dealing with a type of cat.
  • Here is an example of a user-defined hierarchy that has non-unique names. It is not a taxonomy, but it is still useful to be able to capture such relationships, so probably this case shouldn’t be disallowed. If we require the user to provide minimal context, e.g. nutrition:Beef/Recipes instead of just nutrition:recipes, we can make tags unique. This will make any blocks/pages so tagged so visible under the Products, Meats, and Beef categories.

    ``` markdown
    	  semanticRelation::narrowerTransitive
    	  concept::nutrition
    	  - Products
    	    - Meats
    	      - Beef
    	        - Recipes
    	        - Nutrition
    	      - Pork
    	        - Recipes
    	        - Nutrition
    ```
    
  • This is an example of a “related” relationship. All of the tags [frying, deepFrying, airFrying, grilling] are marked as related.

    • If a user tags an item with the tag frying, a search for related items will bring up the other 3
      	  semanticRelation::related
      	  concept::cooking
      	  frying
      	  deepFrying
      	  airFrying
      	  grilling
      	  
      
      • Related tags can also live in different namespaces
      	  semanticRelation::related
      	  cooking:frying
      	  nutrition:fat
      

I am thinking, @alex0, that perhaps we agree on substance, but I used “hierarchy in an ambiguous way”. Thank you for your more focused description. When you wrote:

you can browse them maybe interactively with “virtual” hierarchies

That is what I want. A good UX, layered over what you describe as a database approach is great! What matters is the experience of browsing a tree, as way of traversing a subset of your graph, above the level of the page.

Conveniently, a tree = an outline, in terms of UX. Logseq already does the UX right.

In terms of UX, I wish I could edit “All pages”, “Linked references” and “Unlinked references” exactly like a standard page, in outline form, with maybe a drop-down menu specifying the query:

  • So one could be the current default or flat list in reverse chronological order
  • One could be according to some taxonomy or classification scheme, plus a flat list for notes outside that scheme
  • One could be by project, plus a flat list for notes outside that scheme
  • And you could edit those pages like normal outlines, and turn them into saved search pages

I don’t need hierarchies to be stored objects necessarily. It will be better if they are saved queries that return outline-structured, editable, saveable results. It’s good because I’d want many notes to fall within the scope of more than one tree.

  • In one of my examples above, the hierarchy shows a note for [[Topics/Sexuality/Sexual Power Dynamics]].
  • That note is relevant to both [[Topics/Sexuality]] and [[Topics/Psychodynamics]] Topics.
  • But the reason I’m exploring it is for the novel I’m working on: [[Journals/Project Logs/Candace Valleigh]].
  • As a novelist, the fire in the story comes from deep inside my psyche, so I also need to ask hard, searching questions about that topic and why it is emerging in my heart as a topic relevant to this story, so I need to do personal journaling about it. I’ll want to open my personal journal and go to that topic to work on it, when I have reflections that relate to that topic.

In each of these locations, I want that note to appear in editable, collapsible outlines.

I can do this manually, or try my luck with Unlinked References, which I cannot currently edit as an outline in and of itself. This makes it a bit of a mess, taking too long to scan visually, so I ignore it.

As a non-technical user, I just want the experience to work. I want this tool to be useful for me and for the way I think. Since it’s in early stages of development, this is a chance to have input. Also, since what I want is editable outlines, and much of what Logseq does is present interrelated blocks in editable-outline form, this seems like a promising tool.

But I want this to be a tool, not a toy. As a non-technical user, if I have to tinker with it too much, it becomes a toy. I’d play with it for the sheer delight of dancing up the learning curve, and do my real work in Scrivener or something.

I’m currently using Scrivener as a massive database for teaching the history of psychology, but the search is too slow, and the foldering is a bit too rigid - even though labels help with lateral connections a bit. It’s really not optimized to be a long term environment for grooming and improving a web of knowledge, but I can be fast in it.

There’s no such thing as “just UX” for a non-technical user. I feel comfortable in Logseq, but some founders, like those building app.capacities.io, take UX pretty seriously, and it feels good to be respected like that, as a user, and not be told to construct something that looks like code. (I’m a bit technical, but prefer GUIs for most things).

2 Likes

I get what you are trying to accomplish here, but why properties to store relations like in my example above wouldn’t be enough?

My point is that we already have the data structure in Logseq and we just need a better way to query it.

If you where able to type something like {{tree }}, for example {{tree extended-by}}, and it renders an indented list of pages linked together by the specified property (in this case “extended-by”) would it be enough for you?

For example:

Page A
...
Page B
extended-by:: [[Page A]]
{{tree extended-by}}

renders:

* Page A
    * Page B
1 Like

If you mean that references etc should be a special default case of what we are discussing here and should be displayed with a command, i.e.

{{linked-referencies}}
{{unlinked-referencies}}
{{namespaces}}
{{tagged-pages}}
{{tree <property>}}

then I’m totally with you.

1 Like

You know @gax, I think your feature request is really powerful. I would really, really, really want software that can do this with seamless UX. I am currently hacking together some of this kind of knowledge organization using these tools:

I’m using Vocbench and TemaTres to try and connect with existing taxonomies, and Taguette to infer my own, over my collections of notes. I am doing it for tag control.

This is exactly the point you make in your Feature Request draft. It fights tag explosion, which matters for those of us who like to maintain a synoptic overview of what our knowledgebase contains. It also surfaces knowledge, because some categorical knowledge can be inferred directly from taxonomic or categorical relations.

I don’t have anything to add really.

  • You are doing a very good what and why feature request for this kind of tagging
  • @alex0 is very good at explaining best ways how to do it in Logseq
  • I just want to start using it.

I think that with you two doing much better work than me, my feature requests can be residual UX considerations - that queries leveraging SKOS or hand-coded trees be returned as editable outlines that can be saved as auto-updating pages. Something like that.

The best I can do is suggest how I want the thing to feel in my hands.

1 Like

This is tantalizing! And I guess an SKOS importer would be able to map SKOS info to this Logseq data structure?

If that is the case, your ideas about how Logseq can already handle this data extends the feature request being constructed by @gax. If there was a section in that feature request which said in more detail how Logseq is already able to handle this data, then that focuses the question.

I think the question becomes - should Logseq’s core behaviours evolve in a way that more deeply support the management of these properties, or are we left with a personal burden of constructing queries, or waiting for a plugin and hoping its development always remains up to date.

Maybe this is ultimately not a technical question, but a user question. If people - and scholarly communities - who deal with taxonomies and classification schemes frequently (academics to a degree, transnational organizations to a very significant degree), are to be supported by Logseq, it would be best to make taxonomy-management functions core.

1 Like

And queries should definitely have more display options and not only block list and table.

For example a carousel of cards:

2 Likes

@alex0 What I would like to do is decouple the tagging of pages from the organization of the tags themselves.

So for example, I might tag many pages with parent, child, and teddy. Over time, as the number of tags explodes, I find that I need to organize my tags, so I create a hierarchy and tell Logseq that teddy is a child node of child which is a child node of parent. I also tell Logseq that teddy is a child node of stuffedAnimal, which is a child node of toy. This automatically makes every single page tagged teddy appear in the hierarchies, without needing to edit the individual pages.

Another use case: Items imported from other systems (e.g. Zotero) have many overlapping tags, such as “History, 20th Century”, “history”, “History / World”. In practice, this creates graphs that look like this
b9656e9b3e780e5655a9f13e72ce81791f850e56_2_610x4991
and is practically unusable, the recommended solution being to just delete the tags (and discard the information contained in the tags).

A much better solution would be to create your own hierarchy, and then tell Logseq where the existing tags fit into your hierarchy. For example, I could search for all tags containing “history”, and place them under my history hierarchy.
This will automatically categorize all of the imported data without needing to edit any of the pages. Later, as my organizational system changes, I can move the tags around or create other hierarchies that fit my workflow.

1 Like

Yes, I got the use case, but what you are asking for is assigning some properties to tags so that they form a hierarchy i.e. one tag has some other tags as child. And someone mentioned “polyhierarchies” or something like that because some tags may belong to different hierarchies.

I’m saying this is already possible since tags are pages and pages can have properties with the syntax:

property-name:: something, [[Page A]], URL, whatever

So you can already structure those polyhierarchies but you don’t have the UI/UX to query and display them in an efficient manner.

I.e. you can use {{query }} syntax to list pages that have certain properties but it’s not iterative, you can’t make it display

* Page A
   * Page B
        * Page C
   * Page D

when these pages are linked together by a property you define, for example “extends”, “extended-by”, “generalizes” or whatever name you want to give to each of the polyhierarchies you need.

If you don’t want to display an indented list but a graph it doesn’t matter, it’s still an UI/UX issue; no need (from what I can tell) to extends Logseq data structures.

Is it clear what I mean?

2 Likes

SKOS items can have notes, which can contain images.

@alex0 I get it, I didn’t think about tags as pages. User interface would then pull all the relationships out of the individual pages and make them editable.

2 Likes

This is so cool @alex0 ! If it’s already possible to specify the properties we need, we’d benefit from a standard way of doing it, so we can enjoy all the benefits of the large feature request @gax was drafting. Also, an SKOS-properties standard in Logseq would be a draw for many people.

And of course, people not interested in this could ignore the whole thing.

1 Like

I’ve created a feature request: Knowledge Management for Tags / Tag Hierarchies

2 Likes

Hey @pvb, I just checked out Trillium and I agree!

  • a) it beautifully handles a tree-based organization with the ability to access clones of a note of on different branches in the tree
  • b) it sucks that it’s implemented in sqlite. Logseq achieves a better separation of data from behaviour.

Also, even though I’ve been pushing hard for a more hierarchical way to traverse my graph, Trillium seems a bit too rigidly hierarchical. What I really want is a flexible graph, with lots of ways to traverse it - including a tree-constrained manner of doing it for content that is naturally connected that way.

I only want to put info into tree format that is helpfully organized that way. I don’t want to be prevented from doing that, but I don’t want to be forced to do it either.

@gax I see there is talk about simplifying the feature request.

It’s funny because here I see a feature request being tweeted out that your ideas would support. It’s exemplified without explanation.

I guess it’s better to post more full ideas in the general discussions area, and very simple concrete things in Feature Requests

This is a great idea and fairly straightforward. I am not sure how much text is required to calculate a meaningful similarity metric. It would work really well for scientific documents, though.

I am not very familiar with this, but I have seen some work on automatic concept hierarchy generation.
It might be possible, given a large enough collection, to automatically create yet another type of hierarchy at the tag level.

Length is a tradeoff, if I shorten it, some people might say “but why can’t you use tags?” Not everyone might be familiar with the concepts. The important part is at the top anyway. Let’s see if it generates any feedback.

Do you want to add some user interface ideas to the proposal? You probably saw this, but the SKONS notes support arbitrary documents and also images, so one can build e.g. a tree with little thumbnails.
It could also just take the first image it finds for any tag. Lot’s of opportunities, I really hope this can become part of Logseq.

I’ll add some feature requests over time, based on this conversation. I’ve fallen behind on other things and will have to do that later. I’m also going to play with queries and properties a bit first to see if I can get some ideas working.

1 Like

I think we’ve been talking past each other. I agree with @alex0’s suggestion for adding tree and graph searches and for storing the information.
I did not think about designing hierarchies efficiently, which @boisjere’s program can do.
For now, I’d be happy to even design the hierarchy by hand in Markdown.

I was solely talking about efficiently classifying existing nodes. While theoretically this can be done on each individual page, it is too cumbersome to classify hundreds or many thousands of nodes this way.

What I would like to see is this:

The process should be designed to be as efficient as possible, such that it is realistic to sort a few thousand yet unclassified tags into the hierarchy.

3 Likes

If I understand it correctly you want to be able to select multiple pages (from the graph or from a list) and assign/remove the same properties at the same time. This could be useful in general and another feature request.

Also, the advantage of using text files to store data is that you can write scripts or entire programs to generate or manipulate them programmatically so you could manage properties/tags with another tool.

1 Like

Exactly. I don’t worry about the taxonomy generation, there are a couple tools available and in the worst case in can be done in a text editor. In contrast to the number of tags, this is a small number.

What I would like to see is:

  • a way to efficiently add properties. I think it makes sense to have this in Logseq UI to leverage the existing search functionality on the graph. It should have a (poly) hierarchical representation of the nodes on the left, so that we don’t just move items between random tag clouds. It would be easy to parse the properties from the pages, but at the moment I can’t think about a more efficient way than something similar to the UI I suggested. I am open for suggestions.
  • tree search, as suggested by you. This also needs to be a core part of Logseq.
1 Like