Would a rich commitment to hierarchies and classification be an anathema to Logseq culture?

@alex0 What I would like to do is decouple the tagging of pages from the organization of the tags themselves.

So for example, I might tag many pages with parent, child, and teddy. Over time, as the number of tags explodes, I find that I need to organize my tags, so I create a hierarchy and tell Logseq that teddy is a child node of child which is a child node of parent. I also tell Logseq that teddy is a child node of stuffedAnimal, which is a child node of toy. This automatically makes every single page tagged teddy appear in the hierarchies, without needing to edit the individual pages.

Another use case: Items imported from other systems (e.g. Zotero) have many overlapping tags, such as “History, 20th Century”, “history”, “History / World”. In practice, this creates graphs that look like this
b9656e9b3e780e5655a9f13e72ce81791f850e56_2_610x4991
and is practically unusable, the recommended solution being to just delete the tags (and discard the information contained in the tags).

A much better solution would be to create your own hierarchy, and then tell Logseq where the existing tags fit into your hierarchy. For example, I could search for all tags containing “history”, and place them under my history hierarchy.
This will automatically categorize all of the imported data without needing to edit any of the pages. Later, as my organizational system changes, I can move the tags around or create other hierarchies that fit my workflow.

1 Like

Yes, I got the use case, but what you are asking for is assigning some properties to tags so that they form a hierarchy i.e. one tag has some other tags as child. And someone mentioned “polyhierarchies” or something like that because some tags may belong to different hierarchies.

I’m saying this is already possible since tags are pages and pages can have properties with the syntax:

property-name:: something, [[Page A]], URL, whatever

So you can already structure those polyhierarchies but you don’t have the UI/UX to query and display them in an efficient manner.

I.e. you can use {{query }} syntax to list pages that have certain properties but it’s not iterative, you can’t make it display

* Page A
   * Page B
        * Page C
   * Page D

when these pages are linked together by a property you define, for example “extends”, “extended-by”, “generalizes” or whatever name you want to give to each of the polyhierarchies you need.

If you don’t want to display an indented list but a graph it doesn’t matter, it’s still an UI/UX issue; no need (from what I can tell) to extends Logseq data structures.

Is it clear what I mean?

2 Likes

SKOS items can have notes, which can contain images.

@alex0 I get it, I didn’t think about tags as pages. User interface would then pull all the relationships out of the individual pages and make them editable.

2 Likes

This is so cool @alex0 ! If it’s already possible to specify the properties we need, we’d benefit from a standard way of doing it, so we can enjoy all the benefits of the large feature request @gax was drafting. Also, an SKOS-properties standard in Logseq would be a draw for many people.

And of course, people not interested in this could ignore the whole thing.

1 Like

I’ve created a feature request: Knowledge Management for Tags / Tag Hierarchies

2 Likes

Hey @pvb, I just checked out Trillium and I agree!

  • a) it beautifully handles a tree-based organization with the ability to access clones of a note of on different branches in the tree
  • b) it sucks that it’s implemented in sqlite. Logseq achieves a better separation of data from behaviour.

Also, even though I’ve been pushing hard for a more hierarchical way to traverse my graph, Trillium seems a bit too rigidly hierarchical. What I really want is a flexible graph, with lots of ways to traverse it - including a tree-constrained manner of doing it for content that is naturally connected that way.

I only want to put info into tree format that is helpfully organized that way. I don’t want to be prevented from doing that, but I don’t want to be forced to do it either.

@gax I see there is talk about simplifying the feature request.

It’s funny because here I see a feature request being tweeted out that your ideas would support. It’s exemplified without explanation.

I guess it’s better to post more full ideas in the general discussions area, and very simple concrete things in Feature Requests

This is a great idea and fairly straightforward. I am not sure how much text is required to calculate a meaningful similarity metric. It would work really well for scientific documents, though.

I am not very familiar with this, but I have seen some work on automatic concept hierarchy generation.
It might be possible, given a large enough collection, to automatically create yet another type of hierarchy at the tag level.

Length is a tradeoff, if I shorten it, some people might say “but why can’t you use tags?” Not everyone might be familiar with the concepts. The important part is at the top anyway. Let’s see if it generates any feedback.

Do you want to add some user interface ideas to the proposal? You probably saw this, but the SKONS notes support arbitrary documents and also images, so one can build e.g. a tree with little thumbnails.
It could also just take the first image it finds for any tag. Lot’s of opportunities, I really hope this can become part of Logseq.

I’ll add some feature requests over time, based on this conversation. I’ve fallen behind on other things and will have to do that later. I’m also going to play with queries and properties a bit first to see if I can get some ideas working.

1 Like

I think we’ve been talking past each other. I agree with @alex0’s suggestion for adding tree and graph searches and for storing the information.
I did not think about designing hierarchies efficiently, which @boisjere’s program can do.
For now, I’d be happy to even design the hierarchy by hand in Markdown.

I was solely talking about efficiently classifying existing nodes. While theoretically this can be done on each individual page, it is too cumbersome to classify hundreds or many thousands of nodes this way.

What I would like to see is this:

The process should be designed to be as efficient as possible, such that it is realistic to sort a few thousand yet unclassified tags into the hierarchy.

3 Likes

If I understand it correctly you want to be able to select multiple pages (from the graph or from a list) and assign/remove the same properties at the same time. This could be useful in general and another feature request.

Also, the advantage of using text files to store data is that you can write scripts or entire programs to generate or manipulate them programmatically so you could manage properties/tags with another tool.

1 Like

Exactly. I don’t worry about the taxonomy generation, there are a couple tools available and in the worst case in can be done in a text editor. In contrast to the number of tags, this is a small number.

What I would like to see is:

  • a way to efficiently add properties. I think it makes sense to have this in Logseq UI to leverage the existing search functionality on the graph. It should have a (poly) hierarchical representation of the nodes on the left, so that we don’t just move items between random tag clouds. It would be easy to parse the properties from the pages, but at the moment I can’t think about a more efficient way than something similar to the UI I suggested. I am open for suggestions.
  • tree search, as suggested by you. This also needs to be a core part of Logseq.
1 Like

I’m getting indications that it is a bit contrary to the culture. The coolest feature about Roam and Logseq is bidirectionally linked block-based graph traversal. Early evangelists are super-excited about this, and kind of dump on earlier (millennia old) ways of organizing knowledge.

I get it as a way of highlighting what’s new, but I think the baby’s going out with the bathwater.

The main evangelist who is trying to balance both the old and the new ways of managing information is Nick Milo. He loves minimal foldering, Maps of Content, and Datascopes, as ways of maintaining enough high-level knowledge about your content to enter the graph with some positional awareness.

For those of us who need that initial moment of orientation, doing pure block-based graph traveral sucks, quite frankly.

Early evangelists are full of excitement about the moment they “got it”, and they’re trying to help everyone else “get it”, but aren’t widening the funnel much for those of us who prefer to do things a bit differently.

Historically, this has been very associated with the culture around Roam, but it happens a bit around Logseq too.

I’m the kind of person who is very mentally clear at the 50,000 ft overview level of my information, and I need to orient myself to my knowledge landscape before I decide where to land and hunt. But early evangelists of block-based PKM tools are sometimes most excited about ruling that out.

I want the new ways of exploring information, but I want to choose my starting place, somewhere besides the Daily Journal, or an un-indented list of Favourites.

So my question remains genuine. I’m not sure if a rich commitment to hierarchies and classification is kosher, for Logseq’s developers.

  • Is it an exciting way to differentiate themselves from Roam and support a wider market of users?
  • Or is it an irritating distraction and people like me should go and struggle with Obsidian
  • Obsidian isn’t outline-based, but which has a culture supporting soft hierarchies of knowledge

I decided to stop looking for tools, and chose Logseq, because of that “Hierarchies” section on a page. It felt like I was “coming home” to a tool that combined the ancient magic of classification with the new magic of graph traversal.

But I fear that the “Hierarchies” feature will be considered to be an aberration that they will leave to wither and then phase out - ironically because people may use it too much, the wrong way.

Not everyone will have that “conversion experience” to pure block-based graph traversal, and they’ll choose to focus on the true believers, and not the larger masses.

I anticipated this kind of issue. I wanted to know I could discuss hierarchies and classification with some clarity and energy. I didn’t want to disrupt this community, if in fact I was wrong about what Logseq plans to become.

So I’m still unsure if this is the place for me. It’s a bit sad, because I don’t want to be in “tool-choosing hell” again… but I don’t feel good when my needs are deligitimized (Different ways to structure data - #27 by boisjere) - especially because I love this tool, and I think my needs reflect those of future users, farther along the technology adoption curve, “across the chasm”.

3 Likes

If I had to design this from scratch:

  • every folder in our file system would potentially be a “graph”, even nested in each others
  • the info about a certain graph would be saved in an hidden .logseq folder in every folder that is a graph
  • it would be possible to open every Markdown file with Logseq and when doing so it creates a graph starting from that folder (creating the .logseq one)
  • mentioning pages between different graph could be possible using relative links (../ syntax to go up)
  • journal would be a plugin to manage one or more specified folders/graphs (even hidden ones inside the ones above)
  • Logseq wouldn’t be set to “one graph” for each instance, instead it would manage graphs like file managers do with folders

For example:

📂 All Encompassing Graph
  📂 Subgraph 1
    📂 .logseq
    📂 Journal
    📄 Page.md
    ...
  📂 Subgraph 2
    📂 .logseq
    📂 .journal
    📄 Page.md
    ...

and hide relative paths from non-editing mode in Logseq i.e.

[[./journal/2022-07-22.md]]

is displayed as

22 July 2022
3 Likes

I’ve not long come across Logseq and am interested in it for both knowledge and project management in the professional services (in my case, legal) arena. I posted some thoughts here on structuring information/ notes for easier (re)discovery using concept tagging, but think that perhaps this thread is now more appropriate. Comments welcome!

1 Like

See my reply in the related thread: Knowledge Management for Tags / Tag Hierarchies - #19 by brsma

I have still trouble understanding what is your concrete use case for such an extensive hierarchical taxonomy? How do you intend to make all these tags actually productive? (see also The Collector’s Fallacy • Zettelkasten Method – I had some sobering ‘ouch!’ moment reading that…:see_no_evil: )

Your comments are well taken, but I would differentiate between the ability to build a rich classification scheme and actually deploying one at a scale larger than needed. I like the generality of the WordNet metamodel, but that doesn’t mean I want or need to load all 170,000+ entries into my graphs.

As far as a use case is concerned, I’m thinking of knowledge management in the typical SME professional practice, particularly law. There is, I think, a bright future in this area for tools like Logseq by virtue of the fact that they are text based and can readily be distributed using Git or the like. For an industry that purportedly relies on knowledge, “knowledge management” remains pretty problematic. I have experienced enough failures to appreciate that the biggest problem with current systems is that of “not knowing what you know”, a problem that grows with time and scope. Put another way, discovery is a key issue.

Systems like SharePoint and its ilk allow one save documents in folder hierarchies and to build controlled and uncontrolled vocabularies with which to classify (i.e. to tag) them, but, at least in my opinion, they in fail in two ways. First, although not directly related to this discussion, they don’t support document linking to the degree that Logseq (or on macOS, Hook) does. Second, which is relevant, educating users into how taxonomies are structured and how they should be used is a real difficulty and making them do it consistently even more so. What makes it so difficult is the “discount rate” we apply to our time. How much effort am I prepared to invest now so that you can find something later, perhaps years later. For most people indexing as you enter is an adjunct to their role, it’s just not worth a significant investment of their time, especially when you’re being evaluated on the basis of billable hours. For academics and researchers that’s clearly not the case, their role is to link, classify, assimilate and derive, but I would suggest that for the majority of knowledge workers the value to them of future discovery by others is rather small.

I’m my mind the trick to knowledge discovery is to make life as easy as possible for those entering information. Don’t try to force users into sticking to a rigidly defined vocabulary; don’t get in their way; by all means guide them (say with properties), but accept anything that they think appropriate.

Generally the difficult question is something along the lines of “what do we know about X? “. Here X is typically a concept, not a specific thing like the contract we wrote for some client. It’s the sort of question that library indices are designed to facilitate, albeit for physical information that is not directly searchable…

Digital information has the advantage that it can be directly searched using full text indexing. Full text searches fail, however, if the word(s) that I use to envisage X differ from those actually in the document. The same applies to tagging. The word or brief phrase that comes to my mind when I think of a concept is not necessarily that which you used when you tagged the document years ago. It is highly likely, however, that whatever word(s) I think of will be a synonym of that you actually used.

That, however, requires a means of equating terms/ phrases to concepts, which is where WordNet synsets come in. As far as hyper- / hyponyms are concerned, I view them as enabling a sense of scale, of zooming in or out. I can start with a high level concept and quickly refine a search by looking at what hyponyms have actually been used, or if I happen to start with a hyponym that hasn’t been used, pull back to a hypernym and see which of its hyponyms have been used. Other linguistic relationships provide different ways of navigating the search space.

Hope this explains better!

1 Like

Thanks a lot! Actually I was asking @gax for clarification :wink: (whose use case seems to lie more on the academic side of knowledge work), but I consider yours very interesting. At least to me, being on the business side of (P)KM, as well and facing at least partly similar challenges (one of my major professional concerns besides organising my own work as a manager: enabling cross-functional business intelligence across my teams in order to help everyone make the best decisions). Yet, law with its large cultural history of elaborate reference systems is still a different kind of beast compared to (digital) product.

educating users into how taxonomies are structured and how they should be used is a real difficulty and making them do it consistently even more so.

the trick to knowledge discovery is to make life as easy as possible for those entering information. Don’t try to force users into sticking to a rigidly defined vocabulary; don’t get in their way; by all means guide them (say with properties), but accept anything that they think appropriate

+1 to both of that. (And to your observation re: puzzling lack of digitisation in knowledge-based industries)

At the same time I would like to propose that fuzzy information retrieval is more of a software problem rather than something to implement manually in your graph. DEVONthink (my universal vault for all kinds of documents and reference materials) is quite good at surfacing related documents, for example. Besides, your last paragraph reads like the informal description of an algorithm that shouldn’t be this hard to implement using WordNet’s API or a similar service in other languages. Add a smart user interface and you have a nice product for legal services that should even be commercially quite viable (assuming this is a common pain point). :slight_smile: As for Logseq this actually could be a nice search plug-in.

1 Like

Thanks @brsma for your comments! The quote about the downsides of tagging is worth reading:

Extensive content-based tagging is a known anti-pattern because tags create a weak association at best between notes.

By using content-based tags you are making yourself feel that you are creating associations but you are still really shifting the burden to your future self to figure out why the notes are associated.

This becomes a significant problem when you have a larger corpus and your tag ontology begins growing beyond control.

Once you decide to tag based on subject you have to keep expanding the subjects you tag.

Then every time you add a tag later you have to decide if you will go back and re-tag all applicable prior notes, which quickly becomes untenable.

But if you don’t do that then your tagging system becomes untrustworthy, because it is not returning all notes that it should, so you start developing workarounds to compensate for the faulty tagging system, which increases the friction of using the system.

To overcome these limitations, we need (poly)hierarchical tags that can be edited independent of individual notes.

My use case for tagging is classification of knowledge (instead of building relationships between the content at a block level, which comes at a later stage).
I have many thousands of items (e.g. articles, books etc.) that need to be classified by subject areas. Luckily in the case of articles, they fit nicely into a small group of polyhierarchies.

Backlinks are great for building relationships between individual blocks of these pages (e.g. a fine-grained link between items, such as “cites: [[other article]]”, “contradicts: [[other article]]”), but they are not good for sorting the knowledge on a large scale.

An additional advantage of tags over backlinks (for classification vs. linking) is that most (all?) fields of research already have well established classification hierarchies. When a biologist talks about cats, it is clear where such page will sit in the biological taxonomy. It is also easy to develop limited taxonomies for the purpose of a project.
This makes sharing of knowledge easy, any biologist will find it very natural to browse a graph based on the well-known taxonomy of animals. Also, a new student could e.g. import a scheme from someone else and use it to get started in his own research.

Taxonomies can be edited, exported, imported, shared without touching the individual pages, which addresses many of the issues in the quote above.

A lot of effort has been put into existing ontologies, and articles and books have typically been classified, so a large fraction of Logseq pages could be automatically organized without touching any of the pages themselves.

Logseq could easily build a searchable classification just by pulling subject classifications from library catalogs, e.g. in the Unsinkable Sam case. As @GaiusScotius suggested, this does not need to be limited to simple taxonomies, but can be extended to general ontologies like WordNet.

3 Likes