Would a rich commitment to hierarchies and classification be an anathema to Logseq culture?

gax · July 2, 2022, 6:08am

Thank you for this excellent writeup.

I am not familiar with Logseq’s internals, but I think a plugin would be the way to start initially and let different ideas compete. Later on it might become integrated into the core.

I personally very much dislike these ancient classification systems, because I wasted a lot of my life trying to organize things in such a way. I see people on the Zotero and Calibre forums ask all of the time how they can maintain their library as a folder structure sorted by author/title etc, so these ideas are alive and still doing a lot of harm.

My worry is that, because everyone is familiar with library classifications, people might think that this is the state of the art and oppose far superior approaches.
I’ve received massive pushback at work trying to implement something that would have been trivial in a database, but management wanted and got a hierarchical folder structure that was virtually unsearchable and a complete failure.
I suspect that any proposal will get some pushback as well, saying that we can solve the issue with tags or with the rudimentary hierarchical page structure that already exists. This is why we have to explain very clearly how these old systems were made to deal with physical realities of index cards (à la Luhmann’s Zettelkasten) and items stored on shelves.

That’s why I like the SKOS concepts, they allow indices, (poly) hierarchies, and they can also be completely ignored and fall back on what we have now.

sounds good

I dislike specifying the full hierarchy for every item. That would also require memorizing the taxonomy or relying on autocomplete.
My thought was to have a separate taxonomy file that specifies the relationships between tags. In the case of taxonomies, these are very simple nested lists. In the case of general SKOS concepts it is more complex and would require writing down the relations, but this could also be mapped to Markdown.

The process would look like this:

specify relationships between tags through user-defined taxonomies.
- for example, we have a document that specifies the taxonomy of animals. As this is a simple list that narrows transitively it can be represented by a nested Markdown lists.
- Here is a small part of the classification of animals:
  - |Kingdom:|Animalia|
    - |Phylum:|Chordata|
      - | Class:|Mammalia|
        
        |Order:|Carnivora|
        
        |Suborder:|Feliformia|
        
        |Family:|Felidae|
        
        |Subfamily:|Felinae|
        
        |Genus:|Felis|
        
        |Species:|F. catus|
as you suggested, this list could be imported
adding a tag automatically places the item in the right spot of the hierarchy if the tag is unique in the taxonomy
- A block on the common house cat will be tagged animals:F.catus, which will make it appear in the hierarchy in the right place, it will be returned in a search Felidae etc.
if the tag appears in multiple places in the taxonomy, a sufficiently long path needs to be specified to make it unique
- I don’t think this happens much in good taxonomies, but it is no problem, if there where two species with the same name but in different phyla, one just has to specify enough of the path to make it unique.
if the tag appears in multiple taxonomies, we need to specify the namespace (probably this should be done for any taxonomy for future extensibility)
- For example, the tag Felidae could point to the animal taxonomy, the novel, or the film with the same name. If a user has imported both a taxonomy of animals and of movies animals:Felidae and movies:Felidae make clear which one is meant (and an item might have both tags and appear in both hierarchies)

boisjere:

I see references that indicate to me some other ways of defining namespaces in other places in this forums:

Collapse namespace prefixes - #9 by Alex_QWxleA

Collapse namespace prefixes - #11 by Aryan

Collapse namespace prefixes - #14 by cannibalox

Anyhow, all that is to say that Polyhierarchies (having a page appear in multiple namespaces) can be implemented without having a sidebar link to an overview page, and vice versa.

I think it’s a different kind of feature request from Feature Reques 2, because it does touch how namespaces work, potentially.

I still think it’s a smaller feature request than implementing SKOS (provided Logseq isn’t architecturally similar to SKOS already).

FR 1.B: Vocabulary Control

Having a nested hierarchy of terms (broader, narrower, related) is different from having branching hierarchies of namespaces. In Ranganathan’s terms, it’s classification on the “conceptual plane” as opposed to the “physical plane” (I’ll treat digital notes as “things”, and so they’re analogous in some ways to objects - on the physical plane).

How are conceptual hierarchies different from hierarchies of things like notes? You can construct namespaces (for locating things like notes) any way you like. It’s essentially manual outlining, on a macro level above the page level. There are no conceptual limitations on what you put where.

I am torn about this. On the one hand side, conceptual hierarchies, indices, related items etc. are different. On the other hand, they can all be represented by the same relations like SKOS does.
If Logseq had a way to specify the following relationships for tags:

taxonomies (as in the example for cats above, not strictly needed, but it simplifies things)
transitive and intransitive broader and narrower relationships
relatedness
(do we need anything else?)

then all of the concepts, from full imported taxonomies, user-defined taxonomies, indices could be represented using the same mechanism.
For this reason, I think from a software design point of view, they should all be integrated.

This is an excellent idea!

If the taxonomies were specified as simple .md files, it would be very easy to edit and share them.
Importing existing taxonomies would indeed be a killer feature for Logseq. What I especially like about this that it wouldn’t force the user to put any item into the taxonomy, like in a folder structure.
Let’s say I learn about medicine, if I start with a folder structure it will be mostly empty and it will be difficult to locate items. Also, as a beginner, I will probably not completely understand how things are classified in the field. On the other hand, if I can import the taxonomy, I can “tag” my items with one or multiple locations, or leave them as-is.
An item can also be part of multiple taxonomies, so if I work on the healing force of crystals, I can place an article on amethyst into both the medical and the mineralogical taxonomy.

Thanks for bringing up vocabulary grooming, the lack of which has become a major issue in Zotero. Items have user-defined and automatic tags (e.g. from a library import). I chose to use and keep the automatic tags (they are meaningful tags after all), which means that now I have thousands of items with tens of thousands of tags, many of which are duplicates, often just different capitalizations. If I deleted automatic tags, I would lose the classification of many of my items. Sadly, Zotero offers zero help in this, even if I rename items by hand, on the next import I might get unwanted tags back. Logseq absolutely needs a way to deal with this, as it will e.g. inherit the badly tagged Zotero databases.

boisjere:

Idea: Logseq MOCs / Outliner MOCs / OMOCs

What I’m seeing here is the emergence of Logseq MOCs - or Outline MOCs (OMOCs) a distinct form of MOC native to outliners, with enforced indenting to preserve hierarchical relationships that span assemblies of notes (for manual page hierarchies) or ideas (for SKOS-light taxonomy-type conceptual relations) - but you can add notes to make them more informative than just TOC-like header links.

Question: Are Conceptual OMOCs Queries?

I think for the bottom-up conceptual OMOC, what I think you’re suggesting is the creation of a taxonomic index to notes. So you have a page with a taxonomy (you might zoom into specific blocks of it). There you find links to any notes you’ve decided to situate in that taxonomy, by tagging them with terms that fall within that taxonomy’s scope.

Navigating the taxonomy (the page defining the broader, narrower and related terms), would show links to the notes that have been gathered around those terms. It would pull notes into a hierarchy without using explicit namespaces. It’s a separation of the tree-making and leaf-placement concerns.

The concern of a taxonomy is hierarchy-construction. You make the tree as its own term-based thing, independent from notes, but it defines tags. You use those tags throughout your database. Then when you use the taxonomy page to explore the notes aggregated by that hierarchy, any notes you’ve tagged using those terms appear in place.

To express this second point a different way, notes flag themselves for inclusion within a hierarchy by wearing a tag belonging to it. (This is very Tinderboxy)

I see this taxonomic/conceptual plane organization as distinct from polyhierarchy. Polyhierarchy can possibly work with branching hierarchies of “things” only, not nested conceptual ones. So I think polyhierarchy and SKOS are different asks.

they seem different from the point of the user, but under the hood they are all relations between tags. If we specify that TagA is a broader version of TagAA, we get a hierarchy (taxonomy), if we specify that TagA is a transitive narrower version of TagX and TagY we get a polyhierarchy that automatically includes all of the children of TagA (such as TagAA and TagAAA), and if we specify the intransitive narrowing we get only TagA as narrower versions of TagX. If we specify things as related, we get a “see-also” entry. If we want an index, we can rely on the ordered collections. So from a software engineering point of view, with just 5 relations, we can create almost any type of hierarchy.

boisjere:

Splitting the First 2 Feature Requests into 3 Requests

I agree with you that if you are going to do any of this, it can theoretically be done using SKOS.

If I use namespaces to create branching hierarchies of notes, with no semantics that other people would care about, that can just be my own idiosyncratic instantiation of what, under the hood, is a concept hierarchy.

However, in terms of incremental steps towards developing these features, it’s possible that Feature Request 2, bullet 1, is quick and easy.

Polyhierarchy using the same namespace logic as the team currently uses might be a bit more work, but may not strain the existing architecture too much.

To allow the construction of SKOS-based conceptual taxonomies, and automatically aggregating links to notes tagged for inclusion in taxonomical indexes… the relative difficulty of doing this depends on how hospitable (or how “close”) the Logseq architecture is to SKOS already.

Perhaps its a North Star, and the team should try to make new developments future-compatible with SKOS, but it may not be something they can deliver in one incremental step yet. It may be more of a Saga than an Epic.

Feature Request 3: Tree Search

Maybe this isn’t a feature request to the development team at all, but a user-community effort to figure out how to construct this kind of query. I don’t know. If it’s a feature request, it doesn’t seem as big to me as SKOS-based conceptual hierarchy support (enforced nested hierarchies).

I think this will come naturally. Once Logseq has a nice search hierarchy, people will want to filter the view as well.

I think it is absolutely worth it. I am fighting the massively lacking Zotero system of tags and saved searches every single day. I am at the point where it is often easier for me to find items on Google Scholar than in my Zotero database.
Logseq will only make this 10x worse. A researcher might add 5 Zotero items a day, 1000 per year; in Logseq, it is easy to add 50 blocks in the same amount of time. So we are talking about 10k blocks per year of research. Over the course of a PhD, we are talking about tens of thousands of blocks, over a career hundreds of thousands. Especially as there might be tools that automatically feed into the Zotero database, such as RSS-feeds and the like.

boisjere:

I see a future where someone in their 20s today collects extensive notes for 40 years. Then they want to start writing masterworks. I think if they’ve been playing with and refining namespace OMOCs, and conceptual OMOCs all along, those will provide important pathways into their ginormous note clouds.

This isn’t foldering. These structured pathways into their graphs don’t need to be exhaustive. Not every note need be encompassed by these maps, and that’s fine.

I think you can often say about classification hierarchies what is sometimes said about plans. “Plans are useless, planning is priceless.”

When I spend time and expend intellectual effort organizing things hierarchically - essentially thinking like an outliner, but using crisp criteria over very large collections of notes, it increases my mental clarity. If at some point the hierarchy becomes useless, I’m happy to let it go stale.

I completely agree. The main reason I am looking at Logseq is that Zotero just doesn’t cut it anymore to organize the information I’ve collected.

The benefit of these hierarchies is that they are so lightweight that ignoring them or adding new hierarchies is very little effort. This is not like a folder structure, which is fixed and can only be resorted with a massive effort. Here the hierarchies are a type of overlay on the existing structure, so despite the large power, it doesn’t complicate anything for users who chose not to rely on hierarchies.

boisjere:

When I’m writing a start with an outline but as I get into the compositional flow of things the outline gets very malleable and may be abandoned altogether as a new emerging logic asserts itself.

At the same time, a meta-outline that grows resilient over time, and seems to legitimately embody the larger structure of your way of thinking, would be gold.

tl;dr - I think we have 4 feature requests here, not 3. Actually… I think there are 5 - one is hidden.

Namespace-based hierarchies homepage (the “Logseq namespace MOC”) autolinked from sidebar

Polyhierarchy support (multiple “Logseq namespace MOCs” - where children can have two parents)

Conceptual hierarchy (a complex query page that aggregates notes to the right nesting level in a conceptual tree on a page that has the property of being a conceptual hierarchy… ?)

Tree-search - a query that picks a page in a multi-page hierarchy as parent, and returns it and all its child pages.

So 1. is GUI representation, 2. is internal relationship between tags, 3. is different hierarchy views (?) and 4. is search on 1 and 4?

For the feature requests, we need to find the right balance between abstract terms and concrete examples.

While virtually everyone has used faceted search, e.g. on Amazon, only few people are familiar with the term, and abstract nonsense gets pushback.
We also need a good, concrete, example. Maybe something from literature. A book about World War 2 could be classified by

/Books/LastName/FirstName/Title
/Literature/Non-fiction/…
/History/20thCentury/WW2/…
/Military/Conflicts/WW2/…
/Places/Europe/…
/Technology/Weapons/…
Any of the Library classification schemes: Dewey, UDC, LCC, Bliss, …
- These are available from library catalogs and a plugin can tag book items automatically
- wouldn’t it be great to be able to browse our own libraries by these schemes?
…

All of which are completely reasonable and useful for finding the item.

We also need to explain why tagging is not sufficient, as in this case our item would need a large number of tags, forgetting to tag the item would make it irretrievable, and tags by themselves to not give the hierarchical search we need.