I am not familiar with Logseq’s internals, but I think a plugin would be the way to start initially and let different ideas compete. Later on it might become integrated into the core.
I personally very much dislike these ancient classification systems, because I wasted a lot of my life trying to organize things in such a way. I see people on the Zotero and Calibre forums ask all of the time how they can maintain their library as a folder structure sorted by author/title etc, so these ideas are alive and still doing a lot of harm.
My worry is that, because everyone is familiar with library classifications, people might think that this is the state of the art and oppose far superior approaches.
I’ve received massive pushback at work trying to implement something that would have been trivial in a database, but management wanted and got a hierarchical folder structure that was virtually unsearchable and a complete failure.
I suspect that any proposal will get some pushback as well, saying that we can solve the issue with tags or with the rudimentary hierarchical page structure that already exists. This is why we have to explain very clearly how these old systems were made to deal with physical realities of index cards (à la Luhmann’s Zettelkasten) and items stored on shelves.
That’s why I like the SKOS concepts, they allow indices, (poly) hierarchies, and they can also be completely ignored and fall back on what we have now.
sounds good
I dislike specifying the full hierarchy for every item. That would also require memorizing the taxonomy or relying on autocomplete.
My thought was to have a separate taxonomy file that specifies the relationships between tags. In the case of taxonomies, these are very simple nested lists. In the case of general SKOS concepts it is more complex and would require writing down the relations, but this could also be mapped to Markdown.
The process would look like this:
specify relationships between tags through user-defined taxonomies.
for example, we have a document that specifies the taxonomy of animals. As this is a simple list that narrows transitively it can be represented by a nested Markdown lists.
Here is a small part of the classification of animals:
adding a tag automatically places the item in the right spot of the hierarchy if the tag is unique in the taxonomy
A block on the common house cat will be tagged animals:F.catus, which will make it appear in the hierarchy in the right place, it will be returned in a search Felidae etc.
if the tag appears in multiple places in the taxonomy, a sufficiently long path needs to be specified to make it unique
I don’t think this happens much in good taxonomies, but it is no problem, if there where two species with the same name but in different phyla, one just has to specify enough of the path to make it unique.
if the tag appears in multiple taxonomies, we need to specify the namespace (probably this should be done for any taxonomy for future extensibility)
For example, the tag Felidae could point to the animal taxonomy, the novel, or the film with the same name. If a user has imported both a taxonomy of animals and of movies animals:Felidae and movies:Felidae make clear which one is meant (and an item might have both tags and appear in both hierarchies)
I am torn about this. On the one hand side, conceptual hierarchies, indices, related items etc. are different. On the other hand, they can all be represented by the same relations like SKOS does.
If Logseq had a way to specify the following relationships for tags:
taxonomies (as in the example for cats above, not strictly needed, but it simplifies things)
transitive and intransitive broader and narrower relationships
relatedness
(do we need anything else?)
then all of the concepts, from full imported taxonomies, user-defined taxonomies, indices could be represented using the same mechanism.
For this reason, I think from a software design point of view, they should all be integrated.
This is an excellent idea!
If the taxonomies were specified as simple .md files, it would be very easy to edit and share them.
Importing existing taxonomies would indeed be a killer feature for Logseq. What I especially like about this that it wouldn’t force the user to put any item into the taxonomy, like in a folder structure.
Let’s say I learn about medicine, if I start with a folder structure it will be mostly empty and it will be difficult to locate items. Also, as a beginner, I will probably not completely understand how things are classified in the field. On the other hand, if I can import the taxonomy, I can “tag” my items with one or multiple locations, or leave them as-is.
An item can also be part of multiple taxonomies, so if I work on the healing force of crystals, I can place an article on amethyst into both the medical and the mineralogical taxonomy.
Thanks for bringing up vocabulary grooming, the lack of which has become a major issue in Zotero. Items have user-defined and automatic tags (e.g. from a library import). I chose to use and keep the automatic tags (they are meaningful tags after all), which means that now I have thousands of items with tens of thousands of tags, many of which are duplicates, often just different capitalizations. If I deleted automatic tags, I would lose the classification of many of my items. Sadly, Zotero offers zero help in this, even if I rename items by hand, on the next import I might get unwanted tags back. Logseq absolutely needs a way to deal with this, as it will e.g. inherit the badly tagged Zotero databases.
they seem different from the point of the user, but under the hood they are all relations between tags. If we specify that TagA is a broader version of TagAA, we get a hierarchy (taxonomy), if we specify that TagA is a transitive narrower version of TagX and TagY we get a polyhierarchy that automatically includes all of the children of TagA (such as TagAA and TagAAA), and if we specify the intransitive narrowing we get only TagA as narrower versions of TagX. If we specify things as related, we get a “see-also” entry. If we want an index, we can rely on the ordered collections. So from a software engineering point of view, with just 5 relations, we can create almost any type of hierarchy.
I think this will come naturally. Once Logseq has a nice search hierarchy, people will want to filter the view as well.
I think it is absolutely worth it. I am fighting the massively lacking Zotero system of tags and saved searches every single day. I am at the point where it is often easier for me to find items on Google Scholar than in my Zotero database.
Logseq will only make this 10x worse. A researcher might add 5 Zotero items a day, 1000 per year; in Logseq, it is easy to add 50 blocks in the same amount of time. So we are talking about 10k blocks per year of research. Over the course of a PhD, we are talking about tens of thousands of blocks, over a career hundreds of thousands. Especially as there might be tools that automatically feed into the Zotero database, such as RSS-feeds and the like.
I completely agree. The main reason I am looking at Logseq is that Zotero just doesn’t cut it anymore to organize the information I’ve collected.
The benefit of these hierarchies is that they are so lightweight that ignoring them or adding new hierarchies is very little effort. This is not like a folder structure, which is fixed and can only be resorted with a massive effort. Here the hierarchies are a type of overlay on the existing structure, so despite the large power, it doesn’t complicate anything for users who chose not to rely on hierarchies.
So 1. is GUI representation, 2. is internal relationship between tags, 3. is different hierarchy views (?) and 4. is search on 1 and 4?
For the feature requests, we need to find the right balance between abstract terms and concrete examples.
While virtually everyone has used faceted search, e.g. on Amazon, only few people are familiar with the term, and abstract nonsense gets pushback.
We also need a good, concrete, example. Maybe something from literature. A book about World War 2 could be classified by
/Books/LastName/FirstName/Title
/Literature/Non-fiction/…
/History/20thCentury/WW2/…
/Military/Conflicts/WW2/…
/Places/Europe/…
/Technology/Weapons/…
Any of the Library classification schemes: Dewey, UDC, LCC, Bliss, …
These are available from library catalogs and a plugin can tag book items automatically
wouldn’t it be great to be able to browse our own libraries by these schemes?
…
All of which are completely reasonable and useful for finding the item.
We also need to explain why tagging is not sufficient, as in this case our item would need a large number of tags, forgetting to tag the item would make it irretrievable, and tags by themselves to not give the hierarchical search we need.