Help with implementing a tag management system

Hal9000 · February 25, 2024, 11:08pm

Hi everybody,

As I have been experimenting with Logseq for the last two months, I am trying to come up with a strategy to organize information. More specifically, I am trying to implement a tag management system without implementing namespaces that expert users recommend against.

My approach is to develop a personal “PKM Schema” to implement the equivalent of a Network Model in database systems. I am trying to accomplish that by using a nesting structure to organize my tags. See example in the following screenshot:

Even though my PKM schema is not fully mature as of yet, my strategy is to have Level 0 (e.g., [[Career]]) and Level 1 (e.g., [[my-current-employer]]) as broad thematic categories that correspond to a specific but broad aspect of my life (I call them “Life Blocks” and my rule for defining them is that they are relatively continuous time-wise and not time-bounded such as a project). Then, levels 2 through “N” can be a) more specific knowledge domains that may or may not similar across Level 0 and Level 1 tags or b) projects and time-bounded activities of narrower scope (e.g., prepare presentation for “x” audience)

A system like that would rely heavily on a robust property management system for each tag.
With regards to that, I am also building a strategy for properties of items organized using my PKM schema. As an example, see below information captured for an item called [[Apache Ozone]], an [[Open Source Engine]] that falls under [[Cloudera]] which falls under [[Career]]

My question for anybody that has built anything similar to the above is, how can I create a query that returns a list of all Level-2 items (e.g., all [[Open Source Engines]]) that fall under a Level-1 and Level-0 tag (e.g., [[Cloudera]] which falls under [[Career]]), in a way that will allow me to index the database I am building using my PKM schema.

Thanks in advance!
Best,
Andreas

mentaloid · February 26, 2024, 8:03am

Most probably nobody has built something very similar.
On the other hand, most serious PKMs are quite similar in what they try to accomplish.
- They try to manage knowledge, and tags are just one aspect of it.

Multiple ways. As long as there are references that connect the various items (whether at note-level or at tag-level), this won’t be difficult. The actual problem is different.

Nesting structures cannot model networks. Such approach needs serious reconsideration.

The recommendation is not against everything namespaces, but against:
- using namespaces for modeling hierarchies
- creating hierarchies of things that are not strictly hierarchical
  - Your case fits this point, as it tries to impose an artificial hierarchy on non-hierarchical things.
Some things may naturally fall under others, e.g.:
- Career Growth under Career
- Cloudera Tech Stack under Cloudera
- Interview-Q&A under Interviewing
- etc.
But most things fall under either:
- many others
- none
Forcing such things under a single parent causes problems down the way.
- For example, Cloudera is not a career, but a company.
  - Even if it happens to relate with the career of someone, this is a temporary relation.

Since you plan to make heavy use of properties, why do you bother with an inferior nesting approach? Rather simply add one more property for each nesting-like type of relation.

Hal9000 · February 26, 2024, 12:01pm

Hi @mentaloid,

Thank you very much for your detailed response! Also, let me add how useful your posts in this forum have been for me; I’ve learned a lot from you!

To address some of the points that you are making:

Nesting structures cannot model networks. Such approach needs serious reconsideration

I agree that you cannot design a network model using nesting structures, but that’s the simplest way I thought of to design the ‘schema’ for my PKMS in Logseq in such a way that would contain adequate context for the nature and relationships of my tags. The alternative would be to build a Entity Relationship diagram using Whiteboards, but such an approach would make it difficult to perform CRUD operations on the schema (i.e., Create Tag, Read Tag, Update Tag, Delete Tag), given the number of tags I am planning to add.

The same thinking applies to your other point:

Since you plan to make heavy use of properties , why do you bother with an inferior nesting approach? Rather simply add one more property for each nesting-like type of relation.

I am not sure how, by only using properties for tags, I could easily go through all of the properties and tags and easily identify how to add, change, or delete a tag / property OR to decide which tag I want to use for a particular knowledge item I want to capture. The nesting structure gives me a way to organize and traverse the list with tags and easily perform CRUD operations on properties and tags. In summary, I am using the nesting structure to govern my schema, absent of another mechanism (but I am open to suggestions if you have any!); For example, every time I want to add a new piece of information, I want to first define the tag, add it to the schema, and also identify the block-level properties (a block corresponds to an item or an “entry” in my PKM db) before capturing information.

Another approach I thought of, was to define level-0 though level “N” properties and enirch each block with information like the following:

Apache Iceberg (item title)
level-0-properties: [[Career]]
level-1-properties: [[Cloudera]]
level-2-properties: [[Open Source Engines]], [[Data and Analytics]]

* blah-blah-Blah
* blah-blah-blah

But then, I would need to create some really complex queries to visualize the relationships (the output of such queries would also look like nesting structures!)

Forcing such things under a single parent causes problems down the way.

For example, Cloudera is not a career, but a company.

You are correct. Currently, in the static ontology I am defining today, Cloudera has a very narrow meaning: It’s my employer. In the future it can be a vendor I am buying services from, or even a competitor. That would lead into a host or problems. A better descriptor would be Cloudera-Career. But if I were to use that term, that would make it more difficult to find information. For example, performing a search for all the Open Source Engines associated with Cloudera after I am not longer with that company, requires a query that includes Cloudera-Career and not Cloudera to get all the relevant results. That query would miss Open Source Engines associated with Cloudera that I have captured while no longer with that company (since I am not putting things under Cloudera-Career any more).

I don’t know if I am making a lot of sense here (and to be honest, as I am writing these sentences I think I have designed my schema for Open Source Engines in the wrong way), but I am really struggling to come up with a structured and rigorous approach to manage meta-information for the information I want to capture.

Thanks again for the help.

mentaloid · February 26, 2024, 1:28pm

I would suggest that you give yourself a chance to escape from the paradigms that you are familiar with.
- Try to “forget” about traditional databases, entities, diagrams, levels etc., and embrace the tool the way it is meant to be used.
- Otherwise, in your effort to go for simplicity, you end up imposing complexity.
Relationships are simple arrows that connect two concepts (a reference), often through a third concept (a property).
- That produces triplets, which fill the graph database and are the targets of queries. This is:
  - simple
    - Certainly simpler than the traditional approaches.
    - But being “new” and different, it may initially feel more complicated, while it is not.
      - This doesn’t mean that every tool is made as easy as it could (Datalog for example).
  - powerful
    - You can quickly model complicated realities (realities are always complicated) with relatively minimal effort.
      - To express such models in traditional databases, it takes nothing less than an expert.
        
        They often have to think in advance how the information will be used.
        
        In real-life systems, this is even impossible, as the needs tend to change.
      - While in Logseq:
        
        it is possible to start taking notes immediately, letting the exact organization to emerge
        
        as long as the basic principles are respected
        
        it still won’t happen automagically (see How to leverage Logseq’s linked structure?)
        
        can capture the data in a natural way, then use it in multiple ways
CUD operations are:
- a valid concern
  - currently supported only through the API
- a challenge, no matter the approach
- yet not the deciding factor for the structure to adopt
  - the coming database version will make them easier
In contrast, Reading operations are already good enough, thanks to queries, which:
- are not user-friendly
- but they can achieve everything read-wise
  - particularly when it comes to tag-based relations, they are not even difficult
Concerning nesting
- Even when adopting properties, the outliner can and should be used for forming nesting views when needed.
  - But they should be views on top of the actual data, not the actual data themselves.
  - Consider The contextual sidebar: query current page and organize pages in indexes
- For things that are genuinely hierarchical in nature, consider Generate explicit hierarchy out of properties
The important point is to no longer just put things under other things.
- Cloudera needs its own page
  - preferably without a namespace
    - “under” only the overall graph
  - in the rare case of a name-conflict, the only fitting namespace would be Company
- Cloudera-Career is not much better than Career/Cloudera
  - In your particular case, the only thing that makes sense is for page Hal9000 to have Cloudera as a value to one of its properties (employer or something). This is the intuitive association that is:
    - easy to maintain
      - changing it won’t break things
    - easy to query
      - properties are straightforward
    - modular enough
      - can peacefully coexist with other employers
      - can be the value of other properties at the same time (vendor etc.)

Hal9000 · February 26, 2024, 3:13pm

That’s a excellent response with pointers to additional resources! Now, it’s time for me to do some more reading about how to use them for my structure / workflow, before returning with more questions!

All the best,
Andreas