Sorting notes by the date of their source material (aka sorting tagged blocks using custom date tags)

kleer · May 4, 2023, 1:26am

This one is for the historians out there.

Logseq offers a lot of support, right out of the box, for querying and sorting journal entries by date. But for many users it’s their notes they will want to query and sort by date, not their journal entries. This is especially true for historians, who would typically use Logseq to keep track of the many primary documents they’ve read. They might find it very helpful, as I do, to later be able to sort their notes by keyword and the date of the source material on which those notes draw. In this way you get a kind of running commentary by contemporaries on the events of the day.

I had long been tagging my extensive notes on eighteenth-century money, banking and public finance, with the dates of the primary source materials (or, in the case of secondary sources, of the periods/events) to which they referred. I had hoped that the move to Logseq (from Roam Research) would provide me with a proper date type into which these tags could be converted. But I quickly discovered that Logseq’s built-in use of date fields, and the query machinery built around them, is confined to its journalling system. So I settled instead on custom hierarchical date tags of the general form #date/yyyy/mm/dd (with month and days optional, and with dashes instead of digits to indicate approximate dates (e.g. 178- to refer to anything in the 1780s).

Since in Logseq child blocks inherit tags from their parents, I could stick with my existing system of assigning only a relatively few date tags, each at some relatively high level in my notes (e.g. a chapter heading) and covering all the children beneath them. As I worked my way through a given source, I might have to update the associated date tag as the source material moved into later time periods. Even so, there aren’t many date tags in my notes; they’re not nearly as numerous as my keyword tags. Sometimes I need only a single date tag for an entire document, in which case I just add it to the tags:: property for that page.

So far so good, and relatively straightforward (though very labour intensive if you’re adding date tags after the fact). And to pull a list of blocks for any one keyword tag, or some combination of them, Logseq’s basic queries will work just fine, which is nice since the GUI tool makes that sort of thing easy.

But a real problem arises if you want to sort the blocks for a given keyword by the date tags associated with them. In that case you absolutely must use an advanced query.

The first step was to make sure that the names of the associated date tags were included in my query result set. This wasn’t exactly easy. The relevant date tags can be found at any of three places: a) on the block itself (in which case a call to block/refs will pull it in); b) on some parent of that block, perhaps several levels up (these will be found in block/path-refs); or c) in the page tags (to be found at {:block/page [:block/tags]}) I couldn’t find a way to build a series of :where clauses that would pull out those three different pieces of information and link each of them to some simple variable declared in the :find clause. So I solved the problem instead by using Datalog’s pull API, grabbing the page names of the full set of tags associated with each of the three possible locations for date tags. I also pulled in the :block/content and (the page’s) :block/original-name fields, to allow Logseq to work its default table-view magic with the eventual result set. It took some work to figure out how to get the desired output, since the Datalog/Datomic documentation isn’t exactly generous in its coverage of hierarchical/namespaced keywords, which of course are rampant in Logseq’s database schema. In any case, the raw query result set is a pretty ugly mishmash of vectors and maps, some of them nested several layers deep. But this approach gets you all the information you need to solve the date sorting problem and display the results.

The final step, sorting the results by date tag, was by far the hardest. It took me ten or so days to sew that part up, though in fairness this was largely because I am a complete noob when it comes to Clojure and to the result-transform component of Logseq’s advanced queries. It didn’t help that Logseq’s documentation doesn’t provide an actual specification of the result-transform (or either for that matter of the view) clause, just a set of examples. And the real work for me came from the fact that Clojure is a completely different kind of programming language than any I’ve encountered before. It took quite a while to wrap my head around its function-centric design and its insistence upon destructuring any data that gets passed to functions. Thank god for the REPL that comes with Clojure’s clj terminal tool, which facilitates trial-and-error learning. But I worked out a solution in the end. Someone who knows Clojure better could surely write up my formulation more elegantly. Suggestions welcome.

In list view, the query generates nothing more than an long, ugly set of text fields, arranged as vectors and nested maps. I suppose I could try to write up a corresponding view that would present this information in a much more orderly fashion, ideally including the breadcrumbs that come with Logseq’s default list view. But you can get a reasonably serviceable view without all that work just by setting the query results to table view and then left-clicking on any blocks that looks interesting; this will open the standard Logseq query result view for those blocks in the right panel.

Notice that the query results are sorted by the most appropriate date tag: the one closest to the block tagged with a given keyword (in order of priority, on the block itself, on a parent of the block, or on the page as a whole)

I’m sharing the whole query below, in case anyone else has been looking for a solution to this problem – to spare you the work I had to go through to figure it out myself.

#+BEGIN_QUERY
{
    :title ["get blocks tagged with x, sorted by their associated date tags"]
    :query [
        :find (pull ?b [:block/content
            {:block/refs [:block/name]}
            {:block/path-refs [:block/name]}
            {:block/page [:block/original-name {:block/tags [:block/name]}]}])
        :in $ ?tag   
        :where
            [?t :block/name ?tag]
            [?b :block/refs ?t]
        ]
    :inputs ["chosen keyword"]
    :result-transform
        (fn [result]
            (defn get-date [refs]
                (for [elem refs
                    :let [tag (get-in elem [:block/name])]
                    :when (clojure.string/includes? tag "date/")]
                    (subs tag 5)))
            (sort-by (fn [row]
                (let [{:keys [block/refs block/path-refs block/page] {:keys [block/tags]} :block/page} row]
                (or (first (get-date refs)) (first (get-date path-refs)) (first (get-date tags)))))
            result))
    :group-by-page? false
}
#+END_QUERY

Siferiax · May 4, 2023, 7:10am

Kudos! I was personally thinking about an or statement, or using rules. But if this works, then great!
The only thing I wonder about is the :block/refs. I don’t think you need that as it is already included in :block/path-refs.

kleer · May 4, 2023, 2:12pm

Thanks for pointing out that :block/refs could be redundant. I hadn’t realized, until you mentioned it, that all refs are also picked up in path-refs. That said, I still need to keep :block/refs in my query. This is to address the case where both the block and one or more of its parents contain date tags. This happens often in my notes – for instance, when someone writing in 1715 refers to something that happened say in 1706. In that case, programatically my query would have no way of knowing which of the two date tags available in :block/path-refs is the one I want (in the case of my example, 1706 rather than 1715).

I’ll keep experimenting with or patterns in the :where clause in case I can find a way to make that work. It would surely be a simpler solution. But so far I can’t keep my efforts from narrowing down the result set, often to zero items.

Siferiax · May 5, 2023, 6:07am

Made a page to test things.

First some data, then a query I’m messing with, then your query.
The problem I run into is that it will not give the full tag back. But it does get all the blocks.
If I just pull the blocks out, the result is this:

Here’s the query I’m working with. Just the blocks

#+BEGIN_QUERY
{:title "date blocks"
 :query [:find (pull ?b [*])
   :where
     [?date :block/name "date"]
     [?p :block/name "history"] ;keeping my results in check :)
     [?b :block/page ?p]
     (or 
       [?b :block/path-refs ?dtag]
       [?p :block/tags ?dtag]
     )
     [?dtag :block/namespace ?date]
     [?dtag :block/name ?datename]
 ]
 :result-transform (fn [result] (sort result) )
}
#+END_QUERY

The result from the first screenshot.

#+BEGIN_QUERY
{:title "date blocks"
 :query [:find ?datename ;(pull ?b [*])
   :where
     [?date :block/name "date"]
     [?p :block/name "history"] ;keeping my results in check :)
     [?b :block/page ?p]
     (or 
       [?b :block/path-refs ?dtag]
       [?p :block/tags ?dtag]
     )
     ;[?dtag :block/namespace ?date]
     [?dtag :block/name ?datename]
 ]
 :result-transform (fn [result] (sort result) )
}
#+END_QUERY

I can change it to.

#+BEGIN_QUERY
{:title "date blocks"
 :query [:find ?datename (pull ?b [*])
   :keys date block
   :where
     [?date :block/name "date"]
     [?p :block/name "history"] ;keeping my results in check :)
     [?b :block/page ?p]
     (or 
       [?b :block/path-refs ?dtag]
       [?p :block/tags ?dtag]
     )
     [?dtag :block/namespace ?date]
     [?dtag :block/name ?datename]
 ]
 :result-transform (fn [result] (sort result) )
}
#+END_QUERY

And you’ll see the date key only contains date/year (due to [?dtag :block/namespace ?date] btw)

Maybe this’ll give some ideas for further messing around
I know I’m intrigued to find a good solution for it!

Edit:
Here’s the problem I think…
Due to the way namespaces work a block will have an abundance of path-refs.

These will all be returned. We can use [(clojure.string/starts-with? ?datename "date")], but that would still leave lots of results.

Hmmmm no after further testing your solution seems the most solid.
The problem I run into is you get ALL blocks on a page with a date reference and I don’t think that was the intent.

Siferiax · May 5, 2023, 6:13am

This seems fine actually. - No wait it messes up the sorting
But you do still get the blocks lol.

#+BEGIN_QUERY
{
    :title ["get blocks tagged with x, sorted by their associated date tags"]
    :query [
        :find (pull ?b [:block/content
            {:block/path-refs [:block/name]}
            {:block/page [:block/original-name {:block/tags [:block/name]}]}])
        :in $ ?tag   
        :where
            [?t :block/name ?tag]
            [?b :block/refs ?t]
        ]
    :inputs ["test"]
    :result-transform
        (fn [result]
            (defn get-date [refs]
                (for [elem refs
                    :let [tag (get-in elem [:block/name])]
                    :when (clojure.string/includes? tag "date/")]
                    (subs tag 5)))
            (sort-by (fn [row]
                (let [{:keys [block/path-refs block/page] {:keys [block/tags]} :block/page} row]
                (or (first (get-date refs)) (first (get-date path-refs)) (first (get-date tags)))))
            result))
    :group-by-page? false
}
#+END_QUERY

kleer · May 6, 2023, 3:48am

Thanks Siferiax for giving it a go, and providing a few useful pointers along the way. I particularly liked your use of the :block/namespace property, which I hadn’t previously noticed. Very elegant. Unfortunately, as you point out, it can return only the YEAR portion of my date tags.

So it was back to the approach I had been trying previously, of trying to pull out the date tags associated with any blocks tagged “x”, and somehow getting the query to pick out the lowest-level date tag (the one closest to the block tagged “x”) and ignore any date tags higher up the tree. I had hoped that something like the following query would achieve this.

#+BEGIN_QUERY
{
:title ["get blocks tagged with x, sorted by their associated date tags"]
:query [
    :find ?content ?rname
    :keys content date
    :in $ ?tag
    :where
        [?t :block/name ?tag]
        [?b :block/refs ?t]
        [?b :block/content ?content]
        [?b :block/page ?p]
        (or
            (and
                    [?b :block/refs ?refs]    
                    [?refs :block/name ?rname]
                    [(clojure.string/starts-with? ?rname "date/")]
            )
            (and  
                    [?b :block/path-refs ?refs]
                    [?refs :block/name ?rname]
                    [(clojure.string/starts-with? ?rname "date/")]
                    (not
                        [?b :block/refs ?refs]
                        [?refs :block/name ?rname]
                        [(clojure.string/starts-with? ?rname "date/")]
                    )

            )
            (and
                    [?p :block/tags ?refs]
                    [?refs :block/name ?rname]
                    [(clojure.string/starts-with? ?rname "date/")]
                    (not
                        (or
                            (and
                                [?b :block/refs ?refs]
                                [?refs :block/name ?rname]
                                [(clojure.string/starts-with? ?rname "date/")]
                            )
                            (and
                                [?b :block/path-refs ?refs]
                                [?refs :block/name ?rname]
                                [(clojure.string/starts-with? ?rname "date/")]
                            )
                        )
                    )
            )
        )

]
:inputs ["windhandel"]
:group-by-page? false
}
#+END_QUERY

In other words, return a date tag at a parent level (or from the page as a whole) only if one doesn’t already exist at the original block level (or on one of its parents). But, for reasons I can’t quite pin down in my set-logic-impaired brain, this particular formulation doesn’t get me that result. It still returns date tags higher up the tree even if one already exists at the block level.

Furthermore, it’s starting to seem to me that even if I could find a solution along these lines, it wouldn’t be as clean as the one I’ve already hit upon.

So I’m about ready to admit defeat. Seems to me that Datalog/Datomic is good at isolating the particular records for which you’re looking, but not great at returning some logical subset of those records’ properties. The pull API is better at that, and that’s exactly how the documentation recommends it be used: separating the business of isolating relevant records from obtaining information about those records.

Unfortunately there’s no way I can find to supply logic (or even keys) to the nested fields selected out via the pull API, meaning that any logic processing of this kind has to be handled in the result-transform clause. I would be happier with this approach if I could extract from it the final value of the lowest-available-level date tag and supply that value to the view clause. But I haven’t been able to manage that. I will settle for my existing solution anyway, knowing that at least it solves the sorting problem correctly – even if it’s not quite an ideal solution on a few other counts.

Thanks again for your help. Much appreciated.

Siferiax · May 6, 2023, 7:41am

This is quite easy to explain.
When you have a block with #date/1900/01/01 that block references that page, but also #date/1900/01 and #date/1900 and #date because technically it does reference all of them.
The only thing I can think of is doing a recursive search using rules.
Here’s an example

Otherwise I agree with your assessment in the rest of your post.
I had some ideas this morning, but they failed as well lol.

kleer · May 13, 2023, 5:50pm

Here’s a slightly improved version of the query. I managed to pull the actual value of the optimal date tag into the result set, and then built a small view to display it alongside the blocks (which also provides a handy check that the sort is working properly). This new version also handles the case where a block has more than one date tag (which can easily happen, such as when a writer makes a claim, in one and the same sentence, about two different past events). In that case the block appears multiple times in the list, once for each date tag.

The custom view, unlike Logseq’s default query results view, doesn’t allow you to click through to the underlying block(s) in the secondary pane. I will work on that next (probably by building in hyperlinks for the blocks, if that’s possible; otherwise I will try just mimicking as much as I can of Logseq’s default view, making it unnecessary to click through to a secondary view).

#+BEGIN_QUERY
{
:title ["get blocks tagged with x, sorted by their associated date tags"]
:query [
    :find (pull ?b [:db/id :block/content {:block/refs [:block/name]} {:block/path-refs [:block/name]} {:block/page [:block/original-name {:block/tags [:block/name]}]}])
    :in $ ?tag
    :where
        [?t :block/name ?tag]
        [?b :block/refs ?t]
]
:inputs ["desired tagname here"]
:result-transform
    (fn [result]
        (defn get-dates [refs]
            (for [ref refs
                    :let [tag (get-in ref [:block/name])]
                    :when (clojure.string/starts-with? tag "date/")
                ]
                (subs tag 5)
            )
        )
        (sort-by :date
            (for [row result
                :let [{:keys [db/id block/content block/refs block/path-refs] {:keys [block/original-name block/tags]} :block/page} row]]
                    (into {}
                        (for [date (first (filter not-empty [(get-dates refs) (get-dates path-refs) (get-dates tags)]))]
                            {:date date, :page original-name, :content content}
                        )
                    )
            )
        )
    )
:view (fn [rows]
    [:table 
      [:thead
        [:tr
            [:th {:width "8%"} "Date"]
            [:th {:width "25%"} "Page"]
            [:th "Block"]
        ]
      ]
      [:tbody
        (for [r rows]
            [:tr
                [:td (get r :date)]
                [:td [:a {:href (str "#/page/" (get r :page))} (get r :page)]]
                [:td (get r :content)]
            ]
        )
      ]
    ]
)
:group-by-page? false
}
#+END_QUERY

Siferiax · May 14, 2023, 10:22am

I see you already have a link to the page. It’s exactly the same but instead of :block/original-name you use the attribute :block/uuid.

Nice addition to the solution this way

kleer · May 16, 2023, 10:13pm

Thanks for this Siferiax. Since adding this feature turned out not to be a trivial exercise, I am going to post one last version of the query, with this feature implemented.

It wasn’t as simple as adding :block/uuid to my pull specification and picking that extra data up in :result-transform. Approaching it that way caused 2/3rds of my normal result set to come back as empty maps. I suspect this is because a UUID field appears in multiple parts of any given query result row. The fix was to add an :as clause to the main UUID field. Since Clojure also handles UUID variables in a special way, a bit of string transformation was also required in the :result-transform and view clauses.

I’m pretty happy with this close-to-final result. The one thing it lacks is the ability to open any of the returned blocks in the right side panel with the usual shift-click. On the assumption a standard block reference (the UUID in double parentheses) would offer that functionality, I tried adding that to my custom view instead of an anchor tag. But in a query custom view the result is treated just as standard text; there’s no aliasing of the underlying block and no ability to shift-click on the block and open it in the right side panel. If anyone knows how to integrate block reference functionality into a custom view, kindly let me know.

#+BEGIN_QUERY
{
:title ["get blocks tagged with x, sorted by their associated date tags"]
:query [
    :find (pull ?b [[:block/uuid :as :uuid] :block/content {:block/refs [:block/name]} {:block/path-refs [:block/name]} {:block/page [:block/original-name {:block/tags [:block/name]}]}])
    :in $ ?tag
    :where
        [?t :block/name ?tag]
        [?b :block/refs ?t]
]
:inputs ["desired tagname here"]
:result-transform
    (fn [result]
        (defn get-dates [refs]
            (for [ref refs
                    :let [tag (get-in ref [:block/name])]
                    :when (clojure.string/starts-with? tag "date/")
                ]
                (subs tag 5)
            )
        )
        (sort-by :date
            (for [row result
                :let [{:keys [uuid block/content block/refs block/path-refs] {:keys [block/original-name block/tags]} :block/page} row]]
                    (into {}
                        (for [date (first (filter not-empty [(get-dates refs) (get-dates path-refs) (get-dates tags)]))]
                            {:date date, :page original-name, :uuid (str uuid), :content content}
                        )
                    )
            )
        )
    )
:view (fn [rows]
    [:table 
      [:thead
        [:tr
            [:th {:width "8%"} "Date"]
            [:th {:width "25%"} "Page"]
            [:th "Block"]
        ]
      ]
      [:tbody
        (for [r rows]
            [:tr
                [:td (get r :date)]
                [:td [:a {:href (str "#/page/" (get r :page))} (get r :page)]]
                [:td [:a {:href (str "#/page/" (str (get r :uuid)))} (get r :content)]]
            ]
        )
      ]
    ]
)
:group-by-page? false
}
#+END_QUERY

Siferiax · May 17, 2023, 5:17am

Hmmm interesting. Well glad you managed though! I would’ve thought just using :block/uuid was enough.

As for shift-clicking, yeah that doesn’t seem to work in custom views. It’s a shame. I feel it should be (made) possible.
But I’m not sure how the shift-clicking mechanism works, so I can’t comment on why it doesn’t work.

Darwis · May 17, 2023, 5:26am

example for shift click. it is possible to call-api open_on_sidebar using shift click. i just tested it with clojure eval code block in logseq

^:hiccup [:button {:on-click (fn [e] (if (aget e "shiftKey") (call-api "show_msg" "click with shift") (call-api "show_msg" "normal click")))} "test"]

kleer · May 17, 2023, 10:22pm

Thanks for that Darwis! Of course now, sigh, I will have to learn a few things about the hiccup library.

kleer · May 18, 2023, 1:39am

Well, that was easy! Just replace the third line in the table row portion of the view definition with the line below. With that small change, a shift-click opens the clicked block in the right side panel as desired.

[:td {:on-click (fn [e] (if (aget e "shiftKey") (call-api "open_in_right_sidebar" (get r :uuid))))} (get r :content)]

Thanks again Darwis.

kleer · May 26, 2023, 10:57pm

Two final (I hope) tweaks, to fix two problems with the previous version. First, the into {} call was returning only the last date tag in each set, not passing all of them through as I had thought. Second, once I corrected the latter problem, I ran into one that @Siferiax identified near the outset of this thread: that for hierarchical tags Datalog queries will return multiple results – the child, plus any of its higher-level parents. I wanted only the bottom-level (lowest-child) dates, to avoid a confusing duplication of results.

The following new, much more lengthy, version of the :result-transform handler fixes both of these problems.

:result-transform
    (fn [result]
        (defn count-of-date-tag-in-string-of-all-date-tags [ref refs]
            (if (or (clojure.string/includes? ref ")") (clojure.string/includes? ref "("))
                1 ;just pass the malformed tag through for now and catch the error in the query output
                (count (re-seq (re-pattern ref) refs))
            )
        )
        (defn get-dates [refs]
            (for [ref refs
                    :let [tag (get-in ref [:block/name])]
                    :when (clojure.string/starts-with? tag "date/")
                ]
                (subs tag 5)
            )
        )
        (defn get-most-childish-dates [refs]
            (let [string-of-all-date-tags (clojure.string/join (get-dates refs))]
                (filter
                    (fn [ref]
                        (= (count-of-date-tag-in-string-of-all-date-tags ref string-of-all-date-tags) 1)
                        ;parent date tags, because they are substrings of their child date tags, will have counts > 1
                    )
                    (get-dates refs)
                )
            )
        )
        (defn get-best-dates [refs path-refs tags]
            (first (filter not-empty [(get-most-childish-dates refs) (get-most-childish-dates path-refs) (get-most-childish-dates tags) "undated"]))
        )
        (sort-by :date
            (flatten
                (for [row result
                    :let
                        [
                            uuid (get-in row [:uuid])
                            content (get-in row [:block/content])
                            original-name (get-in row [:block/page :block/original-name])
                            refs (get-in row [:block/refs])
                            path-refs (get-in row [:block/path-refs])
                            tags (get-in row [:block/page :block/tags])
                        ]
                    ]
                    (for [date (get-best-dates refs path-refs tags)]
                        {:date date, :page original-name, :uuid (str uuid), :content content}
                    )
                )
            )
        )
    )

I could no doubt have written that Clojure code more elegantly or succinctly. But I chose to go with maximum readability, in case I have to update this query a year or two from now – by which time I will have long forgotten what I had to learn to get it to work.

I added the if test in the first function definition because calls to re-seq fail if the regex pattern contains a left or right parenthesis. I would have preferred testing date tags against a full regex pattern specification, rather than just eliminating possible format errors one-by-one. But while I could get that approach to work just fine in the clj command-line environment, for some reason the same code (even with the requisite extra leading backslashes) fails in the Logseq query environment.

This whole exercise has led me to favour a separation-of-tasks approach to writing advanced queries in Logseq: the :where clause for selecting the target set of records in the graph database; Datalog’s pull API in the :find clause for extracting the requisite data from that set of records; and Clojure in the :result-transform clause for manipulating the returned data as needed for the custom :view.

Siferiax · May 27, 2023, 1:18pm

Lovely addition. You’re being a superstar.
I like this approach. I use either just a normal pull with just a sort in the result-transform. Or i often use :keys for processing further in the result-transform.
Though unsure if that’s a workable way in this scenario, because honestly I have no idea what is even happening
(I’m not well versed in coding tbh)