Wildcard functionality for linked references, unlinked references & hierarchy

Wildcard functionality of some sort would increase Logseq’s knowledge connecting abilites.

linked references, unlinked references and hierarchy

expand-linking-matches so that Logseq searches for words that have a partial match, for example the page “Darwin” would connect with pages that mentioned the word “Darwinism”.

Similar to a wildcard such as *darwin*

An option in config.edn such as the following to enable/disable this functionality.

 ;; expand-linking-matches so that Logseq searches for words that have a
 ;; partial match, for example the page "Darwin" would connect with pages that
 ;; mentioned the word "Darwinism".
 {
   :expand-linking-matches/linked-ref true ; Default false
   :expand-linking-matches/unlinked-ref true ; Default false
   :expand-linking-matches/hierarchy true ; Default false
 }

Surprisingly enough, partial matching is working in Arabic!

1 Like

Something I added on discord earlier, not sure if its worth taking further as my implementation is very clunky.


Note: You need to refresh/reload Logseq to get this to work (or at least I had to…)

I made a query to try and demo the wildcard functionality, this query will search for partial references with the target being the current page name (similar to your image example from earlier)

It’s not very practical at the moment as you need to add the query manually to pages.

{:query [:find (pull ?b [*])
         :in $ ?current-page
         :where
         [?b :block/content ?content]
         [(str "(?i).*" ?current-page ".*") ?regs]
         [(re-pattern ?regs) ?regx]
         [(re-matches ?regx ?content)]]
 :inputs [:current-page]}

In the screenshot attached the query finds linked and unlinked references when run on the page Experiment, and finds Experiment and Experimentation in my notes.

The query runs with wildcards both sides of the current page name.

I found that you might not want a wildcard on the beginning because there are many matches of a word inside another word which are completely unrelated, see: Words in Words for some examples.

To remove the beginning wildcard just modify the query with this replacement:

         [(str "(?i)" ?current-page ".*") ?regs]

To get around having to add the query on every page, you could put the query in custom.js and interact via the api.

let extendedReferenceSearchData = logseq.api.custom_query(`[:find (pull ?b [*])
:where
[?b :block/content ?content]
[(str "(?i).*${targetString}.*") ?regs]
[(re-pattern ?regs) ?regx]
[(re-matches ?regx ?content)]]`);

And then use some sort of MutationObserver to run this on every page. An issue would be the output from the query with the api is just data, you would need to style it and it might be a waste of time if this is a 10 minute fix that the Logseq devs could do to the linked references and unlinked references functions…

Amazing effort @drawingthesun. Of course, having it as a built-in functionality is a lot more convenient.

1 Like

Here is a clear visualization of the inconsistency between Arabic and English

1 Like

Removing the wrapping part of the regex (which ban alphabets from the matching result) should work.
However, the previous behavior has some benefits and that’s what the regex doing. Should provide a config option to switch the behavior (an UI is better).
Further more, the code is doing regex to full datoms, it’s very slow… Ideally should switch it to search/block-search, but it would harm the flexibility and requires a big effort to switch…

Anyway, I’m happy to review if somebody is going to contribute a PR addressing this.

1 Like