Cleaning Up a Messy Graph

I followed the advice of “just start writing” when I was starting out. Over time, as I learned more, my ideas about tagging and use of properties has changed. I want to use a consistent tagging and properties structure, but my existing notes don’t currently have that. I have a lot of tags and property keys that I am no longer using, and notes that are improperly notated.

What is your workflow for cleaning up the mess you created when getting started?

Good question, I also would like to know the experience of others.

My flow (which I would like to improve):

  • open Graph
  • find orphan page and open it
  • add property tags:: to page or delete a page
  • repeat

Unfortunatly it is not possible to have graph on main screen and open page on sidebar with Shift+left click

Maybe it is possible to create a query to show all orphan pages. It would be better.

1 Like

I have found the tags plug-in to be a useful aid here.

Too many tags with one or two references might be a sign that you’re over-thinking things and an opportunity to clean up near-duplicates.

But it would not helped if you need clean up pages with no tags :frowning_face:

Looking for some good suggestions to this too. In the past when I ran into this problem I just started a new database because it was too much to fix.

I will +1 the tags plugin suggestion, that helped me for sure.

I have a workflow page with a bunch of sanity check queries on it.

Here’s a few general ones for your use :saluting_face:

#+BEGIN_QUERY
{:title [:b "Tasks without tag"]
 :query [:find (pull ?b [*])
  :where
   [?b :block/marker ?mark]
   [(contains? #{"TODO" "LATER"} ?mark)]
   [?m :block/original-name ?mark]
   (not 
     [?b :block/refs ?r]
     [(!= ?m ?r)]
   )
   [?b :block/page ?p]
   [?p :block/journal? true]
 ]
 :breadcrumb-show? false
}
#+END_QUERY
#+BEGIN_QUERY 
{:title [:b "Broken References"]
 :query [:find (pull ?b [*])
  :in $ ?matcher
  :where
   [(re-pattern ?matcher) ?regex]
   [?b :block/content ?c]
   [(re-find ?regex ?c)]
   [?b :block/refs ?br]
   [(missing? $ ?br :block/content)]
   [(missing? $ ?br :block/name)]
 ]
 :inputs [ "\\([0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}\\)"]
}
#+END_QUERY

(I hope this one still works as of today lol)

#+BEGIN_QUERY
{:title [:b "Empty files"]
 :query [:find (pull ?p [*])
  :where
   [?p :block/journal? false]
   [?p :block/file _]
   (not 
     [?b :block/page ?p]
     (not [?b :block/content ""] )
   )
 ]
}
#+END_QUERY
#+BEGIN_QUERY
{:title [:b "orphan pages"]
 :query [:find (pull ?p [*])
  :where
   [?p :block/file _]
   [?p :block/name ?name]
   (not [(clojure.string/starts-with? ?name "hls__")])
   [(missing? $ ?p :block/namespace)]
   (not (or
     [?b :block/refs ?p]
     [?b :block/page ?p]
   ) )
 ]
}
#+END_QUERY
2 Likes

Thanks for the queries!

The empty files miss files with totally empty, it do detects ones that have only “-” in the begining (which appears empty in logseq)

I will add another usefull query for me - for those who have page conflicts via OneDrive - which create another file with “-PC{SomeNumber}” under the hood (in my company, every pc has the nane PC{SomeNumber}…
so this query surface that I need to handle this conflict myself…

#+BEGIN_QUERY
{:title “Duplicate conflict pages”
:query [:find (
pull
?p [*]
)
:where
[?p :block/name ?n]
[(re-pattern “(.+)-pc.+”) ?rx]
[(re-find ?rx ?n) [?s ?s1]]
[?p1 :block/name ?s1]
]
:breadcrumb-show? false
; :view :pprint
}
#+END_QUERY

1 Like

Yes true. The only thing you’re missing is empty files (leftover as it were) that are still referenced somewhere in your graph.
If they’re not referenced, then through all pages => three dots menu => remove orphaned pages, Logseq can delete them for you.

Trying to explain to datalog what an empty file is, is a bit difficult.
We have to say ok it is a page without blocks. But (not [?b :block/page ?p]) results in basically all your graph’s pages, as there is always a block in your graph that’s not on that page.
So we would have to pose the question “of all the blocks in my graph which pages have none of them”. And that is probably a performance nightmare. Though I haven’t tried to build it.

1 Like