Understanding not-clause in Datalog

In a few words, why this:

(not
[?b :block/refs ?ref]
[?ref :block/name ?name]
[(contains? ?ignore ?name)]
)

is not the same as this:

[?b :block/refs ?ref]
[?ref :block/name ?name]
(not
[(contains? ?ignore ?name)]
)

Above this code I’m asking for all TODO tasks, ?ignore is a map with all refs that I want to exclude.
The first example returns all tasks except the one with refs to one page included in ?ignore (expected behaviour).
The second throws all tasks including the ones I didn’t want to retrieve.

I’m sure it’s just a basic concept about this languaje that I missunderstood (I didn’t even know how is this called, datascript? datalog? datomic? clojurescript?.. I’m a bit confused).

Thanks!

The meaning is roughly this:

  • for the first example:
    • the block should not contain a reference to something that is in the ignore list
  • for the second example:
    • the block should contain a reference to something that is not in the ignore list
1 Like

Great explanation! I think I’m a bit closer to manage queries

PS. the way to look at the not clause in general is as such:
you make a dataset within that clause which then gets subtracted from the query results as they are.

  • The set that gets removed in your first example is any ?b entities with a reference from your ignore list
  • The set that gets removed in your second example is any ?name values that are in your ignore list.

And then you basically get what mentaloid indicates.
So the way to think of it is, “what subset of data do I wish to remove from my results?”

Whenever I get confused about what my not is suppose to be, I actually build a query trying to get the exact subset I don’t want to see. This will then inform my not clause.

Hope this helps :slight_smile:

  • That perspective goes well for the first level of nesting.
    • In the second example, it would essentially read it like this:
      • create a set A of blocks that contain a reference
      • create a set B of blocks that contain a reference in the ignore list
      • subtract B from A
    • Would be easier to read it like this:
      • include blocks that contain references
      • from the resulting set, exclude blocks that their reference is in the ignore list
  • Things get tricky when adding more nested levels.
    • Consider a query like this:
      :where
        [...a...]
        (not
          [...b...]
          (not [...c...])
        )
      
    • The above perspective would read it like this:
      • create a set A that passes condition a
      • create a set B that additionally passes condition b
      • create a set C that additionally passes condition c
      • subtract C from B to get a set S
      • subtract S from A
    • Would be easier to read it like this:
      • include blocks that pass condition a
      • from the resulting set, exclude blocks that pass condition b
        • but don’t exclude them if they also pass condition c
    • However, I prefer reading it like this:
      • blocks that pass condition a
      • but don’t pass condition b
        • except if they also pass condition c
1 Like

I don’t think so?
I would read it as

  • from the dataset remove
    • those ?name values that are present in the ?ignore list
    • and therefore those ?ref entities with those ?name values as their :block/name
  • crucially that leaves all ?b entities that have a reference outside of the ignore list
    • regardless of whether it also has a reference present in the ignore list.
      • this is because :block/refs is not a single value
  • to work around the multiple values, we need to include the ?b entities in our subset
    • this way the ?b entity gets subtracted as a whole, even if there are references outside the ignore list.

Example 2 therefore only excludes ?b entities that only have references present in the ignore list.
And example 1 excludes all ?b entities that have 1 or more references in the ignore list.

I think I will just conclude not clauses can be very opaque if you want to understand them thoroughly.
I was trying to think of a way to better articulate how not clauses work, but I feel I’m writing in circles.
And as you demonstrated already, when it comes to nesting it gets even more confusing! I have such an example in my own graph:

  • Block that don’t have a reference
  • and whose lineage doesn’t have a reference
  • don’t count references to note or activity, as having a reference

Basically a nested example 2.

(not [?b :block/refs])
(not
  [?b :block/parent ?par]
  [?par :block/refs ?r]
  (not 
    [?r :block/name ?name]
    [(contains? #{"notitie" "activiteit"} ?name)]
  )
)

(The actual query uses a rule to get through the whole lineage)

Describing sets comes less natural to people that don’t use them in a daily fashion.

  • This description right there reads better in my opinion.
    • Although the last line is vague.
      • e.g. not clearing up whether the exception applies:
        • to blocks as well
        • exclusively to their lineage
      • Indenting it would clarify its scope.
  • I would slightly change it like this:
    • blocks that neither have a reference themselves
    • nor their lineage has a reference
      • except (or other than) to note or activity
  • In other words, it helps when expressing the various nots with more specific negative words (the bold ones).

True!

Perhaps it is a language barrier thing, for me except is not natural :slight_smile:
I mean in the not clause sense of using it.
I guess that’s why I went for the “not count as” wording.
Other than would work I guess. Not sure why I’m tripping over the word except.

Probably a language barrier that I use simpler/less specific terms.

“except” can be read as:

  • besides
  • excluding
    • This should be natural enough.
  • with the exception/exclusion of
  • without including/counting those
    • This is closer to your own expression.
  • keep out those
    • This is the literal meaning.
1 Like