PDF annotation in Logseq

I suggest using blocks for implementing pdf annotations. For example, annotations can be stored in a page with the name of the corresponding pdf file. The block text itself would be the comment or a label (subpoints will not be rendered, or if comment as a thread?) and the properties of that specific block would then include (based on What Are Annotations? | PSPDFKit)

  • type: /Annot
  • subtype: /Highlight
  • Rect: [52.9556, 728.343, 191.196, 743.218]
  • C: [1.0, 1.0, 0.0]

This approach is based on the tiddlywiki philosophy that “everything is a tiddler” or in this case - a block. There are so many pros with this approach, including

  • We don’t have to develop or maintain anything new, as blocks and block properties already exist. It is “merely” a question of rendering the blocks as an annotation layer. Which, of course, will be some work, but PDF.js supports annotation layer and it seems far easier than treating annotations as a “special data type” or something
  • The annotation can now be treated as any other piece of information; it can be built upon or linked or filtered and the annotations are saved as accessible plain text
4 Likes

I will move some of my suggestions from the Discord here to keep track.

Touchpoints

Primarily examples of possible workflows that seem to integrate well with key Logseq design decisions (text-based, web-native, etc.)

  • hypothes.is: appears to be a general web-page annotation tool, but it works with PDFs (local, even) just fine. They’re build on open web standards, and have a quite-active github organization with a wide variety of tools/users.

  • Polar Bookshelf: an web-native, cross-platform tool for storing/managing/annotating pdfs. Seem’s to have a similar ethos to LogSeq or Obsidian. Haven’t used it in a while since I kept needing Zotero-style bib management (and switching apps for annotation, note-taking, and archival is what pushed me to OrgMode and then to LogSeq in the first place). But looking quite polished lately.

  • org-noter is not web-native, but rather does a wonderful job at taking advantage of plain-text storage options (org). Used to use this a lot, along with org-noter-pdftools for pushing my notes back into the pdf file itself (to share with colleagues). But trying to merge this with org-roam and md-roam and org-roam-bibtex, etc… was becoming an ever-taller house of cards, even with Doom emacs as a base.

  • Highlights.app This is the one that almost made me get a mac haha. All annotations can be saved/synced to sidecar MD files, annotations can be saved to the pdf itself for sharing, etc. Nothing from the above has come close for me in terms of being able to capture this functionality w/ Open Source and Linux.

Needs

So, this will obviously be a personal needs thing. But they come from a long-time researcher, and a lot of delving into each of these communities.

  • Sidecar and/or plaintext syncing: huge believer in version control for notes and the ability to mine/hack your own notes with scripts a la @karlicoss and HPI. This IMO necessitates human-readable plain-text as the annotation “source of truth” (barring e.g. drawings on the page).
  • Access to relative PDF locations: We probably have existing PDF management systems like Zotero, Polar, or even just dropbox. I recommend the “graph” link to those as relative locations, and then cache the pdf in e.g. the database as needed. Try to not replace pdf storage solutions, but integrate with existing ones. Else, that way lies madnesss.
  • User control over annotation metadata: The most “pdf-crazy” group is likely the academia crowd (I could be biased). But this means that some annotations need to be highly identifiable/sharable (e.g. Sending a cool paper with my notes to a friend, or collaborating on annotations with colleagues). Other times the annotations have to have NO metadata (e.g. an anonymous paper review). The chance for accidentally including your name in the MD sync is huge in something like Org-noter where it grabs your username by default as the author. This is not really configurable on a per-file/folder/environment setting in org-noter, which means lots of manual tweaking, ALL THE TIME.

Ok, sorry for the wall of text. Hopefully there’s some useful food for thought here? Thanks for making the thread!

12 Likes

Really happy with the option to read and annotate in Logseq. I would like to suggest an option to copy a reference, with link to the highlight in the pdf, and a simple copy of the text. In this way, you can edit the text, like include a cloze deletion via the SRS plugin, but with a reference to the original in the pdf. It would be similar to what Zotfile does in Zotero. An example:


And just to show the possibility of using SRS:

You can do this by using the [Title of the note]( ((block reference)) ), like I did for the example, but it´s a little more work.
The copied text from the pdf could come with " " too, to make it clear that is from other text.
It would be great if the highlights persisted in the original file, not only when you use Logseq.
Thank you so much for doing the plugin.

14 Likes

I think this is a great suggestion. I often manipulate the text that comes in from highlights or use it for making SR cards, but linking back to the original is helpful if I need to remind myself of the context or get more text from the source… Especially if I end up using it elsewhere in the graph.

1 Like

I think this is exactly right. It might even be enough to copy the text without updating the original PDF with a highlight (i.e. cntrl-F if you need to find the exact text again)

The key problem right now is that the reference doesn’t link back to a stable citation for the PDF and a local copy, so it’s not useful for long term note taking.

2 Likes

I agree with this suggestion - it would significantly speed up my ability to read/take notes with PDF. Below is how I think it would be most helpful -
image

So here’s how I envision it working:

  1. You highlight the text, and the pop-up menu comes up.
  2. The option would be available in the menu to “Copy Text with Reference”
  3. When “Copy Text with Reference” is selected, it would be added to my clipboard (the highlight would also remain in the PDF).
  4. When I go into the Logseq section and Ctrl+v, it would paste as you showed

Then I could edit the text and click the “P. 2” part to return to the highlighted part of the PDF.

This would be a game-changer for me, and I would finally start loading PDFs into Logseq. Hope it happens! Big thanks to the developers for putting together such an amazing software!

13 Likes

Current PDF annotation implementation is pretty good, but I think the following improvements may make it better:

  1. Copy Reference without text
    Sometimes I only need a reference link back to PDF, without the text. (e.g. Summarizing a lengthy paragraph, or just referring a picture)

  2. More contrast on the highlight region when clicking the reference link
    Current implementation only jumps to the page containing the highlight, but for academic papers and other densely packed PDF materials, jumping to the page is not enough. Perhaps a little border, or temporarily changing the color to create a flash effect? (e.g. In Zotero a little frame is created to indicate the source highlight, see pic below)

image

  1. Indicate highlight color in the reference link
    Currently all reference link uses the same pin emoji. It would be great if the emoji can be colored according to the highlight color. If that’s hard to implement, a solid color block can also work fine (like when you mark a block as TODO).
9 Likes

For

  1. Agree that having a dedicated point reference is helpful. I have a workaround below:
    You can Shift + click and edit the text on the side bar.
    But if you use Shift + Enter in a block it doesn’t show the lines below. You can use <quote if you want to have multiple lines.
    I can add a caption for a picture this way.
1 Like

Is it possible to let reference text open the pdf without opening the pdf through a link?
For example, I would like to open the 243 page in the pdf directly by clicking this link.
image

2 Likes

+1 for “Copy Reference without text” (as can be done in RemNote).

5 Likes

@basalt78 @taooceros PDF annotation: add pin reference in addition to quote reference there’s a feature request for this please upvote

2 Likes

I am now working with PDF annotation in Logseq and Zotero and expanded @bluejaw’s comments into a feature request.

1 Like

On the topic of adding more “Copy…” operations, it’s important for the devs to consider that it might make more sense to have one single general “Copy” operation and a bunch of “Paste…” operations (as is now standard in many operating systems and applications) than the other way around. See my comment here: PDF Annotation: Copy Text With Block Ref, Citekey, and Page Number - #2 by huy

3 Likes

Hi guys,

All my graphs are centralized on onedrive. When I use pdf annotator on my main computer to copy references, these same references doesn’t work on my Laptop logseq installation. If I make references on my laptop, these same references doesn’t work on my main system. In other word, pdf annotator seems not to be able managing centralized links in the cloud area.

Best,
Jean

1 Like

PDFs with existing annotations have the block refs already available. Right now you have to highlight again to generate a block ref.

2 Likes

In addition to the above, I would really like it if I could type notes and draw directly on the PDF itself in logseq, the way I can in Apple Preview. Right now Logseq’s PDF support only works for conventional languages, but for musical scores, it would be nice to be able to put in fingerings, chord analysis, and performance notes.

3 Likes

I am eager to see the pdf annotation features (the desktop version is fantastic) available on mobile devices. I believe it will make logseq the best note-taking app. Here is a request page already. PDF annotation on mobile devices

2 Likes

I am gathering ideas on improvements for document annotations here if someone would like to discuss them:

1 Like

Hi,

I was very frustrated by Logseq’s lack of pdf annotation import so I forked pdfannots to create a proof of concept to import pdf annotations. Here’s the link.

You can use it like so python ./pdfannots.py -f md_and_edn path_to_pdf -o path_to_pages_markdown --edn_output path_to_asset_edn.

This will create the markdown annotation file (in the pages folder) as well as the edn files containing the metadata of the highlights (in the assets folder).

This is still early and does not yet support highlight color parsing nor highlights different than simple text highlighting (no shapes, no rectangles etc). Don’t hesitate to help in the PR as I’m not a pro at this!

2 Likes

why do we want to do pdf annotation in LogSeq? Maybe the basics, ok, but once we get beyond a very narrow window of annotation, why not send a pdf to another app (pdf expert, Preview), and let that app annotate it?

That may create an added task of bringing back the annotated version, but that’s a small price to pay for always having access to the most up-to-date annotation app with no effort on our part.