PDF annotation in Logseq

What do you want for PDF annotation feature support in Logseq?

I am more interested in importing annotations from other apps than annotating in Logseq itself. Not sure if that should be a separate discussion? But basically if I have an annotated PDF that I import, I’d like to have a page of all the annotations. Ideally with some hierarchy marking section and chapter headings… (Right now I use Readwise to import these via an Obsidian plugin.)

1 Like

I really like the implementation displayed here:

Highlights with text and area selection (for images and tables) options.
It would be cool to have the changes made in logseq be in the pdf file outside of the application.
For some suggestions on a nice mobile experience, I recomend having a look at this video by
Bryan Jenks on Zotero pdf annotation feature: https://youtu.be/yjN-sdvhH3A?t=438
The UI is very clean and intuitive, but I don´t know if mirroring zotero is possible with an outline format.
Looking forward to this feature. Thank you!

4 Likes

I suggest using blocks for implementing pdf annotations. For example, annotations can be stored in a page with the name of the corresponding pdf file. The block text itself would be the comment or a label (subpoints will not be rendered, or if comment as a thread?) and the properties of that specific block would then include (based on What Are Annotations? | PSPDFKit)

  • type: /Annot
  • subtype: /Highlight
  • Rect: [52.9556, 728.343, 191.196, 743.218]
  • C: [1.0, 1.0, 0.0]

This approach is based on the tiddlywiki philosophy that “everything is a tiddler” or in this case - a block. There are so many pros with this approach, including

  • We don’t have to develop or maintain anything new, as blocks and block properties already exist. It is “merely” a question of rendering the blocks as an annotation layer. Which, of course, will be some work, but PDF.js supports annotation layer and it seems far easier than treating annotations as a “special data type” or something
  • The annotation can now be treated as any other piece of information; it can be built upon or linked or filtered and the annotations are saved as accessible plain text
2 Likes

I will move some of my suggestions from the Discord here to keep track.

Touchpoints

Primarily examples of possible workflows that seem to integrate well with key Logseq design decisions (text-based, web-native, etc.)

  • hypothes.is: appears to be a general web-page annotation tool, but it works with PDFs (local, even) just fine. They’re build on open web standards, and have a quite-active github organization with a wide variety of tools/users.

  • Polar Bookshelf: an web-native, cross-platform tool for storing/managing/annotating pdfs. Seem’s to have a similar ethos to LogSeq or Obsidian. Haven’t used it in a while since I kept needing Zotero-style bib management (and switching apps for annotation, note-taking, and archival is what pushed me to OrgMode and then to LogSeq in the first place). But looking quite polished lately.

  • org-noter is not web-native, but rather does a wonderful job at taking advantage of plain-text storage options (org). Used to use this a lot, along with org-noter-pdftools for pushing my notes back into the pdf file itself (to share with colleagues). But trying to merge this with org-roam and md-roam and org-roam-bibtex, etc… was becoming an ever-taller house of cards, even with Doom emacs as a base.

  • Highlights.app This is the one that almost made me get a mac haha. All annotations can be saved/synced to sidecar MD files, annotations can be saved to the pdf itself for sharing, etc. Nothing from the above has come close for me in terms of being able to capture this functionality w/ Open Source and Linux.

Needs

So, this will obviously be a personal needs thing. But they come from a long-time researcher, and a lot of delving into each of these communities.

  • Sidecar and/or plaintext syncing: huge believer in version control for notes and the ability to mine/hack your own notes with scripts a la @karlicoss and HPI. This IMO necessitates human-readable plain-text as the annotation “source of truth” (barring e.g. drawings on the page).
  • Access to relative PDF locations: We probably have existing PDF management systems like Zotero, Polar, or even just dropbox. I recommend the “graph” link to those as relative locations, and then cache the pdf in e.g. the database as needed. Try to not replace pdf storage solutions, but integrate with existing ones. Else, that way lies madnesss.
  • User control over annotation metadata: The most “pdf-crazy” group is likely the academia crowd (I could be biased). But this means that some annotations need to be highly identifiable/sharable (e.g. Sending a cool paper with my notes to a friend, or collaborating on annotations with colleagues). Other times the annotations have to have NO metadata (e.g. an anonymous paper review). The chance for accidentally including your name in the MD sync is huge in something like Org-noter where it grabs your username by default as the author. This is not really configurable on a per-file/folder/environment setting in org-noter, which means lots of manual tweaking, ALL THE TIME.

Ok, sorry for the wall of text. Hopefully there’s some useful food for thought here? Thanks for making the thread!

6 Likes

Really happy with the option to read and annotate in Logseq. I would like to suggest an option to copy a reference, with link to the highlight in the pdf, and a simple copy of the text. In this way, you can edit the text, like include a cloze deletion via the SRS plugin, but with a reference to the original in the pdf. It would be similar to what Zotfile does in Zotero. An example:


And just to show the possibility of using SRS:

You can do this by using the [Title of the note]( ((block reference)) ), like I did for the example, but it´s a little more work.
The copied text from the pdf could come with " " too, to make it clear that is from other text.
It would be great if the highlights persisted in the original file, not only when you use Logseq.
Thank you so much for doing the plugin.

11 Likes

I think this is a great suggestion. I often manipulate the text that comes in from highlights or use it for making SR cards, but linking back to the original is helpful if I need to remind myself of the context or get more text from the source… Especially if I end up using it elsewhere in the graph.

I think this is exactly right. It might even be enough to copy the text without updating the original PDF with a highlight (i.e. cntrl-F if you need to find the exact text again)

The key problem right now is that the reference doesn’t link back to a stable citation for the PDF and a local copy, so it’s not useful for long term note taking.

1 Like

I agree with this suggestion - it would significantly speed up my ability to read/take notes with PDF. Below is how I think it would be most helpful -
image

So here’s how I envision it working:

  1. You highlight the text, and the pop-up menu comes up.
  2. The option would be available in the menu to “Copy Text with Reference”
  3. When “Copy Text with Reference” is selected, it would be added to my clipboard (the highlight would also remain in the PDF).
  4. When I go into the Logseq section and Ctrl+v, it would paste as you showed

Then I could edit the text and click the “P. 2” part to return to the highlighted part of the PDF.

This would be a game-changer for me, and I would finally start loading PDFs into Logseq. Hope it happens! Big thanks to the developers for putting together such an amazing software!

9 Likes

Current PDF annotation implementation is pretty good, but I think the following improvements may make it better:

  1. Copy Reference without text
    Sometimes I only need a reference link back to PDF, without the text. (e.g. Summarizing a lengthy paragraph, or just referring a picture)

  2. More contrast on the highlight region when clicking the reference link
    Current implementation only jumps to the page containing the highlight, but for academic papers and other densely packed PDF materials, jumping to the page is not enough. Perhaps a little border, or temporarily changing the color to create a flash effect? (e.g. In Zotero a little frame is created to indicate the source highlight, see pic below)

image

  1. Indicate highlight color in the reference link
    Currently all reference link uses the same pin emoji. It would be great if the emoji can be colored according to the highlight color. If that’s hard to implement, a solid color block can also work fine (like when you mark a block as TODO).
5 Likes

For

  1. Agree that having a dedicated point reference is helpful. I have a workaround below:
    You can Shift + click and edit the text on the side bar.
    But if you use Shift + Enter in a block it doesn’t show the lines below. You can use <quote if you want to have multiple lines.
    I can add a caption for a picture this way.