Scientific Workflows with Zotero

Thank you for the well-done right up! I can’t say that am a power-user in Zotero yet so perhaps others can fill in their experiences, but here are my thoughts on some of your questions:

How to split workflow between Zotero and Logseq


Has anyone done an in-depth comparison between Zotero and Logseq PDF annotation? Are there any downsides of Logseq?

  • Zotero 6 Annotation Pros:
    • Text Search
    • Can edit highlight annotations
    • Highlight annotations have appropriate spacing between lines (there is an extra space between words at the end of a line and the first word in the next line which is missing from logseq highlight annotations).
    • Can highlight images
    • Can export annotations to Zotero’s new note format (at the cost of cloud space if you have image annotations), and then export to markdown (I have not tested how it works with image annotations)
    • Very stable (no data loss or links breaking)
    • MarkdownDBConnect plugin in Zotero can link to Obsidian, logseq and other software to add an icon on articles in the Zotero Database. This helps differentiate between articles I have created a note for in Logseq and others that I haven’t made a note for yet. It’s simple to setup especially if you’re using citekeys as your markdown file names.
    • If I annotate the pdf file directly, the annotations show up in the sidebar similar to if I made the highlight in Zotero. However, I can’t edit the highlighted text.
  • Zotero 6 Cons:
    • As you mentioned, there is no note or article linking feature which logesq is best for.
    • Without exporting annotations are stuck inside Zotero. However there are hyperlinks at the end of each annotation that can open local Zotero when we need to see the context.
  • Logseq PDF Annotation Pros:
    • Highlights text and images that can be easily referenced anywhere in logseq.
    • Highlight annotations can be edited to include anything that can be rendered in logseq (mathjax, code, bold, italics, links…)
    • Zotero settings in logseq allows for importing of links to pdfs from our Zotero database.
  • Logseq PDF Annotation Cons:
    • No pdf text search
    • Image Highlights don’t work with Zotero PDFs with spaces in the name: github issue. There is a small fix for that currently in the issue comments, but requires file renaming with Zotfile.
    • Current Zotero plugin in logseq isn’t customizable like the one in Obsidian (no customizable template for yaml properties) which results in creating too many pages for all the authors. The search option is also very slow compared to Obsidian and shows less information (missing the authors, year of publication). Otherwise it does what it needs to do.
    • UI zoom scaling resets while editing or resizing the logseq window. When that happens the view also resets to the beginning of the file.
    • If the pdf file has highlights already, they do show up in the logseq pdf viewer, but they don’t fill up the logesq annotation file, unlike in zotero.

The most stable and consistent workflow I would think is to take all my notes on Zotero, and then export the notes and images to markdown. Unfortunately I’m more used to taking notes and summarizing as I read which leads to me annotating in logseq more. The caveat here being I need to screenshot figures and diagrams instead of linking in image annotation since that is still buggy at the moment.

If the logseq team fixes zoom scaling bug, and image highlight bug with zotero pdfs, then I think the workflow where zotero is used to capture articles and logesq for annotation and linking would work well. Especially for those who work mostly with text and less with figures/ diagrams.

Note: I use Windows 10

5 Likes

How to transfer data from Zotero to Logseq?

Did I miss any options for Zotero integration?

Another method I’ve seen floating about is to use Obsidian’s Zotero Integration Plugin to make the markdown file in Logseq with a custom template (see the post). It’s essentially the same as the ‘loose-integration’ approach you described.

How to get outlines from Logseq into Word/Tex?

I can see the value of outlining in logseq itself because it’ll keep a record of where I used my ideas and and what new connections I can make. I find I like to make the outlines in the software where I will make my full draft. Every time I make an outline in Logseq, I end up rewriting it anyway (for the reasons you pointed out).

Then again I have less experience with the output part of the workflow, so perhaps someone else could chime in?

3 Likes

I think many of us are setting up Zotero with Zotfile and are able to use Zotero for free.

The set up is based on this article Zotero hacks: unlimited synced storage and its smooth use with rmarkdown • Ilya Kashnitsky with quite a bit of tweaking. The underlying mechanism is to use Zotero proprietary sync for everything except attachment files because it is free for this purpose, and all the attachments such as PDFs can be sync by using Zotfile and a 3rd party sync service (like Google Drive).

Setting up Zotero to play nicely with Logseq in order to preserve the annotations when moving the Logseq Graph around is another headache. There’re some effort to improve this UX in the work (see the Discord thread here) but no idea when this will be done.

I’m writing this just to argue that cost should not be a reason to not use “Integration using the Zotero web-API”. Your other 3 reasons are valid.

there’s a pull request to address this, but apparently there’s some incompatibility with the old implementation, and no idea when it will be done feat(pdf): fix formatting of copied text

Highlighting figures has been very stable for me and I do a lot of this. Maybe there’s something wrong in your setup. My issue with PDF annotations in Logseq is that there are many moving parts that can go wrong (usually in the file name). You can get help with that in Discord by others, or tag me at @Nhan.

3 Likes

Thank you for your thoughts and for your in-depth comparison!

You mentioned quite a few issues with doing annotations in Logseq that I wasn’t aware of. They are not unsolvable, so let’s hope that they will be fixed soon.

I’ve accumulated a lot of annotations in Zotero (using the old notes and now the new annotations), but it feels very limited. Having the ability to add block-level tags is quite nice.

I’ll need to have a closer look at the MarkdownDBConnect plugin, this type of plugin could solve the backlink issue for the loosely coupled approach via a bib file.

2 Likes

You are right about Zotero storage. I think it is also possible to directly sync the storage folder with syncthing or similar, just the database itself has to be sync’ed through the Zotero server.

I had a look at how the Zotero annotations are stored in the database: The annotations are stored individually in the sqlite file. Images are stored as regular Items in the storage folder. So most likely most users will be able to stay under the free tier if they sync the storage folder manually.

For me it is still not an option to upload all my database to the Zotero cloud due to privacy concerns, but it might be ok for some.

Personally, I’d like to move away from Zotero for anything beyond collecting and managing items. The architecture of Zotero is too closed for my taste. Moving items around is surprisingly difficult if not impossible, for example, moving items between libraries resets the created date, which would mess up my workflow. Also, Zotero’s tagging and filtering is lacking compared to Logseq, no hierarchies etc.

Hi there, a scientist is here. A heavy user of Zotero, Zettlr, etc. Very recently new Zotero plugin was announced, seems that the author is keeping it well updated. It is still not well known, but look promising for fast outlining and linking when working with PDF’s.
Thanks for interesting discussion!

5 Likes

Another card carrier here.

I’d agree with the previous responses: a well thought out writeup of issues surrounding what is potentially a very useful workflow.

I’d hesitate to call myself a power-user in any of the programs under review (LibreOffice / LaTeX / zotero / logseq), despite a reasonable amount of experience in all.

For me, tight integration between zotero & logseq would be ideal. It strikes me that a useful avenue to pursue might be along the lines of zotero plugins for Libre(MS)office, which appear to reference local storage.

An equally workable solution would be for logseq to be able to import .bib files, much like LaTeX’ bibliography. This would obviate the need to work with large bibliographies.

For me, logseq’s ability to directly reference PDF’s in notes is a game changer

It might also be worthwhile asking what you require of each component of your workflow. I don’t require much more from logseq other than concept linkage & and export of a few dot points. I don’t require much more from zotero than to store references for searching. Any writing that needs to be done, I’m doing in the end program (LibreOffice or LaTeX, as the case may be) so that I can leverage the strengths of each component. However, it is useful to export a series of dot points with notes and references through (eg) pandoc (such as Zettlr) to the end program.

$0.02.

3 Likes

As far as integration goes, I found out that Zotero is not very open and that it is quite difficult to get access to the data locally.
I looked at the office integration a while ago, and it was very complex and limited protocol, that was also completely different between MS Word (COM-based, I think) and OO. There is also this protocol:

Overall, I am torn about the Zotero integration. I see that Zotero is developing quite slowly and I feel that relying on Zotero internals might be dangerous in the long run. My library has become very large, and the Zotero citation picker has become extremely slow, a problem shared by many users.

For some reason, Zotero does not provide a local API to access the database, so there is no official way to interact with a local Zotero instance (which is needed for privacy reasons and to work offline). Zotero also plans to switch to Electron, a switch which might or might not affect any plugins Logseq would rely on

For these reasons, I feel that the safest route is to go through .bib files (which would also open workflows with other reference managers).

An option for a tight integration could be to have a scanner that goes through the Logseq documents, finds any links to zotero, then opens Zotero and adds linked documents back to the markdown files. If the Zotero plugin goes down for whatever reason, it wouldn’t stop Logseq from working. I think this would be the best and most stable solution, short of an officially supported local API that exposes the full database (similar to Calibre’s API and the Content Server).

I agree with you that writing needs to be done in a word processor or LaTeX for the time being.

3 Likes

I’m new to Zotero so I don’t know much about it. Is it that you feel the development is slow or is this relative to another reference manager? Do you have an alternative in mind?

Could you give a few links of example of workflow using .bib file? I don’t know anything and would like to learn about this.

Zotero also plans to switch to Electron

They’ve talked about it for 5 years and the latest is “won’t be […] anytime soon” ha ha.

Zotero is a great program and I don’t see anything coming even remotely close, but still I have the feeling that Zotero is starting to lag behind. I am sure many problems are due to technical debt from being tied to the browser platform, this also makes it difficult to interface with 3rd party software. If you compare Zotero to Calibre, the latter has a much more vibrant developer company that has created a huge amount of plugins.
Over the years, I have run into many limitations of Zotero, such as

  • no easy way to transfer items between libraries while maintaining all information
  • no way to support complex workflows
  • search is very slow
  • too much emphasis on cloud sync, which has privacy issues
  • citation picker is very slow
  • no supported local API
  • tag system is primitive compared to how it should be.
  • no way to automatically populate collections based on tags (search folders have to hierarchy)
  • no automatic renaming of tags.
  • Zotero notes are great, but they lack Logseq’s features for assembling the information into other documents. Can’t tag individual blocks in Zotero’s Notes, tags are per note.
  • The new note support is great, but it still doesn’t support TeX, and currently there is no good way to export notes. Writing a note is a substantial investment (many hours per article), and I don’t like my notes to end up in a format that I can’t export properly. I don’t want to rely on a plugin either that might stop working in a few years when they move to Electron.

All of these issues could be addressed with a couple lines of Python, but the lack of a local API makes this difficult and one has to rely on the unofficial debug-bridge or write a Zotero plugin.
The Zotero development is also not very open, they have a mailing list, but no public roadmap.
I don’t want to be too critical of Zotero, like I said, it is a unique program, but I am still worried about putting too much of my intellectual work into the Zotero ecosystem.

There is a plugin for Better Bibtex that automatically writes a bib file and keeps it sync’ed. It still misses some information that would be useful (such as Zotero ID’s for zotero://select links, but probably the author would be willing to add those).
Logseq could then parse this file. This has some major advantages, it still works if Zotero is down and it doesn’t rely on the cloud, so no latency or privacy issues.
I wrote some more comments here.

That’s a good example for the lack of openness. Three years ago it was supposed to happen within half a year and now it has been postponed forever without much of an explanation. I don’t care about the GUI, but if the switch eventually happens it might break add-ons. I am also not very inclined to write add-ons for this reason.

5 Likes

I recommend a zotero plug-in called “Zotero IF pro max”. For highlighting content marked up by Zotero’s own PDF reader, it supports automatic generation and export of markdown files, with or without highlighting colors. The location of the exported file is the location of Logseq’s data. It is designed for Obsidian, but Logseq is also applicable.

The problem is that it’s a Chinese plugin, and that you have to pay for it. I’m not sure if it’s available in English. If you guys would like to try using translation software, I’m sure it would be very helpful. (Zotero IF Pro Max 首次使用须知

2 Likes

I just noticed GitHub - sawhney17/logseq-citation-manager — has anyone tried it?

1 Like

It works great! I have issues that Logseq doesn’t work with relative links (see Comprehensive Zotero Plugin - #42 by Luhmann ), but that is a Logseq bug.
It might be related due to me having the Zotero storage folder in a different location.

zotero-better-notes is great on this.
I take all my notes in zotero with zotero-better-notes, and then export markdown and sync them under Logseq folder.
Each note has a link to the reference pdf in zotero.
You can open the pdf from the note in Logseq with one click.
It work great.

1 Like

geo_fan also mentions zotero plugin zotero-better-notes above, Scientific Workflows with Zotero - #8 by geo_fan

1 Like

I’ve tried zotero-better-notes, but the markdown file exported to logset is like, all of my annotations are all in one block. It’s kind of annoying I have to say :joy:

@yangjincai what export settings do you use from zotero-better-notes? (see snapshot below).
I screenshot an arbitrary selection, but I feel like whatever combination I try, the links aren’t working within LogSeq. But this is an amazing project, I hope I can get it working.

Hi, @Flaunster , I use this export setting.

and if you want the [[bi-directional links]] work, you need to remove the random tag (avoid conflict) in export file names.
Zotero → Edit → Note Template Editor → ExportMDFileName:

related discussion in zotero-better-notes issue125.

1 Like

Thank you @yangjincai !!! You just saved me untold hours trying to figure that out.

Also, the Zotero-better-notes plug-in sync is unidirectional…so accidentally overwriting notes seems over (especially over time when you forget about the syn and revisit a paper).

It’s frustrating because this solution is SO close to working if it could just sync both ways. Do any developers out there have a sense of how much work this would require to develop bi-directional sync? Like would it be an arm and leg to hire a freelancer or just a leg? :wink:

It’s quite frustrating, I must say, to see that the open source community is now, in 2022, finally able to start replicating the functionalities of a (relatively simple) commercial tool that had figured all of this workflow out about 10 years ago: Citavi. PDF highlights, annotation with page numbers that can be (1) tagged (2) categorized in multiple hierarchical trees, a plug-in that connects to word and shows this hierarchy in the text editor, automatic formatting of references…It’s all been there perfectly working and integrated (without the need to export, re-format, etc). Except of course it’s closed-source, becoming expensive, built on an outdated Microsoft database structure and is basically unable to move out of there (the web app is a joke).

If only Zotero and Logseq developpers could take a good hard look at Citavi and say: we can do that and much better, we would gain a huge amount of time. Anyway, in the meantime, we re-invent the wheel with the tools we have…For the time being, Citavi is such an efficient workflow that I’m willing to install a virtual machine and run Windows 11 just to be able to use that program. But I can’t wait until the day that I can switch to a good Zotero/Logseq integration…

5 Likes