Why Logseq AI and how to preserve privacy?

tienson · May 4, 2023, 5:29am

Hello loggers! I’d like to share a few thoughts on the ongoing AI feature and privacy.

To provide some context, several users have expressed apprehension regarding the in-progress feat/ai-lab branch. We didn’t anticipate that people would be so closely monitoring the source code and caring about a project that has yet to be released. However, we do appreciate the privacy concerns (which are crucial to us) and welcome any feedback or criticism to enhance both the product and our communication.

Relevant links:
- Reddit - Dive into anything
- Post: "Given the points above and the fact I backed #Log…" - Qoto Mastodon

What is the purpose of the AI experiment?

The aim of Logseq is to establish a better environment for both learning and collaboration, enabling us to form a network that connects our ideas and enhances the collective knowledge of humanity.
AI possesses the ability to inquire and acquire knowledge from a vast amount of information on the network, allowing us to learn anything.

The plan

Features:
- Comprehensive semantic search throughout your entire graph, including assets like PDFs
- Text generation such as summarization and translation
- Image generation
- Audio transcription
- Chat
AI solutions:
- Local AI
  - A plugin named LogMind, designed to offer local AI features, currently undergoing internal testing. Our engineer Junyi has been working diligently to ensure its performance and flexibility, allowing it to serve as a foundation for other plugins.
    - Vector Database
    - A runtime for any custom model
    - Transformer embedding
  - LangChain HTTP Proxy (TODO)
    - A local HTTP bridge for LangChain server to make it easy to develop new AI solutions in any language (not just Python and Javascript).
- Cloud services
  - The initial integration will be with OpenAI
For plugin developers:
- We’re working on APIs to make it easier for you guys to hook in. We’re also working on local AI models that will not transfer any data over the internet nor give us access to people’s data in any way.

I’ll use this post to answer some questions, feel free to leave a comment if you have any additional questions on the AI integration.

How to preserve privacy-first?
- A user has to do nothing by default as AI features are disabled by default.
- Each function can be individually enabled or disabled. For instance, you can choose to disable chat while still utilizing text generation.
- As fans of local-first technology, we anticipate the arrival of fully-featured local AI, and we’re putting more effort into local AI instead of cloud services such as the OpenAI integration.
- Regarding local AI:
  - All data, including chat conversations, will be stored exclusively on your local device. If you use Logseq Sync, your information will be encrypted before being transmitted to our server.
- Concerning OpenAI integration:
  - For those who prefer not to use this feature, no additional action is required, as it is disabled by default. If you choose to use it, the general guideline is to avoid sending sensitive information to OpenAI. Logseq will not transmit your entire graph to OpenAI; instead, you can select the data (specific blocks or pages) to be sent and review it before sharing it with OpenAI.
- We welcome any suggestions or ideas to further safeguard our privacy!
Why incorporate it into the core product instead of a plugin?
- A seamless user experience is crucial when integrating AI with other features like whiteboards, flashcards, and queries.
Will there be a charge for it?
- There will be no charge for OpenAI integration using your API token.
- We might offer a service in the future for users who lack an OpenAI API token but still wish to utilize it.
Will it be open-sourced?
- Yes, the OpenAI integration can be found in the branch feat/ai-lab, and the local AI plugin will be released following the testing phase.
How do you plan to enhance communication with the community?
- We will establish a monthly Q&A thread on this forum, followed by a Discord session where we address the most urgent and popular questions of the month.

omeganot · May 4, 2023, 8:53am

Maybe this isn’t the right place for this comment, but since features are mentioned I’ll make a suggestion: AI curation of tags and text changes to automatically tag, link blocks and pages, and make my knowledge graph more traversable. I don’t tell my brain how to connect information for me, it just does it. Having AI help me make those connections that I might not be aware of would be a wonderful feature.

RichardJActon · May 4, 2023, 8:59am

If I might suggest an exemplar of how to handle AI integration as a privacy focused open-source project I would look to what Nextcloud are doing in the space: Nextcloud Ethical AI Rating - Nextcloud

Another initiative to be aware of is RAIL (responsible AI Licenses) https://www.licenses.ai/

Federico_Frosini · May 4, 2023, 11:24am

@tienson
Thanks for the explanation it is so cool to have a local AI able to help us with our notes.

When the AI futures will be ready?

John_Doe · May 4, 2023, 7:56pm

Said developer of open source project with 22.2k stars on GitHub
Sure we watching you

Future sounds great, thank you.

It would be great to see some milestone for it.
Before enabling it, please make sure performance of application won’t get worse. As for now mobile application are very slow

Zac_Hansen · May 5, 2023, 10:34am

I want to just put it out there and say I’m all for AI integration. Not worried about privacy at all personally. In fact, I would not mind at all having a way in the future to have a Logseq graph that works for our company.

The only thing I care about is not so much cloud applications secured on something like AWS accessing data (because I trust AWS with secruity) but random cloud providers that could cause breaches.

I keep all my crypto codes on there so that would be my only concern.

Another idea too that I did like about Rome was to simply have locked pages that if someone accessed your graph, you would have the ability to password protect pages.

I’m not technical at all and don’t code…but this would be a great use case for me.

Overall: would love more AI integrations to include things like auto sumurization and even auto tagging & linking like what Mem is trying to do

John_Doe · May 5, 2023, 7:58pm

I keep all my crypto codes on there so that would be my only concern.

you would have the ability to password protect pages

Sorry, for possible offtopic, but I want to share.
Application should do their main purpose.
Logseq is note taking tool, so it should be about notes.
Passwords/secrets storing is another story.

For password storing it is better to use special program - password manager (KeepassXC, for example).

byron · May 13, 2023, 4:11pm

I’ll add my two cents that I have zero interest in any sort of feature that involves OpenAI or really something that may be called “”“AI”“” instead of more descriptive terms of the actual feature and value it provides. So the amount of time devoted to it makes Logseq less appealing to me and wish such things would not be added.

Piotr · May 15, 2023, 10:33am

If you could implement this idea, it would fit perfectly into Logseq’s right sidebar: Compounding Knowledge Graphs with LLMs - YouTube

e_Spirals · May 15, 2023, 2:57pm

How will this work with PDF’s that are not embedded and stored in something like Zotero/cloud services?

Flan · August 25, 2023, 2:47am

I want to throw my vote in support of this, as an opt-in feature. Different users have different priorities. For me, I’m so disorganized that I’ve decided it’s worth risking some of my privacy (I understand and respect this may be inconsistent with other users needs and values).

vijiboy · August 25, 2023, 1:31pm

As far as our Privacy Safeguarding objective goes,
I vote fully for local AI however suggest to avoid these monolithic closed-source ai services like openai.
until we as society are sure they will do less harm than benefit.
For now, we are not sure and we should err on the side of safety rather than experiment.

fusionhammer · March 15, 2024, 3:45pm

I would like to see local LLM support in this feature, like using the API Ollama exposes. This would check the privacy checkbox.

Is this feature still being developed? I haven’t seen any movement on the ai-lab feature since May '23.

Does this thread exist on the forum? I’ve been unable to find it.

mentaloid · March 15, 2024, 3:58pm

present thread - opening post - last paragraph

You gave the answer yourself. In short, it is low priority.

ama · October 21, 2024, 5:41am

DuckDuckGo provides anonymous access to four AI models including both close and open source. Integration of those into LogSeq could be a great solution for us concerned about privacy, IMO.