I think audio recording by press / could be very handy and useful, maybe like that audio recorder core plugin of obsidian that will save record audio as attachment.
Yeah, and an automated speech-to-text transcript would be the icing on the cake. Especially useful on mobile, I think.
Agreed! I do a lot of audio notes and use Otter and Sonix to transcribe them. I made a few Zaps using Zapier to automatically save the audio files to my Loqseq Assets folder, append the txt transcription file with metadata (tags, date, linked audio file, embedded Google map using Evernote location data), convert the txt transcription file to markdown, and save to my Loqseq Pages folder.
Very cool solution - Google’s new (I think?) Recorder app also does this.
Interesting. Would you be able to outline your workflow for the audio notes, transcription, zaps, etc?
I actually do this on the mobile version of Logseq on an IOS app.
IOS has built-in speech to text function. In MacOS there is also a dictation function which you can turn on under preferences - keyboard.
Some people are more productive striking a keyboard, some prefer writing with their hand and some brilliant minds out there are at their best while conversing. If we believe that to be true, there are not a lot of tools out there to support the conversational way of thinking. In fact, conversations and dialogs are how philosophy and knowledge art got popularised since the ancient times.
First off, if you’re a conversational thinker, accept it and demand better tools for transcribing and recording your thoughts such that they can become usable and whose content would end up contributing to your mind graph in Logseq.
Here are some of the difficulties with voice recordings
- They look orphaned as if there’s no metadata describing them.
- I honestly don’t understand why that is the case. As if we are afraid to glean content from audio files.
- I think tools like Telegram, Whatsapp and the like have made good progress in bring more context and ease of use around audio files, but it’s clearly not enough.
- Imagine, you want to know what the audio says or have an idea as to what topic it covers.
- Do you really have to start listening and going back and forth, in order to find out what the audio file says?
- Why should it be such a big deal and a hefty exercise to transcribe an audio file such that you can use the content to link it to the rest of your graph?
- Why should it be difficult to record something and cut out a piece and re-record one piece and merge it back with the original file.
This is by no means a problem of Logseq, per se. But rather how as an industry, we seldom pay attention to other forms of human computer interaction. Not everyone is comfortable with a keyboard. I don’t think it’s something nature would have wanted us to do after all these mere 100s of thousands of years of natural selection.
And I know that a combination of tools and commands and what not can be used to address some of these requirements. But really, the question is, how can we distribute a solution to these problems in a democratic way for non tech savvy folks.
Yes! This! A one tap record, transcribe, and add to graph would be amazing.
Audio clip as a block, then the transcription as a subblock maybe?
Cherry on top would be a homescreen widget that allows this in one tap from the homescreen.
I think this would be great. I love it the way its handled in obsidian. So something similar would be great