My case against git commit bloat: Anyone else want a longer time period for Git autosave? (It's currently limited to 600s/10minutes)

Amazing app here! Idea below.

Issues with current git limitations:

Size of Git Repo

Doesn’t having a git repo that commits every little change every few seconds, or even every few minutes, mean that eventually (for an active LogSeq user) she’ll have a github repo that becomes quite large in size as also those little changes are logged?

Do I really need to track all changes

Do I really need track every little change to a page? I really am not sure how useful this is to see every little change on a page. Hopefully my cloud storage is already handling the backup so really all I’d like to do is track substantial edits or preserve blocks of text that I delete as I am in an editing phase.

Better git usage: Developers only save git changes manually “on commit”

It seems the way devs use git is better as they only commit changes when they know there is a reason to commit, like some substantial piece of new work is done.

Solution idea

idea1: Autosave each day

I originally started thinking about this when I tried to change the LogSeq upper bound for saving to only save once a day (every 86400 seconds) … this way the git file size would be decreased and users could just have a way more useful and neat list of commits that show how much they do each day. If there is a page you frequently update this method would allow you to see within the page change history only the days you made changes instead of a massive stream of little micro changes.

idea2: Manual commit button

I think the above method might be best but also: maybe in addition to longer intervals for saving there is also a manual commit button which would allow someone to commit changes to git during the middle of a day if lots of editing and deletions have occurred (instead of the more common activity of content addition) so that these earlier versions can be logged as they are being worked on.

note: Possible current hacks

I guess it’s possible to currently turn the autosave feature off and on manually each night and maybe this would help towards this effort but I’ll never remember to do this and I think the above two ideas makes some sense.

I also realize I could probably create my own git repo and program it to commit at a daily interval but this is also an annoying amount of extra maintenance.

Maybe I am also looking a plugin that allows for this! If so, I apologize.

Thanks for all the hard work and consideration on such a wonderful app!

1 Like

Logseq git interval limit

I would think the limit is there because the git commit is done on an internal timer of Logseq, making it too long would result in people closing Logseq before the git commit happening. But speculating here without looking in the code.

Usability and end users

The main issue here is that you need to strike a balance between end user and git and as a professional git user I would say, yes, you do want all these little commits.

Because git keeps track of changes having frequent commits also means you can say things like, I want to see everything that’s changed in the last 2 minutes, hour, day, week, etc…
And you can see those changes on a per file, but also repository wide basis.

How developers really work

What most developers do (including me) is we commit all the time, but once our change is done and we are happy with it we do something called a squash merge, meaning that all our changes get combined in a single commit that get’s send to the main branch. This keeps our history readable but gives me the ability to scope changes I made that last couple of hours.

Technical limits

Git only saves changes, if you where the fastest typer in the world (998 characters per minute) and would type for the full 24h you would get around 1.5mb of data. A 512gb SD card would store all your changes if you typed 24 hours a day, 365 days a year from birth till death. And yes, git can handle the amount of commits.

Other solutions

The further away something is, the less we care about the details. So in the last hour you might want changes from every 10 minutes, but after a week, you care about the changes made on X day. So in theory you could do a git rebase and combine daily commits into a single commit.

Theory, because it’s a pain to do, git was made not to forget anything. As someone who had to scrub passwords from old commits I still have the pain. To be fair, that’s when you use it with multiple people.

Daily squash merge

So to me the best solution would be to do a daily squash merge, have a main branch with daily commits and a running branch with all the commits and once a day do maintaince.

  1. Make a new branch from the last 24h mark
  2. Squash merge that to the main branch
  3. Merge the changes after the 24h mark to a new workbranch
  4. Switch labels around
  5. Git will eventually cleanup the now orphaned commits.

That’s a tricky action for little gain though.

Manual squash merge

Pick a time you’re happy, and hit a button.

  1. Squash merge your working branch to main branch
  2. Remove working branch and make new working branch from main

Simpler, more control and like how developers do it.

Tried creating a quick test, but it’s early and I couldn’t debug it in 5 minutes. But I could probably whip up a script for that.

Final comments

Okay, that was a lot of text. Hope it didn’t sound negative, it’s early when I read this and I’m a bit passionate when it comes to git. I do think it’s an amazing topic

2 Likes

Being mostly a git noob, and not interested enough to completely figure it out myself: how would one squash merge a working branch to main branch?

Pretty simple, if you’re on the main branch you run.

"git merge --squash "

That should turn all the changes in that happened after the state of main branch into a single commit.

Don’t use it very often though, because it usually happens during pull requests in github/gitlab/azure devops for me.

2 Likes

Man! This is very good @Bas_Grolleman … I especially like you zooming out to calculate total size if someone were to type like mad each day for the rest of his life and detailing squash merging. Certainly very helpful.

1 Like

working on a PR for more git configurability…

i have pull before and push after working already:

I will add squash if i can get it working.

and a separate timer for:
pull --rebase + commit
and
squash and push

this will only work if you want to deal with multiple branches.

on a single branch one can:
git reset --soft origin


Also
My PR for this feature is here:

be aware if you want to try the PR it is best to add a .gitignore with:

logseq/.recycle
logseq/bak
1 Like

so…
the logseq team decided against the PR for auto pull and push options.

Maybe its true that not many are asking for it… but if you like the idea, you can thumbs up on the PR

Considering that I’m currently writing a video about being selective on features for applications this hits close to home. Is there any way to move the idea to a plugin?

Check out the git plug-in by Hayden. I think it should be under the scope of that plug-in.

For sure possible, but I have concerns.

If git support will be fully ripped from the core, and plugins are the only way, then … well … then they’ll be the only way.

Buuuut, if there is partial git support in the core and plugins to enhance it, feels weird.

Annnd, i am concerned about the number of iframes needed to run all these sandboxed mini plugins. I have no hard data as grounds for these concerns… yet.

1 Like