Correct me if I’m wrong, but it’s not enough to delete the files in the commit, unless you’re ok with Git tracking the large amount of data that was previously committed. Your git clones will be long, my friend
See this is the kind of shit that bothers me with Git and we just sort of accept it, because it’s THE STANDARD. And then we crank attach these shitty LFS solutions on the side because it don’t really work.
What was perforce’s solution to this?
If you delete a file in a new revision, it still kept the old data around, right? Otherwise there’d be no way to rollback.
Yes but Perforce is a (broadly) centralised system, so you don’t end up with the whole history on your local computer. Yes, that then has some challenges (local branches etc, which Perforce mitigates with Streams) and local development (which is mitigated in other ways).
For how most teams work, I’d choose Perforce any day. Git is specialised towards very large, often part time, hyper-distributed development (AKA Linux development), but the reality is that most teams do work with a main branch in a central location.
I don’t understand how we’re all using git and it’s not just some backend utility that we all use a sane wrapper for instead.
Everytime you want to do anything with git it’s a weird series or arcane nonsense commands and then someone cuts in saying “oh yeah but that will destroy x y and z, you have to use this other arcane nonsense command that also sounds nothing like you’re trying to do” and you sit there having no idea why either of them even kind of accomplish what you want.
There are tons of wrappers for git, but they all kinda suck. They either don’t let you do something the cli does, so you have to resort to the arcane magicks every now and then anyways. Or they just obfuscate things to the point where you have no idea what it’s doing, making it impossible to know how to fix things if (when) it fucks things up.
Git is complicated, but then again, it’s a tool with a lot of options. Could it be nicer and less abstract in its use? Sure!
However, if you compare what goes does, and how it does, to it’s competitors, then git is quite amazing. 5-10 years ago it was all svn, the dark times. Simpler tool and an actual headache to use.
These things are not related. Git uses the system default editor, which is exactly what a cli program dropping you into an editor should use. If that’s Vim and you don’t like that, you need to configure your system or take it up with your distro maintainers.
It’s because git is a complex tool to solve complex problems. If you’re one hacker working alone, RCS will do an acceptable job. As soon as you add a second hacker, things change and RCS will quickly show its limitations. FOSS version control went through CVS and SVN before finally arriving at git, and there are good reasons we made each of those transitions. For that matter, CVS and SVN had plenty of arcane stuff to fix weird scenarios, too, and in my subjective experience, git doesn’t pile on appreciably more.
You think deleting an empty directory should be easy? CVS laughs at your effort, puny developer.
It’s because git is a complex tool to solve complex problems. If you’re one hacker working alone, RCS will do an acceptable job. As soon as you add a second hacker, things change and RCS will quickly show its limitations. FOSS version control went through CVS and SVN before finally arriving at git, and there are good reasons we made each of those transitions. For that matter, CVS and SVN had plenty of arcane stuff to fix weird scenarios, too, and in my subjective experience, git doesn’t pile on appreciably more.
Yes it is a complex tool that can solve complex problems, but me as a typical developer, I am not doing anything complex with it, and the CLI surface area that’s exposed to me is by and large nonsense and does not meet me where I’m at or with the commands or naming I would expect.
I mean NPM is also a complex tool, but the CLI surface area of NPM is “npm install”.
So basic, well documented, easily understandable commands like git add, git commit, git push, git branch, and git checkout should have you covered.
the CLI surface area that’s exposed to me is by and large nonsense and does not meet me where I’m at
What an interesting way to say “git has steep learning curve”. Which is true, git takes time to learn and even more to master. You can get there solely by reading the man pages and online docs though, which isn’t something a lot of other complex tools can say (looking at you kubernetes).
Also I don’t know if a package manager really compares in complexity to git, which is not just a version control tool, it’s also a thin interface for manipulating a directed acyclic graph.
So basic, well documented, easily understandable commands like git add, git commit, git push, git branch, and git checkout should have you covered.
You mean: git add -A, git commit-m "xxx", git push or git push -u origin --set-upstream, etc. etc. etc. I get that there’s probably a reason for it’s complexity, but it doesn’t change the fact that it doesn’t just have a steep learning curve, it’s flat out remarkably user unfriendly sometimes.
What are you smoking? Shallow clones don’t modify commit hashes.
The only thing that you lose is history, but that usually isn’t a big deal.
--filter=blob:none probably also won’t help too much here since the problem with node_modules is more about millions of individual files rather than large files (although both can be annoying).
git clone --depth=1 <url> creates a shallow clone. These clones truncate the commit history to reduce the clone size. This creates some unexpected behavior issues, limiting which Git commands are possible. These clones also put undue stress on later fetches, so they are strongly discouraged for developer use. They are helpful for some build environments where the repository will be deleted after a single build.
Maybe the hashes aren’t different, but the important part is that comparisons beyond the fetched depth don’t work: git can’t know if a shallowly cloned repo has a common ancestor with some given commit outside the range, e.g. a tag.
Blobless clones don’t have that limitation. Git will download a hash+path for each file, but it won’t download the contents, so it still takes much less space and time.
If you want to skip all file data without any limitations, you can do git clone --filter=tree:0 which doesn’t even download the metadata
Yes, if you ask about a tag on a commit that you don’t have git won’t know about it. You would need to download that history. You also can’t in general say “commit A doesn’t contain commit B” as you don’t know all of the parents.
You are completely right that --depth=1 will omit some data. That is sort of the point but it does have some downsides. Filters also omit some data but often the data will be fetched on demand which can be useful. (But will also cause other issues like blame taking ridiculous amounts of time.)
Neither option is wrong, they just have different tradeoffs.
Correct me if I’m wrong, but it’s not enough to delete the files in the commit, unless you’re ok with Git tracking the large amount of data that was previously committed. Your git clones will be long, my friend
See this is the kind of shit that bothers me with Git and we just sort of accept it, because it’s THE STANDARD. And then we crank attach these shitty LFS solutions on the side because it don’t really work.
Give me Perforce, please.
What was perforce’s solution to this? If you delete a file in a new revision, it still kept the old data around, right? Otherwise there’d be no way to rollback.
Yes but Perforce is a (broadly) centralised system, so you don’t end up with the whole history on your local computer. Yes, that then has some challenges (local branches etc, which Perforce mitigates with Streams) and local development (which is mitigated in other ways).
For how most teams work, I’d choose Perforce any day. Git is specialised towards very large, often part time, hyper-distributed development (AKA Linux development), but the reality is that most teams do work with a main branch in a central location.
https://github.blog/2020-12-21-get-up-to-speed-with-partial-clone-and-shallow-clone/
Yes I’m aware, of course. But then you take on another set of trade-offs. It’s not like shallow cloning SOLVES your problem.
You’d have to rewrite the history as to never having committed those files in the first place, yes.
And then politely ask all your coworkers to reset their working environments to the “new” head of the branch, same as the old head but not quite.
Chaos ensues. Sirens in the distance wailing.
If this was committed to a branch would doing a squash merge into another branch and then nuking the old one not do the trick?
Yes, that would do the trick
Rewrite history? Difficult.
Start a new project and nuke the old one? Finger guns.
History is written by the victors. The rest of us have to nuke the project and start over.
git clone --depth=1
?No, don’t do that. That modifies the commit hashes, so tags no longer work.
git clone --filter=blob:none
is where it’s at.I don’t understand how we’re all using git and it’s not just some backend utility that we all use a sane wrapper for instead.
Everytime you want to do anything with git it’s a weird series or arcane nonsense commands and then someone cuts in saying “oh yeah but that will destroy x y and z, you have to use this other arcane nonsense command that also sounds nothing like you’re trying to do” and you sit there having no idea why either of them even kind of accomplish what you want.
There are tons of wrappers for git, but they all kinda suck. They either don’t let you do something the cli does, so you have to resort to the arcane magicks every now and then anyways. Or they just obfuscate things to the point where you have no idea what it’s doing, making it impossible to know how to fix things if (when) it fucks things up.
Git is complicated, but then again, it’s a tool with a lot of options. Could it be nicer and less abstract in its use? Sure!
However, if you compare what goes does, and how it does, to it’s competitors, then git is quite amazing. 5-10 years ago it was all svn, the dark times. Simpler tool and an actual headache to use.
You are not entirely wrong, but just as some advice I would refrain from displaying fear of the command line in interviews.
Lol if an employer can’t have an intelligent discussion about user friendly interface design I’m happy to not work for them.
For a lot of experienced people, command line tools are user friendly interface design.
Command line tools can be, git’s interface is not. There would not be million memes about exiting vim if it was.
These things are not related. Git uses the system default editor, which is exactly what a cli program dropping you into an editor should use. If that’s Vim and you don’t like that, you need to configure your system or take it up with your distro maintainers.
It’s because git is a complex tool to solve complex problems. If you’re one hacker working alone, RCS will do an acceptable job. As soon as you add a second hacker, things change and RCS will quickly show its limitations. FOSS version control went through CVS and SVN before finally arriving at git, and there are good reasons we made each of those transitions. For that matter, CVS and SVN had plenty of arcane stuff to fix weird scenarios, too, and in my subjective experience, git doesn’t pile on appreciably more.
You think deleting an empty directory should be easy? CVS laughs at your effort, puny developer.
Yes it is a complex tool that can solve complex problems, but me as a typical developer, I am not doing anything complex with it, and the CLI surface area that’s exposed to me is by and large nonsense and does not meet me where I’m at or with the commands or naming I would expect.
I mean NPM is also a complex tool, but the CLI surface area of NPM is “npm install”.
Well, you’re free to try RCS if you like. It’s still out there.
Git is too hard for you. Please stop using it
So basic, well documented, easily understandable commands like
git add
,git commit
,git push
,git branch
, andgit checkout
should have you covered.What an interesting way to say “git has steep learning curve”. Which is true, git takes time to learn and even more to master. You can get there solely by reading the man pages and online docs though, which isn’t something a lot of other complex tools can say (looking at you kubernetes).
Also I don’t know if a package manager really compares in complexity to git, which is not just a version control tool, it’s also a thin interface for manipulating a directed acyclic graph.
You mean:
git add -A
,git commit -m "xxx"
,git push
orgit push -u origin --set-upstream
, etc. etc. etc. I get that there’s probably a reason for it’s complexity, but it doesn’t change the fact that it doesn’t just have a steep learning curve, it’s flat out remarkably user unfriendly sometimes.git add
with no arguments outputs a message telling you to specify a path.git commit
with no arguments drops you into a text editor with instructions on how to write a commit message.git push
with no arguments will literally print thegit push --set-upstream
command you need to run if your branch has no upstream.Again, I recognize that git has a steep learning curve, but you chose just about the worst possible examples to try and prove that point lol.
Mercurial is way better.
There, I said it.
Thanks, didn’t know about that one.
What are you smoking? Shallow clones don’t modify commit hashes.
The only thing that you lose is history, but that usually isn’t a big deal.
--filter=blob:none
probably also won’t help too much here since the problem withnode_modules
is more about millions of individual files rather than large files (although both can be annoying).From github’s blog:
Maybe the hashes aren’t different, but the important part is that comparisons beyond the fetched depth don’t work: git can’t know if a shallowly cloned repo has a common ancestor with some given commit outside the range, e.g. a tag.
Blobless clones don’t have that limitation. Git will download a hash+path for each file, but it won’t download the contents, so it still takes much less space and time.
If you want to skip all file data without any limitations, you can do
git clone --filter=tree:0
which doesn’t even download the metadataYes, if you ask about a tag on a commit that you don’t have git won’t know about it. You would need to download that history. You also can’t in general say “commit A doesn’t contain commit B” as you don’t know all of the parents.
You are completely right that
--depth=1
will omit some data. That is sort of the point but it does have some downsides. Filters also omit some data but often the data will be fetched on demand which can be useful. (But will also cause other issues likeblame
taking ridiculous amounts of time.)Neither option is wrong, they just have different tradeoffs.