Completely remove a file from whole git repository
Using git for a project, I accidentally added to a commit a big .zip file. I didn’t notice until I started uploading it to github. When I noticed, I hit ctrl-c, git remove, git commit and uploaded it again (now with the file untracked).
I know that this wasn’t the right choice to do, because once I committed the .zip, it stays in the repo until I revert the commit, but sadly I didn’t.
Now, when someone tries to download from the repo, it takes a lot of time to do it, sometimes yields git the remote end hung up unexpectedly (which I’ve read can be solved by doing some git config) and is very annoying.
My point is: is there a way to tell further pull/fetch request that forget this specific file in this specific commit version?
Github provide a useful help page on removing files like this. There are also other questions on StackOverflow which cover this
- Completely remove unwanted file from Git repository history
- How can I completely remove a file from a git repository?
See also this section of the Pro Git book, an example given there:
To remove a file named passwords.txt from your entire history, you can
use the –tree-filter option to filter-branch:
$ git filter-branch --tree-filter 'rm -f passwords.txt' HEAD Rewrite 6b9b3cf04e7c5686a9cb838c3f36a8cb6a0fc2bd (21/21) Ref 'refs/heads/master' was rewritten
After the cleanup, you could also try a git gc to further compress and clean up for repository.
There’s a tool now called “BFG Repo-Cleaner”, it’s mentioned on github.com as an alternative to filter-branch https://help.github.com/articles/remove-sensitive-data/
Link to the tool’s page https://rtyley.github.io/bfg-repo-cleaner/