This document lists some commands to remove specific elements from the repository history. If you are interested in remove elements with certain properties (min size, formated name, etc.) you can use BFG Repo-Cleaner app (not such powerfull but lot faster). Related link: https://rtyley.github.io/bfg-repo-cleaner/
List all the SHA identification number for all the files in the repo:
$ git rev-list --objects --all | sort -k 2 > allfileshas.txt
Get a list of the files ordered by the size (from biggest to smallest):
$ git gc && git verify-pack -v .git/objects/pack/pack-*.idx | egrep "^\w+ blob\W+[0-9]+ [0-9]+ [0-9]+$" | sort -k 3 -n -r > bigobjects.txt
The file generated in the previous step includes only the SHA values to identify each file. Now, we need to include the file name/path for each entry:
$ for SHA in `cut -f 1 -d\ < bigobjects.txt`; do
echo $(grep $SHA bigobjects.txt) $(grep $SHA allfileshas.txt) | awk '{print $1,$3,$7}' >> bigtosmall.txt
done;
$ git filter-branch --force --index-filter 'git rm --cached --ignore-unmatch PATH-TO-FILE-TO-BE-REMOVED' --prune-empty --tag-name-filter cat -- --all
Note: If you want to remove a folder add '-r' after 'git rm' so
=> '... git rm -r --cached ...
Where 'PATH-TO-FILE-TO-BE-REMOVED' is the path to the file or folder you want to remove.
$ git for-each-ref --format='delete %(refname)' refs/original | git update-ref --stdin
$ git reflog expire --expire=now --all
$ git gc --prune=now
$ git push origin --force --all
$ git push origin --force --tags
$ git reflog expire --expire=now --all
$ git gc --aggressive --prune=now
IMPORTANT: If you need to upload the branches to other servers, and they are not present in the current cloned repo (that you have pruned), DO NOT PULL changes from the remote. Instead, only checkout the branches you need and thats it.
They can not pull the changes (this could be catastrophic), but there is a way to synchronize the repos in a save way. For those with extra commits:
$ cd MY_LOCAL_GIT_REPO
$ git fetch origin
$ git rebase
$ git reflog expire --expire=now --all
$ git gc --aggressive --prune=now
For those with no extra data (Warning: This options reases all not pushed data):
$ cd MY_LOCAL_GIT_REPO
$ git fetch origin
# WARNING: can destroy unpublished data!
$ git reset --hard origin/master
$ git reflog expire --expire=now --all
$ git gc --aggressive --prune=now
https://help.github.com/articles/removing-sensitive-data-from-a-repository/ https://help.github.com/articles/removing-sensitive-data-from-a-repository/ http://naleid.com/blog/2012/01/17/finding-and-purging-big-files-from-git-history http://naleid.com/blog/2012/01/17/finding-and-purging-big-files-from-git-history http://blog.ostermiller.org/git-remove-from-history http://blog.ostermiller.org/git-remove-from-history https://git-scm.com/docs/git-filter-branch