Reduce repository size
This post is strongly inspired by the gitlab documentation( Reduce repository size )
Motivations
– it slows repository fetching/cloning decreases the productivity of development in general:
developers, continuous integration and so for…
– it takes a large amount of server storage.
– Git repository storage limits can be reached.
The cause
The problem is generally the history that may be very big with the time.
So, removing unwanted history to make the repository smaller.
Solution: git filter-repo
Nowadays, the recommended way is git filter-repo
.
It is open source and the repository is: href= »https://github.com/newren/git-filter-repo/blob/main/README.md »>git filter-repo
»
Beware:It is not an official tool provided git, it was created by a user and the
install is not always standard for now.
How to install git filter-repo
Prerequisites:
– git >= 2.22.0
at a minimum; some features require git >= 2.24.0 or later
– python3 >= 3.5
– If you operating system support it you can install it directly with your current package
manager: apt, yum…
– If it is not the case, it doesn’t matter because it is not very difficult: this tool
is finally a single python file.
To achieve it, we have just to clone the repository and to refer this
bin/cloned_directory
in the
path of our OS.
How to use git filter-repo
1)Generate an export from the project and download it.
This project export contains all refs, so we can use it to purge files from your repository.
2)The export contains a project.bundle
file, which was created by git bundle
.
3)Clone a copy of the repository from the bundle using –bare and –mirror options:
git clone --bare --mirror /path/to/project.bundle
4)Go to the project.git
directory:
cd project.git
5)You can know the size of the repository by issuing a command such as: du -sh
.
With git filter-repo
or git-sizer
, analyze your repository and
review the results
to determine which items you want to purge:
# Using git filter-repo git filter-repo --analyze head .git/filter-repo/analysis/*-{all,deleted}-sizes.txt # Using git-sizer git-size |
6) Purge any unwanted files from the repository history.
Because we are trying to remove internal refs, we rely on the commit-map produced by each run to
tell us which internal refs to remove.
BEWARE:git filter-repo creates a new commit-map file every run.
Examples:
– Remove history located in the foo-path
directory:
git filter-repo --path foo-path
– Remove all history but history located in the foo-path
directory:
git filter-repo --path foo-path --invert-paths
Before pushing your modifications, you can check if you have really reduced the size of the
repository With the command issued previously:du -sh
7)Push the modifications on the remote repository to update all branches.
But before that we have to set the remote URL of the repository because the cloning from a
bundle file sets the origin remote
to the local bundle file such as:
git remote remove origin git remote add origin https://gitlab.example.com/<namespace>/<project_name>.git git push origin --force 'refs/heads/*' |
8) Optionally,to remove large files from tagged releases, force push your changes to all tags on
GitLab:
git push origin --force 'refs/tags/*'
9)Wait at least 30 minutes, because the repository cleanup process only processes object
older than 30 minutes and run repository cleanup from the administration GUI.
To clean up a repository:
– Go to the project for the repository.
– Go to Settings > Repository.
– Upload a list of objects. For example, a commit-map file created by git filter-repo which is
located in the filter-repo directory.