Git: Reduce repository size

Reduce repository size

This post is strongly inspired by the gitlab documentation( Reduce repository size )

Motivations

– it slows repository fetching/cloning decreases the productivity of development in general: developers, continuous integration and so for…
– it takes a large amount of server storage.
– Git repository storage limits can be reached.

The cause

The problem is generally the history that may be very big with the time.
So, removing unwanted history to make the repository smaller.

Solution: git filter-repo

Nowadays, the recommended way is git filter-repo.
It is open source and the repository is: href= »https://github.com/newren/git-filter-repo/blob/main/README.md »>git filter-repo
 » Beware:It is not an official tool provided git, it was created by a user and the install is not always standard for now.

How to install git filter-repo

Prerequisites:
git >= 2.22.0 at a minimum; some features require git >= 2.24.0 or later
python3 >= 3.5

– If you operating system support it you can install it directly with your current package manager: apt, yum…
– If it is not the case, it doesn’t matter because it is not very difficult: this tool is finally a single python file.
To achieve it, we have just to clone the repository and to refer this bin/cloned_directory in the path of our OS.

How to use git filter-repo

1)Generate an export from the project and download it.
This project export contains all refs, so we can use it to purge files from your repository.

2)The export contains a project.bundle file, which was created by git bundle.

3)Clone a copy of the repository from the bundle using –bare and –mirror options:
git clone --bare --mirror /path/to/project.bundle

4)Go to the project.git directory:
cd project.git

5)You can know the size of the repository by issuing a command such as: du -sh.
With git filter-repo or git-sizer, analyze your repository and review the results to determine which items you want to purge:

# Using git filter-repo
git filter-repo --analyze
head .git/filter-repo/analysis/*-{all,deleted}-sizes.txt
 
# Using git-sizer
git-size

6) Purge any unwanted files from the repository history.
Because we are trying to remove internal refs, we rely on the commit-map produced by each run to tell us which internal refs to remove.
BEWARE:git filter-repo creates a new commit-map file every run.

Examples:
– Remove history located in the foo-path directory: git filter-repo --path foo-path
– Remove all history but history located in the foo-path directory: git filter-repo --path foo-path --invert-paths

Before pushing your modifications, you can check if you have really reduced the size of the repository With the command issued previously:du -sh

7)Push the modifications on the remote repository to update all branches.
But before that we have to set the remote URL of the repository because the cloning from a bundle file sets the origin remote to the local bundle file such as:

git remote remove origin
git remote add origin https://gitlab.example.com/<namespace>/<project_name>.git
git push origin --force 'refs/heads/*'

8) Optionally,to remove large files from tagged releases, force push your changes to all tags on GitLab:
git push origin --force 'refs/tags/*'

9)Wait at least 30 minutes, because the repository cleanup process only processes object older than 30 minutes and run repository cleanup from the administration GUI.
To clean up a repository:
– Go to the project for the repository.
– Go to Settings > Repository.
– Upload a list of objects. For example, a commit-map file created by git filter-repo which is located in the filter-repo directory.

Ce contenu a été publié dans Non classé. Vous pouvez le mettre en favoris avec ce permalien.

Laisser un commentaire

Votre adresse de messagerie ne sera pas publiée. Les champs obligatoires sont indiqués avec *