The minimalist's guide to cloning git repositories

📅️ Published: May 10, 2025  • 🕣 9 min read

Late last year, I launched osscooking.com to provide real-time analysis of open-source projects on certain opinionated metrics. One of the first optimization techniques while doing on-demand git analysis is to have a copy of the repository locally. Waiting time with plain git clones can gradually increase as the repository size grows ultimately affecting the response time. The topic of cloning is simple, yet complex, so much so that companies like TikTok & Microsoft had to build their own tools to improve the git mono-repo experience with cloning being one of the key areas of focus.

For instance, the obsproject repo is around 136MB in size at the time of publishing this post and takes around ~7 seconds to get it on my machine with a stable Internet speed of ~210Mbps.

In this post, I will try to explore 10 different ways to clone a git repository with the goal of analyzing git metadata. For the sake of simplicity, I will be using the vanilla git CLI with no additional tools. There’s a clear winner according to my acceptance criteria, however, it’s important to understand that the best approach depends on the use case. For folks, who can’t wait, please jump to the conclusion. For the rest, let’s dive in.

Acceptance Criteria

  1. Full commit history
  2. Only fetch the main branch
  3. Include Git tags
  4. Exclude file contents (blobs)

Setup

  • Git version: 2.47.1

Single Branch Clone

git clone --single-branch
# Size: ~136M

Requirements met?

  • Full commit history
  • Only main branch
  • Access to git tags
  • Exclude file contents

Now, I used the --single-branch flag under the assumption that pulling fewer branches would surely reduce the size of the cloned repository. However, the size of the repository is somehow a bit large than the plain clone?

The reason this happens is ordinary (plain) clones usually just download the server’s preexisting packfile(s) (which tend to be delta-compressed against all branches/tags). But once you ask only for one branch, the server frequently has to repack on the fly for just that branch’s objects—which can yield a slightly bigger pack overall despite fetching fewer commits, simply because it loses the delta relationships with other branches or with the full set of objects.

Bare Clone

A bare clone or as I would like to call it is a naked git repository.

git clone --bare https://github.com/obsproject/obs-studio.git
# Size: 84MB

Git documentation defines a bare repository as:

A bare repository is normally an appropriately named directory with a .git suffix that does not have a locally checked-out copy of any of the files under revision control. That is, all of the Git administrative and control files that would normally be present in the hidden .git sub-directory are directly present in the repository.git directory instead, and no other files are present and checked out. Usually, publishers of public repositories make bare repositories available.

Requirements met?

  • Full commit history
  • Only main branch
  • Access to git tags
  • Exclude file contents

Shallow Clone (Limited Depth Clone)

A shallow clone is a way to clone a Git repository with limited commit history. Instead of pulling the entire project’s lifetime of commits, you only grab the most recent ones, usually just the latest one.

git clone https://github.com/obsproject/obs-studio.git --depth=1
# Size: 66 MB

Similarly, --depth=4 will fetch the last 4 commits.

Create a shallow clone with a history truncated to the specified number of commits. Implies –single-branch unless --no-single-branch is given to fetch the histories near the tips of all branches.

Requirements met?

  • Full commit history (No access to git history beyond the depth specified.)
  • Only main branch
  • Access to git tags
  • Exclude file contents

Shallow clones can be an ideal approach to clone a repo for ephemeral purposes, like CI/CD pipelines, or to quickly build the latest code images.

Bare Blobless Clone (aka Partial Clone)

What is a blob?

For a git refresher, a blob is a git object that stores the content of a file. Blobs are stored in the .git/objects directory. If you were to clone the source of this blog, and run the following command, you would see the content of the file in the blob represented by the hash, 0015c5c42b1eeb1590fdb1a487a560b47cbd2975.

git cat-file -p 0015c5c42b1eeb1590fdb1a487a560b47cbd2975
---
layout: page
title: Work with Bhupesh
description: Interested to work with me professionally? I can assist you as a Software Developer, Technical Writer, or Dev Community Builder. Find out more about my work domains and how I can help you.
image: hire.png
permalink: /hire/
---

![hire-bhupesh](/images/hire.png)
.....

Anywho, As mentioned earlier, the goal is to clone the repository without the blobs.

git clone --bare --filter=blob:none https://github.com/obsproject/obs-studio.git
# Size: 15MB

Requirements met?

  • Full commit history
  • Only main branch
  • Access to git tags
  • Exclude file contents

Bare Treeless Clone

What is a tree?

A tree is a git object that represents the directory structure of the repository. Trees are simply directories.

git clone --bare --filter=tree:0 https://github.com/obsproject/obs-studio.git
# Size: 5.4 MB

Requirements met?

  • Full commit history (Fetching git log with this approach increases the final size of the repository, i.e. giving access to the full commit history)
  • Only main branch
  • Access to git tags
  • Exclude file contents

Combining Approaches

1. Bare + Blobless + Shallow

git clone --bare --filter=blob:none --depth=1 https://github.com/obsproject/obs-studio.git
# Size: 272K

Requirements met?

  • Full commit history
  • Only main branch
  • Access to git tags
  • Exclude file contents

2. Bare + Blobless + Single branch

git clone --bare --filter=blob:none --single-branch https://github.com/obsproject/obs-studio.git
# Size: 15MB

Requirements met?

  • Full commit history
  • Only main branch
  • Access to git tags
  • Exclude file contents

3. Bare + Blobless + Single branch + Shallow

git clone --bare --filter=blob:none --single-branch --depth=1 https://github.com/obsproject/obs-studio.git
# Size: 272K

Requirements met?

  • Full commit history (only latest commit)
  • Only main branch
  • Access to git tags
  • Exclude file contents

4. Bare + Treeless + Single branch

git clone --bare --filter=tree:0 --single-branch https://github.com/obsproject/obs-studio.git
# Size: 6.6MB

Requirements met?

  • Full commit history (Fetching the git log with this approach increases the final size of the repository)
  • Only main branch
  • Access to git tags
  • Exclude file contents

5. Bare + Treeless + Single branch + Shallow

git clone --bare --filter=tree:0 --single-branch --depth=1 https://github.com/obsproject/obs-studio.git
# Size: 104K

Requirements met?

  • Full commit history (only latest commit)
  • Only main branch
  • Access to git tags
  • Exclude file contents

What are the big players doing?

I spent some time poking around TikTok’s sparo & Microsoft’s scalar (both of which are wrappers over the Git CLI btw), here’s what I found:

sparo

sparo uses the following combinations:

  • blobless clone is a blobless + sparse + single branch clone.
  • treeless clone is a treeless + sparse + single branch clone.

At the time of publishing this post, the git clone service is available in GitCloneService.ts.

scalar

scalar which is now part of Microsoft’s fork of git, uses a similar approach to sparo. The scalar documentation mentions that the fork is needed to support the following features that stand out:

  1. Use Git’s partial clone feature to only download the files you need for your current checkout.
  2. Use Git’s sparse-checkout feature to minimize the number of files required in your working directory.

Conclusion

Here’s each approach, ranked by the size of the cloned repository:

Approach Size Full Commit History Access to Tags Only Main Branch Excludes File Contents ✅ Meets All Criteria Use Case
Single Branch Clone 135MB Minimal default clone for builds where blobs are needed.
Bare Clone 84MB Used for mirrors or server-side setups without working trees.
Shallow Clone (--depth=1) 66MB Fast and ephemeral; ideal for CI and latest-commit builds.
Bare Blobless Clone 15MB Ideal for analyzing commit and tag metadata, not for file contents.
Bare Treeless Clone 5.4MB Extremely lightweight; needs fetch for log/history.
Blobless + Shallow 272KB Lowest size; just HEAD + shallow metadata.
Blobless + Single Branch 15MB ✅ Best Option Best for git metadata analysis; matches all acceptance criteria.
Blobless + Single Branch + Shallow 272KB Variant of best-case option but with limited history.
Treeless + Single Branch 6.6MB Useful for resolving refs/tags; no log access until fetch.
Treeless + Single Branch + Shallow 104KB Ultra-light for pointer-level analysis or HEAD-only introspection.

The source code for running a similar cloning test can be found here. Here’s a sample run:

Using branch: master
Cloning into bare repository 'clones/blobless-shallow'...
Cloning into bare repository 'clones/treeless-single-branch'...
Cloning into bare repository 'clones/bare-treeless'...
Cloning into bare repository 'clones/bare-blobless-single-branch-shallow'...
Cloning into bare repository 'clones/bare-treeless-single-branch-shallow'...
Cloning into 'clones/shallow'...
Updating files: 100% (5014/5014), done.
Cloning into bare repository 'clones/blobless-single-branch'...
Cloning into bare repository 'clones/bare-blobless'...
Cloning into 'clones/single-branch'...
Cloning into bare repository 'clones/bare'...
Cloning into 'clones/plain'...

Results:

104K    clones/bare-treeless-single-branch-shallow
272K    clones/bare-blobless-single-branch-shallow
272K    clones/blobless-shallow
5.9M    clones/bare-treeless
5.9M    clones/treeless-single-branch
 14M    clones/bare-blobless
 14M    clones/blobless-single-branch
 66M    clones/shallow
 84M    clones/bare
134M    clones/plain
135M    clones/single-branch

According to our acceptance criteria, the winner is the Bare + Treeless approach, I ended up rolling it out on osscooking.com. The migration was beneficial as it reduced the repo lookup response time by seconds on the current production deployment.

blobless


treeless

Resources