Git Fundamentals: Repositories, Commits, Branches

Python Full Stack Web Developer Course - Week 1: Tuesday

Understanding Git Core Concepts

In our digital age, Git has become the foundation of modern software development. Whether you're building a personal project or collaborating with thousands of developers worldwide, understanding Git's core concepts is essential. Think of Git as the DNA of your codebase—it preserves the evolutionary history of your project and enables efficient collaboration.

Today, we'll dive deep into the three fundamental pillars of Git: repositories, commits, and branches. By understanding these concepts thoroughly, you'll be equipped to handle any development scenario that comes your way.

Repositories: The Project's Timeline

What is a Repository?

A Git repository (or "repo") is like a project's digital time capsule. It contains your project's files and the entire history of changes made to those files. Think of it as a special kind of filesystem with a powerful time-travel mechanism built in.

Unlike traditional file storage, a Git repository tracks not just what your files look like now, but what they looked like at every point in their history. It's similar to how a historian might preserve not just the final draft of a constitution, but every draft, annotation, and revision along the way.

The .git Directory

When you initialize a Git repository, a hidden .git directory is created. This directory is the heart of Git's functionality—it's where Git stores all the data it needs to track your project. Think of the .git directory as the "backstage area" where all the magic happens.

The .git directory contains:

$ ls -la .git

This command lists all contents of the .git directory

Types of Repositories

Local Repository

A local repository resides on your computer. It's where you make changes, create commits, and develop your project. Think of it as your personal workspace where you craft your code before sharing it with others.

$ git init

Creates a new local repository in the current directory

$ git init project_name

Creates a new directory with a Git repository inside

Remote Repository

A remote repository is hosted on a server (like GitHub, GitLab, or Bitbucket) and allows for collaboration. Think of it as a central library where team members can share their work. Remote repositories enable multiple developers to work on the same project, each with their own local copy.

$ git remote add origin https://github.com/username/repository.git

Connects your local repository to a remote repository

$ git remote -v

Lists all remote repositories connected to your local repository

Bare Repository

A bare repository doesn't contain a working directory—it only has the version control information. It's typically used as a central repository that developers push to and pull from, but not where they work directly. It's like a vault that stores the project's history but doesn't display the files themselves.

$ git init --bare

Creates a bare repository

Real-World Repository Example

Let's create a repository for a simple Python web application:

$ mkdir flask_todo_app $ cd flask_todo_app $ git init $ touch app.py README.md requirements.txt $ mkdir templates static

Now, let's add some initial content to our app.py file:

from flask import Flask, render_template, request, redirect, url_for app = Flask(__name__) todos = [] @app.route('/') def index(): return render_template('index.html', todos=todos) @app.route('/add', methods=['POST']) def add(): todo = request.form.get('todo') if todo: todos.append(todo) return redirect(url_for('index')) if __name__ == '__main__': app.run(debug=True)

And let's create a simple requirements.txt file:

flask==2.0.1

Now we have a basic repository structure set up for our Flask to-do application.

Commits: Snapshots in Time

What is a Commit?

A commit is a snapshot of your repository at a specific point in time. It's like taking a photograph of your entire project, capturing the state of all tracked files at that moment. Each commit is identified by a unique SHA-1 hash (a 40-character string that often looks like 4a5c9b2...).

Commits serve as the building blocks of your project's history. They allow you to see what changed, when it changed, and who changed it. Think of commits as journal entries in your project's diary, each one telling a story about a specific set of changes.

Anatomy of a Commit

A commit contains the following information:

The Commit Process

Staging Area (Index)

Before creating a commit, you must first stage your changes. The staging area (or index) is like a preparation area where you select which changes will be included in your next commit. This allows you to craft purposeful, logical commits rather than including all changes at once.

Think of the staging area as a photographer's composition frame—it lets you decide exactly what goes into your snapshot before you capture it.

$ git add filename.py

Stages changes in a specific file

$ git add .

Stages all changes in the current directory (and subdirectories)

$ git add -p

Interactively stage portions of files (very useful for creating focused commits)

Creating a Commit

Once you've staged your changes, you can create a commit with a descriptive message. The commit message should clearly explain what changes were made and why they were necessary.

$ git commit -m "Add user authentication system"

Creates a commit with a short message

$ git commit

Opens a text editor for a more detailed commit message

For more significant changes, it's recommended to write a more detailed commit message. A good format is:

Short summary (50 chars or less) More detailed explanation of what was changed and why. This can span multiple lines and go into detail about the motivations for the change, any trade-offs made, and any other relevant context. References to issue tracking, if applicable.

Commit Best Practices

Viewing Commit History

Git provides several ways to explore your project's commit history:

$ git log

Shows the commit history with details

$ git log --oneline

Shows a compact commit history (one line per commit)

$ git log --graph --oneline --all

Shows a graphical representation of the commit history, including branches

$ git show commit_hash

Shows details about a specific commit

Practical Commit Example

Let's continue with our Flask to-do app example. Now that we have our basic files, let's stage and commit them:

$ git add app.py README.md requirements.txt $ git commit -m "Initial commit: Basic Flask to-do app structure"

Now, let's add an index.html template:

$ touch templates/index.html

Add this content to index.html:

<!DOCTYPE html> <html> <head> <title>Flask Todo App</title> </head> <body> <h1>Todo List</h1> <form action="/add" method="post"> <input type="text" name="todo" placeholder="Enter a todo item"> <button type="submit">Add</button> </form> <ul> {% for todo in todos %} <li>{{ todo }}</li> {% endfor %} </ul> </body> </html>

Now, let's stage and commit this new file:

$ git add templates/index.html $ git commit -m "Add index.html template for displaying todos"

Let's check our commit history:

$ git log --oneline

This will show our two commits with their respective hashes and messages.

Branches: Parallel Development Universes

What is a Branch?

A branch in Git is simply a movable pointer to a commit. Think of branches as parallel universes where you can develop features or fix bugs without affecting the main codebase. Branches allow multiple developers to work on different features simultaneously without interfering with each other.

The default branch in Git is called "master" (in newer repositories, it's often called "main"). This branch typically represents the stable, production-ready version of your code.

The Power of Branching

Branching is one of Git's most powerful features because it enables:

Think of branches as alternate dimensions where you can safely experiment and develop. Once you're happy with your changes, you can merge these dimensions back together.

Creating and Switching Branches

$ git branch

Lists all local branches (* indicates the current branch)

$ git branch feature-name

Creates a new branch named "feature-name" at the current commit

$ git checkout feature-name

Switches to the "feature-name" branch

$ git checkout -b feature-name

Creates a new branch and switches to it in one command

$ git branch -d feature-name

Deletes the "feature-name" branch (after it's been merged)

$ git branch -D feature-name

Force deletes the "feature-name" branch (even if not merged)

Branching Strategies

Different teams use different branching strategies depending on their workflow needs. Here are some common patterns:

Feature Branching

Create a new branch for each feature or task. Once the feature is complete, merge it back into the main branch. This approach keeps features isolated and enables parallel development.

$ git checkout -b feature/user-authentication # Make changes and commits $ git checkout main $ git merge feature/user-authentication
Git Flow

A more structured branching model with specific branches for features, releases, and hotfixes. Git Flow is suitable for projects with scheduled releases.

GitHub Flow

A simpler model focusing on continuous delivery. Everything in the main branch is deployable, and all work happens in feature branches that are merged via pull requests.

Trunk-Based Development

Developers work directly on the main branch or on short-lived feature branches that are merged frequently. This approach emphasizes continuous integration.

Merging Branches

Once you've completed work on a branch, you'll want to integrate those changes back into your main branch. This process is called merging.

$ git checkout main $ git merge feature-branch

Merges changes from "feature-branch" into the current branch (main)

Fast-Forward Merge

If the main branch hasn't changed since you created your feature branch, Git can simply move the main pointer forward to match the feature branch. This is called a "fast-forward" merge.

Three-Way Merge

If the main branch has changed since you created your feature branch, Git performs a three-way merge, combining the changes from both branches and creating a new merge commit.

Merge Conflicts

Sometimes when merging, Git encounters changes in both branches that modify the same part of a file. This results in a merge conflict that requires manual resolution.

<<<<<<< HEAD def hello(): return "Hello, World!" ======= def hello(): return "Hello, Git!" >>>>>>> feature-branch

In this conflict, you need to decide which version to keep or combine them.

Practical Branching Example

Let's continue with our Flask to-do app and add a feature branch for implementing deletion functionality:

$ git checkout -b feature/delete-todo

Creates and switches to a new branch for our delete feature

Now, let's modify app.py to add a delete route:

from flask import Flask, render_template, request, redirect, url_for app = Flask(__name__) todos = [] @app.route('/') def index(): return render_template('index.html', todos=todos) @app.route('/add', methods=['POST']) def add(): todo = request.form.get('todo') if todo: todos.append(todo) return redirect(url_for('index')) @app.route('/delete/') def delete(index): if 0 <= index < len(todos): todos.pop(index) return redirect(url_for('index')) if __name__ == '__main__': app.run(debug=True)

And update our index.html template to add delete links:

<!DOCTYPE html> <html> <head> <title>Flask Todo App</title> </head> <body> <h1>Todo List</h1> <form action="/add" method="post"> <input type="text" name="todo" placeholder="Enter a todo item"> <button type="submit">Add</button> </form> <ul> {% for todo in todos %} <li> {{ todo }} <a href="/delete/{{ loop.index0 }}">(Delete)</a> </li> {% endfor %} </ul> </body> </html>

Now, let's commit these changes on our feature branch:

$ git add app.py templates/index.html $ git commit -m "Add delete functionality for todo items"

Let's merge this feature back into the main branch:

$ git checkout main $ git merge feature/delete-todo

After a successful merge, we can delete the feature branch:

$ git branch -d feature/delete-todo

Advanced Topics and Best Practices

Rebasing: An Alternative to Merging

Rebasing is another way to integrate changes from one branch to another. Instead of creating a merge commit, rebasing rewrites history by applying your branch's commits on top of the target branch. This creates a linear history but should be used with caution, especially on shared branches.

$ git checkout feature-branch $ git rebase main

Replays your feature branch commits on top of the main branch

Stashing: Saving Work in Progress

Sometimes you need to switch branches but aren't ready to commit your changes. Git's stash feature allows you to temporarily save your uncommitted changes and reapply them later.

$ git stash

Saves your uncommitted changes

$ git stash pop

Reapplies the most recently stashed changes

Cherry-Picking: Selecting Specific Commits

Cherry-picking allows you to select specific commits from one branch and apply them to another branch. This is useful when you want to incorporate a specific fix without merging an entire branch.

$ git cherry-pick commit_hash

Applies the changes from the specified commit to your current branch

Remote Branches

When collaborating with others, you'll work with remote branches that exist on the remote repository (like GitHub). Understanding how to interact with these is crucial for collaboration.

$ git branch -r

Lists remote branches

$ git checkout -b local-branch origin/remote-branch

Creates a local branch that tracks a remote branch

$ git push origin local-branch:remote-branch

Pushes your local branch to a remote branch

Best Practices for Repository, Commit, and Branch Management

Hands-on Exercise: Repositories, Commits, and Branches

Now, let's practice these concepts with a step-by-step exercise:

  1. Create a new repository for a simple calculator application:
    $ mkdir git_calculator $ cd git_calculator $ git init
  2. Create a basic calculator.py file:
    # calculator.py def add(a, b): return a + b def subtract(a, b): return a - b
  3. Make your initial commit:
    $ git add calculator.py $ git commit -m "Initial commit: Basic calculator with add and subtract functions"
  4. Create a branch for adding multiplication functionality:
    $ git checkout -b feature/multiplication
  5. Add a multiplication function to calculator.py:
    # calculator.py def add(a, b): return a + b def subtract(a, b): return a - b def multiply(a, b): return a * b
  6. Commit your changes:
    $ git add calculator.py $ git commit -m "Add multiplication function"
  7. Switch back to the main branch:
    $ git checkout main
  8. Create another branch for division:
    $ git checkout -b feature/division
  9. Add a division function to calculator.py:
    # calculator.py def add(a, b): return a + b def subtract(a, b): return a - b def divide(a, b): if b == 0: raise ValueError("Cannot divide by zero") return a / b
  10. Commit your changes:
    $ git add calculator.py $ git commit -m "Add division function with zero check"
  11. Switch back to main and merge the multiplication branch:
    $ git checkout main $ git merge feature/multiplication
  12. Now merge the division branch (this will create a merge conflict because both branches modified calculator.py):
    $ git merge feature/division
  13. Resolve the merge conflict by editing calculator.py to include all functions:
    # calculator.py def add(a, b): return a + b def subtract(a, b): return a - b def multiply(a, b): return a * b def divide(a, b): if b == 0: raise ValueError("Cannot divide by zero") return a / b
  14. Mark the conflict as resolved and complete the merge:
    $ git add calculator.py $ git commit -m "Merge division feature and resolve conflicts"
  15. Clean up your branches:
    $ git branch -d feature/multiplication $ git branch -d feature/division
  16. View your commit history with the branch graph:
    $ git log --graph --oneline --all

Real-World Applications and Case Studies

Case Study 1: Small Web Development Team

A small team of 4 developers working on a web application might use a simple feature branch workflow. Each developer creates branches for features they're working on, pushes them to GitHub, and creates pull requests when ready for review.

Their typical workflow:

  1. Pull the latest main branch: git pull origin main
  2. Create a feature branch: git checkout -b feature/user-settings
  3. Make changes and commit regularly
  4. Push the branch to GitHub: git push origin feature/user-settings
  5. Create a pull request for code review
  6. Merge the PR after approval and delete the branch

Case Study 2: Open Source Project

Open source projects often use a fork-and-pull model. Contributors fork the main repository, create branches on their fork, and then create pull requests to the original repository.

A typical contribution workflow:

  1. Fork the repository on GitHub
  2. Clone your fork: git clone https://github.com/yourusername/project.git
  3. Add the original repo as upstream: git remote add upstream https://github.com/original/project.git
  4. Create a feature branch: git checkout -b fix-login-bug
  5. Make changes and commit
  6. Push to your fork: git push origin fix-login-bug
  7. Create a pull request to the upstream repository

Case Study 3: Enterprise Software Development

Large enterprises often use a more complex Git Flow model with specific branches for different purposes. This provides structure for managing releases in a more controlled environment.

Their typical workflow includes:

Visualizing Git Concepts

Understanding Git concepts visually can help solidify your understanding:

Git Repository Visualization

Imagine a repository as a tree with branches growing in different directions. The trunk represents your main branch, and each branch represents a different feature or bug fix. Commits are like growth rings on the tree, marking points in time.

Commit History Visualization

Picture your commit history as a timeline with points (commits) connected by lines. Each point represents a snapshot of your code at a specific time. Branches are alternate timelines that can merge back into the main timeline.

Staging Area Visualization

Think of your working directory, staging area, and repository as three distinct zones:

Using visualization tools like git log --graph or GUI clients like GitKraken, SourceTree, or GitHub Desktop can help you better understand these concepts.

Key Takeaways

Remember, Git is a powerful tool that becomes more valuable as your projects grow in complexity and your teams expand. The investment you make in understanding these fundamentals will pay dividends throughout your development career.

Assignment: Git Repository Practice

Create your own Git repository that demonstrates your understanding of repositories, commits, and branches:

  1. Initialize a new Git repository for a personal project (it can be a simple website, application, or tool)
  2. Create at least 5 meaningful commits that show progressive development
  3. Create at least 2 feature branches
  4. Merge your feature branches back into the main branch
  5. Document your process with screenshots or a written explanation
  6. Push your repository to GitHub and share the link

Bonus challenge: Intentionally create a merge conflict and document how you resolved it.

Additional Resources