Transcript
  • Servers and Tools

Commit Objects

From the class:  Inside Git

Every time we write a tree to the object database, what we're doing is taking a snapshot of the index, and then storing it as a new object in the database. So we can have as many trees as we want. And some of these trees might have overlapping entries. They might point to the same objects in the database. And it might just have, say, one extra entry in it or a new entry for a change.

But they're snapshots of our index. And the index represents what's in our working directory up here. So when I say write tree, we get back this SHA1 hash. And if I say write tree again, without making any changes at all, it's just going to give me back the same SHA1 hash. Because nothing has changed. And so what's on disk in the object database is-- it's already there. So we just get back the SHA1 hash essentially of the last written tree.

But what about commits? Commits are a way for developers to give a little bit of a comment about what has changed in this snapshot of their working directory, and also is a log of who made the changes, and what they made the changes on. So the commit is a new type of object that we haven't seen yet. It's a new type of object in the Git database. And let's create one by hand.

Before we create a commit object, you're going to want to make sure that your username and email is configured in Git, if you haven't had that set up all ready. You can check by saying git config user.name, and git config user.email. And if those aren't set up, you can go ahead and set them by saying git config global user.name, and then just put whatever name you want, and the same for email.

We can use a low-level plumbing command to create commits. And when you create a commit, what you're really doing is creating an object that points to a tree. To see what the last tree was, we can just say write tree again, and get back this hash identifier. And the next thing I'm going to do is use the git commit tree command. And I need to provide it the identifier for the tree. So we'll say 1B23. That should be enough.

And what we can provide to that standard input is some commit message. Normally you do this through the git commit command. But we're going to do it using just the command line. So this will-- I'll just say this is my first commit, or the initial commit.

And when I press Enter, I get back SHA1 hash of a new object. This is actually a SHA1 hash of the commit. And these are the SHA1 hashes that you're most used to working with when we use Git in our projects. So this is the SHA1 hash of the commit object.

Let's take a look at what the commit object actually has in it by using the git show command. So git show cc74. And it gives us-- let me scroll back a little bit-- quite a bit of information. But this is starting to look like what? A regular commit message, would look like. So here's the SHA1 hash, the author of the commit, the date it was made. And here's the message that I stored with the commit itself. And then it shows me a nice diff between this commit and what was actually done in this commit.

Let's look at a slightly lower-level view by using the cat file command, cat file, and I'll give it the commit, the first four characters of the commit. And this shows us a little bit more closely what's actually stored in the object itself in the database. The key thing is that it points to a tree. So when we say points to, it means that it stores the SHA1 hash of some tree that is also an object in our database. And then the commit message, or the commit object adds a little bit more information to this tree, namely the author and committer, and the time that the commit happened, along with the commit message.

We can find the specific commit object in our database just like we have with any of the others, by searching through the git/objects folder. And we'll just filter it again by file, and just look through this list. These are all the objects in our database. And you've seen now a couple different types. We've seen blobs, trees, and now we've added commit. So search for cc74, and here it. So this is actually a commit object that's stored in the Git database, alongside all the other different types of objects.

To make it even more clear, in the bottom window I've opened up this file in Ruby again. I inflated it, or unzipped it, if you will. And this is the actual contents of that file after it's been unzipped. And it's just a different type of object in the database. And this says what type of object it is. It's 187 bytes. And here's our null byte followed by the actual contents of the file.

So Git is storing this commit object with this content. It's a specific format for the content that includes the tree, and then a new line, the author, and then a new line, the committer, a new line, and so on. All the way until we get to the actual message of the commit that stores the last piece of content. So this is the content for this type of object in the Git database.

And all the different types we've seen so far have different formats for their content. So we've seen blob, which is just any kind of content. Then we saw tree, which is a specific kind of content that stores entries to other blobs and to other trees. And now we've seen a new one called commit.

Now that we have at least one commit, we can provide it to the log command cc74, and we have a little bit of a git log here. This is showing the commit that we just created. And this looks kind of like what you would see when you say git log. We just have only one commit at this point.

Let's create one more commit. In order to do that, I'm going to make a change to my working directory, get that into the index and write a new tree. Let's just change the file two .txt. I'll make this to version two, and write that out to two txt. And we should see, now that we have a difference between our local directory, our project directory, and the index. And I can go ahead and update the index with update index, and provide it the path to the file. And then we'll write that index to a tree by saying git write tree.

Now we have a new tree object written, we can create a commit. But this time I'm going to do something a little special. I want this commit to have a parent commit. I want it to come after the previous commit that I made. So I can give it a parent. So what I can do is say echo. And I'll give it a commit message. This is my second commit with a parent. And we'll pipe that into git commit tree. And we provide the e6211 to commit that tree, and then give it the P option and provide a parent commit that we want to be the parent for this commit message.

So the last commit message had a hash that started with cc74. If I scroll back a little bit, we should see that, yep, cc74. And if I press Enter here, I do get a new commit with a new SHA hash of aa5c.

And if I take a look at the raw contents of that git object, notice it has a new property in it called parent. And this parent points to the previous commit that we created prior to this one. So it has a tree, just like the other one. It has an author and a committer and a commit message. But this time it points up to a parent. So we can form this hierarchy of commits. And if I look at the git logs starting at this commit, so git log aa5c, notice that the log command is now going to show me all the commits starting from this one, working its way all the way up the list of parents.

And now we're getting pretty close to what we see with Git inside of our projects, with the git log and commit messages.