Model

This chapter explains the user's view of Subversion -- what "objects" you interact with, how they behave, and how they relate to each other.

Working Directories and Repositories

Suppose you are using Subversion to manage a software project. There are two things you will interact with: your working directory, and the repository.

Your working directory is an ordinary directory tree, on your local system, containing your project's sources. You can edit these files and compile your program from them in the usual way. Your working directory is your own private work area: Subversion never changes the files in your working directory, or publishes the changes you make there, until you explicitly tell it to do so.

After you've made some changes to the files in your working directory, and verified that they work properly, Subversion provides commands to publish your changes to the other people working with you on your project. If they publish their own changes, Subversion provides commands to incorporate those changes into your working directory.

A working directory contains some extra files, created and maintained by Subversion, to help it carry out these commands. In particular, these files help Subversion recognize which files contain unpublished changes, and which files are out-of-date with respect to others' work.

While your working directory is for your use alone, the repository is the common public record you share with everyone else working on the project. To publish your changes, you use Subversion to put them in the repository. (What this means, exactly, we explain below.) Once your changes are in the repository, others can tell Subversion to incorporate your changes into their working directories. In a collaborative environment like this, each user will typically have their own working directory (or perhaps more than one), and all the working directories will be backed by a single repository, shared amongst all the users.

A Subversion repository holds a single directory tree, and records the history of changes to that tree. The repository retains enough information to recreate any prior state of the tree, compute the differences between any two prior trees, and report the relations between files in the tree -- which files are derived from which other files.

A Subversion repository can hold the source code for several projects; usually, each project is a subdirectory in the tree. In this arrangement, a working directory will usually correspond to a particular subtree of the repository.

For example, suppose you have a repository laid out like this:

/trunk/paint/Makefile
             canvas.c
             brush.c
       write/Makefile
             document.c
             search.c

In other words, the repository's root directory has a single subdirectory named trunk, which itself contains two subdirectories: paint and write.

To get a working directory, you must check out some subtree of the repository. If you check out /trunk/write, you will get a working directory like this:

write/Makefile
      document.c
      search.c
      .svn/
This working directory is a copy of the repository's /trunk/write directory, with one additional entry -- .svn -- which holds the extra information needed by Subversion, as mentioned above.

Suppose you make changes to search.c. Since the .svn directory remembers the file's modification date and original contents, Subversion can tell that you've changed the file. However, Subversion does not make your changes public until you explicitly tell it to.

To publish your changes, you can use Subversion's commit command:

$ pwd
/home/jimb/write
$ ls -a
.svn/    Makefile   document.c    search.c
$ svn commit search.c
$

Now your changes to search.c have been committed to the repository; if another user checks out a working copy of /trunk/write, they will see your text.

Suppose you have a collaborator, Felix, who checked out a working directory of /trunk/write at the same time you did. When you commit your change to search.c, Felix's working copy is left unchanged; Subversion only modifies working directories at the user's request.

To bring his working directory up to date, Felix can use the Subversion update command. This will incorporate your changes into his working directory, as well as any others that have been committed since he checked it out.

$ pwd
/home/felix/write
$ ls -a
.svn/    Makefile    document.c    search.c
$ svn update
U search.c
$

The output from the svn update command indicates that Subversion updated the contents of search.c. Note that Felix didn't need to specify which files to update; Subversion uses the information in the .svn directory, and further information in the repository, to decide which files need to be brought up to date.

We explain below what happens when both you and Felix make changes to the same file.

Transactions and Revision Numbers

A Subversion commit operation can publish changes to any number of files and directories as a single atomic transaction. In your working directory, you can change files' contents, create, delete, rename and copy files and directories, and then commit the completed set of changes as a unit.

In the repository, each commit is treated as an atomic transaction: either all the commit's changes take place, or none of them take place. Subversion tries to retain this atomicity in the face of program crashes, system crashes, network problems, and other users' actions. We may call a commit a transaction when we want to emphasize its indivisible nature.

Each time the repository accepts a transaction, this creates a new state of the tree, called a revision. Each revision is assigned a unique natural number, one greater than the number of the previous revision. The initial revision of a freshly created repository is numbered zero, and consists of an empty root directory.

Since each transaction creates a new revision, with its own number, we can also use these numbers to refer to transactions; transaction n is the transaction which created revision n. There is no transaction numbered zero.

Unlike those of many other systems, Subversion's revision numbers apply to an entire tree, not individual files. Each revision number selects an entire tree.

It's important to note that working directories do not always correspond to any single revision in the repository; they may contain files from several different revisions. For example, suppose you check out a working directory from a repository whose most recent revision is 4:

write/Makefile:4
      document.c:4
      search.c:4

At the moment, this working directory corresponds exactly to revision 4 in the repository. However, suppose you make a change to search.c, and commit that change. Assuming no other commits have taken place, your commit will create revision 5 of the repository, and your working directory will look like this:

write/Makefile:4
      document.c:4
      search.c:5
Suppose that, at this point, Felix commits a change to document.c, creating revision 6. If you use svn update to bring your working directory up to date, then it will look like this:
write/Makefile:6
      document.c:6
      search.c:6
Felix's changes to document.c will appear in your working copy of that file, and your change will still be present in search.c. In this example, the text of Makefile is identical in revisions 4, 5, and 6, but Subversion will mark your working copy with revision 6 to indicate that it is still current. So, after you do a clean update at the root of your working directory, your working directory will generally correspond exactly to some revision in the repository.

How Working Directories Track the Repository

For each file in a working directory, Subversion records two essential pieces of information:

Given this information, by talking to the repository, Subversion can tell which of the following four states a file is in:

Subversion Does Not Lock Files

Subversion does not prevent two users from making changes to the same file at the same time. For example, if both you and Felix have checked out working directories of /trunk/write, Subversion will allow both of you to change write/search.c in your working directories. Then, the following sequence of events will occur:

Some version control systems provide "locks", which prevent others from changing a file once one person has begun working on it. In our experience, merging is preferable to locks, because:

Of course, the merge process needs to be under the users' control. Patch is not appropriate for files with rigid formats, like images or executables. Subversion allows users to customize its merging behavior on a per-file basis. You can direct Subversion to refuse to merge changes to certain files, and simply present you with the two original texts to choose from. Or, you can direct Subversion to merge using a tool which respects the semantics of the file format.

Properties

Files generally have interesting attributes beyond their contents: owners and groups, access permissions, creation and modification times, and so on. Subversion attempts to preserve these attributes, or at least record them, when doing so would be meaningful. However, different operating systems support very different sets of file attributes: Windows NT supports access control lists, while Linux provides only the simpler traditional Unix permission bits.

In order to interoperate well with clients on many different operating systems, Subversion supports property lists, a simple, general-purpose mechanism which clients can use to store arbitrary out-of-band information about files.

A property list is a set of name / value pairs. A property name is an arbitrary text string, expressed as a Unicode UTF-8 string, canonically decomposed and ordered. A property value is an arbitrary string of bytes. Property values may be of any size, but Subversion may not handle very large property values efficiently. No two properties in a given a property list may have the same name. Although the word `list' usually denotes an ordered sequence, there is no fixed order to the properties in a property list; the term `property list' is historical.

Each revision number, file, directory, and directory entry in the Subversion repository, has its own property list. Subversion puts these property lists to several uses:

Property lists are versioned, just like file contents. You can change properties in your working directory, but those changes are not visible in the repository until you commit your local changes. If you do commit a change to a property value, other users will see your change when they update their working directories.

Adds

This section describes how the Subversion client deals with the addition of files and directories.

Adding items

Adding items consists of two phases:

  1. Make your working copy aware of a new object it contains, effectively "scheduling" this new object to be added to the repository.
  2. Commit the object to the repository.

This section describes the first phase.

To add a file: svn add foo.c
The file foo.c is now tracked by your working copy. The svn status command will show the file with status code A and at local revision 0 (because it is not yet part of any repository revision.)
To add a directory: svn add dirname
The directory dirname will be added, but not recursively. If you want to add everything within dirname, then you can pass the --recursive flag to svn add. Everything added will have status code A and be at revision 0.

(Note that unlike CVS, adding a directory does not effect an immediate change in the repository!)

To undo additions before committing: svn revert [items]
Because added items have not yet been committed to the repository, it's easy to make your working copy "forget" that you wanted to add something. This command will de-schedule items for addition. The svn status command will no longer show them at all.

There are two important exceptions which will prevent something from being scheduled for addition.

First, you can't schedule an item to be added if it doesn't exist. The svn add foo command will check that foo is an actual file or directory before succeeding.

Second, you can't schedule an item to be added if any of its parent directories are scheduled for deletion. This is a sanity check performed by svn add; it makes no sense to add an item within a directory that will be destroyed at commit-time.

Committing additions

If your working copy contains items scheduled for addition and you svn commit them, they will be copied into the repository and become a permanent part of your tree's history.

As usual, your commit will be rejected if any server-side conflicts result from your own working copy being out-of-date.

One final rule: you cannot commit an added item if any of its parents are scheduled for addition but are not included in the same commit. This is because it would be meaningless to commit a new item to the repository without a parent to hold that item. Therefore, to commit added items that are nested, you must commit from the top of the nesting.

For example, recall our old working copy:

write/Makefile
      document.c
      search.c
      .svn/

Say we add a new directory fonts to the working copy:

$ mkdir fonts
$ svn add fonts
$ svn st
_   1       (     1)  .
_   1       (     1)  ./Makefile
_   1       (     1)  ./document.c
_   1       (     1)  ./search.c
A   0       (     1)  ./fonts

And say we add two new files within fonts:

$ cp /some/path/font1.ttf fonts/
$ cp /some/path/font2.ttf fonts/
$ svn add fonts/font1.ttf fonts/font2.ttf
$ svn st
_   1       (     1)  .
_   1       (     1)  ./Makefile
_   1       (     1)  ./document.c
_   1       (     1)  ./search.c
A   0       (     1)  ./fonts
A   0       (     1)  ./fonts/font1.ttf
A   0       (     1)  ./fonts/font2.ttf

So what happens if we try to commit only font1.ttf? The command svn commit fonts/font1.ttf will fail, because it attempts to copy a file to the fonts directory on the repository - and no such directory exists there!

Thus the correct solution is to commit the parent directory. This will add fonts to the repository first, and then add its new contents:

$ svn commit fonts
Adding   ./fonts
Adding   ./fonts/font1.ttf
Adding   ./fonts/font2.ttf
Commit succeeded.

Additions from updates

During an update, new files and directories may be added to your working copy. This is no surprise.

The only problems that may occur are those times when the items being added have the same names as non-versioned items already present in your working copy. As a rule, Subversion never loses nor hides data in your working copy - versioned or not. Thus for the update to succeed, you'll have to move your unversioned items out of the way.

Replacements

Replacement is when you add a new item that has the same name as an item already scheduled for deletion. Instead of showing both "D" and "A" flags simultaneously, an "R" flag is shown.

For example:

$ svn st
_   1       (     1)  .
_   1       (     1)  ./foo.c

$ svn rm foo.c
$ svn st
_   1       (     1)  .
D   1       (     1)  ./foo.c

$ rm foo.c
$ echo "a whole new foo" > foo.c
$ svn add foo.c
$ svn st
_   1       (     1)  .
R   1       (     1)  ./foo.c

At this point, the replaced item acts like any other kind of addition. You can undo the replacement by running svn remove foo.c - and the file's status code will revert back to D. If the replaced item is a directory, you can schedule items within it for addition as well.

When a replaced item is committed, the client will first delete the original foo.c from the repository, and then add the "new" foo.c.

Replacements are useful: the object being replaced can even change type. For example, a file foo can be deleted and replaced with a directory of the same name, or vice versa.

Removals

This section describes how the Subversion client deals with removals of files and directories. Many of these behaviors are newly invented, because they follow from the fact that Subversion is versioning directories. (In other words, the CVS model hasn't had to deal with these scenarios before.)

Removing items

The svn rm subcommand is used to mark items in your working copy for removal from the repository. Note that marking them is not the same as actually removing them in the repository: the repository is never modified until you run svn commit (or svn import for new data).

Also, note that there two different ways to interpret the phrase "remove an item". In the less destructive case, the item is removed from revision control (i.e. no longer tracked by Subversion), but it is not removed from your working copy. In the more destructive case, the item is removed both from revision control and from disk.

Subversion defaults to the less destructive behavior - svn rm by itself only removes an item from revision control. However, if the -f flag (--force) is given, the item(s) will also be removed from disk. However, no item containing local modifications will be removed, nor will items that are not under revision control (you can remove such items by hand).

To remove a file: svn rm foo.c
This will schedule foo.c to be deleted from the repository. The file is still tracked in the administrative directory until the user commits; afterwards, the working copy will no longer track the file.

If foo.c is locally modified, this command will return an error (you'll have to svn revert your change).

To recursively remove a directory: svn rm dirname
This will recursively schedule every item below directory dirname to be deleted from the repository.

If any locally modified items live below the directory, this command will return an error.

To undo a deletion before committing: svn undel item
This subcommand will "unmark" a file or directory that is scheduled for removal. In the directory case, it does not recurse by default (like the way svn rm does.) To recurse, use the --recursive flag.

When an item has been scheduled for removal, but not yet committed, the client more-or-less treats the item as if it were gone. Although the item will still show up in the svn status command with a D flag next to it, the client will now allow the user to add a new item of the same name. In this case, the svn status output will describe the item as replaced (with an R flag).

(todo: perhaps we should show some examples here...)

This scenario is made even more complicated when the item in question is a directory. If a directory is recursively marked for deletion, and then a directory of the same name is added with svn add, the user can continue to add (or replace) items in the newly added directory. The svn status command would then show the parent directory as "replaced", and items inside the directory as a mixture of items that are scheduled to be "deleted", "added", and "replaced".

Committing removals

When the user runs svn commit, and items are scheduled for removal, the items are first removed from the repository. If there are server-side conflicts, then (as usual) an error message will explain that the working copy is out-of-date.

After the items are removed from the repository, all tracking information about the items is removed from the working copy. In the case of a file, its information is removed from .svn/. In the case of a directory, the entire .svn/ administrative area is removed, as well as all the administrative areas of its subdirectories.

Note that commit never removes any real working files or directories; that only happens with a svn rm -f command, or possibly during a svn update.

Removals in updates

When an update tries to remove a file or directory, the item is not only removed from local revision control, but the item itself is deleted. In the case of a directory removal, this is equivalent to a Unix rm -rf command.

There are two exceptions, for safety's sake:

Thus it's possible that after an update which recursively removes a directory, there may be stray path "trails" leading down to individual locally-modified files that were deliberately saved.

Directory Versioning

The three cardinal virtues of a master technologist are: laziness, impatience, and hubris." - Larry Wall

This section describes some of the pitfalls around the (possibly arrogant) notion that one can simply version directories just as one versions files.

Revisions

To begin, recall that the Subversion repository is an array of trees. Each tree represents the application of a new atomic commit, and is called a revision. This is very different than a CVS repository, which stores file histories in a collection of RCS files (and doesn't track tree-structure.)

So when we refer to "revision 4 of foo.c" (written foo.c:4) in CVS, this means the fourth distinct version of foo.c - but in Subversion this means "the version of foo.c in the fourth revision (tree)". It's quite possible that foo.c has never changed at all since revision 1! In other words, in Subversion, different revision numbers of the same versioned item do not imply different contents.

Nevertheless, the contents of foo.c:4 is still well-defined. The file foo.c in revision 4 has a specific text and properties.

Suppose, now, the we extend this concept to directories. If we have a directory DIR, define DIR:N to be "the directory DIR in the fourth revision." The contents are defined to be a particular set of directory entries (dirents) and properties.

So far, so good. The concept of versioning directories seems fine in the repository - the repository is very theoretically pure anyway. However, because working copies allow mixed revisions, it's easy to create problematic use-cases.

The Lagging Directory

Problem

Suppose our working copy has directory DIR:1 containing file foo:1, along with some other files. We remove foo and commit.

Already, we have a problem: our working copy still claims to have DIR:1. But on the repository, revision 1 of DIR is defined to contain foo - and our working copy DIR clearly does not have it anymore. How can we truthfully say that we still have DIR:1?

One answer is to force DIR to be updated when we commit foo's deletion. Assuming that our commit created revision 2, we would immediately update our working copy to DIR:2. Then the client and server would both agree that DIR:2 does not contain foo, and that DIR:2 is indeed exactly what is in the working copy.

This solution has nasty, un-user-friendly side effects, though. It's likely that other people may have committed before us, possibly adding new properties to DIR, or adding a new file bar. Now pretend our committed deletion creates revision 5 in the repository. If we instantly update our local DIR to 5, that means unexpectedly receiving a copy of bar and some new propchanges. This clearly violates a UI principle: "the client will never change your working copy until you ask it to." Committing changes to the repository is a server-write operation only; it should not modify your working data!

Another solution is to do the naive thing: after committing the deletion of foo, simply stop tracking the file in the .svn administrative directory. The client then loses all knowledge of the file.

But this doesn't work either: if we now update our working copy, the communication between client and server is incorrect. The client still believes that it has DIR:1 - which is false, since a "true" DIR:1 contains foo. The client gives this incorrect report to the repository, and the repository decides that in order to update to revision 2, foo must be deleted. Thus the repository sends a bogus (or at least unnecessary) deletion command.

Solution

This problem is solved through tricky administrative tracking in the client.

After deleting foo and committing, the file is not is not totally forgotten by the .svn directory. While the file is no longer considered to be under revision control, it is still secretly remembered as having been `deleted'.

When the user updates the working copy, the client correctly informs the server that the file is already missing from its local DIR:1; therefore the repository doesn't try to re-delete it when patching the client up to revision 2.

The Overeager Directory

Problem

Again, suppose our working copy has directory DIR:1 containing file foo:1, along with some other files.

Now, unbeknownst to us, somebody else adds a new file bar to this directory, creating revision 2 (and DIR:2).

Now we add a property to DIR and commit, which creates revision 3. Our working-copy DIR is now marked as being at revision 3.

Of course, this is false; our working copy does not have DIR:3, because the "true" DIR:3 on the repository contains the new file bar. Our working copy has no knowledge of bar at all.

Again, we can't follow our commit of DIR with an automatic update (and addition of bar). As mentioned previously, commits are a one-way write operation; they must not change working copy data.

Solution

Let's enumerate exactly those times when a directory's local revision number changes:

In this light, it's clear that our "overeager directory" problem only happens in the second situation - those times when we're committing directory propchanges.

Thus the answer is simply not to allow property-commits on directories that are out-of-date. It sounds a bit restrictive, but there's no other way to keep directory revisions accurate.

User impact

Really, the Subversion client seems to have two difficult--almost contradictory--goals.

First, it needs to make the user experience friendly, which generally means being a bit "sloppy" about deciding what a user can or cannot do. This is why it allows mixed-revision working copies, and why it tries to let users execute local tree-changing operations (delete, add, move, copy) in situations that aren't always perfectly, theoretically "safe" or pure.

Second, the client tries to keep the working copy in correctly in sync with the repository using as little communication as possible. Of course, this is made much harder by the first goal!

So in the end, there's a tension here, and the resolutions to problems can vary. In one case (the "lagging directory"), the problem can be solved through secret, complex tracking in the client. In the other case ("the overeager directory"), the only solution is to restrict some of the theoretical laxness allowed by the client.

License

Copyright © 2000 Collab.Net. All rights reserved.

This software is licensed as described in the file COPYING, which you should have received as part of this distribution. The terms are also available at http://subversion.tigris.org/license-1.html. If newer versions of this license are posted there, you may use a newer version instead, at your option.