This chapter explains the user's view of Subversion -- what "objects" you interact with, how they behave, and how they relate to each other.
Suppose you are using Subversion to manage a software project. There are two things you will interact with: your working directory, and the repository.
Your working directory is an ordinary directory tree, on your local system, containing your project's sources. You can edit these files and compile your program from them in the usual way. Your working directory is your own private work area: Subversion never changes the files in your working directory, or publishes the changes you make there, until you explicitly tell it to do so.
After you've made some changes to the files in your working directory, and verified that they work properly, Subversion provides commands to publish your changes to the other people working with you on your project. If they publish their own changes, Subversion provides commands to incorporate those changes into your working directory.
A working directory contains some extra files, created and maintained by Subversion, to help it carry out these commands. In particular, these files help Subversion recognize which files contain unpublished changes, and which files are out-of-date with respect to others' work.
While your working directory is for your use alone, the repository is the common public record you share with everyone else working on the project. To publish your changes, you use Subversion to put them in the repository. (What this means, exactly, we explain below.) Once your changes are in the repository, others can tell Subversion to incorporate your changes into their working directories. In a collaborative environment like this, each user will typically have their own working directory (or perhaps more than one), and all the working directories will be backed by a single repository, shared amongst all the users.
A Subversion repository holds a single directory tree, and records the history of changes to that tree. The repository retains enough information to recreate any prior state of the tree, compute the differences between any two prior trees, and report the relations between files in the tree -- which files are derived from which other files.
A Subversion repository can hold the source code for several projects; usually, each project is a subdirectory in the tree. In this arrangement, a working directory will usually correspond to a particular subtree of the repository.
For example, suppose you have a repository laid out like this:
/trunk/paint/Makefile canvas.c brush.c write/Makefile document.c search.c
In other words, the repository's root directory has a single
subdirectory named trunk
, which itself contains two
subdirectories: paint
and write
.
To get a working directory, you must check out some subtree of the
repository. If you check out /trunk/write
, you will get a working
directory like this:
write/Makefile document.c search.c .svn/This working directory is a copy of the repository's
/trunk/write
directory, with one additional entry -- .svn
-- which holds the
extra information needed by Subversion, as mentioned above.
Suppose you make changes to search.c
. Since the .svn
directory remembers the file's modification date and original contents,
Subversion can tell that you've changed the file. However, Subversion
does not make your changes public until you explicitly tell it to.
To publish your changes, you can use Subversion's commit
command:
$ pwd /home/jimb/write $ ls -a .svn/ Makefile document.c search.c $ svn commit search.c $
Now your changes to search.c
have been committed to the
repository; if another user checks out a working copy of
/trunk/write
, they will see your text.
Suppose you have a collaborator, Felix, who checked out a working
directory of /trunk/write
at the same time you did. When you
commit your change to search.c
, Felix's working copy is left
unchanged; Subversion only modifies working directories at the user's
request.
To bring his working directory up to date, Felix can use the Subversion
update
command. This will incorporate your changes into his
working directory, as well as any others that have been committed since
he checked it out.
$ pwd /home/felix/write $ ls -a .svn/ Makefile document.c search.c $ svn update U search.c $
The output from the svn update
command indicates that Subversion
updated the contents of search.c
. Note that Felix didn't need to
specify which files to update; Subversion uses the information in the
.svn
directory, and further information in the repository, to
decide which files need to be brought up to date.
We explain below what happens when both you and Felix make changes to the same file.
A Subversion commit
operation can publish changes to any number
of files and directories as a single atomic transaction. In your
working directory, you can change files' contents, create, delete,
rename and copy files and directories, and then commit the completed set
of changes as a unit.
In the repository, each commit is treated as an atomic transaction: either all the commit's changes take place, or none of them take place. Subversion tries to retain this atomicity in the face of program crashes, system crashes, network problems, and other users' actions. We may call a commit a transaction when we want to emphasize its indivisible nature.
Each time the repository accepts a transaction, this creates a new state of the tree, called a revision. Each revision is assigned a unique natural number, one greater than the number of the previous revision. The initial revision of a freshly created repository is numbered zero, and consists of an empty root directory.
Since each transaction creates a new revision, with its own number, we can also use these numbers to refer to transactions; transaction n is the transaction which created revision n. There is no transaction numbered zero.
Unlike those of many other systems, Subversion's revision numbers apply to an entire tree, not individual files. Each revision number selects an entire tree.
It's important to note that working directories do not always correspond to any single revision in the repository; they may contain files from several different revisions. For example, suppose you check out a working directory from a repository whose most recent revision is 4:
write/Makefile:4 document.c:4 search.c:4
At the moment, this working directory corresponds exactly to revision 4
in the repository. However, suppose you make a change to
search.c
, and commit that change. Assuming no other commits have
taken place, your commit will create revision 5 of the repository, and
your working directory will look like this:
write/Makefile:4 document.c:4 search.c:5Suppose that, at this point, Felix commits a change to
document.c
, creating revision 6. If you use svn update
to
bring your working directory up to date, then it will look like this:
write/Makefile:6 document.c:6 search.c:6Felix's changes to
document.c
will appear in your working copy of
that file, and your change will still be present in search.c
. In
this example, the text of Makefile
is identical in revisions 4, 5,
and 6, but Subversion will mark your working copy with revision 6 to
indicate that it is still current. So, after you do a clean update at
the root of your working directory, your working directory will
generally correspond exactly to some revision in the repository.
For each file in a working directory, Subversion records two essential pieces of information:
Given this information, by talking to the repository, Subversion can tell which of the following four states a file is in:
Subversion does not prevent two users from making changes to the same
file at the same time. For example, if both you and Felix have checked
out working directories of /trunk/write
, Subversion will allow
both of you to change write/search.c
in your working directories.
Then, the following sequence of events will occur:
search.c
first. His
commit will succeed, and his text will appear in the latest revision in
the repository.
search.c
, Subversion
will reject your commit, and tell you that you must update
search.c
before you can commit it.
search.c
, Subversion will try to merge Felix's
changes from the repository with your local changes. By default,
Subversion merges as if it were applying a patch: if your local changes
do not overlap textually with Felix's, then all is well; otherwise,
Subversion leaves it to you to resolve the overlapping
changes. In either case,
Subversion carefully preserves a copy of the original pre-merge text.
search.c
,
which now contains everyone's changes.
Some version control systems provide "locks", which prevent others from changing a file once one person has begun working on it. In our experience, merging is preferable to locks, because:
Of course, the merge process needs to be under the users' control. Patch is not appropriate for files with rigid formats, like images or executables. Subversion allows users to customize its merging behavior on a per-file basis. You can direct Subversion to refuse to merge changes to certain files, and simply present you with the two original texts to choose from. Or, you can direct Subversion to merge using a tool which respects the semantics of the file format.
Files generally have interesting attributes beyond their contents: owners and groups, access permissions, creation and modification times, and so on. Subversion attempts to preserve these attributes, or at least record them, when doing so would be meaningful. However, different operating systems support very different sets of file attributes: Windows NT supports access control lists, while Linux provides only the simpler traditional Unix permission bits.
In order to interoperate well with clients on many different operating systems, Subversion supports property lists, a simple, general-purpose mechanism which clients can use to store arbitrary out-of-band information about files.
A property list is a set of name / value pairs. A property name is an arbitrary text string, expressed as a Unicode UTF-8 string, canonically decomposed and ordered. A property value is an arbitrary string of bytes. Property values may be of any size, but Subversion may not handle very large property values efficiently. No two properties in a given a property list may have the same name. Although the word `list' usually denotes an ordered sequence, there is no fixed order to the properties in a property list; the term `property list' is historical.
Each revision number, file, directory, and directory entry in the Subversion repository, has its own property list. Subversion puts these property lists to several uses:
svn:posix-access-permission
. Operating systems which allow files
to have more than one name, like Windows 95, can use directory entry
property lists to record files' alternative names.
svn-acl
property holds an access control list which the Subversion server uses
to regulate access to repository files.
Property lists are versioned, just like file contents. You can change properties in your working directory, but those changes are not visible in the repository until you commit your local changes. If you do commit a change to a property value, other users will see your change when they update their working directories.
This section describes how the Subversion client deals with the addition of files and directories.
Adding items consists of two phases:
This section describes the first phase.
svn add foo.c
foo.c
is now tracked by your working copy. The
svn status
command will show the file with status code
A
and at local revision 0 (because it is not yet part of any
repository revision.)
svn add dirname
dirname
will be added, but not recursively. If you
want to add everything within dirname
, then you can pass the
--recursive
flag to svn add
. Everything added will
have status code A
and be at revision 0.
(Note that unlike CVS, adding a directory does not effect an
immediate change in the repository!)
svn revert [items]
svn status
command will no longer show them at all.
There are two important exceptions which will prevent something from being scheduled for addition.
First, you can't schedule an item to be added if it doesn't exist. The
svn add foo
command will check that foo
is an actual
file or directory before succeeding.
Second, you can't schedule an item to be added if any of its parent
directories are scheduled for deletion. This is a sanity check
performed by svn add
; it makes no sense to add an item within
a directory that will be destroyed at commit-time.
If your working copy contains items scheduled for addition and you
svn commit
them, they will be copied into the repository and
become a permanent part of your tree's history.
As usual, your commit will be rejected if any server-side conflicts result from your own working copy being out-of-date.
One final rule: you cannot commit an added item if any of its parents are scheduled for addition but are not included in the same commit. This is because it would be meaningless to commit a new item to the repository without a parent to hold that item. Therefore, to commit added items that are nested, you must commit from the top of the nesting.
For example, recall our old working copy:
write/Makefile document.c search.c .svn/
Say we add a new directory fonts
to the working copy:
$ mkdir fonts $ svn add fonts $ svn st _ 1 ( 1) . _ 1 ( 1) ./Makefile _ 1 ( 1) ./document.c _ 1 ( 1) ./search.c A 0 ( 1) ./fonts
And say we add two new files within fonts
:
$ cp /some/path/font1.ttf fonts/ $ cp /some/path/font2.ttf fonts/ $ svn add fonts/font1.ttf fonts/font2.ttf $ svn st _ 1 ( 1) . _ 1 ( 1) ./Makefile _ 1 ( 1) ./document.c _ 1 ( 1) ./search.c A 0 ( 1) ./fonts A 0 ( 1) ./fonts/font1.ttf A 0 ( 1) ./fonts/font2.ttf
So what happens if we try to commit only font1.ttf
? The command
svn commit fonts/font1.ttf
will fail, because it attempts to
copy a file to the fonts
directory on the repository - and no
such directory exists there!
Thus the correct solution is to commit the parent directory. This will
add fonts
to the repository first, and then add its new contents:
$ svn commit fonts Adding ./fonts Adding ./fonts/font1.ttf Adding ./fonts/font2.ttf Commit succeeded.
During an update, new files and directories may be added to your working copy. This is no surprise.
The only problems that may occur are those times when the items being added have the same names as non-versioned items already present in your working copy. As a rule, Subversion never loses nor hides data in your working copy - versioned or not. Thus for the update to succeed, you'll have to move your unversioned items out of the way.
Replacement is when you add a new item that has the same name as an item already scheduled for deletion. Instead of showing both "D" and "A" flags simultaneously, an "R" flag is shown.
For example:
$ svn st _ 1 ( 1) . _ 1 ( 1) ./foo.c $ svn rm foo.c $ svn st _ 1 ( 1) . D 1 ( 1) ./foo.c $ rm foo.c $ echo "a whole new foo" > foo.c $ svn add foo.c $ svn st _ 1 ( 1) . R 1 ( 1) ./foo.c
At this point, the replaced item acts like any other kind of addition.
You can undo the replacement by running svn remove foo.c
- and
the file's status code will revert back to D
. If the replaced
item is a directory, you can schedule items within it for addition as
well.
When a replaced item is committed, the client will first delete the
original foo.c
from the repository, and then add the "new"
foo.c
.
Replacements are useful: the object being replaced can even change
type. For example, a file foo
can be deleted and replaced with a
directory of the same name, or vice versa.
This section describes how the Subversion client deals with removals of files and directories. Many of these behaviors are newly invented, because they follow from the fact that Subversion is versioning directories. (In other words, the CVS model hasn't had to deal with these scenarios before.)
The svn rm
subcommand is used to mark items in your working copy
for removal from the repository. Note that marking them is not the same
as actually removing them in the repository: the repository is
never modified until you run svn commit
(or svn
import
for new data).
Also, note that there two different ways to interpret the phrase "remove an item". In the less destructive case, the item is removed from revision control (i.e. no longer tracked by Subversion), but it is not removed from your working copy. In the more destructive case, the item is removed both from revision control and from disk.
Subversion defaults to the less destructive behavior - svn rm
by
itself only removes an item from revision control. However, if the
-f
flag (--force
) is given, the item(s) will also be
removed from disk. However, no item containing local modifications will
be removed, nor will items that are not under revision control
(you can remove such items by hand).
svn rm foo.c
foo.c
to be deleted from the repository.
The file is still tracked in the administrative directory until the user
commits; afterwards, the working copy will no longer track the file.
If foo.c is locally modified, this command will return an error
(you'll have to svn revert
your change).
svn rm dirname
dirname
to be deleted from the repository.
If any locally modified items live below the directory, this command
will return an error.
svn undel item
svn rm
does.) To recurse, use the --recursive
flag.
When an item has been scheduled for removal, but not yet committed, the
client more-or-less treats the item as if it were gone. Although the
item will still show up in the svn status
command with a D
flag next to it, the client will now allow the user to add a new
item of the same name. In this case, the svn status
output will
describe the item as replaced (with an R
flag).
(todo: perhaps we should show some examples here...)
This scenario is made even more complicated when the item in question is
a directory. If a directory is recursively marked for deletion, and
then a directory of the same name is added with svn add
, the user
can continue to add (or replace) items in the newly added directory.
The svn status
command would then show the parent directory as
"replaced", and items inside the directory as a mixture of items that
are scheduled to be "deleted", "added", and "replaced".
When the user runs svn commit
, and items are scheduled for
removal, the items are first removed from the repository. If there are
server-side conflicts, then (as usual) an error message will explain
that the working copy is out-of-date.
After the items are removed from the repository, all tracking
information about the items is removed from the working copy. In the
case of a file, its information is removed from .svn/
. In the
case of a directory, the entire .svn/
administrative area is
removed, as well as all the administrative areas of its subdirectories.
Note that commit never removes any real working files or directories;
that only happens with a svn rm -f
command, or possibly during a
svn update
.
When an update tries to remove a file or directory, the item is not only
removed from local revision control, but the item itself is deleted. In
the case of a directory removal, this is equivalent to a Unix rm
-rf
command.
There are two exceptions, for safety's sake:
Thus it's possible that after an update which recursively removes a directory, there may be stray path "trails" leading down to individual locally-modified files that were deliberately saved.
The three cardinal virtues of a master technologist are: laziness, impatience, and hubris." - Larry Wall
This section describes some of the pitfalls around the (possibly arrogant) notion that one can simply version directories just as one versions files.
To begin, recall that the Subversion repository is an array of trees. Each tree represents the application of a new atomic commit, and is called a revision. This is very different than a CVS repository, which stores file histories in a collection of RCS files (and doesn't track tree-structure.)
So when we refer to "revision 4 of foo.c" (written foo.c:4) in
CVS, this means the fourth distinct version of foo.c
- but in
Subversion this means "the version of foo.c in the fourth revision
(tree)". It's quite possible that foo.c
has never changed at all
since revision 1! In other words, in Subversion, different revision
numbers of the same versioned item do not imply different
contents.
Nevertheless, the contents of foo.c:4
is still well-defined. The
file foo.c
in revision 4 has a specific text and properties.
Suppose, now, the we extend this concept to directories. If we have a
directory DIR
, define DIR:N to be "the directory DIR in the
fourth revision." The contents are defined to be a particular set of
directory entries (dirents) and properties.
So far, so good. The concept of versioning directories seems fine in the repository - the repository is very theoretically pure anyway. However, because working copies allow mixed revisions, it's easy to create problematic use-cases.
Suppose our working copy has directory DIR:1
containing file
foo:1
, along with some other files. We remove foo
and
commit.
Already, we have a problem: our working copy still claims to have
DIR:1
. But on the repository, revision 1 of DIR is
defined to contain foo
- and our working copy DIR clearly
does not have it anymore. How can we truthfully say that we still have
DIR:1
?
One answer is to force DIR to be updated when we commit foo's deletion.
Assuming that our commit created revision 2, we would immediately update
our working copy to DIR:2
. Then the client and server would both
agree that DIR:2
does not contain foo, and that DIR:2
is
indeed exactly what is in the working copy.
This solution has nasty, un-user-friendly side effects, though. It's
likely that other people may have committed before us, possibly adding
new properties to DIR, or adding a new file bar
. Now pretend our
committed deletion creates revision 5 in the repository. If we
instantly update our local DIR to 5, that means unexpectedly receiving a
copy of bar
and some new propchanges. This clearly violates a UI
principle: "the client will never change your working copy until you ask
it to." Committing changes to the repository is a server-write
operation only; it should not modify your working data!
Another solution is to do the naive thing: after committing the
deletion of foo
, simply stop tracking the file in the .svn
administrative directory. The client then loses all knowledge of the
file.
But this doesn't work either: if we now update our working copy, the
communication between client and server is incorrect. The client still
believes that it has DIR:1
- which is false, since a "true"
DIR:1
contains foo
. The client gives this incorrect
report to the repository, and the repository decides that in order to
update to revision 2, foo
must be deleted. Thus the repository
sends a bogus (or at least unnecessary) deletion command.
This problem is solved through tricky administrative tracking in the client.
After deleting foo
and committing, the file is not is not
totally forgotten by the .svn
directory. While the file is no
longer considered to be under revision control, it is still secretly
remembered as having been `deleted'.
When the user updates the working copy, the client correctly informs the
server that the file is already missing from its local DIR:1
;
therefore the repository doesn't try to re-delete it when patching the
client up to revision 2.
Again, suppose our working copy has directory DIR:1
containing
file foo:1
, along with some other files.
Now, unbeknownst to us, somebody else adds a new file bar
to this
directory, creating revision 2 (and DIR:2
).
Now we add a property to DIR
and commit, which creates revision
3. Our working-copy DIR
is now marked as being at revision 3.
Of course, this is false; our working copy does not have
DIR:3
, because the "true" DIR:3
on the repository contains
the new file bar
. Our working copy has no knowledge of
bar
at all.
Again, we can't follow our commit of DIR
with an automatic update
(and addition of bar
). As mentioned previously, commits are a
one-way write operation; they must not change working copy data.
Let's enumerate exactly those times when a directory's local revision number changes:
In this light, it's clear that our "overeager directory" problem only happens in the second situation - those times when we're committing directory propchanges.
Thus the answer is simply not to allow property-commits on directories that are out-of-date. It sounds a bit restrictive, but there's no other way to keep directory revisions accurate.
Really, the Subversion client seems to have two difficult--almost contradictory--goals.
First, it needs to make the user experience friendly, which generally means being a bit "sloppy" about deciding what a user can or cannot do. This is why it allows mixed-revision working copies, and why it tries to let users execute local tree-changing operations (delete, add, move, copy) in situations that aren't always perfectly, theoretically "safe" or pure.
Second, the client tries to keep the working copy in correctly in sync with the repository using as little communication as possible. Of course, this is made much harder by the first goal!
So in the end, there's a tension here, and the resolutions to problems can vary. In one case (the "lagging directory"), the problem can be solved through secret, complex tracking in the client. In the other case ("the overeager directory"), the only solution is to restrict some of the theoretical laxness allowed by the client.
Copyright © 2000 Collab.Net. All rights reserved.
This software is licensed as described in the file COPYING, which you should have received as part of this distribution. The terms are also available at http://subversion.tigris.org/license-1.html. If newer versions of this license are posted there, you may use a newer version instead, at your option.