Slashdot Log In
The Future of Subversion
Posted by
kdawson
on Fri May 09, 2008 12:03 PM
from the had-a-good-run dept.
from the had-a-good-run dept.
sciurus0 writes "As the open source version control system Subversion nears its 1.5 release, one of its developers asks, what is the project's future? On the one hand, the number of public Subversion DAV servers is still growing quadratically. On the other hand, open source developers are increasingly switching to distributed version control systems like Git and Mercurial. Is there still a need for centralized version control in some environments, or is Linus Torvalds right that all who use it are 'ugly and stupid'?" The comments on the blog post have high S/N.
Related Stories
[+]
BSD: FreeBSD Begins Switch to Subversion 70 comments
An anonymous reader writes "The FreeBSD Project has begun the switch of its source code management system from CVS to Subversion. At this point in time, FreeBSD's developers are making changes to the base system in the Subversion repository. We have a replication system in place that exports our work to the legacy CVS tree on a continuous basis.
People who are using our extensive CVS based distribution network (including anoncvs, CVSup, cvsweb, ftp) will not be interrupted by our work-in-progress. We are committed to maintaining the existing CVS based distribution system for at least the support lifetime of all existing 'stable' branches. Security and errata patches will continue to be made available in their usual CVS locations."
This discussion has been archived.
No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.

S/N Ratio (Score:4, Funny)
Well *I'm* ugly and stupid... (Score:5, Insightful)
Basically Subversion is not suited for development with a diverse population of loosely connected individuals, each with their own private branches. Frankly, for corporate work, I don't understand why you would want the backup and integrity hassles of a distributed version control system. But maybe that's because I'm ugly and stupid
Re:Well *I'm* ugly and stupid... (Score:4, Insightful)
Parent
Re:Well *I'm* ugly and stupid... (Score:5, Informative)
The big difference is that a DVCS adds a local workspace. I can check something out from the centralized server(with a DVCS, I pull the server tree to my local tree), mess around, make a branch, see what it does, decide it was stupid and throw the whole thing away, or I can decide it was a good idea and then commit it to the centralized server(by pushing my tree up to the central tree). The only real difference is that a check out is called a pull and a commit is called a push.
Separating change management from committing to the repository is not necessarily a bad thing. It may be undesirable in many situations, but it can also be handy.
Parent
Re:Well *I'm* ugly and stupid... (Score:4, Informative)
Parent
Re:Well *I'm* ugly and stupid... (Score:5, Informative)
A few months ago I switched to git. Git seems like the winner - it's fast, modular, and many people are hacking on it and have written many cool tools (most of which are "built-in" git "commands.") However, its Windows support lags behind the other front-runner Mercurial. Darcs is mostly used by Haskell hackers, Monotone never seemed to really take off, and Codeville has died on the vine.
The good thing is you can switch because there are migration tools for almost every one and the histories tend to be isomorphic.
Parent
Re:Well *I'm* ugly and stupid... (Score:5, Funny)
Of course a CVS user only updates once a week and checks in once a month, so being on the beach for a few days wouldn't make any difference at all.
I have used Subversion, git, and most recently CVS, and the only big risks I've taken have been with CVS, where everything is so constrictive and painful that I tend to check in as little as possible. The bottom line is that whatever makes for the easiest, most natural development process will result in more frequent check-ins and less lost work.
(I've just stopped asking my colleagues, "How do I ___ in CVS?" because the answer is always slack-jawed silence, followed by, "Why would you want to do that?" accompanied with a suspicious squint-eyed stare that makes me feel like I'm in Deliverance, right there in a cube farm full of college-educated yuppies. CVS warps your brain to the point where you don't think there is ever any good reason to, say, rename a directory, and anyone who wants to rename a directory must be some kind of alien, possibly a marketer or a salesman who wandered into the wrong department, because a Real Programmer would never think up such a bizarre idea as renaming a directory to reflect its current contents. I mean, you pick a name, and it stays forever, right, like a street name! You don't go to Market Street and expect to find a market, so why are you surprised to find the networking code in the tpe_bckp directory? Gary Graybeard can tell you all about how it got that way, and it's a fascinating story. Think of all the rich history that would disappear if you renamed it the "net" or "networking" directory. So depressingly literal. And speaking of depressingly literal, the history would *literally disappear*, and the whole reason we have a SCM system is so we don't lose history. So don't go making changes that it doesn't know how to track, you hear?)
Parent
Re:Well *I'm* ugly and stupid... (Score:4, Insightful)
subversion is good for small projects, or larger projects with limited number of developers.
Once you get into the hundreds and thousands of developers working on the same project though you need to think a bit differently in terms of needs of the individual developer, and the group as a whole.
Parent
Re:Well *I'm* ugly and stupid... (Score:5, Insightful)
Correct me if I'm wrong, but isn't this the major selling point of distributed revision control? The idea being that since it is a distributed repository, everyone has a "backup" of someone else's repository (depending where they got their code from). No distributed copy is necessarily considered more important than another. However in a corporate environment I would imagine it works out quite well since there's an inherent hierarchy. Those "higher up" can pull changes from those "below". Those "higher" repositories you could (and probably should) backup.
As far as integrity goes I think one of the main goals of both Mecurial and Git was to protecting against corruption (using a SHA1 hash). You're much more likely to get corruption through CVS and SVN, which is awful considering it's in a central location.
Parent
Re:Well *I'm* ugly and stupid... (Score:5, Insightful)
By backup - I mean a tape or location where I know I can look to find the "good" copy that contains the official tree of code that represents what is going into my product. What you are describing is copies of repositories sitting in various locations that isn't really the same as a backup. It's also a bit upside-down - I don't want to be "pulling" fixes from engineers, I want engineers "pushing" fixes into a known-good integration environment.
By integrity - I mean ensuring that you have all of the fixes you want to have from everyone who should be making changes on a project. NOT file corruption.
Parent
Re:Well *I'm* ugly and stupid... (Score:5, Insightful)
In a distributed environment usually there's someone's (or a group's) repository that's considered more important than others. In a software setting this could be a Lead Engineer's/QA/Certification's repository. Depending on what your definition of the "good" repository is, you would take the copy from the right place. It opens up in terms of flexibility what code you actually want to get to work with. The upcoming released version of your software from QA, the next-generation stuff that developers are working on, or maybe a new feature that you here so-and-so is working on...
But you have someone who needs to approve a change to a central repository that everyone shares. Right? That person would probably want to examine those changes before they're committed. The only difference between distributed and centralized, in this case, is that it's a required step. Everyone is responsible for their own repository.
I'm no expert on distributed revision control, so anyone please feel free to correct me.
Parent
Distributed VCS can be used like this (Score:4, Insightful)
Parent
Re:Distributed VCS can be used like this (Score:5, Insightful)
How do you force your cvs/svn users to commit ? You can't, you expect them to be responsible and do it. This isn't much different from a DVCS.
What if a user wants his work to be backed up but doesn't want to commit because his changes are not ready to be published ? A centralized VCS forces them to commit with the side-effect of making their unfinished work immediately visible in the central repo, while a DVCS lets them commit to a private repo that you can back up independently.
Your backup requirements can be solved 2 different ways:
Besides, in this debate, you are completely ignoring the other major advantages of DVCS over centralized ones: scalability, no single point of failure, possibility to work offline and have full access to all of the features of your VCS, usually faster than centralized VCS, low-cost branching/merging, etc.
Parent
Git vs Subversion (Score:5, Informative)
1. timestamps. Subversion doesn't do that by default, but it has good enough metadata support than timestamps can be hacked in easily. For Git, metastore is nearly worthless. If you store a program source, you risk just skewed builds -- for other types of data, lack of timestamps is often a deal breaker.
2. move tracking: trying to move a directory branch from one dir to another means you lose history. Rename tracking works only sometimes, often it will take a random file from somewhere else, etc.
3. large files. Take a 100MB binary file into SVN, change one byte, commit. Change one byte again. And again. Git will waste the freaking 100MB for every single commit.
4. partial checkouts. If there's a 5GB repository, you'll often want to check out just a single dir. With Git, there is no such option.
5. ease of use. I see that ordinary users, web monkeys and so on can learn SVN easily; with Git, even I had some troubles initially.
On the other hand, SVN used to have no merge tracking (I wonder what that "limited merge tracking" in 1.5 means...) which for distributed development of program sources is worse than points 1..5 together.
Re:Git vs Subversion (Score:5, Insightful)
To commit a change to the Linux kernel, you do need to build the whole thing. That's a monolothic thing.
To commit a change to a webpage, a graphical project, a set of biochem data, you don't need that. Do you need to check out the countless megs of Wesnoth to update your changes to a campaign? That's a modular thing.
Parent
Re:Git vs Subversion (Score:5, Informative)
As a matter of fact, a guy doing a demo at an Apple Developer conference once used the svn codebase as 'something big to compile' when demonstrating the XCode IDE. When we asked why he used svn, he said that it was "the only open source codebase he'd ever seen which compiles with no warnings."
If you have specific criticisms about the codebase, we'd like to hear. Instead, your post just seems to be about how your personal wish-list of features has never been added, and therefore "the codebase must be really bad." I'm not sensing any logic to this conclusion.
svn 1.6 is going to likely have
The fact is: we haven't added your pet features yet because we've been too busy working on other big fish, like FSFS, svnserve, locking, svnsync, SASL support, changelists, interactive conflict resolution, and oh yeah... automatic tracking of merges.
The working copy code was designed in a specific way -- the scattered
Parent
Depends on the environment (Score:5, Insightful)
If you're in a highly-distributed development environment like Linux, where the developers are spread across multiple continents and have very little shared infrastructure and a high need to work independently of each other (either because of preference or because they don't want their work stalled by another undersea cable cut half a world away), then yes using a centralized VCS like Subversion is stupid. But if you're a developer on a project where all the developers are in a common location sharing common infrastructure, often literally within speaking distance of each other, then a decentralized VCS like Git is stupid. It's harder to maintain and, in that situation, yields none of the offsetting benefits.
Analogy: a fleet of Chevy vans vs. a freight train. The vans are far more flexible, they can travel any route needed whereas the freight train's limited to fixed tracks, and their smaller size and lower cost each let you buy a lot of them and dedicate each one to just a few deliveries in a particular area without a lot of overhead. You can fan the vans out all over the city, sending just what you need where it's needed and rerouting each one to adapt to changes without upsetting the others. But if your only delivery each day is 1000 tons of a single product from one warehouse to another 600 miles away, you're better off with that one big freight train.
Linus has a big mouth... (Score:5, Insightful)
Re:Linus has a big mouth... (Score:4, Insightful)
He's an excellent assembly hacker, a fast learner, and at least a majority of the time a nice guy, so most people overlook it.
Parent
Don't knock it till you try it (Score:5, Insightful)
Most of the people knocking DVCS or saying they can't see the benefits haven't actually used them on any projects. They have built up a framework in their minds of How Things Should Work, but unfortunately that model was defined by the limitations of their tools.
Parent
we use SVN (Score:5, Interesting)
SVN is currently integrated with our IDEs (all 3), one of the main selling points in choosing a VCS.
Ease of backups:
We archive our repositories every day, IT loves being able to simply tgz the SVN directory and not have to worry about anything else, regardless of the state of any current projects (all groups use SVN).
Simplicity:
SVN/Trac training (client use, management, backend workings) takes less than 10 minutes. In another 15 minutes I can have someone setting up their own SVN repositories+Trac, without needing to pull up a single reference document, primarily because the an SVN setup methodology is trivial to memorize.
SVN's weaknesses (Score:4, Interesting)
- Merges in a typical environment become effectively anonymous. Let's say you have a build manager and individual developers working on different changes in parallel. The build manager can't merge the changes without those changes taking on his identity, that is, all identifying information about the originator of the changes is lost.
- So-called "best practice" for SVN branching means building new branches with every new release. That is, it's not recommended to build one branch and merge changes from the trunk into it as you're incrementally changing things on that branch, noooo. You have to keep polluting the repository with needless hair by making new branches every week, and sometimes, multiple ones per day.
These are just two I'm aware of that bite us in the ass on a regular basis. The first issue is supposed to be fixed in one of the near-term mods to SVN, but the fact that the second even exists tells me that the guys developing SVN don't really work in the same world as a lot of the bigger commercial development environments do.Re:SVN's weaknesses (Score:5, Informative)
Since I'm sure you're not talking about what svn blame gives you, what do you mean exactly?
Umm, says who? Thanks exactly what we do. We have /trunk and /branches/devel. When one of us gets a particularly stable version of /branches/devel ready, we merge it to /trunk.
Have to? No way. But since branches are basically free, why would you want to avoid them?
We use them for experimental "what-if" branches, like "I wonder what would happen if I ported our application from MySQL to SQLite". You copy "/branches/devel" to "/branches/sqlite" and hack away. If it works, merge your changes back to devel. If it bombs, just delete the branch./P
Parent
My goal regarding the future of Subversion... (Score:4, Insightful)
helloooo merge tracking (Score:5, Informative)
http://blogs.open.collab.net/svn/2007/09/what-subversion.html [collab.net]
BTW, they did a really nice job of mapping out the use cases and whatnot before implementing the feature. I guess source control people are natural planners.
http://subversion.tigris.org/merge-tracking/requirements.html [tigris.org]
Anyway, I'm sure the world will continue to have need for both distributed and client/server source control systems, and Subversion is a nice example of the latter.