[Mono-devel-list] Analyzing Subversion logs

Wed Jan 12 09:40:00 EST 2005

> Sorry for bothering you with this questions but I am working on my
> degree thesis which deals with Subversion repositories analysis. Now I

That sounds interesting. What exactly are you trying to dig out?

> For example, it happened more than 150 times that "miguel" performed a
> commit less than 10 seconds after his previous commit. I am trying to
> understand if this can be caused (when the repository was CVS), by
> problems during the commits (i.e. network problems). I suppose that it
> could happen that when you try to perform a commit using an IDE plugin
> or some evolved tool instead of the command line, and a problem occours,
> the tool tries to complete the commit automatically generating, indeed,
> a new commit on CVS. This should not happen in the Subversion era.
> I had been told that the migration form CVS to Subversion happened the
> 11/11/2004 so I filtered the log to exclude CVS generated log entries.
> Unfortunately this did not result in "strange" commits been excluded.

Even with svn, some people are not using changesets to do commits. Paolo
emailed some rational on this list one or two months ago.

Sometimes, people realize that they forgot to add a file, so that can
result in a commit really fast. Also, people can have unrelated changes
that they don't want to commit as one patch.

There are also times when you want to commit something in mcs/ and in
mono/, and you can't do that nicely with svn, so you get two commits.

cvs2svn should have made changesets out of these commits: in the world of
cvs, there is no such thing as two files modified by one `change', so all
the changesets are made by cvs2svn.

Are any of the 10 second things before the cvs migration?

One filter you might try applying is looking for commit messages like
`oops', `duh', `forgot to add this'. Most of our commits are done with a
changelog stile message. A good test for one of these messages is that the
log contains an `*'. However, you will get some problems with this. For
example, Miguel often commits with the message `flush' when he updates the
web page portion of things.

>  - how do you usually perform a commit? Do you use the command line?
> RapidSVN? TortoiseSVN? Do this tools fragment a single commit into more
> commits? Why?

Most people on Linux are using the command line.

-- Ben