[Mono-devel-list] Analyzing Subversion logs

Paolo Molaro lupus at ximian.com
Wed Jan 12 11:28:34 EST 2005

On 01/12/05 Paolo Marini wrote:
> Sorry for bothering you with this questions but I am working on my degree
> thesis which deals with Subversion repositories analysis. Now I am tyring to
> analyse the mono project to extract some information but I have found that
> an important number of commits appear really close each other.
> For example, it happened more than 150 times that "miguel" performed a
> commit less than 10 seconds after his previous commit. I am trying to

It depends on the specific files. I think this may happen for two
*) importing source code from somewhere else: miguel did most of it
and it may happen that cvs recorded different times as the import proceded
*) some time ago we has some file system corruption on the cvs server
and of course something was wrong with the backup:-) We had the
(mostly) last version of the files, but we lost the history for
quite a few changes in mcs/. To try and get back the history information,
I wrote a perl script that grabbed the changes from the commit list
or something like that and recreated the missing history: this may have
resulted in a large number of changes being committed in fast succession.

I guess that, if you are interested in analyzing subversion repositories,
you should use the info from the repo only starting since the subversion 
The info extracted from cvs has also a number of issues of skewed data, 
resulting from imports and the like. For example, take the mini/ directory
(had to use cvs annotate on the cvs repo, since svn annotate is unusable):
miguel appears to have written 24K lines (of a total of 80K), but the
lines he wrote in the jit are at best a few hundred. This happens because
he imported the mini sources in cvs: there are many other cases like this.


lupus at debian.org                                     debian/rules
lupus at ximian.com                             Monkeys do it better

More information about the Mono-devel-list mailing list