Re: [eigen] Bitbucket migration

[ Thread Index | Date Index | More Archives ]

Holy cow this looks impressive!

On Tue, Sep 17, 2019 at 2:31 PM Gael Guennebaud <gael.guennebaud@xxxxxxxxx> wrote:

Thanks to Joseph's work, and after fighting with's super aggressive spam filter, I finally managed to get the following project as a demo of what could be the outcome of a full migration:

This project includes:
 - a git repo with bug ids, commit hashes, and pull-request links updated in commit message.
 - a migration of all our bugzilla entries and comments with similar updates as the commit messages.

Some examples:
- attachments, links to commits and bugs:
- link to PR:
- migration of our "3.x" bug entries as milestones:
- link to PR from commits:
- link to bug from commits:

The bugzilla to gitlab migration script is there: (adapted from

PS1: If you notice many extra newlines in issue descriptions/comments, that's normal, I already fixed this shortcoming.

What do you think ?


On Wed, Sep 11, 2019 at 6:03 PM Gael Guennebaud <gael.guennebaud@xxxxxxxxx> wrote:
To prepare the migration from bitbucket, I started to play a bit with its API to see what could be done. So far I've quickly draft two (ugly) python scripts to archive the forks and pull-requests. Since this is a one shot for us, I did not cared about robustness, safety, generality, beauty, etc.

You can see them there : and contribute!

** Forks **

You can see the summary of the fork script there:

The hg clones (history+checkout) represents 20GB, maybe 12GB if we remove the checkouts. Among the 460 forks, 214 seems to have no change at all (according to "hg out") and could be dropped. I don't know yet where to host them though.

This script can be ran incrementally.

** Pull-Requests **

You can find the output of the pull-requests script there:

There is a short summary, and then for each PR a static .html file plus diff/patch files, and other details. For instance, see:

Currently this script cannot be ran incrementally. You have to run it just before closing the respective repository!

Also, this script does not grab inline comments. Only the main discussions is archived. Those can be obtained by iterating over the "activity" pages, but I don't think that's worth the effort because they would be difficult to exploit anyway.

** hg to git **

As discussed in the other thread, if we switch from hg to git, then all hashes will have to be updated. Generating a map file is easy, and thus updating the links/hashes in bug comments and PR comments should not be too difficult (we only have to figure out the right regex to catch all variants).

However, updating the hashes within the commit messages will require to rewrite the whole history in a careful order. Does anyone here feels brave enough to write such a script? If not, I guess we could live with an online php script doing the hash conversion on demand. I don't think we'll have to follow such hashes so frequently.


Mail converted by MHonArc 2.6.19+