Re: [eigen] Bitbucket migration

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]




On Thu, Sep 12, 2019 at 9:53 AM Joseph Mirabel <joseph.mirabel@xxxxxxxx> wrote:

To summarize what is missing:

- there are a lot of "backporting rev[0-9]+" that are not found. I don't know what "backporting" means. I just know that "hg log --rev N" for all the one I tested returns "unknown revision".


$ hg log -v  | grep "backporting"  does not give me that many:

backporting 1784: fixed bug #62

-> For this one "hg log -r 1784" tells me that the true hash is 0430786c2a6f (but we don't care if we lost one link!)

The followings are very old subversion references, we don't care at all:

backporting 964177 (gcc 3.3 fix)
backporting r964165 (gcc 3.3 fixes)
backporting rev 951682 (compilation fix in aligned allocator)
backporting rev 918446: fix MSVC internal compilation error
backporting rev925153 (bugfix in MapBase::coeffRef(int) )
backporting commit 918468 (fix MSVC internal error)
 

- I do not catch ranges of revision number like "[0-9]+-[0-9]+". It wouldn't be hard to achieve.

-> I could not find any meaningful one.

I only found two meaningful "([0-9]+:)([0-9]+)" occurrences like: 6089:76b6c62565a6. In this case, if \2 is a valid hash, then \1 (e.g., "6089:") should be dropped to enable auto-link creation, see: https://github.com/jmirabel/eigen_tmp/commit/4325f05a4b60

But that's really no big deal as the true hash has been properly updated, and there are only 2 occurrences!

- what about unamed hg heads ? Should we drop them ? If not, I would appreciate if someone knowing mercurial could name them (with a name valid for hg and git).

I guess this issue is related to changeset 59a7e404a93c, which is an empty commit closing a branch. If your script ignores it without additional issue, then that's fine.

So for me this is all good, I would only care about updating references to bugs/PR by applying the following substitutions:

"([Bb]ug) (\d+)" -> "\2 #\1"
"bugs (\d+) and (\d+)" -> "bugs #\1 and #\2"
"http://eigen.tuxfamily.org/bz/show_bug.cgi\?id=(\d+)" -> "#\1"
"[Pp]ull [Rr]equest #(\d+)" -> "pull request PR-\1"
"PR #(\d+)" -> "PR-\1"


gael
 


Joseph


Le 12/09/2019 à 00:37, Gael Guennebaud a écrit :


On Wed, Sep 11, 2019 at 7:38 PM Joseph Mirabel <joseph.mirabel@xxxxxxx> wrote:
Dear Eigen developers,

- I can convert all reference like to revisions or mercurial hashes that
follows the regex in [2].

I'll look at it more carefully later, but... wow!! this looks very promising: https://github.com/jmirabel/eigen_tmp/commit/6e53e31dc2d79da
 
For comparison, the same commit in our official git-mirror: https://github.com/eigenteam/eigen-git-mirror/commit/6ba9310bc2c168

gael


- I did not try to convert URLs although it should not be hard.

- I manually edited the author file [5] so that they would fit git
author format. If you find yourself in the list and want to update it,
you can contact me.


It should not be hard to add more rules to the plugin convert_references
if anyone feels like doing it.


Best,

Joseph


[1] https://github.com/frej/fast-export.git

[2]
https://github.com/jmirabel/fast-export/blob/1fdc76e0626acd6adfcc0d900d14f36b459c4798/plugins/convert_references/__init__.py#L16

[3] https://github.com/jmirabel/fast-export

[4] https://github.com/jmirabel/eigen_tmp.git

[5]
https://github..com/jmirabel/fast-export/blob/master/eigen/authors_reworked



Le 11/09/2019 à 18:03, Gael Guennebaud a écrit :
> To prepare the migration from bitbucket, I started to play a bit with
> its API to see what could be done. So far I've quickly draft two
> (ugly) python scripts to archive the forks and pull-requests. Since
> this is a one shot for us, I did not cared about robustness, safety,
> generality, beauty, etc.
>
> You can see them there
> : https://gitlab.com/ggael/bitbucket-migration-tools and contribute!
>
> ** Forks **
>
> You can see the summary of the fork script
> there: http://manao.inria.fr/eigen_tmp/archive_forks_log.html
>
> The hg clones (history+checkout) represents 20GB, maybe 12GB if we
> remove the checkouts. Among the 460 forks, 214 seems to have no change
> at all (according to "hg out") and could be dropped. I don't know yet
> where to host them though.
>
> This script can be ran incrementally.
>
>
> ** Pull-Requests **
>
> You can find the output of the pull-requests script
> there: http://manao.inria.fr/eigen_tmp/pullrequests/
>
> There is a short summary, and then for each PR a static .html file
> plus diff/patch files, and other details. For instance,
> see: http://manao.inria.fr/eigen_tmp/pullrequests/OPEN/686/pr686.html
>
> Currently this script cannot be ran incrementally. You have to run it
> just before closing the respective repository!
>
> Also, this script does not grab inline comments. Only the main
> discussions is archived. Those can be obtained by iterating over the
> "activity" pages, but I don't think that's worth the effort because
> they would be difficult to exploit anyway.
>
>
> ** hg to git **
>
> As discussed in the other thread, if we switch from hg to git, then
> all hashes will have to be updated. Generating a map file is easy, and
> thus updating the links/hashes in bug comments and PR comments should
> not be too difficult (we only have to figure out the right regex to
> catch all variants).
>
> However, updating the hashes within the commit messages will require
> to rewrite the whole history in a careful order. Does anyone here
> feels brave enough to write such a script? If not, I guess we could
> live with an online php script doing the hash conversion on demand. I
> don't think we'll have to follow such hashes so frequently.
>
> cheers,
> gael
>
>





Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/