Re: [eigen] Bitbucket is dropping its Mercurial support!

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


Hi,

The point you missed is that especially the "grafted from" links do not include the full URL, just the hg-hash (which is different from git-hashes). And just greping for "grafted from" gives me 425 results (in total -- if you want the log of individual branches, you need to use the `-b` option).
For a more precise count, you should grep for hexadecimal numbers longer than a few digits inside the commit messages.
I see, thanks for the explanation. 

I somewhat doubt that any existing hg->git converters automatically translates these hashes, but I'd be very happy if someone finds out otherwise. Changing these manually is definitely not an option.
I might have good news on this one: We are apparently not the only project that works on migrating from Mercurial to Git. The OpenJDK project (a free implementation of the Java platform) has created Skara, a set of tools to handle all kind of stuff related to contributing to OpenJDK (https://github.com/openjdk/skara). Some of the tools could be really helpful for our issues (see https://openjdk.java.net/jeps/357). 

The relevant tool seem to be git-openjdk-import which is used to import from Mercurial to Git. I just had a short glance on the code but it seems to be very generic and does not seem to contain OpenJDP related stuff at all. The interesting part is the follow paragraph from https://openjdk.java.net/jeps/357

We've also prototyped new tool, git-translate. This tool uses a file called.hgcommits that is generated by the conversion tools and committed to the Git repositories. This file contains a sequence of lines, each of which contains two hexadecimal hashes: the first is the hash of a Mercurial changeset and the second is the hash of the Git commit resulting from converting that Mercurial changeset. The tool git-translate simply queries the file .hgcommits

I haven't managed to get everything work out of the box but haven't tried too hard. Might be even worth opening a thread on the Skara mailing list. 

However, even if we have a translate tool this is still complicated: Changing hashes or links in a commit again alters the git hash and the translation is wrong for this particular commit. This could be a problem if a commit is referenced by more than one other commit or if commit a references commit b references commit c. 

I see essentially three options:
1. Migrate to another mercurial provider
2. Convert to git, stay at bitbucket
3. Convert to git, migrate to another provider
1. We could migrate to Tuxfamily and keep mercurial. As you said this would imply we have to handle pull requests separately which is possible. As you surly know LLVM does exactly that by using Phabricator. However this would fix some of the issues above but links to bitbucket would remain a problem. Another downside of mercurial is that only very few projects are using it and contributing would be much easier in the case of git.

I really don't see much difference in usability between hg and git -- both have their advantages and little quirks, IMO. And I don't think that hg was ever the main-hurdle for people contributing to Eigen ...

If Phabricator allows to import our existing PRs that would of course be a nice option. But I'm really pessimistic about that at the moment, since this also requires to match all users which made the PR or took part in the discussion to the new host (maybe that would be the only argument for staying with bitbucket).

I tried a few things regarding PRs: We can clearly get all Bitbucket PRs using its API (e.g. curl https://api.bitbucket.org/2.0/repositories/eigen/eigen/pullrequests --request GET) but such a Bitbucket PR is basically defined by source and destination repo and doesn't seem to contain any kind off diff. The obvious problem is that not only the Eigen repo will be closed (or deleted...) but also all of its forks. To really transfer PRs we would have to migrate at least part of the forks as well which is absolutely unrealistic.

I've also tried Phabricator and think its a great tool but has major downsides: It uses a different kind of workflow based on pure diffs (you can literally just copy the result of hg diff or git diff into a web tool) which might be hard to adapt for new users and is only free if self-hosted. The only real reason I'm mentioning this is that I guess we could get plain diffs from the Bitbucket PRs and could make them work with Phabricator. However, I really don't want to advertise this solution but it might be at least one.

I'm really pessimistic on this issue but see basically two options:
1. Try something exotic like the Phabricator workaround sketched above (I m totally unsure about this).
2. Get the diffs from all Bitbucket PRs and archive them separately (on an Eigen page for historical purposes only). Handle all open PRs and define a migration period during that we don't accept new PRs.

Thanks,
David


On 24. Aug 2019, at 15:05, Christoph Hertzberg <chtz@xxxxxxxxxxxxxxxxxxxxxxxx> wrote:

Hi!

On 24/08/2019 12.30, David Tellenbach wrote:
just some thoughts about some points you've made:
b) Fixing internal links inside commit messages ("grafted from ...", "fixes error introduced in commit ...")
Maybe I've forgot something crucial but doing something like
for branch in $(hg branches | awk '{print $1}'); do
    hg update -C  $branch > /dev/null
    echo "$branch $(hg log -v | egrep "bitbucket.org" | wc -l)"
done
gives me
Branch                       Links
------                       ------
default                      9
[...]

The point you missed is that especially the "grafted from" links do not include the full URL, just the hg-hash (which is different from git-hashes). And just greping for "grafted from" gives me 425 results (in total -- if you want the log of individual branches, you need to use the `-b` option).
For a more precise count, you should grep for hexadecimal numbers longer than a few digits inside the commit messages.

I somewhat doubt that any existing hg->git converters automatically translates these hashes, but I'd be very happy if someone finds out otherwise. Changing these manually is definitely not an option.

Also, if we stayed with mercurial, but used a different provider, we can't modify the history, because that would influence all the hashes (but then only the 9 direct links to "bitbucket.org/..." you found would be broken, which is acceptable, IMO)

Of course we can just ignore these links (though I think broken links/hashes are even worse than non-existing ones ...)

Another point are links inside the codebase that point to bitbucket.
Following the same logic as above I use
hg grep "bitbucket.org"
and get 11 links (all seem to be the same). Again something fixable manually.

Agreed, this part is easy to fix manually.

c) Fixing external links to the repository. Most notably, any links from our bugtracker will eventually fail (even if we stayed with bitbucket, the hashes won't match). I doubt that we could set up any automatic forwarding for that.
This might be by far the most complicated point since a lot (the majority?) of all issues contain links to commits. If desired I can find a concrete number but I doubt that it will be very...motivating. I also doubt that Bitbucket will provide any functionality to redirect links to other Git providers but I could image that there could be some workaround if we decide to migrate to Bitbucket Git. Something we should keep in mind before choosing a new provider.

If you (or anyone else) are/is really interested, I can try to make a MySQL dump of the underlying database (I'd need to strip the user data). If we have some automatic translation between the hashes, this could even allow us to automatically convert all links.
Migrating to bitbucket-git will still break all existing links, since the hashes don't match. And as bitbucket is not even planning to provide an automated repository conversion, I would not count on any kind of forwarding mechanism.


Any third-party which relies on our main repository will need to change as well (not directly "our" problem, but we need to give a reasonable amount of time for everyone to migrate to whatever will be our future official repository).
It's currently unclear for me what exactly will happen with the hg repo but I guess it will be archived or something similar. In this case we can link to the new repo on the README page. I don't have any further ideas regarding this but also think we should migrate somewhat fast.

Yes, I think this is unclear for everyone at the moment. The announcement from bitbucket sounds a lot like they will literally delete all hg-repositories in June next year :(
If it was at least frozen/archived as it is, we would have almost no problems with point c).

For manual redirection, we can of course open a new git-project which just contains a README.md saying that bitbucket dropped hg-support, and point to where Eigen migrated to.

I see essentially three options:
1. Migrate to another mercurial provider
2. Convert to git, stay at bitbucket
3. Convert to git, migrate to another provider
1. We could migrate to Tuxfamily and keep mercurial. As you said this would imply we have to handle pull requests separately which is possible. As you surly know LLVM does exactly that by using Phabricator. However this would fix some of the issues above but links to bitbucket would remain a problem. Another downside of mercurial is that only very few projects are using it and contributing would be much easier in the case of git.

I really don't see much difference in usability between hg and git -- both have their advantages and little quirks, IMO. And I don't think that hg was ever the main-hurdle for people contributing to Eigen ...

If Phabricator allows to import our existing PRs that would of course be a nice option. But I'm really pessimistic about that at the moment, since this also requires to match all users which made the PR or took part in the discussion to the new host (maybe that would be the only argument for staying with bitbucket).


2. The only reason I see for this is the one I mentioned above: If there is (or will be) any support to redirect bitbucket links it will most likely only work if we stay at bitbucket. Compared with other code hosting services I find bitbucket (not mercurial) to be really complicated and not intuitive.

It might be an option, if they allowed to automatically migrate pull-requests. But at the moment, they don't even seem to plan automatic migration of repositories.

3. In an ideal world this would be my absolute preference (not very surprising). Regarding the choice of a service I want to make the personal point that I would rather migrate to Gitlab than to Github because it is as least as good as Github and I think that diversity of tools and providers is crucial for open source. In the long run we could even think about migrating issues to Gitlab and installing test runners (this is another story).

In my ideal world, somebody volunteers to do the work necessary for migration :) -- including the issues I pointed out (doesn't have to be the same person doing everything, of course). Even some proof-of-concept demos what can be automated would be nice!

I don't have any real preferences between mercurial/git or github/gitlab/bitbucket.

I totally agree that having automated test runners on pull-requests will be a big plus (for which I'm even willing to sacrifice some of my original points, especially since we may need to anyway).

Cheers,
Christoph


Thanks,
David
On 21. Aug 2019, at 14:53, Christoph Hertzberg <chtz@xxxxxxxxxxxxxxxxxxxxxxxx> wrote:

Hello Eigen users and contributers!

As some may have noticed, bitbucket/atlassian is "sunsetting" its mercurial support:

https://bitbucket.org/blog/sunsetting-mercurial-support-in-bitbucket

If they stick to their timeline, we will have to migrate until June 1st, 2020. That means we still have time, but if we do nothing, things will break ...


Converting the repository itself to git should not be a bigger issue -- and if we do this we could as well migrate to a more mainstream provider (i.e., github).

I think the main problems for migration are:
a) Migrating open pull-requests (for historical reasons, the closed/merged ones should probably be archived as well)
b) Fixing internal links inside commit messages ("grafted from ...", "fixes error introduced in commit ...")
c) Fixing external links to the repository. Most notably, any links from our bugtracker will eventually fail (even if we stayed with bitbucket, the hashes won't match). I doubt that we could set up any automatic forwarding for that.
d) Any third-party which relies on our main repository will need to change as well (not directly "our" problem, but we need to give a reasonable amount of time for everyone to migrate to whatever will be our future official repository).

Smaller issues (relatively easy to fix or not as important):
e) Change links from our wiki (to downloads)
f) Change URLs for automated doxygen generation and for unit-tests
g) Automatic links from the repository to our bugtracker (currently "Bug X" automatically links to http://eigen.tuxfamily.org/bz/show_bug.cgi?id=X)
h) Change hashes in bench/perf_monitoring/changesets.txt

I probably missed a few things ...


I see essentially three options:
1. Migrate to another mercurial provider
2. Convert to git, stay at bitbucket
3. Convert to git, migrate to another provider

Honestly, I see no good reason for option 2. And the only real reason I see for option 1 would be that it safes a lot of hassle with b) and h) -- also perhaps it would simplify c) (e.g., we could easily crawl through our bugzilla-database and just replace some URLs).


Any opinions on this? Preferences for how to proceed, or other alternatives?
Does anyone have experience with migrating from hg to git? Or migrating between providers? Especially, also dealing with the issues listed above.
Does anyone see issues I forgot?


Cheers,
Christoph


--
Dr.-Ing. Christoph Hertzberg

Besuchsadresse der Nebengeschäftsstelle:
DFKI GmbH
Robotics Innovation Center
Robert-Hooke-Straße 5
28359 Bremen, Germany

Postadresse der Hauptgeschäftsstelle Standort Bremen:
DFKI GmbH
Robotics Innovation Center
Robert-Hooke-Straße 1
28359 Bremen, Germany

Tel.:     +49 421 178 45-4021
Zentrale: +49 421 178 45-0
E-Mail:   christoph.hertzberg@xxxxxxx

Weitere Informationen: http://www.dfki.de/robotik
 -------------------------------------------------------------
 Deutsches Forschungszentrum für Künstliche Intelligenz GmbH
 Trippstadter Strasse 122, D-67663 Kaiserslautern, Germany

 Geschäftsführung:
 Prof. Dr. Jana Koehler (Vorsitzende)
 Dr. Walter Olthoff

 Vorsitzender des Aufsichtsrats:
 Prof. Dr. h.c. Hans A. Aukes
 Amtsgericht Kaiserslautern, HRB 2313
 -------------------------------------------------------------





Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/