Log in

No account? Create an account
entries friends calendar profile Previous Previous Next Next
How popular is git? Very. - Jef"I am the pusher robot"Spaleta
ramblings of the self-elected Fedora party whip
How popular is git? Very.
Let's look at github...which by the way is a hosting service which directly competes with launchpad as a for-profit hosting service provider while also providing no-cost services to open source projects.

Github has been up and running for about a year I think.  As of right now it appears to have over 70 thousand individual git repositories listed.  Over 70,000  projects in just over a year!  Here's how I got that number.  I searched for all repositories which are themselves forks    fork:true (23143) and all repositories which are not a fork  fork:false (47952)

70,000 individual repositories... wow!  More than half of which are listed as new projects, not code forks of existing projects from somewhere else. Wow! And that's just the public repositories. That's not including the for pay private personal or corporate repositories that customers haven't chosen to make public. Wow!  Do I believe that? I'm not sure, I could be doing the search criteria wrong. Please someone double check that. 

How does that compare to launchpad?  Launchpad has just over 10,300 in what 4+ years? Can anyone dig up an exact date as to when launchpad was open.  Put into perspective with github's explosive project growth in one year, Launchpad's 10k in multiple years doesn't seem so impressive.  And github isn't even tied directly into a distribution.. it doesn't have a contributor network effect to leverage that blurs the line between code hosting and distribution building. How many of the projects registered with bzr trees in Launchpad only have them to build Ubuntu packages?

That integrated connection to the Ubuntu community via the Launchpad Soyuz component should give Launchpad a marked advantage over other hosting competitors like github.  But looking at the project growthrate numbers, you don't see that materializing.
I have to wonder, how much popular a service would Launchpad be if it made the git service available as a primary service for developers?  Would it be able to compete with github's growth of new projects?

Can you run a competitive for-profit  code hosting service that doesn't offer git as a service moving forward?  I'm not sure.  I'd like to see trendable graphs of growth rate for both launchpad and github over the last 6 months.  I'm going to have to poke around at both apis and see if I can dig up that information.

5 comments or Leave a comment
spevack From: spevack Date: February 18th, 2009 07:06 pm (UTC) (Link)
What do you think the proper metrics are for tracking growth of Fedora Hosted?
jspaleta From: jspaleta Date: February 18th, 2009 07:38 pm (UTC) (Link)
I dug up the fedorahosted numbers numbers from the admins in a previous post:

I would love to build weekly graphs of the number of projects using each of the vcs options we provide like the way Debian is doing by tracking its vcs usage at the packaging level.

You'll notice that the current Debian graphs would indicate that bzr is actually crested and has started to drop as well, while git and svn are still growing. This is not yet statistically significant. It will be interesting to see what that graph looks like in April.

Fedorahosted's numbers like the Debian numbers are interesting in that we offer multiple options in an unbiased way and git is the fastest growing option. Debian has a large pre-existing svn using packager pool, which is being eroded over time by migrations to dvcs. Fedorahosted doesn't have that pre-existing inertia, but both show the same trend even though the work flow is different. Debian's trending is mostly about packaging version control (I think), Fedorahosted is strictly about upstream hosting version control.
Fedorahosted's growth numbers are nowhere near what github is doing. And that's okay. We don't really need fedorahosted to be github or gitorious. Fedora as a project might not be structured to be a competitive hosting provider compared to other things we do, so having those other self-sufficient git hosting providers out there for people to choose will help us focus on other things.

Though there is going to come a moment in time when we are going to jump from cvs to a distributed system for our distribution and packaging processes. And its pretty clear to me that something else is going to be git. The git momentum is huge. Everywhere I look, git uptake is just enormous. We need to find a way to work with fedorahosted,github, gitorious and other git repositories to streamline how that future process is going to work.

jspaleta From: jspaleta Date: February 18th, 2009 08:02 pm (UTC) (Link)
Oh and take a good look at how github exposes commit activity. The 52 week graph of commit activity per repository is something you may like:

Can we get that sort of commit activity graph for fedorahosted or even fedora cvs? I don't know. It's something to ping Luke about I guess for his contributor portal framework.

I'm still having a hard time wrapping my head around the github's network graph for example:

From: (Anonymous) Date: February 19th, 2009 01:02 pm (UTC) (Link)


I'd be careful about those figures. If you browse through the list of "Latest Repos", you usually see a very greast deal of "Test" or "My first Git Repo lol" and things like that. I suspect there's enough there to disturb the statistics, but I'm not sure. Just a heads up.
jspaleta From: jspaleta Date: February 19th, 2009 04:35 pm (UTC) (Link)

Re: Careful

Are you saying that other no-cost hosting services like launchpad do a better job pruning out deadwood? For example I give you this: https://launchpad.net/ssctest1

I haven't seen Shuttleworth use any caveats when quoting the launchpad numbers, so I'm not going to bother to be any more precise than he feels he needs to be.

I think dead wood is going to be a general problem that's going to show up in the numbers for all service providers in a somewhat proportionate manner.

https://code.launchpad.net/tinyurl?field.lifecycle=ALL for example hasn't seen branch activity in 87 weeks. And in total it saw 7 commits all in one week back in 2007. It's been inactive for longer than github has existed. Is that an active project? Should that be counted in Launchpad's rolling total that Shuttleworth has used in prior discussion?

The numbers are what they are. If you are going to pick apart the github numbers looking for dead wood and publish a more accurate picture of active respositories...feel free. But do it fairly, apply your same methodology to Launchpad and publish pruned Launchpad numbers side-by-side.

5 comments or Leave a comment