Weeknotes 2021 WK 45

Finding the new (and old) contributors to Django 4.0

Previously, we identified contributors to Django 4.0 by those who’d committed to the django/django repo during the 4.0 development cycle. To reiterate, that’s not everybody who contributed, or every contribution (by a long shot) but it’s a good start.

New Contributors

One thing we wanted to do was to call out new contributors. To achieve that, we can compare the set of committers to Django 4.0 to the set of committers to before Django 4.0.

We can get those sets again from git. Committers for the last 10 commits on the stable/4.0.x branch:

git shortlog -s HEAD~10..                                       
     1  Adam Johnson
     1  Brad
     1  Can Sarigol
     4  Carlton Gibson
     3  Mariusz Felisiak

We don’t want the counts, so a bit of awk:

% git shortlog -s HEAD~10.. | awk '{ $1="";  print substr($0,2) }'
Adam Johnson
Brad
Can Sarigol
Carlton Gibson
Mariusz Felisiak

Perfect.

We know that Django 4.0 is everything in 75182a8..stable/4.0.x so before 75182a8 is a good measure of pre-Django 4.0. (That’s the best part of two and a half thousand contributors, so I won't list it here.)

Subtracting the one from the other gives us a run at the new contributors for Django 4.0:

>>> new_contributors = DJANGO_40 - PRE_40
>>> print("New contributors in Django 4.0: ", len(new_contributors))
New contributors in Django 4.0:  141
>>> print(sorted(new_contributors))
['Aakash Singh', 'Abhyudai', 'Alex Dutton', 'Alex Hayward', 'AliGhotbizadeh',
'Aljaž Košir', 'Allan Feldman', 'Amankumar Singh', 'Amir Ajorloo', 'Andrew
Northall', 'Andrew-Chen-Wang', 'Angus Holder', 'Anil Khatri', 'Arthur Jovart',
'Ben Sturmfels', 'Ben Wilber', 'BeryCZ', 'Can Sarigol', 'Can Sarıgöl', 'Ceesjan
Luiten', 'Chenyang Yan', 'Christophe Henry', 'Cleiton Lima', 'Clumart.G', 'Dan
Strokirk', 'Daniel Ebrahimian', 'Daniyal', 'Daniyal Abbasi', 'Denis
Skulimovskiy', 'Diego Lima', 'Eduardo Aldair Ahumada Garcia Jurado', 'Egidijus
Macijauskas', 'Eugene Morozov', 'GabbyPrecious', 'Gildardo Adrian Maravilla
Jacome', 'Girish Sontakke', 'Greg Twohig', 'Hugo Cachitas', 'Igor Fernandes',
'Jack', 'Jack Aitken', 'Jan Schär', 'Jan Szoja', 'Jannis Vajen', 'Jarosław
Wygoda', 'Jerin Peter George', 'Jero Bado', 'Jim Xie', 'Joel Farthing', 'Johan
Schiff', 'Johannes Wilm', 'John', 'Jonathan Davis', 'Jonathan Richards', 'Jonny
Park', 'Jordan Bae', 'Jordi Castells', 'Karthikeyan Singaravelan', 'Lauri
Tirkkonen', 'Lou Huang', 'Lucidiot', 'Manav Agarwal', 'Marc Gibbons', 'Mart
Sõmermaa', 'Martin Svoboda', 'Mateo Radman', 'Matjaz Gregoric', 'Maxim Beder',
'Maxim Milovanov', 'Michael Lissner', 'Mikolaj Rybinski', 'Mohammadreza
Varasteh', 'Moriyoshi Koizumi', 'Muhammad Hammad', 'Märt Häkkinen', 'Nick
Frazier', 'Nick Touran', 'Nicolas Restrepo', 'Nikita Marchant', 'Nilo César
Teixeira', 'Paul Ganssle', 'Premkumar Chalmeti', 'Pēteris Caune', 'Raymond
Nunez', 'Rohith PR', 'Rust Saiargaliev', 'Sandro Covo', 'Sanskar Jaiswal',
'Sarah Abderemane', 'Seonghyeon Cho', 'Siburg', 'Sih Sîng-hông薛丞宏', 'Slava
Skvortsov', 'Sondre Lillebø Gundersen', 'Stefanos I. Tsaklidis', 'Steven
Maude', 'Susan Wright', 'Teresa Partida', 'ThinkChaos', 'Thomas Guettler',
'Tiago Honorato', 'Ties Jan Hefting', 'Tilman Koschnick', 'Timothy McCurrach',
'Tom Wojcik', 'Victor Sowa', 'Vikash Singh', 'Virtosu Bogdan', 'Wilhelm Klopp',
'Wille Marcel', 'Wu Haotian', 'Yuekui Li', 'Yuri Konotopov', 'Zain Patel',
'Zainab Amir', 'abhiabhi94', 'antoinehumbert', 'arcanemachine', 'aryabartar',
'bankc', 'cammil', 'chrishna1', 'ecogels', 'girishsontakke', 'ilu_vatar_',
'kshitijraghav', 'muskanvaswan', 'pochangl', 'pythonwood', 'qimingmafan',
'ryowright', 'saeedblanchette', 'sdwoodbury', 'snowman2', 'sreehari1997',
'taulant', 'tim-mccurrach', 'tomhamiltonstubber', 'yakimka', 'yujin',
'yyyyyyyan']

OK, there are a couple of duplicates in there, but ≈140 new contributors. Of just over 200 total contributors to Django 4.0, a good proportion of Django’s input is from first time contributors.

On a roll

There are a couple of other groups we can identify while we’re here. First up, those who were also part of Django 3.2. We might say these folks are on a roll.

We identify Django 3.2 in the same way we identified Django 4.0, by looking at when we branched stable/3.1.x for Django 3.1 — after that main, which became stable/4.0.x was Django 3.2.

$ git merge-base origin/stable/3.1.x stable/4.0.x
d51e090db2110f016dbca1d794c0d379b3df551b
% git show d51e090db2110f016dbca1d794c0d379b3df551b | head -n6            
commit d51e090db2110f016dbca1d794c0d379b3df551b
Author: Mariusz Felisiak <felisiak.mariusz@gmail.com>
Date:   Tue May 12 07:21:09 2020 +0200

    Updated man page for Django 3.1 alpha.

That looks right.

So, the contributors to Django 3.2 were (roughly) everything between d51e090 and our start point for Django 4.0 75182a8. Again, some 200 contributors.

If we take the intersection of the two sets, we should have the folks who authored commits in both the Django 3.2 and Django 4.0 cycles:

>>> on_a_roll = DJANGO_40 & DJANGO_32
>>> print("On a roll in Django 4.0: ", len(on_a_roll))
On a roll in Django 4.0:  40
>>> print(sorted(on_a_roll))
['Adam Johnson', 'Artur Beltsov', 'Carlton Gibson', 'Chinmoy Chakraborty',
'Chris Jerdonek', 'Claude Paroz', 'Collin Anderson', 'Daniel Hahler', 'David D
Lowe', 'David Smith', 'David Wobrock', 'Florian Apolloner', 'Florian Demmer',
'François Freitag', 'Giannis Adamopoulos', 'Hannes Ljungberg', 'Hasan
Ramezani', 'Ian Foote', 'Iuri de Silvio', 'Jacob Walls', 'Johannes Maron', 'Jon
Dufresne', 'Josh Santos', 'Konstantin Alekseev', 'Mads Jensen', 'Mariusz
Felisiak', 'Matthias Kestenholz', 'Mike Lissner', 'Nick Pope', 'Raffaele
Salmaso', 'Simon Charette', 'Tim Graham', 'Tom Carrick', 'Tom Forbes', 'William
Schwartz', 'manav014', 'mimi89999', 'sage', 'starryrbs', 'ᴙɘɘᴙgYmɘᴙɘj']

Welcome back

Finally, if we take all the folks that contributed before Django 4.0, except those that are on a roll, we can spot the returning folks:

>>> welcome_back = (PRE_40 - DJANGO_32) & DJANGO_40
>>> print("Welcome back in Django 4.0: ", len(welcome_back))
Welcome back in Django 4.0:  32
>>> print(sorted(welcome_back))
['Adam Donaghy', 'Alex Hill', 'Arkadiusz Adamski', 'Baptiste Mispelon', 'Brad',
'Brad Solomon', 'Camilo Nova', 'Dan Swain', 'Daniele Procida', 'David Beitey',
'David Sanders', 'F. Malina', 'Haki Benita', 'Harm Geerts', 'Illia Volochii',
'Jaap Roes', 'Jacob Rief', 'Jozef', 'Ken Whitesell', 'Keryn Knight', 'Markus
Holtermann', 'Matt Westcott', 'Michał Górny', 'Peter Inglesby', 'Ramon
Saraiva', 'Russell Keith-Magee', 'Shipeng Feng', 'Simon Willison', 'Takayuki
Hirayama', 'Tobias Bengfort', 'Vinay Karanam', 'luzpaz']

More than a few familiar names there. Nice to have them on board.

The data here is only as good as the author records in the git history, which isn’t perfectly clean. Happily the counts for the three sets adds up to the total for Django 4.0 contributors, so maybe it’s not too far off — but do reach out if you spot something amiss.