I collaborated with Deepnote to prepare the ‘Notebooks Year In Review’ report for 2020.
For the report, I analyzed 10,000 repositories on Github created in the year 2020 containing Jupyter Notebooks.
I also analyzed the Google and YouTube search trends to find the most significant queries related to Juyter Notebooks. The result I found is really interesting.
Based on the analysis, I found of these patterns in the notebooks:
-
The most used Python version is 3.6. The reason behind it could be the stability that Python 3.6 offers.
-
The top 3 imported libraries are: NumPy, Pandas and Matplotlib.
-
Matplotlib is the most famous plotting library with a clear lead over plotly and seaborn.
-
The most starred repository of 2020 is Fast AI’s Fastbook with 11k stars, 39 contributors, and 3.4k forks.
What does Google Search reveal?
I used Google Search trends to analyze the queries and topics related to Jupyter Notebooks.
TSone of the top queries on google and youtube search are:
- jupyter notebook
- python
- python jupyter
- install jupyter
- how to use jupyter notebooks
I enjoyed analyzing Gigabytes of data science notebooks and finding the interesting patterns. Creating the dataset was a challenge but seeing the results, its all worth it.
You can read the full report here.
We (@jakubzitny and @yashika51) analyzed over 10M Jupyter notebooks to define key stats, most popular libraries, and other trends from 2020 🔥
— Elizabeth Dlha (@elizabeth_dlha) January 26, 2021
Which of these will you add to your toolkit in the year ahead? ⛏️ https://t.co/4NfwLCIhoU