11 Useful Resources To Learn Python's Internals From Scratch
Learn how python works internally by going deep inside the python virtual machine.
"How does Python work internally?"
I have been asking myself that question for the past few months and now it seems that I'm starting to grasp, slowly...
During this time, I have grown a strong interest in learning more about the internal working of python. I find the CPython implementation so fascinating that I even started contributing to the language.
The CPython runtime is the most popular one but there are a few others like pypy. Unlike pypy, the core language is written in C whereas the standard library is a blend of Python and C.
For newcomers, navigating through the code can be a daunting task. Fortunately, there are some nice resources out there that can help to smooth the learning curve.
In this post, I'll show you my favorite resources to start learning more about the inner workings of Python, a.k.a CPython Internals.
By the end of this tutorial, you should be able to:
- choose the best books that will help you understand Python's source code
- learn more about the CPython internals via public talks
- find the best blogs and other resources that cover the Python internals
Table of Contents
- Books About CPython Internals
- Videos Related to CPython Internals
- Blog Posts
- Other Resources About CPython Internals
- Conclusion
Books About CPython Internals
Python has grown immensely during the past years and the more people learning it, the more the demand for learning materials discussing advanced topics.
A few years ago, I bet just a few curious developers would be interested in learning more about CPython.
These days it’s not unusual to find comprehensive blog posts, videos and books going over the inner guts of Python. Speaking of books, I can surely vouch for two:
- CPython Internals: Your Guide to the Python 3 Interpreter by Anthony Shaw
- Inside The Python Virtual Machine by Obi Ike-Nwosu
CPython Internals: Your Guide to the Python 3 Interpreter - A Brief Review
This one is the most newly published book about CPython. Amongst all the things it covers, you will find information about:
- How to build and compile Python from source on MacOS, Linux and Windows
- How to set up a development environment
- The Python’s grammar and language specification
- The eval loop
- How Python manages memory
- How to run the test suite
You can find the full table of contents in the Real Python website, it comes in digital format and paperback. The eBook versions are DRM-Free and available in three different formats: epub, mobi and PDF. The paperback, on the other hand, can be found on amazon .
The nicest thing to me is that Anthony demonstrates how to add a new operator to the language: the almost-equal. This operator is represented by “~=” and can determine if two numbers are close enough to each other but not exactly equal. It walks through all the fundamental changes required to make this happen, including extending the grammar.
Thanks to this book, I started contributing to CPython and have already a handful PRs merged, including enhancements, documentation and bug fixes.
Lastly, the code samples are available on github.
Inside The Python Virtual Machine - A Brief Review
Another good book covering Python's internals, and if I'm not mistaken it's the first one to explore CPython in detail. Inside The Python Virtual Machine is much shorter than CPython Internals but covers some parts of the language in more detail, such as Python objects, Code and Frame objects. It's available for free as PDF, ePub and Kindle (mobi) on leanpub but I definitely encourage you to buy it.
What I loved the most is the in-depth examination of Python objects. It does a brilliant job dissecting the types, goes over the internals of objects and their attributes and concludes with an explanation of the Method Resolution Order (MRO). I haven’t read it in full yet but I like it so far.
Videos Related to CPython Internals
When it comes to video contents, there isn’t much structured content out there. The first one I learned about was P. Guo’s series covering Python 2.7. Unfortunately, he’s taken out the series from his website but you can still find it via an unlisted playlist.
CPython internals: A ten-hour codewalk through the Python interpreter source code
Check it out the first video of the series.
The full playlist can be found here .
Pablo Salgado - The soul of the beast
In this talk, Pablo Galindo, who is a Python’s core developer, looks at the former Python’s grammar and its limitations. This presentation is fantastic for those who want to understand the general structure of the compiler. By the end, Pablo shows how you can add a new operator to Python, the 'arrow operator' ->
.
Eric Snow - to GIL or not to GIL: the Future of Multi-Core (C)Python - PyCon 2019
In this presentation, Eric Snow talks about Python’s GIL (Global Interpreter Lock) and the future developments to circumvent its impact on performance and unlock multi-core capability in Python. It won’t dive into the source code but it’s a good intro to one of the most controversial topics in Python.
Pablo Galindo Salgado - Time to take out the rubbish: garbage collector - PyCon 2019
In this talk, Pablo presents the "magic" behind Python’s memory management by detailing the inner works of the garbage collector and why it matters. He illustrates some gotchas such as the reason you cannot rely on __del__
method and describes in detail the reference counter.
CPython Full Course - Dev Internals
I’ve recently found this resource, and it looks very neat. The author examines the implementation of NoneType
and also demonstrates how to include a __len__
method to int
s.
What caught my eye was that the author uses gbd
to debug the C portion of the code. I find it nice because it’s not so easy for find videos demonstrating how to debug the CPython code.
Blog Posts
My favorite blog post series on this topic is from Ten thousand meters by Victor. I believe it’s the most complete and up-to-date coverage into the internals of the Python interpreter in the form blog posts.
Another exceptional series, albeit slightly outdated, is the Python internals series from Eli Bendersky. There are lots of interesting stuff including adding new keywords and a great coverage on symbol tables.
Other Resources About CPython Internals
A Python Interpreter Written in Python by Allison Kaptur
This article / mini book is a great resource to understand Python’s bytecode. In only 500 lines of code Allison implements Byterun, a Python interpreter written in pure Python.
As impressive as it may sound, Byterun can actually run a variety of simple Python programs. After reading it you’ll have a much better understanding of the Python interpreter.
The booklet can be found one https://www.aosabook.org.
Python's Official Docs
The official documentation is also an excellent place to go, the text can be particularly dry but that's what you usually expect from a reference. A nice section in particular is the exploring guide. It goes through the source code’s structure and also links to other resources. The official website also shows the C-API in considerable detail.
Conclusion
Learning about CPython can be disheartening, but thanks to the increasing demand, more and more materials have been developed which softens the learning curve considerably. In this article I presented my preferred resources to learn about the internals of Python ranging from books, to videos and blog posts. I hope this can be useful to you as it is to me.
Other posts you may like:
See you next time!