PROFILING FOR PERFORMANCE
You don't care about performance, before you rant at me and include my mother in the conversation, answer some of these following questions:
Do you test for performance?
If you do, do you do it in a automatic and repeatable manner?
Do you track, analyze and categorize the results?
If you answered yes to most of these questions, you could stick around for part two in which I try to play with some useful profiling tools like locust.io, django-silk or maybe you are curios to see the special app that I build especially for this demonstration. For the rest of you, read on.
Profiling is a complex topic, so my take on this is to split it in two parts:
Part one will be a superficial quick and general introduction to profiling python, no prior knowledge required1, I just want to make sure that we are on the same page.
Part two is where we get our hands dirty and simulate a workflow of profiling an application. This part will require from you knowledge on some advance topics, but fear not, still accessible to most of the readers out there.
Part 1
Profiling for performance2
I've used the term profiling several times before but never explained anything about it.
According to Wikipedia:
Profiling (computer programming) is a form of dynamic program analysis that measures, for example, the space (memory) or time complexity of a program, the usage of particular instructions, or the frequency and duration of functions calls. Most commonly, profiling information servers to aid program optimization.
With other words: profiling allows you to see where you are spending your time, it allows you to optimize your code in a intelligent fashion, runs almost exactly the same code as in production. You can easy3 track entry/exit times for functions and helps you build the bigger picture about the application.
TD;LR;
How much of resource X4 is used?
How exactly this amount of X is spent?
Looking for bottlenecks
Why do we profile python?
Because (and it hurts my heart to say this) cPython is slow. Python is a beautiful language, but the rumor spread around by jealous Java programmers is sadly true.
So why not make Python faster?
Well, that is exactly what PyPy is trying to do.
PyPy is an automated, runtime, profile-driven optimization a.k.a.. JIT compiler, because PyPy tries to tackle a non-trivial problem we are not quite there yet.
Even so, at the moment, computers still aren't powerful enough to restructure your entire program to use a better suited algorithm for the job.
So until then we profile.
Why not optimize everything?
Fast code is expensive
Well, this is exactly what the wizards of yesterday did, they wrote code in which every CPU cycle, every piece of memory mattered, control everything was the name of the game. They had no other way.
But this level of care comes at a cost, fast code is expensive because it takes effort to write (good) fast code. It needs better algorithms, deep research into approaches, diligent and focused effort.
Fast code is hard to maintain
Smart code introduces caches, lazy loading, parallelization, assumptions and requirements.
Much smarter code gets you to "meta-programming obscure", code where nothing you write is the code that runs.
Sure i's fast, but it is extremely slow to write, and/or slow to maintain.
We need fast but maintainable code, so we give up right?
No, we do "Intelligent optimization".
Obligatory Quote
We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%. -Knuth
The trick here is to mess up only a small portion of the code base, while preserving the elegance of the whole.
Maintain nice flexibility and uglify just the important 3%.
How to find the 3%?
Why not guess?
I really understand my code!
Well probably you do, and probably most of the time you do a good job but you are taking a bet every time. Better to look first.
What options do I have?
Before thinking at the tools, let's try to split them according to some general rules of thumb.
Overview of the available groups
What is being profiled?
CPU usage
RAM usage
I/O or network
How is it iprofiled?
Deterministic (one instruction at a time, precise information, slow generates huge amount of data)
Statistical (looks at the stack from time to time, not a precise, fast generates manageable amount of data)
Which granularity?
Application-level
Method-level
Line-level
How is it reported?
Textual reports
Color mode
Exploding pies
Square maps
Call graphs
The tools
Run time, aggregate ("snippet-level") : timeit
Run time, method-level, deterministic : cProfile
Runtime, method-level, statistical : statprof.py
Runtime, line-level, deterministic : line_profiler
Memory, line-level, deterministic : memory_profiler
Memory, method-level, deterministic : pympler (Py2/3) guppy/heapy (Py2)
This brings us to the end of Part 1. As promised, in Part 2 we start to play with the tools above, profiling the app I mentioned at the start of the article. I hope we see each other there.
You can find other interesting articles on similar topics here.
Cool stuff
Profile for performance I stole a lot of information from this wonderful presentation
Advance Python Profiling Excellent presentation with tons of code examples
Footnotes:
Comments