You don't care about performance
Profiling for performance

Development Cluj-Napoca, July 22, 2015, by Vlad
Estimated reading time:

You don't care about performance, before you rant at me and include my mother in the conversation, answer some of these following questions:

  • Do you test for performance?
  • If you do, do you do it in a automatic and repeatable manner?
  • Do you track, analyze and categorize the results?

If you answered yes to most of these questions, you could stick around for part two in which I try to play with some useful profiling tools like locust.iodjango-silk or maybe you are curios to see the special app that I build especially for this demonstration. For the rest of you, read on.

Profiling is a complex topic, so my take on this is to split it in two parts:

Part one will be a superficial quick and general introduction to profiling python, no prior knowledge required1, I just want to make sure that we are on the same page.

Part two is where we get our hands dirty and simulate a workflow of profiling an application. This part will require from you knowledge on some advance topics, but fear not, still accessible to most of the readers out there.

Part 1

Profiling for performance2

I've used the term profiling several times before but never explained anything about it.

According to Wikipedia:

Profiling (computer programming) is a form of dynamic program analysis that measures, for example, the space (memory) or time complexity of a program, the usage of particular instructions, or the frequency and duration of functions calls. Most commonly, profiling information servers to aid program optimization.

With other words: profiling allows you to see where you are
spending your time, it allows you to optimize your code in a intelligent
fashion, runs almost exactly the same code as in production. You can
easy3 track entry/exit times for functions and helps you
build the bigger picture about the application.


  • How much of resource X4 is used?
  • How exactly this amount of X is spent?
  • Looking for bottlenecks

Why do we profile python?

Because (and it hurts my heart to say this) cPython is slow. Python is a beautiful language, but the rumor spread around by jealous Java programmers is sadly true.

So why not make Python faster?

Well, that is exactly what PyPy is trying to do.

PyPy is an automated, runtime, profile-driven optimization a.k.a.. JIT compiler, because PyPy tries to tackle a non-trivial problem we are not quite there yet. 

Even so, at the moment, computers still aren't powerful enough to restructure your entire program to use a better suited algorithm for the job.

So until then we profile.

Why not optimize everything?

Fast code is expensive

Well, this is exactly what the wizards of yesterday did, they wrote code in which every CPU cycle, every piece of memory mattered, control everything was the name of the game. They had no other way. 

But this level of care comes at a cost, fast code is expensive because it takes effort  to write (good) fast code. It needs better algorithms, deep research into approaches, diligent and focused effort.

Fast code is hard to maintain

Smart code introduces cacheslazy loadingparallelizationassumptions and requirements.

Much smarter code gets you to "meta-programming obscure", code where nothing you write is the code that runs.

Sure i's fast, but it is extremely slow to write, and/or slow to maintain.

We need fast but maintainable code, so we give up right?

No, we do "Intelligent optimization".

Obligatory Quote

We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.


The trick here is to mess up only a small portion of the code base, while preserving the elegance of the whole. 

Maintain nice flexibility and uglify just the important 3%.

How to find the 3%?

Why not guess?

I really understand my code!

Well probably you do, and probably most of the time you do a good job but you are taking a bet every time. Better to look first.

What options do I have?

Before thinking at the tools, let's try to split them according to some general rules of thumb.

Overview of the available groups

  1. What is being profiled?
  • CPU usage
  • RAM usage
  • I/O or network
  1. How is it iprofiled?
  • Deterministic (one instruction at a time, precise information, slow generates huge amount of data)
  • Statistical (looks at the stack from time to time, not a precise, fast generates manageable amount of data) 
  1. Which granularity?
  • Application-level
  • Method-level
  • Line-level
  1. How is it reported?
  • Textual reports
  • Color mode
  • Exploding pies
  • Square maps
  • Call graphs

The tools

This brings us to the end of Part 1. As promised, in Part 2 we start to play with the tools above, profiling the app I mentioned at the start of the article. I hope we see each other there.

You can find other interesting articles on similar topics here.

Cool stuff

  1. Profile for performance I stole a lot of information from this wonderful presentation
  2. Advance Python Profiling Excellent presentation with tons of code examples


[1] well that no quite true, you still need to know about concepts as: functions, modules, stack

[2] here we cheat and say that the system is performant if it satisfies some conditions, usually given in the code specifications.

[3] well usually not that easy but you get the point

[4] X can be CPU time, RAM, I/O, power

Image of Vlad
About the author
"All it takes is one bad day to reduce the sanest man alive to lunacy."
-- Joker in The Killing Joke (1988)


Leave a Comment