From python script to package
Start with just a script
Your project is about spamming things; you kind of know a-priori that
in C++ it’d be a library called libspam; however, you’re using python
for some reason. To this end, start with a simple script called
spam.py
in a directory called spam
.
spam
└── spam.py
spam.py
is just a script; it doesn’t have any functions.
Progress to a module
After a little while, the spamming gets quite large. You need another
script called bar.py
that also spams, but differently, so you want
to split the spamming out to a library.
To do this, move the common code into functions in the same file
called spam.py
, but move the different bits into distinct scripts:
spam
├── bar.py
├── foo.py
└── spam.py
So, spam.py
contains a common function called spam()
.
Then in foo.py
you can say
import spam
and it will find your new library and import it. In python-speak,
spam.py
is now a module. Python will find it even without setting
PYTHONPATH
. However, your function is now accessed as
spam.spam()
, which is stupid. So, this is better:
from spam import *
Or this:
from spam import spam
Now it’s just spam()
.
And to a package
After a while, spam.py
might get quite big, including Spam
and
Eggs
classes (classes are CamelCase in python). So you tend to want
to split it up. The first thing might be to put the eggy bits into a
file called eggs.py
:
spam
├── bar.py
├── foo.py
├── spam.py
└── eggs.py
However, there’s now no relationship between these for python. To enforce that, put them both in a directory like this:
spam
├── bar.py
├── foo.py
└── spam
├── __init__.py
├── spam.py
└── eggs.py
In python speak, spam
is now a package. __init__.py
can be
empty; it just means “this is a package” to the python interpretter.
import spam
still finds everything. Python will still find it
without setting PYTHONPATH
.
There are now a lot of things called spam
though. You can say
(somewhat absurdly)
import spam
s = new spam.spam.Spam
It’s tricky to see a way around this.
As a C++ programmer, one sensible way seems to be to put classes in
files named for the class, so there’s a class Spam
in a file
spam.py
in the directory spam
(as implied above). Then put this
in __init__.py
:
from spam import *
from eggs import *
And in foo.py
you can say
from spam import Spam
from spam import Eggs
s = new Spam
e = new Eggs
So, the __init__.py
file can squash some of the namespace madness
that the file hierarchy introduces. Python people don’t appear to
think this is sensible though. They would have you put related
classes into one file. This has the side-effect of leading to big
files that are tricky to navigate.