Hello again. In anticipation of the start of a new thread on the Machine Learning course, we want to share a translation of an article that is quite indirectly related to ML, but will certainly be useful to our blog subscribers.
Mariatta , a Canadian developer, asked on
python -m pip on Twitter, asking her to talk about this idiom and explain how it works.
I recently learned that you need to write python -m pip instead of the usual pip install, but now I can’t remember who I heard from. Probably from @brettsky or @zooba . Do any of you have a blog post so I can share it with readers?
- Mariatta ( @mariatta
) October 29, 2019 ( https://twitter.com/mariatta/status/1189243515739561985?ref_src=twsrc%5Etfw )
I'm not sure what I told Mariatta about
python -m pip , but there is every chance that it was me, since I asked that this instruction for installing packages using
PyPI be written this way since 2016. So, this article should explain what
python -m pip is and why you should use it when starting
pip .
What is python -m pip?
To get started,
python -m pip executes pip using the version of Python you specified for the python statement. Thus,
/usr/bin/python3.7 -m pip
means that you execute
pip for the interpreter located in
/usr/bin/python3.7
. You can read the
documentation about the
-m
flag if you don’t know how it works (by the way, it is extremely useful).
Why use python -m pip instead of pip / pip3?
You can say: “Okay, but why can't I just use
pip by running the
pip command?” The answer is: “Yes, but you will have less control over it.” I will explain what “control less” means by example.
Suppose I have two versions of Python installed, for example, Python 3.7 and 3.8 (this is very common among people who work on Mac OS or Linux, not to mention the fact that you probably wanted to play with Python 3.8, and you already had Python 3.7). So, if you type
pip in the terminal, for which Python interpreter will you install the package?
Without more detailed information you will not know the answer. First you will need to understand what lies in PATH, that is,
/usr/bin
goes first or
/usr/local/bin
(which are the most common places to install Python, by the way usually
/usr/local/
goes first). So, you remember where you installed Python 3.7 and 3.8 and that they were different directories, and you will know what came in PATH first. Suppose you installed both manually, perhaps Python 3.7.3 was already preinstalled on your system, and you installed Python 3.7.5. In this case, both versions of Python are installed in
/usr/local/bin
. Can you tell me now what
pip is now attached to?
You do not know the answer. If you do not know when each version was installed, and you understand that the latest version of
pip was written to
/usr/local/bin/pip
, but you do not know which interpreter will be used for the
pip command. Now you can say: “I always install the latest versions, so that means that Python 3.8.0 will be installed last, because it is newer than, say, 3.7.5.” Well, but what happens when Python 3.7 comes out .6? Your
pip would no longer be used from Python 3.8, but from Python 3.7.
When you use
python -m pip with the specific python interpreter you need, all the uncertainty disappears. If I write
python3.8 -m pip , I know exactly which
pip will be used and that the package will be installed for Python 3.8 (the same thing would happen if I pointed out python3.7).
If you use Windows, then you have an additional incentive to use
python -m pip , since it allows
pip to update itself. Mostly because
pip.exe is considered running when you write
pip install --upgrade pip . At this point, Windows will not allow you to reinstall
pip.exe . However, if you do
python-m pip install --upgrade pip , you work around this problem as
python.exe is launched, not
pip.exe .
And what happens when I am in an activated environment?
Usually, when I explain the essence of this article to people, there is always someone who will say: "I always use the virtual environment, and this does not apply to me." Well, for starters, it would ALWAYS be good to use a virtual environment! (I will explain why I think so in one of my next articles!) But to be honest, I would still insist on using
python -m pip , even if, strictly speaking, this is not necessary.
Firstly, if you use Windows, you will still want to use
python-m pip so that you can update
pip in your environment.
Secondly, even if you use a different operating system, I would say that you still need to use
python-m pip , since it will work regardless of the situation. He will warn you about a mistake if you forget to activate the environment, and any person who watches you will adopt the best practices. And personally, I do not think that saving 10 keystrokes is a significant price for not using good practice. And this command will also help you prevent errors when writing automation scripts that will perform obviously incorrect operations if you forget to activate the environment.
Personally, when I use any tool whose work depends on which interpreter it starts up, I always use
-m
, regardless of whether the virtual environment is activated or not. It's always important for me to understand which Python interpreter I'm using.
ALWAYS use the environment! Do not put everything in a row in the global interpreter!
When we talk about how to avoid confusion when installing in Python, I want to emphasize that we don’t have to install anything in the global Python interpreter when we work locally (containers are another matter entirely)! If this is the pre-installed Python of your system, then if you install some incompatible version of the library that your OS relies on, then you will actually break the system.
But even if you install a copy of
python separately for you, I still strongly recommend not to put it directly in local development. Ultimately, in your projects, you will use various packages that may conflict with each other, and you will not have a clear idea of the dependencies within your projects. It is much better to use environments to isolate individual projects and tools for them from each other. The Python community uses two types of environments: virtual environments and conda environments. There is even a special way to install Python tools in isolation.
If you need to install the tool
For a stand-
alone installation of the tool, I can recommend using
pipx . Each tool will receive its own virtual environment, so as not to conflict with others. Thus, if you want to have only one installation, for example,
Black , you can work without accidentally breaking your only installation of
mypy .
If you need an environment for the project (and you do not use conda)
When you need to create an environment for a project, I personally always turn to
venv and virtual environments. It is included in
stdlib Python, so it is always available using
python-m venv (unless, of course, you are using Debian or Ubuntu, in which case you may need to install the
python3-venv apt package). A bit of history: I actually deleted the old
pyvenv command that Python installed to create virtual environments using
venv , for the same reasons why you need to use
python -m pip instead of
pip . That is, it is not clear for which interpreter you created the virtual environment using the old
pyvenv command . And remember that you do not need to activate the environment in order to use the interpreter contained in it, because
.venv/bin/python
works as well as activating the environment and entering the
python command.
Today, some developers still prefer
virtualenv , as it is available in Python 2 and has some additional features. Personally, I have little interest in additional features, and having integrated
venv means that I don't need to use
pipx to install
virtualenv on every machine. But if
venv doesn't
suit your needs and you want a virtual environment, then see if
virtualenv offers what you need.
If you use conda
If you use
conda , you can use
conda environments to get the same effect that virtual environments provided by
venv can offer. I'm not going to go into whether you need to use conda or venv in your specific situation, but if you use
conda , then you know that you can (and should) create
conda environments for your work, instead of installing everything in your own system installation. So you can get a clear understanding of what dependencies your project has (and this is a good reason to use
miniconda instead of a full-fledged
anaconda , since the former is less than a tenth of the volume of the latter).
There are always containers
Working in a container is a way not to deal with the environment at all, since your entire “machine” will become a separate environment. Until you have installed Python in the container system, you should be able to safely make a global installation so that your container remains simple and straightforward.
I repeat that you really understand the essence ...
Do not install anything in your global Python interpreter! Always try to use the environment for local development!I can no longer say how many times I had to help someone who thought that pip installed in one Python interpreter, but actually installed in another. And this immeasurable amount also applies to those times when people broke the entire system or wondered why they could not install something that contradicted some other thing that they had previously set for another project, etc. due to the fact that they did not bother to configure the environment on their local machine.
Therefore, so that you and I can sleep peacefully, use
python-m pip and try to always use the environment.