+++ rss = "GSoC 2020: Outro" date = Date(2020, 8, 31) tags = ["gsoc", "pip", "python"] +++ # Outro > Steamed fish was amazing, matter of fact\ > Let me get some jerk chicken to go\ > Grabbed me one of them lemon pie theories\ > And let me get some of them benchmarks you theories too \toc ## The Look At the time of writing, {{pip 8771 "implementation-wise parallel download is ready"}}: [![asciicast](/assets/pip-8771.svg)](https://asciinema.org/a/356704) Does this mean I've finished everything just-in-time? This sounds to good to be true! And how does it perform? Welp... ## The Benchmark Here comes the bad news: under a decent connection to the package index, using `fast-deps` does not make `pip` faster. For best comparison, I will time `pip download` on the following cases: ### Average Distribution For convenience purposes, let's refer to the commands to be used as follows ```console $ pip --no-cache-dir download {requirement} # legacy-resolver $ pip --use-feature=2020-resolver \ --no-cache-dir download {requirement} # 2020-resolver $ pip --use-feature=2020-resolver --use-feature=fast-deps \ --no-cache-dir download {requirement} # fast-deps ``` In the first test, I used [axuy] and obtained the following results | legacy-resolver | 2020-resolver | fast-deps | | --------------- | ------------- | --------- | | 7.709s | 7.888s | 10.993s | | 7.068s | 7.127s | 11.103s | | 8.556s | 6.972s | 10.496s | Funny enough, running `pip download` with `fast-deps` in a directory with downloaded files already took around 7-8 seconds. This is because to lazily download a wheel, `pip` has to {{pip 8670 "make many requests"}} which are apparently more expensive than actual data transmission on my network. !!! note "When is it useful then?" With unstable connection to PyPI (for some reason I am not confident enough to state), this is what I got | 2020-resolver | fast-deps | | ------------- | --------- | | 1m16.134s | 0m54.894s | | 1m0.384s | 0m40.753s | | 0m50.102s | 0m41.988s | As the connection was *unstable* and that the majority of `pip` networking is performed as CI/CD with large and stable bandwidth, I am unsure what this result is supposed to tell (-; ### Large Distribution In this test, I used [TensorFlow] as the requirement and obtained the following figures: | legacy-resolver | 2020-resolver | fast-deps | | --------------- | ------------- | --------- | | 0m52.135s | 0m58.809s | 1m5.649s | | 0m50.641s | 1m14.896s | 1m28.168s | | 0m49.691s | 1m5.633s | 1m22.131s | ### Distribution with Conflicting Dependencies Some requirement that will trigger a decent amount of backtracking by the current implementation of the new resolver `oslo-utils==1.4.0`: | 2020-resolver | fast-deps | | ------------- | --------- | | 14.497s | 24.010s | | 17.680s | 28.884s | | 16.541s | 26.333s | ## What Now? I don't know, to be honest. At this point I'm feeling I've failed my own (and that of other stakeholders of `pip`) expectation and wasted the time and effort of `pip`'s maintainers reviewing dozens of PRs I've made in the last three months. On the bright side, this has been an opportunity for me to explore the codebase of package manager and discovered various edge cases where the new resolver has yet to cover (e.g. I've just noticed that `pip download` would save to-be-discarded distributions, I'll file an issue on that soon). Plus I got to know many new and cool people and idea, which make me a more helpful individual to work on Python packaging in the future, I hope. [TensorFlow]: https://www.tensorflow.org [axuy]: https://sr.ht/~cnx/axuy