rss = "GSoC 2020: Unexpected Things When You're Expecting"
date = Date(2020, 6, 9)
@def tags = ["pip", "gsoc"]
# Unexpected Things When You're Expecting
Hi everyone, I hope that you are all doing well and wishes you all good health!
The last week has not been really kind to me with a decent amount of
academic pressure (my school year is lasting until early Jully).
It would be bold to say that I have spent 10 hours working on my GSoC project
since the last check-in, let alone the 30 hours per week requirement.
That being said, there were still some discoveries that I wish to share.
## The `multiprocessing[.dummy]` wrapper
Most of the time I spent was to finalize the multi{processing,threading}
wrapper for `map` function that submit tasks to the worker pool.
To my surprise, it is rather difficult to write something that is
not only portable but also easy to read and test.
By {{pip 8320 "the latest commit"}}, I realized the following:
1. The `multiprocessing` module was not designed for the implementation
details to be abstracted away entirely.  For example, the lazy `map`'s
could be really slow without specifying suitable chunk size
(to cut the input iterable and distribute them to workers in the pool).
By *suitable*, I mean only an order smaller than the input.  This defeats
half of the purpose of making it lazy: allowing the input to be
evaluated lazily.  Luckily, in the use case I'm aiming for, the length of
the iterable argument is small and the laziness is only needed for the output
(to pipeline download and installation).
2. Mocking `import` for testing purposes can never be pretty.  One reason
is that we (Python users) have very little control over the calls of
`import` statements and its lower-level implementation `__import__`.
In order to properly patch this built-in function, unlike for others
of the same group, we have to `monkeypatch` the name from `builtins`
(or `__builtins__` under Python 2) instead of the module that import stuff.
Furthermore, because of the special namespacing, to avoid infinite recursion
we need to alias the function to a different name for fallback.
3. To add to the problem, `multiprocessing` lazily imports the fragile module
during pools creation.  Since the failure is platform-specific
(the lack of `sem_open`), it was decided to check upon the import
of the `pip`'s module.  Although the behavior is easier to reason
in human language, testing it requires invalidating cached import and
re-import the wrapper module.
4. Last but not least, I now understand the pain of keeping Python 2
compatibility that many package maintainers still need to deal with
everyday (although Python 2 has reached its end-of-life, `pip`, for
example, {{pip 6148 "will still support it for another year"}}).
## The change in direction
Since last week, my mentor Pradyun Gedam and I set up weekly real-time
meeting (a fancy term for video/audio chat in the worldwide quarantine
era) for the entire GSoC period. During the last session, we decided to
put parallelization of download during resolution on hold, in favor of a
more beneficial goal: {{pip 7819 "partially download the wheels during
dependency resolution"}}.
As discussed by Danny McClanahan and the maintainers of `pip`, it is feasible
to only download a few kB of a wheel to obtain enough metadata for
the resolution of dependency.  While this is only applicable to wheels
(i.e. prebuilt packages), other packaging format only make up less than 20%
of the downloads (at least on PyPI), and the figure is much less for
the most popular packages.  Therefore, this optimization alone could make
[the upcoming backtracking resolver][]'s performance par with the legacy one.
During the last few years, there has been a lot of effort being poured into
replacing `pip`'s current resolver that is unable to resolve conflicts.
While its correctness will be ensured by some of the most talented and
hard-working developers in the Python packaging community, from the users'
point of view, it would be better to have its performance not lagging
behind the old one.  Aside from the increase in CPU cycles for more
rigorous resolution, more I/O, especially networking operations is expected
to be performed.  This is due to {{pip 7406#issuecomment-583891169 "the lack
of a standard and efficient way to acquire the metadata"}}.  Therefore, unlike
most package managers we are familiar with, `pip` has to fetch
(and possibly build) the packages solely for dependency informations.
Fortunately, {{pep 427 recommended-archiver-features}} recommends
package builders to place the metadata at the end of the archive.
This allows the resolver to only fetch the last few kB using
`HTTP range requests`_ for the relevant information.
Simply appending `Range: bytes=-8000` to the request header
in `pip._internal.network.download` makes the resolution process
*lightning* fast.  Of course this breaks the installation but I am confident
that it is not difficult to implement this optimization cleanly.
One drawback of this optimization is the compatibility.  Not every Python
package index support range requests, and it is not possible to verify
the partial wheel.  While the first case is unavoidable, for the other,
hashes checking is usually used for pinned/locked-version requirements,
thus no backtracking is done during dependency resolution.
Either way, before installation, the packages selected by the resolver
can be downloaded in parallel.  This warranties a larger crowd of packages,
compared to parallelization during resolution, where the number of downloads
can be as low as one during trail of different versions of the same package.
Unfortunately, I have not been able to do much other than
{{pip 8411 "a minor clean up"}}.  I am looking forward to accomplishing more
this week and seeing what this path will lead us too!  At the moment,
I am happy that I'm able to meet the blog deadline, at least in UTC!
[the upcoming backtracking resolver]: http://www.ei8fdb.org/thoughts/2020/05/test-pips-alpha-resolver-and-help-us-document-dependency-conflicts
[HTTP range requests]: https://developer.mozilla.org/en-US/docs/Web/HTTP/Range_requests
rss = "GSoC 2020: The Wonderful Wizard of O'zip"
date = Date(2020, 6, 22)
@def tags = ["pip", "gsoc"]
# The Wonderful Wizard of O'zip
> Never give up... No one knows what's going to happen next.
## Preface
Greetings and best wishes!  I had a lot of fun during the last week,
although admittedly nothing was really finished.  In summary,
these are the works I carried out in the last seven days:
* Finilizing {{pip 8320 "utilities for parallelization"}}
* {{pip 8467 "Continuing experimenting"}}
on {{pip 8442 "using lazy wheels or dependency resolution"}}
* Polishing up {{pip 8411 "the patch"}} refactoring
`operations.prepare.prepare_linked_requirement`
* Adding `flake8-logging-format`
{{pip 8423#issuecomment-645418725 "to the linter"}}
* Splitting {{pip 8456 "the linting patch"}} from {{pip 8332 "the PR adding
the license requirement to vendor README"}}
## The `multiprocessing[.dummy]` wrapper
Yes, you read it right, this is the same section as last fortnight's blog.
My mentor Pradyun Gedam gave me a green light to have {{pip 8411}} merged
without support for Python 2 and the non-lazy map variant, which turns out
to be troublesome for multithreading.
The tests still needs to pass of course and the flaky tests (see failing tests
over Azure Pipeline in the past) really gave me a panic attack earlier today.
We probably need to mark them as xfail or investigate why they are
undeterministic specifically on Azure, but the real reason I was *all caught up
and confused* was that the unit tests I added mess with the cached imports
and as `pip`'s tests are run in parallel, who knows what it might affect.
I was so relieved to not discover any new set of tests made flaky by ones
I'm trying to add!
## The file-like object mapping ZIP over HTTP
This is where the fun starts.  Before we dive in, let's recall some
background information on this.  As discovered by Danny McClanahan
in {{pip 7819}}, it is possible to only download a potion of a wheel
and it's still valid for `pip` to get the distribution's metadata.
In the same thread, Daniel Holth suggested that one may use
HTTP range requests to specifically ask for the tail of the wheel,
where the ZIP's central directory record as well as where usually
`dist-info` (the directory containing `METADATA`) can be found.
Well, *usually*.  While {{pep 427}} does indeed recommend
> Archivers are encouraged to place the `.dist-info` files physically
> at the end of the archive.  This enables some potentially interesting
> ZIP tricks including the ability to amend the metadata without
> rewriting the entire archive.
one of the mentioned *tricks* is adding shared libraries to wheels
of extension modules (using e.g. `auditwheel` or `delocate`).
Thus for non-pure Python wheels, it is unlikely that the metadata
lie in the last few megabytes.  Ignoring source distributions is bad enough,
we can't afford making an optimization that doesn't work for extension modules,
which are still an integral part of the Python ecosystem )-:
But hey, the ZIP's directory record is warrantied to be at the end of the file!
Couldn't we do something about that?  The short answer is yes.  The long answer
is, well, yessssssss! That, plus magic provided by most operating systems,
this is what we figured out:
+1. We can download a realatively small chunk at the end of the wheel
+   until it is recognizable as a valid ZIP file.
+2. In order for the end of the archive to actually appear as the end to
+   `zipfile`, we feed to it an object with `seek` and `read` defined.
+   As navigating to the rear of the file is performed by calling `seek`
+   with relative offset and `whence=SEEK_END` (see `man 3 fseek`
+   for more details), we are completely able to make the wheels in the cloud
+   to behave as if it were available locally.
+   ![Wheel in the cloud](/assets/cloud.gif)
+3. For large wheels, it is better to store them in hard disks instead of memory.
+   For smaller ones, it is also preferable to store it as a file to avoid
+   (error-prony and often not really efficient) manual tracking and joining
+   of downloaded segments.  We only use a small potion of the wheel, however
+   just in case one is wonderring, we have very little control over
+   when `tempfile.SpooledTemporaryFile` rolls over, so the memory-disk hybrid
+   is not exactly working as expected.
+4. With all these in mind, all we have to do is to define an intermediate object
+   check for local availability and download if needed on calls to `read`,
+   to lazily provide the data over HTTP and reduce execution time.
+The only theoretical challenge left is to keep track of downloaded intervals,
+which I finally figured out after a few trials and errors.  The code
+was submitted as a pull request to `pip` at {{pip 8467}}.  A more modern
+(read: Python 3-only) variant was packaged and uploaded to PyPI under
+the name of lazip_.  I am unaware of any use case for it outside of `pip`,
+but it's certainly fun to play with d-:
+## What's next?
+I have been falling short of getting the PRs mention above merged for
+quite a while.  With `pip`'s next beta coming really soon, I have to somehow
+make the patches reach a certain standard and enough attention to be part of
+the pre-release—beta-testing would greatly help the success of the GSoC project.
+To other GSoC students and mentors reading this, I also hope your projects
+to turn out successful!
+[lazip]: https://pypi.org/project/lazip/
+rss = "GSoC 2020: I'm Not Drowning On My Own"
+date = Date(2020, 7, 6)
+@def tags = ["pip", "gsoc"]
+# I'm Not Drowning On My Own
+## Cold Water
+Hello there!  My schoolyear is coming to an end, with some final assignments
+and group projects left to be done.  I for sure underestimated the workload
+of these and in the last (and probably next) few days I'm drowning in work
+trying to meet my deadlines.
+One project that might be remotely relevant is [cheese-shop][], which tries to
+manage the metadata of packages from the real [Cheese Shop][].  Other than that,
+schoolwork is draining a lot of my time and I can't remember the last time
+I came up with something new for my GSoC project )-;
+## Warm Water
+On the bright side, I received a lot of help and encouragement
+from contributors and stakeholders of `pip`.  In the last week alone,
+I had five pull requests merged:
+* {{pip 8332}}: Add license requirement to `_vendor/README.rst`
+* {{pip 8320}}: Add utilities for parallelization
+* {{pip 8504}}: Parallelize `pip list --outdated` and `--uptodate`
+* {{pip 8411}}: Refactor `operations.prepare.prepare_linked_requirement`
+* {{pip 8467}}: Add utitlity to lazily acquire wheel metadata over HTTP
+In addition to helping me getting my PRs merged, my mentor Pradyun Gedam
+also gave me my first official feedback, including what I'm doing right
+(and wrong too!) and what I should keep doing to increase the chance of
+the project being successful.
+{{pip 7819}}'s roadmap (Danny McClanahan's discoveries and works on lazy wheels)
+is being closely tracked by `hatch`'s maintainter Ofek Lev, which really
+makes me proud and warms my heart, that what I'm helping build is actually
+needed by the community!
+## Learning How To Swim
+With {{pip 8467}} and {{pip 8530}} merged, I'm now working on {{pip 8532}}
+which aims to roll out the lazy wheel as the way to obtain
+dependency information via the CLI flag `--use-feature=lazy-wheel`.
+{{pip 8532}} was failing initially, despite being relatively trivial and that
+the commit it used to base on was passing.  Surprisingly, after rebasing it
+on top of {{pip 8530}}, it suddenly became green mysteriously.  After the first
+(early) review, I was able to iterate on my earlier code, which used
+the ambiguous exception `RuntimeError`.
+The rest to be done is *just* adding some functional tests (I'm pretty sure
+this will be either overwhelming or underwhelming) to make sure that
+the command-line flag is working correctly.  Hopefully this can make it into
+the beta of the upcoming release {{pip 8511 "this month"}}.
+![Lazy wheel](/assets/lazy-wheel.jpg)
+In other news, I've also submitted {{pip 8538 "a patch improving the tests
+for the parallelization utilities"}}, which was really messy as I wrote them.
+Better late than never!
+Metaphors aside, I actually can't swim d-:
+## Diving Plan
+After {{pip 8532}}, I think I'll try to parallelize downloads of wheels
+that are lazily fetched only for metadata.  By the current implementation
+of the new resolver, for `pip install`, this can be injected directly
+between the resolution and build/installation process.
+[cheese-shop]: https://github.com/McSinyx/cheese-shop
+[Cheese Shop]: https://pypi.org
+rss = "GSoC 2020: I've Walked 500 Miles..."
+date = Date(2020, 7, 20)
+@def tags = ["pip", "gsoc"]
+# I've Walked 500 Miles...
+> ... and I would walk 500 more\
+> Just to be the man who walks a thousand miles\
+> To fall down at your door
+> ![500 miles](/assets/500-miles.gif)
+## The Main Road
+Hi, have you met `fast-deps`?  It's (going to be) the name of `pip`'s
+experimental feature that may improve the speed of dependency resolution
+of the new resolver.  By avoid downloading whole wheels to just
+obtain metadata, it is especially helpful when `pip` has to do
+heavy backtracking to resolve conflicts.
+Thanks to {{pip 8532#discussion_r453990728 "Chris Hunt's review on GH-8537"}},
+my mentor Pradyun Gedam and I worked out a less hacky approach to inteject
+the call to lazy wheel during the resolution process.  A new PR {{pip 8588}}
+was filed to implement it—I could have *just* worked on top of the old PR
+and rebased, but my `git` skill is far from gud enuff to confidently do it.
+Testing this one has been a lot of fun though.  At first, integration tests
+were added as a rerun of the tests for the new resolver, with an additional flag
+to use feature `fast-deps`.  It indeed made me feel guilty towards [Travis][],
+who has to work around 30 minutes more every run. Per Chris Hunt's suggestion,
+in the new PR, I instead write a few functional tests for the area relating
+the most to the feature, namely `pip`'s subcommands `wheel`,
+`download` and `install`.
+It was also suggested that a mock server with HTTP range requests support
+might be better (in term of performance and reliablilty) than for testing.
+However, {{pip 8584#issuecomment-659227702 "I have yet to be able to make
+Werkzeug do it"}}.
+Why did I say I'm half way there?  With the parallel utilities merged and a way
+to quickly get the list of distribution to be downloaded being really close,
+what left is *only* to figure out a way to properly download them in parallel.
+With no distribution to be added during the download progress, the model of this
+will fit very well with the architecture in [my original proposal][].
+A batch downloader can be implemented to track the progress of each download
+and thus report them cleanly as e.g. progress bar or percentage. This is
+the part I am second-most excited about of my GSoC project this summer
+(after the synchronization of downloads written in my proposal, which was then
+superseded by `fast-deps`) and I can't wait to do it!
+## The Side Quests
+As usual, I make sure that I complete every side quest I see during the journey:
+* {{pip 8568}}: Declare constants in `configuration.py` as such
+* {{pip 8571}}: Clean up `Configuration.unset_value`
+  and nit the class' `__init__`
+* {{pip 8578}}: Allow verbose/quite level
+  to be specified via config file and env var
+* {{pip 8599}}: Replace tabs by spaces for consistency
+## Snap Back to Reality
+A bit about me, I actually walked 500 meters earlier today to a bank
+and walked 500 more to another to prepare my Visa card for purchasing
+the upcoming Pinephone prototype.  It's one of the first smartphones
+to fully support a GNU/Linux distribution, where one can run desktop apps
+(including proper terminals) as well as traditional services like SSH,
+HTTP server and IPFS node because why not?  Just a few hours ago,
+I pre-ordered the [postmarketOS community edition][] with additional hardware
+for convergence.
+If you did not come here for a Pinephone ad, please take my apologies though d-;
+and to ones reading this, I hope you all can become the person who walks
+a thousand miles to fall down at the door opening to all
+what you ever wished for!
+[Travis]: https://travis-ci.com
+[my original proposal]: /assets/pip-parallel-dl.pdf
+[postmarketOS community edition]: https://postmarketos.org/blog/2020/07/15/pinephone-ce-preorder/
+rss = "GSoC 2020: Sorting Things Out"
+date = Date(2020, 8, 3)
+@def tags = ["pip", "gsoc"]
+# Sorting Things Out
+Hi!  I really hope that everyone reading this is still doing okay,
+and if that isn't the case, I wish you a good day!
+## `pip` 20.2 Released!
+Last Wednesday, `pip` 20.2 was released, delivering the `2020-resolver`
+as well as many other improvements!  I was lucky to be able
+to get the `fast-deps` feature to be included as part of the release.
+A brief description of this *experimental* feature as well as testing
+instruction can be found on [Python Discuss][].
+The public exposure of the feature also remind me of some further
+{{pip 8681 optimization}} to make on {{pip 8670 "the lazy wheel"}}.
+Hopefully without download parallelization it would not be too slow
+to put off testing by concerned users of `pip`.
+## Preparation for Download Parallelization
+As of this moment, we already have:
+* {{pip 8162#issuecomment-667504162 "Multithreading pool fallback working"}}
+* An opt-in to use lazy wheel to optain dependency information,
+  and thus getting a list of wheels at the end of resolution
+  ready to be downloaded together
+What's left is *only* to interject a parallel download somewhere after
+the dependency resolution step.  Still, this struggles me way more than
+I've ever imagined.  I got so stuck that I had to give myself a day off
+in the middle of the week (and study some Rust), then I came up with
+{{pip 8638 "something what was agreed upon as difficult to maintain"}}.
+Indeed, a large part of this is my fault, for not communicating the design
+thoroughly with `pip`'s maintainers and not carefully noting stuff down
+during (verbal) discussions with my mentor.  Thankfully {{pip 8685
+"Chris Hunt came to the rescue"}} and did a refactoring that will
+make my future work much easier and cleaner.
+[Python Discuss]: https://discuss.python.org/t/announcement-pip-20-2-release/4863/2
+rss = "GSoC 2020: Parallelizing Wheel Downloads"
+date = Date(2020, 8, 17)
+@def tags = ["pip", "gsoc"]
+# Parallelizing Wheel Downloads
+> And now it's clear as this promise\
+> That we're making\
+> Two progress bars into one
+Hello there! It has been raining a lot lately and some mosquito has given me
+the Dengue fever today.  To whoever reading this, I hope it would never happen
+to you.
+Download Parallelization
+I've been working on `pip`'s download parallelization for quite a while now.
+As distribution download in `pip` was modeled as a lazily evaluated iterable
+of chunks, parallelizing such procedure is as simple as submitting routines
+that write files to disk to a worker pool.
+Or at least that is what I thought.
+Progress Reporting UI
+`pip` is currently using customly defined progress reporting classes,
+which was not designed to working with multithreading code.  Firstly, I want to
+try using these instead of defining separate UI for multithreaded progresses.
+As they use system signals for termination, one must the progress bars has to be
+running the main thread.  Or sort of.
+Since the progress bars are designed as iterators, I realized that we
+can call `next` on them.  So quickly, I throw in some queues and locks,
+and prototyped the first *working* {{pip 8771 "implementation of
+progress synchronization"}}.
+Performance Issues
+Welp, I only said that it works, but I didn't mention the performance,
+which is terrible.  I am pretty sure that the slow down is with
+the synchronization, since the `map_multithread` call doesn't seem
+to trigger anything that may introduce any sort of blocking.
+This seems like a lot of fun, and I hope I'll get better tomorrow
+to continue playing with it!
+rss = "GSoC 2020: Outro"
+date = Date(2020, 8, 31)
+@def tags = ["pip", "gsoc"]
+# Outro
+> Steamed fish was amazing, matter of fact\
+> Let me get some jerk chicken to go\
+> Grabbed me one of them lemon pie theories\
+> And let me get some of them benchmarks you theories too
+## The Look
+At the time of writing,
+{{pip 8771 "implementation-wise parallel download is ready"}}:
+Does this mean I've finished everything just-in-time?  This sounds to good
+to be true!  And how does it perform?  Welp...
+## The Benchmark
+Here comes the bad news: under a decent connection to the package index,
+using `fast-deps` does not make `pip` faster.  For best comparison,
+I will time `pip download` on the following cases:
+### Average Distribution
+For convenience purposes, let's refer to the commands to be used as follows
+$ pip --no-cache-dir download {requirement}  # legacy-resolver
+$ pip --use-feature=2020-resolver \
+   --no-cache-dir download {requirement}  # 2020-resolver
+$ pip --use-feature=2020-resolver --use-feature=fast-deps \
+   --no-cache-dir download {requirement}  # fast-deps
+In the first test, I used [axuy][] and obtained the following results
+| legacy-resolver | 2020-resolver | fast-deps |
+| --------------- | ------------- | --------- |
+| 7.709s          | 7.888s        | 10.993s   |
+| 7.068s          | 7.127s        | 11.103s   |
+| 8.556s          | 6.972s        | 10.496s   |
+Funny enough, running `pip download` with `fast-deps` in a directory
+with downloaded files already took around 7-8 seconds.  This is because
+to lazily download a wheel, `pip` has to {{pip 8670 "make many requests"}}
+which are apparently more expensive than actual data transmission on my network.
+!!! note "When is it useful then?"
+    With unstable connection to PyPI (for some reason I am not confident enough
+    to state), this is what I got
+    | 2020-resolver | fast-deps |
+    | ------------- | --------- |
+    | 1m16.134s     | 0m54.894s |
+    | 1m0.384s      | 0m40.753s |
+    | 0m50.102s     | 0m41.988s |
+    As the connection was *unstable* and that the majority of `pip` networking
+    is performed as CI/CD with large and stable bandwidth, I am unsure what this
+    result is supposed to tell (-;
+### Large Distribution
+In this test, I used [TensorFlow][] as the requirement and obtained
+the following figures:
+| legacy-resolver | 2020-resolver | fast-deps |
+| --------------- | ------------- | --------- |
+| 0m52.135s       | 0m58.809s     | 1m5.649s  |
+| 0m50.641s       | 1m14.896s     | 1m28.168s |
+| 0m49.691s       | 1m5.633s      | 1m22.131s |
+### Distribution with Conflicting Dependencies
+Some requirement that will trigger a decent amount of backtracking by
+the current implementation of the new resolver `oslo-utils==1.4.0`:
+| 2020-resolver | fast-deps |
+| ------------- | --------- |
+| 14.497s       | 24.010s   |
+| 17.680s       | 28.884s   |
+| 16.541s       | 26.333s   |
+## What Now?
+I don't know, to be honest.  At this point I'm feeling I've failed my own
+(and that of other stakeholders of `pip`) expectation and wasted the time
+and effort of `pip`'s maintainers reviewing dozens of PRs I've made
+in the last three months.
+On the bright side, this has been an opportunity for me to explore the codebase
+of package manager and discovered various edge cases where the new resolver
+has yet to cover (e.g. I've just noticed that `pip download` would save
+to-be-discarded distributions, I'll file an issue on that soon).  Plus I got
+to know many new and cool people and idea, which make me a more helpful
+individual to work on Python packaging in the future, I hope.
+[TensorFlow]: https://www.tensorflow.org
+[axuy]: https://sr.ht/~cnx/axuy
+# GSoC 2020 Blog Posts
+Blog posts are longer descriptions of the work
+I was doing as a Python GSoC student:
+* {{abslink blog/2020/gsoc/article/1}}
+* {{abslink blog/2020/gsoc/article/2}}
+* {{abslink blog/2020/gsoc/article/3}}
+* {{abslink blog/2020/gsoc/article/4}}
+* {{abslink blog/2020/gsoc/article/5}}
+* {{abslink blog/2020/gsoc/article/6}}
+* {{abslink blog/2020/gsoc/article/7}}
+rss = "GSoC 2020: First Check-In"
+date = Date(2020, 6, 1)
+@def tags = ["pip", "gsoc"]
+# First Check-In
+Hi everyone, I am McSinyx, a Vietnamese undergraduate student
+who loves [free software][].  This summer I am working with
+the maintainers and the contributors of `pip` to make
+the package manager {{pip 825 "download in parallel"}}.
+## What did I do during the community bonding period?
+Aside from bonding with `pip`'s maintainers and contributors as well as
+with my mentors, I was also experimenting on the theoretical and technical
+obstacles blocking this GSoC project.  Pradyun Gedam (a mentor of mine)
+suggested making [a proof of concept][] to determine if parallel downloading
+can play nicely with ResolveLib_'s abstraction and we are reviewing it
+together.  On the technical side, we `pip`'s committers are exploring
+{{pip 8169 "available options for parallelization"}} and I made an attempt to
+{{pip 8320 "make use of Python's standard worker pool in a portable way"}}.
+## Did I get stuck anywhere?
+Yes, of course!  Neither of the experiments above is finished as of
+this moment.  Though, I am optimistic that the issues will not be
+real blockers and we will figure that out in the next few days.
+## What is coming up next?
+As planned, this week I am going to refactor the package downloading code
+in `pip`.  The main purpose is to decouple the networking code from
+the package preparation operation and make sure that it is thread-safe.
+In addition, I am also continuing mentioned experiments to have a better
+confidence on the future of this GSoC project.
+To other GSoC students, mentors and admins reading this, I am wishing
+you all good health and successful projects this summer!
+[free software]: https://www.gnu.org/philosophy/free-sw.html
+[a proof of concept]: https://gist.github.com/McSinyx/513dbff71174fcc79f1cb600e09881af
+[ResolveLib]: https://pypi.org/project/resolvelib
+rss = "GSoC 2020: Second Check-In"
+date = Date(2020, 6, 15)
+@def tags = ["pip", "gsoc"]
+# Second Check-In
+Hi everyone and may the odds ever in your favor, especially during this
+tough time!
+## What did I do last week?
+Not as much I wished, apparently (-:
+* Finalizing {{pip 8411 "the refactoring patch"}}
+  of `operations.prepare.prepare_linked_requirement`
+* {{pip 8423 "Nitpicking some logging calls"}}.  This (as well as the next one)
+  was to fill up the time my brain not being as productive as I want it to XD
+* {{pip 8423 "Beginning to migrate"}} from `%`- to `{}`-style logging.
+  The amount of tests failing due to this was way beyond my imagination,
+  but I got functional tests for `pip install` and unit tests passing now!
+* {{pip 8442 "Mocking up a working partial wheel download during
+  dependency resolution"}} for [the new resolver][].
+## Did I get stuck anywhere?
+Yes, of course!  {{pip 8320 "Parallel maps"}} are still stalling
+as well as other small PRs listed above.  The failure related to
+`logging` are still making me pulling my hair out and the proof of
+concept for partial wheel downloading is too ugly even for a PoC.
+I imagine that I will have a lot of clean up to do this week (yay!).
+## What is coming up next?
+I'm trying get the multi-{threading,processing} facilities merged ASAP
+to start rolling it out in practice.  The first thing popping out of my
+head is to get back {{pip 7962 "the multi-threaded"}} `pip list -o`.
+The other experimental improvement (this phrase does not sound right!)
+I would like to get done is the partial wheel download.  It would be
+really nice if I can get both included as `unstable-feature`'s
+in {{pip 7628#issuecomment-636319539 "the upcoming beta release of pip 20.2"}}.
+[the new resolver]: http://www.ei8fdb.org/thoughts/2020/05/test-pips-alpha-resolver-and-help-us-document-dependency-conflicts/
+rss = "GSoC 2020: Third Check-In"
+date = Date(2020, 6, 29)
+@def tags = ["pip", "gsoc"]
+# Third Check-In
+Holla, holla, holla!  Last seven days has not been a really productive week
+for me, though I think there are still some nice things to share with
+you all here!  The good news is that I've finish my last leçon as a somophore,
+the bad news is that I have a bunch of upcoming tests, mainly in the form
+of group projects and/or presentation (phew!).  Enough about me,
+let's get back to `pip`:
+## What did I do last week?
+Not much, actually )-:
+* Write some tests for {{pip 8467 "the HTTP range mapping for wheel"}}.
+* {{pip 8504 "Try to bring back"}} multithreaded `pip list --outdated`
+  and `--uptodate`, as {{pip 8320 "the parallel <code>map</code>"}} was merged
+  earlier today.
+* Nitpick {{pip 8332}}
+  (yep it's a new low for me to include this to the list (-:).
+## Did I get stuck anywhere?
+Not exactly, since I didn't do much d-;  [Many of my PRs][] are stalling though.
+On one hand the maintainers of `pip` are all volunteers working in
+their free time, on the other hand I don't think I have tried hard enough
+to get their attention on my PRs.
+## What is coming up next?
+I'll try my best getting the following merged upstream before
+{{pip 8206 "the upcoming beta release"}}:
+* Parallel networking for `pip list`: {{pip 8504}}
+* Lazy wheel for dependency information: {{pip 8467}}, {{pip 8411}}
+  (to determine if hashing is required) and {{pip 8467#issuecomment-648717032
+  "a new patch introducing this as an unstable feature"}}
+[Many of my PRs]: https://github.com/pulls?q=is:open+is:pr+author:McSinyx+repo:pypa/pip+sort:updated-desc
+rss = "GSoC 2020: Fourth Check-In"
+date = Date(2020, 7, 13)
+@def tags = ["pip", "gsoc"]
+# Fourth Check-In
+Hello there! I'm having my second year's last exam tomorrow,
+but it [feels like summer][] already!  I've been finalizing quite a few things
+to get them ready for pip 20.2b2.
+## What did I do last week?
+I've spent most of the time on getting {{pip 8532 "the opt-in"}} for obtaining
+dependency information via lazy wheels ready.  It will be available as
+`--use-feature=fast-deps` and only has effect when
+`--use-feature=2020-resolver` also presents.
+While waiting for reviews and suggestions, I made some patches for
+internal cleansing, namely {{pip 8568}}, {{pip 8571}} and {{pip 8578}}.
+Some of the similar patches I made earlier were also merged last week:
+{{pip 8456}} and {{pip 8538}}.
+## Did I get stuck anywhere?
+Not really, everything was going as expected for me.
+## What is coming up next?
+After {{pip 8532}}, I'll work on the parallel download of the postponed wheels.
+My main current concern is with how the download progress will be reported
+to the users, but I think I'll figure it out soon.
+[feels like summer]: https://www.youtube.com/watch?v=F1B9Fk_SgI0
diff --git a/blog/2020/gsoc/checkin/5.md b/blog/2020/gsoc/checkin/5.md
+rss = "GSoC 2020: Fifth Check-In"
+date = Date(2020, 7, 27)
+@def tags = ["pip", "gsoc"]
+# Fifth Check-In
+Hello and I hope y'all are still doing well!
+## What did I do last week?
+I was not really productive last week—most of the following tickets are fillers
+to make use of the spare cycles I had when I was still trying to figure out
+the way to implement the main work.
+* Finalize the `--use-feature=fast-deps` flag ({{pip 8588}})
+* Improve mocking of environment variables in the test suit ({{pip 8614}})
+* Finalize the fix for verbose/quiet options specified via
+  configuration files and environment variables ({{pip 8578}})
+* Clean up a tiny bit in the resolver internal API ({{pip 8629}})
+* Start working on seperating the download of wheels
+  from dependency resolution ({{pip 8638}})
+## Did I get stuck anywhere?
+I'm struggling on refactoring the code to support separate download.
+`pip`'s codebase was not intended for this and thus there are
+many execution paths and other details entangled around the relevant area.
+## What is coming up next?
+`pip` 20.2 is going to be released within the next few days with
+`--use-feature=fast-deps` included and I'm mentally prepare to fix
+any undiscovered problem.  At the same time, I will continue working
+on {{pip 8638}} and hopefully get it done soon enough to begin drafting
+download parallelization strategies, mostly with the UI.
diff --git a/blog/2020/gsoc/checkin/6.md b/blog/2020/gsoc/checkin/6.md
+rss = "GSoC 2020: Sixth Check-In"
+date = Date(2020, 8, 10)
+@def tags = ["pip", "gsoc"]
+# Sixth Check-In
+Hello there!
+## What did I do last week?
+It has been a quite fun week for me, given the current state of
+development and the newly dicovered bugs thanks to pip 20.2 release:
+* Initiate discussion with the maintainers of pip on isolating
+  networking code for late download in parallel ({{pip 8697}})
+* Discuss the UI of parallel download ({{pip 8698}})
+* Log debug information relating lazy wheel decision ({{pip 8710}})
+* Disable caching for range requests ({{pip 8716}})
+* Dedent late download logs ({{pip 8722}})
+* Add a hook for batch downloading (third attempt I think) ({{pip 8737}})
+* Test hash checking for fast-deps ({{pip 8743}})
+## Did I get stuck anywhere?
+Not exactly, everything is going smoothly and I'm feeling awesome!
+## What is coming up next?
+I'll try to solve {{pip 8697}} and {{pip 8698}} within the next few days.
+I am optimistic that the parallel download prototype will be done
+within this week.
diff --git a/blog/2020/gsoc/checkin/7.md b/blog/2020/gsoc/checkin/7.md
+rss = "GSoC 2020: Final Check-In"
+date = Date(2020, 8, 24)
+@def tags = ["pip", "gsoc"]
+# Final Check-In
+Hello there!
+## What did I do last week?
+Not much, but seemingly implementation-wise I have finished my GSoC project:
+* Finish the implementation of wheels' parallel download ({{pip 8771}})
+* Help make `pip`'s CI green again ({{pip 8790}})
+* Reformat a few spots in user guide ({{pip 8795}})
+## Did I get stuck anywhere?
+I got sick, but I am recovering now!
+## What is coming up next?
+I will try to spend the time I got left within the scope of GSoC
+to {{pip 8720 "improve cache usage of the fast-deps feature"}}.
+# GSoC 2020 Check Ins
+Weekly check ins answer a few short questions as a sort of status report:
+* {{abslink blog/2020/gsoc/checkin/1}}
+* {{abslink blog/2020/gsoc/checkin/2}}
+* {{abslink blog/2020/gsoc/checkin/3}}
+* {{abslink blog/2020/gsoc/checkin/4}}
+* {{abslink blog/2020/gsoc/checkin/5}}
+* {{abslink blog/2020/gsoc/checkin/6}}
+* {{abslink blog/2020/gsoc/checkin/7}}
+rss = "GSoC 2020 final report"
+date = Date(2020, 8, 31)
+internship = "https://summerofcode.withgoogle.com/archive/2020/projects/6238594655584256"
+benchmark = "/blog/2020/gsoc/blog20200831/#the_benchmark"
+python_gsoc = "https://blogs.python-gsoc.org/en/mcsinyxs-blog"
+@def tags = ["fun", "pip", "gsoc"]
+# Google Summer of Code 2020
+In the summer of 2020, I worked with the contributors of `pip`,
+trying to improve the networking performance of the package manager.
+Admittedly, at the end of [the internship]({{internship}}) period,
+[the benchmark said otherwise]({{benchmark}}); though I really hope
+the clean-up and minor fixes I happened to be doing to the codebase
+over the summer, in addition to the implementation of parallel
+utils and lazy wheel, might actually help the project.
+Personally, I learned a lot: not just about Python packaging and
+networking stuff, but also on how to work with others.  I am really
+grateful to {{github pradyunsg}} (my mentor), {{github chrahunt}},
+{{github uranusjr}}, {{github pfmoore}}, {{github brainwane}},
+{{github sbidoul}}, {{github xavfernandez}}, {{github webknjaz}},
+{{github jaraco}}, {{github deveshks}}, {{github gutsytechster}},
+{{github dholth}}, {{github dstufft}}, {{github cosmicexplorer}}
+and {{github ofek}}.  While this feels like a long shout-out list,
+it really isn't.  These people are the maintainers, the contributors of `pip`
+and/or other Python packaging projects, and more importantly, they have been
+more than helpful, encouraging and patient to me throughout my every activities,
+showing me the way when I was lost, fixing me when I was wrong, putting up with
+my carelessness and showing me support across different social media.
+To best serve the community, below I have tried my best to document
+what I have done, how I've done it and why I've done it for over
+the last three months.  At the time of writing, some work is still in progress,
+so these also serve as a reference point for myself and others to reason
+about decisions in relevant topics.
+## The Main Story
+The storyline can be divided into the following four main acts.
+### Act One: Parallelization Utilities
+In this first act, I ensured the portibility of parallelization
+measures for later use in the final act.  Multithreading and multiprocessing
+`map` were properly fellback on platforms without full support.
+* {{pip 8320}}: Add utilities for parallelization (close {{pip 8169}})
+* {{pip 8538}}: Make `utils.parallel` tests tear down properly
+* {{pip 8504}}: Parallelize `pip list --outdated` and `--uptodate`
+  (using {{pip 8320}})
+### Act Two: Lazy Wheels
+As proposed by {{github cosmicexplorer}} in {{pip 7819}}, it is possible to only
+download a portion of a wheel to obtain metadata during dependency resolution.
+Not only that this would reduce the total amount of data to be transmitted over
+the network in case the resolver needs to perform heavy backtracking, but also
+it would create a synchronization point at the end of the resolution progress
+where parallel downloading can be applied to the needed wheels (some wheels
+solely serve their metadata during dependency backtracking and are not needed
+by the users).
+* {{pip 8467}}: Add utitlity to lazily acquire wheel metadata over HTTP
+* {{pip 8584}}: Revise lazy wheel and its tests
+* {{pip 8681}}: Make range requests closer to chunk size (help {{pip 8670}})
+* {{pip 8716}} and {{pip 8730}}: Disable caching for range requests
+### Act Three: Late Downloading
+During this act, the main works were refactoring to integrate the *lazy wheel*
+into `pip`'s codebase and clean up the way for download parallelization.
+* {{pip 8411}}: Refactor `operations.prepare.prepare_linked_requirement`
+* {{pip 8629}}: Abstract away `AbstractDistribution`
+  in higher-level resolver code
+* {{pip 8442}}, {{pip 8532}} and {{pip 8588}} (later reworked by
+  {{github chrahunt}} in {{pip 8685}}): Use lazy wheel to obtain
+  dependency information for the new resolver
+* {{pip 8743}}: Test hash checking for `fast-deps`
+* {{pip 8804}}: Check download directory before making range requests
+### Act Four: Batch Downloading in Parallel
+The final act is mostly about the UI of the parallel download.
+My work involved around how the progress should be displayed
+and how other relevant information should be reported to the users.
+* {{pip 8710}}: Revise method fetching metadata using lazy wheels
+* {{pip 8722}}: Dedent late download logs (fix {{pip 8721}})
+* {{pip 8737}}: Add a hook for batch downloading
+* {{pip 8771}}: Parallelize wheel download
+The Side Quests
+In order to keep the wheel turning (no pun intended) and avoid wasting time
+waiting for the pull requests above to be reviewed, I decided to create
+even more PRs (as I am typing this, many of the patches listed below
+are nowhere near being merged).
+* {{pip 7878}}: Fail early when install path is not writable
+* {{pip 7928}}: Fix rst syntax in Getting Started guide
+* {{pip 7988}}: Fix tabulate col size in case of empty cell
+* {{pip 8137}}: Add subcommand alias mechanism
+* {{pip 8143}}: Make mypy happy with beta release automation
+* {{pip 8248}}: Fix typo and simplify ireq call
+* {{pip 8332}}: Add license requirement to `_vendor/README.rst`
+* {{pip 8423}}: Nitpick logging calls
+* {{pip 8435}}: Use str.format style in logging calls
+* {{pip 8456}}: Lint `src/pip/_vendor/README.rst`
+* {{pip 8568}}: Declare constants in configuration.py as such
+* {{pip 8571}}: Clean up `Configuration.unset_value` and nit `__init__`
+* {{pip 8578}}: Allow verbose/quiet level to be specified
+  via config files and environment variables
+* {{pip 8599}}: Replace tabs by spaces for consistency
+* {{pip 8614}}: Use `monkeypatch.setenv` to mock environment variables
+* {{pip 8674}}: Fix `tests/functional/test_install_check.py`,
+  when run with new resolver
+* {{pip 8692}}: Make assertion failure give better message
+* {{pip 8709}}: List downloaded distributions before exiting (fix {{pip 8696}})
+* {{pip 8759}}: Allow py2 deprecation warning from setuptools
+* {{pip 8766}}: Use the new resolver for test requirements
+* {{pip 8790}}: Mark tests using remote svn and hg as xfail
+* {{pip 8795}}: Reformat a few spots in user guide
+## The Plot Summary
+Every Monday throughout the Summer of Code, I summarized what I had done
+in the week before in the form of either a short blog or an (even shorter)
+check-in.  These write-ups often contain handfuls of popular culture references
+and was originally hosted on [Python GSoC]({{python_gsoc}}).
+* {{abslink blog/2020/gsoc/checkin/1}}
+* {{abslink blog/2020/gsoc/article/1}}
+* {{abslink blog/2020/gsoc/checkin/2}}
+* {{abslink blog/2020/gsoc/article/2}}
+* {{abslink blog/2020/gsoc/checkin/3}}
+* {{abslink blog/2020/gsoc/article/3}}
+* {{abslink blog/2020/gsoc/checkin/4}}
+* {{abslink blog/2020/gsoc/article/4}}
+* {{abslink blog/2020/gsoc/checkin/5}}
+* {{abslink blog/2020/gsoc/article/5}}
+* {{abslink blog/2020/gsoc/checkin/6}}
+* {{abslink blog/2020/gsoc/article/6}}
+* {{abslink blog/2020/gsoc/checkin/7}}
+* {{abslink blog/2020/gsoc/article/7}}