about summary refs log tree commit diff homepage
path: root/blog
diff options
context:
space:
mode:
authorNguyễn Gia Phong <mcsinyx@disroot.org>2021-03-09 15:36:38 +0700
committerNguyễn Gia Phong <mcsinyx@disroot.org>2021-03-09 15:36:38 +0700
commit1ff1746272a97d9c58d2e6a8936592f90fd5cd47 (patch)
tree9dad4f887fa2752de84dc6695a05590430669f11 /blog
parent1f42cc72174d00f1dd2f93925a6931016298568f (diff)
downloadsite-1ff1746272a97d9c58d2e6a8936592f90fd5cd47.tar.gz
Migrate GSoC 2020 check-ins
Diffstat (limited to 'blog')
-rw-r--r--blog/gsoc2020/checkin20200601.md45
-rw-r--r--blog/gsoc2020/checkin20200615.md45
-rw-r--r--blog/gsoc2020/checkin20200629.md44
-rw-r--r--blog/gsoc2020/checkin20200713.md35
-rw-r--r--blog/gsoc2020/checkin20200727.md37
-rw-r--r--blog/gsoc2020/checkin20200810.md33
-rw-r--r--blog/gsoc2020/checkin20200824.md26
-rw-r--r--blog/gsoc2020/index.md151
8 files changed, 416 insertions, 0 deletions
diff --git a/blog/gsoc2020/checkin20200601.md b/blog/gsoc2020/checkin20200601.md
new file mode 100644
index 0000000..a362f28
--- /dev/null
+++ b/blog/gsoc2020/checkin20200601.md
@@ -0,0 +1,45 @@
++++
+rss = "GSoC 2020: First Check-In"
+date = Date(2020, 6, 1)
++++
+@def tags = ["pip", "gsoc"]
+
+# First Check-In
+
+Hi everyone, I am McSinyx, a Vietnamese undergraduate student
+who loves [free software][].  This summer I am working with
+the maintainers and the contributors of `pip` to make
+the package manager {{pip 825 "download in parallel"}}.
+
+## What did I do during the community bonding period?
+
+Aside from bonding with `pip`'s maintainers and contributors as well as
+with my mentors, I was also experimenting on the theoretical and technical
+obstacles blocking this GSoC project.  Pradyun Gedam (a mentor of mine)
+suggested making [a proof of concept][] to determine if parallel downloading
+can play nicely with ResolveLib_'s abstraction and we are reviewing it
+together.  On the technical side, we `pip`'s committers are exploring
+{{pip 8169 "available options for parallelization"}} and I made an attempt to
+{{pip 8320 "make use of Python's standard worker pool in a portable way"}}.
+
+## Did I get stuck anywhere?
+
+Yes, of course!  Neither of the experiments above is finished as of
+this moment.  Though, I am optimistic that the issues will not be
+real blockers and we will figure that out in the next few days.
+
+## What is coming up next?
+
+As planned, this week I am going to refactor the package downloading code
+in `pip`.  The main purpose is to decouple the networking code from
+the package preparation operation and make sure that it is thread-safe.
+
+In addition, I am also continuing mentioned experiments to have a better
+confidence on the future of this GSoC project.
+
+To other GSoC students, mentors and admins reading this, I am wishing
+you all good health and successful projects this summer!
+
+[free software]: https://www.gnu.org/philosophy/free-sw.html
+[a proof of concept]: https://gist.github.com/McSinyx/513dbff71174fcc79f1cb600e09881af
+[ResolveLib]: https://pypi.org/project/resolvelib
diff --git a/blog/gsoc2020/checkin20200615.md b/blog/gsoc2020/checkin20200615.md
new file mode 100644
index 0000000..e59cac2
--- /dev/null
+++ b/blog/gsoc2020/checkin20200615.md
@@ -0,0 +1,45 @@
++++
+rss = "GSoC 2020: Second Check-In"
+date = Date(2020, 6, 15)
++++
+@def tags = ["pip", "gsoc"]
+
+# Second Check-In
+
+Hi everyone and may the odds ever in your favor, especially during this
+tough time!
+
+## What did I do last week?
+
+Not as much I wished, apparently (-:
+
+* Finalizing {{pip 8411 "the refactoring patch"}}
+  of `operations.prepare.prepare_linked_requirement`
+* {{pip 8423 "Nitpicking some logging calls"}}.  This (as well as the next one)
+  was to fill up the time my brain not being as productive as I want it to XD
+* {{pip 8423 "Beginning to migrate"}} from `%`- to `{}`-style logging.
+  The amount of tests failing due to this was way beyond my imagination,
+  but I got functional tests for `pip install` and unit tests passing now!
+* {{pip 8442 "Mocking up a working partial wheel download during
+  dependency resolution"}} for [the new resolver][].
+
+## Did I get stuck anywhere?
+
+Yes, of course!  {{pip 8320 "Parallel maps"}} are still stalling
+as well as other small PRs listed above.  The failure related to
+`logging` are still making me pulling my hair out and the proof of
+concept for partial wheel downloading is too ugly even for a PoC.
+I imagine that I will have a lot of clean up to do this week (yay!).
+
+## What is coming up next?
+
+I'm trying get the multi-{threading,processing} facilities merged ASAP
+to start rolling it out in practice.  The first thing popping out of my
+head is to get back {{pip 7962 "the multi-threaded"}} `pip list -o`.
+
+The other experimental improvement (this phrase does not sound right!)
+I would like to get done is the partial wheel download.  It would be
+really nice if I can get both included as `unstable-feature`'s
+in {{pip 7628#issuecomment-636319539 "the upcoming beta release of pip 20.2"}}.
+
+[the new resolver]: http://www.ei8fdb.org/thoughts/2020/05/test-pips-alpha-resolver-and-help-us-document-dependency-conflicts/
diff --git a/blog/gsoc2020/checkin20200629.md b/blog/gsoc2020/checkin20200629.md
new file mode 100644
index 0000000..93699d1
--- /dev/null
+++ b/blog/gsoc2020/checkin20200629.md
@@ -0,0 +1,44 @@
++++
+rss = "GSoC 2020: Third Check-In"
+date = Date(2020, 6, 29)
++++
+@def tags = ["pip", "gsoc"]
+
+# Third Check-In
+
+Holla, holla, holla!  Last seven days has not been a really productive week
+for me, though I think there are still some nice things to share with
+you all here!  The good news is that I've finish my last leçon as a somophore,
+the bad news is that I have a bunch of upcoming tests, mainly in the form
+of group projects and/or presentation (phew!).  Enough about me,
+let's get back to `pip`:
+
+## What did I do last week?
+
+Not much, actually )-:
+
+* Write some tests for {{pip 8467 "the HTTP range mapping for wheel"}}.
+* {{pip 8504 "Try to bring back"}} multithreaded `pip list --outdated`
+  and `--uptodate`, as {{pip 8320 "the parallel"}} `map` was merged
+  earlier today.
+* Nitpick {{pip 8332}}
+  (yep it's a new low for me to include this to the list (-:).
+
+## Did I get stuck anywhere?
+
+Not exactly, since I didn't do much d-;  [Many of my PRs][] are stalling though.
+On one hand the maintainers of `pip` are all volunteers working in
+their free time, on the other hand I don't think I have tried hard enough
+to get their attention on my PRs.
+
+## What is coming up next?
+
+I'll try my best getting the following merged upstream before
+{{pip 8206 "the upcoming beta release"}}:
+
+* Parallel networking for `pip list`: {{pip 8504}}
+* Lazy wheel for dependency information: {{pip 8467}}, {{pip 8411}}
+  (to determine if hashing is required) and {{pip 8467#issuecomment-648717032
+  "a new patch introducing this as an unstable feature"}}
+
+[Many of my PRs]: https://github.com/pulls?q=is:open+is:pr+author:McSinyx+repo:pypa/pip+sort:updated-desc
diff --git a/blog/gsoc2020/checkin20200713.md b/blog/gsoc2020/checkin20200713.md
new file mode 100644
index 0000000..417db58
--- /dev/null
+++ b/blog/gsoc2020/checkin20200713.md
@@ -0,0 +1,35 @@
++++
+rss = "GSoC 2020: Fourth Check-In"
+date = Date(2020, 7, 13)
++++
+@def tags = ["pip", "gsoc"]
+
+# Fourth Check-In
+
+Hello there! I'm having my second year's last exam tomorrow,
+but it [feels like summer][] already!  I've been finalizing quite a few things
+to get them ready for pip 20.2b2.
+
+## What did I do last week?
+
+I've spent most of the time on getting {{pip 8532 "the opt-in"}} for obtaining
+dependency information via lazy wheels ready.  It will be available as
+`--use-feature=fast-deps` and only has effect when
+`--use-feature=2020-resolver` also presents.
+
+While waiting for reviews and suggestions, I made some patches for
+internal cleansing, namely {{pip 8568}}, {{pip 8571}} and {{pip 8578}}.
+Some of the similar patches I made earlier were also merged last week:
+{{pip 8456}} and {{pip 8538}}.
+
+## Did I get stuck anywhere?
+
+Not really, everything was going as expected for me.
+
+## What is coming up next?
+
+After {{pip 8532}}, I'll work on the parallel download of the postponed wheels.
+My main current concern is with how the download progress will be reported
+to the users, but I think I'll figure it out soon.
+
+[feels like summer]: https://www.youtube.com/watch?v=F1B9Fk_SgI0
diff --git a/blog/gsoc2020/checkin20200727.md b/blog/gsoc2020/checkin20200727.md
new file mode 100644
index 0000000..5e50f67
--- /dev/null
+++ b/blog/gsoc2020/checkin20200727.md
@@ -0,0 +1,37 @@
++++
+rss = "GSoC 2020: Fifth Check-In"
+date = Date(2020, 7, 27)
++++
+@def tags = ["pip", "gsoc"]
+
+# Fifth Check-In
+
+Hello and I hope y'all are still doing well!
+
+## What did I do last week?
+
+I was not really productive last week—most of the following tickets are fillers
+to make use of the spare cycles I had when I was still trying to figure out
+the way to implement the main work.
+
+* Finalize the `--use-feature=fast-deps` flag ({{pip 8588}})
+* Improve mocking of environment variables in the test suit ({{pip 8614}})
+* Finalize the fix for verbose/quiet options specified via
+  configuration files and environment variables ({{pip 8578}})
+* Clean up a tiny bit in the resolver internal API ({{pip 8629}})
+* Start working on seperating the download of wheels
+  from dependency resolution ({{pip 8638}})
+
+## Did I get stuck anywhere?
+
+I'm struggling on refactoring the code to support separate download.
+`pip`'s codebase was not intended for this and thus there are
+many execution paths and other details entangled around the relevant area.
+
+## What is coming up next?
+
+`pip` 20.2 is going to be released within the next few days with
+`--use-feature=fast-deps` included and I'm mentally prepare to fix
+any undiscovered problem.  At the same time, I will continue working
+on {{pip 8638}} and hopefully get it done soon enough to begin drafting
+download parallelization strategies, mostly with the UI.
diff --git a/blog/gsoc2020/checkin20200810.md b/blog/gsoc2020/checkin20200810.md
new file mode 100644
index 0000000..aea9d5a
--- /dev/null
+++ b/blog/gsoc2020/checkin20200810.md
@@ -0,0 +1,33 @@
++++
+rss = "GSoC 2020: Sixth Check-In"
+date = Date(2020, 8, 10)
++++
+@def tags = ["pip", "gsoc"]
+
+# Sixth Check-In
+
+Hello there!
+
+## What did I do last week?
+
+It has been a quite fun week for me, given the current state of
+development and the newly dicovered bugs thanks to pip 20.2 release:
+
+* Initiate discussion with the maintainers of pip on isolating
+  networking code for late download in parallel ({{pip 8697}})
+* Discuss the UI of parallel download ({{pip 8698}})
+* Log debug information relating lazy wheel decision ({{pip 8710}})
+* Disable caching for range requests ({{pip 8716}})
+* Dedent late download logs ({{pip 8722}})
+* Add a hook for batch downloading (third attempt I think) ({{pip 8737}})
+* Test hash checking for fast-deps ({{pip 8743}})
+
+## Did I get stuck anywhere?
+
+Not exactly, everything is going smoothly and I'm feeling awesome!
+
+## What is coming up next?
+
+I'll try to solve {{pip 8697}} and {{pip 8698}} within the next few days.
+I am optimistic that the parallel download prototype will be done
+within this week.
diff --git a/blog/gsoc2020/checkin20200824.md b/blog/gsoc2020/checkin20200824.md
new file mode 100644
index 0000000..b87a7fd
--- /dev/null
+++ b/blog/gsoc2020/checkin20200824.md
@@ -0,0 +1,26 @@
++++
+rss = "GSoC 2020: Final Check-In"
+date = Date(2020, 8, 24)
++++
+@def tags = ["pip", "gsoc"]
+
+# Final Check-In
+
+Hello there!
+
+## What did I do last week?
+
+Not much, but seemingly implementation-wise I have finished my GSoC project:
+
+* Finish the implementation of wheels' parallel download ({{pip 8771}})
+* Help make `pip`'s CI green again ({{pip 8790}})
+* Reformat a few spots in user guide ({{pip 8795}})
+
+## Did I get stuck anywhere?
+
+I got sick, but I am recovering now!
+
+## What is coming up next?
+
+I will try to spend the time I got left within the scope of GSoC
+to {{pip 8720 "improve cache usage of the fast-deps feature"}}.
diff --git a/blog/gsoc2020/index.md b/blog/gsoc2020/index.md
new file mode 100644
index 0000000..09f208b
--- /dev/null
+++ b/blog/gsoc2020/index.md
@@ -0,0 +1,151 @@
++++
+rss = "GSoC 2020 final report"
+date = Date(2020, 8, 31)
++++
+@def tags = ["fun", "pip", "gsoc"]
+
+# Google Summer of Code 2020
+
+In the summer of 2020, I worked with the contributors of `pip`, trying
+to improve the networking performance of the package manager.  Admittedly, at
+the end of the [internship][] period, [the benchmark said otherwise][benchmark];
+though I really hope the clean-up and minor fixes I happened to be doing
+to the codebase over the summer, in addition to the implementation of parallel
+utils and lazy wheel, might actually help the project.
+
+Personally, I learned a lot: not just about Python packaging and
+networking stuff, but also on how to work with others.  I am really
+grateful to {{github pradyunsg}} (my mentor), {{github chrahunt}},
+{{github uranusjr}}, {{github pfmoore}}, {{github brainwane}},
+{{github sbidoul}}, {{github xavfernandez}}, {{github webknjaz}},
+{{github jaraco}}, {{github deveshks}}, {{github gutsytechster}},
+{{github dholth}}, {{github dstufft}}, {{github cosmicexplorer}}
+and {{github ofek}}.  While this feels like a long shout-out list,
+it really isn't.  These people are the maintainers, the contributors of `pip`
+and/or other Python packaging projects, and more importantly, they have been
+more than helpful, encouraging and patient to me throughout my every activities,
+showing me the way when I was lost, fixing me when I was wrong, putting up with
+my carelessness and showing me support across different social media.
+
+To best serve the community, below I have tried my best to document
+what I have done, how I've done it and why I've done it for over
+the last three months.  At the time of writing, some work is still in progress,
+so these also serve as a reference point for myself and others to reason
+about decisions in relevant topics.
+
+\toc
+
+## The Main Story
+
+The storyline can be divided into the following four main acts.
+
+### Act One: Parallelization Utilities
+
+In this first act, I ensured the portibility of parallelization
+measures for later use in the final act.  Multithreading and multiprocessing
+`map` were properly fellback on platforms without full support.
+
+* {{pip 8320}}: Add utilities for parallelization (close {{pip 8169}})
+* {{pip 8538}}: Make `utils.parallel` tests tear down properly
+* {{pip 8504}}: Parallelize `pip list --outdated` and `--uptodate`
+  (using {{pip 8320}})
+
+### Act Two: Lazy Wheels
+
+As proposed by {{github cosmicexplorer}} in {{pip 7819}}, it is possible to only
+download a portion of a wheel to obtain metadata during dependency resolution.
+Not only that this would reduce the total amount of data to be transmitted over
+the network in case the resolver needs to perform heavy backtracking, but also
+it would create a synchronization point at the end of the resolution progress
+where parallel downloading can be applied to the needed wheels (some wheels
+solely serve their metadata during dependency backtracking and are not needed
+by the users).
+
+* {{pip 8467}}: Add utitlity to lazily acquire wheel metadata over HTTP
+* {{pip 8584}}: Revise lazy wheel and its tests
+* {{pip 8681}}: Make range requests closer to chunk size (help {{pip 8670}})
+* {{pip 8716}} and {{pip 8730}}: Disable caching for range requests
+
+### Act Three: Late Downloading
+
+During this act, the main works were refactoring to integrate the *lazy wheel*
+into `pip`'s codebase and clean up the way for download parallelization.
+
+* {{pip 8411}}: Refactor `operations.prepare.prepare_linked_requirement`
+* {{pip 8629}}: Abstract away `AbstractDistribution`
+  in higher-level resolver code
+* {{pip 8442}}, {{pip 8532}} and {{pip 8588}} (later reworked by
+  {{github chrahunt}} in {{pip 8685}}): Use lazy wheel to obtain
+  dependency information for the new resolver
+* {{pip 8743}}: Test hash checking for `fast-deps`
+* {{pip 8804}}: Check download directory before making range requests
+
+### Act Four: Batch Downloading in Parallel
+
+The final act is mostly about the UI of the parallel download.
+My work involved around how the progress should be displayed
+and how other relevant information should be reported to the users.
+
+* {{pip 8710}}: Revise method fetching metadata using lazy wheels
+* {{pip 8722}}: Dedent late download logs (fix {{pip 8721}})
+* {{pip 8737}}: Add a hook for batch downloading
+* {{pip 8771}}: Parallelize wheel download
+
+The Side Quests
+---------------
+
+In order to keep the wheel turning (no pun intended) and avoid wasting time
+waiting for the pull requests above to be reviewed, I decided to create
+even more PRs (as I am typing this, many of the patches listed below
+are nowhere near being merged).
+
+* {{pip 7878}}: Fail early when install path is not writable
+* {{pip 7928}}: Fix rst syntax in Getting Started guide
+* {{pip 7988}}: Fix tabulate col size in case of empty cell
+* {{pip 8137}}: Add subcommand alias mechanism
+* {{pip 8143}}: Make mypy happy with beta release automation
+* {{pip 8248}}: Fix typo and simplify ireq call
+* {{pip 8332}}: Add license requirement to `_vendor/README.rst`
+* {{pip 8423}}: Nitpick logging calls
+* {{pip 8435}}: Use str.format style in logging calls
+* {{pip 8456}}: Lint `src/pip/_vendor/README.rst`
+* {{pip 8568}}: Declare constants in configuration.py as such
+* {{pip 8571}}: Clean up `Configuration.unset_value` and nit `__init__`
+* {{pip 8578}}: Allow verbose/quiet level to be specified
+  via config files and environment variables
+* {{pip 8599}}: Replace tabs by spaces for consistency
+* {{pip 8614}}: Use `monkeypatch.setenv` to mock environment variables
+* {{pip 8674}}: Fix `tests/functional/test_install_check.py`,
+  when run with new resolver
+* {{pip 8692}}: Make assertion failure give better message
+* {{pip 8709}}: List downloaded distributions before exiting (fix {{pip 8696}})
+* {{pip 8759}}: Allow py2 deprecation warning from setuptools
+* {{pip 8766}}: Use the new resolver for test requirements
+* {{pip 8790}}: Mark tests using remote svn and hg as xfail
+* {{pip 8795}}: Reformat a few spots in user guide
+
+## The Plot Summary
+
+Every Monday throughout the Summer of Code, I summarized what I had done
+in the week before in the form of either a short blog or an (even shorter)
+check-in.  These write-ups often contain handfuls of popular culture references
+and was originally hosted on [Python GSoC][].
+
+* [{{fill title blog/gsoc2020/checkin20200601}}](/blog/gsoc2020/checkin20200601)
+* [{{fill title blog/gsoc2020/blog20200609}}](/blog/gsoc2020/blog20200609)
+* [{{fill title blog/gsoc2020/checkin20200615}}](/blog/gsoc2020/checkin20200615)
+* [{{fill title blog/gsoc2020/blog20200622}}](/blog/gsoc2020/blog20200622)
+* [{{fill title blog/gsoc2020/checkin20200629}}](/blog/gsoc2020/checkin20200629)
+* [{{fill title blog/gsoc2020/blog20200706}}](/blog/gsoc2020/blog20200706)
+* [{{fill title blog/gsoc2020/checkin20200713}}](/blog/gsoc2020/checkin20200713)
+* [{{fill title blog/gsoc2020/blog20200720}}](/blog/gsoc2020/blog20200720)
+* [{{fill title blog/gsoc2020/checkin20200727}}](/blog/gsoc2020/checkin20200727)
+* [{{fill title blog/gsoc2020/blog20200803}}](/blog/gsoc2020/blog20200803)
+* [{{fill title blog/gsoc2020/checkin20200810}}](/blog/gsoc2020/checkin20200810)
+* [{{fill title blog/gsoc2020/blog20200817}}](/blog/gsoc2020/blog20200817)
+* [{{fill title blog/gsoc2020/checkin20200824}}](/blog/gsoc2020/checkin20200824)
+* [{{fill title blog/gsoc2020/blog20200831}}](/blog/gsoc2020/blog20200831)
+
+[internship]: https://summerofcode.withgoogle.com/archive/2020/projects/6238594655584256
+[benchmark]: /blog/gsoc2020/blog20200831/#the_benchmark
+[Python GSoC]: https://blogs.python-gsoc.org/en/mcsinyxs-blog/