about summary refs log tree commit diff homepage
path: root/blog/2020/gsoc/article/6.md
diff options
context:
space:
mode:
authorNguyễn Gia Phong <mcsinyx@disroot.org>2021-09-21 17:02:17 +0700
committerNguyễn Gia Phong <mcsinyx@disroot.org>2021-09-21 17:02:17 +0700
commit2c085d53133fd267a809d0a4e2cbf9421ea2a2a8 (patch)
treea0ede5321105f8a92449d17bf0fcd999dac0a382 /blog/2020/gsoc/article/6.md
parent7d8ce2a7f598312e3501b53d34ff8146b4dba0a6 (diff)
downloadsite-2c085d53133fd267a809d0a4e2cbf9421ea2a2a8.tar.gz
Reorganize GSoC 2020
Diffstat (limited to 'blog/2020/gsoc/article/6.md')
-rw-r--r--blog/2020/gsoc/article/6.md52
1 files changed, 52 insertions, 0 deletions
diff --git a/blog/2020/gsoc/article/6.md b/blog/2020/gsoc/article/6.md
new file mode 100644
index 0000000..40caad5
--- /dev/null
+++ b/blog/2020/gsoc/article/6.md
@@ -0,0 +1,52 @@
++++
+rss = "GSoC 2020: Parallelizing Wheel Downloads"
+date = Date(2020, 8, 17)
++++
+@def tags = ["pip", "gsoc"]
+
+# Parallelizing Wheel Downloads
+
+> And now it's clear as this promise\
+> That we're making\
+> Two progress bars into one
+
+\toc
+
+Hello there! It has been raining a lot lately and some mosquito has given me
+the Dengue fever today.  To whoever reading this, I hope it would never happen
+to you.
+
+Download Parallelization
+------------------------
+
+I've been working on `pip`'s download parallelization for quite a while now.
+As distribution download in `pip` was modeled as a lazily evaluated iterable
+of chunks, parallelizing such procedure is as simple as submitting routines
+that write files to disk to a worker pool.
+
+Or at least that is what I thought.
+
+Progress Reporting UI
+---------------------
+
+`pip` is currently using customly defined progress reporting classes,
+which was not designed to working with multithreading code.  Firstly, I want to
+try using these instead of defining separate UI for multithreaded progresses.
+As they use system signals for termination, one must the progress bars has to be
+running the main thread.  Or sort of.
+
+Since the progress bars are designed as iterators, I realized that we
+can call `next` on them.  So quickly, I throw in some queues and locks,
+and prototyped the first *working* {{pip 8771 "implementation of
+progress synchronization"}}.
+
+Performance Issues
+------------------
+
+Welp, I only said that it works, but I didn't mention the performance,
+which is terrible.  I am pretty sure that the slow down is with
+the synchronization, since the `map_multithread` call doesn't seem
+to trigger anything that may introduce any sort of blocking.
+
+This seems like a lot of fun, and I hope I'll get better tomorrow
+to continue playing with it!