about summary refs log tree commit diff homepage
path: root/blog/2020/gsoc/article/6.md
blob: 996ba41c901000229680f71f084092ef90f804dd (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
+++
rss = "GSoC 2020: Parallelizing Wheel Downloads"
date = Date(2020, 8, 17)
tags = ["gsoc", "pip", "python"]
+++

# Parallelizing Wheel Downloads

> And now it's clear as this promise\
> That we're making\
> Two progress bars into one

\toc

Hello there! It has been raining a lot lately and some mosquito has given me
the Dengue fever today.  To whoever reading this, I hope it would never happen
to you.

Download Parallelization
------------------------

I've been working on `pip`'s download parallelization for quite a while now.
As distribution download in `pip` was modeled as a lazily evaluated iterable
of chunks, parallelizing such procedure is as simple as submitting routines
that write files to disk to a worker pool.

Or at least that is what I thought.

Progress Reporting UI
---------------------

`pip` is currently using customly defined progress reporting classes,
which was not designed to working with multithreading code.  Firstly, I want to
try using these instead of defining separate UI for multithreaded progresses.
As they use system signals for termination, one must the progress bars has to be
running the main thread.  Or sort of.

Since the progress bars are designed as iterators, I realized that we
can call `next` on them.  So quickly, I throw in some queues and locks,
and prototyped the first *working* {{pip 8771 "implementation of
progress synchronization"}}.

Performance Issues
------------------

Welp, I only said that it works, but I didn't mention the performance,
which is terrible.  I am pretty sure that the slow down is with
the synchronization, since the `map_multithread` call doesn't seem
to trigger anything that may introduce any sort of blocking.

This seems like a lot of fun, and I hope I'll get better tomorrow
to continue playing with it!