]> jfr.im git - dlqueue.git/blob - docs/design/job-flow.md
update design docs
[dlqueue.git] / docs / design / job-flow.md
1 dlqueue job flow
2
3 - A job is inserted, by dlqueue.php or another tool. This inserts a row into `pending_jobs` table. The job data can be a URL by itself, or a URL with some extra data (such as `tags`).
4 - A worker finishes a job and requests a new job (or if idle, polls for new jobs after a timeout). The middleman selects and locks a row of the database.
5 - The middleman processes transforms and/or rules on the job
6 - In my use case, this would include expanding galleries/playlists into a list of video URLs, and then checking that the resulting videos have not already been downloaded
7 - I'll `yt-dlp -s` locally and then add a `--match-filter "url!=... & url!=... & ..."` to the arguments for existing URLs
8 - This can also do things like check the tags against the worker's identity for geo-locked content, or add yt-dlp arguments for audio-only
9 - The middleman provides the resulting data to the worker, and waits for the worker to accept or reject.
10 - The worker can also process transforms and/or rules, and accept or reject the job
11 - This can also check the tags and everything else
12 - For example, a transform rule which precedes all arguments with `--` could be used to secure workers against an untrusted middleman/queue
13 - If the worker rejects, the job is unlocked and the next job (`WHERE id > rejected_id`) is processed and offered
14 - If the middleman (or worker) has an error during this step, the lock on the job will simply be released by the database.
15 - Once the worker accepts, the job is marked as "being worked on" as of $time
16 - The worker updates the job progress regularly
17 - If the worker crashes, the job will eventually be re-queued again due to age
18 - Once completed, the worker notifies to mark the job as complete