]> jfr.im git - dlqueue.git/blame - docs/design/job-flow.md
update design docs
[dlqueue.git] / docs / design / job-flow.md
CommitLineData
89dcb868
JR
1dlqueue job flow
2
3- A job is inserted, by dlqueue.php or another tool. This inserts a row into `pending_jobs` table. The job data can be a URL by itself, or a URL with some extra data (such as `tags`).
4- A worker finishes a job and requests a new job (or if idle, polls for new jobs after a timeout). The middleman selects and locks a row of the database.
5- The middleman processes transforms and/or rules on the job
6 - In my use case, this would include expanding galleries/playlists into a list of video URLs, and then checking that the resulting videos have not already been downloaded
7 - I'll `yt-dlp -s` locally and then add a `--match-filter "url!=... & url!=... & ..."` to the arguments for existing URLs
8 - This can also do things like check the tags against the worker's identity for geo-locked content, or add yt-dlp arguments for audio-only
9- The middleman provides the resulting data to the worker, and waits for the worker to accept or reject.
10- The worker can also process transforms and/or rules, and accept or reject the job
11 - This can also check the tags and everything else
12 - For example, a transform rule which precedes all arguments with `--` could be used to secure workers against an untrusted middleman/queue
13- If the worker rejects, the job is unlocked and the next job (`WHERE id > rejected_id`) is processed and offered
14- If the middleman (or worker) has an error during this step, the lock on the job will simply be released by the database.
15- Once the worker accepts, the job is marked as "being worked on" as of $time
16- The worker updates the job progress regularly
17- If the worker crashes, the job will eventually be re-queued again due to age
18- Once completed, the worker notifies to mark the job as complete