Urban75 Home About Offline BrixtonBuzz Contact

Today I have mostly been using ${insert development technology here}

It may well be (it wouldn't be the first time) that by the time I've sussed out all the MIDI stuff, you're right, and I'll be able to refactor all the USB cleverness out and just use MIDI-over-USB. I'm telling myself that, even if that turns out to be the case, this is all part of the learning curve, and nothing to be ashamed of :)
 
I've spent this week re-discovering powershell, I'd forgotten how powerful it was. Someone asked if they could install PowerGui, this was so that she could look at the contents of a bunch of distribution groups, after reading the very long instructions on what she needed to do to use this I said I could do better. A single line of powershell would do what she wanted, so I downloaded Poshtools and built a gui front end to search across domains for all the groups, then export the contents to a csv file. It took about three hours, including the time to install the programs and work out how to use them.

Now I'm deep in visual studio making guis for powershell.
 
I've built a tool that automates one of the most boring parts of my job using ffmpeg, Python and Pillow.

I can feed in a video and it'll find the points where ads need to be placed, and edits and encodes the video, and puts it on the CDN. All in one click.

I've only tested for one show so far, but I see no reason it shouldn't work for the majority of our shows. If it does, my job has just got a whole lot easier :cool:
 
Nice! I was doing some work with a client who'd built a fairly big, expensive system to do that for video content in-app, so yours is pretty impressive!

I've spent the last week working with Ansible to automate all of our deployments, and ripping out Rundeck and Salt. It's been rather fun.
*mutters magical incantations*
*servers appear*
 
I've now moved onto automating the next most annoying/boring parts of my job. This might be even harder than the video edit detection :mad:

So, I need to take screenshots of the videos for display on the episode/series. The problem is we have so many platforms, and they all use different dimensions etc.

TV shows a 16:9 image, which is perferct, but when you click on it, it takes you to a second screen that does a crude crop in the center third of the screen, so the image needs to look good if you cut away to the two sides. Also, there's iOS/Android apps that take a 3:4 image, and Samsung TV app that takes another ratio...I forget which.

I'm still selecting the images manually for now, but trying to automate the crop first. I've been reading about "seam carving" which is a clever way of cropping your content without losing any relevant detail. It's how Photoshops content-aware stuff works. With seam carving crops, it looks for a path of least 'interest' across a row or column (or both!) and removes that strip of image and shifts everything else in (or out) to compensate. With the right photo (and implementation), the results can be incredible.

Sadly, I do not have any of those things yet.

Source image:
0aafc5ef-cd8d-48fc-a79f-7dc589bd8c2c.jpg

Seam carved crop:
carved.jpg

More work needed :facepalm:
 
I've now moved onto automating the next most annoying/boring parts of my job. This might be even harder than the video edit detection :mad:

So, I need to take screenshots of the videos for display on the episode/series. The problem is we have so many platforms, and they all use different dimensions etc.

TV shows a 16:9 image, which is perferct, but when you click on it, it takes you to a second screen that does a crude crop in the center third of the screen, so the image needs to look good if you cut away to the two sides. Also, there's iOS/Android apps that take a 3:4 image, and Samsung TV app that takes another ratio...I forget which.

I'm still selecting the images manually for now, but trying to automate the crop first. I've been reading about "seam carving" which is a clever way of cropping your content without losing any relevant detail. It's how Photoshops content-aware stuff works. With seam carving crops, it looks for a path of least 'interest' across a row or column (or both!) and removes that strip of image and shifts everything else in (or out) to compensate. With the right photo (and implementation), the results can be incredible.

Sadly, I do not have any of those things yet.

Source image:
View attachment 136888

Seam carved crop:
View attachment 136889

More work needed :facepalm:
Lol.

We just tell editors to specify an area on the image which should never be cropped (dragging a rectangle around it) and the resize/crop algorithm works on that. It’s pretty much foolproof but does admittedly require people to do their fucking jobs which is something that’s not at all guaranteed.

I’m not going to mention the company but a majority of the auto generated thumbnails that don’t go through editors are at completely the wrong aspect ratio for their own self declared metadata. Nobody seems to care and it drives me mad.
 
I've built a tool that automates one of the most boring parts of my job using ffmpeg, Python and Pillow.

I can feed in a video and it'll find the points where ads need to be placed, and edits and encodes the video, and puts it on the CDN. All in one click.

I've only tested for one show so far, but I see no reason it shouldn't work for the majority of our shows. If it does, my job has just got a whole lot easier :cool:
Back to working on this.

The previous thing was OK, but it was still just a messy script that used hard-coded everything, and needed tweaks for every show, and some manual input (when encountering a new show). I've re-written it as a proper application now, and already is so much better - works on more cases with less input and fewer manual bits. The design I used also means I get some really handy error checking basically for free (this was accidental/fortunate :) )

I've got it working on every show I've tried so far (admittedly going for the low-hanging fruit first). I'm fairly confident I can use it on 90% of them, and with a bit of work, another 5% should be doable. The rest will be much trickier and maybe not worth the effort in implementing

One thing I've added is a directory watcher, so now I can just drop the files into the right dir, and it picks up the correct config etc and Just Works. One thing I've lost by doing this is my task scheduler/queue, so now I'm learning how to use RabbitMQ via Celery. I could just call out the encoding tasks to a system job and use the old scheduler, but I was already coming up against its limits (single queue, no priorities) so I think branching out to a proper task management solution is a good idea here.
 
Nooooooooo don't use celery. Honestly, it's almost always overkill. (If you *want* to learn it, then obviously go right ahead). Python-RQ does 80% of what Celery does, and is about 20% of the hassle. having used both, i'd recommend using Redis over RabbitMQ for the broker as well, as the setup is easier (imo). If you do go down the celery route though, then you probably want to check out Flower on top of it to check your job status in a nice visual way. We use it in production and a single micro instance in AWS manages to keep a track of tens of thousands of tasks an hour on our RabbitMQ broker server with four separate worker machines.
 
Nooooooooo don't use celery. Honestly, it's almost always overkill. (If you *want* to learn it, then obviously go right ahead). Python-RQ does 80% of what Celery does, and is about 20% of the hassle. having used both, i'd recommend using Redis over RabbitMQ for the broker as well, as the setup is easier (imo). If you do go down the celery route though, then you probably want to check out Flower on top of it to check your job status in a nice visual way. We use it in production and a single micro instance in AWS manages to keep a track of tens of thousands of tasks an hour on our RabbitMQ broker server with four separate worker machines.
I saw flower when I was looking around and it looked really nice.

My needs are pretty basic, so I'm inclined to trust your (clearly experienced) views here and never heard of Python-RQ so I'll check it out, thanks!
 
*bitter* experience ;)

It all works, and it'll be fine and useful, but this job (big company, global, hundreds of thousands of users) is really the first time I've seen the infra and setup cost of Celery justified, as I always default to the others as lighter, simpler libraries.
 
Nooooooooo don't use celery. Honestly, it's almost always overkill. (If you *want* to learn it, then obviously go right ahead). Python-RQ does 80% of what Celery does, and is about 20% of the hassle. having used both, i'd recommend using Redis over RabbitMQ for the broker as well, as the setup is easier (imo). If you do go down the celery route though, then you probably want to check out Flower on top of it to check your job status in a nice visual way. We use it in production and a single micro instance in AWS manages to keep a track of tens of thousands of tasks an hour on our RabbitMQ broker server with four separate worker machines.
One question before I go too deep into this *ahem* rabbit hole: what is the 20% that's missing from Python-RQ? I doubt it'll be what I need but would hate to find out too late...
 
Hmm, one thing I've spotted already from the quickstart docs:

One of my issues with the previous queue system I used was the single queue function. I could allow multiple jobs to run, but it was all in one queue, FIFO.

Thing is, most of the tasks I run are long-running, but they use different resources. I have I/O bound processes and CPU bound processes. I would like a different queue for each, with equal priority. Seems like Python-RQ can't do this?

edit: unless I run multiple instances of redis with a queue on each?

More doc reading required
 
Last edited:
Nah you don't need multiple instances of redis.

When you're building the queues, you can just do:
lowpriority = Queue('low', connection=redis_conn)
highpriority = Queue('high', connection=redis_conn)
everythingelse = Queue('stuff', connection=redis_conn)

Then when launch workers, specify which queue they're for. (e.g. 'rq worker highpriority' and 'rq worker lowpriority everythingelse' would cover all of the queues above, with the workers pulling out the jobs from each queue on a fifo basis. (Not sure of the FIFO priority if the worker is listening to multiple queues, think it's FIFO across all of it, but not 100%)

If you want equal priority, just have the same number of workers for each queue - I tended to run my high priority queues with 2x the workers of the low priority queue.
 
Nah you don't need multiple instances of redis.

When you're building the queues, you can just do:
lowpriority = Queue('low', connection=redis_conn)
highpriority = Queue('high', connection=redis_conn)
everythingelse = Queue('stuff', connection=redis_conn)

Then when launch workers, specify which queue they're for. (e.g. 'rq worker highpriority' and 'rq worker lowpriority everythingelse' would cover all of the queues above, with the workers pulling out the jobs from each queue on a fifo basis. (Not sure of the FIFO priority if the worker is listening to multiple queues, think it's FIFO across all of it, but not 100%)

If you want equal priority, just have the same number of workers for each queue - I tended to run my high priority queues with 2x the workers of the low priority queue.
I didn't really give an example in my post which would show the problem.

Let's say I have 10 tasks to run today, each one takes an hour. 5 are I/O bound (download/uploading/backing up), 5 are CPU bound (transcoding).

I want one CPU task and one I/O task to run simultaneously. If I had a single queue, with FIFO, it might end up looking something like this (top being next to process):

Code:
CPU
CPU
I/O
CPU
I/O
I/O
etc..

So if I set the max number of workers to one, my CPU task will run, but no I/O tasks will be running. Even if I set it to two, I get two CPU tasks running, but no I/O - meaning my CPU tasks take even longer, and my I/O tasks are sat there doing nothing when they could be being processed.

So let's say I have two queues:

Code:
Q1	 Q2
CPU	 I/O
CPU	 I/O
CPU	 I/O
etc..

Now, I want one worker per queue, with equal priority between queues - as in the two queues should not affect each other at all. If Q1 is cleared, I still only want one job to be running as it means I only have I/O tasks to do.

Do I just do:

rq worker Q1
rq worker Q2

?
 
(Not sure of the FIFO priority if the worker is listening to multiple queues, think it's FIFO across all of it, but not 100%)
btw, the docs are pretty clear on this point.

rq worker high low

All high queue jobs are processed first. When a job finishes it moves to the low queue, unless another high queue job has been added in the meantime. Makes sense to me :)
 
btw, sorry forgot to reply to these.
We just tell editors to specify an area on the image which should never be cropped (dragging a rectangle around it) and the resize/crop algorithm works on that. It’s pretty much foolproof but does admittedly require people to do their fucking jobs which is something that’s not at all guaranteed.

In my job, I am the 'editor' (along with everything else). We don't get images supplied, so I scrub through the video until I find a nice image and then extract a frame to use. Then for the other platforms, I edit them down to the correct size/ratios. It's a slow job, hence me wanting to automate it away.

My next idea is to generate candidate screenshots automatically, and then do some sort of 'complexity analysis' on the image, looking at a third of the image at a time.

So taking the image I posted above, I'd cut it into three giving me portrait images, and then calculate a score for each third. I don't know what algorithm to use here as I've not got as far as that yet, but I'm sure something must exist. The idea is that if the central third has more 'complexity' than the two side thirds, then there's a good chance we have an image like above - subject in the centre, blurred/plain around the subject.

If I take a screenshot every second of video then run the above on it, I can get the script to return the top 10 (or whatever) candidate pics and then choose between them.

If it works well enough, then I could just serve these somewhere and we'll get someone else to pick the images and I can work on more interesting things.
Can you use a photoshop to do the heavy lifting via a batch job/droplet thingy?
I could, but as above, I don't actually have the images yet. I have to create them from video footage, and then do the cropping. It's a ballache. I have another 16 to do in the next 2 hours...and even worse, I have another 1,000 (yep!) to do in the next err, well, they were due on the 1st June :facepalm:
 
More with you now. Yes, that's how you'd do it. :)
Had a little play with rq just now and having some issues...documentation is awful (the dev admitted so!) - any chance you know what I'm doing wrong here?

Redis installed and running
rq installed and running etc

Code:
q = Queue('encoder', connection=Redis())
# lots of other stuff happens, then...
q.enqueue(episode.encode_freeview())

Code:
$ rq info
0 queues, 0 jobs total

arnie.61291 idle: encoder
1 workers, 0 queues

Updated: 2018-06-25 13:03:32.534670
But my encode is running!

So is it just bypassing the queue here and running as a normal python function? Can't figure out how to put the task on the queue (and rq info says I don't even have a queue...)
 
Hmm, it might be 'working' after all. Monitoring the Redis db shows new keys when a new episode is detected...

Code:
127.0.0.1:6379> monitor
OK
1529928897.008327 [0 127.0.0.1:46344] "EXPIRE" "rq:worker:arnie.37697" "420"
1529928897.008487 [0 127.0.0.1:46344] "HSET" "rq:worker:arnie.37697" "last_heartbeat" "2018-06-25T12:14:57.008388Z"
1529928897.008607 [0 127.0.0.1:46344] "BLPOP" "rq:queue:encoder" "405"
 
LaTeX, Python, and Jinja2. Generating invoices. I want to be able to send an email to myself which will get picked up by a script running on my home PC to generate an invoice PDF and email it back to me, for me to forward to the customer.

I thought it might be fun to have the email (almost) in natural language (which would mean I could use voice input, too), and have it parse it all out for generation...
 
Fez909 The thing that jumps out is that you don't pass the invoked function /method, but the callable -

So rather than
Code:
q.enqueue(episode.encode_freeview())[code], 

it should be 

[code]q.enqueue(episode.encode_freeview)
-

but since that looks like a method, you may need to do it as

Code:
q.enqueue(Episode.encode_freeview, episode_instance

So pass the callable Class & method as the first arg, and the actual instance as the first parameter, per this - Can't enqueue instance methods · Issue #582 · rq/rq

...if that makes sense.
 
I spent the last 4 days coding in Python, scraping content out of .docx files, squirting it into MySQL, then using Python, Jinja2 templating and LaTeX to produce beautiful PDFs of the lesson plans.

It's the most fun I've had sober and with my clothes on for ages! :)

Now coding a Django front end for all that data.
 
Did you do the rest of the code in Django? If it's pure python I wouldn't bother dropping it into Django for that (and I say that as someone who's ostensibly a professional Django programmer). Sounds like an ideal reason to play with React!
 
Ps: I set up a Minecraft server for my kid this week and it was an absolute fucking ballache, well harder than most stuff I do for work.
 
Did you do the rest of the code in Django? If it's pure python I wouldn't bother dropping it into Django for that (and I say that as someone who's ostensibly a professional Django programmer). Sounds like an ideal reason to play with React!
Doing all the front end stuff was going to be too much of a pain, so that's the django bit - I'm not ready to do this steep a learning curve on a new technology!

And the database access class layer I wrote for the pure python bit is a nastily kludgy parody of django's :eek:
 
Back
Top Bottom