Wire speech-to-text request (whisper container)
When #1697 (closed) is implemented, the UI needs to be wired to the whisper container (requesting transcription, transmitting selected language).
This should come after the bounding boxes are properly implemented.