feat: two-way audio (listen + talk), all platforms#26
Merged
Conversation
feat(audio): phase 17 listen slice — camera audio + volume/mute - Core: AudioFrame, IVideoSession.AudioFrames, EnableAudio flag, IAudioOutput/NullAudioOutput, AudioMonitor (one-source policy + software gain) - Video: FfmpegVideoSession decodes audio via swresample → S16/48k/stereo; auto-reconnect forwards it - Desktop: native WASAPI shared-mode renderer (AUTOCONVERTPCM, ring buffer) - App: speaker toggle + volume slider on single-camera page, default muted, persisted @
feat(audio): grid tile listen via dynamic SetAudioEnabled (phase 17) - IVideoSession.SetAudioEnabled toggles audio decode on a live session, no video blip - FfmpegVideoSession lazily sets up/tears down audio decoder in the loop; sticky across reconnects - per-tile speaker button; one-source policy displaces the previously-listening tile @
feat(audio): two-way talk via ONVIF backchannel (phase 17.5/17.6) - RtspBackchannelClient: minimal RTSP (DESCRIBE+Require/SETUP TCP-interleaved/PLAY), Basic/Digest auth, SDP sendonly track, interleaved RTP send + keepalive - IAudioBackchannelClient/Session contracts; PushToTalkController wires capture->G711->RTP->send (tested with fakes) - single-camera mic button (tap to talk), gated on mic availability; backchannel failures surface as TalkError - NOTE: needs a Profile-T camera with a speaker to validate end-to-end @
feat(audio): mic capture on all platforms (phase 17.6) - Linux AlsaAudioInput (libasound), macOS CoreAudioInput (AudioQueue) - Android AudioRecord (+RECORD_AUDIO manifest), iOS AVAudioEngine (+NSMicrophoneUsageDescription) - registered per head; 8k mono S16 to match the G.711 backchannel - verified compile: Win/Linux/macOS/Android; iOS builds on Mac only @
feat(audio): native playback on all platforms (phase 17.2) - Linux AlsaAudioOutput, macOS CoreAudioOutput (shared PcmRing), Android AudioTrack, iOS AVAudioPlayerNode - registered per head -> listen now works everywhere, not just Windows - verified compile: Win/Linux/macOS/Android; iOS builds on Mac only @
feat(audio): hold-to-talk, tile speaker gating, Android mic prompt (phase 17.6) - push-to-talk is now press-and-hold (BeginTalk/EndTalk via pointer events), guards release-during-connect - grid tile speaker hidden only when a probed ONVIF camera reports no mic - request RECORD_AUDIO at Android startup @
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
listen to the camera's mic and push-to-talk into its speaker. FFmpeg is used only for audio decode/encode; playback, capture and the backchannel are native per platform
Related
Type
Checklist
TreatWarningsAsErrors=true).dotnet test); new Core logic has unit tests.AppreferencesCoreonly (Infrastructure / Video / Devices wired via DI in a head).Platforms tested
Screenshots / notes