Initial version of handling audio object content.#304
Closed
thirv wants to merge 2 commits intoxiph:masterfrom
Closed
Initial version of handling audio object content.#304thirv wants to merge 2 commits intoxiph:masterfrom
thirv wants to merge 2 commits intoxiph:masterfrom
Conversation
Member
|
I've noticed that object_analysis() is quite similar to surround_analysis(). Did you consider having a single function implement the analysis for both cases? Or are you expecting object_analysis() to further diverge from the surround case as it gets improved? |
Author
Yeah, I think it might be good to have it separate in case of further developments. |
Contributor
|
Master branch is deleted and all pr's against master is closed. If this change is still relevant please reopen and repoint your PR to main branch. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Object audio is an emerging immersive format, which is especially interesting for content creators. An audio object consists of 1) audio wave stem (typically mono), and 2) associated object metadata. Currently most prominent application is to utilize spatial objects, where the metadata describes the spatial position of the object as a function of time, and all objects in the content are then rendered simultaneously according to their metadata. Objects are in principle quite layout-agnostic, so the content can be reproduced in any multi-loudspeaker setup, or headphones etc. Typically, content creators prefer that the objects are merely spatial without any renderer-side interactivity to preserve the artistic intent.
This PR represents an initial step toward handling object audio compression using Opus. The naivest solution would be to code each object with a separate mono Opus instance at equal bitrate. However, given that the number of objects can be large, this is very consuming. Luckily, Opus already implements multistream coding, as well as a mechanism to adjust individual channel/stream rate based on analyzing the joint masking among all channels. No handling of object metadata, nor decoder side rendering, are implemented, and it may be reasonable to leave these outside Opus in general. All object wave PCMs are assumed to be inputted as e.g. a multichannel file.
The underlying spatial masking model for bitrate allocation assumes that all objects in the content/multistream are rendered with a typical spatial renderer (such as EAR). The decoder side interactivity (e.g. changing the levels) is not assumed here. In a typical listening room with reflections (as opposed to free field/anechoic room), the spatial masking release effects are not very prominent, and thus for a typical object content, this first approximation just assumes no spatial release from masking between the objects. Despite the simplicity,
object_analysisis added as a separate function to make future development easier.Related PRs to other Opus projects TODO.
Comments and suggestions welcome!