-
Notifications
You must be signed in to change notification settings - Fork 167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Post processing step #77
base: main
Are you sure you want to change the base?
Conversation
… before that it gets encoded. - DataSources now have an optional handler: "PostProcessor" that will be called for each input and output buffers, just like the resample ore remix classes, but the goal in this case is to do a custom manipulation of the final raw data. It can also be use to prevent the data from being sent to the encoder (by not not filling the outputBuffer in the postProcess method) . For example, it can be use to merge two audio DataSource. - Adds two PostProcessors that can be used to mix the audio of two DataSources: MixerSourceAudioPostProcessor is used accumulate and not write the raw audio data of the first DataSouce. MixerTargetAudioPostProcessor is used to do the mixing and write the mixed audio data to the output buffer when processing the second DataSource. - Adds an AudioPostProcessor to change the volume of a DataSource
…ta sources." This reverts commit 68198ce.
Hey @cbernier2 , thanks a lot for your contribution. I'm sorry for delay, I've been very busy. I'll take a look at this tomorrow! Also out of curiosity, do you and @mudar work together? Not sure if I have asked already |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have left questions here and there, my main doubts are around the changes in Engine
(I don't think they are needed) and also about PostProcessor.calculateNewDurationUs()
, do we really need the ability for a processor to change the length? If we don't, that would be extremely simpler.
@@ -253,46 +258,55 @@ public long interpolate(@NonNull TrackType type, long time) { | |||
}; | |||
} | |||
|
|||
private long getTrackDurationUs(@NonNull TrackType type) { | |||
private long getTrackDurationUs(@NonNull TrackType type, boolean processedDuration) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain to me why we need two versions of getTrackDurationUs
and getTotalDurationUs
? These are used to compute transcoding progress/end so it seems to me that we should not take post processing into account here. Even if the processor transforms 10 seconds into 20 seconds, when the source reaches 10, we'll have decoded 10 seconds and encoded 20 and so we're done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
processedDuration=true: Duration after that the data went through the processor. It is used to get the expected total duratio, to know if we should force the end of the stream.
processedDuration=false: Duration of the data that has been read. It is used to calculate the progress because it is the reading and extracting steps that take most of the time: It depends of course of what the processors do but that we can't know.
In our case, we use this system to mix the sound of the audio track with the data sources (https://github.com/cbernier2/Transcoder/blob/release/lib/src/main/java/com/otaliastudios/transcoder/postprocessor/MixerSourceAudioPostProcessor.java and https://github.com/cbernier2/Transcoder/blob/release/lib/src/main/java/com/otaliastudios/transcoder/postprocessor/MixerTargetAudioPostProcessor.java) ... so one processor only acculate data and skip it so the processed duration doesn't increase while the transcoder is doing this step,
@@ -361,8 +375,8 @@ public void transcode(@NonNull TranscoderOptions options) throws InterruptedExce | |||
// This can happen, for example, if user adds 1 minute (video only) with 20 seconds | |||
// of audio. The video track must be stopped once the audio stops. | |||
long totalUs = getTotalDurationUs() + 100 /* tolerance */; | |||
forceAudioEos = getTrackReadUs(TrackType.AUDIO) > totalUs; | |||
forceVideoEos = getTrackReadUs(TrackType.VIDEO) > totalUs; | |||
forceAudioEos = getTrackProgressUs(TrackType.AUDIO, true) > totalUs; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example, here you compare a processed progress (getTrackProgress(true)
) with a non-processed duration (getTrackDuration(false)
). I don't think this makes sense, we should rather use false for both. But maybe I'm missing something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In our case we use this system to mix the custom audio track with the videos audio track, So the initial audio track duration is 2x the duration fo the videos. We use https://github.com/cbernier2/Transcoder/blob/release/lib/src/main/java/com/otaliastudios/transcoder/postprocessor/MixerSourceAudioPostProcessor.java to accumulate the audio track data and not write it until it it can be mixed: with https://github.com/cbernier2/Transcoder/blob/release/lib/src/main/java/com/otaliastudios/transcoder/postprocessor/MixerTargetAudioPostProcessor.java.
Without processedDuration=true, the transcoder would end the audio track stream before that the audio get mixed... even if nothing has been written to the output file for the audio track.
* @param bufferDurationUs the duration of the input data | ||
* @return the duration of the output data | ||
*/ | ||
long postProcess(@NonNull final ShortBuffer inputBuffer, @NonNull final ShortBuffer outputBuffer, long bufferDurationUs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that the number of channels and sample rate should also be here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that the number of channels and sample rate should also be here.
Indeed! I will add them.
/** | ||
* Returns the duration of the data source on it has been processed (after calling the postProcess() method) | ||
* @param durationUs the original duratin in Us | ||
* @return the new duration in Us | ||
*/ | ||
long calculateNewDurationUs(long durationUs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you guys really need this ability? For what? Everything would be much simpler if you remove this option.
For example the postProcess() function could be as simple as :
void postProcess(ShortBuffer buffer, int sampleRate, int channels);
If the duration is unchanged, the audio processor can simply rewrite into the input buffer. This would be more efficient and simplify the AudioEngine as well.
public interface DataSource { | ||
|
||
/** | ||
* Returns an handler that need to be executed with the raw data source data | ||
* before that it gets encoded. | ||
* | ||
* @return the PostProcessor object | ||
*/ | ||
PostProcessor getPostProcessor(); | ||
|
||
/** | ||
* Sets the handler that needs to be called before that the raw data source data | ||
* gets sent to the encoder. | ||
* | ||
* @param postProcessor the PostProcessor object | ||
*/ | ||
void setPostProcessor(@NonNull PostProcessor postProcessor); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this is practical but I think that a DataSource should not hold the processor. It's just a source.
Instead, we could create an internal map like so TranscoderOptions.addProcessor(Processor, DataSource...)
. So when you add the processor you specify all the sources it applies to
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do!
package com.otaliastudios.transcoder.postprocessor; | ||
|
||
public interface PostProcessor { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small detail, can we change the package and name to processor/Processor? Removing the "post" everywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do!
Yes we are :) |
#76
It includes an example of AudioPostProcessor to change the volume, but not the two AudioPostProcessors that we use to mix the audio as they are maybe too specific to our need. They are in a temporary commit instead.