Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rivalium integration. #3

Merged
merged 18 commits into from
Oct 21, 2021
Merged

Rivalium integration. #3

merged 18 commits into from
Oct 21, 2021

Conversation

ijc8
Copy link
Owner

@ijc8 ijc8 commented Sep 4, 2021

Resolves #1.

@ijc8
Copy link
Owner Author

ijc8 commented Sep 6, 2021

Made a first pass at implementing rivalium.send().
PyOgg doesn't have OggOpusWriter in the latest version on PyPI, so this requires installing directly from GitHub:
pip install git+https://github.com/TeamPyOgg/PyOgg.git
Audio is a bit glitchy (https://rvm.sh/0t4hcgfrxdr3); some issue with encoding to Opus, as this happens even when saving locally.

@ijc8
Copy link
Owner Author

ijc8 commented Sep 6, 2021

Found the issue (OggOpusWriter expects a byte-view of an array of shorts, rather than the array of shorts itself.)
Audio sounds much better (https://rvm.sh/03hpmdggm6gp).
I'm noticing, however, that the POST fails (with 404, strangely) if I try to send segments longer than 0.5 seconds.

Also, make `recv()` work with the latest PyOgg.
@ijc8 ijc8 force-pushed the rivalium-integration branch from 6f756fd to 82fc015 Compare September 6, 2021 22:20
Avoids underruns between segments.
@ijc8
Copy link
Owner Author

ijc8 commented Sep 8, 2021

I noticed that rivalium.send() is not working in PyPy because of an issue when encoding Ogg Opus.
I think the issue is in PyPy, not PyOgg, so I've submitted an issue there: https://foss.heptapod.net/pypy/pypy/-/issues/3546.

ijc8 added 2 commits September 7, 2021 21:36
Exiting during `rivalium.send()` is less annoying now.
Avoids underruns while playing `rivalium.send()`.
@drohen
Copy link

drohen commented Sep 8, 2021

I'm noticing, however, that the POST fails (with 404, strangely) if I try to send segments longer than 0.5 seconds.

Your encoder probably isn't compressing enough, in the current version I do a huge amount of compression (down to 12000hz resample then a 12kbps bitrate, and only using 1 channel) and this gets it down to around 2kb per 1 sec segment, and anything double that size is rejected by the server.

This will be changing in the next version, as I have received continual requests to allow for slightly higher quality audio, and I think it's still reasonable for weak/expensive connections to access ~7kb segments, which uses settings of 64kbps (variable) bit rate and doesn't downsample (48k hz), still using 1 channel. Additionally, with the new design, the 1 second (+80ms pre-skip, excluded before passing to decoding buffer) segment size will be a strict amount as the decoding buffer will be limited to a size of 1 second at the output sample rate.

@ijc8
Copy link
Owner Author

ijc8 commented Sep 9, 2021

Okay, thanks for the info. In that case, I'll add a sample_rate argument to send(), and, for now, have it default to 12 kHz. (I'll also up the default segment_duration to 1 second.)

@ijc8 ijc8 marked this pull request as ready for review September 10, 2021 20:19
@ijc8
Copy link
Owner Author

ijc8 commented Sep 10, 2021

@drohen Okay, I think this includes everything we discussed last week.
(Note that it just supports the random playback mode; there's a TODO to add support for normal and live once they are settled.)
Let me know if this looks good to you; if so, I'll go ahead and merge it in.

@drohen
Copy link

drohen commented Sep 13, 2021

Sorry for the delay, short of time at the moment and want to give this a proper test. Will let you know by the end of this weekend.

@drohen
Copy link

drohen commented Sep 18, 2021

Code looks good, I could mostly follow along with the implementation. Did you record the audio for "4hb496kn6yh" directly to rivalium using aleatora?

Took a bit of effort to get started, but that was mostly due to me trying to figure out installing from source and vscode not playing nicely.

when running: play(rivalium.recv("rvm.sh/0VDK-rWa2dsm1fC70sRn5"))
I got:

raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 404: Not Found

I tried variations of this,

  • play.rivalium.com/api/VDK-rWa2dsm1fC70sRn5
  • play.rivalium.com/stream/VDK-rWa2dsm1fC70sRn5
  • VDK-rWa2dsm1fC70sRn5
    none of which worked. Not sure what is happening because this is a valid endpoint.

when running: play(rivalium.recv("4hb496kn6yh"))

raise PyOggError("The OpusFile library wasn't found or couldn't be loaded (maybe you're trying to use 64bit libraries with 32bit Python?)")
pyogg.pyogg_error.PyOggError: The OpusFile library wasn't found or couldn't be loaded (maybe you're trying to use 64bit libraries with 32bit Python?)

when running:

upload_stream, public_url, admin_url = rivalium.send(rand * 2 - 1)
play(upload_stream)

I hear the random white noise, however I also get:

TypeError: expected c_ubyte_Array_28 instead of c_ubyte

Maybe if you could share a test script I can run with python main.py as well as a couple of steps like "run this, then open browser and listen at x URL" I can make sure I'm using it correctly as well.

@ijc8
Copy link
Owner Author

ijc8 commented Sep 18, 2021

Code looks good, I could mostly follow along with the implementation. Did you record the audio for "4hb496kn6yh" directly to rivalium using aleatora?

No, that's just from me playing around with Rivalium in the browser. I used that for testing recv(), which I implemented before send(). Here's an example uploaded directly from Aleatora.

Took a bit of effort to get started, but that was mostly due to me trying to figure out installing from source and vscode not playing nicely.

Sorry about that; let me know what steps were missing or unclear, and I'll document them.

Setup problems are also the cause of this issue:

pyogg.pyogg_error.PyOggError: The OpusFile library wasn't found or couldn't be loaded (maybe you're trying to use 64bit libraries with 32bit Python?)

PyOgg requires libopusfile. (On Ubuntu, sudo apt install libopusfile0.)

when running: play(rivalium.recv("rvm.sh/0VDK-rWa2dsm1fC70sRn5"))
I got:

raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 404: Not Found

This is due to the regex that parses the descriptor. I had the impression that stream/group IDs were alphanumeric, so it uses \w to match them; it fails on the hyphen here. I'll fix the regex; are there any characters in the ID alphabet besides uppercase, lowercase, digits, and hyphen?

However, the last variation you tried ("VDK-rWa2dsm1fC70sRn5") works for me. Did it also fail with an HTTPError, or a PyOggError?

when running:

upload_stream, public_url, admin_url = rivalium.send(rand * 2 - 1)
play(upload_stream)

I hear the random white noise, however I also get:

TypeError: expected c_ubyte_Array_28 instead of c_ubyte

You're using send() correctly. Unfortunately there was a bug in PyPy's ctypes implementation that broke PyOgg's encoder, so (until the next release comes out with the fix) you'll need to run in CPython (or PyPy's nightly build) for send().

Maybe if you could share a test script I can run with python main.py as well as a couple of steps like "run this, then open browser and listen at x URL" I can make sure I'm using it correctly as well.

Sure, here's a quick example:

from aleatora import *
from aleatora.thirdparty import rivalium
# Create an Aleatora stream, and a Rivalium stream.
wavy = osc(200 + 20 * osc(0.2))
upload_wavy, public_url, admin_url = rivalium.send(wavy)
# Play 10 seconds while uploading to Rivalium stream.
# To *just* upload (without playing), use `upload_wavy[:10.0].run()` instead.
run(upload_wavy[:10.0])
print(f"Open this link in a browser: https://rvm.sh/0{rivalium.extract_id(public_url)[1]}")

Save as main.py. Running python main.py should play for 10 seconds, then spit out a link like https://rvm.sh/0whdg9f4nn9f.

@drohen
Copy link

drohen commented Sep 19, 2021

I noticed the playback is a bit weird for: https://rvm.sh/0whdg9f4nn9f.
I noticed when I imported into audacity that there wasn't really any samples being dropped and there's some empty buffer and some bad samples at the beginning. I think the amount of pages/data or whatever required to ensure the compression works is about 80ms, which in my version below is 3840 samples.

It's weird that playback length is coming back as 0.25s, because it clearly is not. Seems like some encoder weirdness. Also page duration in yours is longer, not sure it matters though.

» opusinfo 00000001.ogx
Processing file "00000001.ogx"...

New logical stream (#1, serial: 5755991c): type opus
Encoded with ENCODER=PyOgg
Opus stream 1:
	WARNING: stream 1 has more than one packet of end trimming
	Pre-skip: 240
	Playback gain: 0 dB
	Channels: 1
	Packet duration:   20.0ms (max),   20.0ms (avg),   20.0ms (min)
	Page duration:   1020.0ms (max), 1020.0ms (avg), 1020.0ms (min)
	Total data length: 2051 bytes (overhead: 8.87%)
	Playback length: 0m:00.250s
	Average bitrate: 65.63 kb/s, w/o overhead: 59.81 kb/s
Logical stream 1 ended

» opusinfo 1IJ4S_8PbRLtONmKHToC.ogx
Processing file "1IJ4S_8PbRLtONmKHToC.ogx"...

New logical stream (#1, serial: 2193bcb5): type opus
Encoded with RecorderJS
Opus stream 1:
	Pre-skip: 3840
	Playback gain: 0 dB
	Channels: 1
	Original sample rate: 48000Hz
	Packet duration:   20.0ms (max),   20.0ms (avg),   20.0ms (min)
	Page duration:   1000.0ms (max), 1000.0ms (avg), 1000.0ms (min)
	Total data length: 1866 bytes (overhead: 9.54%)
	Playback length: 0m:00.919s
	Average bitrate: 16.23 kb/s, w/o overhead: 14.68 kb/s
Logical stream 1 ended

Sorry about that; let me know what steps were missing or unclear, and I'll document them.

The main point where I got stuck was installing from git using pip, because it didn't seem to install the correct branch. Then when I clone the project to the same dir and installed it that way, it confused vscode, but I think that was just an edge case.

I had the impression that stream/group IDs were alphanumeric, so it uses \w to match them; it fails on the hyphen here. I'll fix the regex; are there any characters in the ID alphabet besides uppercase, lowercase, digits, and hyphen?

I use nanoid, and the older IDs used the standard library, the new one excludes a bunch of characters.

However, the last variation you tried ("VDK-rWa2dsm1fC70sRn5") works for me. Did it also fail with an HTTPError, or a PyOggError?

Ah yeah I think it was PyOggError as after installing the missing opus lib it now works. So I can confirm it works as expected in this case.

@ijc8
Copy link
Owner Author

ijc8 commented Sep 19, 2021

It's weird that playback length is coming back as 0.25s, because it clearly is not. Seems like some encoder weirdness.

I think you're right (and thanks for the opusinfo output).
I did some poking around, and it looks like OggOpusWriter consistently generates files with incorrect durations for sample rates other than 48kHz. (So encoding at 12kHz results in a file with 1/4th of the intended duration; encoding at 24kHz results in 1/2, and so on.)
I have created an issue for this. In the meantime, I'll see if I can find a workaround.

I use nanoid, and the older IDs used the standard library, the new one excludes a bunch of characters.

Okay. It looks like nanoid indeed uses the alphabet A-Za-z0-9_-. I've updated the regexes accordingly; all the URLs you tried before should work now.

Ah yeah I think it was PyOggError as after installing the missing opus lib it now works. So I can confirm it works as expected in this case.

👍

@drohen
Copy link

drohen commented Oct 3, 2021

just wanted to check in to see if you were going to update any more on this? I think the only issue remaining was consistency with the pre-skip length. I don't think the playback length issue should affect the browser decoder, it's more of a semantic issue. Other than that I think this is mostly in a good place.

@ijc8
Copy link
Owner Author

ijc8 commented Oct 4, 2021

Hey, thanks for the prodding - got a bit busy and had this on the back burner.
I was hoping to hear back about the PyOgg length issue. In my testing, it did cause issues for some players (e.g. VLC and Totem cut off early), but maybe browser decoders won't care - in any case, it's no reason to hold up the PR.
I'll look into increasing the pre-skip length later this week; if that suffices to fix the playback issues, I'll go ahead and merge this.

@ijc8
Copy link
Owner Author

ijc8 commented Oct 7, 2021

I took a look at this tonight, with no luck. Even with increased pre-skip, there's a small gap at the start, regardless of the sample rate.
On the other hand, if I first write to a .wav and then encode to opus with ffmpeg, there is no gap, and only a small preskip (e.g. 312 samples). (I think 80 ms is only necessary for certain use cases.)
So, I think there is something else going awry.

PyOgg's high-level API for opus encoding seems generally somewhat flaky, which is fair enough, as it hasn't been pushed out in a stable release yet. At this point, I think the simplest approach to get correct output is to either use the C bindings provided by PyOgg directly, or just call out to an external tool like ffmpeg or opusenc.
I am leaning towards the latter; I have a working example (with no initial gap) that uses PyOgg's bindings for libopusenc, which is pretty straightforward, but this requires libopusenc, which unfortunately is not in the Ubuntu repos (though there is an unofficial .deb available here).

@ijc8
Copy link
Owner Author

ijc8 commented Oct 7, 2021

Alright, got this working with ffmpeg-python. There are no gaps at the start of encoded segments, the user can now specify the bitrate directly (defaults to 12kbps), the code's a bit shorter, and hopefully installation will be easier.
Here's an example stream of "wavy" encoded at 12kbps: https://rvm.sh/0xgp7xc3g9cf.

@ijc8
Copy link
Owner Author

ijc8 commented Oct 12, 2021

@drohen Let me know if this works for you.

@drohen
Copy link

drohen commented Oct 16, 2021

Apologies again for the delay, I don't have much time beyond my weekend to review stuff, and even then I have a long list of stuff I often need to do. I tested with you above example and a couple of my own, mostly sounds great on my end (at least as good as I think it will sound for the current rvm system).

Two issues I encountered:

  • On my (ubuntu) system, ffmpeg seems to be outputting decode data to the terminal for each decoded segment, so there's just a constant stream of output information. Is there a way this can be made quiet?
  • I tried with VDK-rWa2dsm1fC70sRn5 which I've mentioned earlier in this PR, seems that playback works for a little while then just stops, no information is given as to why however. It's strange though, not sure if its the decoding or if its the server/application not populating the decode queue. Do you encounter the same problem?

@drohen
Copy link

drohen commented Oct 16, 2021

Additionally, I remembered why the pre-skip thing was important. In the update version, in order to splice the audio together correctly (rather than cross-fade) the pre-skip needs to be known so these initial samples can be dropped from the decode buffer. Does ffmpeg decoder drop these automatically? It seems the browser tool doesn't, and hence I can simply infer (by using the hardcoded 80ms in the browser encoder) that I can drop these samples, but I guess if this will vary, then I'll need to find a way to probe the file for the pre-skip length, or have it sent with the file data.

Speaking of file data. Although unrelated to this PR, I'm currently trying to find an effective way to split up segments so they can be spliced together again, but from what I can see its probably only possible with inconsistent segment lengths that split always at the zero crossing. I'm not certain of this though, but I simply can't figure out how to get the join point to stitch together nicely while also not losing any samples (my current design has pre-skip of previous audio that gets chopped, and 1 second of current audio that gets chopped at the beginning and end to its zero crossing, which is only usually a few hundred samples at most). I haven't found a way to get a consistent sine wave to not "throb" but my next experiment will attempt attempt the inconsistent length approach. I wonder if you have any thoughts on this, and if you think this will create much complexity in aleatora.

@ijc8
Copy link
Owner Author

ijc8 commented Oct 18, 2021

Apologies again for the delay, I don't have much time beyond my weekend to review stuff, and even then I have a long list of stuff I often need to do.

No worries, I'm in the same boat. 🚣

I tested with you above example and a couple of my own, mostly sounds great on my end (at least as good as I think it will sound for the current rvm system).

Glad to hear it!

On my (ubuntu) system, ffmpeg seems to be outputting decode data to the terminal for each decoded segment, so there's just a constant stream of output information. Is there a way this can be made quiet?

Yep - fixed, thanks.

I tried with VDK-rWa2dsm1fC70sRn5 which I've mentioned earlier in this PR, seems that playback works for a little while then just stops, no information is given as to why however. It's strange though, not sure if its the decoding or if its the server/application not populating the decode queue. Do you encounter the same problem?

No. I've heard occasional hiccups near the start, but I haven't encountered playback stopping completely.

Does ffmpeg decoder drop these automatically?

Yes, ffmpeg will drop whatever pre-skip is declared in the file (as reported by opusinfo).

It seems the browser tool doesn't, and hence I can simply infer (by using the hardcoded 80ms in the browser encoder) that I can drop these samples, but I guess if this will vary, then I'll need to find a way to probe the file for the pre-skip length, or have it sent with the file data.

Yeah, not sure what opus-recorder does here. I did a little experiment, and it seems that the browser itself (as in the <audio> tag and Web Audio API) does skip the pre-skip automatically. So you might consider the built-in decodeAudioData() unless you have reason to avoid it.

I'm not certain of this though, but I simply can't figure out how to get the join point to stitch together nicely while also not losing any samples (my current design has pre-skip of previous audio that gets chopped, and 1 second of current audio that gets chopped at the beginning and end to its zero crossing, which is only usually a few hundred samples at most). I haven't found a way to get a consistent sine wave to not "throb" but my next experiment will attempt attempt the inconsistent length approach. I wonder if you have any thoughts on this, and if you think this will create much complexity in aleatora.

Yes, if you want to avoid both clicks and throwing away (or enveloping) samples, I think you'll need inconsistent lengths. Say, for each segment, you lop off everything after the last zero-crossing and tack it on to the next segment. If the segments are played back in order, you just get the original sequence of samples back, but if the segments are played back out-of-order, you avoid clicks at the boundaries (because all segments begin and end with a zero-crossing).
(You're still liable to get artifacts with sine wave segments played back out-of-order, because some zero-crossings occur in the rising part of each cycle and some occur in the falling part, but this should avoid the "throb" during in-order runs.)

Presumably this segment-shifting will happen either before encoding + upload or after download + decoding. Either will add a little complexity to aleatora's rivalium module, but probably not much more than the current zero-crossing crop approach.

In the meantime, I think this PR is ready to go. ffmpeg-python seems to be working nicely, and the module now works with PyPy. The verbose decoding issue is resolved. I wasn't able to reproduce the issue you mentioned with VDK-rWa2dsm1fC70sRn5, but I'm happy to debug and get a fix in later if you run into it again.

@drohen
Copy link

drohen commented Oct 20, 2021

Yeah I agree, go ahead and merge and lets jam sometime to try it all out.

re decodeAudioData, yeah when I first built this wasn't working as expected, but maybe I know more now and is probably the superior option. Makes me think I should see if MediaRecorder is also now supporting opus on all platforms (I think safari didn't last I checked).

re zero crossing, could avoid the further issue just by ensuring the cut is at each neg - pos zero crossing.

@ijc8
Copy link
Owner Author

ijc8 commented Oct 21, 2021

re zero crossing, could avoid the further issue just by ensuring the cut is at each neg - pos zero crossing.

Good idea.

Yeah I agree, go ahead and merge and lets jam sometime to try it all out.

Sweet, shoot me an email to let me know when!

@ijc8 ijc8 merged commit fc247f5 into main Oct 21, 2021
@ijc8 ijc8 deleted the rivalium-integration branch October 21, 2021 03:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

proposal: integrating streams from sludge server
2 participants