8x (or more) compression for your podcast archive.

merkelspam · Post by **merkelspam** » Sat Feb 17, 2024 6:49 am

How i scrunch my mp3s:

mp3toopus.sh.tgz

https://envs.sh/Fkn.mp3 780404 bytes

https://envs.sh/FkT.opus 37542 bytes

20.7x filesize reduction

The script features volume normalization, aggressive compression, and a gate cutoff to save bits on silent parts.

Sorry if i inadvertently reposted.

Output is optimized for understandable speech on small speakers, so i can listen to podcasts on a watch hung around my neck.

With more work I'd see if I can optimize the gate/compander curve some more, as this is a bit aggressive and distorted.

Note the script as-is *DELETES* the original file, so you'll probably want to comment-out that "rm" command if just experimenting with it.

As with scripts generally, this is submitted for educational purposes only. Don't run a script without some inspection and comprehension.
:okhand

Atruepatriot · Post by **Atruepatriot** » Sat Feb 17, 2024 7:02 am

Cool thing about voice audio files is even 8 bit is clear as a bell. A friend of mine owned the ancient "Traveler's" virtual world network on digital space many years ago and it used 8 bit audio. It was amazingly full and clear as a bell in real time with no latency. Quality of music is another story...

Clayton · Post by **Clayton** » Sat Feb 17, 2024 10:39 am

Might be able to use neural nets to achieve even higher compression without any further sacrifices to quality. First, train a VAE to encode/decode many samples of human speech. You could even cut up, say, 3-second segments of your own podcast as training data. The audio files used during training should be encoded at the bitrate/quality you want as a *final* result. The VAE's internal encoding will be significantly smaller (fewer bits) than the original data. Once trained, feed the audio to be compressed into the encoder-side of the VAE and store the output as your compressed file. When you want to recover the audio, feed the compressed file into the decoder-side of the VAE. Obviously, this will be lossy compression.

merkelspam · Post by **merkelspam** » Sun Feb 18, 2024 1:17 am

Clayton wrote: ↑Sat Feb 17, 2024 10:39 am Might be able to use neural nets to achieve even higher compression without any further sacrifices to quality. First, train a VAE to encode/decode many samples of human speech. You could even cut up, say, 3-second segments of your own podcast as training data. The audio files used during training should be encoded at the bitrate/quality you want as a *final* result. The VAE's internal encoding will be significantly smaller (fewer bits) than the original data. Once trained, feed the audio to be compressed into the encoder-side of the VAE and store the output as your compressed file. When you want to recover the audio, feed the compressed file into the decoder-side of the VAE. Obviously, this will be lossy compression.

https://github.com/google/lyra

Doesn't build for me. Need to get rid of the Starlark and Java. Want a pure C/C++ implementation for embedded devices.

8x (or more) compression for your podcast archive.

8x (or more) compression for your podcast archive.

Re: 8x (or more) compression for your podcast archive.

Re: 8x (or more) compression for your podcast archive.

Re: 8x (or more) compression for your podcast archive.