Home

Advertisement

Customize
jmspeex
08 December 2007 @ 11:52 pm
Speex 1.2beta3 has been tagged and will be up on the website shortly. There should even be Windows builds this time thanks to Alexander Chemeris. I'm expecting the next release to be named 1.2rc1. There's still a few things to address before 1.2, but I'm hoping the libspeexdsp API will be complete for rc1. Stay tuned.

There's another releasing coming up: a new Code-Excited Lapped Transform (CELT) codec prototype. This codec is (for now at least) meant to compete with neither Vorbis, nor Speex. Instead, the primary idea is to reduce latency to a minimum -- currently around 8 ms (compared to ~25 ms for Speex and ~100 ms for Vorbis). Of course, this comes with a price in terms of efficiency, but I'm already surprised the price isn't bigger. I've been mainly focusing on speech, but unlike Speex, I'm hoping this one will handle music as well. For the curious, I've put a 56 kbps CBR music file (original). This is still very experimental and everything is still likely to change, including the exact goals. I'm still trying to figure out how to put psychoacoustics into that. Stay tuned for the release of version 0.0.1 (or should I use a negative version number to make it clearer it's experimental?).

CELT is based on a paper I submitted to ICASSP and which I'm hoping will be accepted so I can make it available to everyone. The only difference is that the ICASSP paper was based on the FFT (non critically sampled), whereas this version is based on the MDCT. One part that is already published though is Timothy's explanation of the pulse codebook encoding along with some source code. Now, here's a challenge. Who can beat the algorithm on Timothy's page? Simply stated, the idea is to enumerate all combinations of M pulses in a vector of dimension N, knowing that pulses have unit magnitude and a sign, but all pulses at the same position need to have the same sign.

Updated: The full source for CELT is available at: http://downloads.us.xiph.org/releases/celt/celt-0.0.1.tar.gz or through git at http://git.xiph.org/celt.git

Updated again:: Speex 1.2beta3 is out.
 
 
jmspeex
23 November 2007 @ 02:17 pm
I realised recently that Speex tends to have a lot of array copies done using for loops instead of memmove()/memcpy(). However, I wasn't quite happy with just replacing everything with memmove()/memcpy() because of the very poor type safety they provide. For example, it's all too easy to change the type of an array and have memmove() doing the wrong thing without any warning. So, after a bit of thinking, I came up with the following macro, which I think should work:

#define SPEEX_MOVE(dst, src, n) (((dst)-(src)), memmove((dst), (src), (n)*sizeof(*(dst))))

Compared to memmove(), this macro does two things. First, it removes the need for using sizeof() in the length argument, removing a source of error. Second, the discarded (dst)-(src) value before the comma basically ensures that the compiler will generate an error if src and dst point to different types. Expect to see that macro appearing in Speex soon. I'd be curious to hear if anyone knows of any unwanted side effects of this, except of course the usual "don't use arguments with side effects" limitation.

Update: Actually, the following expression also has the advantage not producing a warning with gcc:

#define SPEEX_MOVE(dst, src, n) (memmove((dst), (src), (n)*sizeof(*(dst)) + 0*((dst)-(src)) ))
 
 
jmspeex
20 March 2007 @ 12:36 am
So a while ago, I wasn't careful with type lengths and wrote some code in the speex encoder (speexenc, not libspeex) that wouldn't work very well on 64-bit machines. More precisely, it would make speexenc crash on startup 100% of the time, so you can't really miss it if you have a 64-bit machine. Fortunately, someone noticed and the bug was promptly fixed. This should normally have been the end of the story... except that Ubuntu was going to ship Dapper with an older version (current Debian unstable).

Turns out that the bug was reported against Dapper very early on. A patch was even posted more than a month before the release of Dapper. From there, it took 11 months for the 2-line fix to be applied and released. And if it wasn't for me harassing some of the developers (thanks crimsun, tritium for pushing the fix in), I don't think the fix would never have made it.

Sometimes one wonders why it is that Ubuntu has a bug tracker. Another example is bug #52600. You can't see it because it's marked as a security bug, but considering I filed it more than 8 months ago, I don't think making it private makes sense anymore. That one comes down to the fact that any local user with no privilege can crash a Dapper machine very easily. You just compile the following program:

#include <sched.h>
int main() {
struct sched_param param;
param.sched_priority = sched_get_priority_max(SCHED_FIFO);
sched_setscheduler(0,SCHED_FIFO,¶m);
while(1);
}

and then execute it. What this does is simply ask the maximum real-time priority and then spin doing nothing, starving every single other process on the machine and forcing a reboot. While allowing SCHED_FIFO to some users in some circumstances makes sense, I can't understand why it's enabled for everyone on the system. It's a bit like making the shutdown command setuid root. Yeah for the Ubuntu LTS (Long Time to get Support) process!
 
 
jmspeex
02 December 2006 @ 11:22 pm
After a couple days fighting with this annoying overflow bug, I think I've managed to solve the problem. As you can see, some of the fixes are not very nice. It basically comes down to
  • Adding explicit saturation (SATURATE) before 32-bit to 16-bit conversions.
  • Scaling the signal up/down for some operations to avoid having to add saturation all over the place, especially in critical loops
  • PSHR* is evil. Well, not quite but can you shot the danger in the PSHR32 definition?

    • The moral of the story: saturating isn't great, but it still beats overflowing!
Tags: ,
 
 
jmspeex
16 November 2006 @ 01:14 am
OK, so I thought the fixed-point code in 1.2-beta1 was getting pretty good. But that was until a user ((wouldn't things be simple without them!) was able to make it fail horribly by feeding it totally clipped speech. It turns out that the file manages to trigger at least a half-dozen overflows all around the code, some of them easily fixed, some not.

So here's the deal with fixed-point. Some CPUs/DSPs support saturating versions of add/sub/mul/... and some don't. Most G.72x codecs are usually implemented assuming that they exist, so they don't need to worry about overflows. For Speex, I decided to do it without assuming hardware saturation, so it can run on ARM and other chips (including x86) that don't support saturation. And that's how everything suddenly becomes more complicated. If once in a while 0.5 + 0.6= 1.0, you usually don't care too much. On the other hand, if 0.5 + 0.6 = -0.9, then suddenly you do care.

So the fundamental question here is how much overflows on corrupted input can be tolerated (based on the "garbage in, garbage out" principle) and how much needs to be avoided regardless of the input? Answer when I get to the bottom of this. To be continued...
 
 
 
 

Advertisement

Customize