iki.fi/o

Circular blur

2010-09-06

This article describes an approach for efficient two-dimensional convolution with arbitrary circularly symmetric convolution kernels. A 2-d Gaussian kernel is linearly separable; the convolution can be split into a horizontal convolution followed by a vertical convolution. This makes 2-d Gaussian convolution relatively cheap, computationally, and “Gaussian blur” has become, partially for this reason, a popular operation in image processing. No other circularly symmetric convolution kernel is linearly separable. But this is only true when the kernel is limited to real numbers. This article introduces complex-valued convolution kernels that are both circularly symmetric and linearly separable, and goes on to show how these can be used to produce arbitrary circularly symmetric 2-d convolutions by 1-d convolutions, in contrast to the use of 2-d Fast Fourier Transform (FFT). Special focus is given to implementation of circular convolution, or convolution with a solid disc, which is useful for simulation of bokeh of camera lenses with fully open apertures.

A Gaussian convolution kernel, used in Gaussian blur (black = -maximum value, grey = 0, white = maximum value)

To produce circularly symmetric 2-d convolution, the condition that the 1-d kernel function f must satisfy is f(\sqrt{x^2+y^2}) = f(x)\times f(y). The condition may be broken down into a magnitude condition and a phase condition that must both be satisfied. The magnitude condition reads |f(\sqrt{x^2+y^2})|  = |f(x) \times f(y)| = |f(x)|\times|f(y)| , meaning that the kernel must have a Gaussian function as its magnitude function. Because in multiplication of complex numbers the phases are summed, the phase (argument) condition reads \theta(\sqrt{x^2+y^2}) =  \theta(x)+\theta(y), with the shorthand \theta(x) = \arg(f(x)). The only family of functions to satisfy this is \theta(x) = a x^2. The kernel function can be constructed as the product of a Gaussian function, here called the envelope, and a complex phasor with x^2 as its argument. In one dimension these kernels have the form e^{-a x^2} e^{i b x^2} =  e^{- a x^2 + i b x^2} = e^{(- a + b i) x^2}, where i is the imaginary unit, x is the spatial coordinate, a is the spatial scaling constant of the envelope, and b is the spatial scaling constant of the complex phasor. In two dimensions the form is e^{- a x^2 + i b x^2} e^{- a y^2 + i b y^2} =  e^{- a (x^2 + y^2) + i b (x^2 + y^2)} = e^{(- a + b i) (x^2 + y^2)}, y being the coordinate in the other dimension. Let’s see what these look like.

The real part of a phased Gaussian kernel

The imaginary part of a phased Gaussian kernel

The real part of a phased Gaussian kernel of infinite spatial scale of the magnitude part, showing the phasing only

The imaginary part of the phased Gaussian kernel of infinite spatial scale of the magnitude part, showing the phasing only

Magnitude of a phased Gaussian kernel, identical to the plain Gaussian kernel

These forms are not normalized to preserve average intensity in image filtering, but to give a value of 1 for x = 0, y = 0. Arbitrary circularly symmetric shapes can be constructed by a weighted sum of the real parts and imaginary parts of complex phasors of different frequencies. Fourier series is the standard approach for such designs. These give a repetitive pattern, but the repeats could be diminished by adjusting the spatial scale of the envelope. As necessary, sloping of the envelope can be compensated for in the weights of the sum, but simultaneous optimization of both will give the best results, as it can take advantage of that the envelopes need not be the same for each of the component kernels.

Let’s try a 5-component square wave approximation, designed manually with the Fourier series applet of Paul Falstad.

5-component equiripple square wave Fourier series

Real part of a square wave approximated by 5 phased Gaussian components of infinite spatial scale of the magnitude part.

The formula for this series is 0.54100\ +\ 0.66318\ \cos(x)\ -\ 0.20942\ \cos(3 x)\ +\ 0.10471\ \cos(5 x)\ -\ 0.08726\ \cos(7 x). A bit of scaling (by 0.9) and shifting (by -0.01) was then done to make the ripples go evenly around the zero level and to avoid values greater than 1.

This is a pretty good indicator of the capabilities of this approach. The weighted component kernels were:

Component 0

Component 1

Component 2

Component 3

Component 4

By lowering the spatial scale of the magnitude part of each component kernel, one gets a result like this:

First try on a circular convolution kernel

Not so bad! The halos would need to be eliminated, and the disk is also fading a bit toward the edges.

I searched for better circular kernels by global optimization with an equiripple cost function, with the number of component kernels and the transition bandwidth as the parameters. Transition bandwidth defines how sharp the edge of the disk is. The sharper it is, the more ringing there will be both inside and outside the disk. A transition bandwidth setting of 1 means that the width of the transition is as large as the radius of the disk inside. Below, 1-d component kernel formulas are given for a pass band spanning  x \in -1..1 and a dual stop band spanning  x\ \in\ -1.2\ ..\ -1\ \cup\ 1\ ..\ 1.2.

Number of components: 1, transition bandwidth: 0.200000, ripple: ±0.232417. Component 0: (cos(x*x*1.624835) * 0.767583 + sin(x*x*1.624835) * 1.862321) * exp(-0.862325*x* x)

1-d view

Number of components: 2, transition bandwidth: 0.200000, ripple: ±0.075459. Component 0: (cos(x*x*5.268909) * 0.411259 + sin(x*x*5.268909) * -0.548794) * exp(-0.886528*x *x) Component 1: (cos(x*x*1.558213) * 0.513282 + sin(x*x*1.558213) * 4.561110) * exp(-1.960518*x* x)

1-d view of the components and their sum (magenta)

Number of components: 3, transition bandwidth: 0.200000, ripple: ±0.026297. Component 0: (cos(x*x*5.043495) * 1.621035 + sin(x*x*5.043495) * -2.105439) * exp(-2.176490*x *x) Component 1: (cos(x*x*9.027613) * -0.280860 + sin(x*x*9.027613) * -0.162882) * exp(-1.019306* x*x) Component 2: (cos(x*x*1.597273) * -0.366471 + sin(x*x*1.597273) * 10.300301) * exp(-2.815110* x*x)

1-d view of the components and their sum (cyan)

Number of components: 4, transition bandwidth: 0.200000, ripple: ±0.010843. Component 0: (cos(x*x*1.553635) * -5.767909 + sin(x*x*1.553635) * 46.164397) * exp(-4.338459* x*x) Component 1: (cos(x*x*4.693183) * 9.795391 + sin(x*x*4.693183) * -15.227561) * exp(-3.839993* x*x) Component 2: (cos(x*x*8.178137) * -3.048324 + sin(x*x*8.178137) * 0.302959) * exp(-2.791880*x *x) Component 3: (cos(x*x*12.328289) * 0.010001 + sin(x*x*12.328289) * 0.244650) * exp(-1.342190* x*x)

1-d view of the components and their sum (brown)

Close-up

Number of components: 5, transition bandwidth: 0.200000, ripple: ±0.004062. Component 0: (cos(x*x*1.685979) * -22.356787 + sin(x*x*1.685979) * 85.912460) * exp(-4.892608 *x*x) Component 1: (cos(x*x*4.998496) * 35.918936 + sin(x*x*4.998496) * -28.875618) * exp(-4.711870 *x*x) Component 2: (cos(x*x*8.244168) * -13.212253 + sin(x*x*8.244168) * -1.578428) * exp(-4.052795 *x*x) Component 3: (cos(x*x*11.900859) * 0.507991 + sin(x*x*11.900859) * 1.816328) * exp(-2.929212* x*x) Component 4: (cos(x*x*16.116382) * 0.138051 + sin(x*x*16.116382) * -0.010000) * exp(-1.512961 *x*x)

1-d view of the components and their sum (magenta)

Close-up

Close-up of the pass band

Close-up of the stop band

The last one of the above, with 5 component kernels, is good enough for practical applications as the ripple is as small as 1/250, about as much as the least significant bit for 8-bits-per-channel graphics, so invisible in almost all situations. As the number of components is increased, the components have an increasingly large weighted amplitude, as much as 36 times their sum for the 5-component system. This may become a practical problem if one aims for convolution kernel shapes with sharp transitions. Here is one more, as a bonus:

Number of components: 6, transition bandwidth: 0.200000, ripple: ±0.001935. Component 0: (cos(x*x*1.981960) * -62.773778 + sin(x*x*1.981960) * 99.694943) * exp(-5.029513 *x*x) Component 1: (cos(x*x*6.159438) * 74.703895 + sin(x*x*6.159438) * 41.255198) * exp(-5.134785* x*x) Component 2: (cos(x*x*9.531306) * 0.154676 + sin(x*x*9.531306) * -84.608620) * exp(-6.171939* x*x) Component 3: (cos(x*x*12.618627) * -23.197236 + sin(x*x*12.618627) * 33.922147) * exp(-5.3924 39*x*x) Component 4: (cos(x*x*14.751538) * 12.326634 + sin(x*x*14.751538) * -4.453788) * exp(-5.04584 3*x*x) Component 5: (cos(x*x*18.798966) * -0.216125 + sin(x*x*18.798966) * -0.079862) * exp(-2.24716 8*x*x)

Disabling the ThinkPad X200 master control for wireless radios switch

2010-08-18

I became very annoyed by the reoccurring disturbances to Internet connectivity by the overly sensitive master control for wireless radios switch of my ThinkPad X200 laptop. So I superglued it to ON position.

The master control for wireless radios switch. Glued to ON position.

No help. The glitches were still there, shutting down WLAN randomly and whenever I pressed my palm against the palm rest. So it wasn’t just me accidentally pushing the switch, although I admit doing that too. I endured the nasty Windows message boxes and WLAN dropouts for quite a long time, until I finally decided to get rid of the problem once and for all by electrically hard-wiring the switch to ON position. A very good decision, one which I should’ve made much earlier!

Inside the laptop. That dreaded switch... In ON position, the outermost terminals were, or should have been, connected. In OFF position, the two rightmost (from this viewing angle) terminals were connected.

Removing the switch was easy. A firm grip with pliers, and a rolling motion towards the upper part of this image worked fine. I tried turning on the computer like this, with all terminals disconnected, but the state of the switch was detected as OFF. In my opinion, other than having no software disable for the switch, this was the design error made by Lenovo engineers. The switch should work so that the state is switched only when a connection is made.

A jump wire was soldered between the outer terminals. This did the trick!

And the light stays on! The end of my troubles.

If you want to void the warranty and blah blah, either have a look at the ThinkPad X200 and X200s Hardware Maintenance Manual, or follow this pictorial guide:

Remove the power cord and battery. Remove all screws from the back, except for the four screws that keep the screen hinges in place, and the RAM slot cover screws.

This is how many screws you should have at this point. Let me count... 10 with red tips and 1 smaller one. The screw holding the hard disk slot cover is not here. I broke the thin plastic cover long ago.

Pry open the wristrest with your fingernails

The wrist rest and keyboard should now come off nicely. (If not, you should have removed more screws!) Pull off their connectors.

With the keyboard and wrist rest away, this is how it looks in there

Now let's start removing screws from the inside. We need only to remove the ones that hold down the mainboard. Let's start with these two. Note that the screws down here are different, so sort the screws according to where you took them from so that you know where to put them back later.

This ribbon cable must be detached from the mainboard

Flip open the connector and you can simply lift up the end of the ribbon cable

Remove the two black screws (the other is already removed in this image) holding down the WLAN card. If you have a different model, you may have some other stuff here, so act accordingly.

Remove the WLAN card (or whatever you have there, that might hold down the mainboard)

Remove this screw

Remove these three screws (two black ones and one silver)

Remove the screws holding down the display connector

Lift up the display connector to detach it

Remove the hard disk. Could've done this much earlier...

Pry off the plastic strip that goes around the left, upper and right part of where the keyboard was. Start from either end.

Lift up the metal coverings so that you can pull out the mainboard underneath. You may have to detach some small pieces of adhesive tape first

When pulling out the mainboard, be careful not to injure this ribbon cable

When the mainboard has come loose, flip upside down the computer with the screen slightly open, so that you can allow the mainboard to rest on it. Now the culprit of all the troubles is nicely accessible... Remove the switch and solder a jump wire connection between the outer pads. (See the first three images of this article!)

When you have done the deed that had to be done, it is time to put everything back together again. Start by putting the mainboard in its correct place.

When putting back in the mainboard, again, be careful not to injure this ribbon cable

Push in the wireless LAN card (or whatever you have there)

Attach these three black screws

Attach these two black screws and the silver screw

Put the WLAN antenna cables under this ribbon cable

Lay the ribbon cable down into the connector

Close the connector

Make sure the WLAN antenna cables are properly connected to the WLAN card

Attach these two screws

Push down the display ribbon cable connector into its slot

Attach the two screws that hold the connector in place

Put back the hard disk

Reattach the large plastic strip

This is how the insides look at this point

Attach the keyboard connector

Attach the wristrest connector

Put the keyboard in place and snap the wrist rest in place

Reattach the screws in the back

The smaller screw goes here

Ahhhhh... the bliss.

Sine wave look-up table generation

2010-03-05

Here is a method for generating a sine look-up table in case you have little (a few kilobytes of) program memory. The idea goes like this: Let’s say you have a sine wave lookup table of length 1024 with a 24-bit amplitude range. If you take the difference between successive samples, the range of the numbers is reduced. If you repeat the process a total of 3 times, the data will be dominated by the 3rd differential of the quantization noise and will have a range of -4 to 3, which can be conveniently stored in 3 bits only. This program below does the reverse, it generates the sine table from the 3-bit values, and also takes advantage of the symmetry properties of sine. So, instead of wasting 1024 program memory words for storing the original sine table, in the c-language implementation below, the compressed data takes only 35 words and the decompression code hopefully not too much more.

/* Multiplierless 1024-sample, 24-bit sine table generator
 *
 * by Olli Niemitalo in 2010-03-07.
 * This work is placed in the public domain.
 */

#include <stdio.h>

/* Sine table with one extra sample to enable interpolation
 *
 * Sine with amplitude -0x800000 .. 0x800000 was rounded to
 * nearest integer and truncated to -0x800000 .. 0x7fffff.
 * int data type must be at least 24-bit.
 */
int sineTable[1025];

void generateTable() {
  int i, j, k;
  const int magic[33] = {
    0x691864, 0x622299, 0x2CB61A, 0x461622,
    0x62165A, 0x85965A, 0x0D3459, 0x65B10C,
    0x50B2D2, 0x4622D9, 0x88C45B, 0x461828,
    0x6616CC, 0x6CC2DA, 0x512543, 0x65B69A,
    0x6D98CC, 0x4DB50B, 0x86350C, 0x7136A2,
    0x6A974B, 0x6D531B, 0x70D514, 0x4EA714,
    0x5156A4, 0x393A9D, 0x714A6C, 0x755555,
    0x5246EB, 0x916556, 0x7245CD, 0xB4F3CE,
    0x6DBC7A
  };
  k = 0;
  for (i = 0; i < 33; i++) {
    for (j = 0; j < 24; j += 3) {
      sineTable[k++] = ((magic[i] >> j) & 7) - 4;
    }
  }
  sineTable[1] = 51472;
  for (i = 3; i > 0; i--) {
    for (j = i; j <= 256; j++) {
      k = sineTable[j - 1];
      sineTable[j] = k + sineTable[j];
      sineTable[513-j] = k;
      sineTable[511+j] = -k;
      sineTable[1025-j] = -k;
    }
  }
  sineTable[768] = -0x800000;
}

int main() {
  int i;

  generateTable();

 /* Printout */
  for (i = 0; i < 1025; i++) {
    printf("%d\t%d\n", i, sineTable[i]);
  }

  return 0;
}

Here is a spreadsheet that illustrates the algorithm: sinetable.xls. The above implementation additionally embeds the seeds 0 and -3 in the magic data

Marginal notes

These are miscellaneous notes and recipes.

2010-06-22

Calculation of Boltzmann factors in Google

A google search for (-6 kcal/mol) * -1 / (Avogadro constant)/ (Boltzmann constant) / (273.15 K + 42 K)) calculates the Boltzmann factor for a molecule being in a state with a free energy of -6 kcal / mol, at 42 degrees Celsius, 9.5805616.

2010-03-30

Logarithm of a sum of large numbers

When dealing with large numbers, it can be useful, as an approximation, not to deal with the numbers themselves, but with their logarithms. This has the added benefit that multiplication of two numbers corresponds to taking the sum of their logarithms. But what if you want to take the sum of the numbers, a and b, how can you avoid dealing with the possibly horrendously large numbers themselves? This problem is encountered for example when computing the total unnormalized probability (a Boltzmann factor or partition function) of mutually exclusive events. Here’s a way:

Presume that a>0, b>0, and a \ge b. If a < b you can always swap the two numbers.

\mathrm{ln}(a + b) = \mathrm{ln}\big(a(1+\frac{b}{a})\big) = \mathrm{ln}\,a + \mathrm{ln}(1+\frac{b}{a}) = \mathrm{ln}\, a + \mathrm{ln}\big(1 + \mathrm{e{x}p}(\mathrm{ln}\,b - \mathrm{ln}\,a)\big) .

If \mathrm{ln}\,b - \mathrm{ln}\,a is a very large negative number, then \mathrm{ln}(a + b) \approx \mathrm{ln}\,a.

For the logarithm of the difference of a and b, we would proceed similarly:

Presume that a>0, b>0, and a > b. These conditions are necessarily satisfied if \mathrm{log}\,a, \mathrm{log}\,b, and \mathrm{log}(a - b) can be calculated.

\mathrm{ln}(a - b) = \mathrm{ln}\big(a(1-\frac{b}{a})\big) =  \mathrm{ln}\,a + \mathrm{ln}(1-\frac{b}{a}) = \mathrm{ln}\, a +  \mathrm{ln}\big(1 - \mathrm{e{x}p}(\mathrm{ln}\,b - \mathrm{ln}\,a)\big)  .

If \mathrm{ln}\,b - \mathrm{ln}\,a is a very large negative number, then \mathrm{ln}(a - b) \approx \mathrm{ln}\,a.

\mathrm{ln}\big(1 + \mathrm{e{x}p}(x)\big) and \mathrm{ln}\big(1 - \mathrm{e{x}p}(x)\big)  are called Gaussian logarithms, because Carl Friedrich Gauss was the first to publish printed tables of them.

2010-03-02

Newbie bug with C++ nested templates

Say, you want to create a map from stings to vectors of strings. Why does this (with the appropriate STL includes):

typedef std::vector<std::string> VectorOfStrings;
typedef std::map<std::string, VectorOfStrings> StringToVectorOfStrings;

work, while this doesn’t:

typedef std::map<std::string, std::vector<std::string>> StringToVectorOfStrings;

The latter works if you modify it a little bit, so that the consecutive >’s are not recognized as >>:

typedef std::map<std::string, std::vector<std::string> > StringToVectorOfStrings;

2009-10-15

Two IRC channels in different windows using EPIC IRC client

Start your IRC session like this:

/window new
/window show 1
/bind ^I parse_command {/window size 6} {/window last} {/window move 1}

Now you have two windows on top of each other, and you are doing things in the bottom one. You can switch between the windows by pressing tab.

The normal usage is to join on two different channels:

/join #firstchannel

Press tab.

/join #secondchannel

Whenever you want to view or talk on the other channel, press tab.

2009-09-12

Embedding a Java Applet in a WordPress 2.8.4 post

In the HTML editor:

<div id="dyemixer">APPLET HERE<script type="text/javascript">// <![CDATA[ document.getElementById('dyemixer').innerHTML = '<applet width="550" height="420" code="DyeMixer.class" archive="http://yehar.com/blog/wp-content/uploads/2009/09/DyeMixer.jar" alt="To use this applet, you need a Java virtual machine (Java plug-in) for your web browser.">'; // ]]></script></div>

This also survives editing the post in the visual editor. You can even center the applet like you would a normal paragraph.  Note that the div’s id and the argument to getElementById must be identical. Using a similar JavaScript script, you should be able to put also other arbitrary HTML in a WordPress post or page.

2009-07-30

Generation of random numbers from a truncated exponential distribution

To generate a random number with a truncated exponential distribution:

  1. Generate a random number x from an exponential distribution with the rate parameter you want
  2. Calculate x modulo where-to-truncate. This would not be integer modulo but for example fmod() floating-point remainder.

Could be useful if you have exponentially distributed random numbers handy.

2002-11-25

Anti-imaged wavelet transform

This is an idea of a wavelet-type audio processing scheme, where you decompose the audio into octave bands, each octave sampled at half the sampling frequency compared to the one-up octave band. Then you process the bands in whatever way you wish, and reconstruct the signal from the bands. The benefit of the method compared to usual wavelets is the low level of imaging noise when bands are processed separately. The original music-dsp posting has the details:

http://aulos.calarts.edu/pipermail/music-dsp/2002-November/018600.html

multi

The processing scheme ("/2" means decimation by 2, "*2" means dilution by zeros to 2x samplerate)

fir

Frequency responses of filters that could be used in the scheme

2001-09-02

Frequency shifting (audio effect)

Frequency shifting can be an interesting audio effect, although it does not usually preserve harmonic relations. Here is how to do it the wrong and the right way:

blah

Complex exponential modulation illustrated on z-plane unit circle

Complex exponential modulation illustrated on z-plane unit  circle

The ugly mixture. Complicated aliasing caused by discarding the imaginary part from a complex exponential modulated signal.

baa

Anti-aliased frequency shifting illustrated on z-plane unit circle

2000-08-10

Daubechies wavelets 1-38 in a .h file

I copied some daubechies wavelets from a published database into this C header file, confirmed the data to be error-free and wrote a little introduction into wavelets in the comments.

Polynomial interpolation

deip.pdf

Abstract:

“This paper discusses piece-wise polynomial interpolators used in audio resampling and presents new low-order designs that are optimized for high-quality resampling of oversampled audio. Source code and useful tables for using the interpolators are included.”

Adlib / OPL2 / YM3812

This page is devoted to the classic PC soundcard, Adlib, and compatibles.

Reverse-engineering report

2008-04-20

Me and Matthew Gambrell reverse-engineered the YM3812/YMF262 ROM tables. These are the pictures we took:

ymf262ym3812ym3812_rom3_2ym3812_rom1_explainym3812_rom1_1 v2bymf262_rom1ymf262_idym3812_rom3_1ym3812_rom2_2ym3812_rom2_1ym3812_rom1_2ym3812_id2ym3812_id1

Adlib-DigiSnap

2000-03-06

Adlib-DigiSnap (ads.pdf) is a device that is capable of recording Adlib (OPL2/YM3812) digitally. Want to build one? Now you can! You will also need the PCB film and the MSDOS driver (with source code, also a newer version 2009/01/11 with everything for compiling it is here).

One-sided Chebyshev-type inequalities for bounded probability distributions

2007-12-14
Chebyshev’s inequality states that, for any probability distribution, at most 1/k^2 of the area of the probability density function lies more than k standard deviations away from the mean. We can do better, if we know that the distribution is bounded and we know the bounds.

Let X be a random variable bounded by 0 \le X \le M, where M > 0. Given the first two moments, E(X) and E(X^2), of its probability distribution, a sharp lower bound to P(X <L), where L > 0, is given by:

P(X < L)\ge \left\{ \begin{array}{ll}   0 & \mbox{if $E(X) > L$ and $E(X^2) < L E(X) + M E(X) - L M$,}\\\\   1 - \frac{L E(X) + M E(X) - E(X^2)}{L M} & \mbox{if \big($E(X) > L$ and $E(X^2) \ge L E(X) + M E(X) - L M$\big)}\\<br />
& \mbox{or \big($E(X) \le L$ and $E(X^2) \ge L E(X)$\big),}\\\\<br />
\frac{E(X)^2 - 2 L E(X) + L^2}{E(X^2) - 2 L E(X) + L^2} & \mbox{if $E(X) \le L$ and $E(X^2) < L E(X)$.} \end{array} \right.

That’s it. On a related note…

Let X be a random variable and L a constant both bounded by 0 \le X \le M and 0 \le L \le M, where M > 0.

Let Y be a random variable otherwise equal to X, but collecting the tail of X exceeding L to L:

Y =<br />
\left\{\begin{array}{ll}<br />
X & \mbox{if $X \le L$,} \\<br />
L & \mbox{if $X > L$.}\\<br />
\end{array}\right.

E(Y) and E(Y^2) cannot be determined knowing only E(X), E(X^2), M and L. However, sharp upper bounds to E(Y) and E(Y^2) are given by:

<br />
\begin{array}{c}<br />
E(Y) \le<br />
\left\{ \begin{array}{ll}<br />
L & \mbox{if $E(X) > L$ and $E(X^2) < L E(X) + M E(X) - L M$,} \\\\<br />
\frac{LE(X)+ME(X)-E(X^2)}{M} & \mbox{if \big($E(X) > L$ and $E(X^2) \ge L E(X) + M E(X) - L M$\big) or} \\<br />
& \mbox{\big($E(X) \le L$ and $E(X^2) \ge L E(X)$\big),} \\\\<br />
E(X) & \mbox{if $E(X) \le L$ and $E(X^2) < L E(X)$,}<br />
\end{array}\right.\\\\<br />
E(Y^2) \le<br />
\left\{ \begin{array}{ll}<br />
L^2 & \mbox{if $E(X) > L$ and $E(X^2) < L E(X) + M E(X) - L M$,} \\\\<br />
\frac{L^2E(X)+LME(X)-LE(X^2)}{M} & \mbox{if \big($E(X) > L$ and $E(X^2) \ge L E(X) + M E(X) - L M$\big) or} \\<br />
& \mbox{\big($E(X) \le L$ and $E(X^2) \ge L E(X)$\big),} \\\\<br />
E(X^2) & \mbox{if $E(X) \le L$ and $E(X^2) < L E(X)$.}<br />
\end{array}\right.<br />
\end{array}<br />

Beats me how to prove these formulas, but I tried with tens of thousands of randomly generated distributions and they always worked. The way I got the formulas was by trying to think of the “worst-case” distributions (that’s why these are sharp bounds). There were a few of these, corresponding to the different conditions. So I think I know the worst cases, but I don’t know how to show that these truly are the worst cases.

Henry Bottomley has written more about Chebyshev type inequalities.

Impulse response measurements

2006/08/10

To measure the impulse response of a noisy system, you don’t want to just send an impulse and see what comes out. Instead, you will play a signal that has succifiently high energy (long instead of loud), record the outcome and deconvolve from it your original signal, giving you the impulse response.

Golay codes

I measured these impulse responses from the home audio system of my friend Mikko Nelo at his city flat (check the pic for an overview. There might have been a subwoofer somewhere too, but I don’t remember…). Hehe, at least the bass is there. Miniature microphones were plugged into my ears, a minidisc recorder operated as a pre-amp and a laptop I borrowed from work as the player/recorder. A lengthy pair of Golay codes was used as the measurement signal because that way it was easy to deconvolve with existing tools. The following measurements are in binaural stereo 44100Hz 16bit WAV: ir_left.wav (158kB), ir_right.wav (112kB), ir_mono.wav (113kB). Here’s a piece of music unprocessed/processed to demonstrate the use of the impulse responses: irtest.mp3 (559kB). Headphones required!

Chirp signals

Any signal with a perfectly flat frequency content can be deconvolved by convolving with its reverse. Chirp signals (a.k.a. frequency sweeps) made using IFFT can be perfectly flat in the periodic signal sense. Here is such a 65536-sample chirp: chirp64k.wav (WARNING: SOUNDS NASTY). It can be used to analyze up to one second impulse responses. You must play the chirp signal in a loop a few times, record the result, deconvolve, and pick one of the middle impulse responses. Using this method, I have made inverse filters for headphones, which can be used for example with the RealReverb plugin for WinAmp.

Evolutionary real variable optimization in C++

2006/08/10

If you have a problem where you need to find the optimal values for a set of real variables, try this library written by Magnus Jonsson and me.


opti.cpp opti.hpp – Optimization library

MersenneTwister.h – Mersenne Twister random number generator required by the library (by Rick Wagner)

optitest.cpp keyboard.h – Example program with keyboard IO routines


Look at opti.hpp for the interfaces to the library. Read this text to know what is the idea of the various things and what can be done to improve the optimization.

You must present your optimization problem in form of a cost function that accepts a vector of parameters (real numbers) and gives out a single (real) number. The library will then try to find the combination of values for the parameters that minimizes the value your cost function. Sometimes a sub-optimal solution is found, but, depending on the hardness of the problem, often the global optimum is reached.

The basic idea of how an evolutionary algorithm works is as follows. First a population of parameter vectors is initialized to random values that cover the whole range of possible values for the optimal solution. Then the evolutionary algorithm will combine members of the population to create new trial vectors. The trial vectors may replace inferior members in the population. Ultimately, the population will dive into a minimum, hopefully the global minimum, of the cost function.

Different evolutionary strategies available in the library:

  • Differential evolution (class DE)

This algorithm adds scaled-down differences between population members to other population members to create trial vectors.

  • G3PCX (class G3, PCX is the default recombinator)

This algorithm is based on creating trial vectors randomly in the vicinity of existing population members. The extent of the randomness depends on the scattering of the population.

  • GreedyMagnus (class GreedyMagnus)

The name means it converges fast but may therefore get sucked into a local optimum.

In the library, the recombinator means the part of the evolutionary algorithm that combines population members to create the trial vectors. We made the library in a flexible way so that you can exchange the recombinators between the optimization strategies. A default recombinator is provided for each strategy, but some experimenting with the recombinator parameters may improve the reliability and the speed of the optimization, as different types of problems benefit from different approaches. The parameters in the recombinators are as follows.

  • DERecombinator

This recombinator is to be used with Differential Evolution. The parameter cr defines how many of the parameters from the trial vector will replace corresponding parameters in the inferior vector. Usually a value of 1 is OK, but if your population size is small, there is a danger that the dimensionality of your population might drop too low. A value of 0.999 will add a refreshing dash of randomness to the otherwise linear way of combining of the vectors, but will not disturb the over-all scheme too much. The parameter c is the number by which the difference between two population members is multiplied before it is added to another member to create the trial vector. Values such as 0.3, 0.4, 0.6, 0.7 and 0.8 have been useful.

  • PCXRecombinator

This recombinator is to be used with the G3 strategy. The parameter sd1 and sd2 are standard deviations of the random vectors that are created around existing population members, expressed relative to distances between randomly selected members of the populaton. It is fine to have the same value for both. 0.1 is a nice choice. You should normally stay under 1. A low number will make the population progress in smaller steps, thus better fine-tuning a found minimum, but a higher value will make the population explore more distant areas in the cost function. numparents means how many population members will be selected for the recombination process ecah time. 3 is the default value, which I haven’t found useful to change.

  • MagnusRecombinator

This recombinator has no adjustable parameters. It is to be used with GreedyMagnus.

The population size is something you must always define by yourself. Practical values are in the range of 30-1000, but even higher values might be used in over-night optimizations where you don’t want to take any chances. If the population is large enough to explore the cost function properly, the global minimum will be reached within a low numerical error. You should always use more population members than you have dimensions in your problem.

The optimization is advanced by repeated calling of the evolve() function, which advances the population somewhat (how many evaluations of the cost function it does depends on the implementation of the strategy) and returns the cost of the best parameter vector so far in the population. Also the average cost of the population can be quickly retrieved using the avarageCost() function. The optimization should be stopped when there is no more progress within the numerical accuracy that you require. The best vector in the population is returned by the function best(). Try to refrain from printing progress information on each iteration, or your program might be occupied mostly by the print function…
If many repeated runs of the optimization return the same solution, it is quite likely that you have reached the global minimum, because there are typically a huge number of local minima and if you are stuck at one, you probably won’t be stuck at the same the next time. Just look at the solved digits of the cost function value. These will work as a kind of fingerprint of the minimum. If you have problems getting into the global minimum or you are unsure that you are there, you may want to start with a similar problem but of a lower dimensionality. Then increase the dimensionality step by step. If you see a nice trend in the costs of the found solutions for each dimensionality, then this is an extra encouragement that you are at the global minimum.

To speed up things, reduce the running time of your cost function. If your cost function is a sum of a number of terms, then you can employ the compare variable passed to the cost function by the optimizer. If your cost seems to go above this value, then you can stop evaluation of the terms and just return compare or some larger number. This will instruct the optimizer that the evaluated trial vector is worse than another other vector it is comparing it against, and therefore can be discarded. (Note: GreedyMagnus will not speed up using this trick.) If you use this trick, then it makes sense to first evaluate terms you suspect will give the biggest penalty.

It is also possible to constrain the variables being optimized. You can do this by modifying the input vector at the beginning of the cost function. But carefully consider what might be the consequences of this modification. If you hard-limit some parameters, will this reduce the dimensionality of your population as you set a parameter to the same value in many of the vectors? It can be better to just penalize strongly solutions that go outside your bounds of choice, by adding a large penalty term. Perhaps even create a sort of a funnel in the cost function that will direct astray vectors back on track. This can be done by penalizing according to the severity of the constraint violation.

This is getting a bit problem-specific, but if you are doing something like least mean square approximation, you typically are integrating the error function through sampling. Do this so that you concentrate more samples at the known problem areas of the function. This way you can reduce the total number of samples significantly and will speed up the execution of the cost function.

Hilbert transform

This is probably the most efficient structure for implementing a Hilbert transform. Actually, it’s not a Hilbert transform, but two all-pass IIR filters whose phase difference is approximately 90 degrees over a range of frequencies symmetric around Nyquist/2. Laurent de Soras uses these kind of filters in his HIIR resampling library. This note is available in Maple format: hilbert.mws

2003/07

90 degree phase difference IIR allpass pair

The basic building block is:

> H_sect := (z, a) -> (a^2 – z^(-2)) / (1 – a^2 * z^(-2));

[Maple Math]

Equation 1. Allpass section taking one multiplication when a^2 is a known coefficient.

Each such allpass section has poles located on the real axis at a and - a , and zeros at 1/ a and -1/ a .

We construct two allpass filters from such sections:

> H_1 := z -> H_sect(z, 0.6923878)*H_sect(z, 0.9360654322959)*H_sect(z, 0.9882295226860)*H_sect(z, 0.9987488452737)*z^(-1);

[Maple Math]

Equation 2. First allpass filter, delayed by one sample.

> H_2 := z -> H_sect(z, 0.4021921162426)*H_sect(z, 0.8561710882420)*H_sect(z, 0.9722909545651)*H_sect(z, 0.9952884791278);

[Maple Math]

Equation 3. Second allpass filter, has 90 (+- 0.7) degrees relative phase to first filter over a long range of frequencies.

> plot({argument(H_1(exp(I*freq))), argument(H_2(exp(I*freq)))}, freq=0..Pi, phase = -Pi..Pi);

Figure 1. Phases of H_1 (red) and H_2 (green).

> plot(argument(H_2(exp(I*freq))/H_1(exp(I*freq))), freq=0..Pi, phase = -Pi..Pi);

Figure 2. Phase difference of H_2 and H_1.

> plot({argument(H_2(exp(I*freq))/H_1(exp(I*freq))), Pi/2}, freq=0..Pi, phase = Pi/2*0.95..Pi/2*1.05);

Figure 3. Phase difference of H_2 and H_1 (green), detail near 90 degrees (red).

> plot(argument(H_2(exp(I*freq))/H_1(exp(I*freq))), freq=-0.005..0.005, phase = -Pi..Pi);

Figure 4. Phase difference of H_2 and H_1, detail near 0 Hz.

You can use the 90 degree phase difference of the filters as such, or you can construct a complex filter and analyze its properties:

> H := z -> 0.5*(H_2(z)+I*H_1(z));

[Maple Math]

Equation 4. Combined complex filter that will remove negative frequencies.

> plot(20*log10(abs(H(exp(I*freq)))), freq=-Pi..Pi, dB=-60..1, axes=boxed);

Figure 5. Frequency response of filter in Equation 4.

> plot(20*log10(abs(H(exp(I*freq), 0.4))), freq=-0.01..0.01, dB=-60..1);

Figure 6. Frequency response of filter in Equation 4, transition band detail

> plot(20*log10(abs(H(exp(I*freq), 0.4))), freq=0..0.01, dB=-0.0005..0.0005);

Figure 7. Frequency response of filter in Equation 4, transition band – passband detail.

> plot(20*log10(abs(H(exp(I*freq)))), freq=0..Pi, magnitude=-0.0005..0.0005);

Figure 8. Frequency response of filter in Equation 4, passband detail

Transition bandwidth is 0.002 times the width of passband, stopband is attenuated down to -44 dB and passband ripple is 0.0002 dB.

Plenty cheap for a total of 8 multiplications (plus final scaling by 0.5)!

The coefficients were found using a generic evolutionary algorithm. I believe that it would be possible to design coefficients for this filter structure using the software by Artur Krukowski, which finds coefficients for halfband filters: http://www.cmsa.wmin.ac.uk/~artur/Poly.html

2004/01/13

Ben Saucer has succesfully implemented this filter pair to be used in his audio effect. Here are some useful (edited) excerpts from our e-mail correspondence.

Here’s a quick diagram of the allpass pair:

        ................ filter 1 .................
   +--> allpass --> allpass --> allpass --> allpass --> delay --> out1
   |
  in
   |    ................ filter 2 .................
   +--> allpass --> allpass --> allpass --> allpass ------------> out2 (+90 deg)

We can use cookbook formulas to convert an allpass section into code. A general IIR recurrence relation:

  out(t) = a0*in(t) + a1*in(t-1) + a2*in(t-2) + ...
         + b1*out(t-1) + b2*out(t-2) + ...

results in the transfer function:

         a0 + a1*z^-1 + a2*z^-2 + ...
  H(z) = ----------------------------
          1 - b1*z^-1 - b2*z^-2 - ...

The allpass section in question has the following transfer function:

          a^2 - z^-2
  H(z) = ------------
         1 - a^2 z^-2

We want to convert this into the recurrence relation. According to the cookbook formulas and the above transfer function:

  a0 = a^2, a2 = -1, b2 = a^2, rest of coefficients zero

  =>  out(t) = a^2*in(t) - in(t-2) + a^2*out(t-2)

which simplifies to the one-multiplication allpass section:

  out(t) = a^2*(in(t) + out(t-2)) - in(t-2)

Older Posts »

Powered by WordPress - Hosted by SuniSoft oy