Venturini's "MAM" technique modifies the basic Widrow-Hoff update technique by adding a more rapid update at the beginning.

Suppose you want to estimate the value of a quantity x by looking at successive samples of it. Let your current estimate always be called x, and let a new sample be called x'. Then the regular Widrow-Hoff technique modifies x according to

x <- x + b(x'-x),

where b is called the "learning rate", 0 < b <= 1, and "<-" is the assignment symbol.

You can see that x is adjusted so as to bring it closer to x'. Usually b is quite small, like 0.2, so that the adjustment is conservative, i.e., a single sample does not affect x too much. Thus x changes slowly as samples are taken.

Now to Venturini. Suppose you are just beginning to get samples of x. And suppose that, initially, you have very little idea what x should be. You have to initialize x at some value, maybe a guess, and that value could be very different from the true value! The Widrow-Hoff technique will cause x to converge to the true value, but this may take quite some time. Venturini said, why not let the first values of x' influence x more strongly than they would under Widrow-Hoff.

To do this, the MAM technique uses ordinary averaging for the first N samples of x', where N is the largest integer less or equal to 1/b. So if b = 0.2, N = 5. Now suppose the first 5 values of x' are 7, 8, 6, 7, 8. Then the first 5 values of x are

7, (7+8)/2, (7+8+6)/3, (7+8+6+7)/4, (7+8+6+7+8)/5.

After the fifth sample, x = 36/5 = 7.2. It looks like the "true" value of x is near 7.2. But notice that x was close to 7.2 after the very first adjustment, independent of its initial value, which might have been, say, zero. So MAM is a good way to "start" the Widrow-Hoff process when you don't have a very good idea what the true value is. After N samples have been received, MAM switches to using regular Widrow-Hoff.

Could you explain the "MAM" technique, giving an example?