Let's Design and Build a (mostly) Digital Theremin!

Posted: 4/19/2017 6:15:21 PM 1201

From: Porto, Portugal

Joined: 3/16/2017

"Signal processing. From a dead pure sine to a living sound.

When I started my project, I thought of using algorithms with unchanged (or relatively slowly updated) parameters (formant filter frequency for example). Simulation of natural sound requires a lot of interrelations, and the high-quality sound requires updating the parameters at the sampling frequency." - ILYA

For ARM 180MHz with hard floating point and 44KHz sample rate, about 1000-2000 floating point additions/multiplication may be done per sample. ARM has some nice instructions for loading/saving several registers in one command, etc. Overclocking to 240MHz is possible.

If processing is being done by frames (N samples, e.g. ~1ms), many parameters may just be linearly interpolated between start and end of frame.

If filter parameters are being changed slowly, for short interval (frame) they probably may be calculated for begin and end of period, and then interpolated. At least I hope so. Some tricks or alternative optimized algorithms, or pre-calculated tables may be used to fit into limited hardware.

Posted: 4/19/2017 7:06:12 PM 1202

dewster

From: Northern NJ, USA

Joined: 2/17/2012

threads - posts

If I could do a float in 2-4 cycles I'd be all over them. But they're not efficient in Hive so I'll be avoiding them as much as possible. Fixed decimal is a tough slog though, I can see why most DSP engineers throw up their hands and do everything in floats.

I'm going to do filter updates at the sampling frequency (48kHz). Anything lower and you might hear the updates. Filters don't necessarily have to be analytic (e.g. 0x1000 corresponding to a Q of 1 or some such with perfect linearity) so the filters can be sloppier and the parameters more efficiently computed.

Haven't totally gotten into vocal synthesis yet, but I'm hoping a realistic vocal tract can be simulated with maybe 10 filters, with continual adjustment of Q and cutoff. I'm sure that's doable in software on top of the Theremin hardware babysitting. Reverb is out, but I figure anyone can buy a better vocal / guitar pedal than I can do in SW.

Posted: 4/19/2017 7:28:16 PM 1203

ILYA

From: Theremin Motherland

Joined: 11/13/2005

threads - posts

Buggins

the frames mean a latency. To hell with it and Theremini!

Posted: 4/19/2017 7:44:10 PM 1204

dewster

From: Northern NJ, USA

Joined: 2/17/2012

threads - posts

"the frames mean a latency." - ILYA

To put it in perspective, keyboard players seem to tolerate >10ms of outright latency (transport delay) in MIDI instruments, which is >480 sample periods @ 48kHz.

I don't want to do the frame thing because I think it's generally a bad thing to do in a DSP synth, and it makes things more complicated. If I need more throughput (doubtful) I'll design some hardware to handle it and strap it onto the Hive register set.

Posted: 4/20/2017 3:48:51 AM 1205

Buggins

From: Porto, Portugal

Joined: 3/16/2017

threads - posts

"Buggins the frames mean a latency. To hell with it and Theremini!"

Audio latency itself is not as audible if frame is short enough. 1-3ms is probably acceptable - this delay is less than time sound travels from speaker to your ear. If synthesizer would track controls with high enough rate (several times per frame), and then interpolated it for frame using several points (instead of two points - start and end) you would not notice this audio frame latency.

When some thereminist connects his theremin to PC for effect processing, he adds latency (3-10ms). But he still can perform well.

In my SoundTab soft theremin app, I see that 3 and even 6 ms frame lengths sound acceptable.

This app uses Wacom tablet as low latency input (horizontal pen movement - pitch, pressure - volume, vertical - some effect controller). Interpolation of pitch and volume just linearly over frame gives good enough reaction on pen. Much better than theremini.

As an upset owner of Theremini, I feel that it's overall latency looks more like 100ms frame. I believe engineers from Moog cannot use such long frames, and it's shorter than 10ms. (Did someone try MiniMoog? Does it have audible latency?)

I think Theremini's problem with latency is in theremin sensor part. Probably sensor output contains too much noise, and they use averaging to reduce hand position measurement noise. This averaging is what is killing vibrato in Theremini.

I've tried Ehterwave and some other theremins at Peter Theremin's express courses. Theremini is far from analog theremins. That's why I've started my own digital theremin project. I think anyone can build better digital theremin than Theremini. Of course, Peter Theremin was able to play even on crap like Theremini (vibrato was awful although).

Interesting: Theremini in the same room as Etherwave (even 5 meters from it) does not response to hand movements at all. Instead, it plays what is currently played on Etherwave, with note some shift up or down. :) LOL

Posted: 4/20/2017 2:34:43 PM 1206

dewster

From: Northern NJ, USA

Joined: 2/17/2012

threads - posts

"As an upset owner of Theremini, I feel that it's overall latency looks more like 100ms frame. I believe engineers from Moog cannot use such long frames, and it's shorter than 10ms. (Did someone try MiniMoog? Does it have audible latency?)"

"I think Theremini's problem with latency is in theremin sensor part. Probably sensor output contains too much noise, and they use averaging to reduce hand position measurement noise. This averaging is what is killing vibrato in Theremini." - Vadim

My educated guess is that some of the Theremini latency is caused by the sensors and the synth being separated by the MIDI interface, as it is in most decent (non-toy) digital keyboards. It takes a lot to smooth out MIDI to the point where you don't hear steps and such. And the oscillators are very low voltage at the antenna so, as you say, they are probably low pass filtering / averaging like hell to lower environmental noise (I doubt they are using a CIC hum filter here, so multi-pole LPF << 50Hz).

"I think anyone can build better digital theremin than Theremini."

I haven't researched this, but I wonder if Moog Inc. isn't up against FCC RF emissions regulations? If so, then perhaps they are constrained to have low voltage at the antennas, and there isn't a lot you can do if that's the case. Certainly the Theremini calibration could be vastly improved to make the response and linearity more consistent. And the tuner display seems like it was designed by a committee of non-musicians.

"Interesting: Theremini in the same room as Etherwave (even 5 meters from it) does not response to hand movements at all. Instead, it plays what is currently played on Etherwave, with note some shift up or down. :) LOL"

Surprising! But not too surprising as the EW kicks out a pretty healthy signal and the Theremini is a wuss in that department. I found that I could place an FM modulated RF source near the Theremini and seriously mess with it. This is how I measured the gestural bandwidth to be somewhere between 1.6Hz and 2.4Hz in this TW post. Sad!

The way the Theremini turned out is really a shame. A good Theremin engineer could take 99% of what's in there and turn it into a real instrument. Moog Inc. probably should have concentrated on making a more stable, more playable, less expensive alternative to the EW with fewer bells and whistles. A few waveforms and a filter or two. They could have used a much less expensive computing platform to do that. They should have insulated the antennas too. It's pretty obvious that Bob Moog is gone and the pencil pushers are now in charge.

Posted: 4/21/2017 5:49:25 AM 1207

Buggins

From: Porto, Portugal

Joined: 3/16/2017

threads - posts

"Surprising! But not too surprising as the EW kicks out a pretty healthy signal and the Theremini is a wuss. I found that I could place an FM modulated RF source near the Theremini and seriously mess with it. This is how I measured the gestural bandwidth to be somewhere between 1.6Hz and 2.4Hz in this TW post. Sad!"

Wow. We are talking about 1-3ms latency/response time as bad one. But having 200-400ms latency like in Theremini is beyond evil.

"I haven't researched this, but I wonder if Moog Inc. isn't up against FCC RF emissions regulations? If so, then perhaps they are constrained to have low voltage at the antennas, and there isn't a lot you can do if that's the case."

Hmm. When I'm feeding LC tank with 2MHz 5V via 300 ohm resistor, isn't 50cm antenna radiating as a good transmitter?

Is there any method to check if some device exceeds FCC rules?

Posted: 4/21/2017 8:38:30 PM 1208

dewster

From: Northern NJ, USA

Joined: 2/17/2012

threads - posts

"Wow. We are talking about 1-3ms latency/response time as bad one. But having 200-400ms latency like in Theremini is beyond evil." - Vadim

Well, there's two kinds of latency. One is "transport delay" or pure delay, like an echo. The other is "intertial delay" which is like phase delay through an RC network. Transport delay is much worse than inertial delay in a feedback (human playing a Theremin) scenario.

"Hmm. When I'm feeding LC tank with 2MHz 5V via 300 ohm resistor, isn't 50cm antenna radiating as a good transmitter?"

I should have an equation for this, but I believe a Theremin antenna acts like a monopole above a ground, so the closer the antenna length is to a multiple of the wavelength at the operating frequency the more efficiently it radiates RF. At 2 MHz the wavelength is 3x10^8 m/s / 2x10^6 = 300 / 2 = 150 m. If it's a direct ratio (I don't really know) then 0.5 m / 150 m = 1 / 300, so very little RF is being radiated. The radiated power puts an upper limit on tank Q, because it acts like a resistor (power sink). You can pretty much tell that not much power is radiated with a Theremin oscillator on the bench, because with a good coil you can get 100's of volts swing at the antenna with one mA @ 3.3V supply.

RF transmission loss is an argument against making the Theremin sensors look like antennas. Plates - or ideally spheres I suppose - should radiate less because the largest dimension is smaller for a given surface area (which sets intrinsic C and sensitivity).

I've wasted a fair amount of time on-line looking for relevant antenna info, but they usually assume the antenna length is tuned, so everything is given in terms of useless impedances.

"Is there any method to check if some device exceeds FCC rules?"

I should have a ready answer for this too but I haven't even looked into it. I know that FCC radiated emissions were a huge deal when I was working on telephony equipment, they'd spend weeks sniffing the equipment rack with all kinds of antennas, installing shorting tooth conductors between the doors, ferrite beads on the cables, etc. Everyone got nervous at the end of a design cycle, and they'd often have to play games with the emissions results to get a pass ("assuming the cabinet doors stay closed", etc.).

Vadim, have you seen this?: http://humancond.org/wiki/user/ram/electro/capsense/0main. One way to meet emissions requirements in general is to spread the energy in the peaks out. The techniques in that article are quite interesting, but the main problem with it (other than the super low bandwidth) is the sensing instability. He's driving the shield, which exposes the shield / plate capacitance, which drifts with temperature. I think one could perhaps do like he does and design a low Q tank, drive it with a band-limited pseudo-random sequence (NRZ), and synchronously AM detect it, but without using an explicit shield. Might get around the drive / sense / emissions issues with high Q LC. Something to ponder, anyway.

Posted: 4/24/2017 11:10:36 PM 1209

dewster

From: Northern NJ, USA

Joined: 2/17/2012

threads - posts

Stop Me Before I Kill Tinker With Opcodes Again!

Fairly productive week or so. Removed the generally unused conditional goto and conditional interrupt return opcodes, and moved the new 16 bit jump opcodes into their place. Restored the read low and read byte opcodes (both unsigned) as I found myself needing them.

Also tinkered with the CLI code to handle carriage returns in the token parser (was using the space character to indicate the end of a command, return forces the situation). The basic CLI feels pretty robust and complete now.

Been coding rather "blind" in skanky old notepad, which doesn't do code formatting - but worse it only has one step of undo! So I researched code editors a bit and ended up downloading and installing notepad++. It seems pretty nice! I made my own HAL assembly language format definition, which took maybe an hour (took me 1/2 of that to get hex to format!) and it works great! Feast your eyes:

Ooo la la! It feels pretty, oh so pretty...

Theremin wise, I'm currently dealing with the LCD code, but should be testing the linearity of the pitch field very soon. A lot of what's been holding me up is not being 100% confident about scalability, as well as technical debt buildup. The re-write of the CLI and coding in a good editor has gone a long way to dispel that.

Posted: 4/26/2017 6:13:12 PM 1210

dewster

From: Northern NJ, USA

Joined: 2/17/2012

threads - posts

So while I'm writing the LCD code I'm thinking of incorporating the processing of the rotary encoders and push buttons (on the encoders) on the same thread. The overhead is low so why not? Maybe as an interrupt routine on thread 7, which currently handles the CLI.

My approach to what shows up in the register set for both has changed today. Previously for the rotary encoders I had two clear-on-read register bits per encoder, one for CW (clockwise) and another for CCW (counter clockwise) detent-to-detent action. A better use of those two bits is for them to give 01 (+1) on CW, and 11 (-1) on CCW, which requires only a simple sign extension to modify a running sum.

For the push buttons I also had two clear-on-read register bits, one for current state (1=pressed) and the other to indicate a change since the last read. But since the buttons will really only bounce when going from open to closed, we just need to debounce the closed to open interval. Which means we only need one bit per push button, with a 1 indicating that the button is either currently pressed or was pressed since the last read, with the "currently pressed" setting taking precedence over the clear-on-read clearing (necessary for correct tracking).

I have a tendency to overthink this kind of stuff - both of these changes are so basic I couldn't see them before. I'm also combining the push button and rotary encoder Hive registers as this makes sense.

Thinking ahead to the assembly code that handles the encoders, I want to incorporate some kind of velocity sensing on the rotation speed to increase the inc/dec value for faster turns. I think maybe I can use the push button debounce interval for this. So I hooked an encoder up to my scope with a pull-up resistor and looked at what the software might encounter. A really fast spin with no knob gives detent-to-detent speeds of ~1 ms (1kHz!). A more reasonable fast spin is maybe ~4 ms, and a fairly slow spin is around 50 ms. If we sample the hardware at 48 kHz we'll definitely catch all events. If we then sub-sample this at 1/(2^-12) *48 kHz we get a debounce interval of around 10 Hz or 100 ms (which should be adequate for LCD update too). The reasonably fast spin count in this interval would be 25, the fairly slow spin 2. So I'm thinking of taking the debounce interval spin count, squaring it, dividing it by 4 (>>2), and adding this to the count (taking direction sign into account of course). This would give 25 + 25^2 / 4 = 181 per detent for fast, 2 + 2^2/4 = 2 for the slow, an exponential distribution in-between, and +/- 1 below. This will be the first thing I try anyway.

Until you really get into the software, it's hard to know what the hardware should look like.