# Let's Design and Build a (mostly) Digital Theremin!

Posted: 1/3/2018 3:07:30 PM

From: Northern NJ, USA

Joined: 2/17/2012

"What are the effects of the difference between original and "modified" versions?" - tinkeringdude

There's a paper on it, and here's a poster (link) but I don't think they clearly convey the simple thing that's going on.  For any filter, analog or digital, reduction in gain controls the cutoff frequency (because it controls the rate of integration).  Each gain reduction is paired with an integrator, and the pair count tells you the order of the filter.  When the order is > 1 there will be additional gain reductions associated with damping.

When you have limited resolution, gain reductions can easily be problematic.  Even floats can't get you around this entirely because they also have limited resolution, so when you add something significantly smaller than the thing you are adding to, nothing changes.  (There may be situations where floats actually make things worse, because the exponent consumes some of the resolution space.)

So where you place the gain reduction in relation to the integrator is critical to the basic resolution of the thing being integrated. If you place the gain reduction before the integrator you've reduced its resolution by truncating its least significant bits.  On the upside, the integration won't grow much larger than the thing feeding it, and the value of the integrator is often a direct filter output (LP, BP) which can be handy.  If you place the gain reduction after the integrator then the integration will grow to be much larger than the thing feeding it, but if you have headroom in the integrator then this is the best possible scenario because there is no truncation going on until after the integration, where there is excess resolution to throw away.

Resolution reduction is bad because it directly degrades signal SNR, and makes the parameters less continuous.  And the least significant bits you chop off can sometimes remain in the filter and recirculate, causing low-level limit cycles (you want everything to decay to true zero).

I think Chamberlin picked his SV topology (gain reduction preceding integration) because it was a direct replacement of the analog SV integrators with digital integrators, and he likely was working with an 8 or 16 bit processor so wide integrators were perhaps problematic from a performance standpoint.  But he didn't go this route for his single pole HP/LP, so I don't totally get his thinking here.

"Btw, I have checked, the one I'm using isn't from Chamberlin, it's an approximation to it, by J.Dattorro. It can be tuned only to < 1/4 fs, which I forgot about because the project I took it out of uses oversampling all the way and it was never a problem, but I got just reminded of that fact ;)"

I also limit my filters and oscillators to 1/4 Fs.  For one thing, who can really hear 12kHz all that well?  For another, going above this causes all kinds of aliasing problems with oscillators, and stability problems with filters.  You never want to have positive feedback for any integrator in a filter, but you'll get it with high damping and high corner frequency, and guarding against instability with up-front parameter math can be a complicated thing (most stability criterion as implemented stay quite far away from actual instability as they are set for a single worst-case).

"Your wave sounds pretty clean, despite a lack of deliberate regards to band-limiting (?). It reminds me of that venerable speech synth singing "daisy, daisy..."."

No band-limiting at the source, and it's generated at Fs (48kHz).  I find that I'm always setting the glottal too "buzzy" during adjustment - I suppose it adds to the clarity, and it helps my older ears hear the formant details in the mix.  But once my ears have had a rest it sounds too harsh.  BTW, the wave looks very much like your LP sawtooth!  Mutating a sine can be incredibly effective, and, if done as gently as possible to get only the harmonic levels you need, can yield lower aliasing.

"As for vocal vs. friction sounds. I wonder why classical speech synth setups of this category tend to use an own resonator path for the noise shaping. After all, there is only one vocal tract, not two that are mixed together. Just that the different sound sources are located in different places within it."

I've noticed this too, and I think it's because the turbulence is often caused by the tongue (which isn't influenced by the nose or throat much), or by the lips (ditto), or by the nose opening (external to the body).  A sudden intake of breath is mostly a mouth resonance.  Deep breath sounds can use the vocal tract.  In Perry Cook's Thesis, noise can be injected at several points in the tract, and this seems like it might be the best approach?  I don't know why one would leave the tract itself entirely out of the equation, unless it is to remove those resonances that don't occur in real life.  But why not just modulate those resonances?

Posted: 1/3/2018 3:49:17 PM

From: Northern NJ, USA

Joined: 2/17/2012

"Heard your sample below, to me digital sounds flat like it lacks a soul or is un-unnatural." - Christopher

Probably a listener bias thing because you know it's digital.  If you believed it was analog you'd probably have a very different reaction.  One can get essentially the same results either way.

That reminds me, I need to go out in the forest and pick some fresh organic analog transistors.  I've grown tired of all this sterile man-made technology and need to get back to nature.

[EDIT] The glottal spreadsheet with the Hi-Lo wave shaping method is located here.

[EDIT] A female sample here.

Posted: 1/3/2018 4:59:22 PM

From: Russia

Joined: 9/8/2016

A sample of interesting, soft, volumetric sound.

Only I do not know if this is a termenvox.

https://youtu.be/6NOEbDjg3Wc

Posted: 1/3/2018 7:21:09 PM

From: 60 Miles North of San Diego, CA

Joined: 10/1/2014

Posted  5/13/2012     The start of this thread

I am not trying to persuade you to change direction (Don’t! I have been switching direction continuously in my theremin developments, and this is probably the main reason Im broke!) – But, if while playing with your FPGA you see an easy way to provide a linearized CV, let me know!  I believe the future is a hybrid : Front-end which deals with the linearity and stability issues and allows control of range etc, and this to be followed by conventional (or unconventional) heterodyning analogue / mixed signal theremin which is ‘liberated’ from the constraints imposed by the antenna and its circuitry.

Fred

============================================

Six years later 1/02/18 a dewster sample. At the end of this sound byte you do sound exhausted.

Seven years ago my sample . Eventually you shared a breakthrough which gave me an answer I long looked for.

If I seem frustrated at times it is because I wanted more from you on my project than you had time to give. I do respect your knowledge. I was learning electronics as I went along. Now my theremin time is over, had family dismantle my hobby room, can't find anything, maybe that is good.

Christopher

---------------------------------------------------

Edit: One Master Thereminist stated the theremin is a "one trick pony", but if mastered? I add the theremin is not a gimmick but harnesses a real natural phenomenon. Why demean the name of Lev Sergeyevich when your design has little connection if any to his work. This is the most bothersome part, I really don't care what you build if you do not misrepresent it in the theremin community.

After many years of electronic gibberish your thread still goes nowhere, delete all the hundreds of posts that do not demonstrate anything of value. Look out for the newbies that stumble upon your 136 TW webpages in the search engines, then they leave mislead or discouraged, even if they wanted to build digital. You have a responsibility with your knowledge to properly teach, not to become the latest Glasgow approach where the designer knew it was a good idea that could never work properly.

Posted: 1/3/2018 9:55:34 PM

From: Northern NJ, USA

Joined: 2/17/2012

I'm not burned out at all Christopher, quite the contrary.

Anyway, I wasn't at the point of thinking about the sound-making side of things until now.  But I can recommend that if you're looking to do a female voice in analog you should set up (buy or build) a bank of three or four tunable bandpass filters and try running your Theremin output through that.  You don't need a lot of tuning range nor Q, and the filters all go in parallel with a volume control for each.  A parametric equalizer will work, a graphic equalizer probably won't.  Each filter would take a quad op-amp max if you want to roll your own (and a stereo pot).

There are two ways to wrangle harmonics: additive and subtractive.  If you're doing additive via a fixed waveform, then the harmonics will track the pitch, and this is a somewhat unnatural thing to run across outside of synthesis (think Hammond organ).  So you can't get a fixed timbre with a fixed waveform, you either have to mutate the waveform with pitch (difficult to do) or turn to subtractive synthesis (harmonic rich source with filters to reduce / remove the harmonics you don't want) to get a fixed timbre.  The human voice is largely fixed timbre, and a good waveform to drive vocal synthesis looks a lot like a sawtooth (or some other signal rich in all harmonics, perhaps low-pass filtered).

The beauty of digital here is I can have a 100 filters (I'm currently using 7 running on one thread), all controlled by 3 knobs, and each filter is just a subroutine call and some processor real-time, and not days and days of wiring and massive front panel real estate devoted to pots.  The hissing of analog actives and passives doesn't even enter into the picture (like it tends to do on analog mixing boards and such).

The power of the dark side commands you to try this! :-)

Posted: 1/4/2018 6:52:51 AM

From: Germany

Joined: 8/30/2014

Ah, thanks for the detailed filter explanations, dewster! It all makes sense to me.
The 1/4 fs max filter cutoff would be a hindrance if I run things on a samplerate like 24k for a system that's to output speech and some alarm sounds only.

Noise: I guess doing e.g. voiced "S" may require to at least not feed the noise into all formant filters (equally strongly), with my current simple setup I have not gotten a good "zzzz" sound. I don't know whether the hiss is more or less "white" and one of the formants shapes the spectrum, or some of the coloration of the hiss is inherent in its production - then I guess one would need extra filters to get that.

As for the "lifeless" sound. The analog thermin sample posted had the typical theremin sound with the tone pot set to a vaguely voice like setting, whereas the digital samples posted here were of attempts to produce a straight voice sound, not a "analog theremin that wants to sing" sound. Why would one be disappointed not to hear one's favorite kind of sound then?
I don't know what dewster has planned, but I'm pretty sure "analog theremin model X" is a sound that could be imitated, among many others. From my theremin sounds exposure, there isn't that much of "life" going on that needs imitating, all of the animation comes from the hands?

Posted: 1/4/2018 4:54:04 PM

From: Northern NJ, USA

Joined: 2/17/2012

"Edit: One Master Thereminist stated the theremin is a "one trick pony", but if mastered? I add the theremin is not a gimmick but harnesses a real natural phenomenon. Why demean the name of Lev Sergeyevich when your design has little connection if any to his work. This is the most bothersome part, I really don't care what you build if you do not misrepresent it in the theremin community.

After many years of electronic gibberish your thread still goes nowhere, delete all the hundreds of posts that do not demonstrate anything of value. Look out for the newbies that stumble upon your 136 TW webpages in the search engines, then they leave mislead or discouraged, even if they wanted to build digital. You have a responsibility with your knowledge to properly teach, not to become the latest Glasgow approach where the designer knew it was a good idea that could never work properly."  - Christopher

Christopher, please stop trolling me and spamming this thread.  You're being a huge negative vibe merchant.

If you feel that strongly about technology that you can't be arsed to understand, please start your own thread where you can rant and rave and post pictures that reveal unsuspecting viewer's IP addresses.

Posted: 1/4/2018 6:38:05 PM

From: Northern NJ, USA

Joined: 2/17/2012

"Noise: I guess doing e.g. voiced "S" may require to at least not feed the noise into all formant filters (equally strongly), with my current simple setup I have not gotten a good "zzzz" sound. I don't know whether the hiss is more or less "white" and one of the formants shapes the spectrum, or some of the coloration of the hiss is inherent in its production - then I guess one would need extra filters to get that." - tinkeringdude

The vocal tract in Perry Cook's Thesis has noise injection at many points, with amplitude controls (I assume they are individually and likely dynamically adjustable) on each one.

The "zzzz" sound is interesting to think about.  It's clearly noise and glottal oscillation - maybe have the instantaneous amplitude of the glottal wave modulate the amplitude envelope of the noise?  This is what I want to try for the onset of vocalization, the transition between breath and pitch.

"As for the "lifeless" sound. The analog thermin sample posted had the typical theremin sound with the tone pot set to a vaguely voice like setting, whereas the digital samples posted here were of attempts to produce a straight voice sound, not a "analog theremin that wants to sing" sound. Why would one be disappointed not to hear one's favorite kind of sound then?

I don't know what dewster has planned, but I'm pretty sure "analog theremin model X" is a sound that could be imitated, among many others. From my theremin sounds exposure, there isn't that much of "life" going on that needs imitating, all of the animation comes from the hands?"

There are several threads here at TW regarding this (link, link).  Most analog Theremin waveforms look like sine waves mutated in various ways, which isn't too surprising given the results of heterodyning, and the following, often somewhat non-linear, volume control processing.

With the criticism of "sterile, digital sound" no one talks about the elephant in the room, which is generally the highest regarded modern professional Theremin in the world, the Moog Etherwave Pro.  It uses digital XOR gates (I assume for heterodyning) and binary ripple counters (I assume for bank switching, though they might also be somehow tapped for harmonics) in its voice generator.

The sonic goal for many Theremin designers is likely more historical (and perhaps accidental) than anything else.  They seem to opt for wave shapes that give certain (as you say) vaguely recognizable timbres (female voice, violin, etc.) over certain limited pitch ranges.  I admit that it has a certain charm, but I don't think I could listen to it long-term without one or more filters in the chain somewhere.  Clara's super buzzy tone starts driving me up the wall after a song or two.

Posted: 1/5/2018 10:12:29 PM

From: Northern NJ, USA

Joined: 2/17/2012

Trying to get the onset of oscillation more realistic: link

This is a subtraction by a constant of the exponential volume antenna value, which gives a non-linear roll-off for smaller values.  I think I need to do this on the linear side of things and use squaring or something to give more control as it needs to snap on a bit more and not interfere with the general gain when it's on.  I'm just doing this for the oscillator amplitude, and not the noise amplitude, which follows the volume antenna directly.

[EDIT] Having an absolutely linear pitch field over the entire range is amazingly freeing from several angles.  In the ogg file above I was hitting the high tenor note with my hand right at the antenna, and I kind of screwed up the vibrato because I was out of hand-wiggle headroom and didn't want to hit the antenna.  I can use the same hand wiggle everywhere from right at my body to right at the antenna and get the exact same vibrato depth.

And having variable sensitivity without impacting the pitch field linearity (literally just twisting one knob) means one can entertain all kinds of fingering and hand techniques one can't do on a conventional Theremin.  I've got the pitch side sensitivity set to 4 half steps (delta) per open/closed hand, so it takes 3 of these gestures to cover an octave.  Thinking of going to a perfect 5th (+7 half steps delta) but afraid that might be too sensitive.  Could go down to the 4th I suppose (-5 half steps delta).

I realize now that I should probably be concentrating on features that make my playing sound better, so that the presentation comes off less annoyingly to those who can actually play.  So after I get the vocal transition to an OK point I'll probably look into real pitch correction.  It might be as simple as using feedback to reduce the error to the nearest note center, with long enough time constants to allow for vibrato and glissando.  I've noticed that the pitch quantizer makes my pitch sound noticeably better when not using these techniques, but it's too crude of an effect (not crude in the way I implemented it but crude in a general sense) for general use.  Anyone who owns a Theremini probably knows what I'm talking about.

Posted: 1/6/2018 9:10:55 PM

From: Northern NJ, USA

Joined: 2/17/2012

True Pitch Correction Implemented!

I couldn't sleep this morning so started thinking about how to implement true pitch correction on the prototype.  I figured it would involve low pass filtering of the linear pitch number subnote information normalized to the range -1/2 to +1/2 (full scale) and applied as a correction to the full pitch number.  I thought of ways I might alter the strength of the correction.  Then, while I looking to reuse some of the pitch quantizer code, I realized it already generates this correction signal, and that all I needed to do was low pass filter it.  So I designed a simple first order low pass filter that works rail-to-rail, stuck it in there, and after a bit of debugging and fiddling with parameter scaling, it works!  It works amazingly well!  I've got a second order low pass filter in there now which it doesn't seem to add all that much to what it was doing with the first order, but I'm still feeling my way around playing with pitch correction engaged.

So the addition of one LPF and one knobed parameter, and I've got full-blown chromatic pitch correction that, when the cutoff is set to the high end, gives simple pitch quantization, and when set to the low end is quite subtle in the way it slowly corrects the pitch.  It seems to help me play better, and it doesn't stomp all over the vibrato like quantization does.  Definitely a keeper, all Theremins should have some form of pitch correction on tap, if only as noob training wheels (and to give some relief to those who are forced to listen to them practice).

That came together much easier and faster than I expected, pretty much falling off a log once I saw the clear path.  I've got a fairly solid code base to pick and choose from now, which helps a lot too.