Let's Design and Build a (mostly) Digital Theremin!

Posted: 11/25/2015 7:23:07 PM
dewster

From: Northern NJ, USA

Joined: 2/17/2012

CRC-32

Been looking into error detection for software loads.  The usual approach is to do a CRC over the data, particularly as a check to see if the software load in Flash is valid before using it.  

Not surprisingly, engineers, programmers, and mathematicians had their mutually inconsistent ways with things, so it can take a day or more to come to grips with what's going on.  It's really more obfuscated than complicated, and the fact that hex dumps and C strings have a rather jumbled display order just adds to the fun.  Surprisingly, the standard CRC-32 polynomial is sub-optimal, it doesn't even flag all odd numbers of errors.  Koopman performs exhaustive computational searches for better polynomials and apparently hasn't finished for CRC-32.  I find it incredible that the world relies on CRC for so much yet Koopman is doing this in his off time.  If anything should get funding it's this kind of stuff.

The clearest explanation I could find was in Hacker's Delight, 2nd edition, though the simplest implementation is not shown in code form there.  Warren's hardware view sidesteps all the endian nonsense and language ambiguity, though his diagram is that of a left shifting, non flipped CRC & residue type.  For me (granted, a HW engineer) it helped to initially approach CRC implementation as an LFSR-based serial data scrambler, rather than a byte and table (or no table) arrangement.  The concept of parallel input such as bytes can then be pulled in later, but all the byte and/or word flipping can be confusing without an understanding of the underlying serial process, which has nothing to do with bytes, just bits and 32 bit values.  The byte & table approach is just a bunch of precomputed xoring, and a hardware implementation of the table could be easily replaced with a sea of xor gates, which conveniently factors down to something fairly manageable.

Excel spreadsheet: http://www.mediafire.com/download/yxfyu871wf4yb08/CRC32_2015-11-20.xls

I wrote a Hive subroutine today that does one round on 32 bit input data.  5 cycles through the loop with one loop per bit.  I'll probably use this with the SPI Flash device that will be holding the software load and presets:

    ADDR     OC  SA  SB     IM       OP  Pseudo code               Comments
   0x100  0xc11  s1  s1      .      LIT  s1 := -306674912          -- SUB : CRC32 - poly : 0xedb88320
   0x101 0x8320   .   . -31968        L
   0x102 0xedb8   .   .  -4680        L
   0x103 0xb1f2  s2   .     31      BYT  s2 := 31                  -- loop idx : 31
   0x104 0xaffa  P2   .     -1    ADD_8  P2 += -1                  -- loop start, dec idx
   0x105 0x1c00  s0  s0      .    SK2_O  (s0==odd) ? pc+=2
   0x106 0x97f8  P0   .     -1   SHP_6U  P0 <<= -1 (u)
   0x107  0x402   .   .      2    JMP_8  pc += 2
   0x108 0x97f8  P0   .     -1   SHP_6U  P0 <<= -1 (u)
   0x109 0x3718  P0  s1      .      XOR  P0 ^= s1
   0x10a 0xff92  s2   .     -7 JMP_8NLZ  (s2!<0) ? pc += -7
   0x10b  0x106   .   .      6      POP  P2 P1
   0x10c 0x2df0  s0  P7      .      GTO  pc := P7
Posted: 11/30/2015 10:16:40 PM
dewster

From: Northern NJ, USA

Joined: 2/17/2012

Rotary Encoders

These little guys kick out a two bit Gray code when you rotate the shaft.  If you use pull-up resistors (the norm) then you get an inverted Gray code:

clockwise: 11 (detent), 10, 00, 01, 11

counter clockwise: 11 (detent), 01, 00, 10, 11

I was going to use a hardware state machine, but am thinking that a software state machine would use less in the way of FPGA resources and be more malleable if bounce or race conditions were to arise.  With some thought, the Gray code can be trivially converted to binary by inverting both bits, taking the MSb as it is (i.e. inverted), and XORing the two inverted bits to form the LSb.  The next valid state is then formed by incrementing (CW) or decrementing (CCW) the current state.  A third bit can be used to keep track of CW or CCW rotation, if it is tacked onto the state as the MSb then it can be considered to be the sign bit and everything pretty much just tracks.  We can then tack a zero onto the binary converted input and compare it to the next CW state, and simultaneously tack a one onto the binary converted input and compare it to the next CCW state, if there is a match then the corresponding state is assigned (+1 or -1), otherwise no change.  Transition from state 3 to state 0 is a CW pulse, from state -3 to state 0 a CCW pulse.  If the detent state is detected then the state is assigned 0, thus disallowing state -4.  The machine tolerates / rejects noise on single inputs, and requires full sequences to generate pulses.

Implementation would be to resynchronize the encoder outputs to the internal clock via 2 cascaded flops per output, and present this to the processor in the internal register space.  The pushbutton on the shaft would be similarly pulled up via a resistor, resynced, and sent to the register.  So four encoders with three outputs each (CW, CCW, button) would require 4 x 3 = 12 register bits.  Sampling these at the audio update rate (50kHz) should be entirely adequate if not overkill.  Separate up / down counters with hysteresis should work for debouncing the buttons.  If real-time becomes an issue then maybe only examine one encoder per update, cycling through them.

I used to own the Roland JV-1010 and it was pretty clear that they hooked the encoder directly to processor interrupts with little or no debounce.  You could freak the box out simply by just spinning the knob a bit too fast, and it sometimes crashed even when you turned it slow.  Behavior one could sort of tolerate in the studio, but not live.  Bad engineering, whatcha gonna do?

Posted: 12/1/2015 3:31:10 PM
ILYA

From: Theremin Motherland

Joined: 11/13/2005

dewster,

do I understand correctly that the Hive was designed just for service control, not for DSP?
For example, I could not find such the useful commands as a saturation arithmetic.

Posted: 12/1/2015 5:02:38 PM
dewster

From: Northern NJ, USA

Joined: 2/17/2012

"dewster, do I understand correctly that the Hive was designed just for service control, not for DSP?For example, I could not find such the useful commands as a saturation arithmetic."  -- ILYA

It's true that there aren't any DSP-centric opcodes in Hive, but I do intend to use it for DSP (generation, filtering, etc.).  It will just take more code and real-time to do it.  Super-specific / intense stuff could be off-loaded to FPGA logic.  It's a somewhat limited platform, but they all are in one way or another.

Incidentally, I've updated the Hive simulator to display pseudo-code, which helps to understand what exactly is going on or being entered:

http://www.mediafire.com/download/d853799bcv3hzip/hive_sim_2015-12-01.zip

The PDF in there has been updated to reflect the changes.  Inclusion of the pseudo-code view has drastically reduced my code comments.  For instance, here is the LOG2 subroutine:

    ADDR     OC  SA  SB     IM       OP  Pseudo code               Comments
   0x640 0xd010  s0   .      1  JMP_8NZ  (s0!=0) ? pc += 1         -- LOG2 SUB START
   0x641 0x2df0  s0  P7      .      GTO  pc := P7                  -- return if input zero
   0x642 0x3d01  s1  s0      .      LZC  s1 := lzc(s0)
   0x643 0x8c18  P0  s1      .    SHL_S  P0 <<= s1 (s)             -- normalize
   0x644 0xb1a6  s6   .     26      BYT  s6 := 26                  -- loop index
   0x645 0x8b08  P0  s0      .    MUL_U  P0 *= s0 (u)              -- square - Loop Start
   0x646 0x9019  P1   .      1   SHL_6S  P1 <<= 1
   0x647 0xe020  s0   .      2  JMP_8LZ  (s0<0) ? pc += 2          -- MSB
   0x648 0x9018  P0   .      1   SHL_6S  P0 <<= 1
   0x649 0xa019  P1   .      1    ADD_8  P1 += 1
   0x64a 0xaffe  P6   .     -1    ADD_8  P6 += -1
   0x64b 0xff96  s6   .     -7 JMP_8NLZ  (s6!<0) ? pc += -7        -- Loop End
   0x64c 0x3498  P0  P1      .      NOT  P0 := ~P1
   0x64d 0x2dfe  P6  P7      .      GTO  pc := P7; P6              -- LOG2 SUB END
Posted: 12/2/2015 6:19:24 PM
ILYA

From: Theremin Motherland

Joined: 11/13/2005

dewster,

being an expert on the FPGA can you tell who holds the records for clock speeds in the FPGA world ?

Is it possible at the moment to implement the capture timer (based , maybe,  on the ring counter schematic), say, to 400 MHz and higher?

Posted: 12/3/2015 1:47:08 AM
dewster

From: Northern NJ, USA

Joined: 2/17/2012

"dewster, being an expert on the FPGA can you tell who holds the records for clock speeds in the FPGA world ? Is it possible at the moment to implement the capture timer (based , maybe,  on the ring counter schematic), say, to 400 MHz and higher?"  - ILYA

The pin toggle rate is usually the limiting factor, the internals will generally run much faster (for simple constructs like narrow width counters and such).  For a price one can go to more expensive families and higher speed grades, but unless you are a big guy directly negotiating with the manufacturer to get the price way down for a ton of them, the price will be really high - which is why I tend to stick to the cheap low end stuff.  

If you are trying to resolve input phase more finely there are round-about ways of controlling the phase of one of the internal clock managers (PLL / DLL).  You can also statistically vary the drive period to reflect short-term fractional phase.  And an A/D input can resolve sine wave input more finely than digital means.

 

Posted: 12/5/2015 8:41:49 AM
ILYA

From: Theremin Motherland

Joined: 11/13/2005

Visiting manufacturer' web pages, I've not found clear information about the maximum internal clocks. I suppose that info is disguised among tons of datasheets and verbiage.  

So the question is the same: I need a capture timer with the internal clock >400 MHz, the capture signal is external, the delay between input and capturing is not essential, but should be constant. Minimal depth 12 bit.

What can I get using the FPGA/CPLD? (Your answer will decide whether I will learning FPGA/CPLD or still wait.)

Posted: 12/5/2015 5:17:43 PM
dewster

From: Northern NJ, USA

Joined: 2/17/2012

ILYA, I agree the specs are a bit buried.  For an Altera Cyclone 4 speed grade 8, the datasheet shows 402 MHz as the maximum internal clock tree rate.   (The block RAMs and 18-bit multipliers run slower at 238 and 200 MHz, respectively.)  

The way to really know is to compile a design, as the tool is hyper aware of all timing.  As a test I stuck a PLL (50 MHz in / 400 MHz out) and a simple up-counter together and had it output the MSb of the counter to a pin (so the output pin toggle rate isn't a limiting factor).  In Quartus 10.1 web edition I'm seeing 428 MHz top speed for a 15 bit counter, which is limited by the tool to 402 MHz (max clock tree rate).  Going to 16 bits the top speed drops to 328 MHz (limited by the carry chain logic).  Since this is pushing the (cheap & slow) FPGA logic to the max, top speed really depends on how the counter output will be used internally.  

File is here: http://www.mediafire.com/download/zoh1n6adu41c8s2/speed_testing_2015-12-05.zip

Are you are thinking of using something like this to measure the period of an external oscillator?  If so you could just let the oscillator run, snag the counter output at the input edges, and subtract the previous counter value (i.e. a differentiator).  If you need more bits the counter can likely be deconstructed into sections (registering the intermediate carry chains).  Low pass filtering the results would increase resolution (though, if you think about it, the standard trick of zero padding between samples to increase the distance between aliased images unfortunately won't work because as many zeros will be averaged as the count value, yielding a constant average - though you can do a bit of varable-rate filtering initially quite inexpensively).

Posted: 12/5/2015 6:22:44 PM
ILYA

From: Theremin Motherland

Joined: 11/13/2005

Thanx dewster!

Installing the design software is a great idea on evaluating the different FPGA logics. Unfortunately, all of them (software) are the monsters.

"Are you are thinking of using something like this to measure the period of an external oscillator?"

No. I puzzled about heterodyning in digital domain.

Posted: 12/6/2015 1:58:03 PM
dewster

From: Northern NJ, USA

Joined: 2/17/2012

Analog Theremins Considered Harmful

Sort of kidding, but my point is this: other than helping the designer to understand exactly what a Theremin is in a musical sense, I think approaching digital Theremins from an analog Theremin angle is fraught with the very real danger of causing one bark up the wrong trees forever (or until one burns out, whichever comes first).  A lot of analog Theremin technology and design technique "unlearning" has to take place before one can be effective at designing the digital variety.  I think livio demonstrated this best with his fairly unconventional oscillator, plate antenna, and linearizing software (likely based on working the capacitance backwards to distance).  With digital it's a C sensor problem first and foremost.

With that off my chest, I'm strongly considering ditching the entire "roughly linear in an exponential way" approach.  It's THE major analog stumbling block: with cramped near-field response which pretty much forces one to use long thin low sensitivity antennas, linearizing inductors that complicate tuning, etc.  I have to convert from exponential to linear and back again for the LED tuner and tilt / offset adjustments, and these aren't free in terms of processor real-time, so I'm thinking why not do a direct conversion of frequency (or period) to linear distance?  I've worked through the math (based on my experimental findings that plate/hand mutual capacitance is strongly linear with 1/distance) and it's roughly the same complexity when using either period or frequency as the input, with a single division (or inverse) the most processor time consuming mathematical function.  One nice thing with this approach is that the constants are related to real world parameters (inductance, antenna static capacitance, antenna mutual capacitance "gain", null distance) so setting them should be simpler (in theory).  (Post manipulation, this linear figure would of course still require exponenetiation.)

SNR, particularly when using a small plate antenna, should be sufficient.  A plate gives the best 1/d linearity, the highest sensitivity, and some welcome directivity.  I had an idea last night (simple shield drive) that I need to try out on the bench before parading it around which, if it pans out, would provide even stronger directivity and possibly a bit higher sensitivity.  I wonder how stable (in terms of C vs. temperature) one can construct a plate capacitor with an air (or other) dielectric?  Plates ~0.1m on a side, separation ~25 mm (to give ~3pF of mutual C).

You must be logged in to post a reply. Please log in or register for a new account.