Let's Design and Build a (mostly) Digital Theremin!

Posted: 8/8/2017 6:09:32 PM
dewster

From: Northern NJ, USA

Joined: 2/17/2012

Thought you might like to see the sim so far:

The listing for thread 7 is above, at lower left is the register set (with only the basic core registers shown, none of the Theremin registers), at lower center is the stacks status for thread 7, and at lower right is some info / sim status.  I'll probably swap the register and info views to make room to view more registers on the lower right with the command line on the lower left.

In the listing display, the "BYT", "LIT", and "LINE" columns are new, they show how many bytes are in the opcode, whether the opcode is a literal or not, and what line in the sourcecode the opcode came from (extremely useful).  Note the explicitly variable lengths in the opcode column now.

The command line code isn't implemented yet (only doing single key stuff at the moment to prove the core and display) and I'm not sure if there will be any screens other than this one because I want to keep it really simple.  For the current single key stuff (which will likely change somewhat) the number keys select the thread to display, the 'c' key does one clock, the spacebar does 8 clocks, the 'h', 'u', and 'i' keys pick the display radix, and the 'q' key quits the sim.  I need ways to interact with the UART FIFOs, GPIO, and SPI; load a different *.hal file; issue core clear and external interrupts; and page/scroll through the memory listing display without execution.  (I wish IBM had started the function key numbering at F0 rather than F1...)

The new sim snags pertinent state info for each thread when it crosses from stage 7 to stage 0 in the pipeline.  This way the state info remains stable throughout an 8 clock thread cycle for display purposes.  In the old sim the displayed state was dependent on the stage you were viewing, which was realistic, but cumbersome and a little confusing - after changing the thread view I usually had to do several single clocks to get past stage 5 where the fetch happens in order to see the next state.

===========

[EDIT] Wow, this thread got ~500 views in 24 hours.  Kind of hard to believe there are that many people in the world, much less visit here, with the intersecting interests of Theremins and soft processors.  Bots?  Web page scraping?

Posted: 8/11/2017 3:27:45 PM
dewster

From: Northern NJ, USA

Joined: 2/17/2012

Not a lot of visual change, but it all of the essential features are implemented and seem to be working:

I swapped the register and info views to make room to view more registers on the lower right with the command line on the lower left.  The Theremin registers aren't here, I will add them later, and may make them an executable command line option so that others who want to use the Hive core won't be bothered by them.

Doing 800,000 verification cycles (as seen on the command line at the lower left) takes 94 seconds of real time on my PC (AMD Athlon II X2 250, XP SP3). 800,000 cy / 94 sec = 8.5 kHz, and this is the per thread rate.  The core clock rate is 8 times this, or 68 kHz.  The FPGA hardware runs over 180 MHz, so the C++ sim is is a couple of thousand times slower than the hardware, which really isn't too shabby.  

Ten times the cycles (8,000,000) in the old sim take 81 seconds, so the new sim is roughly 12x slower than the old.  I haven't done any profiling, but I imagine the slowdown is due to data moving through the pipelines in a more realistic manner, with double the representation for clocking (asynchronous & registered).  Whatever, it's going to be easier to maintain the new sim code, and it's likely fast enough for what I need to do.  Sort of ironic that the things you do to speed up the hardware implementation slow down the software representation.

One nifty feature I added to the sim last night is a cleared screen which shows the verbose results of the HAL assembly process to memory, with a "hit any key" pause to read them.  Today I added a MIF file write command which kicks out the FPGA config files. Together these features largely obviate the need for a stand-alone assembler.

(Will be going on vacation for a couple of weeks starting Sunday, so light to no posting in the interim.)

Posted: 8/11/2017 7:03:41 PM
ILYA

From: Theremin Motherland

Joined: 11/13/2005

Have a good vacation Dewster! Dont forget about TW!

Posted: 8/11/2017 9:04:56 PM
oldtemecula

From: 60 Miles North of San Diego, CA

Joined: 10/1/2014

Hey dew,

Coding is not like riding a bike or the nasty, walk away for two weeks and it takes precious time to get the brain synapses fired up again. Have fun and we will not worry if you found paradise and do not come back. surprised

Christopher

Posted: 8/11/2017 10:38:27 PM
dewster

From: Northern NJ, USA

Joined: 2/17/2012

"Have a good vacation Dewster! Dont forget about TW!" - ILYA

Thanks ILYA, I won't forget!

"Coding is not like riding a bike or the nasty, walk away for two weeks and it takes precious time to get the brain synapses fired up again. Have fun and we will not worry if you found paradise and do not come back." - Christopher

That's the truth!  The main reason to code as simply and directly as possible.  I had this bad feeling when coding the initial sim (as an exercise in OOP) that I should have instead been doing so with as much fidelity to the hardware as possible.  Ah well, a rewrite was probably called for at this point anyway.  Leaving every possible option out and combining / simplifying those that remained is something you can only do more or less correctly in a later pass.  At the time of the initial sim I didn't know that I'd be implementing an assembly language down the road (I did everything I could to avoid it, but it makes too much sense in the developmental flow of things) so there were a bunch of editing options I was able to remove this pass.

Posted: 9/3/2017 8:10:37 PM
dewster

From: Northern NJ, USA

Joined: 2/17/2012

Back from vacation, had a great view of the solar eclipse when we were at Table Rock state park in South Carolina as it was in the path of totality.  This was my first total eclipse, and I must say it was every bit as amazing looking as any photo I've seen, like a black fireball in the sky.  It got really dark out, but not as dark as I expected.

Had a few ideas while I was away and am finally getting around to implementing them.  The main one was to make encoding of the opcodes more orthogonal.  Now that Hive is byte addressed and opcodes are a variable 1 to 4 bytes in length, all the pressure is off in terms of fitting everything into a fixed width field.  Which paradoxically somewhat confounds my thought processes as I've been working in a very constrained design space for so long I'm finding it difficult to change gears. Freedom overspill.

Anyway, I moved the opcodes around in the encode space so that those with 8 bit immediate values, those with 16 bit immediate values, and the versions of these without immediate values are now simple offsets of each other.  Doing so expanded the number of immediate add, subtract, and multiply operations to 4 each (1 normal and 3 extended) for each immediate size, which is fine and essentially free.  

I can't seem to put the issue of the inclusion or removal of the unconditional jump with 8 bit immediate to bed - every time I talk myself into excluding it I find myself arguing the other side and putting it back in, and vice-versa in an endless loop.  The immediate field as a signed binary value is non-standard (re-use of the stack selects, though this is used one-hot for stack pop and clear opcodes) and there is no 16 bit immediate version, but it has some utility.  In my command line code I count around 28 instances of it in 930 lines of code.  Another idea I implemented was code statistics reporting from the assembly process:

********************************************
* HIVE HAL Assembler : Core Version 0x1106 *
********************************************

HAL lines   : 930
Label passes: 3

OP bytes    : 1887
OP count    : 684
OP avg      : 2.75877
OP %        : 11.5173

LIT bytes   : 3528
LIT count   : 1047
LIT avg     : 3.36963
LIT %       : 21.5332

TOTAL bytes : 5415
TOTAL %     : 33.0505

Elapsed time: 0.182085s

*********************
* Exiting Assembler *
*********************

In the absence of an unconditional im8 jump, a conditional im8 jump is automatically substituted with the condition (s0 == s0) which is always true even if the stack is empty or otherwise uninitialized.  Both have the same execution time, but the conditional version consumes another byte, so this comes down to a matter of storage efficiency.  If 28 instances in this code is anywhere near the norm (who can say?) then we're saving 28 bytes in 5415 bytes total, which is roughly 0.5%.  So I'm rather inclined to drop it.  And tomorrow I'll come up with an even better argument to keep it, LOL!

I'm also considering adding unsigned conditional testing, but that's about it for my new ideas.  Really hope I'm at the scraping the bottom stage as I'm quite anxious to get back to real SW and Theremin work.

Posted: 9/4/2017 5:02:46 PM
ILYA

From: Theremin Motherland

Joined: 11/13/2005

welcome back dewster!
Relation of the moon with the theremin? There is nothing easier!

(idea of Dmitry Gurovitch, implementation by our member Valery)

 

Posted: 9/5/2017 1:36:22 PM
dewster

From: Northern NJ, USA

Joined: 2/17/2012

Hi ILYA!  Wow, is that a Theremin?  If so, I've had similar thoughts as to that physical design, except mine would be a straight tube with square cross section, mounted at an angle to put the pitch end high and the volume end low.  This puts the controls at a good height for seeing / manipulating them, but they unfortunately also get angled.

(Found this crazy page when Googling for that image.)

======

Changing the SV according to my previous post was fairly quick and straightforward, and it doesn't seem to have broken anything. Changing the assembler / sim is proving to be more involved. I'm looking into more intelligent parsing steps to get it done.  The usual suspect is the minus sign which can be either an operator or a negative sign.  Read an interesting comment to an article over on Hacker News where the person was talking about a "micro compiling" workshop technique, where the input is changed slightly to something intermediate but closer to what you want and then output, with 40 or more steps in a chain of these transformations.  This makes a lot of sense as the total transformation is very difficult for the brain to take on at once; much more tractable implemented as baby steps.

Posted: 9/5/2017 3:29:32 PM
ILYA

From: Theremin Motherland

Joined: 11/13/2005

"is that a Theremin?  "

yea,  based on the assembled EW pcb .

https://translate.google.ru/translate?sl=ru&tl=en&js=y&prev=_t&hl=ru&ie=UTF-8&u=https%3A%2F%2Fjazzpeople.ru%2Fjazz-in-faces%2Fdmitriy-gurovich-termenvoks-luna%2F&edit-text=

 

Posted: 9/13/2017 10:54:02 PM
dewster

From: Northern NJ, USA

Joined: 2/17/2012

Lots of work done on Hive, the assembly language HAL, and the Hive simulator in the last week or so. 

Instead of getting rid of the unconditional jump with immediate byte distance, I decided to embrace it and add 16 and 24 bit versions - don't want to waste any memory access bandwidth opportunities.  In the assembler I removed the remaining "clumping" as the "puffing" step that comes before it is a surgical insertion of spaces now that doesn't require much post fiddling with to tokenize and parse.  I added OP_HLT which is a halt operation that is a one byte infinite loop - this is for convenience as it keeps me from having to declare self labels to jump to.  I also added more jump conditionals, unsigned less than and unsigned not less than, as well as odd and not odd.  This brings the single operand conditional count to 6, and the two operand conditional count also to 6, and the comparison syntax is now signed / unsigned where necessary to disambiguate (<s, <u, !<s, !<u).  Once again, the bit reduction operations return 0 and 1, rather than 0 and -1, which I've realized is more in line with C bools (in terms of adding a bool to an int, and bool bit vectors).  The mixed signed and unsigned add, subtract, and multiply now are US (unsigned, signed) rather than SU (signed, unsigned), which is easier to think about and perhaps more useful when the sign of the thing operating on the unsigned thing isn't known (i.e. a variable in a register).  It took years for the dust to settle on most of these final issues in my brain (and it may unsettle at any point!).

Spend most of yesterday editing the verification assembly code, the sim and FPGA version check out OK.  The FPGA place and route has difficulty making 190MHz now, which isn't terribly surprising.  The logic element count is just over 3k so it hasn't grown significantly.

Getting tapped out (I hope!) on processor ideas to try, so I'm going back to algorithm polishing with the new opcodes and my 3 term float type (sign, mag, exp).  Then back to Theremin prototyping stuff.  Gotta do a major update on the Hive design doc at some point too (hopefully in the form of a book someday).

Processor design is like solving a gigantic, indistinct crossword.  You slowly fill in a few things here and there, and end up erasing a lot of it, but the more you do the more the momentum builds and the faster it goes, though the overall rate is glacially slow.  It's certainly the biggest puzzle I've ever worked on, with only the vaguest of clues at the outset.  Aside from attempting some new overall architecture (the main reason to do it in the first place), a major part of it is deciding on the set of basic operations you want / need to include, and you can only get a feel for this after you've designed a processor of some sort and programmed it to do mathy stuff, which is quite involved on its own.  And any significant programming activity like this pretty much requires an assembly language, so you're on the hook for that, as well as some form of efficient simulation.  

The thrust of hardware description languages like SV is to cover two distinctly different "modes": 1) code that describes hardware, and 2) code that verifies that description (test bench implementation).  But, as painful as it is, I must say that describing the design in two different languages (SV & C++) is incredibly valuable from several standpoints.  One form tends to catch logical errors in the other form so verification is strongly aided, and looking at the design from different angles helps to gain general insight which directly leads to improvements in the design.  And the C++ version can be used to make a full featured simulator and process the assembly language, so it's not wasted effort.

You must be logged in to post a reply. Please log in or register for a new account.