Pitch & Volume Axis Parameter Modulation
There's a lot going on with modulation parameters that probably isn't all that apparent, even in modular analog synths. When, where, and how to scale and apply them has been one of the tougher nuts for me (hence much whinging - sorry!). Since I had to revisit this issue for the oscillator harmonic level input, I thought I'd post a somewhat clearer explanation of it while it's still firmly in my head.
The above is my current approach (implemented via subroutine) to modulating several parameters via the pitch and volume axis numbers. The pitch number associated with C0 and the volume number associated with -48dB are actually the same: 0xC000,0000 which is 3/4 of a full scale unsigned 32 bit value. So, except for a factor of 4 weighting favoring the pitch side, both the pitch and volume processing branches are identical.
Following the upper pitch branch: the (unsigned) linear pitch number has (unsigned) C0 subtracted from it. This is a saturating unsigned subtraction (negatives limit/clip to zero) so the result is also unsigned. The result is at most 1/4, and it is multiplied by 4 to be full-scale. After this the most significant bit [31] is flipped (logically negated) to give a +/-1/2 (full) range signed value. This is then (signed) multiplied with the (signed) modulation knob parameter PMOD, yielding a +/-1/4 range signed value. The volume branch is the same, but the result is right signed shifted by 2 to provide a division of 4, yielding a +/-1/16 range signed value. These are combined and used as a signed offset to various things, such as the (unsigned) filter frequency shown at right. Note that combining the pitch and volume modulation numbers by simple addition is safe to do, as the result will always be less than +/-1/2, but the result with other things downstream requires saturating math. Even if you did this stuff with floats you would need to saturate somewhere.
Please note what isn't at all obvious (IMO) from the above diagram & explanation: the initial subtraction, *4, and NSb operations provide a signed value that "hinges" (has zero crossing) about the midpoint of operation, which is C4 and -24dB, or 0xE000,0000. Multiplying this by another signed "hinged" value, PMOD / VMOD from the knob, creates a sort of "teeter-totter" with slope strength and direction controlled by the knob. Since the pivot point is mid-field, it's entirely possible to, say, make the knob value more positive and have it change the current static modulation negatively - which is fairly counter-intuitive, but it seems that that's life in the big city.
The gain difference of 4 favoring the pitch side is to roughly make up for the generally more limited range the pitch field controls. For the filters, setting PMOD to +32 gives normal chromatic spacing, setting it to +63 gives ~twice this, and negative values give negative directional modulation. You really need one specific setting value to give exact chromatic tracking here (so ringing filters can be used as rough oscillators, so filters track the oscillators to yield harmonic content similar to fixed waveforms, etc.).
So the above is nothing new in terms of how I was doing the filter modulation, but I think the explanation of the central nuance is much clearer. The above is however new for the oscillator harmonic level modulation. Previously, I was only modulating the harmonic level with the volume axis, and only in a positive (louder = more harmonics) way, which wasn't entirely ideal. If you don't constrain the axis modulation and make it "hinge" about the mid-field you get most of the change happening at the far-field, where the volume is too low to really hear it change.
As for pitch axis modulation of harmonics, adding a bit of negative modulation seems to make human vocals more realistic sounding to me. Which makes sense because the low pitch end vocal "fry" needs lots of harmonics, whereas on the high pitch end the vocal process is entering falsetto territory, which tends to be more mellow. Traditional Theremin voices also tend to have more harmonics on the low pitch end due to oscillator coupling. For violin type sounds, negative volume modulation of the harmonics might help emulate the initial "skritching" of the bow.
From many synthesis standpoints, it is extremely useful having a harmonic level input on the oscillator.
=================
It's strikes me as a bit odd that simple waveforms which merely accentuate upper harmonics (and not even in a fixed resonance way!) can sound anything like a human voice. Though I suppose waveforms like that don't generally exist in nature. Kind of like literally any flying insect basing its navigation on the sun and therefore being utterly confused by artificial lighting, everything our auditory system has been designed by nature to detect has been, up until very recently, fairly simple (to generate) waveforms + resonant structures.
=================
[EDIT] Here's a bit of "Danny Boy" (MP3) with the latest harmonic modulation. I must say that it feels a little strange controlling somewhat realistic sounding voices that aren't my own. Here's hoping that articulating (matrix mod) the formant frequencies will add even more realism.