"What are the effects of the difference between original and "modified" versions?" - tinkeringdude
There's a paper on it, and here's a poster (link) but I don't think they clearly convey the simple thing that's going on. For any filter, analog or digital, reduction in gain controls the cutoff frequency (because it controls the rate of integration). Each gain reduction is paired with an integrator, and the pair count tells you the order of the filter. When the order is > 1 there will be additional gain reductions associated with damping.
When you have limited resolution, gain reductions can easily be problematic. Even floats can't get you around this entirely because they also have limited resolution, so when you add something significantly smaller than the thing you are adding to, nothing changes. (There may be situations where floats actually make things worse, because the exponent consumes some of the resolution space.)
So where you place the gain reduction in relation to the integrator is critical to the basic resolution of the thing being integrated. If you place the gain reduction before the integrator you've reduced its resolution by truncating its least significant bits. On the upside, the integration won't grow much larger than the thing feeding it, and the value of the integrator is often a direct filter output (LP, BP) which can be handy. If you place the gain reduction after the integrator then the integration will grow to be much larger than the thing feeding it, but if you have headroom in the integrator then this is the best possible scenario because there is no truncation going on until after the integration, where there is excess resolution to throw away.
Resolution reduction is bad because it directly degrades signal SNR, and makes the parameters less continuous. And the least significant bits you chop off can sometimes remain in the filter and recirculate, causing low-level limit cycles (you want everything to decay to true zero).
I think Chamberlin picked his SV topology (gain reduction preceding integration) because it was a direct replacement of the analog SV integrators with digital integrators, and he likely was working with an 8 or 16 bit processor so wide integrators were perhaps problematic from a performance standpoint. But he didn't go this route for his single pole HP/LP, so I don't totally get his thinking here.
"Btw, I have checked, the one I'm using isn't from Chamberlin, it's an approximation to it, by J.Dattorro. It can be tuned only to < 1/4 fs, which I forgot about because the project I took it out of uses oversampling all the way and it was never a problem, but I got just reminded of that fact ;)"
I also limit my filters and oscillators to 1/4 Fs. For one thing, who can really hear 12kHz all that well? For another, going above this causes all kinds of aliasing problems with oscillators, and stability problems with filters. You never want to have positive feedback for any integrator in a filter, but you'll get it with high damping and high corner frequency, and guarding against instability with up-front parameter math can be a complicated thing (most stability criterion as implemented stay quite far away from actual instability as they are set for a single worst-case).
"Your wave sounds pretty clean, despite a lack of deliberate regards to band-limiting (?). It reminds me of that venerable speech synth singing "daisy, daisy..."."
No band-limiting at the source, and it's generated at Fs (48kHz). I find that I'm always setting the glottal too "buzzy" during adjustment - I suppose it adds to the clarity, and it helps my older ears hear the formant details in the mix. But once my ears have had a rest it sounds too harsh. BTW, the wave looks very much like your LP sawtooth! Mutating a sine can be incredibly effective, and, if done as gently as possible to get only the harmonic levels you need, can yield lower aliasing.
"As for vocal vs. friction sounds. I wonder why classical speech synth setups of this category tend to use an own resonator path for the noise shaping. After all, there is only one vocal tract, not two that are mixed together. Just that the different sound sources are located in different places within it."
I've noticed this too, and I think it's because the turbulence is often caused by the tongue (which isn't influenced by the nose or throat much), or by the lips (ditto), or by the nose opening (external to the body). A sudden intake of breath is mostly a mouth resonance. Deep breath sounds can use the vocal tract. In Perry Cook's Thesis, noise can be injected at several points in the tract, and this seems like it might be the best approach? I don't know why one would leave the tract itself entirely out of the equation, unless it is to remove those resonances that don't occur in real life. But why not just modulate those resonances?