"... lost time is never found again." - Benjamin Franklin
Weird sorta bug in the Intel / Altera Quartus FPGA toolset that I had the misfortune of stumbling across. I was adding some intermediate registering to the SPDIF component flag decoding in order to speed it up, and for the first time finally took a really good look at the failing timing paths associated with it. The component by itself could fairly easily hit the target clock rate of ~200MHz with comfortable (>12%) margin, but when instantiated in the full FGPA load (Hive processor with all of the peripherals) it would often (depending on the build seed number) fail timing closure.
Timing paths go from the clock source, to the launch register, through asynchronous logic and wires, to the latch register. The worst paths in the timing report were launched from a prescaling counter in the SPDIF component, and latched at the NCO divider (modulo counter) in the pitch and volume axes LC_DPLLs - registers which are internal to the components themselves and not at all brought out to a port, much less connected!
So I checked, double-checked, and triple-checked the explicit and implicit port connections, but found nothing that could account for this. Then I spent a lot of time trying to get observability into the connection via the various views supplied by the tools - the best I found was by doing a "Report Timing... (In TimeQuest UI)" then right clicking on the path in TimeQuest and doing a "Locate Path..." in the "Technology Map Viewer" which brings up an auto-generated schematic view of the logic as-built by the synthesis and place & route. It's a rather tricky viewer to navigate in, and the component ports names can be kinda cryptic, but it showed a direct wire going from the prescaler LSb register to the NCO divider LSb register.
Thinking it was a bug in the tool that might be fixed with a later version of Quartus, I searched for the latest version that supports Windows XP and nothing would tell me that, not even the Intel / Altera web page where you download them (!) - but by examining the various chicken entrails and tea leaves, it seems v13.1 is the latest. Downloaded that and the latest patch, installed and applied the patch, but it generates the same failing paths as the older version I was using (v10.1sp1). Yarg.
Synthesis of the SystemVerilog code performs a variety of optimizations on the final logical blob that gets presented to the fitter. These optimizations include the reduction / elimination of redundant asynchronous logic via Karnaugh mapping techniques and the like, and the removal of identical registers, all followed by a mapping to the logic available in the target device. I realized this morning that the LSb of the NCO divider was always driven by a '1' which makes it toggle at 1/2 the clock rate, and this is the same behavior the SPDIF prescaler LSb performs, so synthesis was mushing them together, causing a timing bottleneck. Why synthesis is doing this is unknown to me, as the "Timing-Driven Synthesis" option is checked, so one would think that it really ought to know better. And the fitter settings let it add redundant registers to reduce fan-out and speed things up, so there's not a lot of excuse here for why this is happening.
The solution? Preset rather than reset the prescale count at initialization! A one character change in the code after a day and a half snipe hunt. This rather insidious tooling issue has caused me to lose a bit confidence in the FPGA build process, not to mention causing me to lose a bit of time (and sleep).
Speaking of time, my personal odometer hit 60 yesterday - surreal. Beats the alternative, I suppose.