SubQ 1.1 Small trims inference cost with sub-quadratic scaling

SubQ 1.1 Small promises faster, leaner neural‑net inference on constrained devices.

The report details a redesign that reduces the algorithmic complexity from quadratic to sub‑quadratic in relation to model size. Benchmarks show up to 30 % lower latency and half the memory footprint on ARM Cortex‑A78 cores compared with the previous 1.0 release. The changes focus on a re‑engineered attention mechanism and tighter quantisation, while keeping the original architecture’s accuracy within 0.5 % on ImageNet.

For edge developers, the improvement means longer battery life and the ability to run larger models on the same silicon. In a market where TinyML competitors like Edge Impulse and NVIDIA Jetson Nano are pushing raw performance, SubQ’s angle is efficiency at a comparable accuracy level. If the sub‑quadratic claim holds across more workloads, it could shift the cost curve for on‑device AI deployments.

The report is a technical deep‑dive rather than a marketing brochure, but the language hints at positioning against models that rely on sheer compute power. Time will tell whether the gains translate beyond the specific benchmarks used.

← Back to the front page