As an exercise I've coded a C++ fully connected neural network and am testing it on MNIST. The network is 784x512x512x10.
I'm looking for the fastest reasonable CPU I can run this on.
I've tried an i9-9900K with 4x 16GB RAM:
1 thread 3.8k training samples / second
15 threads...