Looks like Google have released some details of their second generation: they put 4 chips on a board, call that a Cloud TPU, rated at 180 Teraflops, and then connect 64 boards in an 8x8 torus, which all fills up a pair of double-width racks. That might be called a TPU pod. Possibly two more adjacent double-width racks are also a necessary part of the pod. Then they make it all available as a service - sign up now if you have an appropriate research project. Most important, perhaps, is that this new TPU is supposedly good for both inference and for training - which probably means it does at least 16 bit arithmetic.
https://blog.google/topics/google-cloud ... -learning/More photos
https://www.tensorflow.org/tfrc/All that said, no detail about the internal architecture.
"Using these TPU pods, we've already seen dramatic improvements in training times. One of our new large-scale translation models used to take a full day to train on 32 of the best commercially-available GPUs—now it trains to the same accuracy in an afternoon using just one eighth of a TPU pod."