How exactly is it all about his hardware? If I remember correctly, he said in an interview that he was working with a single Titan Xp (12 GB VRAM), which seems pretty similar to the P100 on Colab if you ask me. Of course it's much better if you can work with that locally, but still: am I missing something? Is that interview outdated or his Colab somehow else limiting? I had no issues so far and my results keep improving.
His technique does seem to play a bigger role to me. He picks scenes and actors that work well together, etc. And his training doesn't seem to be all that basic either. He does pre-training in a manner that is (according to him) unusual, he switches things up 2-3 times during training and only goes up to about 100k iterations while people here get much worse results with well over that amount.