Added "ctmulq" implementation of Poly1305 (using 64->128 multiplications when available).