*GPU has implemented reciprocal function with reduced accuracy. So for true 1/x ( or division ) you must add few mads.
More advanced function like sin, exp, cos, log .... must be computed using those basic ops. And in this case CPU has huge advantage. You can look into CAL++ sources for implementation of few of those functions.