A few things i can think of are:
1. Try to create explicit scopes in your kernel, by using additional curly braces. This may reduce the number of VGPRs required at any point, by making variables out of scope, as soon as they are not required.
2. If possible, try to reuse the variables, and make the number of variables minimum.
3. Check if some variables can be made be const. They will probably get optimized away.
4. You can even think of having some variables in LDS, incase your register pressure is very high. Keep in mind the LDs throughput and bank conflict issues though.
5. You can check out the opmization options available while building kernel, they may also help in reducing VGPRs.
Probably Some one can add more.