cancel
Showing results for 
Search instead for 
Did you mean: 

Build Your Local Coding Copilot with AMD Radeon GPU Platform

AMD_AI
Staff
Staff
1 0 26K

Generative AI is changing the way software engineers work today. Did you know that you can build your own coding Copilot just using the AMD RadeonTM graphic card locally? That’s right. AMD provides powerful large-model inference acceleration capabilities through the latest and advanced AMD RDNATM architecture, that powers not just cutting-edge gaming but also high-performance AI experience. With the help of the open software development platform AMD ROCmTM, now it is possible for software developers to implement GPT-like code generation functions on the desktop machines. This blog will share with you how to build your personal coding Copilot with Radeon graphic card, Continue (name of an open-source integrated development environment, act as an extension of VSCode and JetBrains that enable developers to create their own modular AI software development system easily), and LM Studio plus the latest open-source large model Llama3.

 

Here is the recipe to set up the environment:

Item

Version

Character

URL

Windows

Windows11

Host

 

VSCode

 

Integrated Development Environment

 

Continue

 

Copilot Extension

https://www.continue.dev/

LM Studio

v0.2.20 ROCm

LLM inference server

support Llama3

https://lmstudio.ai/rocm

AMD Radeon 7000 Series

 

LLM Inference Accelerator

 

 

In this implementation, the LM Studio is used to deploy Llama3-8B as an inference server. The Continue extension connected to the LM Studio server plays as the copilot client in VSCode.

 

AMD_AI_0-1718087026052.png

A Brief Structure of the Coding Copilot System

 

The latest version of LM Studio ROCm v0.2.22 supports AMD Radeon 7000 Series Graphics cards (gfx1030/gfx1100/gfx1101/gfx1102) and has added Llama3 to the support list. It also supports other SOTA LLMs like Mistral with awesome performance based on AMD ROCm.

 

AMD_AI_1-1718087152972.png

 

Step1: Please follow Experience Meta Llama 3 with AMD Ryzen™ AI and Radeon™ 7000 Series Graphics to setup LM Studio with Llama3. 

 

In addition to work as a standalone chatbot, LM Studio could also act as an inference server. Like shown in the picture below, just one-click on the Local Inference Server button at the left-hand side of LM Studio user interface with the LLM model, e.g. Llama3-8B selected, an OpenAI API HTTP inference service will be launched. The default port is http://localhost:1234

 

AMD_AI_2-1718087215574.png

 

You may use the curl example code to verify the service with PowerShell.

AMD_AI_3-1718087254054.png

 

Step2: Setup Continue in VSCode

Search and install the Continue extention in VSCode.

AMD_AI_4-1718087308599.png

 

You will find out that Continue now works with LM Studio and other inference framework. 

AMD_AI_5-1718087336222.png

 

Refer to https://continuedev.netlify.app/model-setup/configuration to modify the config.json of Continue to set LMStudio as the default model provider. Find out config.json and add the contents as what have been highlighted in the picture below:

AMD_AI_6-1718087409827.png

 

Then, choose the LM Studio as the copilot backend (at the lower-left corner of this UI). Then you can chat with Llama3 with Continue in VSCode.

AMD_AI_7-1718087434596.png

 

Continue provides a button to copy the code from chat to code file.

AMD_AI_8-1718087458455.png

 

Right click the mouse to trigger out the quick menu of Continue in the code editing windows.  

AMD_AI_9-1718087484019.png

 

At this point, the application of automatic coding using the Llama3 model in LM Studio through Continue has been successfully launched. Continue enables users to select the right AI model for working purpose, whether it's open-sourced or commercialized models, running on local machine or remotely, and used for chat, autocomplete, or embeddings.  You may find out more usage of it from https://continuedev.netlify.app/intro/.

 

Now, you have your own AI Copilot with AMD Radeon Graphic card. This is a very simple and easy-to-use implementation for many individual developers, especially those who do not yet have access to the cloud instance for large-scale AI inference calculations.

 

AMD ROCm open ecosystem is developing rapidly with the support of latest LLM by AMD GPU, and the excellent software applications such as LM Studio. If you need more information on AMD AI acceleration solutions and developer ecosystem plans, please send email to amd_ai_mkt@amd.com.