With GPT, Raspberry Pi becomes more advanced

Latest update time：2024-06-24

Reads：

Is there any field without the concept of AI? Since ChatGPT became popular, the battle of big models has begun, and this time the war has also continued to the Raspberry Pi field.

A while ago, Raspberry Pi was officially launched. Although it did not attract much attention, it also shows that Raspberry Pi has always been popular among developers. In the past, many people regarded Raspberry Pi, an SBC, as a "small toy" for development, and some people used industrial Raspberry Pi as productivity. Now, GPT is here, and it has completely changed Raspberry Pi.

Wang Zhaonan, Fu Bin | Author

Electronic Engineering World (ID: EEworldbbs) | Produced

Put GPT on the Raspberry Pi

In March 2023, some people began trying to put GPT on the Raspberry Pi. A software developer named Georgi Gerganov developed a tool called "llama.cpp", which can not only run Meta's new GPT-3-level AI large language model LLaMA locally on a Mac laptop. Others have successfully run it on the Raspberry Pi, although it runs very slowly.

At that time, there were also people at Stanford University who specialized in this technology. Stanford open source project Alpaca shared the implementation principle: 52,000 training corpus data were generated by calling the text-davinci-003 model through the ChatGPT API interface, and the 52K corpus data were used to Fine-tune based on the LLaMA 7B model. It took less than 3 hours for 8 A100 graphics cards to complete the task, and finally achieved the same effect as OpenAI text-davinci-003, with a total cost of less than $600.

In February 2024, Makers launched a project to try to localize LLM. The project is called World's Easiest GPT-like Voice Assistant, which is the world's simplest GPT-like voice assistant, so as to realize the GPT voice service that is completely executed on the local side, without any network connection. The specific method is: use Raspberry Pi, such as RPi 4, install a microphone and speaker, which becomes the input and output of voice interactive dialogue, and then install the Whisper software to convert the voice received by the microphone into text, and feed the text to LLM. After receiving the input, LLM performs inference processing, and the processed result is output as text. The output text is converted through another installed software, namely eSpeak, which converts the text into speech and then speaks back through the speaker.

Recently, the Animism Team of Beijing University of Posts and Telecommunications also officially announced the success of running the GPT experiment on the Raspberry Pi.

After quantizing and shrinking the GPT-3-like large model, it can be successfully installed in the Raspberry Pi, just like "putting an elephant into a refrigerator"! For the LLaMA-7B with 7 billion parameters, the minimum memory requirement is 8GB or 12GB. This is still a big burden for devices with weak computing power. Their memory capacity is often only 8G or even 4G. The step of loading the model alone will block them from the door of GPT, not to mention the need to store the intermediate results of the model execution.

Model quantization is used to compress model weights. If you want to use a larger model or the device memory is further limited, model quantization is not enough to fully support model deployment. In this case, the model can be further split and migrated to a device cluster for execution, so that multiple models can share the huge memory overhead.

As the model is initialized and the weights are gradually loaded, a minimal intelligent agent with cognitive ability is born. When asked if it knows about Beijing University of Posts and Telecommunications, after a short thought, it begins to introduce Beijing University of Posts and Telecommunications word by word. Although the response is slow, it still successfully completes the entire computational reasoning process.

Super cool rover robot

A Raspberry Pi rover robot named Floyd, which has become very chatty thanks to its integration with ChatGPT.

Image source: Larry's Workbench

Floyd is the work of YouTube blogger Larry, using a Raspberry Pi 4B as the main control board. It is partially assisted by a HAT (Hardware Attached on Top) to handle some external components, such as servos for operating wheels and arms. The body of the robot appears to be made of metal, with the hardware completely exposed and mounted outside.

As far as ChatGPT bots go, Floyd has quite a few body parts to work with. It's able to move around on a set of wheels and even has a movable arm. However, thanks to ChatGPT's integration, Floyd has been given the ability to speak. With a microphone and speaker, Floyd can do voice-to-text and text-to-speech interactions and give customized responses instantly.

Make another desktop robot

Foreign maker David Packman also made a Raspberry Pi-based robot MBO-AI. Its appearance design was inspired by the robot MBO in the animation Adventure Time. It has powerful interactive functions and can realize offline wake-up word detection, call ChatGPT 3.5 for chatting, and use machine vision to analyze and explain images.

BMO-AI utilizes a Raspberry PI 3B+ running Raspbian Bullseye and an Adafruit CRICKIT Hat for servo motor control and signal input from its button array. It has four points of articulation and a 5-inch display for displaying facial expressions and images using the Pygame library. BMO also uses several other libraries to implement the following AI features:

Speech-to-text and text-to-speech with offline wake-word detection using Azure Speech Services.
Complete a single-round question-answering text using OpenAI ChatGPT 3.5 Turbo.
Multi-turn and context-aware chat using OpenAI ChatGPT 3.5 Turbo Chat.
Use Azure Computer Vision services for image analysis and captioning.
Raspberry Pi Camera Module v3 with image capture and sharing capabilities.
The new painting mode uses stable diffusion to paint pictures based on your spoken descriptions.
New DALL-E image from ChatGPT 3.5 description mode, which will create an image from a description generated by ChatGPT.

Speech Recognition on Raspberry Pi

At present, most large models run on cloud servers, and terminal devices get responses by calling APIs. In a few years, if the project team is shut down and the API interface is closed, the smart hardware that users have spent a lot of money on may become a brick. Therefore, how to run completely offline has always been the most concerned issue for users.

Foreign netizens have successfully run speech recognition and LLama-2 GPT on Raspberry Pi using a Raspberry Pi 4 and a 128x64 I2C monochrome OLED display. The display does not need to be soldered, just enable the I2C interface in the Raspberry Pi settings.

Large Language Model

Add a large language model. First, you need to install the required libraries:


pip3 install llama-cpp-pythonpip3 install huggingface-hub sentence-transformers langchain

Before using LLM, you need to download it. Use the huggingface-cli tool:

huggingface-cli download TheBloke/Llama-2-7b-Chat-GGUF llama-2-7b-chat.Q4_K_M.gguf --local-dir . --local-dir-use-symlinks FalseORhuggingface-cli download TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf --local-dir . --local-dir-use-symlinks False


The Llama-2–7b-Chat-GGUF and TinyLlama-1–1B-Chat-v1-0-GGUF models were used. Smaller models run faster, but larger models may provide better results.

After downloading the model, use it:


from langchain.llms import LlamaCppfrom langchain.callbacks.manager import CallbackManagerfrom langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandlerfrom langchain.prompts import PromptTemplatefrom langchain.schema.output_parser import StrOutputParser
llm: Optional[LlamaCpp] = Nonecallback_manager: Any = None
model_file = "tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf" # OR "llama-2-7b-chat.Q4_K_M.gguf"template_tiny = """<|system|>You are a smart mini computer named Raspberry Pi.Write a short but funny answer.</s><|user|>{question}</s><|assistant|>"""template_llama = """<s>[INST] <<SYS>>You are a smart mini computer named Raspberry Pi.Write a short but funny answer.</SYS>>{question} [/INST]"""template = template_tiny
def llm_init():""" Load large language model """global llm, callback_manager
callback_manager = CallbackManager([StreamingCustomCallbackHandler()])llm = LlamaCpp(model_path=model_file,temperature=0.1,n_gpu_layers=0,n_batch=256,callback_manager=callback_manager,verbose=True,)
def llm_start(question: str):""" Ask LLM a question """global llm, template
prompt = PromptTemplate(template=template, input_variables=["question"])chain = prompt | llm | StrOutputParser()chain.invoke({"question": question}, config={})

Using this model is straightforward, but here comes the next step: we need to stream the answer on the OLED screen. To do this, we will use a custom callback that will be executed every time the LLM generates a new token:


class StreamingCustomCallbackHandler(StreamingStdOutCallbackHandler):""" Callback handler for LLM streaming """

def on_llm_start(self, serialized: Dict[str, Any], prompts: List[str], **kwargs: Any) -> None:""" Run when LLM starts running """print("<LLM Started>")

def on_llm_end(self, response: Any, **kwargs: Any) -> None:""" Run when LLM ends running """print("<LLM Ended>")

def on_llm_new_token(self, token: str, **kwargs: Any) -> None:""" Run on new LLM token. Only available when streaming is enabled """print(f"{token}", end="")add_display_tokens(token)

test

Finally, we combine all the parts. The code is simple:


if __name__ == "__main__":add_display_line("Init automatic speech recogntion...")asr_init()

add_display_line("Init LLaMA GPT...")llm_init()

while True:# Q-A loop:add_display_line("Start speaking")add_display_line("")question = transcribe_mic(chunk_length_s=5.0)if len(question) > 0:add_display_tokens(f"> {question}")add_display_line("")

llm_start(question)

Here, the Raspberry Pi records audio for 5 seconds, then the speech recognition model converts the audio into text; finally, the recognized text is sent to the LLM. After the end, the process is repeated. This approach can be improved, for example, by using automatic audio level thresholding, but for a weekend demo, it is good enough.

After running successfully on the Raspberry Pi, the output is as follows:

AI Polaroid Photo Implementation Using GPT+Raspberry Pi

Two designers, Kelin and Ryan, created a new AI species, the Poetry Camera.

Image source: Poetry Camera

A Raspberry Pi Zero 2 W, a 12-megapixel Raspberry Pi Camera Module 3, a thermal printer, and some batteries, cables, and memory card accessories. Kelin and Ryan will also build the Poetry Camera's hardware and software "instructions" and publish them in detail on the open source platform GitHub. This also means that as long as they have all the materials, everyone can create their own Poetry Camera.

Poetry Camera is used just like any other camera, just press the shutter. But what is different is that Poetry Camera can only generate poems, not take photos. Let's first look at the technical logic of Poetry Camera:

When the "shutter" button is pressed, Poetry Camera transmits the photo taken by the camera to ChatGPT, which recognizes the key information in the photo, such as color, shape, object, etc., and then automatically generates poetry based on the visual data.

Source: Marilyn Hue/Instagram

After the poem is generated, it is sent to the Poetry Camera and printed out by a thermal printer.

Image source: Poetry Camera

In the end, the user receives a paper tape with poems printed on it, like a supermarket shopping list.

As for why photos are not recorded, Ryan's answer is: simplifying the functions will make it easier for us to create this product, and the second consideration is privacy issues.

Make a watch that supports ChatGPT

YouTube blogger MayLabs demonstrated a smart watch made of Raspberry Pi that supports ChatGPT. This smart watch does not require mobile phone or PC support and can be used anywhere. It can also answer users' voice questions through ChatGPT.

The watch, developed by a maker going by the pseudonym "Frumtha Fewchure," uses a Raspberry Pi 4B for processing.

The watch portion features an LED light to show that the microphone is enabled, a few buttons, a 0.96-inch two-tone OLED screen, and holders for two Apple Watch bands. The buttons are 6 x 6 x 4.3 mm tactile buttons. In addition, the watch has an LED that acts as an infrared transmitter, so the watch can be used as a universal remote control in an eventual update.

There are three buttons on the watch, and the Pi recognizes which one you pressed, and with these you can get some CPU stats or a watch face, but the most interesting one is the button that connects to ChatGPT to ask questions. The answers appear as text on the display, and also via audio if you have headphones connected (wired or Bluetooth), as there aren't any speakers.

While the watch doesn’t require a phone or PC, you do need an internet connection to interact with ChatGPT, so you can connect to Wi-Fi on your home network. The video creator said he connected the device to his smartphone’s hotspot when testing the watch in a coffee shop.

With the continuous advancement of artificial intelligence technology, Raspberry Pi has become a popular platform for implementing innovative AI projects. From a speech recognition system that can run offline to the chatty rover robot Floyd integrated with ChatGPT, to the interactive desktop robot MBO-AI, and a smart watch that supports ChatGPT, these projects demonstrate the wide application potential of Raspberry Pi in the field of AI.

The combination of Raspberry Pi and AI not only provides a platform for experimentation and creation for technology enthusiasts and developers, but also opens a door to the intelligent world for us.

references

[1] https://mp.weixin.qq.com/s/zsWEtc6Ylc1BLZHSVYAI2Q

[2] https://mp.weixin.qq.com/s/hZBIIn4pnrfLk-bHOL-TYw

[3] https://mp.weixin.qq.com/s/OOh-U1--OoehAB8Dg72h-Q

[4] https://mp.weixin.qq.com/s/moGb5Jtdniz7_HhRwdNhug

[5] https://www.tomshardware.com/raspberry-pi/this-raspberry-pi-rover-bot-is-named-floyd-and-is-super-sassy-thanks-to-chat-gpt

[6] https://mp.weixin.qq.com/s/Jpvlmh-C8Zjka2BG_AWagQ

[7] https://mp.weixin.qq.com/s/rh9t7Wt7DVHL3P7xqwG3HQ

[8] https://mp.weixin.qq.com/s/pwhr-wNG_amkIk52lYeIxA

[9] https://mp.weixin.qq.com/s/z_xf-y7Ifhutq0FgxhtJaw

· END ·