Find your Raspberry Pi 3 and the two plastic standoffs that came with your kit.
Insert the standoffs into the two yellow holes opposite the 40-pin box header on your Raspberry Pi 3. They should snap into place.
This project demonstrates how to get a natural language recognizer up and running and connect it to the Google Assistant, using your AIY Projects voice kit. Along with everything the Google Assistant already does, you can add your own question and answer pairs. All in a handy little cardboard cube, powered by a Raspberry Pi.
Don’t own a kit? You can also integrate the Google Assistant into your own hardware by following the official Google Assistant SDK guides, or you can read below for links to purchase the AIY kit.
Assembling the kit and setting up the Google Assistant SDK should take about an hour and a half.
No more stock? Join the waitlist and we'll let you know as soon as they are back.
Open the box and verify you have all of the necessary components in your kit. You’ll also need a couple of tools for assembly.
This guide shows you how to assemble the AIY Projects voice kit.
The kit is composed of simple cardboard form, a Raspberry Pi board, Voice HAT (an accessory board for voice recognition) and a few common components.
By the end of this guide, your voice project will be assembled with the Raspberry Pi board and other components connected and running. Then you’ll move on the User’s Guide to bring it to life!
You’ll need to download the Voice Kit SD image using another computer. Both of the next steps can take several minutes for your computer to complete, so while you're waiting, get started on "Assemble the hardware" in the next step.
Get the Voice Kit SD image
Write the image to an SD card using a card writing utility (Etcher.io is a popular tool for this)
Our default platform and instructions are for Raspbian Linux. However, you can also build on Android Things, an IoT solution using Android APIs and Google services. Skip down to the Maker's Guide for instructions.
Find your Raspberry Pi 3 and the two plastic standoffs that came with your kit.
Insert the standoffs into the two yellow holes opposite the 40-pin box header on your Raspberry Pi 3. They should snap into place.
Take your Voice HAT accessory board and attach it to the Raspberry Pi 3 box header.
Gently press down to make sure the pins are secure. On the other side, press down to snap the spacers into place.
Find the speaker with the red and black wires attached. Insert the speaker’s red wire end into the “+” terminal on the Voice HAT blue screw connector.
Do the same for the black wire end into the “-” terminal. At this point, they should be sitting there unsecured.
Now screw the wires in place with a Phillips “00” screwdriver.
Gently tug on the wires to make sure they’re secure.
Find the 4-wire button cable: it has a white plug on one end and four separate wires with metal contacts on the other.
Insert the plug into one of the white connectors on the Voice HAT board.
Find the Voice HAT Microphone board and the 5-wire daughter board cable from your kit (pictured).
Insert the 5-wire plug into the Microphone board.
Connect the Microphone board to the Voice HAT board using the other white connector on the Voice HAT board.
Well done! Set aside your hardware for now.
Now let’s build the box. Find the larger cardboard piece with a bunch of holes on one side (pictured).
Fold along the creases, then find the side with four flaps and fold the one marked FOLD 1.
Do the same for the other folds, tucking FOLD 4 underneath to secure it in place.
Easy! Now set it aside.
Find the other cardboard piece that came with your kit (pictured). This will build the inner frame to hold the hardware.
Fold the flaps labeled 1 and 2 along the creases.
The flap above the 1 and 2 folds has a U-shaped cutout. Push it out.
Then fold the rest of the flap outward.
Fold the section labeled FOLD UP so that it’s flush with the surface you’re working on. There’s a little notch that folds behind the U-shaped flap to keep it in place.
The U-shaped flap should lay flush with the box side.
At this point, the cardboard might not hold its shape. Don’t worry: it’ll come together once it’s in the box.
Find your speaker (which is now attached to your Raspberry Pi 3).
Slide the speaker into the U-shaped pocket on the cardboard frame.
Turn the cardboard frame around.
Take the Pi + Voice HAT hardware and slide the it into the bottom of the frame below flaps 1 + 2 (pictured).
The USB ports on the Pi should be exposed from the cardboard frame.
If your SD card is already inserted into the Pi, remove the SD card before sliding the hardware into the cardboard or it may break.
Let’s put it all together!
Take the cardboard box you assembled earlier and find the side with the seven speaker holes.
Slide the cardboard frame + hardware into the cardboard box, making sure that the speaker is aligned with the box side with the speaker holes.
Once it’s in, the Pi should be sitting on the bottom of the box.
Make sure your wires are still connected.
Check that your ports are aligned with the cardboard box holes.
Find your arcade button. There should be a button, a spacer, and a nut.
If they’re connected, unscrew the nut and spacer from the button.
Insert the button into the top flap of the cardboard box.
The pushable button side should face outward.
Screw on the spacer and then the washer to secure the button in place.
Next, find your button lamp components:
Insert the lamp into the black lamp holder.
Then attach the lamp holder to the micro-switch.
Insert the completed lamp into the button.
Secure the lamp in place by carefully rotating it right-ward. It may take some force to lock it in place.
Find the four colored wires with metal contacts that you previously connected to the Voice HAT board.
Following the picture above, attach the four metal contacts to the micro-switch.
Important: Wire color matters! Make sure each of the wires are attached to the same end as the picture.
Find the Voice HAT Microphone board.
The Microphone board sits below the button on the top flap.
Before you tape it down, check the other side of the cardboard flap to align the microphones with the two cardboard holes (see the picture in the next step).
Using some trusty scotch tape, tape the board to the top flap of the cardboard.
Turn it around and double check that your microphones are aligned with the cardboard holes.
That’s it! Close the box up.
Look at that! The device is assembled and ready to be used. Next you’ll connect it and boot it up.
Now that your box is assembled, plug your peripherals in:
The SD card can be tricky to remove after it’s been inserted. We recommend using either small, needle nose pliers to remove it, or attaching tape to the SD card before inserting so you can pull on the tape to remove it.
Insert your SD card (the one with the Voice Kit SD image) into the slot on the bottom side of the Raspberry Pi board. The SD card slot should be accessible through a cutout provided in the external cardboard form.
With the SD card in place and peripherals connected, plug in the power supply and the Raspberry Pi will begin booting up.
If you don’t see anything on your monitor, or you see "Openbox Syntax Error", check the troubleshooting guide in the appendix.
Click the network icon in the upper right corner of the Raspberry Pi desktop. Choose your preferred WiFi access point.
Once booted, the small red LED in the center of the Voice HAT and the LED inside the arcade button should both indicate the device is running by emitting a slow pulse. If you don’t see the LED pulse, check the troubleshooting guide in the appendix.
This script verifies the audio input and output components on the Voice HAT accessory board are working correctly. Double-click the Check Audio icon on your desktop.
When you click the script, it will run through each step listed below. Note: some of the steps require voice input, which you will be prompted for—so watch closely!
Follow along with the script and if everything is working correctly, you’ll see a message that says
The audio seems to be working
If you see an error message, follow the message details to resolve the issue and try again.
This script verifies that your WiFi is configured and working properly on the Raspberry Pi board. Double-click the Check WiFi icon on your desktop.
When you double-click the script, it will check your Raspberry Pi is connected to the internet over WiFi.
If everything is working correctly, you’ll see a message that says The WiFi connection seems to be working.
If you see an error, click on the network icon at the top right and verify you are connected to a valid access point.
Congratulations on assembling the voice recognizer device and verifying the components are setup properly. Now you’ll need to connect the device to Google Cloud Platform.
To do that, open the User’s Guide and follow the instructions provided.
Congrats on assembling your voice recognizer device -- now, let’s bring it to life!
The voice recognizer uses the Google Assistant SDK to recognize speech, along with a local Python application that evaluates local commands. You can also use the Google Cloud Speech API. By the end of this guide, your voice recognizer will let you talk to the Google Assistant. Then check out the Maker’s guide for creative extensions to inspire you to use voice capabilities in your future projects.
To try the Google Assistant API, you need to first sign into Google Cloud Platform (GCP) and then enable the API.
Using your voice recognizer device, open up an internet browser and go to the Cloud Console
Use your Google account to sign in. If you don’t have one, you’ll need to create one. Trying the Google Assistant API is free to use for personal use.
GCP uses projects to organize things. Create one for your voice recognizer box.
In the Cloud Console, click the drop-down button to the right of the “Google Cloud Platform” logo
From the dropdown, click Create project
Enter a project name and click Create
After your project is created, make sure the drop-down has your new project name displayed
In the Cloud Console, enable the "Google Assistant API".
In the Cloud Console, create an OAuth 2.0 client by going to API Manager > Credentials
Click Create credentials and select OAuth client ID
Select Other, enter a name to help you remember your credentials, then click Create.
A dialog window will pop up. Click OK. In the Credentials list, find your new credentials and click the download icon (
) on the right.
Note: if you don't see the download icon, try expanding width of your browser window or zooming out.
Find the JSON file you just downloaded (client_secrets_XXXX.json) and rename it to assistant.json. Then move it to /home/pi/assistant.json
On your desktop, click Start dev terminal and enter sudo systemctl stop voice-recognizer
Go to the Activity Controls panel. Make sure to log in with the same Google account as before.
You’re ready to turn it on: follow the manual start instructions under Using your device below
ssh -X to handle authentication through the browser when starting the example for the first time.Authorize access to the Google Assistant API, when prompted
Try an example query like "how many ounces in 2 cups" or "what's on my calendar?"-- and the Assistant should respond!
Actually, there are some basic settings that need your permission first..., perform step 8 again, being sure to use the same account that you used for the authorization step.The voice recognizer doesn't run automatically by default. You can either run it as a service in the background, or if you'd like to make changes to the code, it might be useful to run it manually. This is required when using the Assistant API for the first time, and lets you see some diagnostic output as well.
For the device to begin listening for your queries, start the voice recognizer app by double-clicking "Start dev terminal" on the Desktop and entering:
src/main.py
When you are done, press Ctrl-C to end the application.
As an alternative to running the application manually, you can run it as a system service, however running the application manually may be better for making code changes for fast restarts.
You start the service by entering sudo systemctl start voice-recognizer. You can stop the service by entering sudo systemctl stop voice-recognizer.
If you started with the preloaded system image (on an SD card), then the voice recognition service will not be started on-boot. If you would like for the service to automatically start on boot, then run sudo systemctl enable voice-recognizer once.
To learn about other commands, like stop and disable, see the systemctl command manual.
Your box has a range of responses that it displays through the bright LED inside the arcade style button mounted on top of the device. The LED signals can be configured to your preference by modifying ~/voice-recognizer-raspi/src/led.py.
Verify your device is up and running when it displays a slow pulse pattern. Once you see this, you’re ready to start speaking queries to the device.
| LED signal | Description |
|---|---|
| Pulse | The device is starting up, or the voice recognizer has not been started yet |
| Blink (every few seconds) | The device is ready to be used |
| On | The device is listening |
| Pulse | The device is thinking or responding |
| Pulse → off | The device is shutting down |
| 3 blinks → pause | There’s an error |
There’s a lot you can do with this project beyond the Assistant API. If you’re the curious type, we invite you to explore the Maker’s Guide for more ideas on how to hack this project, as well as how to use the Cloud Speech API as an alternative to the Assistant API.
This is a hackable project, so we encourage you to make this project your own! We’ve included a whole section on replacing the Google Assistant SDK with the Cloud Speech API to give you even more options. This guide gives you some creative extensions, settings, and even a different voice API to use.
We hope this project has sparked some new ideas for you.
Below are some options to change the device behavior and suggestions for extensions if you want to hack further.
If you’re using the SD card image provided, the source for the voice-recognizer app is already installed on your device. You can browse the Python source code at $HOME/voice-recognizer-raspi/
Alternately, the project source is available on GitHub at aiyprojects-raspbian.
The application can be configured by adjusting the properties found in $HOME/.config/voice-recognizer.ini. That file lets you configure the default activation trigger and which API to use for voice recognition. Try adding additional properties for your own extensions! And don’t worry: If you mess it up, there’s a backup copy kept in $HOME/voice-recognizer-raspi/config.
By default, the voice recognizer activates after a single button press (see ~/voice-recognizer-raspi/src/triggers/*.py), but you can change the activation trigger when you manually start the application by including the -T flag. As another example, we’ve included an activation trigger that responds to a single clap or snap of your fingers.
python3 src/main.py -T {trigger-name}
| trigger-name | Description |
|---|---|
| gpio | Activates by pressing the arcade button |
| clap | Activates from a single clap or snap |
You can add additional triggers beyond these examples by modifying the code with your own ideas. To add a new activation trigger, you’ll need to create a new source file in the trigger folder, implement a subclass of Trigger (see ~/voice-recognizer-raspi/src/triggers/trigger.py) and add it to the command-line options.
Below is the code for the GPIO trigger for reference.
import RPi.GPIO as GPIO
import time
from triggers.trigger import Trigger
class GpioTrigger(Trigger):
'''Detect edges on the given GPIO channel and call the callback.'''
DEBOUNCE_TIME = 0.05
def __init__(self, channel, polarity=GPIO.FALLING,
pull_up_down=GPIO.PUD_UP):
super().__init__()
self.channel = channel
self.polarity = polarity
if polarity not in [GPIO.FALLING, GPIO.RISING]:
raise ValueError('polarity must be GPIO.FALLING or GPIO.RISING')
self.expected_value = polarity == GPIO.RISING
self.event_detect_added = False
GPIO.setmode(GPIO.BCM)
GPIO.setup(channel, GPIO.IN, pull_up_down=pull_up_down)
def start(self):
if not self.event_detect_added:
GPIO.add_event_detect(self.channel, self.polarity, callback=self.debounce)
self.event_detect_added = True
def debounce(self, _):
'''Check that the input holds the expected value for the debounce period,
to avoid false trigger on short pulses.'''
start = time.time()
while time.time() < start + self.DEBOUNCE_TIME:
if GPIO.input(self.channel) != self.expected_value:
return
time.sleep(0.01)
self.callback()
Get the Google Assistant running on Android Things with these instructions.
Want to try another API? Follow the instructions below to try the Cloud Speech API, which recognizes your voice speech and converts it into text. The Cloud Speech API supports 80 languages, long audio clips, and the ability to add phrase hints for processing audio.
The voice recognizer cube uses Google’s Cloud Speech API. If you use it for less than 60 minutes a month, it’s free. Beyond that the cost is $0.006 for 15 seconds. Don’t worry: you’ll get a reminder if you go over your free limit.
cloud_speech.jsonOpen your workstation’s terminal. Move your credentials.json file to the correct folder by entering the following:
(using the local file system)
cp /path/to/downloaded/credentials.json ~/cloud_speech.json
(from another machine)
scp /path/to/downloaded/credentials.json pi@raspberrypi.local:~/cloud_speech.json
On your desktop, double-click the Check Cloud icon. Follow along with the script. If everything is working correctly, you’ll see this:
The cloud connection seems to be working
If you see an error message, follow the details and try the Check Cloud script again.
To issue a voice command, press the voice recognizer button once to activate the voice recognizer and then speak loudly and clearly. You’ll know the device is listening for a voice command when you see the LED arcade button is steady on.
We’ve included a few example voice commands in our local dictionary as a starting point, but we encourage you to explore the code and add your own.
| Voice command | Response |
|---|---|
| Hello | Hello to you too |
| What time is it? | It is <time>. E.g. "It is ten to nine." |
| Tell me a joke | (listen for the joke response) |
| Volume up | Increase the volume by 10% and say the new level |
| Volume down | Decrease the volume by 10% and say the new level |
| Max volume | Increase volume to 100% |
You can create new actions and link them to new voice commands in ~/voice-recognizer-raspi/src/action.py.
To control an LED that you've connected to GPIO 4 (Driver0), add the following class to ~/voice-recognizer-raspi/src/action.py below the comment "Implement your own actions here":
# =========================================
# Makers! Implement your own actions here.
# =========================================
import RPi.GPIO as GPIO
class GpioWrite(object):
'''Write the given value to the given GPIO.'''
def __init__(self, gpio, value):
GPIO.setmode(GPIO.BCM)
GPIO.setup(gpio, GPIO.OUT)
self.gpio = gpio
self.value = value
def run(self, command):
GPIO.output(self.gpio, self.value)
Then add the following lines to ~/voice-recognizer-raspi/src/action.py below the comment "Add your own voice commands here":
# =========================================
# Makers! Add your own voice commands here.
# =========================================
actor.add_keyword('light on', GpioWrite(4, True))
actor.add_keyword('light off', GpioWrite(4, False))
| Function | GPIO | Description |
|---|---|---|
| Button | 23 | button is active low |
| LED | 25 | LED is active high |
| Driver0/GPIO4 | 4 | 500mA drive limit, can be used as GPIO |
| Driver1/GPIO17 | 17 | 500mA drive limit, can be used as GPIO |
| Driver2/GPIO27 | 27 | 500mA drive limit, can be used as GPIO |
| Driver3/GPIO22 | 22 | 500mA drive limit, can be used as GPIO |
| Servo0/GPIO26 | 26 | 25mA drive limit, can be used as GPIO |
| Servo1/GPIO6 | 6 | 25mA drive limit, can be used as GPIO |
| Servo2/GPIO13 | 13 | 25mA drive limit, can be used as GPIO |
| Servo3/GPIO5 | 5 | 25mA drive limit, can be used as GPIO |
| Servo4/GPIO12 | 12 | 25mA drive limit, can be used as GPIO |
| Servo5/GPIO24 | 24 | 25mA drive limit, can be used as GPIO |
| I2S | 20, 21, 19 | used by Voice HAT ALSA driver, not available to user |
| Amp Shutdown | 16 | used by Voice HAT ALSA driver, not available to user |
| I2C | 2, 3 | available as GPIO or I2C via Raspbian drivers |
| SPI | 7, 8, 9, 10, 11 | available as GPIO or SPI via Raspbian drivers |
| UART | 14, 15 | available as GPIO or UART via Raspbian drivers |
You can view logs to get a better sense of what’s happening under the (cardboard) hood if you’re running the voice-recognizer as a service.
With the voice-recognizer running manually or as a service, you can view all log output using journalctl.
sudo journalctl -u voice-recognizer -n 10 -f
Clap your hands then speak, or press Ctrl+C to quit...
[2016-12-19 10:41:54,425] INFO:trigger:clap detected
[2016-12-19 10:41:54,426] INFO:main:listening...
[2016-12-19 10:41:54,427] INFO:main:recognizing...
[2016-12-19 10:41:55,048] INFO:oauth2client.client:Refreshing access_token
[2016-12-19 10:41:55,899] INFO:speech:endpointer_type: START_OF_SPEECH
[2016-12-19 10:41:57,522] INFO:speech:endpointer_type: END_OF_UTTERANCE
[2016-12-19 10:41:57,523] INFO:speech:endpointer_type: END_OF_AUDIO
[2016-12-19 10:41:57,524] INFO:main:thinking...
[2016-12-19 10:41:57,606] INFO:main:command: light on
[2016-12-19 10:41:57,614] INFO:main:ready...
Feedback from the recognizer that it is listening (our request was accepted)
See https://cloud.google.com/speech/reference/rest/v1beta1/EndpointerType
Same as line 7
You did it! Whether this was your first hackable project or you’re a seasoned maker, we hope this project has sparked new ideas for you. Keep tinkering, there’s more to come.