Voice Kit

Project Overview

This project demonstrates how to get a natural language recognizer up and running and connect it to the Google Assistant, using your AIY Projects voice kit. Along with everything the Google Assistant already does, you can add your own question and answer pairs. All in a handy little cardboard cube, powered by a Raspberry Pi.

Don’t own a kit? You can also integrate the Google Assistant into your own hardware by following the official Google Assistant SDK guides, or you can read below for links to purchase the AIY kit.

Assembling the kit and setting up the Google Assistant SDK should take about an hour and a half.

Get the kit

List of Materials

Materials
1
Voice HAT accessory board
2
Voice HAT microphone board
3
Plastic standoffs
4
3” speaker (wires attached)
5
Arcade-style push button
6
4-wire button cable
7
5-wire daughter board cable
8
External cardboard box
9
Internal cardboard frame

Open the box and verify you have all of the necessary components in your kit. You’ll also need a couple of tools for assembly.

In your kit

  1. 1 Voice HAT accessory board (×1)
  2. 2 Voice HAT microphone board (×1)
  3. 3 Plastic standoffs (×2)
  4. 4 3” speaker (wires attached) (×1)
  5. 5 Arcade-style push button (×1)
  6. 6 4-wire button cable (×1)
  7. 7 5-wire daughter board cable (×1)
  8. 8 External cardboard box (×1)
  9. 9 Internal cardboard frame (×1)

Not included

  1. Raspberry Pi 3 (×1)
  2. SD card (×1)
  3. Size “00” Phillips screwdriver (×1)
  4. Scotch tape (×1)

Assembly Guide

This guide shows you how to assemble the AIY Projects voice kit.

The kit is composed of simple cardboard form, a Raspberry Pi board, Voice HAT (an accessory board for voice recognition) and a few common components.

By the end of this guide, your voice project will be assembled with the Raspberry Pi board and other components connected and running. Then you’ll move on the User’s Guide to bring it to life!

1

Get the Voice Kit SD Image

You’ll need to download the Voice Kit SD image using another computer. Both of the next steps can take several minutes for your computer to complete, so while you're waiting, get started on "Assemble the hardware" in the next step.

  1. Get the Voice Kit SD image

  2. Write the image to an SD card using a card writing utility (Etcher.io is a popular tool for this)

Building on Android Things

Our default platform and instructions are for Raspbian Linux. However, you can also build on Android Things, an IoT solution using Android APIs and Google services. Skip down to the Maker's Guide for instructions.

2

Assemble the hardware

Assemble the hardware image 1

Find your Raspberry Pi 3 and the two plastic standoffs that came with your kit.

Insert the standoffs into the two yellow holes opposite the 40-pin box header on your Raspberry Pi 3. They should snap into place.

Assemble the hardware image 2

Take your Voice HAT accessory board and attach it to the Raspberry Pi 3 box header.

Gently press down to make sure the pins are secure. On the other side, press down to snap the spacers into place.

Assemble the hardware image 3

Find the speaker with the red and black wires attached. Insert the speaker’s red wire end into the “+” terminal on the Voice HAT blue screw connector.

Do the same for the black wire end into the “-” terminal. At this point, they should be sitting there unsecured.

Assemble the hardware image 4

Now screw the wires in place with a Phillips “00” screwdriver.

Gently tug on the wires to make sure they’re secure.

Assemble the hardware image 5

Find the 4-wire button cable: it has a white plug on one end and four separate wires with metal contacts on the other.

Insert the plug into the white connector labeled “Button” on the Voice HAT board.

Assemble the hardware image 6

Find the Voice HAT Microphone board and the 5-wire daughter board cable from your kit (pictured).

Insert the 5-wire plug into the Microphone board.

Assemble the hardware image 7

Plug the Microphone board to the Voice Hat board using the white connector labeled “Mic.”

Step complete

Well done! Set aside your hardware for now.

3

Fold the cardboard

3.1. Build the box

Build the box image 1

Now let’s build the box. Find the larger cardboard piece with a bunch of holes on one side (pictured).

Fold along the creases, then find the side with four flaps and fold the one marked FOLD 1.

Build the box image 2

Do the same for the other folds, tucking FOLD 4 underneath to secure it in place.

Easy! Now set it aside.

3.2. Build the frame

Build the box image 1

Find the other cardboard piece that came with your kit (pictured). This will build the inner frame to hold the hardware.

Fold the flaps labeled 1 and 2 along the creases.

Build the box image 2

The flap above the 1 and 2 folds has a U-shaped cutout. Push it out.

Build the box image 3

Then fold the rest of the flap outward.

Fold the section labeled FOLD UP so that it’s flush with the surface you’re working on. There’s a little notch that folds behind the U-shaped flap to keep it in place.

Build the box image 4

The U-shaped flap should lay flush with the box side.

At this point, the cardboard might not hold its shape. Don’t worry: it’ll come together once it’s in the box.

Build the box image 5

Find your speaker (which is now attached to your Raspberry Pi 3).

Slide the speaker into the U-shaped pocket on the cardboard frame.

Build the box image 6

Turn the cardboard frame around.

Take the Pi + Voice HAT hardware and slide the it into the bottom of the frame below flaps 1 + 2 (pictured).

The USB ports on the Pi should be exposed from the cardboard frame.

4

Put it all together

Warning

If your SD card is already inserted into the Pi, remove the SD card before sliding the hardware into the cardboard or it may break.

Put it all together image 1

Let’s put it all together!

Take the cardboard box you assembled earlier and find the side with the seven speaker holes.

Slide the cardboard frame + hardware into the cardboard box, making sure that the speaker is aligned with the box side with the speaker holes.

Put it all together image 2

Once it’s in, the Pi should be sitting on the bottom of the box.

Make sure your wires are still connected.

Put it all together image 3

Check that your ports are aligned with the cardboard box holes.

Put it all together image 4

Find your arcade button and set it into the top of your cardboard box.

Put it all together image 5

On the other side, screw on the washer to secure the button in place.

Put it all together image 6

Now let’s hook the button up. Find the four colored wires with metal contacts connected to the Voice HAT board. Connect the wires in the positions indicated by the image. Check the next step for another view.

Important: Wire color position matters! Check the next step to make sure your wires are correctly placed.

Put it all together image 7

Here’s another view to make sure your wires are correctly connected.

Looking at the small crown logo (the base of the crown facing toward you), the wires should be connected at these locations:

  • Blue: bottom left
  • Red: bottom right
  • Black: top right
  • White: top left
Put it all together image 8

The next step is attaching the microphone board to the cardboard.

We’re using double-sided tape here, but your standard-issue scotch tape works fine too.

Put it all together image 9

Line up the microphone board so the mics (the white boxes on the ends) are sitting aligned with the cardboard holes for maximum listening capability.

Put it all together image 10

Give it a turn and check that your mics are aligned correctly.

Put it all together image 11

Well done! Time to close it up.

5

Connect and boot the device

5.1. Connect peripherals

Plug peripherals in
1
USB Keyboard
2
USB Mouse
3
HDMI Monitor

Now that your box is assembled, plug your peripherals in:

  1. 1 USB Keyboard
  2. 2 USB Mouse
  3. 3 HDMI Monitor

5.2. Boot the device

Note

The SD card can be tricky to remove after it’s been inserted. We recommend using either small, needle nose pliers to remove it, or attaching tape to the SD card before inserting so you can pull on the tape to remove it.

Build the box image 1

Insert your SD card (the one with the Voice Kit SD image) into the slot on the bottom side of the Raspberry Pi board. The SD card slot should be accessible through a cutout provided in the external cardboard form.

With the SD card in place and peripherals connected, plug in the power supply and the Raspberry Pi will begin booting up.

If you don’t see anything on your monitor, or you see "Openbox Syntax Error", check the troubleshooting guide in the appendix.

5.3. Connect to the internet

Build the box image 1

Click the network icon in the upper right corner of the Raspberry Pi desktop. Choose your preferred WiFi access point.

6

Verify it works

Once booted, the red LED on the Raspberry Pi near the power connector should be lit. If not, check the troubleshooting guide.

6.1. Check audio

Check audio image 1

This script verifies the audio input and output components on the Voice HAT accessory board are working correctly. Double-click the Check Audio icon on your desktop.

When you click the script, it will run through each step listed below. Note: some of the steps require voice input, which you will be prompted for—so watch closely!

Check audio image 2

Follow along with the script and if everything is working correctly, you’ll see a message that says The audio seems to be working

If you see an error message, follow the message details to resolve the issue and try again.

6.2. Check WiFi

Check WiFi image 1

This script verifies that your WiFi is configured and working properly on the Raspberry Pi board. Double-click the Check WiFi icon on your desktop.

When you double-click the script, it will check your Raspberry Pi is connected to the internet over WiFi.

Check WiFi image 2

If everything is working correctly, you’ll see a message that says The WiFi connection seems to be working.

If you see an error, click on the network icon at the top right and verify you are connected to a valid access point.

Wrap up

Congratulations on assembling the voice recognizer device and verifying the components are setup properly. Now you’ll need to connect the device to Google Cloud Platform.

To do that, open the User’s Guide and follow the instructions provided.

7

Appendix

Troubleshooting Tips

  1. A red LED on the Raspberry Pi near the power connector should light. If it doesn't, unplug the power, unplug the connector to the microphone, and power-up again. If it lights after powering-up without the microphone, then the microphone board may be defective.
  2. The lamp in the button will not light up until you run a demo, so don't worry that it's off. (This is different from the AIY Essentials Guide, which describes an older software version.)
  3. If you don't see anything on your monitor, make sure the HDMI and power cables are fully inserted into the Raspberry Pi.
  4. If you see "Openbox Syntax Error", you'll need to rewrite the image to the SD card and try booting the device again.

User’s Guide

Congrats on assembling your voice recognizer device -- now, let’s bring it to life!

The voice recognizer uses the Google Assistant SDK to recognize speech, along with a local Python application that evaluates local commands. You can also use the Google Cloud Speech API. By the end of this guide, your voice recognizer will let you talk to the Google Assistant. Then check out the Maker’s guide for creative extensions to inspire you to use voice capabilities in your future projects.

1

Setting up your device

1.1. Connect to Google Cloud Platform

To try the Google Assistant API, you need to first sign into Google Cloud Platform (GCP) and then enable the API.

Log into GCP

Log into GCP 1

Using your voice recognizer device, open up an internet browser and go to the Cloud Console

I’ve never used Google Cloud Platform before

Use your Google account to sign in. If you don’t have one, you’ll need to create one. Trying the Google Assistant API is free to use for personal use.

Create a project

GCP uses projects to organize things. Create one for your voice recognizer box.

Create a project 1

In the Cloud Console, click the drop-down button to the right of the “Google Cloud Platform” logo

Create a project 2

From the dropdown, click Create project

Create a project 3

Enter a project name and click Create

Create a project 4

After your project is created, make sure the drop-down has your new project name displayed

1.2. Turn on the Google Assistant API

Turn on the Google Assistant API image 1

In the Cloud Console, enable the "Google Assistant API".

Turn on the Google Assistant API image 2

In the Cloud Console, create an OAuth 2.0 client by going to API Manager > Credentials

Turn on the Google Assistant API image 3

Click Create credentials and select OAuth client ID

  • If this is your first time creating a client ID, you’ll need to configure your consent screen by clicking Configure consent screen. You’ll need to name your app (this name will appear in the authorization step)
Turn on the Google Assistant API image 4

Select Other, enter a name to help you remember your credentials, then click Create.

Turn on the Google Assistant API image 5

A dialog window will pop up. Click OK. In the Credentials list, find your new credentials and click the download icon (Download icon) on the right.

Note: if you don't see the download icon, try expanding width of your browser window or zooming out.

Turn on the Google Assistant API image 6

Find the JSON file you just downloaded (client_secrets_XXXX.json) and rename it to assistant.json. Then move it to /home/pi/assistant.json

Turn on the Google Assistant API image 7

Go to the Activity Controls panel. Make sure to log in with the same Google account as before.

  • Turn on the following:
    1. Web and app activity
    2. Device information
    3. Voice and audio activity

  1. You’re ready to turn it on: follow the manual start instructions under Using your device below

    • You can also SSH from another computer. You’ll need to use ssh -X to handle authentication through the browser when starting the example for the first time.
  2. Authorize access to the Google Assistant API, when prompted

    • Make sure you're following the manual start instructions the first time - if you run as a service, you won't be prompted for authorization.
  3. Try an example query like "how many ounces in 2 cups" or "what's on my calendar?"-- and the Assistant should respond!

    • If the voice recognizer doesn't respond to your button presses or queries, you may need to restart.
    • If the response is Actually, there are some basic settings that need your permission first..., perform step 8 again, being sure to use the same account that you used for the authorization step.
3

Using your device

We provide three demo apps that showcase voice recognition and Google Assistant with different capabilities. They may be used as templates to create your own apps.

When a demo app is running, the LED inside the arcade button and the LED in the center of the Voice Hat will pulse every few seconds. If you don’t see the LED pulse, check the troubleshooting guide.

Demo App Description Raspberry Pi supported
assistant_library_demo.py Showcases the Google Assistant Library and hotword detection ("Okay, Google"). 2b, 3b
assistant_grpc_demo.py Showcases the Google gRPC APIs and button trigger. 2b, 3b, Zero W
cloudspeech_demo.py Showcases the Google Cloud Speech APIs, button trigger, and custom voice commands. 2b, 3b, Zero W

3.1. Start the assistant library demo app

For the device to begin acting as your Google Assistant much like Google Home, start the assistant library demo app by double-clicking "Start dev terminal" on the Desktop and entering:

src/examples/voice/assistant_library_demo.py

The assistant library app has hotword detection built-in. To start a conversation with the Google Assistant, say "Okay, Google" or "Hey Google". When you are done, press Ctrl-C to end the application.

3.2. Start the assistant gRPC demo app

Double-click "Start dev terminal" on the Desktop and enter:

src/examples/voice/assistant_grpc_demo.py

Unlike the assistant library demo, this demo does not support hotword detection. To ask Google Assistant a question, press the arcade button and speak. When you are done, either press the arcade button and say "goodbye", or simply press Ctrl-C to end the application.

3.3. Start the Cloud Speech demo app

The cloud speech demo makes use of the Google Cloud Speech APIs. If you do not need the conversations provided by Google Assistant, this is useful for building your own app to recognize voice commands. Details are described in the Maker’s Guide.

3.3. LED status codes

Your box has a range of responses that it displays through the bright LED inside the arcade button mounted on top of the device.

LED signal Description
Blink (every few seconds) The device is ready to be used
On The device is listening
Pulse The device is thinking or responding
3 blinks → pause There’s an error
Extending the project

There’s a lot you can do with this project beyond the Assistant API. If you’re the curious type, we invite you to explore the Maker’s Guide for more ideas on how to hack this project, and explore ways to build your own custom Voice User Interface.

3

Appendix

Troubleshooting Tips

  1. If the lamp in the button doesn't light up when running a demo, check that the wire colors are the same as the picture in step 7 of the assembly instructions.

Maker’s Guide

This is a hackable project, so we encourage you to make this project your own! We’ve included a whole section on replacing the Google Assistant SDK with the Cloud Speech API to give you even more options. This guide gives you some creative extensions, settings, and even a different voice API to use.

We hope this project has sparked some new ideas for you.

1

Software extensions

Below are some options to change the device behavior and suggestions for extensions if you want to hack further.

1.1. Source code

If you’re using the SD card image provided, the source for the voice-recognizer app is already installed on your device. You can browse the Python source code at $HOME/AIY-voice-kit-python/src/

Alternately, the project source is available on GitHub: https://github.com/google/aiyprojects-raspbian/tree/voicekit. Note it is released under the "voicekit" branch.

1.2. Python API Reference

Please see the table below for a list of modules available for develope use. The full APIs are available on GitHub: https://github.com/google/aiyprojects-raspbian/tree/voicekit/src/aiy.

Module APIs Provided Description & Uses in Demo Apps
aiy.voicehat get_button()
get_led()
get_status_ui()
For controlling the Arcade button and the LED.

See uses in any demo app.
aiy.audio get_player()
get_recorder()
record_to_wave()
play_wave()
play_audio()
say()
For controlling the microphone and speaker. It is capable of speaking some text or playing a wave file.

See uses in assistant_grpc_demo.py and cloudspeech_demo.py.
aiy.cloudspeech get_recognizer() For accessing the Google CloudSpeech APIs.

See uses in cloudspeech_demo.py.
aiy.i18n set_locale_dir()
set_language_code() get_language_code()
For customizing the language and locale.

Not used directly by demo apps. Some APIs depend on this module. For example, aiy.audio.say() uses this module for speech synthesis.
aiy.assistant.grpc get_assistant() For accessing the Google Assistant APIs via gRPC.

See uses in assistant_grpc_demo.py.
google.assistant.library This is the official Google Assistant Library for Python.

See online documentation.

1.3. Create a new activation trigger

An activation trigger is a general term describing the condition on which we activate voice recognition or start a conversation with the Google Assistant. Previously you have seen two different types of activation triggers:

  1. Voice activation trigger
    This is the "Okay, Google" hotword detection in the assistant library demo. The assistant library continuously monitors the microphones on your VoiceHat. As soon as it detects that you said "Okay, Google", a conversation is started.

  2. Button trigger
    This is when you press the arcade button. Internally, it is connected to the GPIO on the Raspberry Pi (take a look at the driver code: aiy._drivers._button).

You may design and implement your own triggers. For example, you may have a motion detection sensor driver that can call a function when motion is detected:

gpio trigger
# =========================================
# Makers! Implement your own actions here.
# =========================================

import aiy.audio
import aiy.cloudspeech
import aiy.voice


def main():
    '''Start voice recognition when motion is detected.'''
    my_motion_detector = MotionDetector()
    recognizer = aiy.cloudspeech.get_recognizer()
    aiy.audio.get_recorder().start()
    while True:
        my_motion_detector.WaitForMotion()
        text = recognizer.recognize()
        aiy.audio.say('You said ', text)


if __name__ == '__main__':
    main()

1.4. Use the Google Assistant library with a button

In the User's Guide, you learned to use the Google Assistant library to make Voice Kit into your own Google Home. Sometimes, we also want to use an external trigger to start a conversation with the Google Assistant. Example external triggers include the default button (GPIO trigger, demonstrated in cloudspeech_demo.py and assistant_grpc_demo.py), a motion sensor, or a clap trigger.

This section shows how to start a conversation with a button press. It is little trickier because of the way the assistant library works. If you are new to programming, you may skip the "Design" section and jump to the "Implementation" subsection.

Design

Each python app has a main thread, which executes your code in main. For example, all our demo apps contain the following code:

 

if __name__ == '__main__':
    main()

It executes the main() function in the main thread. The assistant library runs an event loop:

 

...
for event in assistant.start():
   process_event(event)

The button driver has a method called "on_press" so you can tell it to run a function you provided every time it is pressed. You may wonder why the following does not work with assistant library:

 

...
def on_button_press(_):
    assistant.start_converstation()

...
aiy.voicehat.get_button().on_press(on_button_press)
for event in assistant.start():
   process_event(event)

Save it as my_demo.py, run it in the terminal, and press the button. Nothing happened. This is actually because the assistant library's event loop blocks the main thread, so the internal event loop inside the button driver does not get to run. For more details, you may take a look how the button driver works (see src/aiy/_drivers/_button.py).

To summarize, the button driver runs an internal event loop (from the stock GPIO driver) in the main thread. And assistant library also runs an event loop that blocks the main thread. To solve this problem and allow both event loops to run successfully, we need to use the powerful threading library in Python and run the assistant library event loop in a separate thread. For more information on Python threading, take a look at the official Python threading doc.

Implementation

The source code for a working demo is at: src/examples/voice/assistant_library_with_button_demo.py

We created a class MyAssistant to capture all the logic. In its constructor, we created the thread that will be used to run the assistant library event loop:

 

...
class MyAssistant(object):
    def __init__(self):
        self._task = threading.Thread(target=self._run_task)

The "_run_task" function specified as the target will be run when you start the thread. In that function, we created an assistant library object and ran the event loop. This event loop is executed in the thread we created, separate from the main thread:

 

...
def _run_task(self):
    credentials = aiy.assistant.auth_helpers.get_assistant_credentials()
    with Assistant(credentials) as assistant:
        # Save assistant as self._assistant, so later the button press handler can use
        # it.
        self._assistant = assistant
        for event in assistant.start():
            self._process_event(event)

We have yet to hook up the button trigger at this point, because we want to wait until the Google Assistant is fully ready. In the "self._process_event" function, we enabled the button trigger when the API tells us it is ready to accept conversations:

 

...
def _process_event(self, event):
    ...
    if event.type == EventType.ON_START_FINISHED:
        # The Google Assistant is ready. Start the button trigger.
        aiy.voicehat.get_button().on_press(self._on_button_pressed)

This is the simplest demo of utilizing the button trigger. You may connect your own trigger with the assistant library the same way to start a conversation, mute/unmute the assistant, and do many other things.

2

Build on Android Things

Follow this codelab to get started with the Google Assistant on Android Things or just get the sample code on GitHub.

Android Things

3

Custom Voice User Interface

3.1. Change to the Cloud Speech API

Want to try another API? Follow the instructions below to try the Cloud Speech API, which recognizes your voice speech and converts it into text. The Cloud Speech API supports 80 languages, long audio clips, and the ability to add phrase hints for processing audio.

Turn on billing

Why do I need to turn on billing?

The voice recognizer cube uses Google’s Cloud Speech API. If you use it for less than 60 minutes a month, it’s free. Beyond that the cost is $0.006 for 15 seconds. Don’t worry: you’ll get a reminder if you go over your free limit.

  1. In the Cloud Console, open the navigation menu Navigation menu
  2. Click Billing
  3. If you don’t have a billing account, then click New billing account and go through the setup
  4. Return to the main billing page, then click the My projects tab.
  5. Find the name of your new project. Make sure it’s connected to a billing account.
  6. To connect or change the billing account, click the three-dot button Navigation menu, then select Change billing account
Enable the API
  1. In the console, open the navigation menu and click API Manager
  2. Click ENABLE API
  3. Enter “Cloud Speech API” into the search bar, then click the name
  4. Click ENABLE to turn on the API
Create a service account and credentials
  1. Go to the left-hand navigation menu, click API Manager and then click Credentials
  2. Click Create credentials and then click Service account key from the list
  3. From the “Service account” dropdown, click New service account
  4. Enter a name so that you’ll know this is for your voice recognizer stuff, like “Voice credentials”
  5. Select the Project viewer role
  6. Use the JSON key type
  7. Click Create
  8. Your credentials will download automatically. The file name contains your project name and some numbers: locate it rename it to cloud_speech.json
  9. Open your workstation’s terminal. Move your credentials.json file to the correct folder by entering the following:

    (using the local file system)
    cp /path/to/downloaded/credentials.json ~/cloud_speech.json

    (from another machine)
    scp /path/to/downloaded/credentials.json pi@raspberrypi.local:~/cloud_speech.json

Start the demo app

On your desktop, double-click the Start Dev Terminal icon. Then start the app: src/examples/voice/cloudspeech_demo.py

Check that it works correctly

On your desktop, double-click the Check Cloud icon. Follow along with the script. If everything is working correctly, you’ll see this:

The cloud connection seems to be working

If you see an error message, follow the details and try the Check Cloud script again.

3.2. Voice commands

To issue a voice command, press the arcade button once to activate the voice recognizer and then speak loudly and clearly.

Voice command Response
turn on the light The LED is turned on and is solid
turn off the light The LED is turned off
blink The LED starts blinking
goodbye The app automatically exits

3.3. Create a new voice command (or action)

You can create new actions and link them to new voice commands by modifying src/examples/voice/cloudspeech_demo.py directly.

Example: repeat after me

To add a voice command, first make it explicit what command is expected to the recognizer. This improves the recognition rate:

 
recognizer.expect_phrase('repeat after me')

Then add the code to handle the command. We will use aiy.audio.say to repeat the recognized transcript:

 

...
// In the process loop. 'text' contains the transcript of the voice command.
if 'repeat after me' in text:
    // Remove the command from the text.
    to_repeat = text.replace('repeat after me', '', 1)
    aiy.audio.say(to_repeat)

The modified cloudspeech_demo.py looks like:

 

"""A demo of the Google CloudSpeech recognizer."""

import os

import aiy.audio
import aiy.cloudspeech
import aiy.voicehat


def main():
    recognizer = aiy.cloudspeech.get_recognizer()
    recognizer.expect_phrase('turn off the light')
    recognizer.expect_phrase('turn on the light')
    recognizer.expect_phrase('blink')
    recognizer.expect_phrase('repeat after me')

    button = aiy.voicehat.get_button()
    led = aiy.voicehat.get_led()
    aiy.audio.get_recorder().start()

    while True:
        print('Press the button and speak')
        button.wait_for_press()
        print('Listening...')
        text = recognizer.recognize()
        if text is None:
            print('Sorry, I did not hear you.')
        else:
            print('You said "', text, '"')
            if 'turn on the light' in text:
                led.set_state(aiy.voicehat.LED.ON)
            elif 'turn off the light' in text:
                led.set_state(aiy.voicehat.LED.OFF)
            elif 'blink' in text:
                led.set_state(aiy.voicehat.LED.BLINK)
            elif 'repeat after me' in text:
                to_repeat = text.replace('repeat after me', '', 1)
                aiy.audio.say(to_repeat)
            elif 'goodbye' in text:
                os._exit(0)


if __name__ == '__main__':
    main()

You may add more voice commands. Several ideas include a "time" command to make it speak out the current time or commands to control your smart light bulbs.

3.4. Run your app automatically

Imagine you have customized an app with your own triggers and the Google Assistant library. It is an AIY-version of a personalized Google Home. Now you want to run the app automatically when your Raspberry Pi starts. All you have to do is make a system service (like the status-led service mentioned in the user's guide) and enable it.

Assuming that your app is src/my_assistant.py. We would like to make a system service called "my_assistant". First, it is always a good idea to test your app and makes sure it works to your expectation. Then you need a systemd config file. Open your favorite text editor and save the following content as my_assistant.service:

 
Description=My awesome assistant app

[Service]
ExecStart=/bin/bash -c '/home/pi/AIY-voice-kit-python/env/bin/python3 -u src/my_assistant.py'
WorkingDirectory=/home/pi/AIY-voice-kit-python
Restart=always
User=pi

[Install]
WantedBy=multi-user.target

The config file is explained below.

Line Explanation
Description= A textual description of the service.
ExecStart= The target executable to run. In this case, it executes the python3 interpreter and runs your my_assistant.py app.
WorkingDirectory= The directory your app will be working in. By default, we use /home/pi/AIY-voice-kit-python. If you are working as a different user, please update the path accordingly.

Note shortcuts files do not support $HOME, so you have to explicitly use /home/pi/.
Restart= Here we specify that the service should always be restarted should there be an error.
User= The user to run the script. By default we use the "pi" user. If you are working as a different user, please update accordingly.
WantedBy= Part of the dependency specification in systemd configuration. You just need to use this value here.

For more details on systemd configuration, please consult its manual page.

We also need to move the file to the correct location, so systemd can make use of it. To do so, move the file with the following command:

sudo mv my_assistant.service /lib/systemd/system/

Now your service has been configured! To enable your service, enter:

sudo systemctl enable my_assistant.service

Note how we are referring to the service by its service name, not the name of the script it runs. To disable your service, enter:

sudo systemctl disable my_assistant.service

To manually start your service, enter:

sudo service my_assistant start

To manually stop your service, enter:

sudo service my_assistant stop

To check the status of your service, enter:

sudo service my_assistant status

3.5. Use TensorFlow on device

Help your fellow makers experiment with on-device TensorFlow models by donating short speech recordings. This small web app will collect short snippets of speech, and upload them to cloud storage. We'll then use these recordings to train machine learning models that will eventually be able to run on-device, no Cloud needed.

START OPEN SPEECH RECORDING

4

Hardware extensions

4.1. Connecting additional sensors

GPIO

GPIO description
Function GPIO Description
Button 23 button is active low
LED 25 LED is active high
Driver0/GPIO4 4 500mA drive limit, can be used as GPIO
Driver1/GPIO17 17 500mA drive limit, can be used as GPIO
Driver2/GPIO27 27 500mA drive limit, can be used as GPIO
Driver3/GPIO22 22 500mA drive limit, can be used as GPIO
Servo0/GPIO26 26 25mA drive limit, can be used as GPIO
Servo1/GPIO6 6 25mA drive limit, can be used as GPIO
Servo2/GPIO13 13 25mA drive limit, can be used as GPIO
Servo3/GPIO5 5 25mA drive limit, can be used as GPIO
Servo4/GPIO12 12 25mA drive limit, can be used as GPIO
Servo5/GPIO24 24 25mA drive limit, can be used as GPIO
I2S 20, 21, 19 used by Voice HAT ALSA driver, not available to user
Amp Shutdown 16 used by Voice HAT ALSA driver, not available to user
I2C 2, 3 available as GPIO or I2C via Raspbian drivers
SPI 7, 8, 9, 10, 11 available as GPIO or SPI via Raspbian drivers
UART 14, 15 available as GPIO or UART via Raspbian drivers

4.2. Google Actions + Particle Photon (via Dialogflow)

Want to learn how to use your Voice Kit to control other IoT devices? You can start here with a Particle Photon (a Wi-Fi development kit for IoT projects) and Dialogflow (a tool for creating conversational interfaces). This tutorial will show how to make your Voice Kit communicate with Dialogflow (and Actions on Google) to control an LED light with the Photon by voice.

Get all the code for this example here.

Android Things

What's included

This example ties together multiple technology platforms, so there are a few separate components included in this repo:

  • dialogflow-agent - an agent for Dialogflow
  • dialogflow-webhook - a web app to parse and react to the Dialogflow agent's webhook
  • particle-photon - a Photon app to handle web requests, and to turn the light on and off

We've included two separate web app implementations. Choose (and build on) the one that best suits your preferences:

This should be enough to get you started and on to building great things!

What you'll need

We’ll build our web app with Node.js, and will rely on some libraries to make life easier:

On the hardware side, you will need:

It's handy to have a breadboard, some hookup wire, and a bright LED, and the examples will show those in action. However, the Photon has an addressable LED built in, so you can use just the Photon itself to test all the code presented here if you prefer.

You'll also need accounts with:

  • Dialogflow (for understanding user voice queries)
  • Google Cloud (for hosting the webhook webapp/service)
  • Particle Cloud (for deploying your Photon code and communicating with the Particle API)

If you're just starting out, or if you're already comfortable with a microservices approach, you can use the 1-firebase-functions example — it's easy to configure and requires no other infrastructure setup. If you'd prefer to run it on a full server environment, or if you plan to build out a larger application from this, use the 2-app-engine example (which can also run on any other server of your choosing).

If you've got all those (or similar services/devices) good to go, then we're ready to start!

Getting started

Assuming you have all the required devices and accounts as noted above, the first thing you'll want to do is to set up apps on the corresponding services so you can get your devices talking to each other.

Local setup

First, you'll need to clone this repo, and cd into the newly-created directory.

git clone git@github.com:google/voice-iot-maker-demo.git
cd git@github.com:google/voice-iot-maker-demo.git

You should see three directories (alongside some additional files):

  • dialogflow-agent - the contents of the action to deploy on Dialogflow
  • dialogflow-webhook - a web application to parse the Google Actions/Dialogflow webhook (with server-based and cloud function options)
  • particle-photon - sample code to flash onto the Particle Photon

Once you‘ve taken a look, we’ll move on!

Dialogflow

Using the Dialogflow account referenced above, you‘ll want to create a Dialogflow agent. We'll be setting up a webhook to handle our triggers and send web requests to the Particle API.

  1. Create a new agent (or click here to begin). You can name it whatever you like
  2. Select Create a new Google project as well
  3. In the Settings section (click on the gear icon next to your project name) and go to Export and Import
  4. Select Import from zip and upload the zip provided (./dialogflow-agent/voice-iot-maker-demo.zip)

You've now imported the basic app shell — take a look at the new ledControl intent (viewable from the Intents tab). You can have a look there now if you're curious, or continue on to fill out the app's details.

  1. Head over to the Integrations tab, and click Google Assistant.
  2. Scroll down to the bottom, and click Update Draft
  3. Go back to the General tab (in Settings), and scroll down to the Google Project details.
  4. Click on the Google Cloud link and check out the project that's been created for you. Feel free to customize this however you like.
  5. Click on the Actions on Google link, and go to 2 - App information
  6. Click Add, and fill in the details of your project there
    1. Add some sample invocations, as well as a pronunciation of your Assistant app's name
    2. Fill out the other required fields (description, picture, contact email, etc.)
  7. Scroll down to the bottom, and click Test Draft

You can now test out the conversational side of the app in one of two ways:

You can also try talking to your application on any Assistant-enabled device that you‘re signed into.

However, if you’re following along step-by-step, it won't turn any lights on yet — we still have to set up the web service and the Photon app. Onward then!

Google Cloud

Depending on which hosting environment you want to use, cd into either ./dialogflow-webhook/1-firebase-functions or ./dialogflow-webhook/2-app-engine, and continue the setup instructions in that directory's README.md file.

IMPORTANT: Regardless of what hosting/deployment method you choose, make sure you return to the Dialogflow panel and go into the Fulfillment tab to update the URL field. Also, check that the DOMAINS field is set to "Enable webhook for all domains". Without doing these things, Dialogflow won't be able to talk to your new webhook.

Particle

Make sure the Photon is correctly set up and connected. (If it’s not configured yet, follow the steps in the Particle docs

You can upload your code to your photon via the Particle web editor, the Particle Desktop IDE (based on Atom), or the Particle command-line tools.

We'll be using the CLI for this example, which you can install thusly:

sudo npm i particle-cli -g

To deploy via the command line, first make sure you’re logged in:

particle login

You can find out the ID of your device by running:

particle list

Then upload the code using that ID:

particle flash [YOUR-DEVICE-ID] particle-photon/particle-blink-demo.ino

The Photon should blink rapidly while the upload is in process, and when it's done (and calmly pulsing cyan), you're ready to go.

Note: Make sure you generate a Particle access token, and add that token (along with your Photon's device id) to your config.js file.

You can make sure it all works by running the following from your terminal:

curl https://api.particle.io/v1/devices/[YOUR-DEVICE-ID]/led -d access_token=[YOUR-ACCESS-TOKEN] -d led=on

If everything is configured properly, you should see something like the following:

{
    "id": "[YOUR-DEVICE-ID]",
    "last_app": "",
    "connected": true,
    "return_value": 1
}

`

You should see the Photon's light come on (along with an LED on the breadboard, if you've wired one up)! Doing the same with led=off will return a 0 instead of a 1, and will (you guessed it) turn the light off.

Note: If you ever see a "return_value":-1, that's an error message — something has gone wrong somewhere.

Putting it all together

Once you’ve uploaded all the code and each service is configured, it’s time to give it all a try! You can confirm that everything went to plan by going to either your Assistant-enabled device or the Google Actions simulator, asking to talk to your app ("talk to [APP-NAME]"), and typing "turn the light on". If all goes well, your LED should turn on!

Further reading

This application is just a taste of what's possible — how far you take this framework is up to you! Here are a few resources to help you continue on your journey:

5

Appendix

5.1. Log Data and Debugging

You can view logs to get a better sense of what’s happening under the (cardboard) hood if you’re running the voice-recognizer as a service.

Logs

With the voice-recognizer running manually or as a service, you can view all log output using journalctl.

sudo journalctl -u voice-recognizer -n 10 -f

Example logs
Clap your hands then speak, or press Ctrl+C to quit...
[2016-12-19 10:41:54,425] INFO:trigger:clap detected
[2016-12-19 10:41:54,426] INFO:main:listening...
[2016-12-19 10:41:54,427] INFO:main:recognizing...
[2016-12-19 10:41:55,048] INFO:oauth2client.client:Refreshing access_token
[2016-12-19 10:41:55,899] INFO:speech:endpointer_type: START_OF_SPEECH
[2016-12-19 10:41:57,522] INFO:speech:endpointer_type: END_OF_UTTERANCE
[2016-12-19 10:41:57,523] INFO:speech:endpointer_type: END_OF_AUDIO
[2016-12-19 10:41:57,524] INFO:main:thinking...
[2016-12-19 10:41:57,606] INFO:main:command: light on
[2016-12-19 10:41:57,614] INFO:main:ready...
  1. Any lines before and including this one are part of the initialization and are not important
  2. Here is where the main loop starts
  3. Each successful trigger is logged
  4. Once a trigger is recognized the audio recording will be activated
  5. … and a new session with the Cloud Speech API is started
  6. For this a new token is generated to send the recognition request
  7. Feedback from the recognizer that it is listening (our request was accepted)

    See https://cloud.google.com/speech/reference/rest/v1beta1/EndpointerType

  8. Same as line 7

  9. Same as line 7
  10. Back in the application, where we dispatch the command
  11. The command that has been dispatched
  12. The app is ready and wait for a trigger again

Project complete!

You did it! Whether this was your first hackable project or you’re a seasoned maker, we hope this project has sparked new ideas for you. Keep tinkering, there’s more to come.