Vision Kit

Do-it-yourself intelligent camera. Experiment with image recognition using neural networks.

Introduction

The AIY Vision Kit from Google lets you build your own intelligent camera that can see and recognize objects using machine learning. All of this fits in a handy little cardboard cube, powered by a Raspberry Pi.

Everything you need is provided in the kit, including the Raspberry Pi.

Get the kit

For product discontinuation plans, see the product lifecycle details.

Meet your kit

Welcome! Let’s get started

The following instructions show you how to assemble your AIY Vision Kit, connect to it, and run the Joy Detector demo, which recognizes faces and detects if they're smiling.

Then you can try running some other demos that detect other kinds of objects with the camera. You can even install your own custom-trained TensorFlow model.

Time required to build: 1.5 hours

If you have any issues while building the kit, check out our help page or contact us at support-aiyprojects@google.com.

Check your kit version

These instructions are for Vision Kit 1.1. Check your kit version by looking on the back of the white box sleeve in the bottom-left corner.

If it says version 1.1, proceed ahead! If it doesn’t have a version number, follow the assembly instructions for the earlier version.

Gather additional items

You’ll need some additional things, not included with your kit, to build it:

Micro USB power supply: The best option is to use a USB Power supply that can provide 2.1 Amps of power via micro-USB B connector. The second-best choice is to use a phone charger that also provides 2.1A of power (sometimes called a fast charger). Don't try to power your Raspberry Pi from your computer. It will not be able to provide enough power and it may corrupt the SD card, causing boot failures or other errors.

Below are two different options to connect to your kit. Choose the one that works best for you, based on what you have available:

Option 1: Use the AIY Projects app

Choose this option if you have access to an Android smartphone and a separate computer.

You’ll need:

  • Android smartphone
  • Windows, Mac, or Linux computer
  • Wi-Fi connection
  • Optional: MonitorMany of the demos give you the opportunity to see what your Vision Kit’s camera sees, so it is helpful to connect a monitor or TV directly to your kit. If you don’t have one available, many of the demos will still work, but you won’t see the the visual output. or TV (any size will work) with a HDMI input
  • Optional: Normal-sized HDMI cable and mini HDMI adapter

Option 2: Use a monitor, mouse, and keyboard

Choose this option if you don’t have access to an Android smartphone.

You’ll need:

  • Windows, Mac, or Linux computer
  • Mouse
  • Keyboard
  • Monitor or TV (any size will work) with a HDMI input
  • Normal-sized HDMI cable and mini HDMI adapter
  • Adapter to attach your mouse and keyboard to the kit. Below are two different options.

Adapter option A: USB On-the-go (OTG) adapter cable to convert the Raspberry Pi USB micro port to a normal-sized USB port. You can then use a keyboard/mouse combo that requires only one USB port.

Adapter option B: Micro USB Hub that provides multiple USB ports to connect to any traditional keyboard and mouse.

Get to know the hardware

Open your kit and get to know what’s inside.

Take note that the Electrical Hardware bag is underneath the Mechanical Hardware bag.

Missing something? Please send an email to support-aiyprojects@google.com and we will help direct you to finding a replacement.

List of materials

1
Vision Bonnet
2
Raspberry Pi Zero WH
3
Raspberry Pi Camera v2
4
Long flex cable
5
Push button
6
Button harness
7
Micro USB cable
8
Piezo buzzer
9
Privacy LED
10
Short flex cable
11
Button nut
12
Tripod nut
13
LED bezel
14
Standoffs
15
microSD card
16
Camera box cardboard
17
Internal frame cardboard

In your kit

  1. 1 Vision Bonnet (×1)
  2. 2 Raspberry Pi Zero WH (×1)
  3. 3 Raspberry Pi Camera v2 (×1)
  4. 4 Long flex cable (×1)
  5. 5 Push button (×1)
  6. 6 Button harness (×1)
  7. 7 Micro USB cable (×1)
  8. 8 Piezo buzzer (×1)
  9. 9 Privacy LED (×1)
  10. 10 Short flex cable (×1)
  11. 11 Button nut (×1)
  12. 12 Tripod nut (×1)
  13. 13 LED bezel (×1)
  14. 14 Standoffs (×2)
  15. 15 microSD card (×1)
  16. 16 Camera box cardboard (×1)
  17. 17 Internal frame cardboard (×1)

Build your kit

Get the Latest System Image

This kit requires a special version of the Raspberry Pi operating system that includes some extra AIY software.

Although the microSD card included with your kit is pre-flashed with the AIY system image, it's out of date. So before you begin, we highly recommend you download and install the latest system image and flash it to your microSD card. Otherwise, you might encounter some old bugs and some of the sample code might not work for you.

  1. Download the latest .img.xz file from our releases page on GitHub.
  2. Use an adapter to connect your microSD card to your computer.
  3. Download, install, and launch the Raspberry Pi Imager.
  4. Click Choose OS, scroll to the bottom, select Use custom, and find the .img.xz file you downloaded above.
  5. Click Choose storage to select your microSD card and then click Write to begin flashing the SD card.

Flashing the card can take a few minutes, so start assembling the kit. Once the kit is assembled, you'll put the card into it.

Fold the internal frame

Round up your parts

First, let’s build the internal frame that will go inside your camera box. Gather up:

  • Internal frame cardboard
  • Raspberry Pi Camera v2
  • Long flex cable
  • Piezo buzzer

Open connector latch

Start by finding your Raspberry Pi Camera v2 board and open the cable connector latch by pulling gently back on the black raised latch.

Need more help? The latch is pretty tiny: fingernails help open it. If you can’t tell whether it’s open or not, the latch will wiggle a little in the open position.

Insert long flex into latch

Grab your long flex cable. Find the wide end of the cable, and make sure the side with the copper stripes is facing away from you.

Insert the wide end until it hits the back of the connector. You’ll still see the edge of the cable when fully inserted, so don’t force it in.

Close connector latch

Close the cable connector latch by pressing down. You should feel the latch snap into place. Gently check that the cable is secure.

Set the camera board down — you’ll need it again in a few steps.

WARNING: Failure to securely seat the connector may cause electric shock, short, or start a fire, and lead to serious injury, death, or damage to property.

Orient the cardboard

Take the internal frame cardboard and orient it as shown in the photo.

Remove adhesive liner

In the middle of the cardboard is a rectangular cutout labeled A. Remove the adhesive liner from the cutout.

Fold up flap A

Fold the adhesive flap toward you, then against the frame. Press the flap firmly down against the cardboard frame so they stick together.

Flip it over

Flip the frame over as shown in the photo.

Insert piezo buzzer

Find your piezo buzzer and stick it to the adhesive flap that you just folded.

Orient the buzzer so that its wire follows the opening (and the side with the hole is facing towards you), as shown in the image.

Add the camera

Grab the camera you assembled earlier. Peel the clear sticker off the lens, and place the camera board aperture into the rectangular slot in the middle of the cardboard.

The lens should be facing towards you.

Flip to the other side

Inspect the cardboard from the other side so that it matches the picture.

Fold down the top flap

Fold the top flap over the camera board.

Fold the left and right flaps

Holding the top flap down, fold the flaps on the left and the right of the board toward you to hold the camera in place.

There are two small cardboard notches on each side that will loosely hold the flaps in place.

Fold the bottom flap

While holding those flaps in, fold the bottom flap of the inner frame upwards. Lock the flap in place by securing the tabs into the notches.

The bottom of the assembly should look like a shelf.

Thread the buzzer wire

Thread the piezo buzzer wire through the circular opening next to the long flex.

Fold the long flex up

Fold the long flex upwards and crease it by pressing gently.

It’s okay to bend the long flex a bit, so don’t worry about damaging it.

Fold the long flex to the left

Then fold the long flex to the left at a 45-degree angle. The unconnected end should be aligned with the three slits on the left.

You’re going to weave the cable through the slits, like a shoestring.

Thread the long flex

There are three slits on the left flap of the frame. Thread the flex cable into the bottom slit, making sure that the side with the copper lines is facing away from you...

Keep threading

...and up towards you, through the middle slit...

More threading

...and then through the final slit. The cable should now be threaded through all three slits, and sticking out of the left side of the frame.

This keeps the long flex in place.

Set your internal frame aside for now.

Connect the boards

Gather your parts

Now we’re going to connect the circuit boards together. Round up:

  • Raspberry PiThe green board is the Raspberry Pi, a small but mighty Linux computer designed for makers. It has a SD card slot, two USB connectors, and a mini HDMI connector.
  • Vision BonnetThe blue board is the Vision Bonnet, an accessory for the Raspberry Pi that lets you run machine learning programs to identify images from the camera. It contains a special chip designed to run machine learning programs.
  • Standoffs (x2)
  • Short flex cableThe short flex is a flexible circuit board. It’s used to connect boards in electronics when rigid boards have to fit in a tight space.
  • Button harness

Orient your Raspberry Pi

Orient your Raspberry Pi so that the 40-pin headerA header is a fancy electronics term for set of wire connectors. In this case, we refer to each wire as a pin, and there are 40 of them arranged in two columns. They're usually used to connect electronics together and provide access to internal components electrically. is on the left edge of the board, like the photo.

WARNING: First make sure your Raspberry Pi is disconnected from any power source and other components. Failure to do so may result in electric shock, serious injury, death, fire or damage to your board or other components and equipment.

Open the top cable connector

Open the flex connector latch by pulling gently back on the black raised latch. Be careful, as it only takes a little bit of effort to open. Make sure you're holding the board by its edges, as shown in the photo.

Need more help? The latch is pretty tiny: fingernails inserted on either side between the black and white parts will help open it. If you can’t tell whether it’s open or not, the latch will wiggle a little in the open position, and there will be a visible gap between the black and white parts of the connector.

Insert the short flex

Find your short flex cable. See that the ends are labeled Rasp Pi and Vision Bonnet. Take note of the Rasp Pi end, then flip the short flex over lengthwise (like the photo) so the side with the labels (and copper stripes) is facing away from you.

Insert the end of the short flex labeled Rasp Pi into the flex connector until the flex hits the back of the connector.

Close the cable connector latch

Close the cable connector latch by pressing down. You should feel the latch snap into place. Gently check that the cable is secure by lightly tugging on it, and make sure that it is inserted squarely.

WARNING: Failure to securely seat the ribbon in the connector may cause electric shock, short, or start a fire, and lead to serious injury, death, or damage to property.

Orient your Vision Bonnet

Find your Vision Bonnet and orient it so the 40-pin header connector is on the left and the white connector is on the bottom.

There are two white-colored cable connectors on the board (one on the top, and one on the bottom), so make sure the 40-pin header is to your left when you hold it (as shown).

Open the cable connector latch

Look at the white cable connector from the side. If the tiny black latch is standing up above the white base, it is already open. If the black latch is lying flat, flush with the white base, it is closed.

If it is closed, open the cable connector latch by gently flipping the black latch upwards so that it stands up.

Fingernails or tweezers help here. Try to pull from the sides rather than the center of the flap.

Insert the short flex

Insert the short flex into the Vision Bonnet (the other side should be connected to the Raspberry Pi).

Make sure the side with the copper stripes (and labels) is still facing away from you, as shown in the picture.

Close the cable connector latch

Close the cable connector latch on the Vision Bonnet by flipping the black latch back down parallel to the white base. This will secure the short flex.

Flip the assembly over. Make sure the Rasp Pi and Vision Bonnet labels on the short flex correctly correspond to the boards they’re connected to.

WARNING: Failure to securely seat the connector may cause electric shock, short, or start a fire, and lead to serious injury, death, or damage to property.

Insert standoffs

Insert the standoffs into the Raspberry Pi board in the holes opposite the pin header.

Need more help?

The standoffs are tough to get in sometimes. You might need to press firmly and wiggle them into place. If you find you're having to use too much force, use a pair of pliers to squeeze the end of the standoffs while inserting into the holes.

Align the boards

Align the Vision Bonnet header connector with the pin header on the Raspberry Pi.

Push the short flex inward

Push the short flex into the space between the two boards.

Connect the boards and check connections

Firmly push the boards together to snap the standoffs into place. Once you've mated the two headers together, push from the center of the connectors (rather than the edges of the boards) to finish the connection. You may have to work your way around the board to make sure the standoffs snap in as well.

Make sure the standoffs have snapped into the boards and that the 40-pin header is pushed all the way down so that there is no gap between the two boards.

WARNING: Damaging the ribbon or failure to securely seat the Vision Bonnet may cause electric shock, short, or start a fire, and lead to serious injury, death, or damage to property.

Add the boards to the frame

Orient your boards

Now we can insert the boards into the internal frame.

Orient your boards so the Vision Bonnet is facing you, and the white cable connector is on the bottom.

If it isn’t already, open the cable connector latch on the Vision Bonnet by gently flipping the black latch upward.

Connect the boards and frame

Find your internal frame and the long flex that you threaded through the cardboard slits.

With the black tip of the long flex facing toward you, connect it to the white cable connector. Then flip the black latch into the parallel state to close it.

WARNING: Failure to securely seat the connector may cause electric shock, short, or start a fire, and lead to serious injury, death, or damage to property.

Slide the boards into the frame

Now that the boards are connected, slide the boards into the frame, with the Vision Bonnet (the blue one) on top. The boards slide into a slot that looks like a mouth :o

Lightly crease the twisted part of the long flex so that it lays closer against the cardboard.

Check frame and cables

Double-check that your internal frame assembly looks like the one pictured.

The lower tip of each white standoff should be on the outside of the internal frame walls.

Plug in the button harness

Find your button harness and plug it into the top of the Vision Bonnet board. Either end is fine. You can also remove the white tag.

You’ve built the frame and connected the boards. Now let’s build the camera box that it all goes into.

WARNING: Failure to securely seat the connector may cause electric shock, short, or start a fire, and lead to serious injury, death, or damage to property.

Build the box

Orient the cardboard

Now it’s time to fold the camera box. You’ll need:

  • Camera box cardboard
  • Tripod nut

Find the camera box cardboard and unfold it, holding it so that the lettered labels face towards you.

Fold A

Fold each of the flaps labeled A upwards.

Fold B

Now fold the two flaps labeled B toward you.

Fold C

While holding A and B in, fold the flap labeled C toward you.

Peel adhesives

You’re now going to secure the bottom of the box.

Peel the adhesive liner off the flaps labeled A...

Fold D

While applying pressure to the outer sides of the box, fold flap D over and onto both of the A flaps. Press down to secure the adhesive.

Two sets of arrows should now line up.

Insert tripod nut

Take the tripod nut and slide it, wider side face down, into the slot labeled tripod nut.

Fold E

Fold the flaps labeled E inward.

Fold F

Fold the flap labeled F inward, keeping the E flaps inside the box.

Remove adhesive on E

Remove the adhesive on the right-hand E flap.

Fold G

Fold over the right hand flap labeled G, creasing it very close to the edge of panel B. Secure the adhesive by pressing down.

The two arrows should now face each other.

Fold and secure the other side

Repeat the previous two steps on the other side. Ensure that the arrows are aligned.

Fold bottom flap

Fold down the bottom retaining flap, and bend it in the center so the tip points toward you.

Your camera box is now built! But don't close it yet, because you'll now put the rest inside.

Bring it together

Slide in the internal frame

Take your internal frame and slide it into the back of the camera box (as shown). Notice the base of the box contains a slot that aligns with the internal frame's left wall.

Check the boards and wires

Ensure that your Raspberry Pi and Vision Bonnet board are still sitting snugly in the internal frame and that your long flex cable is secure. Also check that the camera is visible through the camera hole on the other side of your box.

Now let’s put the finishing touches on your box.

Install the LED

Gather your parts

Now we’re going to attach the LED to your box. Find your:

  • Privacy LED
  • LED bezel

Install the LED bezel

Flip your box to the front side. You will see three holes.

Push the LED bezel into the top-left hole, above the camera aperture.

Install privacy LED

Turn the box around. Take the privacy LED and insert the end with the bulb into the LED bezel you just installed. It should snap into place.

Check other side

Make sure the privacy LED is peeking out on the other side.

Install the Hardware

Gather your parts

You’re in the home stretch! So, let’s install the button. You’ll need:

  • Push button
  • Button nut

Thread wires through nut

Gather the piezo buzzer, privacy LED, and button harness cables. Thread all three through the button nut.

The wider side of the button nut should be facing upwards, toward the top of the box.

Thread the cables through the box

Take those same cables and insert them through the hole on the top of the box.

Plug wires into push button

Get your push button. Hold it upside down, and check the board for the words PIEZO, LED, and BONNET (they’ll be tiny).

Then take the piezo buzzer cable and plug it into the slot on the left labeled PIEZO. Plug the privacy LED wire into the middle slot labeled LED, then the button harness into the black connector on the right.

WARNING: Failure to securely seat the connector may cause electric shock, short, or start a fire, and lead to serious injury, death, or damage to property.

Insert push button

Insert the push button into the hole.

Secure button nut

From the inside of the box, screw the button nut to secure the push button.

Make sure the wider, flanged side of the nut is facing upwards.

Check your completed box

The completed assembly should look like the image.

Before you close it up, it doesn’t hurt to make sure all your cables are still connected.

Close the box

Close the box and secure it with the tab.

Insert the SD card

The SD card is pre-loaded with all the software you need.

With the arrow side facing up, insert your SD card into the sliver slot on the Raspberry Pi, which you can find through the cardboard cutout on the side.

WARNING: Forcing connectors into misaligned ports may result in loose or detached connectors. Loose wires in the box can cause electric shock, shorts, or start a fire, which can lead to serious injury, death or damage to property.

Congrats, you’ve just assembled the Vision Kit hardware!

Now you’re ready to turn it on.

Try it out

Turn it on

Plug your Vision Kit into a power supply

Plug your Vision Kit into a wall power supply through the port labeled Power on your device.

See Meet your kit for power supply options. Do not plug your Vision Kit into a computer for power.

Let it boot up

To confirm that it’s connected to power, look into the hole in the cardboard labeled SD Card. You’ll see a green LED light flashing on the Raspberry Pi board.

Be patient while it boots up; the first boot takes a few minutes. You’ll know it’s booted when you hear a short tune. The software needs this time to install and configure settings. In the future, it’ll start faster.

When you decide to put away your kit, follow the steps to safely shut it down.

Use the Joy Detector

Try out the Joy Detector

Point the Vision Kit toward someone’s face (or your own) to try out the Joy Detector demo. When the camera detects a face, the button illuminates.

  • Ask them to smile
  • Then ask them to smile REALLY BIG
  • Then ask them to make a frowny face

The Joy Detector uses machine learningMachine learning is the science of making predictions based on patterns and relationships that've been automatically discovered in data. to detect if a person is smiling or frowning, and how much they are doing so. A smile turns the button to yellow, and a frown turns it blue.

If expressions are really big, a sound plays. If the camera sees more than one face, it evaluates each person’s face and sums the joy score of each face.

Not working?

Try holding the camera at least an arms length away from the face you’re pointing it at. Sometimes the camera has trouble if the subject is too close. And be sure the subject is well lit from the front. Keep this in mind for all the demos that you try.

Take a photo

At any point while Joy Detector is running, you can snap a photo by pressing the button.

If you take a photo while the camera detects a face (the button is illuminated), it saves a second version of the photo that's annotated with the joy score.

These photos are saved on the SD card in the ~/Pictures/ directory. You'll learn how to access these photos after you connect to your kit in the next step.

Connect to your kit

Select an option

To try out other demos, you’ll connect to your Vision Kit so that you can give it commands. There are two options for connecting, explained in Meet your kit.

Follow the instructions below for one connection option, either with the AIY Projects app or with a monitor, mouse, and keyboard.

Option 1: AIY Projects app

Download the AIY Projects app

Go to the Google Play Store and download the AIY Projects app.

This app will allow you to connect your Vision Kit to a Wi-Fi network, and display an IP address which you’ll use to communicate with your Vision Kit wirelessly via a separate computer and SSHSSH stands for “secure shell.” It’s a way to securely connect from one computer to another..

Psst: This app only works on Android smartphones. If you don’t have one, please use the alternate connection method (which uses a monitor, keyboard, and mouse).

Follow app instructions

Open the app and follow the onscreen instructions to pair with your Vision Kit.

Take note of the IP addressThe Internet Protocol Address is a four-digit number that identifies a device on a network. Every device on your network (your computer, phone, your Vision Kit) will have a unique IP Address. Using this address, one device can talk to another. — you’ll need it later. The app will also remember and display it on the home screen.

Not working? Make sure your Vision Kit is connected to a power supply.

If you run into errors, quit the app and try again.

If the device won’t pair, make sure the green LED on the Vision Bonnet is flashing. If it’s not flashing, it may have timed out. Press and hold the Vision Bonnet button for 5 seconds, and try again. If that doesn’t work, try restarting your phone.

Optional: Connect a monitor

A monitor is not required to run these demos, but if you have one available, it can be useful so that you can see what your Vision Kit is seeing.

If you have a monitor and mini-HDMI cable (or HDMI cable + mini-HDMI adapter) available, follow these instructions:

  • Unplug your kit from power
  • If it’s open, then close the back of your kit
  • Connect your monitor to the mini-HDMI port labeled HDMI on the back of your kit
  • Make sure your monitor is connected to power
  • Plug your kit back into power
  • Wait for the kit to boot; you’ll hear a beep when it’s ready

When your kit is booted, you’ll see a desktop with the AIY logo on the background.

A pop-up will tell you that the password for the Raspberry Pi user is set to the default. This is important if you plan to use this kit in other projects or expose it to the internet, but for now, you can leave it (we’ll explain more later).

If you don’t have a monitor, or when your kit is powered back on, go to the next step.

Connect your computer to Wi-Fi

Make sure your computer is on the same Wi-Fi network as your Vision Kit. This will allow you to connect to your kit through SSH.

Get your terminal ready

We’re going to connect your computer to the Raspberry Pi using SSH in a terminalA terminal is a text window where you can issue commands to your Raspberry Pi. SSH allows you to do so from a separate computer..

If you’re familiar with using a terminal, start an SSH session with pi@192.168.0.0 (but using the Raspberry Pi's real IP address from above), then skip to step 10.

If you're not familiar with a terminal, download and install the Chrome browser and Secure Shell Extension, and proceed to the next step.

Open the Secure Shell Extension

Once the extension is installed open it.

If you’re using Chrome on a Windows, Mac, or Linux computer, you should see the Secure Shell Extension icon in the toolbar (look to the right of the address bar). Click that icon and then select Connection Dialog in the menu that appears.

If you’re using Chrome on a Chromebook, go to the app menu and type "secure shell app extension".

Connect to the Raspberry Pi

In the top field, type pi@192.168.0.0, but replacing those numbers with the real IP address of your Raspberry Pi. After typing this in, click on the port field. The [ENTER] Connect button should light up.

Click [ENTER] Connect to continue.

Can’t connect? If you can’t connect, check to make sure the IP address you wrote down earlier is correct and that your Raspberry Pi is connected to the same Wi-Fi access point your computer is.

Note If you rewrite or replace your SD card, you will need to remove and add the Secure Shell Extension from Chrome. You can do this by right clicking on the icon in your toolbar and selecting "Remove", then re-add it by following the instructions above.

Give the extension permission

Click Allow.

This gives permission to the SSH extension to access remote computers like your Raspberry Pi.

You will only need to do this once when you add the extension into Chrome.

Continue connecting

At the prompt, type yes and press enter to confirm that the displayed host keyThe SSH extension is designed to be secure, and because of this goal, it needs to identify that the computer you're trying to connect to is actually the computer you expect. To make this job easier, the computers generate a long number and present it to the extension for verification each time. The extension saves this key somewhere safe so that it can verify that the computer you're speaking to is actually the right one. matches what is stored on your Raspberry Pi. You will only have to do this the first time you connect to your kit.

Enter the Raspberry Pi’s password

Enter the Raspberry Pi’s password at the prompt. The default, case-sensitive password is raspberry

When you type, you won’t see the characters.

Note Your IP address might be different than the one shown in the example.

It’s okay if you see the warning line. It’s letting you know that the host key has been saved, and the extension will do the hard work of comparing what it just stored with what the Raspberry Pi provides automatically next time.

Having trouble? If it’s typed wrong, you’ll see “Permission denied, please try again” or “connection closed.” You’ll need to re-start your connection by pressing the R key.

Confirm you’re connected

If the password was entered correctly, you’ll see a message about SSH being enabled by default and the pi@raspberrypi:~ $ shellA shell is a program that runs on a computer that waits for instructions from you, and helps you to make your computer work for you. promptIt’s a response from the shell that indicates that it is ready to receive commands, and tells you what your current working directory is (the tilde, ~, in this case). It ends in a $ where you type your command. will be green.

You will also see a warning that the password for the Raspberry Pi user is set to the default. This is important if you plan to use this kit in other projects, or expose it to the internet, but for now it’s okay to proceed.

Congrats! You’re now connected to your Vision Kit. Skip to the Try More Demos section to explore more Vision Kit demos.

Do I need to change my password? You'll want to change the pi user's password if you plan on using this kit in a project that is exposed to the open internet. It's not safe to expose it with a password everybody knows. If you plan on doing this, you'll want to use the passwd program. This is an advanced step, so for the purposes of this guide, we will assume you haven't changed the password for the pi user.

Note If you do change the password, make sure you keep your password written down somewhere safe in case you forget. It's not easy to recover it if you change it.

Option 2: With monitor, mouse, and keyboard

Gather your peripherals

Use connection option if you don’t have access to an Android smartphone and second computer, or if you prefer to connect directly to your Raspberry Pi.

You’ll need a set of peripherals to interact with your Raspberry Pi, including a monitor, keyboard and mouse. Check here for suggestions.

Unplug your kit

Before plugging in your peripherals, unplug your kit from power.

Connect peripherals

Plug your monitor into the HDMI port and your keyboard and mouse into the Data port on your Vision Kit using one of the adapters described in Meet your kit.

Plug your monitor into power if it’s not already.

Plug your Vision Kit into a power supply

Plug your Raspberry Pi back into power via the Power port. To confirm that it’s connected to power, look into the hole in the cardboard labeled SD Card. You’ll see a green LED flashing on the Raspberry Pi board.

You’ll also see the Raspberry Pi logo in the top left corner of the monitor.

Wait for your device to boot, which will take about two minutes. It’s okay if your screen goes black while it’s booting. Be patient! You’ll know when it’s booted when you hear it beep.

Acknowledge the warning

You’ll see a desktop with the AIY on the background. A pop-up will tell you the password for the Raspberry Pi user is set to the default. This is important if you plan to use this kit in other projects or expose it to the internet, but for now, it's safe to click OK.

Do I need to change my password? You'll want to change the pi user's password if you plan on using this kit in a project that is exposed to the open internet. It's not safe to expose it with a password everybody knows. If you plan on doing this, you'll want to use the passwd program. This is an advanced step, so for the purposes of this guide, we will assume you haven't changed the password for the pi user.

Note If you do change the password, make sure you keep your password written down somewhere safe in case you forget; it's not easy to recover if you change it.

Open the terminal

Open the terminalA terminal is a text window where you can issue commands to your Raspberry Pi. by clicking the black rectangular icon on the taskbar at the top of the screen.

Now you’ll be able to issue commands to your Raspberry Pi.

Confirm you’re connected

You should now see the promptIt’s a response from the shell that indicates that it is ready to receive commands, and tells you what your current working directory is (the tilde, ~, in this case). It ends in a $ where you type your command. pi@raspberrypi: ~ $.

Congrats! You’re ready to start issuing commands to your Raspberry Pi.

What if my prompt looks different? If you clicked on the Start dev terminal icon, you’ll see the prompt “pi@raspberrypi: ~/AIY-projects-python $” instead. This is because the Start dev terminal shortcut is setup to open a terminal and then set your working directory to “~/AIY-projects-python”. That’s fine for the purpose of these instructions. We’ll show you in a few steps how you can use the cd command to change your working directory.

Try more demos

See how computer vision works

Now that you’ve connected to your Vision Kit, you can try out other demos to experiment with machine learning.

View an image on your Pi

View your photos

If you connected a monitor to your Vision Kit, you can now look at the photos you captured. (Unfortunately, you cannot view the photos if your Vision Kit isn't directly plugged into a monitor.)

To view the photos captured with the Joy Detector, you need to first navigate your terminal to the ~/Pictures directoryYou might have heard the terms "folder" or "directory" before. They are synonyms for the same thing: a data structure that contains a listing of filenames and the location of their contents on disk. Think of them like a table of contents: each time you run the ls command, you're "list"-ing the contents of one of these directories. using cdcd stands for “change directory.” Think of it as clicking through file folders. You should see the path in the command line in blue. Capitalization matters: it’s cd, not CD. If you ever get lost or curious, typing pwd and then pressing enter will display your current path.. So type the following into the terminal and press enter:

cd ~/Pictures

Now type ls"ls" is shorthand for "LiSt" and prints out all of the files in the current working directory. It's a great way to look around and see what changed on disk. and press enter to see what’s in the directory. (Hint: that’s an “l” as in lemon, not a #1.) You should see a list of filenames ending with .jpeg. So let's look at one of these.

Type the following command in your terminal and press enter, replacing <filename> with the filename you want to open (such as 2018-05-03_19.52.00.jpeg):

DISPLAY=:0 gpicview <filename>

This photo opens in a new window on the monitor that's plugged into the Vision Kit.

Tip: If you connected to your Vision Kit with monitor, mouse, and keyboard, you can enter the command without DISPLAY=:0.

To close the photo window from your terminal, press Ctrl-CCtrl-C interrupts a running process and returns control back to the terminal prompt..

What is gpicviewgpicview is an application that you can use to display an image. You need to type “DISPLAY=:0” when connecting to your Pi via SSH to tell gpicview how to display an image on the screen.?

Learn more about working in the terminal: Check out some guides from our friends at the Raspberry Pi Foundation: Conquer the Command Line and Linux Commands.

Not seeing anything on your monitor? If your monitor looks like it’s asleep, try typing Ctrl-C to interrupt your previous command and return to the prompt. Then type “DISPLAY=:0 xset s activate” and press enter. Then try to view the image again by typing the command above.

Stop the Joy Detector

Stop the Joy Detector

The Joy Detector runs by default, so you need to stop it before you can run another demo. To do this, type the following command and press enter:

sudo systemctl stop joy_detection_demo

After the demo stops, you are brought back to the command prompt. If you instead see an error, check the command for typos and try again.

Always stop any demos that are running before trying a new demo. If you don’t, you will run into errors.

However, the next time you reboot your kit, the Joy Detector demo will start running again. So if you want to disable it completely so that it does not start by default, type the following command into your prompt and press enter:

sudo systemctl disable joy_detection_demo

You can re-enable it later with:

sudo systemctl enable joy_detection_demo

For more information about these commands, see run your app at bootup.

See which demos are available

Move to the examples directory

To try out several other Vision Kit demos, move into the directory where they’re located. Type the following command into your prompt and press enter:

cd ~/AIY-projects-python/src/examples/vision

Your prompt should now say pi@raspberrypi:~/AIY-projects-python/src/examples/vision $.

Trying to Copy + Paste? Copying and pasting in a terminal is a little different than other applications you may be used to.

If you are using the Secure Shell Extension, to copy some text, highlight what you want by clicking and dragging with the left mouse button, and as soon as you let go of it, it'll copy it. To paste, click the right mouse button. On a touchpad this can be a little tricky, so try tapping or pressing in the lower right of the touchpad, or tapping with two fingers.

To copy text using the terminal on your Raspberry Pi: select the text, right-click, and select 'copy' from the menu. Left click where you want to paste the text, then right click and select 'paste' from the pop up menu.

Take a look around

Now that you’ve changed directories, type ls and press enter to see what’s inside your current directory.

Here you’ll see a list of files that end in .py. These are the example demos written in PythonPython is a programming language that we use for the majority of our demos and scripts. These files end in “.py”. It's a simple language and is very easy to learn. You can find out more about Python at https://www.python.org/ that you can run.

Try image classification in the live camera

Start the image classification camera demo

The image classification camera demo uses an object detection model to identify objects in view of the Vision Kit.

To start it, type the following command and press enter:

./image_classification_camera.py

It might take a moment to fire up.

If it's working, a camera window pops up on your monitor (if one is attached) and the output from the model starts printing to your terminal. If you are brought back to the prompt after seeing error text, check the Using the Vision Kit section of the help page for troubleshooting tips.

Help! The camera is blocking my terminal window. If you are connected directly to your Raspberry Pi via mouse, monitor, and keyboard, the camera window might block your terminal. That’s okay - your terminal is still there in the background. Press Ctrl-C after pointing your camera at a few objects to stop the demo and close the camera window. Then you can scroll up in your terminal window to see what the camera identified. If you want to see the terminal and camera preview at the same time, you can connect your Raspberry Pi to Wi-Fi and then connect to it from another computer via SSH. For information about that setup, see the login setup for the Voice Kit.

Point the camera at stuff

Point your Vision Kit at a few objects, such as some office supplies or fruit. Check your terminal screen to see what the modelA model is like a program for a neural network. It is a mathematical representation of all the different things the neural network can identify. But unlike a program, a model can't be written, it has to be trained from hundreds or thousands of example images. When you show your Vision Kit a new image, the neural network uses the model to figure out if the new image is like any image in the training data, and if so, which one. is guessing. The number next to each guess is its confidence scoreThe confidence score indicates how certain the model is that the object the camera is seeing is the object it identified. The closer the number is to 1, the more confident it is..

You might be surprised at the kinds of objects the model is good at guessing. What is it bad at? Try different angles of the same object and see how the confidence score changes.

Stop the image classification camera demo

The image classification camera demo will run indefinitely until you interrupt it.

When you’re done playing with the image classifier demo, press Ctrl-C to end it. This will bring you back to the prompt.

Don’t worry, you can always start the demo again.

Try face detection in the live camera

Start the face detection camera demo

This demo enables your Vision Kit to identify faces. It prints out how many faces it sees in the terminal, and if you have a monitor attached, it draws a box around each face it identifies.

To start the face detection demo, type the following command and press enter:

./face_detection_camera.py

If it's working, you will see a camera window pop up on your monitor (if one is attached) and the output from the model will start printing to your terminal. If you are brought back to the prompt after seeing error text, check out the Using the Vision Kit section of the help page for troubleshooting tips.

Help! The camera is blocking my terminal window. If you are connected directly to your Raspberry Pi via mouse, monitor, and keyboard, the camera window might block your terminal. That’s okay - your terminal is still there in the background. Press Ctrl-C after pointing your camera at a few objects to stop the demo and close the camera window. Then you can scroll up in your terminal window to see what the camera identified. If you want to see the terminal and camera preview at the same time, you can connect your Raspberry Pi to Wi-Fi and then connect to it from another computer via SSH. For information about that setup, see the login setup for the Voice Kit

Point the camera at faces

Point the camera toward some faces and watch the demo output. Iteration tells you the number of times the model has run. num_faces is the model’s best guess at how many faces are in view of the camera.

Try moving the camera quickly, or farther away. Does it have a harder time guessing the number of faces?

Stop the face detection camera demo

When you’re done experimenting with the face detection demo, press Ctrl-C to end it. This will bring you back to the prompt.

Take a photo when a face is detected

Run the face camera trigger demo

With this demo, your Vision Kit automatically takes a photo when it detects a face. To start it, type the following command and press enter:

./face_camera_trigger.py

If you have a monitor attached, you’ll see a blinking cursor and a camera window pops up. It will remain in this state until the camera sees a face and captures a photo.

Point the camera at faces

Point the camera at yourself or a friend. Try making a bunch of faces and experiment with what the machine considers to be a face.

When it sees a face, it will take a photo and create an image called faces.jpg in your current directory, and then close the camera window and bring you back to the prompt.

Seeing an error? Check out the Using the Vision Kit section of the help page for troubleshooting tips.

Verify the photo was created

To verify that a photo was created, type ls at the prompt and press enter.

You should see a file called faces.jpg listed in your current directory.

To open the photo, see the instructions for how to View an image on your Pi.

Hint: Each time the face_camera_trigger demo captures a photo, it overwrites faces.jpg. If you want to rename the last photo you took so that you don’t overwrite it, type the following command and press enter:

mv faces.jpg newname.jpg

Take a photo using raspistill

Take a photo

The following demos show you how to use existing image files as input (instead of using the live camera feed). So you need to first capture a photo with the camera (or save a file into the same directory).

To capture a new photo named image.jpg, type the following command and press enter:

raspistill -w 1640 -h 922 -o image.jpg

The camera will wait 5 seconds, and then take a photo.

What should I name my file? image.jpg is the name of the file we are telling the command to write to in the screenshot to the left. You can name your file anything you want, as long as you use only letters, numbers, dashes, and underscores. You should end your filename with .jpg because this command is saving the image in the JPEG format.

What does this command mean? raspistill is a command that lets you capture photos using your Raspberry Pi camera module. The -w flag and -h flags specify the width and height for the image. The -o flag specifies the filename. For more information, see the raspistill documentation.

Verify your photo was created

To verify that a photo was created, type ls at the prompt and press enter. You should see the filename you used in the step above.

To open the photo, see the instructions for how to View an image on your Pi.

Tip: Press the up and down arrow keys at the prompt to scroll through a history of commands you've run. To rerun a command, it's easier to press the arrows until the one you want is shown. You can edit the command if needed, then press enter.

Try face detection on an image

Run the face detection demo

Now let’s use a photo you captured above with the face detection model. If you skipped that step, go back and take a photo or make sure you have a photo with a face on your SD card.

To run the demo, type the following command in your terminal and press enter:

./face_detection.py --input image.jpg

If you named your image file something different, replace image.jpg with the name of the file you want to use.

Seeing an error? Check out the Using the Vision Kit section of the help page for troubleshooting tips.

Check the results

When it’s done, you should get something like this:

Face #0: face_score=0.989258, joy_score=0.969556, bbox=(632.0, -15.0, 782.0, 782.0)

face_score is how certain the model is that it’s found a face, and joy_score is how happy it appears the person is (both scores are out of 1). bbox tells you where the face is located in the image.

Nothing happened: If you’re brought back to the prompt and don’t see any output in white, the model didn’t detect a face in the photo. Try taking a new photo and then running the command again. Be sure your subject is well lit from the front and there are no bright lights directly behind them.

Try object detection on an image

Run the object detection demo

The object detection demo takes an image and checks whether it’s a cat, dog, or person.

First, you need an image ready: take a photo with the camera or save a photo on the SD card.

Then type the following command and press enter, replacing image.jpg with the file you want to use:

 ./object_detection.py --input image.jpg

Seeing an error? Check out the Using the Vision Kit section of the help page for troubleshooting tips.

Check the results

When it’s done, you should get something like this:

Object #0: kind=PERSON(1), score=0.959231, bbox=(359, 108, 896, 808)

kind is the type of object detected and score is how confident the model is about the result it gave. bbox is where that object is located in the image.

Nothing happened: If you’re brought back to the prompt and don’t see any output in white, the model didn’t detect a face, cat, or dog in the photo. Try taking a new photo and then running the command again.

Try dish classification on an image

Run the dish classifier demo

The dish classifier model can identify food from an image.

First, you need an image ready: take a photo with the camera or save a photo on the SD card.

Then type the following command and press enter, replacing image.jpg with the file you want to use:

./dish_classification.py --input image.jpg

Seeing an error? Check out the Using the Vision Kit section of the help page for troubleshooting tips.

Check the results

When it’s done processing (it may take a minute), you’ll get a list of results, along with the type of food identified and a probability score indicating how confident the model is of its answer (out of 1).

Nothing happened: If you’re brought back to the prompt and don’t see any output in white, the model didn’t detect anything, womp. Try again with a different photo.

Try image classification on an image

Run the image classification demo

This is the same image classifier from above but now running against a captured image.

First, you need an image ready: take a photo with the camera or save a photo on the SD card.

Then type the following command and press enter, replacing image.jpg with the file you want to use:

./image_classification.py --input image.jpg

Seeing an error? Check out the Using the Vision Kit section of the help page for troubleshooting tips.

Check the results

Like the camera image classifier, you will get a list of results, which includes the kind of object and the model’s level of confidence.

Nothing happened: If you’re brought back to the prompt and don’t see any output in white, the model didn’t detect anything, womp. Try taking a new photo and then running the command again.

Shut down your kit

When you’re done with your Vision Kit for the day, it’s important to shut it down properly before unplugging it, to make sure you don’t corrupt the SD card.

If you've connected your kit to a monitor, mouse, and keyboard, you can shut it down by opening the applications menu (the Raspberry Pi icon in the top-left corner of the desktop) and then clicking Shutdown.

Otherwise, if you're connected to the kit with an SSH terminal, type the following command and press enter:

sudo poweroff

After a few moments, the green LED on the Raspberry Pi will turned off (look through the hole labeled SD Card), indicating that the kit is powered off.

You can then safely unplug the power supply from your kit.

Reconnect your kit

To reconnect your kit, plug your kit back into the power supply and wait for it to boot up (about 2 minutes).

If you’re using a monitor, mouse, and keyboard, make sure they’re connected before you plug in your kit. Once the kit is booted, open up a terminal and you’re good to go.

If you’re using SSH, wait until the green LED stops flickering before connecting via SSH. Once your kit is booted, reconnect via the Secure Shell Extension (review the steps to connect to your kit). Note: You might have to re-pair your kit via the app.

What's next?

Congrats! You’ve setup your very own intelligent camera.

Now that you’ve got a taste for the Vision Kit can do, you can start hacking the kit to build your own intelligent vision projects.

In the following Maker's Guide, you'll find documentation about the Python APIs and hardware features available in the Vision Kit. It also describes how you can train your own TensorFlow model to perform new machine vision tasks.

Share your creations with the maker community at #aiyprojects

Maker's guide

Heads up! This section assumes a much higher level of technical experience. So if you're new to programming, don't be discouraged if this is where you stop for now.

Python API library

To support various features in the Vision Kit, we've built a Python library that handles a lot of programming dirty work for you. It makes it easy to perform an inference with a vision model and draw a box around detected objects, and to use kit peripherals such as the button, LEDs, and extra GPIO pins.

These APIs are built into a Python package named aiy, which is pre-installed in the kit's system image. Just be sure that you've installed the latest system image.

To learn more about these APIs, refer to the API reference. In particular, the following APIs will be of interest for use with your Vision Kit:

  • aiy.toneplayer: A simple melodic music player for the piezo buzzer.
  • aiy.trackplayer: A tracker-based music player for the piezo buzzer.
  • aiy.vision.annotator: An annotation library that draws overlays on the Raspberry Pi’s camera preview.
  • aiy.vision.inference: An inference engine that communicates with the Vision Bonnet from the Raspberry Pi side.
  • aiy.vision.models: A collection of modules that perform ML inferences with specific types of image classification and object detection models.
  • aiy.board: APIs to use the button that’s attached to the Vision Bonnet’s button connector.
  • aiy.leds: APIs to control certain LEDs, such as the LEDs in the button and the privacy LED.
  • aiy.pins: Pin definitions for the bonnet's extra GPIO pins, for use with gpiozero.

Examples

You might find it easier learn the aiy Python API if you start with an existing demo and modify it to do what you want.

You've seen some of these demos above, so they're already installed on your kit at ~/AIY-projects-python/src/examples/. You can also browse the examples on GitHub, where you'll find the source code for all the examples and more.

For instance, to learn more about the aiy.vision.inference and face_detection API, try running the face_detection.py example:

cd ~/AIY-projects-python/src/examples/vision

./face_detection.py --input image.jpg --output result.jpg

For each face detected in image.jpg, the demo prints information such as the face score and joy score. It also creates an image to the output location, which is a copy of the image that includes a box around each face.

To see how it works, open this file on your Raspberry Pi or see the source code here. Then start tweaking the code.

If you're more interested in programming hardware such as buttons and servos, see the section below about the GPIO expansion pins, which includes some other example code.

TensorFlow Model Compiler

To further customize your project, you can train a TensorFlow model to recognizes new types of objects, and use our Vision Bonnet compiler to convert the model into binary file that's compatible with the Vision Bonnet.

Give it a try right now by following our tutorial to retrain a classification model.

If you want to build your own TensorFlow model, beware that due to limited hardware resources on Vision Bonnet, there are constraints on what type of models can run on device. We have tested and verified that the following model structures are supported on the Vision Bonnet.

Model type Supported Configuration
MobileNetV1 input size: 160x160, depth multiplier = 0.5
input size: 192x192, depth multiplier = 1.0
MobileNetV1 + SSD input size: 256x256, depth multiplier = 0.125
SqueezeNet input size: 160x160, depth multiplier = 0.75

Retrain a classification model

For an example of how to retrain and compile a TensorFlow model for the Vision Bonnet, follow this Colab tutorial to retrain a classification model for the Vision Kit.

The tutorial uses Google Colab to run all the code in the cloud, so you don't need to worry about installing and running TensorFlow on your computer.

At the end of the tutorial, you'll have a new TensorFlow model that's trained to recognize five types of flowers and compiled for the Vision Bonnet, which you can download and run on the Vision Kit (as explained in the tutorial).

You can also modify the code directly in the browser (or download the code) to adjust the training parameters and provide your own training data. For example, you can replace the flowers training data with something else, like photos of different animals to train a pet detector.

Beware that although this script retrains an existing classification model, it still requires a large amount of training data to produce accurate results (usually hundreds of photos for each class). You can often find good, freely-available datasets online, such as from the Open Images Dataset.

Vision Bonnet compiler

Download the Vision Bonnet model compiler here.

To unzip the file, run tar -zxvf bonnet_model_compiler_yyyy_mm_dd.tgz. This should give you bonnet_model_compiler.par (you might need to chmod u+x bonnet_model_compiler.par to run it).

You can also download the TensorFlow models shipped with Vision Kit (except FaceDetection) in frozen graph format.

Note: The compiler works only with x86-64 CPU running Linux. It was tested with Ubuntu 14.04. Do NOT run it on Vision Kit.

Here's the basic command to compile a model:

./bonnet_model_compiler.par \
    --frozen_graph_path=<frozen_graph_path> \
    --output_graph_path=<output_graph_path> \
    --input_tensor_name=<input_tensor_name> \
    --output_tensor_names=<output_tensor_names> \
    --input_tensor_size=<input_tensor_size>

Take mobilenet_v1_160res_0.5_imagenet.pb as an example. Put mobilenet_v1_160res_0.5_imagenet.pb in the same folder as bonnet_model_compiler.par and run:

./bonnet_model_compiler.par \
    --frozen_graph_path=./mobilenet_v1_160res_0.5_imagenet.pb \
    --output_graph_path=./mobilenet_v1_160res_0.5_imagenet.binaryproto \
    --input_tensor_name="input" \
    --output_tensor_names="MobilenetV1/Predictions/Softmax" \
    --input_tensor_size=160

input_tensor_name is the input node’s name of the inference part of TensorFlow graph. Similarly, output_tensor_names are the output nodes’ names of the inference part of TensorFlow graph. README.md of the downloaded file contains this information.

Note: For MobileNet SSD based model (mobilenet_ssd_256res_0.125_person_cat_dog.pb), the TensorFlow graph contains 3 parts, preprocessing + inference + post-processing. The input and outputs name you want to use are the input and outputs for the inference part. If you look at the TF graph, you will find nodes with name prefix 'Preprocessor', 'FeatureExtractor', 'Postprocessor' that correspond to each phase. This is why the input and output tensor names do not appear in the first and last few nodes of the TensorFlow graph.

Due to the Vision Bonnet model constraints, it's best to make sure your model can run on Vision Bonnet before you spend a lot of time training the model. You can do this as follows:

  1. Use the checkpoint generated at training step 0 and export as a frozen graph; or export a dummy model with random weights after defining your model in TensorFlow.

  2. Use our compiler to convert the frozen graph into binary format, and copy it onto the Vision Kit.

  3. Run the following script to make sure your model can run on Vision Bonnet.
~/AIY-projects-python/src/examples/vision/any_model_camera.py \
  --model_path <path_to_model> \
  --input_height <h> \
  --input_width <w>

Constraints

  1. Model takes square RGB image and input image size must be a multiple of 8.

    Note: Vision Bonnet handles down-scaling, therefore, when doing inference, you can upload image that is larger than model's input image size. And inference image's size does not need to be a multiple of 8.

  2. Model's first operator must be tf.nn.conv2d.

  3. Model should be trained in NHWC order.

  4. Model's structure should be acyclic.

  5. When running inference, batch size is always 1.

Supported operators and configurations

The following subset of TensorFlow operators can be processed by the model compiler and run on device. There are additional constraints on the inputs and parameters of some of these ops, imposed by the need for these ops to run efficiently on the Vision Bonnet processor.

TF operators Supported on device configuration
tf.nn.conv2d Input tensor depth must be divisible by 8 unless it is the first operator of the model.

filter: [k, k, in_channels, out_channels], k = 1, 2, 3, 4, 5;

strides: [1, s, s, 1], s = 1, 2;

padding: VALID or SAME;

data_format: NHWC;
tf.nn.depthwise_conv2d filter: [k, k, in_channels, channel_multiplier], k = 3, 5, channel_multiplier = 1;

strides: [1, s, s, 1], s = 1, 2;

padding: VALID or SAME;

data_format: NHWC;
tf.nn.max_pool Input tensor depth must be divisible by 8.

ksize: [1, k, k, 1], k = 2, 3, 4, 5, 6, 7;

strides: [1, s, s, 1], s <= k;

padding: VALID or SAME;

data_format: NHWC
tf.nn.avg_pool ksize: [1, k, k, 1], k = 2, 3, 4, 5, 6, 7;

strides: [1, s, s, 1], s <= k;

padding: VALID or SAME;

data_format: NHWC
tf.matmul Suppose a is MxK matrix, b is KxN matrix, K must be a multiple of 8.

a: rank-1 or rank-2 tensor;

b: rank-1 or rank-2 tensor;

transpose_a: False;

transpose_b: False;

adjoint_a: False;

adjoint_b: False;

a_is_sparse : False;

b_is_sparse: False;
tf.concat axis: 1, or 2, or 3
tf.add Yes
tf.multiply Yes
tf.nn.softmax dim: -1.
tf.sigmoid x: tensor's shape must be [1, 1, 1, k].
tf.nn.l2_normalize Input tensor depth must by a multiple of 8.

dim: -1.
tf.nn.relu Yes
tf.nn.relu6 Yes
tf.tanh Yes
tf.reshape First dimension tensor can not be reshaped. That is shape[0] = tensor.shape[0].

FAQ

I’m retraining object detection model with TensorFlow’s object_detection tutorial and running into some trouble.

The pretrained MobileNet based model listed here is based on 300x300 input and depth multiplier of 1.0, which is too big to run on Vision Kit. You can train a smaller model with supported configuration (MobileNet + SSD, input 256x256, depthwise multiplier 0.125), this requires changing the input size and depth multiplier. Unfortunately, if you are following their retraining tutorial, you cannot retrain (fine tune) a depth multiplier 1.0 model to use a different depth multiplier. At this point, you have to train from scratch.

How to train and deploy a customized object detection model trained with TF’s object detection API?

Known supported architecture, MobileNet + SSD

Verified configuration:

Input height x width Depthwise multiplier
256 x 256 0.125

Use embedded version of training configuration embedded_ssd_mobilenet_v1_coco.config

Let’s take training PASCAL VOC dataset locally as an example.

  1. Install object detection API as described here.
  2. Prepare training and eval data for PASCAL VOC.
  3. Make changes to embedded_ssd_mobilenet_v1_coco.config accordingly, with instructions. Major changes:
    1. num_classes = 20 (instead of 90)
    2. Comment out fine_tune_checkpoint
    3. label_map_path, input_path, and PATH_TO_BE_CONFIGURED
  4. Start the training as described here.
  5. Export inference graph using instructions here. For trained_checkpoint_prefix, it is usually model.ckpt-${CHECKPOINT_NUMBER}.
    Note: it is highly recommended to check your model can run on Vision Bonnet as soon as you get checkpoint 0.
  6. Use bonnet_model_compiler to compile the model
./bonnet_model_compiler.par \
  --frozen_graph_path=frozen_inference_graph.pb \
  --output_graph_path=customized_detector.binaryproto \
  --input_tensor_name="Preprocessor/sub" \
  --output_tensor_names="concat,concat_1" \
  --input_tensor_size=256 \
  --debug
  1. Run the following script to make sure your model can run on Vision Bonnet.
~/AIY-projects-python/src/examples/vision/any_model_camera.py \
  --model_path <path_to_model> \
  --input_height <h> \
  --input_width <w>
  1. Write python code to interpret inference result. Reusing src/aiy/vision/models/object_detection.py is a good starting point.

GPIO Header Pinout

If you plan to take your project beyond the cardboard box, you might be wondering which GPIO pins are available for your other hardware. So figure 1 shows exactly which pins from the Raspberry Pi are used by the Vision Bonnet.

Figure 1. GPIO pins used by the Vision Bonnet (highlighted pins are used)

GPIO Expansion Pins

The Vision Bonnet also includes a dedicated microcontroller (MCU) that enables the following additional features:

  • Control of four additional GPIO pins, freeing up the Pi GPIOs for other uses
  • PWM support for servo/motor control without taxing the Raspberry Pi's CPU
  • Analog input support for all GPIO pins via on-board analog-to-digital converter (ADC)
  • Control of the two LEDs on the bonnet

The extra GPIO pins are provided on the top of the Vision Bonnet (see figure 2). You can control the GPIOs and LEDs with the gpiozero library, using pin names PIN_A, PIN_B, PIN_C, PIN_D, LED_1, and LED_2.

Figure 2. GPIO expansion pins on the Vision Bonnet

The gpiozero-compatible pin definitions are provided by the aiy.pins package. You can use these definitions to construct standard gpiozero devices like LEDs, Servos, and Buttons.

If you want to dig deeper into these pins, checkout the MCU (SAM D09) docs—the bonnet GPIO pin names correspond to the MCU pins as follows:

  • PIN_A = PA04
  • PIN_B = PA05
  • PIN_C = PA10
  • PIN_D = PA11

Also see how to read the analog voltages.

WARNING: Before you connect any wires to the Vision Bonnet, be sure your Raspberry Pi is disconnected from any power source. Failure to do so could result in electric shock, serious injury, death, fire or damage to your board or connected components and equipment.

LED example

Note: The following example code might not be installed on your SD card right out of the box. Be sure that you are running the latest system image.

Although the LEDs on the bonnet are easy to use, you probably want your light to appear somewhere else. So connect an LED to PIN_A and GND as shown in figure 3. (Be sure the long/bent leg of the LED is connected to PIN_A; the resistor can be any size over 50 ohms.)

Then run the led_chaser.py example code:

cd ~/AIY-projects-python/src/examples/gpiozero

./led_chaser.py

It takes several seconds for the script to begin. Once it does, your light will blink on and off. To stop, press Control+C.

If the light does not blink, continue to wait another 15 seconds. If it still does not blink, look for any errors in the terminal window. Then press Control+C to stop the script, power off the kit, and double check all wiring. Then try again.

Figure 3. An LED connected to the Vision Bonnet

The led_chaser.py script is designed to light up 4 LEDs in sequence, as shown here:

from time import sleep
from gpiozero import LED
from aiy.pins import (PIN_A, PIN_B, PIN_C, PIN_D)

leds = (LED(PIN_A), LED(PIN_B), LED(PIN_C), LED(PIN_D))
while True:
    for led in leds:
        led.on()
        sleep(0.5)
        led.off()

Of course, the code works fine with just one LED connected. But once you have the one LED working, try connecting LEDs to PIN_B, PIN_C, and PIN_D in the same way, and run the code again.

Servo example

Because the GPIO pins on the Vision Bonnet are controlled by an on-board MCU, they perform pulse-width modulation (PWM) more precisely than the Raspberry Pi. So these pins are great for controlling servos.

To try it out, connect a servo to the GND, PIN_B, and 5V pins as shown in figure 4, and then run the servo_example.py script:

cd ~/AIY-projects-python/src/examples/gpiozero

./servo_example.py

It takes several seconds for the script to begin. Once it does, your servo should rotate back and forth between the minimum, maximum, and neutral position. But each servo can be a little different, so you might need to tune the parameters of the code to achieve a perfect alignment with your servo's full range of motion.

If the servo does not respond, continue to wait another 15 seconds. If it still does nothing, look for any errors in the terminal window. Then press Control+C to stop the script, power off the kit, and double check all wiring. Then try again.

Figure 4. A servo connected to the Vision Bonnet

The servo_example.py script uses the gpiozero Servo object to control the servo. The important parts of the script look like this:

from gpiozero import Servo
from aiy.pins import PIN_B

# Create a servo with the custom values to give the full dynamic range.
tuned_servo = Servo(PIN_B, min_pulse_width=.0005, max_pulse_width=.0019)

# Move the Servos back and forth until the user terminates the example.
while True:
    tuned_servo.max()
    sleep(1)
    tuned_servo.mid()
    sleep(1)
    tuned_servo.min()
    sleep(1)

To adjust the rotation range of your servo, open the Python script and adjust the parameters of the Servo() constructor. Also see the Servo API documentation.

For more examples using the GPIO pins, see the AIY GitHub examples.

All of these example files are already available on your Vision Kit in the direcory ~/AIY-projects-python/src/examples/. Just be sure you have the latest system image on your SD card.

Button Connector Pinout

If you want to modify the button interface (such as to change the actual button), be sure to follow the wiring pinout as shown in figure 5.

Figure 5. Pinout for the bonnet button connector

Note: The push button built onto the board functions exactly the same as the button connected to the button connector. They both activate GPIO23 on the Raspberry Pi.

Run your app at bootup

By default, your Vision Kit runs the Joy Detector demo when it boots up. This is enabled using a systemd service, which is defined with a .service configuration file at ~/AIY-projects-python/src/examples/vision/joy/joy_detection_demo.service, and it looks like this:

[Unit]
Description=AIY Joy Detection Demo
Requires=dev-vision_spicomm.device
After=dev-vision_spicomm.device
Wants=aiy-board-info.service
After=aiy-board-info.service

[Service]
Type=simple
Restart=no
User=pi
Environment=AIY_BOARD_NAME=AIY-Board
EnvironmentFile=-/run/aiy-board-info
ExecStart=/usr/bin/python3 /home/pi/AIY-projects-python/src/examples/vision/joy/joy_detection_demo.py --enable_streaming --mdns_name "${AIY_BOARD_NAME}" --blink_on_error

[Install]
WantedBy=multi-user.target

The .service file accepts a long list of configuration options, but this example provides everything you need for most programs you want to run at bootup.

To create a service like this to start your own app at bootup, just copy this configuration to a new file such as my_program.service (the name must end with .service). Then change ExecStart so it points to your program's Python file (and passes it any necessary parameters), and change Description to describe your program.

Then you need to put this file into the /lib/systemd/system/ directory. But instead of moving this file there, you can keep it with your program files and create a symbolic link (a "symlink") in /lib/systemd/system/ that points to the file. For example, let's say your config file is at ~/Programs/my_program.service. Then you can create your symlink as follows:

# Create the symlink
sudo ln -s ~/Programs/my_program.service /lib/systemd/system

# Reload the service files so the system knows about this new one
sudo systemctl daemon-reload

Now tell the system to run this service on bootup:

sudo systemctl enable my_program.service

All set! You can try rebooting now to see it work.

Or manually run it with this command:

sudo service my_program start

If you want to stop the service from running on bootup, disable it with this command:

sudo systemctl disable my_program.service

And to manually stop it once it's running, use this command:

sudo service my_program stop

You can check the status of your service with this command:

sudo service my_program status

If you'd like to better understand the service configuration file, see the .service config manual.

View Log Data

If you need to see more logs to help with debugging (or you're simply curious to see more output), you can view system logs and program-specific logs using the journalctl tool.

By default, this prints a lot of system information that won't be useful to you, so it's best if you launch your program as a service and then tell journalctl to print only the logs from that service.

For example, if you start the Joy Detector demo as a service (or it's already running, as usual), you can begin printing all log output for that service with this command:

sudo journalctl -u joy_detection_demo -f

The -f option continuously prints new log entries as they occur. To stop printing the log, press Control+C.

Models guide

Face Detector

The Face Detector model locates and identifies faces from an image. It also provides a “joy score” for each face.

Dog / Cat / Human Detector

The Dog / Cat / Human Detector can identify whether there’s a dog, cat, or person in an image and draw a box around the identified objects. It’s based on the MobileNet model architecture.

Dish Classifier

The Dish Classifier model is designed to identify food in an image. It’s based on the MobileNet model architecture and trained to recognize over 2,000 types of food.

Image Classifier

The Image Classifier demo is designed to identify 1,000 different types of objects. This demo can use either the SqueezeNet model or Google's MobileNet model architecture.

Nature Explorer

Nature explorer has 3 machine learning models based on MobileNet, trained on photos contributed by the iNaturalist community. These models are built to recognize 4,080 different species (~960 birds, ~1020 insects, ~2100 plants).

In collaboration with iNaturalist

More information

System updates

To get the latest bug fixes and features, update the system image for your kit as follows:

  1. Download the latest .img.xz file from our releases page on GitHub.
  2. Use an adapter to connect your microSD card to your computer.
  3. Download, install, and launch the Raspberry Pi Imager.
  4. Click Choose OS, scroll to the bottom, select Use custom, and find the .img.xz file you downloaded above.
  5. Click Choose storage to select your microSD card and then click Write to begin flashing the SD card.

When flashing is done, put the microSD card back in your kit and you're good to go!

Support

If you're having trouble assembling your kit or running the demos, try the following resources:

Project complete!

You did it! Whether this was your first hackable project or you’re a seasoned maker, we hope this project has sparked new ideas for you. Keep tinkering, there’s more to come.