Vision Kit

Do-it-yourself intelligent camera. Experiment with image recognition using neural networks.

Meet your kit

Welcome! Let’s get started

The AIY Vision Kit from Google lets you build your own intelligent camera that can see and recognize objects using machine learning. All of this fits in a handy little cardboard cube, powered by a Raspberry Pi.

These instructions show you how to assemble your AIY Vision Kit, connect to it, and run the Joy Detector demo application.

Time required: 1.5 hours

If you have any issues while building the kit, check out our help page or contact us at support-aiyprojects@google.com.

Check your kit version

These instructions are for Vision Kit 1.1. Check your kit version by looking on the back of the white box sleeve in the bottom-left corner.

If it says version 1.1, proceed ahead! If it doesn’t have a version number, follow the assembly instructions for the earlier version.

Gather additional items

You’ll need some additional things, not included with your kit, to build it:

Micro USB power supply: The best option is to use a USB Power supply that can provide 2.1 Amps of power via micro-USB B connector. The second-best choice is to use a phone charger that also provides 2.1A of power (sometimes called a fast charger). Don't try to power your Raspberry Pi from your computer. It will not be able to provide enough power and it may corrupt the SD card, causing boot failures or other errors.

Below are two different options to connect to your kit. Choose the one that works best for you, based on what you have available:

Option 1: Use the AIY Projects app

Choose this option if you have access to an Android smartphone and a separate computer.

You’ll need:

  • Android smartphone
  • Windows, Mac, or Linux computer
  • WiFi connection
  • Optional: MonitorMany of the demos give you the opportunity to see what your Vision Kit’s camera sees, so it is helpful to connect a monitor or TV directly to your kit. If you don’t have one available, many of the demos will still work, but you won’t see the the visual output. or TV (any size will work) with a HDMI input
  • Optional: Normal-sized HDMI cable and mini HDMI adapter

Option 2: Use a monitor, mouse, and keyboard

Choose this option if you don’t have access to an Android smartphone.

You’ll need:

  • Windows, Mac, or Linux computer
  • Mouse
  • Keyboard
  • Monitor or TV (any size will work) with a HDMI input
  • Normal-sized HDMI cable and mini HDMI adapter
  • Adapter to attach your mouse and keyboard to the kit. Below are two different options.

Adapter option A: USB On-the-go (OTG) adapter cable to convert the Raspberry Pi USB micro port to a normal-sized USB port. You can then use a keyboard/mouse combo that requires only one USB port.

Adapter option B: Micro USB Hub that provides multiple USB ports to connect to any traditional keyboard and mouse.

Get to know the hardware

Open your kit and get to know what’s inside.

Take note that the Electrical Hardware bag is underneath the Mechanical Hardware bag.

Missing something? Please send an email to support-aiyprojects@google.com and our customer support will help you with replacement.

List of materials

Materials
1
Vision Bonnet
2
Raspberry Pi Zero WH
3
Raspberry Pi Camera v2
4
Long Flex
5
Push Button
6
Button harness
7
Micro USB Cable
8
Piezo buzzer
9
Privacy LED
10
Short Flex
11
Button Nut
12
Tripod nut
13
LED bezel
14
Standoffs
15
Micro SD Card
16
External Box
17
Internal Box

In your kit

  1. 1 Vision Bonnet (×1)
  2. 2 Raspberry Pi Zero WH (×1)
  3. 3 Raspberry Pi Camera v2 (×1)
  4. 4 Long Flex (×1)
  5. 5 Push Button (×1)
  6. 6 Button harness (×1)
  7. 7 Micro USB Cable (×1)
  8. 8 Piezo buzzer (×1)
  9. 9 Privacy LED (×1)
  10. 10 Short Flex (×1)
  11. 11 Button Nut (×1)
  12. 12 Tripod nut (×1)
  13. 13 LED bezel (×1)
  14. 14 Standoffs (×2)
  15. 15 Micro SD Card (×1)
  16. 16 External Box (×1)
  17. 17 Internal Box (×1)

Build your kit

Fold the internal frame

Round up your parts

First, let’s build the Internal Frame that will go inside your cardboard box. Gather up:

  • Internal Frame
  • Raspberry Pi Camera v2
  • Long Flex
  • Piezo Buzzer

Open connector latch

Start by finding your Raspberry Pi Camera v2 board and open the cable connector latch by pulling gently back on the black raised latch.

Need more help? The latch is pretty tiny: fingernails help open it. If you can’t tell whether it’s open or not, the latch will wiggle a little in the open position.

Insert Long Flex into latch

Grab your Long Flex. Find the wide end of the cable, and make sure the side with the copper stripes is facing away from you.

Insert the wide end until it hits the back of the connector. You’ll still see the edge of the cable when fully inserted, so don’t force it in.

Close connector latch

Close the cable connector latch by pressing down. You should feel the latch snap into place. Gently check that the cable is secure.

Set the camera board down — you’ll need it again in a few steps.

WARNING: Failure to securely seat the connector may cause electric shock, short, or start a fire, and lead to serious injury, death, or damage to property.

Orient the Internal Frame

Take the Internal Frame and orient it as shown in the photo.

Remove adhesive liner

In the middle of the cardboard is a rectangular cutout labeled A. Remove the adhesive liner from the cutout.

Fold up flap A

Fold the adhesive flap toward you, then against the frame. Press the flap firmly down against the cardboard frame so they stick together.

Flip it over

Flip the frame over as shown in the photo.

Insert Piezo Buzzer

Find your Piezo Buzzer and stick it to the adhesive flap that you just folded.

Orient the buzzer so that its wire follows the opening (and the side with the hole is facing towards you), as shown in the image.

Add the camera

Grab the camera you assembled earlier. Peel the clear sticker off the lens, and place the camera board aperture into the rectangular slot in the middle of the cardboard.

The lens should be facing towards you.

Flip to the other side

Check the view of the Internal Frame from the other side so that it matches the picture.

Fold down the top flap

Fold the top flap over the camera board.

Fold the left and right flaps

Holding the top flap down, fold the flaps on the left and the right of the board toward you to hold the camera in place.

There are two small cardboard notches on each side that will loosely hold the flaps in place.

Fold the bottom flap

While holding those flaps in, fold the bottom flap of the inner frame upwards. Lock the flap in place by securing the tabs into the notches.

The bottom of the assembly should look like a shelf.

Thread the buzzer wire

Thread the Piezo Buzzer wire through the circular opening next to the Long Flex.

Fold the Long Flex up

Fold the Long Flex upwards and crease it by pressing gently.

It’s okay to bend the Long Flex a bit, so don’t worry about damaging it.

Fold the Long Flex to the left

Then fold the Long Flex to the left at a 45-degree angle. The unconnected end should be aligned with the three slits on the left.

You’re going to weave the cable through the slits, like a shoestring.

Thread the Long Flex

There’s three slits on the left flap of the frame. Thread the flex cable into the bottom slit, making sure that the side with the copper lines is facing away from you...

Keep threading

...and up towards you, through the middle slit...

More threading

...and then through the final slit. The cable should now be threaded through all three slits, and sticking out of the left side of the frame.

This keeps the Long Flex in place.

Set your Internal Frame aside for now.

Connect the boards

Gather your parts

Now we’re going to connect the circuit boards together. Round up:

  • Raspberry PiThe green board is the Raspberry Pi, a small but mighty Linux computer designed for makers. It has a SD card slot, two USB connectors, and a micro HDMI connector.
  • Vision BonnetThe blue board is the Vision Bonnet, an accessory for the Raspberry Pi that lets you run machine learning programs to identify images from the camera. It contains a special chip designed to run machine learning programs.
  • 2x Standoffs
  • Short FlexThe Short Flex is a flexible circuit board. It’s used to connect boards in electronics when rigid boards have to fit in a tight space.
  • Button Harness

Orient your Raspberry Pi

Orient your Raspberry Pi so that the 40-pin headerA header is a fancy electronics term for set of wire connectors. In this case, we refer to each wire as a pin, and there are 40 of them arranged in two columns. They're usually used to connect electronics together and provide access to internal components electrically. is on the left edge of the board, like the photo.

WARNING: First make sure your Raspberry Pi is disconnected from any power source and other components. Failure to do so may result in electric shock, serious injury, death, fire or damage to your board or other components and equipment.

Open the top cable connector

Open the flex connector latch by pulling gently back on the black raised latch. Be careful, as it only takes a little bit of effort to open. Make sure you're holding the board by its edges, as shown in the photo.

Need more help? The latch is pretty tiny: fingernails inserted on either side between the black and white parts will help open it. If you can’t tell whether it’s open or not, the latch will wiggle a little in the open position, and there will be a visible gap between the black and white parts of the connector.

Insert the Short Flex

Find your Short Flex. See that the ends are labeled Rasp Pi and Vision Bonnet. Take note of the Rasp Pi end, then flip the Short Flex over lengthwise (like the photo) so the side with the labels (and copper stripes) is facing away from you.

Insert the end of the Short Flex labeled Rasp Pi into the flex connector until the flex hits the back of the connector.

Close the cable connector latch

Close the cable connector latch by pressing down. You should feel the latch snap into place. Gently check that the cable is secure by lightly tugging on it, and make sure that it is inserted squarely.

WARNING: Failure to securely seat the ribbon in the connector may cause electric shock, short, or start a fire, and lead to serious injury, death, or damage to property.

Orient your Vision Bonnet

Find your Vision Bonnet and orient it so the 40-pin header connector is on the left and the white connector is on the bottom.

There are two white-colored cable connectors on the board (one on the top, and one on the bottom), so make sure the 40-pin header is to your left when you hold it (as shown).

Flip open the cable connector latch

Check the state of the cable connector latch. Look at from the side against a white background. If the tiny black latch is at a right angle to the white base, then it is open. If the black latch is flush and parallel with the top of the white base, it is closed.

If it is closed, open the cable connector by gently flipping the black latch upwards so that it becomes perpendicular to the white base.

Fingernails or tweezers help here. Try to pull from the sides rather than the center of the flap.

Insert the Short Flex

Insert the Short Flex into the Vision Bonnet (the other side should be connected to the Raspberry Pi).

Make sure the side with the copper stripes (and labels) is still facing away from you, as shown in the picture.

Close the cable connector

Close the Vision Bonnet cable connector by flipping the black latch back down parallel to the white connector. This will secure the Short Flex.

Flip the assembly over. Make sure the Rasp Pi and Vision Bonnet labels on the Short Flex correctly correspond to the boards they’re connected to.

WARNING: Failure to securely seat the connector may cause electric shock, short, or start a fire, and lead to serious injury, death, or damage to property.

Insert Standoffs

Insert the Standoffs into the Raspberry Pi board in the holes opposite the pin header.

Need more help?

The Standoffs are tough to get in sometimes. You might need to press firmly and wiggle them into place. If you find you're having to use too much force, use a pair of pliers to squeeze the end of the Standoffs while inserting into the holes.

Align the boards

Align the Vision Bonnet header connector with the pin header on the Raspberry Pi.

Push the Short Flex inward

Push the Short Flex into the space between the two boards.

Connect the boards and check connections

Firmly push the boards together to snap the Standoffs into place. Once you've mated the two headers together, push from the center of the connectors (rather than the edges of the boards) to finish the connection. You may have to work your way around the board to make sure the standoffs snap in as well.

Make sure the Standoffs have snapped into the boards and that the 40-pin header is pushed all the way down so that there is no gap between the two boards.

WARNING: Damaging the ribbon or failure to securely seat the Vision Bonnet may cause electric shock, short, or start a fire, and lead to serious injury, death, or damage to property.

Add the boards to the frame

Orient your boards

Now we can insert the boards into the Internal Frame.

Orient your boards so the Vision Bonnet is facing you, and the white cable connector is on the bottom.

If it isn’t already, open the white connector on the top of the Vision Bonnet by gently flipping the black latch upward.

Connect the boards and frame

Find your Internal Frame and the Long Flex that you threaded through the cardboard slits.

With the copper stripes on the Long Flex facing away from you, connect it to the white cable connector. Then flip the black latch into the parallel state to close it.

WARNING: Failure to securely seat the connector may cause electric shock, short, or start a fire, and lead to serious injury, death, or damage to property.

Slide the boards into the frame

Now that the boards are connected, slide the boards into the frame, with the Vision Bonnet (the blue one) on top. The boards slide into a slot that looks like a mouth :o

Lightly crease the twisted part of the Long Flex so that it lays closer against the cardboard.

Check frame and cables

Double-check that your cardboard frame assembly looks like the one pictured.

Plug in the Button Harness

Find your Button Harness and plug it into the top of the Vision Bonnet board. Either end is fine. You can also remove the white tag.

You’ve built the frame and connected the boards. Now let’s build the box that it all goes into.

WARNING: Failure to securely seat the connector may cause electric shock, short, or start a fire, and lead to serious injury, death, or damage to property.

Build the box

Orient the cardboard box

Now it’s time to fold the cardboard box. You’ll need:

  • External Box
  • Tripod Nut

Find the External Box and unfold it, holding it so that the lettered labels face towards you.

Fold A

Fold each of the flaps labeled A upwards.

Fold B

Now fold the two flaps labeled B toward you.

Fold C

While holding A and B in, fold the flap labeled C toward you.

Peel adhesives

You’re now going to secure the bottom of the box.

Peel the adhesive liner off the flaps labeled A...

Fold D

While applying pressure to the outer sides of the box, fold flap D over and onto both of the A flaps...

Press down

...and press down to secure the adhesive.

Two sets of arrows should line up after you’ve secured the adhesive.

Insert Tripod Nut

Take the Tripod Nut and slide it, wider side face down, into the slot labeled Tripod Nut.

Fold E

Fold the flaps labeled E inward.

Fold F

Fold the flap labeled F inward.

Remove adhesive on E

Remove the adhesive on the right-hand E flap.

Fold G

Fold over the right hand flap labeled G, and secure the adhesive by pressing down.

The two arrows should face each other after you’ve secured the adhesive.

Fold and secure the other side

Repeat the previous two steps on the other side. Ensure that the arrows are aligned.

Fold bottom flap

Fold back the bottom retaining flap. It has a crease in the center.

Then fold the crease towards you.

Fold H

Fold down flap H. This forms the back side of your cardboard box.

Your box is built! Now let’s bring it all together.

Bring it together

Slide in the Internal Frame

Take your Internal Frame and slide it into the back of the cardboard box (as shown).

Check the boards and wires

Ensure that your Raspberry Pi and Vision Bonnet board are still sitting snugly in the Inner Frame and that your Flex Cable is secure. Also check that the camera is visible through the camera hole on the other side of your box.

Now let’s put the finishing touches on your box.

Install the LED

Gather your parts

Now we’re going to attach the LED to your box. Find your:

  • Privacy LED
  • LED Bezel

Install the LED Bezel

Flip your box to the front side. You will see three holes.

Push the LED Bezel into the top-left hole, above the camera aperture.

Install Privacy LED

Turn the box around. Take the Privacy LED and insert the end with the bulb into the LED Bezel you just installed. It should snap into place.

Check other side

Make sure the Privacy LED is peeking out on the other side.

Install the Hardware

Gather your parts

You’re in the home stretch! So, let’s install the button. You’ll need:

  • Push Button
  • Button Nut

Thread wires through nut

Gather the Piezo Buzzer, Privacy LED, and Button Harness cables. Thread all three through the Button Nut.

The wider side of the Button Nut should be facing upwards, toward the top of the box.

Thread the cables through the box

Take those same cables and insert them through the hole on the top of the box.

Plug wires into Push Button

Get your Push Button. Hold it upside down, and check the board for the words PIEZO, LED, and BONNET (they’ll be tiny).

Then take the Piezo Buzzer cable and plug it into the slot on the left labeled PIEZO. Plug the Privacy LED wire into the middle slot labeled LED, then the Button Harness into the black connector on the right.

WARNING: Failure to securely seat the connector may cause electric shock, short, or start a fire, and lead to serious injury, death, or damage to property.

Insert Push Button

Insert the Push Button into the hole.

Secure Button Nut

From the inside of the box, screw the Button Nut to secure the Push Button.

Make sure the wider, flanged side of the nut is facing upwards.

Check your completed box

The completed assembly should look like the image.

Before you close it up, it doesn’t hurt to make sure all your cables are still connected.

Close the box

Close the box and secure it with the tab.

Insert the SD Card

With the arrow side facing up, insert your SD Card into the sliver slot on the Raspberry Pi, which you can find through the cardboard cutout labeled SD Card.

WARNING: Forcing connectors into misaligned ports may result in loose or detached connectors. Loose wires in the box can cause electric shock, shorts, or start a fire, which can lead to serious injury, death or damage to property.

Congrats, you’ve just assembled the Vision Kit hardware!

Now you’re ready to turn it on.

Try it out

Turn it on

Plug your Vision Kit into a power supply

Plug your Vision Kit into a wall power supply through the port labeled Power on your device.

See Meet your kit for power supply options. Do not plug your Vision Kit into a computer for power.

Let it boot up

To confirm that it’s connected to power, look into the hole in the cardboard labeled SD Card. You’ll see a green LED light flashing on the Raspberry Pi board.

Wait about four minutes to allow your device to boot. You’ll know it’s booted when your kit beeps. The software needs this time to install and configure settings. In the future, it’ll only take about two minutes to turn on.

Use the Joy Detector

Try out the Joy Detector

Point the Vision Kit toward someone’s face (or your own) to try out the Joy Detector Demo.

  • Ask them to smile
  • Then ask them to smile REALLY BIG
  • Then ask them to make a frowny face

The Joy Detector uses machine learningMachine learning is the science of making predictions based on patterns and relationships that've been automatically discovered in data. to detect if a person is smiling or frowning, and how much they are doing so. Frowns light the button to blue, and smiles light the button to yellow.

If expressions are really big, a sound will play. If the camera sees more than one face, it will evaluate each person’s face and sum the joy score of each face.

Not working?

Try holding the camera at least an arms length away from the face you’re pointing it at. Sometimes the camera has trouble if the subject is too close. Keep this in mind for all the demos that you try.

Connect to your kit

Select an option

To try out other demos, you’ll connect to your Vision Kit so that you can give it commands. There are two options for connecting, explained in Meet your kit.

Follow instructions for one connection option, either with the AIY Projects app or with a monitor, mouse, and keyboard.

Option 1: AIY Projects app

Download the AIY Projects app

Go to the Google Play Store and download the AIY Projects app.

This app will allow you to connect your Vision Kit to a WiFi network, and display an IP address which you’ll use to communicate with your Vision Kit wirelessly via a separate computer and SSHSSH stands for “secure shell.” It’s a way to securely connect from one computer to another..

Psst: This app only works on Android smartphones. If you don’t have one, please use the alternate connection method (which uses a monitor, keyboard, and mouse).

Follow app instructions

Open the app and follow the onscreen instructions to pair with your Vision Kit.

Take note of the IP addressThe Internet Protocol Address is a four-digit number that identifies a device on a network. Every device on your network (your computer, phone, your Vision Kit) will have a unique IP Address. Using this address, one device can talk to another. — you’ll need it later. The app will also remember and display it on the home screen.

Not working? Make sure your Vision Kit is connected to a power supply.

If you run into errors, quit the app and try again.

If the device won’t pair, make sure the green LED on the Vision Bonnet is flashing. If it’s not flashing, it may have timed out. Press and hold the Vision Bonnet button for 5 seconds, and try again. If that doesn’t work, try restarting your phone.

Optional: Connect a monitor

A monitor is not required to run these demos, but if you have one available, it can be useful so that you can see what your Vision Kit is seeing.

If you have a monitor and micro-HDMI cable (or HDMI cable + micro-HDMI adapter) available, follow these instructions:

  • Unplug your kit from power
  • If it’s open, then close the back of your kit
  • Connect your monitor to the micro-HDMI port labeled HDMI on the back of your kit
  • Make sure your monitor is connected to power
  • Plug your kit back into power
  • Wait for the kit to boot; you’ll hear a beep when it’s ready

When your kit is booted, you’ll see a desktop with the AIY logo on the background.

A pop-up will tell you that the password for the Raspberry Pi user is set to the default. This is important if you plan to use this kit in other projects or expose it to the internet, but for now, you can leave it (we’ll explain more later).

If you don’t have a monitor, or when your kit is powered back on, go to the next step.

Connect your computer to WiFi

Make sure your computer is on the same WiFi network as your Vision Kit. This will allow you to connect to your kit through SSH.

Get your terminal ready

We’re going to connect your computer to the Raspberry Pi using a terminalA terminal is a text window where you can issue commands to your Raspberry Pi.. Once you are connected, the commands you type to the Raspberry Pi are the same across many operating systems, and lets you control the Raspberry Pi remotely from another computer.

Download and install the Chrome browser and Secure Shell Extension on your computer.

If you’re familiar with another method of using a terminal, feel free to use that on your own.

Open the Secure Shell Extension

Once the extension is installed open it.

If you’re using Chrome on a Windows, Mac, or Linux computer, the icon used to open the terminal will appear in your toolbar, next to the URL field. If you click on it, it will open a menu with "Connection Dialog" listed. Click on that, and it will open a new window with the form pictured here.

If you’re using Chrome on a Chromebook, go to the app menu and type secure shell app extension.

Connect to the Raspberry Pi

In the top field, type pi@00.00.00.0 replacing the 0’s with the IP address of your Raspberry Pi. After typing this in, click on the port field. The "[ENTER] Connect" button should light up.

Click "[ENTER] Connect" to continue.

Can’t connect? If you can’t connect, check to make sure the IP address you wrote down earlier is correct and that your Raspberry Pi is connected to the same WiFi access point your computer is.

Note If you rewrite or replace your SD card, you will need to remove and add the Secure Shell Extension from Chrome. You can do this by right clicking on the icon in your toolbar and selecting "Remove", then re-add it by following the instructions above.

Give the extension permission

Click Allow.

This gives permission to the SSH extension to access remote computers like your Raspberry Pi.

You will only need to do this once when you add the extension into Chrome.

Continue connecting

At the prompt, type yes and press enter to confirm that the displayed host keyThe SSH extension is designed to be secure, and because of this goal, it needs to identify that the computer you're trying to connect to is actually the computer you expect. To make this job easier, the computers generate a long number and present it to the extension for verification each time. The extension saves this key somewhere safe so that it can verify that the computer you're speaking to is actually the right one. matches what is stored on your Raspberry Pi. You will only have to do this the first time you connect to your kit.

Enter the Raspberry Pi’s password

Enter the Raspberry Pi’s password at the prompt. The default, case-sensitive password is raspberry

When you type, you won’t see the characters.

Note Your IP address might be different than the one shown in the example.

It’s okay if you see the warning line. It’s letting you know that the host key has been saved, and the extension will do the hard work of comparing what it just stored with what the Raspberry Pi provides automatically next time.

Having trouble? If it’s typed wrong, you’ll see “Permission denied, please try again” or “connection closed.” You’ll need to re-start your connection by pressing the R key.

Confirm you’re connected

If the password was entered correctly, you’ll see a message about SSH being enabled by default and the pi@raspberrypi:~ $ shellA shell is a program that runs on a computer that waits for instructions from you, and helps you to make your computer work for you. promptIt’s a response from the shell that indicates that it is ready to receive commands, and tells you what your current working directory is (the tilde, ~, in this case). It ends in a $ where you type your command. will be green.

You will also see a warning that the password for the Raspberry Pi user is set to the default. This is important if you plan to use this kit in other projects, or expose it to the internet, but for now it’s okay to proceed.

Congrats! You’re now connected to your Vision Kit. Skip to the Try More Demos section to explore more Vision Kit demos.

Do I need to change my password? You'll want to change the pi user's password if you plan on using this kit in a project that is exposed to the open internet. It's not safe to expose it with a password everybody knows. If you plan on doing this, you'll want to use the passwd program. This is an advanced step, so for the purposes of this guide, we will assume you haven't changed the password for the pi user.

Note If you do change the password, make sure you keep your password written down somewhere safe in case you forget. It's not easy to recover it if you change it.

Option 2: With monitor, mouse, and keyboard

Gather your peripherals

Use connection option if you don’t have access to an Android smartphone and second computer, or if you prefer to connect directly to your Raspberry Pi.

You’ll need a set of peripherals to interact with your Raspberry Pi, including a monitor, keyboard and mouse. Check here for suggestions.

Unplug your kit

Before plugging in your peripherals, unplug your kit from power.

Connect peripherals

Plug your monitor into the HDMI port and your keyboard and mouse into the Data port on your Vision Kit using one of the adapters described in Meet your kit.

Plug your monitor into power if it’s not already.

Plug your Vision Kit into a power supply

Plug your Raspberry Pi back into power via the Power port. To confirm that it’s connected to power, look into the hole in the cardboard labeled SD Card. You’ll see a green LED flashing on the Raspberry Pi board.

You’ll also see the Raspberry Pi logo in the top left corner of the monitor.

Wait for your device to boot, which will take about two minutes. It’s okay if your screen goes black while it’s booting. Be patient! You’ll know when it’s booted when you hear it beep.

Acknowledge the warning

You’ll see a desktop with the AIY on the background. A pop-up will tell you the password for the Raspberry Pi user is set to the default. This is important if you plan to use this kit in other projects or expose it to the internet, but for now, it's safe to click OK.

Do I need to change my password? You'll want to change the pi user's password if you plan on using this kit in a project that is exposed to the open internet. It's not safe to expose it with a password everybody knows. If you plan on doing this, you'll want to use the passwd program. This is an advanced step, so for the purposes of this guide, we will assume you haven't changed the password for the pi user.

Note If you do change the password, make sure you keep your password written down somewhere safe in case you forget; it's not easy to recover if you change it.

Open the terminal

Open the terminalA terminal is a text window where you can issue commands to your Raspberry Pi. by clicking the black rectangular icon on the taskbar at the top of the screen.

Now you’ll be able to issue commands to your Raspberry Pi.

Confirm you’re connected

You should now see the promptIt’s a response from the shell that indicates that it is ready to receive commands, and tells you what your current working directory is (the tilde, ~, in this case). It ends in a $ where you type your command. pi@raspberrypi: ~ $.

Congrats! You’re ready to start issuing commands to your Raspberry Pi.

What if my prompt looks different? If you clicked on the Start dev terminal icon, you’ll see the prompt “pi@raspberrypi: ~/AIY-projects-python $” instead. This is because the Start dev terminal shortcut is setup to open a terminal and then set your working directory to “~/AIY-projects-python”. That’s fine for the purpose of these instructions. We’ll show you in a few steps how you can use the cd command to change your working directory.

Try more demos

See how computer vision works

Now that you’ve connected to your Vision Kit, you can try out other demos to experiment with machine learning.

Stop the Joy Detector

Stop the Joy Detector

The Joy Detector runs by default, so you will need to stop it before you run another demo. To do this, type the following command and press enter:

sudo systemctl stop joy_detection_demo

After the joy_detection_demo has been successfully stopped, you will be brought back to the command prompt. If not correct, you'll see an error.

Always stop any demos that are running before trying a new demo. If you don’t, you will run into errors. You’ll need to do this step each time you unplug and replug your kit back in, because the Joy detector starts automatically.

Note This will stop the demo from running while you are currently connected to your kit. However, the next time you reboot your kit (for example by unplugging and plugging it back in), the demo will start running again.

If you want to disable it completely so that it does not start by default, type the following command into your prompt and press enter:

sudo systemctl disable joy_detection_demo

See which demos are available

Move to the examples directory

To try out seven other Vision Kit demos, move into the directoryYou might have heard the terms "folder" or "directory" before. They are synonyms for the same thing: a data structure that contains a listing of filenames and the location of their contents on disk. Think of them like a table of contents: each time you run the ls command, you're "list"-ing the contents of one of these directories. where they’re located. Type the following command into your prompt and press enter:

cd ~/AIY-projects-python/src/examples/vision

Your prompt should now say “pi@raspberrypi:~/AIY-projects-python/src/examples/vision $”.

What’s cdcd stands for “change directory.” Think of it as clicking through file folders. You should see the path in the command line in blue. Capitalization matters: it’s cd, not CD. If you ever get lost or curious, typing pwd and then pressing enter will display your current path.?

Trying to Copy + Paste? Copying and pasting in a terminal is a little different than other applications you may be used to.

If you are using the Secure Shell Extension, to copy some text, highlight what you want by clicking and dragging with the left mouse button, and as soon as you let go of it, it'll copy it. To paste, click the right mouse button. On a touchpad this can be a little tricky, so try tapping or pressing in the lower right of the touchpad, or tapping with two fingers.

To copy text using the terminal on your Raspberry Pi: select the text, right-click, and select 'copy' from the menu. Left click where you want to paste the text, then right click and select 'paste' from the pop up menu.

Take a look around

Now that you’ve changed directories, type ls and press enter to see what’s inside your current directory. Hint: that’s an “l” as in lemon, not a #1.

Here you’ll see a list of files that end in “.py”. These are the example demos that you can run.

What’s ls"ls" is shorthand for "LiSt" and prints out all of the files in the current working directory. It's a great way to look around and see what changed on disk.?

What’s pythonPython is a programming language that we use for the majority of our demos and scripts. These files end in “.py”. It's a simple language and is very easy to learn. You can find out more about Python at https://www.python.org/?

Learn more about working in the terminal Check out some guides from our friends at the Raspberry Pi Foundation: Conquer the Command Line and Linux Commands.

Try image classification in the live camera

Try the image classification camera demo

To start the image classification camera demo, type the following command and press enter:

./image_classification_camera.py

It might take a moment to fire up. It will use an object detection model to identify objects in view of the Vision Kit.

If it's working, you will see a camera window pop up on your monitor (if one is attached) and the output from the model will start printing to your terminal. If you are brought back to the prompt after seeing error text, check the Using the Vision Kit section of the help page for troubleshooting tips.

What does “./” do?

./ tells the computer you’re talking to to run a file from the directory you’re in.

Test it out

Point your Vision Kit at a few objects. Check your terminal screen to see what the modelA model is like a program for a neural network. It is a mathematical representation of all the different things the neural network can identify. But unlike a program, a model can't be written, it has to be trained from hundreds or thousands of example images. When you show your Vision Kit a new image, the neural network uses the model to figure out if the new image is like any image in the training data, and if so, which one. is guessing. The number next to each guess is its confidence scoreThe confidence score indicates how certain the model is that the object the camera is seeing is the object it identified. The closer the number is to 1, the more confident it is..

You might be surprised at the kinds of objects the model is good at guessing. What is it bad at? Try different angles of the same object and see how the confidence score changes.

Help! The camera is blocking my terminal window If you are connected directly to your Raspberry Pi via mouse, monitor, and keyboard, the camera window might block your terminal. That’s okay - your terminal is still there in the background. Press Ctrl-C after pointing your camera at a few objects to stop the demo and close the camera window. Then you can scroll up in your terminal window to see what the camera identified. If you want to see the terminal and camera preview at the same time, you can connect your Raspberry Pi to WiFi and then connect to it from another computer via SSH see steps 48-63 of the Voice Kit guide for more information.

End the image classification camera demo

When you’re done playing with the image classifier demo, press Ctrl-CCtrl-C interrupts a running process and returns control back to the shell prompt. to end it. This will bring you back to the prompt.

The image classification camera demo will run indefinitely until you interrupt it or power the device off. Don’t worry, you can always .

Try face detection in the live camera

Try the face detection camera demo

To start the face detection demo, type the following command and press enter:

./face_detection_camera.py

This demo enables your Vision Kit to identify faces, print out how many it sees, and draw boxes around where the faces are located. You’ll be able to see this if you have an attached monitor.

If it's working, you will see a camera window pop up on your monitor (if one is attached) and the output from the model will start printing to your terminal. If you are brought back to the prompt after seeing error text, check out the Using the Vision Kit section of the help page for troubleshooting tips.

Try the demo

Move the camera around and watch the demo output. Iteration tells you the number of times the model has run. num_faces is the model’s best guess at how many faces are in view of the camera.

Try moving the camera quickly, or further away. Does it have a harder time guessing the number of faces?

Help! The camera is blocking my terminal window: If you are connected directly to your Raspberry Pi via mouse, monitor, and keyboard, the camera window might block your terminal. That’s okay - your terminal is still there in the background. Press Ctrl-C after pointing your camera at a few objects to stop the demo and close the camera window. Then you can scroll up in your terminal window to see what the camera identified. If you want to see the terminal and camera preview at the same time, you can connect your Raspberry Pi to WiFi and then connect to it from another computer via SSH see steps 48-63 of the Voice Kit guide for more information).

End the face detection camera demo

When you’re done experimenting with the face detection demo, press Ctrl-C to end it. This will bring you back to the prompt.

Take a photo when a face is detected

Try the face camera trigger demo

With this demo, your Vision Kit will automatically take a photo when it detects a face. To start it, type the following command and press enter:

./face_camera_trigger.py

If you have a monitor attached, you’ll see a blinking cursor and a camera window pop up. It will remain in this state until the camera sees a face. Point the camera at yourself or a friend. Try making a bunch of faces and experiment with what the machine considers to be a face.

When it sees a face, it will take a photo and create an image called faces.jpg in your current directory, and then close the camera window and bring you back to the prompt.

Seeing an error? Check out the Using the Vision Kit section of the help page for troubleshooting tips.

Check that a photo was created

To check that a photo was created, type ls at the prompt and press enter.

You should now see a file called faces.jpg listed in your current directory.

Hint Each time you run face_camera_trigger.py you will overwrite faces.jpg. If you want to rename the last photo you took so that you don’t overwrite it, type the following command and press enter:

mv faces.jpg newname.jpg

View an image on your Pi

Optional: View the image

To see the photo you just took, and if you’re connected via SSH and a monitor, type the following command and press enter:

DISPLAY=:0 gpicview faces.jpg

To close the photo window, press Ctrl-C.

If you’re connected to your Vision Kit via mouse and keyboard, type the following command and press enter:

gpicview faces.jpg

To close it, click the X on the photo window or press Q on your keyboard.

You can open up another image file by replacing “faces.jpg” with the name of the file you want to open.

If you don’t have a monitor, you will not be able to view the photo, but you can still use it in later demos.

What is gpicviewgpicview is an application that you can use to display an image. You need to type “DISPLAY=:0” when connecting to your Pi via SSH to tell gpicview how to display an image on the screen.?

Not seeing anything on your monitor? If your monitor looks like it’s asleep, try typing Ctrl-C to interrupt your previous command and return to the prompt. Then type “DISPLAY=:0 xset s activate” and press enter. Then try to view the image again by typing the command show above.

Take a photo using raspistill

Take a photo

The next demos will show you how to use image files taken with your Vision Kit as input (instead of the live camera).

Type the following command and press enter, replacing image.jpg with whatever filename you’d like to use:

raspistill -w 1640 -h 922 -o image.jpg

The camera will wait 5 seconds after you press enter and take an image.

What should I name my file? “image.jpg” is the name of the file we are telling the command to write to in the screenshot to the left. You can name your file anything you want, as long as you use only letters, numbers, dashes, and underscores. It’s good practice to end your filename in .jpg since this command is saving the image in the JPEG format.

What does this command mean? raspistill is a command that lets you capture photos using your Raspberry Pi camera module. The -w flag and -h flags tell the command the width and height of the image to capture from the camera. The -o flag tells the command which file to write the image to. The Raspberry Pi camera can produce images of different sizes, just like your smartphone. For the demos to work, we need to make sure the images aren’t too large.

Check that your photo was created

To check that your photo was written to your SD Card, type ls at the prompt and press enter. You should now see the filename you used in the step above.

To view this image, follow the the instructions in step 106 using the new filename.

Quick tip Press the up and down arrow keys at the prompt to scroll through a history of commands you've run. To rerun a command, it's easier to press the arrows until the one you want is shown. You can edit the command if needed, then press enter.

Try face detection on an image

Try the face detection demo

Let’s use the photo you took in step 104 with the face detection model. If you skipped that step, check it for instructions or make sure you have a photo with a face on your SD card.

To run the demo, type the following command in your terminal and press enter (it might take a while to run):

./face_detection.py --input faces.jpg

If you named your image file from step 104 something different, or want to use a different image, replace “faces.jpg” with the name of the file you want to use.

Seeing an error? Check out the Using the Vision Kit section of the help page for troubleshooting tips.

Check the results

When it’s done, you should get something like this:

Face #0: face_score=0.989258, joy_score=0.969556, bbox=(632.0, -15.0, 782.0, 782.0)

face_score is how certain the model is that it’s found a face, and joy_score is how happy it appears the person is (both scores are out of 1). bbox tells you where the face is located in the image.

Nothing happened: If you’re brought back to the prompt and don’t see any output in white, the model didn’t detect a face in the photo. Try taking another photo of a face using face_camera_trigger.py and then running the command again.

Try object detection on an image

Try the object detection demo

The object detection demo takes an image and checks whether it’s a cat, dog, or person.

To run the demo, type this command and press enter, replacing image.jpg with an image from a previous step:

 ./object_detection.py --input image.jpg

Seeing an error? Check out the Using the Vision Kit section of the help page for troubleshooting tips.

Check the results

It will take a minute to process the image and output the results on your terminal screen.

kind is the type of object detected and score is how confident the model is about the result it gave. bbox is where that object is located in the image.

Nothing happened: If you’re brought back to the prompt and don’t see any output in white, the model didn’t detect a face, cat, or dog in the photo. Try taking another photo using step 107 and then running the command again.

Try dish classification on an image

Try the dish classifier demo

The dish classifier model can identify food from an image.

First, have an image of food ready. Step 107 explains how to take a photo.

Type in the following command and press enter, replacing image.jpg with the photo you took:

./dish_classifier.py --input image.jpg

Seeing an error? Check out the Using the Vision Kit section of the help page for troubleshooting tips.

Check the results

When it’s done processing (it may take a minute), you’ll get a list of results, along with the type of food identified and a probability score indicating how confident the model is of its answer (out of 1).

Nothing happened: If you’re brought back to the prompt and don’t see any output in white, the model didn’t detect anything, womp. Try taking another photo of food using step 107 and then running the command again.

Try image classification on an image

Try the image classification demo

Let’s try the image classifier on the photos that you’ve taken.

Type the following command and press enter, replacing image.jpg with one of the photos you you took earlier:

./image_classification.py --input image.jpg

Seeing an error? Check out the Using the Vision Kit section of the help page for troubleshooting tips.

Check the results

Like the camera image classifier, you will get a list of results, which includes the kind of object and the model’s level of confidence.

Nothing happened: If you’re brought back to the prompt and don’t see any output in white, the model didn’t detect anything, womp. Try taking another photo of food using step 107 and then running the command again.

Shut down your kit

When you’re done with your Vision Kit for the day, it’s important to shut it down properly to make sure you don’t corrupt the SD Card.

Type the following command and press enter:

sudo poweroff

Once you see the prompt to "(R)econnect, (C)hoose another connection, or E(x)it?" and the green LED on the Raspberry Pi has turned off, you can unplug the power supply from your kit safely. Remember when you reconnect your power supply to wait until the LED stops blinking before reconnecting your kit via SSH.

Reconnecting your kit

To reconnect your kit, plug your kit back into the power supply and wait for it to boot up (about 2 minutes).

If you’re using a monitor, mouse, and keyboard, make sure they’re connected before you boot your kit. Once the kit is booted, open up a terminal and you’re good to go.

If you’re using SSH, once your kit is booted reconnect via the Secure Shell Extension (see step 82 in connect to your kit). Note: You may have to re-pair your kit via the app.

What's next?

Congrats! You’ve setup your very own intelligent camera.

Now that you’ve got a taste for the Vision Kit can do, we’d love to see what you do with it. In the next section, we’ve included HW, APIs, and tools to enable you to get your own intelligent vision projects up and running.

Share your creations with the maker community at #aiyprojects

Makers guide

TensorFlow Model Compiler

Vision Kit allows you to run a customized model on device. First, you must specify and train your model in Tensorflow. Learn more about TensorFlow.

To get started on deploying your model on Vision Kit, export the model as a frozen graph. You then compile the frozen graph using bonnet_model_compiler (licensed under Apache 2.0) into a binary file that can be loaded and run on Vision Bonnet.

NOTE: The compiler works only with x86 64 CPU running Linux. It was tested with Ubuntu 14.04. You should NOT run it on VisionBonnet.

To unzip the file, do tar -zxvf bonnet_model_compiler_yyyy_mm_dd.tgz. This should give you bonnet_model_compiler.par (you might need to chmod u+x bonnet_model_compiler.par after downloading).

Due to limited hardware resource on Vision Bonnet, there are constraints on what type of models can run on device, detailed in the Constraints section below. The bonnet_model_compiler.par program will perform checks to make sure your customized model can run on device.

Note: You can run this tool as soon as you get a frozen graph, even before the training has converged at all, to make sure your model can run on VisionBonnet. You can use the checkpoint generated at training step 0 or export a dummy model with random weights after defining your model in TensorFlow. In addition, it's highly recommended that you run the compiled binary on Bonnet as well to make sure it returns a result.
Constraints
  1. Model takes square RGB image and input image size must be a multiple of 8.

    Note: Vision Bonnet handles down-scaling, therefore, when doing inference, you can upload image that is larger than model's input image size. And inference image's size does not need to be a multiple of 8.

  2. Model's first operator must be tf.nn.conv2d.

  3. Model should be trained in NHWC order.

  4. Model's structure should be acyclic.

  5. When running inference, batch size is always 1.

Supported operators and configurations

The following subset of tensorflow operators can be processed by the model compiler and run on device. There are additional constraints on the inputs and parameters of some of these ops, imposed by the need for these ops to run efficiently on the Vision Bonnet processor.

TF operators Supported on device configuration
tf.nn.conv2d Input tensor depth must be divisible by 8 unless it is the first operator of the model.

filter: [k, k, in_channels, out_channels], k = 1, 2, 3, 4, 5;

strides: [1, s, s, 1], s = 1, 2;

padding: VALID or SAME;

data_format: NHWC;
tf.nn.depthwise_conv2d filter: [k, k, in_channels, channel_multiplier], k = 3, 5, channel_multiplier = 1;

strides: [1, s, s, 1], s = 1, 2;

padding: VALID or SAME;

data_format: NHWC;
tf.nn.max_pool Input tensor depth must be divisible by 8.

ksize: [1, k, k, 1], k = 2, 3, 4, 5, 6, 7;

strides: [1, s, s, 1], s <= k;

padding: VALID or SAME;

data_format: NHWC
tf.nn.avg_pool ksize: [1, k, k, 1], k = 2, 3, 4, 5, 6, 7;

strides: [1, s, s, 1], s <= k;

padding: VALID or SAME;

data_format: NHWC
tf.matmul Suppose a is MxK matrix, b is KxN matrix, K must be a multiple of 8.

a: rank-1 or rank-2 tensor;

b: rank-1 or rank-2 tensor;

transpose_a: False;

transpose_b: False;

adjoint_a: False;

adjoint_b: False;

a_is_sparse : False;

b_is_sparse: False;
tf.concat axis: 1, or 2, or 3
tf.add Yes
tf.multiply Yes
tf.nn.softmax dim: -1.
tf.sigmoid x: tensor's shape must be [1, 1, 1, k].
tf.nn.l2_normalize Input tensor depth must by a multiple of 8.

dim: -1.
tf.nn.relu Yes
tf.nn.relu6 Yes
tf.tanh Yes
tf.reshape First dimension tensor can not be reshaped. That is shape[0] = tensor.shape[0].
Supported Common Graphs
Name Configurations
Mobilenet input size: 160x160, depth multiplier = 0.5
SqueezeNet input size: 160x160, depth multiplier = 0.75
How to run the compiler

To run the compiler:

./bonnet_model_compiler.par -- \
    --frozen_graph_path=<frozen_graph_path> \
    --output_graph_path=<output_graph_path> \
    --input_tensor_name=<input_tensor_name> \
    --output_tensor_names=<output_tensor_names> \
    --input_tensor_size=<input_tensor_size>

Take mobilenet_v1_160res_0.5_imagenet.pb (available after download) as an example. Put mobilenet_v1_160res_0.5_imagenet.pb in the same folder as bonnet_model_compiler.par.

./bonnet_model_compiler.par -- \
    --frozen_graph_path=./mobilenet_v1_160res_0.5_imagenet.pb \
    --output_graph_path=./mobilenet_v1_160res_0.5_imagenet.binaryproto \
    --input_tensor_name="input" \
    --output_tensor_names="MobilenetV1/Predictions/Softmax" \
    --input_tensor_size=160

VisionBonnet model binary ./mobilenet_v1_160res_0.5_imagenet.binaryproto generated.

GPIO Diagrams

GPIO Diagram Front

GPIO Diagram Back

AIY Microcontroller

The Vision Bonnet is the first AIY project to include a dedicated AIY microcontroller. The MCU adds features that the Raspberry Pi alone doesn’t provide. These GPIOs are accessible via 1mm pitch pins on the top of the Vision Bonnet (connector P2). The MCU enables:

  • PWM support for servo/motor control without taxing the PiZero CPU.
  • Control of the two LEDs on the bonnet.
  • More accurate analog channels than the PiZero.
  • Frees up Pi GPIOs for other uses.

Control of the GPIOs is available in user-space code via sysfs nodes or from Python code with the included adaption (aiy.vision.pins) of the popular gpiozero library. The aiy.vision.pins package provides gpiozero compatible pin specifications for the additional io functionality on the hat. These definitions can be used to construct the standard gpiozero devices like LEDs, Servos, and Buttons. Examples can be found in ~/AIY-projects-python/src/examples/vision/gpiozero/.

Project complete!

You did it! Whether this was your first hackable project or you’re a seasoned maker, we hope this project has sparked new ideas for you. Keep tinkering, there’s more to come.