This guide introduces OpenCV.js and OpenCV tools for the ESP32 Camera Web Server environment. As an example, we’ll build a simple ESP32 Camera Web Server that includes color detection and tracking of a moving object.
This tutorial is by no means an exhaustive treatment of all that OpenCV can offer to ESP32 camera web servers. It is expected that this introduction will inspire additional OpenCV work with the ESP32 cameras.
This project/tutorial was created based on Andrew R. Sass project and edited by Sara Santos.
For those who have little or no experience with the ESP32 camera development boards can start with the following tutorial.
The project we’ll build throughout this tutorial creates a web server that allows color tracking of a moving object. On the web server interface, you play with several configurations to properly select the color you want to track. Then, the browser sends the real time x and y coordinates of the center of mass of the moving object to the ESP32 board.
Here’s a preview of the web server.
Before proceeding with this project, make sure you follow the next pre-requisites.
We’ll program the ESP32 board using Arduino IDE. So, you need the Arduino IDE installed as well as the ESP32 add-on:
VS Code (optional)
If you prefer to use VS Code + PlatformIO to program your board, you can follow the next tutorial to learn how to set up VS Code to work with the ESP32 boards.
Getting an ESP32 Camera
This project is compatible with any ESP32 camera board that features an OV2640 camera. There are several ESP32 camera models out there. For a comparison of the most popular cameras, you can refer to the next article:
Make sure you know the pin assignment for the camera board you’re using. For the pin assignment of the most popular boards, check this article:
Code – ESP32-CAM with OpenCV.js
The program consists of two parts:
Create a new Arduino sketch called OCV_ColorTrack_P and copy the following code.
Save that file.
Then, open a new tab in the Arduino IDE as shown in the following image.
Name it index_OCV_ColorTrack.h.
Copy the following into that file.
For the program to work properly, you need to insert your network credentials in the following variables in the OCV_ColorTrack_P.ino file:
Camera Pin Assignment
By default, the code uses the pin assignment for the ESP32-CAM AI-Thinker module.
If you’re using a different camera board, don’t forget to insert the right pin assignment, you can go to the following article to find the pinout for your board:
How the Code Works
Continue reading to learn how the code works, or skip to the next section.
In order to facilitate reading and understanding the program, ANNOTATIONS have been added as comments in the program.
For example, wiring for ESP32-CAM is listed below the ANN:0 annotation, located in the .ino file. ANN:0 is found with the Edit/Find command of the Arduino IDE.
The server program, OCV_ColorTrack.ino, is taken from ESP32-CAM Projects, Module 5 by Rui Santos and Sara Santos. It has a standard ESP32 Camera setup() which configures the Camera and server IP address and password.
Annotation 1 (ANN:1)
However, what is not standard, in this server program are instructions of vital importance, which allow Access-Control. See code at ANN:1.
This instructs the browser to allow the Camera image and OpenCV.js, which have different origins to work together in the program. Without these instructions, the Chrome browser throws errors.
Annotation 2 (ANN:2)
The server loop() monitors client messages and decodes them via an ExecuteCommand() found at ANN:2.
The original program uses this function to receive and execute sliders in the client which controls image characteristics and which are transmitted by the client via a “fetch” instruction.
In our current program, this feature, to be described, is used to communicate the “center-of-mass” of a color target detected by the client to the ESP32 server; a feature vital to a robotics application.
Other than the change of extracting the x and y center-of-mass and printing it, there are no other changes to the server program.
Client Sketch (OpenCV.js)
Other that the image-characteristics-sliders-routines and their data transmission to the server via “fetch” and the error routine in the original client program eBook refenced above, the client program here is new and contains code devoted to the application of OpenCV.js to the ESP32 Camera image transmitted to the browser (as mentioned previously, a “fetch” is used to transmit color target data to the server).
The client code is sprinkled liberally with console.log instructions which allow the user to see the results of the code. Chrome console.log is accessed by pressing CTRL+SHIFT+J simultaneously.
Annotation 3 (ANN:3)
ANN:3 includes the latest version of OpenCV.js in our web page. Click here to learn more.
ANN:READY marks the module which signals that OpenCV.js has been initialized. Once initialization is complete, the Color Detection button can be clicked. While faster computers do not require this capability, it is included for the sake of completeness.
Annotation 4 (ANN:4)
The screenshot of the client program, running on Chrome shows two columns as created by the HTML section of the code. The left column shows the original image of the camera which is transmitted at approximately 1 fps. This image, with an ID of ShowImage is the source image of the OpenCV code routine in the program.
ANN:4 marks the creation of the src and its characteristics; rows, cols, etc.
RGB Color Trackbars
Below the source image are the original three image-characteristics sliders (Quality, Brightness and Contrast), there are RGB Color Trackbars.
These are used to set limits to the color range of colors allowed in the “processed” image in the CV application. Code for the trackbars are found at ANN:5, ANN:6.
The maximum and minimum values of red, green, and blue (RGB) are applied to the OpenCV function, inRange(), at ANN:7.
The image is 4channel; RGBA where A is the level of transparency. In this tutorial, A will set set at 100% opacity, namely 255. The code is based on the fact that, besides the A plane, the image has 3 color planes, RGB, each pixel in each plane having a value between 0 and 255. The high/low limits are applied to the corresponding color planes for each pixel.
Note that inRange() has a destination image which has been created previously in the program (ANN:8).
Important: every image created in an OpenCV program has to be deleted to avoid computer memory leakage (ANN:8A).
The destination image Mask1 is not shown in the program although it could be. However, it is used by the threshold() function immediately following inRange().
The threshold() function examines the composite source image pixel value and sets the corresponding destination value at either 0 or 255 depending on whether the source value is less or greater than the threshold value. The top image in the right hand column shows this binary image.
For the sake of completeness, an invert capability has been added to the binary image. When the INVERT button in the web page is clicked, the binary image is inverted (black becomes white, white becomes black) and subsequent processing is performed on the new image. The button is bistable, so that a second push returns the binary image to its original state.
In the screenshot, a red cap is the target in an ordinary room environment with an ordinary 60W fluorescent lamp. The lamp emits red, green, and blue. The red cap reflects red, green, and blue but principally red. The method of detecting the amount of each reflected color will be described now. This method allows the RGB trackbars to be set with minimal effort. Its use is strongly advised.
The method involves using the Color Probe sliders. These two sliders, X and Y Probe are used to place a small white circle probe in a desired position in the bottom image of the right-hand column. The RGB values in this probe position are measured and used to set the inRange() RGB maximums and minimums described previously.
See ANN:9,9A,9B,9C for the code associated with this probe.
When the optimum values for a desired target are found using the X, Y probe and set by the trackbar, the target in the binary image is white and the remainder of the image is black, ideally, as shown in the screenshot.
This ideal typically can be realized only when lighting conditions can be closely controlled. Indoor, standard room lighting is acceptable. Filters can be used for optimal results but were not used here.
Here’s another example:
Once the binary image is deemed acceptable, the TRACKING button, which is bistable, can be clicked. ANN:10 marks the beginning of the tracking routine.
Since, as mentioned above, this article is not concerned with the INVERT capability, only the b_invert equal to false is of interest
ANN:11 The first step in the tracking is findContours, which is the OpenCV algorithm which finds the contours of all the white objects in the binary image.
If the tracking button is pressed when the binary image is fully black, the instructions depending on findContours output will throw exceptions; the try-catch allows the program to continue safely, posting an output in the console log and the text box.
Contours.sizecontours is the output of findContours and is an array of contours of the white object(s) found in the binary image. Contours.size() finds the number of elements in the array. The hierarchy (contours inside other contours) output is not of concern here as there will be no white objects (outlined in black) inside other white objects.
ANN:12 Marks the beginning of finding the moments of the contours found.
M00 is the zeroth moment-the “area” enclosed by a contour. In OpenCv it is actually the number of pixels enclosed by the contour. M10 and M01 are the x and y coordinate-weighted number of pixels enclosed.
As usual, the origin of the x,y coordinate system is at the upper left corner of the image. X is positive horizontal to the right and Y is positive vertical down. Therefore M10/M00 and M01/M00 are the x,y coordinates of the centroid of a contour in the array.
ANN:13,13A marks finding the largest area contour in the array of contours using the MaxAreaArg function and transmitting the centroid, x_cm, y_cm to the ESP32 via a fetch instruction.
During the running of the program, the centroid coordinates are seen printed in the serial monitor as well as in the console.log and in the text box in the browser screen. The ESP32 can use the centroid data for tracking purposes in robotic applications.
ANN:14 Marks code for a blue bounding rectangle which bounds the largest area contour and the centroid of that contour. These can be seen in the lower image in the right hand column of the browser screen.
Below the lower image in the right-hand column, a text box contains selected outputs of the program including the X, Y Probe data, the centroid coordinates, and a catch output if an exception is generated as mentioned above.
Uploading the Code
After inserting your network credentials and the pinout for the camera you’re using you can upload the code.
In the Tools menu select the following settings before uploading the code to your board.
Testing the Program
After uploading the code, open the Serial Monitor at a baud rate of 115200. Press the on-board RST button, and the ESP IP address should be printed. In this case, the IP address is 192.168.1.95.
Open a browser on your local network and type de ESP32-CAM IP address.
Open the console log when the browser opens. Check if OpenCV.js is loading properly. At the bottom right corner of the web page, it should display “OpenCV.JS READY”.
Then left-click the Color Detection button in the upper left column of the browser window.
You should see a similar window and no error messages.
After setting the right settings to target a color in the Target-color Probe (as explained previously), click the Tracking button.
At the same time, the centroid coordinates of the target should be displayed on the web page as well as on the ESP32-CAM Serial Monitor.
None of the elements of the project described in this tutorial are new. The ESP32 Camera Web Server and OpenCV have each been described extensively and in detail in the literature.
The novelty here has been the combining of these two technologies via OpenCV.js. ESP32 Camera, with its small size, wi-fi, high tech and low-cost capability promises to be an interesting new front-end image-capture capability for OpenCV web server applications.
Learn more about the ESP32-CAM
We hope you liked this project. Learn more about the ESP32-CAM with our tutorials:
About Andrew R. Sass
This project/tutorial was developed by Andrew R. Sass. We’ve edited the tutorial to match our tutorials’ style. Apart from some CSS, the code is the original provided by Andrew.
Author background: Andrew (“DOC”) R. Sass holds a BSEE(MIT), MSEE & PhD EE (PURDUE). He is a retired research engineer (integrated circuit components), a second-career retired teacher (AP Physics, Physics, Robotics), and has been a mentor of a local FIRST robotics team.
This content was originally published here.