I’ve been working on an object counting project using the ESP32-CAM and OpenCV. I got the basic setup working, but I’m running into issues with accuracy in changing lighting conditions. I’m wondering if anyone here has tried improving detection using background subtraction or more advanced tracking techniques? Any tips for reducing false positives or handling multiple objects moving at once would be great!
Some of the ESPcams I’ve used were a bit shoddy, but even with a good one running naive object detection on an esp32 will be essentially maxing it out…you can train the face detection algos with the different lighting conditions to help account for them
You can also implement some thresholds that take the average brightness, normalize it vs other data, then maximize contrast to help with most everything you’ve described
If you decide you’re into using CV on MCUs, a Pi with an arducam works great, a Jetson Nano with most camera works great…you can also use most usb/webcams with a Pi/Jetson
The human eye and brain do lots of image processing “in the background” that is extremely difficult to imitate with cameras and computer algorithms. For example, mere color recognition under changing light conditions is a far more difficult problem than most people realize.
Also keep in mind that the eye has logarithmic response to illumination levels spanning many orders of magnitude, whereas cameras have linear response. As mentioned above, a better camera with greater bit depth will help overcome that range problem.
Thanks a lot for the suggestions! I’ll try tweaking the brightness threshold and contrast like you mentioned. Also, the OpenMV cam looks really interesting, might give that a shot for better performance.
That’s a great point. I hadn’t really thought about how much our eyes handle behind the scenes. Makes sense why lighting throws things off so much. I’ll definitely look into better camera options with higher dynamic range.