The design and implementation of a stereo vision system capable of determining the 3D-coordinates of a red laser dot within the field of vision.
The system has two cameras in a fixed known position relative to each other. From both images the system will identify if there is a red laser dot visible. A red laser is selected because it seems to stand out very well, at least with human perception. The system will then determine the coordinates of the dot in both images and calculate an estimate of the 3D-coordinates of the dot, knowing the camera configuration.
The system will utilize two standard webcams as cameras, connected via USB to a x86 Linux PC. The final program will be coded in C++ and it should be near real time.
I started the project in time, bought the cameras and started coding the image capturing and processing framework in C++. That lasted for a while, but then I had to finish my report for my Information Technology Project GMMBayes. It took two months.
After that I continued coding and got the framework quite complete during my summer vacation. So now I am in the point where the real project would start, but the deadline was yesterday (15.7.2004, postponed twice by a month) and I should start doing real work. I will stop working on this project because I really have no time. If someone, perhaps on the next Machine Vision course, would like to continue my work, I would be happy to assist.
The software is called Doublesight, it is written in C++ and it has the following requirements for compilation and use:
Doublesight has its own command line interface, rather than taking command line parameters it presents its own command prompt. The core of Doublesight is the Controller class, which has its own thread of execution. The user interface runs in the main thread and sends commands to the Controller. This way the user interface will be more responsive.
Theoretically Doublesight can handle any number of cameras, but in practice I found that I could not plug more than one camera to a single USB bus. This is probably due to hardware problems. Images are captured in the Controller thread.
Doublesight has certain image handling components called ViewProcessors, which derive from the ViewProcessor abstract class. Any number of ViewProcessors can be used and they are executed in the Controller thread. For a single shot, images from all configured cameras are captured, labelled with the camera name, and sent as a vector to each active ViewProcessor. ViewProcessors can do whatever with the raw image data which is in YUV420P format, but they should not change the original image.
(On the other hand, you could create filtering processors that do change the original image. Since the ViewProcessors are stored in a list, they are executed one by one in the order they were added, so you could create determinisic filter chains and still use even the original ViewProcessors. The filters could not reallocate the bitmap memory area, but they could even change the color format. Of course, the current ViewProcessors expect always to get YUV420P, so the results might be quite funny.)
An abstract class called WindowedView is derived from the ViewProcessor class. WindowedView uses the TXWindow class to offer image display capabilities by creating an X window for each image (camera) in the vector. TXWindow class is threaded and sustains the image once it has been set. All current ViewProcessors are derived from WindowedView, but for the 3D-application WindowedView class' interface is not suitable.
The currently implemented ViewProcessors are:
A UML class diagram is here (34kiB).
Click to enlarge
Particulaly in pictures 4 and 5 you can see effect of the automatic exposure algorithm of the webcam. Picture 4 is taken "normally" and it is heavily over-exposed. Picture 5 is taken after the camera looked directly into an ultrabright white led. The autoexpose seems to "lag" badly. Also see the difference between pictures 1 and 7. Picture 2 is taken with a sun shade on :-)
I had an idea of first finding bright spots in the luminance image by thresholding and segmented area size (and shape) selection. Then the found dots' chrominance values would be checked to see is it really is a red dot. This should have located the red laser pointer. The reason why I would not do this in RGB color space is that it would require a conversion from YUV space, and in the YUV space the brightness (luminance, Y-channel) and color (chrominances, U- an V-channel) are already separated.
I think the screen capture 4 shows the potential of my approach. You can clearly see, that the laser dot is found in the thresholded image as a nice dot, and it is the only dot-like area that is red according to the chrominance view.
I did have trouble adjusting the cameras. I found no way to manually set the exposure time without losing all color information. That means the system cannot stand bright environments because of over-exposure, unless the cameras are fooled with a bright lamp first or you could try sun shaders or other filter optics. Even then, you have to remember that these cameras are toys and you cannot expect too much from them.
Information on 3-dimensional optical measurement can be found in the Master's Thesis by Kari Jyrkinen (not yet published, contact me for more information).
Released the two source code packages.
Finishing the documentation.
The ChrominanceView got a lot better when I decided to scale the RGB-values, so that the greatest value is always 255. When I discard the Y-channel I still need some value for Y in the color conversion. I made an image (97kiB, 320x1200px) of the effect.
After about 3300 lines of C++ code I am finally getting to the image processing. LuminanceView (Y-channel) is already working and I can get gray scale "live" images from the two webcams. This is the first time the real intended engine structure of the program is working. Next I am going to implement ColorView (RGB-color) and ChrominanceView (UV-channels converted to RGB, constant luminance). After those, some nice luminance and chrominance histogram views so that I can start developing to the ultimate goal: locating the laser dot.
The program has a command line user interface through which I can set the camera parameters, take single shots or set it to run (in a separate thread) taking shots by itself and load ViewProcessors like the LuminanceView. All X-windows have their own threads and the UI has it's own thread (the main thread, actually), so the user interfaces should be quite responsive.
If you are interested about the source code, I will make it available some day under some free licence.
I finally configured a cam at home. I have linux 2.6.7 and the 8.x versions of the driver are out of the question. Fortunately the version 9 beta 2 seems to work fine. It's been a long since I worked on this project, but now I'm coding again.
Yes! Even though the cam&drivers only support capturing YUV420P format images, I've managed to read the components from raw file I produced into Matlab and visualize them. My own code can now capture images and adjust all video4linux and Philips specific parameters. Just as I thought, the camera overexposes the picture too easily. When I set the shutter speed manually I can get the laser dot clearly visible in indoor lighting.
I have finally got the webcams. I hooked one up on my workstation at work. After recompiling (and updating) the kernel I got the Linux driver working and got live image out of the cam at 5 fps with Xawtv. After fetching the latest pwcx decompression kernel module version 8.4 I think I got also VGA resolution image. The camera has some some problems adjusting to the light level, it tends to over expose. I hope I get that somehow managed when I start coding my own program for it.
UPDATE: I got 15 fps with VGA resolution, not bad.
I have decided to order two Logitech QuickCam Pro 4000 webcams from Verkkokauppa.com (code 6171, price 80.90€ each). It has a 640x480 pixel CCD sensor and presumably good image quality. It should be supported under Linux according to "Linux support for Philips USB webcams" -page.
The project proposition has been accepted. Task 14 in the Practical Assingment -page of the course.