This book is about programming novel computer interfaces by
capturing images from a PC's webcam. The idea is to augment
(perhaps even replace) the familiar keyboard and mouse with input
derived from pictures of the user's movements, facial features, hand and
finger gestures, and visual tags such as barcodes.
In one picture:
Vision-based user interfaces (VBIs) have been a hot research topic
for decades, but only in the last five years or so has the
hardware and software become cheap enough, fast enough, and
feature-rich enough for the technology to also be practical. For
instance, it's a rare computing device that doesn't now come with
a color mega-pixel camera, capable of recording 30 images
(frames) per second.
The software side of VBIs is all about implementing
sophisticated computer vision algorithms that are fast enough for
(near) real-time processing of images. That problem was addressed
to a great extent by the release of version 1 the OpenCV computer vision
library (http://opencv.org/)
in 2006. Subsequent
releases have seen the library ported to a variety of machines,
programming languages, and OSes (Windows, Linux, OS X, Android,
iOS, and more).
Specifically, this book is about how to write VBI programs
using a Java binding of OpenCV called JavaCV
(https://code.google.com/p/javacv/)
utilizing an ordinary
laptop-based webcam. The chapters are divided into seven
parts:
- an introduction to VBIs (chapter 1)
- taking (snapping) pictures with a webcam (chapters 2 and
3)
- general computer vision techniques, which appear often in
later examples (chapters 4 and 5)
- VBIs that utilize information obtained from
the user's hands and/or fingers (chapters 6-8)
- VBIs that utilize the face (chapters 9-12)
- VBIs using different kinds of visual tags
(chapters 13-15, and also chapter 7)
- VBIs employing more than one webcam (chapters 16 and
17)
Early (sometimes very early) draft versions of the book's
chapters can be downloaded from here (see the links below).
I'll also be adding new chapters here; chapters which
don't appear in the book.
If you're looking for Killer Game Programming in Java
then it's here.
What this Book is not About
It's quite useful to list things which this book does not
do.
- This book is not an introduction to Java. I'm going
to assume that you've already done an introductory course on
Java (or something similar), and so understand about classes, objects, inheritance,
exception handling, basic threads and graphics. A good
introductory textbook on Java
is Thinking in Java, by Bruce Eckel. It's
won awards, and can be downloaded for free from
http://www.mindview.net/Books/TIJ/.
However, I will be explaining
more advanced stuff such as Java Sound, networking, and Java
3D.
- This book is not a theoretical introduction to computer
vision or human computer interfaces. It's driven by
programming examples focused on specific vision-based user interface
problems. Of course, when a particular technique (e.g.
eigenfaces) comes up, I explain it, but without relying on
heavy-duty mathematics.
There are many excellent academic texts on computer vision.
One that I've found useful is:
Richard Szeliski, Computer Vision: Algorithms and
Applications, Springer, 2010;
http://szeliski.org/Book/
- This book doesn't cover every aspect of OpenCV,
which is enormous, and getting bigger by the moment. I cover a
lot of topics, but there's
always more things to learn. Many times during this book, I'll
refer to the OpenCV website (http://opencv.org/)
for more
information, and to the standard text on OpenCV:
Gary Bradski and Adrian Kaehler, Learning OpenCV:
Computer Vision with the OpenCV Library, O'Reilly Media,
2008; http://shop.oreilly.com/product/9780596516130.do
- This isn't a book about Android programming. All my
code utilizes a laptop/PC webcam running on MS Windows (tested
on XP and version 7, but I couldn't face installing Windows 8
☺). However, OpenCV, and most of the other libraries I use, are
supported across multiple platforms, such as Mac OS X, Linux,
and Android. In all cases, I use the Java bindings of the
libraries, but most have multiple programming language
interfaces.
- You won't find page after page of code that you
have to type in. All the code examples are available
online, accessible from this
page.
I only discuss the code that
contains interesting computer vision algorithms, or more
advanced Java programming (e.g. fancy uses of
concurrency).
Dr. Andrew Davison
E-mail: ad@coe.psu.ac.th
Back to my home page