Motherboard Forums

Thread Tools Display Modes

Gesture Recognition 1 -- The Back Story

Don Y
Posts: n/a
      11-26-2011, 01:30 PM

[advAPOLOGIESance for the length of the post -- but there seems
to be a lot of ambiguity in this field so I try to define terms
that I use. I'll post this in two parts: this first part outlining
the problem domain and the second describing the implementation]

[[ *PLEASE* elide judiciously when replying. None of us need
to read this entire missive again just because you're too
lazy to edit it for your reply :< ]]

I'm tweaking a gesture recognizer that I wrote and still not
happy with the results.

In this context, "gesture recognizer" is akin to "pen recognizer"
(though not entirely). Specifically, the gesture is (currently)
"issued" by the fingertip (on a single-point touch pad) without
the use of a stylus, etc. In the future, this may migrate to
a camera or accelerometer based recognizer.

It's an "on-line" recognizer so it has access to the temporal
aspects of the gesture (vs. an off-line recognizer that only
sees a static "afterimage"). I.e., I can "watch" the gesture
as it is being "issued".

The key point(s) to take away are:
- no need for explicit "training" (user-independent)
- rotation/scale invariant (subject to the gesture set)
- "real-time" interaction
- it's not (conventional) "writing" that is taking place
- a single point is traced through space
- no "afterimages" of gestures are present (i.e., no "ink")

The last point bears further emphasis: the user has no
obvious means of reviewing the gesture issued. E.g., with
a pen interface, you can "see" what you "wrote". More
importantly, you can see what the MACHINE thinks you wrote!
(i.e., if you *think* you drew a 'O' but the resulting
"image" looks more like a 'C', then you know the machine
didn't "see" what you intended it to see! So, if it ends
up doing something "unexpected", you know *why*!)

Finally, the interface is used to issued *commands*, not
"data entry" (though this is a small fib). As such, the
user has no opportunity to confirm/deny the results of the
recognizer before they are acted upon. E.g., in a pen
interface, if something is misrecognized, *you* can
detect that and discard/abort the entry. By contrast,
here, the entry is *acted upon* as soon as it is recognized!

In *practice*, to further clarify these issues, my prototype
is a small touchpad (~3" dia) mounted on the *lapel* of a
jacket. As such, the user can't "see" the gesture he is
issuing. Nor any "afterimage" (had their been "ink" involved).
Nor would it be a particularly convenient way of "writing".

OTOH, it only requires *one* hand for operation *and* leaves
your eyes free for other activities (IMO, two of the stupidest
aspects of Apple's products are that they require *two* hands
*and* two EYES to operate! Try using one while running down
the street or SAFELY driving a car! :> )

Recognizing a small set of gestures is relatively easy. But,
as the gesture set increases, the potential for ambiguities
quickly increases.

I've designed my gestures to try to minimize this. And, the
user interface has been designed with awareness of the
gesture input mechanism and its needs/constraints. For
example, the UI constrains the range of valid inputs at
any given time so that the gesture recognizer need not
be required to recognize the entire range of "potential"
gestures at all times. (This is also a profound efficiency
hack) It also tries to avoid having similar gestures in the
same input set (e.g., 'O' vs 0') to increase the "distance"
between candidates. (Eventually, I would like to provide
a run-time mechanism that lets the application evaluate
the "orthogonality" of the input set on-the-fly). The emphasis
in this approach is to dynamically trade complexity for

For example, when used for data entry (the "fib" alluded to
above), the size of the gesture set increases dramatically.
E.g., even simple numeric entry requires a dozen "extra"
gestures! OTOH, data entry tasks tend to be more "focused"
than command oriented activities -- so, the user can be
expected to be more careful in issuing those gestures.
Also, data entry has an expectation of "review" prior
to acceptance -- unlike "commands" that you don't want to
constantly be prompting the user for confirmation:
"I have detected the EMERGENCY STOP gesture. Do you
really mean to affect an emergency stop? <crash> Ooops!
Never mind..."

As an (insane) example of the potential application for
such an interface, one *should* be able to drive a *car*
using it!

[*Think* about the consequences of that. You surely
don't want to be requiring confirmation of every issued
gesture -- the interface would be *way* too clumsy
AND sluggish! You don't want the "driver" watching a
display for feedback to see what the recognizer's
decisions are (or, if it might have "missed" part of a
gesture). And, you surely don't want the recognizer to
misrecognize "turn left" for "come to a complete stop".
Yet, if the driver can spare the attention, he should
be able to "enter" a speed setting -- "45" -- for the
cruise control; or, dial a radio station; etc.]

Of course, the other big advantage of such an interface
is that it can be "wide" (support lots of virtual buttons)
as well as configurable -- in a small size/cost.
Reply With Quote

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off

Similar Threads
Thread Thread Starter Forum Replies Last Post
Gesture Recognition (2 of 2) Don Y Embedded 4 12-01-2011 08:54 PM
"switch back" story (Dell vs Apple) nosredna Apple 17 04-23-2005 01:37 AM
scratch out gesture JDS Tablet PC 2 05-09-2004 04:11 PM
Lack of Pen Gesture Applications Brad Tablet PC 0 04-27-2004 07:31 AM
How to use Microsoft Gesture Recognizer? yangyh Tablet PC 1 09-02-2003 02:08 PM

All times are GMT. The time now is 10:53 AM.

Welcome to Motherboard Point