1. This forum section is a read-only archive which contains old newsgroup posts. If you wish to post a query, please do so in one of our main forum sections (here). This way you will get a faster, better response from the members on Motherboard Point.

Gesture Recognition 1 -- The Back Story

Discussion in 'Embedded' started by Don Y, Nov 26, 2011.

  1. Don Y

    Don Y Guest


    [advAPOLOGIESance for the length of the post -- but there seems
    to be a lot of ambiguity in this field so I try to define terms
    that I use. I'll post this in two parts: this first part outlining
    the problem domain and the second describing the implementation]

    [[ *PLEASE* elide judiciously when replying. None of us need
    to read this entire missive again just because you're too
    lazy to edit it for your reply :< ]]

    I'm tweaking a gesture recognizer that I wrote and still not
    happy with the results.

    In this context, "gesture recognizer" is akin to "pen recognizer"
    (though not entirely). Specifically, the gesture is (currently)
    "issued" by the fingertip (on a single-point touch pad) without
    the use of a stylus, etc. In the future, this may migrate to
    a camera or accelerometer based recognizer.

    It's an "on-line" recognizer so it has access to the temporal
    aspects of the gesture (vs. an off-line recognizer that only
    sees a static "afterimage"). I.e., I can "watch" the gesture
    as it is being "issued".

    The key point(s) to take away are:
    - no need for explicit "training" (user-independent)
    - rotation/scale invariant (subject to the gesture set)
    - "real-time" interaction
    - it's not (conventional) "writing" that is taking place
    - a single point is traced through space
    - no "afterimages" of gestures are present (i.e., no "ink")

    The last point bears further emphasis: the user has no
    obvious means of reviewing the gesture issued. E.g., with
    a pen interface, you can "see" what you "wrote". More
    importantly, you can see what the MACHINE thinks you wrote!
    (i.e., if you *think* you drew a 'O' but the resulting
    "image" looks more like a 'C', then you know the machine
    didn't "see" what you intended it to see! So, if it ends
    up doing something "unexpected", you know *why*!)

    Finally, the interface is used to issued *commands*, not
    "data entry" (though this is a small fib). As such, the
    user has no opportunity to confirm/deny the results of the
    recognizer before they are acted upon. E.g., in a pen
    interface, if something is misrecognized, *you* can
    detect that and discard/abort the entry. By contrast,
    here, the entry is *acted upon* as soon as it is recognized!

    In *practice*, to further clarify these issues, my prototype
    is a small touchpad (~3" dia) mounted on the *lapel* of a
    jacket. As such, the user can't "see" the gesture he is
    issuing. Nor any "afterimage" (had their been "ink" involved).
    Nor would it be a particularly convenient way of "writing".

    OTOH, it only requires *one* hand for operation *and* leaves
    your eyes free for other activities (IMO, two of the stupidest
    aspects of Apple's products are that they require *two* hands
    *and* two EYES to operate! Try using one while running down
    the street or SAFELY driving a car! :> )

    Recognizing a small set of gestures is relatively easy. But,
    as the gesture set increases, the potential for ambiguities
    quickly increases.

    I've designed my gestures to try to minimize this. And, the
    user interface has been designed with awareness of the
    gesture input mechanism and its needs/constraints. For
    example, the UI constrains the range of valid inputs at
    any given time so that the gesture recognizer need not
    be required to recognize the entire range of "potential"
    gestures at all times. (This is also a profound efficiency
    hack) It also tries to avoid having similar gestures in the
    same input set (e.g., 'O' vs 0') to increase the "distance"
    between candidates. (Eventually, I would like to provide
    a run-time mechanism that lets the application evaluate
    the "orthogonality" of the input set on-the-fly). The emphasis
    in this approach is to dynamically trade complexity for

    For example, when used for data entry (the "fib" alluded to
    above), the size of the gesture set increases dramatically.
    E.g., even simple numeric entry requires a dozen "extra"
    gestures! OTOH, data entry tasks tend to be more "focused"
    than command oriented activities -- so, the user can be
    expected to be more careful in issuing those gestures.
    Also, data entry has an expectation of "review" prior
    to acceptance -- unlike "commands" that you don't want to
    constantly be prompting the user for confirmation:
    "I have detected the EMERGENCY STOP gesture. Do you
    really mean to affect an emergency stop? <crash> Ooops!
    Never mind..."

    As an (insane) example of the potential application for
    such an interface, one *should* be able to drive a *car*
    using it!

    [*Think* about the consequences of that. You surely
    don't want to be requiring confirmation of every issued
    gesture -- the interface would be *way* too clumsy
    AND sluggish! You don't want the "driver" watching a
    display for feedback to see what the recognizer's
    decisions are (or, if it might have "missed" part of a
    gesture). And, you surely don't want the recognizer to
    misrecognize "turn left" for "come to a complete stop".
    Yet, if the driver can spare the attention, he should
    be able to "enter" a speed setting -- "45" -- for the
    cruise control; or, dial a radio station; etc.]

    Of course, the other big advantage of such an interface
    is that it can be "wide" (support lots of virtual buttons)
    as well as configurable -- in a small size/cost.
    Don Y, Nov 26, 2011
    1. Advertisements

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.
Similar Threads
  1. yangyh
    Chris H.
    Sep 2, 2003
  2. Brad

    Lack of Pen Gesture Applications

    Brad, Apr 27, 2004, in forum: Tablet PC
  3. JDS

    scratch out gesture

    JDS, May 9, 2004, in forum: Tablet PC
    May 9, 2004
  4. Guest

    Suggestion: Gesture for Scrolling

    Guest, Aug 18, 2004, in forum: Tablet PC
    Avner Ben
    Aug 21, 2004
  5. Guest
    Josh Einstein
    Nov 11, 2004
  6. frankp23

    Input panel gesture not working

    frankp23, Apr 17, 2006, in forum: Tablet PC
    Chris H.
    Apr 17, 2006
  7. nosredna
    Tim Murray
    Apr 23, 2005
  8. Don Y

    Gesture Recognition (2 of 2)

    Don Y, Nov 26, 2011, in forum: Embedded
    Don Y
    Dec 1, 2011