Category: HCI

Design interaction with Myo

11/24/2014

After spending some time thinking on how to use Myo for text input and experimenting a little bit we think we can share some of our findings.

Strengths:

Myo does have some strengths when compared to other input devices.

1- It is minimally intrusive and can be used for mobile interaction.
2- It can sense hand poses.
3- It has IMU with accelerometers and gyros to detect arm movement.

Limitations:

Some of them are due to myo's specific sensing technology

1- There are only a few hand poses that the MYO can recognize
2- It seems to be sensitive to the initial calibration.
3- Hand poses can be stressful and difficult to change between them (this is required for proper detection).
4- Many false negatives (generally for the less stressful poses).

Some are shared with other 3D interface devices:

1-Live mic problem - hand poses can be activated by accident, though rarely.
2-Lack of spatial frame of reference.

Ideas:

To overcome the limitations we tried several things in our design:

1-Restrict the set of hand poses used frequently to the ones that offer less false negatives (wave-in and wave-out).
2-Use the initial set of recognized hand poses as primitives to obtain a larger set ( add gesture and gyroscope data).
3-Allow for continuation of gestures. Since most gestures are stressful, allow to user to specify parameters by continuing the motion.
4-Try to use relative hand positions instead of using absolute space positions.
5-Try to transition between gestures that are easy to perform (some gestures are hard to transition e.g. between fist an open hand).

We hope that these ideas are useful to others designing interaction techniques using Myo.

1 Comment

Myo - First Impressions

11/7/2014

1 Comment

On a previous blog entry I mentioned that Myo could be an interesting device for bare hand text input. After being able to play with it for a while I can offer my first impressions.

The armband seems pretty well designed. It has no hard edges and the material feels good to touch. It is somewhat bulky and even though it is not heavy on the arm, the aesthetics might be more appealing to men. It also comes with small clips that you can use to tighten the band if you have thin arms. There are no buttons, just a USB connector for charging and a glowing logo in one of the pods.

Every time you put the armband you need to perform a special sync gesture. It is used to detect the arm and the orientation it is being worn. If for some reason it slips from the original position, you will need to calibrate again. This is not common though. After a successful calibration Myo will acknowledge by vibrating. It also vibrates when it is removed. This haptic feedback works very well and makes straightforward to start using without even looking at your display.

The SDK recognizes five different hand poses: palm pointing left, right, spread fingers, fist and thumb to pinky. These poses can be combined with data from the IMU to create more complex patterns and detect movement. The API gives access to the pose detected, orientation data and the vibration motor. As we inferred the poses are fairly independent from the arm orientation, which allows more flexibility for design.

The quality of the built in classifier is also good for a first version. We noticed some false negatives and false positives with the former being more common. Sometimes you need to repeat the same gesture several times before it gets acknowledged. In our limited experience the pinky to thumb seems the most difficult to recognize, which also correlates with the fact of it is the less stressful pose. Left and right hand poses are the most reliable, they are almost flawless. People seems to hold different opinions regarding which pose is the most tiresome, but in general all of them seems less fatiguing than we initially thought.

In retrospect the engineers at Thalmic made a good choice in selecting the poses and the decision of shipping with only five of them was also wise. However, as a researcher I wish I could have access to more data. This would allow me to play with new algorithms or just select poses that make more sense for my particular task. I hope they change it on the future like Leap Motion did. It does not need to provide all EMG data, just the "tension" values for each group of muscles would suffice: palm, thumb, fingers. I do not believe they have constructed their classifier in this way but I think it can be done. This seems a good tradeoff between flexibility, simplicity and battery life.

As we spend more time with the armband we might need to adapt our original ideas for text input. If the gesture pinky-to-thumb end up being unreliable we might be tempted to remove it altogether from our design. The information from the IMU, on the other hand, may open more interesting directions for our work. We experimented with the rotation of the arm (roll), for example, and it seems pretty stable.

Summing up, even with the current limitations Myo it is a very interesting device. It is ready for some new cool interaction propositions. Anyone ?

1 Comment

The Swype and the Shark

10/10/2014

0 Comments

Swype is a commercial text input system developed originaly by Swype Inc. an latter acquired by Nuance Communications. It was first presented at the TechCrunch50 in 2008. After that several similar keyboards appeared for smartphones.

Swype (www.swype.com)

To enter a word in swype, the user draws a continuous line over the characters without lifting the finger from the screen. The software uses an algorithm to determine the word that is more closely related to the input pattern.

Since there is no need to be precise and since you do not need to take your finger from the surface for each letter, you can type very fast. In fact, in 2010 a Swype employee gained a entry into the Guinness World Record for the fastest text message on a touchscreen mobile phone.

The technique used in Swype is similar to the one presented five years early, on the CHI 2003 conference and later in UIST 2004. In the paper Shorthand writing on the stylus keyboard, Zhai and Kristensson present SHARK, a technique that contains most of the benefits we see in Swype today.

In SHARK each word is also defined by a pattern drawn over a keyboard layout. In this work, however, the authors used the ATOMIK keyboard instead of a QWERTY. The ATOMIK was designed to minimized the gesture time between keys that are more frequently used. The authors argue that gesture input for words can be better than tapping letters if you are able design them in a way that require less visual precision. This explains why even though you require more travel time, it is still fast to input text with this technique. The authors, however, were targeting the use of the gestures as a shortcut for most used words. The ideia is that after training a user would be able to input words without using the keyboard as a guide. They created an alphabet for the most frequent 100 words. Words that were not on this group used normal tapping input.

ATOMIK keyboard layout.

In SHARK words are recognized using dynamic programming to compare the distance between the gesture and a template. The most similar word is returned. If only the direction and shape of the gestures are considered then some ambiguity still remain. The authors proposed that the initial or center position of the gesture could be used to solve this problem. In SHARK2 they improve upon the previous system by adding visual trace to the gestures (similar to what Swype does) and by assuring that every word could by entered by the same method. This more advanced system combined information from shape and location classifiers as well as a language model to increase the discrimination power. The inclusion of visual guided input works very well as an scaffolding technique. Even after using Swype for while, I don't think I was ever able to draw a word from memory alone.

0 Comments

Designing an wearable input system

9/30/2014

0 Comments

Wearable devices will change the way we use and experience technology. Many challenges still need to be solved such as better integration with clothing, better battery lifetime and many questions regarding interaction. These devices can have very small displays or even no display at all. When this happens, common techniques such as keyboards and touch screens are not so useful anymore.

Google glass in a prescription frame.

Apple Watch.

Today many wearables rely on a smartphone companion for internet connection and some advanced tasks, however many understand that decoupling from mobile devices is a necessary step to achieve the vision of a truly omnipresent and invisible technology. Connectivity is already on the way in devices like the Samsung S, which uses a mini-sim for direct 3G access. A good interaction technique would impact a lot the use of these devices. Apple has recognized that we need to come up with different ideas so it recently proposed using the watch crown as an interaction device.

The main problem is that we usually want to increase the expressiveness of input not to narrow it. For this reason along the years we have supplemented keyboards with mouses, tablets and touch screens. By making smaller devices we inadvertently tend to scale down the interaction possibilities too. Two exceptions to this are voice and gestural input: since they are not physically attached to the input target they can maintain their power regardless of the device size. Voice recognition, unfortunately, has some obvious disadvantages if you are in a noise environment or don't want to bother nearby fellows.

Together with Ayshwarya, I have been studying the possibility of developing a gesture-based text input technique as part of our work for a Natural User Interface class at Virginia Tech. We decided to go along a selection technique for letters instead of drawing/handwriting. We believe that selection has the potential of being much faster since you can create methods to choose a letter in constant time. In fact, with a few exceptions, most of the current input systems go in this way. Another trend is to use some kind of prediction or correction algorithm to minimize mistakes and effort during input.

For selection, we noticed that three main actions are required:
1-Highlight a specific letter
2-Select the letter
3-Finish the word

Touch based techniques implicitly highlight letters, since the user can tell from his hand position and tactile perception when no letter has been selected. Other input devices, however, may require an explicit designed disambiguation mechanism (such as a button) or use a technique that combines both highlighting and selection. Joysticks are a good example of the first group. They generally use a specific button to selected a highlighted item. On the other hand, Swype unifies the two actions: the drawing of the curve is used determine probable letters and then words. Another classic technique that combines highlighting and selection is the Dasher. In this technique the user continuously steer a cursor, which run over letters selecting them. Finally, the purpose of the last action is to enable the user to prematurely end the word input, based on suggestions from an autocomplete system.

Regarding the input device, our first idea was to use Myo. This would theoretically allow users to input information using gestures and hand poses without the support of a desktop or computer, something more close to the wearable ideal.

Myo, from Thalmic Labs.

Myo works by analyzing signals captured from your arms in a process called electromyography. The signals are processed by a classifier that can distinguish 5 different hand poses. The armband also contains an IMU (gyroscope, accelerometer and magnetometer) that can be used to track the relative position to the ground and the acceleration of the arm.

Next steps involve generating some options for input mapping and some way of evaluating them.

0 Comments

Text Input II

9/18/2014

1 Comment

Lets talk more about text input. What is common between the existing methods?
My first try culminated in the categorization of text input methods into four groups:

1-Physical keyboards
2-Drawing methods (grafitti, gesture for drawing)
3-Virtual keyboards (swype, virtual pointing)
4-Voice recognition

This classification is far from being complete and definitive. Some techniques can combine more than one input method or can be considered borderline. For example, you can activate virtual keys by using gestures, your eye, touchscreens or hand poses. However this classification is enough to provide a start point for analysis.

Keyboards are good for several reasons: you can use your 10 fingers, have a clear confirmation when the letter has been input and can use your muscle memory to type without looking at the keyboard. Drawing techniques such as Graffiti are good because they can leverage your knowledge of writing and the letter shapes to memorize a large number of commands easily. Virtual keyboards use the fact that they are not real to improve input by changing the way you activate the keys. Finally voice recognition use the association of phonemes and written text to allow you to input text.

Each technique has different advantages and weakness. Graffiti may be slower than the keyboard but it is very fast to learn and can be used in constrained spaces. A good thing to do before going further is to try to establish some guidelines and principles that we can use to guide decisions later. After thinking for a while, I came up with the following dos and dont's for text input methods:

Try to use something that is easy to recognize (e.g. keyboard layout, letter design, phonemes)
Avoid too much visual analysis/search
Use motor memory if possible
Avoid absolute hand positioning
Use both hands if possible
Avoid having moving things when using eyetracking
Avoid drawing entire letters
Avoid moving the hands/arms too much
Use redundancy if possible.
Avoid having to many poses or complicated gestures for each letter
Avoid having more than one one step to input a letter

Ideally we would want something that is fast to learn and use. Something as intuitive as drawing a letter and as accurate and fast as the keyboard. When we consider the application in virtual environments, where the user does not have a physical keyboard, further restrictions apply.

Many interesting ideas are ruled out from the beginning because they simply add more complexity to an existing input system. If you are selecting a letter, anything more than just pointing at it will not present a real gain. The only exception would be just looking at it instead of pointing with a device or your hand.

If we want a method that is easy to learn we are pretty stuck with using a common known letter layout (qwerty or alphabetically), speech or drawing.

1 Comment

Thinking about Text Input

9/12/2014

0 Comments

Even with the advent of the GUI, touchscreen and voice recognition, the basic mechanism for text input remained mostly unchanged for more than 100 years. The keyboard and the QWERTY layout are still the most widely used method and still the fastest one.

The qwerty layout was developed in 1868 to be used int railroad ticket typewriting. After that many changes were made in the layout to try to improve the speed, the most famous being the DVORAK. Anyway typing is quite fast. The world record is 216 words per minute, achieved on an electrical typewriter in 1968. The average typists is able to achieve 40 words per minute. Check the infographic and measure your own speed.

Mobile devices, have reduced space to house a full keyboard and virtual ones also lack haptic feedback. This makes typing more challenging. To cope with this restrictions, most mobile keyboards utilize some kind of input prediction to correct words half typed or mistyped. This is an on going competition with several contenders such as swype, flesky, swiftkey and so on. The great majority, however, are still variations of the original QWERTY keyboard.

The qwerty layout on a mechanical typewriter.

The latest high-tech variation.

When thinking about text input in immersive environments the situation gets further complicated. Besides the lack of haptic feedback for keys there is also a loss of the reference frame. If you want to keep your hands away from a solid surface you eventually move away from the keyboard or cannot keep stroking at the same positions. I believe that a good text input method will be necessary to increase the range of applications using gestures and VR.

Researchers have come up with a lot different ideas for more natural or efficient text input in theses conditions. However, none seems to have been established as a good solution for now, which is unfortunate.

For more information see: input methods

0 Comments

Interfaces in SciFi Movies

9/4/2014

0 Comments

Watching computer appearances in movies is always interesting.
It is fun to imagine if computer technology would really work in the way presented by Hollywood. Sometimes they manage to get computer scientists very puzzled though.

Click at the image on the side to read three plausible explanations to a question whose answer has eluded many for years (Courtesy Wolf Gnards).

When done right, however, movies with futuristic elements are a great opportunity to see new ideas and concepts from very skilled designers. Besides incredible motion graphics, most Sci-FI movies routinely present some kind of advanced display or interface. One recurrent ideia is the combination of holograms and 3d interaction with gestures. You can see it in several movies such as Tron, Iron Man and my favorite, District 9.

Iron Man 2 interface and holograms example from Bo Opfer on Vimeo.

DISTRICT 9 // Image Engine from Goldtooth Creative on Vimeo.

These sort of motion control interfaces are well fitted for movies, but what they mean to real world designs? Leap Motion has created a nice two part video and post inspired by an analysis written by Noessel for Smashing Magazine. The video points out that even though movies seldom portray the reality, they do create expectation on the users for:

Feelings of power and mastery - Characters on movies are generally heroes that are either technology experts or possesses and control some incredible technology. These impressions are usually conveyed by complex interfaces, wireframes and by the acting itself.
Immersion and Flow - Users expect to be immersed in the world or the activity being executed. If the interface detracts from the goal experience it will not be effective. They expect the interaction to be fun and rewarding.

These are nice guidelines but perhaps they are too general to help us with the specific goal of building better gestural interfaces. As a user you always want to be in control, not the opposite. It is also complex to require both immersion and flow. Not all applications provide immersion and flow is hardly a property of the interface alone.

In Noessel article, however, he observes that gesture in movies tend to fall into 7 different type of actions:

Wave to activate
Push to Move
Turn to rotate
Swipe to dismiss
Point or touch to select
Extend the hand to shoot
Pinch and spread to scale

These actions are almost a 1:1 mapping to real world gestures. So they tend to feel more intuitive and even advanced (on the sense that they understand what you mean). This set some guidelines on what users may expect from those gestures.

The most interesting point by Noessels is really about the role of language in interfaces, but I will discuss it later.

If you are interested, NoteLoop has an interesting blog with a section dedicated to Movie UI.

0 Comments

Natural User Interfaces

8/28/2014

1 Comment

Natural User Interfaces (NUI) still lacks a formal definition. It can appear as a synonym for intuitive, easy or gesture based interaction. So rather than trying to define one I will list some characteristics that are usually associated with them:

They leverage real world knowledge to achieve task
They are intuitive to learn
They offer direct manipulation
They are often invisible to the user

If you are new to the term, this is a cool video that has some examples and attempts to define some concepts:

In video games, the use of more natural interfaces helped technology intimidated people to get the courage to jump in and play. Even if most hard-core gamers still prefer classic game controllers, these new interaction techniques really helped to expand the market for videogame consoles. Bowling with the Wiimote seems more natural and simple than using a combination of buttons and analog sticks. The Nintendo Wii, Playstation Move and Microsoft Kinect are successful technologies that opened the path to development of more natural interactions for videogames.

It is hard to talk about an absolute natural interface, but the previous example shows that one interface can feel more natural than another. The extent of this feeling depends not only on the technology but also on the activity being performed and the actual user of the interface. This further complicate the issue since NUI relies on previous background knowledge about how things are supposed to work. This assumptions are ultimately related to culture and previous experience of each user.

With a few exceptions most user interfaces are a coolest but not so efficient way to perform a task. If we look closely we can see that in few examples, some actions performed are far from being natural. Even int the coolest complex hand/head/spirit tracking setup. In fact, a natural interaction does not require complex multiple degrees of freedom setup. Its quality emerge from the perfect coupling between the hardware, the software, the task and the user.

More on that latter. Or not.

1 Comment

Human Computer Interaction

8/27/2014

0 Comments

I would like to talk a little about my new area of study. Coming from computer graphics research it is going to be very fun and challenging to learn the diversity of techniques and subtleties of the field.

Human Computer Interaction (HCI) deals with the interface between computers and people. It seeks to understand how humans behave when performing actions and interpreting computer output. At the same time it tries to develop new technologies or paradigms that can be useful to make this communication better.

The difficulties arise because we do not understand well how our brain and perception works. What are its limits and capabilities ? Computers operate on a way that is quite different from us and even the most power computer pale in comparison to small prodigies of the human mind. However computers are very useful in many specifics tasks and that is the motivation behind most research in computing.

Researchers in HCI generally employ a multidisciplinary approach when trying to devise better interaction models. One can get support from disciplines such as cognitive sciences, ergonomics, graphics design, engineering, ethnography, etc. Designs are also based on different metaphors and ideas. Windowed interfaces, for example, are available in most computers. They are based on discrete visualization areas, that you can move around, open an close, each one associated with a specific document or software. Windows like these only exists on the display and do not really represent anything. They are just abstractions that work really well. Depending on display, you can use a mouse, keyboard or touch screen to interact.

Tangible User Interfaces, on the other hand, try to associate information and actions with physical entities. In this way information can have a presence in the real world: you can grab, move, combine it with your hands. This mapping needs not be static and may change depending on the context. There are a couple of really interesting experiments with this idea.

Reactable tangible interface (photo by Daniel Williams)

Interfaces that are instinctive and employ mappings that are well fitted to the task or situation are called Natural User Interfaces. This approach generally makes use of more advanced techniques such as gesture recognition and head tracking to allow direct manipulation of virtual elements. This is a way to enable the use of our real world knowledge and expectations when interacting with the computer.

Tom Cruise operating a "natural user interface" in the minority report movie.

Human computer interaction is an exciting field that is evolving constantly by new theoretical perspectives, technology developments and a better understanding of the human being. Stay tuned!

Quick links:

0 Comments

Design interaction with Myo

Myo - First Impressions

The Swype and the Shark

Designing an wearable input system

Text Input II

Thinking about Text Input

Interfaces in SciFi Movies

Natural User Interfaces

Human Computer Interaction

Wallace Lages

Archives

Categories