Drawing invisible boundaries in conversational interfaces

December 06, 2017 by Eugene Wei

One of the things anyone who has worked on textual conversation interfaces, like chatbots, will tell you is that the challenge is dealing with the long tail of crazy things people will type. People love to abuse chatbots. Something about text-based conversation UI's invites Turing tests. Every game player remembers the moment they first abandoned their assigned mission in Grand Theft Auto to start driving around the city crashing into cars, running over pedestrians, just to exercise their freedom and explore just what happens when they escape the plot mechanic tree.

However, this type of user roaming or trolling happens much less with voice interfaces. Sure, the first time a user tries Siri or Alexa or whatever Google's voice assistant is called (it really needs a name, IMO, to avoid inheriting everything the word "Google" stands for), they may ask something ridiculous or snarky. However, that type of rogue input tends to trail off quickly, whereas it doesn't in textual conversation UI's.

I suspect some form of the uncanny valley and blame the affordances of text interfaces. Most text conversation UI's are visually indistinguishable from those of a messaging UI used to communicate primarily with other human beings. Thus it invites the user to probe its intelligence boundaries. Unfortunately, the seamless polish of the UI isn't matched by the capabilities of chatbots today, most of which are just dumb trees.

On the other hand, none of the voice assistants to date sounds close to replicating the natural way a human speaks. These voice assistants may have more human timbre, but the stiff elocution, the mispronunciations, the frequent mistakes in comprehension, all quickly inform the user that what they are dealing with is something of quite limited intelligence. The affordances draw palpable, if invisible, boundaries in the user's mind, and they quickly realize the low ROI on trying anything other than what is likely to be in the hard-coded response tree. In fact, I'd argue that the small jokes that these UI's insert, like answering random questions like "what is the meaning of life?" may actually set these assistants up to disappoint people even more by encouraging more such questions the assistant isn't ready to answer (I found it amusing when Alexa answered my question, "Is Jon Snow dead?" two seasons ago, but then was disappointed when it still had the same abandoned answer a season later, after the question had already been answered by the program months ago).

The same invisible boundaries work immediately when speaking to one of those automated voice customer service menus. You immediately know to speak to these as if you're addressing an idiot who is also hard of hearing, and the goal is to complete the interaction as quickly as possible, or to divert to a human customer service rep at the earliest possible moment.

[I read on Twitter that one shortcut to get to a human when speaking to an automated voice response system is to curse, that the use of profanity is often a built-in trigger to turn you over to an operator. This is both an amusing and clever design but also feels like some odd admission of guilt on the part of the system designer.]

It is not easy, given the simplicity of textual UIs, to lower the user's expectations. However, given where the technology is for now, it may be necessary to erect such guardrails. Perhaps the font for the assistant should be some fixed-width typeface, to distinguish it from a human. Maybe some mechanical sound effects could convey the robotic nature of the machine writing the words, and perhaps the syntax should be less human in some ways, to lower expectations.

One of the huge problems with voice assistants, after all, is that the failures, when they occur, feel catastrophic from the user perspective. I may try a search on Google that doesn't return the results I want, but at least something comes back, and I'm usually sympathetic to the idea that what I want may not exist in an easily queryable form on the internet. However, though voice assistant errors occur much less frequently than before, when they do, it feels as if you're speaking to a careless design, and I mean careless in all sense of the word, from poorly crafted (why didn't the developer account for this obvious query) and uncaring (as in emotionally cold).

Couples go to counseling over feeling as if they aren't being heard by each other. Some technology can get away with promising more than they can deliver, but when it comes to tech that is built around conversation, with all the expectations that very human mode of communication has accrued over the years, it's a dangerous game. In a map of the human brain, the neighborhoods of "you don't understand" and "you don't care" share a few exit ramps.

Evaluating mobile map designs

October 28, 2017 by Eugene Wei

I saw a few links to this recent comparison by Justin O'Beirne of the designs of Apple Maps vs. Google Maps. In it was a link to previous comparisons he made about a year ago. If you're into maps and design, it's a fairly quick read with a lot of useful time series screenshots from both applications to serve as reference points for those who don't open both apps regularly.

However, the entire evaluation seems to come from a perspective at odds with how the apps are actually used. O'Beirne's focus is on evaluating these applications from a cartographic standpoint, almost as if they're successors to old wall-hanging maps or giant road atlases like the ones my dad used to plot out our family road trips when we weren't wealthy enough to fly around the U.S.

The entire analysis is of how the maps look when the user hasn't entered any destination to navigate to (what I'll just refer to as the default map mode). Since most people use these apps as real-time navigation aids, especially while driving, the views O'Beirne dissects feel like edge cases (that's my hypothesis, of course; if someone out there who has actual data on % of time these apps are used for navigation versus not, I'd love to hear it, even if it's just directional to help frame the magnitude).

For example, much of O'Beirne's ink is spent on each application's road labels, often at really zoomed out levels of the map. I can't remember the last time I looked at any mobile mapping app at the eighth level of zoom, I've probably only spent a few minutes of my life in total in all of these apps at that level of the geographic hierarchy, and only to answer a trivia question or when visiting some region of the world on vacation.

What would be of greater utility to me, and what I've yet to find, is a design comparison of all the major mapping apps as navigation aids, a dissection of the UX in what I'll call their navigation modes. Such an analysis would be even more useful if it included Waze, which doesn't have the market share of Apple or Google Maps but which is popular among a certain set of drivers for its unique approach to evaluating traffic, among other things.

Such a comparison should analyze the visual comprehensibility of each app in navigation mode, which is very different from their default map views. How are roads depicted, what landmarks are shown, how clear is the selected path when seen only in the occasional sidelong glance while driving, which is about as much visual engagement as a user can offer if operating a 3,500 pound vehicle. How does the app balance textual information with the visualization of the roads ahead, and what other POI's or real world objects are shown? Waze, for example, shows me other Waze users in different forms depending on how many miles they've driven in the app and which visual avatars they've chosen.

Of course, the quality of the actual route would be paramount. It's difficult for a single driver to do A/B comparisons, but I still hope that someday someone will start running regular tests in which different cars, equipped with multiple phones, each logged into different apps, try to navigate to the same destination simultaneously. Over time, at some level of scale, such comparison data would be more instructive than the small sample size of the occasional self-reported anecdote.

[In the future, when we have large fleets of self-driving cars, they may produce insights that only large sample sizes can validate, like UPS's "our drivers save time by never turning left." I'd love if Google Maps, Apple Maps, or Waze published some of what they've learned about driving given their massive data sets, a la OKCupid, but most of what they've published publicly leans towards marketing drivel.]

Any analysis of navigation apps should also consider the voice prompts: how often does the map speak to you, how far in advance of the next turn are you notified, how clear are the instructions? What's the signal to noise? What are the default wording choices? Syntax? What voice options are offered? Both male and female voices? What accents?

Ultimately, what matters is getting to your destination in the safest, most efficient manner, but understanding how the applications' interfaces, underlying data, and algorithms influence them would be of value to so many people who now rely on these apps every single day to get from point A to B. I'm looking for a Wirecutter-like battle of the navigation apps, may the best system win.

The other explicit choice O'Beirne makes is noted in a footnote:

We’re only looking at the default maps. (No personalization.)

It is, of course, difficult to evaluate personalization of a mapping app since you can generally only see how each map is personalized for yourself. However, much of the value of Google Maps lies in its personalization, or what I suspect is personalization. Given where we are in the evolution of many products and services, analyzing them in their non-personalized states is to disregard their chief modality.

When I use Google Maps in Manhattan, for example, I notice that that the only points of interest (POI's) the map shows me at various levels of zoom seem to be places I've searched for most frequently (this is in the logged in state, which is how I always use the app). Given Google's reputation for being a world leader in crunching large data sets, it would be surprising if they weren't selecting POI labels, even for non-personalized versions of their maps, based on what people tend to search for most frequently.

In the old days, if you were making a map to be hung on the wall, or for a paper map or road atlas, what you chose as POI's would be fixed until the next edition of that map. You'd probably choose what felt like the most significant POI's based on reputation, ones that likely wouldn't be gone before the next update. Eiffel Tower? Sure. Some local coffee shop? Might be a Starbucks in three months, best leave that label off.

Now, maps can be updated dynamically. There will always be those who find any level of personalization creepy, and some are, but I also find the lack of personalization to be immensely frustrating in some services. That I search for reservations in SF on Open Table and receive several hundred hits every time, sorted in who knows what order, instead of results that cluster my favorite or most frequently booked restaurants at the top, drives me batty.

When driving, personalization is even more valuable because it's often inconvenient or impossible to type or interact with the device for safety reasons. It's a great time saver to have Waze guess where I'm headed automatically ("Are you driving to work?" it asks me every weekday morning), and someday I just want to be able to say "give me directions to my sister's" and have it know where I'm headed.

My quick first person assessment, despite the small sample size caveats noted earlier:

I know that Apple Maps, as the default on iOS, has the market share lead on iPhone by a healthy margin. Still, I'll never get past the time the app took me off to a dead end while I was on the way to a wedding, and I've not used it since except to glance at the design. It may have the most visually pleasing navigation mode aesthetic, but I don't trust their directions at the tails. Some products are judged not on their mean outcome but their handling of the tails. For me, navigation is one of those.
It's not clear if Apple Maps should have a data edge over Google Maps and Waze (Google bought Waze but has kept the app separate). Most drivers use it on the iPhone because it's the default, but Google got a headstart in this space and also has a fleet of vehicles on the road taking Google street photos. Eventually, Google may augment that fleet with self-driving cars.
I trust Google Maps directions more than those of Apple Maps. However, I miss the usability of the first version of Google Maps, which came out on iOS way back with the first iPhone. I'd heard rumors Apple built that app for Google, but I'm not sure if that's true. The current flat design of Google Maps often strands me in a state in which I have no idea how to initiate navigation. I'd like to believe I'm a fairly sophisticated user and yet I sometimes sit there swiping and tapping in Google Maps like an idiot, trying to get it to start reading turn by turn directions. Drives me batty.

I use Waze the most when driving in the Bay Area or wherever I trust that there are enough other drivers using Waze that it will offer the quickest route to my destination. That seems true in most major metropolitans. I can tell a lot of users in San Francisco use Waze because sometimes, when I have to drive home to the city from the Peninsula, I find myself in a line of cars exiting the highway and navigating through some random neighborhood side street, one that no one would visit unless guided by an algorithmic deity.

I use Waze with my phone mounted to one of those phone clamps that holds the phone at eye level above my dashboard because the default Tesla navigation map is still on Google Maps and is notoriously oblivious to traffic when selecting a route and estimating an arrival time. Since I use Waze more than any other navigation app, I have more specific critiques.

One reason I use Waze is that it seems the quickest to respond to temporary buildups of traffic. I suspect it's because the UI has a dedicated, always visible button for reporting such traffic. Since I'm almost always the driver, I have no idea how people are able to do such reporting, but either a lot of passengers are doing the work or lots of drivers able to do so while their car is stuck in gridlock. The other alternative, that drivers are filing such reports while their cars are in motion, is frightening.
I don't understand the other social networking aspects of Waze. They're an utter distraction. I'm not immune to the intrinsic rewards of gamification, but in the driving context, where I can't really do much more than glance at my phone, it's all just noise. I don't feel a connection to the other random Waze drivers I see from time to time in the app, all of which are depicted as various pastel-hued cartoon sperm. In wider views of the map, all the various car avatars just add a lot of visual noise.
I wish I could turn off some of the extraneous voice alerts, like "Car stopped on the side of the road ahead." I'm almost always listening to a podcast in the background when driving, and the constant interruptions annoy me. There's nothing I can do about a car on the side of the road, I wish I could customize which alerts I had to hear.
The ads that drop down and cover almost half the screen are not just annoying but dangerous as I have to glance over and then swipe them off the screen. That, in and of itself, is disqualifying. But beyond that, even while respecting the need for companies to make money, I can't imagine these ads generate a lot of revenue. I've never looked at one. If the ads are annoying, the occasional survey asking me which ads/brands I've seen on Waze are doubly so. With Google's deep pockets behind Waze, there must be a way to limit ads to those moments where they're safe or clearly requested, for example when a user is researching where to get gas or a bit to eat. When a driver has hands on the wheel and is guiding a giant mass of metal at high velocity, no cognitive resources should be diverted to remembering what brands you recall seeing on the app.
Waze still doesn't understand how to penalize unprotected left turns, which are almost completely unusable in Los Angeles at any volume of traffic. At rush hour it's a fatal failure, like being ambushed by a video game foe that can kill you with one shot with no advance warning. As long as it remains unfixed, I use Google Maps when in LA. I can understand why knowledge sharing between the two companies may be limited by geographic separation despite being part of the same umbrella company, but that the apps don't borrow more basic lessons from other seems a shame.
I use Bluetooth to listen to podcasts on Overcast when driving, and since I downloaded iOS 11, that connection has been very flaky. Also, if I don't have the podcast on and Waze gives me an voice cue, the podcast starts playing. I've tried quitting Overcast, and the podcast still starts playing every time Waze speaks to me. I had reached a good place in that Overcast would pause while Waze spoke so they wouldn't overlap, but since iOS 11 even that works inconsistently. This is just one of the bugs that iOS 11 has unleashed upon my phone, I really regret upgrading.

IKEA's Billy bookcase

February 27, 2017 by Eugene Wei

Now there are 60-odd million in the world, nearly one for every 100 people - not bad for a humble bookcase.

In fact, so ubiquitous are they, Bloomberg uses them to compare purchasing power across the world.

According to the Bloomberg Billy Bookcase Index - yes, that's a thing - they cost most in Egypt, just over $100 (£79), whereas in Slovenia you can get them for less than $40 (£31).

A few of the interesting stats on Ikea's Billy bookcase series.

To get as rich as Mr Kamprad has, you have to make stuff that is both cheap and acceptably good.

And to get even richer, you make stuff that is both cheap and the best in its class, though that's not as easy with furniture as it is with software.

IKEA is an interesting example of disruption that I haven't read as many think pieces on as the usual suspects in tech.

I miss first-gen Google Maps for IOS

July 19, 2015 by Eugene Wei

This is an oldie, but still relevant: an informative deep dive into the design choices of Google Maps and Apple Maps on iOS.

I wish I had screens from the first version of Google Maps that shipped on the iPhone, a version that was rumored to have been built by Apple for Google. To me, that's still the most usable mapping app ever for iOS, and all subsequent versions, including both of the latest versions of Google Maps and Apple Maps, are more complex. The new maps may do more and offer more functionality, but if you just wanted to quickly get directions to a particular place, nothing beat the first-gen Google Maps for iOS.

Part of this is the result of the new flat design aesthetic, which is sleek but often opaque. In many ways, touchscreen user interfaces seem to have approached a local maximum in which the only innovation is coming up with new icons that users must learn. At some point, we're just substituting new abstractions and not making significant leaps forward in usability. More apps are better, on average, than the first generation of mobile apps, but the best designed apps today don't feel much better than the best apps from the dawn of the iOS app store.

These days, the great leap forward in interface design feels like it's the complete removal of the abstraction of traditional software design. The interface that feels closest to achieving that in the near future is text, most often found in some sort of messaging interface. Following on its heels, with even greater potential as a democratic UI medium, is voice.

Asym spacing

July 14, 2015 by Eugene Wei

I've never heard of this typography concept: asym spacing.

But one tech company believes something as simple as increasing the size of spacing between certain words could improve people’s reading comprehension. Research going back decades has found that “chunking,” a technique that separates text into meaningful units, provides visual cues that help readers better process information.

...

The image below shows the before and after of Asym’s spacing on a paragraph of text. Quartz is also experimenting by manually adding Asym’s spaces to this article. The effect is subtle, but likely will irk keen-eyed copy editors (sorry!), especially those from the print world who are accustomed to deleting extraneous spaces.

No idea if the science behind this is solid, but I have heard of chunking. When I took a speed-reading class in grade school, they taught us two key principles. One was not to read aloud “inside your head,” and the other was not to read linearly, one word at a time, but to look at chunks of words (which also makes it hard to read linearly).

Maybe because I already chunk groups of words in regularly spaced text, or maybe because the asym spacing bunched odd groups of words together, I found the regularly spaced text (on the left) easier to read.