Jeff Zych

On Being a Generalist

I recently read Frank Chimero’s excellent article, “Designing in the Borderlands”. The gist of it is that we (designers, and the larger tech community) have constructed walls between various disciplines that we see as opposites, such as print vs. digital, text vs. image, and so on. However, the most interesting design happens in the borderlands, where these different media connect. He cites examples that combine physical and digital media, but the most interesting bit for me was his thoughts on roles that span disciplines:

For a long time, I perceived my practice’s sprawl as a defect—evidence of an itchy mind or a fear of commitment—but I am starting to learn that a disadvantage can turn into an advantage with a change of venue. The ability to cross borders is an asset. Who else could go from group to group and be welcomed? The pattern happens over and over: if you’re not a part of any group, you can move amongst them all by tip-toeing across the lines that connect them.

I have felt this way many times throughout my career (especially that “fear of commitment” part). I have long felt like a generalist who works in both design and engineering, and I label myself to help people understand what I do (not to mention the necessity of a title). But I’ve never cleanly fit into any discipline.

This line was further blurred by my graduate degree from UC Berkeley’s School of Information. The program brings together folks with diverse backgrounds, and produces T-shaped people who can think across disciplines and understand the broader context of their work, whether it be in engineering, design, policy & law, sociology, or dozens of other fields in which our alumni call home.

These borderlands are the best place for a designer like me, and maybe like you, because the borderlands are where things connect. If you’re in the borderlands, your different tongues, your scattered thoughts, your lack of identification with a group, and all the things that used to be thought of as drawbacks in a specialist enclave become the hardened armor of a shrewd generalist in the borderlands.

Couldn’t have said it any better. Being able to move between groups and think across disciplines is more of an advantage than a disadvantage.

Why We Hire UI Engineers on the Design Team

This Happy Cog article by Stephen Caver perfectly encapsulates why we hire UI Engineers on the design team at Optimizely (as opposed to the engineering team). We want the folks coding a UI to be involved in the design process from the beginning, to understand the design system that underlies a user experience, and to be empowered to make design decisions while developing a UI. Successful designs must adapt to various contexts and degrade gracefully. The people most qualified to make those kinds of decisions are the ones writing the code. As said in the article, “In this new world, the best thing a developer can do is to acquire an eye for design—to be able to take design aesthetic, realize its essential components, and reinterpret them on the fly.” By embracing this mindset in our hiring and design process, we’ve found the end result is a higher quality product.

Matthew Carter’s “My Life in Typefaces”

I just got around to watching Matthew Carter’s excellent TED talk, “My Life in Typefaces”. In it, he talks about his experience designing type for the past 5 decades, and how technical constraints influenced his designs. The central question he tries to answer is, “Does a constraint force a compromise? By accepting a constraint, are you working to a lower standard?” This is a question that comes up in every discipline, and with every technological change. Matthew Carter’s take on this subject is interesting because he’s experienced numerous technological changes, and has designed superb typefaces for all of them.

At first blush, it’s easy to conclude that constraints force designers to compromise their vision. But design isn’t produced in a vacuum, and ultimately must be realized through one or more mediums (print, screen, radio, etc.). Therefore, one must work within constraints to produce the best designs. To do so, designers must understand the technology that enables their designs to be experienced, be it code, the printing process, and so on. As Matthew Carter said in this talk, “I’m a pragmatist, not an idealist, out of necessity,” which is a valuable lesson that all designers should take to heart.

Designing with instinct vs. data

Braden Kowitz wrote a great article exploring the ever increasing tension between making design decisions based on instinct versus data. As he says, “It’s common to think of data and instincts as being opposing forces in design decisions.” This is especially true at an A/B testing company — we have a tendency to quantify and measure everything.

He goes on to say, “In reality, there’s a blurry line between the two,” and I couldn’t agree more. When I’m designing, I’m in the habit of always asking, “What data would help me make this decision?” Sometimes it’s usage logs, sometimes it’s user testing, sometimes it’s market research, and sometimes there isn’t anything but my own intuition. Even when there is data, it’s all just input I use to help me reach a decision. It’s not a good idea to blindly follow data, but it’s equally bad to only use your gut. As Braden said, it’s important to balance the two.

Did you A/B test the redesigned preview tool?

A lot of people have asked me if we A/B tested the redesigned preview tool. The question comes in two flavors: did we use A/B testing to validate impersonation was worth building (a.k.a. fake door testing); and, did we A/B test the redesigned UI against the old UI? Both are good questions, but the short answer is no. In this post I’m going to dig into both and explain why.

Fake Door Testing

Fake door testing (video) is a technique to measure interest in a new feature by building a “fake door” version of it that looks like the final version (but doesn’t actually work) and measuring how many people engage with it. Trying to use the feature gives users an explanation of what it is, that it’s “Coming soon”, and usually the ability to “vote” on it or send feedback (the specifics vary depending on the context). This doesn’t need to be run as an A/B test, but setting it up as one lets you compare user behavior.

We could have added an “Impersonate” tab to the old UI, measured how many people tried to use it, and gathered feedback. This would have been cheap and easy. But we didn’t do this because the feature was inspired by our broader research around personalization. Our data pointed us towards this feature, so we were confident it would be worthwhile.

But more than that, measuring clicks and votes doesn’t tell you much. People can click out of curiosity, which doesn’t tell you if they’d actually use the feature or if it solves a real problem. Even if people send feedback saying they’d love it, what users say and what they do is different. Actually talking to users to find pain points yields robust data that leads to richer solutions. The new impersonate functionality is one such example — no one had thought of it before, and it wasn’t on our feature request list or product roadmap.

However, not everyone has the resources to conduct user research. In that situation, fake door testing is a good way of cheaply getting feedback on a specific idea. After all, some data is better than no data.

A/B Testing the Redesigned UI

The second question is, “Did you A/B test the redesigned preview against the previous one?” We didn’t, primarily because there’s no good metric to use as a conversion goal. We added completely new functionality to the preview tool, so most metrics are the equivalent of comparing apples to oranges. For example, measuring how many people impersonate a visitor is meaningless because the old UI doesn’t even have that feature.

So at an interface level, there isn’t a good measurement. But what about at the product level? We could measure larger metrics, such as the number of targeted experiments being created, to see if the new UI has an effect. There are two problems with this. First, it will take a long time (many months) to reach a statistically significant difference because the conversion rate on most product metrics are low. Second, if we eventually measured a difference, there’s no guarantee it was caused by adding impersonation. The broader the metric, the more factors that influence it (such as other product updates). To overcome this you could freeze people in the “A” and “B” versions of the product, but given how long it takes to reach significance, this isn’t a good idea.

Companies like Facebook and Google have enough traffic that they actually are able to roll out new features to a small percentage of users (say, 5%), and measure the impact on their core metrics. If any take a plunge, they revert users to the previous UI and keep iterating. When you have the scale of Facebook and Google, you can get significant data in a day. Unfortunately, like most companies, we don’t have this scale, so it isn’t an option.

So How Do You Know The Redesign Was Worth It?

What people are really asking is how do we know the redesign was worth the effort? Being at an A/B testing company, everyone wants to A/B test everything. But in this case, there wasn’t a place for it. Like any method, split testing has its strengths and weaknesses.

No, Really — How Do You Know The Redesign Was Worth It?

Primarily via qualitative feedback (i.e. talking to users), which at a high level has been positive (but there are some improvements we can make). We’re also measuring people’s activities in the preview tool (e.g. changing tabs, impersonating visitors, etc.). So far, those are healthy. Finally, we’re keeping an eye on some product-level metrics, like the number of targeted experiments created. These metrics are part of our long-term personalization efforts, and we hope in the long run to see them go up. But the preview tool is just one piece of that puzzle, so we don’t expect anything noticeable from that alone.

The important theme here is that gathering data is important to ensure you’re making the best use of limited resources (time, people, etc.). But there’s a whole world of data beyond A/B testing, such as user research, surveys, product analytics, and so on. It’s important to keep in mind the advantages and disadvantages of each, and use the most appropriate ones at your disposal.

« Previous page Next page »