You are here

A List Apart

Subscribe to A List Apart feed
Articles for people who make web sites.
Updated: 22 hours 29 min ago

Request with Intent: Caching Strategies in the Age of PWAs

Thu, 11/21/2019 - 06:30

Once upon a time, we relied on browsers to handle caching for us; as developers in those days, we had very little control. But then came Progressive Web Apps (PWAs), Service Workers, and the Cache API—and suddenly we have expansive power over what gets put in the cache and how it gets put there. We can now cache everything we want to… and therein lies a potential problem.

Media files—especially images—make up the bulk of average page weight these days, and it’s getting worse. In order to improve performance, it’s tempting to cache as much of this content as possible, but should we? In most cases, no. Even with all this newfangled technology at our fingertips, great performance still hinges on a simple rule: request only what you need and make each request as small as possible.

To provide the best possible experience for our users without abusing their network connection or their hard drive, it’s time to put a spin on some classic best practices, experiment with media caching strategies, and play around with a few Cache API tricks that Service Workers have hidden up their sleeves.

Best intentions

All those lessons we learned optimizing web pages for dial-up became super-useful again when mobile took off, and they continue to be applicable in the work we do for a global audience today. Unreliable or high latency network connections are still the norm in many parts of the world, reminding us that it’s never safe to assume a technical baseline lifts evenly or in sync with its corresponding cutting edge. And that’s the thing about performance best practices: history has borne out that approaches that are good for performance now will continue being good for performance in the future.

Before the advent of Service Workers, we could provide some instructions to browsers with respect to how long they should cache a particular resource, but that was about it. Documents and assets downloaded to a user’s machine would be dropped into a directory on their hard drive. When the browser assembled a request for a particular document or asset, it would peek in the cache first to see if it already had what it needed to possibly avoid hitting the network.

We have considerably more control over network requests and the cache these days, but that doesn’t excuse us from being thoughtful about the resources on our web pages.

Request only what you need

As I mentioned, the web today is lousy with media. Images and videos have become a dominant means of communication. They may convert well when it comes to sales and marketing, but they are hardly performant when it comes to download and rendering speed. With this in mind, each and every image (and video, etc.) should have to fight for its place on the page. 

A few years back, a recipe of mine was included in a newspaper story on cooking with spirits (alcohol, not ghosts). I don’t subscribe to the print version of that paper, so when the article came out I went to the site to take a look at how it turned out. During a recent redesign, the site had decided to load all articles into a nearly full-screen modal viewbox layered on top of their homepage. This meant requesting the article required requests for all of the assets associated with the article page plus all the contents and assets for the homepage. Oh, and the homepage had video ads—plural. And, yes, they auto-played.

I popped open DevTools and discovered the page had blown past 15 MB in page weight. Tim Kadlec had recently launched What Does My Site Cost?, so I decided to check out the damage. Turns out that the actual cost to view that page for the average US-based user was more than the cost of the print version of that day’s newspaper. That’s just messed up.

Sure, I could blame the folks who built the site for doing their readers such a disservice, but the reality is that none of us go to work with the goal of worsening our users’ experiences. This could happen to any of us. We could spend days scrutinizing the performance of a page only to have some committee decide to set that carefully crafted page atop a Times Square of auto-playing video ads. Imagine how much worse things would be if we were stacking two abysmally-performing pages on top of each other!

Media can be great for drawing attention when competition is high (e.g., on the homepage of a newspaper), but when you want readers to focus on a single task (e.g., reading the actual article), its value can drop from important to “nice to have.” Yes, studies have shown that images excel at drawing eyeballs, but once a visitor is on the article page, no one cares; we’re just making it take longer to download and more expensive to access. The situation only gets worse as we shove more media into the page. 

We must do everything in our power to reduce the weight of our pages, so avoid requests for things that don’t add value. For starters, if you’re writing an article about a data breach, resist the urge to include that ridiculous stock photo of some random dude in a hoodie typing on a computer in a very dark room.

Request the smallest file you can

Now that we’ve taken stock of what we do need to include, we must ask ourselves a critical question: How can we deliver it in the fastest way possible? This can be as simple as choosing the most appropriate image format for the content presented (and optimizing the heck out of it) or as complex as recreating assets entirely (for example, if switching from raster to vector imagery would be more efficient).

Offer alternate formats

When it comes to image formats, we don’t have to choose between performance and reach anymore. We can provide multiple options and let the browser decide which one to use, based on what it can handle.

You can accomplish this by offering multiple sources within a picture or video element. Start by creating multiple formats of the media asset. For example, with WebP and JPG, it’s likely that the WebP will have a smaller file size than the JPG (but check to make sure). With those alternate sources, you can drop them into a picture like this:

<picture> <source srcset="my.webp" type="image/webp"> <img src="my.jpg" alt="Descriptive text about the picture."> </picture>

Browsers that recognize the picture element will check the source element before making a decision about which image to request. If the browser supports the MIME type “image/webp,” it will kick off a request for the WebP format image. If not (or if the browser doesn’t recognize picture), it will request the JPG. 

The nice thing about this approach is that you’re serving the smallest image possible to the user without having to resort to any sort of JavaScript hackery.

You can take the same approach with video files:

<video controls> <source src="my.webm" type="video/webm"> <source src="my.mp4" type="video/mp4"> <p>Your browser doesn’t support native video playback, but you can <a href="my.mp4" download>download</a> this video instead.</p> </video>

Browsers that support WebM will request the first source, whereas browsers that don’t—but do understand MP4 videos—will request the second one. Browsers that don’t support the video element will fall back to the paragraph about downloading the file.

The order of your source elements matters. Browsers will choose the first usable source, so if you specify an optimized alternative format after a more widely compatible one, the alternative format may never get picked up.  

Depending on your situation, you might consider bypassing this markup-based approach and handle things on the server instead. For example, if a JPG is being requested and the browser supports WebP (which is indicated in the Accept header), there’s nothing stopping you from replying with a WebP version of the resource. In fact, some CDN services—Cloudinary, for instance—come with this sort of functionality right out of the box.

Offer different sizes

Formats aside, you may want to deliver alternate image sizes optimized for the current size of the browser’s viewport. After all, there’s no point loading an image that’s 3–4 times larger than the screen rendering it; that’s just wasting bandwidth. This is where responsive images come in.

Here’s an example:

<img src="medium.jpg" srcset="small.jpg 256w, medium.jpg 512w, large.jpg 1024w" sizes="(min-width: 30em) 30em, 100vw" alt="Descriptive text about the picture.">

There’s a lot going on in this super-charged img element, so I’ll break it down:

  • This img offers three size options for a given JPG: 256 px wide (small.jpg), 512 px wide (medium.jpg), and 1024 px wide (large.jpg). These are provided in the srcset attribute with corresponding width descriptors.
  • The src defines a default image source, which acts as a fallback for browsers that don’t support srcset. Your choice for the default image will likely depend on the context and general usage patterns. Often I’d recommend the smallest image be the default, but if the majority of your traffic is on older desktop browsers, you might want to go with the medium-sized image.
  • The sizes attribute is a presentational hint that informs the browser how the image will be rendered in different scenarios (its extrinsic size) once CSS has been applied. This particular example says that the image will be the full width of the viewport (100vw) until the viewport reaches 30 em in width (min-width: 30em), at which point the image will be 30 em wide. You can make the sizes value as complicated or as simple as you want; omitting it causes browsers to use the default value of 100vw.

You can even combine this approach with alternate formats and crops within a single picture.

Categories: Design

Responsible JavaScript: Part III

Thu, 11/14/2019 - 06:30

You’ve done everything you thought was possible to address your website’s JavaScript problem. You relied on the web platform where you could. You sidestepped Babel and found smaller framework alternatives. You whittled your application code down to its most streamlined form possible. Yet, things are just not fast enough. When websites fail to perform the way we as designers and developers expect them to, we inevitably turn on ourselves:

“What are we failing to do?” “What can we do with the code we have written?” “Which parts of our architecture are failing us?”

These are valid inquiries, as a fair share of performance woes do originate from our own code. Yet, assigning blame solely to ourselves blinds us to the unvarnished truth that a sizable onslaught of our performance problems comes from the outside.

When the third wheel crashes the party

Convenience always has a price, and the web is wracked by our collective preference for it.  JavaScript, in particular, is employed in a way that suggests a rapidly increasing tendency to outsource whatever it is that We (the first party) don’t want to do. At times, this is a necessary decision; it makes perfect financial and operational sense in many situations.

But make no mistake, third-party JavaScript is never cheap. It’s a devil’s bargain where vendors seduce you with solutions to your problem, yet conveniently fail to remind you that you have little to no control over the side effects that solution introduces. If a third-party provider adds features to their product, you bear the brunt. If they change their infrastructure, you will feel the effects of it. Those who use your site will become frustrated, and they aren’t going to bother grappling with an intolerable user experience. You can mitigate some of the symptoms of third parties, but you can’t cure the ailment unless you remove the solutions altogether—and that’s not always practical or possible.

In this installment of Responsible JavaScript, we’ll take a slightly less technical approach than in the previous installment. We are going to talk more about the human side of third parties. Then, we’ll go down some of the technical avenues for how you might go about tackling the problem.

Hindered by convenience

When we talk about the sorry state of the web today, some of us are quick to point out the role of developer convenience in contributing to the problem. While I share the view that developer convenience has a tendency to harm the user experience, they’re not the only kind of convenience that can turn a website into a sluggish, janky mess.

Operational conveniences can become precursors to a very thorny sort of technical debt. These conveniences are what we reach for when we can’t solve a pervasive problem on our own. They represent third-party solutions that address problems in the absence of architectural flexibility and/or adequate development resources.

Whenever an inconvenience arises, that is the time to have the discussion around how to tackle it in a way that’s comprehensive. So let’s talk about what it looks like to tackle that sort of scenario from a more human angle.

The problem is pain

The reason third parties come into play in the first place is pain. When a decision maker in an organization has felt enough pain around a certain problem, they’re going to do a very human thing, which is to find the fastest way to make that pain go away.

Markets will always find ways to address these pain points, even if the way they do so isn’t sustainable or even remotely helpful. Web accessibility overlays—third-party scripts that purport to automatically fix accessibility issues—are among the worst offenders. First, you fork over your money for a fix that doesn’t fix anything. Then you pay a wholly different sort of price when that “fix” harms the usability of your website. This is not a screed to discredit the usefulness of the tools some third-party vendors provide, but to illustrate how the adoption of third-party solutions happens, even those that are objectively awful

A Chrome performance trace of a long task kicked off by a third party’s web accessibility overlay script. The task occupies the main thread for roughly 600 ms on a 2017 Retina MacBook.

So when a vendor rolls up and promises to solve the very painful problem we’re having, there’s a good chance someone is going to nibble. If that someone is high enough in the hierarchy, they’ll exert downward pressure on others to buy in—if not circumvent them entirely in the decision-making process. Conversely, adoption of a third-party solution can also occur when those in the trenches are under pressure and lack sufficient resources to create the necessary features themselves.

Whatever the catalyst, it pays to gather your colleagues and collectively form a plan for navigating and mitigating the problems you’re facing.

Create a mitigation plan

Once people in an organization have latched onto a third-party solution, however ill-advised, the difficulty you’ll encounter in forcing a course change will depend on how urgent a need that solution serves. In fact, you shouldn’t try to convince proponents of the solution that their decision was wrong. Such efforts almost always backfire and can make people feel attacked and more resistant to what you’re telling them. Even worse, those efforts could create acrimony where people stop listening to each other completely, and that is a breeding ground for far worse problems to develop.

Grouse and commiserate amongst your peers if you must—as I myself have often done—but put your grievances aside and come up with a mitigation plan to guide your colleagues toward better outcomes. The nooks and crannies of your specific approach will depend on the third parties themselves and the structure of the organization, but the bones of it could look like the following series of questions.

What problem does this solution address?

There’s a reason why a third-party solution was selected, and this question will help you suss out whether the rationale for its adoption is sound. Remember, there are times decisions are made when all the necessary people are not in the room. You might be in a position where you have to react to the aftermath of that decision, but the answer to this question will lead you to a natural follow-up.

How long do we intend to use the solution?

This question will help you identify the solution’s shelf life. Was it introduced as a bandage, with the intent to remove it once the underlying problem has been addressed, such as in the case of an accessibility overlay? Or is the need more long-term, such as the data provided by an A/B testing suite? The other possibility is that the solution can never be effectively removed because it serves a crucial purpose, as in the case of analytics scripts. It’s like throwing a mattress in a swimming pool: it’s easy to throw in, but nigh impossible to drag back out.

In any case, you can’t know if a third-party script is here to stay if you don’t ask. Indeed, if you find out the solution is temporary, you can form a plan to eventually remove it from your site once the underlying problem it addresses has been resolved.

Who’s the point of contact if issues arise?

When a third-party solution is put into place, someone must be the point of contact for when—not if—issues arise.

I’ve seen what happens (far too often) when a third-party script gets out of control. For example, when a tag manager or an A/B testing framework’s JavaScript grows slowly and insidiously because marketers aren’t cleaning out old tags or completed A/B tests. It’s for precisely these reasons that responsibility needs to be attached to a specific person in your organization for third-party solutions currently in use on your site. What that responsibility entails will differ in every situation, but could include:

  • periodic monitoring of the third-party script’s footprint;
  • maintenance to ensure the third-party script doesn’t grow out of control;
  • occasional meetings to discuss the future of that vendor’s relationship with your organization;
  • identification of overlaps of functionality between multiple third parties, and if potential redundancies can be removed;
  • and ongoing research, especially to identify speedier alternatives that may act as better replacements for slow third-party scripts.

The idea of responsibility in this context should never be an onerous, draconian obligation you yoke your teammates with, but rather an exercise in encouraging mindfulness in your colleagues. Because without mindfulness, a third-party script’s ill effects on your website will be overlooked until it becomes a grumbling ogre in the room that can no longer be ignored. Assigning responsibility for third parties can help to prevent that from happening.

Ensuring responsible usage of third-party solutions

If you can put together a mitigation plan and get everyone on board, the work of ensuring the responsible use of third-party solutions can begin. Luckily for you, the actual technical work will be easier than trying to wrangle people. So if you’ve made it this far, all it will take to get results is time and persistence.

Load only what’s necessary

It may seem obvious, but load only what’s necessary. Judging by the amount of unused first-party JavaScript I see loaded—let alone third-party JavaScript—it’s clearly a problem. It’s like trying to clean your house by stuffing clutter into the closets. Regardless of whether they’re actually needed, it’s not uncommon for third-party scripts to be loaded on every single page, so refer to your point of contact to figure out which pages need which third-party scripts.

As an example, one of my past clients used a popular third-party tool across multiple brand sites to get a list of retailers for a given product. It demonstrated clear value, but that script only needed to be on a site’s product detail page. In reality, it was frequently loaded on every page. Culling this script from pages where it didn’t belong significantly boosted performance for non-product pages, which ostensibly reduced the friction on the conversion path.

Figuring out which pages need which third-party scripts requires you to do some decidedly untechnical work. You’ll actually have to get up from your desk and talk to the person who has been assigned responsibility for the third-party solution you’re grappling with. This is very difficult work for me, but it’s rewarding when good-faith collaboration happens, and good outcomes are realized as a result.

Self-host your third-party scripts

This advice isn’t a secret by any stretch. I even touched on it in the previous installment of this series, but it needs to be shouted from the rooftops at every opportunity: you should self-host as many third-party resources as possible. Whether this is feasible depends on the third-party script in question.

Is it some framework you’re grabbing from Google’s hosted libraries, cdnjs, or other similar provider? Self-host that sucker right now.

Casper found a way to self-host their Optimizely script and significantly reduced their start render time for their trouble. It really drives home the point that a major detriment of third-party resources is the fact that their mere existence on other servers is one of the worst performance bottlenecks we encounter.

If you’re looking to self-host an analytics solution or a similar sort of script, there’s a higher level of difficulty to contend with to self-host it. You may find that some third-party scripts simply can’t be self-hosted, but that doesn’t mean it isn’t worth the trouble to find out. If you find that self-hosting isn’t an option for a third-party script, don’t fret. There are other mitigations you can try.

Mask latency of cross-origin connections

If you can’t self-host your third-party scripts, the next best thing is to preconnect to servers that host them. WebPageTest’s Connection View does a fantastic job of showing you which servers your site gathers resources from, as well as the latency involved in establishing connections to them.

WebPageTest’s Connection View shows all the different servers a page requests resources from during load.

Preconnections are effective because they establish connections to third-party servers before the browser would otherwise discover them in due course. Parsing HTML takes time, and parsers are often blocked by stylesheets and other scripts. Wherever you can’t self-host third-party scripts, preconnections make perfect sense.

Maybe don’t preload third-party scripts

Preloading resources is one of those things that sounds fantastic at first—until you consider its potential to backfire, as Andy Davies points out. If you’re unfamiliar with preloading, it’s similar to preconnecting but goes a step further by instructing the browser to fetch a particular resource far sooner than it ordinarily would.

The drawback of preloading is that while it’s great for ensuring a resource gets loaded as soon as possible, it changes the discovery order of that resource. Whenever we do this, we’re implicitly saying that other resources are less important—including resources crucial to rendering or even core functionality.

It’s probably a safe bet that most of your third-party code is not as crucial to the functionality of your site as your own code. That said, if you must preload a third-party resource, ensure you’re only doing so for third-party scripts that are critical to page rendering.

If you do find yourself in a position where your site’s initial rendering depends on a third-party script, refer to your mitigation plan to see what you can do to eliminate or ameliorate your dependence on it. Depending on a third party for core functionality is never a good position to be in, as you’re relinquishing a lot of control to others who might not have your best interests in mind.

Lazy load non-essential third-party scripts

The best request is no request. If you have a third-party script that doesn’t need to be loaded right away, consider lazy loading it with an Intersection Observer. Here’s what it might look like to lazy load a Facebook Like button when it’s scrolled into the viewport:

let loadedFbScript = false; const intersectionListener = new IntersectionObserver(entries => { entries.forEach(entry => { if ((entry.isIntersecting || entry.intersectionRatio) && !loadedFbScript) { const scriptEl = document.createElement("script"); scriptEl.defer = true; scriptEl.crossOrigin = "anonymous"; scriptEl.src = ""; scriptEl.onload = () => { loadedFbScript = true; }; document.body.append(scriptEl); } }); }); intersectionListener.observe(document.querySelector(".fb-like"));

In the above snippet, we first set a variable to track whether we’ve loaded the Facebook SDK JavaScript. After that, an IntersectionListener is created that checks whether the observed element is in the viewport, and whether the Facebook SDK has been loaded. If the SDK JavaScript hasn’t been loaded, a reference to it is injected into the DOM, which will kick off a request for it.

You’re not going to be able to lazy load every third-party script. Some of them simply need to do their work at page load time, or otherwise can’t be deferred. Regardless, do the detective work to see if it’s possible to lazy load at least some of your third-party JavaScript.

One of the common concerns I hear from coworkers when I suggest lazy loading third-party scripts is how it can delay whatever interactions the third party provides. That’s a reasonable concern, because when you lazy load anything, a noticeable delay may occur as the resource loads. You can get around this to some extent with resource prefetching. This is different than preloading, which we discussed earlier. Prefetching consumes a comparable amount of data, yes, but prefetched resources are given lower priority and are less likely to contend for bandwidth with critical resources.

Staying on top of the problem

Keeping an eye on your third-party JavaScript requires mindfulness bordering on hypervigilance. When you recognize poor performance for the technical debt that it truly is, you’ll naturally slip into a frame of mind where you’ll recognize and address it as you would any other kind of technical debt.

Staying on top of third parties is refactoring—a sort that requires you to periodically perform tasks such as cleaning up tag managers and A/B tests, consolidating third-party solutions, eliminating any that are no longer needed, and applying the coding techniques discussed above. Moreover, you’ll need to work with your team to address this technical debt on a cyclical basis. This kind of work can’t be automated, so yes, you’ll need to knuckle down and have face-to-face, synchronous conversations with actual people.

If you’re already in the habit of scheduling “cleanup sprints” on some interval, then that is the time and space for you to address performance-related technical debt, regardless of whether it involves third- or first-party code. There’s a time for feature development, but that time should not comprise the whole of your working hours. Development shops that focus only on feature development are destined to be wholly consumed by the technical debt that will inevitably result.

So it will come to pass that in the fourth and final installment of this series we’ll discuss what it means to do the hard work of using JavaScript responsibly in the context of process. Therein, we’ll explore what it takes to unite your organization under the banner of making your website faster and more accessible, and therefore more usable for everyone, everywhere.

Categories: Design

Everyday Information Architecture: Auditing for Structure

Thu, 04/18/2019 - 05:45

Just as we need to understand our content before we can recategorize it, we need to understand the system before we try to rebuild it.

Enter the structural audit: a review of the site focused solely on its menus, links, flows, and hierarchies. I know you thought we were done with audits back in Chapter 2, but hear me out! Structural audits have an important and singular purpose: to help us build a new sitemap.

This isn’t about recreating the intended sitemap—no, this is about experiencing the site the way users experience it. This audit is meant to track and record the structure of the site as it really works.

Setting up the template

First, we’re gonna need another spreadsheet. (Look, it is not my fault that spreadsheets are the perfect system for recording audit data. I don’t make the rules.)

Because this involves building a spreadsheet from scratch, I keep a “template” at the top of my audit files—rows that I can copy and paste into each new audit (Fig 4.1). It’s a color-coded outline key that helps me track my page hierarchy and my place in the auditing process. When auditing thousands of pages, it’s easy to get dizzyingly lost, particularly when coming back into the sheet after a break; the key helps me stay oriented, no matter how deep the rabbit hole.

Fig 4.1: I use a color-coded outline key to record page hierarchy as I move through the audit. Wait, how many circles did Dante write about? Color-coding

Color is the easiest, quickest way to convey page depth at a glance. The repetition of black text, white cells, and gray lines can have a numbing effect—too many rows of sameness, and your eyes glaze over. My coloring may result in a spreadsheet that looks like a twee box of macarons, but at least I know, instantly, where I am.

The exact colors don’t really matter, but I find that the familiar mental model of a rainbow helps with recognition—the cooler the row color, the deeper into the site I know I must be.

The nested rainbow of pages is great when you’re auditing neatly nested pages—but most websites color outside the lines (pun extremely intended) with their structure. I leave my orderly rainbow behind to capture duplicate pages, circular links, external navigation, and other inconsistencies like:

  • On-page navigation. A bright text color denotes pages that are accessible via links within page content—not through the navigation. These pages are critical to site structure but are easily overlooked. Not every page needs to be displayed in the navigation menus, of course—news articles are a perfect example—but sometimes this indicates publishing errors.
  • External links. These are navigation links that go to pages outside the domain. They might be social media pages, or even sites held by the same company—but if the domain isn’t the one I’m auditing, I don’t need to follow it. I do need to note its existence in my spreadsheet, so I color the text as the red flag that it is. (As a general rule, I steer clients away from placing external links in navigation, in order to maintain a consistent experience. If there’s a need to send users offsite, I’ll suggest using a contextual, on-page link.)
  • Files. This mostly refers to PDFs, but can include Word files, slide decks, or anything else that requires downloading. As with external links, I want to capture anything that might disrupt the in-site browsing experience. (My audits usually filter out PDFs, but for organizations that overuse them, I’ll audit them separately to show how much “website” content is locked inside.)
  • Unknown hierarchy. Every once in a while, there’s a page that doesn’t seem to belong anywhere—maybe it’s missing from the menu, while its URL suggests it belongs in one section and its navigation scheme suggests another. These pages need to be discussed with their owners to determine whether the content needs to be considered in the new site.
  • Crosslinks. These are navigation links for pages that canonically live in a different section of the site—in other words, they’re duplicates. This often happens in footer navigation, which may repeat the main navigation or surface links to deeper-but-important pages (like a Contact page or a privacy policy). I don’t want to record the same information about the page twice, but I do need to know where the crosslink is, so I can track different paths to the content. I color these cells gray so they don’t draw my attention.

Note that coloring every row (and indenting, as you’ll see in a moment) can be a tedious process—unless you rely on Excel’s formatting brush. That tool applies all the right styles in just two quick clicks.

Outlines and page IDs

Color-coding is half of my template; the other half is the outline, which is how I keep track of the structure itself. (No big deal, just the entire point of the spreadsheet.)

Every page in the site gets assigned an ID. You are assigning this number; it doesn’t correspond to anything but your own perception of the navigation. This number does three things for you:

  1. It associates pages with their place in the site hierarchy. Decimals indicate levels, so the page ID can be decoded as the page’s place in the system.
  2. It gives each page a unique identifier, so you can easily refer to a particular page—saying “2.4.1” is much clearer than “you know that one page in the fourth product category?”
  3. You can keep using the ID in other contexts, like your sitemap. Then, later, when your team decides to wireframe pages 1.1.1 and 7.0, you’ll all be working from the same understanding.

Let me be completely honest: things might get goofy sometimes with the decimal outline. There will come a day when you’ll find yourself casually typing out “,” and at that moment, a fellow auditor somewhere in the universe will ring a tiny gong for you.

In addition to the IDs, I indent each level, which reinforces both the numbers and the colors. Each level down—each digit in the ID, each change in color—gets one indentation.

I identify top-level pages with a single number: 1.0, 2.0, 3.0, etc. The next page level in the first section would be 1.1, 1.2, 1.3, and so on. I mark the homepage as 0.0, which is mildly controversial—the homepage is technically a level above—but, look: I’ve got a lot of numbers to write, and I don’t need those numbers to tell me they’re under the homepage, so this is my system. Feel free to use the numbering system that work best for you.

Criteria and columns

So we’ve got some secret codes for tracking hierarchy and depth, but what about other structural criteria? What are our spreadsheet columns (Fig 4.2)? In addition to a column for Page ID, here’s what I cover:

  • URL. I don’t consistently fill out this column, because I already collected this data back in my automated audit. I include it every twenty entries or so (and on crosslinks or pages with unknown hierarchy) as another way of tracking progress, and as a direct link into the site itself.
  • Menu label/link. I include this column only if I notice a lot of mismatches between links, labels, and page names. Perfect agreement isn’t required; but frequent, significant differences between the language that leads to a page and the language on the page itself may indicate inconsistencies in editorial approach or backend structures.
  • Name/headline. Think of this as “what does the page owner call it?” It may be the H1, or an H2; it may match the link that brought you here, or the page title in the browser, or it may not.
  • Page title. This is for the name of the page in the metadata. Again, I don’t use this in every audit—particularly if the site uses the same long, branded metadata title for every single page—but frequent mismatches can be useful to track.
  • Section. While the template can indicate your level, it can’t tell you which area of the site you’re in—unless you write it down. (This may differ from the section data you applied to your automated audit, taken from the URL structure; here, you’re noting the section where the page appears.)
  • Notes. Finally, I keep a column to note specific challenges, and to track patterns I’m seeing across multiple pages—things like “Different template, missing subnav” or “Only visible from previous page.” My only caution here is that if you’re planning to share this audit with another person, make sure your notes are—ahem—professional. Unless you enjoy anxiously combing through hundreds of entries to revise comments like “Wow haha nope” (not that I would know anything about that).
Fig 4.2: A semi-complete structural audit. This view shows a lot of second- and third-level pages, as well as pages accessed through on-page navigation.

Depending on your project needs, there may be other columns, too. If, in addition to using this spreadsheet for your new sitemap, you want to use it in migration planning or template mapping, you may want columns for new URLs, or template types. 

You can get your own copy of my template as a downloadable Excel file. Feel free to tweak it to suit your style and needs; I know I always do. As long as your spreadsheet helps you understand the hierarchy and structure of your website, you’re good to go.

Gathering data

Setting up the template is one thing—actually filling it out is, admittedly, another. So how do we go from a shiny, new, naive spreadsheet to a complete, jaded, seen-some-stuff spreadsheet? I always liked Erin Kissane’s description of the process, from The Elements of Content Strategy:

Big inventories involve a lot of black coffee, a few late nights, and a playlist of questionable but cheering music prominently featuring the soundtrack of object-collecting video game Katamari Damacy. It takes quite a while to exhaustively inventory a large site, but it’s the only way to really understand what you have to work with.

We’re not talking about the same kind of exhaustive inventory she was describing (though I am recommending Katamari music). But even our less intensive approach is going to require your butt in a seat, your eyes on a screen, and a certain amount of patience and focus. You’re about to walk, with your fingers, through most of a website.

Start on the homepage. (We know that not all users start there, but we’ve got to have some kind of order to this process or we’ll never get through it.) Explore the main navigation before moving on to secondary navigation structures. Move left to right, top to bottom (assuming that is your language direction) over each page, looking for the links. You want to record every page you can reasonably access on the site, noting navigational and structural considerations as you go.

My advice as you work:

  • Use two monitors. I struggle immensely without two screens in this process, which involves constantly switching between spreadsheet and browser in rapid, tennis-match-like succession. If you don’t have access to multiple monitors, find whatever way is easiest for you to quickly flip between applications.
  • Record what you see. I generally note all visible menu links at the same level, then exhaust one section at a time. Sometimes this means I have to adjust what I initially observed, or backtrack to pages I missed earlier. You might prefer to record all data across a level before going deeper, and that would work, too. Just be consistent to minimize missed links.
  • Be alert to inconsistencies. On-page links, external links, and crosslinks can tell you a lot about the structure of the site, but they’re easy to overlook. Missed on-page links mean missed content; missed crosslinks mean duplicate work. (Note: the further you get into the site, the more you’ll start seeing crosslinks, given all the pages you’ve already recorded.)
  • Stick to what’s structurally relevant. A single file that’s not part of a larger pattern of file use is not going to change your understanding of the structure. Neither is recording every single blog post, quarterly newsletter, or news story in the archive. For content that’s dynamic, repeatable, and plentiful, I use an x in the page ID to denote more of the same. For example, a news archive with a page ID of 2.8 might show just one entry beneath it as 2.8.x; I don’t need to record every page up to 2.8.791 to understand that there are 791 articles on the site (assuming I noted that fact in an earlier content review).
  • Save. Save frequently. I cannot even begin to speak of the unfathomable heartbreak that is Microsoft Excel burning an unsaved audit to the ground.  

Knowing which links to follow, which to record, and how best to untangle structural confusion—that improves with time and experience. Performing structural audits will not only teach you about your current site, but will help you develop fluency in systems thinking—a boon when it comes time to document the new site.

Categories: Design

Nothing Fails Like Success

Thu, 04/11/2019 - 02:30

A family buys a house they can’t afford. They can’t make their monthly mortgage payments, so they borrow money from the Mob. Now they’re in debt to the bank and the Mob, live in fear of losing their home, and must do whatever their creditors tell them to do.

Welcome to the internet, 2019.

Buying something you can’t afford, and borrowing from organizations that don’t have your (or your customers’) best interest at heart, is the business plan of most internet startups. It’s why our digital services and social networks in 2019 are a garbage fire of lies, distortions, hate speech, tribalism, privacy violations, snake oil, dangerous idiocy, deflected responsibility, and whole new categories of unpunished ethical breaches and crimes.

From optimistically conceived origins and message statements about making the world a better place, too many websites and startups have become the leading edge of bias and trauma, especially for marginalized and at-risk groups.

Why (almost) everything sucks

Twitter, for instance, needs a lot of views for advertising to pay at the massive scale its investors demand. A lot of views means you can’t be too picky about what people share. If it’s misogynists or racists inspiring others who share their heinous beliefs to bring back the 1930s, hey, it’s measurable. If a powerful elected official’s out-of-control tweeting reduces churn and increases views, not only can you pay your investors, you can even take home a bonus. Maybe it can pay for that next meditation retreat.

You can cloak this basic economic trade-off in fifty layers of bullshit—say you believe in freedom of speech, or that the antidote to bad speech is more speech—but the fact is, hate speech is profitable. It’s killing our society and our planet, but it’s profitable. And the remaining makers of Twitter—the ones whose consciences didn’t send them packing years ago—no longer have a choice. The guy from the Mob is on his way over, and the vig is due.

Not to single out Twitter, but this is clearly the root cause of its seeming indifference to the destruction hate speech is doing to society…and will ultimately do to the platform. (But by then Jack will be able to afford to meditate full-time.)

Other companies do other evil things to pay their vig. When you owe the Mob, you have no choice. Like sell our data. Or lie about medical research.

There are internet companies (like Basecamp, or like Automattic, makers of, where I work) that charge money for their products and services, and use that money to grow their business. I wish more internet companies could follow that model, but it’s hard to retrofit a legitimate business model to a product that started its life as free.

And there are even some high-end news publications, such as The New York Times, The Washington Post, and The Guardian, that survive on a combination of advertising and flexible paywalls. But these options are not available to most digital publications and businesses.

Return with me to those Halcyon days…

Websites and internet startups used to be you and your friends making cool stuff for your other friends, and maybe building new friendships and even small communities in the process. (Even in 2019, that’s still how some websites and startups begin—as labors of love, fashioned by idealists in their spare time.)

Because they are labors of love; because we’ve spent 25 years training people to believe that websites, and news, and apps, and services should be free; because, when we begin a project, we can scarcely believe anyone will ever notice or care about it—for these reasons and more, the things we make digitally, especially on the web, are offered free of charge. We labor on, excited by positive feedback, and delighted to discover that, if we keep at it, our little community will grow.

Most such labors of love disappear after a year or two, as the creators drift out of touch with each other, get “real” jobs, fall in love, start families, or simply lose interest due to lack of attention from the public or the frustrations of spending weekends and holidays grinding away at an underappreciated site or app while their non-internet friends spend those same hours either having fun or earning money.

Along came money

But some of these startup projects catch on. And when they do, a certain class of investor smells ROI. And the naive cofounders, who never expected their product or service to really get anywhere, can suddenly envision themselves rich and Zuckerberg-famous. Or maybe they like the idea of quitting their day job, believing in themselves, and really going for it. After all, that is an empowering and righteous vision.

Maybe they believe that by taking the initial investment, they can do more good—that their product, if developed further, can actually help people. This is often the motivation behind agreeing to an initial investment deal, especially in categories like healthcare.

Or maybe the founders are problem solvers. Existing products or services in a given category have a big weakness. The problem solvers are sure that their idea is better. With enough capital, and a slightly bigger team, they can show the world how to do it right. Most inventions that have moved humankind forward followed exactly this path. It should lead to a better world (and it sometimes does). It shouldn’t produce privacy breaches and fake medicine and election-influencing bots and all the other plagues of our emerging digital civilization. So why does it?

Content wants to be paid

Primarily it is because these businesses have no business model. They were made and given away free. Now investors come along who can pay the founders, buy them an office, give them the money to staff up, and even help with PR and advertising to help them grow faster.

Now there are salaries and insurance and taxes and office space and travel and lecture tours and sales booths at SXSW, but there is still no charge for the product.

And the investor seeks a big return.

And when the initial investment is no longer enough to get the free-product company to scale to the big leagues, that’s when the really big investors come in with the really big bucks. And the company is suddenly famous overnight, and “everybody” is using the product, and it’s still free, and the investors are still expecting a giant payday.

Like I said—a house you can’t afford, so you go into debt to the bank and the Mob.

The money trap

Here it would be easy to blame capitalism, or at least untrammeled, under-regulated capitalism, which has often been a source of human suffering—not that capitalism, properly regulated, can’t also be a force for innovation which ameliorates suffering. That’s the dilemma for our society, and where you come down on free markets versus governmental regulation of businesses should be an intellectual decision, but these days it is a label, and we hate our neighbors for coming down a few degrees to the left or right of us. But I digress and oversimplify, and this isn’t a complaint about late stage capitalism per se, although it may smell like one.

No, the reason small companies created by idealists too frequently turn into consumer-defrauding forces for evil has to do with the amount of profit each new phase of investor expects to receive, and how quickly they expect to receive it, and the fact that the products and services are still free. And you know what they say about free products.

Nothing fails like success

A friend who’s a serial entrepreneur has started maybe a dozen internet businesses over the span of his career. They’ve all met a need in the marketplace. As a consequence, they’ve all found customers, and they’ve all made a profit. Yet his investors are rarely happy.

“Most of my startups have the decency to fail in the first year,” one investor told him. My friend’s business was taking in several million dollars a year and was slowly growing in staff and customers. It was profitable. Just not obscenely so.

And internet investors don’t want a modest return on their investment. They want an obscene profit right away, or a brutal loss, which they can write off their taxes. Making them a hundred million for the ten million they lent you is good. Losing their ten million is also good—they pay a lower tax bill that way, or they use the loss to fold a company, or they make a profit on the furniture while writing off the business as a loss…whatever rich people can legally do under our tax system, which is quite a lot.

What these folks don’t want is to lend you ten million dollars and get twelve million back.

You and I might go, “Wow! I just made two million dollars just for being privileged enough to have money to lend somebody else.” And that’s why you and I will never have ten million dollars to lend anybody. Because we would be grateful for it. And we would see a free two million dollars as a life-changing gift from God. But investors don’t think this way.

We didn’t start the fire, but we roasted our weenies in it

As much as we pretend to be a religious nation, our society worships these investors and their profits, worships companies that turn these profits, worships above all the myth of overnight success, which we use to motivate the hundreds of thousands of workers who will work nights and weekends for the owners in hopes of cashing in when the stock goes big.

Most times, even if the stock does go big, the owner has found a way to devalue it by the time it does. Owners have brilliant advisers they pay to figure out how to do those things. You and I don’t.

A Christmas memory

I remember visiting San Francisco years ago and scoring an invitation to Twitter’s Christmas party through a friend who worked there at the time. Twitter was, at the time, an app that worked via SMS and also via a website. Period.

Some third-party companies, starting with my friends at Iconfactory, had built iPhone apps for people who wanted to navigate Twitter via their newfangled iPhones instead of the web. Twitter itself hadn’t publicly addressed mobile and might not even have been thinking about it.

Although Twitter was transitioning from a fun cult thing—used by bloggers who attended SXSW Interactive in 2007—to an emerging cultural phenomenon, it was still quite basic in its interface and limited in its abilities. Which was not a bad thing. There is art in constraint, value in doing one thing well. As an outsider, if I’d thought about it, I would have guessed that Twitter’s entire team consisted of no more than 10 or 12 wild-eyed, sleep-deprived true believers.

Imagine my surprise, then, when I showed up at the Christmas party and discovered I’d be sharing dinner with hundreds of designers, developers, salespeople, and executives instead of the handful I’d naively anticipated meeting. (By now, of course, Twitter employs many thousands. It’s still not clear to an outsider why so many workers are needed.)

But one thing is clear: somebody has to pay for it all.

Freemium isn’t free

Employees, let alone thousands of them, on inflated Silicon Valley engineer salaries, aren’t free. Health insurance and parking and meals and HR and travel and expense accounts and meetups and software and hardware and office space and amenities aren’t free. Paying for all that while striving to repay investors tenfold means making a buck any way you can.

Since the product was born free and a paywall isn’t feasible, Twitter must rely on that old standby: advertising. Advertising may not generate enough revenue to keep your hometown newspaper (or most podcasts and content sites) in business, but at Twitter’s scale, it pays.

It pays because Twitter has so many active users. And what keeps those users coming back? Too often, it’s the dopamine of relentless tribalism—folks whose political beliefs match and reinforce mine in a constant unwinnable war of words with folks whose beliefs differ.

Of course, half the antagonists in a given brawl may be bots, paid for in secret by an organization that wants to make it appear that most citizens are against Net Neutrality, or that most Americans oppose even the most basic gun laws, or that our elected officials work for lizard people. The whole system is broken and dangerous, but it’s also addictive, and we can’t look away. From our naive belief that content wants to be free, and our inability to create businesses that pay for themselves, we are turning our era’s greatest inventions into engines of doom and despair.

Your turn

So here we are. Now what do we do about it?

It’s too late for current internet businesses (victims of their own success) that are mortgaged to the hilt in investor gelt. But could the next generation of internet startups learn from older, stable companies like Basecamp, and design products that pay for themselves via customer income—products that profit slowly and sustainably, allowing them to scale up in a similarly slow, sustainable fashion?

The self-payment model may not work for apps and sites that are designed as modest amusements or communities, but maybe those kinds of startups don’t need to make a buck—maybe they can simply be labors of love, like the websites we loved in the 1990s and early 2000s.

Along those same lines, can the IndieWeb, and products of IndieWeb thinking like, save us? Might they at least provide an alternative to the toxic aspects of our current social web, and restore the ownership of our data and content? And before you answer, RTFM.

On an individual and small collective basis, the IndieWeb already works. But does an IndieWeb approach scale to the general public? If it doesn’t scale yet, can we, who envision and design and build, create a new generation of tools that will help give birth to a flourishing, independent web? One that is as accessible to ordinary internet users as Twitter and Facebook and Instagram? Tantek Çelik thinks so, and he’s been right about the web for nearly 30 years. (For more about what Tantek thinks, listen to our conversation in Episode № 186 of The Big Web Show.)
Are these approaches mere whistling against a hurricane? Are most web and internet users content with how things are? What do you think? Share your thoughts on your personal website (dust yours off!) or (irony ahoy!) on your indie or mainstream social networks of choice using hashtag #LetsFixThis. I can’t wait to see what you have to say.

Categories: Design

Accessibility for Vestibular Disorders: How My Temporary Disability Changed My Perspective

Wed, 04/03/2019 - 18:55

Accessibility can be tricky. There are plenty of conditions to take into consideration, and many technical limitations and weird exceptions that make it quite hard to master for most designers and developers.

I never considered myself an accessibility expert, but I took great pride in making my projects Web Content Accessibility Guidelines (WCAG) compliant…ish. They would pass most automated tests, show perfectly in the accessibility tree, and work quite well with keyboard navigation. I would even try (and fail) to use a screen reader every now and then.

But life would give me a lesson I would probably never learn otherwise: last October, my abled life took a drastic change—I started to feel extremely dizzy, with a constant sensation of falling or spinning to the right. I was suffering from a bad case of vertigo caused by labyrinthitis that made it impossible to get anything done.

Vertigo can have a wide range of causes, the most common being a viral infection or tiny calcium crystal free floating in the inner ear, which is pretty much our body’s accelerometer. Any disruption in there sends the brain confusing signals about the body’s position, which causes really heavy nausea, dizziness, and headaches. If you’ve ever felt seasick, it’s quite a similar vibe. If not, think about that feeling when you just get off a rollercoaster…it’s like that, only all day long.

For most people, vertigo is something they’ll suffer just once in a lifetime, and it normally goes away in a week or two. Incidence is really high, with some estimates claiming that up to 40% of the population suffers vertigo at least once in their lifetime. Some people live all their lives with it (or with similar symptoms caused by a range of diseases and syndromes grouped under the umbrella term of vestibular disorders), with 4% of US adults reporting chronic problems with balance, and an additional 1.1% reporting chronic dizziness, according to the American Speech-Language-Hearing Association.

In my case, it was a little over a month. Here’s what I learned while going through it.

Slants can trigger vestibular symptoms

It all started as I was out for my daily jog. I felt slightly dizzy, then suddenly my vision got totally distorted. Everything appeared further away, like looking at a fun house’s distortion mirror. I stumbled back home and rested; at that moment I believed I might have over-exercised, and that hydration, food, and rest were all I needed. Time would prove me wrong.

What I later learned was that experiencing vertigo is a constant war between one of your inner ears telling the brain “everything is fine, we’re level and still” and the other ear shouting “oh my God, we’re falling, we’re falling!!!” Visual stimuli can act as an intermediary, supporting one ear’s message or the other’s. Vertigo can also work in the opposite way, with the dizziness interfering with your vision.

I quickly found that when symptoms peaked, staring at a distant object would ease the falling sensation somewhat.

In the same fashion, some visual stimuli would worsen it.

Vertical slants were a big offender in that sense. For instance, looking at a subtle vertical slant (the kind that you’d have to look at twice to make sure it’s not perfectly vertical) on a webpage would instantly trigger symptoms for me. Whether it was a page-long slant used to create some interest beside text or a tiny decoration to mark active tabs, looking at anything with slight slants would instantly send me into the rollercoaster.

Horizontal slants (whatever the degree) and harder vertical slants wouldn’t cause these issues.

My best guess is that slight vertical slants can look like forced perspective and therefore reinforce the falling-from-height sensation, so I would recommend avoiding vertical slants if you can, or make them super obvious. A slight slant looks like perspective, a harder one looks like a triangle.

Target size matters (even on mouse-assisted devices)

After a magnetic resonance imaging (MRI) scan, some tests to discard neurological conditions, and other treatments that proved ineffective, I was prescribed Cinnarizine.

Cinnarizine is a calcium channel blocker—to put it simply, it prevents the malfunctioning inner ear “accelerometer” from sending incorrect info to the brain. 
And it worked wonders. After ten days of being barely able to get out of bed, I was finally getting something closer to my normal life. I would still feel dizzy all the time, with some peaks throughout the day, but for the most part, it was much easier.

At this point, I was finally able to use the computer (but still unable to produce any code at all). To make the best of it, I set on a mission to self-experiment on accessibility for vestibular disorders. In testing, I found that one of the first things that struck me was that I would always miss targets (links and buttons).

I’m from the generation that grew up with desktop computers, so using a mouse is second nature. The pointer is pretty much an extension of my mind, as it is for many who use it regularly. But while Cinnarizine helped with the dizziness, it has a common side effect of negatively impacting coordination and fine motor skills (it is recommended not to drive or operate machinery while under treatment). It was not a surprise when I realized it would be much harder to get the pointer to do what I intended.

The common behavior would be: moving the pointer past the link I intended to click, clicking before reaching it at all, or having to try multiple times to click on smaller targets.

Success Criterion 2.5.5 Target Size (Level AAA) of the World Wide Web Consortium (W3C)’s WCAG recommends bigger target sizes so users can activate them easily. The obvious reason for this is that it’s harder to pinpoint targets on smaller screens with coarser inputs (i.e., touchscreens of mobile devices). A fairly common practice for developers is to set bigger target sizes for smaller viewport widths (assuming that control challenges are only touch-related), while neglecting the issue on big screens expected to be used with mouse input. I know I’m guilty of that myself.

Instead of targeting this behavior for just smaller screen sizes, there are plenty of reasons to create larger target sizes on all devices: it will benefit users with limited vision (when text is scaled up accordingly and colors are of sufficient contrast), users with mobility impairments such as hand tremors, and of course, users with difficulty with fine motor skills.

Font size and spacing

Even while “enjoying” the ease of symptoms provided by the treatment, reading anything still proved to be a challenge for the following three weeks.

I was completely unable to use mobile devices while suffering vertigo due to the smaller font sizes and spacing, so I was forced to use my desktop computer for everything.

I can say I was experiencing something similar to users with mild forms of dyslexia or attention disorders: whenever I got to a website that didn’t follow good font styling, I would find myself reading the same line over and over again.

This proves once again that accessibility is intersectional: when we improve things for a particular purpose it usually benefits users with other challenges as well. I used to believe recommendations on font styles were mostly intended for the nearsighted and those who have dyslexia. Turns out they are also critical for those with vertigo, and even for those with some cognitive differences. At the end of the day, everybody benefits from better readability.

Some actions you can take to improve readability are:

  • Keep line height to at least 1.5 times the font size (i.e., line-height: 1.5).
  • Set the spacing between paragraphs to at least 2.0 times the font size. We can do this by adjusting the margins using relative units such as em.
  • Letter spacing should be at least 0.12 times the font size. We can adjust this by using the letter-spacing CSS property, perhaps setting it in a relative unit.
  • Make sure to have good contrast between text and its background.
  • Keep font-weight at a reasonable level for the given font-family. Some fonts have thin strokes that make them harder to read. When using thinner fonts, try to improve contrast and font size accordingly, even more than what WCAG would suggest.
  • Choose fonts that are easy to read. There has been a large and still inconclusive debate on which font styles are better for users, but one thing I can say for sure is that popular fonts (as in fonts that the user might be already familiar with) are generally the least challenging for users with reading issues.

WCAG recommendations on text are fairly clear and fortunately are the most commonly implemented of recommendations, but even they can still fall short sometimes. So, better to follow specific guides on accessible text and your best judgement. Passing automated tests does not guarantee actual accessibility.

Another issue on which my experience with vertigo proved to be similar to that of people with dyslexia and attention disorders was how hard it was for me to keep my attention in just one place. In that sense…

Animations are bad (and parallax is pure evil)

Val Head has already covered visually-triggered vestibular disorders in an outstanding article, so I would recommend giving it a good read if you haven’t already.

To summarize, animations can trigger nausea, dizziness, and headaches in some users, so we should use them purposely and responsibly.

While most animations did not trigger my symptoms, parallax scrolling did. I’d never been a fan of parallax to begin with, as I found it confusing. And when you’re experiencing vertigo, the issues introduced by parallax scrolling compound.

Really, there are no words to describe just how bad a simple parallax effect, scrolljacking, or even background-attachment: fixed would make me feel. I would rather jump on one of those 20-G centrifuges astronauts use than look at a website with parallax scrolling.

Every time I encountered it, I would put the bucket beside me to good use and be forced to lie in bed for hours as I felt the room spinning around me, and no meds could get me out of it. It was THAT bad.

Though normal animations did not trigger a reaction as severe, they still posed a big problem. The extreme, conscious, focused effort it took to read would make it such that anything moving on the screen would instantly break my focus, and force me to start the paragraph all over. And I mean anything.

I would constantly find myself reading a website only to have the typical collapsing navigation bar on scroll distract me just enough that I’d totally lose count of where I was at. Autoplaying carousels were so annoying I would delete them using dev tools as soon as they showed up. Background videos would make me get out of the website desperately.

Over time I started using mouse selection as a pointer; a visual indication of what I’d already read so I could get back to it whenever something distracted me. Then I tried custom stylesheets to disable transforms and animations whenever possible, but that also meant many websites having critical elements not appear at all, as they were implemented to start off-screen or otherwise invisible, and show up on scroll.

Of course, deleting stuff via dev tools or using custom stylesheets is not something we can expect 99.99% of our users to even know about.

So if anything, consider reducing animations to a minimum. Provide users with controls to turn off non-essential animations (WCAG 2.2.3 Animation from Interactions) and to pause, stop, or hide them (WCAG 2.2.2 Pause, Stop, Hide). Implement animations and transitions in such a way that if the user disables them, critical elements still display.

And be extra careful with parallax: my recommendation is to, at the very least, try limiting its use to the header (“hero”) only, and be mindful of getting a smooth, realistic parallax experience. My vertigo self would have said, “just don’t freaking use parallax. Never. EVER.” But I guess that might be a hard idea to sell to stakeholders and designers.

Also consider learning how to use the prefers-reduced-motion feature query. This is a newer addition to the specs (it’s part of the Media Queries Level 5 module , which is at an early Editor’s Draft stage) that allows authors to apply selective styling depending on whether the user has requested the system to minimize the use of animations. OS and browser support for it is still quite limited, but the day will come when we will set any moving thing inside a query for when the user has no-preference, blocking animations from those who choose reduce.

After about a week of wrestling websites to provide a static experience, I remembered something that would prove to be my biggest ally while the vertigo lasted:

Reader mode

Some browsers include a “reader mode” that strips the content from any styling choices, isolates it from any distraction, and provides a perfect WCAG compliant layout for the text to maximize readability.

It is extremely helpful to provide a clear and consistent reading experience throughout multiple websites, especially for users with any kind of reading impairment.

I have to confess: before experiencing my vestibular disorder, I had never used Reader Mode (the formal name varies in browsers) or even checked if my projects were compatible with it. I didn’t even think it was such a useful feature, as a quick search for “reader mode” actually returned quite a few threads by users asking how to disable it or how to take the button for it out of Firefox’s address bar. (It seems some people are unwittingly activating it…perhaps the icon is not clear enough.)

Displaying the button to access Reader Mode is toggled by browser heuristics, which are based on the use (or not) of semantic tags in a page’s HTML. Unfortunately this meant not all websites provided such a “luxury.”

I really wish I wouldn’t have to say this in 2019…but please, please use semantic tags. Correct conversational semantics allow your website to be displayed in Reader Mode, and provide a better experience for users of screen readers. Again, accessibility is intersectional.

Reader Mode proved to be extremely useful while my vertigo lasted. But there was something even better:

Dark color schemes

By the fourth week, I started feeling mostly fine. I opened Visual Studio Code to try to get back to work. In doing so, it served me well to find one more revelation: a light-text-on-dark-background scheme was SO much easier for me to read. (Though I still was not able to return to work at this time.)

I was quite surprised, as I had always preferred light mode with dark-text-on-light-background for reading, and dark mode, with light-text-on-dark for coding. I didn’t know at the time that I was suffering from photophobia (which is a sensitivity to light), which was one of the reasons I found it hard to read on my desktop and to use my mobile device at all.

As far as I know, photophobia is not a common symptom of vestibular disorders, but there are many conditions that will trigger it, so it’s worth looking into for our projects’ accessibility.

CSS is also planning a media query to switch color schemes. Known as prefers-color-scheme, it allows applying styles based on the user’s stated preference for dark or light theming. It’s also part of the Media Queries Level 5 spec, and at the time of writing this article it’s only available in Safari Technology Preview, with Mozilla planning to ship it in the upcoming Firefox 67. Luckily there’s a PostCSS plugin that allows us to use it in most modern browsers by turning prefers-color-schemequeries into color-index queries, which have much better support.

If PostCSS is not your cup of tea, or for whatever reason you cannot use that approach to automate switching color schemes to a user’s preference, try at least to provide a theming option in your app’s configuration. Theming has become extremely simple since the release of CSS Custom Properties, so implementing this sort of switch is relatively easy and will greatly benefit anyone experiencing photophobia.

Moving on

After a month and some days, the vertigo disappeared completely, and I was able to return to work without needing any meds or further treatment. It should stay that way, as for most people it’s a once-in-a-lifetime occurrence.

I went back to my abled life, but the experience changed my mindset for good.

As I said before, I always cared for making my projects compatible for people using keyboard navigation and screen readers. But I learned the hard way that there are plenty of “invisible conditions” that are just as important to take into consideration: vestibular disorders, cognitive differences, dyslexia, and color blindness, just to name a few. I was totally neglecting those most of the time, barely addressing the issues in order to pass automated tests, which means I was unintentionally annoying some users by making websites inaccessible to them.

After my experience with vertigo, I’ve turned to an accessibility-first approach to design and development. Now I ask myself, “am I leaving anyone behind with this decision?,” before dropping a single line of code. Accessibility should never be an afterthought.

Making sure my projects work from the start for those with difficulties also improves the experience for everyone else. Think about how improving text styles for users with dyslexia, vertigo, or visual problems improves readability for all users, or how being able to control animations or choose a color scheme can be critical for users with attention disorders and photophobia, respectively, while also a nice feature for everybody.

It also turned my workflow into a much smoother development experience, as addressing accessibility issues from the beginning can mean a slower start, but it’s also much easier and faster than trying to fix broken accessibility afterwards.

I hope that by sharing my personal experience with vertigo, I’ve illustrated how we can all design and develop a better web for everybody. Remember, we’re all just temporarily abled.

Categories: Design

Responsible JavaScript: Part I

Thu, 03/28/2019 - 02:07

By the numbers, JavaScript is a performance liability. If the trend persists, the median page will be shipping at least 400 KB of it before too long, and that’s merely what’s transferred. Like other text-based resources, JavaScript is almost always served compressed—but that might be the only thing we’re getting consistently right in its delivery.

Unfortunately, while reducing resource transfer time is a big part of that whole performance thing, compression has no effect on how long browsers take to process a script once it arrives in its entirety. If a server sends 400 KB of compressed JavaScript, the actual amount browsers have to process after decompression is north of a megabyte. How well devices cope with these heavy workloads depends, well, on the deviceMuch has been written about how adept various devices are at processing lots of JavaScript, but the truth is, the amount of time it takes to process even a trivial amount of it varies greatly between devices.

Take, for example, this throwaway project of mine, which serves around 23 KB of uncompressed JavaScript. On a mid-2017 MacBook Pro, Chrome chews through this comparably tiny payload in about 25 ms. On a Nokia 2 Android phone, however, that figure balloons to around 190 ms. That’s not an insignificant amount of time, but in either case, the page gets interactive reasonably fast.

Now for the big question: how do you think that little Nokia 2 does on an average page? It chokes. Even on a fast connection, browsing the web on it is an exercise in patience as JavaScript-laden web pages brick it for considerable stretches of time.

Figure 1. A performance timeline overview of a Nokia 2 Android phone browsing on a page where excessive JavaScript monopolizes the main thread.

While devices and the networks they navigate the web on are largely improving, we’re eating those gains as trends suggest. We need to use JavaScript responsibly. That begins with understanding what we’re building as well as how we’re building it.

The mindset of “sites” versus “apps”

Nomenclature can be strange in that we sometimes loosely identify things with terms that are inaccurate, yet their meanings are implicitly understood by everyone. Sometimes we overload the term “bee” to also mean “wasp”, even though the differences between bees and wasps are substantial. Those differences can motivate you to deal with each one differently. For instance, we’ll want to destroy a wasp nest, but because bees are highly beneficial and vulnerable insects, we may opt to relocate them.

We can be just as fast and loose in interchanging the terms “website” and “web app”. The differences between them are less clear than those between yellowjackets and honeybees, but conflating them can bring about painful outcomes. The pain comes in the affordances we allow ourselves when something is merely a “website” versus a fully-featured “web app.” If you’re making an informational website for a business, you’re less likely to lean on a powerful framework to manage changes in the DOM or implement client-side routing—at least, I hope. Using tools so ill-suited for the task would not only be a detriment to the people who use that site but arguably less productive.

When we build a web app, though, look out. We’re installing packages which usher in hundreds—if not thousands—of dependencies, some of which we’re not sure are even safe. We’re also writing complicated configurations for module bundlers. In this frenzied, yet ubiquitous, sort of dev environment, it takes knowledge and vigilance to ensure what gets built is fast and accessible. If you doubt this, run npm ls --prod in your project’s root directory and see if you recognize everything in that list. Even if you do, that doesn’t account for third party scripts—of which I’m sure your site has at least a few.

What we tend to forget is that the environment websites and web apps occupy is one and the same. Both are subject to the same environmental pressures that the large gradient of networks and devices impose. Those constraints don’t suddenly vanish when we decide to call what we build “apps”, nor do our users’ phones gain magical new powers when we do so.

It’s our responsibility to evaluate who uses what we make, and accept that the conditions under which they access the internet can be different than what we’ve assumed. We need to know the purpose we’re trying to serve, and only then can we build something that admirably serves that purpose—even if it isn’t exciting to build.

That means reassessing our reliance on JavaScript and how the use of it—particularly to the exclusion of HTML and CSS—can tempt us to adopt unsustainable patterns which harm performance and accessibility.

Don’t let frameworks force you into unsustainable patterns

I’ve been witness to some strange discoveries in codebases when working with teams that depend on frameworks to help them be highly productive. One characteristic common among many of them is that poor accessibility and performance patterns often result. Take the React component below, for example:

import React, { Component } from "react"; import { validateEmail } from "helpers/validation"; class SignupForm extends Component { constructor (props) { super(props); this.handleSubmit = this.handleSubmit.bind(this); this.updateEmail = this.updateEmail.bind(this); = ""; } updateEmail (event) { this.setState({ email: }); } handleSubmit () { // If the email checks out, submit if (validateEmail( { // ... } } render () { return ( Enter your email: Sign Up ); } }

There are some notable accessibility issues here:

  1. A form that doesn’t use a <form> element is not a form. Indeed, you could paper over this by specifying role="form" in the parent <div>, but if you’re building a form—and this sure looks like one—use a <form> element with the proper action and method attributes. The action attribute is crucial, as it ensures the form will still do something in the absence of JavaScript—provided the component is server-rendered, of course.
  2. A <span> is not a substitute for a <label> element, which provides accessibility benefits <span>s don’t.
  3. If we intend to do something on the client side prior to submitting a form, then we should move the action bound to the <button> element's onClick handler to the <form> element’s onSubmit handler.
  4. Incidentally, why use JavaScript to validate an email address when HTML5 offers form validation controls in almost every browser back to IE 10? There’s an opportunity here to rely on the browser and use an appropriate input type, as well as the required attribute—but be aware that getting this to work right with screen readers takes a little know-how.
  5. While not an accessibility issue, this component doesn't rely on any state or lifecycle methods, which means it can be refactored into a stateless functional component, which uses considerably less JavaScript than a full-fledged React component.

Knowing these things, we can refactor this component:

import React from "react"; const SignupForm = props => { const handleSubmit = event => { // Needed in case we're sending data to the server XHR-style // (but will still work if server-rendered with JS disabled). event.preventDefault(); // Carry on... }; return ( <form method="POST" action="/signup" onSubmit={handleSubmit}> <label for="email" class="email-label">Enter your email:</label> <input type="email" id="email" required /> <button>Sign Up</button> </form> ); };

Not only is this component now more accessible, but it also uses less JavaScript. In a world that’s drowning in JavaScript, deleting lines of it should feel downright therapeutic. The browser gives us so much for free, and we should try to take advantage of that as often as possible.

This is not to say that inaccessible patterns occur only when frameworks are used, but rather that a sole preference for JavaScript will eventually surface gaps in our understanding of HTML and CSS. These knowledge gaps will often result in mistakes we may not even be aware of. Frameworks can be useful tools that increase our productivity, but continuing education in core web technologies is essential to creating usable experiences, no matter what tools we choose to use.

Rely on the web platform and you’ll go far, fast

While we’re on the subject of frameworks, it must be said that the web platform is a formidable framework of its own. As the previous section showed, we’re better off when we can rely on established markup patterns and browser features. The alternative is to reinvent them, and invite all the pain such endeavors all but guarantee us, or worse: merely assume that the author of every JavaScript package we install has solved the problem comprehensively and thoughtfully.


One of the tradeoffs developers are quick to make is to adopt the single page application (SPA) model, even if it’s not a fit for the project. Yes, you do gain better perceived performance with the client-side routing of an SPA, but what do you lose? The browser’s own navigation functionality—albeit synchronous—provides a slew of benefits. For one, history is managed according to a complex specification. Users without JavaScript—be it by their own choice or not—won’t lose access altogether. For SPAs to remain available when JavaScript is not, server-side rendering suddenly becomes a thing you have to consider.

Figure 2. A comparison of an example app loading on a slow connection. The app on the left depends entirely upon JavaScript to render a page. The app on the right renders a response on the server, but then uses client-side hydration to attach components to the existing server-rendered markup.

Accessibility is also harmed if a client-side router fails to let people know what content on the page has changed. This can leave those reliant on assistive technology to suss out what changes have occurred on the page, which can be an arduous task.

Then there’s our old nemesis: overhead. Some client-side routers are very small, but when you start with Reacta compatible router, and possibly even a state management library, you’re accepting that there’s a certain amount of code you can never optimize away—approximately 135 KB in this case. Carefully consider what you’re building and whether a client side router is worth the tradeoffs you’ll inevitably make. Typically, you’re better off without one.

If you’re concerned about the perceived navigation performance, you could lean on rel=prefetch to speculatively fetch documents on the same origin. This has a dramatic effect on improving perceived loading performance of pages, as the document is immediately available in the cache. Because prefetches are done at a low priority, they’re also less likely to contend with critical resources for bandwidth.

Figure 3. The HTML for the writing/ URL is prefetched on the initial page. When the writing/ URL is requested by the user, the HTML for it is loaded instantaneously from the browser cache.

The primary drawback with link prefetching is that you need to be aware that it can be potentially wasteful. Quicklink, a tiny link prefetching script from Google, mitigates this somewhat by checking if the current client is on a slow connection—or has data saver mode enabled—and avoids prefetching links on cross-origins by default.

Service workers are also hugely beneficial to perceived performance for returning users, whether we use client side routing or not—provided you know the ropesWhen we precache routes with a service worker, we get many of the same benefits as link prefetching, but with a much greater degree of control over requests and responses. Whether you think of your site as an “app” or not, adding a service worker to it is perhaps one of the most responsible uses of JavaScript that exists today.


If we’re installing a package to solve a layout problem, proceed with caution and ask “what am I trying to accomplish?” CSS is designed to do this job, and requires no abstractions to use effectively. Most layout issues JavaScript packages attempt to solve, like box placement, alignment, and sizingmanaging text overflow, and even entire layout systems, are solvable with CSS today. Modern layout engines like Flexbox and Grid are supported well enough that we shouldn’t need to start a project with any layout framework. CSS is the framework. When we have feature queries, progressively enhancing layouts to adopt new layout engines is suddenly not so hard.

/* Your mobile-first, non-CSS grid styles goes here */ /* The @supports rule below is ignored by browsers that don't support CSS grid, _or_ don't support @supports. */ @supports (display: grid) { /* Larger screen layout */ @media (min-width: 40em) { /* Your progressively enhanced grid layout styles go here */ } }

Using JavaScript solutions for layout and presentations problems is not new. It was something we did when we lied to ourselves in 2009 that every website had to look in IE6 exactly as it did in the more capable browsers of that time. If we’re still developing websites to look the same in every browser in 2019, we should reassess our development goals. There will always be some browser we’ll have to support that can’t do everything those modern, evergreen browsers can. Total visual parity on all platforms is not only a pursuit made in vain, it’s the principal foe of progressive enhancement.

I’m not here to kill JavaScript

Make no mistake, I have no ill will toward JavaScript. It’s given me a career and—if I’m being honest with myself—a source of enjoyment for over a decade. Like any long-term relationship, I learn more about it the more time I spend with it. It’s a mature, feature-rich language that only gets more capable and elegant with every passing year.

Yet, there are times when I feel like JavaScript and I are at odds. I am critical of JavaScript. Or maybe more accurately, I’m critical of how we’ve developed a tendency to view it as a first resort to building for the web. As I pick apart yet another bundle not unlike a tangled ball of Christmas tree lights, it’s become clear that the web is drunk on JavaScript. We reach for it for almost everything, even when the occasion doesn’t call for it. Sometimes I wonder how vicious the hangover will be.

In a series of articles to follow, I’ll be giving more practical advice to follow to stem the encroaching tide of excessive JavaScript and how we can wrangle it so that what we build for the web is usable—or at least more so—for everyone everywhere. Some of the advice will be preventative. Some will be mitigating “hair of the dog” measures. In either case, the outcomes will hopefully be the same. I believe that we all love the web and want to do right by it, but I want us to think about how to make it more resilient and inclusive for all.

Categories: Design

Canary in a Coal Mine: How Tech Provides Platforms for Hate

Tue, 03/19/2019 - 02:22

As I write this, the world is sending its thoughts and prayers to our Muslim cousins. The Christchurch act of terrorism has once again reminded the world that white supremacy’s rise is very real, that its perpetrators are no longer on the fringes of society, but centered in our holiest places of worship. People are begging us to not share videos of the mass murder or the hateful manifesto that the white supremacist terrorist wrote. That’s what he wants: for his proverbial message of hate to be spread to the ends of the earth.

We live in a time where you can stream a mass murder and hate crime from the comfort of your home. Children can access these videos, too.

As I work through the pure pain, unsurprised, observing the toll on Muslim communities (as a non-Muslim, who matters least in this event), I think of the imperative role that our industry plays in this story.

At time of writing, YouTube has failed to ban and to remove this video. If you search for the video (which I strongly advise against), it still comes up with a mere content warning; the same content warning that appears for casually risqué content. You can bypass the warning and watch people get murdered. Even when the video gets flagged and taken down, new ones get uploaded.

Human moderators have to relive watching this trauma over and over again for unlivable wages. News outlets are embedding the video into their articles and publishing the hateful manifesto. Why? What does this accomplish?

I was taught in journalism class that media (photos, video, infographics, etc.) should be additive (a progressive enhancement, if you will) and provide something to the story for the reader that words cannot.

Is it necessary to show murder for our dear readers to understand the cruelty and finality of it? Do readers gain something more from watching fellow humans have their lives stolen from them? What psychological damage are we inflicting upon millions of people   and for what?

Who benefits?

The mass shooter(s) who had a message to accompany their mass murder. News outlets are thirsty for perverse clicks to garner more ad revenue. We, by way of our platforms, give agency and credence to these acts of violence, then pilfer profits from them. Tech is a money-making accomplice to these hate crimes.

Christchurch is just one example in an endless array where the tools and products we create are used as a vehicle for harm and for hate.

Facebook and the Cambridge Analytica scandal played a critical role in the outcome of the 2016 presidential election. The concept of “race realism,” which is essentially a term that white supremacists use to codify their false racist pseudo-science, was actively tested on Facebook’s platform to see how the term would sit with people who are ignorantly sitting on the fringes of white supremacy. Full-blown white supremacists don’t need this soft language. This is how radicalization works.

The strategies articulated in the above article are not new. Racist propaganda predates social media platforms. What we have to be mindful with is that we’re building smarter tools with power we don’t yet fully understand: you can now have an AI-generated human face. Our technology is accelerating at a frightening rate, a rate faster than our reflective understanding of its impact.

Combine the time-tested methods of spreading white supremacy, the power to manipulate perception through technology, and the magnitude and reach that has become democratized and anonymized.

We’re staring at our own reflection in the Black Mirror.

The right to speak versus the right to survive

Tech has proven time and time again that it voraciously protects first amendment rights above all else. (I will also take this opportunity to remind you that the first amendment of the United States offers protection to the people from the government abolishing free speech, not from private money-making corporations).

Evelyn Beatrice Hall writes in The Friends of Voltaire, “I disapprove of what you say, but I will defend to the death your right to say it.” Fundamentally, Hall’s quote expresses that we must protect, possibly above all other freedoms, the freedom to say whatever we want to say. (Fun fact: The quote is often misattributed to Voltaire, but Hall actually wrote it to explain Voltaire’s ideologies.)

And the logical anchor here is sound: We must grant everyone else the same rights that we would like for ourselves. Former 99u editor Sean Blanda wrote a thoughtful piece on the “Other Side,” where he posits that we lack tolerance for people who don’t think like us, but that we must because we might one day be on the other side. I agree in theory.

But, what happens when a portion of the rights we grant to one group (let’s say, free speech to white supremacists) means the active oppression another group’s right (let’s say, every person of color’s right to live)?

James Baldwin expresses this idea with a clause, “We can disagree and still love each other unless your disagreement is rooted in my oppression and denial of my humanity and right to exist.”

It would seem that we have a moral quandary where two sets of rights cannot coexist. Do we protect the privilege for all users to say what they want, or do we protect all users from hate? Because of this perceived moral quandary, tech has often opted out of this conversation altogether. Platforms like Twitter and Facebook, two of the biggest offenders, continue to allow hate speech to ensue with irregular to no regulation.

When explicitly asked about his platform as a free-speech platform and its consequence to privacy and safety, Twitter CEO Jack Dorsey said,

“So we believe that we can only serve the public conversation, we can only stand for freedom of expression if people feel safe to express themselves in the first place. We can only do that if they feel that they are not being silenced.”

Dorsey and Twitter are most concerned about protecting expression and about not silencing people. In his mind, if he allows people to say whatever they want on his platform, he has succeeded. When asked about why he’s failed to implement AI to filter abuse like, say, Instagram had implemented, he said that he’s most concerned about being able to explain why the AI flagged something as abusive. Again, Dorsey protects the freedom of speech (and thus, the perpetrators of abuse) before the victims of abuse.

But he’s inconsistent about it. In a study by George Washington University comparing white nationalists and ISIS social media usage, Twitter’s freedom of speech was not granted to ISIS. Twitter suspended 1,100 accounts related to ISIS whereas it suspended only seven accounts related to Nazis, white nationalism, and white supremacy, despite the accounts having more than seven times the followers, and tweeting 25 times more than the ISIS accounts. Twitter here made a moral judgment that the fewer, less active, and less influential ISIS accounts were somehow not welcome on their platform, whereas the prolific and burgeoning Nazi and white supremacy accounts were.

So, Twitter has shown that it won’t protect free speech at all costs or for all users. We can only conclude that Twitter is either intentionally protecting white supremacy or simply doesn’t think it’s very dangerous. Regardless of which it is (I think I know), the outcome does not change the fact that white supremacy is running rampant on its platforms and many others.

Let’s brainwash ourselves for a moment and pretend like Twitter does want to support freedom of speech equitably and stays neutral and fair to complete this logical exercise: Going back to the dichotomy of rights example I provided earlier, where either the right to free speech or the right to safety and survival prevail, the rights and the power will fall into the hands of the dominant group or ideologue.

In case you are somehow unaware, the dominating ideologue, whether you’re a flagrant white supremacist or not, is white supremacy. White supremacy was baked into founding principles of the United States, the country where the majority of these platforms were founded and exist. (I am not suggesting that white supremacy doesn’t exist globally, as it does, evidenced most recently by the terrorist attack in Christchurch. I’m centering the conversation intentionally around the United States as it is my lived experience and where most of these companies operate.)

Facebook attempted to educate its team on white supremacy in order to address how to regulate free speech. A laugh-cry excerpt:

“White nationalism and calling for an exclusively white state is not a violation for our policy unless it explicitly excludes other PCs [protected characteristics].”

White nationalism is a softened synonym for white supremacy so that racists-lite can feel more comfortable with their transition into hate. White nationalism (a.k.a. white supremacy) by definition explicitly seeks to eradicate all people of color. So, Facebook should see white nationalist speech as exclusionary, and therefore a violation of their policies.

Regardless of what tech leaders like Dorsey or Facebook CEO Zuckerberg say or what mediocre and uninspired condolences they might offer, inaction is an action.

Companies that use terms and conditions or acceptable use policies to defend their inaction around hate speech are enabling and perpetuating white supremacy. Policies are written by humans to protect that group of human’s ideals. The message they use might be that they are protecting free speech, but hate speech is a form of free speech. So effectively, they are protecting hate speech. Well, as long as it’s for white supremacy and not the Islamic State.

Whether the motivation is fear (losing loyal Nazi customers and their sympathizers) or hate (because their CEO is a white supremacist), it does not change the impact: Hate speech is tolerated, enabled, and amplified by way of their platforms.

“That wasn’t our intent”

Product creators might be thinking, Hey, look, I don’t intentionally create a platform for hate. The way these features were used was never our intent.

Intent does not erase impact.

We cannot absolve ourselves of culpability merely because we failed to conceive such evil use cases when we built it. While we very well might not have created these platforms with the explicit intent to help Nazis or imagined it would be used to spread their hate, the reality is that our platforms are being used in this way.

As product creators, it is our responsibility to protect the safety of our users by stopping those that intend to or already cause them harm. Better yet, we ought to think of this before we build the platforms to prevent this in the first place.

The question to answer isn’t, “Have I made a place where people have the freedom to express themselves?” Instead we have to ask, “Have I made a place where everyone has the safety to exist?” If you have created a place where a dominant group can embroil and embolden hate against another group, you have failed to create a safe place. The foundations of hateful speech (beyond the psychological trauma of it) lead to events like Christchurch.

We must protect safety over speech.

The Domino Effect

This week, Slack banned 28 hate groups. What is most notable, to me, is that the groups did not break any parts of their Acceptable Use Policy. Slack issued a statement:

The use of Slack by hate groups runs counter to everything we believe in at Slack and is not welcome on our platform… Using Slack to encourage or incite hatred and violence against groups or individuals because of who they are is antithetical to our values and the very purpose of Slack.

That’s it.

It is not illegal for tech companies like Slack to ban groups from using their proprietary software because it is a private company that can regulate users if they do not align with their vision as a company. Think of it as the “no shoes, no socks, no service” model, but for tech.

Slack simply decided that supporting the workplace collaboration of Nazis around efficient ways to evangelize white supremacy was probably not in line with their company directives around inclusion. I imagine Slack also considered how their employees of color most ill-affected by white supremacy would feel working for a company that supported it, actively or not.

What makes the Slack example so notable is that they acted swiftly and on their own accord. Slack chose the safety of all their users over the speech of some.

When caught with their enablement of white supremacy, some companies will only budge under pressure from activist groups, users, and employees.

PayPal finally banned hate groups after Charlottesville and after Southern Poverty Law Center (SPLC) explicitly called them out for enabling hate. SPLC had identified this fact for three years prior. PayPal had ignored them for all three years.

Unfortunately, taking these “stances” against something as clearly and viscerally wrong as white supremacy is rare for companies to do. The tech industry tolerates this inaction through unspoken agreements.

If Facebook doesn’t do anything about racist political propaganda, YouTube doesn’t do anything about PewDiePie, and Twitter doesn’t do anything about disproportionate abuse against Black women, it says to the smaller players in the industry that they don’t have to either.

The tech industry reacts to its peers. When there is disruption, as was the case with Airbnb, who screened and rejected any guests who they believed to be partaking in the Unite the Right Charlottesville rally, companies follow suit. GoDaddy cancelled Daily Stormer’s domain registration and Google did the same when they attempted migration.

If one company, like Slack or Airbnb, decides to do something about the role it’s going to play, it creates a perverse kind of FOMO for the rest: Fear of missing out of doing the right thing and standing on the right side of history.

Don’t have FOMO, do something

The type of activism at those companies all started with one individual. If you want to be part of the solution, I’ve gathered some places to start. The list is not exhaustive, and, as with all things, I recommend researching beyond this abridged summary.

  1. Understand how white supremacy impacts you as an individual.
    Now, if you are a person of color, queer, disabled, or trans, it’s likely that you know this very intimately.


    If you are not any of those things, then you, as a majority person, need to understand how white supremacy protects you and works in your favor. It’s not easy work, it is uncomfortable and unfamiliar, but you have the most powerful tools to fix tech. The resources are aplenty, but my favorite abridged list:

    1. Seeing White podcast
    2. Ijeoma Oluo’s So you want to talk about race
    3. Reni Eddo-Lodge’s Why I’m no longer talking to white people about race (Very key read for UK folks)
    4. Robin DiAngelo’s White Fragility
  2. See where your company stands: Read your company’s policies like accepted use and privacy policies and find your CEO’s stance on safety and free speech.
    While these policies are baseline (and in the Slack example, sort of irrelevant), it’s important to known your company's track record. As an employee, your actions and decisions either uphold the ideologies behind the company or they don’t. Ask yourself if the company’s ideologies are worth upholding and whether they align with your own. Education will help you to flag if something contradicts those policies, or if the policies themselves allow for unethical activity.
  3. Examine everything you do critically on an ongoing basis.
    You may feel your role is small or that your company is immune—maybe you are responsible for the maintenance of one small algorithm. But consider how that algorithm or similar ones can be exploited. Some key questions I ask myself:
    1. Who benefits from this? Who is harmed?
    2. How could this be used for harm?
    3. Who does this exclude? Who is missing?
    4. What does this protect? For whom? Does it do so equitably?
  4. See something? Say something.
    If you believe that your company is creating something that is or can be used for harm, it is your responsibility to say something. Now, I’m not naïve to the fact that there is inherent risk in this. You might fear ostracization or termination. You need to protect yourself first. But you also need to do something.
    1. Find someone who you trust who might be at less risk. Maybe if you’re a nonbinary person of color, find a white cis man who is willing to speak up. Maybe if you’re a white man who is new to the company, find a white man who has more seniority or tenure. But also, consider how you have so much more relative privilege compared to most other people and that you might be the safest option.
    2. Unionize. Find peers who might feel the same way and write a collective statement.
    3. Get someone influential outside of the company (if knowledge is public) to say something.
  5. Listen to concerns, no matter how small, particularly if they’re coming from the most endangered groups.
    If your user or peer feels unsafe, you need to understand why. People often feel like small things can be overlooked, as their initial impact might be less, but it is in the smallest cracks that hate can grow. Allowing one insensitive comment about race is still allowing hate speech. If someone, particularly someone in a marginalized group, brings up a concern, you need to do your due diligence to listen to it and to understand its impact.

I cannot emphasize this last point enough.

What I say today is not new. Versions of this article have been written before. Women of color like me have voiced similar concerns not only in writing, but in design reviews, in closed door meetings to key stakeholders, in Slack DMs. We’ve blown our whistles.

But here is the power of white supremacy.

White supremacy is so ingrained in every single aspect of how this nation was built, how our corporations function, and who is in control. If you are not convinced of this, you are not paying attention or intentionally ignoring the truth.

Queer, Muslim, disabled, trans women and nonbinary folks of color — the marginalized groups most impacted by this — are the ones who are voicing these concerns most voraciously. Speaking up requires us to enter the spotlight and outside of safety—we take a risk and are not heard.

The silencing of our voices is one of many effective tools of white supremacy. Our silencing lives within every microaggression, each time we’re talked over, or not invited to partake in key decisions.

In tech, I feel I am a canary in a coal mine. I have sung my song to warn the miners of the toxicity. My sensitivity to it is heightened, because of my existence.

But the miners look at me and tell me that my lived experience is false. It does not align with their narrative as humans. They don’t understand why I sing.

If the people at the highest echelons of the tech industry—the white, male CEOs in power—fail to listen to its most marginalized people—the queer, disabled, trans, people of color—the fate of the canaries will too become the fate of the miners.

Categories: Design

Semantics to Screen Readers

Thu, 02/28/2019 - 05:37
As a child of the ’90s, one of my favorite movie quotes is from Harriet the Spy: “there are as many ways to live as there are people in this world, and each one deserves a closer look.” Likewise, there are as many ways to browse the web as there are people online. We each bring unique context to our web experience based on our values, technologies, environments, minds, and bodies. Assistive technologies (ATs), which are hardware and software that help us perceive and interact with digital content, come in diverse forms. ATs can use a whole host of user input, ranging from clicks and keystrokes to minor muscle movements. ATs may also present digital content in a variety of forms, such as Braille displays, color-shifted views, and decluttered user interfaces (UIs). One more commonly known type of AT is the screen reader. Programs such as JAWS, Narrator, NVDA, and VoiceOver can take digital content and present it to users through voice output, may display this output visually on the user’s screen, and can have Braille display and/or screen magnification capabilities built in. If you make websites, you may have tested your sites with a screen reader. But how do these and other assistive programs actually access your content? What information do they use? We’ll take a detailed step-by-step view of how the process works. (For simplicity we’ll continue to reference “browsers” and “screen readers” throughout this article. These are essentially shorthands for “browsers and other applications,” and “screen readers and other assistive technologies,” respectively.) The semantics-to-screen-readers pipeline Accessibility application programming interfaces (APIs) create a useful link between user applications and the assistive technologies that wish to interact with them. Accessibility APIs facilitate communicating accessibility information about user interfaces (UIs) to the ATs. The API expects information to be structured in a certain way, so that whether a button is properly marked up in web content or is sitting inside a native app taskbar, a button is a button is a button as far as ATs are concerned. That said, screen readers and other ATs can do some app-specific handling if they wish. On the web specifically, there are some browser and screen reader combinations where accessibility API information is supplemented by access to DOM structures. For this article, we’ll focus specifically on accessibility APIs as a link between web content and the screen reader. Here’s the breakdown of how web content reaches screen readers via accessibility APIs: The web developer uses host language markup (HTML, SVG, etc.), and potentially roles, states, and properties from the ARIA suite where needed to provide the semantics of their content. Semantic markup communicates what type an element is, what content it contains, what state it’s in, etc. The browser rendering engine (alternatively referred to as a “user agent”) takes this information and maps it into an accessibility API. Different accessibility APIs are available on different operating systems, so a browser that is available on multiple platforms should support multiple accessibility APIs. Accessibility API mappings are maintained on a lower level than web platform APIs, so web developers don’t directly interact with accessibility APIs. The accessibility API includes a collection of interfaces that browsers and other apps can plumb into, and generally acts as an intermediary between the browser and the screen reader. Accessibility APIs provide interfaces for representing the structure, relationships, semantics, and state of digital content, as well as means to surface dynamic changes to said content. Accessibility APIs also allow screen readers to retrieve and interact with content via the API. Again, web developers don’t interact with these APIs directly; the rendering engine handles translating web content into information useful to accessibility APIs. Examples of accessibility APIs The screen reader uses client-side methods from these accessibility APIs to retrieve and handle information exposed by the browser. In browsers where direct access to the Document Object Model (DOM) is permitted, some screen readers may also take additional information from the DOM tree. A screen reader can also interact with apps that use differing accessibility APIs. No matter where they get their information, screen readers can dream up any interaction modes they want to provide to their users (I’ve provided links to screen reader commands at the end of this article). Testing by site creators can help identify content that feels awkward in a particular navigation mode, such as multiple links with the same text (“Learn more”), as one example. Example of this pipeline: surfacing a button element to screen reader users Let’s suppose for a moment that a screen reader wants to understand what object is next in the accessibility tree (which I’ll explain further in the next section), so it can surface that object to the user as they navigate to it. The flow will go a little something like this: Diagram illustrating the steps involved in presenting the next object in a document; detailed list follows
  1. The screen reader requests information from the API about the next accessible object, relative to the current object.
  2. The API (as an intermediary) passes along this request to the browser.
  3. At some point, the browser references DOM and style information, and discovers that the relevant element is a non-hidden button: <button>Do a thing</button>.
  4. The browser maps this HTML button into the format the API expects, such as an accessible object with various properties: Name: Do a thing, Role: Button.
  5. The API returns this information from the browser to the screen reader.
  6. The screen reader can then surface this object to the user, perhaps stating “Button, Do a thing.”
Suppose that the screen reader user would now like to “click” this button. Here’s how their action flows all the way back to web content: Diagram illustrating the steps involved in routing a screen reader click to web content; detailed list follows
  1. The user provides a particular screen reader command, such as a keystroke or gesture.
  2. The screen reader calls a method into the API to invoke the button.
  3. The API forwards this interaction to the browser.
  4. How a browser may respond to incoming interactions depends on the context, but in this case the browser can raise this as a “click” event through web APIs. The browser should give no indication that the click came from an assistive technology, as doing so would violate the user’s right to privacy.
  5. The web developer has registered a JavaScript event listener for clicks; their callback function is now executed as if the user clicked with a mouse.
Now that we have a general sense of the pipeline, let’s go into a little more detail on the accessibility tree. The accessibility tree Dev Tools in Microsoft Edge showing the DOM tree and accessibility tree side by side; there are more nodes in the DOM tree The accessibility tree is a hierarchical representation of elements in a UI or document, as computed for an accessibility API. In modern browsers, the accessibility tree for a given document is a separate, parallel structure to the DOM tree. “Parallel” does not necessarily mean there is a 1:1 match between the nodes of these two trees. Some elements may be excluded from the accessibility tree, for example if they are hidden or are not semantically useful (think non-focusable wrapper divs without any semantics added by a web developer). This idea of a hierarchical structure is somewhat of an abstraction. The definition of what exactly an accessibility tree is in practice has been debated and partially defined in multiple places, so implementations may differ in various ways. For example, it’s not actually necessary to generate accessible objects for every element in the DOM whenever the DOM tree is constructed. As a performance consideration, a browser could choose to deal with only a subset of objects and their relationships at a time—that is, however much is necessary to fulfill the requests coming from ATs. The rendering engine could make these computations during all user sessions, or only do so when assistive technologies are actively running. Generally speaking, modern web browsers wait until after style computation to build up any accessible objects. Browsers wait in part because generated content (such as ::before and ::after) can contain text that can participate in calculation of the accessible object’s name. CSS styles can also impact accessible objects in other various ways: text styling can come through as attributes on accessible text ranges. Display property values can impact the computation of line text ranges. These are just a few ways in which style can impact accessibility semantics. Browsers may also use different structures as the basis for accessible object computation. One rendering engine may walk the DOM tree and cross-reference style computations to build up parallel tree structures; another engine may use only the nodes that are available in a style tree in order to build up their accessibility tree. User agent participants in the standards community are currently thinking through how we can better document our implementation details, and whether it might make sense to standardize more of these details further down the road. Let’s now focus on the branches of this tree, and explore how individual accessibility objects are computed. Building up accessible objects From API to API, an accessible object will generally include a few things:
  • Role, or the type of accessible object (for example, Button). The role tells a user how they can expect to interact with the control. It is typically presented when screen reader focus moves onto the accessible object, and it can be used to provide various other functionalities, such as skipping around content via one type of object.
  • Name, if specified. The name is an (ideally short) identifier that better helps the user identify and understand the purpose of an accessible object. The name is often presented when screen focus moves to the object (more on this later), can be used as an identifier when presenting a list of available objects, and can be used as a hook for functionalities such as voice commands.
  • Description and/or help text, if specified. We’ll use “Description” as a shorthand. The Description can be considered supplemental to the Name; it’s not the main identifier but can provide further information about the accessible object. Sometimes this is presented when moving focus to the accessible object, sometimes not; this variation depends on both the screen reader’s user experience design and the user’s chosen verbosity settings.
  • Properties and methods surfacing additional semantics. For simplicity’s sake, we won’t go through all of these. For your awareness, properties can include details like layout information or available interactions (such as invoking the element or modifying its value).
Let’s walk through an example using markup for a simple mood tracker. We’ll use simplified property names and values, because these can differ between accessibility APIs. <form> <label for="mood">On a scale of 1–10, what is your mood today?</label> <input id="mood" type="range" min="1" max="10" value="5" aria-describedby="helperText" /> <p id="helperText">Some helpful pointers about how to rate your mood.</p> <!-- Using a div with button role for the purposes of showing how the accessibility tree is created. Please use the button element! --> <div tabindex="0" role="button">Log Mood</div> </form> First up is our form element. This form doesn’t have any attributes that would give it an accessible Name, and a form landmark without a Name isn’t very useful when jumping between landmarks. Therefore, HTML mapping standards specify that it should be mapped as a group. Here’s the beginning of our tree:
  • Role: Group
Next up is the label. This one doesn’t have an accessible Name either, so we’ll just nest it as an object of role “Label” underneath the form:
  • Role: Group
    • Role: Label
Let’s add the range input, which will map into various APIs as a “Slider.” Due to the relationship created by the for attribute on the label and id attribute on the input, this slider will take its Name from the label contents. The aria-describedby attribute is another id reference and points to a paragraph with some text content, which will be used for the slider’s Description. The slider object’s properties will also store “labelledby” and “describedby” relationships pointing to these other elements. And it will specify the current, minimum, and maximum values of the slider. If one of these range values were not available, ARIA standards specify what should be the default value. Our updated tree:
  • Role: Group
    • Role: Label
    • Role: Slider Name: On a scale of 1–10, what is your mood today? Description: Some helpful pointers about how to rate your mood. LabelledBy: [label object] DescribedBy: helperText ValueNow: 5 ValueMin: 1 ValueMax: 10
The paragraph will be added as a simple paragraph object (“Text” or “Group” in some APIs):
  • Role: Group
    • Role: Label
    • Role: Slider Name: On a scale of 1–10, what is your mood today? Description: Some helpful pointers about how to rate your mood. LabelledBy: [label object] DescribedBy: helperText ValueNow: 5 ValueMin: 1 ValueMax: 10
    • Role: Paragraph
The final element is an example of when role semantics are added via the ARIA role attribute. This div will map as a Button with the name “Log Mood,” as buttons can take their name from their children. This button will also be surfaced as “invokable” to screen readers and other ATs; special types of buttons could provide expand/collapse functionality (buttons with the aria-expanded attribute), or toggle functionality (buttons with the aria-pressed attribute). Here’s our tree now:
  • Role: Group
    • Role: Label
    • Role: Slider Name: On a scale of 1–10, what is your mood today? Description: Some helpful pointers about how to rate your mood. LabelledBy: [label object] DescribedBy: helperText ValueNow: 5 ValueMin: 1 ValueMax: 10
    • Role: Paragraph
    • Role: Button Name: Log Mood
On choosing host language semantics Our sample markup mentions that it is preferred to use the HTML-native button element rather than a div with a role of “button.” Our buttonified div can be operated as a button via accessibility APIs, as the ARIA attribute is doing what it should—conveying semantics. But there’s a lot you can get for free when you choose native elements. In the case of button, that includes focus handling, user input handling, form submission, and basic styling. Aaron Gustafson has what he refers to as an “exhaustive treatise” on buttons in particular, but generally speaking it’s great to let the web platform do the heavy lifting of semantics and interaction for us when we can. ARIA roles, states, and properties are still a great tool to have in your toolbelt. Some good use cases for these are
  • providing further semantics and relationships that are not naturally expressed in the host language;
  • supplementing semantics in markup we perhaps don’t have complete control over;
  • patching potential cross-browser inconsistencies;
  • and making custom elements perceivable and operable to users of assistive technologies.
Notes on inclusion or exclusion in the tree Standards define some rules around when user agents should exclude elements from the accessibility tree. Excluded elements can include those hidden by CSS, or the aria-hidden or hidden attributes; their children would be excluded as well. Children of particular roles (like checkbox) can also be excluded from the tree, unless they meet special exceptions. The full rules can be found in the “Accessibility Tree” section of the ARIA specification. That being said, there are still some differences between implementers, some of which include more divs and spans in the tree than others do. Notes on name and description computation How names and descriptions are computed can be a bit confusing. Some elements have special rules, and some ARIA roles allow name computation from the element’s contents, whereas others do not. Name and description computation could probably be its own article, so we won’t get into all the details here (refer to “Further reading and resources” for some links). Some short pointers:
  • aria-label, aria-labelledby, and aria-describedby take precedence over other means of calculating name and description.
  • If you expect a particular HTML attribute to be used for the name, check the name computation rules for HTML elements. In your scenario, it may be used for the full description instead.
  • Generated content (::before and ::after) can participate in the accessible name when said name is taken from the element’s contents. That being said, web developers should not rely on pseudo-elements for non-decorative content, as this content could be lost when a stylesheet fails to load or user styles are applied to the page.
When in doubt, reach out to the community! Tag questions on social media with “#accessibility.” “#a11y” is a common shorthand; the “11” stands for “11 middle letters in the word ‘accessibility.’” If you find an inconsistency in a particular browser, file a bug! Bug tracker links are provided in “Further reading and resources.” Not just accessible objects Besides a hierarchical structure of objects, accessibility APIs also offer interfaces that allow ATs to interact with text. ATs can retrieve content text ranges, text selections, and a variety of text attributes that they can build experiences on top of. For example, if someone writes an email and uses color alone to highlight their added comments, the person reading the email could increase the verbosity of speech output in their screen reader to know when they’re encountering phrases with that styling. However, it would be better for the email author to include very brief text labels in this scenario. The big takeaway here for web developers is to keep in mind that the accessible name of an element may not always be surfaced in every navigation mode in every screen reader. So if your aria-label text isn’t being read out in a particular mode, the screen reader may be primarily using text interfaces and only conditionally stopping on objects. It may be worth your while to consider using text content—even if visually hidden—instead of text via an ARIA attribute. Read more thoughts on aria-label and aria-labelledby. Accessibility API events It is the responsibility of browsers to surface changes to content, structure, and user input. Browsers do this by sending the accessibility API notifications about various events, which screen readers can subscribe to; again, for performance reasons, browsers could choose to send notifications only when ATs are active. Let’s suppose that a screen reader wants to surface changes to a live region (an element with role="alert" or aria-live): Diagram illustrating the steps involved in announcing a live region via a screen reader; detailed list follows
  1. The screen reader subscribes to event notifications; it could subscribe to notifications of all types, or just certain types as categorized by the accessibility API. Let’s assume in our example that the screen reader is at least listening to live region change events.
  2. In the web content, the web developer changes the text content of a live region.
  3. The browser (provider) recognizes this as a live region change event, and sends the accessibility API a notification.
  4. The API passes this notification along to the screen reader.
  5. The screen reader can then use metadata from the notification to look up the relevant accessible objects via the accessibility API, and can surface the changes to the user.
ATs aren’t required to do anything with the information they retrieve. This can make it a bit trickier as a web developer to figure out why a screen reader isn’t announcing a change: it may be that notifications aren’t being raised (for example, because a browser is not sending notifications for a live region dynamically inserted into web content), or the AT is not subscribed or responding to that type of event. Testing with screen readers and dev tools While conformance checkers can help catch some basic accessibility issues, it’s ideal to walk through your content manually using a variety of contexts, such as
  • using a keyboard only;
  • with various OS accessibility settings turned on;
  • and at different zoom levels and text sizes, and so on.
As you do this, keep in mind the Web Content Accessibility Guidelines (WCAG 2.1), which give general guidelines around expectations for inclusive web content. If you can test with users after your own manual test passes, all the better! Robust accessibility testing could probably be its own series of articles. In this one, we’ll go over some tips for testing with screen readers, and catching accessibility errors as they are mapped into the accessibility API in a more general sense. Screen reader testing Screen readers exist in many forms: some are pre-installed on the operating system and others are separate applications that in some cases are free to download. The WebAIM screen reader user survey provides a list of commonly used screen reader and browser combinations among survey participants. The “Further reading and resources” section at the end of this article includes full screen reader user docs, and Deque University has a great set of screen reader command cheat sheets that you can refer to. Some actions you might take to test your content:
  • Read the next/previous item.
  • Read the next/previous line.
  • Read continuously from a particular point.
  • Jump by headings, landmarks, and links.
  • Tab around focusable elements only.
  • Get a summary of all elements of a particular type within the page.
  • Search the page for specific content.
  • Use table-specific commands to interact with your tables.
  • Jump around by form field; are field instructions discoverable in this navigational mode?
  • Use keyboard commands to interact with all interactive elements. Are your JavaScript-driven interactions still operable with screen readers (which can intercept key input in certain modes)? WAI-ARIA Authoring Practices 1.1 includes notes on expected keyboard interactions for various widgets.
  • Try out anything that creates a content change or results in navigating elsewhere. Would it be obvious, via screen reader output, that a change occurred?
Tracking down the source of unexpected behavior If a screen reader does not announce something as you’d expect, here are a few different checks you can run:
  • Does this reproduce with the same screen reader in multiple browsers on this OS? It may be an issue with the screen reader or your expectation may not match the screen reader’s user experience design. For example, a screen reader may choose to not expose the accessible name of a static, non-interactive element. Checking the user docs or filing a screen reader issue with a simple test case would be a great place to start.
  • Does this reproduce with multiple screen readers in the same browser, but not in other browsers on this OS? The browser in question may have an issue, there may be compatibility differences between browsers (such as a browser doing extra helpful but non-standard computations), or a screen reader’s support for a specific accessibility API may vary. Filing a browser issue with a simple test case would be a great place to start; if it’s not a browser bug, the developer can route it to the right place or make a code suggestion.
  • Does this reproduce with multiple screen readers in multiple browsers? There may be something you can adjust in your code, or your expectations may differ from standards and common practices.
  • How does this element’s accessibility properties and structure show up in browser dev tools?
Inspecting accessibility trees and properties in dev tools Major modern browsers provide dev tools to help you observe the structure of the accessibility tree as well as a given element’s accessibility properties. By observing which accessible objects are generated for your elements and which properties are exposed on a given element, you may be able to pinpoint issues that are occurring either in front-end code or in how the browser is mapping your content into the accessibility API. Let’s suppose that we are testing this piece of code in Microsoft Edge with a screen reader: <div class="form-row"> <label>Favorite color</label> <input id="myTextInput" type="text" /> </div> We’re navigating the page by form field, and when we land on this text field, the screen reader just tells us this is an “edit” control—it doesn’t mention a name for this element. Let’s check the tools for the element’s accessible name. 1. Inspect the element to bring up the dev tools. The Microsoft Edge dev tools, with an input element highlighted in the DOM tree 2. Bring up the accessibility tree for this page by clicking the accessibility tree button (a circle with two arrows) or pressing Ctrl+Shift+A (Windows). The accessibility tree button activated in the Microsoft Edge dev tools Reviewing the accessibility tree is an extra step for this particular flow but can be helpful to do. When the Accessibility Tree pane comes up, we notice there’s a tree node that just says “textbox:,” with nothing after the colon. That suggests there’s not a name for this element. (Also notice that the div around our form input didn’t make it into the accessibility tree; it was not semantically useful). 3. Open the Accessibility Properties pane, which is a sibling of the Styles pane. If we scroll down to the Name property—aha! It’s blank. No name is provided to the accessibility API. (Side note: some other accessibility properties are filtered out of this list by default; toggle the filter button—which looks like a funnel—in the pane to get the full list). The Accessibility Properties pane open in Microsoft Edge dev tools, in the same area as the Styles pane 4. Check the code. We realize that we didn’t associate the label with the text field; that is one strategy for providing an accessible name for a text input. We add for="myTextInput" to the label: <div class="form-row"> <label for="myTextInput">Favorite color</label> <input id="myTextInput" type="text" /> </div> And now the field has a name: The accessible Name property set to the value of “Favorite color” inside Microsoft Edge dev tools In another use case, we have a breadcrumb component, where the current page link is marked with aria-current="page": <nav class="breadcrumb" aria-label="Breadcrumb"> <ol> <li> <a href="/cat/">Category</a> </li> <li> <a href="/cat/sub/">Sub-Category</a> </li> <li> <a aria-current="page" href="/cat/sub/page/">Page</a> </li> </ol> </nav> When navigating onto the current page link, however, we don’t get any indication that this is the current page. We’re not exactly sure how this maps into accessibility properties, so we can reference a specification like Core Accessibility API Mappings 1.2 (Core-AAM). Under the “State and Property Mapping” table, we find mappings for “aria-current with non-false allowed value.” We can check for these listed properties in the Accessibility Properties pane. Microsoft Edge, at the time of writing, maps into UIA (UI Automation), so when we check AriaProperties, we find that yes, “current=page” is included within this property value. The accessible Name property set to the value of “Favorite color” inside Microsoft Edge dev tools Now we know that the value is presented correctly to the accessibility API, but the particular screen reader is not using the information. As a side note, Microsoft Edge’s current dev tools expose these accessibility API properties quite literally. Other browsers’ dev tools may simplify property names and values to make them easier to read, particularly if they support more than one accessibility API. The important bit is to find if there’s a property with roughly the name you expect and whether its value is what you expect. You can also use this method of checking through the property names and values if mapping specs, like Core-AAM, are a bit intimidating! Advanced accessibility tools While browser dev tools can tell us a lot about the accessibility semantics of our markup, they don’t generally include representations of text ranges or event notifications. On Windows, the Windows SDK includes advanced tools that can help debug these parts of MSAA or UIA mappings: Inspect and AccEvent (Accessible Event Watcher). Using these tools presumes knowledge of the Windows accessibility APIs, so if this is too granular for you and you’re stuck on an issue, please reach out to the relevant browser team! There is also an Accessibility Inspector in Xcode on MacOS, with which you can inspect web content in Safari. This tool can be accessed by going to Xcode > Open Developer Tool > Accessibility Inspector. Diversity of experience Equipped with an accessibility tree, detailed object information, event notifications, and methods for interacting with accessible objects, screen readers can craft a browsing experience tailored to their audiences. In this article, we’ve used the term “screen readers” as a proxy for a whole host of tools that may use accessibility APIs to provide the best user experience possible. Assistive technologies can use the APIs to augment presentation or support varying types of user input. Examples of other ATs include screen magnifiers, cognitive support tools, speech command programs, and some brilliant new app that hasn’t been dreamed up yet. Further, assistive technologies of the same “type” may differ in how they present information, and users who share the same tool may further adjust settings to their liking. As web developers, we don’t necessarily need to make sure that each instance surfaces information identically, because each user’s preferences will not be exactly the same. Our aim is to ensure that no matter how a user chooses to explore our sites, content is perceivable, operable, understandable, and robust. By testing with a variety of assistive technologies—including but not limited to screen readers—we can help create a better web for all the many people who use it. Further reading and resources
Categories: Design

Designing for Conversions

Thu, 02/14/2019 - 07:30
What makes creative successful? Creative work often lives in the land of feeling—we can say we like something, point to how happy the client is, or talk about how delighted users will be, but can’t objectively measure feelings. Measuring the success of creative work doesn’t have to stop with feeling. In fact, we can assign it numbers, do math with it, and track improvement to show clients objectively how well our creative is working for them. David Ogilvy once said, “If it doesn’t sell, it isn’t creative.” While success may not be a tangible metric for us, it is for our clients. They have hard numbers to meet, and as designers, we owe it to them to think about how our work can meet those goals. We can track sales, sure, but websites are ripe with other opportunities for measuring improvements. Designing for conversions will not only make you a more effective designer or copywriter, it will make you much more valuable to your clients, and that’s something we should all seek out. Wait—what’s a conversion? Before designing for conversions, let’s establish a baseline for what, exactly, we’re talking about. A conversion is an action taken by the user that accomplishes a business goal. If your site sells things, a conversion would be a sale. If you collect user information to achieve your business goals, like lead aggregation, it would be a form submission. Conversions can also be things like newsletter sign-ups or even hits on a page containing important information that you need users to read. You need some tangible action to measure the success of your site—that’s your conversion. Through analytics, you know how many people are coming to your site. You can use this to measure what percentage of users are converting. This number is your conversion rate, and it’s the single greatest metric for measuring the success of a creative change. In your analytics, you can set up goals and conversion funnels to track this for you (more on conversion funnels shortly). It doesn’t matter how slick that new form looks or how clever that headline is—if the conversion rate drops, it’s not a success. In fact, once you start measuring success by conversion rate, you’ll be surprised to see how even the cleverest designs applied in the wrong places can fail to achieve your goals. Conversions aren’t always a one-step process. Many of us have multi-step forms or long check-out processes where it can be very useful to track how far a user gets. It’s possible to set up multiple goals along the way so your analytics can give you this data. This is called a conversion funnel. Ideally, you’ll coordinate with the rest of your organization to get data beyond the website as well. For instance, changing button copy may lead to increased form submissions but a drop in conversions from lead to sale afterward. In this case, the button copy update probably confused users rather than selling them on the product. A good conversion funnel will safeguard against false positives like that. It’s also important to track the bounce rate, which is the percentage of users that hit a page and leave without converting or navigating to other pages. A higher bounce rate is an indication that there’s a mismatch between the user’s expectations when landing on your site and what they find once landing there. Bounce rate is really a part of the conversion funnel, and reducing bounce rate can be just as important as improving conversion rate. Great. So how do we do that? When I was first getting started in conversion-driven design, it honestly felt a little weird. It feels shady to focus obsessively on getting the user to complete an action. But this focus is in no way about tricking the user into doing something they don’t want to do—that’s a bad business model. As Gerry McGovern has commented, if business goals don’t align with customer goals, your business has no future. So if we’re not tricking users, what are we doing? Users come to your site with a problem, and they’re looking for a solution. The goal is to find users whose problems will be solved by choosing your product. With that in mind, improving the conversion rate doesn’t mean tricking users into doing something—it means showing the right users how to solve their problem. That means making two things clear: that your product will solve the user’s problem, and what the user must do to proceed. The first of these two points is the value proposition. This is how the user determines whether your product can solve his or her problem. It can be a simple description of the benefits, customer testimonials, or just a statement about what the product will do for the user. A page is not limited to one value proposition—it’s good to have several. (Hint: the page’s headline should almost always be a value proposition!) The user should be able to determine quickly why your product will be helpful in solving their problem. Once the value of your product has been made clear, you need to direct the user to convert with a call to action. A call to action tells the user what they must do to solve their problem—which, in your case, means to convert. Most buttons and links should be calls to action, but a bit of copy directly following a value proposition is a good place too. Users should never have to look around to find out what the next step is—it should be easy to spot and clear in its intention. Also, ease of access is a big success factor here. My team’s testing found that replacing a Request Information button (that pointed to a form page) with an actual form on every page significantly boosted the conversion rate. If you’re also trying to get information from a user, consider a big form at the top of the page so users can’t miss it. When they scroll down the page and are ready to convert, they remember the form and have no question as to what they have to do. So improving conversion rate (and, to some degree, decreasing bounce rate) is largely about adding clarity around the value proposition and call to action. There are other factors as well, like decreasing friction in the conversion process and improving performance, but these two things are where the magic happens, and conversion problems are usually problems with one of them. So, value propositions…how do I do those? The number one thing to remember when crafting a value proposition is that you’re not selling a product—you’re selling a solution. Value propositions begin with the user’s problem and focus on that. Users don’t care about the history of your company, how many awards you’ve won, or what clever puns you’ve come up with—they care about whether your product will solve their problem. If they don’t get the impression that it can do that, they will leave and go to a competitor. In my work with landing pages for career schools, we initially included pictures of people in graduation gowns and caps. We assumed that the most exciting part of going back to school was graduating. Data showed us that we were wrong. Our testing showed that photos of people doing the jobs they would be training for performed much better. In short, our assumption was that showing the product (the school) was more important than showing the benefit (a new career). The problem users were trying to solve wasn’t a diploma—it was a career, and focusing on the user showed a significant improvement in conversion rate. We had some clients that insisted on using their branding on the landing pages, including one school that wanted to use an eagle as their hero image because their main website had eagles everywhere. This absolutely bombed in conversions. No matter how strong or consistent your branding is, it will not outperform talking about users and their problems. Websites that get paid for clicks have mastered writing headlines this way. Clickbait headlines get a groan from copywriters—especially since they often use their powers for evil and not good—but there are some important lessons we can learn from them. Take this headline, for instance: Get an Associate’s degree in nursing Just like in the example above with the college graduates, we’re selling the product—not the benefit. This doesn’t necessarily show that we understand the user’s problem, and it does nothing to get them excited about our program. Compare that headline to this one: Is your job stuck in a rut? Get trained for a new career in nursing in only 18 months! In this case, we lead with the user’s problem. That immediately gets users’ attention. We then skip to a benefit: a quick turnaround. No time is wasted talking about the product—we save that for the body copy. The headline focuses entirely on the user. In your sign-up or check-out process, always lead with the information the user is most interested in. In our case, letting the user first select their school campus and area of study showed a significant improvement over leading with contact information. Similarly, put the less-exciting content last. In our testing, users were least excited about sharing their telephone number. Moving that field to be the last one in the form decreased form abandonment and improved conversions. As designers, be cognizant of what your copywriters are doing. If the headline is the primary value proposition (as it should be), make sure the headline is the focal point of your design. Ensure the messaging behind your design is in line with the messaging in the content. If there’s a disagreement in what the user’s problem is or how your product will solve that problem, the conversion rate will suffer. Once the value proposition has been clearly defined and stated, it’s time to focus on the call to action. What about the call to action? For conversion-driven sites, a good call to action is the most important component. If a user is ready to convert and has insufficient direction on how to do so, you lose a sale at 90 percent completion. It needs to be abundantly clear to the user how to proceed, and that’s where the call to action steps in. When crafting a call to action, don’t be shy. Buttons should be large, forms should be hard to miss, and language should be imperative. A call to action should be one of the first things the user notices on the page, even if he or she won’t be thinking about it again until after doing some research on the page. Having the next step right in front of the user vastly increases the chance of conversion, so users need to know that it’s there waiting. That said, a call to action should never get in the way of a value proposition. I see this all the time: a modal window shows as soon as I get to a site, asking me to subscribe to their mailing list before I have an inkling of the value the site can give me. I dismiss these without looking, and that call to action is completely missed. Make it clear how to convert, and make it easy, but don’t ask for a conversion before the user is ready. For situations like the one above, a better strategy might be asking me to subscribe as I exit the site; marketing to visitors who are leaving has been shown to be effective. In my former team’s tests, there were some design choices that could improve calls to action. For instance, picking a bright color that stood out from the rest of the site for the submit button did show an improvement in conversions, and reducing clutter around the call to action improved conversion rates by 232%. But most of the gains here were in either layout or copy; don’t get so caught up in minor design changes that you ignore more significant changes like these. Ease of access is another huge factor to consider. As mentioned above, when my team was getting started, we had a Request Information link in the main navigation and a button somewhere on the page that would lead the user to the form. The single biggest positive change we saw involved putting a form at the top of every page. For longer forms, we would break this form up into two or three steps, but having that first step in sight was a huge improvement, even if one click doesn’t seem like a lot of effort. Another important element is headings. Form headings should ask the user to do something. It’s one thing to label a form “Request Information”; it’s another to ask them to “Request Information Now.” Simply adding action words, like “now” or “today,” can change a description into an imperative action and improve conversion rates. With submit buttons, always take the opportunity to communicate value. The worst thing you can put on a submit button is the word “Submit.” We found that switching this button copy out with “Request Information” showed a significant improvement. Think about the implied direction of the interaction. “Submit” implies the user is giving something to us; “Request Information” implies we’re giving something to the user. The user is already apprehensive about handing over their information—communicate to them that they’re getting something out of the deal. Changing phrasing to be more personal to the user can also be very effective. One study showed that writing button copy in first person—for instance, “Create My Account” versus “Create Your Account”—showed a significant boost in conversions, boosting click-through rates by 90%. Users today are fearful that their information will be used for nefarious purposes. Make it a point to reassure them that their data is safe. Our testing showed that the best way to do this is to add a link to the privacy policy (“Your information is secure!”) with a little lock icon right next to the submit button. Users will often skip right over a small text link, so that lock icon is essential—so essential, in fact, that it may be more important than the privacy policy itself. I’m somewhat ashamed to admit this, but I once forgot to create a page for the privacy policy linked to from a landing page, so that little lock icon linked out to a 404. I expected a small boost in conversions when I finally uploaded the privacy policy, but nope—nobody noticed. Reassurance is a powerful thing. Measure, measure, measure One of the worst things you can do is push out a creative change, assume it’s great, and move on to the next task. A/B testing is ideal and will allow you to test a creative change directly against the old creative, eliminating other variables like time, media coverage, and anything else you might not be thinking of. Creative changes should be applied methodically and scientifically—just because two or three changes together show an improvement in conversion rate doesn’t mean that one of them wouldn’t perform better alone. Measuring tangible things like conversion rate not only helps your client or business, but can also give new purpose to your designs and creative decisions. It’s a lot easier to push for your creative decisions when you have hard data to back up why they’re the best choice for the client or project. Having this data on hand will give you more authority in dealing with clients or marketing folks, which is good for your creative and your career. If my time in the design world has taught me anything, it’s that, in the realm of creativity, certainty can be hard to come by. So, perhaps most importantly, objective measures of success give you, and your client, the reassurance that you’re doing the right thing.
Categories: Design

Paint the Picture, Not the Frame: How Browsers Provide Everything Users Need

Thu, 02/07/2019 - 06:50
Kip Williams, professor of psychology sciences at Purdue University, conducted a fascinating experiment called “cyberball.” In his experiment, a test subject and two other participants played a computer game of catch. At a predetermined time, the test subject was excluded from the game, forcing them to only observe as the clock ran down. The experience showed increases in self-reported levels of anger and sadness, as well as lowering levels of the four needs. The digital version of the experiment created results that matched the results of the original physical one, meaning that these feelings occurred regardless of context. After the game was concluded, the test subject was told that the other participants were robots, not other human participants. Interestingly, the reveal of automated competitors did not lessen the negative feelings reported. In fact, it increased feelings of anger, while also decreasing participants’ sense of willpower and/or self-regulation. In other words: people who feel they are rejected by a digital system will feel hurt and have their sense of autonomy reduced, even when they believe there isn’t another human directly responsible. So, what does this have to with browsers? Every adjustment to the appearance and behavior of the features browsers let you manipulate is a roll of the dice, gambling on the delight of some at the expense of alienating others. When using a browser to navigate the web, there’s a lot of sameness, until there isn't. Most of the time we’re hopping from page-to-page and site-to-site, clicking links, pressing buttons, watching videos, filling out forms, writing messages, etc. But every once in awhile we stumble across something new and novel that makes us pause to figure out what’s going on. Every website and web app is its own self-contained experience, with its own ideas of how things should look and behave. Some are closer to others, but each one requires learning how to operate the interface to a certain degree. Some browsers can also have parts of their functionality and appearance altered, meaning that as with websites, there can be unexpected discrepancies. We’ll unpack some of the nuance behind some of these features, and more importantly, why most of them are better off left alone. Scroll-to-top All the major desktop browsers allow you to hit the Home key on the keyboard to jump to the top of the page. Some scrollbar implementations allow you to click on the top of the scrollbar area to do the same. Some browsers allow you to type Command+Up (macOS) / Ctrl+Up (Windows), as well. People who use assistive technology like screen readers can use things like banner landmarks to navigate the same way (provided they are correctly declared in the site’s HTML). However, not every device has an easily discoverable way to invoke this functionality: many laptops don’t have a Home key on their keyboard. The tap-the-clock-to-jump-to-the-top functionality on iOS is difficult to discover, and can be surprising and frustrating if accidentally activated. You need specialized browser extensions to recreate screen reader landmark navigation techniques. One commonly implemented UI solution for longer pages is the scroll-to-top button. It’s often fixed to the bottom-right corner of the screen. Activating this control will take the user to the top of the page, regardless of how far down they’ve scrolled. If your site features a large amount of content per page, it may be worth investigating this UI pattern. Try looking at analytics and/or conducting user tests to see where and how often this feature is used. The caveat being if it’s used too often, it might be worth taking a long, hard look at your information architecture and content strategy. Three things I like about the scroll-to-top pattern are:
  • Its functionality is pretty obvious (especially if properly labeled).
  • Provided it is designed well, it can provide a decent-sized touch target in a thumb-friendly area. For motor control considerations, its touch target can be superior to narrow scroll or status bars, which can make for frustratingly small targets to hit.
  • It does not alter or remove existing scroll behavior, augmenting it instead. If somebody is used to one way of scrolling to the top, you’re not overriding it or interrupting it.
If you’re implementing this sort of functionality, I have four requests to help make the experience work for everyone (I find the Smooth Scroll library to be a helpful starting place):
  • Honor user requests for reduced motion. The dramatic scrolling effect of whipping from the bottom of the page to the top may be a vestibular trigger, a situation where the system that controls your body’s sense of physical position and orientation in the world is disrupted, causing things like headaches, nausea, vertigo, migraines, and hearing loss.
  • Ensure keyboard focus is moved to the top of the document, mirroring what occurs visually. Applying this practice will improve all users’ experiences. Otherwise, hitting Tab after scrolling to the top would send the user down to the first interactive element that follows where the focus had been before they activated the scroll button.
  • Ensure the button does not make other content unusable by obscuring it. Be sure to account for when the browser is in a zoomed-in state, not just in its default state.
  • Be mindful of other fixed-position elements. I’ve seen my fair share of websites that also have a chatbot or floating action button competing to live in the same space.
Scrollbars If you’re old enough to remember, it was once considered fashionable to style your website scrollbars. Internet Explorer allowed this customization via a series of vendor-specific properties. At best, they looked great! If the designer and developer were both skilled and detail-oriented, you’d get something that looked like a natural extension of the rest of the website. However, the stakes for a quality design were pretty high: scrollbars are part of an application’s interface, not a website’s. In inclusive design, it’s part of what we call external consistency. External consistency is the idea that an object’s functionality is informed and reinforced by similar implementations elsewhere. It’s why you can flip a wall switch in most houses and be guaranteed the lights come on instead of flushing the toilet. While scrollbars have some minor visual differences between operating systems (and operating system versions), they’re consistent externally in function. Scrollbars are also consistent internally, in that every window and program on the OS that requires scrolling has the same scrollbar treatment. If you customize your website's scrollbar colors, for less technologically literate people, yet another aspect of the interface has changed without warning or instruction on how to change it back. If the user is already confused about how things on the screen work, it’s one less familiar thing for them to cling to as stable and reliable. You might be rolling your eyes reading this, but I’d ask you to check out this incredible article by Jennifer Morrow instead. In it, she describes conducting a guerilla user test at a mall, only to have the session completely derailed when she discovers someone who has never used a computer before. What she discovers is as important as it is shocking. The gist of it is that some people (even those who have used a computer before) don’t understand the nuance of the various “layers” you navigate through to operate a computer: the hardware, the OS, the browser installed on the OS, the website the browser is displaying, the website’s modals and disclosure statements, etc. To them, the experience is flat. We should not expect these users to juggle this kind of cognitive overhead. These kinds of abstractions are crafted to be analogous to real-world objects, specifically so people can get what they want from a digital system without having to be programmers. Adding unnecessary complexity weakens these metaphors and gives users one less reference point to rely on. Remember the cyberball experiment. When a user is already in a distressed emotional state, our poorly-designed custom scrollbar might be the death-by-a-thousand-paper-cuts moment where they give up on trying to get what they want and reject the system entirely. While Morrow’s article was written in 2011, it’s just as relevant now as it was then. More and more people are using the internet globally, and more and more services integral to living daily life are getting digitized. It’s up to us as responsible designers and developers to be sure we make everyone, regardless of device, circumstance, or ability feel welcome. In addition to unnecessarily abandoning external consistency, there is the issue of custom scrollbar styling potentially not having sufficient color contrast. The too-light colors can create a situation where a person experiencing low-vision conditions won’t be able to perceive, and therefore operate, a website’s scrolling mechanism. This article won’t even begin to unpack the issues involved with custom implementations of scrollbars, where instead of theming the OS’s native scrollbars with CSS, one instead replaces them with a JavaScript solution. Trust me when I say I have yet to see one implemented in a way that could successfully and reliably recreate all features and functionality across all devices, OSes, browsers, and browsing modes. In my opinion? Don’t alter the default appearance of an OS’s scrollbars. Use that time to work on something else instead, say, checking for and fixing color contrast problems. Scrolling The main concern about altering scrolling behavior is one of consent: it’s taking an externally consistent, system-wide behavior and suddenly altering it without permission. The term scrolljacking has been coined to describe this practice. It is not to be confused with scrollytelling, a more considerate treatment of scrolling behavior that honors the OS’s scrolling settings. Altering the scrolling behavior on your website or web app can fly in the face of someone’s specific, expressed preferences. For some people, it’s simply an annoyance. For people with motor control concerns, it could make moving through a site difficult. In some extreme cases, the unannounced discrepancy between the amount of scrolling and the distance traveled can also be vestibular triggers. Another consideration is if your modified scrolling behavior accidentally locks out people who don’t use mice, touch, or trackpads to scroll. All in all, I think Robin Rendle said it best: Scrolljacking, as I shall now refer to it both sarcastically and honestly, is a failure of the web designer’s first objective; it attacks a standardised pattern and greedily assumes control over the user’s input. Highlighting Another OS feature we’re permitted to style in the browser is highlighted text. Much like scrollbars, this is an interface element that is shared by all apps on the OS, not just the browser. Breaking the external consistency of the OS’s highlighting color has a lot of the same concerns as styled scrollbars, namely altering the expected behavior of something that functions reliably everywhere else. It’s potentially disorienting and alienating, and may deny someone’s expressed preferences. Some people highlight text as they read. If your custom highlight style has a low contrast ratio between the highlighted text color and the highlighted text’s background color, the person reading your website or web app may be unable to perceive the text they’re highlighting. The effect will cause the text to seemingly disappear as they try to read. Other people just may not care for your aesthetic sensibilities. Both macOS and Windows allow you to specify a custom highlight color. In a scenario where someone has deliberately set a preference other than the system default, a styled highlight color may override their stated specifications. For me, the potential risks far outweigh the vanity of a bespoke highlight style—better to just leave it be. Text resizing Lots of people change text size to suit their needs. And that’s a good thing. We want people to be able to read our content and act upon it, regardless of whatever circumstances they may be experiencing. For the problem of too-small text, some designers turn to text resizing widgets, a custom UI pattern that lets a person cycle through a number of preset CSS font-size values. Commonly found in places with heavy text content, text resizing widgets are often paired with complex, multicolumn designs. News sites are a common example. Before I dive into my concerns with text resizing widgets, I want to ask: if you find that your site needs a specialized widget to manage your text size, why not just take the simpler route and increase your base text size? Like many accessibility concerns, a request for a larger font size isn’t necessarily indicative of a permanent disability condition. It’s often circumstantial, such as a situation where you’re showing a website on your office’s crappy projector. Browsers allow users to change their preferred default font size, resizing text across websites accordingly. Browsers excel at handling this setting when you write CSS that takes advantage of unitless line-height values and relative font-size units. Some designers may feel that granting this liberty to users somehow detracts from their intended branding. Good designers understand that there’s more to branding than just how something looks. It’s about implementing the initial design in the browser, then working with the browser’s capabilities to best serve the person using it. Even if things like the font size are adjusted, a strong brand will still shine through with the ease of your user flows, quality of your typography and palette, strength of your copywriting, etc. Unfortunately, custom browser text resizing widgets lack a universal approach. If you rely on browser text settings, it just works—consistently, with the same controls, gestures, and keyboard shortcuts, for every page on every website, even in less-than-ideal conditions. You don’t have to write and maintain extra code, test for regressions, or write copy instructing the user on where to find your site’s text resizing widget and how to use it. Behavioral consistency is incredibly important. Browser text resizing is applied to all text on the page proportionately every time the setting is changed. These settings are also retained for the next time you visit. Not every custom text resizing widget does this, nor will it resize all content to the degree stipulated by the Web Content Accessibility Guidelines. High-contrast themes When I say high-contrast themes, I’m not talking about things like a dark mode. I’m talking about a response to people reporting that they need to change your website or web app’s colors to be more visually accessible to them. Much like text resizing controls, themes that are designed to provide higher contrast color values are perplexing: if you’re taking the time to make one, why not just fix the insufficient contrast values in your regular CSS? Effectively managing themes in CSS is a complicated, resource-intensive affair, even under ideal situations. Most site-provided high-contrast themes are static in that the designer or developer made decisions about which color values to use, which can be a problem. Too much contrast has been known to be a trigger for things like migraines, as well as potentially making it difficult to focus for users with some forms of attention-deficit hyperactivity disorder (ADHD). The contrast conundrum leads us to a difficult thing to come to terms with when it comes to accessibility: what works for one person may actually inhibit another. Because of this, it’s important to make things open and interoperable. Leave ultimate control up to the end user so they may decide how to best interact with content. If you are going to follow through on providing this kind of feature, some advice: model it after the Windows High Contrast mode. It’s a specialized Windows feature that allows a person to force a high color palette onto all aspects of the OS’s UI, including anything the browser displays. It offers four themes out of the box but also allows a user to suit their individual needs by specifying their own colors. Your high contrast mode feature should do the same. Offer a range of themes with different palettes, and let the user pick colors that work best for them—it will guarantee that if your offerings fail, people still have the ability to self-select. Moving focus Keyboard focus is how people who rely on input such as keyboards, switch controls, voice inputs, eye tracking, and other forms of assistive technology navigate and operate digital interfaces. While you can do things like use the autofocus attribute to move keyboard focus to the first input on a page after it loads, it is not recommended. For people experiencing low- and no-vision conditions, it is equivalent to being abruptly and instantaneously moved to a new location. It’s a confusing and disorienting experience—there’s a reason why there’s a trope in sci-fi movies of people vomiting after being teleported for the first time. For people with motor control concerns, moving focus without their permission means they may be transported to a place where they didn’t intend to go. Digging themselves out of this location becomes annoying at best and effort-intensive at worst. Websites without heading elements or document landmarks to serve as navigational aids can worsen this effect. This is all about consent. Moving focus is fine so long as a person deliberately initiates an action that requires it (shifting focus to an opened modal, for example). I don’t come to your house and force you to click on things, so don’t move my keyboard focus unless I specifically ask you to. Let the browser handle keyboard focus. Provided you use semantic markup, browsers do this well. Some tips: The clipboard and browser history The clipboard is sacred space. Don’t prevent people from copying things to it, and don’t append extra content to what they copy. The same goes for browser history and back and forward buttons. Don’t mess around with time travel, and just let the browser do its job. Wrapping up In the game part of cyberball, the fun comes from being able to participate with others, passing the ball back and forth. With the web, fun comes from being able to navigate through it. In both situations, fun stops when people get locked out, forced to watch passively from the sidelines. Fortunately, the web doesn’t have to be one long cyberball experiment. While altering the powerful, assistive technology-friendly features of browsers can enhance the experience for some users, it carries a great risk of alienating others if changes are made with ignorance about exactly how much will be affected. Remember that this is all in the service of what ultimately matters: creating robust experiences that allow people to successfully use your website or web app regardless of their ability or circumstance. Sometimes the best strategy is to let things be.
Categories: Design

UX in the Age of Personalization

Thu, 01/17/2019 - 06:55
If you listened to episode 180 of The Big Web Show, you heard two key themes: 1) personalization is now woven into much of the fabric of our digital technology, and 2) designers need to be much more involved in its creation and deployment. In my previous article we took a broad look at the first topic: the practice of harvesting user data to personalize web content, including the rewards (this website gets me!) and risks (creepy!). In this piece, we will take a more detailed look at the UX practitioner’s emerging role in personalization design: from influencing technology selection, to data modeling, to page-level implementation. And it’s high time we did. A call to arms Just as UX people took up the torch around content strategy years ago, there is a watershed moment quickly approaching for personalization strategy. Simply put, the technology in this space is far outpacing the design practice. For example, while “personalized” emails have been around forever (“Dear COOLIN, …”), it’s now estimated that some 45% of organizations [PDF] have attempted to personalize their homepage. If that scares you, it should: the same report indicated that fewer than a third think it’s actually working. While good old “mail merge” personalization has been around forever, more organizations are now personalizing their website content. Source: Researchscape International survey of 300 marketing professionals from five countries, conducted February 22 to March 28, 2018. As Jeff MacIntyre points out, “personalization failures are typically design failures.” Indeed, many personalization programs are still driven primarily out of marketing and IT departments, a holdover from the legacy of the inbound, “creepy” targeted ad. Fixing that model will require the same paradigm shift we’ve used to tackle other challenges in our field: intentionally moving design “upstream,” in this case to technology selection, data collection, and page-level implementation. That’s where you come in. In fact, if you’re anything like me, you’ve been doing this, quietly, already. Here are just a few examples of UX-specific tasks I’ve completed on recent design projects that had personalization aspects:
  • aligning personalization to the core content strategy;
  • working with the marketing team to understand goals and objectives;
  • identifying user segments (personas) that may benefit from personalized content;
  • drafting personalization use cases;
  • assisting the technical team with product selection;
  • helping to define the user data model, including first- and third-party sources;
  • wireframing personalized components in the information architecture;
  • taking inventory of existing content to repurpose for personalization;
  • writing or editing new personalized copy;
  • working with the design team to create personalized images;
  • developing a personalization editorial calendar and governance model;
  • helping to set up and monitor results from a personalization pilot;
  • partnering with the analytics team to make iterative improvements;
  • being a voice for the personalization program’s ethical standards;
  • and monitoring customer feedback to make sure people aren’t freaking the f* out.
Sound familiar? Many of these are simply variants on the same, user-centered tactics you’ve relied on for years. The difference now is that personalization creates a “third dimension” of complexity relative to audience and content. We’ll define that complexity further in two parts: technical design and information design. (We should note again that the focus of this article is personalizing web content, although many of the same principles also apply to email and native applications.) Part 1: Personalization technical design Influencing technology decisions When clients or internal stakeholders come to you with a desire to “do personalization,” the first thing to ask is what does that mean. As you’ve likely noticed, the technology landscape has now matured to the point where you can “personalize” a digital experience based on just about anything, from basic geolocation to complex machine learning algorithms. What’s more, such features are increasingly baked into your own CMS or readily available from third-party plugins (see chart below). So defining what personalization is—and isn’t—is a critical first step. To accomplish this, I suggest asking two questions: 1) What data can you ethically collect on your users, and 2) which tactics best complement this data. Some capabilities may already exist in your current systems; some you may need to build into your future technology roadmap. The following is by no means an exhaustive list but highlights a few of the popular tactics out there today, and tools that support them: Tactic Definition Examples Geolocation Personalizing based on the physical location of the user, via a geolocation-enabled device or a web browser IP address (which can triangulate your position based on nearby wifi devices). Examples: If I’m in Washington, DC, show me promotions for DC. If I’m in Paris, show me promotions for Paris, in French.

Sample Tools: MaxMind, HTML5 API Quizzes and Profile Info A simple, cost-effective way to gather first-party user data by asking basic questions to help assign someone to a segment. Often done as a layover “intercept” when the user arrives, which can then be modified based on a cookied profile. Generally must be exceptionally brief to be effective. Examples: Are you interested in our service for home use or business use? Are you in the market to buy or sell a house? Campaign Source One of the most popular methods of personalization, it directs a user to a customized landing page based on incoming campaign data. Can be used for everything from passing a unique discount code to personalizing content on the entire site. Examples: Customize landing page based on incoming email campaigns, social media campaigns, and paid search campaigns. Clicks or Pages Viewed Slightly more advanced approach to personalizing based on behavior; common on ecommerce. Examples: Products you previously viewed; suggested content you’ve recently been looking at.

Sample Tools: Dynamic Yield, Optimizely SIC and NAICS Codes Standard Industrial Classification (SIC) and North American Industry Classification System (NAICS) for classifying industries based on a universal four-digit code, e.g., Manufacturing 2000–3999. Helpful for determining who is visiting you from a business location, based on incoming IP address. Examples: Show me a different message if I work in the fashion industry vs. hog farming.

Sample Tools: Marketo, Oracle (BlueKai), Demandbase Geofencing Contextual personalization within a “virtual perimeter.” Establishes a fixed geographical boundary based on your device location, typically through RFID or GPS. Your device can then take an action when you enter or leave the location. Examples: Show me my boarding pass when I’m at the airport. Remind me about unused gift cards when I enter the store.

Sample Tools:, Thinknear, Google Geofencing API. Behavioral Profiling Add a user to a segment based on similar users who fall into that segment. Often combined with machine learning to identify new segments that humans wouldn’t be able to predict. Examples: Sitecore pattern cards, e.g., impulse purchaser, buys in bulk, bargain hunter; expedites shipping. Machine Learning Identify patterns across large sets of data (often across channels) to better predict what a user will want. In theory, improves over time as algorithms “learn” from thousands of interactions. (Obvious downside: your site will need to support thousands of interactions.) Examples: Azure Machine Learning Studio, BloomReach (Hippo), Sitecore (xConnect, Cortex), Adobe Sensei. As you can see, the best tactic(s) can vary dramatically based on your audience and how they interact with you. For example, if you’re a high-volume, B2C ecommerce site, you may have enough click-stream data to support useful personalized product recommendations. Conversely, if you’re a B2B business with a qualified lead model and fewer unique visitors, you may be better served by third-party data to help you tailor your message based on industry type (NAICS code) or geography. To help illustrate this idea, let’s do a quick mapping of tactics relative to visitor volume and session time: To find your personalization “sweet spot,” consider your audience in terms of volume (number of visits) and average attention span (time on site). The good news here is that you needn’t have a massive data platform in place; you can begin to build audience profiles simply by asking users to self-identify via quizzes or profile info. But in either scenario, your goal is the same: help guide the technology decision toward a personalization approach that provides actual value to your audience, not “because we can.” Part 2: Personalization information design Personalization deliverables Once you have a sense of the technical possibilities, it’s time to determine how the personalized experience will look. Let’s pretend we’re designing for a venture several of you inquired about in my previous article: Reindeer Hugs International. As the name implies, this is a nonprofit that provides hugs to reindeer. RHI recently set new business goals and wants to personalize the website to help achieve them. Seems reputable. To address this goal, we propose four UX-specific deliverables:
  1. segments worksheet;
  2. campaigns worksheet;
  3. personalization wireframes;
  4. and personalization copy deck.
Following the technical model we discussed earlier, the first thing we do is define our audience based on existing site interaction patterns. We discover that RHI doesn’t get a ton of organic traffic, but they do have a reasonably active set of authenticated users (existing members) as well as some paid social media campaigns. Working with the marketing team, we propose personalizing the site for three high-potential segments, as follows: Segments worksheet Segment How to Identify Personalization Goal Messaging Strategy Current Members Logged in or made guest contribution (track via cookie) Improve engagement with current members by 10% You’re a hugging rock star, but you can hug it out even more. Non-member Males Inbound Facebook and Instagram campaigns Improve conversion with non-member males age 25–34 by 5% Make reindeer hugging manly again. Non-member Parents Inbound Facebook and Instagram campaigns Improve conversion with non-member parents age 31–49 by 5% Reindeer hugging is great for the kids. Next, let’s determine the specific value we could add for these segments when they come to the site. To do this, we’ll revisit a model that we looked at previously for the four personalization content types. This will help us organize the collective content or “campaign” we show each segment based on a specific personalization goal: A Personalization Content Model showing four flavors of personalized content. For example, current members who are logged in might benefit from a “Make Easier” campaign of links to members-only content. Conversely, each of our three segments could benefit from a personalized “Cross-Sell” campaign to help generate awareness. Let’s capture our ideas like this: Campaigns worksheet Segment Alert Make Easier Cross-Sell Enrich Current Members Geolocation Banner
Hugs needed in your area (displays to any user with location data). Links for members who are logged in, such as to profile information, a member directory, and reindeer friends catalog. Capital Campaign
Generate awareness by audience (minimum three distinct messages). Current Member Blog
Invest in creating original, hug-provoking content to further our brand. Non-member Males Age 25–34 Non-Member CTA In the non-member experience, this will be replaced by a CTA. Thought Leadership
Demonstrate that we are the definitive source for reindeer hugs. Non-member Parents Age 28–39 .multirow-span th, .multirow-span td { background: none !important; border-right: 1px solid #bfbfbf; border-bottom: 1px solid #bfbfbf; } .multirow-span thead th:last-child, .multirow-span td:last-child { border-right: none; } Personalization wireframes Now let’s decide where on the site we want to run these personalized campaigns. This isn’t too dissimilar from the work you already do around templates and components, with the addition that we can now have personalized zones. You can think of these as blocks where the CMS (or third-party plugin) will be running a series of calculations to determine the user segment in real-time (or based on a previously cached profile). To get the most coverage, these are typically dropped in at the template level. Here are examples for our home page template and interior page template: Showing component-level “zoning” on homepage and landing page templates. The colors correspond to the personalization content type. Everything in white is the non-personalized, or “static,” content, which never changes, regardless of who you are. The personalized zones themselves (color-coded based on our content model) will also have an underlying default or canonical content set that appears if the system doesn’t get a personalized match. (Note: this is also the version of the content that is typically indexed by search engines.) As you can see, an important rule of thumb is to personalize around the main content, not the entire page. There are a variety of reasons for this, including the risk of getting the audience wrong, effects on search indexing, and what’s known as the infinite content problem, i.e., can you realistically create content for every single audience on every single component? (Hint: no.) OK, we’re getting close! Finally, let’s look at what specifically we want the system to show in these slots. Based on our campaigns worksheet, we know how many permutations of content we need. We sit down with the creative team to design our targeted messages, including the copy, images, and calls to action. Here’s what the capital campaign (the blue zone) might look like for our three audiences: Personalization copy deck Reindeer Hugs International: Capital Campaign (Cross-Sell) Element Definition Asset Message A:
Current Member Headline: Take Your Hugs to the Next Level

Copy: You’re a hugging expert. But did you know you could hug two reindeers at once?

Primary CTA: Sign up for our Two-for-One Hugs

Secondary CTA: Learn More
Source: Current-Member.jpg
Full-size render: 900x450
Thumbnail render: 300x200 Message B:
Real Men Hug Headline: Real Men Hug Reindeer

Copy: Are you a real man?

Primary CTA: Prove It

Secondary CTA: [None]
Source: Man-Hug.jpg
Full-size render: 900x450
Thumbnail render: 300x200 Message C:
Parents with Young Kids Headline: Looking for a fun activity to do with the kids?

Copy: Reindeer hugs are 100% kid-friendly and 200% environmentally-friendly.

Primary CTA: Shop Our Family Plan

Secondary CTA: Learn More
Source: Parents-Kids.jpg
Full-size render: 900x450
Thumbnail render: 300x200 That’s a pretty good start. We would want to follow a similar approach to detail our other three content campaigns, including alerts (e.g., hugs needed in your area), make easier (e.g., member shortcuts), and enrichment content (e.g., blog articles on latest reindeer fashions). When all the campaigns are up and running, we might expect the homepage to look something like this when seen by two different audiences, simultaneously, in real-time, in different browser sessions: Wireframes illustrating the anticipated homepage delivery to two distinct audiences: Current Member (left) and Non-Member Male 25–34 (right). If the system did not get an audience match, a default or non-personalized set of content would be shown. Part 3: Advanced personalization techniques Digital Experience Platforms Of course, all of that work was fairly manual. If you are lucky enough to be working with an advanced DMP (Data Management Platform) or integrated DXP (Digital Experience Platform) then you have even more possibilities at your disposal. For example, machine learning and behavior profiling can help you discover segments over time that you might never have dreamed of (the study we referenced earlier showed that 26% of marketing programs have tried some form of algorithmic one-to-one approach; 68% still use rules-based targeting to segments). This can be enhanced via parametric scoring, where actioning off of multiple data inputs can help you create blends of audience types (in our example, a thirty-three-year-old dad might get 60 percent Parent and 40 percent Real Man … or whatever). Likewise, on the content side, content scoring can help you deliver more nuanced content. (For example, we might tag an article with 20 percent Reindeer Advocacy and 80 percent Hug Best Practices.) Platforms like Sitecore can even illustrate these metrics, like in this example of a pattern card: The diagram at left shows how a particular user scores (some combination of research and returns merchandise). This most closely correlates to the “Neurotic Shopper” card, so we might show this user content on our free-returns policy. Source: The Berndt Group. Cult of the complex While all of that is super cool, even the most tech-savvy among us will benefit from starting out “simple,” lest you fall prey to the cult of the complex. The manual process of identifying your target audience and use cases, for example, is foundational to building an extensible personalization program, regardless of your tech stack. At a minimum, this approach will help you get buy-in from your team and organization vs. just telling everyone the site will be personalized in a “black box” somewhere. And even with the best-in-class products, I have yet to find seamless “one-click” personalization, where the system somehow magically does everything from finding audiences to pumping out content, all in real time. We’ll get there one day, perhaps. But, in the meantime, it’s up to you.
Categories: Design

Conversations with Robots: Voice, Smart Agents &#038; the Case for Structured Content

Thu, 01/10/2019 - 06:55
In late 2016, Gartner predicted that 30 percent of web browsing sessions would be done without a screen by 2020. Earlier the same year, Comscore had predicted that half of all searches would be voice searches by 2020. Though there’s recent evidence to suggest that the 2020 picture may be more complicated than these broad-strokes projections imply, we’re already seeing the impact that voice search, artificial intelligence, and smart software agents like Alexa and Google Assistant are making on the way information is found and consumed on the web. In addition to the indexing function that traditional search engines perform, smart agents and AI-powered search algorithms are now bringing into the mainstream two additional modes of accessing information: aggregation and inference. As a result, design efforts that focus on creating visually effective pages are no longer sufficient to ensure the integrity or accuracy of content published on the web. Rather, by focusing on providing access to information in a structured, systematic way that is legible to both humans and machines, content publishers can ensure that their content is both accessible and accurate in these new contexts, whether or not they’re producing chatbots or tapping into AI directly. In this article, we’ll look at the forms and impact of structured content, and we’ll close with a set of resources that can help you get started with a structured content approach to information design. The role of structured content In their recent book, Designing Connected Content, Carrie Hane and Mike Atherton define structured content as content that is “planned, developed, and connected outside an interface so that it’s ready for any interface.” A structured content design approach frames content resources—like articles, recipes, product descriptions, how-tos, profiles, etc.—not as pages to be found and read, but as packages composed of small chunks of content data that all relate to one another in meaningful ways. In a structured content design process, the relationships between content chunks are explicitly defined and described. This makes both the content chunks and the relationships between them legible to algorithms. Algorithms can then interpret a content package as the “page” I’m looking for—or remix and adapt that same content to give me a list of instructions, the number of stars on a review, the amount of time left until an office closes, and any number of other concise answers to specific questions. Structured content is already a mainstay of many types of information on the web. Recipe listings, for instance, have been based on structured content for years. When I search, for example, “bouillabaisse recipe” on Google, I’m provided with a standard list of links to recipes, as well as an overview of recipe steps, an image, and a set of tags describing one example recipe: A “featured snippet” for on the Google results page. The same page viewed in Google’s Structured Data Testing Tool. The pane on the right shows the machine-readable values. This “featured snippet” view is possible because the content publisher,, has broken this recipe into the smallest meaningful chunks appropriate for this subject matter and audience, and then expressed information about those chunks and the relationships between them in a machine-readable way. In this example, has used both semantic HTML and linked data to make this content not merely a page, but also legible, accessible data that can be accurately interpreted, adapted, and remixed by algorithms and smart agents. Let’s look at each of these elements in turn to see how they work together across indexing, aggregation, and inference contexts. Software agent search and semantic HTML Semantic HTML is markup that communicates information about the meaningful relationships between document elements, as opposed to simply describing how they should look on screen. Semantic elements such as heading tags and list tags, for instance, indicate that the text they enclose is a heading (<h1>) for the set of list items (<li>) in the ordered list (<ol>) that follows. HTML structured in this way is both presentational and semantic because people know what headings and lists look like and mean, and algorithms can recognize them as elements with defined, interpretable relationships. HTML markup that focuses only on the presentational aspects of a “page” may look perfectly fine to a human reader but be completely illegible to an algorithm. Take, for example, the City of Boston website, redesigned a few years ago in collaboration with top-tier design and development partners. If I want to find information about how to pay a parking ticket, a link from the home page takes me directly to the “How to Pay a Parking Ticket” screen (scrolled to show detail): As a human reading this page, I easily understand what my options are for paying: I can pay online, in person, by mail, or over the phone. If I ask Google Assistant how to pay a parking ticket in Boston, however, things get a bit confusing: None of the links provided in the Google Assistant results take me directly to the “How to Pay a Parking Ticket” page, nor do the descriptions clearly let me know I’m on the right track. (I didn’t ask about requesting a hearing.) This is because the content on the City of Boston parking ticket page is styled to communicate content relationships visually to human readers but is not structured semantically in a way that also communicates those relationships to inquisitive algorithms. The City of Seattle’s “Pay My Ticket” page, though it lacks the polished visual style of Boston’s site, also communicates parking ticket payment options clearly to human visitors: The equivalent Google Assistant search, however, offers a much more helpful result than we see with Boston. In this case, the Google Assistant result links directly to the “Pay My Ticket” page and also lists several ways I can pay my ticket: online, by mail, and in person. Despite the visual simplicity of the City of Seattle parking ticket page, it more effectively ensures the integrity of its content across contexts because it’s composed of structured content that is marked up semantically. “Pay My Ticket” is a level-one heading (<h1>), and each of the options below it are level-two headings (<h2>), which indicate that they are subordinate to the level-one element. These elements, when designed well, communicate information hierarchy and relationships visually to readers, and semantically to algorithms. This structure allows Google Assistant to reasonably surmise that the text in these <h2> headings represents payment options under the <h1> heading “Pay My Ticket.” While this use of semantic HTML offers distinct advantages over the “page display” styling we saw on the City of Boston’s site, the Seattle page also shows a weakness that is typical of manual approaches to semantic HTML. You’ll notice that, in the Google Assistant results, the “Pay by Phone” option we saw on the web page was not listed. If we look at the markup of this page, we can see that while the three options found by Google Assistant are wrapped in both <strong> and <h2> tags, “Pay by Phone” is only marked up with an <h2>. This irregularity in semantic structure may be what’s causing Google Assistant to omit this option from its results. Although each of these elements would look the same to a sighted human creating this page, the machine interpreting it reads a difference. While WYSIWYG text entry fields can theoretically support semantic HTML, in practice they all too often fall prey to the idiosyncrasies of even the most well-intentioned content authors. By making meaningful content structure a core element of a site’s content management system, organizations can create semantically correct HTML for every element, every time. This is also the foundation that makes it possible to capitalize on the rich relationship descriptions afforded by linked data. Linked data and content aggregation In addition to finding and excerpting information, such as recipe steps or parking ticket payment options, search and software agent algorithms also now aggregate content from multiple sources by using linked data. In its most basic form, linked data is “a set of best practices for connecting structured data on the web.” Linked data extends the basic capabilities of semantic HTML by describing not only what kind of thing a page element is (“Pay My Ticket” is an <h1>), but also the real-world concept that thing represents: this <h1> represents a “pay action,” which inherits the structural characteristics of “trade actions” (the exchange of goods and services for money) and “actions” (activities carried out by an agent upon an object). Linked data creates a richer, more nuanced description of the relationship between page elements, and it provides the structural and conceptual information that algorithms need to meaningfully bring data together from disparate sources. Say, for example, that I want to gather more information about two recommendations I’ve been given for orthopedic surgeons. A search for a first recommendation, Scott Ruhlman, MD, brings up a set of links as well as a Knowledge Graph info box containing a photo, location, hours, phone number, and reviews from the web. If we run Dr. Ruhlman’s Swedish Hospital profile page through Google’s Structured Data Testing Tool, we can see that content about him is structured as small, discrete elements, each of which is marked up with descriptive types and attributes that communicate both the meaning of those attributes’ values and the way they fit together as a whole—all in a machine-readable format. In this example, Dr. Ruhlman’s profile is marked up with microdata based on the vocabulary. is a collaborative effort backed by Google, Yahoo, Bing, and Yandex that aims to create a common language for digital resources on the web. This structured content foundation provides the semantic base on which additional content relationships can be built. The Knowledge Graph info box, for instance, includes Google reviews, which are not part of Dr. Ruhlman’s profile, but which have been aggregated into this overview. The overview also includes an interactive map, made possible because Dr. Ruhlman’s office location is machine-readable. The search for a second recommendation, Stacey Donion, MD, provides a very different experience. Like the City of Boston site above, Dr. Donion’s profile on the Kaiser Permanente website is perfectly intelligible to a sighted human reader. But because its markup is entirely presentational, its content is virtually invisible to software agents. In this example, we can see that Google is able to find plenty of links to Dr. Donion in its standard index results, but it isn’t able to “understand” the information about those sources well enough to present an aggregated result. In this case, the Knowledge Graph knows Dr. Donion is a Kaiser Permanente physician, but it pulls in the wrong location and the wrong physician’s name in its attempt to build a Knowledge Graph display. You’ll also notice that while Dr. Stacey Donion is an exact match in all of the listed search results—which are numerous enough to fill the first results page—we’re shown a “did you mean” link for a different doctor. Stacy Donlon, MD, is a neurologist who practices at MultiCare Neuroscience Center, which is not affiliated with Kaiser Permanente. Multicare does, however, provide semantic and linked data-rich profiles for their physicians. Voice queries and content inference The increasing prevalence of voice as a mode of access to information makes providing structured, machine-intelligible content all the more important. Voice and smart software agents are not just freeing users from their keyboards, they’re changing user behavior. According to LSA Insider, there are several important differences between voice queries and typed queries. Voice queries tend to be:
  • longer;
  • more likely to ask who, what, and where;
  • more conversational;
  • and more specific.
In order to tailor results to these more specifically formulated queries, software agents have begun inferring intent and then using the linked data at their disposal to assemble a targeted, concise response. If I ask Google Assistant what time Dr. Ruhlman’s office closes, for instance, it responds, “Dr. Ruhlman’s office closes at 5 p.m.,” and displays this result: These results are not only aggregated from disparate sources, but are interpreted and remixed to provide a customized response to my specific question. Getting directions, placing a phone call, and accessing Dr. Ruhlman’s profile page on are all at the tips of my fingers. When I ask Google Assistant what time Dr. Donion’s office closes, the result is not only less helpful but actually points me in the wrong direction. Instead of a targeted selection of focused actions to follow up on my query, I’m presented with the hours of operation and contact information for MultiCare Neuroscience Center. MultiCare Neuroscience Center, you’ll recall, is where Dr. Donlon—the neuroscientist Google thinks I may be looking for, not the orthopedic surgeon I’m actually looking for—practices. Dr. Donlon’s profile page, much like Dr. Ruhlman’s, is semantically structured and marked up with linked data. To be fair, subsequent trials of this search did produce the generic (and partially incorrect) practice location for Dr. Donion (“Kaiser Permanente Orthopedics: Morris Joseph MD”). It is possible that through repeated exposure to the search term “Dr. Stacey Donion,” Google Assistant fine-tuned the responses it provided. The initial result, however, suggests that smart agents may be at least partially susceptible to the same availability heuristic that affects humans, wherein the information that is easiest to recall often seems the most correct. There’s not enough evidence in this small sample to support a broad claim that algorithms have “cognitive” bias, but even when we allow for potentially confounding variables, we can see the compounding problems we risk by ignoring structured content. “Donlon,” for example, may well be a more common name than “Donion” and may be easily mistyped on a QWERTY keyboard. Regardless, the Kaiser Permanente result we’re given above for Dr. Donion is for the wrong physician. Furthermore, in the Google Assistant voice search, the interaction format doesn’t verify whether we meant Dr. Donlon; it just provides us with her facility’s contact information. In these cases, providing clear, machine-readable content can only work to our advantage. The business case for structured content design In 2012, content strategist Karen McGrane wrote that “you don’t get to decide which platform or device your customers use to access your content: they do.” This statement was intended to help designers, strategists, and businesses prepare for the imminent rise of mobile. It continues to ring true for the era of linked data. With the growing prevalence of smart assistants and voice-based queries, an organization’s website is less and less likely to be a potential visitor’s first encounter with rich content. In many cases—such as finding location information, hours, phone numbers, and ratings—this pre-visit engagement may be a user’s only interaction with an information source. These kinds of quick interactions, however, are only one small piece of a much larger issue: linked data is increasingly key to maintaining the integrity of content online. The organizations I’ve used as examples, like the hospitals, government agencies, and colleges I’ve consulted with for years, don’t measure the success of their communications efforts in page views or ad clicks. Success for them means connecting patients, constituents, and community members with services and accurate information about the organization, wherever that information might be found. This communication-based definition of success readily applies to virtually any type of organization working to further its business goals on the web. The model of building pages and then expecting users to discover and parse those pages to answer questions, though time-tested in the pre-voice era, is quickly becoming insufficient for effective communication. It precludes organizations from participating in emergent patterns of information seeking and discovery. And—as we saw in the case of searching for information about physicians—it may lead software agents to make inferences based on insufficient or erroneous information, potentially routing customers to competitors who communicate more effectively. By communicating clearly in a digital context that now includes aggregation and inference, organizations are more effectively able to speak to their users where users actually are, be it on a website, a search engine results page, or a voice-controlled digital assistant. They are also able to maintain greater control over the accuracy of their messages by ensuring that the correct content can be found and communicated across contexts. Getting started: who and how Design practices that build bridges between user needs and technology requirements to meet business goals are crucial to making this vision a reality. Information architects, content strategists, developers, and experience designers all have a role to play in designing and delivering effective structured content solutions. Practitioners from across the design community have shared a wealth of resources in recent years on creating content systems that work for humans and algorithms alike. To learn more about implementing a structured content approach for your organization, these books and articles are a great place to start:
Categories: Design

Taming Data with JavaScript

Thu, 12/20/2018 - 06:55
I love data. I also love JavaScript. Yet, data and client-side JavaScript are often considered mutually exclusive. The industry typically sees data processing and aggregation as a back-end function, while JavaScript is just for displaying the pre-aggregated data. Bandwidth and processing time are seen as huge bottlenecks for dealing with data on the client side. And, for the most part, I agree. But there are situations where processing data in the browser makes perfect sense. In those use cases, how can we be successful? Think about the data Working with data in JavaScript requires both complete data and an understanding of the tools available without having to make unnecessary server calls. It helps to draw a distinction between trilateral data and summarized data. Trilateral data consists of raw, transactional data. This is the low-level detail that, by itself, is nearly impossible to analyze. On the other side of the spectrum you have your summarized data. This is the data that can be presented in a meaningful and thoughtful manner. We’ll call this our composed data. Most important to developers are the data structures that reside between our transactional details and our fully composed data. This is our “sweet spot.” These datasets are aggregated but contain more than what we need for the final presentation. They are multidimensional in that they have two or more different dimensions (and multiple measures) that provide flexibility for how the data can be presented. These datasets allow your end users to shape the data and extract information for further analysis. They are small and performant, but offer enough detail to allow for insights that you, as the author, may not have anticipated. Getting your data into perfect form so you can avoid any and all manipulation in the front end doesn’t need to be the goal. Instead, get the data reduced to a multidimensional dataset. Define several key dimensions (e.g., people, products, places, and time) and measures (e.g., sum, count, average, minimum, and maximum) that your client would be interested in. Finally, present the data on the page with form elements that can slice the data in a way that allows for deeper analysis. Creating datasets is a delicate balance. You’ll want to have enough data to make your analytics meaningful without putting too much stress on the client machine. This means coming up with clear, concise requirements. Depending on how wide your dataset is, you might need to include a lot of different dimensions and metrics. A few things to keep in mind:
  • Is the variety of content an edge case or something that will be used frequently? Go with the 80/20 rule: 80% of users generally need 20% of what’s available.
  • Is each dimension finite? Dimensions should always have a predetermined set of values. For example, an ever-increasing product inventory might be too overwhelming, whereas product categories might work nicely.
  • When possible, aggregate the data—dates especially. If you can get away with aggregating by years, do it. If you need to go down to quarters or months, you can, but avoid anything deeper.
  • Less is more. A dimension that has fewer values is better for performance. For instance, take a dataset with 200 rows. If you add another dimension that has four possible values, the most it will grow is 200 * 4 = 800 rows. If you add a dimension that has 50 values, it’ll grow 200 * 50 = 10,000 rows. This will be compounded with each dimension you add.
  • In multidimensional datasets, avoid summarizing measures that need to be recalculated every time the dataset changes. For instance, if you plan to show averages, you should include the total and the count. Calculate averages dynamically. This way, if you are summarizing the data, you can recalculate averages using the summarized values.
Make sure you understand the data you’re working with before attempting any of the above. You could make some wrong assumptions that lead to misinformed decisions. Data quality is always a top priority. This applies to the data you are both querying and manufacturing. Never take a dataset and make assumptions about a dimension or a measure. Don’t be afraid to ask for data dictionaries or other documentation about the data to help you understand what you are looking at. Data analysis is not something that you guess. There could be business rules applied, or data could be filtered out beforehand. If you don’t have this information in front of you, you can easily end up composing datasets and visualizations that are meaningless or—even worse—completely misleading. The following code example will help explain this further. Full code for this example can be found on GitHub. Our use case For our example we will use BuzzFeed’s dataset from “Where U.S. Refugees Come From—and Go—in Charts.” We’ll build a small app that shows us the number of refugees arriving in a selected state for a selected year. Specifically, we will show one of the following depending on the user’s request:
  • total arrivals for a state in a given year;
  • total arrivals for all years for a given state;
  • and total arrivals for all states in a given year.
The UI for selecting your state and year would be a simple form: The code will:
  1. Send a request for the data.
  2. Convert the results to JSON.
  3. Process the data.
  4. Log any errors to the console. (Note: To ensure that step 3 does not execute until after the complete dataset is retrieved, we use the then method and do all of our data processing within that block.)
  5. Display results back to the user.
We do not want to pass excessively large datasets over the wire to browsers for two main reasons: bandwidth and CPU considerations. Instead, we’ll aggregate the data on the server with Node.js. Source data: [{"year":2005,"origin":"Afghanistan","dest_state":"Alabama","dest_city":"Mobile","arrivals":0}, {"year":2006,"origin":"Afghanistan","dest_state":"Alabama","dest_city":"Mobile","arrivals":0}, ... ] Multidimensional Data: [{"year": 2005, "state": "Alabama","total": 1386}, {"year": 2005, "state": "Alaska", "total": 989}, ... ] How to get your data structure into place AJAX and the Fetch API There are a number of ways with JavaScript to retrieve data from an external source. Historically you would use an XHR request. XHR is widely supported but is also fairly complex and requires several different methods. There are also libraries like Axios or jQuery’s AJAX API. These can be helpful to reduce complexity and provide cross-browser support. These might be an option if you are already using these libraries, but we want to opt for native solutions whenever possible. Lastly, there is the more recent Fetch API. This is less widely supported, but it is straightforward and chainable. And if you are using a transpiler (e.g., Babel), it will convert your code to a more widely supported equivalent. For our use case, we’ll use the Fetch API to pull the data into our application: window.fetchData = window.fetchData || {}; fetch('./data/aggregate.json') .then(response => { // when the fetch executes we will convert the response // to json format and pass it to .then() return response.json(); }).then(jsonData => { // take the resulting dataset and assign to a global object window.fetchData.jsonData = jsonData; }).catch(err => { console.log("Fetch process failed", err); }); This code is a snippet from the main.js in the GitHub repo The fetch() method sends a request for the data, and we convert the results to JSON. To ensure that the next statement doesn’t execute until after the complete dataset is retrieved, we use the then() method and do all our data processing within that block. Lastly, we console.log() any errors. Our goal here is to identify the key dimensions we need for reporting—year and state—before we aggregate the number of arrivals for those dimensions, removing country of origin and destination city. You can refer to the Node.js script /preprocess/index.js from the GitHub repo for more details on how we accomplished this. It generates the aggregate.json file loaded by fetch() above. Multidimensional data The goal of multidimensional formatting is flexibility: data detailed enough that the user doesn’t need to send a query back to the server every time they want to answer a different question, but summarized so that your application isn’t churning through the entire dataset with every new slice of data. You need to anticipate the questions and provide data that formulates the answers. Clients want to be able to do some analysis without feeling constrained or completely overwhelmed. As with most APIs, we’ll be working with JSON data. JSON is a standard that is used by most APIs to send data to applications as objects consisting of name and value pairs. Before we get back to our use case, let’s look at a sample multidimensional dataset: const ds = [{ "year": 2005, "state": "Alabama", "total": 1386, "priorYear": 1201 }, { "year": 2005, "state": "Alaska", "total": 811, "priorYear": 1541 }, { "year": 2006, "state": "Alabama", "total": 989, "priorYear": 1386 }]; With your dataset properly aggregated, we can use JavaScript to further analyze it. Let’s take a look at some of JavaScript’s native array methods for composing data. How to work effectively with your data via JavaScript Array.filter() The filter() method of the Array prototype (Array.prototype.filter()) takes a function that tests every item in the array, returning another array containing only the values that passed the test. It allows you to create meaningful subsets of the data based on select dropdown or text filters. Provided you included meaningful, discrete dimensions for your multidimensional dataset, your user will be able to gain insight by viewing individual slices of data. ds.filter(d => d.state === "Alabama"); // Result [{ state: "Alabama", total: 1386, year: 2005, priorYear: 1201 },{ state: "Alabama", total: 989, year: 2006, priorYear: 1386 }] The map() method of the Array prototype ( takes a function and runs every array item through it, returning a new array with an equal number of elements. Mapping data gives you the ability to create related datasets. One use case for this is to map ambiguous data to more meaningful, descriptive data. Another is to take metrics and perform calculations on them to allow for more in-depth analysis. Use case #1—map data to more meaningful data: => (d.state.indexOf("Alaska")) ? "Contiguous US" : "Continental US"); // Result [ "Contiguous US", "Continental US", "Contiguous US" ] Use case #2—map data to calculated results: => Math.round(((d.priorYear - / * 100)); // Result [-13, 56, 40] Array.reduce() The reduce() method of the Array prototype (Array.prototype.reduce()) takes a function and runs every array item through it, returning an aggregated result. It’s most commonly used to do math, like to add or multiply every number in an array, although it can also be used to concatenate strings or do many other things. I have always found this one tricky; it’s best learned through example. When presenting data, you want to make sure it is summarized in a way that gives insight to your users. Even though you have done some general-level summarizing of the data server-side, this is where you allow for further aggregation based on the specific needs of the consumer. For our app we want to add up the total for every entry and show the aggregated result. We’ll do this by using reduce() to iterate through every record and add the current value to the accumulator. The final result will be the sum of all values (total) for the array. ds.reduce((accumulator, currentValue) => accumulator +, 0); // Result 3364 Applying these functions to our use case Once we have our data, we will assign an event to the “Get the Data” button that will present the appropriate subset of our data. Remember that we have several hundred items in our JSON data. The code for binding data via our button is in our main.js: document.getElementById("submitBtn").onclick = function(e){ e.preventDefault(); let state = document.getElementById("stateInput").value || "All" let year = document.getElementById("yearInput").value || "All" let subset = window.fetchData.filterData(year, state); if (subset.length == 0 ) subset.push({'state': 'N/A', 'year': 'N/A', 'total': 'N/A'}) document.getElementById("output").innerHTML = `<table class="table"> <thead> <tr> <th scope="col">State</th> <th scope="col">Year</th> <th scope="col">Arrivals</th> </tr> </thead> <tbody> <tr> <td>${subset[0].state}</td> <td>${subset[0].year}</td> <td>${subset[0].total}</td> </tr> </tbody> </table>` } If you leave either the state or year blank, that field will default to “All.” The following code is available in /js/main.js. You’ll want to look at the filterData() function, which is where we keep the lion’s share of the functionality for aggregation and filtering. // with our data returned from our fetch call, we are going to // filter the data on the values entered in the text boxes fetchData.filterData = function(yr, state) { // if "All" is entered for the year, we will filter on state // and reduce the years to get a total of all years if (yr === "All") { let total = this.jsonData.filter( // return all the data where state // is equal to the input box dState => (dState.state === state) .reduce((accumulator, currentValue) => { // aggregate the totals for every row that has // the matched value return accumulator +; }, 0); return [{'year': 'All', 'state': state, 'total': total}]; } ... // if a specific year and state are supplied, simply // return the filtered subset for year and state based // on the supplied values by chaining the two function // calls together let subset = this.jsonData.filter(dYr => dYr.year === yr) .filter(dSt => dSt.state === state); return subset; }; // code that displays the data in the HTML table follows this. See main.js. When a state or a year is blank, it will default to “All” and we will filter down our dataset to that particular dimension, and summarize the metric for all rows in that dimension. When both a year and a state are entered, we simply filter on the values. We now have a working example where we:
  • Start with a raw, transactional dataset;
  • Create a semi-aggregated, multidimensional dataset;
  • And dynamically build a fully composed result.
Note that once the data is pulled down by the client, we can manipulate the data in a number of different ways without having to make subsequent calls to the server. This is especially useful because if the user loses connectivity, they do not lose the ability to manipulate the data. This is useful if you are creating a progressive web app (PWA) that needs to be available offline. (If you are not sure if your web app should be a PWA, this article can help.) Once you get a firm handle on these three methods, you can create just about any analysis that you want on a dataset. Map a dimension in your dataset to a broader category and summarize using reduce. Combined with a library like D3, you can map this data into charts and graphs to allow a fully customizable data visualization. Conclusion This article gives a better sense of what is possible in JavaScript when working with data. As I mentioned, client-side JavaScript is in no way a substitute for translating and transforming data on the server, where the heavy lifting should be done. But by the same token, it also shouldn’t be completely ruled out when datasets are treated properly.
Categories: Design
©2019 Richard Esmonde. All rights reserved.