Look up a list of items and return matches in Google Sheets

Anyone who’s spent time in spreadsheets dreads receiving that request to match a random list of things with another of list of things. Right? Except when you are used to using VLOOKUP. That tried-and-true method is very good at finding a match in a list against a another (usually longer) list.

But what happens when you need to do multiple VLOOKUP-like requests at the same time, say when you have a list of Student IDs and need to look up a list of corresponding names?

Wouldn’t it be great if you could do that in only a single step instead of multiple columns, tabs, and a long process?

Luckily, there’s a trick you can use in Google Sheets to take those multiple lookups and turn them into a single formula.

TL;dr – Matching each item in a list against another list

the answer in case you don’t want to keep reading – get the Google Sheets example here.

Why does this matter, anyway?

Looking up a list one item at a time is slow. Looking up all of the terms in a list is really useful, especially when you have a long list of IDs and a long list of values that might or might not match that list. Also, this method works for comma-delimited lists of different lengths in adjacent rows (e.g. cell 1 listing attendance for course 1 has two students, and cell 2 listing attendance for course 2 has five students, and only two of the students overlap both courses.)

WHATWHY
Look up a list of ID values in Google Sheets, returning a new comma-delimited string with the results of your lookupTransform a list in a single step

Recipe for building our formula

Given a comma delimited list of items you want to look up where the mapped value is not obvious, e.g. “ID123,ID456,ID567”, look each item up against a table of other information about that item where the lookup key is each item in the first list.

For example, you might have a list of Unique IDs and want to find out and return the corresponding mapped value, e.g. the name of people in a class. To do this you need to take the peopleID and return a name.

Each item in the list is the key you want to lookup in another list. One way to do this in multiple steps is to use VLOOKUP where you would run a vlookup to get a different value, e.g. to find that ID123 has the student name “Tony Schmidt”.

To do this lookup, you need to use the QUERY function in Google sheets to match multiple simultaneous conditions.

Using the matches keyword in QUERY, transform your comma-delimited list into a pipe-delimited list, then transpose and join the result to get a returned comma-delimited string from your QUERY function.

(fun fact: the QUERY function returns an Array which makes this possible.)

How does the formula work?

First, get the list items from our original list into a format that QUERY understands. The QUERY function in Google Sheets uses the term “matches” to look up one or more terms in the QUERY action. Using the the SPLIT function, we find each item in the comma-delimited list and then JOIN to replace commas with a “|” character, indicating item 1 OR item 2 so that the QUERY uses multiple terms to match.

After transforming our original list into a pipe-delimited list, we write a QUERY against the list we’re trying to match.

The goal is to select a term from column A (the one in the list with the keys matching our original list) with the corresponding value we want from column B (the ID value for each list item).

This table contains the steps to transform our original list into the comma-delimited result of looking up each term

What does it look like when you’re done?

When finished, we took the original list of ID123, ID456, ID457 and turned into the keys (names) that matched these ids in our lookup table:

  • ID123 matches “Tony Schmidt”
  • ID456 matches “Leroy Kuhn”
  • ID567 matches “Connie Wright”

Concatenating these and returning them yields the result “Tony Schmidt, Leroy Kuhn, Connie Wright.” If you wanted to make this fancier, you could use a string e.g.

="STUDENT NAMES:" & CHAR(10) & join(", ",transpose(query(Sheet1!A2:B3,"Select B where A matches '"&join("|",split(A2,","))&"'",0)))

To format your string further:

STUDENT NAMES:
Tony Schmidt, Leroy Kuhn, Connie Wright

Using the “&” to concatenate a string and the CHAR function to add a line break.

Now that you’ve seen the steps, go ahead and copy this original sheet to try it yourself. Happy Matching!

Spreadsheets are not scary

xkcd-spreadsheet
https://xkcd.com/1906/

What was the last document you created? I’m guessing it was either an email, a Word document, or a Google Doc. Perhaps it was a Slack message but since when you use Slack you create several hundred messages a month (or day), so let’s not count that. Why wasn’t it a spreadsheet (Excel or Google Sheets)?

The most likely reason you didn’t create a spreadsheet is that you wanted the freedom of an open page (hint: View > gridlines turns them off in Google Sheets). Maybe you wanted to insert pictures (hint: use the Image function or insert > image). Or maybe you just like the formatting that seems to happen almost automatically in Cloud-based apps these days. But really, why didn’t you start with a spreadsheet?

Spreadsheets are structured. True, they end up looking that way when they are done. Why not use a tab in your spreadsheet to test an idea (growth in a metric over time, adjusted by factors that change) and produce a sensitivity table to help illustrate a business challenge you are facing. When you are thinking about taking a metric from x to y by when (improve yield by 40% in 30 days) it’s important to show the impact of meeting or missing your expectations.

A suggestion: start with a written statement of what you are trying to solve, e.g. “produce a month over month forecast for the sales team based on a certain number of sales per month and an average sale of $x.” Using the problem statement will frame how you set up your data – you need to have available data series for the problem that match your problem – and start you on the way to presenting the “dashboard” version of your solution. Keep it simple to start by labeling cells that affect others as assumptions, like “sales increase per month.”

Spreadsheets are structured like modular pieces to assemble into a larger whole. Start with the smallest item in your model or argument and test it, then use either a Named Range or a consistently placed cell (e.g. all of your calculations are in a single tab in the same column) to organize where you will find the results. Spreadsheets are more flexible than you think.

Those formulas are hard to copy from one tab to another. Anyone who has ever worked on a spreadsheet has had the experience of copying or changing the data and seeing the dreaded #N/A or #VALUE or #REF show up in place of the data or calculation that they wanted. (And if you have no idea what I am talking about, open a spreadsheet tab in sheets.google.com, enter a formula e.g.  =A1+A2 and then break the formula by putting in nonsense characters).

To test your ability to copy data from one tab to another, create a new tab, remove the gridlines for that tab, and title it “Dashboard”. If you can pull the data from the other tab (hint: just use a simple reference, e.g. “=YourOtherSheet!A2” to pull data from a tab called “YourOtherSheet” in cell A2) you can apply things like formatting and font choice to make your dashboard look like a document, not just a spreadsheet.

This method of data construction and display just plain feels weird. Building data-driven dashboards is a method to insert data conversations in your documents. The data needs to first exist in the format (or in a format that’s convertible to the format you need) and then you need to think about how to combine it. And you need to think of the visual display at the same time to make this a useful technique.

Spreadsheets aren’t scary if you think of the presentation layer as a sketch. You might even draw what you want to see on a Post-It note or find a dashboard that you like on the web and take a screen shot as a starting point. Then, consider what kind of data goes into the boxes and graphs to understand what you need to store in the data layer. Still confused, or want to learn more? Ben Collins is my go-to resource to explain Google Sheets to anyone. The best way to learn spreadsheets, however, is to start one.

 

 

Make a Boolean Expression From A List of Rows in Google Sheets

google-list

We all use search forms every day. There’s usually an advanced search form of some sort that allows you to enter a boolean expression. You know Boolean expressions as “This OR That” – an inclusive search that increases results – or “This AND That” – an exclusive search that limits results.

When you need to filter your results, a powerful method is to make a list of companies, individuals, skills – whatever you are using to get more precise answers – and create a Boolean expression to get a better initial set of results from a search engine. Yet when someone gives you a list of 10, 20, 50, or 100 items the idea of making a delimited list of “Item1” OR “Item2” OR “Item3” is more difficult than you’d like to be.

Never fear: Google Sheets to the rescue! This is a perfect example of a task for a Lazy Programmer™ to solve, as it’s much easier to have a spreadsheet count values and rearrange text for you than it is for you to do the work by hand. Let’s take a look.

As an example, I created a utility spreadsheet that takes a range of values, counts those values, and outputs a result string that delivers a boolean expression to paste into your favorite advanced search page. Whether you’d like to create a list of criteria to find your next job, search Twitter more effectively, or simply create a customized search for Google, this expression builder can help.

This spreadsheet has two formulas that do the work for you. The first one joins a range of values into a concatenated string (turns list of Orange and Apple into ‘Orange’ OR ‘Apple’ OR ”):

join

There are five functions in this expression:

  1. JOIN combines an expression of values into a string connecting each item with the string ‘ OR ‘;
  2. INDIRECT tells the spreadsheet to wait until the range expression in the formula from A2 to the last unblank value in column A is completed and we know the number of the last row to select;
  3. TEXT converts a complex expression into a value – in this case a calculation of the number of rows in the column of this spreadsheet minus the number of blank rows in the column of this spreadsheet – so that INDIRECT will take result of this expression and substitute “5”, the last row above that has a value. This results in a range from A2:A5 to feed back to JOIN;
  4. ROWS counts the number of rows in a range – the standard number in a Google Spreadsheet is 1000 – and returns an integer to feed back to the expression inside of TEXT;
  5. COUNTBLANK returns the number of blank rows in a range, so that our calculation of rows minus the number of blank rows leaves us with the number of rows containing values (4) or the total number of rows including the header (5).

Now that we have our formula, it creates the string:

'Orange' OR 'Apple' OR 'Banana' OR 'Grapefruit' OR ''

From the rows of Orange, Apple, Banana, Grapefruit above. And there is one problem here: that we need to remove the trailing “OR ”” that resulted from the expression.

Our second formula, REGEXREPLACE, removes the last “OR ”” from the string. RegEx is sort of like Black Magic, so you will probably need to use a helper site like Rubular. This string looks for a specific string right before the end of the line (represented by $)

OR ''

and replaces it with nothing.

regex

This leaves you with the strong you want to paste into a search form:

'Orange' OR 'Apple' OR 'Banana' OR 'Grapefruit'

You’re all set! Now, use this utility with any number of rows in your list of items to create a Boolean expression made for you automagically.

You need a better content calendar

We have a content publishing problem

photo by http://www.flickr.com/photos/robellisphotography/
photo by http://www.flickr.com/photos/robellisphotography/

Hey you there.  The one with the combination of WordPress, Hootsuite, Tweetdeck, Twitter, Buffer, Facebook, Pinterest, Google+, Slideshare, Excel, Word, and a Google Docs mishmash of information ending in Google Analytics, Mixpanel, Apptentive and others. You and I have the same problem. We want to be better at what we love – publishing valuable content for audiences that appreciate it – and we want to measure it. We also want to know which content published by which person at which time was effective. And we need to do this without the compendium of technical knowledge and project management skill that it takes today to get this done.

Consider this exchange and you’ll get the idea of why this is difficult.

a conversation among community manager types on Facebook
a conversation among community manager types on Facebook

There has to be a better way

You need a better content calendar (and so do I.) You’d like to have the ability to make a campaign, syndicate information to multiple channels and to track analytics in the same place. You need to schedule this content for days or weeks or months in advance. You’ll need to do this for multiple authors and also have a big red STOP button to make this information cease when bad things happen in the world.

I send apologies in advance to those people think that content calendars and scheduled publishing is bad. I think that it’s better to publish live than schedule, and I also feel that it’s better to set ideas in advance and follow through on those ideas when you are trying to drive sustained, measurable success. So perhaps these two goals are at odds, and perhaps not. In the meanwhile, we all need a better content calendar than just dumping everything in a Google Spreadsheet.

There are good signs – when I asked this question on Twitter – I heard from Meshfire, Relaborate and Brightpod. I also asked a group of about 5400 community manager types and got some great answers. And I also got the feeling that there are few people out there who are managing the publishing of multiple content authors in multiple channels in multiple campaigns having a simple workflow for approval with the precision and information that they are using to manage their email marketing campaigns.

What does this mean overall? Two words: Market Opportunity. Someone needs to build a content calendar and management service for normal people that is as easy as managing your blog posts in WordPress. That service needs to handle scheduling, analytics, and content funnel management for multiple people and campaigns across multiple channels. If this service already exists, I’d love to know about it so that I can use it.

Blog at WordPress.com.

Up ↑

%d bloggers like this: