Look up a list of items and return matches in Google Sheets

Anyone who’s spent time in spreadsheets dreads receiving that request to match a random list of things with another of list of things. Right? Except when you are used to using VLOOKUP. That tried-and-true method is very good at finding a match in a list against a another (usually longer) list.

But what happens when you need to do multiple VLOOKUP-like requests at the same time, say when you have a list of Student IDs and need to look up a list of corresponding names?

Wouldn’t it be great if you could do that in only a single step instead of multiple columns, tabs, and a long process?

Luckily, there’s a trick you can use in Google Sheets to take those multiple lookups and turn them into a single formula.

TL;dr – Matching each item in a list against another list

the answer in case you don’t want to keep reading – get the Google Sheets example here.

Why does this matter, anyway?

Looking up a list one item at a time is slow. Looking up all of the terms in a list is really useful, especially when you have a long list of IDs and a long list of values that might or might not match that list. Also, this method works for comma-delimited lists of different lengths in adjacent rows (e.g. cell 1 listing attendance for course 1 has two students, and cell 2 listing attendance for course 2 has five students, and only two of the students overlap both courses.)

WHATWHY
Look up a list of ID values in Google Sheets, returning a new comma-delimited string with the results of your lookupTransform a list in a single step

Recipe for building our formula

Given a comma delimited list of items you want to look up where the mapped value is not obvious, e.g. “ID123,ID456,ID567”, look each item up against a table of other information about that item where the lookup key is each item in the first list.

For example, you might have a list of Unique IDs and want to find out and return the corresponding mapped value, e.g. the name of people in a class. To do this you need to take the peopleID and return a name.

Each item in the list is the key you want to lookup in another list. One way to do this in multiple steps is to use VLOOKUP where you would run a vlookup to get a different value, e.g. to find that ID123 has the student name “Tony Schmidt”.

To do this lookup, you need to use the QUERY function in Google sheets to match multiple simultaneous conditions.

Using the matches keyword in QUERY, transform your comma-delimited list into a pipe-delimited list, then transpose and join the result to get a returned comma-delimited string from your QUERY function.

(fun fact: the QUERY function returns an Array which makes this possible.)

How does the formula work?

First, get the list items from our original list into a format that QUERY understands. The QUERY function in Google Sheets uses the term “matches” to look up one or more terms in the QUERY action. Using the the SPLIT function, we find each item in the comma-delimited list and then JOIN to replace commas with a “|” character, indicating item 1 OR item 2 so that the QUERY uses multiple terms to match.

After transforming our original list into a pipe-delimited list, we write a QUERY against the list we’re trying to match.

The goal is to select a term from column A (the one in the list with the keys matching our original list) with the corresponding value we want from column B (the ID value for each list item).

This table contains the steps to transform our original list into the comma-delimited result of looking up each term

What does it look like when you’re done?

When finished, we took the original list of ID123, ID456, ID457 and turned into the keys (names) that matched these ids in our lookup table:

  • ID123 matches “Tony Schmidt”
  • ID456 matches “Leroy Kuhn”
  • ID567 matches “Connie Wright”

Concatenating these and returning them yields the result “Tony Schmidt, Leroy Kuhn, Connie Wright.” If you wanted to make this fancier, you could use a string e.g.

="STUDENT NAMES:" & CHAR(10) & join(", ",transpose(query(Sheet1!A2:B3,"Select B where A matches '"&join("|",split(A2,","))&"'",0)))

To format your string further:

STUDENT NAMES:
Tony Schmidt, Leroy Kuhn, Connie Wright

Using the “&” to concatenate a string and the CHAR function to add a line break.

Now that you’ve seen the steps, go ahead and copy this original sheet to try it yourself. Happy Matching!

Applying Regular Expressions to Sort Items in Google Sheets

Lists are more useful when grouped.

When you need to analyze a large list of items, you need to create areas of commonality to enable grouping and sorting that list. A simple example of grouping might be to think about a grocery list, where certain items (bananas, oranges) are fruits and other items (paper plates, napkins) are non-perishable items that need to be managed differently. Short lists of a few items are easy to categorize; longer items take an effort to sort and manage the data; and large lists (>200 items) might require a programmatic approach.

Fortunately, spreadsheets are excellent for a large, boring sort that might drive you a bit nuts if you tried to do it by hand. Spreadsheets require a logical framework to decide exactly how to organize your items into categories, especially if you have multiple criteria. To organize your items, you might look for an obvious sort (alphabetical) or count individual things, but you are more likely trying to create a grouping that’s not obvious.

RegEx can help (really)

Regular expressions are a computer-geeky way of writing a pattern for text that you want to match, and provide a TRUE or FALSE result to your expression. This code is amazing and flexible, and also a bit hard to master (check out https://regexr.com/ if you want a handy playground to try them out). You use Regular Expressions when you want to apply the same test to multiple items, e.g. finding the domain name in an email, or retrieving a portion of text from a URL.

Using a logical test and validating it in Regular Expressions allows you to take an abstract problem (find the senior leaders in a list of titles) that’s easily understood by humans but needs instructions for computers to execute. This all sounds logical, but searching for Google Sheets RegEx formulas might leave you feeling like adding more than one condition at a time is a bit … challenging. So I decided to simplify that process.

Here’s a concrete example of applying multiple true/false conditions to a single problem. If you’ve built prospect lists for outbound campaigns, there’s one thing that almost always happens: you have to “bucket” the titles by groups to decide what to do next. You might manually decide that titles starting with CEO or “Chief” are C-level, and that ones starting with VP are VP-level, but it’s kind of a drag to do this manually. Wouldn’t it be easier to define the filters, then apply the true/false results of each filter to all of your titles at once?

A quick tool to help sort lists

I decided to build a small tool in Google Sheets to model using multiple filters to test the ability to arrange job titles into families. To build this utility, I used Google Sheets to list the testName I wanted to establish in each row, followed by an example title, a Regular Expression condition I wanted to match, and the result of TRUE or FALSE for that test. I also created a column to help you understand which part of the string/title matched your condition.

Several rows of tests, displaying the regular expression used to evaluate the test

The initial list has four tests, where the simplest is:

(?i)(mgr|manager)

Or – in English – “match everything that has either the string “mgr” or the string “manager”, in the source string, and don’t worry about whether it’s upper or lower case.”

These tests each evaluate a job title, determining what seniority it lives in based on a keyword or words in the title. I intended this rule stack evaluate from the top to the bottom, so you would evaluate the first rule, then stop if you get TRUE, then move on to the next rule. If you prefer, you can keep going even when you have multiple TRUE conditions as it will help you understand the distribution of your list between buckets.

A few titles, displaying the results of the testing run and the item that matched

Here’s how this works:

  1. Taking the title text from the “Example Title” column,
  2. We use the header column to look up the row of the test we want to execute. For isManager level, we would use a formula to look at the column labeled Test Name in the other tab, do a vLookup to find the row matching isManager, then read the expression in the Condition/Regex column, and return a TRUE/FALSE result using the REGEXMATCH formula, e.g. =REGEXMATCH(“Sr Manager of Spreadsheets”,”(?i)(mgr|manager)”)
  3. use conditional formatting to alert a TRUE result;
  4. and reverse the formula using a REGEXEXTRACT to identify the portion of the string that matched first.

Ok, so what?

Yes, you could apply these formulas one at a time or use complicated sorting methods to find boolean conditions that are true or false in your data set. This method is helpful because it allows you to flexibly add conditions (add a row in the RegEx Tester sheet; and a column in the ProspectTitles sheet labeled with that row’s test name). It also allows you to change your RegEx conditions to match your data set by updating a single field for each change you want to make in your expression across as many items as you want.

This method works well and quickly for evaluating several fields against a set of 10k records or fewer. As you think about longer lists that involve updating more than 50k cells at a time, you might combine your tests or use small portions of lists at a time to increase the speed (that’s a lot of VLOOKUPs 😉

Here’s a link to the example – make a copy and give it a try.

Blog at WordPress.com.

Up ↑

%d bloggers like this: