If you work with imported or foreign data in Microsoft Excel, chances are you have to clean the data a bit before you can use it. You might have to remove spaces at the beginning or end of values or reformat a column of values from text to numbers or vice versa. It’s also common to parse values when a column stores more than one value.
A good example of this is names. You’ll often find the entire name in a single column and that makes those values difficult to work with. In this tutorial, I’ll show you two quick ways to parse data using Excel’s Flash Fill and Power Query features.
I’m using Microsoft 365 on a Windows 10 64-bit system. You can use string functions in older versions including the .xls menu format. Flash Fill is available in Excel 2013 and later, and Power Query is available in Excel 2010 and later. In addition, Excel for the web supports Flash Fill. You can also open existing Power Query queries in Excel for the web, but you can’t create them in the browser edition.
SEE: Windows, Linux, and Mac commands everyone needs to know (free PDF) (TechRepublic)
How to parse using Microsoft Excel string functions
Before Flash Fill, most parsing required combining string functions such as RIGHT(), LEFT(), MID(), FIND() and LEN(). For example, the expression in column D of Figure A.
=IFERROR(RIGHT(B3,LEN(B3)-IFERROR(FIND(” “,B3,FIND(” “,B3)+1),FIND(” “,B3))),B3)
The above expression returns the last name from the values in column B. IFERROR() handles the error when there’s no second space. Fortunately, our name values are consistent in format, but that won’t always be the case. Even if IFERROR() is removed, it’s still a complex and tedious expression to work through.
Flash Fill can complete many parsing tasks in more recent versions of Microsoft Excel.
How to use Flash Fill to parse data in Microsoft Excel
Excel’s Flash Fill feature has been around since Excel 2013, so you may be familiar with it. When using it to parse, enter the first value in the target column, so Excel can evaluate the pattern. Sometimes, Flash Fill requires only one value. Because our data is consistent, this parsing task will be easy.
Let’s use Flash Fill to parse the last names from the name values in column B.
First, enter Harkins in C3 and press Ctrl + Enter. Most likely, Flash Fill won’t fill the column.
Then, enter the first few characters in Smyth, as shown in Figure B. As you can see, Flash Fill now has the pattern and offers to fill the entire column.
Press Enter to fill the remaining cells, as shown in Figure C.
Excel’s Flash Fill is a lot faster and easier than combining string functions. Even when the data isn’t consistent, this feature does a good job of finding the pattern.
Excel for the web supports Flash Fill. Look for it on the Data tab. Instead of pressing Ctrl + Enter, click Flash Fill in the Data Tools group.
SEE: 6 ways to save time using Flash Fill in Microsoft Excel (TechRepublic)
How to use Power Query to parse data in Microsoft Excel
Power Query is available to Microsoft Excel 2010 users and later. Its purpose is to retrieve and clean data, but it’s full of features that can do much more. Using Power Query, we’ll add a column and parse the last names into that column.
First, we need to load the data into Power Query as follows:
- Click anywhere inside the table that you want to load into Power Query.
- Click the Data tab.
- In the Get & Transform Data group, click From Table/Range.
- If you haven’t formatted the data as a Table, Power Query will prompt you to do so. The demonstration Table is TablePersonnel. You don’t need to name the Table, but a named Table is a bit easier to work with.
Figure D shows TablePersonnel in Power Query.
Once the data is in Power Query, add a column based on the Personnel column.
First, click the Personnel header to select the column, and click the Add Column tab.
In the General Group, choose From Selection from the Column From Examples dropdown (Figure E). Power Query adds a new blank column. To enter the new last name values for this column, we’ll use Power Query’s own version of Flash Fill.
Into the first cell of the new column, Column 1, enter Harkins (Figure F). Power Query responds by offering a set of last name values for the entire column. Look them over a bit. If they’re not correct, enter another last name. However, Power Query returns the correct list with just one name to set the pattern for our data.
Click OK to create and populate the new column with last names, as shown in Figure G.
All that’s left to do is to save the modified data to Microsoft Excel, as follows:
- Click the Home tab.
- In the Close group, click Close and Load from the Close & Load dropdown.
Figure H shows the parsed last names in an Excel sheet. Power Query uses the name TablePersonnel to name the new sheet. At this point, you can rename the new column. You can also do so in Power Query before saving the data.
Flash Fill was definitely quicker but knowing the Power Query method will help when you’re already working with data in Power Query.