Importing Data into PowerPivot Part 1

Jumpstart your PowerPivot knowledge with this post. This Excel add-on can do some serious heavy lifting when it comes to crunching data. Here is an easy way to practice using PowerPivot with a free large dataset.

Microsoft PowerPivot extends the capabilities within Excel to BI Analytics. The process is to pull data in from a variety of sources, and then build pivot tables to explore this data. This can be done with OLAP data, Access, Excel files, DB2, and a list of other sources.

I find I have significantly more flexibility when pulling data using this tool. Additionally, unlike OLAP, as the author of the content, you can refresh the content or keep a snapshot in time of the data.

For this article, I will demonstrate how to import from Access, create a simple join between two tables, and pivot the data into Excel.

Years ago, I took a course in statistics, and a group of us obtained a large dataset for a project. It was the Major League Baseball database, and it was – and still is freely available here:

http://www.seanlahman.com/baseball-archive/statistics/

I downloaded the 2016 version for writing this piece. It contains information on players, batting, fielding, and other baseball related stats.

Once I downloaded and saved the database I opened a new Excel file. I am using Excel 2016 which has PowerPivot built in. If you are using Office 2010 or 2013, PowerPivot is available in those versions as well.

Upon opening, if you have PowerPivot available it will show in the ribbon. If your version of Excel is 2010 or 2013 and you do not see PowerPivot in the ribbon follow the steps as outlined here:

https://support.office.com/en-us/article/Start-the-Power-Pivot-in-Microsoft-Excel-add-in-a891a66d-36e3-43fc-81e8-fc4798f39ea8

2017-07-20_191003

I select the PowerPivot tab and click the green icon over the word Manage.

A new window opens for the PowerPivot extension. This is tied to the existing Excel document. If I save the Excel file, any changes I make within PowerPivot will be saved.

To import the Access file:
Select: Get External Data -> from Other Sources
From the popup menu select: Access

Then use the browse button to locate the Access file. Note that the version of Access available with my MLB database was an earlier version, and I had to use the dropdown to select that format and see my file.

Once the database is identified, the application prompts me to import by selecting tables or writing a query. For this article, I choose select tables.

Next, I am presented with a list of tables from the MLB database. I choose, Batting and Master tables. I click finish and I can see the data from the two tables I selected.

To enable the data between the two tables to populate a pivot table correctly, I now use the icon on the top right called Diagram View.

I now see representations of both tables in diagram form. I click and drag while holding down my mouse from PlayerID field in the Master Table to the PlayerID field in the Batting Table. The two tables are now have a relationhship via the PlayerID Field

2017-07-20_215535

I want to also include a custom field, so I click on Data View and I select the Batting Sheet. I scroll all the way to the right and start typing into any cell of the first empty column:

=YearID&TeamID

Then I label the column YEAR_TEAM by entering this in the top row of that same column.

Next I choose the pivot table option from the top menu and import data into a new pivot table. PowerPivot sends me back to my Excel file and I have a Pivot Table I can start to work with.

This is where some knowledge of baseball will come in handy. Do you know any player who has played on more than one team during his career? Well, if not, you can use Brian McCann. McCann was on the Braves and the Yankees. We are going to look at his Home Run records every year he played in the MLB through 2016.

Add the fields shown below into the PivotTable. Keep in mind, once you add playerID it will be easier to first filter on “mccannbr01” (or whichever player you choose) prior to adding additional dimensions or measures.

2017-07-20_220202

Note that through adding the link between the Master and Batting tables the data can be filtered through attributes in another table.

Additionally, the YEAR_TEAM dimension is one we created, yet it can be easily added as a column to note which year and team the player was on.

Using other measures such as RBIs, this free data source is a great way to learn more about what one can do with PowerPivot.

I plan to post more on PowerPivot in the near future, but this is a good place to start.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s