But how do we get it there in the first place?
There are a couple of options – choosing the best one for your case will very much depend on the volume of data you need to analyse and the data infrastructure that your company provides.
So far I’ve used:
1. Import from text (a.k.a. the good old fashioned .csv or .txt file)
The “quick and dirty way” as I call it only makes sense to use if you are lucky enough to work with relatively small volumes of data (e.g. less than 10, 000 rows), which comes in the very exact format that you need, without requiring any further modifications or additions whatsoever to get your analysis done
Why are you being so dismissive?
Mostly because of file size considerations
The bigger the text file, the larger your Excel file will be. This will consequently mean that Excel will need increasingly more time to:
(1) (re-) calculate any formulas added to it (unless you have your calculation options set to ‘Manual’, Excel will attempt to re-calculate every time you try to re-filter your dataset or save your workbook)
(2) commit any layout/filtering changes made to the pivot tables in the file
This will ultimately cause people using your analysis to get increasingly annoyed at having to wait for 10+ minutes for switching a filter
Therefore, the bigger the text file, all the more reason to completely separate the generation from the analysis layouts by importing your data straight to an MS Access file and feeding it back to Excel from there.
2. OLE DB connection (get data from Microsoft Access)
It is really a very straightforward option – as long as you’ve got a pre-defined table or a query view in MS Access and the Access file is less than 2 GB in size, nothing stops you from linking it directly to Excel
Even though using MS Access might sound a bit outdated in the age of big data, I’ve found it a game-changer in my previous role as a Reporting Analyst when I needed to wrangle HR, Operational and Finance data coming from 5 different user interface systems and subsequently present it in a single-picture layout every month
Setting up an MS Access database (truly, a piece of cake) and making sure that each set of consolidated monthly figures got imported there has allowed me to:
(a) develop a bunch of spin-off analyses, thereby providing much more insight for more people
(b) provide not only the single month view that was required, but also fiscal year, calendar year, and year-over-year trends of operational data
(c) easily restate already generated data, whenever business conditions/relationships changed and a 2-years-backward view was needed
(d) meet ad-hoc reporting requests related to that dataset incredibly quickly
3. ODBC connection (get data from a relational database)
Even though slightly outdated, quite simplistic in nature, and offering super limited data storage capacity, Microsoft Access is still a relational database. Therefore, you can also set up a ODBC database connection to it as well
Why would you want to do that, though, if you can directly import from MS Access database?
Well, you can’t if you happen to be a Mac user. Excel for Mac does not support OLE DB connections and consequently they can’t re-filter any pivot tables based on that source.
So, should you have Mac users among your stakeholders, using the ODBC connection to link your MS Access data to Excel is the safest option to make sure they won’t lose any functionality.
Now comes the sweetest bit – if you happen to be among those lucky few that only need to use a single relational database to fulfil all their data needs (be it Teradata, Vertica, MySQL, MariaDB, PostgreDB or other), you can just:
(1) download and install its respective ODBC driver
(2) set up the connection: Control Panel –> Administrative Tools –> Set up data sources (ODBC)
(3) Once done, simply copy/paste your SQL queries in the Command text line of every ODBC connection you create
It is a huge time-saver, especially if your role is much more focused on preparing tailored ad-hoc analyses rather than on standardized and regular ones. Plus, it’s a super handy way of keeping track of the conditions that you used to retrieve your data, in case you need to re-visit it two months later.
4. MS Query (mix and match data from different MS Access tables)
I’ve only used this one because it was the mandatory way to complete a course assignment at Uni, and to be honest, I am not too impressed by this particular option – if you are curious why, you can find a bit more about it here
It also seems that Microsoft consider it quite obsolete as well, given that both Excel 2013 and Excel 2016 still only support it to enable backward compatibility
5. PowerPivot – the new kid on the bloc
Introduced as an add-in for Excel 2010, PowerPivot is now becoming more and more established, especially as Excel 2016 users no longer need to install it separately.
It might take some getting used to if you haven’t had any prior experience with designing and updating data models, but it is totally worth it if you find yourself in need of crunching huge datasets (and in no way am I exaggerating when I say huge…)
What you need to keep in mind prior building any model in PowerPivot:
(1) File size of a PowerPivot – based model may be larger than the file size of the same dataset based on an MS Access file (assuming you’re feeding data from MS Access directly to Pivot Tables)
(2) Unlike MS Access, PowerPivot does not allow you to specify the data types when importing from text or excel file. Instead, it detects and assigns those for you. Whilst this does spare you a bit of a hassle, it also means that you need to be super careful how you format every single column of your file
(3) Once imported, you cannot edit or delete individual rows from your PowerPivot data model – if you need to restate the data, you need to delete the whole table and re-import it again
(4) Excel for Mac renders pivot tables built on the basis of PowerPivot unfilterable, so if you’ve got Mac users amongst your audience, better not use it altogether