You can use the PowerPivot add-in for Excel 2010 to create a report from multiple Excel workbooks or worksheets, by joining the tables using the Primary and the Foreign key, such as ‘ProductID’ in a Sales table and a Pricing table.
In this example though, we want to combine the data in two Excel files that have an identical structure — sales data for the East and West regions. In this case, we can’t use a key to connect the tables; instead, we want to create one combined table from all the data.
The following technique allows you to import more than a million records from Excel, despite the fact that one worksheet can only contain up to 1,048,576 rows.
At least that’s possible in theory — on my computer it imported about 1.2 million, then gave up, after whining about memory resources.
Thanks to Excel MVP, Kirill Lapin, for sharing this very helpful tip with us. You can see more of Kirill’s work in last week’s posts on Combining Data from Two Excel Files in a Pivot Table.
Create a Connection in the Workbook
The key to this technique is to start by creating a workbook connection, before you launch PowerPivot.
- On the Excel Ribbon’s Data tab, click Connections.
- In the Workbook Connections window, click Add
- At the bottom of the Existing Connections window, click Browse for More.
- Navigate to the folder where your files are located.
- Select one of the files that you want to import — EastSales.xlsx in this example — and click Open.
- Select a table to import, and click OK.
- The new connection appears in the Workbook Connections window.
Combine the Data in PowerPivot
- Close the Workbook Connections window, and on the Ribbon, click the PowerPivot tab.
- Click PowerPivot Window, to launch the PowerPivot add-in.
Note: If you’re using Windows XP, the PowerPivot window has a menu bar. If you’re using Vista or Windows 7, you’ll see a Ribbon instead.
- On the Table menu, click Existing Connections, or, on the Ribbon, click Design, then Existing Connections.
- At the bottom of the Existing Connections window, under Workbook Connections, click on the connection that you added, and click Open.
- In the Table Import Wizard, click Next, then select the table, and click Finish
- After the data is successfully imported, click Close.
Change the SQL Statement
Now that the first table has been imported, you can change its properties, to combine it with data from the second table.
- On the Table menu, click Table Properties, or on the Ribbon, click the Design tab, then click Table Properties.
- At the right, from the Switch To drop down list, select Query Editor.
- Edit the SQL statement, to create a union query, combining the two tables. In this example, the SQL statement is:
SELECT [EastSales$].* FROM [EastSales$] UNION ALL SELECT * FROM ‘C:\_TESTWestSales.xlsx’.[WestSales$]
After you change the SQL statement, click the Validate button, to verify that the statement is correct, then click Save.
Note: The SQL query string can also be edited in the Excel workbook connection window, by selecting the connection, and clicking Properties. However, there’s no Validate feature there.
Create the Pivot Table
Next, you can create a pivot table from the combined data.
- On the Toolbar, click the Create a PivotTable button, or on the Ribbon, click the Home tab, then click PivotTable.
- Select a location for the pivot table, and click OK.
- Add fields to the pivot table layout, to see a summary of the data.
Here’s the pivot table that was created from the combined data, with columns for the East and West regions. The Report Layout is Tabular, and Number format is used, with thousands separator and zero decimals.
Detailed Instruction and Sample Files
To see detailed instructions for this technique, with more screen shots, visit the PowerPivot from Identical Structure Excel Files page on the Contextures website.
That page also has a link for downloading the East and West sales data that I used in this example.
Watch the PowerPivot Video
To see the steps for combining data from multiple tables in PowerPivot, you can watch this PowerPivot from Identical Excel Files video tutorial.
Download the PowerPivot Add-In
You can download the free PowerPivot add-in from the Microsoft website: PowerPivot Download (Note: This link is no longer available)
__________
[…] http://blog.contextures.com/archives/2010/09/06/powerpivot-from-identical-excel-files/ […]
Now, here comes some interesting stuff 🙂 I’ve done some quick’n’dirty testing.
For exactly the same data (1.048.469 records) from different sources the data import went as follows:
Source Max Memory Used Time to Load
TXT 157 MB 0:00:30
OLEDB connection to XLSX 475 MB 0:01:28
XLSX 830 MB 0:02:03
This may explain why Debra ran out of memory when she attempted to load more than 1.200.000 records 🙂
You need to add here the ~140 MB of memory that XL takes up on open, plus ~40 MB for the PowerPivot window 🙂
Hi,
The site on PowerPivot from Identical Excel Files shows the steps to combine TWO similar files.
Q: What are the steps for combining THREE OR MORE similar files? I got stuck at the section on Change the SQL Statement.
The below SQL Statement does not work for me in my Excel 2010.
SELECT [DATA$].* FROM ‘C:20110413CB1.xlsx’.[DATA$]
UNION ALL
SELECT [DATA$].* FROM ‘C:20110413CB2.xlsx’.[DATA$]
UNION ALL
SELECT [DATA$].* FROM ‘C:20110413CB3.xlsx’.[DATA$]
Please help me. Thank you.
Regards,
Ivan
Hi Ivan,
The error is in the SELECT clause. Between SELECT and FROM you must either list the field names to extract, e.g.:
SELECT [FieldName1],[FieldName2],[FieldName…] FROM ‘C:20110413CB1.xlsx’.[DATA$]
or an asterisk (*) to indicate that all fields need to be extracted, e.g.:
SELECT * FROM ‘C:20110413CB1.xlsx’.[DATA$]
Regards
Kirill
I’m trying this solution on two different access DB (one from July and one from August) which have the same table layout but I can’t get it to work.
I get an SQL errormessage saying “no colums specified”
Heres my sql string:
SELECT [tblDataIn August 2011].*
FROM [tblDataIn August 2011]
(The above is the sql generated in power pivot)
Union ALL
Select *
From ‘C:UsersxxxDocuments7 July DB 2011.accdb’.[tblDataIn Juli 2011$]
Any resolution on bringing in multiple access db’s? Having same issue.
Thanks.
Correct path, typo in above message.
From ‘C:UsersxxxDocuments7 July DB 2011.accdb’.[tblDataIn Juli 2011$]
for three or more excel sheets, you can use following formula that seems to be the only one to work:
SELECT [Sheet1$].*
FROM [Sheet1$]
UNION ALL
SELECT * FROM `C:\$link trials\2008 short example trial.xlsx`.[sheet2$]
UNION ALL
SELECT * FROM `C:\$link trials\2008 short example trial.xlsx`.[sheet3$]
Now, this works with data that can fit on an excel sheet.
Does anybody know how I can do this with extreme huge data which is to big to be put in an excelsheet of excel 2010?
In combination of SQL?
Best regards,
Thierry
I tried doing this with comma delmited text files but it won’t allow me to access the Switch to drop down list and select query editor.
Was anyone able to address Thierry’s question relating to do this with huge data files greater than 1.1 million rows (e.g. comma delimited Text Files with >2million rows)?
SELECT [‘Dist1$’].*
FROM [‘Data$’]
UNION ALL
SELECT *FROM `C:\Master\Dist1.xlsx`.[Data$]
The 4 lines above worked fine. But if I tried to add in another line, it no longer works. Does anyone know how can I make this work?
UNION ALL
SELECT *FROM `C:\Master\Dist2.xlsx`.[Data2$]
When I try to validate, the following error pops up
“The SQL statement is not valid. There are no columns detected in the statement.’
SELECT [dbo].[v_carbonreportingv2].* FROM [dbo].[v_carbonreportingv2]
UNION ALL
SELECT * FROM `C:\TEST\BEImportfilCarbonReporting_V2.xlsx`.[‘Import to database$’]
Incorrect syntax near ‘`’.
I’m trying to combine an excel worksheet with an sql server table but only gets syntax error. When I try to combine to excel worksheet I don’t get the syntax error. Is it not possible to combine excel worksheet and a database table?
I have the same columns in the worksheet as I have in the table, also same formatting…
I was having issues with the name of the file, which started by producing error: “The SQL statement is not valid. There are no columns detected in the statement” for this statement:-
SELECT [‘Prime ref prompt – Parties (1)$’].* FROM [‘Prime ref prompt – Parties (1)$’]
UNION ALL
SELECT * FROM ‘C:\CData\temp file.xlsx’.[data$]
So I (eventually) deleted the first two lines leaving:-
SELECT * FROM ‘C:\CData\temp file.xlsx’.[data$]
then I git the following error: ”C:\CData\temp file.xlsx” is not a valid name. Make sure that it does not include invalid characters or punctuation and that it is not too long.
What!?!? I thought. Then it occurred to me, It’s a SQL statement, use [] for containing names. This now works 🙂
SELECT [‘Prime ref prompt – Parties (1)$’].* FROM [‘Prime ref prompt – Parties (1)$’]
UNION ALL
SELECT * FROM [C:\CData\temp file.xlsx].[data$]
Also only use [‘ ‘] if the worksheet name contains spaces….. hope this helps someone else!
Tim – the [] did the trick. Thanks for posting.
I kept having the same trouble as the others here “The SQL statement is not valid. There are no columns detected in the statement.”
The issue boiled down to not having the same number of columns in the three datasets I was using.
Maybe of help to someone out there..
Thanks Neil! And sometimes there are columns that look empty, but have something in them that Excel sees as part of the used range.
Press Ctrl+End to see where the worksheet really ends.
Hi All,
I tried the above and it worked with with multiple workbooks. But whenever I add any new column in the source workbook, it does not reflect in the power pivot table . Can anyone tell me why and how to make the changes work in power pivot.
Anyone know how to do this with CSV files that reside in different directories (but are otherwise identical)?
Thanks!
Hi All,
Thanks Debra, the procedure work fine for me (consolidate identical sheets in Excel 2013)
When pivot table is created, I am not able to group because the command is grayed ?
Note : I have done the same procedure with excel 2007 and the group command work fine
Can anyone tell me why
Best regards
SELECT [FACTIT$].* FROM [FACTIT$]
UNION ALL
SELECT * FROM `Z:\xxxxxxxxxx\AVANTAGE TO EXCEL\CD.xlsx`.[FACTIT$]
UNION ALL
SELECT * FROM `Z:\xxxxxxxxxxx\AVANTAGE TO EXCEL\AVA USA-CA.xlsx`.[FACTIT$]
This sql statement works for 3 files from same structure excel files. but on another table on those same file
SELECT [FACTMA$].* FROM [FACTMA$]
UNION ALL
SELECT * FROM `Z:\xxxxxxxxxx\AVANTAGE TO EXCEL\CD.xlsx`.[FACTMA$]
UNION ALL
SELECT * FROM `Z:\xxxxxxxxxxx\AVANTAGE TO EXCEL\AVA USA-CA.xlsx`.[FACTMA$]
It gives me the “The SQL statement is not valid. There are no columns detected in the statement.”
it works on one table but not on the other. ??
Hi,
When I click valid and it was fine but when running the following SQL
SELECT [‘For Outlook(C1)$’].* FROM [‘For Outlook(C1)$’] where ([Userid] “”)
UNION ALL
SELECT [‘For Outlook(RO)$’].* FROM [‘For Outlook(RO)$’]
UNION ALL
SELECT * FROM `\\team.eworkplace.tas.com@SSL\DavWWWRoot\subentity\ShSvc\Documents\Estate Management\Database\Occupancy database\Database_DTTA_Occupancy.xlsm`.[‘For Outlook$’]
I encounter error
the last one is from another worksheet residing in SharePoint document library
OLE DB or ODBC error: Multiple-step OLE DB operation generated errors. Check each OLE DB status value, if available. No work was done..
The current operation was cancelled because another operation in the transaction failed.
I have the same problem with getting this error message all of the sudden.
I basically have the same setup from multiple Excel files and the reload has been working for six months but all of the sudden I started getting this error message:
OLE DB or ODBC error: Multiple-step OLE DB operation generated errors. Check each OLE DB status value, if available. No work was done..
The current operation was cancelled because another operation in the transaction failed.
Anyone have a solution?