Numbers with leading zeros in CSV files
The issue
If a comma separated value (csv) file that you are trying to import contains numeric data with leading zeros, Excel will truncate the zeros and only show the number.
Example
If we have a comma separated value (csv) file that contains data like this:
abc,005,abc
def,006,def
ghi,007,ghi
Then the ...,005,...
portion of the data will be imported to Excel as ...,5,...
Workarounds
Manual
If you do not have access to change the way the source file is created, I feel that the best solution is to do a Manual import. Alternate methods are to Format the data or Change the file extension but both these options are not ideal.
Programmatic
If you have access to change the way the source file is created, then the easiest option is to Change the original file. Alternately you can Change the file format, unless all your data cannot be represented in the ANSI characters.
Manual import
Instead of double clicking on the file to open it, click on the Data tab and select From Text in the Get External Data section.
An import text file dialog screen will be displayed, use this to navigate to your file, select it and then click on the Import button.
The first step of the Text Import Wizard will be displayed. Make sure that the Delimited radio button is highlighted and then click on the Next button.
The second step of the Text Import Wizard will be displayed. In the Delimiters section, tick the Comma tick box and then click on the Next button
The third step of the Text Import Wizard will be displayed. Select the column with the leading zeros by clicking on the data in it and then in the Column data format section click on the Text radio button. Click on the Finish button
The Import Data dialog will be displayed which allows you to choose where the data will be imported to.
Click on the OK button.
Your data will be imported into Excel.
This has also created a permanent link to the original file and file location that could be reused again.
Format the data
If the leading zeros are simply a formatting issue and the data in the column always has a length of three characters then open the file in Excel.
Select the column containing the data you want to format and on the Home tab use the drop-down in the Number section.
At the bottom of the drop-down select More number formats…
The Format cells window will be displayed.
In the Category section select Custom, in the Type box enter 000
and click on the OK button.
The numbers in the column will now be displayed with leading zeros.
The weakness of this method is that the cell value is stored in Excel as 5
but is displayed as 005
. If you want to use the data in formulas like EXACT, VLOOKUP, MATCH, etc. it might not return the expected values as 5
is not the same as 005
.
Change the file extension
Change the file extension and use the file import wizard as detailed above to control the formats. Files with a .dif extension are usually automatically opened by Excel when double clicked on. This may or may not work depending on what file associations have been set. The .dif format might not be associated with Excel and so will be opened in another program.
Change the original file
If you have access to change the way the CSV file is produced, you can change the data in the .csv file. By putting an ="
at the start of the data field and a "
at the end of the data field, it will be displayed correctly.
For example ...,=”005”,...
will be displayed in Excel as ...,005,...
Excel will have kept the data as a formula, but copying the column and using paste special values will get rid of the formula but retain the formatting
Change the file format
Use a Symbolic Link (SLK) file format in place of CSV. This allows greater control over the formatting, but is restricted to ANSI characters only.
A basic SLK file for the example data used in the screenshots would look like this:
ID;P
C;Y1;X1;K"abc"
C;X2;K"005"
C;X3;K"abc"
C;Y2;X1;K"def"
C;X2;K"006"
C;X3;K"def"
C;Y3;X1;K"ghi"
C;X2;K"007"
C;X3;K"ghi"
E
Double clicking on the file will open it in Excel, with the data looking as follows:
Looking at the file, the start of each row sets the type of record.
The ID;P
denotes the start of file and the E
, the end of the file.
The C
rows denote cells, giving the Row and Column address followed by the value.
The ;Yn
sets the row number and is applied to all subsequent rows in the SLK file until changed.
The ;Xn
sets the column number
The ;K
denotes the cell value
In this example no formatting is applied.
If the custom format of 000
mentioned above was applied to the middle column, the file would look like this.
ID;P
P;P000
C;Y1;X1;K"abc"
F;P0;X2
C;X2;K"005"
C;X3;K"abc"
C;Y2;X1;K"def"
F;P0;X2
C;X2;K"006"
C;X3;K"def"
C;Y3;X1;K"ghi"
F;P0;X2
C;X2;K"007"
C;X3;K"ghi"
E
The P
row sets the formatting.
THe F
row applies the formatting.
The ;Pn
references the format with P0
meaning the first P record.
The ;Xn
sets the column number with the row number inherited from the row above.
Complex files with multiple formats, different fonts and even formulas can be created. The easiest way to experiment is to create the file in Excel, save it in a SYLK (Symbolic Link) (*.slk) format and open it with a text editor.
There is an article on Wikipedia that provides an excellent level of detail on the file format.