ML with circles

Mearns Learns

Numbers with leading zeros in CSV files

The issue

If a comma separated value (csv) file that you are trying to import contains numeric data with leading zeros, Excel will truncate the zeros and only show the number.

Example

If we have a comma separated value (csv) file that contains data like this:

abc,005,abc
def,006,def
ghi,007,ghi

Then the ...,005,... portion of the data will be imported to Excel as ...,5,...

Workarounds

Manual

If you do not have access to change the way the source file is created, I feel that the best solution is to do a Manual import. Alternate methods are to Format the data or Change the file extension but both these options are not ideal.

Programmatic

If you have access to change the way the source file is created, then the easiest option is to Change the original file. Alternately you can Change the file format, unless all your data cannot be represented in the ANSI characters.

Manual import

Instead of double clicking on the file to open it, click on the Data tab and select From Text in the Get External Data section.

Screenshot of Excel Data tab menu Screenshot of Excel Data tab menu

An import text file dialog screen will be displayed, use this to navigate to your file, select it and then click on the Import button.

Screenshot of Excel import text file dialog Screenshot of Excel import text file dialog

The first step of the Text Import Wizard will be displayed. Make sure that the Delimited radio button is highlighted and then click on the Next button.

Screenshot of the first screen of the Excel Text Import Wizard Screenshot of the first screen of the Excel Text Import Wizard

The second step of the Text Import Wizard will be displayed. In the Delimiters section, tick the Comma tick box and then click on the Next button

Screenshot of the second screen of the Excel Text  Import Wizard Screenshot of the second screen of the Excel Text  Import Wizard

The third step of the Text Import Wizard will be displayed. Select the column with the leading zeros by clicking on the data in it and then in the Column data format section click on the Text radio button. Click on the Finish button

Screenshot of the third screen of the Excel Text Import Wizard Screenshot of the third screen of the Excel Text Import Wizard

The Import Data dialog will be displayed which allows you to choose where the data will be imported to.
Click on the OK button.

Screenshot of the Import Data screen" Screenshot of the Import Data screen"

Your data will be imported into Excel.

Screenshot of the imported data" Screenshot of the imported data"

This has also created a permanent link to the original file and file location that could be reused again.

Format the data

If the leading zeros are simply a formatting issue and the data in the column always has a length of three characters then open the file in Excel.

Select the column containing the data you want to format and on the Home tab use the drop-down in the Number section.

Screenshot of csv file open in Excel Screenshot of csv file open in Excel

At the bottom of the drop-down select More number formats…

Screenshot of csv file open in Excel Screenshot of csv file open in Excel

The Format cells window will be displayed. In the Category section select Custom, in the Type box enter 000 and click on the OK button.

Screenshot of format cells window Screenshot of format cells window

The numbers in the column will now be displayed with leading zeros.

Screenshot of formated column Screenshot of formated column

The weakness of this method is that the cell value is stored in Excel as 5 but is displayed as 005. If you want to use the data in formulas like EXACT, VLOOKUP, MATCH, etc. it might not return the expected values as 5 is not the same as 005.

Change the file extension

Change the file extension and use the file import wizard as detailed above to control the formats. Files with a .dif extension are usually automatically opened by Excel when double clicked on. This may or may not work depending on what file associations have been set. The .dif format might not be associated with Excel and so will be opened in another program.

Change the original file

If you have access to change the way the CSV file is produced, you can change the data in the .csv file. By putting an =" at the start of the data field and a " at the end of the data field, it will be displayed correctly. For example ...,=”005”,... will be displayed in Excel as ...,005,...

Excel will have kept the data as a formula, but copying the column and using paste special values will get rid of the formula but retain the formatting

Change the file format

Use a Symbolic Link (SLK) file format in place of CSV. This allows greater control over the formatting, but is restricted to ANSI characters only.

A basic SLK file for the example data used in the screenshots would look like this:

ID;P
C;Y1;X1;K"abc"
C;X2;K"005"
C;X3;K"abc"
C;Y2;X1;K"def"
C;X2;K"006"
C;X3;K"def"
C;Y3;X1;K"ghi"
C;X2;K"007"
C;X3;K"ghi"
E

Double clicking on the file will open it in Excel, with the data looking as follows:

Screenshot of the imported data" Screenshot of the imported data"

Looking at the file, the start of each row sets the type of record.
The ID;P denotes the start of file and the E, the end of the file.

The C rows denote cells, giving the Row and Column address followed by the value. The ;Yn sets the row number and is applied to all subsequent rows in the SLK file until changed. The ;Xn sets the column number The ;K denotes the cell value

In this example no formatting is applied.

If the custom format of 000 mentioned above was applied to the middle column, the file would look like this.

ID;P
P;P000
C;Y1;X1;K"abc"
F;P0;X2
C;X2;K"005"
C;X3;K"abc"
C;Y2;X1;K"def"
F;P0;X2
C;X2;K"006"
C;X3;K"def"
C;Y3;X1;K"ghi"
F;P0;X2
C;X2;K"007"
C;X3;K"ghi"
E

The P row sets the formatting.

THe F row applies the formatting. The ;Pnreferences the format with P0 meaning the first P record. The ;Xnsets the column number with the row number inherited from the row above.

Complex files with multiple formats, different fonts and even formulas can be created. The easiest way to experiment is to create the file in Excel, save it in a SYLK (Symbolic Link) (*.slk) format and open it with a text editor.

There is an article on Wikipedia that provides an excellent level of detail on the file format.