CSV File Format - 43 years on and it's still hard?

January 18, 2011 21:46 by Matt Whitfield

This is one issue that really bugs me. CSV files and support of them. It's a fairly simple concept - you have a file which contains tabular data in a grid, and each column is separated by a comma. Sounds simple, right? Well, it is. And it isn't.

CSV files are great as long as you don't have any data in them that contain commas or new line characters - at which point it tends to go a bit pear shaped in terms of universal support.

Let's look at a few of the extended rules around CSV that so many applications get wrong:

Fields with embedded commas must be delimited with double-quote characters

This one is a fairly obvious one. If you have three values - for example 'Hello', 'sir, how', 'are you?' - then you should have three columns in CSV. However, just dumping out the contents leaves you with four:

Hello,sir,how,are you?

So, fields that contain embedded commas have to be delimited:

Hello,"sir,how",are you?

That's nice, but now assigns special meaning to the double-quote character. So, if fields contain a double quote character - we need to follow another rule:

Fields with embedded double-quotes must escape the contained double-quotes if they are delimited

What does that actually mean?

Imagine the value 'test,"testing",test'. We would need to delimit that, because it contains commas. However, the double quote characters would then take on special meaning, and therefore we need to escape them. Escaping them just involves replacing a single double-quote with two:

"test,""testing"",test"

So we're all good - right? Again, almost. CSV records are generally one record per line - but what about fields that contain multiple lines of text? Again, they need to be delimited in the same way, and this can mean that a single CSV data record can span multiple lines. This leads us to our third really important rule:

Fields that span multiple lines must be delimited

Imagine the phrase 'testy\ntest' (i.e. a new line between testy and test). That would be:

"testy
test"

Now we are all good. Properly.

Was that really so hard?

Next time you use a standard piece of software (*cough* SSMS *cough*) try exporting CSV and see what happens...

Tags: ,

compatibility

Installing SQL Server 2000 on Windows 7

September 6, 2010 14:03 by Matt Whitfield

This is a question I see a lot on the net, and I don't really know why - SQL Server 2000 as a database engine works just fine under Windows 7. If you install it, and allow all the warnings to complain at you, then you will end up with a working installation.

Now, enterprise manager, the service manager and other ancillary applications don't fare so well, and have various issues from authentication through to complete failure.

This has the interesting effect that you have to start SQL Server 2000 manually after it's first installed, and you may have to change the service account using the services control panel - but that's not rocket science and is well documented around the net.

Once you've got the service up and running, you can run all your usual tools (whether it's Management Studio or SQL Everywhere) to interface with the database engine without issue. Having said that, I haven't tried to get full text indexes to play ball!

Tags: , ,

compatibility

Schema Surf

Tag cloud

Calendar

<<  April 2017  >>
MoTuWeThFrSaSu
272829303112
3456789
10111213141516
17181920212223
24252627282930
1234567

View posts in large calendar