Welcome to The Emission Locus Sign in | Join | Help

Encoding source files and documents as UTF-8 in VS2005

I'm picky about the encoding of my source files, HTML documents, etc. Whereever possible I save as either UTF-8 or UTF-16LE, both with a BOM (byte order mark). This ensures that the document can always be read without any encoding issues.

Although it has much better Unicode support than VS2003, VS2005 isn't very helpful with automatically selecting a file encoding; it assumes that the windows-1252 code page is sufficent. VS2005 does have an "Advanced Save Options" dialog on the File menu to set the encoding, but its not completely automated. Replaying a temporary macro recorded when using it will only bring up the dialog, but not choose the encoding.

My solution is a macro hooked up to the IDE's document open event. It brings up the "Advanced Save Options" dialog if the file does not have a BOM and also matches one of the supported file extensions (this list can be modified to suit your preferences). It works for new and existing documents; if the file is read-only the dialog is suppressed.

The environment doesn't directly support a "HasBOM" property as part of the EnvDTE.Document object so I've worked around that by reading the first few bytes of the underlying file to detect the encoded BOM.

To install unpack FixFileEncoding.vsmacros from the attachment into a directory somewhere. Then use the Macro Explorer to load the macro project; when prompted you must enable the event handling code or it won't work. If you don't feel comfortable loading someone else's macro file, I've also included the EnvironmentEvents.vb source file in the .zip file. You'll need to create a new macro project and then import this module.

VS2005 does have a cool feature for XML-style documents: if the document starts with a <?xml?> tag its encoding value is modified to match whatever is selected in the "Advanced Save Options" dialog. Very nice.

Published Friday, June 15, 2007 10:09 PM by john
Filed under: , , ,

Attachment(s): FixFileEncoding.zip

Comments

# Update: Encoding source files and documents as UTF-8 in VS2005

Thursday, June 21, 2007 8:37 AM by The Emission Locus

The old code was a bit tacky, using a timer to trigger the file encoding dialog. And of course it was

Anonymous comments are disabled