Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

Most tragically, I got several TFS team projects with an en dash in them.

If you are not familiar with en dash, open up MS Word and type in alt+8211.

You will see a larger looking dash. (Like this if it renders for you: –). If you open notepad you will see a ? and if you try it in most Unicode editors you will see ?.

But I need it to be a dash. Because I need to run a batch file against all my projects. The paths to the projects now has this en dash in it.

I make a file that holds the names of the projects and I feed them to my batch file.

But when it runs it puts ? in place of the –.

What can I do to keep my dash?

NOTE: I have two batch files. The First runs the second with params from a file.

My First one can be seen here. The second can be seen here. An example of the input file can be seen here.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
902 views
Welcome To Ask or Share your Answers For Others

1 Answer

The en dash character is Unicode code point U+2013. In Windows code page 1252 it is character number 150. The ? charcter is character number 150 in code page 437. So it looks like one process is writing a file in code page 1252, while another is reading it using code page 437.

Ideally, all code pages other than Unicode should be thrown out along with World Wars, small pox, and other relics of the 20th century. Unfortunately, the Windows console makes this rather difficult.

Since code page 437 is the default console code page for most Windows installations, I suspect that it is this default setting that is causing these problems. (File names are stored by the OS in Unicode, so that part, at least, is correct.) Since code page 437 does not include the en dash character, any system using that code page will have to resort to a fallback mechanism to render the file names, such as a question mark.

By changing the console code page to something that does support the en dash character, such as 1252, this problem may be corrected.

You can change this code page with the following command.

chcp 1252

This command should probably be placed at the beginning of your batch file.

This is a terrible hack that will be necessary until you can convert your system to something modern that supports Unicode from top to bottom.

You may also want to try it in Powershell, since PS does support Unicode.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...