Words

by James Classen

“Political Correctness” is all about going too far, I know. But this is just ridiculous. From the folks at Language Log, the potentially offensive phrases mentioned in the original article (“hold down the fort”, “rule of thumb”, “going Dutch”, “handicap”) are not offensive at all. “Hold [down] the fort”, cited as being offensive to Native Americans, was apparently used by Sherman speaking of defending a fort against the French, not the Native Americans. It’s also used regarding defense against cattle rustlers, train robbers, horse thieves, and others, but no mention of “savage Indians” anywhere. “Rule of thumb” has nothing to do with wife beating. Twice in the 19th century such a principle was acknowledged, but the judges ruling on the case gave no citation for it (and summarily ruled against the abusers anyway). And this was certainly not the origin of the phrase. “Going Dutch” doesn’t imply that the people of the Netherlands are cheap, but possibly refers to a Dutch-style door, or Dutch culture (where it is often appropriate to pay separately). “Handicap”, however, has nothing to do with begging, rather a game of chance, gambling.

Here’s an interesting tidbit, though: when it comes to the use of the words “chord” and “cord”, everyone is wrong!

More interesting artifacts of a bygone age: what is the icon you click on in most applications to save a file? A floppy disk. Who uses those anymore? What do you click to send or receive e-mail? An envelope, again, rarely used. What do you look for on your smartphone when looking to call someone? An “antique” handset. Oh, and “Wi-Fi” was a brand name, and has nothing to do with “hi-fi” (high fidelity/wireless fidelity—there’s no indication of quality in IEEE 802.11)


Now, some tips on grammar, Microsoft Word, and Windows-based input in general. Hyphenation. It’s a trickier thing than you may realize. Most people use the “-” key on their keyboard, not realizing its implications. It was and is called the “hyphen-minus”, carrying the dual meaning throughout the days of the 7-bit US-ASCII character set, and even the 8-bit ISO-8859-1 character set (not to be confused with ISO/IEC 8859-1:1998 or Windows codepage 1252). That’s all well and good, but Unicode has been around since 1988, and supported by the world’s dominant operating systems (Windows) since Windows 95 (remember “unicows.dll”?). Because of this, we have better methods for character separation. Many will say they can’t spot the difference between -, ‐, −, —, –, and ‑, or the sometimes invisible ­ (yes, there is something in there). I can, and do, notice the difference most of the time, though the one between ‐, ­, and ‑ is impossible to spot unless it occurs at an obvious location.

Proper usage, in my estimation (I am only an amateur grammarian):

First, terminology: the hyphen-minus, U+002D (I’ll cover character entry in a bit). The hyphen, U+2010. The minus sign, U+2212. The em-dash, U+2014. The en-dash, U+2013. The figure dash, U+2012. The soft hyphen, U+00AD.

Basic character entry: for small numbers (less than 256) and in Windows, characters can be entered by holding the “Alt” key and entering the number “0” followed by the character number being entered. For instance, what I list above as U+002D is Unicode (hexadecimal) code point 2D, which translates to the decimal number 45. Thus this can be entered by holding “Alt” while entering the digits “045” sequentially on the number pad (not the row of numbers above the normal keyboard letters), then releasing “Alt”. Likewise, U+00AD can be entered with Alt+0173 (this latter notation is fairly standard for discussing entry of these characters). The leading zero is important. If left out, you may wind up with an unexpected character, but there’s a time and place for these as well (they’re codepage-specific; perhaps I’ll cover that some other time).

Advanced character entry: for numbers larger than 255, Windows does have a built-in method of entry, but it’s not enabled by default. The registry key HKCU/Control Panel/Input Method/EnableHexNumpad (REG_SZ) must be set to “1”. Then, when entering these higher values, hold “Alt” while pressing the numpad “+”, followed by the hexadecimal value, with the digits entered from the numpad, and the letters from the keyboard. So U+2014 is entered by holding “Alt” while entering, sequentially, “+2014” on the number pad, then releasing “Alt”. It’s also possible to enter the smaller values this way, using the hexadecimal notation: U+00AD would be entered by holding “Alt” while entering “+ad” on the keyboard, the plus side from the number pad. Specific to Microsoft Word (and Wordpad), enter the hexadecimal number, like 2212, highlight it, and press “Alt+X”. If the digits are not preceded by a valid hexadecimal character, you may press “Alt+X” without highlighting (this key combination will also decode character values, should you wish to know what a single character’s hexadecimal value is).

Some additional warnings: Microsoft Word will often change the font on you when entering specific characters. I’ve seen it switch from the default Cambria (version 2007) to Cambria Math or MS Mincho, depending on the character. Sometimes this is required, but that is often not the case, as Cambria and Calibri (the fonts that ship with Office 2007) handle most of the code points used in western typography. Some applications will hijack the last entry method explained, and ignore what it’s supposed to mean.

Back to usage:

Use a hyphen or hyphen-minus to join words to form a compound (where that compound is not in common usage), or to eliminate ambiguity, say between recreation and re-creation, or between a man-eating shark and a man (who is) eating shark. When adding a prefix to a proper noun, insert a hyphen (un-American). There are other rules, but these are the most common. When to use which is discussed below.

Use a soft hyphen when a word (especially a long word) can be broken at a specific point within, but for one reason or another, line breaks at word boundaries in the surrounding text would be awkward.

Use a hard-hyphen or non-breaking hyphen when a hyphen is necessary, but the possibility of a break across lines is undesirable. In business, at least mine, this is often an issue with part numbers. When called out in a report, occasionally the part number will break at a hyphen-minus, and I admonish report authors to use the non-breaking version to avoid the break.

Use a minus sign when discussing subtraction. On a related topic, use the multiplication sign (U+00D7), interpunct/middle dot (U+00B7), or dot operator (U+22C5) when discussing multiplication, rather than the asterisk (U+002A) or letters “X” (U+0058) or “x” (U+0078).

Use a figure dash when using a dash as a separator between groups of numbers, yet not when indicating a range or subtraction. For instance, the US Phone number, 555‒1212. This would also be an appropriate substitute for my company’s part numbers (as they are all digits).

Use an en-dash when discussing ranges of values, whatever they may be (numbers, dates, times, page numbers, etc.). Examples: “Jimmy Carter was President from January 1977–January 1981.” “Product designed for ages 3–5.” “For your assignment, read pages 16–38.” “The program lasts from 7:30 pm–10:00 pm.” It should not be used as a contraction in ranges of SI units, in place of “to” or “and” in constructions such as “between 17 V and 28 V”, so as to avoid any possibility of confusion with subtraction. Another use of this is to denote a relationship between objects or names. “The Supreme Court voted 5–4…” “The Shays–Meehan bill…” “Kansas City Royals 4–2 victory…” Again, there are more rules, but these are the common ones.

Use an en-dash when inserting a parenthetical statement, as an alternative to those parentheses. Also, when quoting someone, where the last word is cut off for some reason (reminiscence, interruption). See many of my entries in this blog for uses of the em-dash in the former sense of the word. And enjoy this quote from Star Wars: A New Hope for an example of the second: “I sense something; a presence I’ve not felt since—”.

Hope that clears things up!


So in 1952 a PhD student at MIT named David Huffman chose to write a term paper instead of taking a final exam for his information theory class, taught by Robert Fano. With this term paper, “A Method for the Construction of Minimum-Redundancy Codes”, he bested his professor’s collaborative work with the discoverer of information theory at constructing optimal binary encoding. I heard this story when I took my information theory class, though I don’t think we were ever tasked with even an implementation of Huffman coding. So, out of boredom, I put together something in Python that does the job. Its usefulness is severely limited, as it expects a string input and produces a string output (plus the generated tree as a nested…well, list, basically). However, it does produce an optimal binary encoding for the input text, and can compress “The Raven” to 57.09% of its original size: 3603 bytes compared to 6311 in the original. In addition to that, the storage of the tree would be required, which in its Python format is 1074 characters. Anyway, the longer the text, the better ratio I’m likely to get, and the smaller the relative overhead would be of storing the Huffman tree. But this is just step one of a larger goal: I wish to be able to provide a clear guide to what audio compression algorithms make the most sense, and are objectively the “best” for a particular application. A comparison between the big contenders in the lossy market: MP3, AAC, and Windows Media in the non-free corner, and Ogg and Opus in the free corner. Along the way I suspect I’ll create image encoders and decoders and implementations of one or more compression algorithm (like DEFLATE), but I’m aware that, in the end, the choices of what bits of sound to chop away are what is going to matter. This is a research project on my part, and, while I don’t expect to win any awards, hopefully what I turn up will help someone, somewhere, in their understanding of the algorithms involved and how best to digitally store their music libraries.

Speaking of which, I’ve given in and installed iTunes on this computer. I refuse to give it control over my library, and my desktop player of choice will remain WinAMP, unless/until I write my own player.

To continue my rambling, there was a new version of Catalyst that for some reason my computer wasn’t picking up in its daily checks for updates (a little excessive for checking, but it didn’t even notice?) So either the 2D driver 8.01.01.1253 or some other part of the driver package 8.982-120727a-145524C-ATI seems to have solved the blue screen issue. Yay for dual monitors! Now I need a decent mount for them…

On to games. Wizards has finally recognized that by ceasing the printing of its most popular table-top RPG rulebooks it was cheating itself out of (I’m sure quite a bit of) cash. So I was pleasantly surprised to find three core books for D&D 3.5 on the shelf at Prairie Dog today when I picked up my comics. I have purchased World of Warcraft: Mists of Pandaria, which comes out on the 25th. With the massive changes they made in version 5.0.4 (and have already patched to 5.0.5), I’m still re-learning how to play a warlock. But I’ll be a Panda monk just like a zillion other people come the 25th. And Batman: Arkham City Game of the Year edition finally came out on Steam. Just shy of a full year after initial release. And honestly? I’m no longer excited about it. After the newness of being a Panda monk wears off, I’ll probably play it, but at this point I’m not bothered. Perhaps I’ll pick it up during one of Steam’s famous awesome sales.

Perhaps it’s the apathy of the week catching up to my stomach, but right now food bores me. I’m hungry, it’s past 9pm and I haven’t eaten dinner, but I look in the fridge and, despite there being plenty of sandwich making materials, I don’t want a sandwich. Besides, I’m out of bread. And milk. But when I think about going out, I’ve had this problem all week of not really wanting to eat anywhere. So I make my usual rounds: Pizza Hut, McAlister’s, Schlotzsky’s, tomorrow’s lunch with the guys will probably be Genghis or the Mexican Irish place (Carlos O’Kelly’s), and before the week is up I’m sure I’ll see the inside of Freddy’s and or Mr. Goodcents. All with zero enthusiasm. Cooking for one sucks (plus I hate the dishes that are concomitant with that duty). Eating out 7+ meals a week is extremely hard on the wallet. Perhaps it’s the 30 years of a sandwich every day that’s got me down, but it hasn’t bothered me until now, and it doesn’t feel like the blame belongs there. And the cooking for two option has a broken foot because I had to get a replacement credit card, and eHarmony won’t let me go in on a month-to-month plan, the only way I can re-up my subscription is by paying for 3, 6, or 12 months now, where, 3 months has a rate higher than the month-to-month does (how stupid is that). Not that I was talking to anyone. Maybe I’ll meet someone at The Color Run, or at “Sleepwalk with Me“. Unlikely, but possible.

Well, I need to stick something in this pie-hole, so I’ll call this entry quite long enough and begin another one tomorrow (which might be posted in a day, or a week, or a month).

Advertisements