May 2008

If you manage, write, visit, or otherwise have anything to do with a web app that connects to a SQL Server database, good guy and Microsoft Program Manager Buck Woody wants you to read this:

[copied with permission from here]

You might have read recently that there have been ongoing SQL injection attacks against vulnerable web applications occurring over the last few months.  These attacks have received recurring attention in the press as they pop up in various geographies around the world. These attacks do not leverage any SQL Server vulnerabilities or any un-patched vulnerabilities in any Microsoft product – the attack vector is vulnerable custom applications. In fact, SQL Injection is a coding issue that can attack any database system, so it’s a good idea to learn how to defend against them.

In order to help you respond to and defend yourself from these attacks, Microsoft has an authoritative blog including talking points and guidance.  You can find this at this Technet location. (Retype the underlying URL if you like. I only linked it this way because it wrapped.)

Ok, if you didn’t visit the Technet link, visit it before reading on.

Thanks. Now I’ll add another bit of advice:

There’s a non-SQL injection issue here as well. The risk in question starts when a web application incorporates part of the URL into SQL and executes it blindly (SQL injection), but the risk to end users only occurs because the web app commits “HTML
injection.” The web app unwittingly delivers a malicious bit of HTML that says “Hey browser, please run a script from this other web site.” That malicious bit of HTML won’t be sent to my browser if the web application doesn’t blindly incorporate table data (especially table data containing HTML tags) into the HTML pages it delivers.

Here’s an analogy. When you fill a prescription, you get instructions like “Take one pill twice a day for seven days.” Those instructions probably get printed out of some database. If the instructions say “Chew up all the pills and wash them down with a cup of bleach,” something’s wrong with the pharmacy’s database. Something’s also wrong with the pharmacy for not catching the bogus instructions before dispensing the prescription. And if you follow the instructions, something’s wrong with you.

The risk Buck is drawing our attention to is like this, and the Technet blog tells us to secure our database. Just as importantly, we should pay attention to what we dispense, and not just assume that if we’re dispensing our data, it’s good data. Browsers often render (and in the case of scripts, execute) whatever a trusted site sends them, and if trusted sites send HTML out without vetting it, well, they shouldn’t be trusted. If you’re a web developer and you want your site to be trusted, then vet what you deliver.

I don’t do web apps, but I don’t think a responsible web app should send me script tags that refer to third-party sites. In fact, the web app probably shouldn’t send me any table data without scrubbing it for tags, non-printing ASCII characters, etc.

Many years ago, we thought it was funny to email people BEL characters, and then someone figured out email shouldn’t be allowed to contain BEL. Years ago bulletin boards figured out they shouldn’t allow users to put any old HTML into their posts.
The threat then was still minor – jokers figured out they could mess up some bulletin board formatting by posting opening tags without closing them. Apparently this was only half fixed. Web apps typically scrub what comes in through the expected channels, but a lot of web apps (most?) apparently don’t scrub the HTML they send out. They should. In fact, they must, now that the bad guys have figured out how to exploit sloppy web apps to modify table data bypassing the expected route. The bad guys may soon find some more sloppy code and exploit it to mess with your data.

Just as it’s possible to scrub outgoing email for viruses, it should be possible (and routine) to scrub outgoing HTML for malicious content. While I don’t trust email attachments that have a “no viruses” sticker on them, and I wouldn’t trust a random site that tells me “this web page is safe,” I would trust Microsoft or another trustworthy source if they told me their web servers scrub all outgoing web pages for unexpected script tags.


Strange. The HTML source for this page shows a yield of 2.39%, not 92,318.20%. In both IE7 and Firefox, 2.39% shows up as the yield when the page is first rendered, but changes immediately to 92,318.20%. The yield is getting its significant digits from the Assets value, but why is a mystery to me.

Andrew Gelman dreams of the day when a journalist (like Ezra Klein) asks “Why?” the items on a list (like Rob Goodspeed’s) are in alphabetical order.

This drew my attention to the items on Barack Obama’s issues page, which as of today are not in alphabetical order (despite first appearance and various journalists’ reports that they are).

“Why?” is always a good question. So is “Why not?” If “Why not?” is the right question, something interesting might explain why [not]. Translation from another language, for example. A friend name Winternitz was listed as the first author of many joint papers, even after the citations were translated from Russian to English.

Why aren’t the items on Barack Obama’s issues page in alphabetical order? I don’t have an answer, but I wonder: Was the “Seniors & Social Security” issue once the “Social Security” issue?

… that teaching mathematical concepts with confusing real-world examples is not a good idea.

Last month, the journal Science published an article about learning mathematics. Many newspapers and magazines picked up the story and pitched it at readers, summarizing the research result more or less like ars technica did [here]: “[S]tudents who learn through real-world examples have a difficult time applying that knowledge to other situations.”

This doesn’t agree with my experience in the classroom, and I sought the source. The full text of the Science article isn’t free, and the following remarks are based on what seems to be an earlier report of the same research by the same authors: “Do Children Need Concrete Instantiations to Learn an Abstract Concept?,” in the Proceedings of the XXVIII Annual Conference of the Cognitive Science Society [accessed at on May 6, 2008.] Here the Ohio State researchers described their experiment.

In the first phase, subjects learned a mathematical concept.

[The concept] was a commutative group of order three. In other words, the rules were isomorphic to addition modulo three. The idea of modular arithmetic is that only a finite number of elements (or equivalent [sic] classes) are used. Addition modulo 3 considers only the numbers 0,1, and 2. Zero is the identity element of the group and is added as in regular addition: 0 + 0 = 0, 0 + 1 = 1, and 0 + 2 = 2. Furthermore, 1 + 1 = 2. However, a sum greater than or equal to 3 is never obtained. Instead, one would cycle back to 0. So, 1 + 2 = 0, 2 + 2 = 1, etc.

Subjects learned this concept through one of two scenarios: a scenario using geometric symbols with “no relevant concreteness” or a scenario “with relevant concreteness” using measuring cups containing various levels of liquid.

Ok, I’ll bite. I was a New Math child, and Mrs. Szeremeta taught us modular arithmetic with (analog) clocks. A “liquid left over” scenario with measuring cups ought to work, too. Most students “know” the idea of measuring cups.

Mrs. Szeremeta’s clocks worked, because the clock scenario contained “relevant concreteness,” and because its concreteness was familiar. For a concrete example to be a good teaching tool for an abstract concept, the concreteness has to be both relevant and familiar. Relevant means the concrete instantiation has to behave in real life more or less according to the rules of the abstract concept. Familiar means students know or can quickly learn how it works in real life. As the authors correctly observe, “the perceptual information communicated by the symbols themselves can act as reminders of the structural rules.”


In addition to scenarios “with no concreteness” and “with relevant concreteness,” other kinds of scenario exists: ones “with confounding concreteness,” or ones with “irrelevant concreteness,” or ones with “distracting concreteness.” A scenario with confounding concreteness is one that draws on the familiar, but where the familiar behavior works contrarily to the rules of the abstract concept.

Here is the authors’ concrete scenario:

To construct a condition that communicates relevant concreteness, a scenario was given for which students could draw upon their everyday knowledge to determine answers to test problems. The symbols were three images of measuring cups containing varying levels of liquid. Participants were told they need to determine a remaining amount when different measuring cups of liquid are combined.

So far so good, but unfortunately, the three images used to represent the equivalence classes were those of a 1/3-full, a 2/3 -full, and a full measuring cup, representing the equivalence classes [1], [2], and [0] respectively.

Yes, the authors used a full measuring cup, not an empty one, to represent the additive identity, zero. The upshot of the experiment, then, in my opinion, was to compare an abstract implementation (geometric symbols with no concreteness), with a second implementation having both relevant and confounding concreteness. Relevant, because combining liquid and considering remainders works like addition in this group, but confounding, because one notion of the abstract concept is the idea of an additive identity (zero, nothing), which students learned to equate with a full measuring cup, which contains something, not nothing.

Students were expected to report the amount of liquid remaining after combining two amounts (but the result could not be “none”). In both the “cups” domain and the “no relevant concreteness” domain where squiggle = 0, disk = 1, and rhombus = 2, students learned to report correct answers (called “remainders” or “results”, depending on the domain) from the action of combining items.

The now combining-savvy students were challenged to learn similar rules in a new domain. The new symbols were images of a round ladybug, a tallish vase, and (I think) a cabochon ring viewed so its projection on the page is an eccentric ellipse with larger axis horizontal. They were told that the rules of this new system were similar to what they had previously learned, and that they should figure them out using their new knowledge.

Coincidentally, or perhaps not, the three images somewhat resembled baroque renditions of the earlier disk, rhombus, and squiggle, but bore little similarity to variously-filled measuring cups. They were taught a “game” where two students each pointed to an item, and a “winner” then pointed to one. Subjects were expected to learn which object the winner would point to. Students might have discovered that the visual similarities disk~ladybug, rhombus~vase, and squiggle~cabochon explained the new behavior.

Certainly the new domain was an abstract one. A pointing game like this, or real-world “ways” vases, ladybugs, and cabochons combine, are not part of the real world. I’d consider this new domain a purely symbolic one. When we teach mathematics, we want students to be able to transfer their knowledge to new domains both applied (with relevant concreteness) and abstract (symbolic, or with no relevant concreteness).

It doesn’t surprise me that students masters a new abstract domain more easily if they’d previously mastered one; and I’d expect it easier to master a new domain with relevant concreteness if you’d previously mastered one of those.

The short of it? Interesting research, but the experimental design is flawed as far as answering the question posed. Certainly this is no reason to give up finding creative, relevant, and familiar examples for abstract mathematical ideas.