Thursday, April 26, 2012

You did _what_? Form-based XML editing

My most recent project has been building an editing system for APIS (the Advanced Papyrological Information System—I know, I know) data as part of the Papyrological Editor (PE). APIS records are now TEI files. They started out using a text-based format that was MARC-inspired (fields, not structure), but I converted them to TEI a while back to bring them in line with the other papyri.info data. HGV records (also TEI) already had an editing system in the PE, so it seemed like the task of adding an APIS editor, which would be a simpler beast, wouldn't be that hard, just a matter of extending the existing one. In fact, it was an awful struggle.

Partly this is my fault. The PE is a Rails application, running on JRuby, and my Rails knowledge was pretty rusty. I loved it when I was working on it full time a few years ago, but coming back to it now, and working on an application someone else built, I found it obtuse and mentally exhausting. I had a hard time keeping in my head the different places business logic might be found, and figuring out where to graft on the pieces of the new editor was a continual fight against coder's block. The HGV editor relies on a YAML-based mapping file that documents where the editable information in an HGV document is to be found. This mapping is used to build a data structure that is used by the View to populate an editing form, and upon form submission, the process is reversed.

It's not at all unlike what XForms does, and in fact I was repeatedly saying to myself "Why didn't they just use XForms?" I got annoyed enough that I took some time to look at what it would take to just plug XForms into the application and use that for the APIS editor. The reluctant conclusion I came to was that there just aren't any XForms frameworks out there that I could do this with. And the XForms tutorials I was looking at didn't deal with data that looked like mine at all. TEI is flexible, but not all that regular, and every example I saw dealt with very regular XML data. Moreover, the only implementation I could find that wasn't a server-side framework (and I wasn't going to install another web service just to handle one form) is XSLTForms. The latter is impressive, but relies on in-browser XSLT transforms, which is fine if you have control of the whole page, but inconvenient for me, because I've already got a framework producing the shell of my pages. I just wanted something that would plug into what I already had. A bit sadder but wiser, I decided the team who built the HGV editor had done what they had to given what they had to work with.

Then, about a month ago, I got sick. Like bedridden sick. I actually took 3 days off work, which is probably the most sick leave I've taken since my children were born. While I was lying in bed waiting for the antibiotics to kick in, I got bored and thought I'd have a crack at rewriting the APIS editor. In Javascript. There was already a perfectly good pure-XML editing view, where you can open your document, make changes, and save it back (having it validated etc. on the server), so why not build something that could load the XML into a form in the browser, and then rebuild the XML with the form's contents and post it back? Doesn't sound that hard. And so that's what I did. I should say, this seemed like a potentially very silly thing to do, adding another, non-standard, editing method that duplicates the functionality of an existing standard to a system that already has a working method. I'd spent a lot of time on figuring out how to refactor the existing editor because that seemed like the Right Thing To Do. Being sick, bored, and off the clock gave me the scope to play around a bit.

Let me talk a bit more about the constraints of the project. Data is stored in XML files in a Git repository. This means that not only do I want my edits to be sending back good XML, I want it to look as much like the XML I got in the first place, with the same formatting, indentation, etc., so that Git can make sensible judgements about what has actually changed. I might want some data extracted from the XML to be processed a bit before it's put into the form, and vice versa. For example, I have dates with @when or @notBefore/@notAfter attributes that have values in the form YYYY-MM-DD, but I want my form to have separate Year, Month, and Day fields. Mixed (though only lightly mixed) content is possible in some elements. I need my form to be able to insert elements that may not be there at all in the source. I need it to deal with repeating data. I need it to deal with structured data (i.e. an XML fragment containing several pieces of data should be treated as a unit). And of course, it needs to be able to put data into the XML document in such a way that it will validate, so the order of inserted elements matters. Moreover, I need to build the form the way the framework expects it to be built, as a Rails View.

So the tool needs to know about an instance document:

  1. how to get data out of the XML, maybe running it through a conversion function on the way out
  2. how to map that data to one or a group of form elements, so there needs to be a naming convention
  3. how to deal with repeating elements or groups of elements
  4. how to structure the data coming out of the form, possibly running it through a function, and being able to deal with optional sub-structures
  5. how to put the form data into the correct place in the XML, so some knowledge of the content model is necessary
  6. how to add missing ancestor elements (maybe with attributes) for new form data
Basically, it needs to be able to deal with XML as it occurs in the wild: nasty, brutish, and (one hopes) short.

Form-based XML editing is one of those things that sounds fairly easy, but is in fact fraught with complications. It's easy to get data out of XML (XPath!), and it's easy to manipulate an XML DOM, removing and adding elements, attributes, and text nodes. But it's actually quite hard to get everything right, and to make the XML's formatting stay consistent. In my next post, I'll talk about how I did it.

Tuesday, March 20, 2012

How to Apologize

The latest regrettable spasm of sexism in the programming world played out this afternoon, as a company called Sqoot's announcement of a hackathon caused said event to implode before it ever began by including the infuriating and insensitive line under "perks" [Update: just to be clear, in the context of the original page, it was clear that the presence of women serving beer was one of the perks for attendees]:
Women: Need another beer? Let one of our friendly (female) event staff get that for you.
Gag. Sqoot fairly quickly realized they had walked into a buzzsaw, as lots of people called them on it, and their sponsors started pulling their support. It's rather nice to see that kind of quick, public reaction. Cloudmine's blog post about it particularly impressed me. Squoot issued an apology fairly swiftly, which I quote below:
Sqoot is hosting an API Jam in Boston at the end of March. One of the perks we (not our sponsors) listed on the event page was:

“Women: Need another beer? Let one of our friendly (female) event staff get that for you.”

While we thought this was a fun, harmless comment poking fun at the fact that hack-a-thons are typically male-dominated, others were offended. That was not our intention and thus we changed it.

We’re really sorry,

Avand & Mo
This didn't do much for a lot of people, but it got me thinking about apologies in tech in general, since they are actually crucial moments in the interaction between you and your customers/audience. When I worked at Lulu, Bob Young used to say that whenever you screw up, it's actually a tremendous opportunity to win a customer's loyalty by making it right. This applies both to small screwups (a customer's order never made it) and large ones (you did something that made lots of people mad). It strikes me that in this day and age, when the "non-apology" has become so frequent, people may actually not realize when it isn't appropriate to use conditional or evasive language in apologies. It's one thing if you're worried about being sued and can't admit culpability, or if you're someone like Rush Limbaugh, who's presumably concerned about appearing to back down in front of his audience. But if you're actually intent on repairing the damage done to your relationship with your customer or your audience, you need to be able to apologize properly.

So what are the elements of a good apology?

  1. I hear you.
  2. I am truly sorry.
  3. (semi-optional, depending on what happened) This is what went wrong.
  4. I am doing x to make sure this doesn't happen again and y to make it right with you.
  5. Thank you. I appreciate the feedback.
#1 is crucial. The person or group you're addressing has to know that you've heard their complaint and understand it. Apologies that lack this element sound cold and disconnected. And this is the main problem with Sqoot's "others were offended."  They aren't speaking to the people they offended. This is just guaranteed to further piss people off.

#2 should be unconditional. Not "I'm sorry if you were offended." Indeed, if you find yourself pushing the focus onto the people whom you pissed off at all, you may be sliding into non-apology territory. This isn't about them—they're mad because you made them mad. Note that a good apology is not defensive, and does not attempt to shift the blame, even if that blame belongs to an employee whom you've just fired.  If you did that, it's part of #4, the "how I'm fixing it" part, not the "I'm sorry" part. Don't try to save face in a genuine apology. Indicating that you meant no harm is fine, but if you're apologizing, it means you caused harm regardless of your intent.

#3 is a bit more tricky. People want to know how this could have happened, but it doesn't do to dwell on it too much, and this is another mistake Sqoot makes. They probably shouldn't quote the line that made everyone mad (it will make the readers mad all over again). It would have been enough to say they put something stupid and sexist into an event page which they now regret. On the other hand, you do have to acknowledge what happened and not look like you're trying to dodge it. So don't go into excruciating detail about what went wrong with a customer's order, for example. "I'm afraid you found a bug in our shopping cart" is probably enough detail. Sqoot's apology does this really badly: they explain exactly what they did, how it happened (we thought it was funny, because we're aware that these things are mostly male), and then contrast the "others" (who lack their sense of humor) who were offended. Explaining how you messed up does not mean defending yourself, and defending yourself in an apology must be handled delicately, or you look like an ass.

#4 Fix it, if you can. "We're refunding your order immediately and giving you a coupon", "I shall be entering rehab tomorrow morning", "We're donating $$ to x charity".

#5 Reconnect. When you screw up, people are paying very close attention to you, and it's an opportunity to show that you're a stand-up company/organization/person. You stand to win greater loyalty and affection by handling the problem effectively. The people who are complaining (assuming they are correct, of course) are helping you by showing you where you're wrong, or at least showing you a different perspective. Squoot "signing" their apology is actually good, in this case, because it indicates the founders (I presume that's who they are) are taking responsibility. It's too bad they flubbed the middle bit.

Sunday, March 04, 2012

A spot of mansplaining

This is bit of rambling, responding to Bethany Nowviskie's terrific "Don't circle the wagons", itself a response to Miriam Posner's "Some things to think about before you exhort everyone to code".

I'm a middle-aged, white, male programmer, so that's where I'm coming from. I can't help any of that, but doubtless it colors my perspective.

First, I have to say that the idea of coding being associated with prestige (as it seems now to be in DH) is rather foreign to my experience, but the rise of the brogrammer is probably a sign that in general it's not such a marginal activity anymore. These guys would probably have gone and gotten MBAs instead in years past.

DH is slightly uncomfortable territory for programmers, as I've written in the past, at least it is for people like me who mostly program rather than academics who write code to accomplish their ends. I speak in generalities, and there are good local exceptions, but we don't get adequate (or often any) professional development funding, we don't get research time, we don't get credit for academic stuff we may do (like writing papers, presenting at conferences, etc.), we don't get to lead projects, and our jobs are very often contingent on funding. All this in exchange for about a 30% pay cut over what we could be earning in industry. There are compensations of course: incredibly interesting work, really great people to work with, and often more flexible working conditions. That's worth putting up with a lot. I have a wife and young kids, and I'm rather fond of them, so being able to work from home and having decent working hours and vacation time is a major bonus for me.

None of that in any way accounts for the gender imbalance that Miriam is highlighting, though it does perhaps work as a general disincentive (in academia) to learn to code too well. I'd also say that there is nothing that can make you feel abysmally stupid quite like the discipline of programming. Errors as small as a single character (added or omitted) can make a program fail in any number of weird ways. I am frequently faced with problems that I don't know ahead of time I'll be able to solve. Ostensibly hard problems may be dead simple to fix, while simple tasks may end up taking days. It keeps you humble. But I would say that if you're the lone woman sitting in a class/lecture/lab, and you're feeling lost, you're not the only one, and it has nothing at all to do with your gender.

As Bethany cautions, please, please don't circle the wagons. It's my contention that most of the offensive things about programmer culture are not intentional nor deeply ingrained but are actually artifacts of the gender/race imbalance.

[As an aside, I was interested in Miriam's remarks about codeacademy. I started working through it with my daughter a couple of weeks ago, and she was finding it incredibly frustrating. It does not fit her learning style at all. She, like me, needs to know why she has to type this thing. She finds being told to just type var myName="Grace" without any context stupid. In the end we gave up and started learning how to build web pages, and I'll reintroduce Javascript in that context.]

Programmer culture is exclusionary though. Undergraduate CS programs have "weed out" courses. I've actually taken a couple of these, and the first one did weed me out—it managed to be hard and extremely boring at the same time. I only came back to it years later. This gets at an aspect of programmer culture though, a sort of "are you good enough?" ethic. It's not without foundation—a lot of people who self-identify as programmers can't program—but it also means that when you start to work with a new group, there's often a kind of ritual pissing contest where you have to prove your worth before you're treated as an equal. This kind of thing is irritating enough on it's own, and it's easy to imagine it taking on sexist or racist overtones.

Programming also tends to squeeze out older folks. Actual age discrimination does happen, but it's also because staying current means almost totally rebooting your skillset every few years. The Pragmatic Programmer book recommends learning a new language every year, and this is crucial advice (my own record is more like one every 18 months or so, but that's been enough so far). If you let yourself get comfortable and coast, or go into management, your skills are going to be close to worthless in about 5 years. And, while your experience definitely gives you an edge, you're not guaranteed to have the best solution to any given problem. The 22-year-old who read something on Hacker News last night might have found an answer that totally blows your received wisdom out of the water.

[As another aside, the speed at which skills go stale means that any organization that doesn't invest in professional development for their programmers is being seriously stupid. Or they expect their devs to leave and be replaced every few years.]

The upshot is that the population of programmers not only skews male, it skews young. Put a bunch of young men together, particularly in small groups that are under a lot of pressure (in a startup, for example), and you get the sorts of tribal behaviors that make women and anyone over 35 uncomfortable. There's not just misogyny, there's hostility towards people who might want to have a life outside of work (e.g. people who have spouses and kids and like spending time with them). And this is both a cause of sexual/racial/age imbalance and a result. It's a self-reinforcing cycle.

But there isn't really one monolithic "coder culture", there are lots of them. Every company and institution has its own culture. Teams have cultures. Basically, any grouping of human beings is going to develop its own set of values and ways of doing things. It's what people do naturally. The leaders of these groups have a lot to do with how those cultures develop, but they aren't necessarily in any position to remedy imbalances.

Once you're in a position to hire people, you realize that hiring good developers is hard. In any pool of applicants, you're going to have a bunch of totally unsuitable people, a few maybes, and if you're lucky, one or two gems (or people you think might become a gem with a little polishing). Are any of these going to be women? Not unless you're really lucky, because the weeding out has already done its work. So once you're in a position to decide who gets hired, it's too late to redress any imbalance. The imbalance is not because leaders don't want to hire women, it's already there.

The only answer I can see is to get a lot more women into programming. If the CS curricula can't do it, maybe new modes like DH can. From what I've seen the gender balance in DH, while still not great, is a lot less ruinous than in programming in general, and a CS degree is far from the only road into a programming career (I don't have one). I think the cycle can be broken. I don't think there's a lot of deeply ingrained misogyny or racism in coding culture. Rather, it's a boys club because it happens to be mostly boys. If there were more women it would adjust. And I don't think that (mostly) it would put up much resistance to an influx of women. So circling the wagons is the exact opposite of what needs to be done.