NASA: Making it as difficult as possible to get the data you need

It's becoming clear to me that many (perhaps not all) NASA web sites and web services are set up in such a way that it's damn near impossible to get the information you need out of them without chanting the correct incantation and sacrificing a chicken. It's a bit frustrating, and a bit like the web c. 1999.

Case in point: I need to retrieve data from a JPL data service and a Goddard application to feed data into a little widget I'm working on for the Explore Mars site (old site still up).

In an ideal world, I'd query those services and they'd return something in spiffy XML or JSON format, which I could parse with a script or Flex, and I'd be done.

In a less ideal world, the web pages would be formatted in such a way that I could pull the data I was looking for out of the HTML source code using some clear delimiters.

Unfortunately, the reality is somewhat less convenient:

The JPL Horizons service has three ways to access data -

  1. A web service that returns data in a big <PRE> tag box, which for you non-geeks is basically a big wad of text. Computers don't like to find bits within big wads of text, at least not without substantial extra effort. Plus, the input parameters to the script that generates the results is entirely obfuscated.
  2. An interactive Telnet service that doesn't seem to have a way to pass all the parameters for the data you're seeking at once, and STILL returns a big wad of text at the end.
  3. A batch email service that does return some of the data we're looking for, but again, wrapped in a crapload of extra text.

The Goddard page is just as inconvenient - no friendly text tools there at all, just a goofy Java applet. Sigh. Guess I can run all the equations to calculate the data myself.

My hope is that I can make my own little corner of the NASA web-o-sphere somewhat more friendly to those that wish to get at the data without spending their days figuring out ways to scrape screens and parsing emails.

Disclaimer: I understand why some of these tools may have been built this way, but c'mon, I know it really is rocket science, but is it that hard to push this data out into an XML file, or at least CSV, if you've gone through all the trouble of making the calculations already? Or, just format your web pages so that the data is wrapped in a reasonably parseable tag structure?

Comments

Heh... I warned ya. I remember going out for a couple beers with some JPL software guys back in 2004 who wanted to use a relational database for some project, and they got shot down because the senior management didn't feel that relational database technology (a tech deployed in many places since the early 1980's, for the non-geeks playing along) was a sufficiently proven technology.

You should count your lucky stars that you can even get access to the data using the TCP/IP protocol or anything based on it.

Umm....I had some client send me an M$ Publisher file. Had to download something called "PDF Creator" in order to get it in. In fact I'd say it was the hardest thing I've had to do since....oh fudge it. You win.

Dear NASA,
The above opinions of my husband in no way reflects our family's desire to be employed and insured.

Is this all some sort of slang for that new porn page you're putting together? If so, I'm I love the: In an ideal world, I'd query those services and they'd return something in spiffy XML or JSON format, which I could parse with a script or Flex, and I'd be done. turned out.

That's is scorching hot.

Excuse my typos. I only write for a living. Gawd.