NLG

Here’s a selection of lead-off introductory lines from discovery papers of a completely random sample of planets announced in 2010:

With the discovery of extrasolar planets during the past 15 years, it has now become evident that our solar system is not unique. Similar to our Sun, many stars are believed to be hosts to giant and/or terrestrial-class planets and smaller objects.

In recent years, extending the threshold for exoplanet detection to yet lower and lower masses has been a significant endeavor for exoplanetary science. As at 2010 October, 31 exoplanets have been published with minimum (i.e., m sin(i)) masses of less than 20 Earth masses.

Radial velocity (RV) searches for extrasolar planets are discovering less massive planets by taking advantage of improved instrumental precision, higher observational cadence, and diagnostics to identify spurious signals. These discoveries include planets with minimum masses (M sin i) as low as 1.9 Earth masses (Mayor et al. 2009) and systems of multiple low-mass planets (Lovis et al. 2006; Fischer et al. 2008; Vogt et al. 2010). To date, 15 planets with M sin(i) < 10 Earth masses and 18 planets with M sin(i)=10–30 Earth masses have been discovered by the RV technique (Wright et al. 2010, Exoplanet Orbit Database10).

Ground-based transit surveys have been very successful at discovering short-period (P < 5 days) transiting extrasolar planets (TEPs) since 2006.

There has been a rapid increase in the number of transiting planets discovered each year due to dedicated ground– and space– based surveys: HAT (Bakos et al. 2002), TrES (Alonso et al. 2004), XO (McCullough et al. 2005),WASP (Pollacco et al. 2006), CoRoT (Baglin et al. 2006) and Kepler (Borucki et al. 2010). This trend looks set to continue, with the discovery of over 35 new planets published already this year (mid 2010), which represents more than a third of the total number of transiting planets known.

These soothing, robotic cadences are familiar to everyone who writes introductions and discussions for planet discovery papers. Those astronomers write prose with machine-like precision. Machine-like. Hmm…

Last year, after one of our “Wouldn’t it be cool if?” conversations, Stefano Meschiari decided to take up the daunting challenge of developing an NLG software package that can analyze radial velocity data, “discover” any statistically significant planets contained therein, and then write a publication-quality paper, that includes a human-readable introduction and analysis.

Stefano soon produced an amazing first-draft package, which he’s named “BAM” — short for Big Automatic Machine. Check out this screen-capture video of the systemic console hooked up to the BAM:

The Big Automatic Machine in action

There are certain advantages to having a computer write planet detection papers… BAM can go out on the Internet and scour the catalogs and the literature, which allows it to place new planets smoothly into the broader context. By looking at where new planets fall within the confines of all the known distributions, it can spot trends, peculiarities, and facets of interest.

As an example, for the planets discussed in Stefano’s latest lead-authored paper, BAM notices that several of them fall in a somewhat sparsely populated region of the mass-period diagram:

With a little coaxing and advice from its human minders, it now produces the following discussion:

All the planets presented in this paper lie well within the existing exoplanet parameter envelopes (Fig. 15). Several of them lie in the so-called “desert” in the mass and semi-major axis distribution of extrasolar planets (Ida & Lin 2004). Monte-Carlo population synthesis models for extrasolar giant planet formation tend to suggest that planets migrate relatively rapidly through the period range between 10 and 100 days, and, in addition, often grow quickly through the mass range centered on the Saturnian mass. In the context of the overall planetary census, these four new planets help to further elucidate the various statistical properties of exoplanets. In particular, the discovery of multiple-planet systems helps in further characterizing the number of stars hosting multiple planetary companions and any correlations emerging in the distribution of orbital elements as suggested by observational clues (e.g. Wright et al. 2009).

With extrasolar planets as the topic, art retains a certain precedence over craft, and for the foreseeable future, BAM will be stuck with a learner’s permit — only allowed to drive if there’s a licensed driver in the car. I can imagine more mercenary, lawyerly, applications, however, where it will be able to really come into its own.

BAM, with its perfect command of LaTeX, its dry analytic mindset, and its cautiously factual discussions, writes prose that is pretty much the opposite of the writing that you’ll find in Jack Keroac’s On the Road. From the Wikipedia:


Keroac completed the first version of the novel during a three-week extended session of spontaneous confessional prose. Kerouac wrote the final draft in 20 days, with Joan, his wife, supplying him bowls of pea soup and mugs of coffee to keep him going. Before beginning, Kerouac cut sheets of tracing paper into long strips, wide enough for a type-writer, and taped them together into a 120-foot (37 m) long roll he then fed into the machine. This allowed him to type continuously without the interruption of reloading pages.

In the mid-1950’s, at the urging of Allen Ginsberg and William Burroughs, Keroac compiled a list of “essentials” for writing the spontaneous prose that comprises On the Road and his other work. Taken as a set of instructions, they seem almost perfectly designed to defy machine implementation in an NLG program. Take for example, the prescription for implementing proper structure:

STRUCTURE OF WORK
Modern bizarre structures (science fiction, etc.) arise from language being dead, “different” themes give illusion of “new” life. Follow roughly outlines in out fanning movement over subject, as river rock, so mind flowover jewel-center need (run your mind over it, once) arriving at pivot, where what was dim-formed “beginning” becomes sharp-necessitating “ending”and language shortens in race to wire of time-race of work, following laws of Deep Form, to conclusion, last words, last trickle-Night is The End.

One gets the feeling that the computers are still a decade or so away…

3 thoughts on “NLG

  1. Pingback: Tweets that mention systemic » NLG -- Topsy.com

  2. Pingback: The Future: Empty of Content « The Five Ages

  3. Pingback: Intro to BAM | Stefano Meschiari

Leave a Reply

Your email address will not be published. Required fields are marked *

*

This site uses Akismet to reduce spam. Learn how your comment data is processed.