Big Data and Natural Language Processing: Remember Max Headroom ? Think the financial report your read on Forbes is written by a human ? Think again. Narrative Science is already disrupting people who work in media by automating the process of describing big data. More or less, anything on a chart or excel sheet can now be narrated using natural language. Now the CIA is interested: while visualizations have gotten plenty of attention as options for getting good stuff out of data, In-Q-Tel’s investment in Narrative Science suggests information in paragraphs could work too.
[Reproduced from GigaOM]
The CIA takes an interest in Narrative Science’s quick summaries of big data
[ Jordan Novet 06.06.13]
Narrative Science has attracted media attention because it has the potential of disrupting people who work in the media. The company’s technology can take heaps of data about, say, a sports game, a company’s quarterly earnings or a person’s life and surface the most important stuff.
Now it appears Narrative Science’s capability would be of use to the U.S. intelligence community, with the company announcing Wednesday a “strategic partnership and technology development agreement” with In-Q-Tel, the investment firm with roots in the CIA.
The partnership will help Narrative Science whip up a version of its Quill artificial-intelligence tool for government users. And it looks like the technology won’t be sitting on a shelf. In-Q-Tel invested in company in order to help the intelligence community within 36 months, according to its website. In other words, the CIA and others might see an immediate need for this technology.
It’s not the first time investors have seen value in Narrative Science. Battery Ventures, SV Angel and others have put up more than $10 million in funding and debt rounds, according to a spokeswoman.
It’s hard to dispute the federal government’s claims to having big data — specifically the speed at which it comes in and the sheer size of it all. This point came through crystal-clear at GigaOM’s Structure:Data conference in March, when the CIA’s chief technology officer, Ira “Gus” Hunt, talked up the importance of being able to spot the important stuff amid vast and growing supplies of data, with more inputs coming online all the time. And at least in the CIA, that ability might not be at the level it should be:
The goal we have is I have to be able to get the power of Big Data and the analytics into the hands of the average user. The only way that the real value is going to be realized by us, or even in the commercial sector and by individual companies, is when everybody has access to a tool and the data in order to get their jobs done and they don’t have to worry about it. Tomorrow what we want are really elegant, easy-to-use tools, the machines to do the heavy-lifting, and we want to get out of simple things like ‘search’: search is so broken in this peta-scale world that we’re talking about.
The CIA isn’t the only organization facing challenges like that. The Defense Advanced Research Projects Agency (DARPA) is, too. Speaking at the Economist’s Information Forum event in San Francisco on Tuesday, the agency’s information innovation office director, Daniel Kaufman, said he wants computers to start presenting hypotheses to him. “What if the computer could ask a big data question?” he said. “… Tell me something interesting, and how do you know that that was interesting?”
Some software already exists for getting clues on data patterns. Ayasdi visualizes connections among otherwise disparate data points, and BeyondCore lets users dump in data and see correlations and the factors driving results, for instance.
While software is the key here, hardware is the enabler, and we’ll be talking about this connection during conversations with Jay Parikh, Facebook’s vice president of infrastructure engineering, and other luminaries at GigaOM’s Structure conference in San Francisco in a couple of weeks.
i blog about the things I love: fitness, hacking work, tech, Experiences and anything holistic.
> Head of Digital Product at Nutrien