Continuing the legacy of the New York Times Index, which stretches back nearly to the founding of the newspaper, The New York Times and The New York Times Company Research & Development Lab have adopted Linked Open Data to maintain and share the newspaper’s extensive holdings. The New York Times’ suite of Linked Open Data datasets, tools and APIs are based in large part on the newspaper’s 150 year old controlled vocabulary, which was released as 10,000 SKOS subject headings in January of 2010.
The New York Times publicizes their projects through their blog – Open: All the News That’s Fit to printf () – and through social media. In addition to creating prototype tools such as Who Went Where, The New York Times also promotes the use of their APIs and source code of their tools. Open has been a regularly updated blog since 2007 when the New York Times Company began its foray into the use and promotion of open source software.
The roots of The New York Times’ dataset is The New York Times Index, which was published quarterly beginning in 1913 and continues to be published today, although with less frequency. These red covered volumes contain a cross-referenced index of all of the names, articles and items that appear in the newspaper. Along with creating an authoritative controlled vocabulary, The New York Times Index also helped to launch The New York Times as a trusted research resource for students, scholars and librarians throughout the United States. As the New York Times continues to promote and develop their Linked Open Data assets they can continue their legacy as information innovators.
The New York Times began publishing their vocabulary as Linked Open Data in 2009 and by 2010 the vocabulary had grown to include 10,000 subject headings. Currently – as of March 2013 – the dataset includes the names of 4,978 people, 1,489 organizations, 1,910 locations and 498 descriptors for a total of 10,467 total tags. These tags are available as RDF documents and also as HTML. Individual data records can be browsed alphabetically, download in packages of SKOS files or queried using the 15 APIs available through The New York Times.
As of the spring of 2013, The New York Times has released 15 APIs. These APIs range from a Movie Reviews API that searches movie reviews by keyword and links to Critic’s Picks, to the TimesTags API, which matches quires to the New York Times’ controlled vocabulary. The documentation for the suite of APIs is hosted in the Developer section of the New York Times website, which includes a glance view of the API as well as suggested uses for each API and a forum for users and developers.
All of the New York Times’ APIs are available in a JSON response format and a smaller subset are available as XML or serialized PHP. The HTTP method universally used by New York Times’ APIs is GET.
Who Went Where Tool
The New York Times in a partnership with R&D labs, has developed a prototype tool to encourage developers to utilize their dataset. The tool “Who Went Where3,” is a search engine that enables users to search for recent Times coverage of the alumnae of a university or college. In addition to an introductory blogpost4 that explains the step-by-step process of creating your own linked data application with New York Times Linked Open Data, they have also made the application’s source code available to the public5.
“Who Went Where” is a JQuery application that queries DBpedia’s SPARQL endpoint. The high-level application control as described on The New York Times’ Open blog is as follows:
- The application starts by initializing an auto-complete field with the names of all the colleges and universities in DBpedia.
- When the user selects the name of an institution from the auto-complete field, the application queries DBpedia for the NYT identifiers of all the alumni of that institution. These identifiers are then used to query the New York Times Article Search API for the ten most recent articles about each alumnus. Then we use a little jQuery magic to display and format these articles.
“Who Went Where” is not only a unique application that showcases the value of The New York Times and its APIs, it is also a unique and elegant of an elegant and relatively straightforward application of these resources. The power of this tool is amplified by the documentation surrounding it, including the source code, which is available freely online.
Analysis of The New York Times Dataset, API and Tools
The New York Times has paired an extensive dataset with approachable documentation, to create a powerful tool in the promotion of both the Linked Open Data movement and the newspaper’s resources. The “Who Went Where” search tool created using one of The New York Times’ APIs is not necessarily the most exciting application utilizing Linked Open Data, but it does present an attainable and elegant framework for the use of APIs.
For more information visit:
- http://ebiquity.umbc.edu/blogger/2009/10/30/new-york-times-publishes-linked-open- data/