SHERPA   
. . . opening access to research  
spacer
RoMEO

SHERPA/RoMEO API Wish List

Version: 5th October 2012

API Queries | API Results | API XML Schema | RoMEO Data | RoMEO General

The wishes in this list were originally collated from a survey of SHERPA/RoMEO Application Programmers' Interface (API) users, notes from breakout sessions at the RoMEO API Workshop, Edinburgh, 1st September 2010, and subsequent feedback received from users. Remarks from the RoMEO developers on the wishes are given in italics. Please send any further ideas and feedback to Peter Millington (romeo@jisc.ac.uk)

The struck through wishes have been fulfilled since the list was compiled.

API Queries

General

  1. Use true REST URLs for the API that can be used as URIs in linked data - e.g. http://www.sherpa.ac.uk/romeo/api/issn/0971-7544
    RoMEO endorses this proposal. Can arguments be eliminated totally, or would this get too complex?
    - Done
  2. Add an optional query argument for Publication date (year [& month]), and adjust the returned RoMEO colour and/or data depending on whether or not this is past any embargo data
    This perhaps may only be relevant if we retain RoMEO colours. Otherwise it is a matter for the user application.
  3. Add an optional query argument for the target archive type (institutional, subject, personal), and adjust the returned RoMEO colour and/or data to reflect specific permissions
    This perhaps may only be relevant if we retain RoMEO colours. Otherwise it is a matter for the user application.
  4. Query publisher by RoMEO update date
  5. Provide query arguments that allow users to specify the data they get back.
    It is already possible to control the amount of Mandate Compliance data that is returned. This principle could be extended to cover versions, Paid OA, etc.
  6. Query just the version the user wishes to archive (i.e. Publisher's version/PDF, Accepted, or Submitted version)
    In truth, this wish mainly concerns the amount of data being returned, but could also be a version/permission type combination query.

Journals

  1. Provide 'bulk query' options - e.g. querying a list of ISSNs or titles in one batch.
    Such queries might need a special output schema.
  2. Allow query by alternative ISSNs (e.g. e.ISSN)
    ESSN queries are already possible, although the ESSNs are not yet displayed in the returned data.
    - Done
  3. Allow query by name variants (e.g. abbreviations)
    Abbreviations are already handled, thanks to Entrez. Other alternative titles will be possible with the new RoMEO Journals database
    - Done
  4. Provide fuzzy searching to accommodate spelling variations in queries and international journal titles
    Major technical challenge, unless there is an off-the-shelf package that we can use. However, this wish might not be desirable, as spelling variations are often significant for distinguishing otherwise identical titles in different languages.
  5. Query by RoMEO Journals database persistent ID
  6. Find all the journals of a specified publisher. (Using any valid publisher API query that retrieves a single publisher)
    Would be possible with the RoMEO Journals, and DOAJ databases only. Should this include or exclude any imprints that have their own policies?
  7. Provide a list of OA titles produced by subscription-based publishers
    Out of scope? Already being done by DOAJ? What about titles with mixed OA and subscription-only content?
  8. Query journals by subject
    Out of scope - This is the domain of user applications

Publishers

  1. Drop queries by RoMEO colour.
    Instead, let people query by version and/or required permission
  2. Allow query by version and/or required permission
    Replacing query by RoMEO colour
  3. Query publishers by country - 2-letter ISO code
  4. SPARQL queries over linked data sets. SPARQL Endpoint
    This is new technical territory for RoMEO, and could require major development effort.

API Results

Alternative Output Formats

  1. Provide a JCP output option.
  2. Provide a JSONP output option
  3. Provide an RDF output option

API XML Schema

General

  1. Use UTF-8 encoding for output instead of ISO-8859-1
    Already planned. Also need to consider whether or not to convert HTML entities for accented characters in the data.
  2. Use the common attribute xml:lang wherever language needs to be specified.
  3. Include URI attributes in the XML schema for each journal and publisher, that can be used as links in user applications.
  4. Use W3C standards URIs
  5. Use standard element and attribute names from existing schemas (e.g. Dublin Core) wherever appropriate.
  6. Remove presentation layer HTML from XML data
    This is already planned for Publishers' Policy links. Is OK for Paid OA. Some restrictions and conditions will require attention.

Header

  1. Split <numhits> to give separate figures for the number of journal and publisher records found
  2. Provide a 'truncated' attribute to <numhits> or its successors to indicate when the results have been limited to 50 journals.

Journals Data

  1. For multilingual journals, place similar fields together in the XML (e.g all alternative full titles), rather than grouping diverse fields by language.
    There may be other alternative structures that could be considered.
  2. Include elements or attributes in the XML schema for previous or subsequent journal titles, along with 'from' and 'to' dates.
    Alternatively, provide URIs for previous and subsequent titles, that user applications can then follow up for more data.
  3. Provide a new URI argument in the XML for a journal that can be used to launch a prepopulated RoMEO 'Suggestion' form to feed back changes of publisher, supply missing data, or send queries to the administrators.
    Human operation of the form, not robot, to limit abuse.

Publisher Data

  1. Provide separate data for Publisher's version/PDF - as in interactive RoMEO
  2. Provide machine-readable codes for terms and definitions in attributes.
    It would be desirable to use ONIX Licensing Terms as much as possible (see ONIX-PL Dictionary Issue 2)
  3. Provide embargo data in machine-readable attributes, so that the archivability of older articles can be determined.
    e.g. <restriction code="publishedembargo" value="6" units="months" />
  4. Include an attribute that indicates when a non-RoMEO-standard statement has been quoted verbatim from a publisher's policy documents.
    e.g. <condition verbatim="true"> If it is a statement that the publisher specifies must accompany a deposited eprint, this will be indicated by a 'type' code. e.g. <condition verbatim="true" statementtype="required">
  5. Use URI attributes to reference definitions of restrictions, conditions and similar data in the XML schema.
  6. Add an attribute to each policy statement with a reference or URI for the relevant publisher's document.
  7. Provide well-formed links to publishers' online copyright documents.
    Such links are already provided, although links to multiple documents are not yet separated, which is an issue for machine handling.
    - Done
  8. Provide link URLs to an online form for requesting information from the RoMEO paper archives for publishers lacking online policy documents.
  9. Provide a new URI argument in the XML for a publisher that can be used to launch a prepopulated RoMEO 'Suggestion' form to feed back policy changes, or send queries to the administrators.
    Human operation of the form, not robot, to limit abuse.
  10. Provide the SWORD API URL for repositories recommended or required in funders OA mandates.
    This involves JULIET, which needs to store the OpenDOAR IDs for such repostories. OpenDOAR in turn needs a field for SWORD API URLs, and this data would need gathering.

RoMEO Data

Clarity & Authority

  1. Replace the confusing terms 'Pre-print' and 'Post-print'.
    We intend to replace these with terms recommended by the UK's Versions project - 'Author's submitted version' and 'Author's accepted version' respectively.
  2. Indicate the authority of each policy statement.
    Our plan is to link each statement to the relevant publisher's online document, or reference to RoMEO's paper records.
  3. Improve the guidance returned for 'Policies not in RoMEO' to include suggestions for other possible sources and databases.
  4. Use a 'traffic lights' scheme to indicate the level of difficulty for archiving, as an alternative to RoMEO colours and/or the tick-cross-question mark scheme for specific document versions

Data Model

  1. Where necessary provide policies at the journal level, not just at the publisher or imprint/client level
    Current plans do not yet accommodate journals with individual exceptional policies
    - Done
  2. Include prices of Paid Open Access options
    Not in the current MySQL schema. Data is available in a spreadsheet and at http://www.sherpa.ac.uk/romeo/PaidOA.php.
  3. Archive superseded RoMEO records, and provide a facility for viewing earlier versions.
  4. Retain old publisher's names and their persistent RoMEO IDs if they rebrand themselves.
    More of an admin workflow/policy matter. The relationship between the old and new records needs storing and handling.
  5. Add fields for dates when a journal-publisher relationship is established and/or terminates
    Already planned for the new MySQL schema, but as yet no input mechanism. (May be indicated in 'Notes')
  6. Record and display relationships between publishers and their imprints and clients.
    To some degree already handled by the new RoMEO Journals database. Already anticipated for the new publishers schema
    - Done
  7. Add fields for dates when a publisher-publisher relationship is established and/or terminates
  8. Add a 'Remarks' field for publisher records - e.g. to flag imminent changes of ownership.
    Already planned for the new MySQL schema.

RoMEO General

Admin

  1. Provide a mechanism (RSS or email?) for alerting partners about new phrases that require translation into their language.
  2. Provide an online form/table for partners to supply translations of new phrases.
    Such a tool was launched in Nov.2010
    - Done

Coverage

  1. Continue adding more content to RoMEO
    Ongoing
  2. Add more data for non-English language publishers and their journals.
    Such work is ongoing in collaboration with overseas RoMEO partners.

Involving Publishers

  1. Provide an online form for publishers to check and suggest changes to their RoMEO entries
  2. Provide a policy generation tool for publishers, adapted from the administrators' input form.
    A prototype was demonstrated in summer 2011.

New Tools

  1. Provide an API for supplying RoMEO statistics - possibly using PSH (http://www.opendoar.org/demos/psh_prototype)

RSS Feeds

  1. Provide an RSS feed for the new RoMEO Journals database
    This could be in RDF format
  2. Provide an RSS feed in RDF format
    For current awareness of data changes. A non-RDF feed already exists for publisher data.

System

  1. Improve the API's speed
    This may be a systems and/or network issue for RoMEO generally, as well optimising the efficiency of the API software.
  2. Establish mirror servers for RoMEO to improve reliabilty

© 2010, University of Nottingham Contact us