Table of Contents

    API

    Mock Server

    Use this URL to access a mockup of the API server. Your traffic will be recorded and compared to the documentation. You'll find your traffic analysis in the inspector or directly here in the documentation, right next to each resource.

TheySay PreCeive API Documentation

TheySay PreCeive API is a platform-agnostic service which enables developers to access and mix-and-match our powerful text analysis processors that cover sentiment analysis, speculation detection, part-of-speech tagging, dependency parsing, and others. If you're building an application that cannot do without serious, state-of-the-art text analytics but don't want to delve deep into natural language processing, then this is the API for you.

Getting started with PreCeive API is easy. Test-drive our live public API demo, explore its end points below, and contact us to receive a development key. Need help or want to give feedback? Contact us - we'd love to hear from you!

API Clients

To help you get started, Open Source API clients are currently available for Java, Scala, Python, Ruby, Node.js, and R. PHP, C#, and Excel clients will be released soon.

HTTP Methods

PreCeive API follows REST principles. The following HTTP request methods are supported:

  • POST (recommended) - Query fields are in the request body and expressed as JSON. Example: { "text":"Patchy rain, sleet or snow in parts...", "level":"sentence" }.
  • GET - Query fields are expressed as parameters in the URL and must be URL-encoded. Note that GET exposes only a limited subset of available query fields. Example: /v1/sentiment?text=how%20cool%20is%20that!&level=sentence.

HTTP Response Codes

  • 200 OK - The request was successful.
  • 201 Created - The request was successful and a resource was created.
  • 400 Bad Request - The request could not be interpreted correctly or some required parameters were missing.
  • 401 Unauthorized - Authentication failed - double-check your username and/or password.
  • 405 Method Not Allowed - The requested method is not supported. Only GET and POST are allowed.
  • 429 Too Many Requests - Quota or rate limit exceeded (see below).
  • 500 Internal Server Error - Something is broken. Please contact us and we'll investigate.

Quotas and Rate Limits

We enforce two request quotas: requests per day and requests per minute. Your quotas depend on your API subscription. By default, the following rates apply:

  • Maximum 500 requests per day, reset at midnight UTC.
  • Maximum 30 requests per minute.

Responses returned by the API contain information about your quota in the following response header fields:

  • X-RequestLimit-Limit - # of requests that you can send in a day. Example: 15000.
  • X-RequestLimit-Remaining - # of requests that you can send before you will exceed your daily request limit. Example: 12323.
  • X-RequestLimit-Reset - When your next daily quota will be reset (in UTC epoch milliseconds). Example: 1360281599708.
  • X-RateLimit-IntervalSecs - The length of your rate limit window. Example: 60.
  • X-RateLimit-Limit - # of requests that can send within your rate limit window. Example: 30.
  • X-RateLimit-Remaining - # of requests that you can send before you will exceed your rate limit. Example: 25.
  • X-RateLimit-Reset - When your next rate limit window will be reset (in UTC epoch milliseconds). Example: 1360254866709.

You can also see your current rate limit status by calling /rate_limit. Example: http://api.theysay.io/rate_limit.

For more information about quotas, rate limits, and subscriptions, contact us.

Maximum Request Length

The maximum length of the text body in each request is 20000 characters.

JSONP Support

Use the callback request parameter to add a JSONP wrapper. The returned Content-Type will be application/javascript.

GZIP Compression

Add Accept-Encoding: gzip to your request headers if you want the API to deliver a gzipped stream.

Server Version Information

To obtain software build version details about the current API, call /version. Example: http://api.theysay.io/version.

Sentiment Analysis

Sentiment, a dimension of non-factuality in language that is closely related to subjectivity/affect/emotion/moods/feelings, reflects psychological evaluation with the following fundamental poles:

  • positive / good / pros / favourable / desirable / recommended / thumbs up /... vs
  • negative / bad / cons / unfavourable / undesirable / not recommended / thumbs down /...

You can use the Sentiment Analysis service to discover and score deep, fine-grained sentiments and opinions in text. The analysis, output by a human-like sentiment reasoning algorithm, captures both explicit "author sentiment" as well as general, implicit "reader-sentiment" beyond opinions that ultimately stems from affective commons sense as well as issues and events that are generally considered to be good vs. bad in the world.

The returned analysis includes majority sentiment labels, fine-grained 3-way positive/neutral/negative percentage scores, and other useful auxiliary fields.

POST

/v1/sentiment

Returns sentiment information about the entire text (document-level sentiment analysis).

Request fields:

  • "bias":{ p:d } (optional) - Sentiment coefficients (0 ≤ d ≤ 100) to control the (in)sensitivity of the sentiment analysis towards p ∈ { positive | neutral | negative } sentiment. Example: bias: { "positive":7.5 }
  • "text" - The text that you want to analyse.

Response

200 (OK)
Content-Type: application/json; charset=UTF-8
{
  "sentiment": {
    "label": "POSITIVE",
    "positive": 0.941,
    "negative": 0.0,
    "neutral": 0.059
  },
  "wordCount": 12
}

POST

/v1/sentiment

Returns sentiment information about each sentence in the text (sentence-level sentiment analysis).

Request fields:

  • "bias":{ p:d } (optional) - Sentiment coefficients (0 ≤ d ≤ 100) to control the (in)sensitivity of the sentiment analysis towards p ∈ { positive | neutral | negative } sentiment. Example: bias: { "positive":7.5 }
  • "level":"sentence" - Selects sentence-level sentiment analysis.
  • "text" - The text that you want to analyse.

Response

200 (OK)
Content-Type: application/json; charset=UTF-8
[{
  "sentiment": {
    "label": "POSITIVE",
    "positive": 0.787,
    "negative": 0.16,
    "neutral": 0.053,
    "confidence": 0.668
  },
  "start": 0,
  "end": 36,
  "sentenceIndex": 0,
  "text": "The new French President Francois Hollande wants a '' growth pact '' in Europe - a set of reforms designed to boost European economies and mitigate the pain caused by government spending cuts across the continent ."
}, {
  "sentiment": {
    "label": "NEGATIVE",
    "positive": 0.347,
    "negative": 0.627,
    "neutral": 0.026,
    "confidence": 0.614
  },
  "start": 37,
  "end": 68,
  "sentenceIndex": 1,
  "text": "All the bad loans made by eurozone banks may need to be cleaned up ( by injecting money into the banks ) because many national governments probably can not afford it ."
}]

POST

/v1/sentiment

Returns sentiment information about each individual entity (term, keyword) mentioned in the text (entity-level sentiment analysis).

Request fields:

  • "bias":{ p:d } (optional) - Sentiment coefficients (0 ≤ d ≤ 100) to control the (in)sensitivity of the sentiment analysis towards p ∈ { positive | neutral | negative } sentiment. Example: bias: { "positive":7.5 }
  • "level":"entity" - Selects entity-level sentiment analysis.
  • "text" - The text that you want to analyse.

By default, all analysed entities are returned in the response. If you want to control which entities are included in the response, use the "targets" and "matching" fields to specify which entities you want.

  • "targets" accepts a list of regular expressions delimited by the | (%7C encoded) operator. Example: "targets":"market" or "targets":"market|business(es)?|opportunity|cost".

The specified target entities are matched against words in each entity NP using the following matching modes:

  • "matching":"head" (or none) (default) - The targets match the head noun of an entity NP (full head match).
  • "matching":"exact" - The targets match an entity NP (full match).
  • "matching":"phrase" - The targets can match anywhere inside an entity NP (substring search).

Response

200 (OK)
Content-Type: application/json; charset=UTF-8
[{
  "sentiment": {
    "label": "POSITIVE",
    "positive": 1.0,
    "negative": 0.0,
    "neutral": 0.0,
    "confidence": 0.756
  },
  "start": 2,
  "end": 2,
  "sentence": "'' This collaboration is sending a strong message to all the spammers : Stop sending us spam .",
  "sentenceHtml": "'' This <span class=\"entityMention\">collaboration</span> is sending a strong message to all the spammers : Stop sending us spam .",
  "text": "collaboration",
  "headNoun": "collaboration",
  "headNounIndex": 2,
  "salience": 1.0
}, {
  "sentiment": {
    "label": "NEGATIVE",
    "positive": 0.412,
    "negative": 0.588,
    "neutral": 0,
    "confidence": 0.689
  },
  "start": 11,
  "end": 11,
  "sentence": "'' This collaboration is sending a strong message to all the spammers : Stop sending us spam .",
  "sentenceHtml": "'' This collaboration is sending a strong message to all the <span class=\"entityMention\">spammers</span> : Stop sending us spam .",
  "text": "spammers",
  "headNoun": "spammers",
  "headNounIndex": 11,
  "salience": 0.7
}]

POST

/v1/sentiment

Returns sentiment information about aggregated entities (terms, keywords) mentioned in the text. Individual entity mentions are grouped using lowercase head noun matching and scored using weighted sentiment scores.

Request fields:

  • "bias":{ p:d } (optional) - Sentiment coefficients (0 ≤ d ≤ 100) to control the (in)sensitivity of the sentiment analysis towards p ∈ { positive | neutral | negative } sentiment. Example: bias: { "positive":7.5 }
  • "level":"entityaggregate" - Selects aggregated entity-level sentiment analysis.
  • "text" - The text that you want to analyse.

Response

200 (OK)
Content-Type: application/json; charset=UTF-8
[{
  "entity": "osborne",
  "frequency": 2,
  "sentiment": {
    "label": "NEGATIVE",
    "positive": 0.0,
    "negative": 0.96,
    "neutral": 0.04,
    "confidence": 0.801
  },
  "salience": 1.0,
  "mentions": [{
    "sentiment": {
      "label": "NEGATIVE",
      "positive": 0.0,
      "negative": 0.851,
      "neutral": 0.149,
      "confidence": 0.775
    },
    "start": 0,
    "end": 1,
    "sentence": "Mr Osborne said the banking system was not working for its customers .",
    "sentenceHtml": " <span class=\"entityMention\">Mr Osborne</span> said the banking system was not working for its customers .",
    "text": "Mr Osborne",
    "headNoun": "Osborne",
    "headNounIndex": 1,
    "salience": 1.0
  }, {
    "sentiment": {
      "label": "NEGATIVE",
      "positive": 0.0,
      "negative": 0.861,
      "neutral": 0.139,
      "confidence": 0.827
    },
    "start": 13,
    "end": 13,
    "sentence": "Osborne also said that banks had failed to take responsibility for their actions .",
    "sentenceHtml": " <span class=\"entityMention\">Osborne</span> also said that banks had failed to take responsibility for their actions .",
    "text": "Osborne",
    "headNoun": "Osborne",
    "headNounIndex": 13,
    "salience": 1.0
  }]
}]

POST

/v1/sentiment

Returns sentiment information about detailed relations between entities (terms, keywords) mentioned in the text.

Request fields:

  • "level":"entityrelation" - Selects relational entity-level sentiment analysis.
  • "text" - The text that you want to analyse.

Response

200 (OK)
Content-Type: application/json; charset=UTF-8
[{
 "entity1": {
    "head": "Avanesov",
    "headIndex": 2,
    "text": "Russian Georgiy Avanesov"
  },
  "entity2": {
    "head": "botnet",
    "headIndex": 17,
    "text": "Bredolab botnet"
  },
  "sentiment": {
    "label": "NEGATIVE",
    "positive": 0.209,
    "negative": 0.523,
    "neutral": 0.268
  },
  "salience": 0.243,
  "sentence": "Russian Georgiy Avanesov was in May sentenced to four years in jail for being behind the Bredolab botnet which was believed to have been generating more than # 80,000 a month in revenue .",
  "sentenceHtml": " <span class=\"entity1\">Russian Georgiy Avanesov</span> was in May sentenced to four years in jail for being behind the <span class=\"entity2\">Bredolab botnet</span> which was believed to have been generating more than # 80,000 a month in revenue ."
}, {
  "entity1": {
    "head": "Avanesov",
    "headIndex": 2,
    "text": "Russian Georgiy Avanesov"
  },
  "entity2": {
    "head": "revenue",
    "headIndex": 32,
    "text": "revenue"
  },
  "sentiment": {
    "label": "POSITIVE",
    "positive": 0.377,
    "negative": 0.314,
    "neutral": 0.309
  },
  "salience": 0.155,
  "sentence": "Russian Georgiy Avanesov was in May sentenced to four years in jail for being behind the Bredolab botnet which was believed to have been generating more than # 80,000 a month in revenue .",
  "sentenceHtml": " <span class=\"entity1\">Russian Georgiy Avanesov</span> was in May sentenced to four years in jail for being behind the Bredolab botnet which was believed to have been generating more than # 80,000 a month in <span class=\"entity2\">revenue</span> ."
}]

POST

/v1/sentiment

Returns information about the flow of sentiment through the text (document-level sentiment timeline analysis). The analysis covers contextual sentence-level sentiment labels and positional co-ordinates for individual words in the text which you can use to plot the temporal development (or flow) of sentiment through the text.

Request fields:

  • "level":"word" - Selects document-level sentiment timeline analysis.
  • "text" - The text that you want to analyse.

Response

200 (OK)
Content-Type: application/json; charset=UTF-8
[{
  "sentiment": {
    "label": "NEGATIVE",
    "timelineY": -1.0
  },
  "wordIndex": 0,
  "text": "There"
}, {
  "sentiment": {
    "label": "NEGATIVE",
    "timelineY": -1.004
  },
  "wordIndex": 1,
  "text": "have"
}, {
  "sentiment": {
    "label": "NEGATIVE",
    "timelineY": -1.008
  },
  "wordIndex": 2,
  "text": "been"
}, {
  "sentiment": {
    "label": "NEGATIVE",
    "timelineY": -1.012
  },
  "wordIndex": 3,
  "text": "clashes"
}, {
  "sentiment": {
    "label": "NEGATIVE",
    "timelineY": -1.0170000000000001
  },
  "wordIndex": 4,
  "text": "throughout"
}, {
  "sentiment": {
    "label": "NEGATIVE",
    "timelineY": -1.0210000000000001
  },
  "wordIndex": 5,
  "text": "the"
}, {
  "sentiment": {
    "label": "NEGATIVE",
    "timelineY": -1.025
  },
  "wordIndex": 6,
  "text": "night"
}, {
  "sentiment": {
    "label": "NEGATIVE",
    "timelineY": -1.029
  },
  "wordIndex": 7,
  "text": "in"
}, {
  "sentiment": {
    "label": "NEGATIVE",
    "timelineY": -1.033
  },
  "wordIndex": 8,
  "text": "many"
}, {
  "sentiment": {
    "label": "NEGATIVE",
    "timelineY": -1.037
  },
  "wordIndex": 9,
  "text": "parts"
}, {
  "sentiment": {
    "label": "NEGATIVE",
    "timelineY": -1.042
  },
  "wordIndex": 10,
  "text": "of"
}, {
  "sentiment": {
    "label": "NEGATIVE",
    "timelineY": -1.046
  },
  "wordIndex": 11,
  "text": "Syria"
}, {
  "sentiment": {
    "label": "NEUTRAL",
    "timelineY": -1.046
  },
  "wordIndex": 12,
  "text": "."
}]

Emotion Analysis

Beyond positive vs. negative sentiment polarity, a vast range of psychological dimensions exist in the realm of emotions/moods/feelings/affect. You can use the Emotion Analysis service to project the text onto a fine-grained, multi-dimensional emotion space which is more natural than a singular majority label. The returned analysis lists emotion dimension labels, each with a confidence value from the prediction, and covers the following basic unbounded emotion dimensions:

  • anger1D - 1-dimensional anger scale (> 0).
  • fear1D - 1-dimensional fear scale (> 0).
  • shame1D - 1-dimensional shame scale (> 0).
  • surprise1D - 1-dimensional surprise scale (> 0).
  • calm2D - 2-dimensional scale between calmness (> 0) vs. agitation (< 0).
  • happy2D - 2-dimensional scale between happiness (> 0) vs. sadness (< 0).
  • like2D - 2-dimensional scale between liking (> 0) vs. disliking/disgust (< 0).
  • sure2D - 2-dimensional scale between certainty/sureness (> 0) vs. uncertainty/unsureness (< 0).

POST

/v1/emotion

Returns emotion dimensions for the entire input text (document-level emotion analysis).

Request fields:

  • "text" - The text that you want to analyse.

Response

200 (OK)
Content-Type: application/json; charset=UTF-8
{
  "emotions": [
    {
      "dimension": "anger1D",
      "score": 1.667
    },
    {
      "dimension": "calm2D",
      "score": -0.478
    },
    {
      "dimension": "fear1D",
      "score": 0
    },
    {
      "dimension": "happy2D",
      "score": 0
    },
    {
      "dimension": "like2D",
      "score": -1.4
    },
    {
      "dimension": "shame1D",
      "score": 0
    },
    {
      "dimension": "sure2D",
      "score": -0.667
    },
    {
      "dimension": "surprise1D",
      "score": 0
    }
  ]
}

POST

/v1/emotion

Returns emotion dimensions for each sentence in the input text (sentence-level emotion analysis).

Request fields:

  • "level":"sentence" - Selects sentence-level emotion analysis.
  • "text" - The text that you want to analyse.

Response

200 (OK)
Content-Type: application/json; charset=UTF-8
[
  {
    "emotions": [
      {
        "dimension": "anger1D",
        "score": 5
      },
      {
        "dimension": "calm2D",
        "score": -3.9
      },
      {
        "dimension": "fear1D",
        "score": 0
      },
      {
        "dimension": "happy2D",
        "score": 0
      },
      {
        "dimension": "like2D",
        "score": -2.533
      },
      {
        "dimension": "shame1D",
        "score": 0
      },
      {
        "dimension": "sure2D",
        "score": 0
      },
      {
        "dimension": "surprise1D",
        "score": 0
      }
    ],
    "start": 11,
    "end": 23,
    "sentenceIndex": 1,
    "text": "I have been called vile , villainous and evil for criticising her ."
  },
  {
    "emotions": [
      {
        "dimension": "anger1D",
        "score": 1.071
      },
      {
        "dimension": "calm2D",
        "score": -0.943
      },
      {
        "dimension": "fear1D",
        "score": 0.714
      },
      {
        "dimension": "happy2D",
        "score": -1.175
      },
      {
        "dimension": "like2D",
        "score": -0.536
      },
      {
        "dimension": "shame1D",
        "score": 0
      },
      {
        "dimension": "sure2D",
        "score": -0.286
      },
      {
        "dimension": "surprise1D",
        "score": 0.286
      }
    ],
    "start": 14,
    "end": 24,
    "sentenceIndex": 1,
    "text": "I wonder how many times she cried and considered suicide ."
  }
]

Speculation Detection

Speculative language describes or refers directly or indirectly to irrealis events that are yet to happen. Speculative expressions can hence cover concepts as diverse as future, certainty, doubt, prediction, wanting, wishes, and waiting, to name a few.
This service detects speculative expressions at the sentence level. The response contains only 'positive' matches: if no speculative content is detected, the response is [], accordingly. Any identified subtypes of speculation are denoted with the dot operator (.) (e.g. SPECULATION.SUBTYPE).

Request fields:

  • "text" - The text that you want to analyse.

POST

/v1/speculation

Response

200 (OK)
Content-Type: application/json; charset=UTF-8
[{
  "start": 0,
  "end": 8,
  "sentenceIndex": 0,
  "speculationType": "SPECULATION.ADVICE",
  "text": "It 's probably not advisable to use it ."
}]

Risk Detection

This sentence-level service detects expressions that describe or refer to risk and danger, either directly or indirectly. The response contains only 'positive' matches: if no risk expressions are detected, the response is hence []. Any identified subtypes of risk are denoted with the dot operator (.) (e.g. RISK.SUBTYPE).

Request fields:

  • "text" - The text that you want to analyse.

POST

/v1/risk

Response

200 (OK)
Content-Type: application/json; charset=UTF-8
[{
  "start": 0,
  "end": 8,
  "sentenceIndex": 0,
  "riskType": "RISK",
  "text": "Your plan sounds plain dangerous in my mind."
}]

Intent Analysis

This sentence-level service detects expressions pertaining to intent, intentions, plans, and decisions that can be detected in text. The response contains only 'positive' matches: if no intent expressions are detected, the response is []. Any identified subtypes of intent are denoted with the dot operator (.) (e.g. INTENT.SUBTYPE).

Request fields:

  • "text" - The text that you want to analyse.

POST

/v1/intent

Response

200 (OK)
Content-Type: application/json; charset=UTF-8
[{
  "start": 0,
  "end": 11,
  "sentenceIndex": 0,
  "intentType": "INTENT.DECISION",
  "text": "I have made a decision to purchase the new improved camera model."
}]

Gender Classification

This end point allows you to predict the gender of the author who wrote the text. The prediction is based solely on the text itself - no user profile information is considered. The returned analysis offers gender labels (MALE vs. FEMALE) as well as confidence values from the predictions.

POST

/v1/gender

Returns a gender prediction for the entire input text (document-level gender detection).

Request fields:

  • "text" - The text that you want to analyse.

Response

200 (OK)
Content-Type: application/json; charset=UTF-8
{
  "score": {
    "label": "MALE",
    "confidence": 0.997
  }
}

Humour Detection

A great many charged sentiment expressions express humour as is evidenced by jokes; puns and word play; funny anecdotes, stories, proverbs, and sayings; one-liners, spoonerisms, and other highly creative linguistic devices. You can use the Humour Detection service to discover explitic and implicit humour(ous) signals in text. The returned analysis offers humour type labels (HUMOUR vs. NOT_HUMOUR) as well as confidence values from the predictions. Have fun!

POST

/v1/humour

Returns a humour prediction for the entire input text (document-level humour detection).

Request fields:

  • "text" - The text that you want to analyse.

Response

200 (OK)
Content-Type: application/json; charset=UTF-8
{
  "score": {
    "label": "HUMOUR",
    "confidence": 0.941
  }
}

POST

/v1/humour

Returns humour predictions for each sentence in the input text (sentence-level humour detection).

Request fields:

  • "level":"sentence" - Selects sentence-level humour detection.
  • "text" - The text that you want to analyse.

Response

200 (OK)
Content-Type: application/json; charset=UTF-8
[{
  "score": {
    "label": "NOT_HUMOUR",
    "confidence": 0.763
  },
  "start": 64,
  "end": 90,
  "sentenceIndex": 3,
  "text": "The company said that the share award scheme is a one-off opportunity, but if it was successful, awards to employees would increase in the future."
  }, {
  "score": {
    "label": "HUMOUR",
    "confidence": 0.816
  },
  "start": 91,
  "end": 103,
  "sentenceIndex": 4,
  "text": "Shares will mainly be awarded to workers working in the \"famous chocolate factory\"... if u know wot i meen"
}]

Advertisement Detection

Due to the fact that advertisements are spammy and almost invariably positive, they can skew sentiment measurements in a harmful way. This service allows you to detect texts that are or resemble advertisements. The returned analysis offers advertisement type labels (AD vs. NOT_AD) as well as confidence values from the predictions.

POST

/v1/ad

Returns an advertisement prediction for the entire input text (document-level advertisement detection).

Request fields:

  • "text" - The text that you want to analyse.

Response

200 (OK)
Content-Type: application/json; charset=UTF-8
{
  "score": {
    "label": "AD",
    "confidence": 1
  }
}

Comparison Analysis

This sentence-level service detects comparative expressions. The response contains only 'positive' matches: if no comparative expressions are detected, the response is []. Any identified finer-grained comparative expressions are denoted with the dot operator (.) (e.g. COMPARISON.SUBTYPE).

Request fields:

  • "text" - The text that you want to analyse.

POST

/v1/comparison

Response

200 (OK)
Content-Type: application/json; charset=UTF-8
[{
  "start": 0,
  "end": 9,
  "sentenceIndex": 0,
  "comparisonType": "COMPARISON",
  "text": "Scala is much better than any other programming language ."
}]

Named Entity Recognition

This service detects expressions in the text snippet that refer explicitly or implicitly to

  • people and humans in general (PEOPLE)
  • places and locations (LOCATION)
  • organisations and companies (ORGANISATION)
  • times and dates (TIMEDATE)
  • monetary issues (MONEY)

For each identified expression (which can be a simple or complex Noun Phrase, Adjective Phrase, or Adverb Phrase), the detected Named Entity types are ranked by their salience (most salient first).

Request fields:

  • "text" - The text that you want to analyse.

POST

/v1/namedentity

Response

200 (OK)
Content-Type: application/json; charset=UTF-8
[{
  "head": "Hollande",
  "headIndex": 5,
  "start": 0,
  "end": 5,
  "sentence": "The new French President Francois Hollande wants a '' growth pact '' in Europe - a set of reforms designed to boost European economies and mitigate the pain caused by government spending cuts across the continent .",
  "sentenceHtml": "The new French President Francois Hollande wants a '' growth pact '' in Europe - a set of reforms designed to boost European economies and mitigate the pain caused by government spending cuts across the continent .",
  "text": "The new French President Francois Hollande",
  "namedEntityTypes": ["PEOPLE"]
}, {
  "head": "area",
  "headIndex": 7,
  "start": 6,
  "end": 15,
  "sentence": "The three lifeboats have been searching an area 25 miles ( 40km ) south of Wick , in the Beatrice oil field , for the two crew who remain missing .",
  "sentenceHtml": "The three lifeboats have been searching an area 25 miles ( 40km ) south of Wick , in the Beatrice oil field , for the two crew who remain missing .",
  "text": "an area 25 miles ( 40km ) south of Wick",
  "namedEntityTypes": ["LOCATION"]
}, {
  "head": "Co-op",
  "headIndex": 1,
  "start": 0,
  "end": 1,
  "sentence": "The Co-op will pay GBP350m upfront and up to an additional # 400m based on the performance of the combined business .",
  "sentenceHtml": "The Co-op will pay GBP350m upfront and up to an additional # 400m based on the performance of the combined business .",
  "text": "The Co-op",
  "namedEntityTypes": ["ORGANISATION"]
}, {
  "head": "shares",
  "headIndex": 31,
  "start": 30,
  "end": 31,
  "sentence": "The resolution for change was filed by Christian Brothers Investment Services ( CBIS ) and members of the Local Authority Pension Fund Forum ( LAPFF ) , organizations that own B shares .",
  "sentenceHtml": "The resolution for change was filed by Christian Brothers Investment Services ( CBIS ) and members of the Local Authority Pension Fund Forum ( LAPFF ) , organizations that own B shares .",
  "text": "B shares",
  "namedEntityTypes": ["MONEY"]
}]

Part-of-Speech Tagging

This service assigns word class types to individual words in the text snippet. The tagset used is largely compatible with the Penn Treebank Tagset.

Request fields:

  • "text" - The text that you want to analyse.

POST

/v1/postag

Response

200 (OK)
Content-Type: application/json; charset=UTF-8
[{
  "posTag": "PRP",
  "posTaggedWord": "I/PRP",
  "sentenceIndex": 0,
  "stem": "I|i",
  "text": "I",
  "wordIndex": 0
}, {
  "posTag": "MD",
  "posTaggedWord": "might/MD",
  "sentenceIndex": 0,
  "stem": "might|may",
  "text": "might",
  "wordIndex": 1
}, {
  "posTag": "VB",
  "posTaggedWord": "buy/VB",
  "sentenceIndex": 0,
  "stem": "buy",
  "text": "buy",
  "wordIndex": 2
}, {
  "posTag": "DT",
  "posTaggedWord": "a/DT",
  "sentenceIndex": 0,
  "stem": "a",
  "text": "a",
  "wordIndex": 3
}, {
  "posTag": "NNP",
  "posTaggedWord": "MacBookPro/NNP",
  "sentenceIndex": 0,
  "stem": "MacBookPro|macbookpro",
  "text": "MacBookPro",
  "wordIndex": 4
}, {
  "posTag": ".",
  "posTaggedWord": "./.",
  "sentenceIndex": 0,
  "stem": ".",
  "text": ".",
  "wordIndex": 5
}]

Shallow Chunk Parsing

This service detects the boundaries of basic syntactic phrases in the text snippet. For each sentence, simple non-recursive Noun Phrase (NP) and Verb Group (VG) constituents are provided.

Request fields:

  • "text" - The text that you want to analyse.

POST

/v1/chunkparse

Response

200 (OK)
Content-Type: application/json; charset=UTF-8
[{
  "chunk": {
    "chunkType": "",
    "end": 0,
    "sentenceIndex": 0,
    "start": 0,
    "text": "The"
  },
  "head": {
    "posTag": "DT",
    "posTaggedWord": "The/DT",
    "stem": "The",
    "text": "The",
    "wordIndex": 0
  }
}, {
  "chunk": {
    "chunkType": "",
    "end": 1,
    "sentenceIndex": 0,
    "start": 1,
    "text": "latest"
  },
  "head": {
    "posTag": "JJS",
    "posTaggedWord": "latest/JJS",
    "stem": "late",
    "text": "latest",
    "wordIndex": 1
  }
}, {
  "chunk": {
    "chunkType": "NP",
    "end": 2,
    "sentenceIndex": 0,
    "start": 0,
    "text": "The latest patch"
  },
  "head": {
    "posTag": "NN",
    "posTaggedWord": "patch/NN",
    "stem": "patch",
    "text": "patch",
    "wordIndex": 2
  }
}, {
  "chunk": {
    "chunkType": "",
    "end": 3,
    "sentenceIndex": 0,
    "start": 3,
    "text": "will"
  },
  "head": {
    "posTag": "MD",
    "posTaggedWord": "will/MD",
    "stem": "will",
    "text": "will",
    "wordIndex": 3
  }
}, {
  "chunk": {
    "chunkType": "",
    "end": 4,
    "sentenceIndex": 0,
    "start": 4,
    "text": "probably"
  },
  "head": {
    "posTag": "RB",
    "posTaggedWord": "probably/RB",
    "stem": "probably",
    "text": "probably",
    "wordIndex": 4
  }
}, {
  "chunk": {
    "chunkType": "VP",
    "end": 5,
    "sentenceIndex": 0,
    "start": 3,
    "text": "will probably solve"
  },
  "head": {
    "posTag": "VB",
    "posTaggedWord": "solve/VB",
    "stem": "solve",
    "text": "solve",
    "wordIndex": 5
  }
}, {
  "chunk": {
    "chunkType": "",
    "end": 6,
    "sentenceIndex": 0,
    "start": 6,
    "text": "all"
  },
  "head": {
    "posTag": "PDT",
    "posTaggedWord": "all/PDT",
    "stem": "all",
    "text": "all",
    "wordIndex": 6
  }
}, {
  "chunk": {
    "chunkType": "",
    "end": 7,
    "sentenceIndex": 0,
    "start": 7,
    "text": "your"
  },
  "head": {
    "posTag": "PRP$",
    "posTaggedWord": "your/PRP$",
    "stem": "your",
    "text": "your",
    "wordIndex": 7
  }
}, {
  "chunk": {
    "chunkType": "NP",
    "end": 8,
    "sentenceIndex": 0,
    "start": 6,
    "text": "all your problems"
  },
  "head": {
    "posTag": "NNS",
    "posTaggedWord": "problems/NNS",
    "stem": "problem",
    "text": "problems",
    "wordIndex": 8
  }
}, {
  "chunk": {
    "chunkType": "",
    "end": 9,
    "sentenceIndex": 0,
    "start": 9,
    "text": "."
  },
  "head": {
    "posTag": ".",
    "posTaggedWord": "./.",
    "stem": ".",
    "text": ".",
    "wordIndex": 9
  }
}]

Dependency Parsing

This service analyses the grammatical structure of each sentence in the text snippet. For each sentence, typed syntactic dependencies between individual words are provided. The parses and the typed dependencies used resemble the labels and types described in the Cambridge Grammar of the English Language.

Request fields:

  • "text" - The text that you want to analyse.

POST

/v1/depparse

Response

200 (OK)
Content-Type: application/json; charset=UTF-8
[{
  "dependency": {
    "predicate": "nsubj(got, I)",
    "relation": "nsubj"
  },
  "dependent": {
    "text": "I",
    "stem": "I|i",
    "wordIndex": 0
  },
  "governor": {
    "text": "got",
    "stem": "got|get",
    "wordIndex": 1
  }
}, {
  "dependency": {
    "predicate": "(root, got)",
    "relation": ""
  },
  "dependent": {
    "text": "got",
    "stem": "got|get",
    "wordIndex": 1
  }
}, {
  "dependency": {
    "predicate": "det(camera, a)",
    "relation": "det"
  },
  "dependent": {
    "text": "a",
    "stem": "a",
    "wordIndex": 2
  },
  "governor": {
    "text": "camera",
    "stem": "camera",
    "wordIndex": 4
  }
}, {
  "dependency": {
    "predicate": "amod(camera, new)",
    "relation": "amod"
  },
  "dependent": {
    "text": "new",
    "stem": "new",
    "wordIndex": 3
  },
  "governor": {
    "text": "camera",
    "stem": "camera",
    "wordIndex": 4
  }
}, {
  "dependency": {
    "predicate": "dobj(got, camera)",
    "relation": "dobj"
  },
  "dependent": {
    "text": "camera",
    "stem": "camera",
    "wordIndex": 4
  },
  "governor": {
    "text": "got",
    "stem": "got|get",
    "wordIndex": 1
  }
}, {
  "dependency": {
    "predicate": "rel(takes, which)",
    "relation": "rel"
  },
  "dependent": {
    "text": "which",
    "stem": "which",
    "wordIndex": 5
  },
  "governor": {
    "text": "takes",
    "stem": "takes|take",
    "wordIndex": 6
  }
}, {
  "dependency": {
    "predicate": "rcmod(camera, takes)",
    "relation": "rcmod"
  },
  "dependent": {
    "text": "takes",
    "stem": "takes|take",
    "wordIndex": 6
  },
  "governor": {
    "text": "camera",
    "stem": "camera",
    "wordIndex": 4
  }
}, {
  "dependency": {
    "predicate": "amod(photos, brilliant)",
    "relation": "amod"
  },
  "dependent": {
    "text": "brilliant",
    "stem": "brilliant",
    "wordIndex": 7
  },
  "governor": {
    "text": "photos",
    "stem": "photos|photo",
    "wordIndex": 8
  }
}, {
  "dependency": {
    "predicate": "dobj(takes, photos)",
    "relation": "dobj"
  },
  "dependent": {
    "text": "photos",
    "stem": "photos|photo",
    "wordIndex": 8
  },
  "governor": {
    "text": "takes",
    "stem": "takes|take",
    "wordIndex": 6
  }
}, {
  "dependency": {
    "predicate": "(root, .)",
    "relation": ""
  },
  "dependent": {
    "text": ".",
    "stem": ".",
    "wordIndex": 9
  }
}]

Text Summarisation

This service generates a summary from the input text. The summary consists of sentences delimited by \n.

Request fields:

  • "ratio" (optional) - The size of the summary - the proportion of the full input text that is included in the summary (0 ≤ ratio ≤ 1.0 where 0 returns all sentences and 1.0 includes only the most salient sentence(s) in the input text).
  • "text" - The text that you want to summarise.

POST

/v1/summary

Response

200 (OK)
Content-Type: application/json; charset=UTF-8
[{
  "summary": "Charities criticise UK for ending humanitarian aid\nCharities have criticised the UK after the govt announced it would stop direct aid to Peru in 2019.\n UK ministers said their relationship with Peru is more about trade and not development as such."
}]

Language Detection

This service returns an ISO 639-1 natural language code for the input text.

Request fields:

  • "text" - The text for which you want a language code.

POST

/v1/langdetect

Response

200 (OK)
Content-Type: application/json; charset=UTF-8
{
 "iso6391": "en"
}

Resources for Sentiment Analysis

This end point allows you to manage the lexical resources that are used in the sentiment analysis on your account. By fine-tuning and customising word lists (adjectives, adverbs, nouns, verbs), you can adapt the sentiment analysis to a particular genre, domain, topic, or use case beyond the default generic resources.

Sentiment lexicon entries can have the following fields:

  • "polarity":p - The sentiment polarity of the lexicon entry p ∈ { pos | ntr | neg }. Example: "polarity":"pos"
  • "reverse":r - The sentiment reversal property of the lexicon entry r ∈ { rev | equ }. Example: "reverse":"rev"
  • "text" - The entry to be stored in the lexicon.

POST

/v1/resources/lexicons/sentiment/{ adjectives | adverbs | nouns | verbs }

Response

201 (Created)

GET

/v1/resources/lexicons/sentiment/{ adjectives | adverbs | nouns | verbs }

Response

200 (OK)
Content-Type: application/json; charset=UTF-8
[{
  "text": "quasi-intelligent",
  "polarity": "pos",
  "id": "51b0630a7a233d39005ecc1e"
}, {
  "text": "unemployment",
  "polarity": "ntr",
  "id": "51b0630a7a233d39005ecc1e"
}]

GET

/v1/resources/lexicons/sentiment/{ adjective | adverb | noun | verb }/{ objectID }

Response

200 (OK)
Content-Type: application/json; charset=UTF-8
{
  "text": "quasi-intelligent",
  "polarity": "pos",
  "id": "51b0630a7a233d39005ecc1e"
}

DELETE

/v1/resources/lexicons/sentiment/{ adjective | adverb | noun | verb }/{ objectID }

Response

200 (OK)

Resources for Entity Taxonomies

This end point allows you to manage the taxonomic resources that are used in the entity categorisation on your account. By adding pattern matching rules for taxonomic categories, you can categorise entity mentions into any desired taxonomic levels beyond the default head noun-based grouping.

Entity taxonomy entries can have the following fields:

  • "matchPattern" - A regex pattern for capturing entity mentions. Example: "matchPattern":"pizza(s)?"
  • "category" - The taxonomic category under which matched entity mentions should be categorised. Example: "category":"FOOD.SOUP.ITALIAN"

POST

/v1/resources/taxonomies/entity

Response

201 (Created)

GET

/v1/resources/taxonomies/entities

Response

200 (OK)
Content-Type: application/json; charset=UTF-8
[{
  "matchPattern": "pizza(s)?",
  "category": "FOOD.PIZZA",
  "id": "51b0780a7a233d4e005ecc1f"
}, {
  "matchPattern": "(beer|lager|bitter)",
  "category": "FOOD.DRINK",
  "id": "51b0781f7a233d48005ecc20"
}]

GET

/v1/resources/taxonomies/entity/{ objectID }

Response

200 (OK)
Content-Type: application/json; charset=UTF-8
{
  "matchPattern": "(beer|lager|bitter)",
  "category": "FOOD.DRINK",
  "id": "51b0781f7a233d48005ecc20"
}

DELETE

/v1/resources/taxonomies/entity/{ objectID }

Response

200 (OK)

Feedback

AVAILABLE SOON!

If you spot 1) incorrect, odd, or funny analyses (classifications, predictions, tags, chunks, parses, labels, ranges, values and the like), 2) structural anomalies or defects, or 3) general issues in the responses returned by the API, you can submit free-form feedback to us which we will look into. Your feedback is greatly appreciated!

Request fields:

  • "endpoint" - The name of the service end point that returned the response that you want to report.
  • "expected" (optional) - The correct or expected value(s) that the response should have contained.
  • "feedback" - Any detailed information or general comments that you can provide that will help us diagnose the issue.
  • "text" The data in the text field that you used in the request that was sent to end point in question.

POST

/v1/feedback

Response

201 (Created)

Usage Data

You can monitor your API usage within a specific time period between two time stamps. The timestamps expect values that are compliant with the W3C date and time format.

Request fields:

  • "from" - The W3C start value for the query.
  • "to" (optional) - The W3C end value for the query. If omitted, defaults to now.

GET

/v1/usagestats?from=2013-02-01&to=2013-02-13

Response

200 (OK)
Content-Type: application/json; charset=UTF-8
{
  "username": "yourUserName",
  "from": "2013-02-06T00:00:00.000Z",
  "to": "2013-02-13T00:00:00.000Z",
  "requestCount": 193,
  "dailyUsage": [{
    "date": "2013-02-06T00:00:00.000Z",
    "requestCount": 0
  }, {
    "date": "2013-02-07T00:00:00.000Z",
    "requestCount": 3
  }, {
    "date": "2013-02-08T00:00:00.000Z",
    "requestCount": 97
  }, {
    "date": "2013-02-09T00:00:00.000Z",
    "requestCount": 0
  }, {
    "date": "2013-02-10T00:00:00.000Z",
    "requestCount": 0
  }, {
    "date": "2013-02-11T00:00:00.000Z",
    "requestCount": 15
  }, {
    "date": "2013-02-12T00:00:00.000Z",
    "requestCount": 72
  }, {
    "date": "2013-02-13T00:00:00.000Z",
    "requestCount": 6
  }]
}