Version 2.1 Introduction

This describes usage of scanii.com’s API version 2.1. If you have any questions please contact support. 

The examples below are utilizing the cURL command line library and should be easily translated into any programming language.

📘 Notable changes from version 2.0

  • File resource now accepts a metadata argument for arbitrary key/value pairs to be stored alongside the resource
  • Added a beta account lookup resource

Endpoints

Scanii is a global content processing service with availability in the following regions/domains:

For sake of this overview we will utilize api.scanii.com as the domain name since it’s the simplest to use.

Basics

All access happens over HTTPS using the https://api.scanii.com domain name and the /v2.1 base path. All responses utilize JSON and all dates are in the ISO8601 (YYYY-MM-DDTHH:MM:SSZ) format. For example, here’s a sample response from our Ping resource:

$ curl -i  -u 8eb05c68f386421db2dd4929fc4f77ad:123456 https://api.scanii.com/v2.1/ping

All of our resources utilize HTTP basic authentication to unique identify clients and since all interactions happen over TLS credentials are always safely transmitted.

 $ curl -i  -u 8eb05c68f386421db2dd4929fc4f77ad:123456<br>

In the example above, 8eb05c68f386421db2dd4929fc4f77ad:123456 represent the api key (8eb05c68f386421db2dd4929fc4f77ad) and its secret (123456). The API credentials are created (and managed) by the user using Scanii’s web interface here.

Protocol

HTTP response codes

Scanii strives to adhere to common REST principles for its responses whenever practical and retain consistent response codes across its resources.

Here are the common HTTP ERROR response codes across our resources:

  • 400 - Request could not be understood
  • 401 - Authentication error
  • 403 - There is problem with your API credentials 
  • 404 - Invalid path
  • 413 - Content size is bigger than the max allowed by this resource

Success codes will vary by resource but will always be in the 2XX range.

HTTP response headers

Scanii utilizes HTTP response headers to provide metadata about the API response and its resources. For example:

$ curl -i -u 8eb05c68f386421db2dd4929fc4f77ad:123456789 -F file=@/tmp/foo.exe https://api.scanii.com/v2.1/files
HTTP/1.1 201 Created
Access-Control-Allow-Headers: Authorization
Access-Control-Allow-Origin: *
Content-Type: application/json
Date: Fri, 29 May 2015 05:33:30 GMT
Location: https://api.scanii.com/v2.1/files/9a1880dcb5e31d47c11be8ab243078ab
X-Runtime: 4164ms
X-Scanii-Host-Id: 613a7f69
X-Scanii-Request-Id: 0cb43907-684a-4439-a164-3e40edde1f48
Content-Length: 299
Connection: keep-alive
{
  "id" : "9a1880dcb5e31d47c11be8ab243078ab",
  "checksum" : "edbb54821bc3f5666be48184a822c3df59392c31",
  "content_length" : 1579562,
  "findings" : [ "av.crdf.malware-generic.2462546599.unofficial" ],
  "creation_date" : "2015-05-29T05:33:34.390Z",
  "content_type" : "application/x-msdownload"
}
Header Purpose
X-Scanii-Request-Id A unique identifier of the request being processed
Location The resource unique location
X-Runtime The amount of server side time it took to process the request
X-Scanii-Host-Id A unique identifier of the server that processed the request

Error payloads

Every unsuccessful API response includes a basic set of response elements:

  • error - a text message explaining what happened

example:

$ curl -i -u 8eb05c68f386421db2dd4929fc4f77ad:123456789 -F file=@example.doc https://api.scanii.com/v2.1/files
HTTP/2 400 
date: Sun, 19 May 2019 18:02:44 GMT
content-type: application/json
content-length: 109
x-runtime: 1ms
x-scanii-host-id: 9a42da17
access-control-allow-origin: *
access-control-allow-headers: Authorization
x-scanii-request-id: 0c3de16c-1833-4489-b7a0-a745c31a818e
{
  "error" : "Regrettably, you did not send us any content to process - please see http://docs.scanii.com"
}

Success payloads

Every successful API response includes a basic set of response elements:

Element Purpose
id this result unique identifier
checksum the SHA1 digest of the content processed
content_length the length in bytes of the content processed
findings what our content detection engines were able to identify while processing the content submitted
creation_date ISO8601 time stamp of when the content was processed
content_type the media type of the content processed
metadata arbitrary set of user-supplied key/value pairs

Findings follow a dot notated hierarchy with the following prefixes: 

Engine Findings Prefix
Malware content.malicious.
NSFW Language content. + ISO_3166-1_alpha-2 country code + .language.nsfw.
NSFW Image content.image.nsfw.

Examples: content.malicious.eicar-test-signature, content.image.nsfw.violence-weapons and content.en.language.nsfw.129

Cross Origin Resource Sharing - CORS

Along with the HTTP headers listed above, our APIs also support Cross Origin Resource Sharing ( http://en.wikipedia.org/wiki/Cross-origin_resource_sharing) for AJAX requests from any resource. Here’s what a sample request from a browser hitting our endpoint would look like:

$ curl -i -u 8eb05c68f386421db2dd4929fc4f77ad:123456789 -H "Origin: http://example.com"  https://api.scanii.com/v2.1/ping
HTTP/1.1 200 OK
Access-Control-Allow-Headers: Authorization
Access-Control-Allow-Origin: *
Content-Type: application/json
Date: Fri, 29 May 2015 05:38:57 GMT
X-Runtime: 0ms
X-Scanii-Host-Id: 613a7f69
X-Scanii-Request-Id: 68177aeb-0648-4946-acaa-8b4de08bbb6b
Content-Length: 70
Connection: keep-alive
{
  "message" : "pong",
  "key" : "8eb05c68f386421db2dd4929fc4f77ad"
}

Limits 

Our current API does not have any explicit API call rate limit. We actually consider that be a big oversight on our part that we aim to address in the next version of our API. With that said, if you blast us with traffic, let’s say 10k reqs in a second, that is likely to trigger a capacity scaling event and that will cause a brief number of 500 level HTTP responses for a brief period of a few seconds while more capacity is provisioned. We encourage all of our clients to add in some sensible retry logic (let’s say 3 times with a random wait between 1- 30 seconds) to cope with these scaling events. 

 API tour

For the examples below we will assume that the user has a valid API key, and we will navigate through a few common API calls using the cURL command line tool. Further information on using cURL

Let’s start with a simple Ping call that tells us that our API key is ready for use:

$ curl -u 8eb05c68f386421db2dd4929fc4f77ad:12345678 https://api.scanii.com/v2.1/ping
{
    "message" : "pong",
    "key" : "8eb05c68f386421db2dd4929fc4f77ad"
}

Looks good, now let’s try to send a file for processing synchronously (that is, the client will wait until the processing is completed):

$ curl -i -u 8eb05c68f386421db2dd4929fc4f77ad:12345678 -F metadata[filename]=suba002.exe -F file=@suba002.exe https://api.scanii.com/v2.1/files
HTTP/1.1 100 Continue
HTTP/1.1 201 Created
Access-Control-Allow-Headers: Authorization
Access-Control-Allow-Origin: *
Content-Type: application/json
Date: Sat, 12 Dec 2015 17:34:10 GMT
Location: https://api.scanii.com/v2.1/files/4d323e37433a76d59a78a97d5265ad67
X-Runtime: 1245ms
X-Scanii-Host-Id: 86ddeebc
X-Scanii-Request-Id: ec9a8d74-823b-457f-bdf8-0836c47f7534
Content-Length: 333
Connection: keep-alive
{
  "id" : "4d323e37433a76d59a78a97d5265ad67",
  "checksum" : "edbb54821bc3f5666be48184a822c3df59392c31",
  "content_length" : 1579562,
  "findings" : [ "av.win.trojan.agent-948155" ],
  "creation_date" : "2015-12-12T17:34:12.031Z",
  "content_type" : "application/x-msdownload",
  "metadata" : {
    "filename" : "suba002.exe"
  }
}

Well, looks like Scanii found something in that file, “findings” is always a list of everything meaningful our engine encountered while processing that file, in this case our anti virus engine found the  “av.win.trojan.agent-948155” malware.

Also, in the example above we took advantage of Scanii’s custom metadata ability to store the file name with the content processed. Arguments sent us in the formatmetadata[key]=value get automatically saved with the resource and it is a great place to store your business logic such as the internal id of the content or the name of the web server (or application) that generated the request.

Now let’s say that you would like to batch process lots of files at once and be notified as processing completes, here’s an example of using our asynchronous callback endpoint:

$ curl -i -u 8eb05c68f386421db2dd4929fc4f77ad:12345678 -F file=@/Users/rafael/virus.exe -F callback=https://acme.com/scanii-webhook   https://api.scanii.com/v2.1/files/async
HTTP/1.1 100 Continue
HTTP/1.1 202 Accepted
Access-Control-Allow-Headers: Authorization
Access-Control-Allow-Origin: *
Content-Type: application/json
Date: Fri, 29 May 2015 06:00:36 GMT
Location: https://api.scanii.com/v2.1/files/decad1d51b7981911113eb735739e73f
X-Runtime: 1113ms
X-Scanii-Host-Id: 613a7f69
X-Scanii-Request-Id: b12f1da6-f76b-4489-8e47-443c3ffc91ea
Content-Length: 41
Connection: keep-alive
{"id":"decad1d51b7981911113eb735739e73f"}

In the example above our API client will return immediately (and not wait for the content processing to finish) and notify via a HTTP POST request the endpoint https://acme.com/scanii-webhook once completed. The payload of the callback will match the usual JSON response payload as below:

{
  "id" : "decad1d51b7981911113eb735739e73f",
  "checksum" : "edbb54821bc3f5666be48184a822c3df59392c31",
  "content_length" : 1579562,
  "findings" : [ "av.crdf.malware-generic.2462546599.unofficial" ],
  "creation_date" : "2015-05-29T06:00:37.772Z",
  "content_type" : "application/x-msdownload"
}

Also notice that the all content processed has a unique persistent and retrievable locator:

$ curl -u 8eb05c68f386421db2dd4929fc4f77ad:12345678 https://api.scanii.com/v2.1/files/decad1d51b7981911113eb735739e73f
{
  "id" : "decad1d51b7981911113eb735739e73f",
  "checksum" : "edbb54821bc3f5666be48184a822c3df59392c31",
  "content_length" : 1579562,
  "findings" : [ "av.crdf.malware-generic.2462546599.unofficial" ],
  "creation_date" : "2015-05-29T06:00:37.772Z",
  "content_type" : "application/x-msdownload"
}

Lastly, let’s have Scanii fetch the content to be processed directly from a third party resource (in this case private Amazon S3 object that we will access using query string authentication):

$ curl -i -u 8eb05c68f386421db2dd4929fc4f77ad:12345678 --data-urlencode location='https://scanii.s3.amazonaws.com/eicarcom2.zip?AWSAccessKeyId=AKIAJNN3CBMBGCMQDU4A&Expires=1432966418&Signature=QjxrlqDq587fSDkhqfI5Kt2LVN8%3D' -d callback=https://acme.com/scanii-webhook https://api.scanii.com/v2.1/files/fetch
HTTP/1.1 202 Accepted
Access-Control-Allow-Headers: Authorization
Access-Control-Allow-Origin: *
Content-Type: application/json
Date: Fri, 29 May 2015 06:19:32 GMT
Location: https://api.scanii.com/v2.1/files/425700a659d88a3e4fc8551f2da1eed1
X-Runtime: 0ms
X-Scanii-Host-Id: 613a7f69
X-Scanii-Request-Id: 538547c8-37be-47b3-9333-ca489eb68bd5
Content-Length: 41
Connection: keep-alive
{"id":"425700a659d88a3e4fc8551f2da1eed1"}

In the example above we pass the location to be fetched, processed and eventually have the results send to a callback URL.

Using Temporary Authentication Tokens

Authentication tokens are temporary use API credentials aimed at making it easier to do client side processing, with them you can generate a token you can send to the insecure client to submit content directly to Scanii for analysis.

Here’s how you create one:

curl -i -X POST -u a4d6b8a87ddd09c2647d97741bc380c4:12345678 -d timeout=30 https://api.scanii.com/v2.1/auth/tokens
HTTP/2 201 
date: Mon, 21 Sep 2020 11:48:34 GMT
content-type: application/json
content-length: 151
location: https://api.scanii.com/v2.1/auth/tokens/6602a804f7232817676146592de3a667
x-runtime: 112ms
x-scanii-host-id: 589e5e35
access-control-allow-origin: *
access-control-allow-headers: Authorization
x-scanii-request-id: 83ec1fda-6e13-4520-96d8-ed37ecf40269
{
  "id" : "6602a804f7232817676146592de3a667",
  "creation_date" : "2020-09-21T11:48:34.824146Z",
  "expiration_date" : "2020-09-21T11:49:04.824146Z"
}

In the example above we created a new temporary authentication token with a 30-second timeout, and we’re going to use it to authentication a content analysis request. Please note that we’re using the temporary auth token for the authentication username and leaving the password blank:

curl -i -X POST -u 6602a804f7232817676146592de3a667: -F file=@contents.txt  https://api.scanii.com/v2.1/files
HTTP/2 201 
date: Mon, 21 Sep 2020 11:52:31 GMT
content-type: application/json
content-length: 255
location: https://api.scanii.com/v2.1/files/fb7b800970fdaaac1a87d2b39bb5fb14
x-amzn-trace-id: Root=1-5f6893ff-a0676c31f9eecd8c82fc52e2;
x-runtime: 79ms
x-scanii-host-id: 4aad24e4
access-control-allow-origin: *
access-control-allow-headers: Authorization
x-scanii-request-id: 00661172-7c62-4d57-8fd6-5248ee880cf9
{
  "id" : "fb7b800970fdaaac1a87d2b39bb5fb14",
  "checksum" : "22596363b3de40b06f981fb85d82312e8c0ed511",
  "content_length" : 12,
  "findings" : [ ],
  "creation_date" : "2020-09-21T11:52:31.600929Z",
  "content_type" : "text/plain",
  "metadata" : { }
}

Congratulations, you have reached the end of our overview, now you should dig deeper into our Resources

Did this answer your question? Thanks for the feedback There was a problem submitting your feedback. Please try again later.

Still need help? Contact Us Contact Us