How do the different detection engines work?

Scanii is a content detection service capable of detecting a variety of content, depending on the enabled detection engine. Content detection engines can be enabled (and disabled) on a per API key basis, giving you a large number of configuration options.

This page covers our detection engines in detail but if you are looking for API limits, you should check out our API documentation here: https://github.com/uvasoftware/openapi

Malware Detection

Scanii’s original malware detection engine has evolved significantly over the years. Today, we believe its detection capabilities are on par with leading commercial solutions — but what truly sets us apart is our speed, simplicity, and developer-first experience.

To ensure high reliability, we use a redundant scanning setup: our proprietary engine is paired with a top-tier commercial engine as a fallback. Specifically, we OEM Sophos to augment our own detection capabilities, providing a second layer of defense without compromising performance.

That said, we encourage customers to see for themselves — create a free Scanii account and try it out.

Key Features

Automatically decompresses and analyzes archive formats (e.g., ZIP, GZIP, RAR), unless password-protected.
Meta-engine approach that combines multiple detection layers for increased accuracy.
Accurately identifies the industry-standard EICAR test file.
All file types supported.
Maximum practical file size: under 2 GB.

Sample Files

You can test using the EICAR test file or sample Office documents with the EICAR string — they will be properly detected.

NSFW Image Detection

NSFW Image is a modern, AI-driven detection engine built from the ground up to detect adult, offensive, or otherwise inappropriate visual content. It leverages the latest in computer vision and machine learning for accurate and fast detection.

If you're concerned about inappropriate images being shared by users, enabling this engine helps mitigate that risk.

Key Features

Supports 100+ image formats (see supported types here).
Recommended minimum resolution: 640×480.
Maximum practical file size: 4 MB.
Built using state-of-the-art AI models for visual content analysis.

Sample Files

We do not currently provide NSFW image samples — but such content is readily available for testing purposes.

NSFW Language Detection

The NSFW Language engine detects profane, offensive, or inappropriate language in nearly all file types — including images of documents. Yes, even a phone camera photo of printed text with inappropriate language can be flagged.

It supports 23 languages and is powered by an open-source YARA ruleset.

🔗 NSFW Language YARA rules on GitHub

Key Features

Detects offensive language in both text and OCR-extracted images.
Supports all major file formats.
Built on open-source YARA rules, customizable for your use case.
Maximum practical file size: under 100 MB.
Note: Does not support handwritten text.

Sample Files

We do not currently provide samples, but any file (e.g., Word, PDF, TXT) containing profane or offensive language will be flagged appropriately.

FAQ

What kind of content/files does the malware engine identify as malicious?

Scanii's malware detection engine is carefully optimized for an ultra-low false positive rate. This means that legitimate files are almost never incorrectly flagged as unsafe.

Our product is designed to operate at the edge — identifying harmful content before it reaches your systems. In this context, a false positive can have real-world consequences, such as preventing a student from uploading homework or blocking an essential business document.

To ensure reliability, we focus on detecting genuinely malicious content without overreaching. For example, unlike some overly aggressive engines, we do not automatically flag all Word documents with embedded content as malicious. Such an approach would significantly increase false positives, undermining trust and usability.

Scanii prioritizes accuracy and practicality — catching the threats that matter while letting your users work without unnecessary interruptions.