Scannedonly scalable samba anti-virus module

Does scannedonly work with other virus scanners?

It would be fairly easy for the creators of other anti-virus products to implement a daemon for the scannedonly samba module. Because ClamAV is open source and there is a libClamAV, a daemon for ClamAV could be created by the author of the scannedonly samba vfs module. Release 0.20 has an experimental daemon that connects to F-prot.

What about samba 3.5? What do I need now?

Samba 3.5 contains the samba VFS module, so you don't need that anymore, but it does not contain the anti-virus scanner. So you still need to install scannedonly too: don't specify --with-samba-source when you run ./configure.

The VFS module in Samba may lag behind a couple of features, because it takes more time for patches to get included in samba.

What are the differences between samba-vscan and scannedonly

Samba-vscan is a samba anti-virus module that performs on-access scanning through a daemon. If a file is downloaded three times, it will be scanned three times. If a file is requested, it will be scanned, and only after scanning is OK, it is transferred to the user.

In contrast, scannedonly is a samba module that on-access only checks if the file has been scanned in the past. If not, opening fails. In addition to this check, it also notifies the daemon that scanning of this file is requested.

Samba-vscan advantages over Scannedonly

With samba-vscan, files are scanned on-access, so a clean file on the server is always visible and available to the client. With scannedonly file access may be denied on a first request. After the first request, the daemon is notified, the file will be scanned, and only after a second request the file will be available.

Scannedonly advantages over Samba-vscan

With scannedonly files are scanned only once. With samba-vscan a popular file may be scanned hundreds of times which can have an significant impact on the server load. With scannedonly, once the file is scanned, it is immediately available to the client. Even if the scanning daemon is down all these files stay available. Samba-vscan needs to scan each file on-access: if a file takes a long time to scan (large zip files), the connection may timeout before the first byte is sent to the client. And if the scanning daemon is down, not a single file is available.

Can I run multiple daemons on a single host?

Yes you can. Use a different socket for each daemon.

Can I run the daemon on a different host from the samba module?

It is possible to perform anti-virus scanning on a different server than the samba server. This can be useful if for example your file server does not have the CPU power to do scanning. Install the daemon on the scanning server, and make it listen on UDP. It needs direct access to the same files. You can for example share the same files with NFS, or configure a separate samba share that doesn't have scannedonly configured. Mount the network filesystem in such a place that the pathnames are identical. The samba vfs module passes the path of the file to scan to the daemon, and not the contents of the file. Now install the scannedonly samba vfs module on the samba server, and configure UDP and the scanhost. You now have a separate scanning server and file server.

Why a .scanned: file? It doubles the number of files..

There are several obvious alternatives. A database could have been an option. However, the samba module has to connect to a database, has to query it for each file a database, the database has to lookup the filename, the result has to be returned over the connection --> this adds a lot of latency to each file, and that will come at the cost of performance degradation. Besides, the database probably takes more space on your filesystem than those 0 byte .scanned: files.

Extended filesystem attributes could have been an option. They take as much space as the 0 byte .scanned: files, and a lookup is quick and has little overhead. However, lots of filesystems do not support extended attributes, so this would limit the usability of the module.

So why the .scanned: files? They take up only an inode, because they are 0 bytes. To test if the file is scanned only a stat() call on the filesystem is needed which is very quick compared to a database lookup. All modern filesystems use database technology such as balanced trees for lookups anyway. The number of inodes in modern filesystems is also not limiting anymore. The .scanned: files are also easy scriptable. You can remove them with a simple find command or create them with a simple touch command.

Created with the Bluefish programmers editor