Files

Information about files

Files are one of the most important type of objects in the VirusTotal API. We have a huge dataset of more than 2 billion files that have been analysed by VirusTotal over the years. A file object can be obtained either by uploading a new file to VirusTotal, by searching for an already existing file hash or by other meanings when searching in VT Enterprise services.

A file object ID is its SHA256 hash.

Object Attributes

In a File object you are going to find some relevant basic attributes about the file and its relationship with VirusTotal:

  • capabilities_tags: <list of strings> list of representative tags related to the file's capabilities. Only available for Premium API users.
  • creation_date: <integer> extracted when possible from the file's metadata. Indicates when it was built or compiled. It can also be faked by malware creators. UTC timestamp.
  • crowdsourced_ids_results: <list of dictionaries> IDS (Snort and Suricata) matches for the file. If the file it's not a PCAP, the matches are taken from a PCAP generated after running the file in a sandbox. Results are sorted by severity level, there is only one item per matched alert and every item on the list contains:
    • alert_context: <list of dictionaries> context for every match of that alert:
      • dest_ip: <string> destiny IP.
      • dest_port: <integer> destination port.
      • hostname: <string> in case the alert is related to an HTTP event, destination hostname.
      • protocol: <string> communication protocol.
      • src_ip: <string> source IP.
      • src_port: <integer> source port.
      • url: <string> in case the alert is related to an HTTP event, destination URL.
    • alert_severity: <string> one of high, medium, low or info.
    • rule_category: <string> alert category description.
    • rule_id: <string> Suricata/Snort rule SID.
    • rule_msg: <string> alert description.
    • rule_source: <string> rule source, determined by SID range.
  • crowdsourced_ids_stats: <dictionary> IDS results stats:
    • high: <integer>: number of matched rules having a high severity.
    • info: <integer>: number of matched rules having a info severity.
    • low: <integer>: number of matched rules having a low severity.
    • medium: <integer>: number of matched rules having a medium severity.
  • crowdsourced_yara_results: <list of dictionaries> YARA matches for the file. Every item on the list contains the following attributes:
    • author: <string> rule author.
    • description: <string> matched rule description.
    • match_in_subfile: <boolean> whether the match was in a subfile or not.
    • rule_name: <string> matched rule name.
    • ruleset_id: <string> VirusTotal's ruleset ID.
    • ruleset_name: <string> matched rule's ruleset name.
    • source: <string> ruleset source.
  • downloadable: <boolean> true if the file can be downloaded, false otherwise. Only available for Premium API users.
  • first_submission_date: <integer> date when the file was first seen in VirusTotal. UTC timestamp.
  • last_analysis_date: <integer> most recent scan date. UTC timestamp.
  • last_analysis_results: <dictionary> latest scan results. For more information about its format, check the Analysis object results attribute.
  • last_analysis_stats: <dictionary> a summary of the latest scan results. For more information about its format, check the Analysis object stats attribute.
  • last_modification_date: <integer> date when the object itself was last modified. UTC timestamp.
  • last_submission_date: <integer> most recent date the file was posted to VirusTotal. UTC timestamp.
  • main_icon: <dictionary> icon's relevant hashes, the dictionary contains two keys:
    • raw_md5: <string> icon's MD5 hash.
    • dhash: <string> icon's difference hash. It can be used to search for files with similar icons using the /intelligence/search endpoint.
  • md5: <string> file's MD5 hash.
  • meaningful_name: <string> the most interesting name out of all file's names.
  • names: <list of strings> all file names associated with the file.
  • reputation: <integer> file's score calculated from all votes posted by the VirusTotal community. To know more about how reputation is calculated, check this article.
  • sandbox_verdicts: <dictionary>: a summary of all sandbox verdicts for a given file. It's a dictionary, where each key is the sandbox name and each value is a dictionary containing the following keys:
    • category: <string> normalized verdict category. It can be one of suspicious, malicious, harmless or undetected.
    • confidence: <integer> verdict confidence from 0 to 100.
    • malware_classification: <list of strings> raw sandbox verdicts.
    • malware_names: <list of strings> malware family names.
    • sandbox_name: <string> sandbox that provided the verdict.
  • sha1: <string> file's SHA1 hash.
  • sha256: <string> file's SHA256 hash.
  • sigma_analysis_stats: <dictionary> dictionary containing the number of matched sigma rules, grouped by its severity.
    • critical: <integer> number of matched critical severity rules.
    • high: <integer> number of matched high severity rules.
    • low: <integer> number of matched low severity rules.
    • medium: <integer> number of matched medium severity rules.
  • sigma_analysis_summary: <dictionary> dictionary containing the number of matched sigma rules group by its severity, same as sigma_analysis_stats but split by ruleset. Dictionary key is the ruleset name and value is the stats for that specific ruleset.
  • size: <integer> file size in bytes.
  • tags: <list of strings> list of representative attributes.
  • times_submitted: <integer> number of times the file has been posted to VirusTotal.
  • total_votes: <dictionary> unweighted number of total votes from the community, divided in "harmless" and "malicious":
    • harmless: <integer> number of positive votes.
    • malicious: <integer> number of negative votes.
  • type_description: <string> describes the file type.
  • type_extension: <string> specifies file extension.
  • type_tag: <string> tag representing the file type. Can be used to filter by file type in VirusTotal intelligence searches.
  • unique_sources: <integer> indicates from how many different sources the file has been posted from.
  • vhash: <string> in-house similarity clustering algorithm value, based on a simple structural feature hash allows you to find similar files.

Additionally VirusTotal together with each Antivirus scan runs a set of tool that allows us to collect more information about the file. All this tool information is included in the "attributes" key, together with the rest of fields previously described. These tools and the data they extract, are documented in the subsections below.

{
    "data": {
        "attributes": {
            "capabilities_tags": [
                "<strings>",....
            ],
            "creation_date": <int:timestamp>,
            "crowdsourced_ids_results": [
                {
                    "alert_context": [
                        {
                            "dest_ip": "<string>",
                            "dest_port": <int>,
                            "hostname": "<string>",
                            "protocol": "<string>",
                            "src_ip": "<string>",
                            "src_port": <int>,
                            "url": "<string>"
                        }
                    ],
                    "alert_severity": "<string>",
                    "rule_category": "<string>",
                    "rule_id": "<string>",
                    "rule_msg": "<string>",
                    "rule_source": "<string>"
                }
            ],
            "crowdsourced_ids_stats": {
                "info": <int>,
                "high": <int>,
                "low": <int>,
                "medium": <int>
            },
            "crowdsourced_yara_results": [
                {
                    "description": "<string>",
                    "match_in_subfile": <boolean>,
                    "rule_name": "<string>",
                    "ruleset_id": "<string>",
                    "ruleset_name": "<string>",
                    "source": "<string>"
                }
            ],
            "downloadable": <bool>,
            "first_submission_date": <int:timestamp>,
            "last_analysis_date": <int:timestamp>,
            "last_analysis_results": {
                "<string:engine_name>": {
                    "category": "<string>",
                    "engine_name": "<string>",
                    "engine_update": "<string>",
                    "engine_version": "<string>",
                    "method": "<string>",
                    "result": "<string>"
                }
            },
            "last_analysis_stats": {
                "confirmed-timeout": <int>,
                "failure": <int>,
                "harmless": <int>,
                "malicious": <int>,
                "suspicious": <int>,
                "timeout": <int>,
                "type-unsupported": <int>,
                "undetected": <int>
            },
            "last_modification_date": <int:timestamp>,
            "last_submission_date": <int:timestamp>,
            "md5": "<string>",
            "meaningful_name": "<string>",
            "names": [
                "<strings>",...
            ],
            "reputation": <int>,
            "sandbox_verdicts": {
                "<string:sandbox_name>": {
                    "category": "<string>",
                    "confidence": <int>,
                    "malware_classification": [
                        "<string>"
                    ],
                    "malware_names": [
                        "<string>"
                    ],
                    "sandbox_name": "<string>"
                },
            },
            "sha1": "<string>",
            "sha256": "<string>",
            "sigma_analysis_stats": {
                "critical": <int>,
                "high": <int>,
                "low": <int>,
                "medium": <int>
            },
            "sigma_analysis_summary": {
                "<string:ruleset_name>": {
                    "critical": <int>,
                    "high": <int>,
                    "low": <int>,
                    "medium": <int>
                }
            },
            "size": <int>,
            "tags": [
                "<strings>",...
            ],
            "times_submitted": <int>,
            "total_votes": {
                "harmless": <int>,
                "malicious": <int>
            },
            "type_description": "<string>",
            "type_extension": "<string>",
            "type_tag": "<string>",
            "unique_sources": <int>,
            "vhash": "<string>"
        },
        "id": "<SHA256>",
        "links": {
            "self": "https://www.virustotal.com/ui/files/<SHA256>"
        },
        "type": "file"
    }
}
{
    "data": {
        "attributes": {
            "capabilities_tags": [
                "str_win32_internet_api",
                "cred_ff",
                "win_mutex",
                "keylogger",
                "str_win32_winsock2_library",
                "sniff_audio",
                "network_dropper",
                "ldpreload",
                "win_files_operation",
                "str_win32_wininet_library",
                "inject_thread"
            ],
            "creation_date": 1589251011,
            "crowdsourced_ids_results": [
              {
                "alert_context": [
                  {
                    "proto": "TCP",
                    "src_ip": "152.126.25.42",
                    "src_port": 80
                  }
                ],
                "alert_severity": "high",
                "rule_category": "Potential Corporate Privacy Violation",
                "rule_id": "32481",
                "rule_msg": "POLICY-OTHER Remote non-JavaScript file found in script tag src attribute",
                "rule_source": "snort"
               }
            ],
            "crowdsourced_ids_stats": {
                "high": 1,
                "info": 0,
                "low": 0,
                "medium": 0
             },
            "crowdsourced_yara_results": [
                {
                    "description": "Detects a very evil attack",
                    "match_in_subfile": true,
                    "rule_name": "evil_a_b",
                    "ruleset_id": "000abc43",
                    "ruleset_name": "evilness",
                    "source": "https://example.com/evil/ruleset"
                }
            ],
            "downloadable": true,
            "first_submission_date": 1592134853,
            "last_analysis_date": 1592141610,
            "last_analysis_results": {
                "ALYac": {
                    "category": "malicious",
                    "engine_name": "ALYac",
                    "engine_update": "20200614",
                    "engine_version": "1.1.1.5",
                    "method": "blacklist",
                    "result": "Trojan.GenericKDZ.67102"
                },
                "APEX": {
                    "category": "malicious",
                    "engine_name": "APEX",
                    "engine_update": "20200613",
                    "engine_version": "6.36",
                    "method": "blacklist",
                    "result": "Malicious"
                },
                "AVG": {
                    "category": "malicious",
                    "engine_name": "AVG",
                    "engine_update": "20200614",
                    "engine_version": "18.4.3895.0",
                    "method": "blacklist",
                    "result": "Win32:PWSX-gen [Trj]"
                },
                "Acronis": {
                    "category": "undetected",
                    "engine_name": "Acronis",
                    "engine_update": "20200603",
                    "engine_version": "1.1.1.76",
                    "method": "blacklist",
                    "result": null
                }
            },
            "last_analysis_stats": {
                "confirmed-timeout": 0,
                "failure": 0,
                "harmless": 0,
                "malicious": 3,
                "suspicious": 0,
                "timeout": 0,
                "type-unsupported": 0,
                "undetected": 2
            },
            "last_modification_date": 1592141790,
            "last_submission_date": 1592141610,
            "md5": "5a430646b4d3c04f0b43b444ad48443f",
            "meaningful_name": "o4oz44Z4E444.exe",
            "names": [
                "myfile.exe",
                "o4oz44Z4E444.exe"
            ],
            "reputation": 0,
            "sandbox_verdicts": {
                "VirusTotal Jujubox": {
                    "category": "malicious",
                    "confidence": 70,
                    "malware_classification": [
                        "MALWARE",
                        "TROJAN"
                    ],
                    "malware_names": [
                        "XMRigMiner"
                    ],
                    "sandbox_name": "VirusTotal Jujubox"
                },
            },
            "sha1": "54fdf53af86f90bf446f0a5fe26f6e4fd5f4c9fd",
            "sha256": "3f6fa13af90cf967f0b5f5d07f413f9d1f39d2fa366f09ff760fcd3fd8bf6fbf",
            "sigma_analysis_stats": {
                "critical": 0,
                "high": 0,
                "low": 2,
                "medium": 0
            },
            "sigma_analysis_summary": {
                "Sigma Integrated Rule Set (GitHub)": {
                    "critical": 0,
                    "high": 0,
                    "low": 2,
                    "medium": 0
                }
            },
            "size": 374272,
            "tags": [
                "peexe",
                "runtime-modules",
                "assembly",
                "direct-cpu-clock-access",
                "detect-debug-environment"
            ],
            "times_submitted": 3,
            "total_votes": {
                "harmless": 0,
                "malicious": 0
            },
            "type_description": "Win32 EXE",
            "type_tag": "exe",
            "type_tag": "peexe",
            "unique_sources": 3,
            "vhash": "2350f6f515f29f93f147f0f0"
        },
        "id": "3f6fa13af90cf967f0b5f5d07f413f9d1f39d2fa366f09ff760fcd3fd8bf6fbf",
        "links": {
            "self": "https://www.virustotal.com/ui/files/3f6fa13af90cf967f0b5f5d07f413f9d1f39d2fa366f09ff760fcd3fd8bf6fbf"
        },
        "type": "file"
    }
}

Relationships

In addition to the previously described attributes (and the ones described in the following subsections), File objects contain relationships with other objects in our dataset that can be retrieved as explained in the Relationships section.

The following table shows a summary of available relationships for file objects.

Relationship

Description

Accessibility

Return object type

analyses

Analyses for the file

VT Enterprise users only.

A list of Analyses

behaviours

Behaviour reports for the file. See File behaviour.

Everyone.

A list of File behaviour.

bundled_files

Files bundled within the file.

Everyone.

A list of Files.

carbonblack_children

Files derived from the file according to Carbon Black.

VT Enterprise users only.

A list of Files.

carbonblack_parents

Files from where the file was derived according to Carbon Black.

VT Enterprise users only.

A list of Files.

ciphered_bundled_files

Files within a ciphered bundle.

VT Enterprise users only.

A list of Files.

ciphered_parents

Compressed bundle files where a file is contained.

VT Enterprise users only.

A list of Files.

clues

Clues for the file.

Everyone.

A list of Clues.

comments

Comments for the file.

Everyone.

A list of Comments.

compressed_parents

Compressed files that contain the file.

Everyone.

A list of Files.

contacted_domains

Domains contacted by the file.

Everyone.

A list of Domains.

contacted_ips

IP addresses contacted by the file.

Everyone.

A list of IP addresses.

contacted_urls

URLs contacted by the file.

Everyone.

A list of URLs.

dropped_files

Files dropped by the file during its execution.

Everyone.

A list of Files.

email_attachments

Files attached to the email.

VT Enterprise users only.

A list of Files.

email_parents

Email files that contained the file.

VT Enterprise users only.

A list of Files.

embedded_domains

Domain names embedded in the file.

VT Enterprise users only.

A list of Domains.

embedded_ips

IP addresses embedded in the file.

VT Enterprise users only.

A list of IP addresses.

embedded_urls

URLs embedded in the file.

VT Enterprise users only.

A list of URLs.

execution_parents

Files that executed the file.

Everyone.

A list of Files.

graphs

Graphs that include the file.

Everyone.

A list of Graphs.

itw_domains

In the wild domain names from where the file has been downloaded.

VT Enterprise users only.

A list of Domains.

itw_ips

In the wild IP addresses from where the file has been downloaded.

VT Enterprise users only.

A list of IP addresses.

itw_urls

In the wild URLs from where the file has been downloaded.

VT Enterprise users only.

A list of URLs.

overlay_children

Files contained by the file as an overlay.

VT Enterprise users only.

A list of Files.

overlay_parents

File that contain the file as an overlay.

VT Enterprise users only.

A list of Files.

pcap_children

Files contained within the PCAP file.

Everyone.

A list of Files.

pcap_parents

PCAP files that contain the file.

Everyone.

A list of Files.

pe_resource_children

Files contained by a PE file as a resource.

Everyone.

A list of Files.

pe_resource_parents

PE files containing the file as a resource.

Everyone.

A list of Files.

sigma_analysis

Last Sigma analysis results.

VT Enterprise users only.

A single Sigma Analyses object.

similar_files

Files that are similar to the file.

VT Enterprise users only.

A list of Files.

submissions

Submissions for the file.

VT Enterprise users only.

A list of Submissions.

screenshots

Screenshots related to the sandbox execution of the file.

Everyone.

A list of Screenshots.

votes

Votes for the file.

Everyone

A list of Votes.