Ian Whalley
IBM TJ Watson Research Center, PO Box 704, Yorktown Heights, NY
10598, USA
Tel +1-914-784-7808 · Fax +1-914-784-6054
· Email [email protected]
In the field of computing, Trojan horses have been around for even longer than computer viruses – but traditionally have been less of a cause for concern amongst the community of PC users. In recent years, however, they have been the focus of increased attention from anti-virus companies and heightened levels of user concern.
This paper aims to investigate the Trojan phenomenon; particular attention will be paid to the claims made in the field of NVM detection and those made by those who aim to test the vendors’ claims.
In addition, various attempts to define Trojan horses will be evaluated, and a new definition will be suggested. It is not expected that any such definitions will solve any of the basic problems, but they will help shed light on certain aspects.
The author will also investigate further techniques to test the claims of software vendors in this very difficult area.
As the 1990s draw to a close, the anti-virus industry is undergoing a sea change. Those who have followed the AV industry for any length of time are probably bored of hearing such sweeping generalisations, but this author believes his statement to be correct, and will attempt to justify it.
The risks to an average PC in 1999 are very different from the risks to that same PC in the early- and mid-90s, and the manufacturers of traditional anti-virus products are finding that they have to change their products in order to deal with these new threats. However, in order for the transition to be successful, other parts of the anti-virus industry will need to change along with the products.
The most obvious trend at the time of writing is in the area of network-aware viruses. In recent months we have seen more and more occurrences of this type of malware, and this threat looks set to grow as we enter the next century. The problem of malware that spreads using both the Internet and local-area networks is all too evident, and the art of defending against such malware is urgently in need of a boost.
There is concern, however, over another threat – the Trojans of the title.
It may initially seem somewhat odd that this paper sets out to discuss Trojan horses – Trojans have been around for a while, after all, and surely nothing much has changed. Indeed, the fact of the matter is that the average computer user is not encountering Trojan horses with any regularity (or indeed at all!) – the reader is referred to [1] and [2] (updated in [3])for figures and a cogent analysis of the overall Trojan situation.
The standard example of an environment in which the threat of Trojan horses is very real and prevalent is America On-Line (AOL). There are several reasons why AOL is an ideal environment for Trojan horses – including:
If these facts are combined, the result is an environment in which the majority (those who know very little) are wide open to abuse from the minority (those who are experts with computers in general and AOL in particular).
Not surprisingly, such abuse has been taking place, with varying degrees of success, over the last few years, and looks set to continue into the future. The techniques will be entirely laughable to most people reading this paper, but that doesn’t stop them being successful and worthwhile to the attackers. The precise techniques are not really important, and there are many variations on a theme. A typical scenario would be as follows:
The phenomenon of the so-called ‘AOL Trojan’ has been well documented in the past, and the reader is referred in particular to [4] for more information on the subject.
It is a sad fact that this type of attack is very easy to carry out. Thanks to modern programming languages and environments, writing a program of the type outlined above is well within the reach of many people. This trend – of bringing programming to the masses – is of course to be applauded; however, it is a mixed blessing. For example, if more people knew how to break into the average family car, there would be a higher rate of car crime. If more people knew how to commit credit card fraud in near untraceable ways, there would be a higher rate of credit card fraud. In the same way, as more people discover how easy it is to produce this type of Trojan, more and more people will start to produce it, and the problem will increase.
Let us suppose that, in the example above, Bob is running the top of the line anti-virus product, McNortSophOlomon Anti-Virus (MNSOAV). Bob has kept the signatures in his copy of MNSOAV up-to-date – once a week it updates itself automatically from the Internet. From Bob’s point of view, he should have been protected – with the single (albeit significant) exception of running mysterious software sent to him from an untrusted source. However, Bob could argue that the precise reason he pays his subscription fees to the McNortSophOlomon Corporation is that he wishes to be protected if and when he forgets himself and does something silly. After all, when Bob is buying a car, he looks for one with airbags – not because he intends to go right ahead and drive his car at high speed into a brick wall, but so if and when an unfortunate accident does occur, he will be protected.
However, McNortSophOlomon Corporation could argue that their product is, after all, called MNSO Anti-Virus, and Bob is therefore not entitled to assume that he is protected against Trojans. However, upon learning this, Bob can promptly cancel his subscription to MNSOAV (which, after all, isn’t helping him and doesn’t do what he wants), and subscribe instead to a competing product, Dr InocuTonAfeeOs Anti-Virus (DITAOAV), which does advertise itself as detecting Trojans.
Not only does DITAOAV advertise itself as detecting such things, Bob has done his research. DITAOAV has performed admirably in some recent tests examining the ability of this type of product to handle Trojan horses, and Bob feels reassured.
What of the tests in which Bob has put his trust? Consumer tests are essential in the modern world, due to the incredible complexity of the products available – after all, when Bob buys his car with airbags, he is unlikely to do his own consumer testing, unless he has a considerable amount of skill, time, and money (and luck…). Of course, matters are not this clear-cut in the AV arena – because Bob is clearly aware of viruses and the like, it is likely that he has kept a copy of any viruses he has encountered over time. This is standard practice, and whilst AV companies have attempted to discourage it in the past, it is a reasonable base-line test that Bob can use to make some trivial decisions about his anti-virus product. However, Bob is a knowledgeable chap, and realises that the tests he can do on his own are not enough. Therefore, he peruses the results of third-party tests, and uses their results to help him decide with products he should consider and which he should not (for more information, the reader is referred to [9] & [12]).
Some time ago, testing agencies realised that it would be necessary for them to introduce tests to determine the ability of anti-virus products to handle Trojans, as two things were becoming increasingly evident:
These two reasons are nothing new – they are the same reasons that have justified traditional anti-virus product testing for many years; there is a demand for some form of comparison testing of the available products, and so a supply of such tests inevitably results.
It is interesting to pause for a moment and ponder – given the research documented in [1], why are manufacturers and users even remotely interested in Trojans? It appears to be another case of the ‘positive feedback effect’. That is to say, as soon as one producer takes the leap and advertises its products as capable of defending against Trojans, other producers have to jump into the game as well. Both the producers and the users then demand (for different reasons) testing of these ‘new’ features. Producers want some way to differentiate product from product, and tests can provide this; users want to know if the product they are considering buying is capable of doing that which it claims to do. The reader is referred to [9], [10] & [11] for more information on the symbiotic relationship between producers and reviewers of anti-virus products, and other related topics.
As demonstrated above, the marketplace has decided (in the parlance of capitalism) that anti-Trojan testing is required. Given this fact, an examination of the current tests involving Trojans is in order. The following sections cover the four major professional testing bodies.
7.1 University of Hamburg Virus Test CentreThe Virus Test Centre (VTC) is an academic research institution that publishes reviews of anti-virus software at approximately six-monthly intervals [15]. VTC does test the ability of anti-virus software to detect Trojans – see [16]. In particular, note the following:
As VTC malware tests demonstrated that almost all AV products are able to detect a significant part of malware, VTC now considers it´s [sic] malware test as *mandatory part of VTC tests*. Indeed, all essential AntiVirus manufacturers agreed that their product also be tested against VTC´s malware testbeds (one enterprise did not [sic] agree and was dropped from the list of products to be included in VTC tests). [17]
Unfortunately, VTC could not be contacted in time this paper to be submitted.
7.2 West Coast LabsWest Coast Labs is owned by West Coast Group (a privately held company located in Swansea, UK), which also owns the security publication ‘Secure Computing’ [19]. West Coast Labs is responsible for the Checkmark scheme for certifying security-related products – the anti-virus Checkmark scheme of the time is discussed in [9], although changes have been made since then [18].
West Coast Labs have recently introduced a ‘Trojan Checkmark’, part of the description of which is as follows:
For a product to be certified to Trojan Checkmark, the product must be able to detect all trojans in the West Coast Labs Trojan test suite. The product should not cause any false alarms (based on testing against the West Coast Labs false alarm test suite). …
A trojan test suite is maintained by West Coast Labs. The list of trojans will be published by West Coast Labs (possibly on the Checkmark web site) and copies of Trojans will be provided at the discretion of West Coast Labs to bona fide solution developers and members of the Trojan Checkmark scheme. …
Note
(a) The Trojan Checkmark may be subject to change as a result of developments in the field of trojan threats and testing.
(b) Additional levels of certification will be introduced as appropriate and in due course.
(c) Developers agree, as a condition of registering for the Trojan Checkmark, to provide copies of their collections of trojans to West Coast Labs. [20]
Note carefully the marked sections. Upon signing up for the Checkmark scheme, vendors provide the Trojans which they have in their collection to West Coast Labs. In addition, producers in the scheme can receive copies of at least part of the West Coast Labs test set.
West Coast states that it found maintaining the list of Trojans mentioned in [20] to be ‘totally impractical, as they came in faster than anticipated, so this idea had to be abandoned’ [21]. In addition:
I believe that all of our Trojan samples have been supplied by AV companies rather than drawn from the Wild. First we try to match them to existing samples; if they aren't already represented then we confirm that they are Trojans rather than viruses or innocent programs by examining their functions, either by analysis or execution (we find names reported by AV products are often a good indication of the file's functions). Borderline cases are omitted (as they are elsewhere in our collection – for instance, we decided not to include in our tests ALREADY.COM)
Apart from an occasion when Trojans were damaged in the process of sending them to the customer, our samples haven't been questioned. [21]
7.3 Virus BulletinVirus Bulletin [22] is a privately held company located in Abingdon, UK, and is under the same ownership as anti-virus company Sophos. At the time of writing, Virus Bulletin does not perform any type of anti-Trojan testing [23].
7.4 ICSAThe International Computer Security Association (ICSA) [24] is a privately held company located in Virginia, USA. At the time of writing, the ICSA does not perform any type of anti-Trojan testing [25], however, it is involved in the creation of a group to examine the problem of defining what is and what is not a Trojan – for more information, see section 10.
It is useful to consider the similarities between requirements for building a good set of viruses – as will be seen, the requirements for constructing a set of Trojans are not as different as might have been thought.
When building a set of viruses against which to test anti-virus products, the tester is required to verify the viral nature of each sample in his set. This is a worthwhile exercise for the tester for reasons other than technical exactitude – if vendors cast aspersions on his results, the tester can be certain, and can demonstrate, that his samples are valid. This most basic requirement of test-set construction has been documented in several works – refer to [13], [26], and the works they reference.
This requirement is just as fundamental to the art of anti-Trojans product testing – without ‘proof’ that the samples that the tester is using genuinely represent Trojans, the tester has no hope of performing tests which are either reputable or repeatable. Therein lies a fundamental problem.
8.1 The verification problemHow can the tester prove that the samples he has in his test-set are valid samples of Trojans? If the tester were doing tests against samples of viruses, such verification would be fairly simple, if time-consuming. The tester would simply take samples from the test-set, and replicate them. If the samples could be replicated, then the matter is closed – the samples are viruses. Of course, there are other issues – such as are they samples of the correct virus (particularly relevant for ItW [14] tests) – but these are secondary in this discussion to the fundamental question of ‘Is it a virus?’
In the case of viruses, the sample verification process can be rolled into the initial sample creation process – when the tester makes his initial samples, he can produce additional generations of replicants in order to verify that the samples he will use for his test-set are capable of replicating correctly. The tester will wish to document the fact that he has done this, and also document the system upon which the virus was replicated – this information will save time later on when and if he needs to proof the validity of the samples in the set.
This system only works thanks to the very basic definition of ‘virus’ – which for the purposes of this paper will be stated as ‘code which can make copies of itself’ [28]. Provided the tester can prove that his samples fulfil this requirement, he is on reasonably firm ground.
However, Trojan horses do not have this requirement – and this leaves both testers and producers with a remarkably basic question.
There follows (in no particular order) some previous definitions of the phrase ‘Trojan horse’:
A Trojan horse is a program which performs (or claims to perform) something useful, while in [sic] the same time intentionally performs, unknowingly to the user, some kind of destructive function. This destructive function is usually called a payload. [26]
A Trojan horse is a program which performs functions other than those stated in its specifications. These functions can be (and often are) malicious. [27]
A Trojan horse is, as the name suggests, a program which is allowed onto the user’s PC under false pretences, whereupon it has undesirable side effects. [28]
Trojan horse: A computer program with an apparently or actually useful function that contains additional (hidden) functions that surreptitiously exploit the legitimate authorizations of the invoking process to the detriment of security. [29]
A program which someone tells you is legitimate software, but which actually does something other than what the person claims it will do. [2] (Emphasis preserved from original)
A program which the user thinks or believes will do one thing, and which does that thing, but which also does something additional which the user would not approve of. [3]
These definitions are from various eras of computing – from the mid-80s to the present day, and in some senses reflect the changing nature of the problem. Almost inevitably, some are more useful and appropriate than others – of particular relevance, in the opinion of this author, is the marked definition. However, a suggested new version is:
A program which the user thinks or believes will do one thing (the ‘perceived purpose’), and which may or may not do that thing, but which also does something else which is not necessary to accomplish the perceived purpose, and of which the user would not approve (the ‘payload’).
The reader will notice that this definition is entirely subjective in nature – however, this is also true of the definitions suggested by other authors listed above. The main difference is that many of the other definitions do not make their subjectivity clear – the author’s definition (and the definition from which it is derived) make that subjectivity explicit.
It seems clear to this author that it is not possible to produce a definition of Trojan that is not subjective in nature. Bearing this in mind, the advantages and disadvantages of the above definition are discussed below.
9.1 Advantages of the suggested definition 9.1.1 No mention on ‘intent’Many of the current definitions of Trojan refer to the program ‘intentionally’ doing something bad to the host system. In the opinion of this author, if Bob executes a program which is clearly described in the accompanying documentation as a word processor, but which formats some of the tracks on his hard drive (or damages his Linux installation, or some other ‘bad thing’), intent is not an issue. That program is a Trojan whether or not the author wrote those the destructive routines deliberately (and with malice aforethought.)
9.1.2 No requirement to carry out perceived purposeTo use the earlier example, when Bob receives a program from Alice that is described as a patch for the AOL client to speed up access 400%, that becomes its perceived purpose. If that program actually does something bad to Bob’s computer, or transmits his account information to Alice, or anything along those lines, that program is a Trojan even if it does not also carry out the perceived purpose.
9.1.3 No requirement for the payload to be destructiveThe inclusion of words like ‘destructive’ and ‘damaging’ in some definitions needless restrict those definitions. If Alice’s program transmits Bob’s account information to Alice, that is without doubt an action of which Bob would not approve. However, it is not in and of itself destructive – it may become indirectly destructive (in some senses), depending on what Alice does with the information.
9.1.4 No mention of specificationThe word ‘specification’ has many connotations in the field of computer science, none of which apply particularly well to the field of the user’s expectations of what a given piece of software is going to do. Talking instead of the perceived purpose of that piece of software more clearly illustrates the situation, whilst leaving the precise definition very open.
The perceived purpose is something that will vary from person to person – this is absolutely essential to a modern definition of a Trojan. Not only that, but it will vary according to the way in which a given program is distributed. For more discussion of this point, see section 9.3.
9.2 Disadvantages of the suggested definition 9.2.1. No mention of specificationOne of the advantages mentioned above (section 9.1.4) is also a disadvantage – the fact that whether or not something is a Trojan can vary from person to person, and according to the way the object is presented, is a real problem for those producers attempting to defend against them. Again, for more discussion of this point, see section 9.3.
9.2.2 No mention of ‘intent’ (or: Bugs are Trojans?)A side effect of the fact that there is no requirement for intent is the fact that, under some circumstances, bugs could be considered to be Trojans. As mentioned above (section 9.1.1), if the bug causes behaviour ‘of which the user would not approve’, then this is entirely correct. However, if the bug causes a trap, or causes non-fatal errors (for example, it causes Word to make irrational and apparently random claims of incorrect grammar), it could be argued that this is behaviour ‘of which the user would not approve’, and yet it is surely unreasonable to classify the program as a Trojan simply because of insufficient sanity checking or a poor grammar checker.
This is due to the entirely subjective nature of the definition. However, within the boundaries of the definition, it is valid – some bugs (the accidental destruction of data on the disk mentioned above, for example) would indeed cause many people to classify the program exhibiting them as a Trojan.
9.3 The problem of perceived purposeFrom the point of view of both producers and testers, introducing the concept \ of a program’s perceived purpose to the definition is a real problem, for a number of reasons, a couple of which are briefly mentioned below – the reader is no doubt able to think of several others.
9.3.1 RenamingAs briefly mentioned above, the perceived purpose of a given program can vary according to how that program is presented. This is true right down to that most basic of operations, renaming an executable. In many cases (particularly on the Internet), the filename is the only indication which a user has pertaining to the supposed purpose of the program.
For example, consider the case of the malware Happy99 (also known as Ska). This virus attacked Windows 9x TCP/IP services, and was able ‘watch’ what the user was doing, and then send itself via email and Usenet postings to the same people and groups to which the user was posting. Messages sent by Happy99 arrived with no supporting text, merely an included executable file called HAPPY99.EXE [30]. The recipient of such a message has no other indication of the nature of the attachment.
9.3.2 Other supporting documentationEven if producers were able to account in some way for the name of a program, that is not enough. The program may come with other documentation (in the form of README.TXT, help files, even printed manuals), all of which can affect the perceived purpose of the program itself.
Now, the situation seems impossible – and it may very well be. The perceived purpose of a program will almost inevitably change depending whether or not the user actually reads the documentation! What if the documentation is in a language that the user does not speak? What if the documentation has been mistranslated into the user’s native language, and the meaning has been damaged?
The fact that simply looking at the bytes which make up a given object is no longer enough to determine whether or not it should be flagged as a risk has far-reaching consequences. In the opinion of this author (and fortunately for users), the scenarios described above are comparatively unlikely to occur in significant numbers in the user community. However, this does not alter the importance of at least attempting to resolve some of these issues in the world of anti-Trojan testing.
To return to the more general problem of determining the perceived purpose of a given program, one of the testing agencies mentioned above is in the course of setting up an interesting-looking project to attempt to define the scope of the problem. Some of the plans for this project are outlined below.
As an attempt to move some distance towards resolving the problems discussed above, the ICSA is setting up a group of people to help construct a Trojan test-set. This is in response to requests from corporate customers of the ICSA, who perceive a need for such testing (given the research in [1], [2] & [3], presumably this is at least in part to the factors outlined in section 6).
Current plans are that the group will be overseen by a board drawn made up of members from the ICSA, the corporate world, academia, and AV producers. Underneath the board comes a larger number of people from each of the above categories – between them, they will attempt to decide whether or not something is malware.
Each member of the group will examine submissions in an attempt to determine whether or not a given program is malware. If they look at a given program, each member can (fundamentally) vote one of three ways – ‘definitely malware’, ‘definitely not malware’, or ‘can’t tell’. If the group is split, there are weighting factors applied to each possible vote, and the combination of those weighted votes will place the suspect program in one of the three possible categories.
In essence, the ICSA is proposing trial by jury for suspected malware – a system which should logically be one of the best available, given the entirely subjective nature of the definition of Trojan. A group of people, skilled in the art, must reasonably conclude that the suspect program is a Trojan before it is declared guilty.
Whilst this is, on the face of it, a promising idea, there are several factors that could cause it difficulties:
However, this author does not wish to predict doom for such a group before it has got off the ground – the results of an appropriately even-handed set of deliberations such as are the apparent intention of the group will undoubtedly make for an interest research topic in and of themselves, and could shed considerable light on this murky area.
As an example, it would be educational to give the group copies of the following pieces of software, and ask them to determine whether or not they are malware:
It would also be instructive to determine whether or not the opinions of the group’s members on each of the above programs were altered depending upon whether or not they were aware of the controversy surrounding these programs. The reader is referred to [34], [35] & [36], in addition to numerous threads on the alt.comp.virus newsgroup.
What, then, of anti-Trojan testing? As discussed previously (see section 8), the requirements for anti-Trojan testing are harder to define than those for anti-virus testing. However, this author suggests the following process for building a Trojan test-set.
By the process described above, the tester can ensure that his test-set is both maintainable and justifiable. However, the reader may notice that the list above glosses over the tricky matter of ‘verification’ – the process by which the tester decides whether or not a sample is malware. Certainly it is not suitable for the tester simply to take samples given to him and arbitrarily put them in the test-set without any attempt at verification – this will very quickly lead to pollution of his set, and possibly even manipulation of his tests by unscrupulous producers.
It is to be hoped that the ICSA’s approach (see section 10.1) will assist in this area – the author and readers must wait and see.
Given the scenarios outlined in the early part of this paper, the continuing interest in anti-Trojan testing seems doomed to continue, regardless of whether or not the average user is actually at any sort of risk from Trojans.
In the field of testing, two things appear certain:
It remains to be seen whether the relevance of such tests will increase in the future – due either to the testers raising their standards, or to the Trojan problem increasing.