In my previous essay dedicated to Thomas Lamarre’s concept of animetism I argued that when studying animation, we shouldn’t just take into consideration space (how objects are distributed and move across the frame), but also time, which I believe is key to understanding the essence of movement. Indeed, movement is not just motion through space, but also takes place in time : it is something dynamic. Here, I’d like to follow up on that statement and offer an account of how time is created and used in animation. To give my arguments more weight, I’ll use a comparative approach : I will show that the way animation presents time is radically different from that of live-action cinema, and that it is a determining factor in the difference between the two mediums.
Basically, my point will be this : in cinema, time only follows from the mechanical operations of the technical apparatus ; in other words, time is but recorded data. On the contrary, in animation, time is created, both by the animator and the viewer. To establish it, I will focus on two techniques of animation : timing and framerate modulation.
As you may know, film is made of a series of still images that are projected at a certain speed that fools the brain into thinking there’s movement : that speed is 24 frames per second. This is the standard rate of projection in both cinema and animation. However, animators soon discovered that you could get away with shooting the same drawings twice and still convey pretty much the same impression of movement : that means that there are only 12 new frames in a second, what’s called animating “on twos”. The twelve other frames are what I’ll be calling “leftover frames”, that are just a repetition of the previous image. With the development of limited animation, animation got away with using less and less new frames : the framerate would often drop to 8 (animating “on threes”) or 6 (“on fours”) new frames per second. The film is still projected at the same speed (thanks to the leftover frames), but the movement isn’t. This capacity of animation to use less drawings is what gave birth to timing and modulation.
Timing refers to the fact that it is the individual animator who decides what number of frames he will use for a given movement. For example, if you want to animate someone raising his arms, you may just use three movements : one with the arms down, one with the arms in the middle, and one with the arms raised. This’ll probably be jerky, but you’ll have movement nevertheless. But you may want your movement to be more detailed : instead of just 3 positions, you may add 3 more, intermediate ones, to make it more smooth. In the end, it’s the animator who chooses what positions will be the most important ones (the key frames) and how many intermediate drawings there are between each (the in-betweens). That kind of calculation is timing.
Modulation is closely linked to timing but operates on a slightly different scale. For one movement, the animator may choose to use different timings : for example, from the first key position (the arms down) to the second (the arms in the middle), it will be animated on twos ; but from the second to the third position (the arms raised), it will be animated on threes. In Western animation, this is called “spacing”, because the animator determines the “space” between each new pose. But what’s really important with modulation are the ideas of variation and irregularity : in a second of animation, the framerate may sometimes change two or three times.
What’s important to remember for now is that, in both cases, it’s the individual animator’s decision (1). This is even more important when considering Japanese animation, as the specific production process of anime leaves a lot of freedom to the individual animator. As I will show, this contrasts not only with certain kinds of Western animation, but with cinema, in which such variations are much harder to obtain and create.
Objective time : cinema
In 1907, not long after the development of cinema, French philosopher Henri Bergson published his major book, The Creative Evolution. This was and still is one of philosophy’s major attempts to understand the darwinian theory of evolution, but the subjects it treats are far more diverse. Among other things, Bergson makes a strong critique of cinema, and what he calls the “cinematic leaning of thought” : we believe that cinema gives us real movement, because we see it move. But in fact, cinema creates movement from immobility : projection of still images at a rapid pace. For Bergson, that’s not real movement : that is but an “indirect image” of movement that translates our inability to see things for themselves, in order to break them down to their parts and see movement as just a series of still positions.
Bergson didn’t target cinema in order to condemn it, but he did give the name “cinematic” to a philosophical tendency he sought to suppress, and his account is one of the first philosophical theories of cinema. That’s why many cinema theorists, directly or indirectly, answered Bergson when they tried to justify the status of cinema as art and as a real representation of reality – that is, movement. Among them, one of the most influential and famous is without a doubt that of one of Bergson’s disciples, Gilles Deleuze, in his monumental 2-volume Cinema.
It is in the very first and introductory chapter that Deleuze confronts Bergson and masterfully shows that, despite his master’s claims, real, direct movement does exist in cinema. In cinema, there is representation of movement : what Deleuze calls the “movement-image”. The argument can be summed up as follows : when we watch a film, we don’t see the individual still frames. We see the movement they produce, and this thanks to the technical apparatus of cinema, which projects these frames at a regular and identical rhythm. As Deleuze says,
“But what [cinema] gives us, as has often been noted, isn’t the frame, it’s an average image to which movement isn’t added from the outside : the movement is a part of the average image as immediate data. […] To sum up, cinema doesn’t give an image to which it adds movement, it immediately gives us a movement-image.” [Deleuze, 1983, p.11] (2)
In other words, cinema isn’t the film on which are inscribed the individual still frames : if this were the case, Bergson would be right. Cinema is the film, and the projection apparatus of which time is an essential attribute : the projector projects images at a certain speed. And that also becomes visible in the materiality of the film : film reels have holes on the side, placed at equidistant intervals, and it is by grabbing through these holes that the projector makes the film move. It is because the holes (and the frames) all have the same size and are placed at regular intervals that the projection can also happen at a regular pace – in other words, that the rhythm of projection stays the same, and that time in cinema can be something objective, that doesn’t change whatever happens. This technical characteristic is what motivates Deleuze’s definition of cinema :
“Cinema is the system that reproduces movement as in any indifferent moment, i.e. according to equidistant instants chosen to give an impression of continuity. Any other system that would reproduce movement by an arrangement of poses projected as to flow in one another or to “transform”, is not part of cinema.” [Deleuze, 1983, p.14]
While this definition might seem very abstract, it’s in fact very simple ; what matters to my point is mostly the need to “give an impression of continuity” : the very nature of cinema is to convey a regular impression of movement. The movement on screen might be fast or slow, it might even be slow-motion, but from the technological standpoint, all these movements represent the same thing : they share the same “data”, the same number of frames and in the end all happen at the same speed. This means that, as Deleuze says, there are no singularities or exceptions in cinema, because its movement is created from “any indifferent moment” : the single frame taken by itself doesn’t matter, because it’s just an indifferent moment taken from a continuity and will never stand out nor need to.
From this point of view, there are no variations of intensity in cinema : every image could be replaced by any other. The creation of intensity does not therefore happen at this level, that of the individual frames, but at successive larger ones : shot composition and framing, camera movement, editing, etc. But all of these relate to direction, or what I would call cinematography (literally, writing with cinema, that is with movement), and not to the essence of cinema itself.
To sum up my argument here, I would say that it relies on two ideas. First, that in cinema, movement is real : it’s not just an addition of still frames. Considering that animation relies on the same projection apparatus (including film), we could say that here, animation and cinema are the same. But then, in cinema, movement is also objective. By objective, I mean that it is an essential attribute of cinema, that never changes, and that cannot be manipulated by any subjective agency (in the case of cinema : the director, the actors, etc.) (3). As I will now show, this is where the main difference between animation and cinema lies.
Subjective time : animation
It’s common practice among animation scholars [cf. Sifianos, 2012] to note that, even though Deleuze was so influential to cinema theory, he barely talked about animation, and what he said was mostly wrong. My stance is slightly more complex, but before detailing it, we must see what Deleuze really says about animation. In the roughly 650 pages of Cinema, Deleuze mentions animation twice : once in an unimportant footnote, and once in the introductory chapter, just after the definition I quoted. Deleuze’s position is clear : for him, animation is cinema.
“This becomes visible when one tries to define cartoons : if they are fully a part of cinema, it’s because the drawing isn’t a pose or an accomplished shape. It is the description of a shape that’s constantly in the making or coming undone thanks to the movement of lines and dots taken at indifferent moments of their trajectory. […] Cartoons do not present the description of a shape taken at a unique moment, but the continuity of the movement that describes the shape.” [Deleuze, 1983, p.14]
Here, what Deleuze does is folding animation back on cinema by arguing that, just like in cinema, the frame-by-frame movement doesn’t really matter, as if each frame had just as much intensity and importance as all the others. That doesn’t mean that all frames are interchangeable ; but that in terms of expressivity and importance to the overall movement, all cuts are worth the same. I believe it does hold for certain kinds of animation : some hand-drawn full-animation works, where each projected image corresponds to one new drawing (but most of the time, it’s just one in two : full animation is mostly drawn in 12 frames per second). But most importantly, I believe it applies to techniques that didn’t exist yet when Deleuze wrote : those that rely on automatic inbetweening, like Flash or 3D animation (4). So Deleuze isn’t entirely wrong.
However, this definition doesn’t apply to all techniques that use, in some sort or other, timing and modulation. Which means a very large part of animation, and most importantly in our case, anime. Indeed, because of the dominant use of limited animation, the individual frame took a new importance : each drawing is no more an indifferent part of the movement, but a precious resource. When you’re animating on 3’s or 4’s, the movement will very probably look jerky ; so you can’t afford to just take a drawing away or make it look bad. Which means that, in contrast to the monotonous intensity of cinema’s frames, limited animation clearly sets up a hierarchy : there are all the leftover frames (if you’re animating on 3’s, the anime standard, that’s 16 frames in a second), whose intensity can be equated to 1. It’s not 0, because they’re still taken up in the regular movement of projection ; but it never changes, never goes beyond or below this value. On the other hand, you have all the new frames, and most importantly the key frames, whose intensity would be 1+x, where “x” represents the impact and importance of the frame in the overall movement.
The most obvious example of this would be the Kanada school of animation, and especially its arguably most radical member, Hiroyuki Imaishi. The principle of Kanada-style animation is to raise the value of x as high as possible by having each key frame represent as striking a pose as possible. As it developed over time, and as most visible in this cut by Imaishi, it went so far as to suppress in-betweens : that’s a practice known as “snapping”. Snapping is interesting, because by omitting the in-betweens (5), it basically reduces their intensity to as close to 0 as possible : the value of each image is therefore not 1+x, but just x. But that doesn’t make the movement weaker in any way, because x’s value is very high from the start, and because of a phenomenon known as “closure”, which I’ll explain later on.
But even if we go out of the Kanada-style and its radically innovative use of omissions, my point is that timing dictates the intensity of each frame, and therefore that the intensity varies : one could say that it modulates, or rather that it is modulated by the animator which decides of the timing of his cuts. Modulation is in fact the key phenomenon here : if we imagine a cut entirely animated on 4’s, but which stays on 4’s during all its duration, the rhythm will be very regular and its intensity pretty low. In other words, it’ll be boring. However, if the rhythm modulates, the movement itself becomes unpredictable : the viewer can’t actually know what’s going to happen next or, more accurately, how it will happen next.
The following scene, a cut by Yasuo Otsuka, is among the most famous examples of modulation in the history of Japanese animation because it is one of the first prominent uses of the technique (6). It’s relevant here because it shows that modulation isn’t just a specificity of limited animation : this comes from Hols, Prince of the Sun, which is almost entirely in full animation. But it’s also a good example of the unpredictability and irregularity created by modulation. Most of the scene, like the rest of the movie, is animated on 2’s, but Otsuka uses modulation to create certain effects. For example, just after it’s been hit by Hols’ harpoon, the fish sinks : this is on 3’s, which conveys how slowly the monster goes down. But when it suddenly comes back out of the water, it’s on 2’s : the fish moves faster, and the animation follows and picks up the pace. This sudden change of rhythm is as striking as the fish’s sudden attack. To create an even stronger sense of surprise, in the next shot, a close-up of Hols’ surprised face, the boy is animated on 1’s : his shock comes through the animation itself, and not just his expression.
A few seconds later, Otsuka uses another technique that’s not quite modulation, but which is related to it : as the fish suddenly swims from the right to the left of the frame, taking Hols along with him, the animation is on 2’s. However, there’s movement in each frame, as if it was animated on 1’s : on the first frame, the fish moves ; on the second, the background moves. This pattern gives an impression of constant movement, even though the animation is on 2’s. Then, in the end of the sequence, as the hurt fish frantically swims and writhes in pain, it’s animated on 1’s, which makes its speed and movement even more striking ; in contrast, the next shot, where we see rocks detaching from the cliff, is on 3’s, which helps us realize the weight of the rocks that are about to fall down.
So now we can understand why, in animation, time is “subjective” : first, it’s not steady, just like subjective time isn’t, in opposition to the objective time of clocks : when I’m impatient or bored, time seems to flow faster or slower That means that, while time is as much an essential quality of animation as it is to cinema, it isn’t in the same sense, because there’s a gap between the rate of projection and the rate of animation. To make the difference even clearer, I’d say that in cinema, there’s only time, whereas in animation, there exist both time and rhythm. The other reason is that rhythm is, as I’ve said about timing, determined by the individual animator : he’s the one who has the say on how things will move. Therefore, time is subjective, because it depends on individuals and not an objective technical apparatus.
Closure, and the different kinds of modulation
There is one last subjective aspect in animation that I have not yet explored in detail : that is the role of the viewer. As I’ve already said, the viewer is already a key component of film : it’s thanks to the phenomenon known as “persistence of vision” that we can see movement where there is only a projection of 24 frames per second. At surface level, there should be no difference here between cinema and animation, because the projection apparatus is the same in both cases. But as I’ve just shown, modulation creates, in animation, a gap between the number of frames projected and the number of frames drawn. That also changes the role of the spectator.
To get an idea of what’s going on, let’s take a step back from the brain-eye-projector apparatus and try to understand what’s going in more general terms. The phenomenon at play has been analyzed by comic book artist and theorist Scott McCloud  : he calls it “closure”, and describes it as “mentally completing that which is incomplete based on past experience” [McCloud, 1994, p.63]. A very clear example of it is how, with just a few drawn lines, we are able to infer that there is a drawing of a human face. Persistence of vision is, according to McCloud, an instance of closure.
Following this theory, comics differ from cinema in that closure is not mechanically provoked by a technical apparatus – in other words, it’s not imposed on the viewer. McCloud describes comics as a participatory medium where “the audience is a willing and conscious collaborator and closure is the agent of change, time and motion.” [McCloud, 1994, p.65]
The place where closure happens in comics is called the “gutter” : that’s the white space between panels. There are no panels, and therefore no gutters, in animation, but I believe that closure happens in much the same way – in other words, from the point of view of closure (that is, of the spectator), animation is closer to comics than to cinema. First, let’s notice that the two simplest forms of transitions in comics according to McCloud closely resemble animation, without the projection apparatus : most notably in the case of the baseball player, it’s like we just had two key frames without the in-betweens.
But, as much as in animation as in comics, we don’t need the in-betweens that much here : the key poses are enough, and our mind does the rest – it does closure and creates movement. The reason this applies to animation only, and not to cinema, is because of the projected/drawn frames gap I mentioned : wherever the animation is not on ones, there will be at least one still frame between each new frame. This means that there is a part of movement that’s not put into motion by the projector, but that depends on the viewer to make the link between old and new frames (7).
What does modulation have to do with all this ? As I’ve already said, modulation is a factor of irregularity in movement. Closure is what makes this irregularity acceptable, and even enjoyable : first, it’s what maintains continuity when the movement is not continuous ; moreover, modulation and timing play on closure itself. For example, in the Imaishi cut I gave as an example earlier, there are only key frames and barely any in-betweens : the viewer’s role is here very important, because he has to do all the in-betweening himself. But in full animation, the viewer has much less work to do, since there are much more in-betweens and new frames in the animation itself. In this context, my 1+x model can be reconsidered in more detail : the constant value 1, that of the leftover frame, is produced by the spectator itself – by closure. The variable x represents as much the intensity of the frame as the amount of work needed by the viewer to make closure. In Kanada-style animation, x is very high, because the key poses are very different from each other, which makes them striking, but also continuity and closure harder to achieve.
This being said, I think it’s possible to try a list of different kinds of modulation, and the different kinds of closure and intensity they involve. I will once again rely on McCloud, who gives a catalog of the 6 types of transitions effects one can find in comics. Since the two mediums are very different, I believe there are only 3 kinds of modulation effects, but I will keep McCloud’s concepts.
Action to action. McCloud defines it as follows : “transitions featuring a single subject in distinct action-to-action progressions” [1994, p.70]. The example of the baseball player I gave earlier is an instance of action to action. In animation, action to action modulation would be a single movement, animated at different timings.
For example, let’s take a look at the second shot of this cut by Hayao Miyazaki. As the pitcher prepares to throw his ball and stretches upward, the animation follows a 3-4 rythm : the movement is slow and deliberate, and we can see it in every detail. But then, when he raises his leg, the rhythm gets faster and oscillates between 3’s and 2’s until the end of the shot, following the sudden acceleration of motion.
This kind of modulation is the most simple, because it’s simple timing, and it’s the most unnoticeable by the viewer : taken by the continuity of movement, he will not consciously notice the change unless it’s very obvious (from 2’s to 4’s, for example) or he is very seasoned. Its general value is therefore not too strong, but the simplicity of this technique musn’t be mistaken for a lack of importance : on the contrary, action to action modulation is what gives the movement its individuality and most of its texture. A cut can work or fail just because of its timing. What I call action to action, just a category of a larger technique, is what “framerate modulation” generally means in animation circles : the other types of modulation I list may not be considered as such by some. But since they involve a variation in the timing – which was my definition of modulation – I believe the word still applies to them.
Subject to subject. McCloud’s definition is : “[a transition that] takes us from subject-to-subject while staying within a same scene or idea.” [1994, p.71] In cinematographical terms, what he means by that is basically a cut to a new shot : an image of a different object or movement, and not just another phase in the same movement. However, considering that in animation, movement and intensity variation take place within the frame, and not from frame to frame, by subject to subject modulation, I here mean two or more characters or objects sharing the same frame but that do not share the same timing. To show in concrete terms what this means, let’s take this cut by Hisashi Mori.
During the entire cut, the fireball moves on 1’s ; in the penultimate shot, the giant bird moves on 2’s : for an instant, the two objects aren’t animated at the same pace. As the fireball hits the bird, action to action and subject to subject happen at the same time : writhing in pain, the bird switches to 1’s, while the fireball, which has lost some of its energy from the hit, is now animated on 3’s. Another famous examp’e would be Yoshihiko Umakoshi’s Mushishi, where the fantastical Mushi creatures are systematically animated on 1’s, which emphasizes their otherness. Subject to subject modulation is more complex, because the animator has to take account of various timings at the same time. However, it’s also very expressive, since having two objects moving at different speeds helps the viewer appreciate their difference. That’s a variation on the well-known principle “if you want to show the difference between two characters, animate them doing the same thing” : the way they will each move in a different way will tell us a lot about them. The same can be said of timing.
Scene to scene. According to McCloud, these kinds of transitions are the ones that “transport us across significant distances of time and space” [1994, p.71]. Since, as I said, the movement in animation takes place in the space and time of a single frame, the “significant distances” that McCloud refers to can here simply mean a cut, the passage from a shot to another. This kind of modulation is therefore when, from one shot to the other, the timing changes. In the Otsuka scene I analyzed, the animator used it quite often : Hols rising in surprise was animated on 1’s, which created contrast with the previous and next shot on 2’s. For another, even more striking example, we can think of animator Yutaka Nakamura : what makes his cuts stand out is not only the insane talent the man has, but also the fact that his recent work is almost systematically on 1’s or 2’s. When your anime is mostly animated on 3’s, but suddenly there’s a sequence of multiple shots animated on 1’s, the sudden burst in movement is obvious. This is why the intensity created by such modulation is among the strongest. This kind of effect, which relies heavily on editing, is closely related to editing techniques in cinema : for example, having a shot with a character shown from afar, and then, without transition, a sudden close-up, with create the same kind of surprise. In Deleuze’s words, we could say that in such instances of modulation, “the image must change its power, switch to a superior amount of power” [Deleuze, 1983, p.54].
This last remark brings me to my conclusion. I argued that there was an essential difference between animation in cinema, that is that they are two distinct mediums, because the representation of time that they each create is very different. Cinema relies on a regular, mechanically-produced time, whereas animation rests on irregular, handmade time thanks to the techniques known as timing and modulation.
That does not mean, however, that animation and cinema do not communicate : as I’ve said multiple times, the projection apparatus is identical. Moreover, they share a common element : cinematography, that is, all that involves editing, shot composition, etc. In Deleuze’s words, these are “determinations of the Whole” [1983, p.46], and not of the parts, that is of the individual shots or frames. This means that, even though animation and cinema share a strong analogy, they are still separated by their different ontologies, i.e. the different status that what they represent has. They could be compared to closely related, but ultimately different, languages : a Spanish speaker may understand some Italian, because the two resemble each other ; but that does not mean that he speaks Italian, because the grammars are different, and there are a lot of words that are not common. In the same way, someone knowledgeable about cinema may understand animation, because he has something to say (sometimes something more to say) about the common element, that is cinematography. But it’s not because he understands and talks about animation that he masters it : there will always be an element that cinema will not be able to assimilate – which is, as I have shown, time.
(1) The animator may eventually be corrected by the animation director, but for my argument, it amounts pretty much to the same thing.
(2) Please note that all the quotes from Deleuze have been translated by myself ; I have done my best, but I’m by no means a professional translator, so please excuse me for any mistaken or unclear translation (thanks to Calann for reviewing them).
(3) I am aware that experimental cinema may play with such factors ; but I’d answer that it is the very nature of experimental art is to play with the boundaries and set definitions.
(4) What this means, if we must absolutely follow Deleuze here, is that the integration of CGI by live-action cinema doesn’t mean, as some argue, that animation is replacing cinema. On the contrary, it’s just cinema (as a whole) made up of a live-action part and a “computer-action” part, so to speak, which is as much cinema because it rests on purely objective and regular time.
(5) Because snapping often involves this kind of omission of in-betweens, I think it’d be more accurate to call it “omission” : “snapping” describes the impression of the movement, but not what’s actually happening.
(6) Many people call this scene the first example of framerate modulation ; however, all the places where I’ve read this do not cite any sources. I find it hard to believe it would be the first example ever of framerate modulation, so it (among with Otsuka’s other scenes in the movie) may be the first use of it Japan, but even then, I’m not really sure that would be the case. But since I haven’t counted the exact framerate of every single Japanese animated production before 1968, I can’t say anything for certain except that further research would have to be made.
(7) This similarity has been noted by McCloud himself : cf. 1994, p.88 where the mind is described as an “in-betweener” and the comic artist as a “[key] animator”. This obviously doesn’t mean that comics and animation are the same thing ; but that they share a strong analogy between them, of which cinema is not a member
Bergson, H. (2013). L’évolution créatrice [The Creative Evolution]. Presses Universitaires de France, coll. “Quadrige”.
Clements, J. (2013) Anime, A History. Palgrave Mac Millan.
Deleuze, G. (1983) Cinéma 1 : L’Image-Mouvement [Cinema 1 : The Movement-Image]. Les Editions de Minuit, coll. “Critique”.
Lamarre, T. (2009) The Anime Machine : A Media Theory of Animation. University of Minnesota Press.
McCloud, S. (1994) Understaning Comics : the Invisible Art. HarperPerennial.
Sifianos, G. (2012) Esthétique du Cinéma d’Animation [Esthetics of Animation]. Le Cerf, coll. “7ème art”.
Tamerlane (2016). “An Introduction To Framerate Modulation”. Wave Motion Cannon. Retrieved from : https://wavemotioncannon.com/2016/12/31/an-introduction-to-framerate-modulation/