Synchronizing animation with MIDI audio

Skip to main content (skip navigation menu)






Using AniSprite and Maximum MIDI

This paper presents a way to play computer animation "on the beat" of a MIDI playback. Tempo changes in the MIDI file are handled transparently. The paper assumes a working knowledge of the engines that it uses: AniSprite and Maximum MIDI. Much of what is discussed here is also applicable to other animation or audio engines.

This paper also serves as a reference page for the extensions to the Maximum MIDI toolkit that I made to enable the animation-music synchronization.

What is AniSprite?

AniSprite logo AniSprite is a multiple layer sprite animation library for Microsoft Windows 3.1x, Windows 3.1x with Win32s, Microsoft Windows 95/98/ME and Microsoft Windows NT/2000. AniSprite's features are covered in detail on a separate page.

Just like Maximum-MIDI, AniSprite is a "programmers tool", not an "artists tool", by the way.

AniSprite has its own timer that drives the movement of the sprites and the screen update, but it can also use an external timer. The only modification that a "typical" AniSprite application needs for MIDI synchronization is the replacement of the internal timer by an external "MIDI" timer.

What is Maximum MIDI?

Maximum MIDI book cover Maximum MIDI is a toolkit to control MIDI input and output, and to manipulate the settings of MIDI devices and/or standard MIDI files. Maximum MIDI runs under Microsoft Windows 95/98 and Microsoft Windows NT (there exists an unsupported version for Windows 3.1; you will also need this version to use the Maximum MIDI toolkit from 16-bit software).

Maximum MIDI consists of three DLLs and seven C++ classes (based on MFC). The DLLs have a "C" interface, and do not require you to use MFC.

The Maximum MIDI toolkit aims at having a reliable foundation of low level code. The higher levels, implemented in user code, build on these lower levels and if the groundwork were unstable, the whole house of cards would collapse. While this sounds obvious, the dedication to build good low level code for Microsoft Windows does send you on a rough journey. (See, for example, the article "Overcoming Timer-Latency Problems in MIDI Sequencers" in the DirectSound knowledge base; if this link is broken, I am sorry to say that Microsoft must have, once again, changed the location of the document —or removed it altogether).

How do AniSprite and Maximum MIDI link?

Music is time-driven; animation is time-driven. In the design of Maximum MIDI, its author went to great lengths to get a stable, high-resolution timer. In the design of AniSprite, we also went to great length to get a reliable, high-resolution timer that is (optionally) synchronized to the vertical retrace.

So it is obvious that the key concept that AniSprite and Maximum MIDI share is a good and fast timer. Letting both products have their way, however, is not such a good idea —notwithstanding the quality of both timers. As the animation tip "Base all animations and timed events on a single timer" explains, the timers are bound to drift and synchronizing the timers tedious and error prone (been there, done that). It is much better to base all time-related elements onto a single timer.

Now, Maximum MIDI really wants to do the timing itself, but AniSprite allows you to drive that animation with an external timer. So the trick explained in this paper is to make the timer of Maximum MIDI more flexible so that it becomes an appropriate timer for animation.

An example that shows the end result of an animation that moves "on the beat" with MIDI music, see the "Sisters" animation.

By the way, AniSprite and Maximum MIDI are independent products: you can use one without requiring the other.

Where do I get the required software?

How synchronization works

Although MIDI is a "music" standard, it is more concerned with (real-time) performance than with "audio". MIDI encompasses a set of commands, which are not necessarily music-related, and a protocol to tells every system exactly when each command must be triggered. The idea to drive animation from MIDI is not new, but only few toolkits or MIDI editors provide support for it.

There are several approaches to synchronize animation to MIDI playback:

The last approach is the most convenient and flexible; and it therefore is the path taken in this paper.

MIDI playback is measured in beats per minute (bpm), where a beat usually stands for a quarter note. Every beat is subdivided into a number of ticks per beat; the number of ticks per beat is always a multiple of 24, because of the way that ticks relate to the MIDI clocks. See the Maximum MIDI book (chapter seven) for detailed information (if you do not have the book yet, you may read this chapter on-line, courtesy of Maximum MIDI's author Paul Messick and Manning Publications Co.).

Animation speed is in frames per second (fps). A common speed for traditional animation on film (cartoons) is 12 fps where each drawing is shot twice. This produces an effective frame rate of 24 fps, which is the frame rate for film. In computer animation, frame rates are very flexible, but 15 fps pops up more often than others do.

To connect the MIDI tempo (beats per minute) with the animator's frame rate, Maximum MIDI introduces the concept "frames per beat". The number of frames in a (partial) animation and the number of beats in a (partial) score are both integral numbers. The ratio of the two may be a fractional value, however. For example, if you want to play 17 frames over 2 beats, the frames per beat ratio is 8.5.

New Maximum MIDI functionality

Version 1.59/A of the Maximum MIDI DLLs provides one new function and extends an existing message.

void SetBeatRatio(HSYNC hSync, WORD framerate, WORD beatrate);

hSync
The "synch device" handle.
framerate
The numerator of the "frames per beat" ratio; the number of frames to display in "beatrate" MIDI beats (again, a MIDI beat is a quarter note).
beatrate
The denominator of the "frames per beat" ratio; the number MIDI beats that it takes to display "framerate" frames.

The "frames per beat" ratio, hence, is framerate/beatrate. The effect of this function is that it adjusts the interval at which the MIDI_BEAT message is sent to the window that is associated with the synch device.

The default values for "framerate" and "beatrate" are 1. In other words, by default, you get one MIDI_BEAT message for every quarter note.

MIDI_BEAT

wParam
Set to 1 if the MIDI beat message is a multiple of the "beatrate" parameter that was passed to SetBeatRatio(). Otherwise, this parameter is zero.
lParam
Set to the "synch device" handle.

Typical usage

After opening the standard MIDI file, the MIDI output device and the "sync" device, the next step is to set the resolution and the beat ratio of the MIDI file. You can copy the resolution from the MIDI file to the sync device with the functions GetSMFResolution() and SetResolution(). Then, you will have to decide how many frames you want per beat (more on this later). Assuming that the animation has some inherent rhythm, this is a simple matter of counting the number of frames between two "beats" and passing that to SetBeatRatio(). In this case, the beatrate parameter is often 1.

This is the lowest level of synchronization: play a number of frames at the tempo of the music. Not all images in the animation are of equal weight, though. If you have an animation of puppets dancing on the "Lambada", you will want the puppets to be at the end of a "swing" at every beat, not in the middle of one. This is the task of the animator. Next to a "beat", the start of a new measure is also an important "key point" to which you may want to synchronize.

To synchronize against the start of a measure, set the beatrate argument of function SetBeatRatio()to the number of quarter notes in the measure and the framerate value to the number of frames to show in the entire measure. The beat ratio, the quotient of framerate and beatrate, does not change. Playing 5 frames per quarter note, or 15 frames per 3 quarter notes amounts to the same. What does change, though, is that the wParam parameter of the MIDI_BEAT message is 1 for every third quarter note in the latter case.

Two concerns: 1) How do you know what the time signature is? This you can read from the MIDI file, as the following code snippet below shows. 2) How do you handle an anacrusis? Well,... I don't. I make sure that the MIDI file starts right on the first beat of the measure. Sometimes, one can get away with inserting a rest before the anacrusis to fill up the measure. At other times, one must get a bit more creative and add a lead; percussion works well in many cases.

Reading the time signature
  HSMF hSmf;
  LPSTR Sig;
  DWORD size;
  int qnotes;                   /* quarter notes per measure */

  /* read the time signature and determine the number of
   * quarter notes per beat
   */
  if (ReadMetaEvent(hSmf, 0, META_TIME_SIG, &Sig, &size) != -1) {
    int num = Sig[0];           /* numerator of the time signature */
    int denom = 1 << Sig[1];    /* denominator of the time signature */
    qnotes = 4 * num / denom;
  } else {
    qnotes = 4;                 /* no time signature, assume 4:4 */
  } /* if */

A topic that I rushed over in the above discussion is that of "tempo". When I wrote that "you will have to decide how many frames you want per beat", a question that may have formed in your head could be "yes, but how fast is the music?". Maximum MIDI automatically parses and handles "tempo" meta events and moves them in the stream of the "normal" events. The "tempo" and "end of track" meta events are the only meta events that Maximum MIDI handles silently. Despite this, you can still browse the MIDI file and analyse all "tempo" meta events yourself, using ReadMetaEvent() with the META_TEMPO. The returned (three-byte) parameter is the tempo in microseconds per beat.

You cannot expect any MIDI file to be suitable to synchronize with any animation. It may sound obvious (I hope it does), but both music and animation attempt to convey a feeling; light or heavy, funny or serious,... If the animation and the music do not match, you'd better play one without the other. Personally, after selecting an animation sequence and a music sequence that go together, I often feel the need to "massage" both the animation and the music (but especially the animation). Most of the changes that I want to make in a MIDI file is to add or remove tempo changes, add volume changes, insert or remove silence in the beginning of the file, and insert cue points in the marker channel (more on this below). While there are many splendid sequencers around, the only tools that I could find and that gave me the control that I wanted are the MIDI tools by Günter Nagler.

Synchronizing on cue points

With the above routines, you are now able to move animated objects ("animobs", for short) at a frame rate that is synchronous with the MIDI stream. This is synchronization at a low level. At a higher level, you may also want to synchronize the general progression of the animation with cue points in the musical piece. For example, if the animation is a dance, you may want the animobs to go through different "steps" in the refrain versus the verses of the song. To do this, you will have to know where a verse (or refrain) starts and, hence, the MIDI file must contain that information.

The MIDI specification defines "cue point" events that are intended for this kind of higher scale synchronization. It is not quite as simple as it may sound, however. The Maximum MIDI toolkit separates the meta events, like cue points, from the voice events. While playing notes (voice events), the Maximum MIDI toolkit skips any meta events, without notifying your program in any way. If you want to detect the cue points, you will have to check for them yourself, which makes your MIDI handling code quite a bit more complex. Oh, and there's that other difficulty to surmount: only few MIDI editors/sequencers support the cue point events.

An alternative that I have used successfully is to reserve one of the 16 channels as a "marker channel": the notes in this channel are markers, or cue points; they are not to be played. At the beginning of the MIDI piece, I announce that the MIDI file contains a marker channel by adding a "text" meta event. The text meta event that contains "@MarkerChan=" and the channel number. This syntax is similar of the one used in Karaoke MIDI files. For example, when MIDI channel 16 would be the "marker channel", the text meta event contains the string "@MarkerChan=16".

Apart from this meta event, the MIDI piece now must also contain notes at the appropriate places in the marker channel. The pitches of the notes is used to indicate the type of marker; which pitch matches which marker is something that your program must resolve.

The routine that reads the MIDI events from the MIDI file and pushes them to the output queue, must monitor the marker channel, if one is present. Every "note on" event read from the stream is replaced by a "User Message". User messages are a feature of the Maximum MIDI toolkit starting with version 1.55. To change a note on event to a user message, all one has to do is to set the "status" byte of the "MidiEvent" structure to a value between 1 and 127 (inclusive).


A status byte of zero is reserved for tempo changes in the Maximum MIDI toolkit. Status bytes above 127 indicate valid MIDI events.

Replacing notes in the marker channel by user messages
  HSMF hSmf             /* initialized elsewhere */
  HMOUT hMOut;          /* initialized elsewhere */
  int MarkerChannel;    /* initialized elsewhere */
  int num, i;
  MidiEvent events[BUFFERSIZE];

  num = (int)ReadSMF(hSmf, 0, events, BUFFERSIZE);
  for (i = 0; i < num; i++) {
    if (events[i].status == 0x90 + MarkerChannel)
      events[i].status = 1;
    PutMidiOut(hMOut, &events[i]);
  } /* for */

Question & answers

How to handle time signatures as 7:8?

In Maximum MIDI, a "beat" is defined as a quarter note. If you need to synchronize on the measure, change the time signature in the MIDI piece to 7:4.

How to act on tempo changes in the MIDI piece?

If your animation can cope with a flexible frame rate, tempo changes are handled automatically by SetBeatRatio() (it was designed for exactly this purpose). If your animation has to play at a fixed frame rate, you have no other option than to simulate the tempo change by adjusting the length of the notes.

What if you set the frames per beat ratio too high; that is, what if the animation cannot keep up with the music?

The timing circuitry in the Maximum MIDI toolkit will drive on and post one MIDI_BEAT message after another into the application's message queue. When the animation cannot run as fast as the MIDI_BEAT interval, the message queue will slowly fill up, and Microsoft Windows will get less and less responsive and finally lock up.

This is a problem for which there is no easy or automatic correction (assuming that you cannot make the animation magically go faster). Slowing down music is often no option either, as this will probably sound awful. The only option that you have, in my opinion, is to avoid this phenomenon from occurring. That is, test the maximum frame rate before you run.

Why are the new DLLs so much bigger than the original ones?

The original DLLs were compiled with Microsoft Visual C/C++ 1.5 (MxMidi16.dll) and Microsoft Visual C/C++ 5.0 (MxMidi95.dll and MxMidi32.dll). I do not have these compilers. I used Borland C++ 3.1 for MxMidi16.dll (Borland C++ 5.0 generated a bigger DLL, still) and Microsoft Visual C/C++ 6.0 for the two other DLLs.

Of course, I added a bit of code for the SetBeatRatio() function, but this does not account for the several KiB of size growth.

There no modifications in the MFC classes (?)

Since I do not use MFC myself, I feel that I am not the appropriate person to potter about the MFC source. That said, the changes that are needed in the MFC sources are probably trivial.

A summary of the changes

Contrary to popular belief (in the surroundings of my work), I usually set out to change as little as is necessary. In the particular case of Maximum MIDI, I withheld myself from removing the "name mangling" in the exports of the 32-bit DLLs. Name mangling makes the DLLs harder to use with other C compilers (Borland, Watcom) than necessary, so I think you should avoid it, but Maximum MIDI originally had mangled exports and, hence, this modified version has mangled exports. On the other hand, I could not keep myself from removing the majority of warning messages that my compilers issued and I fixed two minor errors in the course.

The structure of the "sync device" was extended with several fields to keep the frame and beat rate values, plus a few extra fields that contain the pre-computed ratio. The algorithm that Maximum MIDI uses to send the MIDI_BEAT messages at the accurate timings is very similar to the one that it uses to pump the MIDI events at the correct times.

The new function SetBeatRatio() is added to SYNC.C(); in addition, the functions SetTempo() and SetResolution() are modified to keep the internal, pre-computed ratio for the frames-per-beat up to date.

The two header files needed to contain the definition of the new new function SetBeatRatio(). Since one header file is used for all three DLLs, the definition occurs multiple times in slightly different forms.

The "version information" in the resources of all three DLLs was set to 1.5.9.1 (numerical format) and "1.59/A" as string format. Unfortunately, I forgot to adjust the version number that GetMaxMidiVersion() returns, so that function still returns 1/59 (1 in the high byte, 59 decimal in the low byte).

Questions and support

Send questions and comments concerning the modifications to the DLLs to me at thiadmer@compuphase.com.

Note that the software is provided as is and that there are no warranties.