Company Products Press Purchase Partners Events Support
Cakewalk Home Search Download Contact Us


Desktop Music Handbook

MIDI
History
What is MIDI
How Does it Work
What it Takes
MIDI Channels
MIDI Messages
Note On and Note Off Messages
Program Change Messages
Control Change Messages
System Messages
General MIDI
MIDI Hardware
Synthesizers
General Features of MIDI Hardware
Programability
Samplers
Drum Machines
Guitar and Wind Controllers
MIDI Software
Notation Programs
Patch Editor/Librarians
Integrated Programs
Digital Audio
What is Digital Audio
Recording a Sound
Digital Audio Software
Sound Cards
Putting it Altogether
Synchronization
Integrated Software
Summary
Glossary

Welcome to the World of DESKTOP MUSIC!

Everywhere you turn, you're likely to hear music made with a computer. From the concert hall to the local club, to radio, television and the movies, desktop music can be found all around. Today, making music with a computer is easier and more exciting than ever, and the capabilities you can have are equal in great part to those found in professional recording studios. We've written this primer to help you get going. It looks at the two main music applications found on the desktop today, MIDI and digital audio. Together, these two applications account for the vast majority of sounds in the popular music world, and also have far-reaching applications into many other areas of multimedia. Though it can't cover every subject in great detail, we hope that the primer will give you a good basic understanding of the topics, and will get you started in the right direction. We'll go through the basics of MIDI first, then proceed to digital audio, then conclude with a discussion of how you can combine the two. You'll also find an extensive glossary that covers all the terms mentioned here and many more. There's lots to cover, so let's get started.

MIDI

MIDI, or the Musical Instrument Digital Interface, is a means by which computers and musical instruments can communicate. It's a language that allows you to give instructions to a computer that it will then send to the synthesizer on your sound card, or to any other MIDI devices that you may have available. MIDI is a great way to work with music and has very powerful capabilities that will appeal to users of all levels. There are lots of unfamiliar terms and concepts in the MIDI language, though, and it's easy to get frustrated if you don't have a grasp of some basic ideas. The first section of this guide will help you understand what MIDI is and teach you what it can do for you.

History

MIDI was born in the early 1980's when electronic instrument makers, primarily in the US and Japan, recognized that their instruments must be able to talk to one another. After the details were worked out, manufacturers soon began to include electronic circuitry in their equipment that allowed them to understand the instructions MIDI used. Before long, nearly every instrument maker in the world had adopted the standard, and though there have been refinements and modifications to MIDI along the way, even the earliest MIDI instruments are still capable enough to be used today. Since its adoption, MIDI has dramatically changed the way music is created, performed and recorded.

What is MIDI

MIDI is a universally accepted standard for communicating information about a musical performance by digital means. It encompasses both a hardware and software component, and though it could be used for sending information about many other things, such as the control of lighting in a theater, or even to control your coffee maker, it was developed to transmit instructions about music. Like a music score, on which notes and other symbols are placed, a MIDI transmission carries instructions that must be acted on by some device that can make sound. While a clarinet or guitar player will interpret a written music score and produce the sound required, it is most likely a synthesizer or drum machine that will react to MIDI information. Fortunately for us, a complete set of these instructions can be captured and stored by a computer, and several types of music software can be used to edit and alter them. If the information is sent to several different MIDI devices, an entire electronic orchestra can be at the musician's disposal. MIDI does not (except in rare cases) actually transmit sound electronically; you couldn't connect a MIDI cable to a loudspeaker and expect to hear anything (you'd probably damage both your speakers and your ears if you tried!). Instead, it is the sound-producing capabilities of the synthesizer, whether it's on a sound card in your computer or a stand-alone device, that will create the sound you hear.

How Does it Work

A MIDI transmission consists of a series of signals, called bits for binary digits, that pass through a MIDI cable. These signals are electrical impulses, some strong, some weak, that represent the 1s and 0s that make up the language of computers (any device that wants to send or receive MIDI data must be equipped with a microprocessor, the ``brains'' of every computer). When the impulses reach their destination, for example a synthesizer, the operating system of the synthesizer interprets them as a series of instructions that usually result in the production of a sound. This sound must be amplified, so the synthesizer will typically be connected to an amplifier or mixer.

The bits in the MIDI transmission move at a fairly high rate of speed, 31,250 per second to be exact, and are transmitted in a serial manner, meaning one after another. (A parallel transmission contains a number of signals that pass at the same time). Every bit does not represent a different note or event, however. Bits are grouped into strings of 10 to create MIDI messages, each of which conveys important information about some musical event (Fig.1). (Individual MIDI messages are actually eight bits or one byte long, though when messages are being transmitted, two extra bits, called the stop and start bits, are added to the beginning and end of the byte, hence the 10 bit length of most events.)

Fig 1. -MIDI data is transmitted using a 10-bit packet that includes a start and stop bit.-

Some MIDI messages detail specific aspects of a musical performance: what notes should be heard; how loud they should be; what type of sound (trumpet, drum, flute) should play the notes, etc.; while others are more general in nature. Together, MIDI messages represent an entire language of musical actions, and can be used to convey all the details of a complete symphony or a simple hymn.

What it Takes

In order to communicate in the language of MIDI, a device should be able to send and receive MIDI information, though many common devices are created to do primarily one or the other. A sound card in a computer, for example, must be given instructions that are generated by some other source; it cannot create any MIDI messages on its own. Similarly, certain electronic instruments known as tone or sound modules, are also only able to respond to messages generated from the "outside." By contrast, a class of instruments called keyboard controllers are intended for transmitting MIDI data only, and have no way to make sound. Whatever their capabilities, all MIDI devices must contain a microprocessor, which is a computer chip that deciphers and acts upon MIDI messages, as well as physical connections called Ports, for sending and receiving data.

MIDI Channels

One of the great capabilities of MIDI is its ability to transmit messages to different electronic musical instruments at the same time. Each instrument can distinguish which messages are for it because the messages contain channel information, which acts like an address or shipping label for the message. These MIDI channels are not physically separated, i.e., they are not transmitted on separate strands of wire. Rather, the different channel numbers (1-16) are contained in the beginning of the MIDI message, and determine whether an instrument or device will respond to that message. In this way, messages can be directed to certain devices, while other devices, which might also be receiving the information, will ignore them. Most newer instruments can be programmed to respond to any one or even all MIDI channels. Because of this, the user has extensive control over how different instruments react to the information that they receive.

There are certain classes of messages called system messages that don't use a channel, since they are intended for all devices connected to the MIDI chain. Messages that deal with tuning or timing information are in this category. There are also other cases where individual messages do not need their own channel label, for example when all the notes of a melody are to played by a certain instrument on the same channel. In this case, a channel designation can be set at the beginning of the melodic sequence and used for all messages in that series.

MIDI Messages

MIDI messages are the language of MIDI; they are the words MIDI uses in a transmission to communicate the information that must pass from a source to a destination. There are many types of MIDI messages, though they all fall into two categories: channel messages and system messages. Channel messages are those that carry specific channel information, such as those described above. These include messages such as what note an instrument should play (called a Note Message), and Program Change messages, which tell the instrument what sound it should make while playing the note. System messages, as described above, are either intended for all the instruments currently connected to the transmitting device, or are meant to convey information to a specific instrument that is general in nature and doesn't represent specific details of a performance.

Most messages consist of at least two bytes. The first byte is called the status byte, which tells the receiving device what type of message it is. Basically, it identifies the message and prepares the device for a response. MIDI uses the numbers between 128 and 255 for this part of the message. What follows is the actual data the device needs; these bytes are called data bytes. They represent the details of the message; the values the instrument will use to perform its task. MIDI uses the numbers 0 to 127 for data bytes. Some messages use only one data byte, others need two, while some need none at all. We'll look at a few common messages to see what type of information they contain.

Note On and Note Off Messages

Perhaps the most basic of all messages is the pair called Note On and Note Off. A Note On message is transmitted when a key is pressed on a keyboard, and a Note Off is transmitted when it is released. When a synthesizer receives a Note On message, it looks immediately for additional information, specifically, a data byte that details what note it should play and another that specifies how loud it should play it. MIDI has only 128 different numbers for designating pitch and loudness (or velocity) levels, so immediately after the Note On message is sent, a data byte representing a number between 0 and 127 will appear for the Note Number, followed by another that specifies the velocity level for that note. The note will continue to play until the Note Off message is received, and it too must contain note and velocity numbers. Note and velocity details must be included with the Note Off message because it is possible that a synthesizer will be playing several notes when the Note Off is received. If it received the Note Off without a specific Note number, it wouldn't know which note to stop playing. The Velocity number that appears with the Note Off is not quite as important; in fact, some synthesizers simply ignore it. Nevertheless, it will be sent as part of the data. The Note On / Note Off combination constitutes the most common pair of messages in any MIDI transmission, though there are many other parts of the transmission that we need to explore (Figure 3).

Fig 3. -The MIDI message Note On is followed by two data bytes, as is the Note Off message.-

Program Change Messages

When a synthesizer is first turned on, it will load one of its sounds into its RAM (random access memory) and prepare itself to receive note messages. These sounds are permanently stored in the synthesizer's ROM (read only memory) and are, in essence, individual computer programs that tell the device how to create the required sound. When the computer is directed to load a new sound, it must change the program it is currently running so it will be ready to play notes using the new tone. Hence, the MIDI message that tells the device what sound to make is called a Program Change message. Program Changes are followed by a single data byte.

MIDI devices use two different numbering schemes to catalog their programs, either 0-127 or 1-128, and it is important to know which scheme the different devices you will be using employ. A recent standardization of this numbering scheme, called the General MIDI specification, states that the numbers will run from 1-128, and also specifies which sounds will have what numbers. We'll take a closer look at General MIDI at the end of this section.

Control Change Messages

Control Change messages are used to represent some change in the status of a physical control on a device. These controls are the foot pedals, volume sliders, modulation wheels, and similar peripherals found on most electronic instruments. Some control messages act like simple on and off switches; for example, the sustain pedal on a synthesizer can only be down or up, so a single status byte can be used to specify which state the pedal is in, and no data byte is needed. Other controls are continuously changing and need to be represented by more detailed data called continuous controller data. For example, if you move the pitch wheel on a synthesizer very slowly from its resting position to one extreme up or down, MIDI transmits data representing the wheel's position at numerous points along its path. In this case, the data must be very high in resolution, so 14-bit (2-byte) messages are used. This provides a total of 16,384 values to track the movement of the wheel.

Controller data can be used for many different functions in MIDI, even multiple functions at the same time. For this reason, the different controller data streams are numbered from 0 to 127. Some of these controller numbers have become standardized to control certain tasks, for example, controller 10 (often abbreviated cc 10) is most often used for panning between left and right speakers, while controller seven (cc 7) is typically used for volume changes. Many synthesizers allow the user to change the effect controller data will have. When this is possible, any controller could theoretically be used to control any aspect of a sound that changes over time.

System Messages

One final category of MIDI messages is called system messages. There are several types of system messages, but they all share the characteristic of transmitting information without a channel assignment. As a result, all instruments that receive messages of this type would act upon them, though one particular type of system message, called system exclusive, is intended for communicating only with a device or devices made by a specific manufacturer. System exclusive is often used when a musician wants to transmit large amounts of data to a specific synthesizer or receive data from the device. Because all major instrument makers have an ID number (#7 for Kurzweil devices, #67 for Yamaha, etc.), a message can be "addressed" to one device and all other receiving instruments will see it, but ignore it. For example, all the instructions specifying how a synthesizer makes it sounds could be ``dumped'' from the device and stored on a computer. Users could then trade custom libraries of sounds, or reload all the original factory settings if their equipment's memory were wiped out. Moreover, a whole new setup of sounds could be sent by a computer just before actual note data was transmitted, thereby getting the instrument properly configured before the music starts.

Other system messages include timing messages, which provide information about the tempo of the music; and Song Position messages, which indicate where a recorded MIDI sequence should begin playback. These last messages are particularly useful with synthesizers that contain built-in sequencing capabilities.

General MIDI

Before General MIDI (GM) was popularized, there was no consistency in the way manufacturers numbered the sounds in their instruments, so that on one device program #1 might be a piano, while on another, it might be a flute. Because MIDI data files (or sequences) often contain program change instructions, i.e., the actual specifications for which sound an instrument should use to perform each layer of the music, it was unlikely that music created for one synthesizer would sound correct when performed by another. With the adoption of General MIDI, files that use its numbering scheme are now ``portable,'' meaning they will sound identical, or nearly so, when played by different instruments. This assumes, of course, that the instruments conform to the GM specification (Table 1).

The General MIDI Program Change Specification

In addition to a standardized assignment of program change numbers, General MIDI includes several other guidelines, the most important of which is the use of Channel 10 for drum sounds. It also provides a Drum Map, which is the fixed assignment of certain drums sounds to specific MIDI note numbers (Table 2). For example, sending middle C, MIDI note #60, will trigger a high bongo sound on the receiving General MIDI instrument. A ``C'' one octave below, note #48, will produce a Hi-Mid tom, and so on. This mapping scheme provides yet another layer of standardization, thereby insuring that MIDI sequences can be transported among different studios and desktop systems around the world.

The General MIDI Drum Map

MIDI Hardware

Different MIDI devices have different capabilities and functions. We'll look closely at the various options on a traditional synthesizer first, then explore some of the other options that are found on different types of instruments.

Synthesizers

When you first look at a synthesizer, you are likely to see a piano-style keyboard, one or more rows of buttons and perhaps a few "sliders" or "wheels" (Figure 4).

Fig 4. -A MIDI synthesizer (with integrated keyboard controller).-

Inside the synthesizer are the sound-producing components, the actual brains of the unit, that respond to messages received when a key is pressed on the keyboard or when a MIDI message is sent from some other source. The keyboard part of the unit is called a controller, which is a term used for any MIDI device that can initiate an action. There are other types of controllers, including those in the form of a guitar (guitar controllers), drum machines (drum controllers), and even those that look and work like wind instruments (wind controllers). It's possible to buy a controller that does not include the capability to make any sound, and it's just as easy to buy the sound producing components alone, which are devices commonly known as tone or sound modules. In essence, the devices we commonly refer to as ``synthesizers'' are actually tone modules with integrated keyboard controllers attached.

Keyboard synthesizers are by far the most common MIDI devices, although the tone modules included with nearly all sound cards for the PC are also extremely common. Like any device that wants to join into a MIDI conversation, synthesizers are equipped with the proper connectors that allow MIDI information to pass in, and sometimes out. These connectors, called MIDI ports, are usually grouped in threes: MIDI In, MIDI Out and MIDI Thru. Figure 5 below shows a standard arrangement of the Ports on the back of a synthesizer, and also shows the end of a MIDI cable, which connects the sending and receiving devices. Unlike single ended audio plugs (guitar cords and stereo RCA plugs), MIDI cables and Ports use a 5 pin DIN connection. The MIDI communication does not have to be two-way; for example the MIDI input of device one can be connected to the MIDI Out of device two, but not vice versa. The MIDI Thru port is used to relay the information that is sent to a device on to yet another unit without altering it in any way. By using this port, many MIDI instruments can be chained together, allowing a single controller to transmit to numerous different sound-producing devices simultaneously.

Fig 5-Three standard MIDI ports and a MIDI cable.-

To connect a MIDI synthesizer to a computer, the computer must have a MIDI interface, which typically contains the same three MIDI ports described above. Like the synthesizer, the MIDI interface converts the electrical signals it receives to the proper format needed by the computer. The MIDI interface might be a separate card that installs into a free PC expansion card slot; it could be a stand-alone, external unit that attaches to the PC's parallel or serial port; or it might be an integrated part of a sound card. Some sound cards use proprietary connectors for their MIDI hookup and require an optional MIDI adapter for connections to external MIDI units. On the Macintosh, the interface is almost always external, and typically connects to either the modem or printer port.

General Features of MIDI Hardware

Keyboard and other MIDI controllers share many common features. Most have the ability to detect how hard a key was pressed. This feature, called Velocity Sensitivity, is used to determine a note's loudness, or amplitude. Like other controllers, a keyboard controller typically works by constantly watching the position of every key on the keyboard. An optical sensor is used to determine whether a key is up in its at-rest position, or down. Then, whenever a key is pressed, the instrument knows exactly how long it takes for the key to go down, and it assigns a value to that note by measuring the time it took to go from its starting point to the bottom of the keyboard. This value is called velocity, meaning "speed," but actually determines how loud the note will be played. An instrument that has the ability to measure this speed is said to be velocity sensitive.

Synthesizers and tone modules have many other features, including the ability to play many notes at once. This capability, called polyphony (for "many sounds"), usually ranges from eight notes, up to a maximum of 32, or in rare cases, 64. (Musicians usually use the term Voices when describing the polyphonic capabilities of an instrument, so "8-voice polyphonic" means the device can play eight notes at once.) When a device receives a new message after it has already reached its maximum, it must decide how to allocate its resources. For example, it might choose to drop the oldest note it is playing, or maybe it would drop the lowest or softest note. Some instruments will just ignore the new note that puts it over the top. In a professional synthesizer, this allocation might be programmable by the user, though in many cases it is fixed by the manufacturer.

It's important to keep in mind that certain sounds on a tone module might use up more than one voice. For example, even a simple flute sound could require two notes (or voices) of the available polyphony, while a complex, evolving sound, such as those often intended for use as movie soundtrack backgrounds, might require four or more voices. Playing a four note chord using a sound that requires 4 voices could, in theory, use the entire polyphonic capability of a 16-voice synthesizer. Other sounds, such as drum sounds, typically use only a single note of polyphony, and are not likely to be needed for playing chords!

When a synthesizer can make more than one type of sound at the same time, it is called multitimbral. This term comes from the French word timbre (pronounced ``tam-ber''), which means tone or sound color. If a synthesizer can make the sound of a trumpet, flute, clarinet and oboe simultaneously, it is clearly multitimbral. How many different timbres can be used at once is a significant factor in determining the usefulness of a tone module for one's music; for example if you plan to write your next symphony using a single synthesizer, you should be sure it is at least 16-part multitimbral and has 24 or more voices of polyphony. For choral music, four-part multitimbral and 8-voice polyphony might be adequate, but obviously the more the merrier.

One final basic feature of a MIDI device is its ability to respond to instructions on several different MIDI channels at once. This subject was mentioned earlier, but to review, MIDI can keep all the different layers of a musical performance separate from one another by transmitting each layer on its own channel, so an instrument should be able to handle the full "bandwidth" of a MIDI transmission, which is 16 different channels. Most instruments allow the user to set the Reception Mode of a MIDI device, which determines how it will respond to the messages it receives. The most common (and useful!) Reception Mode is called OMNI OFF \ POLY, which tells the device to distinguish what channel messages are on (OMNI OFF), and play back several notes at once if requested to do so (POLY, from polyphonic). Many older synths were limited to other reception modes, which kept them from distinguishing the different channels of a transmission. For example, if OMNI were ON, the device might play all messages without regard for their channel status. In nearly all recent devices, Reception Mode is selectable, though OMNI OFF/POLY is by far the most common Mode in use today.

Most synthesizers have the ability to assign one sound to play over part of the keyboard, and another sound to play over the rest. This is called keyboard splitting or zoning, and would allow you, for example, to play a bass guitar sound with the left hand on the low notes, and a piano sound with the right hand on the high notes (Figure 6). Synthesizers, by the way, typically offer keyboards that range from as few as four octaves, or forty-eight notes, to full, traditional piano lengths of just over seven octaves, or eighty-eight notes.

Fig 6. -A MIDI keyboard split into two zones.-

Programability

There is a wide range of programming options available on synthesizers today, but most have capabilities that allow the user to design sounds with great precision. Normally, you can layer different sounds, combining a flute with a cymbal for example, or merging the beginning portion of a trumpet with the sustaining segment of a cello. Numerous filters are also typically available, which, like the tone controls on a stereo system, let you raise or lower a sound's treble or bass response. Another useful programming feature is an envelope generator. Because natural sounds do not remain static throughout their duration-the piano, for example, begins to fade away or decay immediately after a note is struck-these generators allow the user to change the way a sound evolves over time. Normally, the characteristic that changes most is the sound's amplitude (loudness), but envelopes might also be applied to the sound's pitch or even timbre. The shape of the envelope is usually alterable, which allows the user to determine how long it takes for the sound to move through each of its ``segments.'' In the example below, the sound will move gradually to its peak level during the attack segment, get a bit softer during the decay, maintain a steady level over the sustain segment, then slowly fade during the release (Figure 7).

Fig 7. -The four segments of an amplitude envelope. -

Samplers

Samplers are electronic devices that allow you to record audio, manipulate it, and play it back using MIDI commands. In effect, they allow the entire range of acoustic sounds to be employed in a musical composition. Under the control of MIDI messages, dog barks, train whistles, car horns and more can be integrated alongside violins and guitars, but samplers can be used for a lot more than just sound effects. Because of their extensive capabilities, samplers are used to create entire original compositions, using exacting reproductions of traditional instruments. Composers can preview their orchestral works and arrangers can listen to elaborate horn arrangements before committing the music to notation. In addition to these tasks, an entire musical style has evolved that uses samplers to store short phrases from existing recordings that are then used as the basis for entirely new musical compositions. While some of these capabilities are possible using traditional synthesizers, samplers expand the musician's palette of sounds enormously.

All samplers contain sample RAM that is used to hold digital recordings while the sampler processes them and plays them back. The amount of RAM determines the total length of recording time available to the unit. For example, if a sampler were to record sound using the quality of a commercial CD, it would require just over 10 MEGS (10,000,000 bytes) of RAM to hold just one minute of stereo or two minutes of monophonic sound. Many professional samplers contain hard disk drives for more permanent storage of recordings, while some also include floppy drives for moving sounds into and out of the unit. Besides the standard audio outputs used to record and playback, some samplers provide direct digital connections so sound can be moved back and forth to a digital tape recorder (DAT) or computer.

Among the many features of most samplers, one particular favorite is looping. This function allows the sampler to play repeatedly a short segment of sound. Using looping, small recordings can be played back for long periods of time, saving RAM and storage resources. When a sound loops, it merely plays through to the end, then restarts at the beginning or at some other point while the key is being held down. Looping works particularly well with string and wind sounds, but some sounds cannot be sustained; drum hits and other short sounds with sharp attacks, for example, simply do not loop well.

Among the other techniques samplers provide are filtering; crossfading, in which one sound fades out while another fades in; and pitch shifting, where the original pitch of a sampled sound is raised or lowered. Pitch shifting is very useful when you need to change or transpose the pitch of a sound, perhaps to change the key of your music. Unfortunately, after a certain amount of shifting in either direction, the sound will no longer resemble the original. It is very common for a sampler to use a technique known as multisampling, in which the original sound is recorded at numerous different pitch levels, and each individual sample is assigned to playback over a different range of the keyboard. In this way, no single sample has to be shifted beyond a small amount.

Samplers provide numerous other manipulation techniques, some of which will be mentioned in the section on digital audio. These include time compression/expansion, which is the ability to stretch or shrink sounds without changing their pitch; amplitude modulation, a technique used to change the sample's amplitude (loudness) at a variable rate; and playing back a sound in reverse. In all, samplers offer versatile options to the musician for shaping and crafting their music.

Drum Machines

One final MIDI device is the drum machine and a related instrument, the drum controller (Figure 8). The drum machine, one of the most common of all MIDI peripherals, typically contains buttons or ``pads'' for playing drum sounds ``live,'' and internal software to generate or store MIDI data. The sounds in the drum machine are most often sampled drums, i.e., digital recordings of actual acoustic drums. Unlike a sampler, you can rarely add your own sounds to such devices; instead you are limited to playback of the internal sounds, perhaps with some minor alterations.

While the buttons on a typical drum machine can be used to play the instrument in ``real-time,'' you can also record any pattern of button presses right into the device. When requested, the drum machine will then play back the patterns you've created. In this way, one can create elaborate drum parts ``note by note,'' then play them back repeatedly and at any tempo required. Drum machines also typically include preset patterns, providing very realistic drum parts that musicians who don't play the instrument can use in their own productions. Unfortunately, many of these patterns sound ``canned,'' and their overuse has created somewhat of a backlash against this type of device. Creative drum programming by capable musicians can, however, produce excellent results.

Guitar and Wind Controllers

While the vast majority of MIDI music emanates from keyboard controllers and synthesizers, instrument makers have come to realize that many other instrumentalists would like to share in the joy of MIDI. For this reason, various types of guitar and wind controllers have been created to provide a familiar performance interface for players of these instruments. While they typically produce no sound on their own, these instruments can be connected to tone modules or samplers, which then generate sound under their control.

Shaped and performed like traditional six string guitars, MIDI guitars contain small sensors that detect the player's finger position on the strings, as well as the amount of pressure applied to the string by the pick. Most can also track movements of the string and convert this bending into continuous controller data. Some guitar controllers even allow the user to assign a different MIDI channel to every string, thereby offering the ability to play up to six different sounds on the receiving device simultaneously. Wind controllers can easily detect which keys have been closed by the player, but must make far more difficult measurements of the amount of air pressure passing through the device's mouthpiece. Typically, a sensor in the mouthpiece is used for such measurements, and over the years, the accuracy of wind controllers has improved dramatically. Because a single MIDI note can be used to generate an entire chord, (if the receiving synthesizer is so programmed), musicians who have spent most of their lives playing monophonic (single note) instruments, now have the ability to play elaborate, chordal textures.

One final controlling device is the pitch-to-MIDI converter. This somewhat uncommon device is attached to a traditional acoustic wind instrument such as a saxophone or trumpet and converts the acoustic tones the instrument generates into MIDI notes. The Pitch-to-MIDI converter offers perhaps the best of both worlds, in that a musician can use his or her favorite instrument to create a performance that combines ``natural'' and synthetic sounds. Unfortunately, the conversion is not always accurate, and these devices still must undergo some refinements before they will be completely reliable. Nevertheless, converters are becoming more common, and offer musicians including singers tremendous expressive qualities in a MIDI performance.

MIDI Software

There are many categories of MIDI software available. Perhaps the most common is the MIDI sequencer, which is a type of program that can record, edit and playback MIDI data. Sequencers, which originally were often found as stand-alone hardware devices, have very powerful capabilities to transform MIDI information, and today represent a very complex and mature category of software. Sequencers share many basic features, and allow the user to put the strength of a personal computer to the task of making music.

Like a multi-track tape recorder, sequencers most often arrange multiple layers of MIDI information into tracks. Each track represents an independent melody or part of the music. The number of tracks in a sequencer can range from as few as sixteen in an entry-level program, to hundreds, or even thousands in others. Each track can be used to hold any type of MIDI data, and there is no single standard for how this information should be arranged. Rather, the best sequencers give the user a high degree of flexibility in organizing the various types of information their music requires.

Figure 9 below shows the main screen of a popular Windows-based sequencer, Cakewalk Pro Audio(TM). Along the left side of the figure you can see the various tracks; the first sixteen tracks are shown here, but different screen resolutions would allow you to see more or fewer at once on your own monitor. Each track is assigned to a specific MIDI channel, though you can see that several of the tracks have the same setting. This indicates that the events on all of these tracks will go to the same destination. Most sequencers allow you to put information for several channels on the same track, though this could make editing the information somewhat more difficult. The right half of the screen represents the actual data, which is organized into segments called clips in Cakewalk.

Fig 9. -The Track View of Cakewalk Pro Audio.-

Sequencers typically provide different ways to view and edit your data, and it's important to understand the function of each of a program's work areas. Usually, one will find a Piano Roll view, where individual or small groups of notes can be altered; a Track Overview, where entire measures or even whole tracks can be manipulated; a Notation or Staff view, where the music is represented using standard music notation; and an Event View, which is a text-based list of all the events in one or more tracks. The editing options that such programs provide are numerous and vary greatly among programs, but typically, one can cut, copy and paste data, as well as apply extensive modifications to the music, such as raising or lowering the pitch and volume characteristics, and expanding or compressing the amount of time a section takes to playback.

Some programs also provides features that can assist the user with the operation of his/her MIDI hardware. It is not uncommon to find sequencers that will list all the different sounds in your synthesizer, allowing you to work with specific names rather than the less familiar patch numbers. Some will also import or export system exclusive (Sysx) data to a synthesizer, meaning you can load an entire setup of sounds before the first note is played. While they don't offer all the editing capabilities of full-blown patch editors (discussed later), these patch librarian features are very useful, especially in settings where there are two or more MIDI devices.

Overall, sequencers are the most common of all MIDI software programs, and provide tremendous power that can be applied to the production of music.

Notation Programs

Another category of MIDI software is the notation, or transcription program (Figure 10). Because standard notation remains the most common way to represent music, an entire market has been established for programs that let musicians work ``the old fashioned way.'' Typically, these programs provide huge libraries of musical symbols that can be entered onto the page to produce professional looking scores. Some even allow the user to create new symbols. Sophisticated page layout features, often comparable to high-end desktop publishing programs, are also included in the more advanced notation software, and all programs of this type offer printing options.

Fig 10. -A view of standard music notation.-

Most programs allow ``point and click'' entry as well as real-time transcription from a MIDI keyboard. With real-time entry, musicians can play their music directly into the program and see it appear instantly on screen as notation. Once the notes are recorded, numerous editing capabilities, such as the cut, copy and paste features of a word processor, are available. Other editing functions needed by musicians, such as the ability to shift or "transpose" the music up or down are also commonly found.

Patch Editor/Librarians

Because of the complexity of many of today's synthesizers, an entire software niche has developed to facilitate the control of such devices from a computer. Patch editors typically display all of a synthesizer's programming controls on one or two computer screens, allowing the user to ``see into'' the synthesizer and control it directly from the computer keyboard (Figure 11). Rather than spend many minutes pushing buttons, trying to locate a particular screen within the synthesizer's own display, the patch editor lays all the device's parameters before the user, and allows him or her to make extensive changes with the sweep of the mouse or press of a few keys. Changes made on the computer screen are typically sent immediately to the device, making it possible to preview them before any permanent changes are made.

Stand-alone librarian programs, or those usually included with the patch editor, simply store all the device's sounds and make them available for quick searching or sorting. Typically, a librarian will request a ``dump'' from the device via Sysx, then show the user the sounds currently available on the instrument. This listing can then be stored on a computer and reloaded into the device if needed. Not only are the names of the patches stored, but also the specifications as to how the sounds are created. In other words, if the internal memory of a synthesizer were wiped out, the librarian could send a list of the original factory programs back to the synthesizer and return it to its original status.

Librarians are also commonly employed when users owning the same equipment wish to share programs they have created. Simply load the sounds into the librarian and save them on a floppy disk, then transport them to another computer anywhere in the world.

Integrated Programs

An interesting trend in MIDI software today is the appearance of integrated programs that combine many of the features of the programs listed above. Like their counterpart in the business world, the ``desktop suite,'' these integrated programs offer professional sequencing, notation, patch librarian, and in some cases, digital audio functions in an all-in-one environment. This trend shows tremendous promise, and has far-reaching implications for the user. It will be exciting to see how far it develops.

Digital Audio

One of the most exciting developments in desktop music in recent years is the ability to work with digital audio on a home PC. Long the province of research institutions and recording studios, digital audio editing software has become nearly commonplace on the desktop, and is now among the most accessible and powerful types of computer software available. Recording, editing, and playing digital audio on a home computer gives the user considerable power to design and produce new sounds, and to edit and craft one's own music with great precision. Digital audio can be a highly technical and elusive concept though, and we'll try to make the terms and concepts perfectly clear.

What is Digital Audio

Digital audio is a numeric representation of sound; it is sound stored as numbers. In order to understand what the numbers mean, we need to review some of the basic principles of acoustics, the study of sound.

Sound is produced when molecules in the air are disturbed by some type of motion produced by a vibrating body. This body, which might be a guitar string, human vocal cord or garbage can, is set into motion because energy is applied to it. The guitar string is struck by a pick or finger, while the garbage can is hit perhaps by a hammer, but the basic result is the same: they both begin to vibrate. The rate and amount of vibration is critical to our perception of the sound. If it is not fast enough or strong enough, we won't hear it. But if the vibration occurs at least twenty times a second and the molecules in the air are moved enough (a more difficult phenomena to measure), then we will hear sound. To understand the process better, let's take a closer look at a guitar string.

When the pick hits the string, the entire string moves back and forth at a certain rate of speed (Figure 12). This speed is called the frequency of the vibration. Because a single back and forth motion is called a cycle, we use a measure of frequency called cycles per second, or cps. This measure is also known as hertz, abbreviated Hz. Like that of other bodies, the frequency of the string is often very fast, so it is useful to use the abbreviation kHz to measure frequency in thousands of vibrations per second. A frequency of 2 kHz then, signifies a frequency of 2,000 cycles per second, meaning the string goes through its back and forth motion 2,000 times per second. The actual distance the string moves is called its displacement, and is proportional to how hard we pluck it. The actual measurement used for this distance is not particularly important for our purposes, but we will often refer to the amplitude or strength of the vibration.

As the string moves, it displaces the molecules around it in a wave-like pattern, i.e., while the string moves back and forth, the molecules also move back and forth. The movement of the molecules is propagated in the air; individual molecules bump against molecules next to them, which in turn bump their neighbors, etc., until the molecules next to our ears are set in motion. At the end of the chain, these molecules move our eardrum in a pattern analogous to the original string movement, and we hear the sound. This pattern of motion, which is an air pressure wave, can be represented in many ways, for example as a mathematical formula, or graphically as a waveform. Figure 13 below shows the movement of the string over time: the segment marked "A" represents the string as it is pulled back by the pick; "B" shows it moving back towards its resting point, "C" represents the string moving through the resting point and onward to its outer limit; then "D" has it moving back towards the point of rest. This pattern repeats continuously under the friction of the molecules in the air gradually slows the string down to a stop. In order for us to hear the string tone, the pattern must repeat at least twenty times per second. This threshold, 20 cps, is the lower limit of human hearing. The fastest sound we can hear is theoretically 20,000 cps, but in reality, it's probably closer to 15 or 17,000 cycles.

Fig 13. -The vibration pattern of a plucked string over time. Gradually, the motion will die out.-

If this back and forth motion were the only phenomena involved in creating a sound, then all stringed instruments would probably sound much the same. We know this is not true, of course, and alas, the laws of physics are not quite so simple. In fact, the string vibrates not only at its entire length, but at one-half its length, one-third, one-fourth, one-fifth, etc. These additional vibrations occur at a rate faster than the original vibration, (known as the fundamental frequency), but are usually weaker in strength. Our ear doesn't hear each vibration individually however. If it if did, we would hear a multi-note chord every time a single note were played. Rather, all these vibrations are added together to form a complex or composite waveform that our ear perceives as a single tone (Figure 14).

Fig 14. -The making of a complex waveform. Vibrations occurring at different frequencies are added together to form a complex tone.-

This composite waveform still doesn't account for the uniqueness of the sound of different instruments, as there is one more major factor in determining the quality of the tone we hear. This is the resonator. The resonator in the case of the guitar is the big block of hollow wood that the string is attached, i.e., the guitar body. This has a major impact on the sound we perceive when a guitar is played as it actually enhances some of the vibrations produced by the string and diminishes or attenuates others. The ultimate effect of all the vibrations occurring simultaneously, being altered by the resonator, adds up to the sound we know as guitar.

Recording a Sound

So what has all this got to do with digital audio? What is it we need to record from all of this motion in the air? It is the strength of the composite pressure wave created by all the vibrations that we must measure very accurately and very often. That is the basic principle behind digital audio. When a microphone records a guitar playing, a small membrane in the mic (called the diaphragm) is set into motion in a pattern identical to the guitar wave's own pattern. The diaphragm moves back and forth, creating an electrical current that is sent through a cable. The voltages in the cable are also "alternating" in strength at a very rapid rate: strong, weaker, weak, strong again. When the cable arrives at our measuring device, called an analog to digital (A/D) converter, the device measures how strong the signal is at every instant and sends a numeric value to a storage device, probably the hard drive in your computer. The A/D converter, along with its counterpart, the digital to analog (D/A) converter that turns the numbers back into voltages, is typically found as a component of your sound card, or as a stand-alone device.

There are several important aspects of this measuring process that we need to discuss. First is the rate at which we choose to examine the signal coming into the converter. It is a known fact of physics that we must measure or sample the signal at a rate twice as fast as the highest frequency we wish to capture. Let's say we are trying to record a moderately high note on a violin. Let's also assume that the fundamental frequency of this tone repeats 440 times per second (the note would be an "A," of course), and that we want to capture all vibrations up to five times the rate of the fundamental, or 2,200 cycles per second. To capture all the components of this note and convert the resulting sound into numbers, we would have to measure it 4,400 times per second.

But humans can hear tones that occur at rates well up into the tens of thousands of times per second, so our system must be capable of much better than that! In theory, we might want to capture an extremely high sound, for example one that actually contains a frequency component of 20,000 cps. In that case, our measurements must occur 40,000 times per second, which in fact, would allow us to capture every possible sound that any human might be able to hear. Because of some complex laws that digital audio obeys however, we use a rate of 44,100 measurements or "snapshots" of a sound per second in our professional equipment. This sampling rate, abbreviated 44.1 kHz (44.1 kilohertz) is one aspect of what we call CD-quality recording, as it is the same rate that commercial CDs use. Other common sampling rates are 11kHz, 22kHz, and for some professional equipment, 48kHz.

The other important issue is how accurate our measuring system will be. Will we have 20 different values to select from for each measurement? How about 200 or 2,000? How accurately do we need to represent the incredible variety of fluctuations in a pressure wave? Think about the different types of time pieces you know about. If your digital watch shows you minutes and seconds, that's adequate for most purposes. If you are doing scientific measurements of time, then you might need more accuracy, perhaps minutes, seconds, tenths, hundredths and even thousandths of seconds. Soundwaves actually encompass an infinite range of strengths, but we must draw the line somewhere, or else we would need gigantic hard drives just to store the information for a short amount of sound. The music industry has settled on a system that provides 65,536 different values to assign to the amplitude (strength) of a waveform at any given instant. In a certain sense, that number represents a compromise, as we will definitely not capture every possible value that the amplitude can take. However, our ears can live with that compromise, and in any event, using a more sophisticated measuring system is simply not worth the extra cost in computing and storage resources.

Obviously you are wondering, "Why in the world did they choose 65,536?" The answer is simply because it is 216, that is, 2 to the 16th power (2 multiplied by itself sixteen times). This is the largest number we can express in the binary numbering system if we use 16 bits, or 16 places. Recall from your high school math that the binary numbering system uses only two digits, 0 and 1, and that this is what computers use as well. A string of sixteen 1's in the binary system produces the number 65,535 in decimal, and a string of 16 0's is, of course the decimal number 0. So from 0 through 65,535 we have 65,536 different numbers that we can express using 16 bits. Computers actually think in terms of 8 digit strings, which you will remember are called bytes. Therefore, if we use numbers that are two bytes long to represent every different value in our system, we have the total range described above. One byte, or a string of 8 bits, would allow us to represent the numbers 0 through 255, and MIDI is quite happy with that range, but there is so much more detail in the digital audio world that our system must be far more sophisticated.

If you've followed the discussion up until now, you should have a pretty good idea of what is on a compact disc. It's a massive amount of numbers, each two bytes long, that represent the fluctuating amplitude of the pressure wave in front of the microphone that made the recording. No matter if the sound was an orchestra, a guitar or a car horn, the CD simply contains measurements for the pattern of motion produced by that sound. We can use our hard drives to record the information in the same form as that on a CD, or if we wish, we can use a somewhat less accurate representation. For example, if we choose not to capture the data as accurately as the CD, we might only use eight bits, or one byte, for each amplitude value. Such a measuring system has only what we call 8-bit accuracy or resolution. This will have a significant impact on the quality of our representation, but it may be adequate for the purpose at hand. Or we might wish to look at the sound and take a measurement only every eleven or twenty-two thousands times a second, i.e., an 11k or 22 kHz sampling rate, realizing that we will miss some detail, in particular the high end (upper frequencies) in the sound. In truth, that rate may be good enough to represent certain types of sound, for example the frequencies produced by the human voice are much lower than those produced by a cymbal, so we might be able to get the whole picture by looking at the voice at a lower rate. The decision regarding how accurate we need to be will be determined by the material we are recording and the amount of storage space we have available to hold the recording. These choices are usually made from within our audio software, so perhaps it's time to turn or attention to the PC.

Digital Audio Software

There are several common varieties of software used to manipulate digital audio data on a computer. The most popular is wave editing software, which is often included as part of the software packaged with sound cards. This type of software allows someone to work with a graphic representation of sound, the waveform, and cut, copy and paste it with the ease of a word processor (Figure 15). The software also typically includes a number of editing features that allow additional processing of the material; this processing can be used to create special effects, such as dogs barking backwards, and gun shots being stretched to one hundred times their length. Features of this type fall into the category of signal processing, or digital signal processing (DSP) functions. Professional versions of waveform editors often cost several hundred dollars, but offer the user tremendous flexibility in the type of manipulations they can perform. By the way, on the IBM-compatible platform, digital audio files are typically called Wave files and carry the extension, .WAV. On the Macintosh, the standard audio file type is the AIFF file.

Fig 15. -A graphic waveform display.-

Usually, wave editing software can accommodate no more than a single, stereo file, though a new category, called multi-track software, lets the user work with several stereo files at once. After being manipulated and edited, these files are mixed together into a single composite stereo file that is sent to the left and right channel outputs of a sound card. In many cases, the multi-track software doesn't offer a full range of editing options; most often it is the signal processing functions that are omitted, but the ability to mix many different layers of audio is very appealing.

One other type of editing software is used with dedicated hard-disk recording systems. These professional products are very sophisticated, and often very expensive. Their key advantage is that they provide extensive editing capabilities, such as those needed to make commercial audio recordings, and often include storage devices devoted to holding large amounts of high quality audio. They also provide multiple tracks of digital audio, in some cases up to ten or even twelve simultaneous tracks on a single PC, as well as multiple audio outputs. This makes them well suited for the production of radio and television commercials, where a vocal narration, sound effects and music soundtrack are often combined.

Sound Cards

Far less expensive than the dedicated hardware described above are the massively popular sound cards found in nearly every PC today. Much of the success of these products can be attributed to the fact that IBM-compatible computers never enjoyed the quality of sound production that the Macintosh(TM) had from its inception. When card maker Creative Labs reached the consumer with its industry standard Sound Blaster(TM) card, they found a huge untapped market that is now quite saturated with products.

Sound cards typically serve several important functions. First, they contain a synthesizer that uses either frequency modulation (FM) synthesis to produce sound, or that stores actual recorded audio data in wavetables for use in playback. FM is a somewhat dated method of synthesis that uses one or more wave(s), called the modulator, to alter the frequency and amplitude of another, called the carrier. The range of sounds that can be produced is limited, though often adequate for simple sound effects or other game sounds. While the FM-style card has nearly disappeared from the market, most software manufacturers must include support for it in their products because of the vast number of cards that are still installed in computers.

Nearly all newer cards use the preferable wavetable approach because it provides far more realistic sound. Wavetables are digital recordings that exist in some type of compressed form in the card's ROM (read only memory). These sounds can never be erased, but can be altered in numerous ways as they playback. For example, a trumpet sound could be reversed, or a piano could be layered with a snare drum. Depending upon the programmability provided by the manufacturer, this type of card can be quite flexible in the sounds it makes. Most wavetable cards, regardless of their manufacturer, offer a General MIDI soundset, which makes them compatible with many popular multimedia programs. Despite what their ads may claim, sound cards vary tremendously in quality, even those that use the same playback method. Magazine reviews and roundups are a good source of information for evaluating a card's characteristics.

Most cards also contain a MIDI interface for MIDI input and output, plus the digital to analog (D/A) and analog to digital (A/D) converters described above. While all MIDI interfaces are essentially created equal, there can be major differences among the converters on these cards. Many cards claim ``CD Quality Sound,'' which simply means they can record and playback audio at a sampling rate of 44.1 kHz using 16-bit resolution. Unfortunately, the personal computer was not originally intended to be a musical instrument, and the high level of electronic activity inside its case can cause interference problems with some cards. With properly built cards, these problems can be avoided, and most users won't experience any difficulties.

Putting it Altogether

MIDI and digital audio have coexisted in separate worlds until very recently. Now, using an entirely new class of software, we have the potential to work with both types of data within a single program. This new category, called simply, integrated MIDI and Digital audio software, solves many of the most nagging problems desktop musicians have had for years. The capabilities it offers greatly facilitate the integration of ``real world'' audio with the ``virtual'' world of MIDI tracks. Before we discuss this software, let's look at the way things used to work. Here's how musicians combined audio and MIDI in the past.

Synchronization

For many years, in home and professional music studios around the world, musicians have employed elaborate and somewhat complex means to join live audio with MIDI music. Guitarists, vocalists, drummers and others have used different synchronization techniques to mix their live playing with the music produced by their MIDI software. Typically, a musician would record live audio onto a tape recorder, then use the tape recorder to send information to the computer which told it when to start and stop playing. In this way, the music on the tape and the sequenced music could be perfectly aligned.

The information sent by the tape recorder in this case is known as SMPTE time code, and is actually an audio signal recorded (or ``striped'') on the tape. SMPTE (pronounced ``simp-tee'') serves as a timing reference for both the tape and the computer running the MIDI software. In essence, this code tells the software ``what time it is,'' i.e., where into the music it should be. If a MIDI drum part must start exactly one minute after the music on the tape recorder begins, then the sequencer will watch the time pass from the beginning of the tape (time 00:00), until it reaches time 01:00, at which point it begins to play. Sequencers can jump instantly to any time point that's required, so the sequencer will simply wait for its ``cue'' then start playing.

SMPTE, which stands for the Society of Motion Picture and Television Engineers, was initially created by the NASA space agency for use with its tracking stations. It provided an absolute timing reference that allowed the agency to keep track of when transmissions occurred. Like a digital clock, SMPTE works on a 24 hour cycle, and the precision it provides is considerable: a normal SMPTE time represents hours, minutes, seconds, and ``frames,'' (Figure 16). The ``frames'' designation is important to the television and movie industry for tracking time in film and video productions. A frame in television occurs 30 times a second, while in film it represents an interval of 1/24th or 1/25th of a second, so SMPTE can measure time quite accurately. Because most professional video equipment is SMPTE-compatible, musicians creating audio for video productions can also use it to synchronize their music with the various types of video equipment they commonly work with. When scoring for films, it is an invaluable way for the composer to know exactly when a sound effect or music cue must begin and end.

Fig. 6 -An example of SMPTE time code, showing time in hours, minutes, seconds, and frames.-

Integrated Software

Rather than deal with the intricacies of SMPTE, today's musician can work with integrated software to combine audio and MIDI tracks with great precision. New programs like Cakewalk Pro Audio represent digital audio data in the same form as MIDI data, and allow the user to manipulate the two with ease. Once audio files are recorded onto disk, they can be aligned for playback along with the MIDI information, and what's more, numerous tracks of audio can be performed simultaneously. If synchronization with an external device is needed, the entire project can still be controlled by that device. Thus, the best features of multi-track audio software can now be found integrated with the advanced options of MIDI sequencers.

The number of audio tracks that can be mixed together in an integrated program, or in a stand-alone audio editor for that matter, is very much a function of the computer hardware being used for the task. In the IBM world, the processor (CPU) speed, access or ``seek'' time of the hard drive, and available system RAM are among the key components to evaluate. In the early years of desktop multimedia, software leader Microsoft produced a ``multimedia'' specification that described the minimal requirements for work of this type. That spec has been modified to keep up with enhancements in today's computers, and has, as of this writing, reached ``Level III'' status. This calls for a computer with a Pentium 75 MHz or better processor, at least 8 MEGS of RAM, a 540 MEG hard drive, a quad-speed CD-ROM player, a sound card that uses wavetable synthesis, and a video card that is MPEG 1 (a form of compression) compliant. Keep in mind that any component of a system can slow the process: a fast CPU with an inadequate hard drive can bring a system to its knees, for example. It's important that all the pieces of the system are well balanced and in good working order.

Here's a tip to keep in mind: one of the easiest and most effective tasks you can do to prepare your system for recording or playing audio is defragmenting your hard drive. A fragmented drive contains pieces of files spread over different physical locations, and makes the job of streaming data to and from that disk very difficult. Use one of the cleanup programs, such as defrag, which comes with your operating system, before making recordings. Also if possible, devote a separate drive partition to digital audio. When you first setup your computer, you can create partitions easily using DOS's fdisk program, but later, you'll have to backup your drive and reformat it.

Summary

We hope you've enjoyed this initial presentation of the ins and outs of desktop music and that it will encourage you to experiment on your own. Much of today's software is very powerful, though manufacturers have done a good job in making it easy to use, and you've got many hours of pleasure and excitement to look forward to. Of course the more you can learn about desktop music, the more you will get out of your equipment, so keep your eyes on the numerous books and magazines devoted to the subject, and consider subscribing to some of the multimedia newsgroups on the Internet. There's a whole world of music waiting for you, right on your desktop.


Glossary of MIDI and Digital Audio Terms

ACTIVE SENSING - a method by which a MIDI device detects disconnection. A message is sent to the receiver around three times per second, and if no message is received during this period, the unit assumes the MIDI connection has been broken. It then begins a routine to reestablish normal operation.

ADDITIVE SYNTHESIS - a synthesis method that builds complex waveforms by combining sine waves whose frequencies and amplitudes are independently variable.

ADSR - Attack, Decay, Sustain and Release are the four stages of an envelope that describe the shape of a sound over time. Attack represents the time the sound takes to rise from an initial value of zero to its maximum level. Decay is the time for the initial falling off to the sustain level. Sustain is the time during which it remains at this level. Release is the time it takes to move from the sustain to its final level. Release typically begins when a note is let up. In most sound generators, the time and the value reached are programmable.

AFTER TOUCH - a measurement of the force applied by a performer to the key on a controller after it has been depressed. Either polyphonic, which measures the pressure on each individual key, or monophonic, reflecting the total pressure on all keys.

AIFF - the standard file format for storing audio information on an Apple Macintosh computer.

ALGORITHM - a set of instructions supplied to a computer for the purpose of solving a problem.

ALL NOTES OFF - a three byte MIDI channel message that instructs the receiving device to terminate all notes currently sounding.

ALIASING (FOLD-OVER) - ``false frequencies'' that are created when sampling frequencies greater than one-half the sampling rate.

AMPLIFIER - a device that increases the amplitude, power or current of a signal. The resulting signal is a reproduction of the input signal as well as this increase.

AMPLITUDE - the strength or magnitude of any changing quantity when compared to its \Qat rest' or \Qzero' value.

ANALOG - information which is continuously variable in nature.

ANALOG SYNTHESIS - a method of sound synthesis that relies on predefined waveforms to create sounds that vary over time. The amplitude, frequency and harmonic content of these waveforms can be manipulated to produce a vast number of different results.

ARPEGGIATE - to play the notes of a chord in succession rather than simultaneously.

ATTACK - the initial stage of an envelope. Refers to the time from the beginning of the sound to its highest or maximum level.

BANK - a storage location in a sampler or synthesizer that typically holds a large number of individual program (sounds).

BINARY NUMBERS - a numbering system based on 2 in which 0 and 1 are the only available digits.

BITS (BYTES) - a binary digit. Mode of information used by a computer to store numbers. One bit equals a \Qone' or a \Qzero'. Usually 8 bits equals one byte, however, MIDI uses a 10 bit-byte that includes a start bit, the 8 - bit data message, and a stop bit.

BUFFER - an area of RAM used to temporarily store data.

CENTRAL PROCESSING UNIT (CPU) - a silicon chip that performs calculations and acts as the brain of a computer.

CHANNELS - one of 16 different data paths that are available to carry messages in MIDI.

CHANNEL MESSAGE - a type of MIDI message that carries specific channel information.

CHORUSING - a doubling effect commonly found on a synthesizer or sampler that makes a single sound appear to sound like an entire ensemble. The initial signal is split and appears at a slightly altered pitch from the original, or at a slightly later point in time. This time and pitch level are often controllable by a low frequency oscillator (LFO).

CONTINUOUS CONTROLLER - a type of MIDI message that is generated by the movement of a continuous control.

CONTROLLERS - various sliders, levers, knobs, or wheels typically found on a MIDI controller. Used to send continuous (as opposed to discrete) data to control some aspect of a sound.

DECIBEL -a decibel (or dB ) is 1/10th of a bel, which is a relative measure of two sounds.

DC (DIRECT CURRENT) - an electrical current that flows in one direction.

DECAY - one of the four basic stages of an envelope. Refers to the time the sound takes to settle into its sustain level.

DEFAULT - the ``normal'' or ``startup'' state of a hardware device or software application.

DELAY - a common effect in a sampler or synthesizer that mimics the time difference between the arrival of a direct sound and the first reflection to reach the listener's ears.

DIGITAL AUDIO - the numeric representation of sound. Typically used as the means for storing sound information in a computer or sampler.

DIGITAL SYNTHESIS - the use of numbers to create sounds. Method most often used in today's synthesizers for generating sounds, as compared to analog method employed previously.

DIN PLUG - a five-pin connector used by MIDI equipment.

DISTORTION - a process, often found desirable by guitar players, that alters a sound's waveform.

DRUM MACHINE - an electronic device, usually controllable via MIDI commands, that contains samples of acoustic drum sounds. Used to create percussion parts and patterns.

DSP - digital signal processing. Processes used to alter sound in its digital form.

DYNAMICS - the relative loudness or softness of a piece of music.

ECHO - the repetition of a sound delayed in time by at least 50 milliseconds after the original. An effect often found in synthesizers and samplers.

ENVELOPE - changes in a sound over time, including alterations in a sound's amplitude, frequency and timbre.

ENVELOPE GENERATOR - a device or process in a synthesizer or other sound generator that creates a time varying signal used to control some aspect of the sound.

ERROR CORRECTION - a procedure found in digital audio systems that detects and correct inaccurate or missing bits in the data stream.

EQUALIZATION (EQ) - boosting or cutting various frequencies in the spectrum of a sound.

FADE IN/OUT - a feature of most audio editing software that allows the user to apply a gradual amplitude increase or decrease over some segment of the sound.

FADER - also known as a slider or attenuator, this control allows the user to perform a gradual change to the amplitude of a signal. Commonly found as a feature of MIDI software programs.

FILTER - a circuit which permits certain frequencies to pass easily while inhibiting or preventing others. Typical filters include low pass, high pass, band pass, and band reject.

FLANGE - an effect applied to a sound wherein a delayed version of the sound is mixed with the original.

FM SYNTHESIS - a synthesis method that involves the interaction of a signal (carrier) by another (modulator).

FREQUENCY - the rate per second at which an oscillating body vibrates. Usually measured in Hertz (Hz), humans can hear sounds whose frequencies are in the range 20 Hz to 20kHz.

FUNDAMENTAL FREQUENCY - the predominant frequency in a complex waveform. Typically provides the sound with its strongest pitch reference.

GRAPHIC EQUALIZER - a device type that applies a series of bandpass filters to a sound, each of which works on a certain range of the spectrum. The frequencies that fall within the range, typically one-third octave, can be boosted or cut.

HARMONIC - a sine wave component of a complex sound whose frequency is a whole number multiple of the fundamental frequency.

HARMONIC SERIES - also known as the ``overtone'' series, this is the series of frequencies in a sound that are whole number multiples of the fundamental.

HERTZ - a measurement used to represent the number of times per second a waveform repeats its pattern of motion (cycle).

KEYBOARD SPLIT- a setup of a keyboard where different notes trigger different sounds. Also known as zoning.

LCD - Liquid Crystal Display. A small screen found on electronic instruments that displays data.

LFO - a low frequency oscillator that is used to alter a sound's frequency or amplitude.

LIBRARIAN - a category of MIDI software that is used to organize and store a MIDI device's patch (program) data.

LOCAL ON/OFF - a three byte channel message that determines the status of the Local On function of a MIDI device. LOCAL ON allows the instrument to produce sounds from incoming MIDI data and its own keyboard. LOCAL OFF states that only external MIDI data is responded to.

LOOP - to repeat a sequencer pattern or portion of an audio sample repeatedly. The point to which the program returns, whether the beginning or some other point, is usually definable by the user.

METRONOME - a device or software function that produces a discrete pulse. Used to synchronize music with a specific tempo.

MIDI - the Musical Instrument Digital Interface. An international standard for communication between a musical instrument and a computer.

MIDI CLOCK - a system real time message that enables the synchronization of different MIDI devices. The standard rate is 24 divisions per beat.

MIDI INTERFACE - a device that adds a MIDI In, Out and sometimes Thru port to a desktop computer.

MIDI MERGE - used to combine MIDI data from various sources into a single source.

MIDI MESSAGE - the different packets of data that form a MIDI transmission.

MIDI PATCHER - a device that allows the routing of one or more MIDI signals to various MIDI devices. Typically reconfigurable to allow for different routings of the data.

MIDI PORTS - the three connectors that pass MIDI data into (MIDI IN), out of (MIDI OUT) and through (MIDI THRU) a MIDI device.

MIDI TIME CODE (MTC) - a timing system used as a universal reference for all the devices in a MIDI network. Represents the information contained in a SMPTE signal using MIDI messages.

MIXER - a recording device that allows several different audio sources to be combined. Provides independent control over each signal's loudness and stereo position.

MODULATION WHEEL - one of several common continuous controls on a MIDI device. Often used to add a vibrato effect to a sound.

MONOPHONIC - the ability to play only one note at once. A characteristic of some older synthesizers.

MULTITIMBRAL - having the ability to produce many different musical timbres (sounds) at once.

MULTITRACK - in traditional recording technology, the ability to layer multiple different audio signals at once. In MIDI software, the ability to layer numerous MIDI data streams.

NOTE ON COMMANDS - a channel voice message that indicates a note is to begin sounding. Contains two additional data bytes: Note number and Note velocity.

NYQUIST FREQUENCY - the highest frequency that any given digital audio system can capture. Defined as one half the sampling rate of that system.

OCTAVE - a frequency ratio of 2:1. A musical distance (interval) of 12 semitones.

OSCILLATOR - an electronic device capable of generating a recurring waveform, or a digital process used by a synthesizer to generate the same..

OVERDUB - the ability to record one sound on top of another.

PATCH CORD - an audio cable used to connect the output of a device to an amplifier or mixer.

PAN - to move a signal from the left to the right of a stereo field, or vice versa.

PARAMETERS - characteristic elements of a sound that are usually programmable in a synthesizer or other MIDI device.

PARTIAL - a sine wave component of a complex sound.

PATCH EDITOR - a category of MIDI software used to control the sound characteristics of a synthesizer from a computer.

PATCHES - also variously known as programs, timbres, or voices. The name used for the sounds that can be generated by a MIDI device.

PERIOD - the time required for one cycle in a periodic waveform. Period is the inverse of frequency.

PHASE - the relative position of a wave to some reference point.

PITCH - a continuous frequency over time.

PITCH BEND - a MIDI controller that can vary the pitch of a sound.

POLYPHONIC - the ability to play many different notes at once.

POTENTIOMETER (POT) - a variable resistor used to alter voltage.

PRESETS - typically, the sounds permanently stored by the manufacturer in a sound generating device.

PROGRAMS (SEE PATCHES)

PROGRAM CHANGE MESSAGE - a two byte MIDI message used to request that a synthesizer change the currently loaded program.

PUNCH IN/OUT - the ability to start and stop a recording at some point other than the beginning.

QUANTIZATION -rounding or truncating a value to the nearest reference value. In a sequencer, used to adjust recorded material so it will be performed precisely on a selected division of the beat. In digital audio, the range of numbers used for specifying amplitude levels of a recorded signal. (16 bit quantization = 65,536 values; 8-bit = 256, etc.)

RAM - random access memory. The temporary storage area of a computer or sampler.

REAL TIME - a recording or realization of a sound processing procedure as it occurs. (see Step Time).

RECEPTION MODE - one of four basic configurations used by a synthesizer that determines how it will respond to incoming data.

ROM - read only memory. Permanent memory in a computer or MIDI device.

SAMPLER - an electronic device that can record, alter and playback digital audio data under the control of a MIDI data stream.

SAMPLING - digitizing a waveform by measuring its amplitude fluctuations at some precisely timed intervals. The accuracy of the measurements is a function of the bit resolution.

SAMPLING RATE - the rate at which samples of a waveform are made. Must be twice the highest frequency one wishes to capture. Commercial compact discs use a rate of 44,100 samples per second.

SEQUENCER - MIDI software or less commonly, a hardware device that can record, edit and playback a sequence of MIDI data.

SINE WAVE - the most basic waveform, consisting of a single partial. Forms the basis of all complex, periodic sounds.

SMPTE TIME CODE - a timing standard adopted by the Society of Motion Picture and Television Engineers for controlling different audio and video devices. Allows a sequencer and an external device such as a tape recorded to stay synchronized.

STEP TIME - entering notes one by one, as opposed to real time recording in a sequencer.

SONG POSITION POINTER (SPP) - a system-common message that specifies where in a sequencer a device should begin to play.

STANDARD MIDI FILE - a standardized form of data used for exchanging MIDI files between programs.

STATUS BYTE - the first byte of a MIDI message that specifies what type of message it is.

SUSTAIN PEDAL - a pedal on a MIDI controller (or acoustic piano) that keeps all notes sounding even a key is released.

SYSTEM COMMON MESSAGES - MIDI messages used for various functions including tuning an instrument and song selection.

SYSTEM EXCLUSIVE MESSAGE - MIDI message used to communicate with a device made by a specific manufacturer.

SYSTEM REAL TIME MESSAGES - commands used to synchronize one MIDI device with another.

TEMPO - the rate of speed at which a musical composition proceeds. Usually uses a quarter note as the timing reference.

TIMBRE - the property of a sound that distinguishes it from all other. Tone color.

TREMELO -a rapid alternation of two tones. Usually a third apart. On a synthesizer, this effect can usually be controlled by the modulation wheel or modulation amount.

VELOCITY - a measure of the speed with which a key on a controller is pressed. Used to determine volume characteristics of note.

WAVEFORM - the graphical display of a sound pressure wave over time.

WAVETABLE - a storage location that contains data used to generate waveforms digitally.


Copyright © 1998 by Twelve Tone Systems, Inc. All rights reserved. Cakewalk is a registered trademark and Cakewalk is a trademark of Twelve Tone Systems, Inc. Other names may be trademarks or registered trademarks of their respective companies. Prices, specifications, and availability are subject to change without notice.