Audio Basics
The Wolfram Language provides built-in support for both programmatic and interactive audio processing, fully integrated with the Wolfram Language's powerful mathematical and algorithmic capabilities. You can create and import sound files, manipulate them with built-in functions, apply linear and nonlinear filters, and visualize them in any number of ways.
Audio[data] | in-core audio with samples given by data |
Audio[file] | out-of-core audio from a file |
Audio[url] | out-of-core audio from a url |
Import[file] | import audio from a file |
AudioGenerator[model] | generate various oscillators and noises |
The simplest way to create an audio object is to wrap the Audio constructor around a vector of real values ranging from
to 1.

In[6]:=6

✖
https://wolfram.com/xid/0jfy9b9ebg-8bxqkz
Out[6]=6

Another way is to obtain an out-of-core audio object from a file on the local file system or any accessible remote location. Out-of-core audio objects do not store samples in memory.
This creates an out-of-core audio object from the Wolfram Language documentation directory ExampleData and displays the number of bytes used internally by the object:
In[7]:=7

✖
https://wolfram.com/xid/0jfy9b9ebg-f3f2vg
Out[7]=7

In[3]:=3

✖
https://wolfram.com/xid/0jfy9b9ebg-98rx8
Out[3]=3

You can use Import to create an in-core audio object from a file or URL.
This creates an in-core audio object and displays the number of bytes used internally by the object:
In[4]:=4

✖
https://wolfram.com/xid/0jfy9b9ebg-dzt8lk
In[5]:=5

✖
https://wolfram.com/xid/0jfy9b9ebg-gdjk56
Out[5]=5

In[6]:=6

✖
https://wolfram.com/xid/0jfy9b9ebg-mty5ro

In[7]:=7

✖
https://wolfram.com/xid/0jfy9b9ebg-n5muy
Out[7]=7

In[8]:=8

✖
https://wolfram.com/xid/0jfy9b9ebg-cfc2qa
Out[8]=8

In[9]:=9

✖
https://wolfram.com/xid/0jfy9b9ebg-5ecwq
Out[9]=9

This converts sound represented by SampledSoundFunction to audio:
In[10]:=10

✖
https://wolfram.com/xid/0jfy9b9ebg-fbi2pw
Out[10]=10

In[11]:=11

✖
https://wolfram.com/xid/0jfy9b9ebg-bfqs64
Out[11]=11

Various oscillators and noises can be created using AudioGenerator.
In[12]:=12

✖
https://wolfram.com/xid/0jfy9b9ebg-hvonvn
Out[12]=12

In[13]:=13

✖
https://wolfram.com/xid/0jfy9b9ebg-guvn4t
Out[13]=13

AudioLength[audio] | give the number of samples of an audio object |
Duration[audio] | give the duration of audio in seconds |
AudioChannels[audio] | give the number of channels present in the data for audio |
AudioSampleRate[audio] | give the sample rate associated with audio |
AudioType[audio] | give the type of values used for each sample element in audio |
In[14]:=14

✖
https://wolfram.com/xid/0jfy9b9ebg-d361fe
In[15]:=15

✖
https://wolfram.com/xid/0jfy9b9ebg-fk3pp2
Out[15]=15

The array of sample values can be extracted using the function AudioData. By default, the function returns real values, but you can ask for a specific type using the optional "type" argument.
In[16]:=16

✖
https://wolfram.com/xid/0jfy9b9ebg-g0kj3v

Here is the same fragment extracted from the out-of-core audio as a vector of integers in the range –127 to 128:
In[17]:=17

✖
https://wolfram.com/xid/0jfy9b9ebg-cj9rzb

In the case of multichannel audio, the raw sample data is represented by a list of channel values (2D array).
This creates an out-of-core multichannel audio object and extracts a fragment of that audio object as a list of channel values:
In[18]:=18

✖
https://wolfram.com/xid/0jfy9b9ebg-bt00sj
In[19]:=19

✖
https://wolfram.com/xid/0jfy9b9ebg-wv5937
Out[19]=19

A multichannel audio object can be split into a list of single-channel audio objects and conversely, a multichannel audio object can be created from any number of single-channel audio objects.
In[20]:=20

✖
https://wolfram.com/xid/0jfy9b9ebg-6gmt6
Out[20]=20

In[21]:=21

✖
https://wolfram.com/xid/0jfy9b9ebg-big8ko
Out[21]=21

AudioPlot[audio] | plot the waveform of audio |
Periodogram[audio] |
plot the squared magnitude of the discrete Fourier transform (power spectrum) of
audio
|
Spectrogram[audio] | plot the spectrogram of audio |
In[22]:=22

✖
https://wolfram.com/xid/0jfy9b9ebg-3mx5u
Out[22]=22

In[23]:=23

✖
https://wolfram.com/xid/0jfy9b9ebg-ljeeit
Out[25]=25

In[26]:=26

✖
https://wolfram.com/xid/0jfy9b9ebg-eq82ja
Out[26]=26

In[9]:=9

✖
https://wolfram.com/xid/0jfy9b9ebg-bgg3fb
Out[9]=9

Many useful audio processing tasks require nothing more than simple arithmetic operations between two audio objects or an audio object and a constant. For example, you can change volume by multiplying an audio object by a constant factor or by adding (subtracting) a constant to (from) an audio object. For this purpose, all Wolfram Language operators and functions with attributes NumericFunction or Listable are overloaded to work with audio objects.
audio1+audio2 | add two audio objects |
n*audio | multiply a scalar by an audio object |
Mean[{a1,a2,…}] | compute the mean of a list of audio objects |
In[11]:=11

✖
https://wolfram.com/xid/0jfy9b9ebg-kh3hx
Out[11]=11

In[11]:=11

✖
https://wolfram.com/xid/0jfy9b9ebg-tp389
Out[11]=11

In[13]:=13

✖
https://wolfram.com/xid/0jfy9b9ebg-ltai
Out[14]=14

A system option called "IndeterminateValue" is used to replace values that can result from arithmetical operation but cannot be stored in an audio object. These include ComplexInfinity and Indeterminate.
In[17]:=17

✖
https://wolfram.com/xid/0jfy9b9ebg-dtttkg
Out[17]=17

In[18]:=18

✖
https://wolfram.com/xid/0jfy9b9ebg-edaonj
Out[18]=18

This shows how "IndeterminateValue" is used to replace unacceptable values in the result of an arithmetical operation:
In[24]:=24

✖
https://wolfram.com/xid/0jfy9b9ebg-bjufd9
In[26]:=26

✖
https://wolfram.com/xid/0jfy9b9ebg-lxwngw

Out[26]=26

In[27]:=27

✖
https://wolfram.com/xid/0jfy9b9ebg-bwexwj
Out[27]=27

Complex numbers in the result of an arithmetical operation are replaced by the real parts of these values:
In[28]:=28

✖
https://wolfram.com/xid/0jfy9b9ebg-jfiz7d
In[30]:=30

✖
https://wolfram.com/xid/0jfy9b9ebg-c54r9
Out[30]=30

In[31]:=31

✖
https://wolfram.com/xid/0jfy9b9ebg-ejqppu
Out[31]=31

In[32]:=32

✖
https://wolfram.com/xid/0jfy9b9ebg-fa5ram
Out[32]=32

In[33]:=33

✖
https://wolfram.com/xid/0jfy9b9ebg-uo2z9y
Out[33]=33

Consider the audio manipulation operations that change the audio duration by trimming, deleting, or padding. These operations serve a variety of useful purposes. Trimming and deleting allow you to create a new audio object from a selected portion of a larger one, while padding is typically used to extend an audio object at the ends to ensure uniform treatment of the end samples in many audio processing tasks.
AudioTrim[audio,{t1,t2}] | give an audio object consisting of only samples between t1 and t2 |
AudioPad[audio,m] | pad audio on both sides with m seconds of zeros |
AudioDelete[audio,{t1,t2}] | delete from time t1 to time t2 |
AudioSplit[audio,{t1,t2,…}] | split audio at times ti |
AudioPartition[audio,dur,offset] | partition audio into overlapping segments |
AudioIntervals[audio,crit] | return intervals of audio for which criteria crit is satisfied |
In[43]:=43

✖
https://wolfram.com/xid/0jfy9b9ebg-f59dum
Out[43]=43

In[44]:=44

✖
https://wolfram.com/xid/0jfy9b9ebg-gu8smk
Out[44]=44

In[45]:=45

✖
https://wolfram.com/xid/0jfy9b9ebg-3y7q6
Out[45]=45

In[46]:=46

✖
https://wolfram.com/xid/0jfy9b9ebg-h4gwpt
Out[46]=46

In[47]:=47

✖
https://wolfram.com/xid/0jfy9b9ebg-nec748
Out[47]=47

In[48]:=48

✖
https://wolfram.com/xid/0jfy9b9ebg-db8o4
Out[48]=48

In[49]:=49

✖
https://wolfram.com/xid/0jfy9b9ebg-f3gvcp
Out[49]=49

In[50]:=50

✖
https://wolfram.com/xid/0jfy9b9ebg-e39ahh
Out[50]=50

In[51]:=51

✖
https://wolfram.com/xid/0jfy9b9ebg-de2xli
Out[51]=51

Intervals that satisfy various criteria can be computed and used to extract corresponding blocks of an audio object.
This computes the intervals where the maximal value is greater than .1 and the value of the spectral centroid is smaller than 800:
In[52]:=52

✖
https://wolfram.com/xid/0jfy9b9ebg-bzyi9p
Out[52]=52

In[53]:=53

✖
https://wolfram.com/xid/0jfy9b9ebg-buy4dm
Out[53]=53

It is frequently necessary to change the sample rate of an audio object by resampling, or to normalize audio samples in some manner. Functions that perform these basic tasks are readily available.
AudioResample[audio,sr] | give a resampled audio object that has sample rate sr |
AudioNormalize[audio] | normalize audio so that the maximum absolute value of its samples is 1 |
AudioAmplify[audio,s] | multiply all samples of the audio by a factor s |
In[54]:=54

✖
https://wolfram.com/xid/0jfy9b9ebg-cu8ygk
Out[54]=54

In[55]:=55

✖
https://wolfram.com/xid/0jfy9b9ebg-d300c7
Out[55]=55

In[56]:=56

✖
https://wolfram.com/xid/0jfy9b9ebg-b26gxr
Out[56]=56

AudioJoin[list] | concatenate a list of audio objects |
AudioOverlay[list] | overlay a list of audio objects |
ConformAudio[list] | conform properties of each audio object from the list |
In[57]:=57

✖
https://wolfram.com/xid/0jfy9b9ebg-jftfc1
Out[57]=57

In[58]:=58

✖
https://wolfram.com/xid/0jfy9b9ebg-4omv3
Out[58]=58

In[59]:=59

✖
https://wolfram.com/xid/0jfy9b9ebg-g43z3c
In[61]:=61

✖
https://wolfram.com/xid/0jfy9b9ebg-g9b73

In[62]:=62

✖
https://wolfram.com/xid/0jfy9b9ebg-fg0fit
