Determining the varying musical pitch of an audio stream has more steps than you probably expected.
Unless you can find a more specific third party library, you can use an audio analysis function called FFT (Fast Fourier Transform) to measure the frequency (pitch) of an audio signal. The AudioKit referenced by @Obelix includes an FFT function. You also can use Apple’s own FFT tool called vDSP.FFT
, which is part of the vDSP
framework.
An FFT outputs a array that is a histogram of the spectral components of the input audio signal. The array element having the largest magnitude value tells you the fundamental frequency of the audio signal.
After you determine the frequency of the audio signal, you can determine which frequency of the 12-tone musical scale it is closest to.
Each element of an FFT output array is of type DSPDoubleSplitComplex, a tuple representing the real and imaginary parts of that spectral component. To get the magnitude you must sum the squares of the real and imaginary parts. (No need to take the square root because you’re only looking for the bin having the largest magnitude, not actually measuring the magnitude.)
An FFT operates on a fixed number of input samples, which is called the “length” of the FFT. The frequency resolution — i.e., the width of each histogram increment in Hz — is the audio sampling rate divided by the FFT length. Choose the length “L”so that the resolution is less than the pitch difference between the two lowest notes that you will be evaluating.
To create a “real time FFT” that periodically reports the frequency of a real time audio input stream, you need to periodically re-run the FFT function on a sliding selection (“window”) of the L most recent audio samples.