Processing input as stream

Before you start, you should already have some knowledge about basics in Objective-C and C. Sometimes you need more than just a basic recording of external sounds. In my case i need the direct input stream from the iPhone microphone or headset.

There are many cases to process the stream on-the-fly for things like echoes or fancy delayed sounds on the input signal. You can make funny voices or the next Shazam. Its up to you and here is the first step for get a callback on the stream.

It is pretty hard to find informations about common techniques for processing the audio stream in iOS. This example and sample project which is available on GitHub, is just a start. In my example i give you a basic overview about things you have to make. This example covers a basic gain to boost the audio input. Since iOS 5 you dont need to write your on gain function cause the api got now a direct control about the level.

Project Setup

Just start with a empty or window based project. You will need only one class to manage your signals at all. In my example project my view controller holds a AudioProcessor object. It includes all functionalities like start and stoping and configure the AU.

In our case we got a file name AudioProcessor.h

[code lang=“objc“]
//
// AudioProcessor.h
//
// Created by Stefan Popp on 21.09.11.
// Copyright 2011 www.stefanpopp.de . All rights reserved.
//

#import
#import

// return max value for given values
#define max(a, b) (((a) > (b)) ? (a) : (b))
// return min value for given values
#define min(a, b) (((a) < (b)) ? (a) : (b))

#define kOutputBus 0
#define kInputBus 1

// our default sample rate
#define SAMPLE_RATE 44100.00

[/code]

We use the Audio Toolbox which is part of the Audio Toolbox Framework. The kOutputBus and kInputBus defines later the microphone and default output speaker.
The sample rate is 44khz.

[code lang=“objc“]
@interface AudioProcessor : NSObject
{
// Audio unit
AudioComponentInstance audioUnit;

// Audio buffers
AudioBuffer audioBuffer;

// gain
float gain;
}

@property (readonly) AudioBuffer audioBuffer;
@property (readonly) AudioComponentInstance audioUnit;
@property (nonatomic) float gain;
[/code]

Our class needs a audio component, a input buffer and for our DSP (digital signal processing) section a variable which holds the current gain multiplicator.

[code lang=“objc“]
-(AudioProcessor*)init;

-(void)initializeAudio;
-(void)processBuffer: (AudioBufferList*) audioBufferList;

// control object
-(void)start;
-(void)stop;

// gain
-(void)setGain:(float)gainValue;
-(float)getGain;

// error managment
-(void)hasError:(int)statusCode:(char*)file:(int)line;
[/code]

Record callback

Beside the initializers i implemented a simple error check method and start and stop functions for the AU.
The implementation is lot more complex and i commented a lot on it. If you dont really know what it is, consult the apple documentation even if its not best document on this topic.

[code lang=“objc“]

#import „AudioProcessor.h“

#pragma mark Recording callback

static OSStatus recordingCallback(void *inRefCon,
AudioUnitRenderActionFlags *ioActionFlags,
const AudioTimeStamp *inTimeStamp,
UInt32 inBusNumber,
UInt32 inNumberFrames,
AudioBufferList *ioData) {

// the data gets rendered here
AudioBuffer buffer;

// a variable where we check the status
OSStatus status;

/**
This is the reference to the object who owns the callback.
*/
AudioProcessor *audioProcessor = (AudioProcessor*) inRefCon;

/**
on this point we define the number of channels, which is mono
for the iphone. the number of frames is usally 512 or 1024.
*/
buffer.mDataByteSize = inNumberFrames * 2; // sample size
buffer.mNumberChannels = 1; // one channel
buffer.mData = malloc( inNumberFrames * 2 ); // buffer size

// we put our buffer into a bufferlist array for rendering
AudioBufferList bufferList;
bufferList.mNumberBuffers = 1;
bufferList.mBuffers[0] = buffer;

// render input and check for error
status = AudioUnitRender([audioProcessor audioUnit], ioActionFlags, inTimeStamp, inBusNumber, inNumberFrames, &bufferList);
[audioProcessor hasError:status:__FILE__:__LINE__];

// process the bufferlist in the audio processor
[audioProcessor processBuffer:&bufferList];

// clean up the buffer
free(bufferList.mBuffers[0].mData);

return noErr;
}
[/code]

Playback callback

The recording callback is called every time when new packets are available. The stream has to be rendered by the audio unit for further processing in the process buffer function of the audio processor object. At the end the buffer has to be freed to avoid memory leaks.

[code lang=“objc“]
#pragma mark Playback callback

static OSStatus playbackCallback(void *inRefCon,
AudioUnitRenderActionFlags *ioActionFlags,
const AudioTimeStamp *inTimeStamp,
UInt32 inBusNumber,
UInt32 inNumberFrames,
AudioBufferList *ioData) {

/**
This is the reference to the object who owns the callback.
*/
AudioProcessor *audioProcessor = (AudioProcessor*) inRefCon;

// iterate over incoming stream an copy to output stream
for (int i=0; i < ioData->mNumberBuffers; i++) {
AudioBuffer buffer = ioData->mBuffers[i];

// find minimum size
UInt32 size = min(buffer.mDataByteSize, [audioProcessor audioBuffer].mDataByteSize);

// copy buffer to audio buffer which gets played after function return
memcpy(buffer.mData, [audioProcessor audioBuffer].mData, size);

// set data size
buffer.mDataByteSize = size;
}
return noErr;
}
[/code]

The playback function is just important if you want to loop back the signal you have processed. This is pretty useful on debugging your processed buffers.

[code lang=“objc“]
#pragma mark objective-c class

@implementation AudioProcessor
@synthesize audioUnit, audioBuffer, gain;

-(AudioProcessor*)init
{
self = [super init];
if (self) {
gain = 0;
[self initializeAudio];
}
return self;
}
[/code]

Audio component description

I dont think that this part needs a lot of explantation. The part below is well commented. If you have any questions take a look into the CoreAudio documentation =). The code below is a description about the input and output processing. We define the channel properties like the input format. The audio unit is defined by our description and gets initialized at the bottom.

[code lang=“objc“]
-(void)initializeAudio
{
OSStatus status;

// We define the audio component
AudioComponentDescription desc;
desc.componentType = kAudioUnitType_Output; // we want to ouput
desc.componentSubType = kAudioUnitSubType_RemoteIO; // we want in and ouput
desc.componentFlags = 0; // must be zero
desc.componentFlagsMask = 0; // must be zero
desc.componentManufacturer = kAudioUnitManufacturer_Apple; // select provider

// find the AU component by description
AudioComponent inputComponent = AudioComponentFindNext(NULL, &desc);

// create audio unit by component
status = AudioComponentInstanceNew(inputComponent, &audioUnit);

[self hasError:status:__FILE__:__LINE__];

// define that we want record io on the input bus
UInt32 flag = 1;
status = AudioUnitSetProperty(audioUnit,
kAudioOutputUnitProperty_EnableIO, // use io
kAudioUnitScope_Input, // scope to input
kInputBus, // select input bus (1)
&flag, // set flag
sizeof(flag));
[self hasError:status:__FILE__:__LINE__];

// define that we want play on io on the output bus
status = AudioUnitSetProperty(audioUnit,
kAudioOutputUnitProperty_EnableIO, // use io
kAudioUnitScope_Output, // scope to output
kOutputBus, // select output bus (0)
&flag, // set flag
sizeof(flag));
[self hasError:status:__FILE__:__LINE__];

/*
We need to specifie our format on which we want to work.
We use Linear PCM cause its uncompressed and we work on raw data.
for more informations check.

We want 16 bits, 2 bytes per packet/frames at 44khz
*/
AudioStreamBasicDescription audioFormat;
audioFormat.mSampleRate = SAMPLE_RATE;
audioFormat.mFormatID = kAudioFormatLinearPCM;
audioFormat.mFormatFlags = kAudioFormatFlagIsPacked | kAudioFormatFlagIsSignedInteger;
audioFormat.mFramesPerPacket = 1;
audioFormat.mChannelsPerFrame = 1;
audioFormat.mBitsPerChannel = 16;
audioFormat.mBytesPerPacket = 2;
audioFormat.mBytesPerFrame = 2;

// set the format on the output stream
status = AudioUnitSetProperty(audioUnit,
kAudioUnitProperty_StreamFormat,
kAudioUnitScope_Output,
kInputBus,
&audioFormat,
sizeof(audioFormat));

[self hasError:status:__FILE__:__LINE__];

// set the format on the input stream
status = AudioUnitSetProperty(audioUnit,
kAudioUnitProperty_StreamFormat,
kAudioUnitScope_Input,
kOutputBus,
&audioFormat,
sizeof(audioFormat));
[self hasError:status:__FILE__:__LINE__];

/**
We need to define a callback structure which holds
a pointer to the recordingCallback and a reference to
the audio processor object
*/
AURenderCallbackStruct callbackStruct;

// set recording callback
callbackStruct.inputProc = recordingCallback; // recordingCallback pointer
callbackStruct.inputProcRefCon = self;

// set input callback to recording callback on the input bus
status = AudioUnitSetProperty(audioUnit,
kAudioOutputUnitProperty_SetInputCallback,
kAudioUnitScope_Global,
kInputBus,
&callbackStruct,
sizeof(callbackStruct));

[self hasError:status:__FILE__:__LINE__];

/*
We do the same on the output stream to hear what is coming
from the input stream
*/
callbackStruct.inputProc = playbackCallback;
callbackStruct.inputProcRefCon = self;

// set playbackCallback as callback on our renderer for the output bus
status = AudioUnitSetProperty(audioUnit,
kAudioUnitProperty_SetRenderCallback,
kAudioUnitScope_Global,
kOutputBus,
&callbackStruct,
sizeof(callbackStruct));
[self hasError:status:__FILE__:__LINE__];

// reset flag to 0
flag = 0;

/*
we need to tell the audio unit to allocate the render buffer,
that we can directly write into it.
*/
status = AudioUnitSetProperty(audioUnit,
kAudioUnitProperty_ShouldAllocateBuffer,
kAudioUnitScope_Output,
kInputBus,
&flag,
sizeof(flag));

/*
we set the number of channels to mono and allocate our block size to
1024 bytes.
*/
audioBuffer.mNumberChannels = 1;
audioBuffer.mDataByteSize = 512 * 2;
audioBuffer.mData = malloc( 512 * 2 );

// Initialize the Audio Unit and cross fingers =)
status = AudioUnitInitialize(audioUnit);
[self hasError:status:__FILE__:__LINE__];

NSLog(@“Started“);

}
[/code]

AudioUnit control

I need some control about the AU so i added a start and stop function to the class.

[code lang=“objc“]
#pragma mark controll stream

-(void)start;
{
// start the audio unit. You should hear something, hopefully 🙂
OSStatus status = AudioOutputUnitStart(audioUnit);
[self hasError:status:__FILE__:__LINE__];
}
-(void)stop;
{
// stop the audio unit
OSStatus status = AudioOutputUnitStop(audioUnit);
[self hasError:status:__FILE__:__LINE__];
}
[/code]

This is just to set the gain from outside.

[code lang=“objc“]
-(void)setGain:(float)gainValue
{
gain = gainValue;
}

-(float)getGain
{
return gain;
}
[/code]

Audio stream manipulation

Iam not the fan of splitting function code in a blogpost so here is the audio buffer processor commented in the code. Hopefully not in that bad english like in this post 😉

[code lang=“objc“]
#pragma mark processing

-(void)processBuffer: (AudioBufferList*) audioBufferList
{
AudioBuffer sourceBuffer = audioBufferList->mBuffers[0];

// we check here if the input data byte size has changed
if (audioBuffer.mDataByteSize != sourceBuffer.mDataByteSize) {
// clear old buffer
free(audioBuffer.mData);
// assing new byte size and allocate them on mData
audioBuffer.mDataByteSize = sourceBuffer.mDataByteSize;
audioBuffer.mData = malloc(sourceBuffer.mDataByteSize);
}

/**
Here we modify the raw data buffer now.
In my example this is a simple input volume gain.
iOS 5 has this on board now, but as example quite good.
*/
SInt16 *editBuffer = audioBufferList->mBuffers[0].mData;

// loop over every packet
for (int nb = 0; nb < (audioBufferList->mBuffers[0].mDataByteSize / 2); nb++) {

// we check if the gain has been modified to save resoures
if (gain != 0) {
// we need more accuracy in our calculation so we calculate with doubles
double gainSample = ((double)editBuffer[nb]) / 32767.0;

/*
at this point we multiply with our gain factor
we dont make a addition to prevent generation of sound where no sound is.

no noise
0*10=0

noise if zero
0+10=10
*/
gainSample *= gain;

/**
our signal range cant be higher or lesser -1.0/1.0
we prevent that the signal got outside our range
*/
gainSample = (gainSample < -1.0) ? -1.0 : (gainSample > 1.0) ? 1.0 : gainSample;

/*
This thing here is a little helper to shape our incoming wave.
The sound gets pretty warm and better and the noise is reduced a lot.
Feel free to outcomment this line and here again.

You can see here what happens here http://silentmatt.com/javascript-function-plotter/
Copy this to the command line and hit enter: plot y=(1.5*x)-0.5*x*x*x
*/

gainSample = (1.5 * gainSample) – 0.5 * gainSample * gainSample * gainSample;

// multiply the new signal back to short
gainSample = gainSample * 32767.0;

// write calculate sample back to the buffer
editBuffer[nb] = (SInt16)gainSample;
}
}

// copy incoming audio data to the audio buffer
memcpy(audioBuffer.mData, audioBufferList->mBuffers[0].mData, audioBufferList->mBuffers[0].mDataByteSize);
}
[/code]

My little error handler which does not really handle errors, but this is not really important for this example.

[code lang=“objc“]
#pragma mark Error handling

-(void)hasError:(int)statusCode:(char*)file:(int)line
{
if (statusCode) {
printf(„Error Code responded %d in file %s on line %d\n“, statusCode, file, line);
exit(-1);
}
}
@end
[/code]

Conclusion

Its pretty easy to get a callback and define a simple audio component. The hard part is to find the right tool for the things you wanna do. While research for this post i have seen a lot of ways to manipulate audio streams, but this is in my opinion easiest and most controllable  way.

You can read more about this topic on the following sites. MusicDSP is my personal favorite if you need professional ways of coding effects or getting methods for counting beats per minute. Its the best knowledge base for audio DSP i know.

CoreAudio documentation
MusicDSP.org
Apple Audio Session Programming Guide

Project source code

You can download the project source code here on GitHub
https://github.com/fuxx/MicInput